09:02:47 #startmeeting ha 09:02:48 Meeting started Wed Dec 21 09:02:47 2016 UTC and is due to finish in 60 minutes. The chair is aspiers. Information about MeetBot at http://wiki.debian.org/MeetBot. 09:02:50 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 09:02:52 The meeting name has been set to 'ha' 09:03:12 hi o/ 09:03:22 hi :) anyone else here today? 09:03:51 I'll try to do a better job with minutes than last time ... 09:04:54 OK, I guess just us 09:05:03 #topic specs 09:05:47 OK I've just update VM recovery spec. 09:05:58 #info VM recovery spec is making progress 09:06:38 yeah I saw, thanks. I will be working on these over the vacation while it is quiet 09:06:49 I have had a lot of urgent customer issues recently :-( 09:07:15 And I put a comment on compute node monitoring spec also 09:07:32 great thanks! 09:07:37 I'll reply to that ASAP 09:07:50 a general format makes a lot of sense 09:08:11 yeh, I would like to have more discussion on that. 09:08:33 my feeling is that the specs need to be quite precise about the implementation 09:08:44 to ensure we can have compatibility between implementations 09:09:07 aspiers: sure 09:09:42 but ddeja does not seem so sure 09:10:58 well, I guess I need to think about this more 09:11:15 did you see the conversation from last week? if not, worth reading I think 09:11:32 yes, I read it 09:11:36 ok cool 09:12:11 I suggested that the libvirtd/nova-compute RAs should do normal process monitoring *and* recovery, but also send HTTP message when monitor fails and when starting/ending recovery 09:12:40 hey 09:12:40 then the external process recovery component could decide its own policy 09:12:44 oh hey haukebruno :) 09:12:55 haukebruno: hi! 09:13:18 aspiers: I agree 09:13:36 samP: ok great, I think I will propose that as the preferred implementation in the spec 09:13:42 samP: but I will list alternatives 09:14:00 samP: maybe including one where Pacemaker gets enhanced 09:14:32 aspiers: I'm wondering how much further should I wire in spec "VM recovery" 09:14:33 and definitely including the one where we do service-disable on every stop 09:15:25 samP: I think VM recovery spec needs to document the interface point 09:16:31 aspiers: if we wants to put more details about implementation in spec, I prefer to have fix the notification format first 09:16:49 uhhh 09:16:57 #topic VM recovery spec 09:17:01 :) 09:17:11 samP: yes, that makes sense 09:18:13 samP: did you see Qiming's suggestion about versioning? 09:18:28 https://review.openstack.org/#/c/406659/2/specs/newton/approved/newton-instance-ha-host-monitoring-spec.rst@205 09:19:58 #agreed that we should decide on a standard format for all HTTP notifications, across all components 09:20:04 how about have a simple spec for notification format and lets make reference it from each spec 09:20:32 samP: ok sure 09:20:36 aspiers: yes, about JSON ver? 09:21:04 the versioning sounds like a good idea to me 09:21:21 I guess there is already an oslo standard? 09:23:32 hmm, maybe just nova 09:23:34 http://developer.openstack.org/api-guide/compute/microversions.html 09:23:57 I couldn't fine any doc in oslo 09:24:20 this looks like a good place to start 09:24:31 * ddeja forgot about the meeting... 09:24:38 sorry guys, hello all 09:24:40 ddeja: haha no problem :) 09:24:49 ddeja: hi 09:25:00 but, I remember on Monday that there is no meeting 09:25:20 that's a good start anyway :) 09:25:22 ddeja: samP has a nice suggestion to standardize the common bits of HTTP message format across all components 09:25:43 that's good 09:26:12 wheras I'm still not sure if it's a good idea to standarize it at all, as I stated in one of the reviews 09:26:33 ddeja: yes I saw that 09:26:52 ddeja: I'm fine with your driver idea, but I think we have to specify the message format in the specs 09:27:22 ddeja: since the main point of the specs is to allow different implementations of each component to be compatible 09:27:30 so that they are interchangeable 09:27:49 OK 09:28:10 we can add an abstraction layer for compatibility reasons 09:28:31 I'm afraid that it may be seen as an overhead for others 09:28:37 and noone would use it 09:29:20 what kind of overhead do you mean? 09:29:27 performance, or implementation? 09:30:33 niether of them 09:30:39 let me show as an example 09:30:44 ok 09:30:56 let's say we have someone who wants to use Masakari 09:31:01 for VM ha 09:31:12 he would have to a) setup pacamaker 09:31:22 b) setup agent that monitors vm 09:31:38 c) setup agent that receive alarms from monitor and inform Maskari 09:31:45 and d) Masakri itself 09:31:53 what he woudl think about it? 09:31:55 no, c) and d) are the same 09:32:04 OK 09:32:27 then, we need Masakri to be able to read the http message that monitor would send 09:32:31 same for Mistral 09:32:31 yes 09:32:34 same for everything 09:33:19 * ddeja is now entering 'mistral core mode' 09:33:25 so, as for Mistral 09:33:27 sometimes it might be easier to write a shim which proxies notifications 09:33:34 but that's up to the individual implementation 09:33:48 unless that http message is openstack complaint, I see no way to implement something like that in Mistral 09:34:05 * ddeja closed 'mistral core mode' :) 09:34:12 ok so in the mistral case maybe a shim is required 09:34:39 is there a problem with that? 09:34:49 aspiers: for me there is no 09:35:02 but if we go this way 09:35:08 we got a, b, c and d 09:35:15 (as in my example) 09:35:28 right 09:35:38 and given someone may think "hm, why I need something that just caches the alarm and sends it to recovery service" 09:35:53 "why I can just notify the recovery itself?" 09:36:11 and given someone would end up writing his own monitor 09:36:30 that's why I think driver-based architecture woudl be better 09:37:26 in vm monitor spec, we just specyfi what information would be passed to driver layer 09:37:44 and what to do with it, is up to driver writer/maintener 09:37:57 but then we have no guarantee of compatibility 09:38:21 yes, but what kind of copatibility you need at this point? 09:38:40 you just want to catch the alaram and send it to recovery service of your choice 09:38:57 compatibility between monitors and recovery services 09:39:19 it would be guaranteed by the driver 09:39:52 the driver in the monitor? 09:39:56 which means, we don't need to specyfi what kind of informatio recovery service needs to understand 09:40:18 we just specyfi that we have a drive that can notify the recovery service and it woudl understand it 09:41:12 but if that's undocumented then there is no reliable way for multiple implementations to be compatible 09:41:29 aspiers: agree 09:41:48 aspiers: taking it other way around: why we want it to be compatible? 09:42:08 as I said before: you just want to catch the alaram and send it to recovery service of your choice 09:42:11 so that we can incrementally move from existing vendor implementations to a single upstream converged implementation 09:42:42 aspiers: so you can change each peace independently? 09:42:45 yes 09:43:03 that's one thing I didn't take into account 09:43:05 that was the whole point of the approach we agreed in Tokyo 09:43:31 OK 09:43:42 what if we have a driver that still talks "the old way" 09:43:49 because several of us have existing implementations which are in production and supported for customers 09:44:04 and then, when you change the revocery service, you just switch the driver? 09:44:44 ddeja: yes, that would be perfect. any support for communication an "old way" is of course still allowed, but outside the scope of the spec 09:44:51 or, why not both? 09:44:54 I mean 09:44:58 both is fine 09:45:09 let's do the driver architecture 09:45:20 specyfi what information are given to it 09:45:30 the specs should only require support for the new way, they are not exclusive deals which prohibit the old ways :) 09:45:43 and give one driver that is "the" driver 09:46:01 for which component? 09:46:06 for monitor 09:46:16 that talks using the specyfied http message 09:46:17 VM monitor or host monitor? 09:46:24 or both 09:46:26 VM monitor 09:46:37 but for host monitor it should also works 09:47:09 then we can specyfi the modular architecture beetwen monitors<->Recovery services 09:47:13 using standard drivers 09:47:29 I like the idea of drivers, but I don't see why the spec would need to require a driver architecture 09:47:51 any component is free to use a driver architecture, but why force all implementations to have one? 09:47:53 so that we can ommit the point 'c' for some recovery services ;) 09:48:37 sure, but why does it have to be in the spec? 09:48:40 we just let the user choice: either your recovery service understand the standard http message 09:48:53 aspiers: I didn't say it's need to be in spec :) 09:48:56 oh ok :) 09:49:18 so I think we are agreed on everything then :) 09:49:39 hmm 09:49:40 I think so 09:49:41 maybe not 09:49:43 :) 09:49:46 ? 09:49:47 I am thinking ... 09:50:08 OK, so let's say we implement a VM monitor with notification drivers 09:50:27 I just don't like the idea to writing some proxy service that translate our standard http message to mistral 09:50:31 then we implement a) a driver for notifying Mistral 09:50:36 sounds good, lest discuss more in the spec 09:50:44 b) a driver for sending standard HTTP message 09:50:48 aspiers: because of this -> https://xkcd.com/927/ 09:51:07 * aspiers guesses that is the one about standards 09:51:18 right :) 09:51:21 lol 09:51:46 aspiers: I think we need to only implement b) 09:51:59 as a starting point 09:52:27 ddeja: but driver b) requires a shim, and you want to eliminate that for mistral 09:52:46 ddeja: how do you propose to implement drivers? 09:53:00 aspiers: and for anything else, that cannot understand the standard http message 09:53:36 but if you don't provide the shim, the only way for each VM monitor implementation to be compatible with mistral is by talking directly to it 09:53:51 aspiers: yes, I want to elimiante it, but let's make it work in one standard way, then add next implementation 09:53:58 that's how I see it 09:54:30 samP: I think of it as some standard base class, to which we would pass a given set of input parameters 09:54:39 and then call some method, like 'notify' on it 09:54:44 that's all 09:54:53 the rest is on the drivers writer side 09:55:01 ddeja: oh, so you are saying we *would* write a shim for initial case? 09:55:15 ddeja: and then later optimise by adding mistral-specific driver? 09:55:23 aspiers: maybe 09:55:28 that could work 09:55:40 or maybe just made it work with only Masakari at first moment 09:56:01 or we can write a driver nearly in the same time, it shouldn't be a big deal 09:56:24 ddeja: OK, got it. 09:56:31 what is more, with drivers it may be easier to be complaint with someone existing solution 09:56:52 he just uses/writes a driver that is compliant with exisitn recovery workflow 09:57:02 service 09:57:21 yes I'm definitely in favour of drivers where it makes sense 09:57:37 I'm just trying to figure out what should be in the specs though 09:57:39 then switches to Masakari/Mistral (are there any other standards?) 09:57:48 and switches the driver 09:58:00 aspiers: I don't think it must be specyfi in the spec itself 09:58:12 right 09:58:15 maybe just information about 'modular architecture' 09:58:33 so it woudl be easy to notify various recovery services? 09:58:43 IMO, spec should contain all the info can obtain from monitors. Then drivers can choose what they want 09:59:00 samP: that's what I was thinking too :) 09:59:43 ok, still not sure I follow 100% but we're out of time so let's just continue working in gerrit :) 09:59:56 good discussion though, thanks! 10:00:10 yep. thanks ddeja 10:00:11 aspiers: I'll make some diagram to show my idea :) 10:00:17 ddeja: perfect! 10:00:19 thanks guys! 10:00:22 thanks :) 10:00:28 #endmeeting