09:02:46 <aspiers> #startmeeting ha 09:02:47 <openstack> Meeting started Mon Sep 5 09:02:46 2016 UTC and is due to finish in 60 minutes. The chair is aspiers. Information about MeetBot at http://wiki.debian.org/MeetBot. 09:02:48 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 09:02:49 <beekhof> morning all 09:02:51 <openstack> The meeting name has been set to 'ha' 09:03:00 <aspiers> oh hi beekhof :) 09:03:05 <aspiers> cool 09:03:12 * beekhof was on vacation last week 09:03:24 <aspiers> hope you had a good time and your inbox is not too bloated ;-) 09:03:43 <aspiers> mine tends to get ~1000/week (excluding mailing lists) 09:04:04 <aspiers> well, including the mailing lists I try to stay on top of 09:04:08 <aspiers> so not openstack-dev ;-) 09:04:21 <beekhof> hahaha 09:04:21 <samP> same here...lol 09:04:30 <aspiers> #topic HA guide 09:04:31 <beekhof> yes, i went somewhere warm to escape winter 09:04:43 <aspiers> does Australia even get winter? ;-) 09:05:42 <beekhof> indeed 09:05:42 <aspiers> #info ddeja submitted a review to HA guide adding instance HA info https://review.openstack.org/#/c/359955/ 09:05:52 <aspiers> samP: I just added you as a reviewer 09:06:15 <samP> aspiers: thank you. Ill take a look 09:06:28 <aspiers> however ddeja is now supposed to be on vacation 09:06:58 <aspiers> so I'll submit a new patchset taking my feedback into account 09:07:35 <aspiers> #action aspiers to update the HA guide review 09:07:52 <aspiers> beekhof: you wanna mention anything else about the HA guide? 09:08:12 <beekhof> can i defer? i got pulled into a meeting real quick 09:08:19 <beekhof> 5 minutes or so 09:08:36 <aspiers> sure 09:10:30 <aspiers> samP: shall we wait a few minutes for him to get back? I guess there's not too much to discuss today 09:10:48 <aspiers> other topics: specs and Barcelona 09:10:57 <aspiers> and anything else you want to discuss 09:11:02 <samP> sure, sorry I dont have much topic to dicuss 09:12:05 <aspiers> that's fine 09:14:41 <aspiers> OK that's 5 mins or so ;-) 09:14:45 <aspiers> #topic specs 09:15:05 <aspiers> so the VM monitoring spec is almost finished 09:15:16 <aspiers> just 1 or 2 FIXMEs I think 09:15:52 <aspiers> samP: how much detail do you think the spec should give about the event data? 09:16:31 <aspiers> samP: also I think it should mention the need to work with https so that the communication is secure 09:17:08 <beekhof> i'm pretty much back 09:17:27 <aspiers> cool 09:18:02 <aspiers> #link https://review.openstack.org/#/c/352217 is the VM monitoring spec 09:18:11 <samP> atleast, we need events about unexpected stop, crash and IO error 09:18:37 <aspiers> samP: yes 09:18:42 <aspiers> samP: and what about event filtering? 09:18:51 <aspiers> that should be configurable in the monitoring service, right? 09:18:59 <samP> aspiers: yes 09:19:02 <aspiers> better to filter at source than destination 09:19:02 <samP> in client side 09:19:21 <samP> aspiers: agree 09:19:27 <aspiers> maybe it's enough for now to make the spec request filtering just by event type? 09:19:38 <aspiers> later we could improve if required 09:19:52 <aspiers> but the spec doesn't need to be the final perfect solution 09:20:22 <samP> aspiers: evetnt type is enough, as u said we can add more details later 09:20:37 <aspiers> ok cool 09:21:27 <aspiers> I think masakari-instancemonitor already implements 95% of what we need, which is good news :) 09:22:20 <samP> aspiers: yes. it also has client side event monitoring. though hard coded 09:22:25 <aspiers> yep 09:22:33 <aspiers> and we need https 09:23:14 <aspiers> how should it get the server's certificate? 09:23:36 <aspiers> a) just trust there is no middle-man attach the first time, and cache the cert 09:23:46 <aspiers> b) require the sysadmin to provide the cert 09:24:05 <aspiers> s/sysadmin/Chef or Puppet or Ansible or .../ 09:24:39 <aspiers> I think b) is better 09:24:43 <samP> b) is feasible 09:24:50 <aspiers> since b) still allows the possibility of a) 09:25:13 <aspiers> ok cool 09:25:25 <haukebruno> from my ops view: also b) sounds better 09:25:33 <aspiers> oh hi haukebruno :) 09:25:37 <samP> aspiers: yes. when we add a new computer node, thing are automated and b) is not a big issue 09:25:37 <haukebruno> morning all \o/ 09:25:40 <aspiers> that's good to hear 09:25:49 <beekhof> ok, so you wanted to talk ha guide? 09:25:49 <samP> haukebruno: morning.. 09:26:01 <beekhof> its mine, all mine i tell you! 09:26:03 <aspiers> beekhof: we're currently talking about the VM monitoring spec 09:26:06 <aspiers> hah 09:26:23 <aspiers> beekhof: anything from you on that? we can switch back to HA guide if you have more to add 09:26:34 <beekhof> i've not done much 09:26:51 <beekhof> i'd like to get around a table at summit and come up with a plan 09:26:57 <aspiers> +1 09:27:09 <samP> +1 09:27:14 <beekhof> something we can then tell people what they can do to help 09:27:22 <aspiers> yep 09:27:24 <beekhof> there was a couple of folks 09:27:43 <aspiers> #topic HA guide (part 2) 09:27:57 <beekhof> we kind of hashed most of it out over email 09:28:14 <beekhof> but we should document it and circulate it a bit more 09:28:25 <aspiers> #info beekhof has ideas for future of HA guide 09:28:32 <aspiers> #info we should discuss in Barcelona 09:28:32 <beekhof> and of course it ties into the other conversation about the new RH arch 09:28:36 <aspiers> right 09:28:52 <beekhof> which i expect to get grilled about in spain 09:29:02 <beekhof> we can call it a spanish inquisition! 09:29:11 <aspiers> or an Australian BBQ ;-) 09:29:15 <beekhof> no-one expects those 09:29:22 <haukebruno> lol. are you guys all heading to barcelona? 09:29:22 <beekhof> hahah, or that 09:29:27 <beekhof> yep 09:29:33 <aspiers> my travel's not confirmed yet 09:29:52 <aspiers> since I didn't get a talk approved this time 09:29:56 <beekhof> doh 09:30:09 <aspiers> beekhof: do you know if RH saw a big reduction in approved talks? 09:30:19 <aspiers> it was a massive drop for SUSE 09:30:19 <beekhof> neither of my 2 got accepted 09:30:27 <beekhof> i dont know about others 09:30:29 <aspiers> very strange 09:30:36 <beekhof> maybe florian doesnt like you anymore? 09:30:40 <aspiers> hah 09:30:45 <aspiers> it's not up to him 09:30:47 <beekhof> if i can get some time, i might try and get started on the docs plans 09:31:27 <aspiers> #topic Barcelona 09:31:38 <beekhof> have you had time to absorb the RH plans? anything you still want to ask or critique? 09:31:47 <aspiers> I think I already mentioned that it was not possible to get any official HA track this time :-/ 09:31:53 <beekhof> yeah 09:32:02 <beekhof> which kinda blows 09:32:12 <aspiers> but it sounds like when they split the event in two it will be a lot easier 09:32:21 <beekhof> unclear 09:32:31 <aspiers> Thierry suggested that will be the case, IIRC 09:32:42 <beekhof> i expect that is the intention 09:33:02 <aspiers> #info still no official HA track, hopefully the future event split will fix this though 09:33:18 <aspiers> #topic RH's new generation HA architecture 09:33:34 <beekhof> aka. why beekhof is wrong 09:33:39 <samP> cat we get the fish bowl? 09:33:48 <aspiers> beekhof: we don't need a dedicated topic to discuss that ;-) 09:34:02 <beekhof> does anyone not know what RH is planning regarding the HA arch? 09:34:11 <beekhof> would be hard to keep up if not :) 09:34:18 <aspiers> beekhof: I think your blog posts are pretty clear 09:34:27 <aspiers> beekhof: although I guess there is more they probably don't cover 09:34:28 <beekhof> maybe not everyone read it 09:34:43 <aspiers> http://blog.clusterlabs.org/blog/2016/next-openstack-ha-arch 09:35:12 <beekhof> in any case, if anyone has questions or concerns... now is your chance 09:35:22 <aspiers> http://blog.clusterlabs.org/blog/2016/composable-openstack-ha 09:35:40 <aspiers> one question is: how are you going to implement the service-level monitoring? 09:36:06 <aspiers> and I just had a crazy idea for you to shoot down in flames 09:36:10 <beekhof> i expect there will be two layers 09:36:40 <beekhof> simple systemd based + more advanced nagios style external monitoring 09:36:46 <aspiers> since I plan to maintain the OCF RAs which have a "monitor" action which should do service-level monitoring (and in the cases where it doesn't, I can fix it) 09:36:55 <beekhof> or sensu or whatever the flavor of the month is 09:37:04 <aspiers> yes, by "service-level" I meant the non-systemd layer 09:37:27 <beekhof> ok 09:37:27 <aspiers> IOW, how will your $nagios_or_similar know how to monitor each service? 09:37:33 <beekhof> yeah, that will all be external 09:37:34 <aspiers> I assume it will need some kind of plugin per service 09:37:40 <aspiers> so here's my crazy idea ... 09:37:49 <aspiers> reuse "monitor" action of the OCF RAs :-) 09:37:56 <beekhof> there will be something that gets called, yes 09:38:03 <beekhof> thats one possibility 09:38:30 <beekhof> of course we might all be in containers by then, so we'd be doing something like http://kubernetes.io/docs/user-guide/production-pods/#liveness-and-readiness-probes-aka-health-checks 09:38:34 <aspiers> if the OCF RA needs to do some extra non-RA stuff to work with your monitoring layer, I'd be more than happy to accommodate it 09:39:15 <beekhof> i dont think there is any great desire to write this stuff, so if it exists in some consumable form i bet it would get reused 09:39:16 <aspiers> yeah I've already been looking at that and stackanetes 09:40:03 <aspiers> a simple service readiness problem could be too naive in some cases 09:40:08 <beekhof> i've not come across that one 09:40:17 <beekhof> aspiers: agreed 09:40:59 <aspiers> so let's agree that in principle, we're aligned on the idea of sharing/reusing code which does service-level monitoring 09:41:08 <beekhof> thats one of the problems the kubernetes proponents will need to find a solution for if they want it adopted 09:41:16 <beekhof> yes 09:41:36 <beekhof> the only real wrinkle, is if your agents expect parameters 09:41:39 <aspiers> if OCF RAs seems to be a suitable home for that code, as the maintainer I'm 100% happy to look after it 09:41:52 <beekhof> thought... 09:41:54 <aspiers> well that should be easy to deal with 09:42:03 <aspiers> simply pass the right environment variables 09:42:11 <beekhof> and i get that this is contrary to everything i've said for 14 years 09:42:22 <aspiers> and if k8s doesn't support that, a wrapper script can set them easily 09:42:29 <beekhof> what if the monitor logic lived in a separate file/script 09:42:44 <aspiers> I'm OK with that too 09:42:58 <beekhof> well the point is that we wouldn;'t have any... anything would be in the sysconfig file 09:43:11 <beekhof> since systemd doesnt have parameters 09:43:25 <beekhof> sep file would make them easier to consume == less resistance 09:43:30 <aspiers> sure 09:43:36 * beekhof has to run again... kids bedtime 09:43:38 <aspiers> I'm totally fine with that 09:43:50 <beekhof> i'll try and make it a bit more often 09:43:57 <aspiers> separate files could still potentially live the o-r-a repo 09:43:57 <beekhof> it == this meeting 09:44:04 <beekhof> yes 09:44:11 <aspiers> beekhof: cool, was great to have you here this time 09:44:52 <aspiers> samP: sorry yes, we should try to book a decent room in advance 09:45:03 <aspiers> samP, haukebruno: you want to discuss anything else? 09:45:27 <haukebruno> not from my site :( 09:45:37 <aspiers> ok np 09:45:39 <haukebruno> just looking forward to the barcelona summit to see some of you folks 09:45:48 <aspiers> yes I hope I can make it! :-/ 09:46:31 <samP> aspiers: not from my side 09:46:43 <aspiers> ok then, let's close for today 09:46:51 <aspiers> thanks all and bye for now! 09:47:06 <samP> thank you all... 09:47:41 <haukebruno> have a nice day all 09:47:55 <aspiers> you too :) 09:47:57 <aspiers> #endmeeting