09:02:46 <aspiers> #startmeeting ha
09:02:47 <openstack> Meeting started Mon Sep  5 09:02:46 2016 UTC and is due to finish in 60 minutes.  The chair is aspiers. Information about MeetBot at http://wiki.debian.org/MeetBot.
09:02:48 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
09:02:49 <beekhof> morning all
09:02:51 <openstack> The meeting name has been set to 'ha'
09:03:00 <aspiers> oh hi beekhof :)
09:03:05 <aspiers> cool
09:03:12 * beekhof was on vacation last week
09:03:24 <aspiers> hope you had a good time and your inbox is not too bloated ;-)
09:03:43 <aspiers> mine tends to get ~1000/week (excluding mailing lists)
09:04:04 <aspiers> well, including the mailing lists I try to stay on top of
09:04:08 <aspiers> so not openstack-dev ;-)
09:04:21 <beekhof> hahaha
09:04:21 <samP> same here...lol
09:04:30 <aspiers> #topic HA guide
09:04:31 <beekhof> yes, i went somewhere warm to escape winter
09:04:43 <aspiers> does Australia even get winter? ;-)
09:05:42 <beekhof> indeed
09:05:42 <aspiers> #info ddeja submitted a review to HA guide adding instance HA info https://review.openstack.org/#/c/359955/
09:05:52 <aspiers> samP: I just added you as a reviewer
09:06:15 <samP> aspiers: thank you. Ill take a look
09:06:28 <aspiers> however ddeja is now supposed to be on vacation
09:06:58 <aspiers> so I'll submit a new patchset taking my feedback into account
09:07:35 <aspiers> #action aspiers to update the HA guide review
09:07:52 <aspiers> beekhof: you wanna mention anything else about the HA guide?
09:08:12 <beekhof> can i defer? i got pulled into a meeting real quick
09:08:19 <beekhof> 5 minutes or so
09:08:36 <aspiers> sure
09:10:30 <aspiers> samP: shall we wait a few minutes for him to get back? I guess there's not too much to discuss today
09:10:48 <aspiers> other topics: specs and Barcelona
09:10:57 <aspiers> and anything else you want to discuss
09:11:02 <samP> sure, sorry I dont have much topic to dicuss
09:12:05 <aspiers> that's fine
09:14:41 <aspiers> OK that's 5 mins or so ;-)
09:14:45 <aspiers> #topic specs
09:15:05 <aspiers> so the VM monitoring spec is almost finished
09:15:16 <aspiers> just 1 or 2 FIXMEs I think
09:15:52 <aspiers> samP: how much detail do you think the spec should give about the event data?
09:16:31 <aspiers> samP: also I think it should mention the need to work with https so that the communication is secure
09:17:08 <beekhof> i'm pretty much back
09:17:27 <aspiers> cool
09:18:02 <aspiers> #link https://review.openstack.org/#/c/352217 is the VM monitoring spec
09:18:11 <samP> atleast, we need events about unexpected stop, crash and IO error
09:18:37 <aspiers> samP: yes
09:18:42 <aspiers> samP: and what about event filtering?
09:18:51 <aspiers> that should be configurable in the monitoring service, right?
09:18:59 <samP> aspiers: yes
09:19:02 <aspiers> better to filter at source than destination
09:19:02 <samP> in client side
09:19:21 <samP> aspiers: agree
09:19:27 <aspiers> maybe it's enough for now to make the spec request filtering just by event type?
09:19:38 <aspiers> later we could improve if required
09:19:52 <aspiers> but the spec doesn't need to be the final perfect solution
09:20:22 <samP> aspiers: evetnt type is enough, as u said we can add more details later
09:20:37 <aspiers> ok cool
09:21:27 <aspiers> I think masakari-instancemonitor already implements 95% of what we need, which is good news :)
09:22:20 <samP> aspiers: yes. it also has client side event monitoring. though hard coded
09:22:25 <aspiers> yep
09:22:33 <aspiers> and we need https
09:23:14 <aspiers> how should it get the server's certificate?
09:23:36 <aspiers> a) just trust there is no middle-man attach the first time, and cache the cert
09:23:46 <aspiers> b) require the sysadmin to provide the cert
09:24:05 <aspiers> s/sysadmin/Chef or Puppet or Ansible or .../
09:24:39 <aspiers> I think b) is better
09:24:43 <samP> b) is feasible
09:24:50 <aspiers> since b) still allows the possibility of a)
09:25:13 <aspiers> ok cool
09:25:25 <haukebruno> from my ops view: also b) sounds better
09:25:33 <aspiers> oh hi haukebruno :)
09:25:37 <samP> aspiers: yes. when we add a new computer node, thing are automated and b) is not a big issue
09:25:37 <haukebruno> morning all \o/
09:25:40 <aspiers> that's good to hear
09:25:49 <beekhof> ok, so you wanted to talk ha guide?
09:25:49 <samP> haukebruno: morning..
09:26:01 <beekhof> its mine, all mine i tell you!
09:26:03 <aspiers> beekhof: we're currently talking about the VM monitoring spec
09:26:06 <aspiers> hah
09:26:23 <aspiers> beekhof: anything from you on that? we can switch back to HA guide if you have more to add
09:26:34 <beekhof> i've not done much
09:26:51 <beekhof> i'd like to get around a table at summit and come up with a plan
09:26:57 <aspiers> +1
09:27:09 <samP> +1
09:27:14 <beekhof> something we can then tell people what they can do to help
09:27:22 <aspiers> yep
09:27:24 <beekhof> there was a couple of folks
09:27:43 <aspiers> #topic HA guide (part 2)
09:27:57 <beekhof> we kind of hashed most of it out over email
09:28:14 <beekhof> but we should document it and circulate it a bit more
09:28:25 <aspiers> #info beekhof has ideas for future of HA guide
09:28:32 <aspiers> #info we should discuss in Barcelona
09:28:32 <beekhof> and of course it ties into the other conversation about the new RH arch
09:28:36 <aspiers> right
09:28:52 <beekhof> which i expect to get grilled about in spain
09:29:02 <beekhof> we can call it a spanish inquisition!
09:29:11 <aspiers> or an Australian BBQ ;-)
09:29:15 <beekhof> no-one expects those
09:29:22 <haukebruno> lol. are you guys all heading to barcelona?
09:29:22 <beekhof> hahah, or that
09:29:27 <beekhof> yep
09:29:33 <aspiers> my travel's not confirmed yet
09:29:52 <aspiers> since I didn't get a talk approved this time
09:29:56 <beekhof> doh
09:30:09 <aspiers> beekhof: do you know if RH saw a big reduction in approved talks?
09:30:19 <aspiers> it was a massive drop for SUSE
09:30:19 <beekhof> neither of my 2 got accepted
09:30:27 <beekhof> i dont know about others
09:30:29 <aspiers> very strange
09:30:36 <beekhof> maybe florian doesnt like you anymore?
09:30:40 <aspiers> hah
09:30:45 <aspiers> it's not up to him
09:30:47 <beekhof> if i can get some time, i might try and get started on the docs plans
09:31:27 <aspiers> #topic Barcelona
09:31:38 <beekhof> have you had time to absorb the RH plans?  anything you still want to ask or critique?
09:31:47 <aspiers> I think I already mentioned that it was not possible to get any official HA track this time :-/
09:31:53 <beekhof> yeah
09:32:02 <beekhof> which kinda blows
09:32:12 <aspiers> but it sounds like when they split the event in two it will be a lot easier
09:32:21 <beekhof> unclear
09:32:31 <aspiers> Thierry suggested that will be the case, IIRC
09:32:42 <beekhof> i expect that is the intention
09:33:02 <aspiers> #info still no official HA track, hopefully the future event split will fix this though
09:33:18 <aspiers> #topic RH's new generation HA architecture
09:33:34 <beekhof> aka. why beekhof is wrong
09:33:39 <samP> cat we get the fish bowl?
09:33:48 <aspiers> beekhof: we don't need a dedicated topic to discuss that ;-)
09:34:02 <beekhof> does anyone not know what RH is planning regarding the HA arch?
09:34:11 <beekhof> would be hard to keep up if not :)
09:34:18 <aspiers> beekhof: I think your blog posts are pretty clear
09:34:27 <aspiers> beekhof: although I guess there is more they probably don't cover
09:34:28 <beekhof> maybe not everyone read it
09:34:43 <aspiers> http://blog.clusterlabs.org/blog/2016/next-openstack-ha-arch
09:35:12 <beekhof> in any case, if anyone has questions or concerns... now is your chance
09:35:22 <aspiers> http://blog.clusterlabs.org/blog/2016/composable-openstack-ha
09:35:40 <aspiers> one question is: how are you going to implement the service-level monitoring?
09:36:06 <aspiers> and I just had a crazy idea for you to shoot down in flames
09:36:10 <beekhof> i expect there will be two layers
09:36:40 <beekhof> simple systemd based + more advanced nagios style external monitoring
09:36:46 <aspiers> since I plan to maintain the OCF RAs which have a "monitor" action which should do service-level monitoring (and in the cases where it doesn't, I can fix it)
09:36:55 <beekhof> or sensu or whatever the flavor of the month is
09:37:04 <aspiers> yes, by "service-level" I meant the non-systemd layer
09:37:27 <beekhof> ok
09:37:27 <aspiers> IOW, how will your $nagios_or_similar know how to monitor each service?
09:37:33 <beekhof> yeah, that will all be external
09:37:34 <aspiers> I assume it will need some kind of plugin per service
09:37:40 <aspiers> so here's my crazy idea ...
09:37:49 <aspiers> reuse "monitor" action of the OCF RAs :-)
09:37:56 <beekhof> there will be something that gets called, yes
09:38:03 <beekhof> thats one possibility
09:38:30 <beekhof> of course we might all be in containers by then, so we'd be doing something like http://kubernetes.io/docs/user-guide/production-pods/#liveness-and-readiness-probes-aka-health-checks
09:38:34 <aspiers> if the OCF RA needs to do some extra non-RA stuff to work with your monitoring layer, I'd be more than happy to accommodate it
09:39:15 <beekhof> i dont think there is any great desire to write this stuff, so if it exists in some consumable form i bet it would get reused
09:39:16 <aspiers> yeah I've already been looking at that and stackanetes
09:40:03 <aspiers> a simple service readiness problem could be too naive in some cases
09:40:08 <beekhof> i've not come across that one
09:40:17 <beekhof> aspiers: agreed
09:40:59 <aspiers> so let's agree that in principle, we're aligned on the idea of sharing/reusing code which does service-level monitoring
09:41:08 <beekhof> thats one of the problems the kubernetes proponents will need to find a solution for if they want it adopted
09:41:16 <beekhof> yes
09:41:36 <beekhof> the only real wrinkle, is if your agents expect parameters
09:41:39 <aspiers> if OCF RAs seems to be a suitable home for that code, as the maintainer I'm 100% happy to look after it
09:41:52 <beekhof> thought...
09:41:54 <aspiers> well that should be easy to deal with
09:42:03 <aspiers> simply pass the right environment variables
09:42:11 <beekhof> and i get that this is contrary to everything i've said for 14 years
09:42:22 <aspiers> and if k8s doesn't support that, a wrapper script can set them easily
09:42:29 <beekhof> what if the monitor logic lived in a separate file/script
09:42:44 <aspiers> I'm OK with that too
09:42:58 <beekhof> well the point is that we wouldn;'t have any... anything would be in the sysconfig file
09:43:11 <beekhof> since systemd doesnt have parameters
09:43:25 <beekhof> sep file would make them easier to consume == less resistance
09:43:30 <aspiers> sure
09:43:36 * beekhof has to run again... kids bedtime
09:43:38 <aspiers> I'm totally fine with that
09:43:50 <beekhof> i'll try and make it a bit more often
09:43:57 <aspiers> separate files could still potentially live the o-r-a repo
09:43:57 <beekhof> it == this meeting
09:44:04 <beekhof> yes
09:44:11 <aspiers> beekhof: cool, was great to have you here this time
09:44:52 <aspiers> samP: sorry yes, we should try to book a decent room in advance
09:45:03 <aspiers> samP, haukebruno: you want to discuss anything else?
09:45:27 <haukebruno> not from my site :(
09:45:37 <aspiers> ok np
09:45:39 <haukebruno> just looking forward to the barcelona summit to see some of you folks
09:45:48 <aspiers> yes I hope I can make it! :-/
09:46:31 <samP> aspiers: not from my side
09:46:43 <aspiers> ok then, let's close for today
09:46:51 <aspiers> thanks all and bye for now!
09:47:06 <samP> thank you all...
09:47:41 <haukebruno> have a nice day all
09:47:55 <aspiers> you too :)
09:47:57 <aspiers> #endmeeting