#openstack-meeting log

09:03:11 <aspiers> #startmeeting ha
09:03:12 <openstack> Meeting started Mon Jan 18 09:03:11 2016 UTC and is due to finish in 60 minutes.  The chair is aspiers. Information about MeetBot at http://wiki.debian.org/MeetBot.
09:03:13 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
09:03:15 <openstack> The meeting name has been set to 'ha'
09:03:21 <aspiers> welcome everyone
09:03:40 <_gryf> hello again :)
09:03:50 <aspiers> I guess it might be a bit short today but that's no problem :)
09:04:03 <masahito> hello
09:04:13 <aspiers> let's start with some status updates as per normal
09:04:21 <aspiers> #topic Current status (progress, issues, roadblocks, further plans)
09:04:37 * aspiers picks someone at random
09:04:46 <aspiers> masahito: anything you'd like to report?
09:05:19 <masahito> I'm working on my work items for Masakari, so nothing to specials.
09:05:30 <aspiers> ok
09:06:23 <aspiers> hey kazuIchikawa, welcome to our small group :) are you also working on masakari?
09:07:03 <kazuIchikawa> Yes, I'm working on masakari with masahito.
09:07:08 <aspiers> cool
09:07:41 <kazuIchikawa> He is my colleague, actually. working in same office.
09:07:51 <aspiers> nice
09:08:11 <aspiers> I don't have too much to report
09:08:15 <aspiers> briefly:
09:08:32 <aspiers> I merged one or two patches into openstack-resource-agents
09:08:37 <_gryf> also, nothing from my side
09:08:41 <_gryf> but ddeja was able to bring his setup alive last Friday
09:08:43 <aspiers> there are still a few outstanding
09:08:46 <_gryf> and currently he is back on track and working on tuning the mistral workflows for evacuation
09:08:57 <aspiers> _gryf: sounds cool!
09:09:09 <bogdando> hi
09:09:09 <aspiers> _gryf: I guess the github will continue to be updated?
09:09:15 <aspiers> oh hey bogdando :)
09:09:18 <_gryf> aspiers, I hope :)
09:09:40 <aspiers> I've continued to have a long discussion with beekhof about the future of OCF RAs
09:09:57 <aspiers> and their relationship to systemd
09:10:27 <aspiers> I am still thinking it might be a good idea if the RAs are changed to wrap around system unit services
09:10:42 <aspiers> instead of reimplementing the logic / config for starting/stopping the daemons
09:10:54 <aspiers> beekhof strongly disagrees ;-)
09:11:15 <aspiers> for reasons which I don't fully understand yet
09:11:18 <bogdando> aspiers, do you mean only stateless resources by OCF or stateful as well?
09:11:25 <bogdando> like clusters
09:11:31 <aspiers> bogdando: possibly both
09:11:47 <aspiers> hey ddeja :)
09:11:51 <ddeja> hi all, sorry for beeing late :)
09:11:55 <aspiers> no problem
09:11:56 <bogdando> the point is for the clusters, start/stop differs
09:11:57 <ddeja> I left my car for repair and didin't expect public transport to take so long
09:12:03 <aspiers> hehe
09:12:09 <_gryf> heh
09:12:12 <bogdando> and systemd start/stop logic cannot fit the needs perhaps
09:12:22 <bogdando> as well as any init-like one
09:12:23 <aspiers> well, let's finish the status round first, and then maybe we can come back to this topic?
09:12:36 <aspiers> want to make sure everyone gets a chance to give their status
09:13:04 <aspiers> the other thing I have to report is that I'm still working on our automation of compute node HA setup via Chef
09:13:16 <aspiers> but it's quite close to completion now
09:13:32 <aspiers> bogdando: you want to give a status update on anything?
09:13:59 <bogdando> no so far, only few updates to the ha guide related things
09:14:29 <aspiers> yeah, ha-guide project seems very active :)
09:14:38 <aspiers> ddeja: anything from your side?
09:14:46 <ddeja> aspiers: yes, thanks
09:14:52 <bogdando> galera patch was reworked, thanks to Kenneth, https://review.openstack.org/#/c/263075/
09:15:07 <ddeja> I have finished working whit my setup for auto-evac testing
09:15:32 <aspiers> cool
09:15:45 <ddeja> And I'm hardening mistral workflow, but on Friday I hit something, that might be a bug
09:15:59 <ddeja> But not sure yet, must do a double check
09:16:30 <ddeja> as soon as I have my workflow fully working, I'll let you know and I'll describe it in etherpad/github
09:16:31 <aspiers> ddeja: do you have any updates you could push to https://github.com/gryf/mistral-evacuate ?
09:16:37 <aspiers> ok great!
09:17:11 <aspiers> alright, I think that's everyone, or does anyone else want to report anything?
09:17:24 <ddeja> aspiers: yes, but I must confirm that there is bug in mistral itself, not in my workflow before I push it, hopefully I'll do it tommorow :)
09:17:36 <aspiers> ddeja: cool :)
09:17:54 <aspiers> also, if you want to discuss a particular topic today (or in the future), please say so now
09:18:14 <aspiers> I would quite like to return to this topic of OCF RAs and systemd to get other people's opinions
09:18:24 <aspiers> but happy to discuss anything else
09:19:05 <ddeja> I'm OK with OCF RAs stuff
09:19:11 <aspiers> ok
09:19:16 <aspiers> #topic OCF RAs and systemd
09:19:26 <aspiers> firstly, is anyone *not* using systemd?
09:20:15 <aspiers> so my original idea, which I think I have mentioned in previous meetings, or on #openstack-ha, is to make small changes to the existing OCF RAs so that they wrap around systemd
09:20:18 <masahito> for HA one?
09:20:27 <aspiers> or at least, that they wrap around service(8)
09:20:37 <aspiers> masahito: I mean, in general
09:20:44 <aspiers> masahito: vs. sysvinit, upstart etc.
09:21:10 <aspiers> there are a few problems with the existing OCF RAs
09:21:13 <masahito> got it. we're using upstart.
09:21:16 <bogdando> let's create a spec first. There is high dev & ops impact
09:21:29 <aspiers> bogdando: sure. this is very much in the proposal phase right now
09:21:33 <aspiers> a spec is a good idea
09:21:35 <bogdando> I'd like to see this change backwards compatible
09:21:47 <aspiers> bogdando: I think it would be, but we'd have to check
09:21:55 <aspiers> the problems I want to solve are:
09:22:01 <bogdando> with some deprectaion perhaps unless there is ubuntu 16.04 at lest :)
09:22:06 <aspiers> 1. duplication of daemon management logic
09:22:16 <aspiers> 2. distro-specific code in openstack-resource-agents repo
09:22:30 <aspiers> 3. inconsistency of daemon management between HA clouds and non-HA clouds
09:22:56 <aspiers> 4. delegation of maintenance of daemon management to distro packages
09:23:09 <aspiers> (2. and 4. are kind of the same thing)
09:23:22 <aspiers> so currently the OCF RAs start up daemons in their own way
09:23:30 <aspiers> they don't care how the daemons are packaged by the distro
09:23:48 <aspiers> if the distro package changes, the RA might also need a change
09:23:54 <aspiers> or at least the Pacemaker primitive
09:24:05 <bogdando> another point is probably a single point of entry for the resources control
09:24:24 <bogdando> would systemd allow to effectively handle this?
09:24:39 <aspiers> bogdando: could you expand a bit on what you mean by that please?
09:24:49 <bogdando> I mean that init-based control plane shall be not used, or desabled then pacemaker takes care
09:24:58 <bogdando> disabled*
09:25:12 <aspiers> well that's already true
09:25:27 <aspiers> if Pacemaker manages the service, then the admin must not also try to start/stop it
09:25:38 <aspiers> I don't think my proposal changes that
09:25:46 <bogdando> yes, but I thought systemd may integrate things a bit better
09:25:54 <bogdando> must not -> cannot
09:26:11 <bogdando> or transparently "will not", in fact
09:26:28 <aspiers> I don't think systemd lets us prevent admins interfering with Pacemaker-controlled services
09:26:32 <bogdando> ;(
09:26:46 <aspiers> which is a shame
09:26:53 <aspiers> maybe there is some way to hack that, I'm not sure
09:26:54 <masahito> my concern is how pacemaker works when a process launched by systemd goes down.
09:27:14 <aspiers> masahito: in that case, Pacemaker must be in charge of restarting it
09:27:17 <bogdando> yes, good point to address in the spec, failure modes
09:27:27 <bogdando> and avoiding split brain to control planes
09:27:38 <aspiers> masahito: in my proposal, Pacemaker launches the process *via* systemd
09:28:12 <aspiers> so e.g. the 'start' action of the OCF RA will call something like "service openstack-keystone start"
09:28:17 <masahito> aspiers: got it.
09:28:19 <bogdando> so status in the systemd will be correct, but it shall not try to do respawn race
09:28:26 <aspiers> bogdando: correct
09:28:58 <aspiers> yes, this also solves the problem that 5. currently "systemctl status" is not guaranteed accurate if the service is started by Pacemaker
09:29:13 <aspiers> since that relies on the OCF RA using the same pid file and/or daemon name as systemd
09:29:14 <bogdando> looks like very good idea, but complex to address (considering keeping it backwards compatible unless there is LTS ubuntu with systemd)
09:29:26 <aspiers> bogdando: I think it would be extremely simple to do
09:29:27 <bogdando> so we need a spec and PoC perhaps
09:29:36 <aspiers> bogdando: it's just a few lines of code changed in each RA
09:29:47 <aspiers> beekhof highlighted one issue with systemctl though
09:29:50 <bogdando> yes perhaps
09:29:54 <aspiers> systemctl start/stop are asynchronous
09:30:08 <aspiers> they do not block until the service is started/stopped
09:30:14 <bogdando> oh, that may be a big problem for stop logic in pacemaker
09:30:22 <aspiers> so after you call systemctl, you have to poll until the service is really started or stopped
09:30:25 <bogdando> and unexpected STONITHes :)
09:30:43 <aspiers> but I think that can be solved simply by a loop which calls the 'monitor' action
09:30:51 <bogdando> then the action stop times out
09:31:04 <aspiers> where 'monitor' should do application-level monitoring (e.g. HTTP layer tests)
09:31:08 <bogdando> but in fact is running async
09:31:18 <bogdando> another failure mode to catch up
09:31:23 <aspiers> bogdando: yes, the polling loop would need a timeout, but this is easy to do
09:31:34 <aspiers> also the systemctl would need a timeout, which again is easy
09:31:53 <aspiers> I have discussed this proposal a LOT with beekhof
09:32:01 <aspiers> he doesn't like it ;-)
09:32:05 <aspiers> but I don't understand why yet
09:32:14 <aspiers> he says systemd is too unreliable to use
09:32:38 <bogdando> :)
09:32:38 <aspiers> but I don't know what technical issues he is referring to
09:32:46 <aspiers> so I guess that conversation will continue
09:33:10 <bogdando> by the way
09:33:13 <ddeja> maybe whe sould ask beekhof to elaborate more on this?
09:33:20 <aspiers> ddeja: I already did
09:33:26 <ddeja> aspiers: cool
09:33:27 <aspiers> ddeja: waiting for an answer
09:33:34 <ddeja> aspiers: OK
09:33:35 <bogdando> for stop, I believe we should use a unified proc_stop which relies on several iterations of SIGTERM
09:33:50 <bogdando> folowing by the SIGKILL if nothing helps to stop gracefully
09:33:54 <aspiers> but in any case, I agree that a spec and some PoC (e.g. a WIP pull request to openstack-resource-agents) is a good way to go
09:33:59 <_gryf> aspiers, is that thread on the ml and somehow i've missed it?
09:34:07 <bogdando> that would be a more classic unix way, rely on the signals, IMO
09:34:20 <bogdando> I have an example to show
09:34:42 <aspiers> _gryf: no, unfortunately it had to be private since there were some minor political topics involved :-/
09:34:58 <_gryf> aspiers, oh, got it ;)
09:35:09 <aspiers> _gryf: but a spec / gerrit review would help keep the technical discussion open
09:35:13 <aspiers> I would much prefer that
09:35:31 <bogdando> https://github.com/rabbitmq/rabbitmq-server/blob/6fd4eb5bcb39be7f5ac26dcc78e3a4b4df4c6fbb/scripts/rabbitmq-server-ha.ocf#L303-L463
09:35:55 <aspiers> #action aspiers to write a spec proposing OCF RAs wrap systemd, and possibly a gerrit review showing a PoC
09:35:58 <_gryf> aspiers, yeah. and for me both (gerrit/ml) works fine
09:35:59 <bogdando> we could contribute this to the ocf-shell-funcs of the resource-agents
09:36:19 <bogdando> an use this instead of the init control plane for stopping things
09:36:22 <bogdando> and
09:36:24 <aspiers> #action beekhof to give technical details of systemd issues which prevent reliable building of OCF RAs on top of it
09:36:49 <bogdando> thoughts?
09:36:50 <aspiers> bogdando: yes, maybe there is an opportunity for reuse of library code there
09:37:03 <aspiers> bogdando: however the idea is to leave the majority of the logic to systemd
09:37:23 <bogdando> I believe posix signals will make happy even the ones with the longest beards among us :)
09:37:53 <aspiers> so effective the OCF RAs don't add much around systemd except 1. polling and error handling to cover systemd timing / failure issues and 2. application-layer monitoring
09:38:06 <bogdando> at least please put this as the alternative to the stop case
09:38:14 <bogdando> an alternative*
09:38:36 <aspiers> bogdando: well, determining which PID to kill is a detail I would prefer to be handled by systemd
09:38:43 <bogdando> cuz you know, in pacemaker the stop is the most sensitive thing
09:38:48 <aspiers> since that depends on how the daemon is coded and packaged by the distro
09:38:58 <aspiers> yes I agree, stop is really critical
09:39:21 <aspiers> but I believe that if the distro packages can't reliably stop the service via systemd, then that is a distro bug
09:40:01 <aspiers> there is no 100% reliable cross-distro way to determine the pid of the daemon
09:40:07 <bogdando> procfs?
09:40:09 <bogdando> not 100%?
09:40:25 <aspiers> no, because you have to make assumptions about the daemon process name
09:40:29 <aspiers> or the location of the pid file
09:40:45 <aspiers> and then this has to be a parameter to the Pacemaker primitive
09:40:49 <bogdando> the latter one is under OCF RA control, as a parameter perhaps
09:40:57 <aspiers> which is a duplication of config data already defined by the distro packages
09:41:03 <aspiers> and that's what I want to avoid
09:41:24 <bogdando> well, we could rely on the systemd to get the pid
09:41:32 <aspiers> yes, that's exactly my proposal :)
09:41:33 <bogdando> but still use proc_stop for stopping
09:41:43 <aspiers> bogdando: what's wrong with systemctl stop?
09:41:48 <bogdando> async
09:41:55 <aspiers> bogdando: solved by polling
09:42:02 <bogdando> not guarantees results
09:42:16 <aspiers> then isn't that a distro bug?
09:42:25 <aspiers> what could go wrong?
09:42:31 <bogdando> we want stop like STONITH, to be sure the one was nuked down
09:42:43 <bogdando> in the specified op stop timeout
09:42:52 <aspiers> why can't systemd do this?
09:43:06 <bogdando> some folks say it is not reliable enough ;)
09:43:27 <bogdando> while kill -TERM , kill -KILL was proven to be so for decades
09:43:34 <ddeja> but it's one of the pacemaker features - it'll kill host if it's unable to stop one of it's services
09:43:39 <aspiers> but that's what systemd does
09:43:45 <aspiers> bogdando: see the systemd.kill(5) man page
09:43:55 <bogdando> okay, thanks will look
09:43:57 <ddeja> so what's the big deal if systemd will not stop service?
09:44:05 <aspiers> bogdando: systemd can send SIGKILL
09:44:07 <aspiers> if needed
09:44:30 <aspiers> ddeja: we want to avoid STONITH if we possibly can, since it's much more disruptive
09:45:18 <ddeja> aspiers: ok, sure thing
09:45:19 <aspiers> but I can't imagine a scenario where systemd would fail to kill the process via SIGTERM/HUP/KILL, unless there was some kind kernel bug
09:45:28 <aspiers> and in that case, we want STONITH anyway :)
09:45:36 <bogdando> sigkill should work indeed
09:45:41 <aspiers> there may well be other issues I am not aware of
09:45:47 <aspiers> beekhof certainly seems to think so
09:45:53 <aspiers> but I haven't heard the details yet
09:45:55 <bogdando> but that if pid was lost?
09:46:05 <aspiers> bogdando: what do you mean by lost?
09:46:06 <bogdando> proc_stop assumes one may use the name based mathcing
09:46:29 <aspiers> bogdando: name-based matching is not cross-distro and also unreliable
09:46:30 <bogdando> imagine a byzantine case then the app crashed and removed its pid too eraly
09:46:35 <bogdando> earlyu
09:46:44 <bogdando> well, *early*
09:46:58 <aspiers> bogdando: in that case, systemd can detect the service is already stopped
09:47:06 <aspiers> so there is no problem
09:47:30 <bogdando> I can only say that pidfiles not always contain one's expecting
09:48:00 <bogdando> especially for high load systems
09:48:08 <aspiers> bogdando: perhaps you can give some concrete examples on #openstack-ha later, or via email?
09:48:14 <bogdando> okay
09:48:17 <aspiers> cool, thanks
09:48:51 <aspiers> btw, there was a report on #openstack-ha in the last few days about a bug where detecting pid via name-based matching failed
09:49:30 <aspiers> so this is not just a theoretical issue
09:49:44 <aspiers> ok, so I will write a spec
09:49:59 <aspiers> apologies in advance: I will probably not have time to do it in the next few weeks :-(
09:50:08 <aspiers> we have a major release coming up, and I am also going away for 2 weeks
09:50:16 <aspiers> but I don't think this proposal is urgent anyway
09:50:49 <aspiers> anyone have any other topics to discuss? bogdando, anything about the ha-guide maybe? e.g. new meeting time?
09:51:48 <aspiers> #action bogdando to give some concrete examples where pid files (and perhaps hence systemd) cannot be relied on
09:52:40 <aspiers> #info success of systemctl stop is considered very important, to avoid STONITH unless we really need it
09:53:00 <aspiers> #info systemd seems to be quite powerful in capabilities for killing misbehaving processes
09:55:02 <aspiers> ok I guess not
09:55:11 <aspiers> in that case let's end the meeting
09:55:28 <aspiers> thanks all, and see you next week, or on #openstack-ha before then!
09:55:32 <masahito> ok, bye
09:55:35 <aspiers> bye for now :)
09:55:44 <kazuIchikawa> bye
09:55:45 <ddeja> bye
09:56:09 <aspiers> #endmeeting