09:03:11 #startmeeting ha 09:03:12 Meeting started Mon Jan 18 09:03:11 2016 UTC and is due to finish in 60 minutes. The chair is aspiers. Information about MeetBot at http://wiki.debian.org/MeetBot. 09:03:13 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 09:03:15 The meeting name has been set to 'ha' 09:03:21 welcome everyone 09:03:40 <_gryf> hello again :) 09:03:50 I guess it might be a bit short today but that's no problem :) 09:04:03 hello 09:04:13 let's start with some status updates as per normal 09:04:21 #topic Current status (progress, issues, roadblocks, further plans) 09:04:37 * aspiers picks someone at random 09:04:46 masahito: anything you'd like to report? 09:05:19 I'm working on my work items for Masakari, so nothing to specials. 09:05:30 ok 09:06:23 hey kazuIchikawa, welcome to our small group :) are you also working on masakari? 09:07:03 Yes, I'm working on masakari with masahito. 09:07:08 cool 09:07:41 He is my colleague, actually. working in same office. 09:07:51 nice 09:08:11 I don't have too much to report 09:08:15 briefly: 09:08:32 I merged one or two patches into openstack-resource-agents 09:08:37 <_gryf> also, nothing from my side 09:08:41 <_gryf> but ddeja was able to bring his setup alive last Friday 09:08:43 there are still a few outstanding 09:08:46 <_gryf> and currently he is back on track and working on tuning the mistral workflows for evacuation 09:08:57 _gryf: sounds cool! 09:09:09 hi 09:09:09 _gryf: I guess the github will continue to be updated? 09:09:15 oh hey bogdando :) 09:09:18 <_gryf> aspiers, I hope :) 09:09:40 I've continued to have a long discussion with beekhof about the future of OCF RAs 09:09:57 and their relationship to systemd 09:10:27 I am still thinking it might be a good idea if the RAs are changed to wrap around system unit services 09:10:42 instead of reimplementing the logic / config for starting/stopping the daemons 09:10:54 beekhof strongly disagrees ;-) 09:11:15 for reasons which I don't fully understand yet 09:11:18 aspiers, do you mean only stateless resources by OCF or stateful as well? 09:11:25 like clusters 09:11:31 bogdando: possibly both 09:11:47 hey ddeja :) 09:11:51 hi all, sorry for beeing late :) 09:11:55 no problem 09:11:56 the point is for the clusters, start/stop differs 09:11:57 I left my car for repair and didin't expect public transport to take so long 09:12:03 hehe 09:12:09 <_gryf> heh 09:12:12 and systemd start/stop logic cannot fit the needs perhaps 09:12:22 as well as any init-like one 09:12:23 well, let's finish the status round first, and then maybe we can come back to this topic? 09:12:36 want to make sure everyone gets a chance to give their status 09:13:04 the other thing I have to report is that I'm still working on our automation of compute node HA setup via Chef 09:13:16 but it's quite close to completion now 09:13:32 bogdando: you want to give a status update on anything? 09:13:59 no so far, only few updates to the ha guide related things 09:14:29 yeah, ha-guide project seems very active :) 09:14:38 ddeja: anything from your side? 09:14:46 aspiers: yes, thanks 09:14:52 galera patch was reworked, thanks to Kenneth, https://review.openstack.org/#/c/263075/ 09:15:07 I have finished working whit my setup for auto-evac testing 09:15:32 cool 09:15:45 And I'm hardening mistral workflow, but on Friday I hit something, that might be a bug 09:15:59 But not sure yet, must do a double check 09:16:30 as soon as I have my workflow fully working, I'll let you know and I'll describe it in etherpad/github 09:16:31 ddeja: do you have any updates you could push to https://github.com/gryf/mistral-evacuate ? 09:16:37 ok great! 09:17:11 alright, I think that's everyone, or does anyone else want to report anything? 09:17:24 aspiers: yes, but I must confirm that there is bug in mistral itself, not in my workflow before I push it, hopefully I'll do it tommorow :) 09:17:36 ddeja: cool :) 09:17:54 also, if you want to discuss a particular topic today (or in the future), please say so now 09:18:14 I would quite like to return to this topic of OCF RAs and systemd to get other people's opinions 09:18:24 but happy to discuss anything else 09:19:05 I'm OK with OCF RAs stuff 09:19:11 ok 09:19:16 #topic OCF RAs and systemd 09:19:26 firstly, is anyone *not* using systemd? 09:20:15 so my original idea, which I think I have mentioned in previous meetings, or on #openstack-ha, is to make small changes to the existing OCF RAs so that they wrap around systemd 09:20:18 for HA one? 09:20:27 or at least, that they wrap around service(8) 09:20:37 masahito: I mean, in general 09:20:44 masahito: vs. sysvinit, upstart etc. 09:21:10 there are a few problems with the existing OCF RAs 09:21:13 got it. we're using upstart. 09:21:16 let's create a spec first. There is high dev & ops impact 09:21:29 bogdando: sure. this is very much in the proposal phase right now 09:21:33 a spec is a good idea 09:21:35 I'd like to see this change backwards compatible 09:21:47 bogdando: I think it would be, but we'd have to check 09:21:55 the problems I want to solve are: 09:22:01 with some deprectaion perhaps unless there is ubuntu 16.04 at lest :) 09:22:06 1. duplication of daemon management logic 09:22:16 2. distro-specific code in openstack-resource-agents repo 09:22:30 3. inconsistency of daemon management between HA clouds and non-HA clouds 09:22:56 4. delegation of maintenance of daemon management to distro packages 09:23:09 (2. and 4. are kind of the same thing) 09:23:22 so currently the OCF RAs start up daemons in their own way 09:23:30 they don't care how the daemons are packaged by the distro 09:23:48 if the distro package changes, the RA might also need a change 09:23:54 or at least the Pacemaker primitive 09:24:05 another point is probably a single point of entry for the resources control 09:24:24 would systemd allow to effectively handle this? 09:24:39 bogdando: could you expand a bit on what you mean by that please? 09:24:49 I mean that init-based control plane shall be not used, or desabled then pacemaker takes care 09:24:58 disabled* 09:25:12 well that's already true 09:25:27 if Pacemaker manages the service, then the admin must not also try to start/stop it 09:25:38 I don't think my proposal changes that 09:25:46 yes, but I thought systemd may integrate things a bit better 09:25:54 must not -> cannot 09:26:11 or transparently "will not", in fact 09:26:28 I don't think systemd lets us prevent admins interfering with Pacemaker-controlled services 09:26:32 ;( 09:26:46 which is a shame 09:26:53 maybe there is some way to hack that, I'm not sure 09:26:54 my concern is how pacemaker works when a process launched by systemd goes down. 09:27:14 masahito: in that case, Pacemaker must be in charge of restarting it 09:27:17 yes, good point to address in the spec, failure modes 09:27:27 and avoiding split brain to control planes 09:27:38 masahito: in my proposal, Pacemaker launches the process *via* systemd 09:28:12 so e.g. the 'start' action of the OCF RA will call something like "service openstack-keystone start" 09:28:17 aspiers: got it. 09:28:19 so status in the systemd will be correct, but it shall not try to do respawn race 09:28:26 bogdando: correct 09:28:58 yes, this also solves the problem that 5. currently "systemctl status" is not guaranteed accurate if the service is started by Pacemaker 09:29:13 since that relies on the OCF RA using the same pid file and/or daemon name as systemd 09:29:14 looks like very good idea, but complex to address (considering keeping it backwards compatible unless there is LTS ubuntu with systemd) 09:29:26 bogdando: I think it would be extremely simple to do 09:29:27 so we need a spec and PoC perhaps 09:29:36 bogdando: it's just a few lines of code changed in each RA 09:29:47 beekhof highlighted one issue with systemctl though 09:29:50 yes perhaps 09:29:54 systemctl start/stop are asynchronous 09:30:08 they do not block until the service is started/stopped 09:30:14 oh, that may be a big problem for stop logic in pacemaker 09:30:22 so after you call systemctl, you have to poll until the service is really started or stopped 09:30:25 and unexpected STONITHes :) 09:30:43 but I think that can be solved simply by a loop which calls the 'monitor' action 09:30:51 then the action stop times out 09:31:04 where 'monitor' should do application-level monitoring (e.g. HTTP layer tests) 09:31:08 but in fact is running async 09:31:18 another failure mode to catch up 09:31:23 bogdando: yes, the polling loop would need a timeout, but this is easy to do 09:31:34 also the systemctl would need a timeout, which again is easy 09:31:53 I have discussed this proposal a LOT with beekhof 09:32:01 he doesn't like it ;-) 09:32:05 but I don't understand why yet 09:32:14 he says systemd is too unreliable to use 09:32:38 :) 09:32:38 but I don't know what technical issues he is referring to 09:32:46 so I guess that conversation will continue 09:33:10 by the way 09:33:13 maybe whe sould ask beekhof to elaborate more on this? 09:33:20 ddeja: I already did 09:33:26 aspiers: cool 09:33:27 ddeja: waiting for an answer 09:33:34 aspiers: OK 09:33:35 for stop, I believe we should use a unified proc_stop which relies on several iterations of SIGTERM 09:33:50 folowing by the SIGKILL if nothing helps to stop gracefully 09:33:54 but in any case, I agree that a spec and some PoC (e.g. a WIP pull request to openstack-resource-agents) is a good way to go 09:33:59 <_gryf> aspiers, is that thread on the ml and somehow i've missed it? 09:34:07 that would be a more classic unix way, rely on the signals, IMO 09:34:20 I have an example to show 09:34:42 _gryf: no, unfortunately it had to be private since there were some minor political topics involved :-/ 09:34:58 <_gryf> aspiers, oh, got it ;) 09:35:09 _gryf: but a spec / gerrit review would help keep the technical discussion open 09:35:13 I would much prefer that 09:35:31 https://github.com/rabbitmq/rabbitmq-server/blob/6fd4eb5bcb39be7f5ac26dcc78e3a4b4df4c6fbb/scripts/rabbitmq-server-ha.ocf#L303-L463 09:35:55 #action aspiers to write a spec proposing OCF RAs wrap systemd, and possibly a gerrit review showing a PoC 09:35:58 <_gryf> aspiers, yeah. and for me both (gerrit/ml) works fine 09:35:59 we could contribute this to the ocf-shell-funcs of the resource-agents 09:36:19 an use this instead of the init control plane for stopping things 09:36:22 and 09:36:24 #action beekhof to give technical details of systemd issues which prevent reliable building of OCF RAs on top of it 09:36:49 thoughts? 09:36:50 bogdando: yes, maybe there is an opportunity for reuse of library code there 09:37:03 bogdando: however the idea is to leave the majority of the logic to systemd 09:37:23 I believe posix signals will make happy even the ones with the longest beards among us :) 09:37:53 so effective the OCF RAs don't add much around systemd except 1. polling and error handling to cover systemd timing / failure issues and 2. application-layer monitoring 09:38:06 at least please put this as the alternative to the stop case 09:38:14 an alternative* 09:38:36 bogdando: well, determining which PID to kill is a detail I would prefer to be handled by systemd 09:38:43 cuz you know, in pacemaker the stop is the most sensitive thing 09:38:48 since that depends on how the daemon is coded and packaged by the distro 09:38:58 yes I agree, stop is really critical 09:39:21 but I believe that if the distro packages can't reliably stop the service via systemd, then that is a distro bug 09:40:01 there is no 100% reliable cross-distro way to determine the pid of the daemon 09:40:07 procfs? 09:40:09 not 100%? 09:40:25 no, because you have to make assumptions about the daemon process name 09:40:29 or the location of the pid file 09:40:45 and then this has to be a parameter to the Pacemaker primitive 09:40:49 the latter one is under OCF RA control, as a parameter perhaps 09:40:57 which is a duplication of config data already defined by the distro packages 09:41:03 and that's what I want to avoid 09:41:24 well, we could rely on the systemd to get the pid 09:41:32 yes, that's exactly my proposal :) 09:41:33 but still use proc_stop for stopping 09:41:43 bogdando: what's wrong with systemctl stop? 09:41:48 async 09:41:55 bogdando: solved by polling 09:42:02 not guarantees results 09:42:16 then isn't that a distro bug? 09:42:25 what could go wrong? 09:42:31 we want stop like STONITH, to be sure the one was nuked down 09:42:43 in the specified op stop timeout 09:42:52 why can't systemd do this? 09:43:06 some folks say it is not reliable enough ;) 09:43:27 while kill -TERM , kill -KILL was proven to be so for decades 09:43:34 but it's one of the pacemaker features - it'll kill host if it's unable to stop one of it's services 09:43:39 but that's what systemd does 09:43:45 bogdando: see the systemd.kill(5) man page 09:43:55 okay, thanks will look 09:43:57 so what's the big deal if systemd will not stop service? 09:44:05 bogdando: systemd can send SIGKILL 09:44:07 if needed 09:44:30 ddeja: we want to avoid STONITH if we possibly can, since it's much more disruptive 09:45:18 aspiers: ok, sure thing 09:45:19 but I can't imagine a scenario where systemd would fail to kill the process via SIGTERM/HUP/KILL, unless there was some kind kernel bug 09:45:28 and in that case, we want STONITH anyway :) 09:45:36 sigkill should work indeed 09:45:41 there may well be other issues I am not aware of 09:45:47 beekhof certainly seems to think so 09:45:53 but I haven't heard the details yet 09:45:55 but that if pid was lost? 09:46:05 bogdando: what do you mean by lost? 09:46:06 proc_stop assumes one may use the name based mathcing 09:46:29 bogdando: name-based matching is not cross-distro and also unreliable 09:46:30 imagine a byzantine case then the app crashed and removed its pid too eraly 09:46:35 earlyu 09:46:44 well, *early* 09:46:58 bogdando: in that case, systemd can detect the service is already stopped 09:47:06 so there is no problem 09:47:30 I can only say that pidfiles not always contain one's expecting 09:48:00 especially for high load systems 09:48:08 bogdando: perhaps you can give some concrete examples on #openstack-ha later, or via email? 09:48:14 okay 09:48:17 cool, thanks 09:48:51 btw, there was a report on #openstack-ha in the last few days about a bug where detecting pid via name-based matching failed 09:49:30 so this is not just a theoretical issue 09:49:44 ok, so I will write a spec 09:49:59 apologies in advance: I will probably not have time to do it in the next few weeks :-( 09:50:08 we have a major release coming up, and I am also going away for 2 weeks 09:50:16 but I don't think this proposal is urgent anyway 09:50:49 anyone have any other topics to discuss? bogdando, anything about the ha-guide maybe? e.g. new meeting time? 09:51:48 #action bogdando to give some concrete examples where pid files (and perhaps hence systemd) cannot be relied on 09:52:40 #info success of systemctl stop is considered very important, to avoid STONITH unless we really need it 09:53:00 #info systemd seems to be quite powerful in capabilities for killing misbehaving processes 09:55:02 ok I guess not 09:55:11 in that case let's end the meeting 09:55:28 thanks all, and see you next week, or on #openstack-ha before then! 09:55:32 ok, bye 09:55:35 bye for now :) 09:55:44 bye 09:55:45 bye 09:56:09 #endmeeting