09:02:34 <aspiers> #startmeeting ha 09:02:35 <openstack> Meeting started Mon Aug 8 09:02:34 2016 UTC and is due to finish in 60 minutes. The chair is aspiers. Information about MeetBot at http://wiki.debian.org/MeetBot. 09:02:37 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 09:02:39 <openstack> The meeting name has been set to 'ha' 09:02:49 <ddeja> o/ 09:02:50 <aspiers> ok hi everyone, and welcome to the "new" (old) meeting time 09:03:11 <samP> hi o/ 09:03:17 <aspiers> please bear with me today if I am slow, as I am currently on two meetings at once :) 09:03:18 <Dinesh_Bhor> hi o/ 09:03:30 <aspiers> but hopefully the other one will not require too much of my attention 09:03:37 <katomo> o/ 09:04:02 <aspiers> hi all, and welcome to any recent joiners :) 09:04:29 <aspiers> #topic Current status (progress, issues, roadblocks, further plans) 09:04:47 <aspiers> samP: thanks a lot for your spec! I am currently reviewing it 09:05:07 <aspiers> samP: if it's ok I will just upload a new patch set with some minor edits? 09:05:11 <samP> aspiers: Thanks 09:05:21 <samP> aspiers: sure 09:05:23 <aspiers> ok 09:05:49 <aspiers> btw I think my comment about a link in the index was wrong 09:05:53 <aspiers> it seems to have automatically linked 09:06:08 <aspiers> the HTML rendered view is viewable from the Jenkins job build link 09:06:20 <aspiers> http://docs-draft.openstack.org/17/352217/2/check/gate-openstack-resource-agents-specs-docs-ubuntu-xenial/84fd365//doc/build/html/specs/newton/approved/newton-instance-ha-vm-monitoring-spec.html 09:07:00 <aspiers> BTW this is the review we are talking about, for those who don't know: https://review.openstack.org/#/c/352217/ 09:07:33 <ddeja> My status: only 3 days of regular work last week. Mostly adjusting to new role in mistral by doing lots of reviews, not diretly related to VM HA unfortunately. 09:08:12 <aspiers> no problem :) good luck with the reviews 09:08:49 <samP> aspiers: about review, I need you attention on "Proposed Change" 09:09:02 <aspiers> samP: sure, I will review the whole thing 09:09:15 <samP> aspiers: thanks 09:09:23 * ddeja will also review 09:09:34 <samP> ddeja: thanks 09:09:44 <aspiers> samP: but I think we probably already agreed on the approach - the masakari way seems best IIRC 09:10:40 <samP> aspiers: yes we did. Ok then, I will proceed with libvirt monitoring for VM monitoring 09:11:08 <aspiers> samP: I think the spec will be more about documenting this decision and then deciding the integration points 09:11:13 <samP> Now, I can complete the VM recovery spec 09:11:52 <aspiers> another update: I attended a meeting about cinder-volume active/active 09:11:52 <samP> aspiers: sure 09:11:57 <aspiers> I think I forgot to report that last week 09:12:24 <aspiers> there is still design discussion to be had, especially around fencing 09:12:39 <aspiers> they have a weekly meeting in case anyone is interested 09:12:55 <samP> aspiers: great. that also a very important topic for us 09:12:56 <ddeja> aspiers: well, I can contact you with ma colleague who is Cinder core reviewer 09:12:58 <aspiers> https://etherpad.openstack.org/p/cinder-active-active-HA 09:13:10 <ddeja> and doing cinder A/A from liberty 09:13:20 <ddeja> and is sitting next to me in office ;) 09:14:09 <aspiers> oh, cool! 09:14:26 <aspiers> will he attend the future cinder meetings? 09:14:34 <aspiers> what is his name? 09:14:39 <ddeja> I'm sure he will 09:14:41 <ddeja> dulek 09:14:45 <ddeja> on IRC 09:15:23 <aspiers> maybe he can be added to the list of nicks to ping for that meeting? 09:15:48 <ddeja> he's on two weeks vacation starting today, so I'm not sure if he would attend. 09:16:41 <aspiers> ok 09:16:49 <aspiers> I'll add his nick for future meetings 09:16:56 <ddeja> OK 09:17:40 <aspiers> any other status reports? 09:19:00 <samP> other than spec, not from my side 09:19:06 <aspiers> ok 09:19:12 <aspiers> #topic Barcelona 09:19:23 <aspiers> IIRC, today is the deadline for voting for sessions 09:19:36 <aspiers> so if you haven't yet voted (like me), please do it quickly ;-) 09:20:44 <aspiers> also I got a reply from ttx about the HA track 09:20:58 <aspiers> as you can see on the list 09:21:37 <samP> aspiers: thanks for mail 09:22:02 <aspiers> http://lists.openstack.org/pipermail/openstack-dev/2016-August/100679.html 09:22:19 <aspiers> then I followed up on openstack-operators 09:22:40 <aspiers> http://lists.openstack.org/pipermail/openstack-operators/2016-August/011154.html 09:22:44 <aspiers> but no replies yet :-( 09:23:01 <aspiers> so it looks like we will have to wait until after Barcelona 09:23:17 <aspiers> although maybe we can chase ops meetup organisers directly 09:23:54 <samP> aspiers: ops meetup at NY would be a good place 09:24:03 <aspiers> samP: yes, you are going right? 09:24:12 <samP> aspiers: yes, 09:24:17 <aspiers> samP: maybe you could talk to them about this last email? 09:25:24 <samP> aspiers: Im discussing with some ops people, hope fully I can spread the word and ask some support 09:25:30 <aspiers> perfect! 09:25:37 <aspiers> on the compute HA topic, I think we should set some goals to achieve before Barcelona 09:25:44 <aspiers> IMHO minimum would be: 09:25:51 <aspiers> - finish all specs (including cross-project spec) 09:26:01 <aspiers> - update HA guide with status quo 09:26:26 <aspiers> - start the ball rolling with at least a little bit of collaborative hacking :) 09:26:31 <aspiers> anything else? 09:27:03 <ddeja> I think no 09:27:46 <samP> shouldn't we document our roadmap? is thant the presentation or spec? 09:29:27 <aspiers> samP: good question 09:29:32 <samP> If the presentation get accepted, then that will be the doc for future roadmap 09:29:54 <aspiers> samP: yes, although the specs will also define the roadmap 09:30:09 <beekhof> are we still going? 09:30:14 <aspiers> beekhof: yes 09:30:20 <beekhof> i have a few moments :) 09:30:30 <aspiers> beekhof: anything you wanna bring up? 09:30:35 <aspiers> Barcelona or otherwise? 09:30:43 <beekhof> i should probably do a docs blueprint 09:31:00 <aspiers> ok 09:31:27 <ddeja> beekhof: please add me for review when you submit it 09:31:29 <beekhof> also, are there any greivances about my arch proposals? 09:31:50 <aspiers> beekhof: yes :) 09:32:22 <katomo> beekhof: thanks 09:32:23 <aspiers> beekhof: but I think I mentioned most of them privately already, so no huge surprises 09:32:55 <aspiers> beekhof: IIRC, the main ones are monitoring and DRBD8 09:32:56 <beekhof> ok, so only disagreements about things you're wrong about :) 09:33:21 <aspiers> if you don't want real monitoring, that's fine by me ;-p 09:33:24 <beekhof> i meant to ask... you're using drbd for the database? 09:33:32 <aspiers> some customers, not all 09:33:36 <aspiers> shared storage also supported 09:33:38 <beekhof> you're the one without real monitoring :) 09:34:29 <beekhof> unless you replace all the systemd scripts with OCF, i would argue that "pacemaker" monitoring is worse than nagios and friends 09:34:44 <beekhof> BUT 09:34:47 <aspiers> agreed, and that's what we'll do 09:35:03 <beekhof> you'll maintain OCF agents? 09:35:09 <aspiers> #topic control plane architecture 09:35:21 <aspiers> beekhof: I'm the maintainer already 09:35:36 <beekhof> but for every single openstack daemon? 09:35:41 <aspiers> sure 09:35:53 <aspiers> beekhof: not claiming to be doing a great job yet, but I believe in the mission 09:36:08 * beekhof runs from the mission 09:36:15 <aspiers> pid-only monitoring is dumb 09:36:21 <beekhof> agreed 09:37:23 <beekhof> in any case, i think we can structure the docs to accomodate both variants with minimal problems 09:37:35 <beekhof> someone from your side is prepared to write that bit? 09:39:12 <aspiers> sure 09:39:55 <aspiers> puzzled why you are choosing an approach you just agreed is dumb, but I won't try to stop you ;-) 09:40:16 <aspiers> I appreciate it makes things a lot simpler though 09:41:28 <aspiers> ok 09:41:32 <aspiers> #topic AOB 09:41:34 <beekhof> nagios etc wont be doing pid only monitoring either 09:41:36 <aspiers> anything else? 09:41:55 <aspiers> nagios won't be doing fencing either 09:41:56 <beekhof> but neither will pacemaker be doing pid only monitoring via systemd 09:42:32 <beekhof> no-one ever wants the node fenced because the openstack service died 09:42:51 <beekhof> we'll still fence (via pacemaker-remote failures) if the node as a whole goes down 09:43:04 <beekhof> AOB? 09:43:15 <aspiers> Any Other Business 09:43:18 <beekhof> ah 09:44:20 <aspiers> for some openstack services, I *definitely* want the node fenced if the service died and can't be restarted 09:45:58 <beekhof> which ones? because this point has been made very clear to me 09:46:02 <beekhof> cinder i guess? 09:47:24 <beekhof> aspiers: ^^^ 09:47:59 <aspiers> anything stateful, at least 09:48:51 <beekhof> my understanding is that all the state is in rabbit queues 09:49:29 <aspiers> nope 09:49:52 <beekhof> and the daemons are just pulling jobs off a queue and processing them 09:50:44 <aspiers> the API daemons, yes 09:51:48 <aspiers> but maybe we should declare the meeting over before we continue this :) 09:52:26 <aspiers> OTOH, it might make for interesting reading in the minutes 09:52:35 <aspiers> so what are you planning to monitor via nagios? 09:54:00 <aspiers> beekhof: ^^^ 09:57:00 <beekhof> i dont think its nagios 09:57:12 <beekhof> they have some package in mind but the name escapes me 09:57:25 <aspiers> ok, same question applies though 09:57:28 <aspiers> but let's close the meeting and continue on #openstack-ha 09:57:33 <beekhof> happy to close the meeting for now.... bedtime run is imminent :) 09:57:40 <aspiers> ok :) 09:57:43 <aspiers> #endmeeting