09:00:08 <aspiers> #startmeeting ha 09:00:09 <openstack> Meeting started Mon Mar 21 09:00:08 2016 UTC and is due to finish in 60 minutes. The chair is aspiers. Information about MeetBot at http://wiki.debian.org/MeetBot. 09:00:10 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 09:00:12 <openstack> The meeting name has been set to 'ha' 09:00:22 <aspiers> hi 09:00:30 <aspiers> who do we have today? 09:00:36 <ddeja> o/ 09:01:10 <aspiers> if it's just us two, we can discuss our talk ;-) 09:01:16 <ddeja> yup 09:01:37 <ddeja> but maybe someone else will show, we can wait 2 minutes more ;) 09:01:44 <aspiers> sure 09:04:19 <ddeja> ok, it looks like it's really us two, aspiers 09:04:22 <aspiers> yep 09:04:24 <aspiers> no problem 09:04:47 <aspiers> well let's use normal structure anyway, it will work fine 09:04:51 <aspiers> #topic Current status (progress, issues, roadblocks, further plans) 09:04:58 <aspiers> not much from my end 09:05:06 <aspiers> I did a few tweaks to openstack-resource-agents 09:05:13 <aspiers> and reported some HA bugs 09:05:23 <aspiers> e.g. http://bugs.clusterlabs.org/show_bug.cgi?id=5271 09:05:24 <openstack> bugs.clusterlabs.org bug 5271 in Documentation "usage of status action in OCF RAs needs clarifying or eliminating" [Normal,Unconfirmed] - Assigned to oalbrigt 09:05:43 <ddeja> so only short status from my side - I have prepared images for demo "Mistral HA". I didn't completed it yet since I worked on getting US visa 09:05:47 <aspiers> when I noticed that most of the RAs have the status action, and that's pointless 09:05:49 <aspiers> so I will drop it 09:05:57 <ddeja> ok 09:06:22 <aspiers> what type of images? 09:06:34 <ddeja> oh, images == pictures 09:06:39 <aspiers> oh, heh 09:06:49 <ddeja> like diagrams in visio ;) 09:06:51 <aspiers> too early in the morninng for me ;-) 09:06:57 <aspiers> cool! 09:07:16 <aspiers> I guess you also took a look at reveal.js, right? 09:07:26 <aspiers> or at least at Florian's talk 09:07:48 <ddeja> yes, I have watched Florians talk, didn't have time to play with technology itself but surely do this week 09:07:52 <aspiers> ok 09:08:03 <aspiers> I was also working on our neutron-ha-tool OCF RA which we use in SUSE OpenStack Cloud 5 and 6 for neutron HA 09:08:26 <aspiers> I guess for the next release we will probably switch to the standard upstream approach 09:08:39 <ddeja> that's good 09:09:01 <aspiers> yeah 09:09:08 <aspiers> or at least in the DVR case 09:09:35 <aspiers> oh 09:09:43 <aspiers> I just remembered 09:09:48 <ddeja> hm? 09:10:06 <aspiers> last Tuesday was the cross-project meeting 09:10:20 <aspiers> that was last week, right? 09:10:28 <ddeja> I think so 09:10:37 <aspiers> I guess most people read the minutes / logs the next day 09:10:41 <bogdando> hi 09:10:43 <aspiers> but I should mention it for completeness 09:10:46 <aspiers> oh hey bogdando :) 09:10:51 <ddeja> hello :) 09:11:11 <aspiers> so we talked about the auto-evacuation spec 09:11:30 <aspiers> #link https://review.openstack.org/#/c/257809/2 09:11:44 <aspiers> there seemed to be a LOT of interest in the topic 09:12:06 <aspiers> and I explained why I thought it was cross-project 09:12:43 <aspiers> however it was decided that whilst this is important work, a high-level cross-project spec doesn't really make sense 09:13:01 <aspiers> I think specs are supposed to have concrete low-level action items 09:13:12 <aspiers> rather than be technical strategy documents 09:13:26 <masahito> hi 09:13:30 <aspiers> so I think the spec will be abandoned 09:13:35 <aspiers> hi masahito :) 09:13:52 <aspiers> I was just summarising the cross-project meeting 09:14:28 <masahito> got it. I'll check eavesdrop. 09:14:33 <aspiers> http://eavesdrop.openstack.org/meetings/crossproject/2016/crossproject.2016-03-15-21.01.log.html 09:14:47 <aspiers> http://eavesdrop.openstack.org/meetings/crossproject/2016/crossproject.2016-03-15-21.01.html 09:15:01 <bogdando> any action items to the spec decomposition then? 09:15:04 <aspiers> #info the conclusion was: keep working on it and collaborating with the different projects 09:15:09 <bogdando> we shall not just abandon and forget :) 09:15:34 <aspiers> #info and submit specs to individual projects when there are APIs missing 09:16:30 <aspiers> bogdando: well I guess until now, each group was working on their individual solutions 09:16:44 <bogdando> and looking to the comments, I believe it was a good idea to make it CP 09:16:54 <aspiers> but maybe we need to start collaborating on the convergence 09:17:06 <bogdando> otherwise we'd never have collected so many comments from folks from different areas / projects 09:17:11 <ddeja> aspiers: I think still some groups working on their solitions 09:17:22 <aspiers> ddeja: yes, I think we all are 09:17:36 <aspiers> ddeja: maybe it is still too early 09:17:46 <aspiers> but we could start thinking about convergence action items 09:18:19 <bogdando> so, CP is eactly perfect place for the *high-level* concepts 09:18:26 <aspiers> #topic how to evaluate possible convergence strategies for auto evac 09:18:32 <bogdando> exactly 09:18:53 <aspiers> bogdando: high-level architectural discussion is cross-project for sure 09:19:04 <aspiers> but apparently specs are not the way to do it 09:19:38 <aspiers> maybe because that reaches too wide an audience 09:19:43 <bogdando> So I'd prefer we finish that instead of allowing the initiative to be split into detached local activities 09:19:59 <aspiers> I think a wiki or etherpad is fine 09:20:06 <aspiers> and mailing list / IRC 09:20:21 <bogdando> but only spec provides a path for future code reviews... 09:20:24 <aspiers> or we could move the spec somewhere else, e.g. mistral 09:20:39 <aspiers> several people said they thought it looked like a mistral spec 09:20:54 <bogdando> maybe yes 09:21:10 <ddeja> aspiers: from my experience with Mistral I don't think such spec suits in it 09:21:27 <aspiers> ddeja: yeah I'm not convinced by that either 09:21:39 <ddeja> since (despite some bugs) Mistral is only a Workflow executor 09:21:59 <ddeja> but there was some plans of having 'very good, reliable workflows' in Mistral repo 09:22:00 <aspiers> Sean Dague said "A cross project spec should either be a thing which affects nearly all openstack projects, or a thing that all the projects involved have agreed to already" 09:22:01 <bogdando> yes, we cannot design things like fencing there 09:22:13 <bogdando> or nova API changes, if any 09:22:35 <aspiers> bogdando: I think the point with nova API changes was that currently it is believed to do everything we need 09:22:52 <aspiers> so in that sense, nova is not involved until we find a bug or a gap in functionality 09:22:54 <ddeja> bogdando: for now, fencing is done outside OpenStack, so I don't know if we should discussed fencing at all 09:23:04 <bogdando> so we may end up having 3 specs - Mistral (if anything to be changed in the Mistral), Nova (ditto), OpenStack resource agents space ? 09:23:23 <ddeja> IMO we should just state that fencing must be configured 09:23:32 <aspiers> ddeja: that's a good point - fencing will probably never be inside OpenStack (unless using Ironic / Triple-O?) 09:23:43 <bogdando> yes, we can just note and leave out of scope 09:23:57 <aspiers> but fencing is a crucial part of the architecture of course 09:23:57 <ddeja> yeah, also we can provide OCF agent for Mistral 09:24:20 <aspiers> ddeja: that's a great example of something which deserves a spec in openstack-resource-agents :) 09:24:21 * ddeja has OCF agent that calls mistral API prepared 09:24:25 <bogdando> indeed, new OCF RA for Mistral fits well into the latter of 3 specs 09:24:27 <aspiers> which reminds me I need to set up a specs repo 09:24:37 <aspiers> #action aspiers to set up a specs repo for openstack-resource-agennts 09:25:05 <bogdando> and the rest things might be just put to the HA guide 09:25:07 <aspiers> (we also need that for planning Fuel reconvergence) 09:25:23 <aspiers> bogdando: hmm, I think it's maybe too WIP for the HA guide now? 09:25:24 <bogdando> like - make sure you configured the pcmk that way , and enabled fencing 09:25:36 <aspiers> oh, you mean just fencing? 09:25:43 <bogdando> yes, if we have a clear vision, nothing blocks us 09:25:59 <bogdando> everything you mentioned would not go as a part of OpenStack setup/op 09:26:16 <ddeja> since everyone is using pacemaker for fencing, it can be mentioned in HA guide 09:26:22 <aspiers> TBH I think we need an architecture diagram 09:26:30 <bogdando> we have a section for controllers HA setup 09:26:36 <aspiers> which maps all the required components of auto-evac 09:26:44 <ddeja> +1 09:26:45 <bogdando> we could add there all details near to the existing pacemaker/corosync sections 09:26:57 <aspiers> maybe we could use google drive to collaborate on drawing on? 09:27:00 <aspiers> *one 09:27:05 <bogdando> and add details how one shall configure pcmk remote for computes HA , for example 09:27:36 <aspiers> or some other tool which is like "ether-visio" 09:27:36 <bogdando> and we can add diagrams 09:27:40 <bogdando> we have many now :) 09:28:14 <aspiers> ddeja: I think we need this map for our talk anyway ;-) 09:29:03 <bogdando> we have bright examples Pacemaker Cluster Manager http://docs.openstack.org/ha-guide/intro-ha-arch-pacemaker.html and Keepalived http://docs.openstack.org/ha-guide/intro-ha-arch-keepalived.html architecture details and limitations 09:29:08 <ddeja> aspiers: you mean something like that https://github.com/gryf/mistral-evacuate/blob/master/Automatic%20evacuate%20design.jpg 09:29:15 <aspiers> bogdando: are there diagrams in the ha-guide? 09:29:20 <aspiers> bogdando: oh... thanks :) 09:29:22 <bogdando> :) 09:29:47 <ddeja> so it looks like we have enaough diagrams ;) 09:30:02 <aspiers> bogdando: is there a standard way to produce diagrams for upstream docs? 09:30:10 <bogdando> but we need ad more specific to the Instance ha + pacemaker remote now 09:30:15 <bogdando> I have no idea ;( 09:30:24 <bogdando> let's ask openstack docs folks 09:30:31 <aspiers> good idea 09:30:38 <aspiers> the neutron docs have tons of great docs 09:30:44 <aspiers> I wonder how they collaborate on them 09:30:45 <bogdando> or Andrew Beekhof, who probably created those above 09:30:47 <bogdando> :) 09:31:51 <beekhof> what i do? 09:31:54 <aspiers> Google Drawings is probably the easiest way 09:31:58 <aspiers> beekhof: it's all your fault! 09:32:08 <bogdando> so do we agree we can start working on the docs update w/o waiting for accepted implementations? 09:32:19 <bogdando> as it seems 100% to be pcmk_remote with OCF RA 09:32:21 <ddeja> beekhof: out of nowhere! 09:32:24 <aspiers> :) 09:32:39 * beekhof was cooking dinner - its been on of those days 09:32:42 <bogdando> beekhof, those? Pacemaker Cluster Manager http://docs.openstack.org/ha-guide/intro-ha-arch-pacemaker.html and Keepalived http://docs.openstack.org/ha-guide/intro-ha-arch-keepalived.html architecture details and limitations 09:33:00 <beekhof> yep, i made those 09:33:00 <aspiers> bogdando: what do you mean by docs update? a new section in ha-guide? 09:33:06 <bogdando> yes 09:33:23 <bogdando> to cover everything's missing to the setup required for the Instances HA 09:33:36 <bogdando> and in that we all 100% sure 09:33:58 <ddeja> fencing is such thing 09:34:00 <bogdando> like, pacemaker_remote, fencing, Mistral OCF RA 09:34:21 <masahito> I find good section. 09:34:26 <masahito> http://docs.openstack.org/ha-guide/compute-node-ha-api.html 09:34:30 <bogdando> exaclty 09:35:01 <aspiers> bogdando: well, are we 100% sure on Mistral? for me it is a very good option, but I think we still have to do a lot of work and testing to be 100% 09:35:15 <bogdando> we can skip Mistral then 09:35:22 <aspiers> that's still implementation details 09:35:33 <aspiers> I think for now, only architecture should be covered in ha-guide 09:35:38 <beekhof> those diagrams were made in keynote (apple application like powerpoint) 09:35:56 <bogdando> hm, I'm only trying to find a place for things will not go to any specs but still must be known (how-to) 09:35:56 <beekhof> i can probably export it into powerpoint which google docs might import 09:36:12 <aspiers> bogdando: yes, that is the challenge we need to figure out :) 09:36:21 <ddeja> aspiers: as far as I know triple-O guys are willing to use Mistral 09:36:31 <aspiers> bogdando: I think probably a wiki 09:36:47 <aspiers> bogdando: unless we want to use gerrit for review 09:37:22 <bogdando> do we agree on adding pacemaker_remote topics to the HA guide compute nodes HA? 09:37:35 <bogdando> it seems like 100% will go into the final solution 09:37:53 <masahito> bogdando: Just pacemaker_remote? 09:37:53 <bogdando> and fencing details! 09:37:55 <aspiers> bogdando: +1 09:38:13 <ddeja> for now I see no alternative - maybe Ironic would be able to do fencing someday... 09:38:36 <aspiers> agreed 09:38:58 <aspiers> bogdando: I think it would also be good to document that compute HA is still WIP 09:39:06 <aspiers> bogdando: the ha-guide could point to our community here 09:39:13 <bogdando> btw, aspiers I saw you asked about remote stonith agents, so may be you could add some things you know now :) 09:39:20 <aspiers> bogdando: link to etherpad, weekly meetings etc. 09:39:32 <ddeja> aspiers: Big +1 on that 09:39:37 <bogdando> great then 09:39:42 <masahito> it sounds nice. 09:40:00 <aspiers> bogdando: can you take care of that? and add us as reviewers? 09:40:22 <bogdando> If I had a time to play with setup verification 09:40:36 <aspiers> maybe we could attract more people to our community that way 09:40:56 <bogdando> would be nice though someone who already did just shared results and notes 09:41:07 <bogdando> so I could expose them as the guide (and test them as well) 09:41:28 <aspiers> bogdando: I think the wiki is the right place to link to them 09:41:32 <masahito> aspiers: agree. I noticed I can't find our eatherpad on google. 09:41:35 <aspiers> since they are WIP and change quickly 09:41:45 <ddeja> bogdando: you mean steps how to configure pacemaker_remote? 09:41:53 <aspiers> we can also use wiki for evolving arch/design docs 09:41:53 <ddeja> and fencing of compute nodes? 09:42:13 <bogdando> yes, and fencing agents to use with computes probably (w/o devices specific things) 09:42:31 <aspiers> bogdando: there's not much to say about stonith of remote nodes, it's all documented already 09:42:41 <aspiers> bogdando: I just failed to find the docs the first time 09:42:55 <bogdando> well I'd like to put only the very specific things, no cross posting 09:43:18 <bogdando> or a link if that just works the way we want 09:43:44 <bogdando> the idea is to document something verified, even if as PoC 09:44:17 <ddeja> I belive that the way RH explained how to setup stonith is OK 09:44:22 <ddeja> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Configuring_the_Red_Hat_High_Availability_Add-On_with_Pacemaker/ch-fencing-HAAR.html 09:44:52 <bogdando> folks, it should be not just a common things, but specific to OpenStack computes and pacemaker_*remote* 09:45:15 <bogdando> with existing limitations like stonith resources may not be running on the compute nodes 09:45:25 <bogdando> or which type of remote fencing agents to use 09:45:32 <bogdando> that we *do* recommend 09:45:47 <ddeja> ok, I see 09:46:04 <aspiers> I don't think there is a need to favour any type of fencing agent 09:46:11 <bogdando> like maximum of nodes supported in the cluster (16 afaik) 09:46:15 <aspiers> any is good 09:46:21 <bogdando> so, more practical things 09:46:26 <bogdando> useful for ops 09:47:14 <bogdando> so that would not be just a reference to the existing guides AFAICT 09:47:16 <aspiers> bogdando: in the period before we have a working solution to document, what do you think is the goal of this section in the ha-guide? 09:47:36 <aspiers> I'm not sure it makes sense to document how to do 20% of a full solution 09:47:41 <bogdando> the goal is to make community prepared for Instances HA solutions 09:48:01 <bogdando> and shed some light to required tooling, setups 09:48:15 <bogdando> there are not much things about pacemaker remote IIUC 09:48:28 <bogdando> only few tech talks, and many things WIP, am I right? 09:48:31 <ddeja> bogdando: I have a setup with fencing configured, but it's based on this https://access.redhat.com/articles/1544823 The only things I've added are custom fencing agent (and Mistral ofcourse) 09:48:35 <aspiers> IMHO it's fine to say "you will need pacemaker_remote" 09:48:43 <aspiers> and maybe link to docs on pacemaker_remote 09:48:57 <aspiers> but I'm not sure it makes sense to document details of how to set it up 09:48:59 <bogdando> okay, let's just try to draft something... 09:49:14 <aspiers> bogdando: yeah, please just submit a review and add us to cc 09:49:30 <bogdando> ok I'll try :) 09:49:30 <aspiers> bogdando: I will definitely review anything you cc me on :) 09:49:39 <aspiers> oops, that was a dangerous promise ;) 09:49:56 <ddeja> aspiers: prepare to review some fuel patches ;) 09:50:03 <aspiers> lol 09:50:19 <bogdando> haha 09:50:23 <aspiers> bogdando: I think the most important thing to add is info on our WIP 09:50:49 <aspiers> bogdando: i.e. http://eavesdrop.openstack.org/#High_Availability_Meeting 09:50:56 <aspiers> https://etherpad.openstack.org/p/automatic-evacuation 09:51:12 <aspiers> and maybe we need a wiki page which is a more friendly landing page for this topic 09:51:23 <aspiers> bogdando: also you could link to the user story 09:51:38 <aspiers> although that is the first link in the etherpad :) 09:52:46 <aspiers> #action bogdando to submit an ha-guide review adding info about community WIP on auto-evac 09:53:24 <aspiers> shall we also have a play with google drawing? 09:53:56 <aspiers> https://docs.google.com/drawings/d/1q50txuu3vVx2WadhWGAeSO25PaEy_DbmwhVryT4FXCY/edit?usp=sharing 09:54:14 <aspiers> #action anyone who wants to, to experiment with google drawing 09:54:29 <aspiers> I think an architecture map would really help us 09:55:09 <ddeja> I can paste there drawing I have prepared for my demo to have something to start with 09:55:16 <aspiers> ok 09:55:19 <bogdando> ddeja, great! 09:55:29 <aspiers> #topic AOB 09:55:33 <ddeja> It's Mistral-oriented, but it's still something to start 09:55:33 <aspiers> any other business before we finish? 09:55:59 <ddeja> wait one minute guys and tell if my drawing makes any sense ;) 09:56:33 <ddeja> Done, pasted 09:57:45 <masahito> I'll paste Masakari architecture. 09:57:57 <masahito> in the page. 09:58:07 <ddeja> but I don't know if it is anywhere close to what you guys have in mind :) 09:58:16 <haukebruno> well, hi \o/ being a _bit_ late, sorry :p 09:58:26 <aspiers> haha hi haukebruno, we are just finishing :) 09:58:30 <masahito> I think it's easy to compare both and others. 09:58:42 <ddeja> so feel free to delete it, I have it in another docs 09:58:46 <ddeja> masahito: +1 09:58:53 <bogdando> good point 09:58:54 <haukebruno> aspiers, yeah. I apologize... pretty bad timings today 09:59:10 <ddeja> I just wonder if we can add second page in this drawing? 09:59:10 <aspiers> ddeja: I was thinking we need a process diagram 09:59:39 <aspiers> ddeja: I'll try to sketch something so it makes more sense 10:00:03 <aspiers> something a bit like https://github.com/ntt-sic/masakari/blob/master/contents/architecture.png 10:00:17 <aspiers> but more generic 10:00:24 <aspiers> not specific to any implementation 10:00:28 <aspiers> anyway, we are out of time 10:00:29 <ddeja> I see 10:00:36 <aspiers> let's continue on #openstack-ha 10:00:46 <aspiers> thanks all! 10:00:55 <aspiers> #endmeeting