15:58:42 <inc0> #startmeeting kolla 15:58:43 <openstack> Meeting started Wed Mar 1 15:58:42 2017 UTC and is due to finish in 60 minutes. The chair is inc0. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:58:44 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:58:47 <openstack> The meeting name has been set to 'kolla' 15:58:58 <inc0> #topic rollcall 15:59:02 <inc0> hello all:) 15:59:07 <britthouser4> hey! 15:59:11 <britthouser4> 0/ 15:59:13 <duonghq> o/ 15:59:19 <egonzalez> woot / 15:59:59 <sp_> woot / 16:00:04 <akwasnie> o/ 16:00:07 <pbourke> o/ 16:00:14 <krtaylor> o/ 16:00:52 <Jeffrey4l> woot 16:01:33 <portdirect> w00t 16:01:43 <sdake> o/ :) 16:02:40 <jascott1> woot 16:02:48 <inc0> ok, we have busy agenda so I'll move on 16:02:56 <inc0> #topic Announcements 16:03:05 <inc0> I have 2: 1. thank you all for great PTG 16:03:07 * krtaylor was happy to meet everyone at PTG 16:03:32 <inc0> notes and session list are available here: 16:03:35 <inc0> #link https://etherpad.openstack.org/p/kolla-pike-ptg-schedule 16:03:47 <inc0> and 2. We release ocata next week! 16:03:56 <inc0> I'd like to encourage everyone to step up testing 16:04:08 <inc0> and fix these last few things that are broken 16:04:41 <inc0> also we'd need good answer about whether or not Kolla runs on docker 1.13:) 16:04:49 <zhubingbing> o/ 16:04:51 <zhubingbing> sorry 16:04:55 <inc0> any announements from community? 16:05:04 <inc0> no worries zhubingbing, welcome 16:06:17 <inc0> guess no announcements 16:06:31 <inc0> #topic Applying for the Stable project maturity tag 16:06:36 <inc0> sdake: I assume it's yours 16:06:46 <inc0> can we make it shorter than 20min plz? 16:07:02 <inc0> there is topic in agenda I'd love to talk about today 16:07:08 <sdake> yup 16:07:09 <sdake> not sure if it can be 16:07:09 <sdake> but will try 16:07:11 <sdake> so - we applied for the stable maturity tag in the past 16:07:22 <sdake> now that liberty is EOL we can do so again 16:07:32 <sdake> and daviey will be our liason (confirmed on irc) 16:07:54 <inc0> from release team? 16:07:55 <sdake> the one thing that Jeffrey4l had a q about was the lifetime of mitaka (2.0.2) 16:08:00 <sdake> daviey is a core reviewer 16:08:05 <sdake> and he is also on the stable maint team 16:08:12 <inc0> ok makes sense 16:08:24 <sdake> I can submit the review if you like 16:08:30 <inc0> I'll do it 16:08:35 <sdake> cool 16:08:41 <sdake> i guess we can leave the unanswered question unanswered 16:08:53 <sdake> although liberty (1.x) is EOL 16:08:58 <sdake> and 2.0.2 (newton) is about to be eoled March 3rd 16:09:08 <inc0> mitaka 16:09:10 <sdake> so heads up to everyoen involved ;) 16:09:17 <inc0> newton still has 6 months in it 16:09:18 <sdake> sorry newton^mitaka 16:09:27 <Jeffrey4l> we need release last tag before 16:09:35 <Jeffrey4l> for mitaka branch . 16:09:37 <sdake> right we do need one last tag for pip 16:09:47 <sdake> maybe its march 10th 16:09:50 <sdake> i foget which it is :) 16:09:53 <Jeffrey4l> no matter whether kolla branch is remove or not. 16:10:00 <sdake> this is the undefined part 16:10:01 <inc0> yeah I agree 16:10:18 <inc0> Jeffrey4l: can we tag this week? 16:10:20 <sdake> trailing cycle projects don't have a defined lifetime 16:10:28 <inc0> I don't expect any patches merging to stable/mitaka 16:10:33 <inc0> or stable/newton 16:10:38 <Jeffrey4l> possible. 16:10:42 <sdake> there are a bunch in th ebacklog 16:10:48 <Jeffrey4l> there is nothing much for mitaka branch. 16:10:50 <sdake> although we can punt on mitaka 16:10:55 <sdake> oh cool then all good :) 16:11:17 <inc0> #action inc0 to submit review for maturity tag 16:11:31 <sdake> ok i think that sums it up - inc0 i'll point you at my last take at this PMT 16:11:51 <inc0> I'd say we should all observe this review and address issues if release team will have it 16:11:54 <sdake> inc0 there are a bunch of requirements needed - and we are hitting almost all of them 16:12:06 <sdake> i'd love to pull it up now, however, i don't have it handy 16:12:11 <inc0> I'll take that with you later 16:12:15 <sdake> if you can move on i can link it in the closing of the meeting 16:12:25 <inc0> ok, let's move on 16:12:30 <sdake> tia ;-) 16:12:35 <inc0> #topic serial in kolla-ansible 16:12:39 <inc0> Jeffrey4l: you're up 16:12:43 <Jeffrey4l> thanks. 16:12:49 <Jeffrey4l> this may related to next topic. 16:12:54 <Jeffrey4l> by duonghq 16:13:13 <Jeffrey4l> i saw some issue when test upgrade from newton to ocata. 16:13:20 <Jeffrey4l> check this link https://etherpad.openstack.org/p/kolla-ansible-serial 16:13:45 <Jeffrey4l> serial try to upgrade service one node by another one. 16:13:59 <Jeffrey4l> which will cause some unexpected issue. 16:14:26 <Jeffrey4l> for example the sighup part in nova. 16:14:32 <inc0> but that's what rolling upgrade is 16:14:38 <inc0> ahh 16:14:42 <inc0> I see what you mean 16:14:55 <Jeffrey4l> i am not trying to use rolling upgrade. 16:14:59 <inc0> we are hitting ansible wall 16:15:05 <Jeffrey4l> serial case some issue ;( 16:15:12 <duonghq> agreed with Jeffrey4l 16:15:18 <Jeffrey4l> and serial only works with playbook, which is bad, too. 16:15:19 <inc0> right 16:15:23 <inc0> yeah 16:15:34 <inc0> this is an example when serial would be needed at task level really 16:15:40 <Jeffrey4l> one propose i made is disable serial. 16:15:54 <Jeffrey4l> inc0, yes. but found nothing about this. 16:15:56 <duonghq> we need some task run only on 1st node, and some task run only on last or 1st node in the end, egonzalez also faced it 16:16:02 <inc0> but then it's not a rolling upgrade or no-downtime upgrade 16:16:16 <Jeffrey4l> one possible solution is use a dynamic delete_to variable, but still testing this. 16:16:43 <Jeffrey4l> kolla do not promise no-downtime upgrade. 16:16:59 <inc0> yeah 16:17:09 <duonghq> but if service has native support zero-downtime upgrade, we should support it too 16:17:09 <inc0> however if we have clear problem with ansible 16:17:15 <Jeffrey4l> and no-downtime is what duonghq is doing and solving. 16:17:28 <inc0> yeah we want to optimize it as much as possible 16:17:34 <sp_> Jeffrey4l: yes 16:17:38 <inc0> no downtime is not a promise but it is a goal 16:17:47 <duonghq> Jeffrey4l, you mean delegate_to? 16:17:53 <Jeffrey4l> any way, we face some issue when during upgrade. 16:18:08 <Jeffrey4l> duonghq, yep. i haven't test it. But i guess it should works. 16:18:10 <inc0> that's a good observation, maybe we should talk on #ansible to ask ansible people for opinions? 16:18:19 <Jeffrey4l> inc0, good idea. 16:18:25 <sp_> inc0: yes actual zero down time would be possible its our goal 16:18:37 <sp_> would not be* 16:18:44 <duonghq> any kolla-k8s people around? will we face same issue with k8s? 16:18:51 <egonzalez> not all projects support no zero-downtime upgrade 16:19:02 <Jeffrey4l> we need solve the serial issue. and better implement zone down time later. but they are two different thing, right? 16:19:03 <inc0> if Ansible will make this impossible, we should note that and work with them to make it possible 16:19:19 <inc0> egonzalez: right, but more and more does 16:19:29 <inc0> which means this is problem we definetly need to fix 16:19:43 <egonzalez> yeah, no doubt of it 16:19:45 <Jeffrey4l> duonghq, i guess kolla-k8s start trying to solve upgrade issue . sdake right? 16:20:01 <inc0> k8s will not have this issue 16:20:02 <sdake> Jeffrey4l no - we are working on basic ugprades at some point in the future for 1.0.0 16:20:08 <inc0> will have different issues;) 16:20:18 <inc0> but yeah, that's a goal 16:20:29 <Jeffrey4l> at last, one propose i want is disable serial, or at least disable this in default. 16:20:39 <inc0> I wouldn't surrender just yet tho, let's talk about it on #ansible after meeting 16:20:54 <inc0> yeah I tend to agree on that 16:20:55 <Jeffrey4l> ok. 16:21:05 <inc0> we should also change "stop all schedulers" tasks 16:21:15 <sp_> sdake: basic upgrade means with downtime . right ? for kolla-k8s 1.0.0 16:21:18 <inc0> as without serial they will only cause downtime we don't need 16:21:26 <duonghq> sp_, noop 16:21:32 <duonghq> ah, sorry, yes 16:21:33 <Jeffrey4l> openstack is more than a control plane. stop the services won't affect vms. 16:21:34 <sdake> sp_ possibly 16:21:37 <inc0> well it's still gonna be ~minute of downtime per service 16:21:57 <duonghq> for service need db migration, it'll take quite long time 16:22:12 <inc0> but we don't need to turn it off because of this issue 16:22:13 <sdake> i think zero downtime upgrades is a great objective for kolla-ansible, whereas any upgrades are a great objective for kolla-kubernetes 16:22:18 <Jeffrey4l> duonghq, when db migration, the service is still running. 16:22:35 <inc0> it's only about restarting containers 16:22:38 <duonghq> Jeffrey4l, not sure, it's depended on service 16:22:40 <sdake> although we can punt zero downtime upgrades t if it wont make the deadline (for kolla-ansible) 16:22:54 <sp_> sdake: yes, 16:23:04 <inc0> duonghq: right, but again, that's not this issue 16:23:15 <Jeffrey4l> duonghq, yes. but checked nova/neutron/cinder/glance, these both support this. 16:23:26 <duonghq> some service cannot working while db is in migration progress 16:23:28 <inc0> sdake: we are discussing that it's just impossible today with ansible being what it is 16:23:38 <inc0> it's not about us, it's about ansible 16:23:46 <sdake> got i t 16:23:57 <Jeffrey4l> duonghq, that's OK for before implement zero-downtime upgrade. 16:24:05 <Jeffrey4l> serial can not solve such issue too. 16:24:16 <inc0> duonghq: right, but we can't help with it, that's on services themselfes 16:24:20 <inc0> themselves* 16:24:24 <duonghq> inc0, right 16:24:46 <duonghq> ah, we can do something 16:24:49 <inc0> Jeffrey4l: on the other hand 16:24:58 <inc0> we *need* serial for compute nodes 16:25:13 <Jeffrey4l> duonghq, delete_to may can do the magic 16:25:16 <inc0> so we're back at square one 16:25:30 <Jeffrey4l> inc0, hrm reason? 16:25:34 <duonghq> Jeffrey4l, delegate_to? 16:25:37 <srwilkers> o/ 16:25:41 <duonghq> hi srwilkers 16:25:44 <Jeffrey4l> duonghq, yep. but i will try it. 16:25:47 <Jeffrey4l> test it. 16:25:50 <inc0> if you start pushing new containes to all of compute nodes at the same time 16:25:56 <inc0> it can be really bad 16:25:57 <duonghq> we already use it at some point 16:25:59 <inc0> really quick 16:26:12 <Jeffrey4l> i can pull the image before upgrade and this should be recommended too. 16:26:18 <inc0> ofc you can do *quasi* serial by modifying "forks" 16:26:32 <inc0> Jeffrey4l: still good to do it in serial 16:27:19 <inc0> I'd hold on before we talk to #ansible 16:27:25 <Jeffrey4l> forks handle task by task. it is different. but it may be helpful. 16:27:37 <Jeffrey4l> ok. let's move on. 16:27:55 <inc0> #topic ks-rolling-upgrade 16:28:00 <inc0> duonghq: you're up 16:28:09 <duonghq> thank you inc0 16:28:09 <Jeffrey4l> ( almost the same topic lol) 16:28:26 <duonghq> basically, it's about how we test the upgrade progress 16:28:31 <sp_> Jeffrey4l: same :) 16:28:41 <duonghq> especially when doing zero-downtime and rolling upgrade 16:28:46 <inc0> Jeffrey4l: it means it's important;) 16:28:59 <Jeffrey4l> yep. 16:29:16 <Jeffrey4l> upgrading with load? 16:29:33 <duonghq> testing service before and after the upgrade it easier than testing if it still working when upgrade is been done 16:29:35 <duonghq> up 16:29:38 <duonghq> yup 16:29:41 <sp_> Jeffrey4l: yes 16:29:53 <Jeffrey4l> in gate or in locally env? 16:29:59 <inc0> duonghq: there was session at ptg about gates to do it 16:30:01 <sp_> Jeffrey4l: thats why we call zero downtime 16:30:08 <inc0> best way I can think of to test rolling upgrade 16:30:13 <duonghq> Jeffrey4l, in gate 16:30:33 <inc0> is deploy old -> test-old -> upgrade 50% of nodes -> test -> upgrade all -> test 16:30:38 <Jeffrey4l> for load, we can use rally, right? 16:30:55 <inc0> for gates, we had session in ptg 16:31:08 <zhubingbing> +1 16:31:10 <duonghq> seem that I missed many thing in PTG :( 16:31:11 <inc0> I'd say let's not start by focusing on most complex scenerio 16:31:17 <zhubingbing> look so good 16:31:23 <inc0> let's start by: 16:31:27 <inc0> 1. multinode deploy gates 16:31:39 <sp_> duonghq: me too :( missed thing of PTG 16:31:53 <inc0> 2. multinode upgrade gates in a way: 16:32:00 <sdake> duonghq indeed, its important to attend ptgs - even when remote participatoin is an option 16:32:17 <sdake> duonghq we have a travel program to help out those who don't have funding to make it 16:32:20 <inc0> we deploy old from tarballs/registry -> we run test suite -> pull new from tarballs/registry -> upgrade -> test 16:32:27 <sdake> we being the broader OpenStack here 16:32:36 <zhubingbing> ;) 16:32:45 <Daviey_> (sdake: i'm here, but in another meeting) 16:32:59 <inc0> and frankly? I'd love to have this as one of highest priorities in Kolla for Pike 16:33:05 <duonghq> inc0, I think it's ok atm, 16:33:06 <sdake> Daviey_ all good - you did agree to serve as our stable liason - correct? if so, then I think we are gtg 16:33:13 <Daviey_> sdake: ack 16:33:13 <Jeffrey4l> inc0, ++ the first thing kolla-ansible need to do is multi node gate. 16:33:13 <duonghq> it worth a highest bp 16:33:15 <inc0> if we can make full upgrade gates, that's going to be awesome 16:33:18 <duonghq> maybe long running bp 16:33:20 <Daviey_> sdake: Unless anyone else is super keen 16:33:31 <sdake> Daviey_ i think everyone else is stretched thin 16:33:38 <inc0> agree duonghq also volunteers:) I volunteer myself but I'll need help 16:33:48 <duonghq> sdake, I applied PTG :( 16:33:59 <sdake> Daviey_ we need someone to coach and guide us on stable processes, kolla has become good at handling backports 16:34:00 <Jeffrey4l> this should be split into multi bp. multi gate, upgrade, load , and combine all of those. 16:34:06 <Daviey_> duonghq: If you really want to do it, i don't mind. :) 16:34:17 <egonzalez> i'll work on zero-downtimes upgrade 16:34:19 <inc0> yeah, and for Pike I'd focus on first 2 16:34:27 <sdake> Daviey_ nah he meant he applied for the tpg travel support and it was not accepted 16:34:29 <inc0> and get them rock solid 16:34:44 <duonghq> *applied for TSP of PTG 16:34:46 <inc0> sdake: can we plz move this outside of meeting?;) 16:34:55 <inc0> let's stick to single topic 16:35:03 <sdake> inc0 wfm 16:35:03 <Daviey_> Sorry, my fault. 16:35:24 <Jeffrey4l> yes. we really need multi node gate. 16:35:29 <inc0> ok, so duonghq is there anything else on your topic? 16:35:35 <sdake> Jeffrey4l zuul v3 is COMING 16:35:41 <inc0> Jeffrey4l: my next topic will help getting this done;) 16:35:44 <duonghq> no, I think some bps is good atm 16:35:45 <sdake> Jeffrey4l we have to wiat for that 16:35:59 <Jeffrey4l> hrm, actually i do not think zuul v3 is required. even though sam think so. 16:36:09 <sdake> i think zuulv3 is required 16:36:12 <inc0> sdake: no, we need to get it in place now and extend it when zuul gets on, that'd be my approach 16:36:14 <sdake> and sam thinks it so 16:36:23 <inc0> why it's required? 16:36:26 <inc0> 2 nodes is something 16:36:29 <sdake> 2 people think it - my thoughts are not based upon sam's opinion 16:36:31 <duonghq> I'll work on gating in this cycle 16:36:38 <Jeffrey4l> yep sam thinks so, but i am not. 16:36:43 <inc0> we still need to crack the networking 16:36:44 <sdake> 2 nodes is something - so it shouldn't block multinode, bu tfor more then 2 nodes, zuulv3 is needed 16:36:46 <inc0> all the same 16:36:53 <sdake> infra won't enable 3+ nodes without zuulv3 16:37:06 <inc0> ok, but let's not "wait" for v3 16:37:06 <Jeffrey4l> why we need 3+ nodes? 16:37:11 <sdake> ya no blocking 16:37:13 <inc0> let's do 2 nodes now and extend 16:37:19 <Jeffrey4l> inc0, + 16:37:23 <sdake> agreed - so lets rock :) 16:37:28 <inc0> ok 16:37:33 <egonzalez> +1 16:37:33 <inc0> next topic then 16:37:39 <Jeffrey4l> ( we are out of the topic ... ) 16:37:40 <duonghq> +1 16:37:42 <inc0> #topic post-ptg bps 16:37:51 <inc0> ok that's an experiment for next 20min;) 16:38:04 <inc0> I'd like everyone to look at ptg notes 16:38:14 <inc0> #link https://etherpad.openstack.org/p/kolla-pike-ptg-schedule 16:38:33 <sp_> inc0: gone through 16:38:49 <inc0> #link https://etherpad.openstack.org/p/kolla-pike-ptg-blueprints 16:39:05 <inc0> pick a session and draft blueprints they see in ^ this etherpad 16:39:23 <inc0> then, on following meetings we'll add bluepritns with notes 16:39:33 <inc0> I don't want our ptg effort to go to waste 16:40:02 <inc0> we should record blueprints out of notes and assign ourselves if we feel we want to do osmething 16:40:21 <zhubingbing> agre 16:40:34 <sp_> inc0: +1 16:40:47 <inc0> So I'll start writing down upgrade bps 16:41:01 <duonghq> inc0, nice 16:41:04 <inc0> I'd encourage everyone to do the same for other sessions 16:41:10 <inc0> timebox - till 16:55 16:42:43 <zhubingbing> sup sdake 16:43:20 <sdake> zhubingbing working my ass off :) 16:43:37 <zhubingbing> ok 16:43:41 <sdake> zhubingbing although I was at the ptg, I want other people to write the blueprints 16:44:01 <sdake> i'll add what I think are necessary in future meetings 16:44:03 <zhubingbing> understand 16:47:04 <jascott1> should we have one BP for all `blocking 1.0 reqs`? 16:51:16 <sdake> jascott1 i think we need to be careful with what we define as blocking 1.0 reqs, as some of the zero downtime upgrades are not really blocking - however put in whatever ou think is helpful and we can follow the standard openstack blueprint process 16:51:57 <jascott1> sdake was talking about this one https://etherpad.openstack.org/p/kolla-pike-ptg-k8s-release-roadmap 16:52:42 <sdake> jascott1 right i know - i think it makes sense to sort those out into blueprints 16:52:58 <jascott1> oh ok 16:53:02 <sdake> jascott1 if you want to get that started, that would rock :) 16:53:02 <duonghq> agree with sdake about zero-downtime upgrade for kolla-k8s 1.0.0, it seems that we have many works for 1.0.0 16:53:27 <sdake> duonghq we can record them all and then use standard blueprint process to select the ones that are essential 16:53:43 <duonghq> sdake, ack 16:53:51 <inc0> ok few last remarks on this 16:53:52 <sp_> sdake: +1 16:54:15 <inc0> we'll repeat this on next meeting too as we surely can get more blueprints out of these notes 16:54:23 <duonghq> sdake, do we need some Y-stream version before 1.0.0? 16:54:26 <inc0> also I encourage everyone to take some time to do it outside meeting 16:54:40 <inc0> also feel free to post blueprint and link it to etherpad 16:55:11 <inc0> that will make easier for us to track how useful sessions were in ptg and how much of them turned into code 16:55:18 <inc0> questions? 16:55:41 <inc0> #topic open discussion 16:55:45 <inc0> 4 minutes:) 16:56:32 <inc0> anyone? 16:56:44 <inc0> or can we end meeting and give our life back?:) 16:57:05 <inc0> right...ok, thank you all for coming and see you in #openstack-kolla! 16:57:10 <inc0> #endmeeting kolla