15:58:42 <inc0> #startmeeting kolla
15:58:43 <openstack> Meeting started Wed Mar  1 15:58:42 2017 UTC and is due to finish in 60 minutes.  The chair is inc0. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:58:44 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:58:47 <openstack> The meeting name has been set to 'kolla'
15:58:58 <inc0> #topic rollcall
15:59:02 <inc0> hello all:)
15:59:07 <britthouser4> hey!
15:59:11 <britthouser4> 0/
15:59:13 <duonghq> o/
15:59:19 <egonzalez> woot /
15:59:59 <sp_> woot /
16:00:04 <akwasnie> o/
16:00:07 <pbourke> o/
16:00:14 <krtaylor> o/
16:00:52 <Jeffrey4l> woot
16:01:33 <portdirect> w00t
16:01:43 <sdake> o/ :)
16:02:40 <jascott1> woot
16:02:48 <inc0> ok, we have busy agenda so I'll move on
16:02:56 <inc0> #topic Announcements
16:03:05 <inc0> I have 2: 1. thank you all for great PTG
16:03:07 * krtaylor was happy to meet everyone at PTG
16:03:32 <inc0> notes and session list are available here:
16:03:35 <inc0> #link https://etherpad.openstack.org/p/kolla-pike-ptg-schedule
16:03:47 <inc0> and 2. We release ocata next week!
16:03:56 <inc0> I'd like to encourage everyone to step up testing
16:04:08 <inc0> and fix these last few things that are broken
16:04:41 <inc0> also we'd need good answer about whether or not Kolla runs on docker 1.13:)
16:04:49 <zhubingbing> o/
16:04:51 <zhubingbing> sorry
16:04:55 <inc0> any announements from community?
16:05:04 <inc0> no worries zhubingbing, welcome
16:06:17 <inc0> guess no announcements
16:06:31 <inc0> #topic Applying for the Stable project maturity tag
16:06:36 <inc0> sdake: I assume it's yours
16:06:46 <inc0> can we make it shorter than 20min plz?
16:07:02 <inc0> there is topic in agenda I'd love to talk about today
16:07:08 <sdake> yup
16:07:09 <sdake> not sure if it can be
16:07:09 <sdake> but will try
16:07:11 <sdake> so - we applied for the stable maturity tag in the past
16:07:22 <sdake> now that liberty is EOL we can do so again
16:07:32 <sdake> and daviey will be our liason (confirmed on irc)
16:07:54 <inc0> from release team?
16:07:55 <sdake> the one thing that Jeffrey4l had a q about was the lifetime of mitaka (2.0.2)
16:08:00 <sdake> daviey is a core reviewer
16:08:05 <sdake> and he is also on the stable maint team
16:08:12 <inc0> ok makes sense
16:08:24 <sdake> I can submit the review if you like
16:08:30 <inc0> I'll do it
16:08:35 <sdake> cool
16:08:41 <sdake> i guess we can leave the unanswered question unanswered
16:08:53 <sdake> although liberty (1.x) is EOL
16:08:58 <sdake> and 2.0.2 (newton) is about to be eoled March 3rd
16:09:08 <inc0> mitaka
16:09:10 <sdake> so heads up to everyoen involved ;)
16:09:17 <inc0> newton still has 6 months in it
16:09:18 <sdake> sorry newton^mitaka
16:09:27 <Jeffrey4l> we need release last tag before
16:09:35 <Jeffrey4l> for mitaka branch .
16:09:37 <sdake> right we do need one last tag for pip
16:09:47 <sdake> maybe its march 10th
16:09:50 <sdake> i foget which it is :)
16:09:53 <Jeffrey4l> no matter whether kolla branch is remove or not.
16:10:00 <sdake> this is the undefined part
16:10:01 <inc0> yeah I agree
16:10:18 <inc0> Jeffrey4l: can we tag this week?
16:10:20 <sdake> trailing cycle projects don't have a defined lifetime
16:10:28 <inc0> I don't expect any patches merging to stable/mitaka
16:10:33 <inc0> or stable/newton
16:10:38 <Jeffrey4l> possible.
16:10:42 <sdake> there are a bunch in th ebacklog
16:10:48 <Jeffrey4l> there is nothing much for mitaka branch.
16:10:50 <sdake> although we can punt on mitaka
16:10:55 <sdake> oh cool then all good :)
16:11:17 <inc0> #action inc0 to submit review for maturity tag
16:11:31 <sdake> ok i think that sums it up - inc0 i'll point you at my last take at this PMT
16:11:51 <inc0> I'd say we should all observe this review and address issues if release team will have it
16:11:54 <sdake> inc0 there are a bunch of requirements needed - and we are hitting almost all of them
16:12:06 <sdake> i'd love to pull it up now, however, i don't have it handy
16:12:11 <inc0> I'll take that with you later
16:12:15 <sdake> if you can move on i can link it in the closing of the meeting
16:12:25 <inc0> ok, let's move on
16:12:30 <sdake> tia ;-)
16:12:35 <inc0> #topic serial in kolla-ansible
16:12:39 <inc0> Jeffrey4l: you're up
16:12:43 <Jeffrey4l> thanks.
16:12:49 <Jeffrey4l> this may related to next topic.
16:12:54 <Jeffrey4l> by duonghq
16:13:13 <Jeffrey4l> i saw some issue when test upgrade from newton to ocata.
16:13:20 <Jeffrey4l> check this link https://etherpad.openstack.org/p/kolla-ansible-serial
16:13:45 <Jeffrey4l> serial try to upgrade service one node by another one.
16:13:59 <Jeffrey4l> which will cause some unexpected issue.
16:14:26 <Jeffrey4l> for example the sighup part in nova.
16:14:32 <inc0> but that's what rolling upgrade is
16:14:38 <inc0> ahh
16:14:42 <inc0> I see what you mean
16:14:55 <Jeffrey4l> i am not trying to use rolling upgrade.
16:14:59 <inc0> we are hitting ansible wall
16:15:05 <Jeffrey4l> serial case some issue ;(
16:15:12 <duonghq> agreed with Jeffrey4l
16:15:18 <Jeffrey4l> and serial only works with playbook, which is bad, too.
16:15:19 <inc0> right
16:15:23 <inc0> yeah
16:15:34 <inc0> this is an example when serial would be needed at task level really
16:15:40 <Jeffrey4l> one propose i made is disable serial.
16:15:54 <Jeffrey4l> inc0, yes. but found nothing about this.
16:15:56 <duonghq> we need some task run only on 1st node, and some task run only on last or 1st node in the end, egonzalez also faced it
16:16:02 <inc0> but then it's not a rolling upgrade or no-downtime upgrade
16:16:16 <Jeffrey4l> one possible solution is use a dynamic delete_to variable, but still testing this.
16:16:43 <Jeffrey4l> kolla do not promise no-downtime upgrade.
16:16:59 <inc0> yeah
16:17:09 <duonghq> but if service has native support zero-downtime upgrade, we should support it too
16:17:09 <inc0> however if we have clear problem with ansible
16:17:15 <Jeffrey4l> and no-downtime is what duonghq is doing and solving.
16:17:28 <inc0> yeah we want to optimize it as much as possible
16:17:34 <sp_> Jeffrey4l: yes
16:17:38 <inc0> no downtime is not a promise but it is a goal
16:17:47 <duonghq> Jeffrey4l, you mean delegate_to?
16:17:53 <Jeffrey4l> any way, we face some issue when during upgrade.
16:18:08 <Jeffrey4l> duonghq, yep. i haven't test it. But i guess it should works.
16:18:10 <inc0> that's a good observation, maybe we should talk on #ansible to ask ansible people for opinions?
16:18:19 <Jeffrey4l> inc0, good idea.
16:18:25 <sp_> inc0:  yes actual zero down time would be possible its our goal
16:18:37 <sp_> would not be*
16:18:44 <duonghq> any kolla-k8s people around? will we face same issue with k8s?
16:18:51 <egonzalez> not all projects support no zero-downtime upgrade
16:19:02 <Jeffrey4l> we need solve the serial issue. and better implement zone down time later. but they are two different thing, right?
16:19:03 <inc0> if Ansible will make this impossible, we should note that and work with them to make it possible
16:19:19 <inc0> egonzalez: right, but more and more does
16:19:29 <inc0> which means this is problem we definetly need to fix
16:19:43 <egonzalez> yeah, no doubt of it
16:19:45 <Jeffrey4l> duonghq, i guess kolla-k8s start trying to solve upgrade issue . sdake right?
16:20:01 <inc0> k8s will not have this issue
16:20:02 <sdake> Jeffrey4l no - we are working on basic ugprades at some point in the future for 1.0.0
16:20:08 <inc0> will have different issues;)
16:20:18 <inc0> but yeah, that's a goal
16:20:29 <Jeffrey4l> at last, one propose i want is disable serial, or at least disable this in default.
16:20:39 <inc0> I wouldn't surrender just yet tho, let's talk about it on #ansible after meeting
16:20:54 <inc0> yeah I tend to agree on that
16:20:55 <Jeffrey4l> ok.
16:21:05 <inc0> we should also change "stop all schedulers" tasks
16:21:15 <sp_> sdake: basic upgrade means with downtime . right ? for kolla-k8s 1.0.0
16:21:18 <inc0> as without serial they will only cause downtime we don't need
16:21:26 <duonghq> sp_, noop
16:21:32 <duonghq> ah, sorry, yes
16:21:33 <Jeffrey4l> openstack is more than a control plane. stop the services won't affect vms.
16:21:34 <sdake> sp_ possibly
16:21:37 <inc0> well it's still gonna be ~minute of downtime per service
16:21:57 <duonghq> for service need db migration, it'll take quite long time
16:22:12 <inc0> but we don't need to turn it off because of this issue
16:22:13 <sdake> i think zero downtime upgrades is a great objective for kolla-ansible, whereas any upgrades are a great objective for kolla-kubernetes
16:22:18 <Jeffrey4l> duonghq, when db migration, the service is still running.
16:22:35 <inc0> it's only about restarting containers
16:22:38 <duonghq> Jeffrey4l, not sure, it's depended on service
16:22:40 <sdake> although we can punt zero downtime upgrades t if it wont make the deadline (for kolla-ansible)
16:22:54 <sp_> sdake: yes,
16:23:04 <inc0> duonghq: right, but again, that's not this issue
16:23:15 <Jeffrey4l> duonghq, yes. but checked nova/neutron/cinder/glance, these both support this.
16:23:26 <duonghq> some service cannot working while db is in migration progress
16:23:28 <inc0> sdake: we are discussing that it's just impossible today with ansible being what it is
16:23:38 <inc0> it's not about us, it's about ansible
16:23:46 <sdake> got i t
16:23:57 <Jeffrey4l> duonghq, that's OK for before implement zero-downtime upgrade.
16:24:05 <Jeffrey4l> serial can not solve such issue too.
16:24:16 <inc0> duonghq: right, but we can't help with it, that's on services themselfes
16:24:20 <inc0> themselves*
16:24:24 <duonghq> inc0, right
16:24:46 <duonghq> ah, we can do something
16:24:49 <inc0> Jeffrey4l: on the other hand
16:24:58 <inc0> we *need* serial for compute nodes
16:25:13 <Jeffrey4l> duonghq, delete_to may can do the magic
16:25:16 <inc0> so we're back at square one
16:25:30 <Jeffrey4l> inc0, hrm reason?
16:25:34 <duonghq> Jeffrey4l, delegate_to?
16:25:37 <srwilkers> o/
16:25:41 <duonghq> hi srwilkers
16:25:44 <Jeffrey4l> duonghq, yep. but i will try it.
16:25:47 <Jeffrey4l> test it.
16:25:50 <inc0> if you start pushing new containes to all of compute nodes at the same time
16:25:56 <inc0> it can be really bad
16:25:57 <duonghq> we already use it at some point
16:25:59 <inc0> really quick
16:26:12 <Jeffrey4l> i can pull the image before upgrade and this should be recommended too.
16:26:18 <inc0> ofc you can do *quasi* serial by modifying "forks"
16:26:32 <inc0> Jeffrey4l: still good to do it in serial
16:27:19 <inc0> I'd hold on before we talk to #ansible
16:27:25 <Jeffrey4l> forks handle task by task. it is different. but it may be helpful.
16:27:37 <Jeffrey4l> ok. let's move on.
16:27:55 <inc0> #topic ks-rolling-upgrade
16:28:00 <inc0> duonghq: you're up
16:28:09 <duonghq> thank you inc0
16:28:09 <Jeffrey4l> ( almost the same topic lol)
16:28:26 <duonghq> basically, it's about how we test the upgrade progress
16:28:31 <sp_> Jeffrey4l: same :)
16:28:41 <duonghq> especially when doing zero-downtime and rolling upgrade
16:28:46 <inc0> Jeffrey4l: it means it's important;)
16:28:59 <Jeffrey4l> yep.
16:29:16 <Jeffrey4l> upgrading with load?
16:29:33 <duonghq> testing service before and after the upgrade it easier than testing if it still working when upgrade is been done
16:29:35 <duonghq> up
16:29:38 <duonghq> yup
16:29:41 <sp_> Jeffrey4l: yes
16:29:53 <Jeffrey4l> in gate or in locally env?
16:29:59 <inc0> duonghq: there was session at ptg about gates to do it
16:30:01 <sp_> Jeffrey4l:  thats why we call zero downtime
16:30:08 <inc0> best way I can think of to test rolling upgrade
16:30:13 <duonghq> Jeffrey4l, in gate
16:30:33 <inc0> is deploy old -> test-old -> upgrade 50% of nodes -> test -> upgrade all -> test
16:30:38 <Jeffrey4l> for load, we can use rally, right?
16:30:55 <inc0> for gates, we had session in ptg
16:31:08 <zhubingbing> +1
16:31:10 <duonghq> seem that I missed many thing in PTG :(
16:31:11 <inc0> I'd say let's not start by focusing on most complex scenerio
16:31:17 <zhubingbing> look so good
16:31:23 <inc0> let's start by:
16:31:27 <inc0> 1. multinode deploy gates
16:31:39 <sp_> duonghq:  me too :( missed thing of PTG
16:31:53 <inc0> 2. multinode upgrade gates in a way:
16:32:00 <sdake> duonghq indeed, its important to attend ptgs - even when remote participatoin is an option
16:32:17 <sdake> duonghq we have a travel program to help out  those who don't have funding to make it
16:32:20 <inc0> we deploy old from tarballs/registry -> we run test suite -> pull new from tarballs/registry -> upgrade -> test
16:32:27 <sdake> we being the broader OpenStack here
16:32:36 <zhubingbing> ;)
16:32:45 <Daviey_> (sdake: i'm here, but in another meeting)
16:32:59 <inc0> and frankly? I'd love to have this as one of highest priorities in Kolla for Pike
16:33:05 <duonghq> inc0, I think it's ok atm,
16:33:06 <sdake> Daviey_ all good - you did agree to serve as our stable liason - correct?  if so, then I think we are gtg
16:33:13 <Daviey_> sdake: ack
16:33:13 <Jeffrey4l> inc0, ++  the first thing kolla-ansible need to do is multi node gate.
16:33:13 <duonghq> it worth a highest bp
16:33:15 <inc0> if we can make full upgrade gates, that's going to be awesome
16:33:18 <duonghq> maybe long running bp
16:33:20 <Daviey_> sdake: Unless anyone else is super keen
16:33:31 <sdake> Daviey_ i think everyone else is stretched thin
16:33:38 <inc0> agree duonghq also volunteers:) I volunteer myself but I'll need help
16:33:48 <duonghq> sdake, I applied PTG :(
16:33:59 <sdake> Daviey_ we need someone to coach and guide us on stable processes, kolla has become good at handling backports
16:34:00 <Jeffrey4l> this should be split into multi bp.   multi gate,  upgrade,  load , and combine all of those.
16:34:06 <Daviey_> duonghq: If you really want to do it, i don't mind. :)
16:34:17 <egonzalez> i'll work on zero-downtimes upgrade
16:34:19 <inc0> yeah, and for Pike I'd focus on first 2
16:34:27 <sdake> Daviey_ nah he meant he applied for the tpg travel support and it was not accepted
16:34:29 <inc0> and get them rock solid
16:34:44 <duonghq> *applied for TSP of PTG
16:34:46 <inc0> sdake: can we plz move this outside of meeting?;)
16:34:55 <inc0> let's stick to single topic
16:35:03 <sdake> inc0 wfm
16:35:03 <Daviey_> Sorry, my fault.
16:35:24 <Jeffrey4l> yes. we really need multi node gate.
16:35:29 <inc0> ok, so duonghq is there anything else on your topic?
16:35:35 <sdake> Jeffrey4l zuul v3 is COMING
16:35:41 <inc0> Jeffrey4l: my next topic will help getting this done;)
16:35:44 <duonghq> no, I think some bps is good atm
16:35:45 <sdake> Jeffrey4l we have to wiat for that
16:35:59 <Jeffrey4l> hrm, actually i do not think zuul v3 is required. even though sam think so.
16:36:09 <sdake> i think zuulv3 is required
16:36:12 <inc0> sdake: no, we need to get it in place now and extend it when zuul gets on, that'd be my approach
16:36:14 <sdake> and sam thinks it so
16:36:23 <inc0> why it's required?
16:36:26 <inc0> 2 nodes is something
16:36:29 <sdake> 2 people think it - my thoughts are not based upon sam's opinion
16:36:31 <duonghq> I'll work on gating in this cycle
16:36:38 <Jeffrey4l> yep  sam thinks so, but i am not.
16:36:43 <inc0> we still need to crack the networking
16:36:44 <sdake> 2 nodes is something - so it shouldn't block multinode, bu tfor more then 2 nodes, zuulv3 is needed
16:36:46 <inc0> all the same
16:36:53 <sdake> infra won't enable 3+ nodes without zuulv3
16:37:06 <inc0> ok, but let's not "wait" for v3
16:37:06 <Jeffrey4l> why we need 3+ nodes?
16:37:11 <sdake> ya no blocking
16:37:13 <inc0> let's do 2 nodes now and extend
16:37:19 <Jeffrey4l> inc0, +
16:37:23 <sdake> agreed - so lets rock :)
16:37:28 <inc0> ok
16:37:33 <egonzalez> +1
16:37:33 <inc0> next topic then
16:37:39 <Jeffrey4l> ( we are out of the topic ... )
16:37:40 <duonghq> +1
16:37:42 <inc0> #topic post-ptg bps
16:37:51 <inc0> ok that's an experiment for next 20min;)
16:38:04 <inc0> I'd like everyone to look at ptg notes
16:38:14 <inc0> #link https://etherpad.openstack.org/p/kolla-pike-ptg-schedule
16:38:33 <sp_> inc0: gone through
16:38:49 <inc0> #link https://etherpad.openstack.org/p/kolla-pike-ptg-blueprints
16:39:05 <inc0> pick a session and draft blueprints they see in ^ this etherpad
16:39:23 <inc0> then, on following meetings we'll add bluepritns with notes
16:39:33 <inc0> I don't want our ptg effort to go to waste
16:40:02 <inc0> we should record blueprints out of notes and assign ourselves if we feel we want to do osmething
16:40:21 <zhubingbing> agre
16:40:34 <sp_> inc0: +1
16:40:47 <inc0> So I'll start writing down upgrade bps
16:41:01 <duonghq> inc0, nice
16:41:04 <inc0> I'd encourage everyone to do the same for other sessions
16:41:10 <inc0> timebox - till 16:55
16:42:43 <zhubingbing> sup sdake
16:43:20 <sdake> zhubingbing working my ass off :)
16:43:37 <zhubingbing> ok
16:43:41 <sdake> zhubingbing although I was at the ptg, I want other people to write the blueprints
16:44:01 <sdake> i'll add what I think are necessary in future meetings
16:44:03 <zhubingbing> understand
16:47:04 <jascott1> should we have one BP for all `blocking 1.0 reqs`?
16:51:16 <sdake> jascott1 i think we need to be careful with what we define as blocking 1.0 reqs, as some of the zero downtime upgrades are not really blocking - however put in whatever ou think is helpful and we can follow the standard openstack blueprint process
16:51:57 <jascott1> sdake was talking about this one https://etherpad.openstack.org/p/kolla-pike-ptg-k8s-release-roadmap
16:52:42 <sdake> jascott1 right i know - i think it makes sense to sort those out into blueprints
16:52:58 <jascott1> oh ok
16:53:02 <sdake> jascott1 if you want to get that started, that would rock :)
16:53:02 <duonghq> agree with sdake  about zero-downtime upgrade for kolla-k8s 1.0.0, it seems that we have many works for 1.0.0
16:53:27 <sdake> duonghq we can record them all and then use standard blueprint process to select the ones that are essential
16:53:43 <duonghq> sdake, ack
16:53:51 <inc0> ok few last remarks on this
16:53:52 <sp_> sdake: +1
16:54:15 <inc0> we'll repeat this on next meeting too as we surely can get more blueprints out of these notes
16:54:23 <duonghq> sdake,  do we need some Y-stream version before 1.0.0?
16:54:26 <inc0> also I encourage everyone to take some time to do it outside meeting
16:54:40 <inc0> also feel free to post blueprint and link it to etherpad
16:55:11 <inc0> that will make easier for us to track how useful sessions were in ptg and how much of them turned into code
16:55:18 <inc0> questions?
16:55:41 <inc0> #topic open discussion
16:55:45 <inc0> 4 minutes:)
16:56:32 <inc0> anyone?
16:56:44 <inc0> or can we end meeting and give our life back?:)
16:57:05 <inc0> right...ok, thank you all for coming and see you in #openstack-kolla!
16:57:10 <inc0> #endmeeting kolla