#openstack-meeting-4 log

15:59:23 <inc0> #startmeeting kolla
15:59:23 <openstack> Meeting started Wed Mar 15 15:59:23 2017 UTC and is due to finish in 60 minutes.  The chair is inc0. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:59:24 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:59:27 <openstack> The meeting name has been set to 'kolla'
15:59:37 <inc0> #topic rollcall, w00t
15:59:40 <duonghq> o/
15:59:43 <inc0> you know what to do
15:59:48 <mnaser> o/ bonjour
15:59:50 <hrw> o/
15:59:51 <pbourke> WOOT
15:59:57 <duonghq> oops
16:00:01 <akwasnie> o/
16:00:28 <berendt> o/
16:00:39 <Jeffrey4l> woot
16:00:44 <egonzalez> woot o/
16:00:54 <spsurya_> o/
16:01:03 <jascott1> o/
16:01:10 <sayantan_> woot
16:01:15 <zhubingbing_> woot
16:01:21 <spsurya_> woot
16:01:22 <zhubingbing_> o/
16:01:28 <vhosakot> o/ w00t w00t
16:02:19 <qwang> O/
16:02:23 <inc0> #topic announcements
16:02:30 <inc0> 1. we releaed ocata!
16:02:32 <Jeffrey4l> so many people today ;)
16:02:39 <inc0> congrats everyone
16:03:18 <spsurya_> gratz
16:03:19 <qwang> Jeffrey4l: for DST
16:03:27 <hrw> yay! more reviewers!
16:03:49 <Jeffrey4l> aha
16:03:55 <inc0> 2. One more week for voting for duonghq to become core, if anyone from core team missed it, please vote
16:04:19 <vhosakot> yep, will vote
16:04:20 <duonghq> thank inc0
16:04:35 <spsurya_> duonghq: congrats
16:04:43 <inc0> so last week we canibalized regular agenda for release discussion, so now let's get back to it
16:05:03 <inc0> #topic Need to formalize policy around pushing to dockerhub
16:05:10 <inc0> agree ^
16:05:14 <inc0> formalize and automate
16:05:37 <inc0> #link https://bugs.launchpad.net/kolla-ansible/+bug/1669075/comments/4
16:05:37 <openstack> Launchpad bug 1669075 in kolla-ansible "kolla-ansible pull with kolla_ansible-4.0.0.0rc1 fails, because of missing tag in docker registry" [Low,Invalid]
16:05:42 <berendt> regarding automate: i can add this to our jenkins instance
16:05:55 <pbourke> #linkhttps://wiki.openstack.org/wiki/Meetings/Kolla#Agenda_for_next_meeting_.28Mar_8th_2017.29
16:05:59 <pbourke> #link https://wiki.openstack.org/wiki/Meetings/Kolla#Agenda_for_next_meeting_.28Mar_8th_2017.29
16:06:18 <Jeffrey4l> rc is unstable, push them will cause lots of issue, imo.
16:06:35 <mnaser> but wont they technically never be pulled
16:06:41 <Jeffrey4l> especially for hub.docker.com.
16:06:45 <mnaser> unless you're running an rc release of kolla-ansible/kubernetes?
16:06:58 <Jeffrey4l> but push them into tarballs.openstack.org is OK, i think.
16:07:01 <berendt> Jeffrey4l: when you visit our docs the master documents including the tag 4.0.0 was published
16:07:08 <inc0> alternatively, instead of rc
16:07:13 <berendt> because of this david opened this bug
16:07:15 <inc0> keep pushing stable/ocata
16:07:19 <pbourke> is there a reason we can't push to dockerhub along side tarballs.oo
16:07:22 <inc0> with some meaninful thag
16:07:24 <inc0> tag
16:07:31 <inc0> like 4.0.0-latest
16:07:44 <berendt> the master branch is only usable when building own images
16:08:00 <mnaser> i think what inc0 makes a lot of sense, that means backports can make their way much faster
16:08:06 <Jeffrey4l> pbourke, i want to know how to keep hub.docker.com credential in ci.
16:08:26 <inc0> yeah and also fixes what egonzalez mentioned on main channel - some other project deploys critical fix
16:08:36 <Jeffrey4l> inc0, 4.0.0-latest is a good idea.
16:08:37 <inc0> we have it upstream immediatly
16:08:51 <inc0> and :latest for master
16:09:04 <inc0> berendt: my question is...what jenkins instance?:)
16:09:07 <mnaser> inc0: well technically, you wouldnt, unless you manually trigger stable/<branch> (i could be wrong)?
16:09:19 <berendt> inc0:  company one
16:09:30 <berendt> not sure if we have to add it to the openstack jenkins
16:09:32 <inc0> right
16:09:35 <Jeffrey4l> berendt, re docs, sorry, i do not get your point ;(
16:10:03 <berendt> Jeffrey4l: david opened the bug because the kolla-ansible repository on the master branch is not usable without building own images
16:10:07 <inc0> so how about we will create crontab entries and keep them in our repo
16:10:14 <Jeffrey4l> mnaser, we can , there is a period pipeline in zuul.
16:10:32 <mnaser> oh cool1
16:10:52 <inc0> Jeffrey4l: really? so we can run a gate daily?
16:10:57 <Jeffrey4l> inc0, yep.
16:11:00 <inc0> or rather, job to build+push?
16:11:01 <Jeffrey4l> pretty sure.
16:11:01 <inc0> cool
16:11:10 <mnaser> yes i recall now the periodic pipeline
16:11:29 <mnaser> https://docs.openstack.org/infra/system-config/zuul.html > periodic
16:11:50 <inc0> do we agree that we create branch :4.0.0-latest for daily stable ocata and :latest for daily master?
16:11:57 <Jeffrey4l> #link https://docs.openstack.org/infra/system-config/zuul.html
16:11:58 <inc0> or maybe not latest
16:12:04 <inc0> let's call it master or trunk
16:12:11 <inc0> as latest is default tag
16:12:12 <mnaser> would it be a lot more work to add newton? :X
16:12:21 <inc0> no it wouldnt
16:12:25 <inc0> we can do neutron too
16:12:36 <Jeffrey4l> neutron?
16:12:38 <mnaser> it would be quite beneficial (as ocata is still "fresh")
16:12:40 <inc0> newton
16:12:40 <mnaser> i think he means newton :-P
16:12:42 <inc0> sorry
16:12:54 <inc0> I'm still waking up;)
16:13:23 <Jeffrey4l> so i guess push branch is acceptable by all guys, right?
16:13:25 <inc0> #action inc0: write bp for daily gerrit jobs
16:13:31 <Jeffrey4l> tag name is not a big deal.
16:13:35 <inc0> yeah
16:13:56 <inc0> we can continue discussion in bp and as usual
16:14:00 <pbourke> inc0: how are we going to get credentials into these jobs
16:14:03 <Jeffrey4l> another thing related to this is: auto bump the service tag in source.
16:14:10 <Jeffrey4l> pbourke, good point.
16:14:23 <inc0> pbourke: that's a good question, I'll check with infra for secret storage
16:14:31 <pbourke> inc0: cool
16:14:47 <inc0> I think they have hiera (they need to;))
16:14:55 <inc0> maybe we can somehow tap into it
16:15:00 <mnaser> they do have hiera
16:15:23 <mnaser> pypi credentials are stored in there for example
16:15:29 <Jeffrey4l> cool.
16:15:41 <Jeffrey4l> mnaser, you know lots of think about ci?
16:15:54 <Jeffrey4l> thing*
16:16:05 * mnaser has been in openstack since 2011
16:16:13 <Jeffrey4l> wow
16:16:37 <mnaser> our cloud is running newton (but it started its life off as bexar actually) -- looking to get more involved but we can get into that later :)
16:16:48 <hrw> mnaser: perfect person for '10y of openstack experience' offers
16:16:54 <Jeffrey4l> lol
16:16:59 <inc0> mnaser: so from Bexar?:0
16:17:00 <berendt> lol
16:17:09 <egonzalez> lol
16:17:09 <mnaser> http://jeffrose.wpengine.netdna-cdn.com/wp-content/uploads/2011/12/dr.-evil-million-dollar-term-policy-300x241.jpg
16:17:20 <inc0> ok let's move on
16:17:24 <berendt> we started first environment with bexar, too, funny times
16:17:36 <inc0> #topic drop root
16:17:43 <inc0> duonghq: you're u
16:17:45 <inc0> up
16:17:50 <duonghq> thank inc0
16:18:05 <duonghq> I see we have 2 bugs relate to drop root topic:
16:18:26 <duonghq> #info keystone https://bugs.launchpad.net/kolla/+bug/1576794
16:18:26 <openstack> Launchpad bug 1576794 in kolla "drop root for keystone" [Critical,In progress] - Assigned to Surya Prakash Singh (confisurya)
16:18:39 <duonghq> #info crontab https://bugs.launchpad.net/kolla/+bug/1560744
16:18:39 <openstack> Launchpad bug 1560744 in kolla "drop root for crontab" [Critical,Confirmed]
16:19:10 <duonghq> for crontab, I see that sdake commented it cannot be dropped in centos, for keystone, I'm not sure
16:19:21 <spsurya_> inc0:  first to check is this valid bug , need to b fix ?
16:19:25 <duonghq> so if we can confirm for crontab one, I think we can close the bug
16:19:50 <spsurya_> we pbourke comment too for keystone one
16:19:57 <spsurya_> that root can't be dropped
16:20:29 <inc0> well for keystone, and other apache based apis, it can't be dropped
16:20:40 <inc0> afair
16:20:52 <duonghq> pbourke, how do you think?
16:20:57 <pbourke> would be interested in what the keystone guys have to say on this
16:21:20 <pbourke> suddenly forcing root on operators is a strange decision
16:21:28 <pbourke> regardless of the benefits brought by running behind apache
16:21:37 <hrw> if it can run on >1024 port then should be doable without root
16:21:38 <Jeffrey4l> copy from net: Apache has to run as root initially in order to bind to port 80. If you don't run it as root initially then you cannot bind to port 80. If you want to bind to some port above 1024 then yes, you can.
16:21:38 <spsurya_> pbourke: +1
16:21:42 <Jeffrey4l> https://superuser.com/questions/316705/running-apache-as-a-different-user
16:21:59 <Jeffrey4l> all the port we are using now > 1024
16:22:13 <inc0> Jeffrey4l: horizon is still 80/443
16:22:17 <inc0> well 80
16:22:27 <Jeffrey4l> oh, right. horizon is.
16:22:29 <duonghq> so, we can move it to higher port, and drop root?
16:22:31 <mnaser> haproxy as well?
16:22:32 <mnaser> for the horizon backends
16:22:50 <inc0> but technically we could run horizon on 1024< and bind 80 on haproxy
16:23:02 <inc0> just not backwards compatible change so let's not do it
16:23:02 <spsurya_> seems like >1024 would be ok for dropping
16:23:07 <Jeffrey4l> inc0, haproxy is optional.
16:23:11 <mnaser> aio deployments might become a bit weird though ^
16:23:20 <inc0> everything is optional;)
16:23:27 <inc0> but yeah, can break stuff
16:23:28 <Jeffrey4l> kolla support run without haproxy.
16:23:32 <duonghq> mnaser, in default setting, AIO still use haproxy
16:23:48 <mnaser> it seems like the root requirement is there, regardless
16:23:49 <inc0> yeah, and keepalived;)
16:24:03 <mnaser> there's quite a few components which will need root at the end of the day
16:24:04 <inc0> well, either way
16:24:08 <duonghq> we still can bind port from docker side
16:24:11 <inc0> keystone shouldn't need it because of apache
16:24:11 <Jeffrey4l> so we can drop root for apache with port > 1024 , right?
16:24:32 <mnaser> yes Jeffrey4l
16:24:53 <inc0> duonghq: that's good alternative, but we would need to drop net=host for apis
16:24:59 <inc0> which I wouldn't be opposed to
16:25:04 <hrw> at linaro we only deploy nova/neutron/cinder/glance/horizon/keystone + openvswitch + ceph iirc
16:25:24 <Jeffrey4l> hrw, so?
16:25:41 <duonghq> inc0, hmm, forgot that, one of our goals
16:25:47 <mnaser> i like net=host being there.  it makes life simple.  once you get out of it, you have to start playing around overlay networsk and you start adding a lot of the complexities (imho)
16:25:58 <Jeffrey4l> there is another parameter may be helpful to drop root: docker run --cap-add
16:26:03 <Jeffrey4l> but i am not sure.
16:26:11 <mnaser> actually thats a really good suggestion
16:26:11 <inc0> yeah, and also there were performance issues
16:26:15 <pbourke> how many people see this as high priority?
16:26:29 <Jeffrey4l> pbourke, drop root, or?
16:26:31 <mnaser> as a deployer, i dont really care about the keystone container running as root (honestly)
16:26:42 <pbourke> breaking out of a container is a theoretical exploit... meanwhile we have world readable passwords on all target nodes
16:26:58 <Jeffrey4l> btw, even though keystone container running as root, but keystone wsgi run as keystone user.
16:27:03 <mnaser> httpd is going to be running as root in most other deployments methods in the first place and the keystone processes fork under keystone
16:27:08 <inc0> and getting into container is arguably harder than root host
16:27:18 <inc0> as we don't run any services there besides one we need
16:27:54 <Jeffrey4l> can i say: drop root is not critical issue, but nice to have ?
16:28:00 <mnaser> i would agree with that ^
16:28:07 <pbourke> I think so
16:28:16 <inc0> but regardless, can we examine drop root for ks as there doesn't seem to be compelling reason why not?
16:28:30 <inc0> it's still better to remove it
16:28:37 <Jeffrey4l> so if drop-root for any container is possible, and anyone who interested in this? please implement it :)
16:28:39 <inc0> just not critical
16:28:40 <spsurya_> Jeffrey4l:  we can its type to medium ?
16:28:43 <duonghq> sure
16:28:50 <pbourke> someone should investigate and update the docs if its not currently feasable
16:28:56 <inc0> yeah let's make all drop root medium bugs
16:29:08 <Jeffrey4l> medium, agree.
16:29:09 <duonghq> so we drop its "importance"?
16:29:15 <duonghq> lol
16:29:19 <inc0> lol
16:29:22 <spsurya_> duonghq: lol
16:29:34 <duonghq> I'll ask sdake later when I see him
16:29:38 <duonghq> about crontab
16:29:38 <inc0> (and I bet *nobody* actually laughted out loud)
16:29:48 <spsurya_> although we need to fix it anyway
16:30:04 <spsurya_> inc0: +1
16:30:11 <duonghq> ya, alright
16:30:12 <inc0> right, let's move on
16:30:19 <spsurya_> yes no body
16:30:26 <inc0> #topic canonical k8s deployment
16:31:02 <inc0> so I think we don't have our canonical guys around
16:31:11 <inc0> (do we have kolla-k8s people?)
16:31:22 <Jeffrey4l> Canonical company? interesting
16:31:38 <inc0> kfox1111 around?
16:32:11 <inc0> ok it seems we don't have quorum for that, pushing to next meeting
16:32:20 <zhubingbing_> ;)
16:32:25 <inc0> #topic open discussion
16:32:42 <inc0> since we ran out of agenda items, anything needing our immediate attention?
16:32:44 <vhosakot> I'm still deploying kolla-k8s and will update docs as needed.
16:32:45 <duonghq> can I?
16:32:52 <inc0> duonghq: go ahead
16:33:12 <duonghq> forgot add this to agenda, I drafted on bp from last week
16:33:14 <duonghq> #link https://blueprints.launchpad.net/kolla/+spec/unix-signals-handling
16:33:22 <duonghq> can you give me some comment?
16:33:47 <duonghq> hmm, where is the bot, I think bot'll put the title
16:33:52 <duonghq> Unix singals handling in Kolla image
16:33:59 <inc0> duonghq: first, we need to figure out which services allows sighup
16:34:10 <inc0> second, that won't work with CONFIG_ONCE
16:34:16 <berendt> duonghq: i think he doesn't because of the leading #link
16:34:31 <duonghq> berendt, roger
16:34:52 <duonghq> inc0, ya, but in COPY_ALWAYS, it'll be nice feature to reload setting w/o downtime
16:34:54 <Jeffrey4l> duonghq, have u tried sighup. it should work with dumb-init.
16:34:54 <mnaser> also, i think this is a big of a weird situation because not all config values are reloaded
16:34:57 <duonghq> w/o restart container
16:35:17 <mnaser> so for example oslo_log might notice the change but some other part of another component will
16:35:33 <duonghq> Jeffrey4l, I'm not sure w/ dumb-init, just plain service, it's ok
16:35:48 <mnaser> so i think its important to keep in mind of the possible complexity that might introduce knowing which config values will reload and which ones wont
16:35:55 <duonghq> mnaser, ya, and we also have some service support graceful shutdown by signal
16:36:14 <mnaser> i think graceful shutdown is miles more important especially for cases like nova-compute for example
16:36:18 <Jeffrey4l> sighup should be handle properly, as long as the real service could handle it.
16:36:55 <Jeffrey4l> currently, we use sighup for haproxy configure reload.
16:37:38 <Jeffrey4l> so i think this pb is already done ;)
16:37:39 <inc0> yeah, sigkill is more importnat
16:37:41 <mnaser> but i think on reconfigure's sending signal instead of just killing the container (unless docker already does that?)
16:38:06 <duonghq> mnaser, it's depend on argument we pass to docker
16:38:11 <inc0> docker does sigkill and then timeout (30s I believe) before force termination
16:38:15 <duonghq> the signal indeed
16:38:26 <Jeffrey4l> inc0, 10s
16:38:31 <mnaser> gotcha inc0 that's good for nova-compute
16:38:44 <mnaser> but 10 seconds might be a bit too short but i think that's anothre discussion
16:38:52 <Jeffrey4l> it is configurable.
16:39:00 <Jeffrey4l> for each container.
16:39:13 <mnaser> thats good to know, thanks Jeffrey4l
16:39:15 <Jeffrey4l> docker stop -t <num>
16:39:33 <inc0> but I don't believe we use this config
16:39:42 <inc0> maybe that's good bug to kolla_docker?
16:39:52 <duonghq> Jeffrey4l, should we figure what service support SIGHUP to reload whole service config, then passthrough the signal to that service?
16:39:53 <portdirect> sorry to burst in late - we can also controll the signals in k8s - so would be great to get kfox and sbezverk to have some input on that
16:39:57 <Jeffrey4l> kolla-ansible do not support this parameter so far.
16:40:24 <inc0> portdirect: yeah k8s is better in this spae
16:40:33 <Jeffrey4l> duonghq, not all parameter support SIGHUP, jut part of them, iirc.
16:40:56 <duonghq> Jeffrey4l, it's docker-py, docker issue or our issue?
16:41:05 <inc0> our issue
16:41:21 <Jeffrey4l> wait 1 min. which issue are u talking?
16:41:22 <inc0> well, we don't allow to override 10s
16:41:31 <inc0> that's it
16:41:52 <mnaser> i think a summary of what inc0 is saying is overriding the docker kill timeout for containers
16:42:11 <mnaser> (aka the time period from when it sends a signal to stop and then forcingly terminates the container)
16:42:12 <Jeffrey4l> 1. kolla container support sighub, it pass to the real process    2. container is killed after 10s without stopped.
16:42:50 <inc0> and for 2 - let's add this config so we can extend period for services like n-cpu or heat
16:43:04 <Jeffrey4l> inc0, ++
16:43:11 <duonghq> Jeffrey4l, just for sure, we already support passing SIGHUP to container?
16:43:27 <mnaser> as you're using dumb-init i believe you it should happen automagically
16:43:31 <duonghq> inc0, +1
16:43:32 <Jeffrey4l> duonghq, yep. with dumb-init, SIGHUP is handle properly.
16:43:39 <duonghq> mnaser, Jeffrey4l roger
16:43:46 <mnaser> i have a few things to bring up if we're done with this
16:43:47 <Jeffrey4l> you can try it simplely.
16:43:53 <duonghq> iirc, we're planing to move to another init
16:44:00 <inc0> yeah, but correct me if I'm wrong but we don't *really* use sighup during reconfigure
16:44:01 <duonghq> tini?
16:44:03 <Jeffrey4l> but another thing is: not all parameter in nova.conf support SIGHUP.
16:44:08 <duonghq> inc0,  yup
16:44:14 <duonghq> Jeffrey4l, of course
16:44:22 <Jeffrey4l> inc0, for haproxy, yes. others no.
16:44:29 <duonghq> it's mnaser said: it's make things go wired
16:44:39 <inc0> question is, is it a big deal really
16:44:53 <duonghq> i.e. all oslo log support it,
16:44:54 <Jeffrey4l> it is impossible, imo.
16:45:21 <inc0> at least very hard
16:45:28 <Jeffrey4l> we do not know which parameter is change, so we can not know whether we should restart or sighup.
16:45:33 <Jeffrey4l> so it is impossible.
16:45:44 <inc0> right
16:45:48 <mnaser> i think if you want to revise the bp duonghq you would maybe look into merge_configs to notice what changed
16:45:50 <duonghq> but for glance, it's support SIGHUP for all config
16:45:51 <inc0> safer to do full restart
16:46:11 <duonghq> I mean, by the time, maybe more service support this kind of reconfiguration
16:46:14 <mnaser> and then maybe if SIGHUP becomes "the way to go" long term, you'd easily be able to do that
16:46:45 <Jeffrey4l> if one service announce he support SIGHUP for all config, i think we can implement this.
16:46:46 <duonghq> so, for services have not supported yet, we can ignore that,
16:46:58 <duonghq> we can have some kind of fully supported list
16:47:00 <mnaser> just on a deployer perspective
16:47:03 <inc0> duonghq: but if we introduce 2 different modes of reload
16:47:06 <inc0> that's complexity
16:47:14 <mnaser> i would much rather have a full restart
16:47:22 <mnaser> i doubt SIGHUP reloads have undergone heavy testing
16:47:29 <Jeffrey4l> COPY_ONCE is another big concern when using SIGHUP.
16:47:38 <Jeffrey4l> mnaser, ++
16:47:40 <duonghq> inc0, sure,
16:47:44 <mnaser> deploy X change, send SIGHUP, makes sure everthing is working is probably not something that's tested
16:47:54 <Jeffrey4l> in most of case, restart is not a big deal.
16:48:19 <duonghq> ok
16:48:25 <inc0> another question
16:48:31 <mnaser> if it is a big deal then you have mutliple controllers and serial will do controlled restarts so you should be okay
16:48:32 <inc0> different topic
16:48:39 <inc0> draining of connections on haproxy
16:48:44 <inc0> during upgrade
16:48:49 <Jeffrey4l> restart means: kill the process and start it again,   reload/sighup means recreate the inner class/object again.
16:49:22 <Jeffrey4l> inc0, at that point, we should support rolling upgrade first.
16:49:37 <inc0> right...
16:49:40 <mnaser> instead of draining connections, i think shutting all services down and letting haproxy return 502 is an acceptable thing
16:49:42 <inc0> any ideas about that btw
16:49:44 <inc0> ?
16:49:46 <Jeffrey4l> otherwise the remaining connection won't work.
16:49:49 <duonghq> about draining connection on haproxy, iirc, egonzalez have a solution
16:50:00 <Jeffrey4l> mnaser, i like you.
16:50:03 <Jeffrey4l> lolo
16:50:08 <egonzalez> inc0, yep, ansible support setting a haproxy backend as maintenance mode
16:50:28 <inc0> yay..but it doesn't support serial way we need it;)
16:50:29 <duonghq> we can drain connection than upgrade the node, so it appear no downtime at that point
16:50:43 <Jeffrey4l> serial is not rolling upgrade. we talked about this
16:50:59 <inc0> ok, anyway, rolling upgrade
16:51:05 <inc0> that's what I meant
16:51:28 <mnaser> i would: pull all new images, shutdown all $service containers, run db syncs, start all $service containers.  naturally, during the time of this happening, haproxy will be giving back 502s
16:51:29 <duonghq> about rolling upgrade, graceful shutdown is important for achieve that
16:51:49 <mnaser> for rolling upgrades, here's what i'd throw in the table, add multiple steps to it (or maybe even multiple kolla-ansible steps)
16:51:59 <mnaser> step #1, upgrade control plane (this happens with no serial)
16:52:01 <Jeffrey4l> mnaser, shutdown all service means shutdown haproxy.
16:52:11 <mnaser> nope, shut down a specific service, ex: glance
16:52:21 <Jeffrey4l> got.
16:52:34 <mnaser> step #2, upgrade data plan (this happens with say, 20% serial or whatever)
16:52:35 <Jeffrey4l> duonghq, graceful shutdown mean?
16:52:46 <mnaser> as part of step #1, you'd set upgrade_levels on the controllers too
16:52:51 <inc0> yeah, we thought of 2 different plays
16:52:58 <mnaser> and then the final step would be, remove all upgrade_levels and restart $service
16:53:11 <duonghq> glance (for example) has glance-control to coordinate its microservice, we have not supported that
16:53:21 <inc0> hmm, upgrade playbook can call 3 plays with serial
16:53:29 <duonghq> Jeffrey4l, we send some signal to container, it'll drain connection by itself
16:53:35 <inc0> 1 - upgrade control, no serial, set upgrade_lebels
16:53:42 <inc0> 2- upgrade compute, with serial
16:53:52 <inc0> 3 - remove controller upgrade_levels
16:54:16 <mnaser> ideally id like to see those split (and one that is combined).  we usually prefer to upgrade control plane and make sure everything is a-ok
16:54:25 <Jeffrey4l> do all services support upgrade_levels?
16:54:34 <mnaser> the large scale ones do (aka neutron+nova)
16:54:58 <mnaser> the rest i dont really know but they're so lightweight that it's not as big of a deal
16:55:04 <mnaser> most people dont have 300 heat-engine instances for example
16:55:14 <Jeffrey4l> yep.
16:56:07 <inc0> separating upgrade to multiple plays - I really like that
16:56:08 <Jeffrey4l> draining connection is trying to reduce the downtime in #1
16:56:15 <mnaser> i have few things
16:56:19 <mnaser> before the end if people dont mind
16:56:19 <inc0> I'd do it after we make upgrade gats really
16:56:23 <Jeffrey4l> two topics we are haveing.
16:56:34 <duonghq> Jeffrey4l, minute, does dumb-init support passsing SIGKILL to the process? in general, every signal?
16:56:36 <Jeffrey4l> mnaser, please.
16:56:41 <mnaser> https://review.openstack.org/#/c/445690/ keystone-ssh is broken
16:56:43 <duonghq> inc0, +1 for your 3 plays
16:56:43 <Jeffrey4l> duonghq, yep.
16:56:47 <mnaser> multinode rotation of fernet tokens doesnt work
16:56:49 <duonghq> Jeffrey4l, cool
16:56:57 <mnaser> if people can give some love to that review, it would be wonderful
16:57:05 <mnaser> ill backport afterwards
16:57:39 <Jeffrey4l> duonghq, dumb-init works like systemd.
16:57:46 <mnaser> and as a closer for next time maybe, i want to float the idea of using bindep when installing from source to avoid problems like this - https://review.openstack.org/#/c/446032/
16:58:01 <Jeffrey4l> +2ed
16:58:08 <duonghq> Jeffrey4l, ok, I'll experiment that, thanks
16:58:28 <inc0> ok, we're running out of time
16:58:35 <inc0> thank you all for coming
16:58:41 <duonghq> thanks
16:58:41 <inc0> #endmeeting kolla