#openstack-meeting-4 log

16:00:27 <Jeffrey4l> #startmeeting  kolla
16:00:27 <openstack> Meeting started Wed May 30 16:00:27 2018 UTC and is due to finish in 60 minutes.  The chair is Jeffrey4l. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:28 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:00:30 <openstack> The meeting name has been set to 'kolla'
16:00:35 <Jeffrey4l> #topic rollcall
16:00:41 <bmace> o/
16:00:43 <gema> o/
16:00:45 <yankcrime> o/
16:00:52 <caoyuan> o/
16:01:17 <duonghq> o/
16:01:27 <Chason_> o/
16:01:49 <Jeffrey4l> #topic Announcements
16:02:04 <Jeffrey4l> no news from my. any news from community?
16:03:17 <Jeffrey4l> ok. move on
16:03:25 <ktibi_> o/
16:03:39 <Jeffrey4l> #topic vancouver summit
16:03:49 <Jeffrey4l> the summit is done.
16:04:05 <Jeffrey4l> since i am absent the summit.
16:04:18 <Jeffrey4l> there is not much thing from me about this.
16:04:39 <Jeffrey4l> but from the kolla ops feedback session, seem it is not bad
16:04:44 * Jeffrey4l is find the etherpad link
16:05:09 <Jeffrey4l> here it is
16:05:11 <Jeffrey4l> #link https://etherpad.openstack.org/p/kolla-rocky-ops-and-user-feedback
16:05:50 <Jeffrey4l> There are several production env deployed by kolla
16:05:53 <Jeffrey4l> good to know
16:06:10 <Jeffrey4l> they also mentioned the kolla cli part
16:06:45 <Jeffrey4l> seems it is attractive
16:08:45 <Jeffrey4l> feel free to take the concern by the operator, and try to implement or fix it in kolla :D
16:08:59 <Jeffrey4l> and thanks pbourke and spsurya to hold this
16:09:03 <gema> what would a kolla client do?
16:09:42 <bmace> right now the client does a lot of things so you don't have to hand edit files, like updating properties, and manipulating the inventory
16:09:42 <Jeffrey4l> there is no detail info on the etherpad.
16:10:00 <Jeffrey4l> bug i guss just like bmace's kolla-cli does
16:10:08 <Jeffrey4l> but*
16:10:20 <yankcrime> kolla-cli is here: https://github.com/openstack/kolla-cli
16:10:29 <bmace> well, it isn't mine, it is the communities now :)
16:10:43 <Jeffrey4l> yeah, definitely.
16:10:44 <gema> thanks
16:11:01 <yankcrime> and kayobe is here: https://github.com/openstack/kayobe
16:11:29 <Jeffrey4l> they also talked the check and diff mode.
16:11:41 <Jeffrey4l> kolla do not support this.
16:11:55 <Jeffrey4l> But please review this https://review.openstack.org/568422, which trying to fix the gap
16:12:00 <bmace> right, i need to add some blueprints but have some ideas on how to support this, and config versioning in the cli
16:13:27 <Jeffrey4l> upgrade is another one important area we should improve. including upgrade from other deploy envrionment.
16:13:47 <Jeffrey4l> for the latter one, any guys know who did this successfully?
16:14:19 <gema> improve how?
16:14:25 <gema> what are we not doing?
16:14:37 <Jeffrey4l> docs about how to upgrade.
16:14:43 <gema> Jeffrey4l: we have a pending brownfield upgrade
16:14:57 <gema> still in the process of deploying from scratch one of the clouds
16:15:04 <Jeffrey4l> great.
16:15:20 <gema> is there a procedure to do upgrades?
16:15:31 <gema> as in, an expected / defined way to do it?
16:15:34 <ktibi> For now kolla upgrade just container but maybe operator need upgrade of system too like docker version, kernel, ... supported by kolla ?
16:15:41 <Jeffrey4l> iirc dreamhost tried to upgrade to kolla from brownfield. But don't know the result.
16:15:58 <gema> Jeffrey4l:ok, we'll try to document it
16:16:05 <gema> we are also going from newton to queens
16:16:13 <gema> so it'll be rocky :P
16:16:17 <Jeffrey4l> gema, there is no special. just need get new images. and kolla-ansible. then update the globals.yml, and try upgrade.
16:16:29 <gema> Jeffrey4l: ack, so that is documented
16:16:37 <gema> we just need to document the brownfield one
16:16:40 <Jeffrey4l> i tried upgrade from newton to ocata / ocata to pike, which works fine.
16:16:51 <gema> Jeffrey4l: we cannot do such upgrade
16:16:53 <gema> we have to jump
16:16:59 <gema> no useful images on ocata or pike for us
16:17:06 <Jeffrey4l> ktibi, yes.
16:17:17 <Jeffrey4l> ffu?
16:17:22 <gema> yes
16:17:28 <Jeffrey4l> it should work. but i haven't tried it.
16:17:33 <gema> we'll let you know
16:17:35 <ktibi> and maybe need to add a rollback docs too ?
16:17:46 <Jeffrey4l> is every openstack service support this?
16:17:54 <Jeffrey4l> ktibi, rollback is hard..
16:17:58 <yankcrime> not all openstack services (currently) support ffu
16:18:07 <yankcrime> some have done historically i.e keystone
16:18:09 <Jeffrey4l> the db can not be downgrade, unless you backup it.
16:18:19 <yankcrime> that's true in pretty much all cases
16:18:21 <ktibi> Jeffrey4l, if an upgrade fail ??
16:18:29 <gema> no worries, we'll do backups
16:18:34 <yankcrime> rollback is database restore and then restart old versions of containers
16:18:37 <gema> that's the only thing I know for sure we need to do
16:18:40 <gema> before starting
16:18:45 <gema> that and announcing an outage of a week :D
16:18:56 <Jeffrey4l> kolla should implement the ffu for upgrade process.
16:19:04 <yankcrime> fwiw we've done upgrades from o -> p -> q without much drama
16:19:13 <yankcrime> and also run mixed releases of services
16:19:21 <gema> Jeffrey4l: there are a lot of conversations ongoing about ffus upstream
16:19:24 <yankcrime> i.e magnum and keystone on q, rest on p
16:19:27 <gema> I have been to a few in the different PTGs
16:19:30 <Jeffrey4l> yankcrime, yeah, that should work. need be documented too.
16:20:01 <Jeffrey4l> i think one ffu is done, openstack could have a kind of lts version ;D
16:20:03 <yankcrime> Jeffrey4l: with the right testing infrastructure in place the process would almost be self-documenting...
16:20:27 <duonghq> ktibi, there is no service support downgrade yet,
16:20:46 <duonghq> regarding ffu, in general, OpenStack services do not support this
16:20:56 <duonghq> at least for configuration migration
16:20:57 <Jeffrey4l> yankcrime, think about downgrade ceph.. it may impossible for some service.
16:21:00 <ktibi> duonghq, yes of course, but with a backup of DB you can rollback no ?
16:21:10 <yankcrime> Jeffrey4l: i'm thinking about only openstack services
16:21:20 <duonghq> operator need to do it by theirself, db is ok
16:21:23 <yankcrime> with things like ceph, yes - almost certainly impossible for any sizeable deployment
16:21:52 <duonghq> ktibi, and you'll lost some data if you do it in zero-downtime/rolling upgrade/minimal downtime manner
16:22:25 <Jeffrey4l> for openstack service , it should be OK to downgrade with a db backups.
16:22:26 <ktibi> Jeffrey4l, yes so maybe somes service like ceph or elasticsearch need to have their own upgrade procedure. uncorrelated with upgrade of openstack service no ?
16:22:30 <duonghq> offline upgrade maybe just fine
16:22:36 <yankcrime> if you test you'll also fine you can run newer versions of database schema for some services with older version of application
16:22:39 <yankcrime> *find
16:22:40 <Jeffrey4l> But it will be full of tricks and hacks.
16:22:47 <gema> ktibi: yes
16:23:37 <ktibi> duonghq, i prefer to have rollback instead of zero-downtime
16:24:10 <duonghq> ktibi, the requirement is different case-by-case
16:24:15 <Jeffrey4l> if we only talk about openstack service, the upgrade may not be failure. the only issue may happen the db upgrade failed. but i think it wil be fixed easy.
16:25:02 <ktibi> Jeffrey4l, hum, If you upgrade but you include a new bug ?
16:26:13 <Jeffrey4l> i would like fix the bug rather than roll back.
16:26:20 <Jeffrey4l> roll back is more dangoure.
16:27:27 <ktibi> Jeffrey4l, yes but you think like a dev, not like a ops ^^
16:27:27 <Jeffrey4l> for this point, i think we could provide docs to explain how to backup, and how to rollback step by step. even we could provide some shell scripts to do such thing.
16:28:07 <Jeffrey4l> but i do not think add a command like "kolla-ansible rollback" is a good idea.
16:28:23 <Jeffrey4l> i am a devops :D
16:28:42 <ktibi> A ops who update during the night, start tempest after upgrade (success) and have 90 fails, you think he can fix the bug ? :p
16:29:07 <ktibi> no just a doc I think :)
16:29:12 <yankcrime> on the subject of backups, i did look at adding that functionality a while back - a new container image for percona-xtrabackup and an additional command to kolla-ansible that creates a new volume containing backup of mariadb
16:29:56 <yankcrime> there's a blueprint for the feature somewhere
16:30:38 <yankcrime> it doesn't handle backup lifecycle though, so no scheduling and no xfer of backup to any other target
16:30:40 <Jeffrey4l> yankcrime, why a new volume? how about just cp -r volumes/mariadb?
16:31:17 <Jeffrey4l> ktibi, that just my thoughts. if you could provide a good solution for "kolla-ansible rollback", that will be cool.
16:31:24 <yankcrime> Jeffrey4l: without stopping the service?
16:31:55 <Jeffrey4l> yankcrime, you are saying backup mariadb periodic, right?
16:32:10 <ktibi> Jeffrey4l, I think DB backup restore + restore old image can be good. But need to stop all actions on your plateform during the upgrade.
16:32:13 <yankcrime> Jeffrey4l: ad-hoc
16:32:52 <Jeffrey4l> yankcrime, kolla have a cron container, which we could reuse. ad-hoc backup is nice to have
16:33:04 <Jeffrey4l> this is also talked during ptg.
16:33:09 <yankcrime> Jeffrey4l: you won't get a consistent backup if you just copy the volume
16:33:49 <Jeffrey4l> sorry, i understood in wrong. yes, need percona-xtrabackup to do adhoc backup.
16:34:59 <Jeffrey4l> ktibi, yes. and need care about some service like mq / db / ceph etc
16:35:48 <ktibi> Jeffrey4l, yes, it's almost obligatory to create a upgrade procedure just for service like ceph....
16:39:20 <Jeffrey4l> yeah.. so feel free to improve this ;)
16:40:02 <Jeffrey4l> could we move on? any thing about the summit?
16:40:18 <yankcrime> for the backup stuff, as there's a few moving parts, what's the best way of getting something up for review?  a mini spec?
16:40:45 <Jeffrey4l> yeah, a mini spec is a good start.
16:40:49 <yankcrime> ok, thanks
16:41:22 <Jeffrey4l> ok. let us move on
16:41:40 <Jeffrey4l> #topic open discussion
16:41:52 <Jeffrey4l> any volunteer?
16:42:25 <bmace> i would just like to quickly mention that Steve Noyes has done some great work to improve the cli testing, and it is at a point where if other people want to get involved, they are most welcome
16:42:48 <bmace> right now the vagrant based kolla-ansible dev environment also included the cli by default
16:43:09 <bmace> right now i am adding some python scripting so that rather than following the README process to get it working you can just run the script.
16:43:23 <Jeffrey4l> great. thanks bmace and Steve
16:43:52 <bmace> so far no non-cli cores seem to look at it at all, so Mark and I have been having to force through reviews, which is non optimal.. i am hoping eventually other cores might give the pending changes there a look occasionally
16:44:34 <bmace> that is it from me.  if anyone tries it and has questions or comments they are all welcome!
16:45:02 <Jeffrey4l> thanks bmace
16:45:14 <ktibi> I have question ;)
16:45:18 <Jeffrey4l> hope the develop clould keep eyes on this .
16:45:24 <Jeffrey4l> ktibi, please
16:45:37 <ktibi> As everyone knows, redhat now uses kolla. Redhat has added a lot of new features like SSL on the internal network. RedHat uses a special docker version. For the moment kolla uses 1.12.6 which is not at all up to date. I think we should make an effort on the docker version deployed by kolla. What do you think ?
16:46:00 <Jeffrey4l> fyi, they using 1.13.x now.
16:46:38 <Jeffrey4l> ktibi, yeah, i tried to do this
16:46:39 <Jeffrey4l> check https://review.openstack.org/533337
16:46:43 <ktibi> yes 1.13 is a fork of CE. I don't really know how their docker works  ><
16:47:32 <ktibi> but kolla need to be compatible with
16:47:32 <ktibi> docker-ce no ?
16:47:49 <Jeffrey4l> i guess they do not wanna to follow docker-ce. and once oci is mature, they may move to it
16:47:57 <ktibi> and not with the redhat version ?
16:48:16 <Jeffrey4l> above patch trying to work with redhat version..
16:48:22 <Jeffrey4l> even though it is not merged ;(
16:49:07 <Jeffrey4l> technially, kolla works with docker 1.12.0
16:49:11 <ktibi> Jeffrey4l, yep :) I try to works too but I think we need to take a decision. For centos we use docker-ce or redhat fork ?
16:49:29 <Jeffrey4l> just redhat version made a small change on the origial docker.
16:49:44 <Jeffrey4l> ktibi, in the gate, we use docker-ce.
16:50:01 <ktibi> Jeffrey4l, not really, live-restore no support, secret are very different, ...
16:50:09 <Jeffrey4l> i think this is a question for operator.
16:50:28 <Jeffrey4l> kolla do not dependes on live restore and secret( need to be disabled on redhat version  )
16:51:06 <ktibi> yes it was just for say the fork have a lot of diff with CE
16:51:40 <Jeffrey4l> ktibi, so we are trying to works on both.
16:52:20 <Jeffrey4l> as far as which one operator wanna to use, this is a choice for him :)
16:52:46 <ktibi> Jeffrey4l, ok :)
16:53:17 <ktibi> so need to upgrade for gate or keep 1.12 ?
16:53:49 <Jeffrey4l> for gate, we will try to use latest docker version.
16:54:23 <ktibi> ok great, I am working on 17.12.1
16:54:55 <Jeffrey4l> cool. 17.09 is most used in my company :D
16:56:17 <Jeffrey4l> ok. time is almost up.
16:56:22 <Jeffrey4l> let us end the meeting.
16:56:31 <Jeffrey4l> thank every for comming
16:56:41 <Jeffrey4l> have a good day/night
16:56:44 <Jeffrey4l> #endmeeting