16:00:27 <Jeffrey4l> #startmeeting kolla 16:00:27 <openstack> Meeting started Wed May 30 16:00:27 2018 UTC and is due to finish in 60 minutes. The chair is Jeffrey4l. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:28 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:30 <openstack> The meeting name has been set to 'kolla' 16:00:35 <Jeffrey4l> #topic rollcall 16:00:41 <bmace> o/ 16:00:43 <gema> o/ 16:00:45 <yankcrime> o/ 16:00:52 <caoyuan> o/ 16:01:17 <duonghq> o/ 16:01:27 <Chason_> o/ 16:01:49 <Jeffrey4l> #topic Announcements 16:02:04 <Jeffrey4l> no news from my. any news from community? 16:03:17 <Jeffrey4l> ok. move on 16:03:25 <ktibi_> o/ 16:03:39 <Jeffrey4l> #topic vancouver summit 16:03:49 <Jeffrey4l> the summit is done. 16:04:05 <Jeffrey4l> since i am absent the summit. 16:04:18 <Jeffrey4l> there is not much thing from me about this. 16:04:39 <Jeffrey4l> but from the kolla ops feedback session, seem it is not bad 16:04:44 * Jeffrey4l is find the etherpad link 16:05:09 <Jeffrey4l> here it is 16:05:11 <Jeffrey4l> #link https://etherpad.openstack.org/p/kolla-rocky-ops-and-user-feedback 16:05:50 <Jeffrey4l> There are several production env deployed by kolla 16:05:53 <Jeffrey4l> good to know 16:06:10 <Jeffrey4l> they also mentioned the kolla cli part 16:06:45 <Jeffrey4l> seems it is attractive 16:08:45 <Jeffrey4l> feel free to take the concern by the operator, and try to implement or fix it in kolla :D 16:08:59 <Jeffrey4l> and thanks pbourke and spsurya to hold this 16:09:03 <gema> what would a kolla client do? 16:09:42 <bmace> right now the client does a lot of things so you don't have to hand edit files, like updating properties, and manipulating the inventory 16:09:42 <Jeffrey4l> there is no detail info on the etherpad. 16:10:00 <Jeffrey4l> bug i guss just like bmace's kolla-cli does 16:10:08 <Jeffrey4l> but* 16:10:20 <yankcrime> kolla-cli is here: https://github.com/openstack/kolla-cli 16:10:29 <bmace> well, it isn't mine, it is the communities now :) 16:10:43 <Jeffrey4l> yeah, definitely. 16:10:44 <gema> thanks 16:11:01 <yankcrime> and kayobe is here: https://github.com/openstack/kayobe 16:11:29 <Jeffrey4l> they also talked the check and diff mode. 16:11:41 <Jeffrey4l> kolla do not support this. 16:11:55 <Jeffrey4l> But please review this https://review.openstack.org/568422, which trying to fix the gap 16:12:00 <bmace> right, i need to add some blueprints but have some ideas on how to support this, and config versioning in the cli 16:13:27 <Jeffrey4l> upgrade is another one important area we should improve. including upgrade from other deploy envrionment. 16:13:47 <Jeffrey4l> for the latter one, any guys know who did this successfully? 16:14:19 <gema> improve how? 16:14:25 <gema> what are we not doing? 16:14:37 <Jeffrey4l> docs about how to upgrade. 16:14:43 <gema> Jeffrey4l: we have a pending brownfield upgrade 16:14:57 <gema> still in the process of deploying from scratch one of the clouds 16:15:04 <Jeffrey4l> great. 16:15:20 <gema> is there a procedure to do upgrades? 16:15:31 <gema> as in, an expected / defined way to do it? 16:15:34 <ktibi> For now kolla upgrade just container but maybe operator need upgrade of system too like docker version, kernel, ... supported by kolla ? 16:15:41 <Jeffrey4l> iirc dreamhost tried to upgrade to kolla from brownfield. But don't know the result. 16:15:58 <gema> Jeffrey4l:ok, we'll try to document it 16:16:05 <gema> we are also going from newton to queens 16:16:13 <gema> so it'll be rocky :P 16:16:17 <Jeffrey4l> gema, there is no special. just need get new images. and kolla-ansible. then update the globals.yml, and try upgrade. 16:16:29 <gema> Jeffrey4l: ack, so that is documented 16:16:37 <gema> we just need to document the brownfield one 16:16:40 <Jeffrey4l> i tried upgrade from newton to ocata / ocata to pike, which works fine. 16:16:51 <gema> Jeffrey4l: we cannot do such upgrade 16:16:53 <gema> we have to jump 16:16:59 <gema> no useful images on ocata or pike for us 16:17:06 <Jeffrey4l> ktibi, yes. 16:17:17 <Jeffrey4l> ffu? 16:17:22 <gema> yes 16:17:28 <Jeffrey4l> it should work. but i haven't tried it. 16:17:33 <gema> we'll let you know 16:17:35 <ktibi> and maybe need to add a rollback docs too ? 16:17:46 <Jeffrey4l> is every openstack service support this? 16:17:54 <Jeffrey4l> ktibi, rollback is hard.. 16:17:58 <yankcrime> not all openstack services (currently) support ffu 16:18:07 <yankcrime> some have done historically i.e keystone 16:18:09 <Jeffrey4l> the db can not be downgrade, unless you backup it. 16:18:19 <yankcrime> that's true in pretty much all cases 16:18:21 <ktibi> Jeffrey4l, if an upgrade fail ?? 16:18:29 <gema> no worries, we'll do backups 16:18:34 <yankcrime> rollback is database restore and then restart old versions of containers 16:18:37 <gema> that's the only thing I know for sure we need to do 16:18:40 <gema> before starting 16:18:45 <gema> that and announcing an outage of a week :D 16:18:56 <Jeffrey4l> kolla should implement the ffu for upgrade process. 16:19:04 <yankcrime> fwiw we've done upgrades from o -> p -> q without much drama 16:19:13 <yankcrime> and also run mixed releases of services 16:19:21 <gema> Jeffrey4l: there are a lot of conversations ongoing about ffus upstream 16:19:24 <yankcrime> i.e magnum and keystone on q, rest on p 16:19:27 <gema> I have been to a few in the different PTGs 16:19:30 <Jeffrey4l> yankcrime, yeah, that should work. need be documented too. 16:20:01 <Jeffrey4l> i think one ffu is done, openstack could have a kind of lts version ;D 16:20:03 <yankcrime> Jeffrey4l: with the right testing infrastructure in place the process would almost be self-documenting... 16:20:27 <duonghq> ktibi, there is no service support downgrade yet, 16:20:46 <duonghq> regarding ffu, in general, OpenStack services do not support this 16:20:56 <duonghq> at least for configuration migration 16:20:57 <Jeffrey4l> yankcrime, think about downgrade ceph.. it may impossible for some service. 16:21:00 <ktibi> duonghq, yes of course, but with a backup of DB you can rollback no ? 16:21:10 <yankcrime> Jeffrey4l: i'm thinking about only openstack services 16:21:20 <duonghq> operator need to do it by theirself, db is ok 16:21:23 <yankcrime> with things like ceph, yes - almost certainly impossible for any sizeable deployment 16:21:52 <duonghq> ktibi, and you'll lost some data if you do it in zero-downtime/rolling upgrade/minimal downtime manner 16:22:25 <Jeffrey4l> for openstack service , it should be OK to downgrade with a db backups. 16:22:26 <ktibi> Jeffrey4l, yes so maybe somes service like ceph or elasticsearch need to have their own upgrade procedure. uncorrelated with upgrade of openstack service no ? 16:22:30 <duonghq> offline upgrade maybe just fine 16:22:36 <yankcrime> if you test you'll also fine you can run newer versions of database schema for some services with older version of application 16:22:39 <yankcrime> *find 16:22:40 <Jeffrey4l> But it will be full of tricks and hacks. 16:22:47 <gema> ktibi: yes 16:23:37 <ktibi> duonghq, i prefer to have rollback instead of zero-downtime 16:24:10 <duonghq> ktibi, the requirement is different case-by-case 16:24:15 <Jeffrey4l> if we only talk about openstack service, the upgrade may not be failure. the only issue may happen the db upgrade failed. but i think it wil be fixed easy. 16:25:02 <ktibi> Jeffrey4l, hum, If you upgrade but you include a new bug ? 16:26:13 <Jeffrey4l> i would like fix the bug rather than roll back. 16:26:20 <Jeffrey4l> roll back is more dangoure. 16:27:27 <ktibi> Jeffrey4l, yes but you think like a dev, not like a ops ^^ 16:27:27 <Jeffrey4l> for this point, i think we could provide docs to explain how to backup, and how to rollback step by step. even we could provide some shell scripts to do such thing. 16:28:07 <Jeffrey4l> but i do not think add a command like "kolla-ansible rollback" is a good idea. 16:28:23 <Jeffrey4l> i am a devops :D 16:28:42 <ktibi> A ops who update during the night, start tempest after upgrade (success) and have 90 fails, you think he can fix the bug ? :p 16:29:07 <ktibi> no just a doc I think :) 16:29:12 <yankcrime> on the subject of backups, i did look at adding that functionality a while back - a new container image for percona-xtrabackup and an additional command to kolla-ansible that creates a new volume containing backup of mariadb 16:29:56 <yankcrime> there's a blueprint for the feature somewhere 16:30:38 <yankcrime> it doesn't handle backup lifecycle though, so no scheduling and no xfer of backup to any other target 16:30:40 <Jeffrey4l> yankcrime, why a new volume? how about just cp -r volumes/mariadb? 16:31:17 <Jeffrey4l> ktibi, that just my thoughts. if you could provide a good solution for "kolla-ansible rollback", that will be cool. 16:31:24 <yankcrime> Jeffrey4l: without stopping the service? 16:31:55 <Jeffrey4l> yankcrime, you are saying backup mariadb periodic, right? 16:32:10 <ktibi> Jeffrey4l, I think DB backup restore + restore old image can be good. But need to stop all actions on your plateform during the upgrade. 16:32:13 <yankcrime> Jeffrey4l: ad-hoc 16:32:52 <Jeffrey4l> yankcrime, kolla have a cron container, which we could reuse. ad-hoc backup is nice to have 16:33:04 <Jeffrey4l> this is also talked during ptg. 16:33:09 <yankcrime> Jeffrey4l: you won't get a consistent backup if you just copy the volume 16:33:49 <Jeffrey4l> sorry, i understood in wrong. yes, need percona-xtrabackup to do adhoc backup. 16:34:59 <Jeffrey4l> ktibi, yes. and need care about some service like mq / db / ceph etc 16:35:48 <ktibi> Jeffrey4l, yes, it's almost obligatory to create a upgrade procedure just for service like ceph.... 16:39:20 <Jeffrey4l> yeah.. so feel free to improve this ;) 16:40:02 <Jeffrey4l> could we move on? any thing about the summit? 16:40:18 <yankcrime> for the backup stuff, as there's a few moving parts, what's the best way of getting something up for review? a mini spec? 16:40:45 <Jeffrey4l> yeah, a mini spec is a good start. 16:40:49 <yankcrime> ok, thanks 16:41:22 <Jeffrey4l> ok. let us move on 16:41:40 <Jeffrey4l> #topic open discussion 16:41:52 <Jeffrey4l> any volunteer? 16:42:25 <bmace> i would just like to quickly mention that Steve Noyes has done some great work to improve the cli testing, and it is at a point where if other people want to get involved, they are most welcome 16:42:48 <bmace> right now the vagrant based kolla-ansible dev environment also included the cli by default 16:43:09 <bmace> right now i am adding some python scripting so that rather than following the README process to get it working you can just run the script. 16:43:23 <Jeffrey4l> great. thanks bmace and Steve 16:43:52 <bmace> so far no non-cli cores seem to look at it at all, so Mark and I have been having to force through reviews, which is non optimal.. i am hoping eventually other cores might give the pending changes there a look occasionally 16:44:34 <bmace> that is it from me. if anyone tries it and has questions or comments they are all welcome! 16:45:02 <Jeffrey4l> thanks bmace 16:45:14 <ktibi> I have question ;) 16:45:18 <Jeffrey4l> hope the develop clould keep eyes on this . 16:45:24 <Jeffrey4l> ktibi, please 16:45:37 <ktibi> As everyone knows, redhat now uses kolla. Redhat has added a lot of new features like SSL on the internal network. RedHat uses a special docker version. For the moment kolla uses 1.12.6 which is not at all up to date. I think we should make an effort on the docker version deployed by kolla. What do you think ? 16:46:00 <Jeffrey4l> fyi, they using 1.13.x now. 16:46:38 <Jeffrey4l> ktibi, yeah, i tried to do this 16:46:39 <Jeffrey4l> check https://review.openstack.org/533337 16:46:43 <ktibi> yes 1.13 is a fork of CE. I don't really know how their docker works >< 16:47:32 <ktibi> but kolla need to be compatible with 16:47:32 <ktibi> docker-ce no ? 16:47:49 <Jeffrey4l> i guess they do not wanna to follow docker-ce. and once oci is mature, they may move to it 16:47:57 <ktibi> and not with the redhat version ? 16:48:16 <Jeffrey4l> above patch trying to work with redhat version.. 16:48:22 <Jeffrey4l> even though it is not merged ;( 16:49:07 <Jeffrey4l> technially, kolla works with docker 1.12.0 16:49:11 <ktibi> Jeffrey4l, yep :) I try to works too but I think we need to take a decision. For centos we use docker-ce or redhat fork ? 16:49:29 <Jeffrey4l> just redhat version made a small change on the origial docker. 16:49:44 <Jeffrey4l> ktibi, in the gate, we use docker-ce. 16:50:01 <ktibi> Jeffrey4l, not really, live-restore no support, secret are very different, ... 16:50:09 <Jeffrey4l> i think this is a question for operator. 16:50:28 <Jeffrey4l> kolla do not dependes on live restore and secret( need to be disabled on redhat version ) 16:51:06 <ktibi> yes it was just for say the fork have a lot of diff with CE 16:51:40 <Jeffrey4l> ktibi, so we are trying to works on both. 16:52:20 <Jeffrey4l> as far as which one operator wanna to use, this is a choice for him :) 16:52:46 <ktibi> Jeffrey4l, ok :) 16:53:17 <ktibi> so need to upgrade for gate or keep 1.12 ? 16:53:49 <Jeffrey4l> for gate, we will try to use latest docker version. 16:54:23 <ktibi> ok great, I am working on 17.12.1 16:54:55 <Jeffrey4l> cool. 17.09 is most used in my company :D 16:56:17 <Jeffrey4l> ok. time is almost up. 16:56:22 <Jeffrey4l> let us end the meeting. 16:56:31 <Jeffrey4l> thank every for comming 16:56:41 <Jeffrey4l> have a good day/night 16:56:44 <Jeffrey4l> #endmeeting