15:00:08 <mgoddard> #startmeeting kolla 15:00:09 <openstack> Meeting started Wed May 19 15:00:08 2021 UTC and is due to finish in 60 minutes. The chair is mgoddard. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:10 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:13 <openstack> The meeting name has been set to 'kolla' 15:00:15 <mgoddard> #topic rollcall 15:00:22 <yoctozepto> \o/ 15:00:43 <yoctozepto> \_o_/ 15:00:55 <yoctozepto> /_o_\ 15:01:00 <mgoddard> dob 15:01:02 <hrw> /-o-\ 15:01:06 <yoctozepto> no, my arms broke 15:01:51 <headphoneJames> o/ 15:02:11 <wuchunyang> o 15:03:42 <mgoddard> #topic agenda 15:03:54 <mgoddard> * Roll-call 15:03:56 <mgoddard> * Agenda 15:03:58 <mgoddard> * Announcements 15:04:00 <mgoddard> * Review action items from the last meeting 15:04:02 <mgoddard> * CI status 15:04:04 <mgoddard> * Wallaby release planning 15:04:06 <mgoddard> ** Debian bullseye 15:04:08 <mgoddard> ** chrony 15:04:10 <mgoddard> * Xena cycle planning 15:04:12 <mgoddard> ** master branch life cycle 15:04:14 <mgoddard> * Open discussion 15:04:16 <mgoddard> #topic announcements 15:04:18 <mgoddard> I have none. Anyone else? 15:04:49 <hrw> nope 15:05:12 <openstackgerrit> Merged openstack/kayobe-config stable/wallaby: Synchronize kayobe-config https://review.opendev.org/c/openstack/kayobe-config/+/792109 15:05:15 <mgoddard> #topic Review action items from the last meeting 15:05:31 <mgoddard> mgoddard email openstack-discuss about quay.io credentials 15:05:33 <mgoddard> mgoddard draft proposal for new release process 15:05:35 <mgoddard> no 15:05:37 <mgoddard> yes 15:05:41 <mgoddard> #action mgoddard email openstack-discuss about quay.io credentials 15:05:47 <mgoddard> #topic CI status 15:05:47 <yoctozepto> lol, that email is hard 15:06:20 <yoctozepto> don't get me started on CI 15:06:24 <yoctozepto> of course it's RED 15:06:31 <yoctozepto> but we are patching :-) 15:06:46 <mgoddard> [all distros] [docker SDK] docker py 5.0.0 removed six but imports it and fails prechecks/any functionality 15:07:00 <mgoddard> fix merged to master, backporting in progress 15:07:13 <yoctozepto> on that note 15:07:23 <yoctozepto> we are trying to move to distro-provided sdk 15:07:36 <yoctozepto> it should be good enough as long as it is provided 15:07:39 <yoctozepto> :-) 15:07:40 <hrw> +2 15:07:43 <mgoddard> well 15:07:47 <mgoddard> do we need it? 15:07:54 <mgoddard> now we have an upper limit on docker 15:08:02 <yoctozepto> nicer than pip-installing 15:08:17 <mgoddard> what happens if we need a feature from a newer version? 15:08:18 <yoctozepto> the crowds will love you - right, kevko? 15:08:21 <mgoddard> or if RDO drops 15:08:32 <yoctozepto> then we drop centos support 15:09:03 <yoctozepto> well, the newer versions seem be lacking behind in feature support anyhow 15:09:22 <yoctozepto> that's why I mentioned we are not really losing anything but using an older version 15:09:29 <mgoddard> ok, well let's not rush into it 15:09:42 <yoctozepto> for cgroups namespace I am directly modifying the API query contents 15:10:56 <yoctozepto> ok, let's move on 15:10:58 <mgoddard> ok, let's move on. Lots to get through 15:11:01 <yoctozepto> more interesting stuff ahead 15:11:19 <mgoddard> #topic Debian bullseye 15:11:34 <mgoddard> who wants to give a status update? 15:11:43 <yoctozepto> the bulls' only eye has arrived at the green station: https://review.opendev.org/c/openstack/kolla-ansible/+/790678 15:11:50 <hrw> yay! 15:11:58 <yoctozepto> now checking without machined 15:12:00 <yoctozepto> we'll see 15:12:14 <yoctozepto> anyhow, we have a working solution *with* cgroups v2 15:12:21 <yoctozepto> which is awesome 15:12:27 <yoctozepto> the other issue is with the upgrade 15:12:31 <yoctozepto> ovs 2.15 15:12:37 <yoctozepto> just like in UCA before we pinned it 15:12:41 <yoctozepto> mnasiadka debugging this 15:13:10 <yoctozepto> no idea about the current progress, mnasiadka silent recently 15:13:17 <yoctozepto> perhaps ovs broke his internets 15:13:52 <yoctozepto> this is worse on debian because we can't pin any "older version" 15:13:58 <yoctozepto> as this is from base bullseye 15:14:28 <yoctozepto> also, with current knowledge, expect breakage when rdo moves to ovs 2.15 15:14:29 <yoctozepto> ;d 15:14:49 <mgoddard> could we use a buster repo? 15:14:59 <yoctozepto> I don't know, hrw, wdyt? 15:15:10 <yoctozepto> the other side to that isssue is that 15:15:13 <yoctozepto> it's not 100% 15:15:18 <yoctozepto> it may succeed 15:15:23 <yoctozepto> and it always succeeds in multinode 15:15:33 <hrw> mgoddard: shouldn't 15:15:34 <yoctozepto> perhaps ovs likes to have friends to get to work ;d 15:15:47 <yoctozepto> and it is 80% sad when alon 15:16:01 <yoctozepto> (the estimation is made up) 15:16:10 <yoctozepto> alone* 15:16:31 <mgoddard> have we spoken to debian openstack team about it? 15:17:36 <mgoddard> or ubuntu even 15:17:53 <yoctozepto> me not 15:17:55 <yoctozepto> hrw, mnasiadka? 15:18:10 <mgoddard> probably should 15:18:26 <hrw> not me 15:18:33 <hrw> I had to dig in non kolla stuff 15:18:47 <mgoddard> or ask on openstack-discuss 15:19:01 <yoctozepto> well, one can argue nobody runs ovs on singlenode because it does not make much sense ;d 15:20:18 <yoctozepto> I asked zigo on #debian-openstack now 15:20:23 <yoctozepto> (on oftc) 15:20:58 <mgoddard> ok 15:21:06 <mgoddard> can we go back to debian & libvirt 15:21:42 <mgoddard> https://docs.docker.com/engine/api/version-history/ suggests the CgroupnsMode parameter is API v1.41 15:22:27 <mgoddard> anyone know how to map that to a docker version? 15:23:07 <mgoddard> 20.10.0 15:23:08 <yoctozepto> 20.10 15:23:10 <mgoddard> https://docs.docker.com/engine/release-notes/ 15:23:13 <yoctozepto> i mentioned this above 15:23:19 <yoctozepto> before the meeting 15:23:20 <yoctozepto> :-) 15:23:29 <yoctozepto> they decided to change the default 15:23:32 <yoctozepto> for fun 15:23:35 <mgoddard> I was elsewhere 15:23:38 <yoctozepto> and added a knob to change it back 15:23:47 <yoctozepto> so much for backward compat 15:24:04 <yoctozepto> well, at least they default to "more secure" 15:24:04 <mgoddard> people in glass houses etc. 15:24:23 <yoctozepto> I know 15:24:38 <yoctozepto> I like to rant 15:25:10 <yoctozepto> since it's going to happen on bullseye 15:25:17 <yoctozepto> we can condition it on being on bullseye 15:25:33 <yoctozepto> and then we relax as more platforms move to cgroups v2 15:25:36 <mgoddard> just so I'm clear 15:25:43 <mgoddard> this default changed in 20.10 15:25:50 <mgoddard> but only affects cgroups v2 15:25:54 <yoctozepto> indeed 15:26:09 <mgoddard> which so far only debian bullseye uses out of our supported platforms 15:26:14 <yoctozepto> exactly 15:26:16 <mgoddard> k 15:26:26 <yoctozepto> and 20.10 is the only to support cgroups v2 out of the box 15:26:37 <yoctozepto> so older ones are supposed to fail / be unsupported 15:27:43 <mgoddard> so we need to either set 20.10 / 1.41 as our minimum supported version, or have some check for cgroups v2 in the module 15:28:11 <yoctozepto> I say we use this know only on bullseye 15:28:18 <yoctozepto> that is easiest ;d 15:28:26 <yoctozepto> knob* 15:28:28 <hrw> 20.10 is available for each of our host distros 15:28:37 <hrw> so we can depend 20.10 for wallaby+ 15:29:05 <mgoddard> current min docker version is 1.10 :) 15:29:15 <yoctozepto> I would not discriminate people running non-upstream docker ;d 15:29:26 <mgoddard> well, just because it's available, doesn't mean its in use 15:29:31 <yoctozepto> well, we can bump to 18.09 15:29:41 <yoctozepto> as that seems to work 15:29:46 <mgoddard> does it? 15:29:47 <yoctozepto> 1.10 is scary 15:29:50 <mgoddard> +1 15:30:16 <yoctozepto> yes, though checked for sure on train/ussuri 15:30:17 <hrw> https://www.docker.com/blog/docker-1-10/ - 4th Feb 2016 15:30:22 <hrw> time to upgrade indeed 15:30:27 <mgoddard> but I thought the new parameter was added in 20.10? 15:30:47 <yoctozepto> yes, that's why we limit it to bullseye which requires both 20.10 and the knob 15:30:54 <yoctozepto> and we have it dealt with 15:31:42 <mgoddard> ok 15:31:55 <yoctozepto> btw, it's passing without machined 15:31:56 <mgoddard> can we rely on distros not to switch to cgroupsv2? 15:32:05 <yoctozepto> I will check libvirt logs if no oddiness happened and refactor 15:32:27 <yoctozepto> oh, I am pretty sure they will not 15:32:45 <yoctozepto> neither would risk pissing off enterprise users 15:33:09 <mgoddard> ok 15:33:13 <mgoddard> added some notes to the patch 15:33:22 <hrw> ubuntu 22.04 will be cg2 15:33:28 <hrw> similar with centos 9 15:33:40 <yoctozepto> yes 15:33:52 <yoctozepto> I meant in their current versions 15:33:55 <mgoddard> we'll have a release that supports a migration 15:34:00 <yoctozepto> I hope I was not misread 15:34:02 <mgoddard> I guess we can handle that when we get to it 15:34:11 <yoctozepto> yeas 15:34:11 <hrw> cs9 will land in Xena or Yeti (I forgot dates) 15:34:11 <mgoddard> Let's move on 15:34:27 <mgoddard> #topic chrony 15:34:36 <yoctozepto> noo 15:34:38 <yoctozepto> one more thing 15:34:44 <mgoddard> #undo 15:34:45 <openstack> Removing item from minutes: #topic chrony 15:34:51 <yoctozepto> so Wallaby and Debian 15:35:04 <yoctozepto> is this the release where we support both Buster and Bullseye, right? 15:35:07 <yoctozepto> (on host) 15:35:14 <yoctozepto> (as the images are simply bullseye) 15:35:14 <hrw> yeah 15:35:26 <hrw> both can have same docker version but buster is cg1 15:35:29 <yoctozepto> ok; should we test both in CI then? I would 15:35:40 <hrw> good point 15:36:13 * yoctozepto is moving to debian-based setup and would love good CI coverage 15:36:32 <hrw> yay 15:36:51 <mgoddard> here's what we said for ubuntu in victoria 15:36:54 <mgoddard> The Victoria release adds support for Ubuntu Focal 20.04 as a host operating system. Ubuntu users upgrading from Ussuri should first upgrade OpenStack containers to Victoria, which uses the Ubuntu Focal 20.04 base container image. Hosts should then be upgraded to Ubuntu Focal 20.04. 15:36:59 <mgoddard> (from https://docs.openstack.org/kolla-ansible/latest/user/operating-kolla.html) 15:37:17 <mgoddard> I don't know if anyone ever tested it :) 15:37:33 <hrw> ;P 15:37:50 <mgoddard> so if we assume the same approach for debian 15:37:57 <mgoddard> we provide bullseye based containers in wallaby 15:38:07 <yoctozepto> yes, and it looks worky 15:38:08 <mgoddard> and support both buster and bullseye hosts 15:38:12 <yoctozepto> it looked worky with ubuntu too 15:38:24 <yoctozepto> but we did not test it in CI 15:38:52 <yoctozepto> I need to check it 15:39:01 <mgoddard> buster host, victoria/buster containers -> buster host, wallaby/bullseye containers -> bullseye host, wallaby/bullseye containers 15:39:19 <hrw> yeah 15:39:26 <mgoddard> so our upgrade jobs should use buster in wallaby 15:39:36 <mgoddard> and our host OS checks should allow both 15:40:31 <yoctozepto> yeah, we have done it for ubuntu 15:40:36 <yoctozepto> using bionic in upgrade 15:40:38 <yoctozepto> and focal in others 15:40:45 <yoctozepto> so let's do the same here 15:40:51 <yoctozepto> buster in upgrade 15:40:57 <yoctozepto> bullseye in others 15:41:01 <mgoddard> I'll add add notes to the patch 15:41:14 <yoctozepto> (there is only one but I might throw more in Xena) 15:41:24 <yoctozepto> thanks 15:42:32 <mgoddard> can I chrony yet? 15:43:00 <mgoddard> #topic chrony 15:43:11 <mgoddard> #link https://review.opendev.org/c/openstack/kolla-ansible/+/792119 15:43:30 <mgoddard> wallaby deprecates chrony, and disables it by default 15:43:40 <mgoddard> therefore we should clean up the container, if disabled 15:44:05 <mgoddard> but, how do we do this cleanly without losing time sync? 15:44:31 <yoctozepto> good q 15:44:51 <yoctozepto> well, if there was chrony container to remove 15:44:55 <yoctozepto> and it worked correctly 15:44:56 <hrw> how do we handle it on fresh installs? 15:45:01 <yoctozepto> then we are very likely breaking it 15:45:29 <yoctozepto> can we do it like this 15:45:38 <yoctozepto> if we do upgrade 15:45:43 <yoctozepto> and chrony is disabled 15:45:54 <yoctozepto> but it was enabled (i.e., the playbook sees containers to go down) 15:46:04 <yoctozepto> we pause the playbook 15:46:12 <yoctozepto> and wait for user to acknowledge this 15:46:22 <yoctozepto> we can have a variable to skip this acknowledgment 15:46:32 <yoctozepto> (to support automated users who read renos) 15:46:43 <yoctozepto> and also we will not pause if no containers exist 15:46:55 <yoctozepto> this way we target the right people 15:47:22 <mgoddard> or people who run it twice :) 15:47:50 <yoctozepto> twice? I excluded those 15:48:08 <mgoddard> we have a time sync precheck 15:48:20 <yoctozepto> it's used in different situations 15:48:22 <mgoddard> could we use that? 15:48:25 <yoctozepto> and it would succeed right after 15:48:37 <yoctozepto> we just need to let users *know for sure* 15:48:44 <mgoddard> how long would it take to not succeed? 15:48:44 <yoctozepto> also, I meant this pause ~> https://docs.ansible.com/ansible/latest/collections/ansible/builtin/pause_module.html 15:49:13 <yoctozepto> mgoddard: I think it depends on kernel observing the clock stability 15:49:40 <yoctozepto> I noticed it being set as unsync after 24h 15:49:52 <yoctozepto> no, we will not add a wait for this ;-) 15:50:40 <yoctozepto> well, the fun fact is 15:50:48 <yoctozepto> the host would have worky ntp 15:50:59 <yoctozepto> if not for kolla-ansible which broke it on purpose to get chrony on board :-) 15:51:14 <mgoddard> how about this 15:51:18 <yoctozepto> yes 15:51:30 <mgoddard> check systemd for known ntp daemons 15:51:52 <mgoddard> add a flag to override, aka acknowledge the change 15:52:18 <mgoddard> provide a command/playbook to cleanup chrony before upgrade 15:52:43 <mgoddard> so ideal workflow would be 15:52:54 <mgoddard> kolla-ansible cleanup_chrony 15:52:56 <yoctozepto> 1) kill_my_chrony 15:52:58 <mgoddard> kolla-ansible prechecks 15:53:01 <mgoddard> kolla-ansible upgrade 15:53:12 <mgoddard> but, for those who ignore renos 15:53:18 <mgoddard> kolla-ansible upgrade 15:53:36 <mgoddard> will check for ntp daemons before cleaning up chrony 15:53:41 <yoctozepto> well, we can always teach people a lesson to read renos 15:54:09 <mgoddard> my clients tend not to like it if I teach them a lesson... 15:54:31 <yoctozepto> but are not you the one doing their upgrades? 15:54:36 <yoctozepto> (or someone else from stackhpc) 15:54:43 <mgoddard> not always 15:55:02 <yoctozepto> well, I think upgrades are the pinnacle of openstack support 15:55:12 <yoctozepto> so they should rethink their attitude 15:55:25 <yoctozepto> but I get you 15:55:44 <mgoddard> I will pass on your message :D 15:55:46 <mgoddard> anyway 15:55:58 <mgoddard> needs more thought, but we have some ideas 15:56:09 <mgoddard> #topic master branch life cycle 15:56:18 <mgoddard> #link https://etherpad.opendev.org/p/kolla-release-process-draft 15:56:22 <mgoddard> did anyone read it? 15:56:27 <yoctozepto> I didn't have time to think about time frame 15:56:31 <yoctozepto> but I read it 15:56:58 <hrw> I did 15:57:00 <hrw> commented even 15:59:12 <yoctozepto> my biggest concern is 15:59:18 <yoctozepto> in this simplistic view 15:59:28 <yoctozepto> we lose the ability to e.g. test bifrost master 16:00:04 <hrw> nope 16:00:08 <yoctozepto> or perhaps not 16:00:19 <yoctozepto> because I think I switch the reference forcibly 16:00:26 <mgoddard> and bifrost master now has a job that uses wallaby? 16:00:28 <hrw> R+9 is when we use master source instead of stable/previous 16:00:59 <yoctozepto> mgoddard: no, just realised the code is replacing the reference with what is in the change 16:01:06 <mgoddard> R+9 is next week 16:01:10 <yoctozepto> so I was just confusing myself and you 16:01:23 <yoctozepto> yeah, the timeframe is to be discussed really 16:01:30 <yoctozepto> and for the meeting as well 16:01:34 <openstackgerrit> Merged openstack/kolla-ansible stable/wallaby: baremetal: Install Docker SDK less than 5.0.0 https://review.opendev.org/c/openstack/kolla-ansible/+/792110 16:01:35 <yoctozepto> as we missed its timeframe 16:01:53 <yoctozepto> thanks mgoddard 16:02:00 <mgoddard> bifrost job may use master, but we will not be testing that, so who knows if it works? 16:02:51 <yoctozepto> mgoddard: I would say differently: we may break the job on bifrost queue now and not know it 16:03:41 <mgoddard> or they may require a change on our side, but we cannot test it 16:04:20 <yoctozepto> we can test it "once" 16:04:31 <yoctozepto> but then it reverts back to stable for subsequent runs 16:04:44 <mgoddard> I'll put the draft onto openstack-discuss 16:05:01 <mgoddard> and also announce end of feature freeze, which should have happened some time ago 16:05:16 <mgoddard> thanks all 16:05:23 <mgoddard> #endmeeting