15:00:11 <mnasiadka> #startmeeting kolla 15:00:11 <opendevmeet> Meeting started Wed Oct 27 15:00:11 2021 UTC and is due to finish in 60 minutes. The chair is mnasiadka. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:11 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:11 <opendevmeet> The meeting name has been set to 'kolla' 15:00:16 <frickler> mnasiadka: where do you keep that ping list? I'd like to add myself 15:00:18 <mnasiadka> #topic rollcall 15:00:21 <priteau> o/ 15:00:23 <mgoddard> \o 15:00:30 <mnasiadka> frickler: it's currently at https://wiki.openstack.org/wiki/Meetings/Kolla 15:00:41 <mnasiadka> o/ 15:00:42 <frickler> mnasiadka: thx 15:00:53 <hrw> [°][_] 15:02:53 <yoctozepto> o/ 15:03:05 <mnasiadka> #topic agenda 15:03:06 <em_> My management network (provider based vswitch), which is used to also tunnel the (non dvr) ovn network, has an mtu of 1400. When adding the provider network, i used a mtu of 1400 - but that seems not to be enough. Do i need to set something with kolla in general? I have found https://docs.openstack.org/neutron/queens/admin/config-mtu.html but not sure that applies here 15:03:18 <yoctozepto> mnasiadka: need to move it to the whiteboard (the official ping list) 15:03:28 <mnasiadka> em_: we have a weekly meeting now - please wait until it ends (around 1hr) 15:03:31 <mnasiadka> yoctozepto: yup 15:03:40 <em_> (oh sorry, was not aware about the irc based meeting, will shut up. Excuse me) 15:03:41 <mnasiadka> * Roll-call 15:03:41 <mnasiadka> * Agenda 15:03:41 <mnasiadka> * Announcements 15:03:41 <mnasiadka> * Review action items from the last meeting 15:03:41 <mnasiadka> * CI status 15:03:43 <mnasiadka> * Release tasks 15:03:43 <mnasiadka> * Yoga cycle planning 15:03:45 <mnasiadka> * Open discussion 15:04:07 <mnasiadka> #topic Announcements 15:04:24 <mnasiadka> I have none - anyone anything? 15:04:38 <mgoddard> RC!? 15:04:40 <mgoddard> RC1? 15:04:52 <mgoddard> PTG? 15:04:55 <yoctozepto> RC!!!!!1111oneoneoneeleven 15:04:55 <opendevreview> wu.chunyang proposed openstack/kolla-ansible master: Fix wrong opts in cyborg.conf https://review.opendev.org/c/openstack/kolla-ansible/+/815672 15:05:10 <mnasiadka> Ah, right - RC1 for Kolla, Kolla-Ansible and Kayobe has been cut. 15:05:19 <mgoddard> #info 15:05:26 <mnasiadka> #info RC1 for Kolla, Kolla-Ansible and Kayobe has been cut. 15:06:19 <mnasiadka> Ok then, let's move on I guess - unless somebody else wants to announce anything? 15:07:03 <mnasiadka> #topic Review action items from the last meeting 15:07:06 <mnasiadka> Seems there were none. 15:07:22 <mnasiadka> #topic CI Status 15:07:25 <mnasiadka> Are we green? 15:08:32 <mnasiadka> Seems we are - based on the whiteboard. 15:09:16 <mnasiadka> #topic Release tasks 15:09:36 <mnasiadka> So, do we have a list of blockers for doing RC2? 15:09:48 <mnasiadka> I think all MariaDB related patches have been merged? 15:10:43 <mnasiadka> yoctozepto: ? 15:11:19 <yoctozepto> yeah, I think so 15:11:30 <yoctozepto> any release tasks still to do? 15:11:45 <yoctozepto> centos-openstack-release done? 15:12:39 <mgoddard> the gerrit dashboards seem a bit broken 15:12:44 <mgoddard> no project filtering 15:12:58 <yoctozepto> mayhaps we need also https://review.opendev.org/c/openstack/kolla-ansible/+/814276 15:12:59 <yoctozepto> for mariadb 15:13:57 <frickler> mgoddard: for dashboards you need to remove the /#/ from the path 15:14:01 <opendevreview> Uwe Grawert proposed openstack/kolla-ansible master: [Grafana] Add unified alerting and smtp options https://review.opendev.org/c/openstack/kolla-ansible/+/815694 15:14:03 <frickler> or reload 15:14:39 <frickler> see https://gerrit-review.googlesource.com/c/gerrit/+/321535 15:14:39 <mgoddard> frickler: thanks 15:14:47 <hrw> yoctozepto: c-r-o-xena exists 15:15:01 <mnasiadka> and we use it 15:15:23 <yoctozepto> ok 15:15:30 <yoctozepto> so only that mariadb patch 15:15:37 <yoctozepto> but I'm not sure what the impact is 15:15:44 <yoctozepto> perhaps it garbles the config 15:15:47 <mnasiadka> around gerrit dashboards - I see master branch in stable branch backports section of Kolla dashboard, so maybe we need to revisit those and check what's going on 15:16:38 <priteau> About CI status, we are amber on wallaby for Kayobe (just updated the booard) 15:16:47 <priteau> It's caused by frequent disk full issues 15:17:02 <priteau> The wallaby images must be a bit bigger than other releases 15:17:25 <priteau> We have a proposed workaround which is to disable heat from CI upgrade jobs 15:18:22 <mnasiadka> Ok, I think one of the changes to make it better for Wallaby is merging/merged today 15:18:41 <mnasiadka> yoctozepto: I don't see a bug report in that patch, so can't really tell we should wait to get it reviewed and merged. 15:18:58 <mnasiadka> So, should we post RC2 for Kolla/Kolla-Ansible/Kayobe? 15:20:24 <mgoddard> https://review.opendev.org/c/openstack/kolla-ansible/+/814942 15:20:35 <yoctozepto> mnasiadka: it's in the reno 15:20:58 <yoctozepto> mgoddard: good catch 15:21:04 <mnasiadka> yoctozepto: what about closes-bug? 15:21:19 <yoctozepto> mnasiadka: yeah, you can comment that on it 15:21:31 <yoctozepto> but the bug report is not satisfactory tbh 15:21:48 <yoctozepto> it's like writing "it doesn't work." 15:22:47 <hrw> mnasiadka: let https://review.opendev.org/c/openstack/kolla/+/815440 merge and then RC2? 15:23:22 <mnasiadka> ok, so two changes are +w and we need to wait for them to merge 15:23:29 <mgoddard> if we're going to merge this revert then let's do it before release https://review.opendev.org/c/openstack/kolla-ansible/+/814949 15:23:52 <mgoddard> (discuss) 15:24:50 <opendevreview> Merged openstack/kayobe stable/wallaby: Set proxy option in early dnf invocation https://review.opendev.org/c/openstack/kayobe/+/814658 15:24:50 <mnasiadka> mgoddard: I think you owe us some more description and reason ;-) 15:25:28 <mgoddard> I would say the same about the original patch :) 15:25:30 <mnasiadka> I added enable_host_ntp and cinder-volume fix as RC2 blockers in the whiteboard (L297) 15:25:55 <mnasiadka> Well, the original patch stated everything works, and CI didn't explode - so I'd like to know what does it break ;-) 15:26:46 <yoctozepto> mgoddard: like mnasiadka said - it was well described :-) 15:27:09 <priteau> If we are issuing RC2 for Kayobe we may want to merge https://review.opendev.org/c/openstack/kayobe/+/812687 in xena 15:27:34 <yoctozepto> I don't mind reverting if it really breaks something; but then again we should probably rewrite it to work differently as it does not make sense to use it with ovs native firewall and ovn 15:28:17 <mnasiadka> priteau: especially that the comment says "since Xena"... ;-) 15:28:35 <mgoddard> I don't see where neutron is loading that module 15:29:18 <mgoddard> I do see neutron will print a warning if it is not loaded 15:30:52 <priteau> I think br_netfilter can be loaded by docker 15:31:00 <mgoddard> if it uses iptables 15:31:09 <mnasiadka> we default to disable iptables now I think 15:31:14 <mgoddard> I can try to justify, but not in real time 15:31:31 <yoctozepto> mgoddard: did it break somewhere IRL? 15:31:43 <mgoddard> not yet 15:32:03 <mnasiadka> but that means we could have non-working SGs? 15:32:05 <yoctozepto> then try to make it break :-) 15:32:09 <yoctozepto> I tried and it works 15:32:18 <yoctozepto> hence did not bother to improve, just removed 15:32:31 <mgoddard> it jumped out as one of those patches that could bite us 15:33:04 <yoctozepto> well, at least we know how to fix it quickly 15:33:09 <mgoddard> and my gut has often been right on those in the past but I let them slide then suffer latet 15:33:24 <yoctozepto> and we fix when it bites 15:33:25 <mnasiadka> So why change a tradition? 15:33:32 <mgoddard> anyway 15:33:35 <kevko> guys , is this visible in CI ? 15:33:35 <kevko> 2021-10-27 15:11:40.879 25 ERROR octavia.api.drivers.driver_factory [-] Unable to load provider driver ovn due to: No module named 'ovn_octavia_provider.common': ModuleNotFoundError: No module named 'ovn_octavia_provider.common' 15:33:36 <yoctozepto> oh well, that's it for the scientific method :D 15:33:47 <yoctozepto> kevko: guy, we are in a meeting 15:33:49 <yoctozepto> :-) 15:33:57 <kevko> oh, sorry :D 15:34:22 <priteau> I just have a freshly deployed kolla using xena branches, br_netfilter is loaded on compute hosts 15:34:40 <priteau> I can try and see what enabled it 15:34:42 <mgoddard> do you know how/when it got loaded? 15:35:28 <priteau> I don't know yet 15:35:37 <priteau> [Wed Oct 27 14:11:49 2021] Bridge firewalling registered 15:36:08 <yoctozepto> now we know when 15:36:18 <yoctozepto> your mileage may vary though 15:36:43 <priteau> But in neutron-openvswitch-agent logs: 15:36:54 <priteau> 2021-10-27 14:08:13.127 7 WARNING neutron.agent.linux.iptables_firewall [req-60c2a81f-edaa-4c34-a2ae-37017aeac72f - - - - -] Kernel module br_netfilter is not loaded. 15:36:54 <priteau> 2021-10-27 14:08:13.128 7 WARNING neutron.agent.linux.iptables_firewall [req-60c2a81f-edaa-4c34-a2ae-37017aeac72f - - - - -] Please ensure that netfilter options for bridge are enabled to provide working security groups. 15:37:26 <mnasiadka> well, so after the warning something did it ;-) 15:37:49 <hrw> this should be done by something on host before containers start 15:38:12 <hrw> otherwise we would need to have hostos == containeros 15:38:16 <mgoddard> Systems that don't override default settings for those knobs would work 15:38:19 <mgoddard> fine except for this exception in the log file and agent resync. This is 15:38:21 <mgoddard> because the first attempt to add a iptables rule using 'physdev' module 15:38:23 <mgoddard> (-m physdev) will trigger the kernel module loading. In theory, we could 15:38:25 <mgoddard> silently swallow missing knobs, and still operate correctly. But on 15:38:27 <mgoddard> second thought, it's quite fragile to rely on that implicit module 15:38:29 <mgoddard> loading. In the case where we can't detect whether firewall is enabled, 15:38:31 <mgoddard> it's better to fail than hope for the best. 15:38:37 <mgoddard> neutron commit e83a44b96a8e3cd81b7cc684ac90486b283a3507 15:39:38 <mgoddard> which I linked to in the br_netfilter patch 2 weeks ago 15:39:42 <priteau> It was loaded when I launched a VM 15:40:06 <priteau> created | 2021-10-27T14:11:45Z 15:40:18 <priteau> Loaded 4 seconds late 15:40:20 <priteau> later 15:40:49 <mgoddard> we should move on 15:41:10 <mnasiadka> Yup, what's the plan? Leave it as is since it seems to work? 15:42:56 <priteau> It works but it produces WARNING messages in logs, that's not nice 15:43:18 <priteau> What do we gain from not loading it? 15:44:01 <yoctozepto> priteau: this is for ovs native firewall and ovn to not have this oddity 15:44:12 <yoctozepto> but we can revert, it does not hurt 15:44:21 <yoctozepto> I can make a better version of it 15:44:30 <mnasiadka> well, we can tweak it to at least not enable when neutron_plugin_agent=ovn 15:44:31 <yoctozepto> at some point ;p 15:45:26 <mnasiadka> ok, let's revert for now and post a tweak 15:45:34 <yoctozepto> yeah, that makes sense 15:46:09 <mnasiadka> #agreed to revert https://review.opendev.org/c/openstack/kolla-ansible/+/814949 and post a better version (to skip loading when not required e.g. ovn) 15:46:21 <mnasiadka> #topic Yoga cycle planning 15:46:30 <mgoddard> sorry, laptop died 15:47:00 <mnasiadka> I have a draft summary mail I'm going to send to openstack-discuss after the PTG and then will populate the Whiteboard with priorities/tasks 15:47:57 <mnasiadka> I'm also planning to use Kolla Klub mailing list to get feedback around our single-distro plans for Kolla 15:47:59 <hrw> cool 15:48:15 <mnasiadka> Anything else that needs to be done? 15:48:44 <opendevreview> Radosław Piliszek proposed openstack/kolla-ansible master: Revert "Do not load br_netfilter" https://review.opendev.org/c/openstack/kolla-ansible/+/814949 15:48:44 <hrw> should we deprecate CentOS now in Xena to be able to drop it with Yoga or we deprecate in Yoga to drop in Zeus? 15:49:04 <mgoddard> I think we should wait until yoga 15:49:16 <yoctozepto> mgoddard, mnasiadka: I improved the error message there 15:49:19 <mgoddard> still a lot of uncertainty 15:49:21 <yoctozepto> sorry, the revert reason 15:49:26 <mgoddard> thanks yoctozepto 15:49:27 * yoctozepto tired lol 15:49:31 <yoctozepto> mgoddard: yw 15:49:46 <mnasiadka> Well, we need to make sure we're not going cs9 15:49:48 <hrw> yoctozepto: tired like old lamb? 15:49:54 <yoctozepto> hrw: yup 15:50:17 <mnasiadka> Does that mean we need to drop centos-binary in Yoga? 15:50:26 <hrw> mnasiadka: is RDO goes cs9 only for yoga then no binary in yoga for us 15:50:43 <hrw> deprecate & drop in one cycle 15:51:19 <mnasiadka> Ok, so we need to carry out the plan to deprecate all binary, and then if RDO goes cs9 (which they most probably will) in Yoga - we will drop that with an appropriate message? 15:51:49 <hrw> looks like 15:51:58 <mnasiadka> Ok, at least that's clear :) 15:52:16 <yoctozepto> ok, makes sense 15:52:43 <mnasiadka> I'll phrase it like this in the summary mail and in the whiteboard items, so it's clear for everybody. 15:53:15 <mnasiadka> And then we also deprecate CentOS as whole in Yoga and drop it in Zeus, right? 15:53:34 <mnasiadka> But still we need to rework centos-source to not pull in anything from RDO packages, so centos-source on cs8 works... 15:54:12 <mgoddard> I thought we said A for dropping? 15:54:34 <hrw> mgoddard: I hoped for A to be Debian/source only iirc 15:55:25 <hrw> as "Y drops binary, Z drops distros" but that can be one release too short 15:55:37 <mnasiadka> I remember we wanted a clean slate from A, but maybe we're just rushing it. 15:56:30 <mgoddard> I think we need a period where we're deploying the chosen containers on all host distros by default 15:56:49 <mgoddard> ideally have that released and in real world use for a while before dropping 15:57:17 <mgoddard> of course this is going to explode the test matrix 15:57:41 <yoctozepto> only debian on all 15:57:46 <yoctozepto> others dedicated 15:57:53 <yoctozepto> the question was about cs9 15:57:59 <mgoddard> ok, double the test matrix 15:58:01 <yoctozepto> as we can't rely on rdo then ;d 15:58:07 <yoctozepto> yeah, double 15:58:50 <mgoddard> on the libvirt question, I found a bit more context on the issue I mentioned 15:58:51 <yoctozepto> I guess in practice we don't need to verify anything other than qemu/kvm 15:58:57 <hrw> infra from xena times would probably fine 15:59:00 <yoctozepto> mgoddard: which one? 15:59:28 <mgoddard> we had problems with centos 8.2 libvirt containers on an 8.1 host 15:59:44 <mgoddard> qemu-kvm: error: failed to set MSR 0x48e to 0xfff9fffe04006172 15:59:46 <mgoddard> qemu-kvm: /builddir/build/BUILD/qemu-4.2.0/target/i386/kvm.c:2695: kvm_buf_set_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs\' failed. 16:00:21 <mnasiadka> Ok, I think we need to continue the discussion next week (or after the meeting). 16:00:32 <mnasiadka> Thanks for attending. 16:00:34 <mnasiadka> #endmeeting