13:00:09 <mnasiadka> #startmeeting kolla
13:00:09 <opendevmeet> Meeting started Wed Oct 18 13:00:09 2023 UTC and is due to finish in 60 minutes.  The chair is mnasiadka. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:00:09 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
13:00:09 <opendevmeet> The meeting name has been set to 'kolla'
13:00:13 <mnasiadka> #topic rollcall
13:00:53 <mmalchuk> o/
13:00:58 <jangutter> \o
13:01:16 <mnasiadka> o/
13:01:16 <SvenKieske> o/
13:01:23 <frickler> \o
13:01:24 <mattcrees> o/
13:01:27 <jsuazo> o/
13:03:32 <mnasiadka> #topic Agenda
13:03:32 <mnasiadka> * Announcements
13:03:32 <mnasiadka> * CI status
13:03:32 <mnasiadka> * Release tasks
13:03:32 <mnasiadka> * Current cycle planning
13:03:34 <mnasiadka> * Additional agenda (from whiteboard)
13:03:34 <mnasiadka> * Open discussion
13:03:37 <mnasiadka> #topic Announcements
13:03:43 <mnasiadka> PTG is next week
13:03:51 <mnasiadka> #link https://etherpad.opendev.org/p/kolla-caracal-ptg
13:04:00 <SvenKieske> \o/
13:04:04 <mnasiadka> Sign up please in L37
13:04:07 * mmalchuk ready
13:04:16 <mnasiadka> And please add topics (L101 and beyond)
13:04:37 <mnasiadka> #topic CI status
13:04:50 <mnasiadka> I think it's mainly green, not couting our Ansible breakage
13:04:51 <opendevreview> Juan Pablo Suazo proposed openstack/kolla-ansible master: Enable the Fluentd Plugin Systemd  https://review.opendev.org/c/openstack/kolla-ansible/+/875983
13:04:56 <opendevreview> Juan Pablo Suazo proposed openstack/kolla master: Adds TAAS Neutron plugin to support OVS port mirrors  https://review.opendev.org/c/openstack/kolla/+/885151
13:04:59 <mnasiadka> Trying to wrap my head around it, but it's not going to be an easy one
13:05:08 <mnasiadka> Anybody did have a look in that as well?
13:05:33 <frickler> not yet. maybe building a simple reproducer will be needed?
13:05:35 <SvenKieske> not really, I still suspect a real upstream bug, or did we change anything in that area?
13:06:00 <mnasiadka> We did not, that's the usual "we fixed that bug in Ansible" :D
13:06:01 <mmalchuk> there is an issue upstream, SvenKieske have a link?
13:06:01 <SvenKieske> there's also already user reports about this on the ML, I answered with the workaround
13:06:27 <SvenKieske> #link https://github.com/ansible/ansible/issues/81945
13:06:39 <mmalchuk> right
13:07:21 <mnasiadka> Anyway, let's move on
13:07:26 <mnasiadka> #topic Release tasks
13:07:35 <mnasiadka> Time to have a look at them - we merged switching the sources to Bobcat
13:08:28 <mnasiadka> seems still no sign of RDO repos on the centos stream mirror
13:08:49 <mnasiadka> I'll wait instead of using trunk.rdoproject.org
13:08:58 <opendevreview> Dawud proposed openstack/kolla-ansible stable/yoga: Remove the `grafana` grafana  https://review.opendev.org/c/openstack/kolla-ansible/+/898736
13:09:02 <mnasiadka> no other release tasks per se - just reviewing existing patches
13:09:39 <opendevreview> Dawud proposed openstack/kolla-ansible stable/yoga: Remove the `grafana` volume  https://review.opendev.org/c/openstack/kolla-ansible/+/898736
13:09:54 <SvenKieske> not sure I can finish the fluentd stuff in time..currently have a problem checking stuff inside the container. I think it's best to split out fluentd out of the "common" role, which will require some more work.
13:10:13 <SvenKieske> not sure if I should move this point to "open discussion"?
13:10:47 <mmalchuk> good idea to split
13:10:58 <mnasiadka> it's ok to discuss it here
13:11:04 <mnasiadka> we should probably move this cycle
13:11:09 <mnasiadka> but it seems we don't have to
13:11:46 <SvenKieske> yeah, so if there are no objections I'd move fluentd in a dedicated role, I'm also on vacation friday and with PTG coming up..how much time is left? :D
13:12:07 <opendevreview> Dawud proposed openstack/kolla-ansible stable/yoga: Remove the `grafana` volume  https://review.opendev.org/c/openstack/kolla-ansible/+/898736
13:12:57 <SvenKieske> I guess I take the id software dev approach: "It's done, when it's done".
13:13:08 <mnasiadka> it doesn't need to be done before the PTG
13:13:18 <mnasiadka> we're cycle trailing, so we still have some time to release
13:13:29 <mnasiadka> although I would prefer we do it sooner than last minute :)
13:14:16 <frickler> is fluentd the only upgrade missing? I lost track, are we good with prometheus plugins?
13:14:18 <SvenKieske> yeah sure; I just need to do some other stuff as well - mainly writing docs it seems - will check internally what has higher priority, maybe the answer is even fluentd.
13:14:23 <mmalchuk> or yesterday)
13:15:02 <SvenKieske> afaik I asked last - or the meeting before that - if someone could take a look at the prometheus plugins, no? I have lost track as well.
13:15:47 <mnasiadka> That might be looked into even last minute, although I would prefer that we would finally have a proper solution (I remember hrw started a script for checking versions on github and updating sources.py)
13:16:09 <mnasiadka> I might have a look into that after that crappy Ansible breakage
13:17:56 <mnasiadka> ok then
13:18:04 <mnasiadka> Podman - seems it's waiting for some reviews, probably mine
13:18:07 <mnasiadka> Let's Encrypt the same
13:18:17 <mnasiadka> frickler: do you have any cycles for looking at those two?
13:18:19 <SvenKieske> https://github.com/openstack-exporter/openstack-exporter/releases is at least at the latest :)
13:18:23 <mnasiadka> It would be nice to merge them this cycle
13:18:55 <frickler> I can check podman
13:19:43 <SvenKieske> be sure to also look at the (linked?) ansible-collection-kolla change for podman, I guess that's also still open
13:20:02 <frickler> ack
13:20:19 <SvenKieske> ah it's in the depends on: https://review.opendev.org/c/openstack/ansible-collection-kolla/+/852240
13:20:22 <mmalchuk> https://review.opendev.org/c/openstack/kolla/+/887347 can be merged
13:20:35 <mnasiadka> I commented on ack patch - I think we need podman based jobs there
13:20:41 <mnasiadka> a-c-k it is
13:21:05 <SvenKieske> yeah, I agree, there should be some testing going on :)
13:21:27 <mnasiadka> Ok, Let's Encrypt - I'll ask bbezak to review those next week
13:21:53 <mnasiadka> and let's try to get those in, even if the code is not perfect-ish ;)
13:22:07 <mnasiadka> #topic Additional agenda
13:22:24 <mnasiadka> 4 patches from jsuazo (the same ones again)
13:22:25 <mnasiadka> let's see
13:22:34 <mnasiadka> https://review.opendev.org/c/openstack/kolla/+/885151
13:22:56 <mnasiadka> +2 from me
13:23:09 <mnasiadka> https://review.opendev.org/c/openstack/kolla-ansible/+/885417
13:23:14 <mnasiadka> (k-a side of TAAS)
13:23:25 <mnasiadka> already has +2 from me
13:23:34 <mnasiadka> frickler: willing to have a look or should wait for bbezak?
13:24:09 <frickler> better ask bbezak, I'd be too picky for this
13:24:21 <mnasiadka> understood
13:24:27 <mnasiadka> https://review.opendev.org/c/openstack/kolla-ansible/+/844614 - Glance/Cinder-backup S3
13:25:10 <mmalchuk> ready 2 weeks ago
13:26:15 <mnasiadka> commented, but basically looks good
13:26:42 <mnasiadka> ok then
13:26:48 <mnasiadka> nothing more on the whiteboard
13:26:55 <mnasiadka> #topic Open discussion
13:27:06 <mnasiadka> I'll cancel next weeks meeting because we'll have the PTG sessions
13:27:11 <mnasiadka> Anything else?
13:27:27 <SvenKieske> if someone has free time, and it's maybe also a topic for PTG: https://review.opendev.org/c/openstack/kolla-ansible/+/898543
13:28:01 <SvenKieske> just a hack to enable quorum queues; all the feedback I read was that they are really nice to have
13:30:00 <mmalchuk> interesting. is this tested under heavy load?
13:30:15 <mnasiadka> well, the questions I have is 1) do we test that in CI 2) do we want to switch it to default in C 3) Migration docs for users - since it's breaking-ish
13:31:10 <mnasiadka> but since classic queue mirroring is deprecated for removal in 4.0 - it's either quorum queues or streams
13:31:35 <mnasiadka> Seems there is some movement in oslo.messaging
13:31:38 <mnasiadka> #link https://review.opendev.org/c/openstack/oslo.messaging/+/888479
13:31:44 <mmalchuk> 2 - no
13:31:44 <SvenKieske> mmalchuk: untested from my side (the patchset), from what I understand OVH does use quorum queues under heavy load, but they don't use k-a
13:31:44 <mmalchuk> this enough, so we need this in k-a
13:31:44 <frickler> we can start with adding the flag and allow this for new deployments. migration can be done next cycle then
13:31:44 <SvenKieske> no docs yet :) there's also an open bug to enable streams from OVH, but afaik that's not even implemented in oslo
13:31:47 <mnasiadka> #link https://review.opendev.org/c/openstack/oslo.messaging/+/890825
13:31:48 <SvenKieske> frickler: that was my intention as well :)
13:31:50 <mmalchuk> frickler +1
13:32:10 <SvenKieske> similar to what we did with the HA flag for rabbit
13:32:11 <mnasiadka> frickler: would feel safer if we had at least one CI job that uses it
13:32:36 <SvenKieske> I agree on the CI job, didn't have an immediate idea how that would best be implemented
13:32:49 <frickler> I don't think we need to test all possible configuration combinations. having a job when we switch the default is good enough IMO
13:32:49 <mmalchuk> mnasiadka test ha flag or +queues?
13:32:59 <mnasiadka> quorum queues
13:33:37 <opendevreview> Dawud proposed openstack/kolla-ansible stable/yoga: Remove the `grafana` volume  https://review.opendev.org/c/openstack/kolla-ansible/+/898736
13:35:19 <mnasiadka> we don't need to test all possible configuration combinations, but it would be useful to test this - with a vision that we'll move to this as default since queue mirroring will be gone in RMQ 4.0
13:35:47 <mnasiadka> like let's use quorum on Ubuntu and HA on Rocky - or something similar
13:36:39 <SvenKieske> sure; would that just entail a set of jobs with it enabled? but which scenario should this run? I've never written upstream CI jobs "from scratch" so I would be grateful for some pointers how that should look like
13:36:54 <frickler> I would still prefer to defer that to the next cycle. but feel free to add a patch for that on top of SvenKieske's, it shouldn't block that change though
13:37:12 <SvenKieske> I can see both sides of the argument
13:37:46 <SvenKieske> I can add a warning doc "this is untested" :D on the other hand, when I looked at other deployment projects, they enabled quorum queues without additional tests ;)
13:38:24 <mmalchuk> you said OVH uses it
13:38:26 <SvenKieske> ¯\_(ツ)_/¯
13:38:32 <SvenKieske> they don't use k-a
13:38:43 <mnasiadka> I can see people raising bugs about it and if we can't test it in CI - and no one uses that in production from the Kolla community - we're kind of not supporting it? :)
13:38:44 <mmalchuk> k-a only deployment tool
13:38:47 <SvenKieske> if you search for quorum queues on the ML there are some happy users :)
13:39:24 <frickler> to have some real testing, one would at least need to do upgrades, if not host failovers. for upgrades we need to add the flag now so we can test next cycle
13:39:36 <mmalchuk> two cents not enable it by default
13:39:39 <SvenKieske> fwiw the SCS project is interested in quorum queues because there was some perceived instability in certain upgrade scenarios, so I hope I get some testers there as well
13:40:07 <mnasiadka> not enable it by default just yet
13:40:19 <SvenKieske> mmalchuk: the current patch disables it
13:40:23 <mnasiadka> just have a minimal testing coverage
13:40:25 <SvenKieske> frickler: agree
13:40:47 <SvenKieske> okay, so a basic (HA?) job that just tests if enabling the flags works?
13:41:00 <mnasiadka> well, I guess multinode would be best
13:41:09 <SvenKieske> seems like a good compromise?
13:41:13 <mnasiadka> but singlenode would at least tell us it works
13:41:14 <mnasiadka> :)
13:41:17 <SvenKieske> yeah multinode, that's what I meant
13:41:24 <mnasiadka> because now we can only assume it works
13:42:00 <mnasiadka> Ok, I'll think of something
13:42:12 <SvenKieske> it should work ;) it's at least supported by oslo :D I'll add some very basic test, probably not this week, so if someone beats me, I'm also fine with that :)
13:42:13 <mnasiadka> Not counting we just changed the default to HA
13:42:29 <mnasiadka> and mattcrees spent some time to enable it in upgrade jobs including a cleanup ;)
13:42:54 <mnasiadka> and probably moving from HA to quorum includes the same dance
13:43:11 <SvenKieske> yeah, it's a little different, but in practice you have to recreate all the queues.
13:43:15 <jangutter> what about adding it to the "experimental" pipeline for now? Won't guarantee it won't regress, but at least that way the job lives in the repo.
13:43:56 <mnasiadka> or switch multinode (not multinode-upgrade) to quorum queues and fix the upgrade bit in C
13:44:11 <mnasiadka> this way we will have some better testing coverage and a path forward
13:44:12 <SvenKieske> let's discuss further stuff in the patch review, what do you think?
13:44:18 <mnasiadka> but anyway, we've spent too much time on this :)
13:44:31 <mnasiadka> yes, let's discuss there
13:44:35 * kevko is upgrading to zed right now, but watching
13:44:45 <opendevreview> Merged openstack/kayobe stable/2023.1: bifrost: Populate bifrost host vars on deprovision  https://review.opendev.org/c/openstack/kayobe/+/898561
13:45:33 <mnasiadka> ok then
13:45:42 <mnasiadka> I don't think there is anything more
13:45:59 <mmalchuk> we still lack of Kayobe reviews
13:46:01 <mmalchuk> https://review.opendev.org/c/openstack/kayobe/+/861397
13:46:07 <mmalchuk> and https://review.opendev.org/c/openstack/kayobe/+/879554
13:46:17 <mmalchuk> a half of year
13:46:31 <mmalchuk> almost
13:46:56 <mnasiadka> Passed that internally in SHPC
13:47:11 <mmalchuk> thanks
13:47:18 <mnasiadka> Let's finish for today - thank you all for coming and speak to you on Monday :)
13:47:21 <mnasiadka> #endmeeting