#openstack-kolla log

13:02:37 <mnasiadka> #startmeeting kolla
13:02:37 <opendevmeet> Meeting started Wed Sep  3 13:02:37 2025 UTC and is due to finish in 60 minutes.  The chair is mnasiadka. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:02:37 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
13:02:37 <opendevmeet> The meeting name has been set to 'kolla'
13:02:40 <mnasiadka> #topic rollcall
13:02:41 <yuval__> R0
13:02:41 <mnasiadka> o/
13:02:43 <mnasiadka> (now)
13:02:49 <frickler> \o
13:02:52 <amir58118> o/
13:02:56 <bbezak_> O/
13:03:56 <seunghunlee> o/
13:04:15 <bbezak_> (Real bbezak is on broken matrix.org) here is the imposter :)
13:04:43 <mnasiadka> #topic agenda
13:04:43 <mnasiadka> * CI status
13:04:43 <mnasiadka> * Release tasks
13:04:43 <mnasiadka> * Regular stable releases (first meeting in a month)
13:04:43 <mnasiadka> * Current cycle planning
13:04:45 <mnasiadka> * Additional agenda (from whiteboard)
13:04:45 <mnasiadka> * Open discussion
13:04:47 <mnasiadka> #topic CI status
13:05:38 <mnasiadka> so, I've noticed Venus job is failing - check-logs.sh is not finding any fluentd data for it - so I proposed https://review.opendev.org/c/openstack/kolla-ansible/+/959298
13:06:04 <mnasiadka> zun is also failing due to changes in pyroute2 - proposing to drop jobs for now - https://review.opendev.org/c/openstack/kolla-ansible/+/959307
13:06:57 <mnasiadka> We have that dns_assignment intermittent issue in ovn jobs - we merged some fix to loop in checking that - it seems it's helping a bit more - but frickler agreed to look into a follow up testing it more and finding out if it's a Neutron bug
13:07:13 <mnasiadka> If there are any other jobs persistently failing - please let us know
13:08:03 <frickler> venus is ptl-less, might be retired soon. does anyone care for it?
13:08:25 <mnasiadka> I don't, let's wait for TC hunt for a PTL - but I doubt anyone serious is using that
13:08:47 <mnasiadka> #topic Release tasks
13:09:25 <mnasiadka> We're not doing cycle highlights, because it's hard to predict - otherwise than that we're good - we dropped RDO, switched UCA, Debian OpenStack is not releasing Bookworm 2025.2
13:10:08 <mnasiadka> By dropping RDO we might have broken skyline (not that I care a lot for this project) - it seems it's using MySQLDb pip package
13:10:14 <bbezak_> So what they are releasing? Trixie 2025.2?
13:10:19 <mnasiadka> Trixie 2025.2
13:10:47 <frickler> trixie has 2025.1 natively already, iiuc. so yes
13:10:54 <mnasiadka> Anyway, let's move on
13:11:01 <mnasiadka> #topic Current cycle planning
13:11:13 <mnasiadka> I think seunghunlee's MariaDB discussion fits more here
13:11:27 <seunghunlee> hello
13:11:28 <mnasiadka> I'm not going to paste the full drama here
13:11:30 <mnasiadka> #link https://etherpad.opendev.org/p/KollaWhiteBoard#L68
13:11:58 <mnasiadka> TLDR; MariaDB changed default collation set of charset utf8mb3 from utf8mb3_general_ci to utf8mb3_uca1400_ai_ci from 11.5 https://jira.mariadb.org/browse/MDEV-25829
13:12:15 <mnasiadka> And it breaks even deployments
13:13:21 <mnasiadka> There has been a hint of this problem by noonedeadpunk in yesterday's TC meeting - because OSA is on the same boat - but they already moved
13:13:52 <mnasiadka> From my perspective - it would be good to send a mail to openstack-discuss ML and be part of the discussion in the TC weekly meeting next week
13:14:21 <mnasiadka> I'll wait some minutes for bbezak_, frickler, seunghunlee and others to chime in
13:14:22 <seunghunlee> I also wanted to suggest that. As this is inter-service problem
13:14:38 <bbezak_> I agree, this is potentially problem for whole openstack in general
13:14:43 <mnasiadka> We don't HAVE TO migrate - mariadb 10.11 is LTS and has two more years of support
13:14:47 <frickler> I don't think we need to rush this, sticking to 11.4 or even 10.11 for another cycle or two should be fine
13:15:11 <mnasiadka> Well, maybe the solution is to jump to 11.4 this cycle
13:15:20 <mnasiadka> Because that sounds like a safe-ish solution
13:15:34 <seunghunlee> That can work too
13:15:36 <mnasiadka> buy some popcorn and wait for the evolution
13:15:39 <frickler> https://mariadb.org/11-8-is-lts/ has a nice table
13:15:47 <yuval__> :)
13:16:15 <mnasiadka> seunghunlee: so from my perspective - send out a mail with summary to openstack-discuss ML - and attend the next weekly TC meeting
13:16:21 <mnasiadka> and let's see how it evolves
13:16:41 <mnasiadka> and in parallel - we can migrate to 11.4 which will give us some more years of support
13:16:42 <seunghunlee> Sounds good
13:16:51 <bbezak_> 11.4 and popcorn
13:16:53 <mnasiadka> and should not introduce any other problems ;-)
13:16:53 <bbezak_> good
13:17:01 <seunghunlee> hopefully
13:17:21 <mnasiadka> ok then
13:17:23 <seunghunlee> 11.4 still needs adjusted way of recovery iirc
13:17:38 <seunghunlee> which is https://review.opendev.org/c/openstack/kolla-ansible/+/958281
13:17:46 <mnasiadka> if you can prepare a Gerrit topic with patches needing upgrade to 11.4 - we're happy to review it
13:18:04 <bbezak_> And 11.4 has even longer regular support than 11.8 and future 12.3
13:18:18 <bbezak_> (community)
13:18:36 <seunghunlee> Will do. Although I think this one is in the topic
13:18:57 <seunghunlee> Yeah. As in kolla_mariadb_11
13:19:01 <opendevreview> Pierre Riteau proposed openstack/kayobe master: Support CentOS Stream 10 and Rocky Linux 10 images  https://review.opendev.org/c/openstack/kayobe/+/959306
13:19:24 <frickler> seunghunlee: it that new method compatible with older mariadb versions?
13:19:33 <frickler> s/it/is/
13:20:53 <seunghunlee> It passed CI with both 10.11 and 11.8
13:21:04 <seunghunlee> and CI tests recovery
13:21:29 <frickler> ok, that should be good enough I think, thx, will put it on my review list
13:21:36 <seunghunlee> Thanks
13:21:46 <mnasiadka> ok then, let's move on
13:21:59 <mnasiadka> Ah, skipped one
13:22:01 <mnasiadka> #topic Regular stable releases (first meeting in a month)
13:22:14 <mnasiadka> bbezak_: have we merged the mariadb unpin in 2025.1?
13:22:37 <mnasiadka> #link https://review.opendev.org/c/openstack/kolla/+/956538
13:22:38 <mnasiadka> nope
13:22:43 <mnasiadka> so no stable releases yet
13:23:29 <mnasiadka> let's try to prepare for it correctly
13:23:41 <bbezak_> We didn’t have those in a while now
13:24:04 <mnasiadka> #topic Open discussion
13:24:10 <mnasiadka> Anybody anything?
13:24:13 <bbezak_> @fungi
13:24:13 <yuval__> yea
13:24:20 <bbezak_> #link https://review.opendev.org/c/openstack/kolla/+/958763
13:24:38 <bbezak_> As promised in some IRC meeting, here is the doc update for contributors
13:24:50 <dougszu> Group IDs below 1000
13:25:00 <Vii> hey I think this is important before releasing a new version https://review.opendev.org/c/openstack/kolla-ansible/+/958888
13:25:15 <Vii> ssl hardening
13:25:45 <mnasiadka> Vii: it's not getting in stable as a backport - we we're discussing stable branch point releases
13:26:14 <Vii> It would be nice if this worked too :) it's a great help for test environments https://review.opendev.org/c/openstack/kolla-ansible/+/958861
13:26:19 <mnasiadka> bbezak_: thanks, let me and frickler review this (if he has spare cycles) and maybe fungi can chime in
13:26:29 <yuval__> I am running with ubuntu jammy, this commit in nova breaks the deployment: https://opendev.org/openstack/nova/commit/e4340cd8e54fe3b447192e7982693793c60569cd#diff-93e955ea3f3f2784989c87ebc799c16b175cbb1b
13:26:53 <mnasiadka> yuval__: which branch? master?
13:27:22 <mnasiadka> We don't see any problem with that in our master CI testing, that's for sure
13:27:35 <mnasiadka> So if you do - it's rather a topic for Nova team
13:28:03 <yuval__> the nova docker is missing a module to import oslo_service.backend
13:28:11 <yuval__> the commit was merged 3 weeks ago
13:28:22 <yuval__> so docker older than that probably see the same issue
13:28:36 <yuval__> https://quay.io/repository/openstack.kolla/nova-libvirt?tab=tags
13:28:58 <mnasiadka> we're not running any nova component in nova-libvirt
13:29:02 <mnasiadka> it's pure libvirt there
13:29:39 <mnasiadka> Basically - if you're seeing issues that we're not seeing in master branch CI - you're either using old images or old kolla-ansible code
13:29:46 <yuval__> I can check exactly where its failing
13:29:53 <yuval__> but yes I agree with you
13:30:10 <bbezak_> Please create a bug if you feel this is a kolla problem
13:30:11 <priteau> This patch for cyborg dev mode has been open for 2 years, I think it's good to go - I tested it back in April: https://review.opendev.org/c/openstack/kolla-ansible/+/890883
13:30:18 <mnasiadka> Yeah, please do - and come back with some outputs (use paste.openstack.org)
13:30:55 <mnasiadka> priteau: I'll have a look later
13:31:04 <mnasiadka> Ok then, anything else?
13:31:16 <dougszu> This change: https://review.opendev.org/c/openstack/kolla/+/930931 create a side effect of the group IDs in the container getting out of sync with the host
13:31:49 <seunghunlee> I left comment on https://review.opendev.org/c/openstack/kolla-ansible/+/928487 and https://review.opendev.org/c/openstack/kolla-ansible/+/927096 to try unblocking. They are also in kolla_mariadb_11 topic
13:32:20 <dougszu> Perhaps it is easier to manage groups IDs below 1000 explicitly as for Kolla groups? I have amended this patch to do it that way: https://review.opendev.org/c/openstack/kolla/+/955388
13:33:41 <mnasiadka> Well, today we use 424XX for Kolla users/groups
13:33:53 <mnasiadka> So what is getting 999 now?
13:34:10 <dougszu> On Noble 999 is now the systemd-journal
13:34:38 <dougszu> But in the Noble container, GID 999 gets taken by something random from udev. Since we have removed the config for systemd-journal in the container
13:36:12 <dougszu> The current approach I have taken is to reserve 999 in advance of udev taking it, which seems to be good enough for now.
13:36:19 <mnasiadka> Other option would be probably to set some MIN uid/guid range for users/groups that get added
13:36:35 <mnasiadka> But I guess that's fine for now
13:36:50 <mnasiadka> But I think we need a test in kolla-ansible to make sure we don't break it next time
13:36:55 <dougszu> This is all happening in post-install scripts from debian packages
13:37:02 <frickler> is this only for noble? or do other distros have the same issue with gid allocation?
13:37:25 <Vii> One more thing. What do you think? Is this worth continuing? So that not all users are in every image? https://review.opendev.org/c/openstack/kolla/+/951944
13:37:25 <Vii> users.py file - keystone user - example
13:37:29 <dougszu> It is only Noble where I see the issue (specifically for systemd)  - I think the issue was triggered by this: https://review.opendev.org/c/openstack/kolla/+/930931
13:37:41 <mnasiadka> We're setting LAST_SYSTEM_UID in https://github.com/openstack/kolla/blob/cc328dc8656399ec4a68156573e91e6930fa1948/docker/base/Dockerfile.j2#L236
13:37:52 <mnasiadka> Maybe there's a variable for FIRST_SYSTEM_UID?
13:38:55 <dougszu> Ideally we would bind mount in the contents of /usr/lib/sysusers.d from the host and then everything would match in the container
13:39:28 <mnasiadka> Maybe that's an option
13:39:41 <mnasiadka> But still I think if you care for this functionality - we need to have it tested in CI
13:40:39 <dougszu> So we could turn on forwarding of the journal in CI perhaps
13:41:47 <dougszu> I can do that
13:42:05 <dougszu> I will grep for the registry pwd :D
13:42:25 <mnasiadka> Anything ;-)
13:42:42 <mnasiadka> Ok then, let's continue discussion in Gerrit
13:42:51 <dougszu> thanks
13:43:39 <mnasiadka> Vii: Probably it makes sense, but I think we have more priority things - we can discuss that another time - if you refresh that patch
13:43:51 <mnasiadka> Ok then, thanks for coming - see you next week :-)
13:43:53 <mnasiadka> #endmeeting