19:00:06 <clarkb> #startmeeting infra
19:00:06 <opendevmeet> Meeting started Tue Sep 30 19:00:06 2025 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:00:06 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:00:06 <opendevmeet> The meeting name has been set to 'infra'
19:00:17 <clarkb> #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/XWCKOXZRNPPBJ7KRRRS5PPKFPFUHANR5/ Our Agenda
19:00:25 <clarkb> #topic Announcements
19:00:34 <clarkb> The OpenStack release is tomorrow October 1
19:00:37 <clarkb> where did september go
19:01:17 <fungi> down the time drain
19:01:54 <clarkb> then next week is the opendev pre ptg (mroe details later)
19:02:12 <clarkb> and then the week after that is the summit (though not until the weekend)
19:02:38 <clarkb> then the ptg then its november
19:02:50 <fungi> and i'll be travelling for another conference in between the pre-ptg and summit weeks, so won't be around much
19:03:10 <clarkb> anything else to announce?
19:04:26 <clarkb> #topic Gerrit 3.11 Upgrade Planning
19:04:53 <clarkb> I don't have anything new on this unforunately. But as mentioned the openstack release (and I think starlingx's upcoming release) make this less of an issue
19:05:18 <clarkb> we did discover that we cannot delete images in openmetal due to some gerrit autoholds I have in place which impacts zuul launcher logs but otherwise doesn't seem to be an issue
19:05:53 <clarkb> that will just serve as a reminder that it would be great if glance could do reference counting and treat the image itself as deleted but not delete content until all the references to it (VMs) go away
19:05:58 <clarkb> but that is not a new issue
19:06:14 <clarkb> #topic Upgrading old servers
19:06:37 <clarkb> last week we discovered a small issue with the kerberos upgrades that impacted our periodic daily deployment jobs
19:06:59 <clarkb> tl;dr is that we needed to remove the borg virtualenv on kdc03 so that ansible could rebuild it against the new python version on ubuntu noble
19:07:14 <clarkb> this was done after the jump from focal to jammy but not after jammy to noble until last week
19:07:24 <fungi> i did it as part of the upgrade to jammy but forgot to do it again once on noble
19:07:25 <clarkb> just something to keep in mind if we do more in place upgrades
19:07:47 <fungi> i blame going on vacation in between those
19:08:00 <clarkb> I don't think there were any other server upgrade updates, but please chime in if there is anything else to add
19:08:13 <clarkb> we've always got more servers to upgrade too so help is always appreciated on this topic
19:09:29 <clarkb> #topic AFS mirror content cleanup
19:10:18 <clarkb> fungi: did you get a chacne to look into whether or not we could safely run the removed release cleanup steps in the reprepro update script? I know you ran into a problem where that process deleted a file early and that impacted subsequent updates so I'm thinking this is complicated?
19:10:32 <clarkb> (its ok to say that isn't reasonable to do I'm just wondering if we've figured that out yet)
19:10:43 <fungi> yeah, i'm still not sure yet
19:10:49 <tonyb> I started looking at how I can modify my metadata checker to emit an rsync filter.
19:10:57 <fungi> because of the gotcha you mentioned, i didn't get a good measurement
19:11:15 <fungi> we should remember to revisit it when we remove the bionic mirror
19:11:30 <tonyb> I think it's going to need at least 2 passes but should work.
19:11:45 <clarkb> fungi: ack we can table this until we're cleaning up bionic then
19:12:05 <clarkb> tonyb: oh cool. 2 passes shold be fine since we don't publish until the vos release happens
19:12:14 <clarkb> (so our ci jobs wont' see the incomplete first pass if that is a concern)
19:12:49 <tonyb> Yeah that's my thinking.  Just hoped for better.
19:13:56 <clarkb> I was also going to ask if we think we've trimmed enough content to consider mirroring new stuff like Trixie for example.
19:14:39 <clarkb> CentOS 10 Stream, Rocky linux, and Alma now are all other things we could consider too, but seems like people want to discuss this stuff at the summit so maybe we wait to figure out what openstack's strategy will be around enterprise linux before we invest too heavily in any one option in the mirrors
19:15:39 <tonyb> Yeah.  I think that makes sense.
19:16:17 <clarkb> and also a reminder that there is a long tail of smaller impact cleanups that we can probably do (puppetlabs stuff, ceph stuff, python wheel stuff, etc)
19:16:44 <clarkb> based on what I see in grafana I suspect that adding trixie wouldn't be an issue
19:17:06 <tonyb> Please remind me to mask about puppet in open discussion
19:17:16 <clarkb> tonyb: will do
19:17:39 <clarkb> any other openafs cleanup related discussion?
19:18:36 <fungi> not from me
19:18:43 <clarkb> #topic Zuul Launcher Updates
19:19:11 <clarkb> The dual nic saga should be over now. This ended up being a bug in zuul itself duplicating the network config if supplied and inherited from in particular ways
19:19:40 <clarkb> corvus fixed zuul and that got deployed and since then we dropped the config from clouds.yaml and added it to zuul's config and all seems to work at this point
19:20:07 <clarkb> this means we can control this stuff going forward without restarting the service to pick up clouds.yaml updates which is nice. We're also providing useful feedback to the zuul project as it transitions away from nodepool
19:20:44 <clarkb> Then separately we've got a bunch of ERROR'd nodes in rax-ord from the nodepool error that are coutning against our quota that I emailed them about cleanup for. Hhaven't heard anything on that yet and the nodes were there yesterday when I checked
19:21:19 <clarkb> in that same email I asked about rax-dfw api performance/reliability as we disabled that region due to errors there. Its been long enough that I'm thinking we should reenable rax-dfw and see if the problem persists
19:21:46 <clarkb> corvus: ^ maybe we do that tomorrow after the openstack release is done (just so that we don't slow down the release process with rax-dfw grabbing node requests and needing to fail several times in a row before another provider can fulfill the request)
19:22:05 <clarkb> I'll make a note for that now as I think that is a reaosnable thing to try once we're out of the way of the release
19:22:38 <clarkb> And then this wasn't on the agenda but mnasiadka just pushed a change to add alma linux test nodes
19:23:18 <clarkb> I do think that we should be aware that means we'll have 3 different enterprise linux distros with ~5 different images on x86 alone
19:23:50 <clarkb> but as long as we're not investing in mirrors and the images build without trouble I don't think that is a major issue. Its the mirrors that create the huge overhead more than the images themselves
19:23:59 <fungi> the 3 being alma, rocky and ubuntu?
19:24:07 <clarkb> #link https://review.opendev.org/c/opendev/zuul-providers/+/962564
19:24:17 <clarkb> fungi: alma, rocky, centos stream
19:24:36 <fungi> oh, i didn't realize centos stream was considered "enterprise"
19:24:43 <fungi> sorry for the confusion
19:24:48 <clarkb> fungi: I think it continues to identify itself that way
19:24:58 <clarkb> like glean isn't doing special things to check for it I don't think
19:25:09 <clarkb> "enteprise linux" is a family of linuxes
19:25:26 <fungi> i guess "enterprise" means rhel derivatives, not necessarily server-targeted commercial distros
19:25:38 <clarkb> ya
19:25:42 <clarkb> per /etc/os-release iirc
19:25:51 <fungi> got it
19:26:47 <clarkb> #topic Matrix for OpenDev comms
19:27:26 <clarkb> I have also not had time to dig into this. I feel like this deservces enough of a stretch of time where I can commit to helping people move and redirect people that I haven't jumped on it. But maybe I should start thinking about the bootstrap steps and then worry about the migration as distinct steps
19:27:53 <clarkb> any other thoughts/concerns/input in matrix for commmunications?
19:29:16 <tonyb> Not from me.   I'll save it for the preptg
19:30:15 <clarkb> #topic Pre PTG Planning
19:30:24 <clarkb> this is happening next week!
19:30:32 <clarkb> Times: Tuesday October 7 1800-2000 UTC, Wednesday October 8 1500-1700 UTC, Thursday October 9 1500-1700
19:30:41 <clarkb> #link https://etherpad.opendev.org/p/opendev-preptg-october-2025 Planning happening in this document
19:30:49 <clarkb> we will also use the meetpad that maps to ^ for the location
19:31:43 <clarkb> if you have topic ideas feel free to add them
19:32:08 <clarkb> tonyb: ^ you may have something to add there based on your last message?
19:32:48 <tonyb> No I meant the questions I have about matrix for the preptg
19:32:53 <clarkb> aha got it
19:33:37 <clarkb> #topic Etherpad 2.5.0 Upgrade
19:33:42 <clarkb> 104.130.127.119 is a held node for testing. You need to edit /etc/hosts to point etherpad.opendev.org at that IP.
19:33:52 <clarkb> has anyone else had a chance to check the rendering of etherpad on this held node?
19:34:37 <clarkb> mostly I just want a sanity check before I go complaining upstream further
19:34:43 <mnasiadka> clarkb: adding alma is rather meant to be able to use v2 nodes in EL10 testing, I don’t know how centos stream should be treated in future (sorry for posting under a different agenda topic)
19:35:21 <clarkb> mnasiadka: ya I think there are upsides to alma for sure. I just want us to be aware of the redundancy too, but as mentioend it sounds like there may be discussions around this at the summit so we can wait for that before trying to optmize within opendev
19:36:30 <clarkb> if someone manages to check that held etherpad please let me know what you see and think of the front page in particular. Checks of the pads too would be great
19:36:34 <clarkb> #topic Open Discussion
19:36:58 <clarkb> A reminder that if you're able to keep Friday evening of the summit clear that is a night where I'll be able to meet up for informal dinner
19:37:40 <tonyb> It's not looking good for me
19:37:50 <frickler> missed the announcements to note that I'll be offline next week and maybe longer
19:37:53 <clarkb> there is a marketplace mixer running until 6pm that evening so we can probably meet there and then find food afterwards
19:38:07 <clarkb> frickler: ack thank you for the heads up
19:38:20 <clarkb> tonyb: you wanted me to ask you about puppet during open discussion
19:39:01 <tonyb> My puppet question.   Who uses the various puppet repos like puppet-nova?
19:39:24 <clarkb> I think zigo's deployment tooling is built on top of those puppet modules
19:39:57 <tonyb> I noticed a bunch of open reviews to add that .gitreview etc for stable branches
19:40:32 <tonyb> Okay.   Nothing more formal?
19:40:50 <fungi> https://governance.openstack.org/tc/reference/projects/puppet-openstack.html
19:41:09 <fungi> it's an official openstack project team
19:41:16 <clarkb> I think people use them but I don't think its as widespread as kolla or openstack ansible
19:41:58 <tonyb> Noted
19:42:50 <clarkb> I do suspect that due to chagnes in the puppet ecosystem we're probably going to see fewer people using it over time
19:43:04 <clarkb> (specifically puppetlabs now perforce has decided to stop publishing builds of the software aiui)
19:44:13 <tonyb> Ahh okay.   I'll leave it for now .
19:44:17 <tonyb> Thanks
19:44:28 <clarkb> anything else before we call it a meeting?
19:44:48 * tonyb is done
19:44:54 <fungi> i've nothing else
19:45:03 <clarkb> Reminder that next week's meeting will be the pre ptg block on tuesday
19:45:09 <clarkb> so we won't be here we'll be on meetpad
19:45:12 <tonyb> Noted
19:45:20 <fungi> thanks clarkb!
19:45:28 <clarkb> This also means I won't send an agenda email. I will treat the pre ptg agenda as the agenda for that
19:45:38 <clarkb> and thank you to everyone who helps keep opendev up and running
19:45:44 <tonyb> ++
19:45:46 <clarkb> See you on meetpad next week
19:45:48 <clarkb> #endmeeting