19:00:33 #startmeeting infra 19:00:33 Meeting started Tue Sep 2 19:00:33 2025 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:00:33 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:00:33 The meeting name has been set to 'infra' 19:00:43 #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/AYKNDGLH46IV3N5NI2BBSVMYMI6W4MQP/ Our Agenda 19:00:47 #topic Announcements 19:01:10 There is a matrix.org outage right now which I think may be impacting the irc bridge to oftc as well as any matrix accounts hosted by matrix.org (like mine) 19:01:30 https://status.matrix.org/ is tracking the issue if you want to know when things return to normal 19:01:57 Also fungi is out today so this meeting may just be me running through the agenda 19:02:09 but feel free to jump in on any topics if there is anything to share and you are following along 19:02:44 #topic Gerrit 3.11 Upgrade Planning 19:02:58 I don't have any updates on this item. I've been distracted by other things lately 19:03:54 #link https://review.opendev.org/c/opendev/system-config/+/957555 Gerrit image updates for bugfix releases 19:04:13 this change could still use reviews though and I'll and it amongst the other container update changes when its convenient to restart gerrit 19:04:21 #topic Upgrading old servers 19:04:47 As mentioned last week fungi managed to update most of the openafs cluster to jammy and I expect when he gets back that we will continue that effort all the way to noble 19:05:11 i'll pick the afs/kerberos upgrades back up toward the end of this week once i'm home 19:05:12 this is major progress and getting off of old ubuntu releases and onto more modern stuff 19:05:18 fungi: thanks! 19:05:32 then the major remaining nodes on the list are backup servers and the graphite server 19:06:12 Help continues to be very much appreciated if anyone else is able to dig into this 19:06:48 #topic Matrix for OpenDev comms 19:06:54 #link https://review.opendev.org/c/opendev/infra-specs/+/954826 Spec outlining the motivation and plan for Matrix trialing 19:07:01 timely 19:07:04 followup reviews on the spec are where we're sort of treading water 19:07:23 the bridge appears to be broken right now 19:07:25 and yes the matrix.org outage may provide new thoughts/ideas on this spec if you want to wait and see how that gets resolved 19:07:46 corvus_: yes I think anything hosted by matrix.org (including my account and the oftc irc bridge) is not working with matrix right now 19:07:59 https://status.matrix.org/ is tracking the outage 19:08:36 reviews still very much welcome and I understand if we want to wait and see some further triage/resolution on the current issue before doing so 19:08:48 yeah, my weechat is logging repeated connection refusal errors from the matrix.org servers 19:09:15 i only noticed because of this meeting 19:09:27 yes my browser was getting 429s from cloudflare. I suspect they've configured limits quite low to ease traffic on the backend while they fix it 19:10:14 #topic Pre PTG Planning 19:10:20 #link https://etherpad.opendev.org/p/opendev-preptg-october-2025 Planning happening in this document 19:10:25 Times: Tuesday October 7 1800-2000 UTC, Wednesday October 8 1500-1700 UTC, Thursday October 9 1500-1700 19:10:33 This will replace our team meeting on October 7 19:10:47 please add discussion topics to the agenda on that etherpad 19:10:59 and I'll see you there in just over a month 19:11:14 #topic Loss of upstream Debian bullseye-backports mirror 19:11:22 Zuul-jobs will no longer enable debian backports by default on September 9 19:11:26 #link https://lists.zuul-ci.org/archives/list/zuul-announce@lists.zuul-ci.org/thread/NZ54HYFHIYW3OILYYIQ72L7WAVNSODMR/ 19:11:55 Once zuul-jobs' default is updated then we'll be able to delete the debian bullseye backports repo from our mirror and drop our workaround 19:12:23 just waiting for sufficient time to pass since this was announced on the zuul announce list 19:13:30 #topic Etherpad 2.5.0 Upgrade 19:13:36 #link https://github.com/ether/etherpad-lite/blob/v2.5.0/CHANGELOG.md 19:13:39 regrading matrix: btw the opendev ems instance is working (gerritbot msgs are going through) and eavesdrop is logging 19:13:47 ack 19:14:10 etherpad claims the 2.5.0 release fixes our problems. I still think the root page's css is weird but the js errors related to 2.4.2 did go away 19:14:17 #link https://review.opendev.org/c/opendev/system-config/+/956593/ 19:14:23 104.130.127.119 is a held node for testing. You need to edit /etc/hosts to point etherpad.opendev.org at that IP. 19:14:44 if you want to test you can punch that ip into /etc/hosts and check out the root page locally as well as the clarkb-test pad (or any other pad you create) 19:15:15 again I don't think this is urgent. Mostly looking for feedback on whether we think this is workable so that we don't fall behind or continue to pester them to fix things better 19:16:13 #topic Moving OpenDev's python-base/python-builder/uwsig-base Images to Quay 19:16:47 last week corvus_ suggested that the way to not wait forever on merging this change is to prep changes to update all the child images as reminders that those need updating at some point then proceeding with moving the base image publication location 19:16:56 I did propose those changes and it caught a problem! 19:17:13 Turns out to use speculative images via the buildset registry when building with docker we always need to build with a custom buildx builder 19:17:50 earlier this year I had changed image building to use podman by default in system-config. I don't remember the details btu I suspect I ran into this problem and just didn't track it down fully and this was the out. The problem with thati s multiarch builds 19:18:28 its actually probably more ideal for us to keep using docker to build images and switch to the custom buildx builder for all image builds so that single arch and multiarch builds use the same toolchain (podman doesn't support multiarch in our jobs yet but the underlying tool does) 19:18:43 #link https://review.opendev.org/c/zuul/zuul-jobs/+/958783 Always build docker images with custom buildx builder 19:18:58 this change updates zuul-jobs to do that for everyone as it plays nice with speculative image builds 19:19:33 so I think the rough plan here for moving base images to quay is land that zuul-jobs change, then move base images to quay, then update the child images to both build with docker and pull base images from quay 19:20:05 that zuul-jobs chnage has a child followup change that adds testing to zuul-jobs to cover all of this too so we should be good moving foward and any regressions should be caught early 19:20:38 #topic Adding Debian Trixie Base Python Container Images 19:20:50 Then once base images move to quay we can also add trixie based python container images 19:20:55 #link https://review.opendev.org/c/opendev/system-config/+/958480 19:21:39 with plans to worry about python3.13 after trixie is in place 19:21:53 just to keep the total number of images we're juggling to a reasonable number 19:22:27 #topic Dropping Ubuntu Bionic Test Nodes 19:22:51 After last week's meeting I think I convinced myself we don't need to do any major announcements for Bionic cleanups yet. Mostly because the release is long EOL at this point 19:22:57 #link https://review.opendev.org/q/hashtag:%22drop-bionic%22+status:open 19:23:22 I did write a few changes to continue to remove opendev's dependence on bionic under that hashtag. I Think all of those changes are likely quick reviews and easy approvals 19:24:00 and dropping releases like bionic reduces the total storage used in openafs which should make things like upgrading openafs servers easier 19:24:17 (though I'm not sure we'll get this done before fungi completes the upgrades. I'm just trying to justify the effort generally) 19:25:38 #topic Temporary Shutdown of raxflex sjc3 for provider maintenance window 19:26:05 last week rackspcae notified us via email that the cinder volume backing the rax flex sjc3 mirror would undergo maintenance tomorrow at 10:30am to 12:30pm cnetral time 19:26:10 #link https://review.opendev.org/c/opendev/zuul-providers/+/959200 19:26:46 this change disables this region in zuul launcher so that I can safely shutdown the mirror while they do that work. My plan is to approve that change after lunch today and then manually shutdown the mirror before EOD 19:26:58 that should be plenty of time for running jobs to complete 19:27:18 then tomorrow after the maintenance window completes I can start the mirror back up again and revert 959200 19:28:20 #topic Fixing Zuul's Trixie Image Builds 19:28:38 This item wasn't on the agenda (my bad) but it was pointed out that our Trixie images are still actually debian testing 19:28:52 #link https://review.opendev.org/c/opendev/zuul-providers/+/958561 Build actual Trixie now that it is released 19:29:03 958561 will fix that but depends on a DIB update 19:29:27 did anyone else want to review the DIB update? I'm thinking I may approve that one today with mnasiadka's review as the sole +2 in order to not let this problem fester for too long 19:29:58 #topic Open Discussion 19:30:11 And with that we have reached the end of the agenda. Anything else? 19:31:23 I know I kinda speedran through that but with fungi out, corvus impacted by matrix bridging issues, and frickler and tonyb not typically attending I figured I should just get through it 19:31:40 I'll leave the floor open until 19:35 UTC then call it a meeting if nothing comes up 19:31:52 as always feel free to continue any discssion on the mailing list or in #opendev 19:32:40 finishing the afs/kerberos upgrades shouldn't take long, btw, it's fairly mechanical now and hopefully i can have the rw volume migration back to the noble afs01.dfw going by the weekend 19:32:48 fyi there's a zuul-scheduler memory leak, but i think i have a fix 19:33:01 we'll probably need to restart the schedulers tomorrow whether or not it lands 19:33:06 corvus_: oh right that came up on Friday and over the weekend 19:33:15 that was the root cause for the connection issues last week? 19:33:22 #link https://review.opendev.org/959228 fix zuul-scheduler memory leak 19:33:27 yeah i think so 19:33:44 i mean, this is all well-informed supposition, not hard proof 19:34:01 I've been following along just didn't have thoughts 19:34:02 but i think we're at "fix obvious things first" and if stuff is still broken, dig deeper. 19:34:11 corvus_: sounds good 19:34:21 I'll make a note now for tomorrow to restart schedulers 19:35:25 and we're at the time I noted we'd end. Thank you everyone! 19:35:35 We should be back here at the same time and location next week 19:35:38 see you then 19:35:43 thanks clarkb ! 19:35:44 #endmeeting