Tuesday, 2025-09-02

clarkbmeeting time19:00
clarkb#startmeeting infra19:00
opendevmeetMeeting started Tue Sep  2 19:00:33 2025 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.19:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.19:00
opendevmeetThe meeting name has been set to 'infra'19:00
clarkb#link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/AYKNDGLH46IV3N5NI2BBSVMYMI6W4MQP/ Our Agenda19:00
clarkb#topic Announcements19:00
clarkbThere is a matrix.org outage right now which I think may be impacting the irc bridge to oftc as well as any matrix accounts hosted by matrix.org (like mine)19:01
clarkbhttps://status.matrix.org/ is tracking the issue if you want to know when things return to normal19:01
clarkbAlso fungi is out today so this meeting may just be me running through the agenda19:01
clarkbbut feel free to jump in on any topics if there is anything to share and you are following along19:02
clarkb#topic Gerrit 3.11 Upgrade Planning19:02
clarkbI don't have any updates on this item. I've been distracted by other things lately19:02
clarkb#link https://review.opendev.org/c/opendev/system-config/+/957555 Gerrit image updates for bugfix releases19:03
clarkbthis change could still use reviews though and I'll and it amongst the other container update changes when its convenient to restart gerrit19:04
clarkb#topic Upgrading old servers19:04
clarkbAs mentioned last week fungi managed to update most of the openafs cluster to jammy and I expect when he gets back that we will continue that effort all the way to noble19:04
fungii'll pick the afs/kerberos upgrades back up toward the end of this week once i'm home19:05
clarkbthis is major progress and getting off of old ubuntu releases and onto more modern stuff19:05
clarkbfungi: thanks!19:05
clarkbthen the major remaining nodes on the list are backup servers and the graphite server19:05
clarkbHelp continues to be very much appreciated if anyone else is able to dig into this19:06
clarkb#topic Matrix for OpenDev comms19:06
clarkb#link https://review.opendev.org/c/opendev/infra-specs/+/954826 Spec outlining the motivation and plan for Matrix trialing19:06
corvus_timely19:07
clarkbfollowup reviews on the spec are where we're sort of treading water19:07
corvus_the bridge appears to be broken right now19:07
clarkband yes the matrix.org outage may provide new thoughts/ideas on this spec if you want to wait and see how that gets resolved19:07
clarkbcorvus_: yes I think anything hosted by matrix.org (including my account and the oftc irc bridge) is not working with matrix right now19:07
clarkbhttps://status.matrix.org/ is tracking the outage19:07
clarkbreviews still very much welcome and I understand if we want to wait and see some further triage/resolution on the current issue before doing so19:08
fungiyeah, my weechat is logging repeated connection refusal errors from the matrix.org servers19:08
corvus_i only noticed because of this meeting19:09
clarkbyes my browser was getting 429s from cloudflare. I suspect they've configured limits quite low to ease traffic on the backend while they fix it19:09
clarkb#topic Pre PTG Planning19:10
clarkb#link https://etherpad.opendev.org/p/opendev-preptg-october-2025 Planning happening in this document19:10
clarkbTimes: Tuesday October 7 1800-2000 UTC, Wednesday October 8 1500-1700 UTC, Thursday October 9 1500-170019:10
clarkbThis will replace our team meeting on October 719:10
clarkbplease add discussion topics to the agenda on that etherpad19:10
clarkband I'll see you there in just over a month19:10
clarkb#topic Loss of upstream Debian bullseye-backports mirror19:11
clarkbZuul-jobs will no longer enable debian backports by default on September 919:11
clarkb#link https://lists.zuul-ci.org/archives/list/zuul-announce@lists.zuul-ci.org/thread/NZ54HYFHIYW3OILYYIQ72L7WAVNSODMR/19:11
clarkbOnce zuul-jobs' default is updated then we'll be able to delete the debian bullseye backports repo from our mirror and drop our workaround19:11
clarkbjust waiting for sufficient time to pass since this was announced on the zuul announce list19:12
clarkb#topic Etherpad 2.5.0 Upgrade19:13
clarkb#link https://github.com/ether/etherpad-lite/blob/v2.5.0/CHANGELOG.md19:13
corvus_regrading matrix: btw the opendev ems instance is working (gerritbot msgs are going through) and eavesdrop is logging19:13
clarkback19:13
clarkbetherpad claims the 2.5.0 release fixes our problems. I still think the root page's css is weird but the js errors related to 2.4.2 did go away19:14
clarkb#link https://review.opendev.org/c/opendev/system-config/+/956593/19:14
clarkb104.130.127.119 is a held node for testing. You need to edit /etc/hosts to point etherpad.opendev.org at that IP.19:14
clarkbif you want to test you can punch that ip into /etc/hosts and check out the root page locally as well as the clarkb-test pad (or any other pad you create)19:14
clarkbagain I don't think this is urgent. Mostly looking for feedback on whether we think this is workable so that we don't fall behind or continue to pester them to fix things better19:15
clarkb#topic Moving OpenDev's python-base/python-builder/uwsig-base Images to Quay19:16
clarkblast week corvus_ suggested that the way to not wait forever on merging this change is to prep changes to update all the child images as reminders that those need updating at some point then proceeding with moving the base image publication location19:16
clarkbI did propose those changes and it caught a problem!19:16
clarkbTurns out to use speculative images via the buildset registry when building with docker we always need to build with a custom buildx builder19:17
clarkbearlier this year I had changed image building to use podman by default in system-config. I don't remember the details btu I suspect I ran into this problem and just didn't track it down fully and this was the out. The problem with thati s multiarch builds19:17
clarkbits actually probably more ideal for us to keep using docker to build images and switch to the custom buildx builder for all image builds so that single arch and multiarch builds use the same toolchain (podman doesn't support multiarch in our jobs yet but the underlying tool does)19:18
clarkb#link https://review.opendev.org/c/zuul/zuul-jobs/+/958783 Always build docker images with custom buildx builder19:18
clarkbthis change updates zuul-jobs to do that for everyone as it plays nice with speculative image builds19:18
clarkbso I think the rough plan here for moving base images to quay is land that zuul-jobs change, then move base images to quay, then update the child images to both build with docker and pull base images from quay19:19
clarkbthat zuul-jobs chnage has a child followup change that adds testing to zuul-jobs to cover all of this too so we should be good moving foward and any regressions should be caught early19:20
clarkb#topic Adding Debian Trixie Base Python Container Images19:20
clarkbThen once base images move to quay we can also add trixie based python container images19:20
clarkb#link https://review.opendev.org/c/opendev/system-config/+/95848019:20
clarkbwith plans to worry about python3.13 after trixie is in place19:21
clarkbjust to keep the total number of images we're juggling to a reasonable number19:21
clarkb#topic Dropping Ubuntu Bionic Test Nodes19:22
clarkbAfter last week's meeting I think I convinced myself we don't need to do any major announcements for Bionic cleanups yet. Mostly because the release is long EOL at this point19:22
clarkb#link https://review.opendev.org/q/hashtag:%22drop-bionic%22+status:open19:22
clarkbI did write a few changes to continue to remove opendev's dependence on bionic under that hashtag. I Think all of those changes are likely quick reviews and easy approvals19:23
clarkband dropping releases like bionic reduces the total storage used in openafs which should make things like upgrading openafs servers easier19:24
clarkb(though I'm not sure we'll get this done before fungi completes the upgrades. I'm just trying to justify the effort generally)19:24
clarkb#topic Temporary Shutdown of raxflex sjc3 for provider maintenance window19:25
clarkblast week rackspcae notified us via email that the cinder volume backing the rax flex sjc3 mirror would undergo maintenance tomorrow at 10:30am to 12:30pm cnetral time19:26
clarkb#link https://review.opendev.org/c/opendev/zuul-providers/+/95920019:26
clarkbthis change disables this region in zuul launcher so that I can safely shutdown the mirror while they do that work. My plan is to approve that change after lunch today and then manually shutdown the mirror before EOD19:26
clarkbthat should be plenty of time for running jobs to complete19:26
clarkbthen tomorrow after the maintenance window completes I can start the mirror back up again and revert 95920019:27
clarkb#topic Fixing Zuul's Trixie Image Builds19:28
clarkbThis item wasn't on the agenda (my bad) but it was pointed out that our Trixie images are still actually debian testing19:28
clarkb#link https://review.opendev.org/c/opendev/zuul-providers/+/958561 Build actual Trixie now that it is released19:28
clarkb958561 will fix that but depends on a DIB update19:29
clarkbdid anyone else want to review the DIB update? I'm thinking I may approve that one today with mnasiadka's review as the sole +2 in order to not let this problem fester for too long19:29
clarkb#topic Open Discussion19:29
clarkbAnd with that we have reached the end of the agenda. Anything else?19:30
clarkbI know I kinda speedran through that but with fungi out, corvus impacted by matrix bridging issues, and frickler and tonyb not typically attending I figured I should just get through it19:31
clarkbI'll leave the floor open until 19:35 UTC then call it a meeting if nothing comes up19:31
clarkbas always feel free to continue any discssion on the mailing list or in #opendev19:31
fungifinishing the afs/kerberos upgrades shouldn't take long, btw, it's fairly mechanical now and hopefully i can have the rw volume migration back to the noble afs01.dfw going by the weekend19:32
corvus_fyi there's a zuul-scheduler memory leak, but i think i have a fix19:32
corvus_we'll probably need to restart the schedulers tomorrow whether or not it lands19:33
clarkbcorvus_: oh right that came up on Friday and over the weekend19:33
fungithat was the root cause for the connection issues last week?19:33
corvus_#link https://review.opendev.org/959228 fix zuul-scheduler memory leak19:33
corvus_yeah i think so19:33
corvus_i mean, this is all well-informed supposition, not hard proof19:33
tonybI've been following along just didn't have thoughts 19:34
corvus_but i think we're at "fix obvious things first" and if stuff is still broken, dig deeper.19:34
clarkbcorvus_: sounds good19:34
clarkbI'll make a note now for tomorrow to restart schedulers19:34
clarkband we're at the time I noted we'd end. Thank you everyone!19:35
clarkbWe should be back here at the same time and location next week19:35
clarkbsee you then19:35
corvus_thanks clarkb !19:35
clarkb#endmeeting19:35
opendevmeetMeeting ended Tue Sep  2 19:35:44 2025 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)19:35
opendevmeetMinutes:        https://meetings.opendev.org/meetings/infra/2025/infra.2025-09-02-19.00.html19:35
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/infra/2025/infra.2025-09-02-19.00.txt19:35
opendevmeetLog:            https://meetings.opendev.org/meetings/infra/2025/infra.2025-09-02-19.00.log.html19:35
clarkbcorvus_: the zuul fix lgtm. I suppose if simon reviews it before my day starts tomorrow that may be ready for manual restarts tomorrow?19:36
corvus_clarkb: yep, though i feel that for this particular change, simon would probably be fine if we just merged it and got a jump on validating it as a fix.  :)19:40
clarkbcorvus_: that also works for me if you want to approve it now19:48

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!