19:00:59 <clarkb> #startmeeting infra
19:00:59 <opendevmeet> Meeting started Tue May 21 19:00:59 2024 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:00:59 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:00:59 <opendevmeet> The meeting name has been set to 'infra'
19:01:06 <clarkb> #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/WQXQPYPP4PVET6PBVJH3FJISF5KPKGTU/ Our Agenda
19:01:39 <clarkb> #topic Announcements
19:02:11 <clarkb> The OpenInfra Summit CFP closes May 29. Get your submissions in before then (you have about a week from today).
19:02:20 <clarkb> #link https://openinfrafoundation.formstack.com/forms/openinfra_asia_summit_2024 OpenInfra Summit CFP Submission Page
19:03:45 <clarkb> #topic Upgrading Old Servers
19:04:13 <clarkb> I believe that tonyb has started poking at some wiki upgrade testing. I don't know how far along that has gotten or if any major concerns have shown up
19:04:24 <fungi> yeah, sounds like tony is making headway on a containerized mediawiki poc
19:04:32 <clarkb> Unfortunately I'm also not sure that tonyb will make it to the meeting today so we'll need to followup on that later
19:04:48 <clarkb> fungi: any idea if there are big issues yet or is it still looking straightforward
19:05:34 <fungi> based on what he said in irc, it sounds like mediawiki improved their upgrade story so we don't have to visit every single version between here and there, and we may be able to do it all on a single platform
19:05:44 <clarkb> oh nice
19:06:03 <fungi> and also our must-have extensions are all currently maintained
19:06:17 <fungi> the one question which came up was the canonical hostname
19:06:57 <fungi> in my opinion, since it's currently used by more projects than just openstack, if we're going to be maintaining the server going forward then it ought to be wiki.opendev.org with a permanent redirect from the old name
19:07:22 <clarkb> yes I think the only reason we havne't done that sooner is its been in a limbo dead end state for a while
19:07:43 <fungi> which also means we can completely not care about porting the current openstack theme to the latest version, just start with the default mediawiki theme and see if anyone feels up for making an opendev one later
19:07:57 <frickler> +1
19:08:00 <clarkb> ++ I wouldn't worry too much about fancy theming
19:08:53 <fungi> yes, in my opinion (as someone not doing most of the work but expecting to review it at least), cosmetic concerns are unimportant for this
19:09:27 <fungi> having something that works well enough that we can tear down the old server once and for all is the priority
19:09:55 <clarkb> big thank you to tonyb for looking into that
19:09:55 <fungi> and it gets us one server away from being able to finally kiss puppet goodbye (i think?)
19:09:59 <clarkb> yup
19:10:08 <clarkb> though it was only lightly puppeted since we never did it properly
19:10:15 <clarkb> but yes it removes one more server from that era
19:10:15 <fungi> just think of how much we're going to be able to delete!
19:10:29 <fungi> great point, we're not puppeting it anyway
19:10:48 <fungi> we could, i guess, rip puppet out as soon as cacti is done
19:11:08 <fungi> (either upgraded to ansible+docker or replaced by prometheus)
19:12:02 <clarkb> yup
19:12:06 <clarkb> #topic AFS Mirror Cleanups
19:12:28 <clarkb> Moving on to the next topic devstack-gate did get retired which is one less thing relying on xenial
19:12:50 <clarkb> The next step is going to be clearing projects out of the zuul config I think then deleting xenial job configs from whatever is left over
19:12:59 <clarkb> I haven't started on that but hope to be able to do so this week
19:13:02 <fungi> and no shouting mobs with torches and pitchforks on this one
19:13:26 <clarkb> there was one small issue with devstack listing devstack-gate as a required-project on a base job when it didn't need to do so. Fixing that fixed the jobs and all was well after
19:13:44 <clarkb> I'm hopeful that xenial can be cleaned up by early/mid june
19:13:56 <clarkb> just have to keep up with the slow and steady progress removing things
19:14:39 <clarkb> #topic Building Ubuntu Noble Test Nodes
19:15:08 <clarkb> over the last few days fungi has managed to push noble package mirroring into our mirrors for x86-64 and arm64
19:15:47 <clarkb> I think the next step is to ask nodepool to build the images. I would probably start with x86-64 again just because that way if there are problems we can fix them on the easier to debug (I assume anyway as I can boot images locally on that arch for example) before trying arm64
19:15:53 <fungi> we've got a bit of breathing room on the volume quotas for those
19:16:05 <fungi> not a ton, but looks safe for a while
19:16:17 <clarkb> oh right I also ssuggested we could trim them back after we synced up with upstream
19:16:19 <fungi> should tide us over until xenial gets deleted at least
19:16:33 <clarkb> but during the initial sync I felt it better to far overestimate than come up short
19:16:41 <fungi> i don't think i'd trim them back, looking at the graphs
19:16:51 <clarkb> ack
19:16:55 <fungi> your overestimates were a good idea
19:17:27 <clarkb> does anyone know if we have a change yet to have nodepool build a noble image? If not I can probably push one up today
19:17:43 <fungi> we do not yet, i was about to ask if i should push one
19:17:44 <frickler> haven't seen one
19:17:59 <clarkb> fungi: go for it
19:18:06 <fungi> will do right after the meeting
19:18:37 <clarkb> The other thing to make note of is dansmith reported that newer ansible is required for sshing into noble nodes. I believe because noble has python3.12 on it and older ansible don't play nice with that version
19:18:57 <clarkb> this isn't a big issue for us as zuul does support a new enough ansible as far as I can tell But it isn't the default version zuul will select
19:19:09 <clarkb> a good reminder that we should probably bump up the default in our zuul tenants as part of this effort
19:20:06 <clarkb> all in all good progress and we should have test nodes soon
19:20:15 <clarkb> #topic Gerrit 3.9 Upgrade Planning
19:20:28 <clarkb> #link https://etherpad.opendev.org/p/gerrit-upgrade-3.9 Upgrade prep and process notes
19:21:02 <clarkb> I managed to test Gerrit upgrades with our updated images and also tested the revert process
19:21:10 <clarkb> not sure if that happened before or after our last meeting.
19:21:23 <clarkb> Overall it seems straightforward and I didn't find any problems that make me hesitate.
19:21:47 <clarkb> Does May 31, 2024 lets say from 1600-1700 work for an upgrade? If so I can announce that today
19:22:28 <fungi> i'll be on my way to visit family, driving all day, but don't let that stop you
19:23:29 <clarkb> ok maybe I'll wait for tonyb to weigh in before announcing anything as I suspect he'll be primary backup in that case (I don't expect frickler  to work on a friday evening)
19:23:42 <clarkb> as an alternative we could do the 30th I suppose
19:24:28 <clarkb> anyway I'll followup with tonyb before making a decision
19:24:46 <clarkb> frickler: I suppose if you are interested I could also wake up early one day. Just let me know
19:24:57 <clarkb> #topic Linaro Cloud SSL Cert Renewal
19:25:06 <clarkb> It has happarently been a couple of months since we last did this (time flies)
19:25:17 <fungi> yeah, just saw the first notification this morning
19:25:27 <clarkb> We are getting warning emails that we have just under a month to rerun the LE acme.sh renewal and move certs into place
19:25:39 <clarkb> The process should be properly documented now /me finds a link
19:25:58 <clarkb> #link https://docs.opendev.org/opendev/system-config/latest/letsencrypt.html#linaro-arm64-cloud-cert-renewal
19:26:11 <clarkb> anyone interested in running through that and renewing the cert?
19:26:51 <clarkb> that can be done at any time as the script should be set up to renew with less than 30 days of life remaining. And its a free cert so no costs involved
19:27:28 <fungi> not volunteering yet, but might as the deadline draws closer if nobody else is up for it
19:28:00 <clarkb> ack I'm in a similar boat. Have a few things to catch up on between gerrit upgrades, xenail cleanup, any noble fallout, etc but I should be able to make time for it later if it becomes more urgent
19:28:25 <clarkb> #topic Open Discussion
19:28:50 <clarkb> Small note that gitea still hasn't produced a 1.22 release yet. I'm waiting for that in order to have the updated db doctor fixup tool
19:28:58 <clarkb> Anything else?
19:30:23 <fungi> nothing springs to mind
19:30:41 <frickler> just mentioning the new zuul config errors again
19:31:04 <clarkb> thats related to the negative regex and re2 stuff?
19:31:23 <frickler> triggered by files/irrelevant-files now also triggering warnings, yes
19:31:43 <clarkb> aha thats the bit I was missing, the rule is being applied to more places, but it is still a warning and not a true error yet right?
19:31:56 <frickler> afaict yes
19:32:46 <frickler> https://review.opendev.org/c/zuul/zuul/+/916141 was the change in zuul
19:32:47 <fungi> also i brought up something in the zuul matrix about the impact of tenant removal on jobs added to pipelines with the removed project as a required-projects entry being silently removed, took some projects by surprise when they approved a change and only their docs job ran before merging
19:33:04 <clarkb> oh I should've mentioned this when discussing the gerrit upgrade. Gerrit upstream made some bugfix releases since the last time I tested things I pushed a change to update our images to those updates https://review.opendev.org/c/opendev/system-config/+/920115
19:33:25 <clarkb> ideally we land that soonish and restart gerrit on the new image (tomorrow?) and then I can do some quick retesting of the upgrade between those image builds
19:33:36 <fungi> sgtm
19:33:59 <clarkb> fungi: I was thinking about that behavior and I think it is similar to the expectation that deleting a job from say stable/foo will stop running that job on stable/foo for another project
19:34:13 <clarkb> this has been historically useful for grenade cleanups. THough I'm not sure that is how grenade is stopped against old branches now
19:34:24 <fungi> yeah, i'm not sure how best to guard against it
19:34:36 <clarkb> well in the grenade case it was actually desireable behavior
19:34:57 <clarkb> but also I'm not sure that is how the grenade jobs are managed any longer due to the longer stable timeframes and the handover to extended maintenance or whatever it is called now
19:37:27 <clarkb> sounds like that may be everything
19:37:35 <clarkb> Thank you for your time and I'll let you have some of it back now
19:37:38 <clarkb> #endmeeting