19:00:04 <clarkb> #startmeeting infra
19:00:04 <opendevmeet> Meeting started Tue Apr 22 19:00:04 2025 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:00:04 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:00:04 <opendevmeet> The meeting name has been set to 'infra'
19:00:10 <clarkb> #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/6CDVW2M7T3K4QDPYT2TKHMPIHN7TSGV6/ Our Agenda
19:00:47 <clarkb> #topic Announcements
19:01:17 <clarkb> I will be travelling May 6th and unable to host the regular meeting. (that is two weeks from today)
19:01:28 <clarkb> I think we can either skip that one or someone else can host
19:01:41 <clarkb> That was all I had to announce. Anything else?
19:01:46 <fungi> i'll be travelling as well
19:01:50 <fungi> on that date
19:02:31 <fungi> probably worth noting that the both of us will be fairly distracted that entire week with in-person meetings
19:02:58 <clarkb> ya I guess we'll be mostly out Monday-Friday that week
19:03:06 <clarkb> directly impacting this meeting but potentially other things
19:03:12 <fungi> but still in approximately the same time zones and can probably still do things in an emergency
19:03:53 <fungi> (with some latency)
19:04:29 <clarkb> If that is all for announcements I think we can dive into the agenda
19:04:35 <clarkb> #topic Zuul-launcher image builds
19:04:37 <corvus> canceling that mtg sounds good
19:05:01 <clarkb> corvus: sent email asking for volunteers to help with image builds and now we have volunteers \o/
19:05:26 <fungi> it's like magic
19:05:35 <corvus> we got some volunteers!
19:05:48 <clarkb> then unrelated to ^ one thing corvus noticed dogfooding with zuul is that nodepool sets the private_ip vars to the public_ip values if there is no private_ip
19:05:52 <fungi> (very formidable magic)
19:06:06 <clarkb> this was originally done to make it easy for openstack testing to avoid using NAT and always use the "private ip"
19:06:33 <corvus> mnasiadka is working on arm64 stuff
19:06:43 <corvus> #link arm64 image builds https://review.opendev.org/c/opendev/zuul-providers/+/947841
19:07:10 <clarkb> however other users may find this behavior less useful or counterproductive so for niz we discussed having public_ip* and private_ip* always be the respective public and private values then we can have additional vars like interface IP for this is how you access the test node from the outside world and maybe local_ip for interfaces that avoid NAT
19:07:37 <clarkb> so there may be a small API change that jobs will need to accomodate as we shift from nodepool to niz
19:07:43 <corvus> oh we're talking about the other thing?
19:07:44 <corvus> neil volunteered to work on rocky image builds
19:07:59 <clarkb> sorry I just started braindumping. Feel free to proceed on the volunteers and we can discuss IPs after
19:08:54 <corvus> on the private_ip thing -- i advocate having the nodepool vars maintain consistent behavior, then in the future, switch to using new zuul vars that have different behavior
19:09:13 <fungi> that makes sense
19:09:34 <fungi> basically backward-compatible but deprecated nodepool vars
19:09:40 <corvus> so we can continue with the idea that the switch to niz won't require any changes at the time
19:09:41 <corvus> but once done, jobs will be using a deprecated api that will need to change
19:09:42 <corvus> at some point in the future
19:09:43 <corvus> that will require a change in zuul which is in progress
19:09:54 <corvus> yep
19:10:09 <clarkb> sounds good to me
19:10:12 <corvus> so we shouldn't move any more tenants over until that's finished
19:11:06 <clarkb> any other concerns with what we're learning so far?
19:11:36 <corvus> nope; i also started the change to emit statsd gauges for nodes/limits so we can has graphs
19:12:00 <corvus> so hopefully in a few weeks, we'll have ducks all lined up for wider rollout
19:12:04 <clarkb> seems like good steady progress
19:12:07 <clarkb> ++
19:12:07 <corvus> i think that's about it
19:12:29 <clarkb> #topic Container hygiene tasks
19:12:40 <clarkb> #link https://review.opendev.org/q/topic:%22opendev-python3.12%22+status:open Update images to use python3.12
19:12:58 <corvus> (oh btw i just noticed: opendev-build-diskimage-ubuntu-noble-arm64 https://zuul.opendev.org/t/opendev/build/26bf8c4c705e40639e57208bae1f8c18 : SUCCESS in 1h 17m 29s )
19:13:02 <clarkb> maybe this topic would've been better after the next one (gerrit server move) but now that gerrit is largely done this is going to be back on my radar
19:13:04 <clarkb> corvus: nice!
19:13:10 <clarkb> very reasonable runtime too
19:13:59 <clarkb> but tl;dr is I'm still trying to update us to python3.12. Gerrit and limnoria are two big ones that are outstanding and both need a bit of care to land as we should restart gerrit on the new image once built to ensure it works and I don't want to do that right after the server move and limnoria updates may impact meetings like this one if we interrupt a meeting
19:14:20 <clarkb> I should just look at a calendar and meeting schedule and find a time to do limnoria then also plan for gerrit
19:14:37 <fungi> the gerrit restart will be a good opportunity to exercise the sigint change too
19:14:43 <clarkb> ++
19:15:27 <clarkb> so just be aware of that I guess and I'll try to get these over the finish line soonish
19:15:33 <clarkb> #topic Switching Gerrit to run on Review03
19:15:45 <clarkb> This happend yesterday as scheduled and well within the hour we allocated
19:15:52 <clarkb> #link https://etherpad.opendev.org/p/i_vt63v18c3RKX2VyCs3 Notes on the migration plan
19:15:56 <fungi> 15 minutes downtime!
19:16:03 <clarkb> I tried to keep notes there and mark things off as we went if anyone wants to look back on it
19:16:26 <clarkb> the current situation is that review02 is in the emergency file and review03 is in our inventory as a normal review server
19:16:43 <clarkb> There are a number of followup changes that we should work through as we're confident that rollback is less and less likely
19:16:48 <clarkb> #link https://review.opendev.org/c/opendev/zone-opendev.org/+/947858 Reset DNS TTL
19:16:53 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/947758 Drop review02 from inventory
19:16:58 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/947759 Drop ignored docker compose version specifier
19:17:04 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/882900 Migrate gerrit images to quay
19:17:34 <clarkb> Reviews very much appreciated. I'd also like to talk about that last change for a bit, but before I do I also would like everyone to look at review02 and double check there isn't anything they want to be preserved on review03
19:17:59 <clarkb> for example I had gerrit user database cleanup notes in my homedir that I copied over to review03. It wouldn't be the end of the world to go to backup to get that info but making it easy seemed nice
19:18:08 <fungi> already looked, i don't need any of my files
19:18:53 <clarkb> cool. Once you've done that the other thing to weigh in on is when you feel comfortable with cleaning up review02. I think we'll keep its bfv volume and data volume around longer than the instance itself. But still removing the instance is a big step
19:19:13 <clarkb> don't need answers on that now, but maybe followup in #opendev with thoughts/concerns/timeline ideas
19:19:30 <corvus> i have no files
19:20:26 <fungi> nor i now
19:20:35 <clarkb> for that last linked change I discovered that building the gerrit container image with docker fails in the move to quay because we don't set up buildkitd mirror data
19:20:35 <fungi> baleeted
19:20:48 <clarkb> The latest patchset to that change updates the jobs to use podman to build the images instead
19:21:10 <clarkb> it seems to work based on the fact that the system-config-run jobs manage to run gerrit and pass our test cases and take screenshots
19:21:39 <clarkb> however I don't think we should switch only gerrit to build with podman. I think we should siwtch the entirety of our image builds that move to quay (leaving the old image builds that are stuck on docker using docker seems fine)
19:22:04 <fungi> wfm
19:22:06 <clarkb> any upfront concerns with doing that? I know corvus said earlier today that it would be good to keep zuul and opendev somewhat in sync here so that we have a battle tested consistent approach
19:23:14 <fungi> that coordination makes sense, sure
19:23:32 <clarkb> I can push a change up to do the switch across that set of jobs then rebase the gerrit image move onto that
19:23:37 <clarkb> that should make it more reviewable/mergable
19:24:02 <corvus> zuul does it differently currently
19:24:07 <fungi> sounds good, i'll prioritize reviewing those
19:24:23 <clarkb> corvus: it did look like the initial test of having zuul use podman worked though?
19:24:30 <clarkb> so at least no Dockerfile content problems
19:24:42 <clarkb> but we may want to inspect the images for difference?
19:25:24 <corvus> i pushed a change up to zuul to see if podman works
19:25:25 <corvus> https://review.opendev.org/947848
19:25:26 <corvus> looks like it does
19:25:26 <corvus> so probably the only reason we needed those settings in the zuul project was for nodepool-builder
19:25:28 <corvus> which we don't care about long-term
19:25:55 <clarkb> ya maybe we can just let nodepool-builder be special for now then phase it out as part of the zuul-launcher effort
19:26:12 <clarkb> similar to how opendev would ohase out docker builds as we move to quay
19:26:16 <corvus> does opendev do any multi-arch builds?
19:26:17 <corvus> (also, sorry, i'm laggy)
19:26:55 <clarkb> corvus: no multi arch builds that I am aware of. But double checking that is a good idea
19:27:38 <fungi> i'd be fine letting nodepool-builder images be unique/different until we no longer produce updates for them
19:28:27 <clarkb> and we don't have to commit ot anything in this meeting. I just want to get this out there so we can start considering the impacts and options
19:28:28 <corvus> system-config-build-image-python-builder-3.12-bookworm looks like it's multi-arch
19:28:33 <clarkb> oh right the base images
19:28:40 <clarkb> because nodepool-builder depends on them
19:29:14 <clarkb> no opendev service images do multi arch, but we make the base images multiarch so that nodepool can build on them
19:29:22 <corvus> oh, well if it's only for nodepool.... then nbd.
19:29:54 <clarkb> corvus: yes I think that is the only place "we" use that
19:30:02 <corvus> that wfm then.
19:30:33 <clarkb> ok cool sounds like we have a rough plan and I can get the opendev side organized around that
19:30:41 <clarkb> Anything else Gerrit related?
19:31:32 <clarkb> #topic Upgrading old servers
19:31:36 <clarkb> I think we can continue then
19:31:53 <clarkb> the other thing that is back on my todo list radar now that Gerrit is largely updated is picking up this process for other services
19:32:04 <clarkb> mirror-update and eavesdrop are both on the todo list maybe I can find some time for them soon
19:32:18 <clarkb> Help is appreciated if aynone else is able to pcik a server or two or three as well
19:32:51 <clarkb> I don't really have any updates on this topic other than to say that as gerrit has been the focus for a while
19:32:54 <clarkb> anyone else have updates?
19:33:19 <fungi> i don't, other than working on shutting refstack down instead of upgrading it
19:33:34 <clarkb> oh ya I guess that is related. refstack won't be updated it will be shutdown
19:33:45 <clarkb> solves the problem in a different but still valid way
19:33:47 <clarkb> thanks!
19:33:49 <fungi> but that's pending some feedback from foundation staff on whether they want to announce it
19:34:05 <fungi> hopefully later this week
19:34:28 <clarkb> #topic Working through our TODO list
19:34:34 <clarkb> #link https://etherpad.opendev.org/p/opendev-january-2025-meetup
19:34:52 <clarkb> just a reminder that if everything else we're discussing isn't keeping you busy enough we've got a big list on this eitherpad
19:35:13 <clarkb> a good place for people who want to get more involved to look as well. Feel free to ask me any questions you may have. I'm super happy to help anyone work on these things
19:36:05 <NeilHanlon> o/ just stopping in to say i'll be poking at rocky/centos images this week if mnasiadka doesn't beat me to it :)
19:36:39 <fungi> thanks NeilHanlon!!!
19:36:44 <clarkb> ++
19:37:00 <clarkb> #topic Rotating mailman 3 logs
19:37:14 <fungi> no progress, sorry
19:37:26 <clarkb> ack. Iwasn't sure and wanted to give you the opportunity if I had missed it
19:37:35 <clarkb> #topic Moving hound image to quay.io
19:37:55 <clarkb> so this is related to the earlier discussion we had with gerrit and also the change to do this landed late yesterday after I posted the meeting agenda
19:38:14 <clarkb> I think the process here is basically we'll update the image build to use podman as part of the earlier discussed changes. Then roll forward and all should be well
19:38:36 <clarkb> at this point lodgeit and hound are on quay but neither are affected by the issue that hit gerrit so they are in a midway point
19:38:48 <clarkb> but we'll fix them up before getting to gerrit and then we can do gerrit
19:40:06 <clarkb> #topic Renewing wiki's cert
19:40:27 <clarkb> As you may have noticed our cert checker is unhappy this cert expires in just over two weeks
19:40:51 <clarkb> due to travel mentioned earlier I think my plan is to get a new cert issued and in place next week
19:41:09 <clarkb> apologies for the continued email alerts but they should go away soonish
19:41:21 <fungi> thanks!
19:41:37 <clarkb> #topic Open Discussion
19:41:39 <clarkb> Anything else?
19:41:57 <fungi> i didn't have anything
19:43:19 <clarkb> I think the screen is still running on review03
19:43:39 <clarkb> I was going to clean that up and then get distracted today. I've got it on my todo list so it should go away soonish
19:44:15 <clarkb> The openinfra summit europe 2025 cfp is open until sometime in june
19:44:35 <clarkb> looks like June 13 it closes
19:45:39 <fungi> i'm going to duck out early, thanks clarkb!
19:46:01 <clarkb> yup we're winding down anyway
19:46:12 <clarkb> I'll leave the floor open for a few more minutes if there is anything else but then end the meeting
19:48:23 <clarkb> Sounds like that may be everything. Thanks Everyone!
19:48:36 <clarkb> We'll be back next week at this time and location. but then likely cancelling the meeting the week after
19:48:43 <clarkb> #endmeeting