19:00:04 <clarkb> #startmeeting infra 19:00:04 <opendevmeet> Meeting started Tue Apr 22 19:00:04 2025 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:00:04 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:00:04 <opendevmeet> The meeting name has been set to 'infra' 19:00:10 <clarkb> #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/6CDVW2M7T3K4QDPYT2TKHMPIHN7TSGV6/ Our Agenda 19:00:47 <clarkb> #topic Announcements 19:01:17 <clarkb> I will be travelling May 6th and unable to host the regular meeting. (that is two weeks from today) 19:01:28 <clarkb> I think we can either skip that one or someone else can host 19:01:41 <clarkb> That was all I had to announce. Anything else? 19:01:46 <fungi> i'll be travelling as well 19:01:50 <fungi> on that date 19:02:31 <fungi> probably worth noting that the both of us will be fairly distracted that entire week with in-person meetings 19:02:58 <clarkb> ya I guess we'll be mostly out Monday-Friday that week 19:03:06 <clarkb> directly impacting this meeting but potentially other things 19:03:12 <fungi> but still in approximately the same time zones and can probably still do things in an emergency 19:03:53 <fungi> (with some latency) 19:04:29 <clarkb> If that is all for announcements I think we can dive into the agenda 19:04:35 <clarkb> #topic Zuul-launcher image builds 19:04:37 <corvus> canceling that mtg sounds good 19:05:01 <clarkb> corvus: sent email asking for volunteers to help with image builds and now we have volunteers \o/ 19:05:26 <fungi> it's like magic 19:05:35 <corvus> we got some volunteers! 19:05:48 <clarkb> then unrelated to ^ one thing corvus noticed dogfooding with zuul is that nodepool sets the private_ip vars to the public_ip values if there is no private_ip 19:05:52 <fungi> (very formidable magic) 19:06:06 <clarkb> this was originally done to make it easy for openstack testing to avoid using NAT and always use the "private ip" 19:06:33 <corvus> mnasiadka is working on arm64 stuff 19:06:43 <corvus> #link arm64 image builds https://review.opendev.org/c/opendev/zuul-providers/+/947841 19:07:10 <clarkb> however other users may find this behavior less useful or counterproductive so for niz we discussed having public_ip* and private_ip* always be the respective public and private values then we can have additional vars like interface IP for this is how you access the test node from the outside world and maybe local_ip for interfaces that avoid NAT 19:07:37 <clarkb> so there may be a small API change that jobs will need to accomodate as we shift from nodepool to niz 19:07:43 <corvus> oh we're talking about the other thing? 19:07:44 <corvus> neil volunteered to work on rocky image builds 19:07:59 <clarkb> sorry I just started braindumping. Feel free to proceed on the volunteers and we can discuss IPs after 19:08:54 <corvus> on the private_ip thing -- i advocate having the nodepool vars maintain consistent behavior, then in the future, switch to using new zuul vars that have different behavior 19:09:13 <fungi> that makes sense 19:09:34 <fungi> basically backward-compatible but deprecated nodepool vars 19:09:40 <corvus> so we can continue with the idea that the switch to niz won't require any changes at the time 19:09:41 <corvus> but once done, jobs will be using a deprecated api that will need to change 19:09:42 <corvus> at some point in the future 19:09:43 <corvus> that will require a change in zuul which is in progress 19:09:54 <corvus> yep 19:10:09 <clarkb> sounds good to me 19:10:12 <corvus> so we shouldn't move any more tenants over until that's finished 19:11:06 <clarkb> any other concerns with what we're learning so far? 19:11:36 <corvus> nope; i also started the change to emit statsd gauges for nodes/limits so we can has graphs 19:12:00 <corvus> so hopefully in a few weeks, we'll have ducks all lined up for wider rollout 19:12:04 <clarkb> seems like good steady progress 19:12:07 <clarkb> ++ 19:12:07 <corvus> i think that's about it 19:12:29 <clarkb> #topic Container hygiene tasks 19:12:40 <clarkb> #link https://review.opendev.org/q/topic:%22opendev-python3.12%22+status:open Update images to use python3.12 19:12:58 <corvus> (oh btw i just noticed: opendev-build-diskimage-ubuntu-noble-arm64 https://zuul.opendev.org/t/opendev/build/26bf8c4c705e40639e57208bae1f8c18 : SUCCESS in 1h 17m 29s ) 19:13:02 <clarkb> maybe this topic would've been better after the next one (gerrit server move) but now that gerrit is largely done this is going to be back on my radar 19:13:04 <clarkb> corvus: nice! 19:13:10 <clarkb> very reasonable runtime too 19:13:59 <clarkb> but tl;dr is I'm still trying to update us to python3.12. Gerrit and limnoria are two big ones that are outstanding and both need a bit of care to land as we should restart gerrit on the new image once built to ensure it works and I don't want to do that right after the server move and limnoria updates may impact meetings like this one if we interrupt a meeting 19:14:20 <clarkb> I should just look at a calendar and meeting schedule and find a time to do limnoria then also plan for gerrit 19:14:37 <fungi> the gerrit restart will be a good opportunity to exercise the sigint change too 19:14:43 <clarkb> ++ 19:15:27 <clarkb> so just be aware of that I guess and I'll try to get these over the finish line soonish 19:15:33 <clarkb> #topic Switching Gerrit to run on Review03 19:15:45 <clarkb> This happend yesterday as scheduled and well within the hour we allocated 19:15:52 <clarkb> #link https://etherpad.opendev.org/p/i_vt63v18c3RKX2VyCs3 Notes on the migration plan 19:15:56 <fungi> 15 minutes downtime! 19:16:03 <clarkb> I tried to keep notes there and mark things off as we went if anyone wants to look back on it 19:16:26 <clarkb> the current situation is that review02 is in the emergency file and review03 is in our inventory as a normal review server 19:16:43 <clarkb> There are a number of followup changes that we should work through as we're confident that rollback is less and less likely 19:16:48 <clarkb> #link https://review.opendev.org/c/opendev/zone-opendev.org/+/947858 Reset DNS TTL 19:16:53 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/947758 Drop review02 from inventory 19:16:58 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/947759 Drop ignored docker compose version specifier 19:17:04 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/882900 Migrate gerrit images to quay 19:17:34 <clarkb> Reviews very much appreciated. I'd also like to talk about that last change for a bit, but before I do I also would like everyone to look at review02 and double check there isn't anything they want to be preserved on review03 19:17:59 <clarkb> for example I had gerrit user database cleanup notes in my homedir that I copied over to review03. It wouldn't be the end of the world to go to backup to get that info but making it easy seemed nice 19:18:08 <fungi> already looked, i don't need any of my files 19:18:53 <clarkb> cool. Once you've done that the other thing to weigh in on is when you feel comfortable with cleaning up review02. I think we'll keep its bfv volume and data volume around longer than the instance itself. But still removing the instance is a big step 19:19:13 <clarkb> don't need answers on that now, but maybe followup in #opendev with thoughts/concerns/timeline ideas 19:19:30 <corvus> i have no files 19:20:26 <fungi> nor i now 19:20:35 <clarkb> for that last linked change I discovered that building the gerrit container image with docker fails in the move to quay because we don't set up buildkitd mirror data 19:20:35 <fungi> baleeted 19:20:48 <clarkb> The latest patchset to that change updates the jobs to use podman to build the images instead 19:21:10 <clarkb> it seems to work based on the fact that the system-config-run jobs manage to run gerrit and pass our test cases and take screenshots 19:21:39 <clarkb> however I don't think we should switch only gerrit to build with podman. I think we should siwtch the entirety of our image builds that move to quay (leaving the old image builds that are stuck on docker using docker seems fine) 19:22:04 <fungi> wfm 19:22:06 <clarkb> any upfront concerns with doing that? I know corvus said earlier today that it would be good to keep zuul and opendev somewhat in sync here so that we have a battle tested consistent approach 19:23:14 <fungi> that coordination makes sense, sure 19:23:32 <clarkb> I can push a change up to do the switch across that set of jobs then rebase the gerrit image move onto that 19:23:37 <clarkb> that should make it more reviewable/mergable 19:24:02 <corvus> zuul does it differently currently 19:24:07 <fungi> sounds good, i'll prioritize reviewing those 19:24:23 <clarkb> corvus: it did look like the initial test of having zuul use podman worked though? 19:24:30 <clarkb> so at least no Dockerfile content problems 19:24:42 <clarkb> but we may want to inspect the images for difference? 19:25:24 <corvus> i pushed a change up to zuul to see if podman works 19:25:25 <corvus> https://review.opendev.org/947848 19:25:26 <corvus> looks like it does 19:25:26 <corvus> so probably the only reason we needed those settings in the zuul project was for nodepool-builder 19:25:28 <corvus> which we don't care about long-term 19:25:55 <clarkb> ya maybe we can just let nodepool-builder be special for now then phase it out as part of the zuul-launcher effort 19:26:12 <clarkb> similar to how opendev would ohase out docker builds as we move to quay 19:26:16 <corvus> does opendev do any multi-arch builds? 19:26:17 <corvus> (also, sorry, i'm laggy) 19:26:55 <clarkb> corvus: no multi arch builds that I am aware of. But double checking that is a good idea 19:27:38 <fungi> i'd be fine letting nodepool-builder images be unique/different until we no longer produce updates for them 19:28:27 <clarkb> and we don't have to commit ot anything in this meeting. I just want to get this out there so we can start considering the impacts and options 19:28:28 <corvus> system-config-build-image-python-builder-3.12-bookworm looks like it's multi-arch 19:28:33 <clarkb> oh right the base images 19:28:40 <clarkb> because nodepool-builder depends on them 19:29:14 <clarkb> no opendev service images do multi arch, but we make the base images multiarch so that nodepool can build on them 19:29:22 <corvus> oh, well if it's only for nodepool.... then nbd. 19:29:54 <clarkb> corvus: yes I think that is the only place "we" use that 19:30:02 <corvus> that wfm then. 19:30:33 <clarkb> ok cool sounds like we have a rough plan and I can get the opendev side organized around that 19:30:41 <clarkb> Anything else Gerrit related? 19:31:32 <clarkb> #topic Upgrading old servers 19:31:36 <clarkb> I think we can continue then 19:31:53 <clarkb> the other thing that is back on my todo list radar now that Gerrit is largely updated is picking up this process for other services 19:32:04 <clarkb> mirror-update and eavesdrop are both on the todo list maybe I can find some time for them soon 19:32:18 <clarkb> Help is appreciated if aynone else is able to pcik a server or two or three as well 19:32:51 <clarkb> I don't really have any updates on this topic other than to say that as gerrit has been the focus for a while 19:32:54 <clarkb> anyone else have updates? 19:33:19 <fungi> i don't, other than working on shutting refstack down instead of upgrading it 19:33:34 <clarkb> oh ya I guess that is related. refstack won't be updated it will be shutdown 19:33:45 <clarkb> solves the problem in a different but still valid way 19:33:47 <clarkb> thanks! 19:33:49 <fungi> but that's pending some feedback from foundation staff on whether they want to announce it 19:34:05 <fungi> hopefully later this week 19:34:28 <clarkb> #topic Working through our TODO list 19:34:34 <clarkb> #link https://etherpad.opendev.org/p/opendev-january-2025-meetup 19:34:52 <clarkb> just a reminder that if everything else we're discussing isn't keeping you busy enough we've got a big list on this eitherpad 19:35:13 <clarkb> a good place for people who want to get more involved to look as well. Feel free to ask me any questions you may have. I'm super happy to help anyone work on these things 19:36:05 <NeilHanlon> o/ just stopping in to say i'll be poking at rocky/centos images this week if mnasiadka doesn't beat me to it :) 19:36:39 <fungi> thanks NeilHanlon!!! 19:36:44 <clarkb> ++ 19:37:00 <clarkb> #topic Rotating mailman 3 logs 19:37:14 <fungi> no progress, sorry 19:37:26 <clarkb> ack. Iwasn't sure and wanted to give you the opportunity if I had missed it 19:37:35 <clarkb> #topic Moving hound image to quay.io 19:37:55 <clarkb> so this is related to the earlier discussion we had with gerrit and also the change to do this landed late yesterday after I posted the meeting agenda 19:38:14 <clarkb> I think the process here is basically we'll update the image build to use podman as part of the earlier discussed changes. Then roll forward and all should be well 19:38:36 <clarkb> at this point lodgeit and hound are on quay but neither are affected by the issue that hit gerrit so they are in a midway point 19:38:48 <clarkb> but we'll fix them up before getting to gerrit and then we can do gerrit 19:40:06 <clarkb> #topic Renewing wiki's cert 19:40:27 <clarkb> As you may have noticed our cert checker is unhappy this cert expires in just over two weeks 19:40:51 <clarkb> due to travel mentioned earlier I think my plan is to get a new cert issued and in place next week 19:41:09 <clarkb> apologies for the continued email alerts but they should go away soonish 19:41:21 <fungi> thanks! 19:41:37 <clarkb> #topic Open Discussion 19:41:39 <clarkb> Anything else? 19:41:57 <fungi> i didn't have anything 19:43:19 <clarkb> I think the screen is still running on review03 19:43:39 <clarkb> I was going to clean that up and then get distracted today. I've got it on my todo list so it should go away soonish 19:44:15 <clarkb> The openinfra summit europe 2025 cfp is open until sometime in june 19:44:35 <clarkb> looks like June 13 it closes 19:45:39 <fungi> i'm going to duck out early, thanks clarkb! 19:46:01 <clarkb> yup we're winding down anyway 19:46:12 <clarkb> I'll leave the floor open for a few more minutes if there is anything else but then end the meeting 19:48:23 <clarkb> Sounds like that may be everything. Thanks Everyone! 19:48:36 <clarkb> We'll be back next week at this time and location. but then likely cancelling the meeting the week after 19:48:43 <clarkb> #endmeeting