19:00:15 <clarkb> #startmeeting infra 19:00:15 <opendevmeet> Meeting started Tue Jul 30 19:00:15 2024 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:00:15 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:00:15 <opendevmeet> The meeting name has been set to 'infra' 19:00:21 <clarkb> #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/GBQUFAFSFBGZGLEYVTCCSUABJXCS6ZUJ/ Our Agenda 19:00:30 <clarkb> #topic Upgrading old servers 19:00:33 <clarkb> #undo 19:00:33 <opendevmeet> Removing item from minutes: #topic Upgrading old servers 19:00:37 <clarkb> I'm getting ahead of myself here 19:00:40 <clarkb> #topic Announcements 19:01:06 <clarkb> As mentioned above frickler and tonyb won't make the meeting today. But we'll take notes here so that they and others can review them later 19:01:53 <clarkb> I've realized that I made a bit of a mistake in planning family obligations and will likely miss the next two Tuesdays... My brother has midweek "weekends" and so late summer family stuff all got planned midweek 19:02:10 <clarkb> Anyway I probably won't be around on August 6 or 13 to run this meeting. 19:02:49 <fungi> i expect to be around and can chair either/both if we want to meet 19:02:56 <clarkb> thanks! 19:03:18 <fungi> we can probably defer that decision until closer to the end of the week since we have some people not around today 19:03:28 <clarkb> I'm also waffling on whether or not to attend Fossy whcih starts Thursday afternoon. I probably should go and use it is an opportunity to thank the folks that work on the osuosl stuff 19:03:33 <clarkb> fungi: yup no rush on that 19:03:56 <clarkb> I do have a working laptop again though so if I do go to fossy I won't be completely disconnected from the world. Just highly distracted :) 19:04:53 <clarkb> #topic Upgrading Old Servers 19:05:34 <clarkb> tonyb can't make it but did mention that noble images have been uploaded to all cloud regions but one. I'm not sure which that one is, but this implies that at least two rax regions were uploaded to which is good news as I expected those to be most difficult 19:06:08 <fungi> that's exciting progress. thanks tonyb! 19:06:29 <clarkb> ++ 19:06:40 <clarkb> looking forward to more info when tonyb is able to share 19:07:23 <clarkb> anything else to mention about upgrading servers? I've been distracted by services and clouds and stuff not servers lately 19:07:41 <fungi> i don't think so 19:08:09 <clarkb> #topic AFS Mirror Cleanups 19:08:40 <clarkb> we just landed the change to cleanup centos 8 stream nodesets and similar stuff from opendev/base-jobs 19:09:36 <clarkb> The next step there is to clean the centos 8 stream images out of nodepool entirely. 19:09:46 <clarkb> #link https://review.opendev.org/c/openstack/project-config/+/924790 Next step in CentOS 8 Stream Cleanup 19:09:51 <clarkb> also topic:drop-centos-8-stream 19:10:42 <clarkb> before we merge 924790 we should check if there are any stuck nodes that need deleting, but Ithink the solution to that is likely not going to chagne regardless of how we merge that change 19:10:47 <clarkb> just somethign to check on and clean up 19:11:12 <clarkb> apologies that progress has been so slow on this, but we are slowly making progress which I'll take as a small win 19:11:49 <clarkb> #topic Testing Rackspaces New Cloud Offering 19:12:11 <clarkb> I'm going to send a followup email today/tomorrowish after double checking that this is more likely to be a better time for them 19:12:41 <clarkb> I was told to wait a week and its been a week. Just want to check first before things get lost in the ether again. But otehrwise nothing new here 19:13:00 <clarkb> #topic Reconsider queue depth for integrated gate 19:13:18 <clarkb> The alternative proposal for this to implement early failure detection has been implemented for tempest jobs 19:13:59 <clarkb> The reason for doing this in tempest is that those jobs are long running so have a larger impact if we can detect failures early and reset ASAP, but also beacuse these tempest base jobs are used everywhere and we can make a small change in one spot and have a large impact across many jobs 19:14:38 <clarkb> So far haven't seen any complaints or even questions about why things are failing before they are done. This would imply that it is working in a manner that isn't too intrusive which is great 19:14:46 <fungi> yeah, a tempest job can fail on a test 25 minutes in and then keep running more tests for a further hour or more 19:15:35 <clarkb> a good next step may be to update openstack-tox-pyXY jobs to do similar though their impact will be smaller 19:16:40 <clarkb> other than that I guess try to observe gate behavior the next time you notice a large queue and we can see if this has helped at all 19:16:58 <clarkb> #topic Removing the Works on ARM Linaro Cloud from Nodepool 19:17:48 <clarkb> Last week I received an email from the Works on ARM folks letting us know that the hardware hosting the linaro cloud will be removed/no longer accessible on August 10. I committed to shutting down our usage of that cloud this week which is now done as of this morning local time for me 19:18:22 <clarkb> I have since sent a followup email to Kevin at linaro letting him know the cleanup on our side is done and that we can clean up the cloud stuff if necessary. I also offered to help 19:18:40 <clarkb> I suspect that might end up being a die on the vine situation and the cloud will go away when the hardware goes away 19:19:01 <clarkb> This does mean we are down to operating arm jobs in one cloud region (the osuosl arm cloud) 19:19:56 <clarkb> fungi: you've been helping out with this process anything else to add? 19:21:38 <fungi> nope, i think we're basically done other than whatever kevin comes back with 19:22:20 <clarkb> #topic Service Coordinator Election 19:23:24 <clarkb> It is that time of year again. I looked at when we had the last election and added 6 months and ended up with this proposal: Open candidate nominations from August 6 to August 20. Then if necessary we can have an election August 21 to 28. This timing also works out nicely to avoid the summit 19:23:58 <clarkb> considering that attendance is light today, maybe we can see if others can ack this proposal as reasonable and then I'll send out email to the list making it official later this week? 19:24:11 <fungi> sounds good, thanks for keeping up with that! 19:24:22 <clarkb> and if there are any concerns or questions feel free to raise those too 19:25:03 <clarkb> thanks! 19:25:13 <clarkb> #topic Open Discussion 19:25:38 <clarkb> I have more thinsg too that never made it to the agenda proper (you might have noticed the last item was missed too, I was very distracted by laptop reinstall things) 19:25:57 <clarkb> First up dansmith is trying to clarify some confusion he ran into creating a new project here: https://review.opendev.org/c/opendev/infra-manual/+/925275 19:26:14 <clarkb> I think landing that now if we can is a good idea just because it is strightforward and I don't want to forget later 19:26:21 <clarkb> fungi maybe you have time to review that? it should be a quick one 19:26:26 <fungi> on it 19:29:31 <clarkb> Next I've been looking ahead to mid term goals I've got for OpenDev: I would like to see us bump the default nodeset to ubuntu noble and change the default ansible version in our zuultenants to ansible 9 for example. A late october early november ish upgrade of Gerrit to 3.10 would also be great. I've also been thinking about trying to have more "OpenDev Zuul for our users" 19:29:33 <clarkb> content this is still very early brainstorming but I've noticed a lot of people struggling with zuul concepts recently... 19:30:19 <clarkb> of these the most straightforward might be the ansible bump. Noble presents challenges due to it having python3.12 which is a fairly big leap in the python workd 19:30:22 <fungi> content in the infra-manual? 19:30:44 <clarkb> fungi: maybe? another thought was some sort of video 19:32:12 <clarkb> addressing things like people not understanding depends on and speculative git states, rarely providing direct links to debug failures, understanding when jobs run and why, and so on are what I want to address 19:32:23 <fungi> makes sense 19:33:15 <clarkb> Zuul is this really great powerful tool and I want people to be able to take advantage of it 19:34:10 <clarkb> anyway all of that gets mixed in with things like the summit and the ptg and the next openstack release. A lot of stuff is happening over the next few months and I wanted to bring these ideas up early so that we can try and fit them in 19:34:11 <corvus> ++ 19:35:58 <clarkb> feedback and ideas welcome. 19:36:09 <clarkb> Anything else to go over in the meeting today? 19:37:34 <fungi> i didn't have anything else 19:39:33 <clarkb> sounds like that is everything. Thank you for your time today. And we can figure out what we are doing next week as we get closer to that point in time. 19:39:39 <clarkb> #endmeeting