19:00:05 <clarkb> #startmeeting infra 19:00:05 <opendevmeet> Meeting started Tue Apr 30 19:00:05 2024 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:00:06 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:00:06 <opendevmeet> The meeting name has been set to 'infra' 19:00:14 <clarkb> #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/M4JKPPJYDIJT5EQTKPCIANUZ6WNFOO5T/ Our Agenda 19:00:29 <clarkb> #topic Announcements 19:01:06 <clarkb> I didn't have anything to announce. More just trying to get back into the normal swing of things after the PTG and all that 19:01:18 <clarkb> Did anyone have anything to announce? 19:01:43 <frickler> just that I'm off for the remainder of this week 19:02:09 <clarkb> I'll be around. Though I dod have parent teacher conference stuff thursday morning. But otherwise I'm generally around 19:02:51 <clarkb> #topic Upgrading Old Servers 19:03:30 <clarkb> tonyb: with our timezones a bit better aligned for a bit I'm happy to help dive into this again if your time constraints allow it 19:03:31 <tonyb> No progress but I'm in the US and it's next on my todo list to clear out my older reviews 19:03:43 <clarkb> cool feel free to ping me if I can help in any way 19:03:49 <tonyb> Will do 19:04:03 <tonyb> I'll also start an etherpad for the focal servers 19:04:29 <fungi> i'll be disappearing on thursday and gone for 11 days 19:04:31 <clarkb> ++ getting a sense of the scale of the next round of things would be good. I'm hoping that generally as we're more and more in containers this becomes easier 19:05:01 <fungi> (sorry, i missed the #topic change) 19:05:06 <clarkb> no problem 19:05:15 <clarkb> #topic MariaDB Upgrades 19:05:26 <clarkb> We've done all of the services except for Gerrit at this point 19:05:42 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/916848 19:06:20 <clarkb> This is a change to do that, but it won't be automated after landing (all of the other upgrades were). I'm 99% positive that this will simply update the docker-compose file instead and we'll need to pull and down and up -d by hand to get that through 19:06:35 <clarkb> that means there will be a short gerrit outage so any thoughts on when we should do that? 19:08:30 <tonyb> Not really. 19:08:49 <clarkb> ok I think this one is also relatively low priority since gerrit barely depends on that db at this point 19:09:06 <clarkb> reviews would be great then if an opportunity presents itself we can merge it quickly and restart services? 19:09:15 <tonyb> Sounds good 19:09:53 <clarkb> #topic AFS Mirror Cleanups 19:10:02 <clarkb> topic:drop-ubuntu-xenial has changes up for review. 19:10:20 <clarkb> That said this is going to be a slow process of chipping away at things. 19:10:48 <clarkb> One question I had for the group is whether or not you think I should push up changes to remove projects from zuul's tenant config rather than try and fix up their zuul configs properly 19:11:19 <clarkb> in particular there are a number of x/* projects with python35 jobs. I think Xenial sort of coincided with the era of everyone making a project for everything and as a result ended up with test configs that want to run on xenial 19:11:38 <clarkb> and for those it may just be easiest to remove them from zuul's tenant config entirely rather than try and coerve their configs into the future 19:12:40 <tonyb> I feel like dropping projects in that state is reasonable. Assuming it's announced and we restore any somewhat quickly? 19:13:00 <clarkb> ya restoring them shouldn't be an issue. They will have to make their first change a zuul config cleanup but otherwise should be straightforward 19:13:07 <clarkb> (and that fixup might be to reset to noop jobs) 19:13:26 <frickler> +1 19:14:51 <clarkb> ok I'll continue down that path then and hopefully we eventually reach a point where its like 80% done and we can announce a removal date and let the fallout fall from there 19:14:54 <fungi> i'm in favor 19:15:03 <frickler> I also just noticed an issue with the deb-ceph-reef mirror I created 19:16:00 <clarkb> frickler: looks like we need the symlink into the apache web space since it is a separate mirror entirely 19:16:03 <frickler> seems the volume is named mirror.ceph-deb-reef, while the correct name would be just mirror.deb-reef 19:16:38 <clarkb> hrm we try to keep those short due to afs limits, not sure what the limit is. Maybe that name is ok as is? 19:16:49 <clarkb> and then we just need to realign things? Or maybe we can simply rename the volume? 19:17:19 <frickler> I'm not sure, I just notice that the vos release in the reprepro log is failing with "vldb not found" 19:17:24 <frickler> anyway, if it is not considered urgent, I can look at that next week 19:17:39 <clarkb> ya I don't think it is urgent since it is new rather than affecting existing jobs 19:17:42 <frickler> but if one you wants to fix it, go ahead 19:17:45 <clarkb> ack 19:18:23 <clarkb> #topic Adding Ubuntu Noble Test Nodes 19:18:56 <clarkb> The changes we needed in glean and opendev nodepool elements all landed and dib is building noble nodes in its testing 19:19:20 <clarkb> I think the next step is to add noble to our ubuntu mirror then we can add noble image builds to nodepool 19:19:49 <frickler> I also just mentioned in the TC meeting that help on this would likely be welcome 19:20:14 <clarkb> need to check that we have room in the ubuntu mirror volume first (and bump quotas if necessary. Hopefully we don't have to split the volume due to the 2TB limit but I don't think we are there yet) 19:20:25 <frickler> (even more so for the devstack etc. part that would still need to follow) 19:20:33 <clarkb> but then it should be some pretty straightforward copy pasta from the existing mirror stuff for ubuntu 19:21:25 <frickler> mirror.ubuntu is at 850G in total, so not close to 2T at least 19:21:28 <clarkb> ubuntu is 6GB short of the 850GB quota limit. And I think openafs is limited to 2TB 19:21:44 <clarkb> so ya we probably need to bump the quota to something like 1200GB and then we should be good to land the change 19:22:14 <clarkb> and once that is done similar copy paste with the nodepool configs to build new images there. And then its the long process of getting stuff onto the new node type 19:22:25 <frickler> oh, note to self: adding the reef volume to grafana is also missing 19:23:32 <clarkb> assuming I'm able to come up for air on the thigns I've got in flight I can probably look at noble stuff but it might be a good thing for someone else eto push along. ON the config side its largely copy paste and then checking results. Only really need infra root for the quota bump and possibly to hold locks for a manual seed of the mirror 19:23:54 <clarkb> I guess if anyone sees this and is interesting in helping out let us know and we'll point you in the right direction 19:24:04 <clarkb> #topic Etherpad 2.0.3 Upgrade 19:24:10 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/914119 Upgrade etherpad to 2.0.3 19:24:22 <fungi> i think that's ready to go now 19:24:36 <clarkb> we are currently running 1.9.7 or something like that. At first the 2.0 update wasn't a big deal as it mostly chagned how you install etherpad so we update the docker file and are good to go 19:25:07 <clarkb> then 2.0.2 broke api auth and we needed 2.0.3 to add a viable alternative. That release happened yesterday and a node is up and held for testing which fungi and I used for testing nad all seems well 19:25:31 <clarkb> so ya if the rest of the change (docker file updates, docs updates, test updates) look good I think we can proceed with upgrading this service 19:25:47 <clarkb> We do need to add a new private var to bridge before landing the change 19:26:16 <clarkb> fungi: maybe we give others a chance to review between now and tomorrow morning then send it tomorrow morning if no one objects by then? 19:26:23 <fungi> wfm 19:27:41 <clarkb> #topic Gerrit 3.9 Upgrade Planning 19:27:49 <clarkb> #link https://etherpad.opendev.org/p/gerrit-upgrade-3.9 Upgrade prep and process notes 19:28:18 <clarkb> I've started on this (need to actuall perform the upgrade process and downgrade process on a test node in order to take notes) and I think overall this is a straightforward upgrade 19:28:54 <clarkb> You can go over my notes and skim the release notes yourselves to see if I've made any errors in judgement or overlooked important changes so far. Feel free to add them to the etherpad if so 19:29:26 <clarkb> There are a few things worth mentioning. First is that we have the option of making diff3 the diff method for merge changes. This adds a third piece of info which is the base file state in addition to theirs and ours 19:29:42 <clarkb> There is a new default limit of 5000 changes per topic 19:30:05 <clarkb> we can incraese that value if we think it is too low. I suspect that our largest topics are going to be things for like openstack releases which maybe have a few hundred? 19:30:46 <clarkb> And finally we there is a new option to build the gerrit docs without external resources, but that option isn't part of the release war build so I've been asking upstream (with no luck yet) in how to combine this option with building a release war 19:31:40 <fungi> any idea how the changes per topic limit can be checked ahead of time, and what the outcome is if it's exceeded during upgrade? 19:32:07 <clarkb> fungi: no, but those are good questions. I'll try to followup with them upstream. Worst case thursday is the monthly community meeting and I should be able to get more info there 19:32:14 <fungi> also what happens when the 5001st change with the same topic is pushed (rejection with some specific error message i guess) 19:32:35 <clarkb> we should be able to test some of that on a held node easily. Just set the limit to 1 and then try and add a second chaneg to a topic 19:32:57 <fungi> good point 19:33:17 <clarkb> fungi: maybe scribble those notes under the item on the etherpad and I'll followup with more info 19:33:53 <clarkb> as far as upgrade planning goes I suspect we can upgrade before the end of May. Maybe on the last day of May given various holidays and vacation and all that 19:34:10 <clarkb> I'll propose something more concrete next week after a bit more testing then we can announcei t 19:35:04 <tonyb> If I'm doing it right new-release topic would be waaaay over 5k 19:35:04 <clarkb> #topic Wiki Cert Renewal 19:35:12 <clarkb> #undo 19:35:12 <opendevmeet> Removing item from minutes: #topic Wiki Cert Renewal 19:35:26 <clarkb> tonyb: you mean that openstack releases produce a topic with over 5000 changes? 19:36:04 <frickler> does that limit include merged changes or only open ones? 19:36:05 <tonyb> https://review.opendev.org/q/topic:new-release,5000 19:36:25 <clarkb> frickler: that is one of the open questions I made upstream that hasn't had a response yet 19:36:33 <tonyb> If it's only open then we'd be fine 19:36:41 <tonyb> Okay. 19:36:50 <clarkb> tonyb: oh I see releases use the same topic each time so they've build up over time. 19:37:27 <fungi> i tried to cover all those items in the pad 19:37:43 <clarkb> interseting that we're right around the limit too. Seems like somewhere between 5100 and 5200 change attached to that topic 19:37:47 <frickler> yes, I was checking "formal-vote" which came to my mind first, but that's only at 750 19:38:20 <tonyb> clarkb: Ok so waaaay over was an overstatement 19:38:33 <clarkb> tonyb: but if there are problems 1 over is probably sufficient to find them :) 19:38:44 <clarkb> I'll continue to followup and try to attend the community meeting to ask directly there as well 19:38:52 <tonyb> clarkb: Thanks 19:39:02 <clarkb> #topic Wiki Cert Renewal 19:39:19 <clarkb> This is just a note to make sure people know I've said I'll deal with this ~1 week before expiry 19:39:33 <clarkb> Don't really have anything new to say. But didnt' remove it from the agenda since the cert hasn't been renewed yet 19:39:53 <fungi> i'll be back from vacation a few days before it expires and can do the file installation part then 19:40:02 <clarkb> ack tahnks 19:40:07 <clarkb> #topic Open Discussion 19:40:09 <clarkb> Anything else? 19:43:45 <clarkb> sounds like that is probably it 19:43:47 <clarkb> Thank you everyone 19:44:16 <clarkb> we'll be back here next week at the same time and location. I suspect there will be fewer of us but enough to have the meeting and sync up on what is going on 19:44:20 <clarkb> #endmeeting