19:00:12 #startmeeting infra 19:00:12 Meeting started Tue Nov 18 19:00:12 2025 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:00:12 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:00:12 The meeting name has been set to 'infra' 19:00:17 #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/R36BWJAGRO7EBAJ6NH7SB2646MP555D3/ Our Agenda 19:00:23 #topic Gerrit 3.11 Upgrade Planning 19:00:42 Last week we managed to fix the bind mount issue with gerrit's docker compose file and updates our images to the latest bugfix releases for 3.10 and 3.11 19:00:57 since then I have held new nodes using those new images 19:01:16 If you want to look at them yourselves you can do so via 3.10: https://217.182.142.186 and 3.11: https://174.143.59.58 19:01:36 my hope/goal is that starting tomorrow I'll be able to pick up the gerrit upgrade effort as my primary focus for the next little bit and we'll get it done 19:01:49 3.10 went eol like last week. So its a good idea for us to upgrade as soon as is reasonable 19:02:06 #link https://www.gerritcodereview.com/3.11.html 19:02:28 Please do take a look over the release notes if you have time and flag any issues you see 19:02:34 #link https://etherpad.opendev.org/p/gerrit-upgrade-3.11 Planning Document for the eventual Upgrade 19:02:39 notes will be captured in this etherpad 19:03:04 also I skipped announcements today because I didn't have any and figured if its going to be a shorter meeting anyway we may as well make the most of it. We can call anything out that is important to note that I miss at the end of our time 19:03:36 Any concerns or questions around the Gerrit upgrade process? I know we were able to walk tonyb through the bugfix update process and I think that was helpful 19:04:19 very 19:04:44 thank you all for making the time to do that 19:05:06 happy to. And I think we can try and organize the actual upgrade to 3.11 in a similar way so that more people can gain that familiarity if there is interest 19:05:19 that might mean doing the upgrade on a sunday afternoon (for me) if tonyb is interested 19:05:28 ianw did at least one upgrade that way too 19:06:41 anyway as noted my plan is to page this all back in starting tomorrow and start (re)building the confidence we need in the upgrade and downgrade then we can set a time to do the actual upgrade 19:06:57 thinking out load here maybe first ish week of december? I think that may be enough time for me to figure things out with the upcoming holiday 19:08:05 #topic Upgrading old servers 19:08:26 that sounds good to me. if you're willing to give up some of your Sunday 19:08:39 tonyb: ya that should work I'll keep it in mind with the planning 19:09:04 for server upgrades are there any updates? I know tonyb had some changes going but not sure if they got new patchsets yet 19:09:45 they have not. 19:10:06 sorry 19:10:12 today for sure! 19:10:18 ack thanks. I think we've all been busy with many other things so understandable 19:10:47 The other thign I wanted to call out on this topic is that there is even some value to upgrading servers like etherpad or giteas to noble despite them being on jammy. And that is due to docker hub rate limits 19:11:12 the recent etherpad ugprades got hit by rate limits and I had to do /etc/hosts surgery to force ipv4 as a temporary workaround 19:11:31 Not terrible and the focus is definitely still on the older nodes, but once those are done doing more updates has value too 19:12:13 #topic Matrix for OpenDev comms 19:12:19 #link https://docs.opendev.org/opendev/infra-specs/latest/specs/matrix_for_opendev.html 19:12:32 The room exists now (thank you tonyb) and corvus has a change up to add statusbot support for matrix 19:12:36 #link https://review.opendev.org/q/hashtag:%22opendev-matrix%22+status:open 19:13:08 per the spec the next steps will be adding gerritbot and eavesdrop bots to the room. I can write chagnes for that soon 19:13:11 it's a wee bit of a larger change than i hoped 19:13:23 then we can also update statusbot and get it joined to the opendev room and the zuul room 19:13:30 mostly due to the way statusbot assumed there would only every be one bot, plus also, the matrix lib is async. 19:13:55 corvus: I think the change ends up being readable enough despite the mix of async and threads 19:14:13 i disabled the feature which changes the room topics. we haven't used that in ages, and it's always been a bit fragile 19:14:27 i'm hoping that gives us a bit more confidence in the system as a whole 19:14:32 the link above is for a hashtag in gerrit that we'll use to track the work so keep an eye out for the changes and please review them 19:14:34 corvus: ++ 19:15:00 then once we have all the bots staged I think we can start the planned cut over and begin using the new room as the primary synchronous comms location 19:15:15 also i *think* tonyb said he fixed up the opendev admin account to be present and admin in all the channels on our homeserver now? or at least most of the ones i might have screwed up in the past 19:15:23 i've tested it, but not completely, but i think it's ready to merge and for us to update it 19:15:36 corvus: ya I was ahppy with it as is 19:15:43 (though that was just code review on my part) 19:16:12 i think we should go ahead and do the statusbot upgrade and test it out. then add the others. we know that eavesdrop and gerrit work, but statusbot needs exercising 19:16:16 yup I definitely tried to make join all the rooms as admin:opendev.org. and make it thr admin 19:16:37 thanks tonyb! and sorry for all the makework 19:16:48 corvus: makes snse 19:17:04 fungi: no problem at all 19:17:05 one thing I haven't considered is that matrix eavesdrop and limnoria eavesdrop may try to log to the same directory on disk for logs 19:17:16 that just occurred to me and is worth looking into /me makes a note 19:17:50 we could maybe just have a test to make sure there's no overlap in channel/room names being logged? 19:18:06 the matrix eavesdrop does intentonally do that, mostly because that was our cutover strategy for zuul 19:18:10 on the matrix stuff. did we decide we're okay to include the ops and new opendev rooms as publicly listed? 19:18:10 fungi: ya that might be one solution 19:18:37 tonyb: I think we decided that zuul is publicly listed so publicly listing those two should also be fine 19:18:56 okay. I'll double check I did that 19:18:56 i mean, as long as we're not logging the zuul irc channel when we start logging the zuul room in matrix (i know this already happened, simply an example) then it should be fine 19:19:14 fungi: yup if only one channel is logged at a time we'll be ok 19:19:35 I may take a quick look to see if there is a way to dedup somehow in limnoria and/or matrix eavesdrop in case we want that feature 19:20:10 strip the two lists to bare names, concatenate and sort, then count duplicates and error 19:20:33 anyway real progress is being made on this spec now. I'll work on a gerritbot chagne since that is easy. Everyone else should review the statusbot change. And I'll see about the logging collision problem 19:21:00 and then once we're happy with the bots we can start moving over potentially as early as this week or next? 19:21:17 fwiw, i like the flag-day cutover and continuing the same logfile 19:21:30 corvus: ack 19:21:51 yeah, sounds like an ideal approach 19:22:07 any other questions, concerns, comments on this effort? 19:22:11 simple and straightforward 19:23:35 #topic Gitea Performance 19:23:52 I left this on the agenda beacuse i wanted to see if anyone had noticed problems since the load balancer and memcached updates 19:24:11 For me personally things appear to be much more consistent in terms of performance since those changes 19:24:36 we did also upgrade to 1.25.1 and add the canonical link headers to apache but I don't think either were expected to improve performance 19:25:21 I haven't noticed any problems, performance or otherwise 19:25:55 great. If you do notice anything feel free to note it. Improvements like this are often driven by users noticing issues 19:26:00 and accessing specific gitea backends is pretty easy with a SOCKS proxy to the LB. 19:26:30 yes I've done that in my browser but then have relied on direct port forwards for git operations 19:26:45 you probably can get git to talk socks (since it uses curl which almost certainly understands socks somehow) but I haven't figured that out yet 19:27:08 the deployment window between one and all backends getting updated is also brief enough that i'm fine just checking through the lb unless we're troubleshooting a problem with a specific backend 19:27:32 this isn't a five-nines operation 19:27:46 neither is cloudflare so I think we're doing ok 19:27:59 touch'e 19:28:08 hmm, my compose key is busted 19:28:32 #topic Open Discussion 19:28:49 that was everything on the agenda. 19:28:54 https://review.opendev.org/c/opendev/zuul-providers/+/966200 will add trixie arm64 images 19:29:15 https://review.opendev.org/c/opendev/zuul-providers/+/967599 adds trixie x86 nodests (since we don't quite have arm64 images yet) 19:29:29 reran xmodmap and can now touché correctly. i wonder why the call through .xinirc didn't take 19:29:30 corvus: fungi ^ should I go ahead and approve 599 at this point? It shouldn't affect anything that is running yet so I think its fine 19:29:49 sgtm 19:30:05 yeah, i'm cool with it 19:30:19 I haven't checked my scroll back. any objection to me using the cli admin interface to remove the unwanted gitea user? 19:30:25 fingers crossed we can add the missing ones later today or tomorrow 19:30:37 tonyb: no objections from me. I think you can do it through the web ui as admin too though? 19:30:41 tonyb: i didn't see anyone comment on it, but please feel free 19:30:55 tonyb: we've only used the cli admin interface once to fix the db records iirc 19:31:01 so its a bit more of an unknown but no objections to getting that sorted out 19:31:41 that one user account slipping through was due to a brief window that we hadn't disabled the feature, but hasn't posed enough of a risk that any of us has prioritized cleaning it up until now 19:31:44 I mentioned this yesterday when prepping the agenda. I think we got a fair bit done in the last couple of weeks between gerrit updates and zuul launcher bugfixes and etherpad and gitea and zookeeper upgrades. Thank you everyone for helping keep the ball rolling forward and keeping the lights on 19:31:45 clarkb: yeah I thought the cli admin interface was "easier" than web clicky clicky 19:32:18 probably so now that we don't have trivial webui access to each backend 19:32:27 you're welcome and thank you (all) :) 19:32:36 i'll likely prefer the cli in future now too 19:33:15 next week is a major US holiday but I expect to be around monday and tuesday as well as most of wednesday (at some point I need to start food prep though) 19:33:25 I don't expect to be around thursday or friday 19:33:26 i'll be around if people want to meet 19:34:03 I'll be here, and trying to watch a little closer 19:34:06 i'll be around most of thursday and friday too and can keep an eye on things, i don't have plans to go anywhere or cater to a crowd this year 19:34:40 fungi: hopefully you enjoy the quiet. I'm hosting so a bit of the opposite for me this year 19:34:56 yeah, you have my sympathies 19:35:16 the most i'll end up doing is making homemade pizza or grilling a couple of burgers 19:35:31 ok sounds like that may be everything? 19:35:55 nothing more from me 19:36:07 feel free to continue discussion (or start new ones) in #opendev or on the mailing list. We'll probably meet next week per usual 19:36:17 and until then thanks again for all the help 19:36:24 #endmeeting