| clarkb | meeting time! | 19:00 |
|---|---|---|
| clarkb | #startmeeting infra | 19:00 |
| opendevmeet | Meeting started Tue Nov 11 19:00:16 2025 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:00 |
| opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:00 |
| opendevmeet | The meeting name has been set to 'infra' | 19:00 |
| clarkb | #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/XEGEBPR2GSFB5UDOI5WGMTUIXHKQAEAP/ Our Agenda | 19:00 |
| clarkb | #topic Announcements | 19:01 |
| clarkb | we are just over 2 weeks away from a major holiday in the US. I expect to be around tuesday and probably wednesday that week but not thursday and friday | 19:01 |
| clarkb | All that to say I don't think it will affect our meeting schedule, but it probably will affect when people are around and active | 19:02 |
| clarkb | Was there anything else to announce? | 19:02 |
| fungi | i have nothing | 19:04 |
| clarkb | #topic Gerrit 3.11 Upgrade Planning | 19:05 |
| clarkb | Gerrit 3.13 has released | 19:05 |
| clarkb | this means the pressure to upgrade to 3.11 is increasing | 19:06 |
| clarkb | Before we do that there are new bugfix releases | 19:06 |
| clarkb | #link https://review.opendev.org/c/opendev/system-config/+/966084 Update to Gerrit 3.10.9 and 3.11.7 | 19:06 |
| clarkb | and before we update to address the bugfix relases we have a docker compose bug to fix | 19:06 |
| clarkb | #link https://review.opendev.org/c/opendev/system-config/+/966083 Fix container bind mounts for Gerrit | 19:06 |
| clarkb | Landing these two changes and restarting Gerrit is going to be a big goal for me this week. I'm still catching up on stuff after being out yesterday but expect to be able to merge these changes and restart Gerrit sometime this week. Maybe friday if we are trying to cut down on impacts but possibly sooner | 19:07 |
| clarkb | We also heard back from vexxhost on the gerrit server and it was a memory issue which should be mitigated now | 19:08 |
| clarkb | (which makes updating gerrit and restarting things safer | 19:08 |
| clarkb | Any other questions or concerns about Gerrit? | 19:08 |
| tonyb | if we can schedule the restart while I'm around I'd like to be a second set of eyes | 19:09 |
| tonyb | mostly to confirm what a normal start looks like | 19:09 |
| clarkb | oh yes we should do that. So maybe thursday afternoon (for me)/friday morning for you | 19:09 |
| tonyb | sounds good | 19:10 |
| clarkb | tonyb: feel free to propose some time blocks. I'm generally pretty flexible late week | 19:10 |
| clarkb | #topic Upgrading old servers | 19:10 |
| clarkb | tonyb has the wiki change stack been updated for quay and/or noble? | 19:11 |
| tonyb | noble yes quay no | 19:11 |
| clarkb | ack thanks. I think updating the image build change to do that is the next step for this effort. | 19:12 |
| tonyb | also the ansible changes are going to be restructured a little to move ansible-nextvto jammy!+3.11 | 19:12 |
| tonyb | I'll get them updated this week | 19:13 |
| clarkb | thanks | 19:13 |
| clarkb | any other server upgrade updates? (I don't think so but want to double check before we move on) | 19:14 |
| tonyb | (sorry about the typos, speed and accuracy are low on my phone) | 19:14 |
| tonyb | nothing more from me | 19:15 |
| clarkb | #topic Matrix for OpenDev comms | 19:15 |
| clarkb | tonyb offered to look into creating the new room last week. Not sure if that happened | 19:15 |
| clarkb | that is step -2 of many to get this moving forward but it is an important step | 19:15 |
| tonyb | nope. today! | 19:15 |
| clarkb | thanks! | 19:16 |
| clarkb | #topic Upgrade Zuul Zookeeper Cluster to 3.9 | 19:16 |
| clarkb | #link https://review.opendev.org/c/opendev/system-config/+/966612 | 19:16 |
| tonyb | I was thinking I might also make a tooling test room .... to target with tools ... for testing | 19:16 |
| corvus | we have one | 19:16 |
| tonyb | oh! never mind then | 19:16 |
| clarkb | the zookeeper cluster is running 3.8 which is the stable release | 19:16 |
| clarkb | 3.9 is the current release and has existed for enough time now to probably also be considered stable | 19:17 |
| clarkb | the normal upgrade process is to upgrade each of the non leaders first then the leader which our ansible is not smart enough to do | 19:17 |
| corvus | i think it's very likely that zuul is going to make 3.9 a requirement for zuul-launcher | 19:17 |
| clarkb | corvus: what specific feature(s) make 3.9 useful for thel auncher? | 19:18 |
| corvus | so getting ahead of that would be beneficial | 19:18 |
| corvus | the watch event returns the zk transaction id starting with 3.9, so we can tell our current position in cache replays | 19:18 |
| clarkb | got it | 19:19 |
| corvus | https://review.opendev.org/966501 is the zuul change that takes advantage of it | 19:19 |
| corvus | i've written a fallback change for zuul | 19:19 |
| clarkb | as far as upgrading goes I have no objections to moving to 3.9. i think I have a slight preference for manually doing the upgrade to employ the correct expected process | 19:19 |
| clarkb | note you have to check the status of each member after each restart beacuse sometimes the leader moves | 19:19 |
| corvus | so this doesn't have to be in the critical path, we can upgrade whenever, but i'd like soon to increase our confidence | 19:19 |
| clarkb | but I'm happy to help with the process which is something like put servers in emergency file, edit docker compose.yaml by hand and upgrade the first follower, repeat on the second follower after checking which node is leader, then finally do the last node | 19:20 |
| clarkb | the release notes for 3.9 say no special steps are required to upgrade from 3.8 to 3.9 so it should be striaghtforward if we use the normal process | 19:20 |
| corvus | i could do it this saturday morning (my time) | 19:21 |
| corvus | yeah, i also went over the notes and didn't see anything | 19:21 |
| corvus | also, a lot of our zuul tests have already been using 3.9 | 19:21 |
| clarkb | and when done we can merge that change and pull the nodes out of the emergency file | 19:21 |
| clarkb | so I guess heads up, review the upgrade change but don't approve it and if you have any concerns please raise them | 19:21 |
| corvus | that process sounds good to me, and it sounds like if no one objects we could do it saturday | 19:22 |
| corvus | we should make sure to take a zuul zk backup before starting too, just in case | 19:22 |
| clarkb | ++ | 19:22 |
| corvus | (with zuul-client) | 19:22 |
| clarkb | I just approved the test fix that the zk upgrade is a child of | 19:23 |
| tonyb | sounds good to me | 19:23 |
| clarkb | #topic Gitea 1.25.1 Upgrade | 19:24 |
| clarkb | #link https://review.opendev.org/c/opendev/system-config/+/965960 Upgrade Gitea to 1.25.1 | 19:24 |
| clarkb | https://158.69.67.86/opendev/system-config is a held node you can interact with to check this upgrade | 19:24 |
| clarkb | gerrit bug fix upgrades, gitea new release upgrade, and zookeeper upgrades all on tap this week | 19:25 |
| clarkb | I'd appreciate reviews of the change itself to make sure I haven't done anything silly when updating templates, but also read over the release notes and make sure there aren't new features we need to enable/disable/configure | 19:25 |
| clarkb | This release seemed to avoid big changes like that so I think it should be easy but let me know | 19:25 |
| clarkb | mostly just trying to keep up so we don't fall behind | 19:26 |
| clarkb | #topic Gitea Performance | 19:26 |
| clarkb | Then related to that I spot checked giteas today and they all look busy but not to the point where they are slow | 19:26 |
| clarkb | both the memcached memory increase and the "force everything through the load balancer" changes merged | 19:27 |
| clarkb | probably a bit early to claim improvement, but not having evidence of problems is something | 19:27 |
| clarkb | fungi: related I noticed this morning when prepping for the meeting that the lists server seems sad again | 19:27 |
| clarkb | I think mariadb is busy so we may have something crawling apis again and maybe we need to double check iops look reasonable still | 19:28 |
| fungi | mmm | 19:28 |
| clarkb | but wanted to call that out if we're discussing general performance issues related to crawlers | 19:28 |
| fungi | load average is hovering around 10 at the moment, yeah | 19:29 |
| clarkb | I suspect its the same story just hitting us in new and exciting ways as we continue to improve bottlenecks | 19:29 |
| clarkb | every fixed bottleneck is an opportunity to find a new one | 19:29 |
| clarkb | Please say something if you notice problems in gitea (or any other service). | 19:30 |
| clarkb | #topic Raxflex DFW3 Disabled | 19:31 |
| clarkb | I don't think this server has been fixed or replaced yet | 19:31 |
| clarkb | last week we basically said if after a week it wasn't fixed we'd boot a new one | 19:31 |
| clarkb | I think we can probably proceed with that plan now if anyone has time | 19:31 |
| clarkb | (my focus is probably on gerrit and gitea and whatever lists needs to be performant, but I'm happy to help if you point me to specific actions that are needed) | 19:31 |
| clarkb | #topic Open Discussion | 19:33 |
| tonyb | I'll try but if someone else has cycles don't let me stop you | 19:33 |
| clarkb | That was all I had on the agenda. I cut out afs stuff since trixie is mirrored now. I cut out launcher things because the major bug there was fixed. We also got vexxhost to address the gerrit vm issues. We upgraded etherpad too | 19:34 |
| clarkb | all that to say we got a lot done last week and I was able to trim the agenda as a result. Thank you everyone for making that happen | 19:34 |
| tonyb | yeah well done! | 19:35 |
| fungi | great work everyone! | 19:36 |
| clarkb | maybe we can upgrade gitea tomorrow and plan for gerrit thursday. tonyb we can sync up outside of the meeting on timing for gerrit | 19:37 |
| clarkb | and with that I think we can probably end early if there is nothing else | 19:37 |
| clarkb | I have some zuul launcher bug fix code reviews I need to do then lunch | 19:37 |
| clarkb | thanks everyone. We'll be back here at the same time and location next week | 19:38 |
| clarkb | #endmeeting | 19:38 |
| opendevmeet | Meeting ended Tue Nov 11 19:38:27 2025 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 19:38 |
| opendevmeet | Minutes: https://meetings.opendev.org/meetings/infra/2025/infra.2025-11-11-19.00.html | 19:38 |
| opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/infra/2025/infra.2025-11-11-19.00.txt | 19:38 |
| opendevmeet | Log: https://meetings.opendev.org/meetings/infra/2025/infra.2025-11-11-19.00.log.html | 19:38 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!