19:01:11 <clarkb> #startmeeting infra
19:01:11 <opendevmeet> Meeting started Tue Nov  7 19:01:11 2023 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:01:11 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:01:11 <opendevmeet> The meeting name has been set to 'infra'
19:01:17 <clarkb> #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/MRP4DFT7DBT56U56R6LCFHG7X36SS554/ Our Agenda
19:01:20 <clarkb> #topic Announcements
19:01:38 <clarkb> I believe that the majority (all?) of us have had DST start or end over the last month. Double check your meeting times :)
19:01:55 <clarkb> Related to that the OpenInfra Foundation Board meeting for November will start in 2 hours
19:03:02 <clarkb> also I'll be AFK November 10-13 (tahts Friday and Monday on both ends of the weekend)
19:03:44 <clarkb> #topic Mailman 3
19:04:10 <clarkb> All lists are now hosted on mailman 3, the mailman 3 services are upgraded to their latest versions, and the old mailman2 servers have been deleted
19:04:33 <clarkb> We're just about done with this item (thank you fungi!), but there was a django template parsing error during the upgrade we need to run down as we thought that was corrected
19:04:46 <clarkb> https://paste.opendev.org/show/bc7jfeZCt97fZm0dCPKw/ is the paste of that I pulled out of logs when the upgrade occurred
19:05:10 <fungi> yeah, i need to check whether those show up in the log in zuul
19:05:13 <clarkb> it doesn't appear to be fatal (probably because we aren't relying on social media logins or similar functionaltiy in django so I think the bulk of the issue here is to understand why this happened
19:05:31 <fungi> also whether it was only during the initial restart or whether it recurs
19:05:36 <clarkb> ++
19:06:44 <clarkb> I think we can probably drop this off of next week's agenda
19:07:02 <fungi> agreed
19:07:04 <clarkb> thanks again for getting this over the finish line fungi
19:07:06 <tonyb> #GreatSuccess
19:07:15 <fungi> i just hope it keeps working
19:07:31 <fungi> keep an ear to the ground for people talking about delivery issues
19:08:17 <clarkb> #topic Upgrading Servers
19:08:33 <clarkb> tonyb has started pushing on this for mirrors.
19:08:42 <clarkb> And I THink started investigating meetpad servers
19:08:51 <clarkb> tonyb: any concerns or items that need review etc?
19:09:33 <tonyb> Nope.  I think the mirror servers are ready to launch new versions.  I'm assuming that's paused due to 900220
19:09:43 <tonyb> I think meetpad will be pretty quick
19:10:17 <tonyb> after that it's just the hard ones, cacti, wiki, translate and storyboard
19:10:20 <clarkb> yup I think we go through the 900220 stuff and use this all as a good learning experience
19:10:34 <clarkb> let's move on. We'll discuss 900220 shortly
19:10:42 <clarkb> #topic Python container updates
19:10:50 <clarkb> Everything is running python3.11 except for zuul-operator
19:11:06 <clarkb> The reason for that is zuul-operator's k8s jobs haven't been working
19:11:26 <clarkb> dpawlik was poking at it and details ended up in https://review.opendev.org/c/zuul/zuul-operator/+/881245 and its depends on
19:11:56 <clarkb> We don't use the zuul-operator so I don't have a ton of context for this stuff. Despite that I've been meaning to try and page it in just haven't had time
19:12:06 <tonyb> I can work with dpawalik to get that all finished.
19:12:20 <fungi> just in time to start talking about 3.12 ;)
19:12:28 <tonyb> \o/
19:12:39 <clarkb> I think the short version is taht something with the way k8s is deployed there causes the operator to not function. dpawlik's changes addrss the k8s issues and now there is maybe a problem in zuul-operator itself that needs fixing
19:13:05 <clarkb> but ya start from that change its and depends on and once we can get it green then we should be good to land changes that update the python version for zuul-operator as well
19:13:34 <tonyb> ++
19:13:42 <clarkb> #topic Gitea 1.21
19:13:58 <clarkb> I've left this item on the agenda because each week I think "this is the week there will be a release and changelog we can discuss"
19:14:04 <clarkb> unfortunately this week is not that week
19:14:31 <clarkb> I saw a message from one of the gitea maintainers on discord/matrix saying that the main release blocker at this point is the blog post. I think this must include writing up a change log because the change log doesn't exist yet
19:14:56 <fungi> changelog-as-an-afterthought always baffles me
19:15:33 <clarkb> maybe next week will be the week :)
19:15:41 <clarkb> #topic Gerrit 3.8 Planning
19:15:52 <clarkb> #link https://etherpad.opendev.org/p/gerrit-upgrade-3.8
19:16:01 <clarkb> if others could look over that etherpad I think it is ready for review
19:16:55 <clarkb> Otherwise I think we are about as ready as we can be. We got the commentlink update in and restarted Gerrit 3.7 to ensure that is working as expected. The downgrade back to 3.7 is tested and the only issue we've found so far is related to a plugin bug in a plugin we don't use
19:17:17 <fungi> 898989 isn't marked as done, should be thought yeah?
19:17:34 <fungi> we restarted onto it and manually tested
19:17:57 <clarkb> yup marked as done now
19:18:12 <fungi> awesome, just wanted to be sure there wasn't anything outstanding there
19:18:41 <clarkb> as far as gerrit upgrades go this one seems to be an easy one (I've just jinxed it)
19:18:55 <fungi> uncool man
19:18:56 <clarkb> feel free to review the chagne log as well to make sure I didn't miss anything
19:19:06 <clarkb> but I tried to put the important bits in the etherpad
19:19:22 <fungi> yeah, seems to me like we're ready for maintenance day
19:19:35 <fungi> ~1.5 weeks out?
19:19:39 <clarkb> which as a reminder is November 17, 2023 at 15:30 UTC
19:19:52 <fungi> just shy of 10 days now
19:19:54 <clarkb> I actually failed to remember that I would be on standard time for that day so 15:30 UTC is a bit early for me
19:20:01 <clarkb> but I'll be fine, just get up a little early
19:20:13 <fungi> 07:30 pst i guess
19:20:49 <clarkb> yup
19:20:54 * tonyb will be around for the morning FWIW
19:20:55 <clarkb> I thought it was 8:30 am
19:20:59 <fungi> i'm happy to run the maintenance if you want to focus on getting your tea steeped
19:21:00 <clarkb> tonyb: awesome
19:21:08 <tonyb> DST strikes again
19:21:17 <clarkb> fungi: cool we can decide when we get closer to the day of
19:21:21 <fungi> wfm
19:22:20 <clarkb> #topic Adding tonyb to infra-root
19:22:42 <fungi> rocketship emoji
19:22:48 <clarkb> we've had discussions about this outside of the meeting, but tonyb is willing to be adding to infra-root and help us out with even more stuff :)
19:22:50 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/900220 Will make it official
19:22:56 <clarkb> thank you tonyb!
19:23:02 <tonyb> Thank you all
19:23:10 <corvus> yay!
19:23:20 <tonyb> I understand the level of trust that's being shown here
19:23:31 <tonyb> I apprecaite that
19:23:42 <corvus> try not to give away the homeworld
19:23:45 <fungi> feel free to pester me for access to things as you find you're missing something (we don't really have a checklist of everything)
19:24:23 <clarkb> the "plan" I've got here is we can approve this change after the meeting. Then I need to edit gerrit groups and some other things. Maybe tomorrow and/or thursday we can meet up and work through things like server boots and adding a gerrit admin account and so on
19:24:24 <tonyb> fungi: will do.  It will be a slow process as my "comfort zone" increases.
19:24:43 <fungi> yeah, there's no need to ask for access to stuff until you're ready to do something with it anyway
19:24:53 <tonyb> Sounds good.
19:25:05 <corvus> it doesn't come up that often, because it doesn't change that often, but i do think a lot of the docs are mostly current: https://docs.opendev.org/opendev/system-config/latest/sysadmin.html#root-only-information
19:25:08 <clarkb> yup I mostly want to make sure we've given a reasonable base line of access so that you aren't in a weird spot of not being abel to say approve changes but can ssh into things
19:25:18 <tonyb> If you want to do that via meetpad or similar I can make sure I'm in a quiet place
19:25:39 <clarkb> tonyb: ya I was thinking a call like that then we can use shared screen sessions (gnu screen) to share context
19:25:52 <tonyb> perfect
19:26:35 <fungi> also for stuff like the upcoming gerrit upgrade maintenance we explicitly start a screen session on the server so that other sysadmins can observe or participate as needed
19:27:22 <fungi> (you'll see it called out in the maintenance plan)
19:27:25 <tonyb> Cool.  I'll have to page in my gnu screen keybindings etc
19:27:47 <tonyb> I recently "switched" to tmux/tmate
19:28:06 <clarkb> for a long time we used screen because not all the systems (there were old centos systems for cgit) had tmux
19:28:12 <clarkb> and then we never switched
19:28:14 <fungi> i've been using tmux personally for a decade or more, but still fall back on screen for some stuff it does better
19:28:38 <fungi> these days though, about the only thing screen does better is connect to serial lines
19:28:56 <clarkb> I've got a usb to rs232 cable I use with screen :)
19:29:02 <tonyb> Yeah that was the only thing I really notcied
19:29:04 <fungi> bingo
19:29:20 <clarkb> #topic Open Discussion
19:29:27 <clarkb> That was it for the posted agenda, is there anything else?
19:29:55 <fungi> reminder that there's an openinfra foundation board of directors meeting in 2.5 hours
19:30:04 <fungi> #link https://board.openinfra.dev/en/meetings/2023-11-07
19:30:04 <clarkb> 1.5 I think
19:30:19 <fungi> 1.5, yep, i can't count
19:30:25 <fungi> 21:00 utc
19:30:41 <clarkb> yup I've got lunch then that consuming my next 3.5 hours or whatever the scheudled time is
19:30:54 <fungi> spoiler: the budget discussion will probably have nice things to say about our work
19:31:15 <tonyb> \o/
19:31:27 <fungi> also there's discussion of upcoming bylaws changes, updating the diversity and inclusion wg's charter, and use of ai in code contributions
19:31:45 <fungi> something for everyone
19:31:49 <tonyb> ooo that could be fun
19:31:52 <clarkb> yup I think it will be one where there is a lot of interesting content which isn't always the case
19:32:31 <fungi> as long as you can make it through the first 15 minutes of rollcall
19:32:40 <tonyb> LOL
19:33:23 <corvus> oh hi
19:33:39 <fungi> heh
19:33:44 <corvus> just a heads up that we merged a nodepool change that is having a small performance impact
19:33:52 <clarkb> corvus: is this the ssh keyscanning state machine change?
19:33:59 <corvus> yep
19:34:11 <clarkb> I keep meanign to look at what motivated that
19:34:17 <corvus> i don't think it's user-visible, but i did notice some extra time-to-ready
19:34:25 <corvus> and some extra launch retries
19:34:27 <corvus> i have a fix up
19:34:27 <clarkb> seems like scan in a loop until good or timeout doesn't really need a proper state machine :)
19:34:39 <corvus> clarkb: paralellization
19:34:57 <corvus> we could only do 10 before; get 10 slow machines booting and everything stops
19:35:21 <clarkb> ah is that the size of our threadpool?
19:35:31 <corvus> yep.  and increasing threads pool workers was :( because it would 2x the threads thanks to paramiko
19:35:39 <fungi> yeah, i guess you want to be able to have fewer active loops than node requests
19:35:47 <corvus> so now it's N+1 instead of 2N
19:35:55 <corvus> threads
19:36:17 <clarkb> got it
19:36:29 <fungi> polling state machine architecture takes me back to my mud coding days
19:37:09 <corvus> anyway, i don't think we need to revert or anything, and i'll be monitoring it.  but wanted to bring it up so folks are aware.
19:37:31 <clarkb> thanks. I'll try to review that change (as well as rereview that one zuul error handling change) this afternoon either during or after the baord meeting
19:37:44 <corvus> cool, thx :)
19:38:30 <clarkb> sounds like that may be everything. I'm going to hit +A on 900220 then go find lunch
19:38:44 <clarkb> thank you for your time today everyone and for all the help running these services
19:38:46 <fungi> thanks!
19:38:48 <clarkb> #endmeeting