19:00:02 <clarkb> #startmeeting infra
19:00:02 <opendevmeet> Meeting started Tue Mar 25 19:00:02 2025 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:00:02 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:00:02 <opendevmeet> The meeting name has been set to 'infra'
19:00:07 <clarkb> #topic Announcements
19:00:32 <clarkb> OpenStack is making its 2025.1 Epoxy release next week. And the week after that is the virtual PTG
19:01:17 <clarkb> Things to be aware of when making changes. I suspect we'll put the meetpad hosts in the emergency file late next week after testing that things work to avoid unexecpted upgrades
19:01:38 <clarkb> Also we aren't going to have OpenDev ptg time as we all tend to be busy participating in other groups' ptg blocks
19:02:06 <clarkb> Anything else to announce?
19:02:14 <fungi> openinfra summit date and venue have been announced, volunteers sought for the programming committee with a kickoff call scheduled for thursday. contact Helena Spease <helena@openinfra.dev> if you want to help in any way
19:02:26 <fungi> #link https://openinfra.org/blog/openinfra-summit-2025 Announcing the OpenInfra Summit Europe 2025!
19:02:33 <clarkb> oh ya that just ahppened
19:04:26 <clarkb> #topic Zuul-launcher image builds
19:04:44 <clarkb> corvus recently added all of our x86_64 regions to zuul-launcher
19:05:00 <corvus> all the x86 clouds are present now, but i haven't chased down the images builds
19:05:05 <clarkb> the cloudsthat have smaller and bigger flavor sizes also support the smaller and bigger labels
19:05:07 <corvus> (to make sure they all have images built)
19:05:11 <clarkb> gotcha
19:05:28 <corvus> sounds like that may be bad timing because of the noble thing?  :)
19:06:06 <corvus> but the noble kernel thing got me thinking
19:06:12 <corvus> zuul
19:06:13 <corvus> er
19:06:59 <corvus> zuul-launcher does have the ability to validate builds.  we're not exercising that yet.  and i don't think we want to validate too much.  but i think maybe validating things like basic network connectivity and iptables, etc, would be reasonable.
19:07:36 <corvus> so if we think it's a good idea, and if anyone wants to volunteer to write a validation job that does something like that, i'd be happy to mentor/review.
19:07:52 <clarkb> a "does it boot and have our base configuration" doe seem reasonable
19:08:04 <clarkb> I agree we don't want to try and catch every possible problem our jobs may have though
19:08:56 <clarkb> #action someone look into basic image validation with zuul-launcher
19:09:01 <clarkb> anything else on this subject?
19:09:07 <corvus> nope
19:09:11 <clarkb> #topic Container hygiene tasks
19:09:17 <clarkb> #link https://review.opendev.org/q/topic:%22opendev-python3.12%22+status:open Update images to use python3.12
19:09:36 <clarkb> we did end up managing to update our base python container images. The next step is to start rolling out updates to rebuild the containers to use python3.12
19:09:53 <clarkb> this ensures we're using the newly built base image content and moves us from python3.11 to 3.12
19:10:08 <clarkb> this sin't terribly urgent but  Ithink it is good hygiene to make these updates periodically if people have time to do reviews
19:10:21 <clarkb> I'm going to pop out this afternoon but I'm ahppy to approve things tomorrow when I can monitor fi reviews happen
19:10:42 <clarkb> the early changes to python3.12 have just worked thankfully as well
19:11:03 <corvus> are we happy with the performance of py 3.12 these days?  i want to say at one point we were not super motivated to bump zuul for performance reasons, but maybe that time has passed?
19:11:34 <clarkb> corvus: its been fine for the small utilities we use it for within opendev. I think the zuul unittest jobs are still consistently slower on noble with py312 than they are with jammy/bookwork and py311
19:11:58 <clarkb> I think zuul is likely to be the only thing where the performance might be noticed by us? since zuul is often quite busy. Maybe nodepool too
19:12:11 <clarkb> but I'm also happy to try it and see if we notice the difference isn't massive
19:12:19 <corvus> do you happen to know about 3.13?
19:12:49 <clarkb> I have it installed locally but haven't done much with it. We dont' have base images for 3.13 yet but we can add those now that 3.10 is gone
19:13:02 <corvus> wonder if maybe we should just wait for that vs going to 12...
19:13:05 <clarkb> I don't know if there is a good wy to install it for zuul unittests other than pyenv
19:13:25 <fungi> unless we add debian-trixie images
19:13:36 <clarkb> and unfrotauntely with pyenv you either do an hour long compile to get something production like or a 2 minute compile and its slower. Good for testing compatibility and less for performance
19:13:47 <fungi> might be getting close to time to think about that, it's already entered soft freeze time
19:14:06 <corvus> okay.  i think that's good info.  i'm not going to rush out and bump zuul right now; and instead will keep an eye out for a 3.13 opportunity.  and we can bump to 3.12 when it's more pressing.
19:14:19 <clarkb> works for me
19:14:22 <fungi> though the official trixie release is probably still a few months away
19:14:53 <clarkb> maybe we add an early pyenv job to check for compaitiblity issues but don't read into performance too much
19:15:10 <clarkb> unless miraculously the 3.13 compiled without optimizations manages to be faster than the optimized 3.11 and 3.12 builds from the distro
19:15:21 <clarkb> in that case we acn probably safely assume the optimized builds will be at least as performant
19:15:47 <corvus> i might try some time trials locally with docker images, but that is not at the top of my list right now
19:16:04 <clarkb> sounds good
19:16:07 <fungi> yeah, part of the challenge with optimizing cpython builds is that it needs to compile, then run a battery of tests with profiling on, then build again based on the results
19:16:26 <clarkb> anything else on this topic?
19:16:27 <fungi> so it's basically double the compile time plus the minimal make test time
19:16:39 <corvus> good here
19:16:45 <clarkb> #topic Dropping uWSGI
19:17:06 <clarkb> this is a related item where I'd like to drop uWSGI. After rebuilding the base container images last week this is a bit less urgent so I've deprioritized it
19:17:18 <clarkb> but my goal is still to swithc lodgeit to granian or similar and then we can stop worrying about uwsgi entirely
19:18:02 <clarkb> the main gotcha here is the deployment and image have to change so we're potentailly taking a small downtime. I think the best way to approach that is to put the service in the emergency file, land the image update, manually edit the docker-compose.yaml and pull the new image and restart things then alnd a system-config update to reflect that
19:18:39 <clarkb> all of which I'm happy to do when I have time or for someone else to push along if they are interested. Its not like I'm really into wsgi servers I just watned to improve our paste service and container image builds
19:19:15 <clarkb> in the meantime reviews welcome but I think we had general consensus last week that htis was ok for a service like paste
19:19:19 <clarkb> #topic Upgrading old servers
19:19:48 <clarkb> I have continued to upgrade servers. Since last week I'ev upgraded all of the nodepool builders and launchers and the osuosl mirror
19:20:06 <clarkb> no new problems have been found with podman / docker compose / noble which is nice
19:20:34 <clarkb> btu it is worth mentioning that since I completed that work yesterday Noble's kernel updated and now has a bug managing ipv6 firewall rules that I suspect may cause problems for launching new servers until fixed
19:20:52 <clarkb> the next server on my todo list is the rax iad mirror and it can serve as a canary for ^
19:21:12 <clarkb> #link https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2104134 Noble kernel bug
19:21:30 <clarkb> also I think the new osuosl builder and mirror may perform better than the old ones
19:21:37 <clarkb> the image builds defniitely seemed quicker
19:22:20 <clarkb> anyone else have updates for replacing servers?
19:23:53 <fungi> not i
19:23:56 <clarkb> then I'll end this topic with a reminder that every little bit helps. There is a fairly large backlog of servers to replace and I appreciate all the help I can get doing so
19:24:03 <clarkb> #topic Running certcheck on bridge
19:24:23 <clarkb> I don't have any updates on this. Too many more urgent things keep popping up. But I would still like to explore ianw's suggestion on this
19:24:34 <clarkb> particularly since it doesn't seem to be urgent at the moment to move to bridge
19:24:56 <clarkb> related to that LetsEncrypt is going to stop sending email reminders taht your certs expire soon. We've never relied on their emails so not a big deal
19:25:25 <fungi> yeah, i'm happy to abandon my change in gerrit
19:25:42 <fungi> it started out as a tiny thing, single ansible task, and snowballed
19:26:06 <clarkb> I think if it were broken right now we'd hurry up and fix it one way or another
19:26:16 <clarkb> but it was just a github blip iirc so we're limping along otherwise
19:26:47 <clarkb> #topic Working through our TODO list
19:26:52 <clarkb> #link https://etherpad.opendev.org/p/opendev-january-2025-meetup
19:27:04 <clarkb> just a friendly reminder we haev this list if you need to find something impactful to do
19:27:12 <clarkb> this applies to existing opendev contributors and new contributors alike
19:27:22 <clarkb> and feel free to reach out if there are questions about a topic you'd like to get involved in
19:28:39 <clarkb> #topic Upgrading to Gitea 1.23.6
19:28:44 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/945414
19:29:09 <clarkb> gitea just made a new release. I pushed a change up to upgrade our installation with links to the changelog. It looks fairly straightforward to me but double checking is always great. Particularly with the openstack release next week
19:29:16 <fungi> thanks, i meant to review that yesterday but time got away from me
19:29:29 <clarkb> good news is the memcached and firewall changes seem to be working well. I want to say gitea performance has been very consistent for me since we implemented those two changes
19:29:40 <fungi> yeah, awesome work!
19:29:50 <clarkb> and again a reminder taht I probably won't be able to monitor today but happy to do so tomorrow if people review and want to defer approvals
19:30:21 <fungi> yeah, i'm going over the changelog now, but will refrain from approving tonight
19:30:42 <clarkb> #topic Rotating mailman 3 logs
19:30:59 <fungi> oh, right, we still need... a logrotate config?
19:31:05 <clarkb> this is mostly a reminder that we're not rotating mm3 logs and that is both an oversight in our config management but one that worked out for us sort of because upstream doesn't support rotating logs?
19:31:25 <clarkb> I don't want us to forget and think we should start trying something and see what breaks. Maybe even push up a logrotate config then hold a node?
19:31:44 <fungi> yeah, i remember i looked into it, now i don't recall the details. something about not gracefully following the inode change
19:31:47 <clarkb> we don't need to test in prod necessarily, but I would like to see us fix this before we have a 20Gb log file that fills a disk and we're scrambling to fix it
19:32:01 <clarkb> fungi: ya it keeps writing to the old file until you restart it or something
19:32:10 <clarkb> and maybe our logrotate config rotates the file and restarts the service?
19:32:12 <fungi> seems like if we use teh copy-truncate method in logrotate it will probably work? at least those were the workarounds i saw mentioned
19:32:15 <clarkb> not ideal but doable
19:32:32 <clarkb> ya copy truncate might work but I think someone reported it had problems too
19:32:42 <fungi> mmm
19:32:48 <clarkb> anyway this is on here as a reminder I don't have any answers. Happy to help review and reread issues though
19:32:55 <fungi> i'll try to revisit that and get something up for it
19:33:00 <clarkb> thanks
19:33:18 <clarkb> #topic Open Discussion
19:33:25 <clarkb> #link https://etherpad.opendev.org/p/opendev_newsletter
19:33:44 <clarkb> we've been asked to help write a blurb for the openinfra newsletter going out soon. I put a draft on that etherpad
19:34:07 <clarkb> would be great if you have a moment to read it and check for accuracy and also think about whether or not we want to get involved in the ai web crawler mess
19:34:28 <clarkb> its possible that calling it out like that might get a target put on our backs and so far it sounds like we've had it relatively easy compared to other open source projects
19:35:03 <clarkb> I think the foundation newsletter editors want a final draft tomorrow
19:35:07 <clarkb> so provide feedback soon
19:36:21 <clarkb> oh and while I keep saying tomorrow I can approve things that may end up being weather dependent. The weather pattern that is making it super nice and warm today for a bike ride is going to get smashed into by cooler weather tomorrow and generate thunderstorms with possible tornadoes and large (for us) hail
19:36:41 <clarkb> I don't expect problems because thunderstorms here tend to be mind, but I thought I'd call that out
19:36:46 <clarkb> *tend to be mild
19:37:10 <clarkb> I might clear out the garage tonight though
19:38:01 <clarkb> anything else?
19:38:14 <fungi> i got nothin'
19:38:58 <clarkb> if that is everything we can all have ~20 minutes to do something else
19:39:01 <clarkb> thank you everyone!
19:39:07 <clarkb> we'll be back here next week at the saem time and location
19:39:11 <clarkb> #endmeeting