19:00:02 <clarkb> #startmeeting infra 19:00:02 <opendevmeet> Meeting started Tue Mar 25 19:00:02 2025 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:00:02 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:00:02 <opendevmeet> The meeting name has been set to 'infra' 19:00:07 <clarkb> #topic Announcements 19:00:32 <clarkb> OpenStack is making its 2025.1 Epoxy release next week. And the week after that is the virtual PTG 19:01:17 <clarkb> Things to be aware of when making changes. I suspect we'll put the meetpad hosts in the emergency file late next week after testing that things work to avoid unexecpted upgrades 19:01:38 <clarkb> Also we aren't going to have OpenDev ptg time as we all tend to be busy participating in other groups' ptg blocks 19:02:06 <clarkb> Anything else to announce? 19:02:14 <fungi> openinfra summit date and venue have been announced, volunteers sought for the programming committee with a kickoff call scheduled for thursday. contact Helena Spease <helena@openinfra.dev> if you want to help in any way 19:02:26 <fungi> #link https://openinfra.org/blog/openinfra-summit-2025 Announcing the OpenInfra Summit Europe 2025! 19:02:33 <clarkb> oh ya that just ahppened 19:04:26 <clarkb> #topic Zuul-launcher image builds 19:04:44 <clarkb> corvus recently added all of our x86_64 regions to zuul-launcher 19:05:00 <corvus> all the x86 clouds are present now, but i haven't chased down the images builds 19:05:05 <clarkb> the cloudsthat have smaller and bigger flavor sizes also support the smaller and bigger labels 19:05:07 <corvus> (to make sure they all have images built) 19:05:11 <clarkb> gotcha 19:05:28 <corvus> sounds like that may be bad timing because of the noble thing? :) 19:06:06 <corvus> but the noble kernel thing got me thinking 19:06:12 <corvus> zuul 19:06:13 <corvus> er 19:06:59 <corvus> zuul-launcher does have the ability to validate builds. we're not exercising that yet. and i don't think we want to validate too much. but i think maybe validating things like basic network connectivity and iptables, etc, would be reasonable. 19:07:36 <corvus> so if we think it's a good idea, and if anyone wants to volunteer to write a validation job that does something like that, i'd be happy to mentor/review. 19:07:52 <clarkb> a "does it boot and have our base configuration" doe seem reasonable 19:08:04 <clarkb> I agree we don't want to try and catch every possible problem our jobs may have though 19:08:56 <clarkb> #action someone look into basic image validation with zuul-launcher 19:09:01 <clarkb> anything else on this subject? 19:09:07 <corvus> nope 19:09:11 <clarkb> #topic Container hygiene tasks 19:09:17 <clarkb> #link https://review.opendev.org/q/topic:%22opendev-python3.12%22+status:open Update images to use python3.12 19:09:36 <clarkb> we did end up managing to update our base python container images. The next step is to start rolling out updates to rebuild the containers to use python3.12 19:09:53 <clarkb> this ensures we're using the newly built base image content and moves us from python3.11 to 3.12 19:10:08 <clarkb> this sin't terribly urgent but Ithink it is good hygiene to make these updates periodically if people have time to do reviews 19:10:21 <clarkb> I'm going to pop out this afternoon but I'm ahppy to approve things tomorrow when I can monitor fi reviews happen 19:10:42 <clarkb> the early changes to python3.12 have just worked thankfully as well 19:11:03 <corvus> are we happy with the performance of py 3.12 these days? i want to say at one point we were not super motivated to bump zuul for performance reasons, but maybe that time has passed? 19:11:34 <clarkb> corvus: its been fine for the small utilities we use it for within opendev. I think the zuul unittest jobs are still consistently slower on noble with py312 than they are with jammy/bookwork and py311 19:11:58 <clarkb> I think zuul is likely to be the only thing where the performance might be noticed by us? since zuul is often quite busy. Maybe nodepool too 19:12:11 <clarkb> but I'm also happy to try it and see if we notice the difference isn't massive 19:12:19 <corvus> do you happen to know about 3.13? 19:12:49 <clarkb> I have it installed locally but haven't done much with it. We dont' have base images for 3.13 yet but we can add those now that 3.10 is gone 19:13:02 <corvus> wonder if maybe we should just wait for that vs going to 12... 19:13:05 <clarkb> I don't know if there is a good wy to install it for zuul unittests other than pyenv 19:13:25 <fungi> unless we add debian-trixie images 19:13:36 <clarkb> and unfrotauntely with pyenv you either do an hour long compile to get something production like or a 2 minute compile and its slower. Good for testing compatibility and less for performance 19:13:47 <fungi> might be getting close to time to think about that, it's already entered soft freeze time 19:14:06 <corvus> okay. i think that's good info. i'm not going to rush out and bump zuul right now; and instead will keep an eye out for a 3.13 opportunity. and we can bump to 3.12 when it's more pressing. 19:14:19 <clarkb> works for me 19:14:22 <fungi> though the official trixie release is probably still a few months away 19:14:53 <clarkb> maybe we add an early pyenv job to check for compaitiblity issues but don't read into performance too much 19:15:10 <clarkb> unless miraculously the 3.13 compiled without optimizations manages to be faster than the optimized 3.11 and 3.12 builds from the distro 19:15:21 <clarkb> in that case we acn probably safely assume the optimized builds will be at least as performant 19:15:47 <corvus> i might try some time trials locally with docker images, but that is not at the top of my list right now 19:16:04 <clarkb> sounds good 19:16:07 <fungi> yeah, part of the challenge with optimizing cpython builds is that it needs to compile, then run a battery of tests with profiling on, then build again based on the results 19:16:26 <clarkb> anything else on this topic? 19:16:27 <fungi> so it's basically double the compile time plus the minimal make test time 19:16:39 <corvus> good here 19:16:45 <clarkb> #topic Dropping uWSGI 19:17:06 <clarkb> this is a related item where I'd like to drop uWSGI. After rebuilding the base container images last week this is a bit less urgent so I've deprioritized it 19:17:18 <clarkb> but my goal is still to swithc lodgeit to granian or similar and then we can stop worrying about uwsgi entirely 19:18:02 <clarkb> the main gotcha here is the deployment and image have to change so we're potentailly taking a small downtime. I think the best way to approach that is to put the service in the emergency file, land the image update, manually edit the docker-compose.yaml and pull the new image and restart things then alnd a system-config update to reflect that 19:18:39 <clarkb> all of which I'm happy to do when I have time or for someone else to push along if they are interested. Its not like I'm really into wsgi servers I just watned to improve our paste service and container image builds 19:19:15 <clarkb> in the meantime reviews welcome but I think we had general consensus last week that htis was ok for a service like paste 19:19:19 <clarkb> #topic Upgrading old servers 19:19:48 <clarkb> I have continued to upgrade servers. Since last week I'ev upgraded all of the nodepool builders and launchers and the osuosl mirror 19:20:06 <clarkb> no new problems have been found with podman / docker compose / noble which is nice 19:20:34 <clarkb> btu it is worth mentioning that since I completed that work yesterday Noble's kernel updated and now has a bug managing ipv6 firewall rules that I suspect may cause problems for launching new servers until fixed 19:20:52 <clarkb> the next server on my todo list is the rax iad mirror and it can serve as a canary for ^ 19:21:12 <clarkb> #link https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2104134 Noble kernel bug 19:21:30 <clarkb> also I think the new osuosl builder and mirror may perform better than the old ones 19:21:37 <clarkb> the image builds defniitely seemed quicker 19:22:20 <clarkb> anyone else have updates for replacing servers? 19:23:53 <fungi> not i 19:23:56 <clarkb> then I'll end this topic with a reminder that every little bit helps. There is a fairly large backlog of servers to replace and I appreciate all the help I can get doing so 19:24:03 <clarkb> #topic Running certcheck on bridge 19:24:23 <clarkb> I don't have any updates on this. Too many more urgent things keep popping up. But I would still like to explore ianw's suggestion on this 19:24:34 <clarkb> particularly since it doesn't seem to be urgent at the moment to move to bridge 19:24:56 <clarkb> related to that LetsEncrypt is going to stop sending email reminders taht your certs expire soon. We've never relied on their emails so not a big deal 19:25:25 <fungi> yeah, i'm happy to abandon my change in gerrit 19:25:42 <fungi> it started out as a tiny thing, single ansible task, and snowballed 19:26:06 <clarkb> I think if it were broken right now we'd hurry up and fix it one way or another 19:26:16 <clarkb> but it was just a github blip iirc so we're limping along otherwise 19:26:47 <clarkb> #topic Working through our TODO list 19:26:52 <clarkb> #link https://etherpad.opendev.org/p/opendev-january-2025-meetup 19:27:04 <clarkb> just a friendly reminder we haev this list if you need to find something impactful to do 19:27:12 <clarkb> this applies to existing opendev contributors and new contributors alike 19:27:22 <clarkb> and feel free to reach out if there are questions about a topic you'd like to get involved in 19:28:39 <clarkb> #topic Upgrading to Gitea 1.23.6 19:28:44 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/945414 19:29:09 <clarkb> gitea just made a new release. I pushed a change up to upgrade our installation with links to the changelog. It looks fairly straightforward to me but double checking is always great. Particularly with the openstack release next week 19:29:16 <fungi> thanks, i meant to review that yesterday but time got away from me 19:29:29 <clarkb> good news is the memcached and firewall changes seem to be working well. I want to say gitea performance has been very consistent for me since we implemented those two changes 19:29:40 <fungi> yeah, awesome work! 19:29:50 <clarkb> and again a reminder taht I probably won't be able to monitor today but happy to do so tomorrow if people review and want to defer approvals 19:30:21 <fungi> yeah, i'm going over the changelog now, but will refrain from approving tonight 19:30:42 <clarkb> #topic Rotating mailman 3 logs 19:30:59 <fungi> oh, right, we still need... a logrotate config? 19:31:05 <clarkb> this is mostly a reminder that we're not rotating mm3 logs and that is both an oversight in our config management but one that worked out for us sort of because upstream doesn't support rotating logs? 19:31:25 <clarkb> I don't want us to forget and think we should start trying something and see what breaks. Maybe even push up a logrotate config then hold a node? 19:31:44 <fungi> yeah, i remember i looked into it, now i don't recall the details. something about not gracefully following the inode change 19:31:47 <clarkb> we don't need to test in prod necessarily, but I would like to see us fix this before we have a 20Gb log file that fills a disk and we're scrambling to fix it 19:32:01 <clarkb> fungi: ya it keeps writing to the old file until you restart it or something 19:32:10 <clarkb> and maybe our logrotate config rotates the file and restarts the service? 19:32:12 <fungi> seems like if we use teh copy-truncate method in logrotate it will probably work? at least those were the workarounds i saw mentioned 19:32:15 <clarkb> not ideal but doable 19:32:32 <clarkb> ya copy truncate might work but I think someone reported it had problems too 19:32:42 <fungi> mmm 19:32:48 <clarkb> anyway this is on here as a reminder I don't have any answers. Happy to help review and reread issues though 19:32:55 <fungi> i'll try to revisit that and get something up for it 19:33:00 <clarkb> thanks 19:33:18 <clarkb> #topic Open Discussion 19:33:25 <clarkb> #link https://etherpad.opendev.org/p/opendev_newsletter 19:33:44 <clarkb> we've been asked to help write a blurb for the openinfra newsletter going out soon. I put a draft on that etherpad 19:34:07 <clarkb> would be great if you have a moment to read it and check for accuracy and also think about whether or not we want to get involved in the ai web crawler mess 19:34:28 <clarkb> its possible that calling it out like that might get a target put on our backs and so far it sounds like we've had it relatively easy compared to other open source projects 19:35:03 <clarkb> I think the foundation newsletter editors want a final draft tomorrow 19:35:07 <clarkb> so provide feedback soon 19:36:21 <clarkb> oh and while I keep saying tomorrow I can approve things that may end up being weather dependent. The weather pattern that is making it super nice and warm today for a bike ride is going to get smashed into by cooler weather tomorrow and generate thunderstorms with possible tornadoes and large (for us) hail 19:36:41 <clarkb> I don't expect problems because thunderstorms here tend to be mind, but I thought I'd call that out 19:36:46 <clarkb> *tend to be mild 19:37:10 <clarkb> I might clear out the garage tonight though 19:38:01 <clarkb> anything else? 19:38:14 <fungi> i got nothin' 19:38:58 <clarkb> if that is everything we can all have ~20 minutes to do something else 19:39:01 <clarkb> thank you everyone! 19:39:07 <clarkb> we'll be back here next week at the saem time and location 19:39:11 <clarkb> #endmeeting