Tuesday, 2022-08-02

clarkbmeeting will start in just a few minutes18:58
fungioh, yep!19:00
clarkb#startmeeting infra19:01
opendevmeetMeeting started Tue Aug  2 19:01:24 2022 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.19:01
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.19:01
opendevmeetThe meeting name has been set to 'infra'19:01
clarkb#link https://lists.opendev.org/pipermail/service-discuss/2022-August/000348.html Our Agenda19:01
clarkbI am prepared with an agenda :)19:01
clarkb#topic Announcements19:01
clarkbThe Service Coordinator nomination period has officially begun19:02
ianwo/19:02
clarkbIt started today and will run through August 16, 2022. I'll send a followup email to the thread I started last week warning people of this timeline :)19:02
clarkb#link https://lists.opendev.org/pipermail/service-discuss/2022-July/000347.html19:02
clarkb#topic Topics19:03
clarkb#topic Improving OpenDev CD Throughput19:03
clarkbI don't have anything new on this item. I'm thinking maybe we can pull it off the agenda until we've got new developments? Seems like we've got a lot of other stuff going on in the meantime19:03
clarkbAny objections to that?19:04
funginone from me19:05
ianwnot really, it is a permanent todo :)19:06
clarkbok cool19:06
clarkb#topic Updating Grafana Management Tooling19:06
clarkb#link https://review.opendev.org/q/topic:grafana-json19:06
clarkbianw: ^ I think this stack is largely ready to go though I had some comments on it. Not sure if you want to respin or land as is and then improve in followups19:06
ianwsorry, i wanted to get back and make sure i responded to comments before merging19:07
clarkbeither approach is fine with me. I just didn't want anyone to feel my comments were necessary improvements. I did +2 afterall19:07
ianwi should have time to get to it soon19:08
clarkbsounds good19:08
clarkb#topic Bastion Host Updates19:08
clarkbThis item is largely a proxy for the zuul streaming log file cleanup work19:08
clarkbat least for now19:09
clarkbianw: did those changes get included in the weekend upgrade of zuul?19:09
*** kopecmartin_ is now known as kopecmartin19:09
clarkbIf so I think we should manually clear out the remaining files on bridge (and static) and then we can monitor to see how many sneak through due to aborted jobs and similar situation with zuul19:09
ianwi haven't yet pushed the +w on those changes as i haven't responded to corvus' comments on the files potentially not being removed for aborted jobs19:10
ianwthe suggestion was a background thread to remove them19:11
clarkbgot it. FWIW it was my impression that we can land the stack you've got as it is an improvement. Just taht we will also want to look into a tmpreaper setup for the straggler files19:11
ianwi'm starting to think perhaps documenting the situation a bit better first, and we can see how much of an issue it is, and perhaps if we can land something that puts them in more of a reserved namespace i'd feel better about a generic cleaner19:12
ianwso i have a half-written doc change that i'll clean up ... very soon :)19:12
clarkbbefore landing the current improvements?19:12
ianwi'll push that on top and feel ok about landing what's there, i think19:13
clarkbgot it19:13
clarkbI just didn't want this to get forgotten as the creatino of the tmpfiles will eventually bite us I think :)19:14
clarkbthis plan seems reasonable though. I think we can move on19:14
clarkb#topic Upgrading Bionic Servers to Focal/Jammy19:14
clarkb#link https://etherpad.opendev.org/p/opendev-bionic-server-upgrades Notes on the work that needs to be done.19:14
clarkbI've been using the mailman3 works as a good exercise for checking Jammy generally works for our config management19:15
clarkbThere are two changes related to improving Jammy support that can landable today19:15
clarkb#link https://review.opendev.org/c/opendev/system-config/+/851094/2 Run system-config-run-base against Jammy19:15
clarkb#link https://review.opendev.org/c/opendev/system-config/+/851266/1 Fix install-docker role for Jammy19:15
fungithanks, i'll make sure to check those out19:15
clarkbOverall nothing really crazy about using Jammy yet which is a good thing19:16
fungiahh, i already reviewed one of them ;)19:16
clarkbBut I haven't gotten to the apahce config for mailman3 yet as I've been struggling with various mailman3 related things recently19:16
clarkbI'm hoping we'll have functioning apache configs in the mailman3 context soon and that will hopefully expose if that server has any new things we need to accomodate19:16
clarkbIf you are curious about the mailman3 work it isn't for a bionic upgrade but it is updating some other old software19:17
clarkb#link https://review.opendev.org/c/opendev/system-config/+/851248 WIP change to run mailman3 on Jammy19:17
ianwthanks, mm3 seems like something we have to do eventually :)19:18
ianw#link https://review.opendev.org/c/zuul/nodepool/+/849273 Dockerfile: move into separate group when running under cgroupsv219:18
ianwis i guess related, and maybe invalidates your "nothing really crazy" bit :)19:18
clarkbThe latest thing is hacking around assumptions in the upstream docker image configs. And for some reason port 8000 doesn't show up as listening even though I've gotten the errors out of the service logs now as far as I can tell19:18
clarkboh ya cgroupsv2 is definitely something that falls under crazy :)19:18
clarkbI'll take a look at that change today19:18
clarkbAnyway there is progress here. Slow but it is happening :)19:19
clarkb#topic Gitea 1.17 Upgrade19:20
clarkbGitea made their 1.17.0 release over the weekend. Yesterday I updated my WIP change that was deploying and testing the gitea release candidates to this final release version19:20
clarkb#link https://review.opendev.org/c/opendev/system-config/+/84720419:20
clarkbI don't think we are in a rush to upgrade. I've tried to call out all of the breaking changes in my commit message and give details why they do or do not affect us19:21
clarkbThe screenshots and our testing seem to show it generally works though. So if we are happy after reviews I think we can upgrade whenever we feel ready19:21
clarkbSeems like we're making good time through the agenda19:23
clarkb#topic Rock Images Not Booting19:23
clarkbLate last week it was pointed out that our Rocky Linux images were no longer booting19:23
clarkbI pulled one up in a rescue instance and noticed there were not kernels in /boot and there were no entries in grub.cfg to boot19:24
clarkbianw: managed to trace this back to the machine ID missing in the image which prevents kernels from being installed to /boot and that prevents grub from updating its config to boot a kernel19:24
clarkbThe fix for this has landed in DIB. The next steps in addressing this are to make a DIB release and update nodepool to use that release.19:24
clarkbHowever, there is one change that we'd liek to get into the DIB release that hasn't merged yet which adds support for rocky linux 919:25
clarkbIn any case we should expect this to be happy in the near future19:25
clarkbianw: out of curiousity how did you trace it back to the machine id?19:26
ianwi started by looking at the dib functional boot jobs; luckily we had one that did work with logs still so had a comparision point19:27
ianweventually i realised that the upstream container had updated since that run, so that cracked open something to explore19:27
ianwcomparing to the older version of the container, the new one didn't have a /boot directory ... which i thought might be the problem at first19:28
ianwwhen that didn't pan out, i started wondering about how the kernel actually got into /boot, which led me to the rpm scripts run by kernel-core package, which lead to tracing /bin/kernel-install, which led to realising it was looking for a machine-id ...19:29
ianwjust now though, weirdly, following this same thread on the rocky9 images, there doens't seem to be a kernel installed either.  but the jobs are booting ok there when built under dib.  so i don't know what's going on with that19:30
clarkbI also wonder if the rhel stuff needs similar fixes? Or are they likely shielded by not starting from a container image?19:32
ianwi do feel like we've been down a similar path with machine-ids and kernel installs on the other build types, in various incarnations19:33
ianwhttps://review.opendev.org/c/openstack/diskimage-builder/+/675056/ even19:34
clarkbha ya ok19:34
ianwoh god, how depressing19:35
ianwhttps://bugzilla.redhat.com/show_bug.cgi?id=1737355#c719:35
ianw"Oh, and I actually debugged this once before 2 years ago and forgot about it!  :) 19:35
ianwhttps://review.opendev.org/#/c/504300/"19:35
ianwso now i've debugged it 3 times?!19:35
fungithat sounds familiar19:36
clarkb#topic Open Discussion19:38
clarkbSeems like the last topic was well covered :) and that was all on the agenda19:38
ianwcould i ask for reviews on19:39
ianw#link https://review.opendev.org/q/topic:ansible-lint-update-619:39
fungioh, yep19:40
ianwthere's a lot, but hopefully nothing too controversial.  it should bring everything in sync19:40
ianweverything being our *-jobs repos19:40
fungilots are already merged too19:40
clarkbrelated to linting the new rules about spaces after keywords hit zuul19:41
clarkbI expect that will continue to crop up in places19:41
fungirelated to the rocky troubleshooting, 851520 is probably safe to merge now19:41
ianwspeaking of that, i proposed 19:41
ianw#link https://review.opendev.org/c/openstack/releases/+/851273 hacking: release 5.0.019:42
ianwi don't know who usually looks after that.  that would bring in a flake8 that is 3.10 compatible19:42
fungialso on an entirely separate note, i have a wip behavior change proposed for git-review which could use some feedback as to whether it's desirable: https://review.opendev.org/85006119:42
clarkbianw: I think that openstack often updates those at the beginning of a cycle so may be too late now and have to wait for ~end of october?19:43
ianwyeah, true, i guess it has potential to do more than just support 3.1019:43
fricklerianw: I think it belongs to qa, so kopecmartin 19:44
clarkbAnything else? Last call19:47
fungii got nuthin19:47
fungioh, though i don't expect to be around much thursday, just a heads up to everyone19:49
clarkbthank you for the heads up.19:49
clarkbThanks everyone. We'll see you back here next week19:49
clarkb#endmeeting19:49
opendevmeetMeeting ended Tue Aug  2 19:49:44 2022 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)19:49
opendevmeetMinutes:        https://meetings.opendev.org/meetings/infra/2022/infra.2022-08-02-19.01.html19:49
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/infra/2022/infra.2022-08-02-19.01.txt19:49
opendevmeetLog:            https://meetings.opendev.org/meetings/infra/2022/infra.2022-08-02-19.01.log.html19:49
fungithanks clarkb!19:49

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!