19:01:18 <clarkb> #startmeeting infra 19:01:18 <opendevmeet> Meeting started Tue Apr 26 19:01:18 2022 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:18 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:18 <opendevmeet> The meeting name has been set to 'infra' 19:01:32 <frickler> o/ 19:01:34 <clarkb> #link https://lists.opendev.org/pipermail/service-discuss/2022-April/000333.html Our Agenda 19:01:46 <clarkb> #topic Announcements 19:01:52 <clarkb> #link https://lists.opendev.org/pipermail/service-announce/2022-April/000037.html Sent an OpenDev Update 19:02:02 <clarkb> Wanted to call out this email I sent. Many of your reviewed it too. 19:02:19 <clarkb> Basic idea is to try and expose more of the user facing things we have done and plan to do on a ~monthlyish time frame 19:02:54 <clarkb> Hopefully in the next one we'll be able to talk about how jammy is now our jam and gerrit 3.5 upgrade is done or will be shortly :) 19:03:10 <clarkb> Also worth noting that the ELK services have been removed and the servers have been shut down 19:03:25 <clarkb> if anyone is looking for that tooling the openstack project has an opensearch that people cna look at instead 19:03:35 <clarkb> #topic Actions from last meeting 19:03:39 <clarkb> #link http://eavesdrop.openstack.org/meetings/infra/2022/infra.2022-04-12-19.01.txt minutes from last meeting 19:03:48 <clarkb> Its been a couple of weeks since we had a meeting, but no actions to go over. 19:03:54 <clarkb> #topic Topics 19:04:04 <clarkb> #topic Improving OpenDev CD Throughput 19:04:21 <clarkb> I've reviewed most of the Zuul ansible upgrade and security stance change stack 19:04:33 <clarkb> the last change I need to review in that stack is the 2k line change that adds ansible 5 support. 19:05:18 <clarkb> Overall it is basically as expected. I did ask that the bwrap isolation get a bit of testing in the zuul test framework so that we are confident in some of the assertions we are makign with bwrap and those tests got added 19:06:24 <clarkb> I would encourage others to at least familiarize themselves with the major aspects of the change and compare that against their understanding of our CD model to ensure there aren't any major gaps we need to plug 19:06:34 <clarkb> But reviewing the 2k line change is probably overkill for everyone :) 19:07:09 <ianw> ++ thanks for keeping a close eye on it from an opendev infra perspective 19:07:12 <fungi> to be fair, it's mostly file deletions ;) 19:07:19 <clarkb> fungi: not the ansible 5 support addition 19:07:26 <clarkb> its like +1.8k-200 19:08:08 <clarkb> anything else to add on this topic? 19:08:39 <fungi> oh, sorry, i meant the ansible fork removal 19:09:57 <clarkb> #topic Container Maintenance 19:10:20 <clarkb> I did some work around this recently to trim our base images down to a more manageable set 19:10:39 <clarkb> All of our buster base images are now not being updated as they were removed fro mconfig management 19:11:02 <clarkb> docker hub will keep hosting the builds of the buster images until they get timed out due to no use and get deleted 19:11:15 <clarkb> I also removed the python3.7 bullseye images as all our stuff builds against python3.8 or newer now 19:11:22 <clarkb> And finally python3.10 images were added. 19:12:04 <clarkb> If you notice anything using buster (I checked and moved what I could find to bullseye) that should get updated to bullseye instead of buster 19:12:30 <clarkb> Then once we've got jammy up (to be discussed soon) and things have python3.10 testing we can start shifting to the new python3.10 images 19:12:50 <clarkb> #topic Spring Cleaning Old Reviews 19:12:57 <clarkb> #link https://review.opendev.org/q/project:opendev/system-config+status:open+topic:system-config-cleanup could use second reviews 19:13:11 <clarkb> That topic could use second reviews on a few changes. I think ianw reviewed most/all of what remains 19:13:38 <clarkb> Also if you've got a spare half hour looking at your old changes and abandoning anything no longer relevant or stale DNM changes etc would help trim the list down 19:14:15 <clarkb> I've got a number of repo retirements proposed at https://review.opendev.org/q/topic:retire-elk 19:14:38 <clarkb> fungi: ^ last time I did a bunch of retirements you used gertty to mass abandon changes on the repos. Any chance you'd be willing to do that for this set here? 19:14:54 <clarkb> #link https://review.opendev.org/q/topic:retire-elk More repo retirements which means more change abandonments 19:14:58 <fungi> gladly! 19:15:02 <clarkb> thanks! 19:15:06 <fungi> i'll put that on my agenda for after dinner 19:15:42 <clarkb> #topic Support for Jammy Jellyfish 19:16:01 <clarkb> Thanks to the hard work of frickler we've got jammy jellyfish images built with nodepool and the instances boot (at least on some clouds) 19:16:08 <fungi> jam and jelly are similar, but not the same 19:16:25 <clarkb> The mirroring of updates and security repos is possibly not happy based on some checking that frickler has done so that may need investigating 19:16:41 <clarkb> We've also not mirrored arm64 jammy in ubuntu-ports yet or the docker repo 19:16:50 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/839422 Docker package mirroring for Jammy 19:17:00 <fungi> worst case, we could probably rebuild mirror-update.o.o on jammy 19:17:06 <frickler> also need to set up wheels builds I guess 19:17:23 <fungi> yeah, wheel builds will need working nodes of course 19:17:23 <clarkb> frickler: ++ I suspect there will be a number of these small things that we need to address over time. We don't need to solve it all at once 19:18:03 <clarkb> #topic Pruning AFS to make room for Jammy ports and more 19:18:18 <frickler> whom should I nag about getting a nodepool release with the fixes needed to build jammy? my comment in #zuul hasn't received much attention 19:18:27 <clarkb> frickler: corvus typically handles those 19:19:10 <clarkb> For the topic of jammy ports mirroring I think we may need to make more room on the openafs fileservers 19:19:31 <clarkb> recently I did a bunch of pruning of our various rpm mirrors and that actually got us about 320 GB iirc 19:19:53 <clarkb> currently jammy x86_64 is only using about half that so there isn't significant pressure to clean things up but its good hygiene 19:20:04 <clarkb> One thing fungi and noticed today is that we mirror source pacakges for ubuntu and debian repos 19:20:20 <clarkb> but basically nothing should use source packges in our ci system and if a one off needs it pulling from upstream is probably ok 19:20:29 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/839421 Drop source pacakges for ubuntu ports mirror 19:20:33 <ianw> frickler: i'm looking at nodepool now, unfortunately gate looks broken with boto3 changes. i'll look into it today 19:20:44 <clarkb> This is a change to try removing source packages from the ports repos as a first step 19:20:58 <clarkb> if that ends up being helpful and not problematic we shoudl do the same for ubuntu and debian mirrors 19:21:08 <ianw> #link https://review.opendev.org/c/opendev/system-config/+/837637 19:21:16 <clarkb> One thing to note is that the zuul-jobs configure-mirrors role configures deb-src entries for debian hosts (but not ubuntu) 19:21:18 <ianw> i have that to cleanup some extra fedora too 19:21:23 <clarkb> oh cool 19:21:35 <clarkb> I thought that had landed but I guess not yet 19:22:01 <clarkb> I figure once we've removed source packages we can take stock and decide if removing xenial is necesasry or if we should add more disk to the fileservers 19:22:01 <ianw> i can watch it through today, didn't want to walk away from it incase it deleted more than i hoped :) 19:22:15 <clarkb> adding disk is not a problem now that we have deleted the elasticsearch volumes 19:22:22 <clarkb> but if we don't need to add disk that is preferable 19:22:37 <ianw> i guess we also have f36 coming soonish; i think maybe we currently only carry one fedora because 34 was broken 19:22:46 <clarkb> so ya, lets try cleaning up fedora a bit more and clean out source packages and see where we are 19:22:52 <clarkb> ianw: yup that seems correct based on my memory 19:22:52 <frickler> do we have a plan to retire xenial? that would also have the potential for some cleanup 19:23:01 <fungi> #link https://review.opendev.org/839421 Drop source package mirroring for ubuntu-ports 19:23:02 <frickler> and it is eol for quite some time now 19:23:11 <clarkb> frickler: yes, it was brought up with the openstack tc as they apparently use it for old stable branch jobs 19:23:17 <clarkb> the main issue is that opendev is using it too 19:23:19 <fungi> oh, clarkb already linked that earlier 19:23:25 <fungi> #undo 19:24:08 <clarkb> frickler: I think that we should figure out what of oepndev's stuff would potentially break with xenial removal and if nothing super important remains (puppet jobs?) we can remove it and switch those jobs to upstream mirrors 19:24:19 <clarkb> and communicate to openstack etc that they are going away 19:24:25 <fungi> worth noting, the frequency of openstack stable branch jobs which use xenial is fairly small, we could look into swapping out the mirror urls for xenoal with official mirrors 19:24:32 <clarkb> fungi: ++ 19:24:36 <frickler> the issue might be that upstream repos no longer exist? 19:24:46 <clarkb> ubuntu keeps them around unlike red hat distros 19:24:58 <fungi> oh, good point, might need paid support to get to them 19:25:03 <clarkb> there is a secondary qusetion of when do we remove the images themselves but this can be a staged removal starting with the mirror 19:25:05 <clarkb> fungi: you don't 19:25:13 <fungi> right, okay, so only if you want newer updates than when they reached eol 19:25:16 <clarkb> ubuntu keeps really old package mirrors up for a really long time 19:25:21 <clarkb> you just don't get security updates etc 19:25:32 <clarkb> yes, that is my understanding 19:26:03 <clarkb> But ya doing a staged removal starting with the xenial mirror and "fixing" what breaks to use the upstream mirrors makes sense to me 19:26:20 <clarkb> the major thing is identifying what would break in opendev land if that happened just to avoid wedging ourselves 19:26:53 <clarkb> and then we can check if adding disk to afs makes sense after all the cleanups 19:27:55 <ianw> fungi: on 839421; it looks to me that configure-mirrors is by default adding deb-src lines? 19:28:10 <clarkb> ianw: only for debian not ubuntu if I read the ansible correctly 19:28:25 <clarkb> ianw: they each use different source.list templates 19:28:29 <clarkb> (why I do not know) 19:28:46 <clarkb> and yes removing deb-src from the debian source.list templates would be a good next step too 19:28:53 <clarkb> or making it a toggleable flag 19:30:03 <ianw> hrm, isn't ubuntu pulling in from https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/configure-mirrors/tasks/mirror/Ubuntu.yaml#L10 ? 19:30:13 <ianw> i guess checking on an actual node would be easy ... 19:30:30 <clarkb> ianw: yup which is https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/configure-mirrors/templates/apt/etc/apt/sources.list.j2 19:30:37 <clarkb> and that file doesn't have deb-src in it 19:30:54 <clarkb> https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/configure-mirrors/tasks/mirror/Debian.yaml#L10-L14 those all have deb-src 19:31:09 <fungi> don't even need to check a node, just look at random job output 19:32:10 <ianw> oh how confusing, ubuntu uses sources.list.j2 and debian sources.list.d 19:32:14 <clarkb> yup 19:32:41 <fungi> https://zuul.opendev.org/t/openstack/build/e88573e7fad746e798fb6b9e1ad47d9b/console#1/0/16/ubuntu-focal 19:33:00 <fungi> it only "get"s the deb lines, no source 19:33:33 <ianw> ok, thanks 19:33:48 <clarkb> anyway I think we've got a few good avenues forward here. Will just take some time to see what sort of impact we get out of changes and make our next decisions based on that 19:33:56 <clarkb> Anything else on the topic of AFS disk pruning? 19:34:33 <fungi> er, i meant to link https://zuul.opendev.org/t/openstack/build/e88573e7fad746e798fb6b9e1ad47d9b/console#0/4/23/ubuntu-focal 19:35:28 <ianw> cool, i've approved that, but i have a feeling we may need a manual run to actually delete things 19:35:39 <clarkb> ianw: yup that was another concern. We're learning as we go :) 19:35:53 <clarkb> I suppose if that happens rm'ing all the .xz files isn't too hard with find. Just annoying 19:36:28 <ianw> i'll have to look it up, but when we've dropped things before there's some clearrepo type commands ... 19:37:02 <clarkb> ianw: in ubuntu-ports I think we can clear out the xenial cruft too 19:37:19 <clarkb> we don't have xenial packagesi n the pool in ports but do have the indexes which are now stale and not related to an actual mirror we have 19:37:30 <ianw> it was debian-security from my notes when we did some cleanups 19:37:38 <fungi> the idea behind 839421 is that we'll be able to see if it does or doesn't remove the source packages after the next run 19:37:43 <ianw> and the stretch removals 19:38:09 <fungi> yes, removing entire releases defniitely included manual commands 19:38:18 <fungi> not sure if the same goes for removing architectures 19:39:07 <ianw> looks like i didn't log *what* i did, just that i did something :) https://meetings.opendev.org/irclogs/%23opendev/%23opendev.2021-11-12.log.html#t2021-11-12T00:54:32 19:39:35 <clarkb> heh 19:39:39 <ianw> "ianw: #status log debian-stretch has been yeeted from nodepool and AFS mirrors" 19:39:54 <clarkb> Alright we dno't have to solve the details of cleanup now. I'm happy to help work through it though. Just let me know 19:40:00 <clarkb> lets move on 19:40:00 <ianw> ++ 19:40:05 <clarkb> #topic Deleting the subunit2sql MySQL Trove instance 19:40:30 <clarkb> As mentioned earlier I deleted the ELK services and servers. At the same time I cleaned up the openstack health and subunit worker servers that were shutdown a few months ago 19:40:45 <clarkb> The health and subunit workers sat in front of a subunit2sql mysql trove instance in rax 19:40:57 <clarkb> nothing is using that database right now and it is quite large: 286GB 19:40:58 <fungi> i can't speak for the qa team, but it seems like that data is getting increasingly stale so keeping it around is unlikely to be much benefit 19:41:12 <clarkb> ya the questions is can we just yeet the trove db entirely and delete it? 19:41:18 <clarkb> does anyone here have concerns withdoing that? 19:41:40 <clarkb> or do we need to make a backup of the data first? I lean towards deleting it since no one was using the service for a while after it broke which is what led us to turning it off 19:41:51 <clarkb> I can triple check with gmann before deleting it too if we are happy with doing so ourselves 19:42:31 * frickler had to look up "yeet" in a dictionary 19:42:50 <frickler> but no objection 19:43:01 <fungi> i hope it was a recent dictionary 19:43:10 <fungi> dict(1) doesn't return any results 19:43:24 <clarkb> ianw: if you don't have any objection I'll take that as the plan to gmann (just a straight delete) 19:43:37 <frickler> duckduck found some urban dict refs 19:44:14 <ianw> yeah that seems fine, i can't imagine there is anything useful now 19:44:35 <clarkb> great, I'll proceed with that being the plan and double check with gmann. Thanks everyone! 19:44:39 <clarkb> #topic Open Discussion 19:44:41 <ianw> my urban dictionary is a 10yo boy :) 19:44:48 <clarkb> And now a few minutes for anything else we might have to bring up 19:45:03 <frickler> just in case you hadn't seen it, nova has dropped py36 support 19:45:18 <frickler> so devstack is failing on centos 8 now 19:45:36 <fungi> i expect the rest of openstack to follow suit before the end of zed (and hopefully before milestone 1) 19:45:52 <ianw> yeah i saw the removal change, i think 9-stream is in a state to take over 19:46:15 <ianw> (from an infra pov, builds are stable) 19:46:32 <frickler> well I still think either is stable in devstack, but that's another discussion 19:46:34 <clarkb> supposedly centos 9 stream should be better about addressing issues than 8 is/was 19:46:37 <fungi> apparently rhel 9 is still in beta 19:46:42 <frickler> s/either/neither/ 19:47:16 <clarkb> I think that if projects want a long term stable python 3.6 then rocky linux may be the thing we offer best suited to that 19:47:35 <clarkb> but sounds like they are in a hurry to leave 3.6 behind so not a big deal for us 19:48:01 <ianw> well except for old branches, but centos jobs have never run there i don't think 19:49:03 <clarkb> I pushed up a gerrit 3.5 stack of changes at https://review.opendev.org/c/opendev/system-config/+/839250/ and child 19:49:26 <clarkb> I noticed our builds were a bit out of date compared to upstream and also wanted to set a config option bsed on upstream ml discussion to improve memory use 19:50:14 <ianw> oh good, probably can start thinking about preparing for that upgrade (checklists, etc.)? 19:50:32 <clarkb> ya thats my goal to try and start actively planning the upgrade 19:50:53 <clarkb> One thing I wanted to check is if 3.5 can downgrade to 3.4 19:50:55 <ianw> i'm probably in a better position to drive an upgrade at some point soon-ish than i have been previously this year, if we like 19:51:06 <clarkb> since that has been a nice get out of jail card we have had for the last couple of upgrades 19:51:20 <clarkb> ianw: that would be great. I'm also happy to help as I've been more plugged into upstream lately 19:52:01 <fungi> yeah, i'm always happy to pitch in as well 19:52:03 <ianw> ok, we've had downgrade instructions in the checklists before, so maybe we should start a new checklist page as a first step? 19:52:10 <clarkb> ianw: ++ 19:52:20 <ianw> i can put that near the top of the todo 19:52:28 <ianw> #link https://review.opendev.org/q/topic:loop-no-log-json 19:52:56 <ianw> ^ that's a couple of changes after we found that yum/dnf/package+yum/dnf was not showing up in console 19:53:20 <ianw> zuul console. i've added a test-case so hopefully that addresses your comment clarkb? 19:53:31 <clarkb> cool I'll take a look and see 19:54:14 <ianw> #link https://review.opendev.org/c/zuul/zuul/+/837458 19:54:54 <ianw> is another low-priority one that a user reported problems with their substitution strings and turns out i think zuul can do better 19:56:15 <clarkb> I'll do my best to get to the various reviews but I promised corvus that ansible 5 zuul chagne review today so that will be the priority. 19:56:36 <ianw> no probs, just background things compared to that :) 19:57:04 <clarkb> And that takes us basically to the end of our hour. I'm hungry so I'll call it here :) 19:57:21 <clarkb> thank you everyone. As always feel free to reach out in #opendev or at service-discuss@lists.opendev.rog 19:57:30 <clarkb> except its opendev.org not opendev.rog 19:57:34 <clarkb> #endmeeting