19:00:28 <clarkb> #startmeeting infra
19:00:28 <opendevmeet> Meeting started Tue Jan 14 19:00:28 2025 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:00:28 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:00:28 <opendevmeet> The meeting name has been set to 'infra'
19:00:35 <clarkb> #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/3WNI573NTZ5VYPTQ5DBQVKXVJSC2QTGB/ Our Agenda
19:00:41 <clarkb> #topic Announcements
19:01:01 <clarkb> The OpenInfra foundation individual board member election is happening right now. It ends at 1900 UTC Friday
19:01:17 <clarkb> this is a good time to check your email inbox for your ballto if you are a foundation member
19:02:10 <clarkb> and if you did get one please vote. There is a great list of candidate
19:02:13 <clarkb> *candidates
19:02:25 <clarkb> Anything else to announce?
19:03:12 <fungi> i'm going to be around at odd hours for the next week, just a heads up
19:03:19 <fungi> (travelling tomorrow through monday)
19:03:44 <fungi> i'll still be back in time for our thing starting on tuesday though
19:04:21 <clarkb> thanks for the heads up
19:04:35 <clarkb> #topic Zuul-launcher image builds
19:04:41 <clarkb> #link https://review.opendev.org/q/hashtag:niz+status:open is next set of work happening in Zuul
19:05:04 <clarkb> I believe this set of changes is next up. And I believe that was hung up on reliabiltiy of docker images which should be improving now that we're mirroring images
19:05:19 <clarkb> corvus: is now a good time to start looking into that or does it still need to settle a bit?
19:05:48 <corvus> i'm about to refresh the change to switch to using quay images in zuul
19:05:58 <corvus> so i think things are settled enough now to start merging changes
19:06:03 <clarkb> great!
19:06:09 <corvus> i'll wait until that stack is fully reviewed before i approve them
19:06:17 <corvus> (i think that's an all-or-none stack)
19:06:21 <clarkb> I'll try to take a look at them before too long
19:06:38 <corvus> thanks, i'd love to merge them this week
19:07:01 <corvus> stack has +2s on half of it, so i think it's plausible
19:07:04 <clarkb> anything else on this topic?
19:07:15 <corvus> nope
19:07:23 <clarkb> #topic Upgrading old servers
19:07:35 <clarkb> tonyb isn't around today so don't expect any updates on wiki updates
19:07:47 <clarkb> given that I think we can jump straight into the noble and podman status
19:07:56 <clarkb> #topic Deploying new Noble Servers
19:08:20 <clarkb> Last week I discovered that the Noble image tonyb uploaded everywhere didn't actually make it into rax. I suspect that the ansible playbook using the openstack modules isn't capable of uploading to rax
19:08:47 <clarkb> I manually uploaded the vhd image tonyb created in his bridge homedir using openstacksdk which is capable of uploading to rax
19:09:14 <fungi> did you upload it to all regions or just dfw?
19:09:16 <clarkb> During testing of those images I discovered that OSC cannot boot instances with network addresses in rax, but again using the sdk directly worked (launch node uses the sdk)
19:09:18 <clarkb> fungi: all three regions
19:09:21 <fungi> thanks!
19:09:48 <clarkb> then yesterday I booted a replacement paste server on top of this image. Deployment of this then got hung up on docker rate limits
19:10:12 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/939124 deploy paste02 on noble
19:10:26 <clarkb> the parent change of ^ switches paste to pulling mariadb:10.11 from quay instead of docker hub to work around that
19:11:00 <clarkb> if we're comforatble doing that for production deployments it would be great to get those changes in so we can start on migrating the paste data and get some real world feedback on podman on noble
19:11:14 <fungi> interestingly, we also needed to use configdrive with the noble image tony built, but not with the jammy image rackspace supplies
19:11:47 <clarkb> oh right. The image is from ubuntu upstream's qcow2 converted to vhd aiui and the cloud init there needs config drive (I'm guessing rax metadata services isn't compatible?)
19:12:10 <clarkb> the other thing we discovered is that podman on noble will only start containers on boot if they are marked restart:always
19:12:15 <fungi> might be down to a difference in cloud-init version, yeah. it's definitely installed at least
19:12:48 <clarkb> this is a behavior change from docker which would restart under other situations as well (though that restarting wasn't super consistent due to spontaneous reboots putting containers in a failed state)
19:13:17 <clarkb> anyway long story short reviews on the quay.io source for mariadb:10.11 would be great that way we can proceed if comfortable with that
19:13:26 <clarkb> and we'll see if we learn more new things afterwards
19:13:36 <corvus> +2 from me i say +w at will
19:13:36 <fungi> i'm almost certain we will
19:13:50 <clarkb> excellent thanks
19:13:52 <clarkb> #topic Mirroring Useful Container Images
19:14:14 <clarkb> which takes us to a followup on this. Indications so far are that this will be super helpful. Thank you corvus for pushing this along
19:14:28 <fungi> nice segue
19:14:37 <corvus> \o/ whew!  i hope it works!
19:14:49 <fungi> yes, thank you!!!
19:14:55 <clarkb> currently the list of tags is somewhat limited so I had to add 10.11 to mariadb yseterday. If you see or think of other tags we should add then adding them early helps keep things moving along
19:15:02 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/939169
19:15:08 <clarkb> the change to do that is simple for an existing image ^
19:15:22 <fungi> i also mentioned during the openstack tc meeting a few minutes ago that we're making some progress on a reusable pattern for this sort of solution
19:15:28 <corvus> also, protip, if adding a new image+job, don't forget to add the new job to the project pipeline config.  ;)
19:15:43 <clarkb> the other thing I noticed this morning in checking that the 10.11 was mirrored properly is that we get different manifests due to different arches
19:16:13 <clarkb> corvus: I don't know if there is a way to have docker or skopeo etc copy all available arch images in the manifest?
19:16:29 <clarkb> that might result in over mirroring but woudl make checking status simpler and avoid problems with arm for example
19:16:33 <corvus> yes i think there is -- but isn't everything x86 right now?
19:16:43 <clarkb> corvus: yes everything but nodepool is x86 in opendev
19:17:03 <corvus> like, this should/will be a problem once we do something with arm...
19:17:08 <clarkb> so I think we're most likely to hit this when building nodepool images that try to fetch teh python base image from quay?
19:17:21 <clarkb> since I think we're only mirroring x86 right now
19:17:27 <clarkb> and maybe that is a non issue due to niz?
19:17:32 <corvus> but for the moment, we're mirroring on x86 nodes and using on x86 nodes so shouldn't be an immediate problem
19:17:32 <corvus> okay sounds like we're on the same page
19:17:43 <corvus> it's certainly a low priority for me due to niz :)
19:17:49 <clarkb> yup I just wanted to call it out as something I noticed. Its not impactful yet
19:18:19 <fungi> if we ever end up containering anything on the mirror servers in our cloud providers, i could see it coming up
19:18:30 <corvus> if someone does want to address that; i would look into existing patterns in zuul-jobs first because i'm pretty sure we have some "all arches" stuff in there somewhere with skopeo or similar.
19:18:31 <fungi> but yeah, nothing on the horizon
19:18:32 <clarkb> I was able to confirm the 10.11 images matched between quay and docker hub by fetching them locally. A docker image list showed them with the same id and docker inspect collapses them both into the same json oputput
19:18:57 <clarkb> just in case anyone else needs to do that later ^ seems to work
19:19:24 <clarkb> any other questions concerns or feedback with the mirrored images?
19:19:51 <corvus> i think --all is the arg to skopeo
19:20:10 <fungi> pipelines... do we run the mirror routine in promote/post in addition to periodic?
19:20:27 <clarkb> fungi: no currently it is only periodic
19:20:39 <fungi> or i guess, what i meant to ask is should we?
19:20:40 <clarkb> but it is also primarily for images we don't produce
19:20:45 <clarkb> I think the only exception to that is the base python images
19:20:54 <clarkb> so we could potentailly run the mirroring for those as part of promote/post
19:21:01 <corvus> we won't need to once we switch canonical locations to quay
19:21:06 <clarkb> true
19:21:15 <corvus> which depends on the noble upgrade.  so there's a window where this might be useful
19:21:20 <fungi> well, more for updates to system-config which add a new image, we might want to merge another change to use it from that location without waiting a day
19:21:41 <clarkb> oh I see
19:21:56 <clarkb> ya I think that could be optimized
19:22:10 <fungi> mainly pondering how we might be able to iterate more quickly on switching image locations or updating versions
19:22:29 <corvus> i think we could do both things; should look at file matchers for general images in system-config.  can just chain jobs for python-base updates.
19:22:34 <fungi> i suppose it wouldn't just be on adding a new image but also adding different versions when we're pinning specific ones
19:22:43 <clarkb> corvus: ++
19:22:50 <fungi> yeah, that should work
19:22:57 <corvus> fun trick btw:
19:23:08 <corvus> add file matchers that never match anything, then the job only runs on job config updates.
19:23:14 <clarkb> hacks
19:23:21 <fungi> heh
19:23:25 <corvus> that's where i'd start for the system-config general new image problem
19:23:43 <clarkb> will the file matchers get ignored in periodic though?
19:23:49 <clarkb> or will that stop us from running in periodic?
19:23:57 <clarkb> maybe we need a child job for post/promote and that hack
19:24:07 <corvus> i would only add them in a project-pipeline variant
19:24:12 <clarkb> aha
19:24:18 <fungi> yeah, that seems cleaner
19:24:20 <corvus> so not to the job, just to the invocation of it on system-config
19:24:32 <corvus> in post or deploy or whatever
19:24:36 <clarkb> yup makes sense
19:24:54 <fungi> that way the funky nonsensical overrides are all in one spot and become more obvious/don't get forgotten
19:25:05 <clarkb> fungi: is that someting you might be interested in pushing up?
19:25:26 <fungi> i want to say yes, but i'm not sure i'd get to it before next week with other stuff currently on my plate
19:25:35 <clarkb> ack, its not urgent. Lets see where we end up
19:25:36 <fungi> so happy to if nobody else beats me to it
19:26:29 <clarkb> #topic Gerrit 3.10.4
19:26:36 <clarkb> Gerrit made new bugfix releases
19:26:50 <clarkb> which seems like a good time to also bundle in the h2 compaction timeout increase change
19:26:57 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/939167 New bugfix releases available
19:27:03 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/938000
19:27:30 <clarkb> fungi: ^ maybe if you get a quite time this week during travel we can sit down and aldn those and do a couple of restarts
19:27:36 <clarkb> *quiet time
19:27:41 <fungi> sure!
19:28:01 <clarkb> I don't think the bugfixes are super urgent. But they do look like good ones (including a NPE fix)
19:28:38 <clarkb> #topic Running certcheck on bridge
19:29:23 <clarkb> the certcheck tool uses a script that is a bit old on cacti and doesn't support SNI. If we move the check to running on bridge we get two things: the latest version and SNI support and we'd better match how we test this in our CI jobs (since we always have a bridge it runs there I guess)
19:29:34 <fungi> i guess in retrospect, the three of us discussing this yesterday are the only ones in today's meeting, so deferring the decision was sorta pointless ;)
19:29:44 <clarkb> fungi wanted to bring this up to check if there were any concerns with adding exttra tools like this to the bastion host
19:30:14 <clarkb> I think the concern is valid but the risk in checking a tls connection once per day to a set of services that we (mostly) run seems low
19:30:24 <clarkb> I think I would be ok with moving this to bridge
19:30:28 <fungi> seems like we established yesterday that none of the three of us had concerns, so i guess let's plan to move it
19:30:35 <corvus> ++
19:30:45 <clarkb> sounds good!
19:31:03 <clarkb> #topic Service Coordinator Election
19:31:14 <fungi> i have a wip change to install it from a distro package instead of from git, i'll try to find time to amend that soon to move it off cacti and onto bridge
19:31:48 <clarkb> I haven't seen or heard any objections to the proposal I made last week for election plans
19:32:07 <clarkb> proposal: Nominations Open From February 4, 2025 to February 18, 2025. Voting February 19, 2025 to February 26, 2025. All times are UTC
19:32:22 <clarkb> Considering that I'll plan to make this official via email to the service-discuss list shortly
19:32:38 <clarkb> you have until then to lodge your objections :)
19:32:39 <fungi> thanks!
19:32:48 <clarkb> #topic Beginning of the Year (Virtual) Meetup
19:32:54 <clarkb> #link https://etherpad.opendev.org/p/opendev-january-2025-meetup
19:33:16 <clarkb> I have started gathering discussion topics on this etherpad which will also serve as the meetpad location for this event
19:34:07 <clarkb> the main thing to decide is timing. Last week I suggested January 21-23 timeframe whcih is next week. No objections to that and frickler indicated that we could proceed with planning and not worry too much about their abiltiy to attend
19:34:41 <fungi> yeah, i'll be getting home the afternoon before that, so wfm
19:34:43 <clarkb> I'm thinking spreading it out over multiple days is a good idea so that we can do shorter gatherings rather than one long one. The 21st is also the day of our normal meeting
19:35:56 <clarkb> maybe on the 21st we do something like 2100-2300 UTC. Then 22 and 23 we do 1800-2000 if frickler can attend and 2100-2300?
19:36:11 <clarkb> then if we decide we don't need all of that time we can skip the 23rd too or something
19:36:15 <fungi> i've got a couple of unrelated conference calls on tuesday and wednesday, but will be available most of the time
19:36:30 <clarkb> (basically I'm trying to be flexible and accomodate timezones and the massive block of meetings on Tuesday everyone seems to have)
19:37:02 <clarkb> corvus: fungi since you're here are those blocks of time particularly problematic?
19:37:47 <corvus> 1 sec
19:38:12 <fungi> before 19z on monday and 15-16z on tuesday are my other calls, so your proposed times look fine for me
19:38:49 <corvus> i don't currently have any blockers there
19:39:06 <fungi> er, i mean before 19z on tuesday and 15-16z on wednesday are my other calls
19:39:17 <fungi> anyway, wfm
19:40:04 <clarkb> excellent I'll get that dded to the etherpad and we can be flexible if necessary (like dropping the time on the 23rd if we don't need it)
19:40:35 <clarkb> #topic Open Discussion
19:40:45 <clarkb> I failed to put the Gitea 1.23.1 upgrade on the agenda
19:40:49 <clarkb> but there is also a proposed Gitea upgrade
19:41:02 <clarkb> I guess we could try landing that and see if we hit the mariadb problems and switch over images there too?
19:41:09 <clarkb> not sure if anyone else wants to inspect the held node
19:43:24 <clarkb> I guess see where we get with paste02 and if that makes progress look to gitea next
19:43:36 <clarkb> Anything else that wasn't in the agenda that we should cover before calling it a meeting?
19:44:10 <fungi> there's a bunch of changes for bindep that could use another pair of eyes (many of which are mine)
19:44:36 <fungi> testing fixes, support updates, packaging modernization
19:44:46 <clarkb> fungi: did we ever end up being able to explain why the old setup stopped working?
19:45:00 <fungi> which old setup?
19:45:03 <clarkb> Thats probably not critical and most likely due to changes in setuptools since the original change was first proposed
19:45:21 <clarkb> fungi: the original change I pushed to have bindep use pyproject.toml worked but then you had to rework it to get it working today
19:45:32 <clarkb> but ya I'll put reviewing that on the todo list
19:45:49 <fungi> i think you're talking about an earlier iteration of the pyproject.toml changes, i incorporated your feedback around that
19:46:19 <fungi> i think it's split up fairly logically now
19:47:15 <clarkb> right. I'm just curious if we know why that became necessary
19:47:26 <clarkb> it did work when originally proposed but then that same pyproject.toml stopped working
19:47:44 <fungi> looking
19:48:20 <fungi> we're talking about 816741 i guess
19:48:50 <clarkb> yes looks like it
19:49:01 <clarkb> ps9 passed then it stopped passing and you had to add the py_module stuff
19:49:13 <clarkb> I guess setuptools changed somehow to make that mandatory
19:49:21 <fungi> it's virtually identical to what you had before, but needed a tweak to account for newer setuptools and editable installs, looks like?
19:49:52 <fungi> and changes to how setuptools did module searching
19:50:10 <clarkb> ya the nox stuff is a sideeffect of setuptools and pip not supporting editable wheels on python3.6
19:50:13 <fungi> but yeah, i think it's all down to setuptools differences
19:50:33 <clarkb> https://review.opendev.org/c/opendev/bindep/+/816741/21/setup.cfg its this difference that I'm looking at I guess and ya setuptools changed seems to be the answer
19:50:59 <fungi> right
19:51:23 <fungi> one of those worked on 3.6 but then setuptools changed the name of the option
19:51:38 <clarkb> I think we can alnd everything up to https://review.opendev.org/c/opendev/bindep/+/938522 then maybe make a release? then land the pyproject.toml update?
19:52:27 <fungi> yeah, also this all started because we wanted a candidate to exercise newer pbr when that gets tagged
19:52:34 <clarkb> then we can decide if we do a release with pyproject.toml before dropping python3.6
19:53:20 <clarkb> and I'll try to review the later half of that stack soon so that I can have an informed opinion on ^
19:53:25 <fungi> right, the later changes in that series are more like "we can do this, should we?"
19:53:56 <fungi> in order to settle on a good example for how we might want to update the packaging for our other tools
19:54:52 <fungi> and also the drop 3.6 change is as much to confirm all the todo comments are correct about whenwe'll be able to clean bits up
19:55:12 <clarkb> ++
19:56:32 <clarkb> anything else? last call
19:57:12 <clarkb> sounds like that is it. Thanks everyone!
19:57:16 <clarkb> We'll be back next week
19:57:21 <clarkb> #endmeeting