19:01:10 <clarkb> #startmeeting infra 19:01:10 <opendevmeet> Meeting started Tue May 30 19:01:10 2023 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:10 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:10 <opendevmeet> The meeting name has been set to 'infra' 19:01:13 <clarkb> #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/G2YQVAPBGOGDJKUKZKDUKAWMFMIWIRRD/ Our Agenda 19:01:22 <clarkb> #topic Announcements 19:01:52 <clarkb> I've gone ahead and written down that we should skip the meeting on June 13 as many of us will be in vancouver for the summit 19:02:06 <fungi> works for me 19:02:23 <clarkb> I will also be unable to attend the meeting on June 20th as I'll either be on a plane or in a tsa line or something 19:02:27 <fungi> i expect to be travelling and likely miss the meeting on the 27th as well 19:02:32 <clarkb> but I'm happy for the meeting to happen without me 19:02:44 <ianw> i will also be AFK then! 19:02:48 <fungi> i can likely chair the one on the 20th unless someone else wants to 19:02:54 <corvus> Lol see everyone in September! 19:03:01 <fungi> i like that plan even better 19:03:23 <corvus> Jk 19:03:52 <clarkb> I do plan to have a meeting here next week though before the hiatus 19:05:10 <clarkb> Also good to generally be aware that a summit + forum + ptg is happening that week (June 13-15) 19:05:15 <clarkb> #topic Topics 19:05:22 <clarkb> #topic Migrating to quay.io 19:05:36 <clarkb> I thought about pulling this off of the agenda but decided to keep it for today in order to do a recap 19:06:05 <corvus> Run as root is new info 19:06:15 <fungi> sudo all the things 19:06:31 <clarkb> the tl;dr is that after migrating about half of hte issues I discoverd that transparent mirroring of images outside of docker.io does not work when using docker 19:06:50 <clarkb> this eventually led me to revert all of the moves I had already done. This is largely done except for base image locations in the zuul/* repos 19:07:12 <clarkb> and ya rootless podman really really wants a systemd session liek you're logging in on a desktop 19:07:29 <clarkb> as far as I can tell our test nodes do create a session with systemd when sshing in (we have all that pam setup in place) 19:07:49 <corvus> User mapping is the bigger production issue 19:07:59 <clarkb> but that isn't sufficient to make it happy. This leads to cgroupfs override options. THen on top of that you cannot run podman su'd to another user because you lack even more sstemd session stuff in that case 19:08:01 <corvus> That requires root 19:09:40 <clarkb> When I did my test conversions of system-config stuff to podman it was all running as root because that is the simple 1:1 mapping away from docker 19:09:57 <clarkb> I don't think this is a big regression compared to our use of docker but does remove some of the functionality you would hope to get out of podman 19:11:11 <clarkb> The other thing I want to call out is that dib folks are asking for some resolution on speculative image testing with nodepool. https://review.opendev.org/c/zuul/nodepool/+/884632 has been proposed now which does the hack up image names and set them via vars option we considered in my brainstorming document 19:11:28 <clarkb> I personally think rolling forward with podman there is the best way forward so I -1'd it and pointed to the change that does that 19:11:45 <clarkb> but might be good to try and close that out soon one way or another 19:12:01 <corvus> oh yeah i'm going to -2 that 19:12:18 <corvus> we haven't been working on the actual fix for weeks just to give up now that it's actually working 19:12:50 <clarkb> I think some of this confusion occurred due to the holiday creating two disparate groups of people attacking hte same problem 19:13:01 <clarkb> but I'm with you I've put a lot of effort into this and would like to see us fix it more properly 19:13:18 <fungi> the fedora mirror change for dib has moments ago been revised to drop the dep on the other nodepool change anyway 19:13:50 <ianw> ++ 19:14:29 <clarkb> I think that is about it on quay.io stuff. Basically reviews and progress on the zuul side of things is what remains 19:14:41 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/883311 A role to install podman. Clarkb needs to update this role 19:14:55 <clarkb> oh also this change is on my list of things to update so that we can start pushing on converting existing jammy nodes 19:15:51 <corvus> i'm not sure the thing i raised before was adequately articulated 19:16:40 <corvus> the thing that is new since the last meeting is that due to the way we bind mount in files that are owned by the in-container nodepool user, the only way we can find to make that work right now is to run podman as root so that bind-mount happens with the correct perms, then we can still run the nodepool container as the nodepool user. 19:17:13 <clarkb> corvus: right we execute `podman` as root but then the container workload can still run as a dedicated user 19:17:28 <corvus> so the implication is that we need to be okay with running the podman command as the root user (which is effectively the same as what is happening now with docker) in production, at least unless/until someone figures out a way of subuid mapping to allow that to happen with a host-level nodepool user. 19:17:33 <corvus> ya 19:18:23 <fungi> if it's basically already the case with how we run the docker client, i don't see the concern 19:18:44 <fungi> unless it's just that podman might have otherwise been an opportunity to avoid doing that 19:19:39 <corvus> fungi: yep, from my pov, it's mostly just a sad face 19:19:40 <clarkb> ya I think we had hoped we could run things more betterer 19:19:48 <clarkb> but this isn't any worse 19:20:25 <clarkb> alright anything else on this? We can pick up the zuul work in matrix since its largely zuul specific at this point 19:20:43 <tonyb> sounds good 19:21:16 <clarkb> #topic Bastion Host Change 19:21:22 <clarkb> #link https://review.opendev.org/q/topic:bridge-backups 19:21:29 <clarkb> I think this topic still needs reviews 19:21:52 <clarkb> I do like the functionality and would like to move forard with it but also think it is sensitive enough it should be carefully reviewed (eg not move forward with just my review) 19:22:18 <tonyb> I promise to review it tomorrow 19:22:32 <tonyb> just another set of eyes 19:22:56 <clarkb> thanks! 19:23:06 <fungi> oh, right, i keep meaning to look at that too 19:23:20 <clarkb> #topic Mailman 3 19:23:25 <fungi> no news yet. next i need to initiate some delivery tests so i can check what urls end up embedded in the list-oriented headers 19:23:47 <clarkb> fungi: is the held node the same one as last week? 19:23:47 <fungi> slightly worried they'll go to the default domain instead of the list-specific domains 19:23:54 <fungi> yeah, same held node still 19:24:14 <fungi> at least now that the default domain is completely separate from the list-specific domains, it'll be easier to check for 19:24:23 <clarkb> fungi: re email headers I think that may just work because django and the email bits are separted and it was only django that we had trouble with 19:24:41 <clarkb> I think when we create the list and set the domain that is with the email backend and it should be more happy on that side of the mm3 house 19:24:44 <clarkb> but ++ to testing it 19:24:53 <fungi> yes, hyperkitty and postorius specifically are the concern, so mailman-core should probably be unaffected 19:25:02 <fungi> but i want to make sure 19:26:48 <clarkb> Sounds good 19:26:51 <clarkb> anything else related to mm3? 19:27:21 <fungi> nada 19:27:28 <clarkb> #topic Gerrit Updates 19:28:03 <clarkb> With all the quay.io stuff I haven't had a chance to look at this. I still would really like to but realistically with summit and travel etc it is unlikely. For this reason I'll push up a revert for the bind mount which we can fallback to if necessary 19:28:12 <clarkb> (again I don't think this is urgent more just super annoying) 19:29:07 <fungi> i guess the main new bit of news is that there's actually a gerrit 3.8.0 release now? 19:29:19 <clarkb> yes since my bugs haven't gotten any traction 19:29:28 <clarkb> there is a community meeting on thursday morning which I'll attempt to attend 19:29:41 <clarkb> but they cancelled the last two because no one at google would start the google meeting instance 19:29:48 <clarkb> I'm not getting my hopes up 19:30:12 <fungi> oh, i guess technically 3.8.0 was released a few days before last week's meeting 19:30:29 <fungi> time has been an absolute blur lately 19:30:35 <clarkb> #topic Upgrading Servers 19:31:01 <clarkb> As with gerrit replication task file cleanup this has been on the back burner. Unlike Gerrit replication leaks I'm hoping I might do a server or two between now and the summit 19:31:10 <clarkb> fingers crossed! but other than that I don't have any real updates 19:31:18 <corvus> i upgraded zuul mergers to jammy 19:31:19 <clarkb> corvus: did replcae our zuul mergers with jammy nodes. 19:31:22 <clarkb> jinx! 19:31:27 <corvus> :) 19:31:28 <fungi> thanks! 19:32:45 <clarkb> This continues to be the perpetual example of slow and steady progress 19:32:59 <clarkb> never as fast as I would like but never completely stalling out. Hopefully I can continue the trend before the summit 19:33:06 <clarkb> #topic Fedora Cleanup 19:33:06 <fungi> corvus: i guess, judging from the inventory/dns changes, you were able to do the full set of mergers in one shot? 19:33:13 <clarkb> #undo 19:33:13 <opendevmeet> Removing item from minutes: #topic Fedora Cleanup 19:33:28 <fungi> i think i was nodding off that afternoon 19:33:56 <clarkb> fungi: that is my understanding. In part because the executors also run mergers so we didn't need all of the mergers running at all times 19:34:13 <fungi> cool 19:34:20 <fungi> makes sense to me, thanks 19:34:27 <clarkb> #topic Fedora Cleanup 19:34:57 <clarkb> tonyb: I went looking for any changes around the disabling of mirrors for fedora test nodes and didn't find one. But I may hvae looked poorly. 19:35:12 <clarkb> I think that is the next step here, I'm happy to help if you need direction or reviews etc 19:35:15 <tonyb> I didn't get my patch published but I did a bunch of local testing 19:35:30 <tonyb> I'll push it up today after I land 19:35:35 <clarkb> sounds good thanks 19:35:38 <ianw> i really should have thought about DIB first 19:35:50 <ianw> this + the quay changes have unfortunately caused quite some confusion 19:35:57 <ianw> #link https://review.opendev.org/c/openstack/diskimage-builder/+/883798 19:36:05 <clarkb> ya, but we have changes to fix things on both sides so we should be able to make progress shortly 19:36:12 <ianw> is I think ~ right 19:36:58 <ianw> however we saw one weird failure where we couldn't parse out the .qcow2 path from a curl to the mirror 19:37:44 <ianw> it's undetermined why, but i don't think as is the work-around in there is required per my comment 19:39:02 <clarkb> once the nodepool stuff is running again we caniterate on the dib side more easily too 19:39:06 <clarkb> to figure that curl thing out 19:40:21 <clarkb> I think that is it for fedora 19:40:22 <ianw> ++ ; i mean we could also just drop building fedora from .qcow2's 19:40:34 <ianw> i don't know if anyone actually uses it, other than the test 19:40:40 <clarkb> eh its a feature people like to have since it allows them to modify existing eimages pretty easily 19:40:56 <clarkb> or at least I'm always told this is why guestfs or whatever it is called is popular 19:41:01 <clarkb> and I have to remind people that dib does that too :) 19:41:37 <clarkb> #topic Storyboard 19:41:53 <clarkb> fungi: I saw some openstack-helm discussion today about this. Anything else to report? 19:42:29 <fungi> nah, i merged that change and am about to deactivate the projects it removes 19:42:39 <fungi> that's basically the extent of it 19:42:58 <clarkb> #topic Open Discussion 19:43:01 <clarkb> #link https://review.opendev.org/c/openstack/project-config/+/884563 Github merge method zuul configuration error fixes 19:43:02 <fungi> and it's just cleanup for some already retired repos, the team isn't moving their active repos off sb for now 19:43:09 <clarkb> ack 19:43:16 <clarkb> frickler: called out this change to fix some configuration errors in zuul 19:43:47 <fungi> frickler also re-lit a fire under the openstack tc to get back to cleaning up their errors 19:43:50 <clarkb> corvus: ^ Is the underlying issue there that the projects in github have chosen a merge method that zuul doesn't support so we have to override? I do think frickler is correct that the harm here is minimal since we aren't gating those projects and are instead doing third party ci 19:43:56 <corvus> i think that may be more complex than anticipated; i left a comment on the change 19:44:05 <clarkb> ah ok I should refresh and read it then 19:44:37 <clarkb> corvus: is the issue that zuul is detecting a mismatch between it and the github project configuration? 19:44:39 <corvus> well, even if the change were correct, i don't think we should merge comments that are incorrect 19:45:18 <corvus> clarkb: yes, zuul is saying that it's configured to use the "merge" merge method with a certain repo, and github says that's not an option 19:45:23 <clarkb> the ideal fix would be to set the merge mode to match the upstream project in that case. Assuming zuul supports that mode. 19:45:43 <clarkb> and then we can drop the comment entirely 19:46:00 <corvus> hrm? 19:46:23 <corvus> i mean, the comment says "this shouldn't be necessary since we're not gating" but it is necessary even if not gating 19:46:47 <corvus> so i don't want to propagate the incorrect idea that this is only important for gating, it's not. it's always important for zuul to merge changes locally the same way they are merged remotely. 19:46:56 <clarkb> right, my point is if we set merge-mode to what github wants then the comment isn't relevant anymore and can be removed 19:47:05 <fungi> because zuul needs to be able to match its merge method in order to faithfully predict what a pr might look like once it merges 19:47:16 <corvus> i mean, we don't need a comment 19:47:40 <corvus> but if we do feel that, then let's say something like "we're setting this to match the upstream method" 19:47:43 <corvus> my objection to the comment is that it says something about zuul's behavior which could mislead people 19:48:01 <corvus> fungi: exactly 19:48:03 <clarkb> got it. I'm trying to calrify what the actual fix is here since you also sa that this change won't change any behavior 19:48:18 <corvus> yes, i think that's the more important thing 19:48:23 <corvus> and i don't have an answer to that 19:48:58 <corvus> you can see right now if you look at the error it says that the 'merge' merge-mode isn't supported 19:49:14 <corvus> it's probably just not reporting an error in this case because it's not tripping the "is this a new error?" check 19:49:35 <corvus> but i bet a nickel if you merge that change the error will still be present since the conditions are the same 19:50:10 <corvus> so, what is the correct merge mode? and if it is 'merge', then why does zuul think it's not allowed? are the $10k questions 19:50:22 <corvus> maybe $11k now with inflation 19:50:47 <clarkb> looking at some closed PRs there doesn't seem to be a clear indication of the method being used unless 'foo merged commit 1234567' means merged explicitly 19:50:52 <tonyb> is that USD? 19:51:05 <fungi> canadian 19:51:18 <fungi> i need to use up all my leftover canadian currency 19:51:19 <tonyb> still better than AUD 19:51:33 <clarkb> but ya we can run that down maybe by asking someone at ansible or querying the github api like zuul or something 19:51:50 <corvus> maybe there's a bug where no merge methods show up as permissible to the zuul user or something. just brainstorm. 19:51:50 <clarkb> corvus: I wonder if zuul can list the acceptable merge methods when it logs the unacceptable ones (I don't know if it has that knowledge) 19:51:55 <corvus> yeah i'd start with the latter. 19:52:04 <corvus> clarkb: that would be ideal for debugging this 19:52:18 <fungi> does sound like a useful addition 19:53:29 <clarkb> anything else? 19:55:13 <fungi> nothing here 19:55:22 <fungi> at least not that i can remember after the weekend 19:55:24 <clarkb> sounds like that is everything. Thank you for your time. Reminder we'll be back next week then take at least a one week break 19:55:31 <clarkb> possibly a two week break. 19:55:46 <clarkb> And then this meeting will occur at 6am for me so I'll feel tonyb and ianw's pain 19:55:52 <clarkb> thanks again! 19:55:55 <clarkb> #endmeeting