19:00:17 <fungi> #startmeeting infra
19:00:17 <opendevmeet> Meeting started Tue Apr  2 19:00:17 2024 UTC and is due to finish in 60 minutes.  The chair is fungi. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:00:17 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:00:17 <opendevmeet> The meeting name has been set to 'infra'
19:00:52 <fungi> i didn't send out an agenda to the ml yesterday, but will be following the one in the wiki
19:01:09 <fungi> #link https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting#Agenda_for_next_meeting
19:01:24 <fungi> #topic Announcements
19:02:01 <fungi> #info OpenStack has a major release occurring tomorrow, please be slushy with configuration change approvals until it's done
19:02:14 <tonyb> lol
19:02:46 <fungi> #info The PTG is occurring next week, and using our Meetpad platform by default, so please be mindful of changes that might impact Jitsi-Meet or Etherpad servers
19:03:20 <fungi> #info Join our PTG sessions next week with your questions or whatever you need help with, see the schedule for time and room link
19:03:45 <fungi> #link https://ptg.opendev.org/ptg.html PTG schedule
19:03:49 <tonyb> Noted
19:04:18 <fungi> #info https://etherpad.opendev.org/p/apr2024-ptg-opendev PTG discussion topics
19:04:24 <fungi> #undo
19:04:24 <opendevmeet> Removing item from minutes: #info https://etherpad.opendev.org/p/apr2024-ptg-opendev PTG discussion topics
19:04:28 <fungi> #link https://etherpad.opendev.org/p/apr2024-ptg-opendev PTG discussion topics
19:04:47 <fungi> #topic Upgrading Bionic servers to Focal/Jammy (clarkb 20230627)
19:05:06 <fungi> #link https://etherpad.opendev.org/p/opendev-bionic-server-upgrades worklist for upgrades
19:05:19 <fungi> #link https://review.opendev.org/q/topic:jitsi_meet-jammy-update outstanding changes for upgrades
19:05:42 <fungi> i'm guessing there's been no new progress here in the past week
19:05:48 <clarkb> not that I've seen
19:05:49 <tonyb> Correct.
19:05:55 <fungi> okay, cool. moving on!
19:06:05 <fungi> #topic MariaDB Upgrades (clarkb 20240220)
19:06:19 <fungi> #link https://review.opendev.org/c/opendev/system-config/+/911000 Upgrade etherpad mariadb to 10.11
19:06:34 <fungi> as previously discussed, that's still waiting for the ptg to end
19:06:51 <fungi> i also said i'd work on the mailman server and then promptly did zilch
19:07:18 <fungi> #action fungi Propose a change to upgrade the MariaDB container for our Mailman deployment
19:07:37 <fungi> we also talked about holding gerrit/gitea upgrades until we have more time after ptg week
19:07:43 <fungi> any other updates since last meeting?
19:07:49 <clarkb> not from me
19:07:58 <clarkb> this is definitely in the slushy category of change
19:08:07 <fungi> agreed
19:08:18 <fungi> #info Being slushy, work will resume post-PTG
19:08:30 <fungi> #topic AFS Mirror cleanups (clarkb 20240220)
19:08:47 <fungi> also slushy
19:08:58 <fungi> #info Being slushy, work will resume post-PTG
19:09:34 <fungi> i suppose we could do a webserver log analysis as a group activity during the ptg, if we get bored?
19:09:43 <fungi> i'll stick it on the pad
19:10:08 <clarkb> I also suggested that volunteers could add noble at this point
19:10:09 <corvus1> fungi: any particular goal for the log analysis?
19:10:21 <clarkb> whcih was one of the goals of the cleanup, to make room for new distros like noble
19:10:40 <clarkb> corvus1: the idea came up at our pre ptg and the idea was to do log analysis to identify other afs content that can be removed because it is largely unused
19:10:42 <fungi> corvus1: to see if there's stuff we're mirroring/proxying that nobody actually uses in jobs
19:10:49 <clarkb> like say possibly the wheel content
19:10:54 <corvus1> ah that one, got it, thx
19:11:21 <fungi> anyway, i stuck it on the pad just now in case we run out of other activities
19:11:38 <fungi> #topic Rebuilding Gerrit Images (clarkb 20240312)
19:11:48 <fungi> this is also on slush-wait?
19:11:59 <clarkb> Last week I said I'd try to do this after the openstack release so Thursday or Friady this week
19:12:08 <clarkb> I think that is still possible though Monday may end up being more likely at this point
19:12:33 <fungi> #info Holding until the OpenStack release is out, may resume early next week
19:12:46 <fungi> #topic Review02 had an oops last night (clarkb 20240326)
19:13:00 <fungi> i haven't seen any rca/post-mortem on this
19:13:14 <clarkb> After our meeting last week I pinged mnaser and guillermesp asking if they had more info and didn't hear back
19:13:17 <fungi> do we know anything new since last week? do we want to leave it on the agenda?
19:13:37 <clarkb> I think we should leave it on for now as a reminder to see if we can track that down. Avoiding gerrit shutdowns would be a good thing to do fi we can
19:14:11 <fungi> #info We'll see if we can get more details from the provider on the root cause
19:14:22 <fungi> #topic Rackspace MFA Requirement (clarkb 20240312)
19:14:37 <fungi> i didn't see any problems after we did it, nor past the deadline
19:14:52 <fungi> in particular, job log uploads seem to have continued uninterrupted
19:14:54 <clarkb> ya assuming rax made the changes (and I have no reason not to) I think we're in the clear for now
19:15:04 <clarkb> we can probably drop this from the agenda at this point
19:15:33 <fungi> i agree. we can always discuss it again if we see any issues we think might be related
19:15:44 <fungi> #topic Project Renames (clarkb 20240227)
19:16:02 <fungi> #link https://review.opendev.org/c/opendev/system-config/+/911622 Move gerrit replication queue aside during project renames
19:16:21 <clarkb> mostly just looking for extra reviews on that change
19:16:38 <fungi> that's been open for a while with no additional reviews, yeah, but i guess there's no hurry as long as we approve it before the maintenance
19:16:39 <clarkb> when we get closer I'll compile the historical record changes that act as an input to the rename playbook
19:16:43 <clarkb> exactly
19:16:55 <fungi> #info Penciled in April 19, 2024 submit your rename requests now
19:17:19 <fungi> #topic Nodepool image delete after upload (clarkb 20240319)
19:17:31 <fungi> i haven't seen the change to configure this yet
19:17:40 <clarkb> I pushed one and corvus1 landed it last week
19:17:46 <fungi> oh!
19:17:59 <fungi> i was living under a rock for the past week, sorry about that
19:18:00 <clarkb> this actually exposed a bug in the implementation so we caught a real problem and corvus fixed that. It should now be in effect
19:18:15 <fungi> very cool. have we observed a reduction in filesystem utilization on the builders?
19:18:31 <corvus1> i double checked the results and it works as expected now
19:18:37 <fungi> yay!
19:18:49 <corvus1> (so no raw images kept on nb01/02)
19:19:24 <clarkb> does look like nb01 could use some cleanup unrelated to that change though (I see some orphaned intermediate vhd files for example)
19:19:41 <corvus1> yeah a bunch of old zero-byte orphans
19:19:44 <fungi> yeah, i just pulled up the cacti graph for /opt and saw the same
19:20:03 <fungi> looks like it will fill up in a few hours if we don't intervene
19:20:41 <fungi> though i do see a drop in utilization back on thursday. i guess that was when the change went into effect?
19:20:57 <clarkb> yes that sounds right
19:21:40 <fungi> ouch, nb02 looks like its /opt filled up in the past day as well
19:21:52 <clarkb> there must be some issue going on
19:22:18 <fungi> yeah, the climb on both looks linear, and seems to start the same time the image cleanup change took effect, so could be related i suppose
19:22:38 <corvus1> hrm, i'll look into that this afternoon
19:22:44 <fungi> thanks corvus1!
19:22:59 <fungi> i can try to find a few minutes to help with cleanup activities if needed
19:23:20 <clarkb> nodepool_dib is under 300GB on nb02. Maybe something to do with dib filling up dib_tmp instead
19:23:40 <fungi> #info This went into effect, but it seems we've sprung a new leak, so need to dig deeper
19:24:48 <fungi> anything else we want to cover on this topic? we can probably do troubleshooting discussion in #opendev unless we want it recorded as part of the meeting log
19:24:59 <clarkb> ya I think we can troubleshoot later
19:25:07 <clarkb> even if the disks fill it isn't urgent for a little while
19:25:16 <fungi> agreed
19:25:27 <fungi> #topic Should the proposal-bot Gerrit account be usable by non-OpenStack jobs (fungi 20240402)
19:25:31 <fungi> #link https://review.opendev.org/912904 "Implement periodic job to update Packstack constraints"
19:25:41 <fungi> we discussed this a little in #opendev earlier today as well
19:26:23 <fungi> for a little historical background, we used a single proposal-bot account back when everything was openstack
19:26:41 <clarkb> At a high level I think it would be ok to have a shared proposal bot since people should be reviewing changes pushed by the bot either wya (however they may just trust the bot in which acse :( ). My bigger concern is that a global shared account implies managing the secret for that account centrally with the jobs that use the secret. For this reason I think it is better to
19:26:43 <clarkb> ask projects to have their own bot accounts if they need them.
19:27:07 <fungi> now we have some non-openstack usage (opendev projects.yaml normalization changes) and the packstack project is asking to use it for proposing their own manifest updates as well
19:28:10 <fungi> it does seem like in the packstack case there's only one repo it's going to propose changes to, so they could define the secret for their dedicated bot account there along with the job that uses it, from what i'm seeing
19:28:47 <clarkb> there might be some way to construct jobs that makes me concern moot as well but I haven't thought through that well enough yet
19:29:02 <fungi> if, for example, starlingx wanted their own translation update jobs that work similarly to openstack's, i wonder how we'd recommend approaching that, since it would need to be in a trusted config repo to get shared, right?
19:29:08 <clarkb> like maybe we push via post-run tasks from the executor
19:30:04 <clarkb> fungi: ya I think thats another case where the ideal would be starlingx have their own central config repo and manage those credetnials independent of however openstack is doing it
19:30:13 <clarkb> but that implies tenant moves etc so may not be straightforward
19:30:42 <fungi> yeah, obviously for projects that use a dedicated zuul tenant it's easy enough to do
19:31:50 <fungi> for older non-openstack projects sharing the openstack tenant (including our own opendev use), there are possible compromises to avoid moving to a new tenant
19:32:18 <tonyb> I'm not sure I understand the concern with sharing the one bot?
19:32:25 <tonyb> the secret is centrally managed ad IIUC can't be exposed
19:32:44 <clarkb> tonyb: the problem is I don't want to be responsible for reviewing changes for how packstack wants to automatically update deps
19:32:44 <fungi> it's not as much a security concern as a "we're on the hook to review their job configuration" concern
19:33:22 <clarkb> that is a packstack concern in my opinion and we should try as much as possible to use the tools we have to keep that off of our plates
19:33:30 <fungi> we have similar challenges with some openstack projects that haven't moved their job definitions out of our config repo too
19:33:34 <tonyb> Ahh okay.
19:33:38 <clarkb> and in an ideal world we wouldn't care bout openstack and starlingx either but for historical reasons we're nto there yet
19:34:15 <clarkb> just to be clear this isn't specific to packstack its more trying to leverage tooling to avoid becoming bottlenecks
19:34:35 <fungi> i suppose our recommendations are 1. use a separate zuul tenant, 2. if #1 isn't feasible and you're only in need of proposals to a single project then put the secret and job definition in that project, 3. if #1 isn't feasible and you need to propose changes to multiple repos then we'll help you work something out
19:34:37 <tonyb> Can we make the job then generates the update in one of their repos and depend on it in the pipeline so it's just the actual proposal that we'd be on the hook with?
19:35:08 <tonyb> (also sorry my IRCclient seems to keep disconnecting)
19:36:01 <clarkb> tonyb: ya it might be possible to define a standard interface for pushing stuff to gerrit in zuul jobs in such a way that we share the account but not the figure out what to update in git steps
19:36:19 <clarkb> that is what I was referring to above but I haven't given it enoguh thought ot be confident in that approach one way or another
19:36:23 <fungi> like return the generated patch as an artifact and then have a standardized rpoposal job push it to gerrit
19:36:37 <fungi> s/rpoposal/proposal/
19:36:42 <tonyb> Yeah something like that
19:36:56 <tonyb> I was trying to switch requirements to that model
19:37:07 <clarkb> yup I think it may be possoble to do that
19:37:27 <corvus1> the afs publishing jobs may have some tricks relevant here (embedding restrictions in secrets, etc)
19:39:05 <tonyb> Is there any time pressure?
19:39:26 <clarkb> not from my side. packstack may want to get this done sooner than later though
19:39:54 <fungi> i suppose we could tell them that the roll-your-own solution is available to them today, or they're welcome to wait
19:40:05 <tonyb> Okay.
19:40:10 <clarkb> or help with the central job idea path if they want to explore that
19:40:17 <fungi> yes, that too definitely
19:41:17 <fungi> also i was thinking how a shared account might work across tenants, we obviously can't use the same key because a tenant owner could extract it with minimal effort, but we could assign per-tenant keys and add them all to the one account in gerrit if we want
19:41:43 <clarkb> ++
19:42:00 <fungi> though i guess the key could be used to authenticate to the ssh api and make account changes, so maybe one account per tenant is still preferable
19:42:02 <clarkb> This has me thinking that gerrit having an anonymous coward code submission process would be neat
19:42:08 <clarkb> but probably full of foot guns
19:42:46 <tonyb> Yup totally agree
19:43:10 <fungi> if gerrit allowed us to scope a key to specific permissions, it would be safer
19:43:56 <fungi> also, if we wanted to switch to https pushes, you're restricted to one api key per account so it wouldn't work there at all
19:45:47 <fungi> #agreed Let the PackStack maintainers know that they can implement this inside their x/packstack repository with a dedicated Gerrit account, but they're welcome to work with us on a refactor to better support shared account pushes
19:46:31 <fungi> #info We're looking into a split job model where only the push to Gerrit tasks are defined in the config repo
19:46:43 <fungi> do those two items capture everyone's takeaways from this discussion?
19:46:53 <clarkb> lgtm
19:46:59 <tonyb> and me
19:47:18 <fungi> anything else on this topic?
19:47:32 <clarkb> not from me
19:48:13 <fungi> #topic Open discussion
19:48:27 <fungi> did anyone have something to discuss that wasn't covered in the agenda so far?
19:48:54 <tonyb> Not from me
19:49:04 <clarkb> I was just notified that multiple people have tested positive for covid after our family easter lunch saturday. I really hope I don't end up sick again but warning that I may be useless again in the near future
19:49:40 <fungi> that sounds even worse than ill-prepared egg salad
19:50:12 <clarkb> I definitely did not enjoy my time with covid last summer. Would not recommend. If its anything like that again I probably would risk bad eggs :)
19:50:24 <clarkb> er if I had the choice of replacing one iwth the other you know what I mean
19:50:33 <ianychoi> Hi, just I wanted to share this - translate.zanata.org sunset on sep 2024 for visibility
19:50:35 <ianychoi> https://lists.osci.io/hyperkitty/list/zanata-sunset@lists.osci.io/thread/6F2D6JRPFF6RRKYURB2WMCXSJ6C4AFBS/
19:50:35 <tonyb> Ergh.  Good luck
19:50:46 <tonyb> LOL
19:51:15 <fungi> i guess the good news is we don't use translate.zanata.org and zanata itself can't get any less-maintained than it already has been for years now
19:51:28 <clarkb> its a race nwo to see who can shtudown faster :)
19:51:31 <clarkb> I want to win this race
19:52:16 <tonyb> I think we all do
19:52:23 <ianychoi> Oh nice analogy as race :p
19:52:37 <fungi> yes, it would be nice if ours isn't the last zanata standing
19:53:38 <ianychoi> I will try to put my effort to win the race - thank u for all the help
19:53:48 <clarkb> and thank you for being on top of it
19:54:19 <ianychoi> Thank u too!
19:57:06 <fungi> seems like that's about it. i thank you all for your attention, and return the unused 3 minutes
19:57:09 <fungi> #endmeeting