19:00:17 <clarkb> #startmeeting infra
19:00:17 <opendevmeet> Meeting started Tue Mar 12 19:00:17 2024 UTC and is due to finish in 60 minutes.  The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:00:17 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:00:17 <opendevmeet> The meeting name has been set to 'infra'
19:00:23 <clarkb> #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/IIJAA3YB34I5JUJLM7SXXRGGQYL2JXGI/ Our Agenda
19:00:30 <clarkb> #topic Announcements
19:01:17 <clarkb> north america did its DST switch. I think europe and australia have changes coming up in the near future too. Just keep that in mind as you manage your calendars :)
19:01:35 * fungi shakes his fist at the hour the universe has borrowed from him
19:01:38 * corvus puts food in face
19:02:16 <clarkb> Also I'm going to be fishing thursday and I think my back is demanding I finally go find a new office chair sooner than later so that will probably happen tomrrow. Tl;dr I'll be in and out this week
19:02:46 <fungi> i'll be starting a little later than usual on thursday (routine medical checkup scheduled)
19:03:02 <fungi> but expect to be around the rest of the day
19:03:11 <clarkb> The OpenStack TC election is happening right now. Go vote if you can
19:04:22 <frickler> also openstack rc1 releases are due this week
19:05:23 <clarkb> #topic Server Upgrades
19:05:55 <clarkb> Haven't seen any movement on this. We'll discuss more later but the rackspace clouds.yaml updates should in place now. Please say something if you run into trouble related to that booting new servers
19:06:07 <clarkb> the dns updates may not work which for 99% of things thats probably fine
19:06:28 <clarkb> its not too bad to add reverse records by hand when we need them and forward records are mostly in opendev.org now
19:06:57 <fungi> i didn't see any issue with dns when i tested the launch-node script with an api key only cloud definition
19:07:21 <fungi> easy enough to rerun that test and check for it explicitly though
19:08:03 <clarkb> fungi: dns uses ist own auth credentials which I think are the old ones
19:08:10 <clarkb> so they will stop working once MFA is enabled
19:08:13 <fungi> oh, not clouds.yaml then
19:08:17 <clarkb> correct
19:08:33 <clarkb> thats what the files we source are for when we do the dns stuff
19:08:40 <clarkb> one sources creds the others the virtualenv for the tool
19:09:36 <clarkb> anyway we'll talk more about rax mfa in a bit
19:09:43 <clarkb> #topic MariaDB Upgrades
19:10:02 <clarkb> I've got two changes up to do some more upgrades. One will do refstack and the other etherpad. They both need reviews. I'm happy to approve and babysit though
19:10:08 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/910999 Upgrade refstack mariadb to 10.11
19:10:13 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/911000 Upgrade etherpad mariadb to 10.11
19:10:37 <clarkb> based on paste's upgrade these should be straightforward but always worth monitoring database changes
19:11:26 <clarkb> let me know if you have any questions or concerns
19:11:35 <clarkb> #topic AFS Mirror cleanups
19:11:59 <clarkb> We're in the "idle" period between active mirror/nodepool cleanup where we try and reduce the impact across our repos
19:12:34 <clarkb> On the infrastructure side of things I think we are now ready to remove the centos-7 base-jobs nodeset and the images from nodepool but I announced that would happen on friday
19:12:47 <clarkb> I don't have changes up for that yet but will get them up before Friday in order to make it easy to alnd them
19:13:02 <clarkb> in the meantime topic:drop-centos-7 is worth keeping an eye on in case there are further cleanups
19:13:12 <frickler> ubuntu-ports mirror was broken likely as a result of running into quota limits
19:13:34 <frickler> should be fixed now but implies we'd better try to avoid this repeating if possible
19:13:47 <clarkb> ++ I'm somewhat surprised that reprepro can't resolve that on its own though
19:14:01 <frickler> maybe I just didn't do the right things
19:14:22 <clarkb> and if you didn't I don't blame you. reprepro docs are a bit incomprehensible :)
19:15:10 <clarkb> I also started pushing some changes for xenial removal under topic:drop-ubuntu-xenial
19:15:26 <clarkb> I don't think we are in a hurry there as there will be plenty to untangle for xenial. But reviews are always welcome
19:16:05 <clarkb> slow but steady progress including in the projects dropping old configs. THat is nice to see
19:17:05 <clarkb> #topic Rebuilding Gerrit Container Images
19:17:05 <frickler> the centos7 removal from devstack is still blocked
19:17:12 <clarkb> #undo
19:17:12 <opendevmeet> Removing item from minutes: #topic Rebuilding Gerrit Container Images
19:17:26 <clarkb> frickler: that is due to jobs in other projects like keystone?
19:17:47 <frickler> yes, see https://review.opendev.org/c/openstack/devstack/+/910986
19:18:30 <clarkb> thanks. I don't see a change or changes yet to remove it from keystone
19:18:36 <clarkb> fungi: was that something you were planning on pushing?
19:18:50 <fungi> yeah, i need to pick that back up now
19:19:01 <frickler> keystone wasn't mentioned in the list of affected projects so far, too
19:19:44 <frickler> but since this is also on master, it might be a good idea to move the removal until after the openstack release if this can't get fixed in time
19:19:53 <frickler> says /me with release hat on
19:20:22 <clarkb> frickler: wouldn't it be just as easy to disable that job/remove that job as landing any fixes to keystone that might be necessary for the release?
19:20:26 <clarkb> I guess I don't see this as a hard blocker
19:20:40 <clarkb> because either way you're talking about $something that needs updating in keystone and if we can do that we can drop the job
19:21:03 <clarkb> but we can check in on friday and make a call if it seems particularly painful
19:21:06 <frickler> we don't know how many other projects might also be affected that we don't know yet about
19:21:23 <frickler> and we will discover only by iterating over merging fixes
19:21:27 <fungi> we can add removal changes for stable/2024.1 before or after the release
19:21:51 <clarkb> ya I think my main push back is that the only way that should affect the release is if you need to update the job config for one of those projects anyway
19:22:05 <clarkb> in which case having a job that needs to be deleted is equivalent to whatever else is blocking you
19:22:32 <fungi> well, or if zuul configuration is broken and the release jobs for the project's tag don't run
19:22:54 <fungi> though i think those jobs are all defined in other repos
19:23:01 <clarkb> I thought zuul will continue to run the jobs in that case yes
19:23:08 <clarkb> I don't know that for certain though
19:23:10 <fungi> project-config and openstack-zuul-jobs mainly
19:23:42 <clarkb> part of my concern is that we can't avoid merging these until every project is fixed because we know not all will be fixed ahead of time
19:23:58 <frickler> yes, that should be fine, I was more thinking about possible last minute fixes being needed
19:23:58 <fungi> i thought zuul wouldn't run jobs for a project if it couldn't load its configuration, but specifically for things defined via in-project configuration which hopefully the release jobs aren't
19:24:36 <clarkb> fungi: and you can have errors in your config that are only a problem when you try to modify the config
19:24:42 <clarkb> otherwise it will use cached configs
19:25:00 <clarkb> (which isn't something to rely on, but config errors aren't usually a hard stop)_
19:25:37 <clarkb> anyway we can take stock in a few days and make a call then
19:25:49 <clarkb> #topic Rebuilding Gerrit Container Images
19:26:08 <clarkb> Gerrit finally released a new version of 3.9 to update mina ssh for that mitm thing
19:26:13 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/912470 Update our 3.9 image to 3.9.2
19:26:44 <clarkb> I try to keep out images up to date so that we're testing what we'll actually be upgrading with. However, merging this change will produce new 3.8 images too so we should try and restart gerrit even though it is running the older image
19:27:11 <clarkb> historically we've upgraded gerrit around april/may and november/december
19:27:28 <clarkb> would be great to get this up to date then try and work towards a gerrit upgrade in the next month or two
19:27:43 <frickler> should we try to combine with project renames?
19:27:52 <clarkb> my preference is that we don
19:27:55 <clarkb> *don't
19:28:19 <frickler> ok
19:28:26 <fungi> yes, it can result in a bigger mess or additional delays if something needs to be rolled back
19:28:27 <clarkb> I think project renames are hacky enough that combining it with na upgrade is more risk than necessary. Also both should be relatively quick so we won't have massive downtimes
19:29:14 <fungi> i agree, two brief outages a week or two apart is preferable to one slightly longer outage which has an increased risk of something going wrong
19:29:28 <clarkb> #topic Project Renames
19:29:55 <clarkb> Thats a good jump to another one of today's topics. A reminder we've pencilled in April 19. If you know of people who may want to rename projects remind them to get that info pushed up
19:30:08 <clarkb> #link https://review.opendev.org/c/opendev/system-config/+/911622 Move gerrit replication queue aside during project renames.
19:30:26 <clarkb> I also wrote this chagne to add this workaround for thousnads of errors on startup that we've been using when manually restarting gerrit to the playbook that autoamtes it
19:31:39 <clarkb> Don't think there is much else to say and I expect that is the primary prep we need before we get there (we actually test renames in our system-config-run-review jobs so should be good)
19:31:51 <clarkb> #topic Rackspace MFA Requirement
19:32:11 <clarkb> As noted earlier all of our clouds.yaml files should be updated now to use the api key auth method and the rackspaceauth plugin
19:32:23 <clarkb> this should make launch node, openstackclient, and nodepool happy
19:32:39 <clarkb> one major exception is the dns client and dns updates for dns hosted by rax
19:32:57 <clarkb> and then there is some question about swift hosted job logs but we're like 95% certain that is dedicated swift accounts which shouldn't be affected
19:33:37 <clarkb> The enforced deadline for the change is March 26
19:33:54 <clarkb> which means we can either wait until then or since we think we're prepared opt into MFA now and see what breaks
19:33:58 <fungi> see also https://review.opendev.org/912632 per our earlier discussion, dns should be fine but that change ought to help avoid future divergence
19:34:43 <clarkb> oh interesting it was already using the api key
19:34:46 <fungi> basically, the dns api module was already set up using api keys, so it transitioned to the new approach long ago
19:35:17 <clarkb> there are three accounts that we will need to manage MFA for. I think we'll just do totp like we've done for other accounts
19:36:01 <clarkb> fungi: actually I don't think that change (912632) is safe ? dns is managed via the third account
19:36:07 <clarkb> which is different than the control plane account
19:36:16 <fungi> dns is, rdns is not
19:36:27 <clarkb> oh this is rdns specific got it
19:36:33 <fungi> i did not touch dns just rdns
19:36:36 <clarkb> got it
19:36:53 <fungi> as mentioned in the commit message
19:37:00 <frickler> maybe try to switch to mfa early next week? then we'd have some days left in case it doesn't work as expected
19:37:03 <clarkb> I suspect that updating the nodepool account is going to have the biggest impact if something goes wrong
19:37:21 <clarkb> frickler: ya so maybe we start with our control plane account, test launch node stuff again, then if that works do the other two accounts
19:37:45 <clarkb> and next week works for me after the way this week filled up
19:38:04 <clarkb> though I won't stop anyone from getting to it sooner
19:38:10 <fungi> agreed
19:38:35 <clarkb> cool /me scribbles a note to try and find time for that next week
19:38:45 <clarkb> let me know if you want to help I'll happily defer :)
19:38:45 <fungi> a lot of my week is consumed by openstack release<->foundation marketing coordination
19:38:54 <fungi> next week should be better though
19:39:20 <clarkb> sounds like a plan
19:39:23 <clarkb> #topic PTG Planning
19:39:32 <clarkb> #link https://ptg.opendev.org/ptg.html
19:39:37 <clarkb> we are now on the schedule there
19:39:59 <clarkb> and I think we'll use the default etherpad that was "created" for us
19:40:01 <clarkb> #link https://etherpad.opendev.org/p/apr2024-ptg-opendev Feel free to add agenda content here.
19:40:01 <fungi> wednesday/thursday
19:40:22 <clarkb> yup I used the times we talked about in here last week.
19:41:16 <clarkb> I need to start adding some stuff to the agenda but everyone should feel welcome to. My intent is to make this more of an external reaching set of time but we can use it for our own stuff as well
19:41:36 <clarkb> for example ubuntu noble nodepool/dib/mirror stuff would be godo to discuss
19:42:31 <clarkb> #topic Open Discussion
19:42:47 <clarkb> fungi: I meant to followup on git-review things but then we got distracted
19:42:58 <clarkb> fungi: have we approved the changes we're already happy with?
19:43:11 <fungi> frickler still wanted to take a look, it sounded like
19:43:32 <fungi> mainly i was looking for feedback on what already merged changes warranted adding release notes
19:43:37 <clarkb> got it
19:43:41 <frickler> oh, I missed that, sorry
19:43:41 <clarkb> https://review.opendev.org/q/project:opendev/git-review+status:open is the list of open changes
19:43:55 <clarkb> my suggestion would be that we go ahead and make a release (with release notes if necessary) for the stuff that has already landed
19:44:05 <clarkb> then we can followup and do a second release for those changes if we want
19:44:06 <fungi> i'll put together an omnubus reno change with the additions requested for the already merged changes since the last tag
19:44:12 <fungi> er, omnibus
19:44:16 <clarkb> but I think trying to do everything will just lead to more delays
19:44:27 <corvus> fungi: sounds ominous
19:44:43 <fungi> omnomnominous
19:45:19 <clarkb> Also I mentioned this in #opendev yesterday but got nerd sniped by eBPF and bcc as profiling tools that may be useful particularly in ci jobs
19:45:36 <fungi> but yeah, the rackspace mfa notification sort of derailed my finishing up the git-review release prep
19:46:13 <clarkb> I think the tools are neat and the way they work is particularly interesting to me beacuse I don't have to care too much about specific test job workloads to profile them reasonably well. You can just do it through the lens of the kernel
19:46:41 <clarkb> that said its not all perfect and they seem to be somewhat neglected on debuntu compared to rpm distributions
19:47:07 <clarkb> the runqslower command doesn't work on ubuntu for example and the python ustat command crashes
19:47:27 <clarkb> mostly mentioning them because you may find them useful as debugging aids
19:48:42 <clarkb> Anything else?
19:51:01 <clarkb> I'll take that as a no. Thank you for your time and help everyone. See you around and we'll be back here same time and place next week.
19:51:06 <clarkb> #endmeeting