19:00:17 #startmeeting infra 19:00:17 Meeting started Tue Mar 12 19:00:17 2024 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:00:17 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:00:17 The meeting name has been set to 'infra' 19:00:23 #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/IIJAA3YB34I5JUJLM7SXXRGGQYL2JXGI/ Our Agenda 19:00:30 #topic Announcements 19:01:17 north america did its DST switch. I think europe and australia have changes coming up in the near future too. Just keep that in mind as you manage your calendars :) 19:01:35 * fungi shakes his fist at the hour the universe has borrowed from him 19:01:38 * corvus puts food in face 19:02:16 Also I'm going to be fishing thursday and I think my back is demanding I finally go find a new office chair sooner than later so that will probably happen tomrrow. Tl;dr I'll be in and out this week 19:02:46 i'll be starting a little later than usual on thursday (routine medical checkup scheduled) 19:03:02 but expect to be around the rest of the day 19:03:11 The OpenStack TC election is happening right now. Go vote if you can 19:04:22 also openstack rc1 releases are due this week 19:05:23 #topic Server Upgrades 19:05:55 Haven't seen any movement on this. We'll discuss more later but the rackspace clouds.yaml updates should in place now. Please say something if you run into trouble related to that booting new servers 19:06:07 the dns updates may not work which for 99% of things thats probably fine 19:06:28 its not too bad to add reverse records by hand when we need them and forward records are mostly in opendev.org now 19:06:57 i didn't see any issue with dns when i tested the launch-node script with an api key only cloud definition 19:07:21 easy enough to rerun that test and check for it explicitly though 19:08:03 fungi: dns uses ist own auth credentials which I think are the old ones 19:08:10 so they will stop working once MFA is enabled 19:08:13 oh, not clouds.yaml then 19:08:17 correct 19:08:33 thats what the files we source are for when we do the dns stuff 19:08:40 one sources creds the others the virtualenv for the tool 19:09:36 anyway we'll talk more about rax mfa in a bit 19:09:43 #topic MariaDB Upgrades 19:10:02 I've got two changes up to do some more upgrades. One will do refstack and the other etherpad. They both need reviews. I'm happy to approve and babysit though 19:10:08 #link https://review.opendev.org/c/opendev/system-config/+/910999 Upgrade refstack mariadb to 10.11 19:10:13 #link https://review.opendev.org/c/opendev/system-config/+/911000 Upgrade etherpad mariadb to 10.11 19:10:37 based on paste's upgrade these should be straightforward but always worth monitoring database changes 19:11:26 let me know if you have any questions or concerns 19:11:35 #topic AFS Mirror cleanups 19:11:59 We're in the "idle" period between active mirror/nodepool cleanup where we try and reduce the impact across our repos 19:12:34 On the infrastructure side of things I think we are now ready to remove the centos-7 base-jobs nodeset and the images from nodepool but I announced that would happen on friday 19:12:47 I don't have changes up for that yet but will get them up before Friday in order to make it easy to alnd them 19:13:02 in the meantime topic:drop-centos-7 is worth keeping an eye on in case there are further cleanups 19:13:12 ubuntu-ports mirror was broken likely as a result of running into quota limits 19:13:34 should be fixed now but implies we'd better try to avoid this repeating if possible 19:13:47 ++ I'm somewhat surprised that reprepro can't resolve that on its own though 19:14:01 maybe I just didn't do the right things 19:14:22 and if you didn't I don't blame you. reprepro docs are a bit incomprehensible :) 19:15:10 I also started pushing some changes for xenial removal under topic:drop-ubuntu-xenial 19:15:26 I don't think we are in a hurry there as there will be plenty to untangle for xenial. But reviews are always welcome 19:16:05 slow but steady progress including in the projects dropping old configs. THat is nice to see 19:17:05 #topic Rebuilding Gerrit Container Images 19:17:05 the centos7 removal from devstack is still blocked 19:17:12 #undo 19:17:12 Removing item from minutes: #topic Rebuilding Gerrit Container Images 19:17:26 frickler: that is due to jobs in other projects like keystone? 19:17:47 yes, see https://review.opendev.org/c/openstack/devstack/+/910986 19:18:30 thanks. I don't see a change or changes yet to remove it from keystone 19:18:36 fungi: was that something you were planning on pushing? 19:18:50 yeah, i need to pick that back up now 19:19:01 keystone wasn't mentioned in the list of affected projects so far, too 19:19:44 but since this is also on master, it might be a good idea to move the removal until after the openstack release if this can't get fixed in time 19:19:53 says /me with release hat on 19:20:22 frickler: wouldn't it be just as easy to disable that job/remove that job as landing any fixes to keystone that might be necessary for the release? 19:20:26 I guess I don't see this as a hard blocker 19:20:40 because either way you're talking about $something that needs updating in keystone and if we can do that we can drop the job 19:21:03 but we can check in on friday and make a call if it seems particularly painful 19:21:06 we don't know how many other projects might also be affected that we don't know yet about 19:21:23 and we will discover only by iterating over merging fixes 19:21:27 we can add removal changes for stable/2024.1 before or after the release 19:21:51 ya I think my main push back is that the only way that should affect the release is if you need to update the job config for one of those projects anyway 19:22:05 in which case having a job that needs to be deleted is equivalent to whatever else is blocking you 19:22:32 well, or if zuul configuration is broken and the release jobs for the project's tag don't run 19:22:54 though i think those jobs are all defined in other repos 19:23:01 I thought zuul will continue to run the jobs in that case yes 19:23:08 I don't know that for certain though 19:23:10 project-config and openstack-zuul-jobs mainly 19:23:42 part of my concern is that we can't avoid merging these until every project is fixed because we know not all will be fixed ahead of time 19:23:58 yes, that should be fine, I was more thinking about possible last minute fixes being needed 19:23:58 i thought zuul wouldn't run jobs for a project if it couldn't load its configuration, but specifically for things defined via in-project configuration which hopefully the release jobs aren't 19:24:36 fungi: and you can have errors in your config that are only a problem when you try to modify the config 19:24:42 otherwise it will use cached configs 19:25:00 (which isn't something to rely on, but config errors aren't usually a hard stop)_ 19:25:37 anyway we can take stock in a few days and make a call then 19:25:49 #topic Rebuilding Gerrit Container Images 19:26:08 Gerrit finally released a new version of 3.9 to update mina ssh for that mitm thing 19:26:13 #link https://review.opendev.org/c/opendev/system-config/+/912470 Update our 3.9 image to 3.9.2 19:26:44 I try to keep out images up to date so that we're testing what we'll actually be upgrading with. However, merging this change will produce new 3.8 images too so we should try and restart gerrit even though it is running the older image 19:27:11 historically we've upgraded gerrit around april/may and november/december 19:27:28 would be great to get this up to date then try and work towards a gerrit upgrade in the next month or two 19:27:43 should we try to combine with project renames? 19:27:52 my preference is that we don 19:27:55 *don't 19:28:19 ok 19:28:26 yes, it can result in a bigger mess or additional delays if something needs to be rolled back 19:28:27 I think project renames are hacky enough that combining it with na upgrade is more risk than necessary. Also both should be relatively quick so we won't have massive downtimes 19:29:14 i agree, two brief outages a week or two apart is preferable to one slightly longer outage which has an increased risk of something going wrong 19:29:28 #topic Project Renames 19:29:55 Thats a good jump to another one of today's topics. A reminder we've pencilled in April 19. If you know of people who may want to rename projects remind them to get that info pushed up 19:30:08 #link https://review.opendev.org/c/opendev/system-config/+/911622 Move gerrit replication queue aside during project renames. 19:30:26 I also wrote this chagne to add this workaround for thousnads of errors on startup that we've been using when manually restarting gerrit to the playbook that autoamtes it 19:31:39 Don't think there is much else to say and I expect that is the primary prep we need before we get there (we actually test renames in our system-config-run-review jobs so should be good) 19:31:51 #topic Rackspace MFA Requirement 19:32:11 As noted earlier all of our clouds.yaml files should be updated now to use the api key auth method and the rackspaceauth plugin 19:32:23 this should make launch node, openstackclient, and nodepool happy 19:32:39 one major exception is the dns client and dns updates for dns hosted by rax 19:32:57 and then there is some question about swift hosted job logs but we're like 95% certain that is dedicated swift accounts which shouldn't be affected 19:33:37 The enforced deadline for the change is March 26 19:33:54 which means we can either wait until then or since we think we're prepared opt into MFA now and see what breaks 19:33:58 see also https://review.opendev.org/912632 per our earlier discussion, dns should be fine but that change ought to help avoid future divergence 19:34:43 oh interesting it was already using the api key 19:34:46 basically, the dns api module was already set up using api keys, so it transitioned to the new approach long ago 19:35:17 there are three accounts that we will need to manage MFA for. I think we'll just do totp like we've done for other accounts 19:36:01 fungi: actually I don't think that change (912632) is safe ? dns is managed via the third account 19:36:07 which is different than the control plane account 19:36:16 dns is, rdns is not 19:36:27 oh this is rdns specific got it 19:36:33 i did not touch dns just rdns 19:36:36 got it 19:36:53 as mentioned in the commit message 19:37:00 maybe try to switch to mfa early next week? then we'd have some days left in case it doesn't work as expected 19:37:03 I suspect that updating the nodepool account is going to have the biggest impact if something goes wrong 19:37:21 frickler: ya so maybe we start with our control plane account, test launch node stuff again, then if that works do the other two accounts 19:37:45 and next week works for me after the way this week filled up 19:38:04 though I won't stop anyone from getting to it sooner 19:38:10 agreed 19:38:35 cool /me scribbles a note to try and find time for that next week 19:38:45 let me know if you want to help I'll happily defer :) 19:38:45 a lot of my week is consumed by openstack release<->foundation marketing coordination 19:38:54 next week should be better though 19:39:20 sounds like a plan 19:39:23 #topic PTG Planning 19:39:32 #link https://ptg.opendev.org/ptg.html 19:39:37 we are now on the schedule there 19:39:59 and I think we'll use the default etherpad that was "created" for us 19:40:01 #link https://etherpad.opendev.org/p/apr2024-ptg-opendev Feel free to add agenda content here. 19:40:01 wednesday/thursday 19:40:22 yup I used the times we talked about in here last week. 19:41:16 I need to start adding some stuff to the agenda but everyone should feel welcome to. My intent is to make this more of an external reaching set of time but we can use it for our own stuff as well 19:41:36 for example ubuntu noble nodepool/dib/mirror stuff would be godo to discuss 19:42:31 #topic Open Discussion 19:42:47 fungi: I meant to followup on git-review things but then we got distracted 19:42:58 fungi: have we approved the changes we're already happy with? 19:43:11 frickler still wanted to take a look, it sounded like 19:43:32 mainly i was looking for feedback on what already merged changes warranted adding release notes 19:43:37 got it 19:43:41 oh, I missed that, sorry 19:43:41 https://review.opendev.org/q/project:opendev/git-review+status:open is the list of open changes 19:43:55 my suggestion would be that we go ahead and make a release (with release notes if necessary) for the stuff that has already landed 19:44:05 then we can followup and do a second release for those changes if we want 19:44:06 i'll put together an omnubus reno change with the additions requested for the already merged changes since the last tag 19:44:12 er, omnibus 19:44:16 but I think trying to do everything will just lead to more delays 19:44:27 fungi: sounds ominous 19:44:43 omnomnominous 19:45:19 Also I mentioned this in #opendev yesterday but got nerd sniped by eBPF and bcc as profiling tools that may be useful particularly in ci jobs 19:45:36 but yeah, the rackspace mfa notification sort of derailed my finishing up the git-review release prep 19:46:13 I think the tools are neat and the way they work is particularly interesting to me beacuse I don't have to care too much about specific test job workloads to profile them reasonably well. You can just do it through the lens of the kernel 19:46:41 that said its not all perfect and they seem to be somewhat neglected on debuntu compared to rpm distributions 19:47:07 the runqslower command doesn't work on ubuntu for example and the python ustat command crashes 19:47:27 mostly mentioning them because you may find them useful as debugging aids 19:48:42 Anything else? 19:51:01 I'll take that as a no. Thank you for your time and help everyone. See you around and we'll be back here same time and place next week. 19:51:06 #endmeeting