19:00:20 #startmeeting infra 19:00:20 Meeting started Tue Mar 5 19:00:20 2024 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:00:20 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:00:20 The meeting name has been set to 'infra' 19:00:27 #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/UG2JFEL6XFFLDT5UYDHCBYNAJF72XXHZ/ Our Agenda 19:00:43 #topic Announcements 19:00:53 hold bowl with one hand, chopsticks with second hand, type with third hand 19:01:33 small note that I'll be AFK through a good chunk of tomorrow. Taking advantage of a morning matinee and kids being in school to see Dune 19:02:16 #topic Server Upgrades 19:02:27 I haven't seen any new movement on this 19:02:55 nope. I'll address the review feedback and boot the new servers today 19:03:04 Worth calling out that the announced rackspace mfa switch may impact our ability to run launch node. I've got notes to discuss that further at the tail end of the meeting 19:03:12 tonyb: ah if you boot today you should be fine 19:03:24 #topic MariaDB Upgrades 19:03:56 The paste db upgrade went as expected. It seems to have only touched system tables, and did a backup of those tables first the size of which is less than 1MB and reasonable to continue to have the process do that backup 19:04:04 #link https://review.opendev.org/c/opendev/system-config/+/910999 Upgrade refstack mariadb to 10.11 19:04:09 #link https://review.opendev.org/c/opendev/system-config/+/911000 Upgrade etherpad mariadb to 10.11 19:05:03 I went ahead and pushed these two changes to upgrade refstack and etherpad's backing databases. I did have to make a small change to etherpad's test cases because the log output from 10.11 was updated to say mariadb is read instead of myslq is ready 19:06:31 reviews welcome as well as any feedback on whether we're comfortable with docker-compose kicking the upgrade off automatically or if we'd prefer manual intervention for up to the minute backups 19:07:10 after these two gerrit, gitea, and mailman 3 are the remaining dbs that need upgrades. I'll try to continue to step through them 19:07:57 #topic AFS Mirror cleanups 19:08:14 OpenSUSE Leap and Debian Buster have been removed from afs mirroring as well as nodepool 19:08:30 Next up is CentOS 7 which we've got some stuff in progress for under topic:drop-centos-7 19:09:04 I did realize that CentOS 7 had/has far more reach than the other two so decided to announce a removal date for March 15 in order to minimize impact to the openstack release process 19:09:24 the impact should still be minimal but there were enough places thatcentos 7 was still showing up that I didn't want to just blaze ahead like I did with the others 19:10:02 we're currently cleaning up project configs then late this week early next week I'll drop zuul-jobs testing of centos 7 and remove wheel caching for centos 7 19:10:13 the custom nodeset definition in devstack is nearly done merging backports across 8 active branches 19:10:27 then we can do the actual nodeset and nodepool removal on the 15th and once that is done clean up afs 19:10:38 #link https://review.opendev.org/c/opendev/system-config/+/906013 Improve DKMS for CentOS OpenAFS testing/packaging 19:10:58 though i expect whichever tries merging last to fail with errors we can then use to see what old branches of other projects are using the devstack nodeset 19:11:04 thsi change isn't directly related to the cleanup but involves centos and afs and I think will make it easier to understand failures with dkms on the platform 19:11:12 fungi: ya 19:11:25 fungi: keystone for example 19:12:03 slow but steady progress. And we've already freed up like 400GB of openafs consumption 19:12:20 this cleanup effort is likely to be pretty vast, since copies of bits like custom nodesets and jobs are declared across many, many branches and only the last removal will actually tell you what was using it 19:12:33 I will note that last friday when I tried to clean up the buster mirror content afs01.dfw.openstack.org lost a "disk" and everything went sideways 19:12:56 it isn't clear to me if this was due to deleting a few hundred gigabytes of data or just coincidence 19:13:13 something we should be aware of when making other large changes to openafs. Rax addressed it quickly at least 19:13:40 yeah, and other than retrying some vos releases there wasn't any lasting impact 19:13:40 fungi: yes, I mentioned it elsewhere but we really need openstack to clean up old stuff early in the branching process instead of at eol time 19:14:01 because we're ending up with ancient configs that make no sense in modern oepnstack that continue to be carried forward release after release increasing the cleanup time/cost 19:14:34 i think people define shared resources in branched repos without considering how zuul uses them 19:15:07 and not realizing that even if you delete something out of your master branch, other projects will just keep using it from a branch 5 releases ago 19:15:09 Another issue that I ran into was that openafs doesn't load on debian bookworm. 19:15:11 #link https://gerrit.openafs.org/#/c/15668/ Fix for openafs on arm with newer gcc 19:15:23 * doesn't load on debian bookworm arm64 19:15:58 Once upstream merges this fix for it I'll submit a bug to debian to see if we can get that fixed (it doesn't work at all so should be a good candidate for a fixup) 19:16:44 Once we've chipped enough of this old stuff out we can add in new things :) 19:16:59 if anyone wants to get a headstart on that a new dib job to start building 24.04 might be helpful 19:17:20 *start building Ubuntu 24.04 to avoid any confusion on what I was referring to 19:17:30 is it coincidence that openafs is using gerrit and we are using openafs? /me just notices this 19:17:46 frickler: yes I think it is 19:19:15 #topic OpenDev Email Hosting 19:19:32 Don't think we have anything new to mention on this. But kept it on the agenda in case we had any stronger opinions 19:19:44 * clarkb will give everyone a couple minutes to chime in if so. Otherwise we can continue on 19:20:22 I'd be fine with dropping it from the agenda and reviving once we consider it to be more urgent again 19:20:58 wfm I can do that 19:21:33 #topic Project Renames 19:21:53 This is mostly a reminder that we're planning to do renames after the openstack release on April 19 19:21:59 we can adjust this timing as necessary 19:22:08 so please say somethign if that timing is especially bad for some reason 19:22:50 the release should happen earlier, the date is after the PTG 19:23:06 correct. Its basically release then ptg then the 19th 19:23:21 we didn't want to conflict with the ptg or the release so we're doing it late 19:23:55 which is a good lead into our next tope 19:23:57 *topic 19:24:03 #topic PTG Planning 19:24:09 #link https://ptg.opendev.org/ptg.html 19:24:26 I was hoping this schedule would be a bit more fileld in before picking times but it is very empty 19:24:47 rather than wait for others to fill in I think we can go ahead and grab some time. 19:24:56 Something like Wednesday 0400-0600 and Thursday 1400-1600UTC. Gives enough time between blocks to catch up on sleep. 19:25:49 monday and tuesday tend to be busy so I'm trying to accomodate that 19:26:22 +1 19:26:34 Works for me. I admit I'll only be attending APAC friendly meetings 19:26:38 ever since the ptg organizers stopped trying to pre-schedule times for all registered teams,many teams tend to wait until the last week to book any slots 19:26:59 I was thinking of dropping into the openeuler session 19:27:11 tonyb: that doesn't conflict with the times I proposed does it? 19:27:28 no it is on friday so we're good there 19:27:43 I wasn't even aware that the scheduling is already happening. seems it is only announced to PTLs/session leaders? 19:28:02 I don't think so. the one I saw was Friday 19:28:05 frickler: yes emails did go out to the session leaders. Not sure if emails went out more broadly. 19:28:28 I can make a note that we may need to communicate this more widely 19:28:54 I think it only goes to session leaders 19:31:07 anyway I'll get us signed up for those two blcoks later today 19:31:17 #topic Rax MFA Requirement 19:31:17 sounds good 19:31:44 fungi received email today announcing that rax will require MFA for authenticating starting march 26, 2024 19:32:06 they've also added a similar notice on the login page for their portal 19:32:06 enabling MFA breaks normal openstack api auth. We have to either use a rax api key or bearer token 19:32:35 this means all of our automation is impacted. 19:33:08 Since bearer tokens expire (relatively quickly too) we've decided to investigate using the api_key method. To do this we need ot install rackspaceauth as a keystoneauth1 plugin to all the places we use the api 19:33:18 the nwe need to use the api key value instead of regular user auth 19:33:51 the rough plan here is to test this with nodepool using a single region to start that way we can check that launcher and builder operati ons work (or don't) 19:34:16 do we know the lifetime for those api keys? 19:34:27 then when that works we can switch all rax nodepool providers over to the new system and update our control plane management to use the same api-key stuff. Then we can opt in to MFA when ready 19:34:44 fungi: ^ do you know the answer to frickler's question? You were testing this with your personal account any indication of a lifetime? 19:34:45 frickler: i generated one for my personal rackspace account years ago and it's never changed 19:35:12 from what i can tell it only changes if you click the "reset" button next to it in the account settings 19:35:32 If you'd like to help with reviews or pitch in pushing changes we're using topic:rackspace-mfa 19:35:48 I'm not really seeing how that helps with security at all? 19:35:59 tonyb: it helps with security theater 19:36:10 if you force people to make changes then you can't say you didn't do anything 19:36:24 fungi: for the system-config chagne we need to put new secrets in private vars. Is that done yet? 19:36:38 thinking about our next steps and I think it is roughly add the new private vars, land the system-config change, then update nodepool config 19:36:46 yes, i left a comment on the change saying i did it too 19:37:28 then we can either land the nodepool change or try it out of the intermediate registry. Pull from the intermediate registry will only work for the launcher image I think since the builer is multiarch and docker isn't able to negotiate multiarch images out of the intermediate registry currently :/ 19:37:31 fungi: thanks! 19:38:06 fungi: we should be able to push up a project-config update with a depends on system-config too if we haven't yet 19:38:16 but I think that is where we're at until a couple of things merge 19:38:17 i think if it works for launcher that's good enough to land the nodepool change 19:38:50 as an alternative we can manually install the lib itno the image if the launcher is multiarch too and can't be fetched out of testing 19:38:52 clarkb: what needs changing in project-config? i can do that 19:38:59 (i don't think we need to prove it works to land the nodepool change; it's pretty simple. but still, it'd be nice to avoid churn or errors there since there's no real way to test it other than in prod) 19:39:23 fungi: we have to update the nodepool/nl01.opendev.org and nodepool/nodepool.yaml files to force one of the three rax providers to use your newly defined clouds.yaml entries 19:39:32 oh, right that 19:39:34 corvus: ++ 19:39:37 yeah i'll get that proposed 19:39:51 though probably not until after 21z 19:39:51 fungi: I would pick the rax region with the lowset capacity to reduce impact if it doesn't work 19:39:58 good idea 19:40:38 we have three weeks to get this working, which seems like plenty, but if we run into problems that time can disappear on us very quickly 19:41:03 agreed best to get as much info as we can as early as possible then adjust our plan as necessary 19:41:55 what about log uploads, are these also affected or not? the earlier discussion in #opendev didn't seem conclusive to me 19:42:18 we use swift account credentials for that, not keystone 19:42:25 as i understand it 19:42:45 those are separate accounts defined in swift itself and scoped to specific swift acls 19:42:46 ya so I don't think they will be affected but we should double check on that (check that we are using special creds and check that they aren't affected though i'm not sure how we do this second thing) 19:43:00 corvus: you may recall the details as I think youset that UP/ 19:43:03 (and I can't type) 19:43:57 we can also, worst case, fall back to only uploading to ovh in the interim while we work it out 19:44:05 not ideal but ya that would work 19:44:41 as far as actual MFA implementatino goes their docs refer to phone authenticator apps. Typically this means they are doing totp so we should be able to do that here as well 19:44:57 similar to how some of our other accounts have done totp 19:46:12 Still a lot of unknowns for now but we've got a plan to learn more. Next week we can catch up and make sure there aren't any glaring issues we need to address 19:46:16 #topic Open Discussion 19:46:21 Anything else before we end the meeting? 19:47:00 clarkb: i don't recall the details.... 19:47:21 corvus: ack we should be able to log in to the swift stuff and check and/or look at our secrets in zuul 19:47:35 yeah, probably worth looking into ahead of time 19:47:41 because i agree, something is different about it 19:48:45 openstack is starting to get into release mode. Keep that in mind when making changes 19:48:57 and thats about all I had 19:51:50 sounds like that is everything for today. Thank you everyone for your time and effort operating and improving opendev 19:51:55 #endmeeting