clarkb | just about meeting time | 18:58 |
---|---|---|
clarkb | #startmeeting infra | 19:00 |
opendevmeet | Meeting started Tue Aug 19 19:00:08 2025 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:00 |
opendevmeet | The meeting name has been set to 'infra' | 19:00 |
clarkb | #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/JQ73GENJBNTUU4QZD7SY6OO2NZ2H7WGU/ Our Agenda | 19:00 |
clarkb | #topic Announcements | 19:00 |
clarkb | I didn't have anything to announce. Did anyone else? | 19:01 |
fungi | i did not | 19:01 |
clarkb | #topic Gerrit 3.11 Upgrade Planning | 19:02 |
clarkb | This continues to be half on the back burner. Except that upstream has made our lives more difficult again by publishing new releases :) | 19:02 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/957555 Upgrade Gerrit images to 3.10.8 and 3.11.5 | 19:02 |
clarkb | Probably a good idea to land that and restart gerrit and recycle teh holds so that when I do get to testing things I'm testing with up to date gerrit versions | 19:03 |
clarkb | maybe tomorrow or thursday for the gerrit restart depending on how zuul stuff goes between now and then? | 19:03 |
clarkb | the zuul upgrade and reboot playbook is looking happy so far just needs time and zuul launchers haven't run out of disk yet so they also looks happy | 19:04 |
clarkb | makes me think I probably will have time for that soon | 19:04 |
clarkb | any comments or concerns around the gerrit 3.11 upgrade? | 19:04 |
clarkb | #topic Upgrading old servers | 19:05 |
clarkb | Fungi cleaned up refstack and the old eavesdrop server stuff last week | 19:06 |
clarkb | Next on the list are kerberos, openafs, graphite, and backup servers | 19:06 |
fungi | i'll try to get to kerberos/openafs later this week | 19:06 |
clarkb | fungi: that would be great. Feel free to ping me if I can help in any way | 19:06 |
fungi | those seem to all be on focal at the moment | 19:06 |
clarkb | Then I wanted to call out a milestone that we appear to have reached: there are no more bionic servers | 19:06 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/957950 cleanup bionic node testing in system-config | 19:07 |
fungi | my plan is to stick them in emergency disable, update our testing to jammy, in-place upgrade them, one by one to jammy, then maybe increase our testing again to noble and repeat, finally removing them from the disable list | 19:07 |
clarkb | This came up because ansible 11 (our new default in zuul) isn't compatible with the python version on bionic. I did a quick workaround yesterday then dug in more this morning and I believe that we don't have any servers running bionic so we can drop testing for that platform and use ansible 11 | 19:07 |
corvus | huzzah! | 19:08 |
clarkb | fungi: that sounds like a great plan. I like the idea of going step by step and checkpointing just to catch any problems early | 19:08 |
fungi | at least i hope our system-config jobs will give an early warning of serious problems before attempting to upgrade | 19:08 |
clarkb | feel free to double check me on the no more bionic assertion but best I can tell the hosts in our ansible fact cache reporting bionic are no longer in our inventory | 19:08 |
clarkb | fungi: ya it should catch the more obvious stuff | 19:09 |
clarkb | its a good check | 19:09 |
clarkb | I also shutdown gitea-lb02.opendev.org this morning as frickler reporting its cronspam was not verifying now that its dns records are gone | 19:09 |
clarkb | we kept the server around as a debugging aid so I dind't want to delete it. But shutting it down for now seemed fine and I've done so | 19:09 |
clarkb | any other server upgrade/replacement/deletion questions comments or concerns? | 19:10 |
corvus | that reminds me, we have a dns record for review02 but i think we're on review03 now? | 19:10 |
fungi | cacti/storyboard are xenial, looks like, and wiki is older | 19:10 |
fungi | so i agree no more bionic that i can find | 19:11 |
fungi | we have stuff older than bionic but we also can't really test it at this point | 19:11 |
clarkb | corvus: correct we're on 03 now. I'll make a note in my todo list to look at that | 19:11 |
clarkb | ya and we removed testing for that older stuff a little while ago | 19:12 |
clarkb | so ship has sailed... | 19:12 |
fungi | review02 is also still in the emergency disable list | 19:12 |
fungi | along with gitea-lb02 | 19:12 |
clarkb | they aren't in our inventory any more so should be able to be removed from the emergency file. I can check on that too | 19:13 |
clarkb | #topic Matrix for OpenDev comms | 19:14 |
clarkb | #link https://review.opendev.org/c/opendev/infra-specs/+/954826 Spec outlining the motivation and plan for Matrix trialing | 19:14 |
clarkb | I've updated the spec with the feedback that I got. It looks like ianw is happy with it now. Anyone else care to rereview? | 19:14 |
clarkb | I guess corvus did ask about double checking the possibility of EMS hosted mjolnir which I haven't done | 19:14 |
clarkb | we should oprobably keep most of the discussion on this topic within the spec review | 19:15 |
corvus | yeah, still seems like a good idea, but probably doesn't radically alter the next steps which are: run mjolnir (either ourselves or via ems) | 19:15 |
clarkb | so please follow up there. | 19:16 |
corvus | though i'm still at "learn how to speel mjolnir" which is step 0 | 19:16 |
clarkb | I think it is literally the hardest word to remember the spelling of and type | 19:16 |
clarkb | #topic Pre PTG Planning | 19:16 |
fungi | sadly i'm enough of a mythology geek that i have no trouble spellingnit | 19:17 |
clarkb | #link https://etherpad.opendev.org/p/opendev-preptg-october-2025 Planning happening in this document | 19:17 |
clarkb | I think we can consider this proposed schedule pretty well settled at this point as I haven't heard any feedback to the contrary | 19:17 |
fungi | sgtm, thanks! | 19:18 |
clarkb | please add agenda items to the etherpad if you have ideas for things to do or change etc | 19:18 |
clarkb | I'll continue to add items myself as I think of them | 19:18 |
clarkb | #topic Service Coordinator Election Planning | 19:19 |
clarkb | The service coordinator nomination period ends at EOD today on a UTC clock | 19:19 |
clarkb | #link https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/YXRD23ZWJGDPZ3WESBNZNEYO7NBCXFT4/ | 19:19 |
clarkb | yesterday I went ahead and nominated myself | 19:19 |
clarkb | https://lists.opendev.org/archives/list/service-discuss@lists.opendev.org/thread/WNUYDT47NMYC3SC5QA44OG4PWK5ENQEF/ | 19:20 |
clarkb | it seemed like no one else was going for it and I wanted to make sure that I was well ahead of the deadline. If however I misread the room please speak up. I'm happy to step to the side or work togetherwith someone etc | 19:20 |
clarkb | (or have an election) | 19:20 |
clarkb | but I suspect its a tag you're it again situation for myself. And that is fine too. That said I think some variety would be a good thing and would happily support someone else in the role | 19:21 |
fungi | your sacrifice is appreciated | 19:22 |
clarkb | #topic Loss of upstream Debian bullseye-backports mirror | 19:22 |
clarkb | I think we had a rough outline of a plan here which included potentially breaking a small number of jobs | 19:22 |
fungi | sounds like we have a way forward for this yes, just haven't had time to start in on it | 19:22 |
clarkb | any concerns with the plan since we last spoke about it here? | 19:23 |
clarkb | (the plan is basically to clean up backports for bullseye and force jobs to find an alternative since that most accurately reflects the state with upstream) | 19:23 |
fungi | i just need to remember if we said base-jobs first or straight to zuul-jobs with an advance announcement | 19:23 |
clarkb | should be in the meeting logs (I believe it was zuul-jobs but double check) | 19:24 |
fungi | the latter is less work for greater benefit in the long run, but will take a it longer | 19:24 |
fungi | yeah, that's what i thought | 19:24 |
clarkb | #topic Etherpad 2.4.2 Upgrade | 19:26 |
clarkb | #link https://github.com/ether/etherpad-lite/issues/7065 Theming updates have broken the no-skin and custom skin themes. We use no-skin. | 19:26 |
clarkb | I was hopeful that there would be followup after they responded to the issue last week just after our meeting | 19:26 |
clarkb | but no. Basically they said no-skin isn't expected to be used its only there as an example for other skins and I responded that no-skin is/was the only skin until colibris was added and we kept it for user familiarity and density of the text | 19:27 |
fungi | it does seem like there's been some upstream turnover, to the point where the current maintainers aren't aware of what etherpad-lite used to look like | 19:27 |
clarkb | still waiting for a response indicating whethero r not no-skin is no longer expected to be used | 19:27 |
clarkb | but hopefully we have an answer soon on whether or not we have to accept colibris or can continue as is | 19:27 |
fungi | but also custom skins (presumably based on no-skin) are now broken too | 19:28 |
clarkb | yes at least one person responded that this is the case I think they have some stuff to fix at least | 19:28 |
fungi | so no-skin isn't currently useful even for they purpose thet thought it was for | 19:28 |
fungi | and if they fix that, it will likely be usable for us again too | 19:29 |
clarkb | here's hoping | 19:29 |
clarkb | #topic Moving OpenDev's python-base/python-builder/uwsig-base Images to Quay | 19:29 |
clarkb | #link https://review.opendev.org/c/opendev/system-config/+/957277 | 19:29 |
clarkb | last week I noted that I had some concern that we may not use speculative images in builds with this switch. Since then I dug into the docker docs, our test job roles, and the change I wrote and I think this is a non issue | 19:30 |
clarkb | our past selves already addressed these problems | 19:30 |
clarkb | at this point I think the main concern is that we'll want to rebuild all of the python based images after this lands | 19:31 |
fungi | yay! /pats selves on back | 19:31 |
clarkb | probably don't need to rush to do that but also won't want to delay | 19:31 |
clarkb | I'm not sure there is ever a good time for something like that so its mostly calling it out as a todo once that lands so we don't forget and use the old stale images forever | 19:31 |
clarkb | so ya reviews welcome and feedback on timing too | 19:31 |
clarkb | #topic Bootstrapping rax-flex iad3 | 19:32 |
clarkb | there is a third rax flex region and cloudnull has given us the go ahead to use it. Yesterday we landed a change to update our clouds.yaml files and set up the cloud launcher to preconfigure things | 19:33 |
clarkb | cloud launcher failed on auth issues and it seems we still need to login to skyline first to have the new env sync account stuff from the old env | 19:33 |
fungi | (along with increased quotas in all 3 regions) | 19:33 |
clarkb | I have done this for both of our accounts in iad3's skyline service and the openstack client can list images in both accounts now | 19:33 |
clarkb | I think I'll wait for our daily cloud launcher run to happen at ~0200 UTC and then look into bootstrapping a mirror tomorrow | 19:34 |
clarkb | we also need to set the network mtu to 1500 which is an extra step post cloud launcher | 19:34 |
clarkb | then cloudnull also suggested we reenable the rax classic regions and see if they are happier now. Looks like the first change to do that just merged | 19:35 |
clarkb | If we do see new or renewed problems with that we can use the existing email thread I started to followup | 19:35 |
clarkb | (and if anyone would prefer I do the emailing just let me know)) | 19:36 |
fungi | yeah, the change for the first two regions just merged | 19:36 |
clarkb | I split them because ord and dfw had a different failure mode to iad | 19:36 |
clarkb | so want to reenable them separately for ease of reverts / debugging | 19:36 |
fungi | right, i didn't approve the second for now | 19:36 |
clarkb | #topic Open Discussion | 19:37 |
clarkb | Anything else? | 19:37 |
fungi | i'll be gone next week and half the following week | 19:38 |
clarkb | I mentioned earlier that the zuul upgrades and reboots seem happier now. I am running that playbook in a root screen on bridge out of band due to several consecutive failures the last few weeks | 19:38 |
clarkb | fungi: enjoy your time of | 19:38 |
clarkb | *off | 19:38 |
fungi | thanks! | 19:38 |
clarkb | the ansible 11 swithc has gone really well I think. The main issues we have encountered are Bionic and older nodes not being supported by ansible 11 due to python version incomaptibilities | 19:41 |
corvus | ++ | 19:41 |
clarkb | then we also discovered skyline used a list for playbook vars: https://opendev.org/openstack/skyline-apiserver/src/branch/master/playbooks/devstack/pre.yaml#L8-L9 and ansible 11 doesn't like that | 19:41 |
fungi | and that one weird skyline job | 19:41 |
fungi | yeah that | 19:41 |
clarkb | but I'm not sure older ansible was even doing the right thing with that config. It should also be a trivial problem to fix | 19:42 |
fungi | it was likely ignored | 19:42 |
corvus | i get the idea that was like removing prehistoric syntax | 19:42 |
corvus | you think? i thought it was just a super weird old way to specify vars | 19:42 |
fungi | i should be surprised that turned up in one of the newest openstack projects, but i'm not | 19:43 |
fungi | yeah maybe it did actually work in 9 | 19:43 |
clarkb | ya its possible it just worked until ansible decided it was weird and stopped being backward compatibile | 19:43 |
clarkb | in any case straightforward to fix | 19:43 |
clarkb | if you see any other ansible 11 issues its good to make note of them as this sort of info can go into zuul's changelog for ansible things | 19:44 |
clarkb | helps other zuul users | 19:44 |
clarkb | ok last call. Anything else? Otherwise we can end about 15 minutes early today | 19:44 |
fungi | bindep and git-review were testing on older python versions we'll need to make some decisions about | 19:45 |
fungi | pbr too | 19:45 |
clarkb | I think for bindep and git review we just drop the old stuff and move on. They have old releases that can run with old python | 19:45 |
clarkb | pbr is trickier and probably worth keeping python2.7 still since swift only just dropped support for that version | 19:46 |
fungi | seems like we can move pbr's py27 testing to newer platforms, but will probably need to drop 3.5-3.7 testing | 19:46 |
clarkb | (and that means updating the python2.7 test job to jammy I think) | 19:46 |
clarkb | ya exactly | 19:46 |
fungi | well, or focal | 19:46 |
clarkb | sounds like that may be everything. Thanks everyone! | 19:48 |
fungi | thanks clarkb! | 19:48 |
clarkb | I'll probably run a meeting next week despite the expected lower attendance. Its good to capture the goings on for people to review if nothing else | 19:48 |
clarkb | #endmeeting | 19:48 |
opendevmeet | Meeting ended Tue Aug 19 19:48:39 2025 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 19:48 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/infra/2025/infra.2025-08-19-19.00.html | 19:48 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/infra/2025/infra.2025-08-19-19.00.txt | 19:48 |
opendevmeet | Log: https://meetings.opendev.org/meetings/infra/2025/infra.2025-08-19-19.00.log.html | 19:48 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!