Sunday, 2026-04-12

-@gerrit:opendev.org- Dr. Jens Harbott proposed: [openstack/project-config] 984220: Drop *-wheel-cache-ubuntu-bionic jobs https://review.opendev.org/c/openstack/project-config/+/98422009:39
-@gerrit:opendev.org- Zuul merged on behalf of Dmitriy Rabotyagov: [openstack/project-config] 974773: Properly re-retire OSA Monasca roles https://review.opendev.org/c/openstack/project-config/+/97477309:55
-@gerrit:opendev.org- Zuul merged on behalf of Dr. Jens Harbott: [openstack/project-config] 984220: Drop *-wheel-cache-ubuntu-bionic jobs https://review.opendev.org/c/openstack/project-config/+/98422010:20
-@gerrit:opendev.org- Michal Nasiadka proposed wip: [opendev/zone-opendev.org] 983860: Add entries for mirror03.ord.rax https://review.opendev.org/c/opendev/zone-opendev.org/+/98386013:03
-@gerrit:opendev.org- Michal Nasiadka proposed wip: [opendev/zone-opendev.org] 983860: Add entries for mirror03.ord.rax https://review.opendev.org/c/opendev/zone-opendev.org/+/98386013:05
-@gerrit:opendev.org- Michal Nasiadka marked as active: [opendev/zone-opendev.org] 983860: Add entries for mirror03.ord.rax https://review.opendev.org/c/opendev/zone-opendev.org/+/98386013:05
-@gerrit:opendev.org- Michal Nasiadka proposed wip: [opendev/zone-opendev.org] 984030: Replace whitespaces with tabs https://review.opendev.org/c/opendev/zone-opendev.org/+/98403013:05
-@gerrit:opendev.org- Michal Nasiadka proposed wip: [opendev/zone-opendev.org] 984030: Replace whitespaces with tabs https://review.opendev.org/c/opendev/zone-opendev.org/+/98403013:07
-@gerrit:opendev.org- Michal Nasiadka marked as active: [opendev/zone-opendev.org] 984030: Replace whitespaces with tabs https://review.opendev.org/c/opendev/zone-opendev.org/+/98403013:07
@clarkb:matrix.orgI am awake and my internet connection seem to be working. I'll pop back in in about 3 hours or so so that we can start on the prep work for the Gerrit upgrade15:56
@fungicide:matrix.orgyeah, everything seems fine16:52
@fungicide:matrix.orgin about an hour i'll do steps 5 and 6 from https://etherpad.opendev.org/p/gerrit-upgrade-3.1218:04
@clarkb:matrix.orgthanks. I'm figuring out an early lunch now18:15
@clarkb:matrix.organd do you want to do a meetpad call (can call it ptg prep/testing) or just coordinate here?18:17
@fungicide:matrix.orghere's fine with me, but whatever you prefer18:30
@fungicide:matrix.orgi just finished cooking/eating an early dinner and am working here and there on replacing a bathroom faucet/drain18:31
@fungicide:matrix.orgwith a bit of laundry thrown in for good measure... typical sunday18:31
@clarkb:matrix.orgone of the kids woke up sick in the middle of the night so we're having a very lazy Sunday on our end. Up until we start doing the upgrade then I won't feel lazy aynmore18:51
@fungicide:matrix.orgno worries. i've got the pre-20z bits covered and then we can divvy up the rest when the time comes18:52
@fungicide:matrix.org#status notice Gerrit on review.opendev.org is being upgraded to version 3.12 and will be offline starting at 2000 UTC. We have allocated an hour for the outage window lasting until 2100 UTC.18:59
@status:opendev.org@fungicide:matrix.org: sending notice18:59
@fungicide:matrix.orgemergency file has the hosts from step #6 included19:01
@fungicide:matrix.orgi've also started a root screen session on review03 and enabled logging, per step #819:02
-@status:opendev.org- NOTICE: Gerrit on review.opendev.org is being upgraded to version 3.12 and will be offline starting at 2000 UTC. We have allocated an hour for the outage window lasting until 2100 UTC.19:02
@status:opendev.org@fungicide:matrix.org: finished sending notice19:02
@clarkb:matrix.orgI've joined teh screen. I see you've got the docker-compose down command ready to go.19:07
@fungicide:matrix.orgyeah, just one less thing to do when we start i guess19:08
@clarkb:matrix.orgyup no reason not to get that ready now19:08
@clarkb:matrix.orgI've jumped on zuul01 and can pause the zuul queues concurrently with the gerrit stuff if you like.19:10
@clarkb:matrix.orgbut that is all still ~50 minutes from now19:10
@fungicide:matrix.orgthat'll be great19:11
@clarkb:matrix.orgI also have a change I can push up at the end19:14
@fungicide:matrix.orgeven better19:17
@clarkb:matrix.orgfungi: if you send the next status notice I'll pause zuul at that time19:53
@clarkb:matrix.orgin ~6 minutes19:53
@fungicide:matrix.orgwill do!19:53
@fungicide:matrix.org#status notice Gerrit on review.opendev.org is being upgraded to version 3.12 and will be offline momentarily. We have allocated an hour for the outage window lasting until 2100 UTC.19:59
@status:opendev.org@fungicide:matrix.org: sending notice19:59
@clarkb:matrix.orgZuul is paused now too and the web ui is showing the expected banner19:59
@fungicide:matrix.orggo ahead and pause zuul, once statusbot returns we can shut down gerrit19:59
@fungicide:matrix.orgperfect19:59
@clarkb:matrix.orgyup all done19:59
-@status:opendev.org- NOTICE: Gerrit on review.opendev.org is being upgraded to version 3.12 and will be offline momentarily. We have allocated an hour for the outage window lasting until 2100 UTC.20:01
@status:opendev.org@fungicide:matrix.org: finished sending notice20:01
@fungicide:matrix.orgstopping gerrit now, then starting mariadb again20:02
@clarkb:matrix.orgwhat is the over under on this. I'll say 150 seconds20:02
@fungicide:matrix.orgroughly20:02
@fungicide:matrix.orgi give it 18020:02
@clarkb:matrix.orgwe were both too optimistic20:05
@fungicide:matrix.orgnow it's exceeded even my guess, yes20:05
@fungicide:matrix.org4 minutes now, maybe we'll get to exercise the timeout increase20:06
@clarkb:matrix.orgmaybe. It moved from gerrit_file_diff to git_file_diff per the lock files20:06
@fungicide:matrix.orgstill going after 5 minutes!20:07
@clarkb:matrix.orgI think once it is done with git_file_diff it should complete quickly20:07
@clarkb:matrix.orgthe best part is we're going to delete/move these files aside afterwards too20:08
@fungicide:matrix.org398.4s20:08
@fungicide:matrix.orgmariadb is started again20:09
@clarkb:matrix.orgI think we can proceed with the backup command20:09
@fungicide:matrix.orgbackup in progress20:09
@clarkb:matrix.orglooks like we can stop backing up /var/lib/containers20:10
@fungicide:matrix.orgbackup done, stopping mariadb yet again now20:10
@clarkb:matrix.orgper the log that seems to have completed successfully. fs backup reported 1 which is warning usually due to something changing and mariadb reported 020:10
@clarkb:matrix.orgthis step will take a few minutes20:11
@fungicide:matrix.orgfor step #17 we're just doing the commands from sub-steps #17.4-17.5?20:14
@fungicide:matrix.org(when we get there in a few minutes)20:14
@clarkb:matrix.orgfungi: yes at a minimum those steps. I noted we could delete some of the caches. But its same fs mv so should be fast and I don't think we need to preclean anything20:14
@fungicide:matrix.orgokay, i'm good just sticking with those fwiw20:15
@clarkb:matrix.orgwfm20:15
@clarkb:matrix.orgthere is plenty of disk space too20:15
@fungicide:matrix.orgcompose file is updated now20:16
@clarkb:matrix.orgyup  Ithink we can pull next20:16
@fungicide:matrix.orgin progress20:16
@fungicide:matrix.org`gerrit Skipped - Image is already being pulled by shell`20:16
@fungicide:matrix.orgis that expected?20:17
@clarkb:matrix.orgyes20:17
@clarkb:matrix.orgshell and gerrit are the same image so it pulls it once and for whatever reason "shell" wins20:17
@fungicide:matrix.orgah20:17
@clarkb:matrix.orgthe docker image ls and then inspect should confirm we pulled it properly20:17
@clarkb:matrix.orgntoe this command needs editing20:17
@clarkb:matrix.orgok that is unexpected20:18
@fungicide:matrix.org`API version 1.41 is not supported by this client: the minimum supported API version is 1.44`20:18
@clarkb:matrix.orgthsi is due to using podman not actual docker20:18
@clarkb:matrix.orgcan you try docker compose image ls ?20:18
@clarkb:matrix.orgor podman image ls20:18
@fungicide:matrix.orgdid the latter20:18
@fungicide:matrix.org`quay.io/opendevorg/gerrit@sha256:52742a46a266e076f6c2b94b5af69477977dc41e47c205e7cb479b7c8d75e732`20:19
@fungicide:matrix.orgthat seems to match what we've pulled20:19
@clarkb:matrix.orgyup those hashes match what I see20:19
@clarkb:matrix.orgI think we're good (other than docker updating but we seem to be ok with docker compose and podman so I think its ok20:19
@clarkb:matrix.org(but also yet another reason to be grumpy about docker I guess)20:20
@fungicide:matrix.orgyeah, i used `podman inspect` too20:20
@clarkb:matrix.orgI think we can proceed and figure that out later since podman is working20:20
@fungicide:matrix.orgstarting mariadb again now20:20
@clarkb:matrix.orgthat does mean some services that use docker now will need to be updated to use podman (I don't believe gerrit is one of them but gitea may be?) again something to sort out later20:20
@fungicide:matrix.orgready for me to `gerrit init`?20:21
@clarkb:matrix.orggo for gerrit ini20:21
@clarkb:matrix.org* go for gerrit init20:21
@clarkb:matrix.orgthat lgtm20:21
@fungicide:matrix.orgthe warnings seem to be the same as recorded in the pad20:21
@clarkb:matrix.orgyup exactly20:22
@clarkb:matrix.orgI think we can proceed wit hstarting gerrit then checking the post start things20:22
@fungicide:matrix.orgon its way up now20:22
@clarkb:matrix.org`[2026-04-12T20:22:59.821Z] [main] INFO  com.google.gerrit.pgm.Daemon : Gerrit Code Review 3.12.6-dirty ready` is in the log20:23
@clarkb:matrix.orgweb ui loads for me. As expected I'm logged out. Will log in now20:23
@fungicide:matrix.org`Reindexing changes`20:24
@fungicide:matrix.orgit's at 5% done. do we want to wait for reindexing to complete before unpausing zuul?20:25
@clarkb:matrix.orgno I think we can unpause zuul. I was able to log in20:25
@clarkb:matrix.orgI'll unpause zuul now20:25
@clarkb:matrix.orgdone20:26
@clarkb:matrix.orgfungi: I dropped my WIP from https://review.opendev.org/c/opendev/system-config/+/98347420:27
@clarkb:matrix.organd I'm going to push that change now20:27
@fungicide:matrix.orggerrit logs look typical, h2 cache files are there, config diff is empty...20:27
-@gerrit:opendev.org- Clark Boylan proposed: [opendev/git-review] 984279: Add py313 testing against Gerrit 3.12 https://review.opendev.org/c/opendev/git-review/+/98427920:27
@clarkb:matrix.orghttps://review.opendev.org/c/opendev/git-review/+/984279 lgtm and it replicated https://opendev.org/opendev/git-review/commit/36cc0f0b2ee9ace29da3eac683fe08ca9e2c5cda20:28
@clarkb:matrix.orgre config diff that is great. It was expected to be an empty diff this time20:28
@fungicide:matrix.orgother than exercising typical interactions, we're just waiting for the reindexing to complete now before closing out the screen session, merging 983474 and cleaning up the emergency file20:29
@clarkb:matrix.orgI'm not seeing zuul see events yet20:29
@clarkb:matrix.orgI seem to recall we may have run into this last time too? As its processing the backlog of events from ssh stream?20:30
@fungicide:matrix.orgmaybe it hasn't reconnected?20:30
@fungicide:matrix.orgoh, or that, sure20:30
@fungicide:matrix.orglooks like we also want to restart zuul schedulers and web?20:30
@fungicide:matrix.orgonce things reach a steady state20:31
@clarkb:matrix.orgyes, but that should only be to sync up the version info which shouldn't be necessary here20:31
@fungicide:matrix.orgokay, so no zuul behavior changes we need for 3.12 then20:31
@clarkb:matrix.orgI don't think so. I'm tailing both scheduler debug logs now as I think only one listens to the ssh event stream at a time and I'm not sure which it is20:31
@fungicide:matrix.orgsince we explicitly enabled robot comments support20:31
@fungicide:matrix.orgdon't have to worry about detecting that until 3.13?20:32
@clarkb:matrix.org`2026-04-12 20:32:04,078 DEBUG zuul.GerritConnection.ssh: Received data from Gerrit event stream` <- just saw this on zuul01 so I think it is starting to read that data20:32
@clarkb:matrix.orgfungi: yup when we upgrade to 3.13 the zuul scheduler and web restarts will be a more critical part of the puzzle20:32
@clarkb:matrix.orgok zuul processed my test change and it is failing on a zuul config issue. I also rechecked it and it reprocessed and reported back -1 again. I'll work on fixing it so we can see more of it in action20:34
-@gerrit:opendev.org- Clark Boylan proposed: [opendev/git-review] 984279: Add py313 testing against Gerrit 3.12 https://review.opendev.org/c/opendev/git-review/+/98427920:36
@clarkb:matrix.orghrm why is label debian-trixie not found20:36
@fungicide:matrix.orgis zuul still rereading configs?20:37
@clarkb:matrix.orgI don't think so20:37
@clarkb:matrix.orgit hasn't been restarted yet so shouldn't be rereading any configs20:37
@clarkb:matrix.orgoh I know20:38
@clarkb:matrix.orgits because trixie doesn't have the generic label20:38
-@gerrit:opendev.org- Clark Boylan proposed: [opendev/git-review] 984279: Add py313 testing against Gerrit 3.12 https://review.opendev.org/c/opendev/git-review/+/98427920:39
@clarkb:matrix.orgok that fixed it. Jobs are queued up now20:39
@clarkb:matrix.orgin the opendev tenant (not openstack) for git-review20:39
@clarkb:matrix.orgfungi: I marked off the items in step 26 because I belive with that latest patchset I have validated each of them now20:40
@clarkb:matrix.orgfungi: do you want to strikeout the steps you did like config comparison and the startup process etc?20:40
@clarkb:matrix.orgwe are about halfway through reindexing20:42
@clarkb:matrix.orgfungi: anyway I think the issues with zuul were just slowness after hte pause then me failing to write a proper change for zuul config. Now that that is sorted it all looks good to me. I think we're just waiting for reindexing so that we can proceed20:43
@fungicide:matrix.orgmakes sense. also +2 on the change20:44
@clarkb:matrix.orgI'm also monitoring the h2 cache db files and they are growing which is expected due to the reindexing. After reindexing I would expect the growth to become less crazy20:44
@clarkb:matrix.orgfungi: I think gating that change may take about an hour. Should we go ahead and approve it now?20:44
@fungicide:matrix.orgdone20:44
@fungicide:matrix.orgshould i also approve 983474 now or wait for reindexing to complete?20:45
@clarkb:matrix.orgfungi: thats the one I thought you approved :)20:47
@clarkb:matrix.orglooks like you approved the git-review change. I think its fine to approve both of them at this point20:47
@clarkb:matrix.orgwe are accumulating a number of futures to reindex the things that have changed during reindexing. But the general trend on the queue size is downward so I think we're ok20:48
@clarkb:matrix.orglooks like git-review is failing linters now. I think we can ignore that one as it did its job for checking replication and recheck and all that. Instead lets focus on 983474 and the rest of our tasks on the etherpad20:48
@fungicide:matrix.orgokay, both are approved now20:51
@fungicide:matrix.orgi had thought you were referring to the git-review change you'd just pushed initially20:51
@clarkb:matrix.orgya no worries. i think both things are fine to approve (though git-review doesn't appear to be in a mergable state)20:52
@fungicide:matrix.orgagreed20:52
@clarkb:matrix.orgdoing a quick grep against system-config we seem to primarily use `docker` for docker image pruning at this point. We do also use it to check some things in testing and to run a few commands here and there (like for registry garbage collection)20:56
@clarkb:matrix.orgbut overall I think the impact of docker the command not speaking to podman the runtime is fairly low at the moment20:57
@clarkb:matrix.orgso I'm going to try and not worry about it until later :)20:57
@fungicide:matrix.orgsounds like a monday problem, not a sunday problem20:57
@clarkb:matrix.orgright20:57
@clarkb:matrix.orgreindexing changes completed with the expected 3 failures21:00
@clarkb:matrix.organd the task list is basically empty so  ithink this portion of the upgrade is done21:00
@clarkb:matrix.orgoh this is interesting. I thought we might be running upstream docker which would explain this issue. But no noble updates updated docker. So we've got https://packages.ubuntu.com/noble-updates/docker.io now instead of https://packages.ubuntu.com/noble/docker.io so we did the right thing and used the distro packaged version so that it would be in `sync` with podman but apparently that wasn't enough21:02
@clarkb:matrix.org* oh this is interesting. I thought we might be running upstream docker which would explain this issue. But no noble updates updated docker. So we've got https://packages.ubuntu.com/noble-updates/docker.io now instead of https://packages.ubuntu.com/noble/docker.io so we did the right thing and used the distro packaged version so that it would be in "sync" with podman but apparently that wasn't enough21:02
@clarkb:matrix.orgI wonder if we should file a bug about that21:03
@clarkb:matrix.orgfungi: I detached from the screen if you want to shut it down and preserve the log file21:03
@clarkb:matrix.orgI will work on restarting zuul web and scheduler services now21:04
@fungicide:matrix.orgscreen session closed down and log moved per step #2921:06
@fungicide:matrix.orgi guess we can't clean up at least the review03 entry in the emergency file until 983474 merges21:07
@clarkb:matrix.org++21:07
@fungicide:matrix.orgor shouldn't anyway21:07
@clarkb:matrix.orgwell i think we can as long as we put it back if that change does not merge21:07
@clarkb:matrix.orgI've got zuul01's web restarting and zuul02's scheduler restarting. I will do the services the other way around once these two are done21:08
@clarkb:matrix.orgcomponents list reports those two restarts are done. I'll give them a minute before I do the next round21:12
@clarkb:matrix.orgok the last zuul restarts are in progress now21:17
@fungicide:matrix.orgcool21:21
@clarkb:matrix.organd as expected the h2 v2 cache file sizes seem stable after the reindexing21:22
@fungicide:matrix.orgexcellent21:22
@clarkb:matrix.organd now zuul reports all its components are back up and running21:22
@clarkb:matrix.orgso I think we're just waiting for this change to merge and we can remove the hosts from the emergency file and check that it runs and doesn't update our docker compose file21:23
-@gerrit:opendev.org- Zuul merged on behalf of Clark Boylan: [opendev/system-config] 983474: Bump Gerrit to 3.12 https://review.opendev.org/c/opendev/system-config/+/98347421:24
@fungicide:matrix.orgthat was faster than expected21:25
@clarkb:matrix.orgdid we remove the emergency entries quickly enough?21:27
@clarkb:matrix.orgif not we can let this finish, remove the entries then reenqueue it21:27
@clarkb:matrix.orglooks like we did not clean up the emergency file quickly enoguh. So lets let this run then clean up emergency file and reenqueue21:28
@clarkb:matrix.organd yes that cught me by surprise I had gotten up for a few minutes thinking I had half an hour21:28
@clarkb:matrix.orgfungi: ok thats done. Does the plan above sound good to you?21:30
@clarkb:matrix.orgthats done == the buildset ran with all the hosts in the emergency file. So no we can clean up the emergency file and reenqueue it21:30
@clarkb:matrix.orgI've logged into zuul and am ready to click the reenqueue buildset button on https://zuul.opendev.org/t/openstack/buildset/8bfc02126fe8498488690b0f2dbf078c if that buildset looks right and you're happy with the plan21:34
@fungicide:matrix.orgi didn't realize we wanted the emergency file cleaned up before that change merged21:34
@clarkb:matrix.orgfungi: I don't think it matters too much since we can reenqueue it. I think the goal is to land the change and ensure we run the infra-prod-service-review job and check docker-compose.yaml has 3.12 in it afterwards (it should noop essentially)21:35
@clarkb:matrix.orgthen we'll be in sync with what is on review03 and in ansible21:35
@clarkb:matrix.orgfungi: if you want to clean up the emergency file I can click that button21:35
@clarkb:matrix.org(or I can do both if it is easier)21:35
@fungicide:matrix.orgokay, entries i added to the emergency file have been removed now21:35
@clarkb:matrix.orgexcellent. Does the buildset I linked look correct to you?21:36
@fungicide:matrix.orgyes, looks like the buildset from 98347421:36
@clarkb:matrix.orgcool clicking the button to reenqueue momentarily21:37
@fungicide:matrix.orgi guess the pad wasn't clear that we wanted the emergency files cleaned up before 983474 deployed, since it had those steps in the opposite order21:38
@clarkb:matrix.org`Apr 12 20:15 /etc/gerrit-compose/docker-compose.yaml` is the current timestamp on the file21:38
@clarkb:matrix.orgfungi: ya I think the idea is to try and convey that we want to remove it and have that chagne apply and check idempotency. Happy for suggested edits on the etherpad to try and make that more clear for next time21:38
@clarkb:matrix.organd its not a big deal we just do what we're doing if we miss the window of opportunity21:38
@fungicide:matrix.orgprobably what we did was the most reliable option, since that way we don't have to worry about racing periodic jobs21:41
@clarkb:matrix.orgthats true. This was a bit slower but less hectic and more controlled21:41
@clarkb:matrix.orgthe job reports success and the timestamp on that file hasn't changed21:41
@clarkb:matrix.orgthis is interesting the size of one of those large h2 v2 caches has slightly decreased21:42
@clarkb:matrix.orgmaybe that is a good sign. i'm going to take it as a good omen21:42
@clarkb:matrix.orghttps://zuul.opendev.org/t/openstack/buildset/28dc8bbfba5442c58a56f6ed3903f62e the rest of the buildset was also a success21:46
@clarkb:matrix.orgfungi: I think that is the end of the upgrade tasks proper21:47
@clarkb:matrix.orgfungi: I'm happy for you to double check the assertions I made above about the docker-compose file and the buildeset etc. Or not. But I think we're done? Anything else you want to see done?21:47
@fungicide:matrix.orgno, checking back over everything i don't have any remaining concerns21:48
@clarkb:matrix.orgcool I guess we can catch up on any followup cleanups (like old cache removals, figuring out docker, dropping old images, etc etc) over the next week21:49
@clarkb:matrix.orgfungi: thank you for all the help. We're now one release behind again :) but that will only last for a few weeks. but that is ok we'll catch up to 3.13 soon enough21:50
@fungicide:matrix.orgthanks for putting together and testing such a solid plan!21:52

Generated by irclog2html.py 4.1.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!