Saturday, 2026-02-21

@harbott.osism.tech:regio.chatcorvus: looks like we have multiple jobs stuck in queued state, likely after the zuul updates earlier today. like one in the openstack gate pipeline and one in periodic weekly, maybe you can take a closer look?15:30
@harbott.osism.tech:regio.chathttps://zuul.opendev.org/t/openstack/buildset/613857b0af394833b6a53ad8d494346c and https://zuul.opendev.org/t/openstack/buildset/aafb769b692c45239da7de452c1958e4 are the affected buildsets, just in case they get cleaned up somehow15:36
@harbott.osism.tech:regio.chatschedulers were restarted ~ 2h ago, so well after those buildsets started15:46
@harbott.osism.tech:regio.chatnot sure if related, this infra-prod-service-zuul run failed at 12:00 since two mergers were found to be unreachable https://zuul.opendev.org/t/openstack/build/e10d1b49e67f416abd8da5458fd0378f15:47
@harbott.osism.tech:regio.chatall component versions show `14.0.1.dev10 fa91e4b3a` which looks like the expected current master version15:55
@harbott.osism.tech:regio.chatnothing that looks obvious to me in the scheduler logs, lots of zk errors but those seem to have been there before16:05
@jim:acmegating.comJens Harbott: it looks like neither rax-ord and rax-iad are operating as expected, with hundreds of servers stuck in the openstack deleting state.  i recommend taking those cloud regions out of service.16:35
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [opendev/zuul-providers] 977548: Disable rax-ord and rax-iad https://review.opendev.org/c/opendev/zuul-providers/+/97754816:44
@jim:acmegating.cominfra-root: something like that ^16:44
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [opendev/zuul-providers] 977548: Disable rax-ord and rax-iad https://review.opendev.org/c/opendev/zuul-providers/+/97754817:21
@harbott.osism.tech:regio.chatseems the launchers are getting `KeyError: 'servers'` when listing servers for that region. not sure yet if things are broken on the rax end or whether a fresh sdk update is hitting us, will try manually from bridge17:23
@harbott.osism.tech:regio.chat`openstack server list` for rax-ord and rad-iad seems to be working fine and fast, so that makes the sdk or something else in the current image the suspect for me17:26
@harbott.osism.tech:regio.chatcorvus: do you expect the stuck buildsets to recover or should we dequeue them?17:30
@harbott.osism.tech:regio.chatcompletely unrelated: wasn't it possible earlier to horizontally scroll long log lines in the "Task Summary" tab like e.g. for https://zuul.opendev.org/t/openstack/build/7e9365f398fa4ccd8bc86eaf35fac006 ? currently that doesn't work for me with either firefox or chromium17:39
@jim:acmegating.comi think there is a good chance of self-recovery, but dequeing builds should make it happen faster17:44
@harbott.osism.tech:regio.chatwell it is the weekend, so I'd be willing to give it a chance until tomorrow17:46
@harbott.osism.tech:regio.chatlooks like things are proceeding now at least for some of the older buildsets, like https://zuul.opendev.org/t/openstack/buildset/75f190d8968a4462969690ab674e52da , so there is some hope17:51
although I note I also cannot horizontally scroll for the timeline of this buildset
@harbott.osism.tech:regio.chatit does look a bit however like most recent requests are getting served faster than the old pending ones, so maybe they'll have to wait until we have less load18:09
@harbott.osism.tech:regio.chatcorvus: looking at grafana, there are 42 nodes in "Requested" state for rax-ord, which don't seem to have moved since about 15:00, well before we disabled the region. so maybe there is some other issue at work there after all?22:22

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!