| Clark[m] | Looks like it passed so that is working | 00:25 |
|---|---|---|
| tkajinam | https://review.opendev.org/c/openstack/zaqar/+/963027?tab=change-view-tab-header-zuul-results-summary | 02:55 |
| tkajinam | I wonder if some maintenance is on-going ? | 02:55 |
| tonyb | tkajinam: Not that I'm aware of | 03:36 |
| tonyb | let me poke in the logs | 03:36 |
| tkajinam | tonyb, thx ! | 04:26 |
| tonyb | tkajinam: Sorry The reason for the failure is beyond my zuul knowledge. | 04:28 |
| tonyb | It looks like ultimately it comes down to: | 04:28 |
| tonyb | 2025-10-04 02:56:22,582 ERROR zuul.Launcher: [e: 6c2aa68a3b4f4952a2035cfc6f8795a5] [req: 3adb2de70cb64298896a80d3d070bf0f] Exception loading ZKObject at /zuul/nodeset/requests/3adb2de70cb64298896a80d3d070bf0f/revision | 04:28 |
| tkajinam | it seems the frequent failure started at 2025-10-04 02:13:24 | 04:28 |
| tkajinam | hmm wait probably even earlier | 04:28 |
| tkajinam | tonyb, np ! | 04:28 |
| tkajinam | 2025-10-04 00:41:12 | 04:29 |
| tonyb | but I really don't know what would cause ZK to fail to have those objects | 04:29 |
| tkajinam | https://zuul.opendev.org/t/openstack/builds?skip=1650 no node failure here | 04:29 |
| tkajinam | https://zuul.opendev.org/t/openstack/builds?skip=1600 it started here | 04:29 |
| tonyb | Wow just skip a cool 1650 builds ;P | 04:30 |
| tkajinam | we may have to skip further a few hours later :-P | 04:30 |
| tkajinam | maybe zookeeper cluster is mulfunctioning but that's not what I'm familiar with | 04:31 |
| tkajinam | that timestamp would help identifying the problem later | 04:31 |
| tonyb | I'll keep poking | 04:31 |
| tkajinam | https://zuul.opendev.org/t/openstack/build/90bb41a0fa1540bb964ee0e1b539c088 is the "first one", for records | 04:31 |
| tonyb | https://grafana.openstack.org/d/21a6e53ea4/zuul-status?orgId=1&from=2025-10-04T00:20:00.000Z&to=2025-10-04T01:20:00.000Z&timezone=utc is also interesting | 04:35 |
| tkajinam | yeah | 04:42 |
| tonyb | I'm out of ideas, the zk cluster seems healthy, graphite says very little helpful. We'll need to wait for another infra-root for help | 05:20 |
| frickler | it looks like there may have been some temporary incompatibility between executors and launchers/schedulers. seems the issue resolved itself when the latter were upgraded 2h ago. I reenqued the reqs periodic-weekly jobs to verify | 11:00 |
| tonyb | hmm I avoided restarting things as I didn't want to confuse any debugging. | 12:10 |
| fungi | yeah, when things crop up with zuul in the early hours of a saturday utc, suspect something related to our automated zuul upgrades since that's when they kick off | 14:54 |
| fungi | i don't think zuul upstream does any testing with mismatched versions of components, so i guess not too surprising to occasionally have a change merge that assumes it is applied simultaneously to multiple components | 14:55 |
| Clark[m] | There are specific upgrade tests that test compatibility between mismatched versions. But that requires expecting problems and having test cases in advance | 15:32 |
| corvus | we do perform testing with mismatched components, but not so much for niz | 15:33 |
| fungi | which makes sense as it's still basically in beta | 16:50 |
| corvus | i'd call it alpha :) | 17:01 |
| fungi | wfm | 17:24 |
| fungi | aleph | 17:25 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!