| opendevreview | OpenStack Proposal Bot proposed openstack/project-config master: Normalize projects.yaml https://review.opendev.org/c/openstack/project-config/+/957995 | 02:08 |
|---|---|---|
| daeun[m] | hi all | 12:25 |
| gibi | hi infra folks. It seems we have some traffic jam on the check-arm64 pipeline. Is it a known situation? It seems it holds up otherwise legitim and green patches to got to the gate | 14:46 |
| clarkb | gibi: the arm64 pipeline should not impact your ability to gate | 16:16 |
| clarkb | that is why it is a separate pipeline | 16:16 |
| clarkb | check votes +1 which allow you to gate. check-arm64 does not | 16:17 |
| clarkb | as for the traffic jam those resources are very limited and any time we have a huge pipe of demand this will happen | 16:17 |
| clarkb | https://grafana.opendev.org/d/2c6f499090/zuul-launcher3a-osuosl?orgId=1&from=now-24h&to=now&timezone=utc&var-region=$__all you can see on this dashboard that we're basically using our full quota all the time right now | 16:17 |
| clarkb | the solution to that problem are either to A) find more arm64 resources or B) run fewer arm64 jobs | 16:18 |
| clarkb | if you have changes that are not gating but you expect them to be we can look closer but check-arm64 should not be the cause of that | 16:20 |
| gibi | clarkb: thanks. Then I don't know why this is not on the gate now https://review.opendev.org/c/openstack/nova/+/951640 | 17:29 |
| gibi | I thought it is not on the gate as it is still in the arm pipeline | 17:30 |
| gibi | but then there should be some other reason | 17:30 |
| gibi | it passed the check pipeline and it has two +2 and +W | 17:31 |
| clarkb | gibi: if you look at that change its parent is not current and not merged | 18:01 |
| clarkb | at least according to gerrit there. https://opendev.org/openstack/nova/commit/43d57ae63d1ecda24d8707b4750d404daadc980f seems to be the parent which is merged | 18:02 |
| clarkb | I wonder if we restarted gerrit on july 1 when that patchset was pushed and it didn't get an updated parent in the index | 18:03 |
| clarkb | gibi: basically when you approve a change in gerrit zuul asks gerrit the change would be mergable if tests pass. I think what has happened here is gerrit said "no this isn't mergeable because the parent is not current" but I'm going to see if I can confirm with the zuul logs | 18:05 |
| clarkb | looking at zuul scheduler logs I see the log message for where zuul reports Verified +1 at 12:18 UTC on the 29th. That should then cause Gerrit to emit a comment-added event for the new Verified +1 vote. But I haven't been able to find this comment added event which would enqueue things to the gate | 18:20 |
| clarkb | that change id and change number do not show up in the gerrit error log either | 18:23 |
| clarkb | I understand the parent thing now too. The parent is for a merge commit so there is no change belonging to that sha1. However, that hash shows up in https://review.opendev.org/c/openstack/nova-specs/+/951689/'s comment history so gerrit finds it and makes you think it is the parent | 18:24 |
| clarkb | I think that is a distraction (its a confusnig gerrit behavior but I don't think the indexing assumption earlier is correct) | 18:24 |
| clarkb | instead I think that zuul simply never received the comment added event for the Verified +1 vote after your recheck which is what would enqueue thinsg to the gate | 18:24 |
| clarkb | gibi: sean-k-mooney gmaan stephenfin if you are still around can you try applying a new workflow +1 approval vote to https://review.opendev.org/c/openstack/nova/+/951640 to see if that event gets emitted and it enqueues to the gate? | 18:25 |
| gmaan | clarkb: done | 18:26 |
| clarkb | that looks like it worked. So I don't think there was anything inherently wrong with the change. Instead for some raeson either gerrit didn't emit the event or zuul didn't read it (maybe there was a connection drop and the event was sent during that period. I haven't found that in the logs yet but could be the case) | 18:27 |
| gmaan | yeah https://zuul.opendev.org/t/openstack/status?change=951640%2C3&pipeline=gate | 18:27 |
| clarkb | I'll leave a note for corvus in #opendev in case he wants to follow up on this, but I'm quickly running out of leads in the logs | 18:28 |
| clarkb | I think I found it 2025-08-29 12:19:56,430 ERROR zuul.GerritConnection.ssh: kazoo.exceptions.ConnectionLoss | 18:36 |
| clarkb | the gerrit ssh connection whcih is where we get the event stream lost its zookeeper connectivity when trying to add an event ot zookeeper. Presumably it was this particular event | 18:37 |
| clarkb | thank you for pointing this out I think there may be some improvement we can make to zuul around this | 18:40 |
| *** gmaan is now known as gmaan_afk | 19:30 | |
| *** gmaan_afk is now known as gmaan | 23:20 | |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!