Friday, 2025-08-29

opendevreviewOpenStack Proposal Bot proposed openstack/project-config master: Normalize projects.yaml  https://review.opendev.org/c/openstack/project-config/+/95799502:08
daeun[m]hi all 12:25
gibihi infra folks. It seems we have some traffic jam on the check-arm64 pipeline. Is it a known situation? It seems it holds up otherwise legitim and green patches to got to the gate14:46
clarkbgibi: the arm64 pipeline should not impact your ability to gate16:16
clarkbthat is why it is a separate pipeline16:16
clarkbcheck votes +1 which allow you to gate. check-arm64 does not16:17
clarkbas for the traffic jam those resources are very limited and any time we have a huge pipe of demand this will happen16:17
clarkbhttps://grafana.opendev.org/d/2c6f499090/zuul-launcher3a-osuosl?orgId=1&from=now-24h&to=now&timezone=utc&var-region=$__all you can see on this dashboard that we're basically using our full quota all the time right now16:17
clarkbthe solution to that problem are either to A) find more arm64 resources or B) run fewer arm64 jobs16:18
clarkbif you have changes that are not gating but you expect them to be we can look closer but check-arm64 should not be the cause of that16:20
gibiclarkb: thanks. Then I don't know why this is not on the gate now https://review.opendev.org/c/openstack/nova/+/95164017:29
gibiI thought it is not on the gate as it is still in the arm pipeline17:30
gibibut then there should be some other reason17:30
gibiit passed the check pipeline and it has two +2 and +W17:31
clarkbgibi: if you look at that change its parent is not current and not merged18:01
clarkbat least according to gerrit there. https://opendev.org/openstack/nova/commit/43d57ae63d1ecda24d8707b4750d404daadc980f seems to be the parent which is merged18:02
clarkbI wonder if we restarted gerrit on july 1 when that patchset was pushed and it didn't get an updated parent in the index18:03
clarkbgibi: basically when you approve a change in gerrit zuul asks gerrit the change would be mergable if tests pass. I think what has happened here is gerrit said "no this isn't mergeable because the parent is not current" but I'm going to see if I can confirm with the zuul logs18:05
clarkblooking at zuul scheduler logs I see the log message for where zuul reports Verified +1 at 12:18 UTC on the 29th. That should then cause Gerrit to emit a comment-added event for the new Verified +1 vote. But I haven't been able to find this comment added event which would enqueue things to the gate18:20
clarkbthat change id and change number do not show up in the gerrit error log either18:23
clarkbI understand the parent thing now too. The parent is for a merge commit so there is no change belonging to that sha1. However, that hash shows up in https://review.opendev.org/c/openstack/nova-specs/+/951689/'s comment history so gerrit finds it and makes you think it is the parent18:24
clarkbI think that is a distraction (its a confusnig gerrit behavior but I don't think the indexing assumption earlier is correct)18:24
clarkbinstead I think that zuul simply never received the comment added event for the Verified +1 vote after your recheck which is what would enqueue thinsg to the gate18:24
clarkbgibi: sean-k-mooney gmaan stephenfin if you are still around can you try applying a new workflow +1 approval vote to https://review.opendev.org/c/openstack/nova/+/951640 to see if that event gets emitted and it enqueues to the gate?18:25
gmaanclarkb: done18:26
clarkbthat looks like it worked. So I don't think there was anything inherently wrong with the change. Instead for some raeson either gerrit didn't emit the event or zuul didn't read it (maybe there was a connection drop and the event was sent during that period. I haven't found that in the logs yet but could be the case)18:27
gmaanyeah https://zuul.opendev.org/t/openstack/status?change=951640%2C3&pipeline=gate18:27
clarkbI'll leave a note for corvus in #opendev in case he wants to follow up on this, but I'm quickly running out of leads in the logs18:28
clarkbI think I found it 2025-08-29 12:19:56,430 ERROR zuul.GerritConnection.ssh:   kazoo.exceptions.ConnectionLoss18:36
clarkbthe gerrit ssh connection whcih is where we get the event stream lost its zookeeper connectivity when trying to add an event ot zookeeper. Presumably it was this particular event18:37
clarkbthank you for pointing this out I think there may be some improvement we can make to zuul around this18:40
*** gmaan is now known as gmaan_afk19:30
*** gmaan_afk is now known as gmaan23:20

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!