Tuesday, 2025-07-22

-@gerrit:opendev.org- Francisco Seruca Salgado proposed: [zuul/zuul-jobs] 955583: Trigger Test https://review.opendev.org/c/zuul/zuul-jobs/+/95558311:09
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 955545: Require multinode requests served from same provider https://review.opendev.org/c/zuul/zuul/+/95554516:03
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed:16:17
- [zuul/zuul] 955040: Add QuotaCache class https://review.opendev.org/c/zuul/zuul/+/955040
- [zuul/zuul] 955106: Plumb zk_client through to endpoints https://review.opendev.org/c/zuul/zuul/+/955106
- [zuul/zuul] 955107: Update drivers to use QuotaCache https://review.opendev.org/c/zuul/zuul/+/955107
- [zuul/zuul] 955325: Implement zuul-launcher connection filter https://review.opendev.org/c/zuul/zuul/+/955325
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 955617: Launcher: implement preferred provider https://review.opendev.org/c/zuul/zuul/+/95561717:19
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 955619: Launcher: skip non-pending uploads https://review.opendev.org/c/zuul/zuul/+/95561917:35
@clarkb:matrix.orgcorvus: looks like the new test cases in https://review.opendev.org/c/zuul/zuul/+/955545 may still be failing?21:50
@clarkb:matrix.orgit almost looks like both providers end up failing the nodescans and then we're out of quota and eventually the job times out. Maybe we're using two counters for the myadvance method so they can each go to two?22:22
@clarkb:matrix.orgalso it almost looks like deleting nodes in the test suite isn't returning their quota back to the available quota22:23
@clarkb:matrix.orgcorvus: we're using the NodescanRequest.node.uuid to index into the counter dict for the number of failures. Do we actually need it to use the uuid of the NodeRequest22:28
@clarkb:matrix.orgya something like `[ns_request.node.request_id]` for the indexing?22:30
@clarkb:matrix.orgthat way we fail node 0a, node1a, and node0b with the `>0` check against the same request22:32
@clarkb:matrix.orgok posted some comments on the change after digging through the test case logs22:43
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 955545: Require multinode requests served from same provider https://review.opendev.org/c/zuul/zuul/+/95554522:59
@jim:acmegating.comClark: i think it's the quota; the other thing should be fine.  i left a reply22:59
@clarkb:matrix.orgcorvus: looking at https://35824449a46d30e681e8-d5237dd035b9f3c7532839ca52b23c88.ssl.cf5.rackcdn.com/zuul/2a997f3e85f244aaaad3a829bab588a9/testr_results.html it looks like each AWSProviderNode has a different uuid value23:01
@jim:acmegating.comyes, so after we've failed for 3 unique nodes, we're done23:02
@jim:acmegating.com(the advance method can get called more than once for a node)23:02
@clarkb:matrix.orgohhh we take the len of the keys `failed_nodes = len(failed_count_by_node.keys())`23:02
@clarkb:matrix.orgIn my head we were looking at the value in each key23:03
@clarkb:matrix.orgre the quota issue is decrementing the values unimplemented in the test framework or are we just not waiting long enough for the caches to determine that we have quota again after deletions?23:05
@jim:acmegating.comlikely the second23:05
@jim:acmegating.comand it's not important for this test, so i lifted the quotas23:05
@clarkb:matrix.orgya makes sense23:06
@clarkb:matrix.orgfor the second test failure I would've expected quota failures to cause it to fail the node request though23:07
@clarkb:matrix.orgin the log it looked like it was still trying to boot nodes 30 seconds after starting which is why I thought maybe we need to reduce the total number of attempts23:07
@clarkb:matrix.orgI guess quota handling has it retrying maybe and it will fail earlier now that it won't kick back quota errors?23:08
@jim:acmegating.comyes that's how i read it23:15
@clarkb:matrix.orgsemi related the zuul gate is really interesting right now23:15
@clarkb:matrix.orgthe tip is ps4 of that change then after that is ps3 for some reason?23:16
@clarkb:matrix.orgI don't think this is a big issue as zuul has correctly identified that ps3 is unmergable (says it has a merge conflict) just unexpected in the rendering ofthe state of things23:16
@jim:acmegating.comi think 619 depends on ps323:16
@clarkb:matrix.orgoh I see23:17
@jim:acmegating.comsince 619 was already approved, it probably raced the update23:17
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 955619: Launcher: skip non-pending uploads https://review.opendev.org/c/zuul/zuul/+/95561923:17
@clarkb:matrix.orgputting the new patchset of the child in the gate didn't evict the ps3 parent. But also I think it should resolve itself23:19
@jim:acmegating.comyeah, ps3 is not in the main queue23:22
@clarkb:matrix.orglooking at https://review.opendev.org/c/zuul/zuul/+/955617 that doesn't seem necessary to address the underlying issue right? Is the idea there that end users could potentially supply that info via nodeset configuration?23:31
@jim:acmegating.comoh no.  definitely not.23:34
@jim:acmegating.comit's just brining zl up to par with nodepool23:35
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 955617: Launcher: implement preferred provider https://review.opendev.org/c/zuul/zuul/+/95561723:35
@jim:acmegating.comthat's a feature that exists; it's just not very important in opendev right now, so no one has noticed it's missing23:36
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul-jobs] 954886: Update s3 minio tests https://review.opendev.org/c/zuul/zuul-jobs/+/95488623:36
@clarkb:matrix.orgoh I see its specific to paused builds23:36
@clarkb:matrix.organd ya opendev I think is "resilient" to that as opendev can fetch container images from jobs in other clouds and very few other users within opendev use pause jobs23:37
@jim:acmegating.comyep23:37

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!