-@gerrit:opendev.org- Shnaidman Sagi (Sergey) proposed on behalf of Sorin Sbârnea: [zuul/zuul-jobs] 803471: Include podman installation with molecule https://review.opendev.org/c/zuul/zuul-jobs/+/803471 | 13:36 | |
@jpew:matrix.org | jpew | 14:20 |
---|---|---|
@y2kenny:matrix.org | corvus I just want to follow up on https://review.opendev.org/c/zuul/zuul/+/823732 (git_over_ssh.) Can that go in? | 16:30 |
@jim:acmegating.com | Kenny Ho: yeah, thanks for double checking! | 17:01 |
@y2kenny:matrix.org | Awesome! Thanks! | 17:02 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: | 17:32 | |
- [zuul/zuul] 823587: Add some ZK debug scripts https://review.opendev.org/c/zuul/zuul/+/823587 | ||
- [zuul/zuul] 824077: Add a zk-shell debug script https://review.opendev.org/c/zuul/zuul/+/824077 | ||
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 824218: Pin tzlocal to avoid warnings https://review.opendev.org/c/zuul/zuul/+/824218 | 17:33 | |
@tobias.henkel:matrix.org | corvus: q on ^ | 17:45 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 824218: Pin tzlocal to avoid warnings https://review.opendev.org/c/zuul/zuul/+/824218 | 17:56 | |
@jim:acmegating.com | tobiash: thx, that was a silly copypasta | 17:57 |
@tobias.henkel:matrix.org | +2 | 17:57 |
-@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul] 824477: Improve documentation around ZK requirements https://review.opendev.org/c/zuul/zuul/+/824477 | 18:29 | |
@hanson76:matrix.org | Hi, we are running Zuul 4.11.0 and Nodepool 4.3.0 togther with the aws driver to launch EC2 instances. | 18:29 |
We have recently seen that we get NODE_ERROR on builds from time to time and it starting to become annoying to have every fifth build fail. | ||
I've done some digging around and figured out that the aws driver in nodepool is accessing the DescribrInstanceId API too quickly after | ||
the create instance API call has finished. | ||
It takes some time for the AWS backends to propagate information about the newly created instance. | ||
Nodepool ends up receiving a NotFound error from DescribeInstanceId in some cases because of this. | ||
I've added a story in the story board about this (https://storyboard.openstack.org/#!/story/2009781) | ||
My guess is that a simple loop with a sleep around the DescribeInstanceId could fix this problem and make the | ||
aws driver more robust. Is this something that could be fixed ? Just noticed that Nodepool 4.4.0 was released yesterday. | ||
@clarkb:matrix.org | a retry loop with a sleep up to some timeout seems reasonable. I believe there are other similar cases of code beacuse clouds are weird :) | 18:32 |
@clarkb:matrix.org | Anders Hanson: if you have it a copy of the full traceback would likely be helpful. I wouldn't know what exception to catch myself and don't have aws credentials to test with | 18:33 |
@clarkb:matrix.org | Also if you'd like to write the fix yourself I'd be happy to help with reviews and general process | 18:34 |
@hanson76:matrix.org | I'll start with digging up the stacktrace. | 18:35 |
@hanson76:matrix.org | I've added the stacktrace to the story. | 18:41 |
@clarkb:matrix.org | I see it thanks | 18:47 |
@clarkb:matrix.org | Anders Hanson: https://opendev.org/zuul/nodepool/src/branch/master/nodepool/driver/simple.py#L114-L119 is code in another driver that does similar to what you need I think | 18:49 |
@clarkb:matrix.org | You'd need to put that in the aws driver and updates it to catch that exception and ignore it until the timeout | 18:50 |
@clarkb:matrix.org | If you'd prefer someone else write the change let us know, but this is probably a good bugfix to get involved if interested :) | 18:50 |
@hanson76:matrix.org | Thanks, I'll take a stab at it tomorrow, not written anything in python before. | 18:54 |
-@gerrit:opendev.org- Zuul merged on behalf of Kenny Ho: [zuul/zuul] 823732: Add git_over_ssh option for Gerrit connection https://review.opendev.org/c/zuul/zuul/+/823732 | 19:01 | |
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 824218: Pin tzlocal to avoid warnings https://review.opendev.org/c/zuul/zuul/+/824218 | 19:25 | |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 824482: Add "zuul delete-pipeline-state" command https://review.opendev.org/c/zuul/zuul/+/824482 | 20:57 | |
@clarkb:matrix.org | corvus: swest for https://review.opendev.org/c/zuul/zuul/+/823782/8/zuul/zk/zkobject.py This will render gets with zkshell pretty much useless. I think zlib isn't directly decompressable with gzip too? I guess we'll have to run a python shell and import zlib and decompress that way? | 22:57 |
@clarkb:matrix.org | Its not the end of the world but I think keeping the database as human readable as possible is a useful thing particularlysince I'ev had to go diving in the db a non zero number of times | 22:58 |
@jim:acmegating.com | Clark: correct, thus i made https://review.opendev.org/824077 | 22:58 |
@clarkb:matrix.org | aha, thanks | 22:59 |
@jim:acmegating.com | Clark: also added a decompress option to the dump script in https://review.opendev.org/823587 | 22:59 |
@clarkb:matrix.org | fwiw opendev's db size is well under 500MB which isn't crazy, but I guess other installs are probably quite a bit bigger | 23:00 |
@jim:acmegating.com | Clark: and yes, i totally agree on keeping the db the same. apparently this will make a huge difference in performance for swest's use case. both in db size as well as throughput (like, it actually makes the scheduler run faster). so i think it's worth it, especially since we can mitigate the loss of functionality with those tools. | 23:01 |
@jim:acmegating.com | (i mean, it should make a 90% improvement for everyone, but that has outsized impact at larger scales) | 23:04 |
@clarkb:matrix.org | I guess once the extra tools land opendev should update their zk config to drop port 2181 | 23:04 |
@clarkb:matrix.org | since we'll want to use these tools anyway and they support the compressed data unlike zkshell | 23:05 |
@jim:acmegating.com | Clark: either way -- that's still firewalled only to the servers, and ssl is optional in the tool i wrote | 23:05 |
@clarkb:matrix.org | ya I just figure we'll be using the tools since they understand zuul's db better and since they support the certs may as well drop the easy mode | 23:05 |
@jim:acmegating.com | so we can use the new tool locally on zk04 with no ssl, or remotely on zuul01 with ssl | 23:05 |
@jim:acmegating.com | Clark: also, fyi `zlib-flate` (in the `qpdf` package in debuntu) is an easy way to decompress zlib from shell. there are other options which involve gzip and the magic header too, but `zilb-flate` is the most straightforward. | 23:09 |
@clarkb:matrix.org | TIL | 23:09 |
@clarkb:matrix.org | ya for gzip you have to prepend a header or something | 23:10 |
@clarkb:matrix.org | then it just works | 23:10 |
@clarkb:matrix.org | corvus: TIL about cmd as well | 23:30 |
@jim:acmegating.com | Clark: yeah, i, erm, wrote 50% of what cmd does myself before i stumbled on it and started over :) | 23:33 |
@jim:acmegating.com | i think there's a bit of room for improvement, but it's good enough | 23:33 |
@jim:acmegating.com | (like, i'd like to see something that's a combo of cmd+argparse) | 23:34 |
@clarkb:matrix.org | corvus: why is help_get different than the others? Is it because it is multiline docstring? | 23:39 |
@clarkb:matrix.org | re argparse you can feed argpase the string provided by args in cmd I think. But ya probably not necessary | 23:40 |
@jim:acmegating.com | Clark: yeah. that's one of the wonky things. i wanted to dedent it properly. | 23:40 |
@clarkb:matrix.org | corvus: I think I found a bug in https://review.opendev.org/c/zuul/zuul/+/824077 | 23:43 |
@clarkb:matrix.org | I left a comment | 23:43 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 824077: Add a zk-shell debug script https://review.opendev.org/c/zuul/zuul/+/824077 | 23:48 | |
@jim:acmegating.com | Clark: agree thx | 23:48 |
@clarkb:matrix.org | corvus: for https://review.opendev.org/c/zuul/zuul/+/817626 and children do we want ot avoid landing anything until after 4.12.0 ? Or are we planning on incorporating all of this stuff in to 5.0? | 23:52 |
@jim:acmegating.com | Clark: i'm leaning toward: land all the zkobject stuff, 4.12.0, then land that stack and 5.0. | 23:53 |
@clarkb:matrix.org | ok | 23:53 |
@jim:acmegating.com | (but i could be talked into wrapping all that into 5.0 -- i'm just thinking that a little more time with this before 5.0 would be best) | 23:54 |
@clarkb:matrix.org | no objections from me | 23:54 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!