clarkb | I need to call it here and help with dinner. It seems like we can safely use ssh-keyscan -4 or 127.0.0.1 in place of localhost for now if we need a workaround | 00:01 |
---|---|---|
openstackgerrit | Ian Wienand proposed zuul/nodepool master: Remove statsd args to OpenStack API client call https://review.opendev.org/c/zuul/nodepool/+/786862 | 00:19 |
*** hamalq has quit IRC | 01:27 | |
openstackgerrit | Ian Wienand proposed zuul/zuul-jobs master: collect-container-logs: don't copy on failure https://review.opendev.org/c/zuul/zuul-jobs/+/787019 | 02:10 |
*** ajitha has joined #zuul | 02:27 | |
*** evrardjp has quit IRC | 02:33 | |
*** evrardjp has joined #zuul | 02:33 | |
*** sam_wan has joined #zuul | 03:15 | |
*** ykarel has joined #zuul | 03:33 | |
ianw | https://github.com/docker/for-linux/issues/1233 | 03:59 |
ianw | Since 20.10.6 it's not possible to run docker on a machine with disabled IPv6 interfaces | 03:59 |
ianw | oh, sorry, ignore that, it's more about kernel disabled ipv6 | 04:00 |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: nodepool-functional : ignore errors copying logs https://review.opendev.org/c/zuul/nodepool/+/787065 | 04:18 |
*** vishalmanchanda has joined #zuul | 04:23 | |
*** saneax has joined #zuul | 04:26 | |
*** ianychoi__ has joined #zuul | 04:48 | |
*** sanjayu_ has joined #zuul | 04:48 | |
*** saneax has quit IRC | 04:50 | |
*** ianychoi_ has quit IRC | 04:50 | |
openstackgerrit | Merged zuul/zuul master: Use ssh-keyscan -4 in quick-start https://review.opendev.org/c/zuul/zuul/+/786988 | 04:56 |
*** jfoufas1 has joined #zuul | 05:11 | |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: Account for resource usage of leaked nodes https://review.opendev.org/c/zuul/nodepool/+/785821 | 06:20 |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: QuotaInformation : abstract resource recording https://review.opendev.org/c/zuul/nodepool/+/787093 | 06:20 |
*** bhavikdbavishi has joined #zuul | 06:39 | |
*** bhavikdbavishi has quit IRC | 06:43 | |
*** jcapitao has joined #zuul | 06:51 | |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: nodepool-functional : ignore errors copying logs https://review.opendev.org/c/zuul/nodepool/+/787065 | 06:52 |
*** harrymichal has quit IRC | 07:23 | |
ianw | i've run out of time to figure out why the nodepool functional jobs aren't working. it's something to do with the docker install | 07:34 |
ianw | https://review.opendev.org/c/zuul/nodepool/+/787065 is a change that makes the logs be captured correctly | 07:34 |
*** rpittau|afk is now known as rpittau | 07:35 | |
ianw | https://98b486d8b4772b9c610a-a795987d7c31db10daa7f58e24ee8596.ssl.cf1.rackcdn.com/787065/2/check/nodepool-functional-container-openstack-siblings/2d652ea/syslog | 07:35 |
ianw | is a sample syslog | 07:35 |
ianw | the relevant part seems to be | 07:35 |
ianw | Apr 20 07:29:15 ubuntu-bionic-inap-mtl01-0024121870 systemd[1]: Starting Docker Application Container Engine... | 07:35 |
ianw | Apr 20 07:29:15 ubuntu-bionic-inap-mtl01-0024121870 dockerd[5153]: time="2021-04-20T07:29:15.607437905Z" level=info msg="Starting up" | 07:35 |
ianw | Apr 20 07:29:15 ubuntu-bionic-inap-mtl01-0024121870 dockerd[5153]: failed to load listeners: no sockets found via socket activation: make sure the service was started by systemd | 07:35 |
ianw | Apr 20 07:29:15 ubuntu-bionic-inap-mtl01-0024121870 systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE | 07:35 |
ianw | there is no obvious solution when googling that, but there are several hits | 07:35 |
ianw | if nobody else figures it out, i guess i'll get back to it tomorrow! | 07:35 |
openstackgerrit | Daniel Blixt proposed zuul/zuul-jobs master: WIP: Make build-sshkey handling windows compatible https://review.opendev.org/c/zuul/zuul-jobs/+/780662 | 07:44 |
*** tosky has joined #zuul | 07:45 | |
openstackgerrit | Daniel Blixt proposed zuul/zuul-jobs master: Make build-sshkey handling windows compatible https://review.opendev.org/c/zuul/zuul-jobs/+/780662 | 07:45 |
*** jpena|off is now known as jpena | 07:58 | |
openstackgerrit | Andy Ladjadj proposed zuul/zuul master: [reporter][elasticsearch] fix the timestamp when the system has a different timezone by forcing the UTC timezone https://review.opendev.org/c/zuul/zuul/+/786444 | 08:03 |
*** mgoddard has quit IRC | 08:08 | |
openstackgerrit | Merged zuul/zuul master: Add nodepool.external_id to inventory https://review.opendev.org/c/zuul/zuul/+/786738 | 08:26 |
openstackgerrit | Daniel Blixt proposed zuul/zuul-jobs master: Make build-sshkey handling windows compatible https://review.opendev.org/c/zuul/zuul-jobs/+/780662 | 09:23 |
*** ykarel is now known as ykarel|lunch | 09:40 | |
*** vishalmanchanda has quit IRC | 09:52 | |
*** nils has joined #zuul | 09:58 | |
*** rpittau is now known as rpittau|bbl | 10:04 | |
*** harrymichal has joined #zuul | 10:08 | |
*** harrymichal has quit IRC | 10:26 | |
*** ykarel|lunch is now known as ykarel | 10:47 | |
*** vishalmanchanda has joined #zuul | 10:49 | |
openstackgerrit | Daniel Blixt proposed zuul/zuul-jobs master: Make build-sshkey handling windows compatible https://review.opendev.org/c/zuul/zuul-jobs/+/780662 | 11:09 |
*** jcapitao is now known as jcapitao_lunch | 11:16 | |
*** jpena is now known as jpena|lunch | 11:31 | |
*** rlandy has joined #zuul | 11:46 | |
*** rlandy is now known as rlandy|rover | 11:46 | |
*** jcapitao_lunch is now known as jcapitao | 11:51 | |
*** ykarel_ has joined #zuul | 12:20 | |
*** ykarel has quit IRC | 12:21 | |
*** jpena|lunch is now known as jpena | 12:33 | |
*** rpittau|bbl is now known as rpittau | 12:37 | |
*** ykarel__ has joined #zuul | 13:16 | |
*** ykarel_ has quit IRC | 13:18 | |
*** vishalmanchanda has quit IRC | 13:39 | |
*** sean-k-mooney has quit IRC | 13:45 | |
tobiash | corvus: commented on 786744 | 13:46 |
tobiash | apart from that lgtm | 13:46 |
openstackgerrit | Tobias Henkel proposed zuul/zuul master: Don't check for existing refs in isUpdateNeeded https://review.opendev.org/c/zuul/zuul/+/787201 | 13:47 |
*** vishalmanchanda has joined #zuul | 13:47 | |
tobiash | corvus: and I've added a slight optimization on top ^ | 13:47 |
corvus | tobiash: thanks! i'll take a look at the getfiles usage | 14:02 |
clarkb | corvus: was a change pushed up to force ipv4 on the ssh keyscan in quickstart? | 14:38 |
clarkb | oh ya I see it now | 14:38 |
*** sanjayu_ has quit IRC | 14:39 | |
fungi | clarkb: however the nodepool jobs are also broken wrt docker around the same timeframe, making in increasingly suspicious the change could have been in docker itself? | 14:41 |
clarkb | ya maybe docker changed how it does the port forwards | 14:42 |
clarkb | corvus: fungi ianw https://github.com/moby/moby/pull/42205 that looks suspicious for our port forwarding problems | 14:54 |
clarkb | "Fix a regression in docker 20.10, causing IPv6 addresses no longer to be bound by default when mapping ports" is how the release notes refer to that. Maybe we weren't getting an ipv6 before and that caused a different error code to ssh-keyscan when connecting, one which caused it to keep looping and try 127.0.01 | 14:55 |
corvus | clarkb: that sounds plausible | 14:55 |
clarkb | but now they "fixed it" which gives minimal ipv6 connectivity somewhere and that makes ssh-keyscan sad | 14:55 |
openstackgerrit | Clark Boylan proposed zuul/nodepool master: Test if docker packages start on focal https://review.opendev.org/c/zuul/nodepool/+/787210 | 14:59 |
clarkb | that change is for more info gathering. I'm curious if docker struggles on focal the asme way as bionic | 15:00 |
avass | any tl;dr what the docker/keyscan issue is? | 15:01 |
fungi | not yet, but seems like we might be zeroing in on it | 15:02 |
clarkb | avass: the keyscan issue is that ssh-keyscan tries to connect to ::1[:29418] to keyscan gerrit but fails as there isn't anything listening there | 15:02 |
clarkb | avass: I'm beginning to suspect https://github.com/moby/moby/pull/42205 is related as they "fixed" ipv6 port forwarding in the most recent release | 15:02 |
fungi | because gerrit's listening on 0.0.0.0:29418 explicitly | 15:02 |
clarkb | my hunch is that prior to this release ipv6 wasn't even attempted ebacuse it wasn't configured at all | 15:02 |
fungi | but yeah, maybe the container is forwarding ::1[:29418] in somehow now? | 15:03 |
avass | huh | 15:03 |
avass | should be possible to test that by installing an older docker release without that change | 15:04 |
clarkb | yup | 15:04 |
clarkb | 20.10.5 if you can sort out how to install it | 15:05 |
avass | does zuul-jobs expose any docker version variable? https://docs.docker.com/engine/install/ubuntu/ | 15:06 |
avass | otherwise, maybe it should? | 15:06 |
clarkb | no I think it is just use upstream or use distro version | 15:08 |
avass | I think it should be possible to include_role and override _docker_upstream_distro__packages to use a specific version | 15:08 |
clarkb | I've just noticed the failing nodepool job appears to install docker.io first (the distro package), then uninstall docker.io and install docker-ce | 15:09 |
clarkb | I wonder if there is some sort of leak between docker.io and docker-ce that is breaking docker-ce | 15:09 |
clarkb | I need to eat some breakfast but I'll start pulling on that thread next | 15:09 |
*** bhavikdbavishi has joined #zuul | 15:10 | |
*** jfoufas1 has quit IRC | 15:16 | |
openstackgerrit | Albin Vass proposed zuul/nodepool master: Test docker with 20.10.5 https://review.opendev.org/c/zuul/nodepool/+/787217 | 15:20 |
avass | clarkb: I wonder if that ^ works | 15:20 |
openstackgerrit | Clark Boylan proposed zuul/nodepool master: Test if not installing docker.io fixes docker-ce installs https://review.opendev.org/c/zuul/nodepool/+/787220 | 15:25 |
*** ykarel__ is now known as ykarel | 15:41 | |
clarkb | corvus: I think that ^ does fix the issue. However, that was done to support zookeeper tls in nodepool | 15:42 |
*** holser has joined #zuul | 15:42 | |
clarkb | the unittest jobs haev passed though so maybe that was vestigal | 15:42 |
clarkb | If that passes I can write up a better commit message and then flip the order of the two changes there (one fixes logging the other fixes the job) | 15:43 |
avass | clarkb: docker.io is removed before installing docker-ce in ensure-docker for upstream installation | 15:44 |
clarkb | avass: yup, and then when docker-ce is installed it doesn't work | 15:44 |
clarkb | my hunch is that uninstalling docker.io without purging it is causing somethign to leak to the docker-ce install that breaks docker-ce | 15:44 |
clarkb | but if we don't need to install docker.io in the first place we can just avoid that extra step entirely | 15:45 |
avass | makes sense | 15:45 |
corvus | i wonder why we use docker-ci | 15:45 |
corvus | docker-ce | 15:45 |
avass | corvus: I was gonna question why you're using docker.io :) | 15:45 |
corvus | we might be able to flip the fix the other way around; but i don't think it's very important | 15:45 |
corvus | avass: bindep+distro packages is simpler for devs | 15:45 |
corvus | it's great that we have an ensure-docker role in zuul-jobs, but it's even better if we don't have to use it :) | 15:46 |
clarkb | corvus: ensure-docker in zuul0jobs defaults to upstream iirc | 15:46 |
corvus | i think we wrote it when distro dockers were ancient; maybe we can stop using it now | 15:47 |
corvus | clarkb: right, whereas docker.io is an ubuntu package | 15:47 |
clarkb | yup | 15:47 |
clarkb | we could set use_upstream_docker: false in nodepool and zuul project defs | 15:48 |
corvus | so to be clear: my point is that it seems the simplest system would be to use docker.io in bindep and not use ensure-docker in jobs. that would be simpler, faster zuul jobs in an environment that's more easily reproducible on dev workstations. but also, i don't think this is very important and we should merge whatever fixes this asap. | 15:49 |
clarkb | (it does appear to default to true) | 15:49 |
corvus | just that after it's fixed, maybe we can try dropping ensure-docker and switching to rely on bindep | 15:49 |
clarkb | corvus: I think you still want ensure-docker, but don't want it to install upstream in your scenario | 15:49 |
corvus | clarkb: why? | 15:49 |
clarkb | because ensure-docker does a few other things with config iirc | 15:49 |
avass | use-buildset-registry etc | 15:49 |
avass | though that could just be role: use-buildset-registry instead :) | 15:50 |
clarkb | looks like it modifies mtus if necessary, puts zuul in the docker group, and configures docker proxy | 15:50 |
clarkb | avass: ^ no that stuff | 15:50 |
corvus | okay, that's worth keeping around then :) | 15:50 |
avass | clarkb: yup just saw that | 15:51 |
avass | clarkb, corvus: could even do include_role: ensure-docker, tasks_from: docker-setup and then role: use-buildset-registry | 15:52 |
clarkb | avass: ya though telling esnure-docker to use the distro package seems simpler? | 15:52 |
clarkb | it will notice the package is already there and move on | 15:52 |
corvus | avass: true, but we've lost the "simplicity" aspect of this | 15:52 |
corvus | i don't think it's worth changing anything if we're not removing the ensure-docker role. i don't really care where the packages come from :) | 15:53 |
fungi | we do something similar with ensure-pip i think | 15:53 |
fungi | use distro package by default, install with get_pip.py as an alternative | 15:53 |
avass | clarkb: heh, but think of the seconds you can shave off from the job ;) | 15:53 |
corvus | i just saw a potential opportunity to replace a role with a bindep entry, but if that's not possible, let's not worry about it. | 15:53 |
clarkb | corvus: ok, I can modify the change to have a better commitm essage and flip the order with its parent so they can both merge | 15:54 |
clarkb | want to make sure the tests pass though | 15:54 |
corvus | ++ | 15:54 |
fungi | if the ensure-docker role skipped installing when docker was already present, that would presumably sidestep a lot of this? | 15:54 |
clarkb | fungi: that would be a non backward compatible change though | 15:54 |
fungi | right | 15:54 |
clarkb | (granted the current situation doesn't work) | 15:54 |
clarkb | but probably better for things to fail and people to evaluate their options than to change the assumption they haev worked with | 15:55 |
fungi | however there could be a var to tell ensure-docker to only install when there wasn't already a docker command in the path or something | 15:55 |
*** ykarel is now known as ykarel|away | 15:56 | |
fungi | that said, our usual approach with default behavior for ensure-.* roles is that they should make sure the tool is installed, by installing only if necessary | 15:57 |
clarkb | ya I think the assumption was that distro docker was unusable (until more recently) which meant if you didn't ask for it then we'd force the other thing | 15:58 |
*** jcapitao has quit IRC | 16:10 | |
openstackgerrit | Clark Boylan proposed zuul/nodepool master: Stop installing docker via bindep https://review.opendev.org/c/zuul/nodepool/+/787220 | 16:30 |
openstackgerrit | Clark Boylan proposed zuul/nodepool master: nodepool-functional : ignore errors copying logs https://review.opendev.org/c/zuul/nodepool/+/787065 | 16:31 |
clarkb | enough jobs have passed that I'm confident that ^ should work now | 16:31 |
clarkb | that is the better commit message + commit reordering stack | 16:31 |
corvus | clarkb: +3 both of those | 16:33 |
clarkb | cool | 16:33 |
*** sam_wan has quit IRC | 16:34 | |
*** ykarel|away has quit IRC | 16:39 | |
*** hamalq has joined #zuul | 16:45 | |
*** hamalq has quit IRC | 16:47 | |
*** hamalq has joined #zuul | 16:48 | |
clarkb | heh py36 failed after like 3 runs | 16:49 |
clarkb | I don't think it is related to this change | 16:50 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Pseudo-shard unique project names in keystore https://review.opendev.org/c/zuul/zuul/+/786983 | 16:53 |
corvus | tobiash, swest, fungi: i think the next steps for the keystore-in-zk work is to review https://review.opendev.org/786774 and https://review.opendev.org/786983 -- ignore the previous test failures (one docker related, one linter), they are ready for review. | 16:55 |
corvus | clarkb: ^ and that last change is the optional "do some extra sharding" change we talked about yesterday | 16:56 |
*** rpittau is now known as rpittau|afk | 16:59 | |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: WIP: Roles to snapshot and cleanup image builds for digitalocean https://review.opendev.org/c/zuul/zuul-jobs/+/786757 | 17:02 |
avass | there's no way to get an artifact produced by the same change but from the pipeline that merged the change other than something like: https://review.opendev.org/c/zuul/zuul-jobs/+/786757/3/roles/promote-image-digitalocean/tasks/main.yaml right? | 17:03 |
avass | (that queries the zuul api) | 17:05 |
corvus | avass: see download-artifact role | 17:07 |
corvus | avass: it uses the api; you might be able to reuse it | 17:08 |
*** jpena is now known as jpena|off | 17:08 | |
avass | corvus: I don't want to download the image, just rename it | 17:08 |
corvus | avass: good that's more efficient :) you might just double check that role for details to see if it does anything differently | 17:09 |
corvus | avass: but yes, i think that's a good approach; one other approach you might consider: | 17:09 |
corvus | avass: is what we do with docker images which is rather than using artifacts, we use the remote system to store the mapping | 17:10 |
corvus | avass: ie, we push images with complex tags derived from the change number, then we rename those tags | 17:10 |
corvus | avass: if you could do something similar, or maybe add image metadata based on change number, etc, then you could avoid using zuul api | 17:10 |
corvus | (but zuul api is fine and reliable; we use that for docs promotion) | 17:11 |
avass | corvus: yeah I think i've seen those. but that could cause problems where the image is created twice (somehow) | 17:11 |
avass | since docker images overwrite while machine images in digital ocean doesn't | 17:11 |
corvus | ack; makes sense | 17:11 |
avass | so I sort of need the exact image id | 17:11 |
openstackgerrit | Gonéri Le Bouder proposed zuul/zuul-jobs master: ensure-docker: ensure docker.socket is stopped https://review.opendev.org/c/zuul/zuul-jobs/+/787271 | 17:19 |
clarkb | that must be the leaked conflict | 17:20 |
fungi | aha! | 17:21 |
fungi | yeah maybe the package removal doesn't clean up the named pipe? | 17:21 |
avass | wow | 17:21 |
clarkb | I left a review on https://review.opendev.org/c/zuul/zuul-jobs/+/787271 based on what I learned debugging the nodepool side of things | 17:33 |
*** goneri has joined #zuul | 17:40 | |
corvus | goneri: clarkb was just saying you may have found the missing piece of the docker puzzle :) | 17:42 |
clarkb | also left acomment on the change. I think we need to be more careful about when we stop the unit | 17:42 |
corvus | might be interesting to do a no-op nodepool change that depends on 787271 before 787220 lands | 17:43 |
clarkb | in particular if that unit doesn't already exist wouldn't stopping it be an error? | 17:43 |
corvus | (i'm busy with other things right now; if anyone wants to try that be my guest) | 17:43 |
goneri | Hi! pabelanger told me to come here :-) | 17:53 |
clarkb | goneri: hello, I left a comment on https://review.opendev.org/c/zuul/zuul-jobs/+/787271 you may want to check | 17:59 |
goneri | I actually just answered. | 18:00 |
clarkb | goneri: if it wasn't installed previously then what started that unit? | 18:01 |
clarkb | since it looks like in the ara that you skip the restart even | 18:02 |
goneri | I think it's a post-install script from the rpm that starts the socket too early. | 18:02 |
clarkb | huh I thought rpms generally didn't start services. But I guess this is the upstream package so they can do what they want | 18:04 |
clarkb | in the ubuntu case it seems to only be an issue if docker.io was there previously (though possibly also if docker-ce was there previously) | 18:05 |
clarkb | btu I guess if we're doing this late enough that the unit has started and we subsequently restart docker itself then this should be fine? | 18:05 |
goneri | on Debian/Ubuntu, (sadly) the services are started during the installation. | 18:05 |
goneri | yup, this should addresses both cases. https://github.com/kata-containers/tests/issues/3103 is with Ubuntu actually. | 18:06 |
goneri | https://github.com/docker/docker-ce-packaging/blob/master/rpm/SPECS/docker-ce.spec#L115 | 18:07 |
*** iurygregory has quit IRC | 18:15 | |
openstackgerrit | Merged zuul/nodepool master: Stop installing docker via bindep https://review.opendev.org/c/zuul/nodepool/+/787220 | 18:35 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: WIP: Roles to snapshot and cleanup image builds for digitalocean https://review.opendev.org/c/zuul/zuul-jobs/+/786757 | 18:35 |
*** bhavikdbavishi has quit IRC | 18:46 | |
openstackgerrit | Merged zuul/nodepool master: nodepool-functional : ignore errors copying logs https://review.opendev.org/c/zuul/nodepool/+/787065 | 18:47 |
*** ajitha has quit IRC | 18:56 | |
*** vishalmanchanda has quit IRC | 19:09 | |
openstackgerrit | Gonéri Le Bouder proposed zuul/zuul-jobs master: ensure-docker: ensure docker.socket is stopped https://review.opendev.org/c/zuul/zuul-jobs/+/787271 | 19:15 |
openstackgerrit | Gonéri Le Bouder proposed zuul/zuul-jobs master: ensure-docker: ensure docker.socket is stopped https://review.opendev.org/c/zuul/zuul-jobs/+/787271 | 19:50 |
openstackgerrit | Gonéri Le Bouder proposed zuul/zuul-jobs master: ensure-docker: ensure docker.socket is stopped https://review.opendev.org/c/zuul/zuul-jobs/+/787271 | 20:44 |
*** odyssey4me has quit IRC | 21:05 | |
*** odyssey4me has joined #zuul | 21:06 | |
goneri | I don't think the last CI problems are coming from my PR: https://review.opendev.org/c/zuul/zuul-jobs/+/787271 | 21:18 |
mordred | goneri: I agree, https://zuul.opendev.org/t/zuul/build/ade3b9c4127546f1873cecac8d3a434f doesn't seem particularly related | 21:24 |
ianw | goneri: i think we started running a periodic job for all zuul-jobs, we could/should look back and see where it started failing | 21:25 |
ianw | i'd say here : http://lists.zuul-ci.org/pipermail/zuul-jobs-failures/2021-April/000028.html | 21:26 |
mordred | ianw: kube 1.21: Remove deprecated --generator, --replicas, --service-generator, --service-overrides, --schedule from kubectl run Deprecate --serviceaccount, --hostport, --requests, --limits in kubectl run | 21:30 |
mordred | sorry, 1.20 actually | 21:30 |
mordred | https://kubernetes.io/docs/setup/release/notes/ | 21:30 |
mordred | https://github.com/kubernetes/kubernetes/pull/99732 | 21:31 |
mordred | according to the PR, the generator flag was a no-op | 21:32 |
mordred | https://github.com/kubernetes/kubernetes/pull/99732/files#diff-bbbdda93ca43398e7c554a57f7934e126ed841d46078afe8d601edc2f695b4f9L177-L178 | 21:32 |
ianw | unfortunately i guess it means squishing changes into the docker fix | 21:33 |
mordred | yeah | 21:35 |
mordred | run-pod/v1 <-- it seems that is the only generator that does anything - so it seems like, givne our use, removing the parameter is appropriate | 21:35 |
*** fsvsbs has joined #zuul | 21:36 | |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Remove use of --generator=run-pod/v1 https://review.opendev.org/c/zuul/zuul-jobs/+/787291 | 21:43 |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Remove use of --generator=run-pod/v1 https://review.opendev.org/c/zuul/zuul-jobs/+/787291 | 21:43 |
mordred | ianw, goneri : ^^ I made that on top of the docker change - it should turn things green. if it does, we can squash | 21:44 |
ianw | mordred: LGTM, thanks -- i can keep and eye and do the squash today | 21:44 |
mordred | I'm curious if there is a way we could have caught this differently. I can't think of a good way | 21:45 |
ianw | i guess modulo noticing deprecated arg warnings, just more scrutiny of the periodic job failure report | 21:47 |
mordred | yeah. like - failing on deprecated args would have just shifted when the job failed - so not really much of a thing | 21:54 |
openstackgerrit | Gonéri Le Bouder proposed zuul/zuul-jobs master: ensure-docker: ensure docker.socket is stopped https://review.opendev.org/c/zuul/zuul-jobs/+/787271 | 22:00 |
ianw | goneri: ^ we don't actually want that depends-on the other change; we want to test mordred's change applied ontop of yours, and then to get through gate we'll need to squash them | 22:02 |
goneri | ok, sorry. | 22:03 |
ianw | it's not always exceedingly clear in gerrit, but you can see "Relation chain" on the right hand side | 22:03 |
openstackgerrit | Gonéri Le Bouder proposed zuul/zuul-jobs master: ensure-docker: ensure docker.socket is stopped https://review.opendev.org/c/zuul/zuul-jobs/+/787271 | 22:04 |
mordred | yeah - if you want to re-push yours up without that depends on (and while you;re at it, go ahead and take out the AWS line, since it's no longer true) - and I can rebase mine | 22:04 |
*** bodgix has quit IRC | 22:07 | |
*** bodgix_ has joined #zuul | 22:07 | |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: Remove use of --generator=run-pod/v1 https://review.opendev.org/c/zuul/zuul-jobs/+/787291 | 22:30 |
mordred | ianw, goneri: my patch went green on the k8s tests but broke on openshift - turns out I shouldn't have removed the arg from oc - just from kubectl *sigh* | 22:30 |
mordred | tristanC: ^^ you know more things about openshift - kubectl has deprecated the generator argument but oc has not - should we assume oc will also deprecate it and should we do anything to future-proof ourselves? | 22:32 |
*** holser has quit IRC | 22:33 | |
clarkb | mordred: does your change need to be rebased under the other change? | 22:35 |
mordred | clarkb: it's sitting on top of it | 22:39 |
mordred | clarkb: we'll need to squash the two to land it | 22:39 |
*** tosky has quit IRC | 23:17 | |
*** nils has quit IRC | 23:25 | |
*** goneri has quit IRC | 23:27 | |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul-jobs master: Remove use of --generator=run-pod/v1 for oc https://review.opendev.org/c/zuul/zuul-jobs/+/787300 | 23:41 |
tristanC | mordred: that sounds good, let's see with ^ | 23:41 |
mordred | I'm going to squash the two changes, the second is now green | 23:43 |
openstackgerrit | Monty Taylor proposed zuul/zuul-jobs master: ensure-docker: ensure docker.socket is stopped https://review.opendev.org/c/zuul/zuul-jobs/+/787271 | 23:45 |
mordred | tristanC, clarkb, corvus : ^^ | 23:45 |
mordred | that's the squashed version of goneri's and my change. it should be green and good to go | 23:46 |
clarkb | one thing I noticed in a recent ps is that we use a handler to start and stop the docker service | 23:46 |
clarkb | are ansible handlers processed in order? | 23:46 |
clarkb | otherwise we could start then stop | 23:46 |
openstackgerrit | James E. Blair proposed zuul/zuul master: Lock node requests in fake nodepool https://review.opendev.org/c/zuul/zuul/+/787301 | 23:48 |
ianw | corvus: thanks! i was just trying to figure that out :) | 23:49 |
corvus | ianw: oh you saw that race too? :) that one hit me here: https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_332/786774/3/check/zuul-tox-py36/33292a2/testr_results.html | 23:50 |
corvus | presents as a failed nodepool request in the scheduler log, which should never happen with a fake nodepool (unless we intentionally fail) | 23:51 |
ianw | nodepool.exceptions.ZKLockException: Did not get lock on /nodepool/nodes/0000000000/lock | 23:51 |
ianw | https://zuul.opendev.org/t/zuul/build/8709fbeed01847949faeda8b7b4d8a59 | 23:51 |
ianw | is what i was looking at | 23:51 |
ianw | well i'm presuming it's the same thing as it involves the word "lock" :) | 23:52 |
corvus | ianw: could be related, but not 100% sure; my thing fixes a lock on the requests; that could cascade to another failure. that error is at least a little different. | 23:53 |
corvus | ianw: oh that's nodepool; definitely not related | 23:54 |
ianw | ok then, back to the drawing board :) | 23:55 |
corvus | clarkb: handlers are run in the order *defined* (!), we define docker.socket stop before docker.socket and docker restart | 23:55 |
corvus | so i think that's gtg | 23:55 |
clarkb | TIL | 23:56 |
corvus | me too | 23:56 |
corvus | clarkb: would you mind giving https://review.opendev.org/786983 at least a quick review on the idea? | 23:57 |
clarkb | ya the idea of using the "org" prefix makes sense | 23:58 |
corvus | kk; hopefully we can merge that stack tomorrow | 23:59 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!