@iwienand:matrix.org | zuul-jobs-test-registry-buildset-registry-k8s tests on jammy remain a bit of a mystery | 01:07 |
---|---|---|
@iwienand:matrix.org | for reference | 01:07 |
@iwienand:matrix.org | PASSING - https://zuul.opendev.org/t/zuul/build/cebd86553be6415a9081683f69b7f4ab/logs - https://db6ae3143f05c54d635d-3d04a8ba77fbf8fa7387964c7149df37.ssl.cf5.rackcdn.com/861560/5/check/zuul-jobs-test-registry-buildset-registry-k8s-docker/cebd865/docker/ | 01:08 |
@iwienand:matrix.org | FAILING - https://zuul.opendev.org/t/zuul/build/a2d27399757048aa929ab358e511f2e6/logs - https://1817fe6a260e6956efeb-c205db93b3c29bdefc78a96a38b9c4c7.ssl.cf5.rackcdn.com/863582/4/check/zuul-jobs-test-registry-buildset-registry-k8s-docker/a2d2739/docker/ | 01:09 |
@iwienand:matrix.org | the failing one doesn't have any coredns containers -- but i also can't see any obvious error messages | 01:11 |
@iwienand:matrix.org | it might be logging stuff to the journal that we don't seem to colelct | 01:11 |
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/zuul-jobs] 863781: test-registry-post: collect k8s logs https://review.opendev.org/c/zuul/zuul-jobs/+/863781 | 01:45 | |
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/zuul-jobs] 863810: enable-kuburnetes: 22.04 updates https://review.opendev.org/c/zuul/zuul-jobs/+/863810 | 05:23 | |
-@gerrit:opendev.org- Dr. Jens Harbott proposed: [zuul/nodepool] 863812: Add username to detailed node list output https://review.opendev.org/c/zuul/nodepool/+/863812 | 06:44 | |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 862978: Add playbook semaphores https://review.opendev.org/c/zuul/zuul/+/862978 | 15:20 | |
@jpew:matrix.org | With Zuul `7.0.0` I see a lot of `opentelemetry.attributes - WARNING - Invalid type Project for attribute value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types` in the scheduler logs; am I missing something? | 15:59 |
@jim:acmegating.com | jpew: it's likely an unimportant bug as we work on adding otel support (which is a work in progress) you can ignore them | 16:02 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 862978: Add playbook semaphores https://review.opendev.org/c/zuul/zuul/+/862978 | 16:41 | |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 863904: Add an upgrading document https://review.opendev.org/c/zuul/zuul/+/863904 | 17:01 | |
@q:fricklercloud.de | Does the zuul community have some opinion on the use of storyboard for issue tracking? I checked some recent zuul stories and there wasn't any feedback there other than by the authors themselves. cf. https://lists.opendev.org/pipermail/service-announce/2022-November/000048.html and the service-discuss thread mentioned there, please send feedback to the discuss ML if you have some | 18:26 |
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: | 18:29 | |
- [zuul/zuul] 863104: Initialize tracing module in model tests https://review.opendev.org/c/zuul/zuul/+/863104 | ||
- [zuul/zuul] 862977: Remove legacy semaphore handling https://review.opendev.org/c/zuul/zuul/+/862977 | ||
@clarkb:matrix.org | corvus: on my failing py311 job runs I see https://paste.opendev.org/show/bW4jF8qZuJLqdNL053Qq/ filling the job-output log (its like 1.5GB). Does that looks like an actual issue or is that a side effect of zookeeper being sad? | 19:31 |
@clarkb:matrix.org | https://zuul.opendev.org/t/zuul/build/59f17c97afba43f593156d61445c9364 is the build | 19:32 |
@jim:acmegating.com | Clark: pretty sure that's a sad zk slow node | 19:32 |
@jim:acmegating.com | (and it's throwing off the test cleanup/shutdown, so it cascades to later tests) | 19:33 |
@clarkb:matrix.org | the zk audit log in that job is empty. Do you know if there is some sort of logging we could grab from zk that might confirm that? something showing timeoutes maybe? | 19:36 |
@jim:acmegating.com | Clark: oh i don't necessarily think it's a zk error, sorry. it's a possibility, but more likely something zuul side. | 19:37 |
@jim:acmegating.com | (like maybe a zk operation timeout or something) | 19:38 |
@jim:acmegating.com | anyway, as far as getting zk logs to confirm, maybe we should think about running zk in a container.... | 19:38 |
@jim:acmegating.com | (basically, just using test-setup-docker straight up) | 19:38 |
@clarkb:matrix.org | oh I see zk is functional, but slower than zuul can deal with | 19:38 |
@clarkb:matrix.org | or at least htat is possible | 19:38 |
@jim:acmegating.com | yeah, something like that's my guess | 19:38 |
@clarkb:matrix.org | the tox task eventually ends with `ubuntu-jammy | MODULE FAILURE: Killed` which is interesting | 19:41 |
@clarkb:matrix.org | did it OOM maybe? | 19:42 |
@clarkb:matrix.org | but that is why the job didn't timeout | 19:42 |
@iwienand:matrix.org | zuul-maint: could i get another eye on https://review.opendev.org/c/zuul/zuul-jobs/+/863582 to make sure we're ok with making these kubernetes jobs non-voting on jammy for now. it seems pointless to pin them to focal and just ignore that they don't work on jammy -- we're not going to get anywhere with that. but i'd really like to get the zuul-jobs gate a bit more happy. i started looking in https://review.opendev.org/c/zuul/zuul-jobs/+/863810/ and maybe it's just finding new package sources ... but there is quite a bit going on in setting up minikube and i'm not sure how much time i have to get it all working | 19:58 |
@jim:acmegating.com | ianw: i think those are used by the zuul-operator repo, which currently runs on bionic. i don't object to dropping those tests, but maybe pinning to bionic, or creating multi-platform (bionic, focal, jammy) jobs and then just disabling jammy are worth considering? | 20:16 |
@iwienand:matrix.org | ok; do you know off-hand if zuul-operator is using bionic for a reason, or just it hasn't been updated? | 20:18 |
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/zuul-jobs] 863810: [wip] enable-kuburnetes: debugging 22.04 https://review.opendev.org/c/zuul/zuul-jobs/+/863810 | 20:21 | |
@iwienand:matrix.org | istm the problem on 22.04 is networking -> https://5ee8a394869b896f2ffd-b765094214961b8b34fe718f08477de4.ssl.cf1.rackcdn.com/863781/1/check/zuul-jobs-test-registry-buildset-registry-k8s-docker/c8eabb3/kubelet/kubelet.txt | 20:21 |
@jim:acmegating.com | ianw: i'm not sure, actually. i just did a quick look and i don't know of a reason not to at least use focal. | 20:21 |
@iwienand:matrix.org | > <@jim:acmegating.com> ianw: i'm not sure, actually. i just did a quick look and i don't know of a reason not to at least use focal. | 20:22 |
ok, i can try bumping those. i think focal should work; that's what it was using in zuul-jobs | ||
@iwienand:matrix.org | ```network plugin is not ready: cni config uninitialized"``` | 20:22 |
@iwienand:matrix.org | i'm not really sure what networking setup was/is being done on < 22.04 -- it seems to "just work" there :/ | 20:23 |
@jim:acmegating.com | i think k8s itself changed some networking stuff in recent versions, just to throw extra variables in | 20:24 |
@iwienand:matrix.org | i think the ensure-kubernetes role is completing and passing tests, but isn't actually making a kubernetes that works on jammy. so i think it might be useful if i can figure out a few extra testing bits so we at least fail the job, and have something concrete to debug | 20:25 |
@jim:acmegating.com | ianw: a simple test would be a k8s "Job" that ran a pod with a shell command that exited 0. easy to wait and ensure job completion with a single kubectl command. | 20:27 |
@jim:acmegating.com | `kubectl wait --for=condition=complete job/JOBNAME --timeout=600s` would be a good functional test. but also, now that i think about it, if it's failing as early as you say, then maybe all we need is `kubectl version` ? | 20:36 |
@iwienand:matrix.org | https://zuul.opendev.org/t/zuul/build/44ab1102c080493e8df74c355892b434/console "kubctl get pods" shows a test pod as "pending" | 21:08 |
@iwienand:matrix.org | hrm, we do dump the pod status -> https://zuul.opendev.org/t/zuul/build/44ab1102c080493e8df74c355892b434/log/pods/test.txt | 21:12 |
@iwienand:matrix.org | ```0/1 nodes are available: 1 node(s) had untolerated taint {node.kubernetes.io/not-ready: }.``` | 21:12 |
@iwienand:matrix.org | i hate it when i have untolerated taint | 21:13 |
@jim:acmegating.com | ah we do make a test pod... then we can probably just change that to actually do some status verification (or change that to do the k8s job idea) | 21:15 |
@iwienand:matrix.org | it seems we make a test pod for cri-o path, but not docker? | 21:18 |
@iwienand:matrix.org | https://opendev.org/zuul/zuul-jobs/src/branch/master/test-playbooks/ensure-kubernetes/crio.yaml | 21:18 |
@iwienand:matrix.org | versus https://opendev.org/zuul/zuul-jobs/src/branch/master/test-playbooks/ensure-kubernetes/docker.yaml | 21:18 |
@jim:acmegating.com | seems oversightish. like everything after l16 could go in a common tasklist | 21:20 |
-@gerrit:opendev.org- Ian Wienand proposed: | 21:30 | |
- [zuul/zuul-jobs] 863810: [wip] enable-kubernetes: check pod is actually running https://review.opendev.org/c/zuul/zuul-jobs/+/863810 | ||
- [zuul/zuul-jobs] 863940: ensure-kubernetes: move testing into common path https://review.opendev.org/c/zuul/zuul-jobs/+/863940 | ||
@michael_kelly_anet:matrix.org | bump: any other reviewers possible for https://review.opendev.org/c/zuul/zuul-operator/+/853592 ? | 22:00 |
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/zuul-jobs] 863810: [wip] enable-kubernetes: check pod is actually running https://review.opendev.org/c/zuul/zuul-jobs/+/863810 | 22:26 | |
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/zuul-operator] 863948: zuul-operator-funcitonal-k8s: update to Focal nodes https://review.opendev.org/c/zuul/zuul-operator/+/863948 | 22:36 | |
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/zuul-jobs] 863810: [wip] enable-kubernetes: check pod is actually running https://review.opendev.org/c/zuul/zuul-jobs/+/863810 | 23:19 | |
@iwienand:matrix.org | ```Ubuntu 20.04 not supported due to no known source of podman packages.``` https://zuul.opendev.org/t/zuul/build/a8d31a103c464318afdcf8f9d7f04e4a/console#1/0/14/ubuntu-focal | 23:21 |
@iwienand:matrix.org | that at least explains why zuul-operator has itself pinned to bionic, i guess | 23:21 |
@michael_kelly_anet:matrix.org | :P | 23:29 |
@michael_kelly_anet:matrix.org | Sounds like it's supported. | 23:30 |
@michael_kelly_anet:matrix.org | https://www.atlantic.net/dedicated-server-hosting/how-to-install-and-use-podman-on-ubuntu-20-04/ | 23:30 |
@clarkb:matrix.org | ya I think that obs repo didn't have focal packages when we first switched stuff to focal. But it does now | 23:34 |
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/zuul-jobs] 863810: [wip] enable-kubernetes: check pod is actually running https://review.opendev.org/c/zuul/zuul-jobs/+/863810 | 23:42 | |
@iwienand:matrix.org | yeah, ensure-podman on focal would probably get the zuul-operator jobs to focal. fixing ensure-kubernetes for jammy would hopefully allow a full jump | 23:43 |
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/zuul-jobs] 863810: [wip] enable-kubernetes: check pod is actually running https://review.opendev.org/c/zuul/zuul-jobs/+/863810 | 23:53 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!