-@gerrit:opendev.org- Zuul merged on behalf of Ian Wienand: [zuul/zuul-jobs] 851163: emit-job-header: noqa on error ignore https://review.opendev.org/c/zuul/zuul-jobs/+/851163 | 00:06 | |
-@gerrit:opendev.org- Zuul merged on behalf of Ian Wienand: [zuul/zuul-jobs] 851014: linters: fix spaces between filters https://review.opendev.org/c/zuul/zuul-jobs/+/851014 | 00:07 | |
-@gerrit:opendev.org- Zuul merged on behalf of Ian Wienand: | 00:22 | |
- [zuul/zuul-jobs] 851015: linters: add names to blocks https://review.opendev.org/c/zuul/zuul-jobs/+/851015 | ||
- [zuul/zuul-jobs] 851017: linters: update to ansible-lint 6 https://review.opendev.org/c/zuul/zuul-jobs/+/851017 | ||
-@gerrit:opendev.org- Zuul merged on behalf of Ian Wienand: | 00:23 | |
- [zuul/zuul-jobs] 851164: ansible-lint: disable progressive mode https://review.opendev.org/c/zuul/zuul-jobs/+/851164 | ||
- [zuul/zuul-jobs] 851263: ensure-kubernetes: pull cri-dockerd systemd from tag https://review.opendev.org/c/zuul/zuul-jobs/+/851263 | ||
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/zuul-jobs] 851288: linters: standardise on newline at end of file https://review.opendev.org/c/zuul/zuul-jobs/+/851288 | 03:55 | |
-@gerrit:opendev.org- Ian Wienand proposed: | 04:08 | |
- [zuul/zuul-jobs] 851288: linters: standardise on newline at end of file https://review.opendev.org/c/zuul/zuul-jobs/+/851288 | ||
- [zuul/zuul-jobs] 851289: linters: use Ansible 5 for ansible-lint https://review.opendev.org/c/zuul/zuul-jobs/+/851289 | ||
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/zuul-jobs] 851334: test-requirements: bump to Ansible 5 https://review.opendev.org/c/zuul/zuul-jobs/+/851334 | 04:19 | |
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/zuul-jobs] 851334: test-requirements: bump to Ansible 5 https://review.opendev.org/c/zuul/zuul-jobs/+/851334 | 04:44 | |
@tony.breeds:matrix.org | I'm sure I'm not the first person to ask this but ... Given a build is there anyway to get the tasks/roles/plays that were run? Ultimately my aim is to re-run *most* of a job that failed and debug it locally. I know it's somewhat easy to interate on a job in zuul but I still feel like somethign like this would be potentially quicker? | 06:27 |
---|---|---|
-@gerrit:opendev.org- Ian Wienand proposed: | 06:28 | |
- [zuul/zuul-jobs] 851288: linters: standardise on newline at end of file https://review.opendev.org/c/zuul/zuul-jobs/+/851288 | ||
- [zuul/zuul-jobs] 851289: linters: use Ansible 5 for ansible-lint https://review.opendev.org/c/zuul/zuul-jobs/+/851289 | ||
- [zuul/zuul-jobs] 851343: Drop py27 tox testing https://review.opendev.org/c/zuul/zuul-jobs/+/851343 | ||
- [zuul/zuul-jobs] 851344: test-requirements: drop 3.5 dependencies https://review.opendev.org/c/zuul/zuul-jobs/+/851344 | ||
-@gerrit:opendev.org- Ian Wienand proposed: | 06:37 | |
- [zuul/zuul-jobs] 851344: test-requirements: drop 3.5 dependencies https://review.opendev.org/c/zuul/zuul-jobs/+/851344 | ||
- [zuul/zuul-jobs] 851334: test-requirements: bump to Ansible 5 https://review.opendev.org/c/zuul/zuul-jobs/+/851334 | ||
- [zuul/zuul-jobs] 851289: linters: use Ansible 5 for ansible-lint https://review.opendev.org/c/zuul/zuul-jobs/+/851289 | ||
@iwienand:matrix.org | zuul-main: https://review.opendev.org/q/topic:ansible-lint-update-6 , previously reviewed and thank you for that 😀 has a few more changes from cleaning up and i figured might as well get o-z-j , project-config and base-jobs on the same level too. | 07:47 |
@westphahl:matrix.org | corvus Clark yeah I think we should go with 851255 and see if that has any significant performance impact for us. If that should be the case we can still look at 851256 | 12:37 |
@fungicide:matrix.org | > <@tony.breeds:matrix.org> I'm sure I'm not the first person to ask this but ... Given a build is there anyway to get the tasks/roles/plays that were run? Ultimately my aim is to re-run *most* of a job that failed and debug it locally. I know it's somewhat easy to interate on a job in zuul but I still feel like somethign like this would be potentially quicker? | 12:38 |
the build console view is a rendering of the job-output.json preserved along with your build logs, and has all the playbook and task details | ||
@fungicide:matrix.org | whether that's sufficient to be able to re-run the essence of a job is another question altogether though. there's been off-again/on-again efforts for a "zuul-runner" tool to query the zuul api for a facsimile of the bits you'd need to recreate a given build | 12:44 |
@fungicide:matrix.org | but actually putting that to use gets pretty tough as soon as you exit the realm of trivial single-node jobs with no dependencies and no external state | 12:45 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul-jobs] 851414: Revert cri-dockerd changes https://review.opendev.org/c/zuul/zuul-jobs/+/851414 | 14:22 | |
@jim:acmegating.com | Clark: ianw ^ that broke nodepool -- error at https://zuul.opendev.org/t/zuul/build/9ae631c37a384b4bb63515c6c5f04a00 | 14:23 |
@clarkb:matrix.org | corvus: I don't have any objections to reverting but the existing testing isn't passing against the revert due to the issues the changes were initially addressing | 14:43 |
@jim:acmegating.com | might be a good force-merge then? | 14:43 |
@jim:acmegating.com | or we could pin the minikube version in the role (which is what nodepool is doing to continue testing) | 14:44 |
@clarkb:matrix.org | corvus: or maybe nodepool will work if unpinning minikube? (I wonder if the fixes need new minikube) but ya I think pinning in the role may be a good workaround and then roll forward from tehre | 14:45 |
@jim:acmegating.com | Clark: any chance you can do that? i'm knees deep in other changes to zuul-jobs | 14:46 |
@clarkb:matrix.org | I can take a look after my meeting. Might be an hour or so | 14:47 |
@jim:acmegating.com | i think we should consider this break somewhat urgent. i'd like to do a force-merge to correct it then. | 14:48 |
@clarkb:matrix.org | ok, just keep in mind any other zuul-jobs updates that will trigger those jobs will no longer be landable through normal gating | 14:48 |
@jim:acmegating.com | yeah, we're going back to status-quo before ianw's stack. i think that's fine since the stack introduced a regression. | 14:49 |
@clarkb:matrix.org | Yup. Just mentioning it as you indicated other zuul-jobs work is in progress | 14:49 |
@jim:acmegating.com | oh yeah, the other work is to try to patch the testing hole that let this through | 14:50 |
@clarkb:matrix.org | I've given it a +2 if you want to force merge it | 14:50 |
@jim:acmegating.com | cool, thanks | 14:51 |
-@gerrit:opendev.org- corvus.admin merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul-jobs] 851414: Revert cri-dockerd changes https://review.opendev.org/c/zuul/zuul-jobs/+/851414 | 15:02 | |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: | 15:14 | |
- [zuul/zuul-jobs] 851418: Add jammy testing https://review.opendev.org/c/zuul/zuul-jobs/+/851418 | ||
- [zuul/zuul-jobs] 851419: Sort supported platforms https://review.opendev.org/c/zuul/zuul-jobs/+/851419 | ||
- [zuul/zuul-jobs] 851420: Support subsets of platforms in update-test-platforms https://review.opendev.org/c/zuul/zuul-jobs/+/851420 | ||
- [zuul/zuul-jobs] 851421: Test ensure-kubernetes on all Ubuntu platforms https://review.opendev.org/c/zuul/zuul-jobs/+/851421 | ||
@jim:acmegating.com | Clark: ianw ^ I think that series should get us the missing test coverage | 15:14 |
@clarkb:matrix.org | corvus: thanks I'm going to grab breakfast now that my meeting is over and then dig into that | 15:40 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: | 15:40 | |
- [zuul/zuul-jobs] 851418: Add jammy testing https://review.opendev.org/c/zuul/zuul-jobs/+/851418 | ||
- [zuul/zuul-jobs] 851419: Sort supported platforms https://review.opendev.org/c/zuul/zuul-jobs/+/851419 | ||
- [zuul/zuul-jobs] 851420: Support subsets of platforms in update-test-platforms https://review.opendev.org/c/zuul/zuul-jobs/+/851420 | ||
- [zuul/zuul-jobs] 851421: Test ensure-kubernetes on all Ubuntu platforms https://review.opendev.org/c/zuul/zuul-jobs/+/851421 | ||
- [zuul/zuul-jobs] 851425: Revert "Revert cri-dockerd changes" https://review.opendev.org/c/zuul/zuul-jobs/+/851425 | ||
@jim:acmegating.com | that's a rebase on current master, plus a revert at the end of the stack so we can check the error. | 15:40 |
@jim:acmegating.com | Clark: cool, thanks. hopefully you can iterate on the last change in the stack and if it passes, we can just squash it with the next-to-last. the ones above it pass, so no logistical issues there. | 15:41 |
-@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul-jobs] 851434: Install crio on Jammy like Focal https://review.opendev.org/c/zuul/zuul-jobs/+/851434 | 16:44 | |
-@gerrit:opendev.org- Clark Boylan proposed on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul-jobs] 851425: Revert "Revert cri-dockerd changes" https://review.opendev.org/c/zuul/zuul-jobs/+/851425 | 16:44 | |
@clarkb:matrix.org | corvus: ^ I think maybe that will fix it. I'll squash everything together if testing looks good | 16:44 |
@jim:acmegating.com | Clark: thanks, that makes sense from what i've picked up so far :) | 16:48 |
@clarkb:matrix.org | and the reason it broke nodepool is ianw's stack for newer k8s support turned on crio globally | 16:49 |
@jim:acmegating.com | yep | 16:49 |
@clarkb:matrix.org | hrm jammy doesn't have packages in that opensuse obs build area | 17:06 |
@clarkb:matrix.org | oh maybe I need to use a different crio version | 17:08 |
-@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul-jobs] 851434: Install crio on Jammy like Focal https://review.opendev.org/c/zuul/zuul-jobs/+/851434 | 17:11 | |
-@gerrit:opendev.org- Clark Boylan proposed on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul-jobs] 851425: Revert "Revert cri-dockerd changes" https://review.opendev.org/c/zuul/zuul-jobs/+/851425 | 17:11 | |
-@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul-jobs] 851434: Install crio on Jammy like Focal https://review.opendev.org/c/zuul/zuul-jobs/+/851434 | 17:24 | |
-@gerrit:opendev.org- Clark Boylan proposed on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul-jobs] 851425: Revert "Revert cri-dockerd changes" https://review.opendev.org/c/zuul/zuul-jobs/+/851425 | 17:24 | |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/nodepool] 851443: Clarify disjoint builders in docs https://review.opendev.org/c/zuul/nodepool/+/851443 | 17:31 | |
@clarkb:matrix.org | corvus: ianw ^ My latest changes have the non crio jobs working at the end of the stack but the crio jobs fail | 17:55 |
@clarkb:matrix.org | My hunch is that I'm not configuring minikube to use crio properly in the crio case just based on the difference between crio and docker jobs | 17:56 |
@jim:acmegating.com | but crio/bionic works | 17:57 |
@clarkb:matrix.org | corvus: yup, but it is installing old crio | 17:58 |
@clarkb:matrix.org | I think the issue must be using newer crio and not configuring something? It says the crio service is failing to start | 17:58 |
@jim:acmegating.com | (since i just went and checked -- the registry crio job runs on focal, so fixing focal for ensure-k8s/crio should fix that) | 17:58 |
@clarkb:matrix.org | Let me update to see if crio 1.21 is any different (the first version to have jammy packages) | 18:00 |
@clarkb:matrix.org | actually it is 1.22 | 18:01 |
-@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul-jobs] 851434: Install crio on Jammy like Focal https://review.opendev.org/c/zuul/zuul-jobs/+/851434 | 18:05 | |
-@gerrit:opendev.org- Clark Boylan proposed on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul-jobs] 851425: Revert "Revert cri-dockerd changes" https://review.opendev.org/c/zuul/zuul-jobs/+/851425 | 18:05 | |
@clarkb:matrix.org | "runtime validation: invalid runtime_path for runtime 'runc': \"stat /usr/lib/cri-o-runc/sbin/runc: no such file | 18:33 |
or directory\""" | ||
@clarkb:matrix.org | For some reason they make cri-o-runc a separate package and cri-o does not dpeend on it even though the service does indeed require it | 18:35 |
-@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul-jobs] 851434: Install crio on Jammy like Focal https://review.opendev.org/c/zuul/zuul-jobs/+/851434 | 18:37 | |
-@gerrit:opendev.org- Clark Boylan proposed on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul-jobs] 851425: Revert "Revert cri-dockerd changes" https://review.opendev.org/c/zuul/zuul-jobs/+/851425 | 18:37 | |
@clarkb:matrix.org | * `runtime validation: invalid runtime\_path for runtime 'runc': "stat /usr/lib/cri-o-runc/sbin/runc: no such file | 18:37 |
or directory""` | ||
@clarkb:matrix.org | Ok I think that last patchset solved the issue. I'll work on a squash after lunch once all the testing reports back | 18:46 |
-@gerrit:opendev.org- Clark Boylan proposed on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: | 19:00 | |
- [zuul/zuul-jobs] 851425: Revert "Revert cri-dockerd changes" https://review.opendev.org/c/zuul/zuul-jobs/+/851425 | ||
- [zuul/zuul-jobs] 851421: Test ensure-kubernetes on all Ubuntu platforms https://review.opendev.org/c/zuul/zuul-jobs/+/851421 | ||
@clarkb:matrix.org | I swapped the order so that they are all mergeable now | 19:03 |
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/nodepool] 851259: Fix race in test_node_list_json https://review.opendev.org/c/zuul/nodepool/+/851259 | 19:51 | |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/nodepool] 847387: Revert "Pin minikube to 1.25.2" https://review.opendev.org/c/zuul/nodepool/+/847387 | 20:30 | |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/nodepool] 847387: Revert "Pin minikube to 1.25.2" https://review.opendev.org/c/zuul/nodepool/+/847387 | 20:32 | |
@jim:acmegating.com | that should test lifting the pin after the zuul-jobs stack is in place | 20:32 |
@clarkb:matrix.org | that seems to be failing with `nodepool.exceptions.LaunchNodepoolException: main-0000000000: couldn't find token for service account {stuff}` I wonder if newer k8s has api incompatibilities we needto address too? | 21:05 |
@clarkb:matrix.org | huh we ask k8s to create a service account token and wait 30 seconds for that to complete when creating a namespace. | 21:25 |
@clarkb:matrix.org | I'm surprised something like a token needs to be waited on and isn't snychronous. But I guess that may be failing for some reason | 21:25 |
@clarkb:matrix.org | https://zuul.opendev.org/t/zuul/build/4a94802b53074825a53e5179dde65579/log/minikube.txt#917-920 is that related? | 21:27 |
@jim:acmegating.com | Clark: is that error in response to our request, or is that a startup error? | 21:40 |
@clarkb:matrix.org | corvus: I think it is a startup error. But wondering if we can't create a service account for similar reasons | 21:43 |
@clarkb:matrix.org | In theory the authentication stuff should be completely independent of the container runtime driver. Which is why i suspect this is more related to the newer version of minikube and k8s | 21:44 |
@clarkb:matrix.org | Looking at the nodepool side of the logs we get the k8s response json logged while it goes through that 30 second wait for secrets. But I never see any secret info in the json | 21:58 |
@clarkb:matrix.org | I'm not familiar enough with the k8s api to know if we should expect that back in this case. But it does seem like maybe the API is doing something that we aren't expecting | 21:58 |
@iwienand:matrix.org | thanks, the whole cri-o/jammy stack lgtm. haven't approved it as is the above saying that it installs but isn't actually happy running? | 21:59 |
@clarkb:matrix.org | ianw: isn't happy for nodepool. But I'm not yet sure if that is a nodepool or minikube+k8s issue | 22:00 |
@clarkb:matrix.org | https://zuul.opendev.org/t/zuul/build/4a94802b53074825a53e5179dde65579/log/job-output.txt#2804 is the start of where nodepool has problems in the job | 22:00 |
@clarkb:matrix.org | and it is a service account on the new namespace that we're waiting for tokens for. Not the namespace itself | 22:01 |
@clarkb:matrix.org | https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/ shows that secrets should be automatically created for you | 22:03 |
@clarkb:matrix.org | https://stackoverflow.com/questions/72256006/service-account-secret-is-not-listed-how-to-fix-it aha | 22:04 |
@clarkb:matrix.org | the k8s docs need updating :) | 22:04 |
@clarkb:matrix.org | but also I think that means it is a nodepool problem | 22:04 |
@clarkb:matrix.org | it looks like we can toggle an option to get the old behavior back, but I suspect the best thing for nodepool to do is to actually request tokens via the separate API | 22:06 |
@clarkb:matrix.org | as that should be most compatible with other k8s installs | 22:06 |
@iwienand:matrix.org | it really takes something to make devstack look like the "simple" option :) | 22:08 |
@clarkb:matrix.org | hrm the guide there indicates that we'd only be compatible back to 1.22 | 22:08 |
@clarkb:matrix.org | using the new thing I mean. Instead we can use time bounded token objects | 22:08 |
@clarkb:matrix.org | Maybe nodepool needs to grow an awarenes of the k8s version it is speaking to in order to make choices about its behavior :/ | 22:09 |
@clarkb:matrix.org | Might be best to defer to people actually using k8s with nodepool to help determien what is the most appropriate action there. | 22:10 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/nodepool] 851473: DNM: Test k8s roles update https://review.opendev.org/c/zuul/nodepool/+/851473 | 22:11 | |
@jim:acmegating.com | Clark ianw ^ there's a noop change just to verify the roles are good for nodepool. hopefully the zuul-jobs test coverage is sufficient, but since we can't remove the pin right now, that should double check that landing the stack won't break nodepool again. | 22:12 |
@clarkb:matrix.org | ++ | 22:13 |
@iwienand:matrix.org | so if i'm understanding; the zuul-jobs are uncapped on minikube but have the cri/docker bridge installed now. but nodepool doesn't seem to work with this version, and is currently pinned to 1.25.2 (which also hasn't introduced the changes that require the cri/docker bridge)? | 22:26 |
@iwienand:matrix.org | basically 851473 is installing 1.25.2+crio-dockerd via zuul-jobs; but the crio-dockerd is just a no-op and runs in the background but nothing uses it? | 22:28 |
@jim:acmegating.com | that's my understanding | 22:29 |
@jim:acmegating.com | so i guess what this is going to tell us is whether 1.25.2 will work with the new crio-dockerd stuff -- since that isn't tested by zuul-jobs | 22:30 |
@jim:acmegating.com | (if it doesn't work, however, i'm not sure there's a very good argument to be made to hold up the zuul-jobs changes. i don't know how much backwards compat for k8s versions we really expect ensure-kubernetes to provide) | 22:31 |
@iwienand:matrix.org | i guess we keep pulling on the thread about service accounts clarkb has started on. i need to do some serious reading but will see if i can help | 22:32 |
@jim:acmegating.com | (though still, i suppose wrapping the crio-dockerd in a version check in ensure-k8s might be reasonable) | 22:33 |
@iwienand:matrix.org | yeah, that should be easy enough | 22:33 |
@jim:acmegating.com | well it's moot anyway -- it looks like the job passed | 22:33 |
@jim:acmegating.com | https://zuul.opendev.org/t/zuul/build/6247ef9fb4634bb8be8cfdf3cfacb451 | 22:34 |
@iwienand:matrix.org | ok, cool, so basically it just sits there unused | 22:34 |
@jim:acmegating.com | i think everything is clear to land now | 22:34 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!