@iwienand:matrix.org | (also, https://review.opendev.org/c/opendev/system-config/+/848562 is a related change that reworks the way we do ssl testing to remove the insecure flags) | 00:03 |
---|---|---|
@iwienand:matrix.org | > <@iwienand:matrix.org> i've filed https://github.com/containers/podman/issues/14884 but upon more research, i'm starting to think it's a cgroups v2 thing | 00:33 |
as an update; this is most certainly related to cgroups v2 in jammy v v1 in focal. i'm currently trying some changes that put the nested podman in a separate cgroup in https://review.opendev.org/c/openstack/diskimage-builder/+/849274/ | ||
@iwienand:matrix.org | i note that corvus probably wants to get nodepool out of the image building business and that is probably a good direction for this. all the problems we've had with the containerfile approach have been from the nesting trying to run podman inside the nodepool-builder docker container | 00:36 |
@jim:acmegating.com | ianw: yeah, i've got a back-burner change (just a few mins here and there) to try building a simple nodepool image in a job at https://review.opendev.org/848792 -- that was to validate the ideas in https://review.opendev.org/775042 a bit more. but i've also recently learned of a zuul user who is in fact doing almost exactly what is in that spec already, so it seems that doesn't really need a lot of validation. :) i think 848792 might be more useful as a seed for making some roles for image building. and we could pivot that to validating a fix for the cgroups stuff. i know of some cloud operators that may be interested in joining in on "roles for building images in zuul jobs" as well. i'm also starting a new spec that builds on those ideas to make an even more robust; i hope to have something to show regarding that later this week. so all in all -- yeah, it's starting to feel like some stuff is lining up for getting some momentum on "build images in zuul jobs". | 01:53 |
@jim:acmegating.com | * ianw: yeah, i've got a back-burner change (just a few mins here and there) to try building a simple nodepool image in a job at https://review.opendev.org/848792 -- that was to validate the ideas in https://review.opendev.org/775042 a bit more. but i've also recently learned of a zuul user who is in fact doing almost exactly what is in that spec already, so it seems that doesn't really need a lot of validation. :) i think 848792 might be more useful as a seed for making some roles for image building. and we could pivot that to validating a fix for the cgroups stuff. i know of some cloud operators that may be interested in joining in on "roles for building images in zuul jobs" as well. i'm also starting a new spec that builds on those ideas to make an even more robust nodepool/zuul image building story; i hope to have something to show regarding that later this week. so all in all -- yeah, it's starting to feel like some stuff is lining up for getting some momentum on "build images in zuul jobs". | 01:57 |
@blaisepabon:matrix.org | OMG... it works! https://u.do.controlplane.info/tenants | 02:27 |
@blaisepabon:matrix.org | Thank you corvus (of course, I'll have to get it behind some kind of login page so Heath doesn't freak out. | 02:30 |
@iwienand:matrix.org | corvus: so after looking; trying to get all images from a "pip install dib" on a system is somewhere between impossible and very annoying at best. e.g. missing packages on focal, etc. so i'd propose a dib reference image, that is based on nodepool-builder; something along the lines of https://review.opendev.org/c/openstack/diskimage-builder/+/849454. i've not wanted to duplicate nodepool-builder, but perhaps it is time. I would propose this is used similar to what is being done in https://review.opendev.org/c/openstack/diskimage-builder/+/791888. basically as a privileged container that you map various things into | 07:20 |
@iwienand:matrix.org | (that doesn't really solve the cgroups v2 issue, but that's something i think worth working through with upstream anyway) | 07:21 |
@avass:vassast.org | We're seeing ansible `command` modules getting stuck before executing in like 1/100 jobs, and when that happens in cleanup it's really bad because we've got jobs hung indefinitely unless they're dequeue and the process on the executor is killed. | 08:29 |
I've been digging for a while and from what I can see the on_task_start callback runs to print the task banner, but the build node never seem to run the actual task so ansible-playbook never exits. But I can't find any obvious race condition in logging, or in the command module anywhere. Is anyone else experiencing something similar? | ||
-@gerrit:opendev.org- Benjamin Schanzel proposed on behalf of Tobias Henkel: [zuul/nodepool] 743790: Check for images to upload single threaded https://review.opendev.org/c/zuul/nodepool/+/743790 | 11:19 | |
@avass:vassast.org | Maybe we're seeing this issue: https://github.com/ansible/ansible/issues/59642, if so that may be fixed in ansible 5 | 11:33 |
-@gerrit:opendev.org- Albin Vass proposed: [zuul/zuul-jobs] 849502: DNM: Run tests with ansible 5 https://review.opendev.org/c/zuul/zuul-jobs/+/849502 | 12:06 | |
@jim:acmegating.com | ianw: that makes sense, but one of the advantages of the zuul job approach is that we don't need a single system to make all images (you could request an ubuntu node to build ubuntu images; a fedora node to build fedora images, etc). only if the underlying cloud doesn't already have such a node would you need to cross-build. | 13:50 |
@jim:acmegating.com | ianw: (not arguing against the image; more just advocating that the non-image approach may still be an option) | 13:51 |
@avass:vassast.org | corvus: are you talking about building an image like what I did for digitalocean: https://review.opendev.org/c/zuul/zuul-jobs/+/786757 ? | 13:57 |
@avass:vassast.org | or is it more like 1) build an image and upload it somehere 2) report it as an artifact that nodepool can pick up and use | 13:58 |
@avass:vassast.org | * or is it more like 1) build an image and upload it somewhere 2) report it as an artifact that nodepool can pick up and use | 13:58 |
@fungicide:matrix.org | > <@avass:vassast.org> or is it more like 1) build an image and upload it somewhere 2) report it as an artifact that nodepool can pick up and use | 14:18 |
i think the latter, as a continuation of this discussion from last year: https://lists.zuul-ci.org/pipermail/zuul-discuss/2021-February/001503.html | ||
@jim:acmegating.com | yeah, i'm looking at dib based, not snapshot | 14:22 |
-@gerrit:opendev.org- Artem Goncharov proposed wip: [zuul/zuul] 849033: Initial implementation of the gitea driver https://review.opendev.org/c/zuul/zuul/+/849033 | 14:31 | |
-@gerrit:opendev.org- Artem Goncharov marked as active: [zuul/zuul] 849033: Initial implementation of the gitea driver https://review.opendev.org/c/zuul/zuul/+/849033 | 14:31 | |
@jim:acmegating.com | Albin Vass: btw the zuul tenant has been using ansible5 by default for a while (and now all of opendev) | 14:36 |
@avass:vassast.org | > <@jim:acmegating.com> Albin Vass: btw the zuul tenant has been using ansible5 by default for a while (and now all of opendev) | 15:15 |
Yeah i realized after pushing that change. Found out that the mirror-workspace role was already fixed and i bumped our internal fork. But Ansible 5 didn't solve our issue anyway :( | ||
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 849570: Add pipeline-based merge op metrics https://review.opendev.org/c/zuul/zuul/+/849570 | 17:27 | |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/nodepool] 849582: Speed up node listing https://review.opendev.org/c/zuul/nodepool/+/849582 | 21:06 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!