| -@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: | 00:44 | |
| - [zuul/zuul] 960417: Add subnode support to launcher https://review.opendev.org/c/zuul/zuul/+/960417 | ||
| - [zuul/zuul] 960695: Add reuse label option to launcher https://review.opendev.org/c/zuul/zuul/+/960695 | ||
| -@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: | 02:09 | |
| - [zuul/zuul] 960417: Add subnode support to launcher https://review.opendev.org/c/zuul/zuul/+/960417 | ||
| - [zuul/zuul] 960695: Add reuse label option to launcher https://review.opendev.org/c/zuul/zuul/+/960695 | ||
| -@gerrit:opendev.org- Zuul merged on behalf of Clark Boylan: [zuul/zuul] 960679: Pull base python images from quay.io/opendevorg https://review.opendev.org/c/zuul/zuul/+/960679 | 11:00 | |
| @fungicide:matrix.org | Clark: not sure if you saw, but someone raised a question post-merge on your https://review.opendev.org/958783 change for zuul-jobs | 14:06 |
|---|---|---|
| -@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 958963: AWS: add kms-key-id for label https://review.opendev.org/c/zuul/zuul/+/958963 | 14:40 | |
| @mhuin:matrix.org | I might have asked before but I don't remember the answer; out of curiosity how many zookeeper nodes do you run on opendev? What about other deployments? | 14:49 |
| @jim:acmegating.com | mhu: 3 | 14:50 |
| @fungicide:matrix.org | basically 3 is the minimum required to have proper redundancy | 14:52 |
| @mhuin:matrix.org | yeah ... we're learning this the hard way | 14:53 |
| @mhuin:matrix.org | also for availability during upgrades | 14:53 |
| @clarkb:matrix.org | fungi: I hadn't seen that, I'll respond there I guess | 14:54 |
| -@gerrit:opendev.org- Jan Gutter proposed: [zuul/zuul-jobs] 960840: Make buildx builder image configurable https://review.opendev.org/c/zuul/zuul-jobs/+/960840 | 14:54 | |
| @jangutter:matrix.org | ^^ Clark I dunno if that's a good idea or not, or if it needs to be a job var instead of a rolevar. | 14:55 |
| @jangutter:matrix.org | feel free to take over (or I can abandon if it's a bad idea) | 14:55 |
| @clarkb:matrix.org | jangutter: are you working with sjabasti on this? I'm wondering if I still need to respond on the original change | 14:56 |
| @clarkb:matrix.org | I think its fine to make that configurable | 14:56 |
| @jangutter:matrix.org | Clark: Nope, just saw some low hanging fruit and needed a mind cleanser of my on-call stuff this week :-p | 14:57 |
| @clarkb:matrix.org | thanks I'll respond with a link to your change then. | 14:57 |
| @jim:acmegating.com | https://zuul-ci.org/docs/zuul-jobs/latest/policy.html#role-variable-naming-policy | 14:57 |
| @jim:acmegating.com | should probably consider the variable name | 14:58 |
| @jangutter:matrix.org | Yeah, the cognitive dissonance to me is because this is technically a "global" definition that's common across those two builder roles. But I have no objection with namespacing them (the two roles are not expected to be drop-in replacements for each other) | 15:01 |
| @clarkb:matrix.org | I think the other variables those roles use are scoped | 15:01 |
| @jangutter:matrix.org | Yep | 15:01 |
| @clarkb:matrix.org | like the container image lists to build so doing it here isn't a problem either. | 15:01 |
| @jim:acmegating.com | that's why the vars use the "container_" prefix | 15:01 |
| @jim:acmegating.com | it's a slight relaxation of the general rule which would be "build_container_image_buildx_builder_image" to account for the multi-role use | 15:02 |
| @jangutter:matrix.org | Dammit, you didn't fall for my global namespace landgrab. Oh well.... | 15:02 |
| @jim:acmegating.com | not my first rodeo | 15:02 |
| @jangutter:matrix.org | But rolevar rather than jobvar? | 15:03 |
| @jangutter:matrix.org | (I've been caught by stashing defaults in roles....) | 15:03 |
| @jim:acmegating.com | yeah i think that's fine (rolevar means it can be set in both places). the docs for the job should be updated to include it as well. | 15:04 |
| @jangutter:matrix.org | 👍️ thanks for the quick out-side-of-gerrit review. I'll do a bit of a quick fix on that. | 15:05 |
| @jim:acmegating.com | Clark: https://review.opendev.org/959887 appeared to immediately succeed -- was that expected? | 15:12 |
| @clarkb:matrix.org | corvus: ish. I suspect the underlying issue was related to the same disk problems that the ubuntu mirrors had at the same time. Things were "working" but very slow and I think possibly geo location impacted results as well | 15:13 |
| @clarkb:matrix.org | fungi: do you know if canonical and ubuntu are all caught up after the mirror issues now? I suspect landing that change now is safe | 15:14 |
| @fungicide:matrix.org | Clark: i haven't seen any more evidence since the weekend | 15:15 |
| @jim:acmegating.com | the job with microk8s in the name ran in a reasonable amount of time https://zuul.opendev.org/t/zuul/runtime?job_name=zuul-jobs-test-registry-buildset-registry-k8s-microk8s&project=zuul/zuul-jobs&branch=master&pipeline=check | 15:16 |
| @jim:acmegating.com | i'll go ahead and +3 it then | 15:17 |
| @clarkb:matrix.org | sounds good | 15:17 |
| @jim:acmegating.com | Clark: i went ahead and signed up for the open source pavillion. i think based on your description of the environment, maybe the best approach would be to show opendev's zuul and do ad-hoc demos and discussion, rather than something structured. i can also have a quick-start demo on the laptop already running in case we want to talk about that. and this would be a zuul project booth, so anyone at the summit can help out. it's only an hour or two, so i don't think we need to coordinate shifts or anything. just show up and show and tell zuul i'm thinking. | 15:19 |
| -@gerrit:opendev.org- Jan Gutter proposed: [zuul/zuul-jobs] 960840: Make buildx builder image configurable https://review.opendev.org/c/zuul/zuul-jobs/+/960840 | 15:19 | |
| @jim:acmegating.com | i have a bunch of the one-page zuul flyers too, i'll bring them. | 15:21 |
| @clarkb:matrix.org | corvus: I like that. And ya we should be able to push changes to zuul itself to trigger behaviors we want to show off in the opendev zuul | 15:21 |
| @clarkb:matrix.org | that seems like an easy way to engage people with someone more interactive than the flyer | 15:21 |
| @jim:acmegating.com | ++ | 15:21 |
| @jim:acmegating.com | Clark: *now* it's failing :) https://zuul.opendev.org/t/zuul/build/4d835ef03dd949e584d318d2f116303b/console | 15:35 |
| @jangutter:matrix.org | corvus: hitting a rate limit? | 15:36 |
| @jim:acmegating.com | unclear; `context canceled` is vague | 15:37 |
| @jangutter:matrix.org | looks like there's at least some other folks who believe it's vague https://bugs.launchpad.net/snapstore-server/+bug/2104066 | 15:40 |
| @clarkb:matrix.org | wow thats classic | 15:40 |
| @clarkb:matrix.org | it works when you think it is broken and broken when you suspect it should work | 15:41 |
| @clarkb:matrix.org | fwiw I'm trying to clear out my opendev backlog then I'm going to see if I can make sense of the bwrap trixie issue on that held node | 15:41 |
| -@gerrit:opendev.org- Zuul merged on behalf of Simon Westphahl: [zuul/zuul] 959428: Fix provider quota calculation https://review.opendev.org/c/zuul/zuul/+/959428 | 15:47 | |
| @jim:acmegating.com | jangutter: i did not realize there was a variable already for that for the docker role.... we could make an argument we should use that variable (which is what you did in PS1), or we could take the opportunity to improve things for the container role and use the container_ prefix. but we probably shouldn't actually change the variable in the docker role -- or if we do, we're going to need to support backwards compat for that and do a migration. | 15:55 |
| @jangutter:matrix.org | Oh hang on... I thought that wasn't templated out in the docker role... | 15:56 |
| @jim:acmegating.com | oh you're adding it to the docker role too | 15:57 |
| @jim:acmegating.com | * oh you're adding it to the docker role too? | 15:57 |
| @jim:acmegating.com | sorry i mistook how the docker role got involved; so this is expanding the scope of the change to do the same thing for both roles? | 15:58 |
| @clarkb:matrix.org | yes I think it is being added to both | 15:58 |
| @jangutter:matrix.org | Yeah, the code is mirrored, so it made sense to mirror the interface to me. | 15:58 |
| @jim:acmegating.com | okay, then i think the approach in PS2 may be the approach we want | 15:58 |
| @jim:acmegating.com | yeah, indentation aside that lgtm | 16:00 |
| @jangutter:matrix.org | So if someone is converting a project from `docker` to `container`, they can just replace the `docker_` section of the job vars with `container_` (following the same pattern as most of the other "common" bits. | 16:00 |
| @jangutter:matrix.org | That whitespace is going to haunt me. | 16:01 |
| @jim:acmegating.com | line 83 too | 16:01 |
| @jangutter:matrix.org | yep, serves me right for not doing a git show and being all obsessive. | 16:02 |
| @fungicide:matrix.org | `git diff --check` will also call out trailing whitespace | 16:03 |
| @jangutter:matrix.org | heh, trailing whitespace goes all red in my vim. | 16:04 |
| @fungicide:matrix.org | oh, well yes mine too | 16:05 |
| -@gerrit:opendev.org- Jan Gutter proposed: [zuul/zuul-jobs] 960840: Make buildx builder image configurable https://review.opendev.org/c/zuul/zuul-jobs/+/960840 | 16:05 | |
| @fungicide:matrix.org | i think --check might also call out mixed tabs and spaces | 16:05 |
| @fungicide:matrix.org | don't recall now, manpage likely says | 16:05 |
| @jangutter:matrix.org | I'm particularly miffed that vscode doesn't add trailing newlines by default. | 16:06 |
| @jangutter:matrix.org | Our codebase is a delightful mixture of text files that sometimes end in newlines. | 16:06 |
| @fungicide:matrix.org | indeed, manpage says by default --check warns about trailing whitespace, spaces immediately followed by tabs in an initial indent, and merge/rebase conflict markers | 16:07 |
| @clarkb:matrix.org | oh nice it will check for conflict markers | 16:08 |
| @fungicide:matrix.org | oh, yeah, dos vs unix text file formatting is a pain too, git core.whitespace might be able to be tweaked to identify it | 16:08 |
| @clarkb:matrix.org | corvus: is there a way to keep job dirs without restarting the executor? | 16:28 |
| @clarkb:matrix.org | I'm on that held node now and builds/ is empty so I think I need to keep jobdirs and then trigger a new test and see what things look like when zuul is trying to run bwrap | 16:28 |
| @jim:acmegating.com | Clark: `zuul-executor keep` on all the executors | 16:29 |
| @jim:acmegating.com | then `nokeep` on all of them once the job is done | 16:29 |
| @jim:acmegating.com | minimize the time it's enabled | 16:29 |
| @jim:acmegating.com | the weekly restart should clean up the detritus from other jobs | 16:30 |
| @clarkb:matrix.org | thanks in this case its the quickstart setup on a held test node so I think its ok to just enable for now | 16:30 |
| @jim:acmegating.com | oh sorry i thought you were asking about a meta level up :) | 16:30 |
| @clarkb:matrix.org | `drwx------ 7 root root 4096 Sep 12 16:32 5153f599f4cd4ceaaf29c026f69b222a` this is what the resulting build dir looks like so now I'm thinking the bwrap `Permission denied` is literal | 16:33 |
| @jim:acmegating.com | maybe the umask changed | 16:35 |
| @clarkb:matrix.org | we also don't set a user on the container in the docker compose file so its running as root within the container so you'd think that should still work. But maybe the unshare of users by bwrap means that it won't use a 700 directory? | 16:37 |
| @clarkb:matrix.org | umask reports `0022` within the trixie container | 16:37 |
| @clarkb:matrix.org | let me see if I can figure out running the command from the logs to reproduce then modify the dir perms to see if the error goes away | 16:39 |
| @clarkb:matrix.org | ok I think the issue is that /var/lib/zuul is 0700 zuul:zuul | 16:46 |
| @clarkb:matrix.org | * ok I think the issue is that /var/lib/zuul is 0700 zuul:zuul (10001:10001 not 1000:1000 of the calling zuul user in the host env) | 16:49 |
| @clarkb:matrix.org | that directory is created by `"lib-zuul-executor:/var/lib/zuul:z"` in the docker compose file as a volume mount. But we create it in the container itself using useradd and set it as the 10001 zuul user's homedir | 16:50 |
| @clarkb:matrix.org | corvus: I'm thinking that useradd must've changed default permissions on homedirs | 16:50 |
| @clarkb:matrix.org | corvus: do you know why we don't run the containers as the 10001 user within the quickstart environment? Is the issue with podman running the containers as a regular user? | 16:50 |
| @clarkb:matrix.org | I'm going to test running this as the zuul user | 16:53 |
| @clarkb:matrix.org | heh deleting the lib zuul volume required deleting the gerrit config container which ended up automatically deleting the gerrit container so now I have a running executor I can't trigger jobs in. | 16:59 |
| @clarkb:matrix.org | I'm just going to push up a change to have zuul automate this testing for me | 16:59 |
| -@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul] 960682: Build container images on Debian Trixie https://review.opendev.org/c/zuul/zuul/+/960682 | 17:00 | |
| @fungicide:matrix.org | how meta | 17:02 |
| @clarkb:matrix.org | anyway the problem appears to be the mismatch between the /var/lib/zuul permissions and ownership and the uid running zuul-executor which ultimately runs the bwrap command | 17:03 |
| @clarkb:matrix.org | I suspect that this change is part of /var/lib/zuul being created in the container via useradd | 17:03 |
| @fungicide:matrix.org | oh, created as the homedir | 17:04 |
| @clarkb:matrix.org | yup | 17:04 |
| @clarkb:matrix.org | useradd might access an override to permsisions but if running the executor as zuul works instead I think that is better | 17:04 |
| @fungicide:matrix.org | yeah, useradd may have started making more restrictively permissioned homedirs by default in trixie | 17:04 |
| @clarkb:matrix.org | then once I get this sorted out I think I want to update quickstart to use a different volume for /var/lib/zuul in gerritconfig so that the executor can be cleared out and restarted without impacting gerrit | 17:05 |
| @clarkb:matrix.org | fungi: ya that is my assumption right now. i haven't confirmed | 17:05 |
| @clarkb:matrix.org | I don't see any useradd options to set perms on the homedir when created with the `-m` flag. But it does appear to look at login.defs? I assume something like that is what changed. Anyway another option would be to drop the -m flag and start creating the dir ourselves if we want | 17:12 |
| @clarkb:matrix.org | but I think if just running the quickstart processes as the zuul user works then that is a better option | 17:12 |
| @jim:acmegating.com | Clark: i'd be concerned about fixing this with the quickstart; fundamentally it seems we're creating the directory in the container image with the wrong perms? i think we should keep them the same. | 17:13 |
| @fungicide:matrix.org | yeah, login.defs has HOME_MODE 0700 | 17:13 |
| @jim:acmegating.com | `drwxr-xr-x 2 zuul zuul` is what they're expected to be | 17:14 |
| @jim:acmegating.com | i also think it's curious that bwrap isn't overriding that when run as root. | 17:14 |
| @clarkb:matrix.org | corvus: I guess it depends on how we define what wrong permissions are. They are wrong in this instance because we create /var/lib/zuul for the zuul user's homedir then run processes as another user (root in this case) out of that directory | 17:14 |
| @clarkb:matrix.org | my initial impression is that the process uid should match the homedir ownership. But if that is not the case then I can modify the Dockerfile to create the homedir explicitly rather than via useradd (that way we can control the permissions | 17:15 |
| @jim:acmegating.com | i think we expect people to bind mount /var/lib/zuul if they run it as a different user, but strictly speaking, we haven't set the USER in the dockerfile, so the default user is still root. we have some pending changes to update that, and i think it may make sense to merge those changes and move this project along. but if the goal is "make zuul work on trixie" then i think we should do that minimally. | 17:17 |
| @jim:acmegating.com | https://review.opendev.org/c/zuul/zuul/+/840758 | 17:19 |
| @clarkb:matrix.org | got it. In that case I'll rework the image build to create the directory in a separate command | 17:19 |
| @clarkb:matrix.org | something like useradd && mkdir && chown && chmod | 17:19 |
| @jim:acmegating.com | sgtm. and tbc, i'm totally in favor of opening the can of worms, but just want it to be a different can of worms than trixie. :) | 17:20 |
| @clarkb:matrix.org | or I guess I can let it create the dir then chmod | 17:20 |
| @jim:acmegating.com | (we should probably update 840758 and see where that stands in the current world since almost everything has changed since then) | 17:20 |
| -@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 960875: Launcher: don't use the main node with slots https://review.opendev.org/c/zuul/zuul/+/960875 | 17:32 | |
| -@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul] 960682: Build container images on Debian Trixie https://review.opendev.org/c/zuul/zuul/+/960682 | 17:38 | |
| -@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 959397: Add "final" attribute to image/flavor/label objects https://review.opendev.org/c/zuul/zuul/+/959397 | 17:38 | |
| -@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul] 960878: Give quickstart gerritconfig a distinct /var/lib/zuul volume https://review.opendev.org/c/zuul/zuul/+/960878 | 17:44 | |
| @clarkb:matrix.org | corvus: I think 960692 is green now. But I want to drop the zuul-executor and scheduelr debug flags unless you think we should keep those in quickstart. Then separately do you think we should merge that change before the new nodejs and golang images are mirrored to quay.io/opendevmirror? If not I'll push an edit up to try and use the mirror which will fail for now until opendev starts mirroring those images | 18:27 |
| @clarkb:matrix.org | https://review.opendev.org/c/opendev/system-config/+/960681 is the change to mirror those images | 18:28 |
| -@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul] 960682: Build container images on Debian Trixie https://review.opendev.org/c/zuul/zuul/+/960682 | 19:22 | |
| @clarkb:matrix.org | I decided to go ahead and make those two changes I think that is the end state we want and may as well point in that direction now | 19:23 |
| -@gerrit:opendev.org- Clark Boylan proposed: [zuul/zuul] 960682: Build container images on Debian Trixie https://review.opendev.org/c/zuul/zuul/+/960682 | 19:24 | |
| -@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 960208: Add image validation admin method https://review.opendev.org/c/zuul/zuul/+/960208 | 20:02 | |
| @clarkb:matrix.org | I just restarted the matrix eavesdrop bot and want to make sure it captures this message | 20:59 |
| @fajfer:reszka.org | and did it? | 21:31 |
| @clarkb:matrix.org | it did | 21:33 |
| @clarkb:matrix.org | https://meetings.opendev.org/irclogs/%23zuul/%23zuul.2025-09-12.log | 21:33 |
| @jim:acmegating.com | Clark: i reckon we need to wait till tomorrow (or do something to force a run) to get that mirrored | 22:37 |
| @clarkb:matrix.org | corvus: ya I'll recheck the change after mirroring happens | 22:47 |
| -@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: | 23:51 | |
| - [zuul/zuul] 960875: Launcher: don't use the main node with slots https://review.opendev.org/c/zuul/zuul/+/960875 | ||
| - [zuul/zuul] 960921: Handle launch failures with subnodes https://review.opendev.org/c/zuul/zuul/+/960921 | ||
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!