| clarkb | cool screenshots for hound indicate it is working. I was worried about updating go and nodejs that it would break something in the codebase but seems fine | 00:01 |
|---|---|---|
| *** efoley_ is now known as efoley | 13:25 | |
| opendevreview | Enzo Candotti proposed openstack/project-config master: Update app-seaweedfs config https://review.opendev.org/c/openstack/project-config/+/970408 | 15:37 |
| fungi | infra-root: for ^ i think we can do it as part of friday's project renames if I just add the group rename to https://review.opendev.org/970307 right? | 15:42 |
| opendevreview | Enzo Candotti proposed openstack/project-config master: Update app-seaweedfs config https://review.opendev.org/c/openstack/project-config/+/970408 | 15:45 |
| fungi | never mind, it looks like we already ended up with two groups because both the correct and incorrect spellings were included in the original acl | 15:45 |
| clarkb | fungi: I think there is a group deletion api now? So theoretically we could fix the acl then delete the unused group? | 15:52 |
| clarkb | oh its a plugin | 15:52 |
| clarkb | which I din't think we install. Meh its not a big deal | 15:52 |
| fungi | yeah, we have a ton of orphaned groups we've accumulated over the years, would probably be a good idea to clean them all up at some point | 15:52 |
| fungi | if we were going to that effort though, i'd want to script something to generate a list of all unused groups and then bulk delete them | 15:53 |
| clarkb | I think All-Users.git controls it so the non plugin api now is to push a commit that removes groups. If done while gerrit is offline due to conflicts in the notedb user db we can either first address those conflicts (a long standing todo) or do it offline and reindex groups before starting gerrit again | 15:57 |
| opendevreview | Merged openstack/project-config master: Update app-seaweedfs config https://review.opendev.org/c/openstack/project-config/+/970408 | 16:00 |
| clarkb | infra-root https://review.opendev.org/c/opendev/system-config/+/970322 https://review.opendev.org/c/opendev/system-config/+/970325 and https://review.opendev.org/c/opendev/system-config/+/970321 are three more trixie image updates. I'd like to get those in today as further sanity checking of trixie before we commit to it with Gerrit. I'm happy to approve these changes or have you | 16:04 |
| clarkb | approve them as you review. I should be around all day and can monitor | 16:04 |
| clarkb | let me know if there are any concerns. The hound one (listed first) is probably the most interesting | 16:04 |
| fungi | clarkb: looks like ems replied "email addresses must be unique across users" | 16:05 |
| fungi | so at least we have the answer | 16:06 |
| fungi | sort of odd that they've designed it with a one-account-per-person expectation, requiring every account to have a unique e-mail address | 16:07 |
| fungi | i don't understand the reasoning behind these new account restrictions, especially for accounts on a dedicated homeserver | 16:07 |
| clarkb | I suspect it has to do with being able to track abuse to verified email addresses. | 16:09 |
| clarkb | Which is maybe ok for humans but yes if I'm manually adding an account to my server for use by a bot I don't think that is important | 16:09 |
| fungi | yeah, but i don't see howq having more than one account associated with an address breaks that ability | 16:10 |
| fungi | with every new change it seems like they're further dumbing down matrix to make it more like proprietary/commercial chat systems | 16:12 |
| clarkb | given that answer I guess I'll do soemthing like infra-root+matrixstatus@ for the email address? | 16:13 |
| fungi | wfm | 16:13 |
| fungi | heading out for lunch errands, back in a while | 16:21 |
| opendevreview | Hediberto Cavalcante da Silva proposed openstack/project-config master: Add LVM CSI App to StarlingX https://review.opendev.org/c/openstack/project-config/+/970439 | 17:03 |
| mordred | clarkb: all three image changes lgtm | 17:19 |
| clarkb | mordred: thanks! | 17:22 |
| mordred | clarkb: out of curiosity - I was looking at the gerrit stack. we have ensure-nodejs on there, which is ensuring 18, which is EOL. If we're building gerrit with java 21, perhaps node should get updated? | 17:23 |
| mordred | I can't find any information about explicit versions supported | 17:24 |
| clarkb | mordred: I looked that up but the polygerrit-ui content in the gerrit stable-3.11 branch says you need nodejs 18. It is possible that newer nodejs would work too and the upstream docs are wrong | 17:26 |
| clarkb | mordred: it may be worth a change on the end of my stack to bump that up and see if it affects 3.11 or 3.12 image builds negatively | 17:26 |
| mordred | ++ I'll push one up for the fun of it, see what happens | 17:27 |
| clarkb | https://gerrit.googlesource.com/gerrit/+/refs/heads/stable-3.11/polygerrit-ui/#installing-node_js-and-npm-packages is the document I am referring to | 17:27 |
| opendevreview | Monty Taylor proposed opendev/system-config master: Update node to 24 for Gerrit builds https://review.opendev.org/c/opendev/system-config/+/970451 | 17:33 |
| clarkb | cool looks like that triggered image rebuild jobs too | 17:37 |
| fungi | will be interesting to see the result, yes | 17:37 |
| mordred | I also asked over in gerrit discord just to be complete | 17:42 |
| clarkb | fungi: if the hound, matrix-eavesdrop, and accessbot updates look good to you I'm happy to ensure they deploy as expected | 17:47 |
| fungi | yeah, they're on my list to go through in a few minutes | 17:47 |
| clarkb | thanks | 17:48 |
| clarkb | mordred: I think your jobs are failing beacuse the parent change image builds are not available in the insecure ci registry anymore. They should be we should only delete images after 30 days iirc, but I seem to recall hitting this before and not having time to debug it then | 18:03 |
| clarkb | I suspect that if we recheck the whole stack it will work beacuse we'll have new images across the board | 18:03 |
| clarkb | corvus: ^ fyi. I'm not aware of any zuul registry changes that would impact this other than potentially the work we did to make swift expirations work at all (but maybe it is buggy?) | 18:04 |
| mordred | oh - nod. fun. well, I imagine we'll naturally recheck that stack at some point before actually moving forward | 18:04 |
| corvus | clarkb: yeah, i don't expect any errors, so this is a good opportunity to debug, but unfortunately i don't have that time right now. | 18:04 |
| clarkb | one thought I had was that maybe we're deleting layers that are shared between new images and old images because the cleanup routine sees them as assoicated with the old stuff. However, I want to say we should account for that via some sort of reference counting iirc | 18:05 |
| corvus | yep, if that's not working, it's not because we forgot to think of that :) | 18:06 |
| corvus | expiration: 15552000 # 180 days | 18:07 |
| clarkb | ok more than 30 days even | 18:07 |
| clarkb | /var/log/registry-prune/prune.log should have pruning logs on insecure-ci-registry. So I guess step 0 is looking there to try and confirm the objects returning 404 are listed as pruned. I can look at that | 18:09 |
| corvus | i'm pretty sure the registry is dealing with seconds as a time unit and we're converting the swift info into seconds. like, i don't think anything is using milliseconds. | 18:12 |
| clarkb | 2025-12-10 00:00:08,209 DEBUG registry.storage: Keep _local/repos/quay.io/opendevorg/gerrit-base/manifests/6e220809a5224a74a19ffdb3f60e0303_latest | 18:13 |
| clarkb | https://zuul.opendev.org/t/openstack/build/8a3e07f57b0044bab6def17aced4a1a5/console#2/1/13/localhost which is one of the two things that failed here | 18:14 |
| clarkb | so maybe it isn't pruning that is the problem? | 18:15 |
| clarkb | I'm trying to find where the registry itself is logging the 404 now as maybe that has a clue | 18:16 |
| clarkb | this service doesn't have its logs going to syslog then to the container specific log files so grepping out of docker logs is slow | 18:18 |
| clarkb | something we could fix up I guess | 18:18 |
| corvus | clarkb: you can use "-n 10000" to reduce the set | 18:19 |
| mordred | clarkb: gerrit-base should have been build in that changeset | 18:20 |
| mordred | so I'm not sure this is actually about pulling any images built by previous incarnations of these jobs | 18:21 |
| clarkb | mordred: yes but its trying to get the gerrit-base from the parent change I think | 18:21 |
| clarkb | at 2025-12-08 21:10:31,523 I see regsitry logs check if the manifest exists and it doesn't so then it proceeds with an upload to that manifest and things start returning 200 at that point | 18:22 |
| clarkb | but then at 2025-12-10 17:59:45,135 we get our first 404 | 18:22 |
| mordred | but why? the gerrit-base image job shouldn't be referencing the gerrit-base images from the previous jobs. and the gerrit-3.11 job should be using the gerrit-base image that should have been uploaded by the gerrit-base job | 18:23 |
| clarkb | the gets to swift return 200 then we return 404 for some reason (is the data getting corrupted I probably need to expand my log grep context range) | 18:23 |
| clarkb | mordred: I think its teh default when you use the intermediate registry hooks. We don't know if each image is going to be used but grab them from the artifacts lists of any parents just in case | 18:23 |
| clarkb | mordred: I think you are correct that in this case we don't actually use that image at all | 18:24 |
| mordred | oh - gotit. the job itself doesn't actually need them, but the machinery has no way of knowing that | 18:24 |
| mordred | that makes much more sense | 18:24 |
| corvus | https://etherpad.opendev.org/p/Vcwu0q2fSfRDo_bfevkW | 18:25 |
| corvus | some logs | 18:25 |
| clarkb | corvus: oh is it some blob under the manifest? | 18:27 |
| corvus | looks like it, and i think we pruned it | 18:28 |
| corvus | /var/log/registry-prune/prune.log.1:2025-12-09 01:19:13,498 DEBUG urllib3.connectionpool: https://storage101.dfw1.clouddrive.com:443 "DELETE /v1/MossoCloudFS_622b11a1-5dfa-43b4-9f58-4ad3c6dbc4a0/intermediate_registry/_local/blobs/sha256:2dca1d89c1ce32840623231f8569b38d9a55a7c15503bd7f2fcc3a6887af23bc/data HTTP/11" 204 0 | 18:28 |
| clarkb | 2025-12-09 01:19:13,419 DEBUG registry.storage: Prune _local/blobs/sha256:2dca1d89c1ce32840623231f8569b38d9a55a7c15503bd7f2fcc3a6887af23bc/data | 18:29 |
| clarkb | 2025-12-09 01:19:13,499 DEBUG registry.storage: Keep _local/blobs/sha256:2dca1d89c1ce32840623231f8569b38d9a55a7c15503bd7f2fcc3a6887af23bc/ | 18:29 |
| clarkb | its interesting that we decdied to prune the /data but keep /. I wonder if that is a clue to where the bug may be | 18:29 |
| corvus | only objects have a timestamp, directories always have the current time and so are never deleted | 18:30 |
| corvus | they aren't real, so once the objects are gone, so are the dirs | 18:30 |
| clarkb | ah | 18:30 |
| corvus | the logic is to delete any blob older than our target expiration time as long as that blob isn't mentioned as one of the layers for the manifests we decided to keep (which we do the same way: manifests older than our target expiration time). | 18:33 |
| corvus | that suggests either the times are off, or the most likely i think is that we did not correctly identify all of the necessary layers for that manifest | 18:34 |
| corvus | 2025-12-09 00:12:32,297 DEBUG registry.storage: Unknown manifest _local/repos/quay.io/opendevorg/gerrit-base/manifests/6e220809a5224a74a19ffdb3f60e0303_latest | 18:35 |
| corvus | i think maybe the registry doesn't know how to read that manifest format | 18:35 |
| corvus | so if our build process is now producing a manifest format other than 'application/vnd.docker.distribution.manifest.v2+json' we are probably effectively just pruning everything every 24 hours | 18:36 |
| clarkb | oh, this might eb related to quay.io also having the two arch entries. One for x86-64 and one for unknown | 18:37 |
| mordred | application/vnd.oci.image.manifest.v1+json | 18:37 |
| mordred | that's what current docker purportedly produces | 18:37 |
| clarkb | the big change we did semi recently is using buildx for everything | 18:37 |
| clarkb | to support speculative builds in docker | 18:38 |
| corvus | https://opendev.org/zuul/zuul-registry/src/branch/master/zuul_registry/storage.py#L296 | 18:38 |
| corvus | then i think the task would be to add parsing of application/vnd.oci.image.manifest.v1+json to that method | 18:38 |
| mordred | corvus: yeah. I agree | 18:38 |
| clarkb | ok so maybe related to buildx change or maybe just related to docker updates | 18:39 |
| clarkb | either way there is an additional format we need to be able to understand | 18:39 |
| mordred | Yeah - correct thing is update zuul-registry | 18:39 |
| mordred | in the short term, we could add --output type=docker to the buildx commands it seems, to produce the old format | 18:40 |
| clarkb | I suspect fixing this is probably easy. Just need to get one of these manifests (the hound image I proposed yesterday should still be in there I suspect? | 18:40 |
| clarkb | well and actually the manifest itself is still there the blobs aren't | 18:41 |
| clarkb | https://github.com/opencontainers/image-spec/blob/main/manifest.md this appears to be the spec | 18:42 |
| mordred | according to the internet, the formats are very similar, mostly differing in media type identifiers | 18:43 |
| clarkb | vs github.com/openshift/docker-distribution/blob/master/docs/spec/manifest-v2-2.md | 18:44 |
| clarkb | though I just noticed that is an openshift repo so probably not authoritative | 18:44 |
| clarkb | comparing those I think the datastructures are basically equiavlent for what we are doing | 18:48 |
| mordred | corvus: in https://opendev.org/zuul/zuul-registry/src/branch/master/zuul_registry/main.py#L330-L333 - it seems like another place that, for completeness, we'd want to also validate the new default. Any change you know why we'd only want to validate the docker v2? | 18:49 |
| mordred | we also already list application/vnd.oci.image.manifest.v1+json here: https://opendev.org/zuul/zuul-registry/src/branch/master/zuul_registry/main.py#L178-L185 - but as the lowest priority | 18:51 |
| clarkb | lookingat the format specs I'm not sure I understand what https://opendev.org/zuul/zuul-registry/src/branch/master/zuul_registry/storage.py#L300-L301 is getting us. I'm half wondering if this would break for the supported format too | 18:52 |
| clarkb | since manifests don't seem to have key's off of their mediatype | 18:53 |
| corvus | mordred: i think we only added that validation to catch an error that was being produced; i think we presume it doesn't occur otherwise | 18:53 |
| mordred | nod | 18:53 |
| clarkb | oh wait we get blob as a separate step so I guess that isn't getting the data I think it is in the first half of the get manifests method | 18:54 |
| corvus | mordred: yeah, so seems like the prune system should be able to handle all of those formats | 18:54 |
| mordred | ++ - want me to add the other three to the prune check while I'm at it? | 18:55 |
| mordred | it doesn't look like we have any tests to test manifest parsing for this | 18:55 |
| corvus | clarkb: yep, the manifest blob is an index which points to one or more other blobs which are the actual manifest content (ie, layer list and config) in different formats | 18:56 |
| corvus | mordred: yeah, if they all have the same layer list format (ie, lines 307-311 work for all of them) then that might be it | 18:57 |
| clarkb | corvus: https://opendev.org/zuul/zuul-registry/src/branch/master/zuul_registry/main.py#L380-L382 yup it seems to originate from here | 18:57 |
| corvus | if one of them puts their layer list in some other json key, we'd need to handle it there | 18:57 |
| clarkb | I was expecting that to be from docker but it is our own accounting data | 18:57 |
| corvus | it does come from docker | 18:58 |
| corvus | the content does at least | 18:59 |
| clarkb | corvus: I mean we're writing it in as data to account for data that docker gave us. So the value itself is from docker but its not a direct blob or manifest | 18:59 |
| corvus | but yeah, if you're referring to the way we store the multiple versions, i see what you mean | 18:59 |
| clarkb | mordred: corvus manifest lists are slightly different https://github.com/openshift/docker-distribution/blob/main/docs/spec/manifest-v2-2.md#example-manifest-list | 18:59 |
| clarkb | I think for manifest lists we want for m in data[manifests]: m[digest]. And for manifests we want manifest[config][digest] and for l in manifest[layers]: l[digest] | 19:01 |
| mordred | manifest lists are just references to other manifests. so for pruning, they really shouldn't come in to play? | 19:01 |
| mordred | we should be pruning at the manifest level | 19:02 |
| clarkb | mordred: I think pruning also prunes manifests so if the manifest list is newer than our prune timeout we want to add the digest for each manifest in the manifest list to the keep list | 19:02 |
| clarkb | mordred: to preserve similar behavior to https://opendev.org/zuul/zuul-registry/src/branch/master/zuul_registry/storage.py#L309 in the single manifest case ? | 19:03 |
| clarkb | mordred: I think otherwise we could make a new manifest list with old manifests in it that would get deleted breaking the manifest list ? | 19:03 |
| mordred | I think you're right in the pathological case, but in practice I don't know of any flow where we'd create and upload a manifest list using pre-existing manifests. we're super clever with buildx things, but we're not that clever :) | 19:04 |
| clarkb | that is a good point. Those get uploaded together | 19:04 |
| clarkb | and manifest lists aren't themselves artifacts within manifests | 19:05 |
| clarkb | (unlike layers) | 19:05 |
| clarkb | in any caase it sounds like yuo're writing a change so I don't need to? | 19:05 |
| mordred | yeah. I'm on it | 19:06 |
| corvus | so we ignore manifest lists for pruning and assume that the manifest list and all the manifests it references will be pruned at the same time. | 19:06 |
| corvus | that's probably worth a code comment :) | 19:06 |
| clarkb | ++ and thanks | 19:06 |
| corvus | (i assume we won't have any collisions with manifests being used either alone or in multiple manifest lists, just because they should have build data that won't be identical in two builds. if i'm wrong about that, then we would need to handle manifest lists as an extra reference-counting step) | 19:07 |
| mordred | remote: https://review.opendev.org/c/zuul/zuul-registry/+/970480 Update manifest parsing to support oci manfiests [NEW] | 19:09 |
| clarkb | re testing: it might be easy to do that at a functional level | 19:09 |
| mordred | there's a first stab | 19:09 |
| clarkb | build image, upload it, prune, pull image | 19:09 |
| clarkb | the 24 hour sanity check may need overriding though | 19:10 |
| clarkb | or we can fast forward our clocks before pruning | 19:10 |
| corvus | yes, but otoh, with so many ways of making images, maybe a synthetic test where we have test fixtures for each of the manifest formats, and then can mock out the time in the test might let us keep more stable test coverage over time | 19:13 |
| clarkb | mordred: I left a note about manifest list handling on the change | 19:13 |
| clarkb | or wait are manifest lists never entering the code in the first place? | 19:14 |
| clarkb | (I'm trying to reconicle the don't do anything with manifests lists when we have one in our acceptable content types) | 19:14 |
| mordred | ohhhh ... yeah. hang on | 19:18 |
| opendevreview | Merged opendev/system-config master: Update Hound Container to Debian Trixie https://review.opendev.org/c/opendev/system-config/+/970322 | 19:18 |
| clarkb | hound is in the process of starting back up now after ^ deployed | 19:23 |
| mordred | corvus, clarkb: ok, updated. made it much simpler change | 19:24 |
| clarkb | mordred: you pushed a new change :) | 19:24 |
| mordred | wat? | 19:24 |
| mordred | oh - piddle. derp | 19:24 |
| clarkb | the change id changed. Maybe you overdeleted teh content in the commit message when editing it or something | 19:25 |
| mordred | yup. it was a commit message edit derp | 19:25 |
| corvus | so what's a application/vnd.oci.image.index.v1+json ? | 19:26 |
| mordred | that's the manifest list version | 19:26 |
| corvus | i assumed that was application/vnd.docker.distribution.manifest.list.v2+json | 19:27 |
| clarkb | corvus: https://github.com/opencontainers/image-spec/blob/main/image-index.md ist the oci manifests list format I think | 19:27 |
| corvus | are both of those manifest lists then? | 19:27 |
| mordred | yeah. the docker non-oci veriso is that | 19:27 |
| mordred | yes | 19:27 |
| mordred | of the 4, two are manifest lists, and 2 are manifests | 19:27 |
| clarkb | they look almost identical too. So basically they formalized the docker things as oci things and maybe sprinkled a bit of extra info in at the same time | 19:27 |
| corvus | ok cool. i mean... you know, "cool" because... cool. | 19:28 |
| mordred | yup. there are some metadata differences in areas we don't care about :) | 19:28 |
| corvus | +3 thanks mordred ! | 19:31 |
| fungi | in unrelated news, deploy of the matrix-eavesdrop container trixie update failed on a "502 Bad Gateway" error from quay | 19:31 |
| clarkb | fungi: ya I was trying to confirm it was quay (it was pulling accessbot but the accessbot change was behind it in the gate so I think it was quay | 19:32 |
| clarkb | (if accessbot was ahead it would use the image from the gate job ahead I think) | 19:32 |
| clarkb | hound seems to be working https://codesearch.opendev.org/?q=foobar&i=nope&literal=nope&files=&excludeFiles=&repos= | 19:35 |
| clarkb | the accessbot eavesdrop job is finally running. If it looks ok I'll recheck the matrix-eavesdrop change | 19:51 |
| opendevreview | Merged opendev/system-config master: Build accessbot on trixie https://review.opendev.org/c/opendev/system-config/+/970321 | 20:08 |
| mordred | clarkb, corvus well piddle. functional tests fail :( will dig a little more | 20:55 |
| mordred | hrm. it fails on a skopeo task | 20:57 |
| clarkb | https://zuul.opendev.org/t/zuul/build/3fbe42da4bbb475ca1baf7161ce804c5/log/docker/functional-test_registry_1.txt#7 is the log on the other side I think | 20:58 |
| clarkb | looks like ssl problems before we do anything docker protocol specific | 20:58 |
| mordred | we haven't run that job since april | 20:59 |
| mordred | perhaps something has degraded/bitrotted | 20:59 |
| clarkb | it would not surprise me | 20:59 |
| mordred | or just changed (hi ssl) | 20:59 |
| clarkb | it is using cherrypy which zuul also uses. Maybe we need to update deps or pin versiosn or something | 21:00 |
| clarkb | mordred: some googling indicates that maybe its trying to do http on one side and https on another | 21:02 |
| mordred | that could be a thing | 21:02 |
| mordred | afk for a sec, but I'll poke more in just a bit | 21:02 |
| corvus | there is (was) a problem with cheroot, a cherrypy dep, causing python to hang instead of exiting after tests. sounds like that isn't the problem here, but if that shows up, we can copy the pin from zuul. | 21:02 |
| clarkb | https://opendev.org/zuul/zuul-registry/src/branch/master/playbooks/functional-test/conf/registry.yaml this indicates to me that the server side is configured to do ssl | 21:04 |
| clarkb | https://opendev.org/zuul/zuul-registry/src/branch/master/playbooks/functional-test/docker.yaml#L3-L17 is where we're failing | 21:06 |
| clarkb | maybe we need skopeo --tls-verify=false ? | 21:07 |
| clarkb | corvus: that cheroot issue claims to have been fixed in november, but I'm not sure if there is a new release for it yet | 21:10 |
| corvus | oh nice, last i checked they hadn't merged a fix | 21:11 |
| clarkb | mordred: actually I misidentified the docker compose file I think it is this one: https://opendev.org/zuul/zuul-registry/src/branch/master/playbooks/functional-test/docker-compose.yaml | 21:13 |
| clarkb | anyway that docker-compose file loads that registry.yaml config into the registry so yes should still be listening on port 9000 with ssl certs | 21:14 |
| clarkb | the error message implies the tcp connection made it to the process running in the container too | 21:14 |
| clarkb | I'm not seeing anything obviously wrong around that. We create a tls cert and it has a dns entry for localhost and an ip entry for 127.0.0.1 | 21:16 |
| clarkb | and we run update-ca-certificates against it | 21:16 |
| clarkb | mordred: corvus: that skopeo command is doing `skopeo copy docker-archive:/tmp/registry-test/test.img docker-daemon:localhost:9000/test/image:latest` but localhost:9000 is a docker registry not a docker daemon. | 21:21 |
| clarkb | I wonder if that just worked in the past | 21:22 |
| clarkb | but now we need to use docker:// ? | 21:22 |
| corvus | weird. i do have a successful build result from a change from april of this year, so it worked at least that recently | 21:28 |
| clarkb | the command name does say copy into local docker image storage. But localhost:9000 appears to be the zuul registry we start in the task just prior listening on port 9000 | 21:29 |
| corvus | but maybe it's the other way around: maybe we need to be copying to the daemon. maybe the "localhost:9000/" is the new thing | 21:29 |
| opendevreview | Merged opendev/system-config master: Update matrix-eavesdrop container to build on Debian Trixie https://review.opendev.org/c/opendev/system-config/+/970325 | 21:29 |
| corvus | can you link to the build console? | 21:29 |
| clarkb | corvus: https://zuul.opendev.org/t/zuul/build/3fbe42da4bbb475ca1baf7161ce804c5/console this is the failing task | 21:30 |
| corvus | the comment and structure of the task are correct: | 21:34 |
| corvus | we want to copy the test image into the local docker daemon's local image storage, and once there, we want that image to have the name "localhost:9000/test/image:latest" | 21:35 |
| corvus | in other words, we're not trying to get skopeo to talk to port 9000 | 21:35 |
| corvus | we're trying to create a local docker image with "localhost:9000" as part of its name | 21:35 |
| corvus | so that we can do a "docker push localhost:9000/test/image" in the next task | 21:36 |
| corvus | (which will causer the docker daemon to push the image named that to localhost:9000 -- that's the point at which we want something to talk to that port) | 21:36 |
| clarkb | corvus: https://zuul.opendev.org/t/zuul/build/3fbe42da4bbb475ca1baf7161ce804c5/log/docker/functional-test_registry_1.txt#7 https://zuul.opendev.org/t/zuul/build/3fbe42da4bbb475ca1baf7161ce804c5/log/job-output.txt#3095 these are the two timestamps and they seem to line up almost? | 21:36 |
| corvus | maybe skopeo changed in such a way that it's trying to be helpful? | 21:37 |
| corvus | yeah, i mean, i'm not saying that it isn't talking to port 9000 | 21:37 |
| corvus | i'm saying it's not supposed to :) | 21:37 |
| clarkb | the timestamps are off by one second. It is enough to make me question that this is the issue but also close enough that it wouldn't surprise me if that is a logging artifact | 21:38 |
| corvus | i don't think we expect anything to contact port 9000 until the next task | 21:38 |
| clarkb | so ya one theory maybe that newer skopeo (on noble?) is interpretting that as talk to the docker daemon at localhost:9000 and it doesn't work rather than naming things that way | 21:38 |
| corvus | yep | 21:39 |
| clarkb | side note: the openstack zuul tenant is in an interesting state (lots of release approvals and other changes making my browser not render things quickly or at all) | 21:39 |
| corvus | which is a super weird thing for skopeo to do, since it entails not understanding that image names can have hostnames in them, which is kind of the rallying cry of the entire "containers lib" project... | 21:40 |
| clarkb | oh I think release-approval is just noise. The pipeline manager has to consider each change but really its just a bunch of new changes pushed up? | 21:40 |
| corvus | yeah, just give it a sec to work through it | 21:40 |
| mordred | so maybe try plopping a docker:// in front of the localhost:9000 ? | 21:58 |
| clarkb | I was going to sugguest maybe running skopeo locally against your docker daemon to see if it tries to hit some specific port if you've got things instaleld (I Don't think I have skopeo currently) | 22:01 |
| clarkb | the job ran on ubuntu noble and I'm also trying to remember when we switched the default nodeset but I think it was earlier than april so any distro packaging change would've been caught earlier? | 22:01 |
| mordred | yeah ... also, according to docs, docker:// implies http transport | 22:03 |
| mordred | so whatever the answer is, it's almost certainly not that | 22:03 |
| mordred | (trying to reproduce locally) | 22:04 |
| clarkb | matrix-eavesdrop updated on eavesdrop02.opendev.org and https://meetings.opendev.org/irclogs/%23zuul/%23zuul.2025-12-10.log is recording new logs | 22:04 |
| mordred | well. that's unfortunate. the skopeo command TOTALLY worked as expected for me locally | 22:06 |
| mordred | it did not try to hit localhost:9000 at all | 22:07 |
| clarkb | so maybe the timing being close there is just a distraction | 22:09 |
| clarkb | as I said its not exactly overlapping but within a second or so of one another | 22:09 |
| mordred | clarkb: gerrit channel confirms that things should work on node 24 | 22:12 |
| clarkb | cool I just caught up on discord and see that is what paladox uses | 22:15 |
| clarkb | kolla-ansible got a 65 change stack for fixing ansible lint errors | 22:20 |
| fungi | 65?!? | 22:20 |
| clarkb | yes sir https://review.opendev.org/c/openstack/kolla-ansible/+/970555/1 is a random one in the middle that I clicked on and the show all button says (65) | 22:21 |
| clarkb | I didn't count them to make sure gerrit's count is accurate | 22:21 |
| fungi | yikes | 22:22 |
| tonyb | #dedication! | 22:31 |
| opendevreview | Monty Taylor proposed zuul/zuul-jobs master: Update the default skopeo upstream version https://review.opendev.org/c/zuul/zuul-jobs/+/970566 | 23:51 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!