fungi | corvus: this might be related to my observation on friday, if so mnaser is aware that something recently broke their hosted cloud profile for vexxhost | 13:10 |
---|---|---|
corvus | fungi: hrm. well, it appears it is not fixed yet. i suppose we either need to completely remove vexxhost, or update our clouds.yaml to hold the content previously found in the hosted profile. does anyone know what that content is supposed to be? | 15:01 |
corvus | archive.org as of 2021 says: https://paste.opendev.org/show/b4iuUmOz9aBRYxguMRtO/ | 15:04 |
corvus | i'm working on that | 15:11 |
tkajinam | I wonder if anyone is aware of the consistent node failure in arm64 jobs ? | 15:14 |
corvus | tkajinam: yes, one of opendev's public clouds is currently unavailable | 15:16 |
fungi | tkajinam: we only have one provider for those (osusol) and it's not working at the momenty | 15:16 |
fungi | tkajinam: but if you know of any other openstack cloud providers interested in donating arm server instance quota, please do get them in touch with us | 15:16 |
fungi | we used to have two providers, but linaro decided they couldn't keep providing it | 15:17 |
opendevreview | James E. Blair proposed opendev/system-config master: Use local profile for vexxhost https://review.opendev.org/c/opendev/system-config/+/941933 | 15:17 |
corvus | i manually made those changes on zl01 and it looks like it's happy to me, so i think that has all the settings we need | 15:18 |
fungi | i adjusted my personal cloud.yaml similarly and it's working for my account too | 15:26 |
fungi | corvus: is there a reason requires_floating_ip: false is only set in the nodepool clouds.yaml, not for bridge? | 15:27 |
fungi | and i guess the nodepool builders don't look at the image_format from the profile to decide what to upload | 15:28 |
corvus | it looked like we set a lot fewer things in that file so i tried to keep it small; do you think we should add that one too? | 15:28 |
fungi | nah, just curious about the divergence | 15:28 |
corvus | fungi: i think the nodepool builders do look at that, but 'raw' is already in the file (i wonder why) | 15:29 |
corvus | that=image format | 15:29 |
fungi | aha, got it | 15:29 |
fungi | i totally missed that we were overriding it there | 15:29 |
fungi | probably we had it in there before it was added to the remote profile and could clean it up once it comes back | 15:30 |
corvus | sounds reasonable | 15:32 |
tkajinam | corvus fungi, ah, ok. thanks. got it. | 15:33 |
tkajinam | fungi, I know some people working for arm things so I'll discuss it if I get a chance to talk with them. | 15:34 |
fungi | thanks!!! | 15:34 |
Clark[m] | fungi: tkajanim: clarification the linaro cloud was hosted on "Works on ARM" hardware hosted by equinix. That. Hardware was pulled by works on arm and wasn't a linaro decision. | 15:54 |
Clark[m] | Then sometime later equinix announced they are shutting all their hardware hosting down. Not sure if related | 15:55 |
fungi | yeah, linaro wasn't able to find other resources to continue providing it | 15:59 |
clarkb | any reason to not approve https://review.opendev.org/c/opendev/system-config/+/941679 and https://review.opendev.org/c/opendev/zone-opendev.org/+/941168 ? I guess we may want to see 941933 make things happy with clouds first? | 16:05 |
fungi | both lgtm | 16:13 |
corvus | i +3d them. if we bork the clouds.yaml on bridge -- well, deleting is a manual process anyway, so should be easy to detect and fix. | 16:38 |
fungi | fair enough | 16:38 |
clarkb | wfm | 16:38 |
opendevreview | Merged opendev/zone-opendev.org master: Cleanup zuul-lb01 and reset zuul ttl https://review.opendev.org/c/opendev/zone-opendev.org/+/941168 | 16:41 |
opendevreview | Clark Boylan proposed openstack/diskimage-builder master: Change grub variables for style and timeout https://review.opendev.org/c/openstack/diskimage-builder/+/937684 | 16:47 |
opendevreview | Clark Boylan proposed openstack/diskimage-builder master: Change grub variables for style and timeout https://review.opendev.org/c/openstack/diskimage-builder/+/937684 | 16:49 |
opendevreview | Merged opendev/system-config master: Use local profile for vexxhost https://review.opendev.org/c/opendev/system-config/+/941933 | 17:19 |
opendevreview | Merged zuul/zuul-jobs master: [remove-registry-tag] Allow using in a loop https://review.opendev.org/c/zuul/zuul-jobs/+/941516 | 17:21 |
fungi | fix for the missing vexxhost profile deployed successfully | 17:29 |
fungi | openstackclient on bridge is working with vexxhost again | 17:29 |
fungi | s/fix/workaround/ | 17:31 |
clarkb | I've just realized that the buildkit image used by docker buildx commands may be somewhat hardcoded into docker? | 17:53 |
clarkb | that didn't even occur to me as a potential issue when setting up a mirror for that image | 17:53 |
fungi | how so? | 17:53 |
corvus | oh i think there may be a way to start a builder with a specific image, then docker will use an already running builder | 17:56 |
clarkb | well I mirrored buildkit:buildx-stable-1 into quay.io so that we can use it in zuul-jobs | 17:56 |
corvus | 1 sec | 17:56 |
clarkb | but I don't see a docker buildx create flag to use a specific image | 17:56 |
clarkb | maybe the trick is to pull the image firstthen rename it then buildx won't pull? | 17:56 |
fungi | aha, got it. the tool is looking only to dockerhub | 17:57 |
fungi | on my way out to run a quick errand, hopefully back in half an hour | 17:57 |
opendevreview | Clark Boylan proposed zuul/zuul-jobs master: Use registry:2 image mirrored to quay.io https://review.opendev.org/c/zuul/zuul-jobs/+/941970 | 17:58 |
clarkb | that is one half of the update i was going to make | 17:59 |
corvus | clarkb: https://docs.docker.com/build/builders/drivers/docker-container/ | 17:59 |
corvus | clarkb: i think maybe if we start the builder ourselves with the quay image then i think docker builds should use that builder automatically and not try to start a new one | 18:00 |
clarkb | corvus: gotcha so docker buildx create --driver-put=image=quay.io/opendevmirror/buildkit:buildx-stable-1 ? | 18:00 |
clarkb | I'll push a change up to exercise ^ | 18:00 |
corvus | s/put/opt/ but yeah something like that | 18:01 |
clarkb | oh we need to set the docker-container driver as the driver as well since that is different than the default docker driver I guess | 18:01 |
corvus | yeah | 18:01 |
clarkb | thanks I'll look into that now. 941970 is a related change (registry is the other image we mirrored) | 18:01 |
corvus | i haven't tried this exact thing, so, some experimentation may be necessary :) | 18:02 |
corvus | i did include a step to start the docker builder in https://review.opendev.org/923084 -- but i constructed it to start the default builder with the default options | 18:03 |
corvus | (that was to address a race with podman starting multiple builders) | 18:03 |
opendevreview | Clark Boylan proposed zuul/zuul-jobs master: Use mirrored buildkit:buildx-stable-1 image https://review.opendev.org/c/zuul/zuul-jobs/+/941992 | 18:08 |
clarkb | first experimentation step | 18:08 |
opendevreview | Merged opendev/system-config master: Remove codesearch01 https://review.opendev.org/c/opendev/system-config/+/941679 | 18:24 |
opendevreview | Clark Boylan proposed opendev/system-config master: Adjust LE role file matchers on system-config-run-* jobs https://review.opendev.org/c/opendev/system-config/+/941997 | 18:25 |
clarkb | I think we can dequeue 940219,5 from the opendev promote pipeline now. That was stuck there due to the empty nodeset with zuul launcher bug iirc | 18:33 |
clarkb | any objections to me doing so now? | 18:33 |
fungi | no objection from me | 18:41 |
clarkb | done | 18:48 |
clarkb | speaking of bindep do we want to merge some bindep changes? | 18:52 |
clarkb | the buildx change failed on docker rate limits pulling the multiarch image before it got to buildx | 19:00 |
clarkb | I'll recheck in a bit to see if that even works at all. But also I'll get a change up for multiarch mirroring I guess | 19:00 |
fungi | yeah, i think we can merge whichever bindep changes folks arre comfortable with and then work on porting the same ideas to, say, git-review or something next | 19:11 |
clarkb | I think we can do https://review.opendev.org/c/opendev/bindep/+/938568/9 and parents at least | 19:13 |
opendevreview | Clark Boylan proposed opendev/system-config master: Mirror multiarch/qemu-user-static https://review.opendev.org/c/opendev/system-config/+/942002 | 19:24 |
clarkb | corvus: that pointer seems to have done it. The multiarch jobs pass after a recheck and grepping the logs shows it fetching from quay.io with no logs indicating it also pulled from docker | 19:47 |
clarkb | last year we did a tour at dino lab in victoria bc and at the end they sit you down with some rocks containing fossils and the various air tools to slowly chip away at the rock surrounding the fossils. This container image stuff feels a lot like that. Slowly removing what we don't want and eventually we'll be where we want to be | 19:49 |
corvus | nice! | 20:04 |
corvus | i'm going to restart the remaining zuul components since the vexxhost fix has merged (and the zuul-web that was stuck has healed itself) | 20:05 |
corvus | it's just the components on zuul02 that are the old version now | 20:06 |
opendevreview | Merged opendev/system-config master: Mirror multiarch/qemu-user-static https://review.opendev.org/c/opendev/system-config/+/942002 | 20:06 |
opendevreview | Clark Boylan proposed zuul/zuul-jobs master: Replace debian:testing with quay.io/opendevmirror/httpd:alpine https://review.opendev.org/c/zuul/zuul-jobs/+/942008 | 20:08 |
clarkb | corvus: ack thanks | 20:08 |
clarkb | 942008 is another replace docker hub hosted image with something roughly equivalent that we can fetch from quay | 20:08 |
clarkb | hopefully with this set of updates the zuul-jobs role testing and the roles themselves will be a bit more reliable for us | 20:09 |
opendevreview | Merged openstack/diskimage-builder master: Change grub variables for style and timeout https://review.opendev.org/c/openstack/diskimage-builder/+/937684 | 21:12 |
clarkb | I'm going to approve the dns change to remove codesearch01 now that the inventory cleanup is done | 21:41 |
clarkb | I need to get a meeting agenda together | 21:42 |
clarkb | anything need to be edited in? The service coordinator nomination period ends tomorrow so I'll call that out | 21:42 |
opendevreview | Merged opendev/zone-opendev.org master: Remove codesearch01 from DNS https://review.opendev.org/c/opendev/zone-opendev.org/+/941681 | 21:48 |
corvus | all zuul components are running the same (latest) code now | 21:50 |
corvus | also, nl03 has recovered and we're launching nodes in osuosl again | 21:51 |
clarkb | thanks! | 21:51 |
clarkb | my initial meeting agenda edits are in. Let me know if I need to add or remove things | 22:06 |
fungi | infra-root: do we have consensus for changes 816741, 938520 and 938568 in bindep? (the pyproject.toml series up through dropping support for python 3.6 and associated cleanup/simplifications) | 22:32 |
fungi | maybe we can approve those tomorrow if there are no objections | 22:32 |
clarkb | no objections from me | 22:32 |
fungi | mainly because in the coming weeks i'd like to start applying a similar pattern to some of our other tools too | 22:33 |
clarkb | ++ | 22:34 |
clarkb | deleting zuul-lb01 and codesearch01 is approaching on my todo list. That can be another one for tomorrow if there are no objcetions before then | 22:35 |
clarkb | fungi: I'm noticing that we may want to set up logrotate rules for the logs in /var/lib/mailman/core/var/logs | 23:04 |
clarkb | specifically on lists.opendev.org | 23:04 |
opendevreview | James E. Blair proposed opendev/system-config master: Use a dedicated zuul launcher temp dir on /opt https://review.opendev.org/c/opendev/system-config/+/942018 | 23:16 |
corvus | we ran out of tmp space downloading images ^ | 23:16 |
clarkb | +2 from me | 23:17 |
clarkb | that may even auto restart things due to the docker compose config update | 23:18 |
corvus | i manually removed some orphaned image file (i just proposed a change to do that automatically) and the launcher seems to have recovered after that without further intervention | 23:27 |
corvus | 2025-02-17 23:21:46,734 INFO zuul.Launcher: Starting upload <ImageUpload 99fba2c3128743f0b0599414ca914d51 state: uploading endpoint: raxflex/raxflex-SJC3 artifact: 2276952861474de8aae5689a9999fdcf validated: True external_id: None> | 23:28 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!