tonyb | clarkb: That's all totally fair. | 00:42 |
---|---|---|
tonyb | If I understand the scrollback we can pause/revert the quay.io work, migrate the tooling to podman[1] and then resume the migration | 00:43 |
tonyb | [1] When people other than clarkb is able to push on it | 00:44 |
clarkb | fwiw I can work on it too. I just don't want the expectation to be clarkb is gonna get it all done in a few weeks for quay.io stuff :) | 00:44 |
clarkb | Ideally we can work on it together over time | 00:44 |
tonyb | Yeah. definately *not* a clarkb thing. | 00:53 |
tonyb | FWIW, whenever I make suggestions I'm always of the opinion that if I'm not going to do the work I only count as 0.5 of a vote. | 00:55 |
clarkb | it looks like the nodepool podman test is going to or has timed out because image builds weren't happening | 01:01 |
clarkb | I can't poke at that more today. Feel free to if interested | 01:02 |
tonyb | clarkb: totallty interested, but also it isn't exactly in my "wheelhouse". | 01:05 |
*** amoralej|off is now known as amoralej | 06:33 | |
*** mooynick is now known as yoctozepto | 09:14 | |
yoctozepto | morning | 09:14 |
opendevreview | Merged opendev/base-jobs master: buildset-registry: Always use Docker https://review.opendev.org/c/opendev/base-jobs/+/883869 | 11:45 |
fungi | yoctozepto: ^ should be able to recheck now | 11:55 |
yoctozepto | fungi: thanks, yeah, it went further: https://zuul.opendev.org/t/nebulous/build/a8d511bf7e6b487684e69210ef59d812 I just need to fix the references | 11:58 |
yoctozepto | and now it works :D | 12:12 |
*** amoralej is now known as amoralej|lunch | 12:13 | |
*** amoralej|lunch is now known as amoralej | 13:05 | |
fungi | excellent | 13:50 |
*** amoralej is now known as amoralej|off | 16:10 | |
fungi | the rackspace tickets i opened yesterday have been acted on, reclaiming 118 nodes worth of capacity for jobs | 16:18 |
clarkb | any indication if we should expect the problem to recur? | 16:20 |
fungi | no clue, but it did cause rackspace support to ask who was opening the tickets since (with both of our accounts) the internal employee advocate is the one whose contact information appears on the account rather than ours | 16:21 |
fungi | apparently the current account contact there is don norton, who was surprised by the ticket | 16:22 |
fungi | it finally happened... https://blog.pypi.org/posts/2023-05-23-removing-pgp/ | 16:50 |
opendevreview | Clark Boylan proposed openstack/diskimage-builder master: DNM testing if depends-on parent change works with dib https://review.opendev.org/c/openstack/diskimage-builder/+/883958 | 17:04 |
yoctozepto | have you seen this error using the opendev cointainer image promoting job? https://zuul.opendev.org/t/nebulous/build/e2e20e8bf84d4fc9b9b500fd1dea6e0e | 17:45 |
yoctozepto | https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/promote-container-image/tasks/promote-from-intermediate-registry.yaml | 17:45 |
yoctozepto | oh well, it means nothing was obtained from the api, strange | 17:46 |
yoctozepto | argh, a typo | 17:47 |
clarkb | yoctozepto: that means the job is looking for the gate job that build your image | 17:47 |
clarkb | and couldn't find it | 17:47 |
yoctozepto | yeah | 17:47 |
clarkb | (it uses that info to then find the artifact to fetch and promote) | 17:47 |
yoctozepto | off by one letter | 17:47 |
yoctozepto | yeah, that I figured :-) | 17:47 |
yoctozepto | but it's like | 17:47 |
yoctozepto | "huh, it should be there" | 17:47 |
yoctozepto | and then "oh well, one letter off" | 17:48 |
yoctozepto | nighty night! | 19:14 |
tonyb | I have a couple of "how does it work" questions. When people have time? | 19:54 |
fungi | i have time, hopefully even answers, and if you're lucky they'll even be correct | 19:55 |
tonyb | 1) For AFS utilization is there a finegraned way to see howmuch $something is using. Currently I'm looking at https://grafana.opendev.org/d/9871b26303/afs?orgId=1 for a general sense but if I wanted to see how much we'd reclaim if we removed $x from storage where shoudl I go? | 19:55 |
fungi | if you want to install the openafs client locally, you can check quotas with fs subcommands | 19:56 |
clarkb | you can also look at the rsync logs iirc they are in afs too | 19:56 |
clarkb | and rsync has size info | 19:56 |
tonyb | 2) this one is more basic, how do I find which jobs/builds are using the Ubuntu cloud archive? I'm just using stable/$branch && ubuntu as a proxy but I don't know if that's valid | 19:57 |
tonyb | these both came up from me looking at: https://review.opendev.org/c/opendev/system-config/+/883468 | 19:57 |
fungi | fungi@dhole:~$ fs listquota /afs/.openstack.org/docs | 19:57 |
fungi | Volume Name Quota Used %Used Partition | 19:57 |
fungi | docs 50000000 30206816 60% 77% | 19:57 |
fungi | tonyb: you might be able to query for a uca url in opensearch? | 19:58 |
tonyb | Okay. I'll look at that. Having OpenAFS locally is a little complex due to packaging on Fedora but I can make it work | 19:58 |
fungi | if you have a debian vm you could just apt install it | 19:59 |
tonyb | Okay, so opensearch .... I don't knwo about that, is that essentially the old logstash? | 19:59 |
tonyb | fungi: very true, I could just do that. | 20:00 |
fungi | essentially, except it's being run by openstack community volunteers | 20:00 |
fungi | the project team guide has details i think, or maybe the tact sig page on governance... checking | 20:00 |
tonyb | Ahhh okay that's why I couldn't find it when I went poking in git | 20:00 |
tonyb | clarkb: rsync logs are .... https://static.opendev.org/mirror/logs/rsync-mirrors/ ? | 20:00 |
fungi | tonyb: https://governance.openstack.org/sigs/tact-sig.html#opensearch | 20:01 |
fungi | tonyb: yes | 20:02 |
fungi | for the mirror content rsyncing logs | 20:02 |
tonyb | fungi: Thanks x2 | 20:02 |
fungi | you bet | 20:04 |
fungi | if you have any other questions, i'm happy to answer any time i'm awake | 20:04 |
tonyb | I don't see anything "unbuntu" in .../mirror/logs | 20:04 |
tonyb | fungi: thanks | 20:04 |
fungi | ubuntu (and debbian) mirrors are not mirrored with rsync | 20:04 |
fungi | they use a tool called reprepro | 20:04 |
tonyb | This 6-months suck WRT tz overlap | 20:05 |
fungi | we may not be splatting or copying reprepro logs into afs | 20:05 |
fungi | yet | 20:05 |
fungi | but that's something we could add, i'm sure | 20:05 |
tonyb | Okay: https://static.opendev.org/mirror/logs/reprepro/ | 20:05 |
tonyb | those logs don't help with the size thing | 20:06 |
fungi | that was quick! | 20:06 |
tonyb | Its next to rsync in the list ;P | 20:06 |
fungi | what's your size dilemma you need to answer? | 20:08 |
tonyb | fungi: It isn't a dilema as such. I was curious how much AFS we'd get back if we merged: https://review.opendev.org/c/opendev/system-config/+/883468/ which would stop mirroring older UCA things | 20:10 |
fungi | oh, got it | 20:10 |
fungi | it's hard to know precisely because debian package repositories are often deduplicated in order to avoid carrying identical copies of packages which might be the same in more than one distribution release | 20:11 |
fungi | they don't use completely separate file trees like other distros tend to | 20:11 |
tonyb | Ahh of course the "pool" concept. | 20:12 |
fungi | yes, exactly | 20:12 |
fungi | reprepro deletes any packages not still referenced in the indices | 20:12 |
fungi | so removing an index will free up the space needed by any packages which are only listed in that index, but packages which were also listed in other indices are retained | 20:13 |
fungi | uca may trivially not pool its packages, so we might simply be able to du a subtree to get a good guess | 20:13 |
tonyb | Okay. I understand. This has been helpful. | 20:14 |
clarkb | ya I think UCA is pretty well segregated by openstack release and ubuntu relase | 20:14 |
fungi | nevermind, uca is also pooled | 20:14 |
fungi | but we might be able to estimate it by parsing file sizes out of the indices | 20:15 |
clarkb | I restarted the merger on zm06 | 20:17 |
fungi | the "size" fields in indices like /afs/.openstack.org/mirror/ubuntu-cloud-archive/dists/bionic-updates/rocky/main/binary-amd64/Packages.gz | 20:17 |
fungi | tonyb: ^ you could collect up all the relevant indices and then parse those files | 20:17 |
tonyb | Okay. | 20:17 |
clarkb | fungi: does du work against openafs mounted content? | 20:18 |
tonyb | I think so. | 20:18 |
clarkb | ya so du might work. Might also be a bit slow as it stats everything | 20:18 |
tonyb | It was mostly I did one review and came up with a bunch of impacts from it and I couldn't really answer any of them so I figured 1) my review wasn't super helpful; and 2) I needed to ask :) | 20:20 |
clarkb | tonyb: re fedora + afs you might get away with kafs though I'm not sure if that gets you userland support you might need | 20:21 |
clarkb | I tried kafs on opensuse a while back and it didn't work but that was a while ago | 20:21 |
tonyb | I don't know about the userland stuff ianw suggested it, tried it and immediately found it is non-functional ATM :/ | 20:21 |
clarkb | at one point I had my fileserver doing openafs mounts for me because it is ubuntu based (with zfs!) | 20:23 |
tonyb | nice. | 20:23 |
fungi | clarkb: du won't work because, as i said, uca is pooled after all | 20:30 |
fungi | there's not separate subdirectories for each set of packages | 20:30 |
tonyb | Also using opensearch answered my which (UCA) releases are still in use question | 20:31 |
fungi | cool | 20:31 |
clarkb | fungi: I just mean generally. Du doesn't work on some filesystems. btrfs in particular has tripped me up because it gives you some naive view | 20:47 |
clarkb | though I think that may have improved over time | 20:48 |
fungi | i've used it with afs in the past | 20:48 |
fungi | though it can take a while | 20:49 |
clarkb | ya lots of stats is slow iirc | 20:50 |
fungi | fungi@dhole:~$ du -s /afs/.openstack.org/mirror/ubuntu-cloud-archive | 20:53 |
fungi | 6583494 /afs/.openstack.org/mirror/ubuntu-cloud-archive | 20:53 |
opendevreview | Clark Boylan proposed openstack/diskimage-builder master: fedora: don't use CI mirrors https://review.opendev.org/c/openstack/diskimage-builder/+/883798 | 22:47 |
clarkb | fungi: ianw ^ I think that should fix the most recent error | 22:48 |
fungi | ah, cool! | 22:48 |
fungi | yep, so it was trying to use the mirrors which were no longer set | 22:50 |
clarkb | corvus: moving here because its more opendev specific. I think our system-config-run-zuul jobs deploy and configure a zookeeper for ssl and all that right? we should be able to adapt that to the nodepool job and then maybe even have that job build an image to test things end to end? | 22:51 |
clarkb | I think the nodepool job currently doesn't do any workload because there is no zookeeper present | 22:52 |
corvus | clarkb: yeah, that could be done... but... two things: 1) that will take ages assuming a production image, and if we use a dummy image, i'm not sure that adds anything; 2) we CD nodepool, so that kind of breakage is more likely to come from the nodepool repo than system-config | 22:59 |
corvus | clarkb: also 3) it shouldn't be necessary once we move image building into jobs so may not be a great investment | 22:59 |
clarkb | that is a good point | 23:03 |
opendevreview | James E. Blair proposed opendev/system-config master: WIP: Test zuul on jammy https://review.opendev.org/c/opendev/system-config/+/883986 | 23:07 |
opendevreview | James E. Blair proposed opendev/system-config master: WIP: Test nodepool on jammy https://review.opendev.org/c/opendev/system-config/+/883987 | 23:11 |
corvus | clarkb: is there any current testing that would actually exercise that nested podman issue? | 23:15 |
corvus | https://github.com/containers/podman/issues/14884 | 23:16 |
clarkb | corvus: I think just what nodepool's testing was doing before the quay move broke speculative gating | 23:17 |
corvus | yeah, it looks like the container-release job should do that | 23:17 |
clarkb | we could do the workaround with skopeo and run it under docker instead of podman | 23:18 |
clarkb | and have a one off job on the side sort of deal just to cover that case | 23:18 |
clarkb | that probably wouldn't be too terrible since we can isolate the job | 23:18 |
corvus | hrm? in nodepool repo? i don't think that's necessary... | 23:18 |
corvus | i just want to know if https://review.opendev.org/883952 means it really worked | 23:19 |
corvus | and it looks like it did... though i should probably update that to also remove your sudos | 23:20 |
clarkb | corvus: butthat would be podman nested in podman | 23:20 |
corvus | right, which is what we want | 23:20 |
clarkb | ah I see. I was confused I think due to the concern about how opendev is still podman in docker | 23:21 |
corvus | i think the specific question was: with https://github.com/containers/podman/issues/14884 merged can we now remove the cgroup hack | 23:21 |
clarkb | but ya I think that shows podman in podman is fine. The original issue was docker in podman (not sure if it exhibited with podman in podman or not) | 23:21 |
clarkb | corvus: right but the original issue was filed specific as podman in docker being problematic. Unknown if the same issue existed as podman in podman | 23:22 |
clarkb | We can test the podman in docker case if we revert the podman change and use the skopeo hack or just run a separate job instead of a revert that does that | 23:22 |
clarkb | anyway I think it is probably sufficient to land that and if it breaks opendev we can revert and tackle with more robust testing | 23:22 |
clarkb | since the impact will be low | 23:22 |
corvus | yeah. i think if we can run podman-in-podman as a normal user without the cgroup hack in, oh, say about a month after debian releases, then i think we're in a good place. i think that's the key thing that, from opendev's perspective, would weigh in on whether it's okay to start landing the podman changes in the zuul project. | 23:23 |
corvus | put another way, if we can clear out the cgroup hack, then i think we're good to land the podman switch for now (with the cgroup hack in place and the sudo workaround; ignore everything in opendev because nothing substantial is changing, then land the cgroup cleanup later. | 23:24 |
corvus | if the cgroup cleanup doesn't work in our desired end-state, then i think opendev should raise that with the zuul project as a reason to hold off/reconsider podman | 23:25 |
clarkb | I think the only thing that opendev really cares about is whether or not podman in docker would work. Everything else should be well covered. | 23:26 |
clarkb | And the only reason that is at question is we don't know what sort of testing podman upstream did when they fixed it | 23:26 |
clarkb | (it is possible they made changes they thought would fix things but for whatever reason are insufficient) | 23:26 |
clarkb | looks like https://zuul.opendev.org/t/openstack/build/65d8dd29a0de4c55ba12eba75156a522/log/logs/fedora_build-succeeds.FAIL.log#1078 is still finding the mirror for some reason | 23:26 |
clarkb | (separate thing) | 23:26 |
corvus | clarkb: well, i think at the meeting today we said opendev wants to run nodepool in podman, so i think opendev cares if nodepool-on-podman works | 23:27 |
clarkb | well that too, it will just take a bit more time to get there. But yes that is doable with an upgrade of builders to jammy and running nodepool-builder with podman | 23:28 |
clarkb | And in the scheme of things swapping out nodepool builders might be one of the simplest services bceause it is almost entirely backend and not user facing (so no one will notice if we take an outage t owork out the transition) | 23:29 |
corvus | right. since everything is containerized in the zuul system, node upgrades should be easy/fast. | 23:29 |
corvus | exactly that :) | 23:29 |
clarkb | the transition is the other concern I have and will need to start looking at. I think it is potentially going to lead to noticeable outages for user facing things because we have to stop the service, clean up content, then start it up after fetching images into the podman context | 23:30 |
clarkb | and that is mostly to avoid any unexpected interaction ebtween docker run services and podman run services (since they will want the same ports and stuff) | 23:30 |
corvus | this is where the "don't always try to automatically start everything" approach is handy. we can install both and then manually switch. | 23:31 |
clarkb | ya so maybe we have an update on a service by service basis that stops starting things autoamtically until that service is moved over or something | 23:32 |
clarkb | then a human can cut it over, land a cleanup change for docker and have podman autostart... | 23:32 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!