Tuesday, 2023-05-23

tonyb	clarkb: That's all totally fair.	00:42
tonyb	If I understand the scrollback we can pause/revert the quay.io work, migrate the tooling to podman[1] and then resume the migration	00:43
tonyb	[1] When people other than clarkb is able to push on it	00:44
clarkb	fwiw I can work on it too. I just don't want the expectation to be clarkb is gonna get it all done in a few weeks for quay.io stuff :)	00:44
clarkb	Ideally we can work on it together over time	00:44
tonyb	Yeah. definately not a clarkb thing.	00:53
tonyb	FWIW, whenever I make suggestions I'm always of the opinion that if I'm not going to do the work I only count as 0.5 of a vote.	00:55
clarkb	it looks like the nodepool podman test is going to or has timed out because image builds weren't happening	01:01
clarkb	I can't poke at that more today. Feel free to if interested	01:02
tonyb	clarkb: totallty interested, but also it isn't exactly in my "wheelhouse".	01:05
*** amoralej\|off is now known as amoralej		06:33
*** mooynick is now known as yoctozepto		09:14
yoctozepto	morning	09:14
opendevreview	Merged opendev/base-jobs master: buildset-registry: Always use Docker https://review.opendev.org/c/opendev/base-jobs/+/883869	11:45
fungi	yoctozepto: ^ should be able to recheck now	11:55
yoctozepto	fungi: thanks, yeah, it went further: https://zuul.opendev.org/t/nebulous/build/a8d511bf7e6b487684e69210ef59d812 I just need to fix the references	11:58
yoctozepto	and now it works :D	12:12
*** amoralej is now known as amoralej\|lunch		12:13
*** amoralej\|lunch is now known as amoralej		13:05
fungi	excellent	13:50
*** amoralej is now known as amoralej\|off		16:10
fungi	the rackspace tickets i opened yesterday have been acted on, reclaiming 118 nodes worth of capacity for jobs	16:18
clarkb	any indication if we should expect the problem to recur?	16:20
fungi	no clue, but it did cause rackspace support to ask who was opening the tickets since (with both of our accounts) the internal employee advocate is the one whose contact information appears on the account rather than ours	16:21
fungi	apparently the current account contact there is don norton, who was surprised by the ticket	16:22
fungi	it finally happened... https://blog.pypi.org/posts/2023-05-23-removing-pgp/	16:50
opendevreview	Clark Boylan proposed openstack/diskimage-builder master: DNM testing if depends-on parent change works with dib https://review.opendev.org/c/openstack/diskimage-builder/+/883958	17:04
yoctozepto	have you seen this error using the opendev cointainer image promoting job? https://zuul.opendev.org/t/nebulous/build/e2e20e8bf84d4fc9b9b500fd1dea6e0e	17:45
yoctozepto	https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/promote-container-image/tasks/promote-from-intermediate-registry.yaml	17:45
yoctozepto	oh well, it means nothing was obtained from the api, strange	17:46
yoctozepto	argh, a typo	17:47
clarkb	yoctozepto: that means the job is looking for the gate job that build your image	17:47
clarkb	and couldn't find it	17:47
yoctozepto	yeah	17:47
clarkb	(it uses that info to then find the artifact to fetch and promote)	17:47
yoctozepto	off by one letter	17:47
yoctozepto	yeah, that I figured :-)	17:47
yoctozepto	but it's like	17:47
yoctozepto	"huh, it should be there"	17:47
yoctozepto	and then "oh well, one letter off"	17:48
yoctozepto	nighty night!	19:14
tonyb	I have a couple of "how does it work" questions. When people have time?	19:54
fungi	i have time, hopefully even answers, and if you're lucky they'll even be correct	19:55
tonyb	1) For AFS utilization is there a finegraned way to see howmuch $something is using. Currently I'm looking at https://grafana.opendev.org/d/9871b26303/afs?orgId=1 for a general sense but if I wanted to see how much we'd reclaim if we removed $x from storage where shoudl I go?	19:55
fungi	if you want to install the openafs client locally, you can check quotas with fs subcommands	19:56
clarkb	you can also look at the rsync logs iirc they are in afs too	19:56
clarkb	and rsync has size info	19:56
tonyb	2) this one is more basic, how do I find which jobs/builds are using the Ubuntu cloud archive? I'm just using stable/$branch && ubuntu as a proxy but I don't know if that's valid	19:57
tonyb	these both came up from me looking at: https://review.opendev.org/c/opendev/system-config/+/883468	19:57
fungi	fungi@dhole:~$ fs listquota /afs/.openstack.org/docs	19:57
fungi	Volume Name Quota Used %Used Partition	19:57
fungi	docs 50000000 30206816 60% 77%	19:57
fungi	tonyb: you might be able to query for a uca url in opensearch?	19:58
tonyb	Okay. I'll look at that. Having OpenAFS locally is a little complex due to packaging on Fedora but I can make it work	19:58
fungi	if you have a debian vm you could just apt install it	19:59
tonyb	Okay, so opensearch .... I don't knwo about that, is that essentially the old logstash?	19:59
tonyb	fungi: very true, I could just do that.	20:00
fungi	essentially, except it's being run by openstack community volunteers	20:00
fungi	the project team guide has details i think, or maybe the tact sig page on governance... checking	20:00
tonyb	Ahhh okay that's why I couldn't find it when I went poking in git	20:00
tonyb	clarkb: rsync logs are .... https://static.opendev.org/mirror/logs/rsync-mirrors/ ?	20:00
fungi	tonyb: https://governance.openstack.org/sigs/tact-sig.html#opensearch	20:01
fungi	tonyb: yes	20:02
fungi	for the mirror content rsyncing logs	20:02
tonyb	fungi: Thanks x2	20:02
fungi	you bet	20:04
fungi	if you have any other questions, i'm happy to answer any time i'm awake	20:04
tonyb	I don't see anything "unbuntu" in .../mirror/logs	20:04
tonyb	fungi: thanks	20:04
fungi	ubuntu (and debbian) mirrors are not mirrored with rsync	20:04
fungi	they use a tool called reprepro	20:04
tonyb	This 6-months suck WRT tz overlap	20:05
fungi	we may not be splatting or copying reprepro logs into afs	20:05
fungi	yet	20:05
fungi	but that's something we could add, i'm sure	20:05
tonyb	Okay: https://static.opendev.org/mirror/logs/reprepro/	20:05
tonyb	those logs don't help with the size thing	20:06
fungi	that was quick!	20:06
tonyb	Its next to rsync in the list ;P	20:06
fungi	what's your size dilemma you need to answer?	20:08
tonyb	fungi: It isn't a dilema as such. I was curious how much AFS we'd get back if we merged: https://review.opendev.org/c/opendev/system-config/+/883468/ which would stop mirroring older UCA things	20:10
fungi	oh, got it	20:10
fungi	it's hard to know precisely because debian package repositories are often deduplicated in order to avoid carrying identical copies of packages which might be the same in more than one distribution release	20:11
fungi	they don't use completely separate file trees like other distros tend to	20:11
tonyb	Ahh of course the "pool" concept.	20:12
fungi	yes, exactly	20:12
fungi	reprepro deletes any packages not still referenced in the indices	20:12
fungi	so removing an index will free up the space needed by any packages which are only listed in that index, but packages which were also listed in other indices are retained	20:13
fungi	uca may trivially not pool its packages, so we might simply be able to du a subtree to get a good guess	20:13
tonyb	Okay. I understand. This has been helpful.	20:14
clarkb	ya I think UCA is pretty well segregated by openstack release and ubuntu relase	20:14
fungi	nevermind, uca is also pooled	20:14
fungi	but we might be able to estimate it by parsing file sizes out of the indices	20:15
clarkb	I restarted the merger on zm06	20:17
fungi	the "size" fields in indices like /afs/.openstack.org/mirror/ubuntu-cloud-archive/dists/bionic-updates/rocky/main/binary-amd64/Packages.gz	20:17
fungi	tonyb: ^ you could collect up all the relevant indices and then parse those files	20:17
tonyb	Okay.	20:17
clarkb	fungi: does du work against openafs mounted content?	20:18
tonyb	I think so.	20:18
clarkb	ya so du might work. Might also be a bit slow as it stats everything	20:18
tonyb	It was mostly I did one review and came up with a bunch of impacts from it and I couldn't really answer any of them so I figured 1) my review wasn't super helpful; and 2) I needed to ask :)	20:20
clarkb	tonyb: re fedora + afs you might get away with kafs though I'm not sure if that gets you userland support you might need	20:21
clarkb	I tried kafs on opensuse a while back and it didn't work but that was a while ago	20:21
tonyb	I don't know about the userland stuff ianw suggested it, tried it and immediately found it is non-functional ATM :/	20:21
clarkb	at one point I had my fileserver doing openafs mounts for me because it is ubuntu based (with zfs!)	20:23
tonyb	nice.	20:23
fungi	clarkb: du won't work because, as i said, uca is pooled after all	20:30
fungi	there's not separate subdirectories for each set of packages	20:30
tonyb	Also using opensearch answered my which (UCA) releases are still in use question	20:31
fungi	cool	20:31
clarkb	fungi: I just mean generally. Du doesn't work on some filesystems. btrfs in particular has tripped me up because it gives you some naive view	20:47
clarkb	though I think that may have improved over time	20:48
fungi	i've used it with afs in the past	20:48
fungi	though it can take a while	20:49
clarkb	ya lots of stats is slow iirc	20:50
fungi	fungi@dhole:~$ du -s /afs/.openstack.org/mirror/ubuntu-cloud-archive	20:53
fungi	6583494 /afs/.openstack.org/mirror/ubuntu-cloud-archive	20:53
opendevreview	Clark Boylan proposed openstack/diskimage-builder master: fedora: don't use CI mirrors https://review.opendev.org/c/openstack/diskimage-builder/+/883798	22:47
clarkb	fungi: ianw ^ I think that should fix the most recent error	22:48
fungi	ah, cool!	22:48
fungi	yep, so it was trying to use the mirrors which were no longer set	22:50
clarkb	corvus: moving here because its more opendev specific. I think our system-config-run-zuul jobs deploy and configure a zookeeper for ssl and all that right? we should be able to adapt that to the nodepool job and then maybe even have that job build an image to test things end to end?	22:51
clarkb	I think the nodepool job currently doesn't do any workload because there is no zookeeper present	22:52
corvus	clarkb: yeah, that could be done... but... two things: 1) that will take ages assuming a production image, and if we use a dummy image, i'm not sure that adds anything; 2) we CD nodepool, so that kind of breakage is more likely to come from the nodepool repo than system-config	22:59
corvus	clarkb: also 3) it shouldn't be necessary once we move image building into jobs so may not be a great investment	22:59
clarkb	that is a good point	23:03
opendevreview	James E. Blair proposed opendev/system-config master: WIP: Test zuul on jammy https://review.opendev.org/c/opendev/system-config/+/883986	23:07
opendevreview	James E. Blair proposed opendev/system-config master: WIP: Test nodepool on jammy https://review.opendev.org/c/opendev/system-config/+/883987	23:11
corvus	clarkb: is there any current testing that would actually exercise that nested podman issue?	23:15
corvus	https://github.com/containers/podman/issues/14884	23:16
clarkb	corvus: I think just what nodepool's testing was doing before the quay move broke speculative gating	23:17
corvus	yeah, it looks like the container-release job should do that	23:17
clarkb	we could do the workaround with skopeo and run it under docker instead of podman	23:18
clarkb	and have a one off job on the side sort of deal just to cover that case	23:18
clarkb	that probably wouldn't be too terrible since we can isolate the job	23:18
corvus	hrm? in nodepool repo? i don't think that's necessary...	23:18
corvus	i just want to know if https://review.opendev.org/883952 means it really worked	23:19
corvus	and it looks like it did... though i should probably update that to also remove your sudos	23:20
clarkb	corvus: butthat would be podman nested in podman	23:20
corvus	right, which is what we want	23:20
clarkb	ah I see. I was confused I think due to the concern about how opendev is still podman in docker	23:21
corvus	i think the specific question was: with https://github.com/containers/podman/issues/14884 merged can we now remove the cgroup hack	23:21
clarkb	but ya I think that shows podman in podman is fine. The original issue was docker in podman (not sure if it exhibited with podman in podman or not)	23:21
clarkb	corvus: right but the original issue was filed specific as podman in docker being problematic. Unknown if the same issue existed as podman in podman	23:22
clarkb	We can test the podman in docker case if we revert the podman change and use the skopeo hack or just run a separate job instead of a revert that does that	23:22
clarkb	anyway I think it is probably sufficient to land that and if it breaks opendev we can revert and tackle with more robust testing	23:22
clarkb	since the impact will be low	23:22
corvus	yeah. i think if we can run podman-in-podman as a normal user without the cgroup hack in, oh, say about a month after debian releases, then i think we're in a good place. i think that's the key thing that, from opendev's perspective, would weigh in on whether it's okay to start landing the podman changes in the zuul project.	23:23
corvus	put another way, if we can clear out the cgroup hack, then i think we're good to land the podman switch for now (with the cgroup hack in place and the sudo workaround; ignore everything in opendev because nothing substantial is changing, then land the cgroup cleanup later.	23:24
corvus	if the cgroup cleanup doesn't work in our desired end-state, then i think opendev should raise that with the zuul project as a reason to hold off/reconsider podman	23:25
clarkb	I think the only thing that opendev really cares about is whether or not podman in docker would work. Everything else should be well covered.	23:26
clarkb	And the only reason that is at question is we don't know what sort of testing podman upstream did when they fixed it	23:26
clarkb	(it is possible they made changes they thought would fix things but for whatever reason are insufficient)	23:26
clarkb	looks like https://zuul.opendev.org/t/openstack/build/65d8dd29a0de4c55ba12eba75156a522/log/logs/fedora_build-succeeds.FAIL.log#1078 is still finding the mirror for some reason	23:26
clarkb	(separate thing)	23:26
corvus	clarkb: well, i think at the meeting today we said opendev wants to run nodepool in podman, so i think opendev cares if nodepool-on-podman works	23:27
clarkb	well that too, it will just take a bit more time to get there. But yes that is doable with an upgrade of builders to jammy and running nodepool-builder with podman	23:28
clarkb	And in the scheme of things swapping out nodepool builders might be one of the simplest services bceause it is almost entirely backend and not user facing (so no one will notice if we take an outage t owork out the transition)	23:29
corvus	right. since everything is containerized in the zuul system, node upgrades should be easy/fast.	23:29
corvus	exactly that :)	23:29
clarkb	the transition is the other concern I have and will need to start looking at. I think it is potentially going to lead to noticeable outages for user facing things because we have to stop the service, clean up content, then start it up after fetching images into the podman context	23:30
clarkb	and that is mostly to avoid any unexpected interaction ebtween docker run services and podman run services (since they will want the same ports and stuff)	23:30
corvus	this is where the "don't always try to automatically start everything" approach is handy. we can install both and then manually switch.	23:31
clarkb	ya so maybe we have an update on a service by service basis that stops starting things autoamtically until that service is moved over or something	23:32
clarkb	then a human can cut it over, land a cleanup change for docker and have podman autostart...	23:32

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!