Thursday, 2025-01-30

opendevreview	James E. Blair proposed opendev/system-config master: Mirror node 23 container image https://review.opendev.org/c/opendev/system-config/+/940419	00:11
opendevreview	Merged opendev/system-config master: Mirror node 23 container image https://review.opendev.org/c/opendev/system-config/+/940419	00:19
frickler	infra-root: nb04 has /opted to be full again. maybe looking at transitioning arm builds to zuul would be a better plan than having to keep babysitting that server?	05:54
*** ykarel_ is now known as ykarel		06:01
*** elodilles_pto is now known as elodilles		08:13
tobias-urdin	after review.opendev.org upgrade to 3.10 it's blazingly fast, something was really improved, never been this fast to-date from what i can remember and that's almost ten years by now	09:03
tobias-urdin	nice work! :)	09:04
fungi	tobias-urdin: that's great to hear! i wonder if it's really gerrit 3.10 performing better, or the fact that we rest the h2 databases for its change caches... maybe the caches being so massive was causing them to do the opposite of their intended purpose	12:53
fungi	normally we preserved the caches, but h2 doesn't really shrink backing files when records are deleted (just flags them so the engine can skip them) and they'd grown so massive over the years that they were causing other problems at shutdown/startup for the gerrit service	12:56
fungi	so during the last upgrade we decided to remove them and let gerrit start up with cold caches	12:58
fungi	and now we're thinking we should just do that every time we restart it for upgrades or config changes to keep the caches from growing so large	12:58
frickler	seems github is a bit sad, just in case people notice failures https://www.githubstatus.com/	14:48
fungi	thanks for the heads up!	14:52
fungi	in other news, i've had a bit of a eureka moment wrt the pypi warnings we started getting a few months ago about non-normalized sdist filenames... newer setuptools fixes that, so my zuul-jobs change to switch from direct setup.py invocation to using pyproject-build will solve it (by pulling in newer setuptools automatically)	14:53
clarkb	yay automatic fixew	15:49
clarkb	*fixes	15:49
frickler	fungi: the job failures in https://review.opendev.org/c/zuul/zuul-jobs/+/940273 are not relevant for https://review.opendev.org/c/zuul/zuul-jobs/+/940273 , do I understand that correctly? are we ready to approve that stack, then?	15:54
clarkb	frickler: I think you linked th same change twice. Which failures?	15:55
fungi	frickler: i assume you mean 940314?	15:55
frickler	argl, sorry, I meant https://review.opendev.org/c/opendev/bindep/+/940258	15:55
fungi	if so, that was an experiment based off a suggestion clarkb made, it's not relevant	15:55
clarkb	frickler: I think the failures in 940258 are due to pbr only listing a dependency on setuptools for python3.12 and newer and line 2 in the pyproject.toml dropped setuptools	15:57
fungi	the failures on the current iteration of 940258 will be fixed once https://review.opendev.org/940118 merges and pbr 6.1.1 is on pypi	15:57
clarkb	the next pbr release (either final or beta) will list setuptools for all python versions. So ya those failures are not relevant	15:57
fungi	frickler: patchset 5 of 940258 is probably a better result to look at	15:59
fungi	that was the one that added the depends-on to the zuul-jobs change, ps6 was testing what happens when removing setuptools from the build-system.requires in pyproject.toml	16:00
fungi	(i was testing a variety of different things over the life of that dnm change)	16:01
clarkb	Once I've loaded my ssh keys I'm going to clean up the etherpad held node	16:01
clarkb	I don't think we need it anymore	16:01
clarkb	oh I don't think we need the helf grafana node anymore either now that we're going to proxy. I can clean that one up too	16:02
clarkb	I have cleaned up my etherpad and grafana autoholds	16:11
opendevreview	Merged zuul/zuul-jobs master: Add ensure-pyproject-build role https://review.opendev.org/c/zuul/zuul-jobs/+/940267	17:04
opendevreview	Merged zuul/zuul-jobs master: build-python-release: pyproject-build by default https://review.opendev.org/c/zuul/zuul-jobs/+/940273	17:19
clarkb	fungi: for bindep we're still waiting on the pbr release right?	18:42
clarkb	which I guess can proceed nowish now that package build tools have updated	18:43
fungi	clarkb: basically yes, i mean it'll work without pbr 6.1.1 but the simplified build-system.requires won't be viable until it exists	19:12
clarkb	corvus: https://review.opendev.org/c/opendev/system-config/+/940403 this is a container image mirroring change that is related to opendev's zuul deployment if you have a moment	19:14
corvus	+3	19:20
clarkb	thanks	19:28
opendevreview	Merged opendev/system-config master: Mirror haproxy container image to opendevmirror on quay.io https://review.opendev.org/c/opendev/system-config/+/940403	19:29
frickler	kolla may be the victim of its self-generated load, but all the timeouts I checked on https://review.opendev.org/c/openstack/kolla-ansible/+/938819 were on rax-dfw	19:52
frickler	I've also seen an unusual number of timeouts on requirements checks over the last couple of days and they also seemed to be concentrated on that cloud	19:53
frickler	nothing we can really act upon I guess, but still worth mentioning I'd think	19:53
fungi	what do requirements checks do that makes them more prone to timeouts?	19:54
frickler	nothing special, normal tempest/devstack jobs, so I do think some slowness of nodes in that cloud is happening. or maybe IO slowness?	19:57
fungi	oh, you mean jobs run for the openstack/requirements project	20:16
clarkb	ya I mean we've always theorized that we are our own worst noisy neighbors	21:21
clarkb	I think that the biggest thing we can do to mitigate that is try and improve the efficiency of our jobs particularly when it comes to avoiding heavy swap use. That seems to thrash everything	21:22
clarkb	also kolla-ansible running 68 jobs per patchset is something else that might be optimized. For example there are LE specific jobs. Why not just run LE all the time and drop the specific jobs?	21:35
clarkb	there are mariadb specific jobs (as opposed to mysql?) maybe just run mariadb all the time?	21:36
clarkb	there are ipv6 specific jobs could probably just do ipv6 all the time	21:36
clarkb	(I don't actually know what divisions make sense to collapse across; I'm just trying to illustrate what it might look like)	21:36
clarkb	there are different bifrost and ironic jobs too	21:37
clarkb	and kolla-ansible isn't the only offender we saw similar with tacker the other day	21:41
clarkb	I wonder if this is something we should put on the tc meeting agenda	21:44
clarkb	for example tack runs tacker-ft-v2-df-userdata-basic-max and tacker-ft-v2-df-userdata-basic-min. The -max job runs a single test case that takes 1600 seconds with a total job runtime of 1 hour 18 minutes in this example. The -min job runs 4 test cases that take 350 seconds with a runtime of 52 minutes in this example. Both are 4 node jobs. If we ran the 350 seconds of test cases in	21:47
clarkb	the -max job we could save ~45 minutes * 4 test nodes per patchset just by collasping these two jobs together	21:47
clarkb	but there are ~36 * 4 node jobs for each tacker change so that only takes us to 35 * 4 nodes. Still a measurable improvement but a lot more needs to be done	21:48
clarkb	I know once upon a time openstack was super concerned about resource utilziation and projects like zuul and starlingx canabilizing available resources to openstack's detrimment but time and time again we see that it is openstack's own house creating the problems	21:48
clarkb	anyway I don't want to put a ban hammer on anyone or any job. I just want to ask that develoeprs look at the tests they are running and ask "does this make sense?" "can we do this more efficiently?"	21:49
fungi	yay! the openafs package maintainer for debian finally uploaded a new enough version to sid to work with linux 6.12 and 6.13	23:11
fungi	i'll finally be able to upgrade my kernel again	23:11

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!