Monday, 2025-06-23

opendevreview	tianyutong proposed openstack/project-config master: Allow tag creation for heterogeneous-distributed-training-framework https://review.opendev.org/c/openstack/project-config/+/953069	01:41
opendevreview	Merged openstack/project-config master: Allow tag creation for heterogeneous-distributed-training-framework https://review.opendev.org/c/openstack/project-config/+/953069	12:31
*** haleyb\|out is now known as haleyb		13:12
*** mmagr__ is now known as mmagr		14:28
priteau	Hello. I have a job which has been queued for 5+ hours: https://zuul.opendev.org/t/openstack/status?change=952983	15:33
priteau	kayobe-seed-vm-ubuntu-noble towards the bottom of the list	15:33
clarkb	looks like that job uses this nodeset: https://opendev.org/openstack/kayobe/src/branch/master/zuul.d/nodesets.yaml#L14-L18 which uses a standard ubuntu-noble label (so not arm or nested virt etc)	15:42
fungi	and just a single node	15:42
clarkb	2025-06-23 15:42:55,713 DEBUG nodepool.PoolWorker.raxflex-sjc3-main: Active requests: ['200-0027294972']	15:43
clarkb	thats the nodepool provider that has the request in its todo list. It reports there isn't enough quota to fulfill the request. It is possible we have leaked fips there again. i'll check	15:43
clarkb	yes based on a floating ip listing I believe this is the case. I'll do what I did a week or two ago and delete all the fips that are not attached to anything	15:44
clarkb	dfw3 is in the same situation so I'll do that same there as well	15:45
clarkb	sjc3 api responses are a bit slow so I started with dfw3	15:50
clarkb	dfw3 is done. sjc3 is in progress but slow due to the api response timing. Hopefully things will be happy in the next 5-10 minutes	15:53
priteau	Thank you!	16:20
priteau	Should I have posted this to #opendev instead?	16:20
clarkb	either is fine, but this is an opendev ci system issue nothing specific to openstack	16:21
clarkb	looking at grafana graphs I think we also have a number of nodes stuck in a deleting state in sjc3. Possibly due to the api slowness I've observed	16:25
clarkb	I can try to manually delete a node there and see what happens	16:26
clarkb	ok manually deleting ~3 nodes seemed to get things moving. Possible with the api response times that nodepool was hitting some error that short circuited things until I reduced the total size of the list? I'm not sure. Either way this also seems to have helped	16:31
clarkb	the job has a node now too	16:32
fungi	agreed, i see it started running	16:34
clarkb	actually I think my manual deletion just happened to coincide with james denton making a fix on the cloud side	16:40
clarkb	so anyway cloud helped us out, once that was sorted then nodepool could clean things up normally (except for the fips since we have fip cleanup disabled)	16:40
clarkb	priteau: thank you for the heads up and I think your change has reported now	16:41
priteau	It did, thanks.	16:46

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!