Wednesday, 2023-10-18

gthiemongeHi Folks, one of our CI run has been stuck for 15h:
gthiemongecan you do something?07:10
fricklergthiemonge: hmm, that queued job should have resulted in a node failure since we no longer have c8 non-stream nodes. I would however ask you to keep it in this state if it is not urgent so corvus has a chance to look at this later07:44
frickleryou still should make a zuul config change to drop this job completely if you cannot convert it to centos-8-stream07:45
gthiemongefrickler: ack, NP, we can keep it for the moment, and we will update our zuul configs to remove centos8, thanks!07:47
corvusgthiemonge: frickler the issue is that the linaro-regionone-main poolworker on nl03 has not had a chance to accept or decline the request yet; that is due to a persistent ssl certificate validation error on the linaro cloud.  if that is corrected, the request will be declined; if we remove that cloud from service, we'll need to dequeue that item from zuul in order to cancel the request.13:36
fricklercorvus: ah, thx for digging. a new PS/rebase should cancel that buildset, too, right? so gthiemonge could make their patch to drop the job and then rebase this one on top13:51
Clark[m]frickler: the first thing I would check for is if gitea has some sort of project reindex command/API like gerrit. Maybe it can check for tags directly rather than us needing to push again14:13
fricklerjust for reference this is a continuation of a discussion in #openstack-release about some tags not showing up in the gitea UI, like for devstack only three tags are listed14:15
clarkbI'm not seeing an api to rescan a project. There is apparenty a way to "adopt" repos off of disk but I think that only works for repos that gitea doesn't already know about15:31
clarkbI suspect that running the adoption process would fixthis15:32
clarkbfrickler: do you have time for a quick review on to fixup some of our job dependencies after container updates?15:34
clarkbcorvus: frickler: re linaro I thought we had certcheck checking that cloud?15:34
clarkbbut I guess we should send email to kevinz about it15:34 5000 is listed in our explicit certcheck list15:38
clarkbmaybe that is the wrong name or one of several?15:38 is what nodepool talks to. I'll propose a chnage to update our certcheck rule and write and email15:42
clarkbyou can see the periodic jobs enqueue into zuul on the zookeeper data size graphs in grafana16:13
clarkbI removed the relevenat file entries from all the jobs as I'm not sure we want that to happen generally16:21
clarkbwe got the data we need and now that change is in a mergeable state16:22
Clark[m]RH emails came through. As far as I know we didn't change anything on our side19:31
JayFI'll note I'm getting duplicates of emails through the list from redhat employees19:49
JayFHmm. Maybe not, looks like minor differences in them19:50
JayFlikely human-duplicates from the other weirdness19:50
clarkbJayF: yes. Best I can tell people sent emails multiple times because they weren't going through. Now that whatver caused the holdup is apparently fixed all of them have gone though19:51
clarkbthe easiest tell is the timestamps on the emails19:51
JayFI don't always assume timestamps are accurate representatives of the issue when dealing with stuff from an ML, but yeah, it's human-dupes for sure19:52
johnsomHi neighbors, is there a plan for Ubuntu 23.10 nodepool (nested-virt-ubuntu-mantic) nodesets? They have added the igb qemu/libvirt support which I would like to play around with having pure virtual SR-IOV VFs for Octavia. I know it's not in the PTI, but wondering if it will be available anyway.21:10
clarkbjohnsom: no we don't do non lts nodes. They don't have a long enough support timeframe to make it worthwhile21:10
clarkbbasically the same issue that led to us removing fedora21:11
johnsomAck, NP. Just trying to understand the time horizon for an upstream gate. Thanks!21:11
clarkbwhat version of libvirt/qemu do you need?21:12
clarkbmantic is libvirt 9.6.0 and qemu 8.0.421:14
johnsomlibvirt 9.3.0 and qemu 8.021:14
johnsomYeah, I think mantic is the first distro to ship with a compatible version21:15
clarkbya just wondering if UCA might carry that stuff21:16
johnsommaybe? I can get everything ready to go on my local VMs so it's ready for April / 2024.2.21:17
clarkbI don't see newer libvirt and qemu in UCA unfortunately21:18
johnsomI would like to have a tempest job that uses SR-IOV ports in Octavia. I have hardware to develop with locally, but eventually would like an upstream test for it. The igb driver should let us do that.21:18
clarkbof course uwsgi doesn't want to buidl with new python21:21
* clarkb wonders if we should be using something other than uwsgi. It always breaks21:22
johnsomIt's been pretty good for us (Octavia)21:23
clarkbthis time around it looks like normal to be expected compilation errors with a new python. But in the past it has been very flaky building due to bugs in uwsgi itself21:24
clarkbiirc one of the issues was non determinism in the build order process creating errors. So you ahd to build single threaded and hope you don't get a bad order21:25
JayFclarkb: when you say things like that, you make me wanna go look to see if gentoo carries patches on it :D 21:25
JayFthe py maintainer there is very very peculiar about things like that 21:25
clarkbJayF: we set pip verbosity to a high level because that slowed things down or whatever and made the builds more reliable21:26
clarkbit was all terrible hacks and super frustrating21:26
JayFclarkb: there's basically a strong negative corrolation between "length of gentoo ebuild" and "quality of program being built" and the uwsgi one says "yikes" now that I read it21:27
johnsomLooks like the fix merged two weeks ago:
JayFI don't know enough about their problem set to know if the yikes is justified or not though :D21:28
clarkbthats a partial fix fwiw21:28
clarkbor at least it started that way. isnt clear if what merged actually runs properly21:29

