Wednesday, 2025-09-24

corvusclarkb: https://review.opendev.org/96214600:18
*** mrunge_ is now known as mrunge07:38
*** dmellado7 is now known as dmellado09:23
*** mtreinish_ is now known as mtreinish11:49
opendevreviewyatin proposed zuul/zuul-jobs master: [WIP] Make fips setup compatible to 10-stream  https://review.opendev.org/c/zuul/zuul-jobs/+/96120811:58
mnasiadkaJust wondering about ^^ - wouldn’t it be easier to just build the fips enabled image?12:33
*** ykarel_ is now known as ykarel12:35
ykarelmnasiadka, i don't have full background on building fips images but there were some discussions atleast in tc channel as mentioned by sean-k-mooney https://meetings.opendev.org/irclogs/%23openstack-qa/%23openstack-qa.2025-09-16.log.html#openstack-qa.2025-09-16.log.html#t2025-09-16T11:21:2812:40
mnasiadkaykarel: if that works and doesn’t prove problematic - sure, but I read the Fedora page about FIPS, and it seems the recommended way is to have a fips enabled image - and if we only need that for CentOS Stream 10, Rocky 10 and maybe Alma 10 in future - it’s not THAT many images to maintain.12:42
opendevreviewyatin proposed zuul/zuul-jobs master: [WIP] Make fips setup compatible to 10-stream  https://review.opendev.org/c/zuul/zuul-jobs/+/96120812:48
ykarelyes if thats done these custom roles will also not be needed12:48
Clark[m]mnasiadka: ykarel: I have a strong desire to not build fips specific images for opendev for two reasons. The first is due to the effort of building images. It is already rare all image builds reliably succeed. Every new image has an impact on that. Doubling our image count is much worse. And second because every image is tech debt that the opendev team typically ends up responsible for cleaning up when everyone else has stopped caring enough13:20
Clark[m]to be involved. Removing old test platforms has not been fun lately13:20
Clark[m]Also worth keeping in mind that food is a fairly specific use case and one that is tied to a specific set of government requirements for which there are equivalents from other governments. I don't want to triple out images for another competing standard 13:22
Clark[m]Heh "food" yay auto complete. *fips13:22
fungiyeah, for an international collaboration, it seems sort of backwards to go out of our way to support compliance with a government/military standard for one country that represents only 16% of openstack project contributors and 12% of merged changes in the current release13:30
fungithe eu is several times more involved (literally) in developing openstack these days than all of north america13:31
mnasiadkaClark[m]: Actually I thought FIPS is more useful than it is, so agree it doesn’t make any sense ;-)13:42
fungifips actually forbids a number of newer algorithms simply because the process for getting them added to the standard is so very, very slow13:50
fungifips compliance doesn't help you run a more secure system, it helps you run a system that meets specific requirements for usa military and government use13:51
fungiit's the usa's "federal information processing standard"13:53
opendevreviewyatin proposed zuul/zuul-jobs master: [WIP] Make fips setup compatible to 10-stream  https://review.opendev.org/c/zuul/zuul-jobs/+/96120814:05
*** diablo_rojo_phone_ is now known as diablo_rojo_phone14:19
*** diablo_rojo_phone is now known as Guest2727514:20
*** masayukig_ is now known as masayukig14:20
*** mnasiadka_ is now known as mnasiadka14:26
*** mtreinish_ is now known as mtreinish14:26
*** TheJulia_ is now known as TheJulia14:26
*** prometheanfire is now known as Guest2731414:28
*** hashar is now known as Guest2731514:28
*** masayukig_ is now known as masayukig14:39
*** mtreinish_ is now known as mtreinish14:39
*** johnsom_ is now known as johnsom14:39
*** mnaser_ is now known as mnaser14:39
clarkbas a heads up my ISP was not able to make it out yseterday afternoon and rescheduled to this morning14:55
fungifun!14:59
fungii'll be disappearing to run an errand and get an early dinner around 18:30-21:00ish myself15:00
opendevreviewyatin proposed zuul/zuul-jobs master: [WIP] Make fips setup compatible to 10-stream  https://review.opendev.org/c/zuul/zuul-jobs/+/96120815:23
opendevreviewJan Gutter proposed zuul/zuul-jobs master: Fix up some EL10 compatibility  https://review.opendev.org/c/zuul/zuul-jobs/+/96219415:29
*** cardoe_ is now known as cardoe15:32
opendevreviewStephen Finucane proposed openstack/project-config master: Initiate retirement of shade  https://review.opendev.org/c/openstack/project-config/+/96152215:53
opendevreviewStephen Finucane proposed openstack/project-config master: Retire shade  https://review.opendev.org/c/openstack/project-config/+/96152415:53
*** gthiemon1e is now known as gthiemonge16:01
opendevreviewMerged openstack/project-config master: Initiate retirement of shade  https://review.opendev.org/c/openstack/project-config/+/96152216:08
*** efoley_ is now known as efoley17:05
fungiheading out for errands/dinner but will check back in probably around 21:00 utc18:33
clarkbenjoy! I'm still waiting on the isp. Things are happier this morning and they did do things remotely yesterday but still wanted to send someone out. Hopefully I get this resolved soon18:34
Clark[m]ISP is here and I'm on mobile data only for the moment20:07
clarkbok I think I'm back now20:48
corvuslooks like you are21:00
clarkbI disconnected for a short bit again to work on the rats nest of cables. The new ONT is POE which frees up a power outlet in the garage (yay!) but adds more devices to my corner of network doom21:09
clarkbthey basically decided that my ONT is so old that its likely it was the source of my problems and they should just replace it so they did21:10
clarkbI need to do more organization work, but I'm reasonably happy with it for now21:12
fungihere's hoping it holds fast tomorrow morning21:39
clarkbI just did a half hour of successful pings to review (after doing cable reorg) with no loss so that is a good sign at least21:42
clarkbI am taking this as an opportunity to do some long over due local device organization and cleanup22:17
fungicable management is necessary22:18
clarkbya part of the problem is that my little network corner grew from a small router device + access point to a router + ap + switch to a router + ap + switch + fileserver to a router + ap + switch + fileserver + serverserver and recently I put a printer into the mix and then today there is a new poe injector for my WAN connection22:20
clarkbso I shutdown everything I could get away with and started redoing where things lived rather than them being in whatever space was convenient at the time22:20
fungidon't forget the ups22:24
clarkboh ya that is on the floor collecting dust. But also because the poe injector is on the ups my home network shoudl still survive power outages for a couple of hours at least (I think the longest outage we've had was just under 4 hours and it made it that long)22:25
fungii wall-mount all that stuff together to keep the mess out of view and also make it easier to work on22:27
corvusclarkb: the necessary changes for the launcher have merged and promote was successful; it should be okay to pull and restart the launchers, then configure flex in zuul-providers the way we want22:27
clarkbcorvus: ++ considering I'm out tomorrow do we want to wait for me to do that Friday? Also do we want to remove the testing config or should we leave that for now to confirm things look good after a restart?22:28
clarkbthough I guess I don't know how to lookup the list of label networks via the repl22:28
corvusclarkb: i am 99% confident in this fix so i feel like it's reasonable to confirm it in prod and then just revert if i'm wrong.22:29
clarkbI guess we can leave the clouds.yaml content as is for now (and in fact need to to avoid another outage). The first step is to restart launchers22:29
clarkbonce launchers are restarted readd the network config in zuul provider config then remove the clouds.yaml and restart launchers again22:30
corvus++22:30
clarkbI can go ahead and and restart launchers now and get a revert of the zuul-provider config updates pushed22:30
clarkbI'll pull images first to make sure I've got the new stuff22:30
corvusthat sounds good.  if the launchers blow up today we can deal with that now.  then you can do the other stuff with known good launchers on friday.22:31
clarkbya22:31
clarkbzl01 is restarted. Once it looks like it is trying to boot things I'll do zl0222:32
clarkbcorvus: Exception: Unable to find flavor: gp.5.4.4 <- this is for sjc3 so I'm going to check that really quick before doing zl0222:33
clarkbcorvus: oh thats a me bug. I copied the dfw3 config when I made the fake sjc3 config22:36
clarkbcorvus: it also appears to be in a tight loop trying to delete images that are in use in openmetal. It isn't clear to me if that is preventing it from proceeding to booting nodes. The logs is so full of these errors that its hard to tell if it is doing real work too or if this is a prestep that is stuck22:37
clarkbcorvus: looking at grep -v ERROR I think it may not be doing any real work as a result of this22:38
corvusi'll take a look22:38
corvusyeah it's launching nodes22:39
clarkbcorvus: is there an easy way to tell?22:40
clarkbalso any concern about this filling the disk?22:40
opendevreviewClark Boylan proposed opendev/zuul-providers master: Fix test sjc3 region flavors  https://review.opendev.org/c/opendev/zuul-providers/+/96223322:41
clarkbthat is the fix for the silly flavor issue22:41
corvusdefinitely a concern about it filling the disk, but that's not a new concern22:41
corvus2025-09-24 22:38:51,142 DEBUG zuul.Launcher: [e: 4aaa4354e3dc44739ba3f71558423eaa] [req: 8c817375d1ba444e83edc53c2739dd84] Building node <OpenstackProviderNode uuid=2fde142880b8415a85f219d1d12d3852, label=ubuntu-focal, state=requested, provider=opendev.org%2Fopendev%2Fzuul-providers/rax-ord-main>22:41
corvusthat log line tells us it's building nodes22:41
corvusit would be worth understanding why we still have nodes using an image from 10 days ago22:43
corvushttps://zuul.opendev.org/t/zuul/provider/openmetal-iad3-main/image/ubuntu-noble22:43
clarkbya I started looking into that and did confirm at least one of the images is from september 1422:44
corvus10b4689e46af4cc986dd587163e56e5c is one of the ones it's trying to delete22:44
clarkbcorvus: do you think I should proceed with zl02 at this point? eg its safe given the issues I found appaer to be old er and also didn't stop startup ?22:44
corvusyep22:44
clarkbok doing that now then I'll see that I can learn about the images in openmetal22:44
corvusclarkb: could be the gerrit test node -- https://zuul.opendev.org/t/openstack/nodes22:45
clarkband whatever 144bc90c6f094faf81f627f84f980e6c is maybe22:46
clarkbI really dislike that glance can't figure out reference counting22:46
corvuswell the gerrit test node is from 9 days ago, the other one is newer22:46
clarkbzl02 has been restarted22:47
clarkbon the new image verison22:47
corvusbut yes, as a user i do not understand why i can't delete an image that was previously used to launch a server22:47
clarkbe77e2cc4-9966-484a-9c56-062e394e5bcd and 1f68c718-c68f-4613-b32b-4da34eb0192c appear to be the cloud side image ids22:48
corvusi haven't checked but i'd guess the other one is jammy since that's also a gerrit node in openmetal22:48
clarkbe77 is noble and 1f68 is jammy22:48
clarkbthere is a good chance that gerrit is noble and bridge is jammy and its that pair of held nodes22:49
clarkbI don't mind delete those two and getting new held nodes if we think that is a good idea to avoid disk filling? though if this has been happenign for ~9 days then we're probably ok?22:49
corvusyeah, we're probably leveled out. :)22:51
corvussince i think our rotation is 7 days22:51
opendevreviewClark Boylan proposed opendev/zuul-providers master: Revert "Remove raxflex networks config"  https://review.opendev.org/c/opendev/zuul-providers/+/96223522:52
clarkbcorvus: I wonder if we should slow down the delete requests though?22:53
clarkbcorvus: also I guess you should let me know if 962233 is better off being a change that removes the test provider stuff entirely22:54
clarkbnot sure how much value there is at this point22:54
clarkbconfirmed np0ad4c944a1544 is booted on 1f68c718-c68f-4613-b32b-4da34eb0192c and npca82fb44ea454 is booted on e77e2cc4-9966-484a-9c56-062e394e5bcd so my held nodes are the cause of that error loop22:57
opendevreviewClark Boylan proposed opendev/system-config master: Revert "Reapply "Select the network to use in raxflex""  https://review.opendev.org/c/opendev/system-config/+/96223723:11
clarkband that is the change to clean up clouds.yaml. I'll pcik this back up on Friday and am happy to babysit those changes and ensure we don't get duplicate networks again23:12
*** iurygregory_ is now known as iurygregory23:41
clarkbthere is a docker hub outage right now. Just a heads up in case people wonder why jobs mgiht be failing23:44
tonybLooking at refreshing, https://review.opendev.org/c/opendev/system-config/+/934937 do we really need to use the latest pip?   These days isn't the distro packaging for pip+venv adequate?23:54
tonybIf we installed the distro pip+venv then we could update the calls into the pip module with the appropriate executables and virtualenv options, and maybe add a symlink?23:55
tonybLike docker/podman we could make the jammy->noble transition the point at which things change23:56
tonybI think that'd be neater than what I did in 93493723:57

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!