Friday, 2024-09-27

*** ramereth[m]1 is now known as Ramereth[m]00:55
*** elodilles_pto is now known as elodilles07:00
mnasiadkaHas anything on CentOS 9 Stream nodes changed? Ansible started complaning about unsupported locale (https://38bbf4ec3cadfd43de08-7d0e556db3075d25d1b91bbdcc8a4562.ssl.cf2.rackcdn.com/930689/1/check/kolla-ansible-centos9s/cc5c91b/primary/logs/ansible/bootstrap-servers)09:55
fricklermnasiadka: I'm not aware of any change on the opendev side. to me it seems very likely that it is once again the stream that is floating10:29
* zigo really hate the move from release names to release dates. Can we revert that decision and use sane names for stable branches ? :/12:56
fungithat's not an opendev decision. you want #openstack-tc12:59
zigoRight.13:04
opendevreviewJeremy Stanley proposed openstack/project-config master: Temporarily remove release docs semaphores  https://review.opendev.org/c/openstack/project-config/+/93070914:27
opendevreviewJeremy Stanley proposed openstack/project-config master: Revert "Temporarily remove release docs semaphores"  https://review.opendev.org/c/openstack/project-config/+/93071014:27
clarkbmnasiadka: frickler: looking at build logs on nb01.opendev.org and nb02.opendev.org for centos 9 stream and grepping for things like locale and the result en_US.UTF-8 we set things to I don't see any obvious difference between the most recent image build and previous ones14:57
clarkbmnasiadka: frickler: I do note that we run `echo '%_install_langs C:en_US:en_US.UTF-8' | sudo tee -a /opt/dib_tmp/dib_build.Pdtq2kFz/mnt/etc/rpm/macros.langs` to configure which languages to generate content for14:57
clarkbI think ansible does require a utf8 locale/lang which we're giving to it, but maybe it requires C.utf8 now? Thats a long winded way of saying maybe it is ansible that updated and it doesn't like those options?14:58
clarkbI suspect we can probably modify that list of langs to include C.utf8 if that is the case14:58
clarkbbut neither ansible-core or ansible have a new release in the last coupel of days so maybe that isn't it either15:00
mnasiadkaI love such cases15:15
clarkbanother possibility is a python update in centos 9 steram15:17
mnasiadkaWell, we're using python3.12 - let me try to reproduce it using centos:stream9 container image15:21
mnasiadkaclarkb: yup, it seems it needs C.UTF-8 for some reason on stream9 now... https://paste.openstack.org/show/b0oLyuO1QtnPLPw6F4T3/15:24
fungisuch fun15:25
fungigranted, i prefer C.UTF-8 for most things15:25
clarkbmnasiadka: can you run locale -a | grep en_US and make sure you have that locale installed?15:26
clarkb(it could be failing because it isn't installed in your env, but it should be present in our image builds based on the logs there)15:26
mnasiadkaok, it seems it's not, let me install that15:26
mnasiadkaok, after installing glibc-langpack-en it works15:28
clarkbI guess another possibility is that for some reason what the dib builds have done previously to install locales is no longer working15:28
clarkbmaybe next step is to hold a node/boot a node off that image and run locale -a and check that python3.12 works generally without ansible doing utf8 things15:30
clarkbif that doesn't work we can figure out what is wrong in the image. If that does work it would indicate some problem with ansible?15:30
clarkboh wait what if we've got the problem in reverse? Is it possible you are trying to force C.utf8 in your job somewhere?15:32
clarkbthen you get the same behavior you just saw in your container but with C.utf8 instead of en_US.utf815:32
clarkbI do think it is reasonable to add C.utf8 to the image as well. I suspect we didn't do that by default because it is a relatively new locale and centos 7 probably didn't support it15:33
mnasiadkaDoing a quick skim of kolla-ansible code - I don't think we're setting any locale variables anywhere15:34
mnasiadkaI can revisit that on Monday (it's 5:34pm here) - it's not that centos9 stream jobs are essential for us (we only use them as a pointer if something can break in Rocky9 soon)15:35
clarkback enjoy your weekend15:35
corvusclarkb: with the fd leak fix in place, zuul tests are completing on rax flex.  looks to be about 16% faster than rax-dfw.18:37
corvushttps://zuul.opendev.org/t/zuul/buildset/818d09f16df34e57b272dc81b65e1467 is the battery if you want to see the results18:38
fungi15% faster with half as many processors18:43
fungier, 16%18:43
corvusyup not shabby18:44
mordredI feel like cloudnull should use that in marketing materials 18:58
JayF"not as slow and outdated as our outdated stuff" doesn't seem like a good pitch to me ;) 19:09
JayFI say that tongue in cheek, obviously it's a huge improvement but I gotta give cardoe and friends a hard time19:09
fungi"it's amazing how much faster our cloud got after jay left..." ;)19:10
JayFfungi: I /did/ brick like 20% of the OnMetal fleet once19:10
JayFfungi: but they got better19:10
fungibwahahaha19:10
cardoeMan that old hardware... it's so hateful19:11
JayFthe opencompute stuff we deployed for onmetal was very special19:11
JayFin lots of good ways, in lots of horrible ways19:11
JayFtrailblazing the path for opencompute at the whole company! I think we won to the count of a dozen racks total internationally when I left :P 19:11
clarkbcorvus: that is good evidence that using the fewer cpu flavor isn't a performance hit either19:26
clarkbsorry had an early lunch today19:26
fungialso suggests that the too many open files failure being hit by another project very well may be due to a similar cause (existing fd leak which only exceeds the limit when tests use fewer processors)19:34
clarkb++19:34
fungiwas that neutron? i can't remember19:34
clarkbyes it was a neutron functional test19:35
fungiwhich means this is a great outcome... the variability in testing is exposing a previously unidentified bug19:35
JayFwhile you're right, I don't think any neutron devs will be throwing a party about it :D 19:44
* JayF would not want to track a leaky fd19:44
fungieveryone's a plumber here. it's all mains and drains really19:50
JayFyeah but in this case, you're the guy having to tell someone their crawlspace is moldy. no fun19:50
corvuspersonally, i do not think my time finding and fixing that was well spent.19:50
corvuszero bugs that affect the actual software were fixed19:50
corvusit's mostly the result of poor test isolation19:51
corvusso i don't blame the cloud for that, but i'm not celebrating it as an improvement.19:52
fungithough it probably wouldn't have existed (or would have been noticed/fixed sooner) if we had more such variability in the recent past19:52
fungibut osic was probably the last place we did 4cpu nodes and that's been a few years19:53
corvusit's been noticed before19:54
fungioh, just only tended to occur when the job ran slower/longer for other reasons?19:55
corvusnot in the gate19:56
fungiah, in local runs19:58
JayFhmm. Where would I make a change if I wanted the systemd unit files created by devstack available in the logs/../etc/ folder?20:16
JayFlike just point me at a repo and I'm sure I can figure it out, I just know there are 500 layers there and am hoping someone just knows20:16
clarkbJayF: in the devstack repo there should be stuff to collect logs20:17
clarkbyou'd add the paths you want to that20:17
JayFack, cool, I assumed it was higher up than that20:17
clarkbJayF: grep zuul_copy_output20:17
clarkbtypically jobs are responsible for collecting the logs they care about onto the executor then the executor publishes them to swift20:18
fungijust remember that collecting and publishing logs takes longer the more files there are, so recursively copying a directory full of stuff could make the post-run phase of the job take longer20:18
clarkbya it turns out the big bottleneck there is total swift requests more so that bandwidth when it comes to small text files20:19
clarkband each directory is a file20:19
fungitripleo jobs had a tendency to just collect the entirety of /etc, which took forever, so please don't do that ;)20:19
JayFthat is not included anywhere in devstack, and looking at it being configured, it's probably in one of those layered job repos 20:21
JayFI'll go digging20:21
JayFthe only mention of those terms in devstack is how to configure it 20:21
clarkbJayF: https://opendev.org/openstack/devstack/src/branch/master/.zuul.yaml#L354 its there20:23
JayFwhat the20:23
clarkbas a side note those unit files are pretty basic and I think come in only a couple of flavors you can probably just look at the source20:23
JayFoh no, does ripgrep exclude files starting with a `.` by default20:24
JayFclarkb: I'm getting a weird error, so I specifically want to pull this and would rather get it from logs than ask you to hold a box20:24
JayFI'm trying to do something strange (I want to get ironic's BMC emulators to install in their own separate venv so we are no longer beholden to global-requirements/upper-constraints for it)20:24
JayFbut I can update the config of one of the ironic jobs and disable the others20:24
clarkbhttps://opendev.org/openstack/devstack/src/branch/master/functions-common#L1538-L1601 ok20:25
clarkber ok but there they are20:26
JayF2024-09-27 20:08:33.255044 | controller | Failed to start devstack@redfish-emulator.service: Unit devstack@redfish-emulator.service has a bad unit file setting. 20:26
JayFis what I'm digging, and I'm directly screwing with vars that go into that template20:26
JayFso I'm just trying to find the shape of how I'm wrong :D 20:26
JayFre https://review.opendev.org/c/openstack/ironic/+/93077620:26
clarkbI thought systemctl would write out the bad bits when you daemon reload but maybe you have to query for them20:30
JayFwell I have it printing output now, I'll poke at it off and on over the weekend20:35
clarkbJayF: I suspect the issue is your command isn't fullyrooted20:37
clarkband path is set to devstack's $PATH which likely doesn't include the root dir of your venv location?20:37
JayFwell, I'm trying to set $cmd to a fully qualified path20:37
clarkbhttps://zuul.opendev.org/t/openstack/build/81c0541fc13448a98ed1d4fd6361564d/log/job-output.txt#10331 is what you're setting it to on the most recent run20:38
JayFand you all don't set a working directory, so I'd think that'd work20:38
JayFaha thank you clarkb 20:38
JayFthat was exactly what I was in search of20:38
JayFand I know likely what the issue is20:38
clarkbit gets expanded below where you are actually calling to write the unit file but its easier to read where you define the variable where I linked20:39
JayFit's obvious upon seeing your example :) but yes20:39
JayFlol20:39
JayFty clarkb :D https://review.opendev.org/c/openstack/ironic/+/930776/5..620:41
* JayF will be back in an hourish to check that20:41

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!