*** ramereth[m]1 is now known as Ramereth[m] | 00:55 | |
*** elodilles_pto is now known as elodilles | 07:00 | |
mnasiadka | Has anything on CentOS 9 Stream nodes changed? Ansible started complaning about unsupported locale (https://38bbf4ec3cadfd43de08-7d0e556db3075d25d1b91bbdcc8a4562.ssl.cf2.rackcdn.com/930689/1/check/kolla-ansible-centos9s/cc5c91b/primary/logs/ansible/bootstrap-servers) | 09:55 |
---|---|---|
frickler | mnasiadka: I'm not aware of any change on the opendev side. to me it seems very likely that it is once again the stream that is floating | 10:29 |
* zigo really hate the move from release names to release dates. Can we revert that decision and use sane names for stable branches ? :/ | 12:56 | |
fungi | that's not an opendev decision. you want #openstack-tc | 12:59 |
zigo | Right. | 13:04 |
opendevreview | Jeremy Stanley proposed openstack/project-config master: Temporarily remove release docs semaphores https://review.opendev.org/c/openstack/project-config/+/930709 | 14:27 |
opendevreview | Jeremy Stanley proposed openstack/project-config master: Revert "Temporarily remove release docs semaphores" https://review.opendev.org/c/openstack/project-config/+/930710 | 14:27 |
clarkb | mnasiadka: frickler: looking at build logs on nb01.opendev.org and nb02.opendev.org for centos 9 stream and grepping for things like locale and the result en_US.UTF-8 we set things to I don't see any obvious difference between the most recent image build and previous ones | 14:57 |
clarkb | mnasiadka: frickler: I do note that we run `echo '%_install_langs C:en_US:en_US.UTF-8' | sudo tee -a /opt/dib_tmp/dib_build.Pdtq2kFz/mnt/etc/rpm/macros.langs` to configure which languages to generate content for | 14:57 |
clarkb | I think ansible does require a utf8 locale/lang which we're giving to it, but maybe it requires C.utf8 now? Thats a long winded way of saying maybe it is ansible that updated and it doesn't like those options? | 14:58 |
clarkb | I suspect we can probably modify that list of langs to include C.utf8 if that is the case | 14:58 |
clarkb | but neither ansible-core or ansible have a new release in the last coupel of days so maybe that isn't it either | 15:00 |
mnasiadka | I love such cases | 15:15 |
clarkb | another possibility is a python update in centos 9 steram | 15:17 |
mnasiadka | Well, we're using python3.12 - let me try to reproduce it using centos:stream9 container image | 15:21 |
mnasiadka | clarkb: yup, it seems it needs C.UTF-8 for some reason on stream9 now... https://paste.openstack.org/show/b0oLyuO1QtnPLPw6F4T3/ | 15:24 |
fungi | such fun | 15:25 |
fungi | granted, i prefer C.UTF-8 for most things | 15:25 |
clarkb | mnasiadka: can you run locale -a | grep en_US and make sure you have that locale installed? | 15:26 |
clarkb | (it could be failing because it isn't installed in your env, but it should be present in our image builds based on the logs there) | 15:26 |
mnasiadka | ok, it seems it's not, let me install that | 15:26 |
mnasiadka | ok, after installing glibc-langpack-en it works | 15:28 |
clarkb | I guess another possibility is that for some reason what the dib builds have done previously to install locales is no longer working | 15:28 |
clarkb | maybe next step is to hold a node/boot a node off that image and run locale -a and check that python3.12 works generally without ansible doing utf8 things | 15:30 |
clarkb | if that doesn't work we can figure out what is wrong in the image. If that does work it would indicate some problem with ansible? | 15:30 |
clarkb | oh wait what if we've got the problem in reverse? Is it possible you are trying to force C.utf8 in your job somewhere? | 15:32 |
clarkb | then you get the same behavior you just saw in your container but with C.utf8 instead of en_US.utf8 | 15:32 |
clarkb | I do think it is reasonable to add C.utf8 to the image as well. I suspect we didn't do that by default because it is a relatively new locale and centos 7 probably didn't support it | 15:33 |
mnasiadka | Doing a quick skim of kolla-ansible code - I don't think we're setting any locale variables anywhere | 15:34 |
mnasiadka | I can revisit that on Monday (it's 5:34pm here) - it's not that centos9 stream jobs are essential for us (we only use them as a pointer if something can break in Rocky9 soon) | 15:35 |
clarkb | ack enjoy your weekend | 15:35 |
corvus | clarkb: with the fd leak fix in place, zuul tests are completing on rax flex. looks to be about 16% faster than rax-dfw. | 18:37 |
corvus | https://zuul.opendev.org/t/zuul/buildset/818d09f16df34e57b272dc81b65e1467 is the battery if you want to see the results | 18:38 |
fungi | 15% faster with half as many processors | 18:43 |
fungi | er, 16% | 18:43 |
corvus | yup not shabby | 18:44 |
mordred | I feel like cloudnull should use that in marketing materials | 18:58 |
JayF | "not as slow and outdated as our outdated stuff" doesn't seem like a good pitch to me ;) | 19:09 |
JayF | I say that tongue in cheek, obviously it's a huge improvement but I gotta give cardoe and friends a hard time | 19:09 |
fungi | "it's amazing how much faster our cloud got after jay left..." ;) | 19:10 |
JayF | fungi: I /did/ brick like 20% of the OnMetal fleet once | 19:10 |
JayF | fungi: but they got better | 19:10 |
fungi | bwahahaha | 19:10 |
cardoe | Man that old hardware... it's so hateful | 19:11 |
JayF | the opencompute stuff we deployed for onmetal was very special | 19:11 |
JayF | in lots of good ways, in lots of horrible ways | 19:11 |
JayF | trailblazing the path for opencompute at the whole company! I think we won to the count of a dozen racks total internationally when I left :P | 19:11 |
clarkb | corvus: that is good evidence that using the fewer cpu flavor isn't a performance hit either | 19:26 |
clarkb | sorry had an early lunch today | 19:26 |
fungi | also suggests that the too many open files failure being hit by another project very well may be due to a similar cause (existing fd leak which only exceeds the limit when tests use fewer processors) | 19:34 |
clarkb | ++ | 19:34 |
fungi | was that neutron? i can't remember | 19:34 |
clarkb | yes it was a neutron functional test | 19:35 |
fungi | which means this is a great outcome... the variability in testing is exposing a previously unidentified bug | 19:35 |
JayF | while you're right, I don't think any neutron devs will be throwing a party about it :D | 19:44 |
* JayF would not want to track a leaky fd | 19:44 | |
fungi | everyone's a plumber here. it's all mains and drains really | 19:50 |
JayF | yeah but in this case, you're the guy having to tell someone their crawlspace is moldy. no fun | 19:50 |
corvus | personally, i do not think my time finding and fixing that was well spent. | 19:50 |
corvus | zero bugs that affect the actual software were fixed | 19:50 |
corvus | it's mostly the result of poor test isolation | 19:51 |
corvus | so i don't blame the cloud for that, but i'm not celebrating it as an improvement. | 19:52 |
fungi | though it probably wouldn't have existed (or would have been noticed/fixed sooner) if we had more such variability in the recent past | 19:52 |
fungi | but osic was probably the last place we did 4cpu nodes and that's been a few years | 19:53 |
corvus | it's been noticed before | 19:54 |
fungi | oh, just only tended to occur when the job ran slower/longer for other reasons? | 19:55 |
corvus | not in the gate | 19:56 |
fungi | ah, in local runs | 19:58 |
JayF | hmm. Where would I make a change if I wanted the systemd unit files created by devstack available in the logs/../etc/ folder? | 20:16 |
JayF | like just point me at a repo and I'm sure I can figure it out, I just know there are 500 layers there and am hoping someone just knows | 20:16 |
clarkb | JayF: in the devstack repo there should be stuff to collect logs | 20:17 |
clarkb | you'd add the paths you want to that | 20:17 |
JayF | ack, cool, I assumed it was higher up than that | 20:17 |
clarkb | JayF: grep zuul_copy_output | 20:17 |
clarkb | typically jobs are responsible for collecting the logs they care about onto the executor then the executor publishes them to swift | 20:18 |
fungi | just remember that collecting and publishing logs takes longer the more files there are, so recursively copying a directory full of stuff could make the post-run phase of the job take longer | 20:18 |
clarkb | ya it turns out the big bottleneck there is total swift requests more so that bandwidth when it comes to small text files | 20:19 |
clarkb | and each directory is a file | 20:19 |
fungi | tripleo jobs had a tendency to just collect the entirety of /etc, which took forever, so please don't do that ;) | 20:19 |
JayF | that is not included anywhere in devstack, and looking at it being configured, it's probably in one of those layered job repos | 20:21 |
JayF | I'll go digging | 20:21 |
JayF | the only mention of those terms in devstack is how to configure it | 20:21 |
clarkb | JayF: https://opendev.org/openstack/devstack/src/branch/master/.zuul.yaml#L354 its there | 20:23 |
JayF | what the | 20:23 |
clarkb | as a side note those unit files are pretty basic and I think come in only a couple of flavors you can probably just look at the source | 20:23 |
JayF | oh no, does ripgrep exclude files starting with a `.` by default | 20:24 |
JayF | clarkb: I'm getting a weird error, so I specifically want to pull this and would rather get it from logs than ask you to hold a box | 20:24 |
JayF | I'm trying to do something strange (I want to get ironic's BMC emulators to install in their own separate venv so we are no longer beholden to global-requirements/upper-constraints for it) | 20:24 |
JayF | but I can update the config of one of the ironic jobs and disable the others | 20:24 |
clarkb | https://opendev.org/openstack/devstack/src/branch/master/functions-common#L1538-L1601 ok | 20:25 |
clarkb | er ok but there they are | 20:26 |
JayF | 2024-09-27 20:08:33.255044 | controller | Failed to start devstack@redfish-emulator.service: Unit devstack@redfish-emulator.service has a bad unit file setting. | 20:26 |
JayF | is what I'm digging, and I'm directly screwing with vars that go into that template | 20:26 |
JayF | so I'm just trying to find the shape of how I'm wrong :D | 20:26 |
JayF | re https://review.opendev.org/c/openstack/ironic/+/930776 | 20:26 |
clarkb | I thought systemctl would write out the bad bits when you daemon reload but maybe you have to query for them | 20:30 |
JayF | well I have it printing output now, I'll poke at it off and on over the weekend | 20:35 |
clarkb | JayF: I suspect the issue is your command isn't fullyrooted | 20:37 |
clarkb | and path is set to devstack's $PATH which likely doesn't include the root dir of your venv location? | 20:37 |
JayF | well, I'm trying to set $cmd to a fully qualified path | 20:37 |
clarkb | https://zuul.opendev.org/t/openstack/build/81c0541fc13448a98ed1d4fd6361564d/log/job-output.txt#10331 is what you're setting it to on the most recent run | 20:38 |
JayF | and you all don't set a working directory, so I'd think that'd work | 20:38 |
JayF | aha thank you clarkb | 20:38 |
JayF | that was exactly what I was in search of | 20:38 |
JayF | and I know likely what the issue is | 20:38 |
clarkb | it gets expanded below where you are actually calling to write the unit file but its easier to read where you define the variable where I linked | 20:39 |
JayF | it's obvious upon seeing your example :) but yes | 20:39 |
JayF | lol | 20:39 |
JayF | ty clarkb :D https://review.opendev.org/c/openstack/ironic/+/930776/5..6 | 20:41 |
* JayF will be back in an hourish to check that | 20:41 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!