Tuesday, 2024-06-25

clarkbthat works for me. Agenda sent00:02
tonybclarkb: Ahh okay, that makes sense.  I guess I can work on that as well.00:10
tonybQuestion about the wiki switch/update.  How do we want to handle logs from the apache process inside the container?  At the moment they're being written to stdout/stderr which means they end up in docker logs / syslog.  Should I switch that so they're written to something like /var/log/containers/apache2/* ?  I guess similar question for the mariadb/memcached/elasticsearch containers.00:19
tonybHow do we test/upgrade the bridge host?  It looks like it's running Focal and therefore is on the list to update.00:28
tonybI noticed this because we can no longer install ansible from git00:29
opendevreviewTony Breeds proposed openstack/project-config master: Build a stow of python 3.13.  https://review.opendev.org/c/openstack/project-config/+/91848200:44
Clark[m]tonyb: re wiki if you look at other container setups we redirect stdout docker logs to syslog which I think would be fine here00:46
tonybClark[m]: Oh cool.00:46
Clark[m]tonyb: re bridge all of our system-config jobs deploy a test bridge and it is tested somewhat implicitly by being able to run the config management for stuff. I think you can propose a change to see if that works on jammy/noble. Then it's a matter of deploying it, syncing stuff and switching zuul over00:47
Clark[m]tonyb: what do you mean we can no longer install Ansible from git. Do you mean in the Ansible devel job?00:47
Clark[m]I think we install from packages elsewhere00:48
tonybClark[m]: Yeah the Ansible devel job fails because python is too old00:48
Clark[m]Oh and we need to update ssh rules on nodes to allow ssh from the new host as well as the current one00:48
Clark[m]Ack00:49
tonybI get that we use packages elsewhere but that's an early warning.00:49
tonybClark[m]: okay so bridge is special but not as special as I was thinking.00:50
Clark[m]Yup. I just wanted to make sure we weren't breaking in production unexpectedly 00:50
tonybOkay.  I might try and get the ansible-devel job working again as an early step00:51
Clark[m]Ya it's special due to its position in the world but the testing of it works just like everything else for the most part00:51
tonybI guess we have the file matchers set so if we update system-config-run we also run all the services?00:52
Clark[m]Maybe? If we're missing matchers we should have then we can always add them00:54
tonybOh it looks like bridge99 is already jammy Hmm https://opendev.org/opendev/system-config/src/branch/master/zuul.d/system-config-run.yaml#L6200:55
tonybI'll do more reading/digging00:55
opendevreviewMerged opendev/glean master: testing: remove centos7 and 8  https://review.opendev.org/c/opendev/glean/+/92191100:58
opendevreviewIan Wienand proposed openstack/diskimage-builder master: dib-functests: run on bookworm  https://review.opendev.org/c/openstack/diskimage-builder/+/92269701:07
opendevreviewTakashi Kajinami proposed opendev/storyboard master: Adopt to recent tox  https://review.opendev.org/c/opendev/storyboard/+/92269902:25
opendevreviewTakashi Kajinami proposed opendev/storyboard master: Fix test executions  https://review.opendev.org/c/opendev/storyboard/+/92269902:31
opendevreviewJeremy Stanley proposed opendev/system-config master: Rebalance Mailman's and Exim's outgoing batch size  https://review.opendev.org/c/opendev/system-config/+/92270303:23
opendevreviewTakashi Kajinami proposed opendev/storyboard master: Fix test executions  https://review.opendev.org/c/opendev/storyboard/+/92269903:27
opendevreviewTony Breeds proposed opendev/system-config master: [DNM] Run ansible-devel under python-3.11  https://review.opendev.org/c/opendev/system-config/+/92270404:23
opendevreviewTony Breeds proposed opendev/system-config master: [DNM] Run ansible-devel under python-3.11  https://review.opendev.org/c/opendev/system-config/+/92270404:48
opendevreviewTony Breeds proposed opendev/system-config master: [DNM] Run ansible-devel under python-3.11  https://review.opendev.org/c/opendev/system-config/+/92270407:43
fricklerinfra-root: websites on static.o.o seem to be very slow, taking a closer look now08:27
opendevreviewTony Breeds proposed opendev/system-config master: [DNM] Run ansible-devel under python-3.11  https://review.opendev.org/c/opendev/system-config/+/92270408:41
opendevreviewTony Breeds proposed opendev/system-config master: [DNM] Run ansible-devel under python-3.11  https://review.opendev.org/c/opendev/system-config/+/92270409:25
*** diablo_rojo_phone is now known as Guest1073411:21
fungifrickler: did you manage to spot the cause? i see a slow afs warning in dmesg from ~30 minutes after you mentioned the issue13:12
fungiaha, never mind. i didn't finish reading everywhere13:13
corvuswhere can i read about slow static.o.o?14:54
opendevreviewClark Boylan proposed opendev/system-config master: Delete centos 8-stream mirror content  https://review.opendev.org/c/opendev/system-config/+/92274916:18
opendevreviewClark Boylan proposed opendev/system-config master: Remove centos rsync mirroring tooling  https://review.opendev.org/c/opendev/system-config/+/92275016:18
corvusi'm going to restart the schedulers and web servers again for another regression fix (that probably doesn't impact opendev much if at all, but just in case)16:57
opendevreviewClark Boylan proposed opendev/system-config master: Remove centos rsync mirroring tooling  https://review.opendev.org/c/opendev/system-config/+/92275017:03
opendevreviewMerged openstack/diskimage-builder master: Disable Stream-8, non-vote FC 37, and assume yes on debian functest  https://review.opendev.org/c/openstack/diskimage-builder/+/92182317:03
corvus#status log restarted zuul schedulers/web17:10
opendevstatuscorvus: finished logging17:10
fungithanks!17:21
opendevreviewJulia Kreger proposed openstack/diskimage-builder master: Provide an ability to disable serial console injection  https://review.opendev.org/c/openstack/diskimage-builder/+/92244117:27
opendevreviewJulia Kreger proposed openstack/diskimage-builder master: remove console entries when console is disabled  https://review.opendev.org/c/openstack/diskimage-builder/+/92244217:27
opendevreviewJulia Kreger proposed openstack/diskimage-builder master: minor ci: quick cleanup and remove centos8  https://review.opendev.org/c/openstack/diskimage-builder/+/92276317:29
dpanechHi we are getting infrastructure-related errors in Zuul jobs, eg here: https://zuul.opendev.org/t/openstack/build/b168543cf18f49b592967ee6b81a3aa6 (https://mirror-int.dfw.rax.opendev.org/ not reachable)17:38
clarkbdpanech: looks like the mirror returned a 404 response17:41
fungithat looks like an ansible change where it has started looking for debian-11.9 instead of debian-1117:45
clarkbya it seems like it can't find a pyzmq wheel on pypi that is valid for the current platform so falls back to the wheel cache and can't find anything there either17:46
clarkbbut in all cases the mirror appears to be accessible and returns appropriate responses17:46
clarkband they must be doing binary only installs because the sdist ins't fetched?17:47
dpanechSorry, is there anything wrong with our scripts?17:50
clarkbdpanech: I don't know enough about what the job is trying to do there to say one way or another. I'm just describing what I see in the lgos and I don't think it is a mirror problem.17:52
dpanechclarkb: I'll ask one of our developers to join here17:53
clarkbhttps://zuul.opendev.org/t/openstack/build/b168543cf18f49b592967ee6b81a3aa6/console#0/4/23/debian-bullseye is where the wheel cache is configured so ya at least part of the issue here is the one fungi points out. Need to fix the config for that so that it points at 11 not 11.017:55
clarkb*11.917:55
clarkbhttps://mirror.dfw.rax.opendev.org/wheel/debian-11-x86_64/pyzmq/ and there is a wheel there17:56
clarkbhrm it does actually seem to find the sdist and wheels from pypi upstream. So why isn't it simply using one of those18:00
dpanechclarkb: >>Need to fix the config for that so that it points at 11<< -- I don't think we changed anything in Zuul config recently. What config would that be?18:01
clarkbdpanech: thats a separate issue. I don't think it explains why your job failed18:02
fungimost likely a behavior change in ansible, for how it populates distribution version variables18:02
fungias for the error itself, pip requires all index urls to be valid even if they don't contain matching packages18:02
clarkbit does this: Downloading https://mirror-int.dfw.rax.opendev.org/pypifiles/packages/58/9d/d26c8808cfc5a033d2fcb724767cc2e183af1c2af1440865776a113cc6f9/pyzmq-20.0.0-cp39-cp39-manylinux1_x86_64.whl.metadata18:02
clarkbthen throws an exception reading that metadata18:03
fungiand that's through the caching proxy, not from the wheel mirror18:05
clarkbcorrect18:05
clarkbto be clear I think the wheel mirror thing is a problem and one that should be addressed. But I think if fixing that fixes this problem we're going to mask the actual problem which is that you can't install the wheel from pypi proper which should be the preferred method18:06
fungithe last successful run for that job was 2024-06-21 20:28:48 (~4 days ago), so whatever changed did so after that18:08
fungithat sort of coincides with the pip 24.1 update, which happened the day before18:09
fungiand did include changes to dependency solving18:09
clarkbhttps://github.com/pypa/pip/blob/24.1/src/pip/_vendor/pkg_resources/__init__.py#L2863 ya this is the line that is exploding18:09
clarkbI wonder if latest pip is unable to parse that metadata for some reason18:09
fungii'm struggling to view the build output in a browser because it's so massive, which file did you find the traceback in?18:10
clarkbhttps://zuul.opendev.org/t/openstack/build/b168543cf18f49b592967ee6b81a3aa6/log/job-output.txt#63829-63840 its there18:10
clarkbfungi: you may need to fetch the file and view it lcoally in a text editor18:10
clarkbmy browser is struggling too but just managing18:10
clarkbI'll make a paste actually that seems friendliest18:11
fungithanks18:11
clarkbfungi: dpanech https://paste.opendev.org/show/bYdq2ryZhiRAJemQDnFp/18:13
fungiinterestingly i don't see "cpython" appearing in any of the metadata fields (other than a passing mention in the markdown description18:14
clarkbya but also that exception occurs after the initial exception so it may just be in a bad state and trying to look up invalid stuff18:15
clarkbdpanech: fungi: probably the next step is to try and install pyzmq of that version with python3.9 on debian 11 and see if you can reproduce the explosion18:15
clarkband if so file a bug with pip / debug further depending on your interest level18:15
clarkbya https://github.com/pypa/pip/blob/24.1/src/pip/_vendor/pkg_resources/__init__.py#L3070 explodes then https://github.com/pypa/pip/blob/24.1/src/pip/_vendor/pkg_resources/__init__.py#L3072 is called which explodes furhter18:17
clarkbreading that code I suppose the idea is that its ok for the first explosion to occur but not the second as it seems dep map is populated by _compute_dependencies so maybe the cpython version string is the problem18:18
clarkbfungi: dpanech: I think it is tripping over the Requires-Dist lines in the metadata file18:19
fungihttps://github.com/pdm-project/pdm/issues/167518:20
fungilooks similarish18:20
clarkbRequires-Dist: py ; implementation_name === "pypy" and or Requires-Dist: cffi ; implementation_name === "pypy" specifically18:20
clarkbyup that issue seems to have reached the same conclusion18:20
clarkbthe problem is the === ?18:21
clarkboh wait no its due to not listing require-dist for cpython?18:21
fungilooks like pip 24.0 used packaging 21.3 while pip 24.1 upgrades its vendored copy of packaging to 24.118:21
clarkboh wait no they say you can cahnge it to == and its fine18:21
clarkbso ya not an infrastructure problem. Its a pip problem18:21
dpanechclarkb: fungi: ok thank you, I'll pass this on18:23
clarkbalso based on that I'm not sure that changing the wheel mirror for debian path will help as presumably those wheels will have the similarly "corrupt" metadata18:23
clarkbdpanech: so ya I think your options are to file a bug with pip and see if they can fix it, or downgrade pip, or see if pyzmq every changed their metadata in newer versions of their packages and possibly upgrade to a newer version of the package18:23
clarkbthen separately we can see about fixing access to the wheels built for debian 11 specifically ,but I don't think that will help in this instance18:27
fungilooks like it's being pulled in as a dependency of a git install of https://github.com/0rpc/zerorpc-python but that doesn't seem to be what's setting the upper bound on it18:28
dpanechigor-soares: I'll send the chat transcript in a bit18:29
igor-soaresAlright. Thank you.18:30
fungidpanech: igor-soares: we publish the channel log at https://meetings.opendev.org/irclogs/%23opendev/latest.log.html#t2024-06-25T17:38:4918:30
fungiso pyzmq 20.0.0 was released almost 4 years ago, but seems to maybe be the last version which published manylinux1 wheels for cp3918:32
clarkboh that would explain why that version is being selected18:32
clarkbstill doesn't garuntee newer versiosn don't have the same issue though18:32
clarkbhttps://github.com/zeromq/pyzmq/blob/main/pyproject.toml#L42 I suspect that the latest stuff would work actually18:33
clarkbhttps://github.com/zeromq/pyzmq/blob/v21.0.0/setup.py#L1446-L1447 and even version 21 I guess18:35
clarkbso fixing the debian wheel mirror path may actualyl fix the CI jobs but then anyone trying to install in the real world would be broken18:35
fungilooks like that was the point where they switched to making manylinux2010 wheels for cp3918:36
clarkband bullseye isn't manylinux2010 compatibile?18:37
fungithat's what's baffling, i think it should be?18:37
clarkbI would expect it to be. Bullseye is what 4 yaers old now? thats ~2020 not 201018:38
clarkbtonyb: following up on qusetions about brdige. Bridge in prod appears to be jammy18:41
clarkbtonyb: I think we can/should just correct the node type in our jobs18:41
fungihttps://github.com/0rpc/zerorpc-python/blob/99ee6e47c8baf909b97eec94f184a19405f392a2/setup.py#L40 is where the pyzmq>=13.1.0 dependency is taken from in the failing starlingx build18:44
clarkbmanylinux2010 requires pip 19 or newer, but the job seems to say it is using pip 24.1 which aligns with what we seein the tracebacks18:47
dpanechfungi: clarkb: thanks for your help, we are looking into it on our side18:50
fungidpanech: igor-soares: in summary, hopefully you can reproduce by trying to install pyzmq with pip 24.1 on debian 1118:51
clarkbdpanech: igor-soares as a side note our rackspace hosted mirrors have internal and external network interfaces. We set up the jobs to use the internal interfaces with the mirror-int.* names for reliability and throughput purposes. If you want to test things you can drop the -int portion of the string and hit the publicly accessible interface and all the services are teh same18:51
clarkbbut I suspect this is reproduceable talkign to pypi directly and isn't directly related to our mirrors18:52
igor-soares_fungi: clarkb: thanks for your input. This is good info to keep digging on our end.18:56
fungii would try to do it myself but don't have a debian-11 vm or chroot handy and would need to boot/make one18:58
clarkbdocker run debian:bullseye would probably work18:59
fungii also don't have docker installed on my workstation (tried once but it made an utter mess)19:00
fungii think it's still picking remaining traces of docker out of its teeth19:01
clarkbfungi: I have approved the mailman + exim change19:57
fungithanks, i'll try to check on it after deployment19:57
opendevreviewMonty Taylor proposed zuul/zuul-jobs master: Support .python-version files in ensure-python  https://review.opendev.org/c/zuul/zuul-jobs/+/92251520:10
opendevreviewMonty Taylor proposed zuul/zuul-jobs master: Support .python-version files in ensure-python  https://review.opendev.org/c/zuul/zuul-jobs/+/92251520:28
opendevreviewMerged opendev/system-config master: Rebalance Mailman's and Exim's outgoing batch size  https://review.opendev.org/c/opendev/system-config/+/92270320:47
opendevreviewMonty Taylor proposed zuul/zuul-jobs master: Support .python-version files in ensure-python  https://review.opendev.org/c/zuul/zuul-jobs/+/92251520:56
clarkbI think the mailman update deployed about 20 minutes ago21:22
opendevreviewMonty Taylor proposed zuul/zuul-jobs master: Support .python-version files in ensure-python  https://review.opendev.org/c/zuul/zuul-jobs/+/92251521:30
clarkbnow need someone to send mail to openstack-discuss and see if the time deltas shrink21:33
fungiuwsgi processes last started 21:07 so that sounds about right21:40
clarkbthis is interesting: if I boot nomodeset with tumbleweed's kernel I only get 1920x1080 (naive is 2560x1440), but if I boot ubuntu noble's kernel nomodeset I get only native resolution22:53
clarkbboth are running X and not wayland which I thought maybe could be related. I suspect ubuntu's kernel is simply better about heuristicing the display when operating with the mesa driver instead of amdgpu22:53
clarkber I guess it is likely vesa not mesa22:54
clarkbanyway I'm still able to reproduce the problems when booting without nomodeset which loads the amdgpu driver under noble as well as jammy and tumbleweed and was not able to do so with similar hardware (my brother has almost the same laptop) so ya I think I have to conclude its hardware and get on the phone tomorrow22:55

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!