corvus | infra-root: there appears to be an afs-related kernel issue on the new zuul-executors. https://paste.opendev.org/show/b8DDsy79bzfl3u4AUU9k/ | 00:49 |
---|---|---|
corvus | i'm going to revert the jammy executor changes so that we'll be back on the focal hosts by monday | 00:50 |
opendevreview | James E. Blair proposed opendev/system-config master: Revert "Replace ze01-ze06" https://review.opendev.org/c/opendev/system-config/+/885179 | 00:50 |
opendevreview | James E. Blair proposed opendev/zone-opendev.org master: Revert "Replace ze01-ze06" https://review.opendev.org/c/opendev/zone-opendev.org/+/885180 | 00:51 |
opendevreview | Merged opendev/zone-opendev.org master: Revert "Replace ze01-ze06" https://review.opendev.org/c/opendev/zone-opendev.org/+/885180 | 01:01 |
opendevreview | Merged opendev/system-config master: Revert "Replace ze01-ze06" https://review.opendev.org/c/opendev/system-config/+/885179 | 01:12 |
opendevreview | Merged openstack/project-config master: Add nebulous/zuul-jobs https://review.opendev.org/c/openstack/project-config/+/885177 | 01:30 |
opendevreview | waleed mousa proposed openstack/diskimage-builder master: Add nm-dhcp-ib-interfaces element https://review.opendev.org/c/openstack/diskimage-builder/+/882507 | 05:45 |
frickler | it isn't obvious to me why executors need afs. also I'm not 100% convinced that the backtrace is afs-related, though it does sound likely | 09:24 |
fungi | frickler: jobs doing publication to afs volumes directly from executors rather than creating/revoking temporary permissions for job nodes to do so | 13:01 |
corvus | the backtrace happened after the executor process made an afs ioctl and then immediately crashed | 14:53 |
corvus | the strace: https://paste.opendev.org/show/bMJvauiX8ioQzhCxdoKH/ | 14:55 |
opendevreview | James E. Blair proposed opendev/zone-opendev.org master: Update serial on zone file https://review.opendev.org/c/opendev/zone-opendev.org/+/885191 | 15:13 |
fungi | looks like lunar and mantic have 1.8.9 packaged which we might be able to backport fairly trivially in a ppa if the issue is one that's been fixed upstream | 15:18 |
fungi | same packages as in debian bookworm and sid | 15:18 |
opendevreview | Merged opendev/zone-opendev.org master: Update serial on zone file https://review.opendev.org/c/opendev/zone-opendev.org/+/885191 | 15:18 |
fungi | or maybe the changes from 1.8.9 are what's in the -3 update build of 1.8.8 in jammy | 15:20 |
fungi | https://www.openafs.org/dl/openafs/1.8.9/ChangeLog | 15:22 |
fungi | focal's on linux 5.4 while jammy is 5.15 | 15:26 |
fungi | there's also apparently openafs 1.9.0 and 1.9.1 | 15:36 |
fungi | release notes for 1.8.8 indicate support for linux kernels up to 5.13: https://www.openafs.org/dl/openafs/1.8.8/RELNOTES-1.8.8 | 15:44 |
fungi | 1.8.9 adds support for up to linux 6.0 | 15:45 |
fungi | maybe 1.8.8 simply doesn't work for jammy's linux 5.15 | 15:46 |
fungi | at least the release notes don't say to expect it to work | 15:47 |
fungi | http://changelogs.ubuntu.com/changelogs/pool/universe/o/openafs/openafs_1.8.8.1-3ubuntu2~22.04.1/changelog | 15:47 |
fungi | aha | 15:47 |
fungi | "Import upstream patches to support linux through 5.18 (Closes: #1010764)" | 15:48 |
fungi | maybe something else was missed | 15:48 |
fungi | yeah, debian's 1.8.8-1 build included a patch for linux up to 5.14, 1.8.8-2 improved that, 1.8.8-3 added support for build issues on 5.15, 1.8.8.1-1 added support for kernels up to 5.16, 1.8.8.1-3 added support for linux up to 5.18, then for ubuntu's extra revisions on top of that there's support up through linux 5.19 to work with the hwe backport kernel in jammy | 15:52 |
fungi | maybe we're just hitting a different corner case they haven't accounted for | 15:53 |
*** dmellado170 is now known as dmellado17 | 18:52 | |
ianw | https://gerrit.openafs.org/#/c/14918/ is surely it; null deref to llseek | 23:47 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!