fungi | up to 48 uncaught bounce notifications to openstack-discuss-owner now | 00:49 |
---|---|---|
fungi | hopefully should see some users with disabled subscriptions soon | 00:50 |
opendevreview | Dr. Jens Harbott proposed opendev/irc-meetings master: Typo fix for eventlet-removal meeting https://review.opendev.org/c/opendev/irc-meetings/+/935744 | 09:03 |
opendevreview | Merged opendev/irc-meetings master: Typo fix for eventlet-removal meeting https://review.opendev.org/c/opendev/irc-meetings/+/935744 | 11:35 |
opendevreview | Karolina Kula proposed openstack/diskimage-builder master: WIP: Add support for CentOS Stream 10 https://review.opendev.org/c/openstack/diskimage-builder/+/934045 | 12:05 |
fungi | openstack-discuss-owner has now received 112 uncaught bounce notifications. more than 5 per address for most of the defunct subscribers. i would have expected to see their subscriptions disabled by this point, so will look closer in a bit | 14:19 |
fungi | huh, so there are subscribers with nonzero bounce scores, but checking a sample of the ones i'm getting uncaught bounce notifications for they all have bounce scores of 0. i guess that's what the "uncaught" means, i think they are set up as forwards, so the ndr which comes back isn't for the same address as what's in the subscription. i guess i need to try to correlate them to | 14:25 |
fungi | subscribed addresses (luckily 95% are red hat addresses and they only seem to differ by the domain part), then manually disable delivery for those | 14:25 |
fungi | as for bounce scores, the highest i'm seeing for any subscriber is 2, i think it must only increment them at most once a day | 14:27 |
fungi | so it will take until the weekend to reach the necessary threshold for disablement on the ones that are bouncing sensibly | 14:28 |
karolinku[m] | Hey folks, im working of adding CS10 support in DIB, recently some issues with architecure appears, so im testing it on nested-virt label. Unfortunately, it looks like devstack jobs can't deal with nested virtualization https://github.com/openstack/devstack/blob/72f99641f15464dca45e42ab0bdae9d3e0cbbe0f/.zuul.yaml#L349-L350, so I wanted to try devstack's var LIBVIRT_TYPE=kvm. So the question is, can I somehow, it tricky way inject this | 14:51 |
karolinku[m] | variable from DIB repo level to devstack? | 14:51 |
fungi | karolinku[m]: there should be numerous examples of devstack-based jobs which override its envvars, for example in the openstack/neutron repo... i'll find you one | 15:16 |
fungi | karolinku[m]: i think this is the sort of thing you're looking for? https://opendev.org/openstack/neutron/src/branch/master/zuul.d/base.yaml#L55-L58 | 15:17 |
fungi | basically, child jobs inheriting from e.g. the devstack-minimal parent job then insert values into devstack's localrc via the devstack_localrc array | 15:18 |
fungi | this might also be a question to bring up with the devstack maintainers in #openstack-qa | 15:19 |
karolinku[m] | yeah, that may be something I need. Thanks for tips! | 15:19 |
fungi | any time! | 15:20 |
opendevreview | Karolina Kula proposed openstack/diskimage-builder master: WIP: Add support for CentOS Stream 10 https://review.opendev.org/c/openstack/diskimage-builder/+/934045 | 15:27 |
corvus | fungi: but shouldn't the bounce come from a verp which should match a subscribed address? | 15:27 |
fungi | corvus: for handled ("caught") bounces yes | 15:36 |
fungi | uncaught bounce notifications come with the following preface: | 15:37 |
fungi | "The attached message was received as a bounce, but either the bounce format was not recognized, or no member addresses could be extracted from it. This mailing list has been configured to send all unrecognized bounce messages to the list administrator(s)." | 15:37 |
fungi | interestingly, the "to" address in the ndr is <openstack-discuss-bounces+someuser=somedomain@lists.openstack.org> where someuser@somedomain does correspond to a list subscriber, so it's unclear why mailman is unable to map them back | 15:40 |
fungi | they seem to all be redhat.com or linux.vnet.ibm.com addresses which were at some point disabled or deleted at the receiving mta, but the way they're being bounced back seems to confound mailman's bounce professing | 15:43 |
corvus | fungi: yeah, that's the part that's weird. the "to" verp should be authoritative. | 15:44 |
corvus | i would only expect that message in response to a message to <openstack-discuss-bounces@lists.o.o>, not to one with a verp. | 15:44 |
corvus | makes me wonder if the verp bounces are not being delivered to mm in a way that it understands they are verp bounces. | 15:45 |
fungi | but it definitely is incrementing bounce counts for other subscribers, just not these | 15:45 |
corvus | fungi: or the "to" address does not match the envelope-to (rcpt to) | 15:46 |
fungi | it seems to though, expanding the full headers of the attached message/rfc822 part | 15:48 |
fungi | Received: from ... by lists01.opendev.org .. for openstack-discuss-bounces+someuser=somedomain@lists.openstack.org | 15:49 |
opendevreview | Marios Andreou proposed opendev/irc-meetings master: Update Watcher team meeting information https://review.opendev.org/c/opendev/irc-meetings/+/935806 | 15:59 |
fungi | https://docs.mailman3.org/projects/mailman/en/latest/src/mailman/model/docs/bounce.html has a fairly detailed explanation of how bounce processing works, but unfortunately doesn't explain why these specific bounces weren't caught | 16:01 |
corvus | maybe something in the logs? | 16:02 |
corvus | 0xb1/0xff = blob delete 69% complete | 16:03 |
fungi | yeah, i'm pouring through /var/lib/mailman/core/var/logs/bounce.log currently | 16:03 |
fungi | lots of entries for "VERPed bounce message but not a recognized DSN" which seem to correspond | 16:03 |
fungi | looks like mailman tries to suss out whether the bounce was a temporary or permanent condition from the ndr text, and https://gitlab.com/mailman/mailman/-/merge_requests/913 switched things so that if it can't figure it out then it assumes it's non-permanent | 16:07 |
fungi | trying to work out whether flufl.bounce has an accessible vcs presence somewhere | 16:12 |
corvus | i wouldn't expect a bounce to represent a temporary condition; i'd expect that to cause our local mta to queue. iow, shouldn't any verp bounce be considered permanent? | 16:13 |
fungi | apparently some e.g. vacation autoresponders reply to the verp address | 16:13 |
fungi | hence the above mr | 16:13 |
corvus | of course they do. | 16:14 |
fungi | https://gitlab.com/warsaw/flufl.bounce/-/tree/master/flufl/bounce/_detectors?ref_type=heads | 16:14 |
fungi | that's where the various dsn matchers live | 16:14 |
corvus | cynical-corvus says score them as bounces anyway; autoresponders shouldn't responds to lists at all. | 16:15 |
fungi | yeah, per https://www.rfc-editor.org/rfc/rfc5230.html#section-4.6%3E | 16:16 |
fungi | er, https://www.rfc-editor.org/rfc/rfc5230.html#section-4.6 | 16:16 |
fungi | Implementations SHOULD NOT respond to any message that contains a "List-Id" ... | 16:17 |
fungi | anyway, it looks like people do submit patters for additional dsn formats at https://gitlab.com/warsaw/flufl.bounce/-/merge_requests?scope=all&state=all | 16:20 |
fungi | s/patters/patterns/ | 16:20 |
fungi | e.g. https://gitlab.com/warsaw/flufl.bounce/-/merge_requests/15/diffs | 16:21 |
JayF | I had more unread emails at one time in my inbox than I have for a decade, thanks bounce-processor-email-thingy /s :P hehe | 16:23 |
* JayF has setup a filter but it's just funny | 16:23 | |
fungi | i guess we could even shoehorn temporary flufl.bounce patches into our mailman container image builds and then push an mr once we see they're working | 16:23 |
JayF | I'm very sad to learn that is a pypi package and not a domain with a fun TLD | 16:24 |
fungi | at least this explains why we're seeing them from specific domains and not others | 16:25 |
corvus | the more i think about it, the more i'm convinced it was just a wrong path to go down. it neuters the point of verp. if there's an option to enable rfc-compliant bounce processing at the cost of vacation bounces increasing the score, i would be in favor of that. | 16:30 |
corvus | another argument for that: ^ vacation responses should only add one point, and that should roll off. if we get 5 vacation response bounces, then, honestly, seriously, that address should be removed from the list. | 16:30 |
fungi | yeah, i guess an option to toggle https://gitlab.com/mailman/mailman/-/blob/master/src/mailman/runners/bounce.py?ref_type=heads#L68 could do that, then it at least still comes with some logging and notification of those mismatches | 16:37 |
corvus | i can't believe they changed that with no accompanying toggle. i mean, i'm just waking up from a long slumber, but it seems to me that single-handedly (and i mean single-handedly -- where's the code review on that change?) undid mailman's near-perfect bounce handling (that it has had for decades!) | 16:40 |
corvus | that change is just wrong. | 16:41 |
clarkb | catching up do we think the bounces we're getting are vacation bounes so are handled "properly" according to mm3 rules? Or are they just caught up in buggy processing related to taht? | 16:42 |
corvus | no they're caught in buggy processing | 16:43 |
corvus | the whole point of verp was that parsing every dsn any mta on the internet can produce is a losing proposition. it would never be complete. but with verp you don't need to do that. | 16:43 |
corvus | but this change has basically said "verp is not enough to detect a bounce, we still have to process the dsn". | 16:43 |
clarkb | ah | 16:43 |
corvus | i think what it effectively does is say that in order to remove someone from a list, we have to in all circumstances, recognize that a dsn says there is a permanent failure. then verp can be used to get the address so that we don't have to textually extract the address from the dsn. | 16:45 |
fungi | these are dsn messages from mimecast, quobyte, and similar third-party hosting/filtering services stating that the account is disabled or does not exist | 16:46 |
corvus | so basically, instead of "verp means we got a permanent bounce and can remove the member, full-stop" we now have "verp is a helper so that we don't have to fully parse a dsn, but we still have to recognize what it's trying to tell us". | 16:46 |
fungi | and yeah, they don't match the recognized patterns in flufl.bounce so mailman is erring on the side of (excessive) caution in not incrementing te bounce score and instead forwarding the dsn to the list owner | 16:47 |
corvus | if they want to do this, they should probably just call out to claude and ask if it's a permanent error. | 16:50 |
corvus | that's the only way this doesn't turn into whack-a-mole. which is exactly what the world of list processing was before verp. | 16:50 |
opendevreview | Clark Boylan proposed opendev/system-config master: Upgrade and reboot test nodes before openafs installation https://review.opendev.org/c/opendev/system-config/+/935812 | 17:08 |
opendevreview | Merged zuul/zuul-jobs master: Cap the ansible version used by ansible-lint https://review.opendev.org/c/zuul/zuul-jobs/+/935726 | 17:11 |
clarkb | the above change and remote: https://review.opendev.org/c/openstack/openstack-zuul-jobs/+/935813 Update system packages and reboot when building centos openafs should hopefully make openafs arm64 package builds on centos less flaky | 17:13 |
opendevreview | Clark Boylan proposed opendev/system-config master: Upgrade and reboot test nodes before openafs installation https://review.opendev.org/c/opendev/system-config/+/935812 | 17:17 |
opendevreview | Merged zuul/zuul-jobs master: Support new style mirror_info in use-docker-mirror https://review.opendev.org/c/zuul/zuul-jobs/+/935722 | 17:30 |
opendevreview | Jay Faulkner proposed openstack/diskimage-builder master: [gentoo] Fix+Update CI for 23.0 profile https://review.opendev.org/c/openstack/diskimage-builder/+/923985 | 17:33 |
clarkb | fungi: I think https://zuul.opendev.org/t/openstack/build/0a5188f5db5c435eb2dcc4039c29ab5a/console might show that the rpm dkms packaging for openafs is kernel specific? | 19:05 |
clarkb | whereas ubuntu/debian packages is more flexible? | 19:05 |
clarkb | infra-root anyone else want to weigh in on https://review.opendev.org/c/openstack/project-config/+/935725 to disable use of docker proxy caching so that we can hopefully get more reliable use of docker again? | 19:28 |
clarkb | the depends on has merged | 19:28 |
clarkb | fungi: since package upgrades and rebooting may not fix things for openafs package builds/installs do we want to go ahead and start a new arm64 centos 9 image build and/or make that job non voting? | 19:33 |
clarkb | msotly I don't want to get hung up on this problem since it is somewhat orthogonal and has been ongoing. I'd rather we find a way forward | 19:33 |
fungi | sorry, pulled in too many directions at once. i don't think the noble openafs upgrades are super urgent, so if just waiting a little longer to let things work themselves out is an option, i'm in favor | 19:52 |
clarkb | ok I'll trigger an arm64 image rebuild now | 19:57 |
clarkb | more generally though it has been frustrating that anytime we trigger those jobs we're forced to wait for an image rebuild. I think I would be in favor of making that specific job nonvoting | 19:58 |
fungi | i would +2 that change | 19:59 |
clarkb | we are currently building an ubuntu image so may be a while to get through whatever build requests are queued up too | 20:00 |
* clarkb will push that up | 20:00 | |
opendevreview | Clark Boylan proposed opendev/system-config master: Make system-config-zuul-role-integration-centos-9-stream-arm64 nonvoting https://review.opendev.org/c/opendev/system-config/+/935828 | 20:04 |
opendevreview | Jay Faulkner proposed openstack/diskimage-builder master: [gentoo] Fix+Update CI for 23.0 profile https://review.opendev.org/c/openstack/diskimage-builder/+/923985 | 20:09 |
clarkb | fungi: any opinion on https://review.opendev.org/c/openstack/project-config/+/935725 ? I'm hoping that will make things less likely to fail when doing anything with docker | 20:34 |
fungi | lgtm, depends-on has already merged | 20:36 |
clarkb | course now that I've said that I have to climb up a roof to fix a leak | 20:37 |
clarkb | if it goes sideways feel free to revert quickly :) othewise I'll check in when I can | 20:37 |
fungi | sure, i'm around, and not getting on a roof in the coming hours (afaik anyway) | 20:38 |
opendevreview | Merged openstack/project-config master: Disable docker hub mirror use in jobs https://review.opendev.org/c/openstack/project-config/+/935725 | 20:47 |
Clark[m] | Any sense yet if ^ is helping (or at least doesn't make it worse)? I need to turn laptop on and recheck some changes | 21:46 |
fungi | i've seen no complaints yet, though it's rather soon | 21:48 |
Clark[m] | There is a lodgeit change to recheck and corvus' nodepool rix | 21:49 |
Clark[m] | *fix | 21:49 |
clarkb | https://review.opendev.org/c/opendev/lodgeit/+/935712/3 and https://review.opendev.org/c/zuul/nodepool/+/935820 and https://review.opendev.org/c/zuul/zuul-jobs/+/849989 have been rechecked | 21:56 |
clarkb | I see a bug | 22:10 |
opendevreview | Clark Boylan proposed zuul/zuul-jobs master: Fix dockerhub check in use-docker-mirror role https://review.opendev.org/c/zuul/zuul-jobs/+/935837 | 22:14 |
clarkb | infra-root corvus fyi ^ | 22:14 |
clarkb | saw that here: https://zuul.opendev.org/t/zuul/build/e9afc70473a142428cc8c4594319b80f | 22:14 |
corvus | clarkb: aprvd | 22:22 |
clarkb | thanks | 22:24 |
ianw | re: 935812,2 there's two boots, but both seem to be into 5.14.0-529.el9.x86_64 (https://753af432ac11edc3fa55-e24395b1f226ab7bf437be1dd808e069.ssl.cf1.rackcdn.com/935812/2/check/system-config-zuul-role-integration-centos-9-stream/4560d78/messages.txt) | 22:42 |
ianw | which is actually the latest anyway -> https://mirror.stream.centos.org/9-stream/BaseOS/x86_64/os/Packages/kernel-5.14.0-529.el9.x86_64.rpm | 22:44 |
ianw | ohhhh, this is x86, where the image is up to date | 22:44 |
ianw | ok, on arm64 it seems like both boots were into 527 https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_4b7/935812/2/check/system-config-zuul-role-integration-centos-9-stream-arm64/4b7c3f8/messages.txt | 22:45 |
ianw | 2024-11-20 19:06:56.696207 | TASK [DNF Update] | 22:46 |
ianw | 2024-11-20 19:06:58.818939 | base | ok: Nothing to do | 22:46 |
ianw | seems unlikely | 22:46 |
ianw | https://mirror.iad.rax.opendev.org/centos-stream/9-stream/BaseOS/aarch64/os/Packages/kernel-5.14.0-529.el9.aarch64.rpm exists | 22:50 |
clarkb | oh weird | 22:59 |
opendevreview | Ian Wienand proposed opendev/system-config master: Upgrade and reboot test nodes before openafs installation https://review.opendev.org/c/opendev/system-config/+/935812 | 23:01 |
ianw | try with some debugging | 23:01 |
opendevreview | Clark Boylan proposed zuul/zuul-jobs master: Fix dockerhub check in use-docker-mirror role https://review.opendev.org/c/zuul/zuul-jobs/+/935837 | 23:06 |
clarkb | corvus: ^ yaml got me there | 23:06 |
clarkb | having a leading quote but then not quoting the entire string made the yaml parsing sad | 23:06 |
ianw | clarkb: it looks like it does see the updated packages, and i think it likely installs them @ https://zuul.opendev.org/t/openstack/build/d2751de1692e48bdb0eb2c6a5256fac4/console (roles-test/pre.yaml) | 23:18 |
ianw | but i suspect it's perhaps not booting into the new kernel :/ | 23:18 |
ianw | as for Failed to download packages: krb5-pkinit-1.21.1-4.el9.x86_64: Cannot download, all mirrors were already tried without success | 23:22 |
ianw | https://mirror.iad.rax.opendev.org/centos-stream/9-stream/BaseOS/x86_64/os/Packages/krb5-pkinit-1.21.1-4.el9.x86_64.rpm | 23:23 |
clarkb | I wonder if something about uefi and grub and gpt isn't happy in those images? | 23:24 |
clarkb | or maybe kvm is bypassing grub and picking a kernel for us? | 23:25 |
opendevreview | Merged zuul/zuul-jobs master: Fix dockerhub check in use-docker-mirror role https://review.opendev.org/c/zuul/zuul-jobs/+/935837 | 23:25 |
ianw | yeah it's kind of intersection of dnf/grub/disk-image-builder ... i don't have an immediate answer | 23:29 |
ianw | nor why that 1.21.1-4 package would fail to download. it's definitely in the mirror that test ran in (https://mirror.sjc3.raxflex.opendev.org/centos-stream/9-stream/BaseOS/x86_64/os/Packages/) | 23:30 |
ianw | unfortunately any useful debugging has been swallowed :/ | 23:30 |
clarkb | the internet suggests it could be related to bootloaderspec | 23:44 |
clarkb | bsaically grub doens't update if you enable bootloaderspec | 23:44 |
clarkb | and at some point fedora went to only use bootloaderspec. Maybe centos 9 stream is similar? | 23:45 |
clarkb | looks like maybe people are complaining about that in rocky linux too so ya maybe this is realted | 23:45 |
clarkb | I suspect we're using grub because dib uses grub but then dnf doesn't touch grub because of bls? maybe we should do a grub mkconfig and call it a day | 23:46 |
clarkb | looks like current centos builds fail on dns lookups for opendev.org when caching git repos... I was looking there to find paths for grub-mkconfig | 23:47 |
clarkb | `grub2-mkconfig --update-bls-cmdline -o /boot/grub2/grub.cfg` appears to be what we're running in image builds. I'll do that in ansible and see if that helps | 23:48 |
ianw | ... ohhh ... unbound issues? that would explain that download error | 23:50 |
clarkb | ianw: the dns lookup issue was on nb04 but ya that should run an unbound too | 23:51 |
clarkb | and maybe its similar problmes from that test node | 23:51 |
ianw | ++ on testing that ... grub+efi+arm64+dib images == basically a black box for me :) | 23:51 |
opendevreview | Clark Boylan proposed opendev/system-config master: Upgrade and reboot test nodes before openafs installation https://review.opendev.org/c/opendev/system-config/+/935812 | 23:52 |
clarkb | I think maybe ns04 is not resolving things | 23:53 |
clarkb | now to figure out hostkey things from sshfp records since I'm on a non home network on a laptop that hasn't ssh'd there before | 23:54 |
clarkb | #status log Restarted nsd on ns04 | 23:57 |
opendevstatus | clarkb: finished logging | 23:58 |
clarkb | it looks like maybe nsd is trying to startup and bind to the external address before it is available on the system post boot | 23:59 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!