Tuesday, 2024-09-17

kevkoGood morning 07:24
kevkoI would like to ask what is this https://review.opendev.org/c/openstack/kolla-ansible/+/928619  'Ready to submit' ..it seems arm job is blocking from merge ? Is it something new ? 07:24
SvenKieskelooks like a gerrit bug to me, weird.07:45
tobias-urdinkevko: perhaps that you have patchset 3 on the parent patch but patchset 4 is merged, a simple rebase and approve again?07:50
kevkotobias-urdin: Shouldn't it be irrelevant? Even if it's in the relation chain, it doesn't matter; the two patch sets don't modify the same lines of code or anything like that.07:55
kevkotobias-urdin: i rebased and will see, thank you for your advice for now 08:00
opendevreviewTyler proposed openstack/diskimage-builder master: Fall back to extract-image on ubuntu build  https://review.opendev.org/c/openstack/diskimage-builder/+/92674809:31
stephenfinclarkb: fungi: Would one of you be able to delete the pyparsing-update branch from openstack/cliff. I tried via the web UI but don't see any delete buttons (on https://review.opendev.org/admin/repos/openstack/cliff,branches)11:04
*** dhill is now known as Guest372412:07
fungikevko: what does matter is whether the parent commit id exists on the target branch. if the parent was changed and the child was not rebased onto the new commit, there is no way to merge it directly to the branch (this is a git fundamental even, nothing to do with gerrit)12:56
fungistephenfin: branch deletion is a permission which can be granted through the repository's acl, similar to branch creation. i can clean up that branch in a few minutes if there are no open changes still targeting it12:58
stephenfinfungi: okay, makes sense. Thanks13:20
fungi#status log Deleted the pyparsing-update branch from openstack/cliff (formerly 993972982739b2db3028278cb4d99be2a713d09c) at Stephen Finucane's request14:10
opendevstatusfungi: finished logging14:10
fungistephenfin: ^14:10
stephenfinty :)14:10
fungiyw14:10
fricklerdid anyone else see this bunch of mails from our servers with no recipients?14:22
fungito root/aliases i guess?14:22
fungioh, ndr messages14:23
frickler"Subject: Mail failure - no recipient addresses"14:25
fungilooks like they started today around 06:02 utc and continued up until 07:08 utc14:25
fungilooking at the one from paste01, this is what was logged by exim:14:27
fungi2024-09-17 06:02:35 1sqRIJ-009xlk-UX 1sqRIJ-009xlk-UX no recipients found in headers14:27
fungi2024-09-17 06:02:35 1sqRIJ-009xln-Uz <= <> R=1sqRIJ-009xlk-UX U=Debian-exim P=local S=77614:28
fungisyslog has a traceback at that time from a process initiated by apt.systemd.daily14:32
fungiImportError: cannot import name 'HeaderWriteError' from 'email.errors' (/usr/lib/python3.8/email/errors.py)14:33
fungijust after that we have:14:33
fungiSep 17 06:02:36 paste01 systemd[1]: apt-daily-upgrade.service: Succeeded.14:33
fungiso it looks like something's broken in the e-mail notification part of unattended-upgrade14:34
fungithere were python upgrades which happened in this batch, so maybe a race condition maybe a regression14:35
fungii guess we should wait and see if it happens again14:35
fricklerah, right, you can see the full traceback in the journal. likely related to the python upgrades I agree14:40
johnsomclarkb I can launch test jobs for those nested virt clouds, but I need to know how to target them. That patch looks like the same labels we normally use....16:16
clarkbjohnsom: yes its the same label. You'll run the jobs and see if they work or not16:16
clarkbadding extra labels is unfortunately a big maintenance burden (its really hard to remove them later) so I want to avoid that if we can16:17
johnsomOk, so our normal runs "may" land on those hosts. Got it16:17
clarkbyup16:17
clarkbreally the only thing we're concerned about is whether the nested virt is particularly unstable since jobs generally work in those regions16:18
johnsomSure, 16:18
johnsomI will keep an eye out and run a few DNM jobs16:18
johnsomclarkb Looking at a patch I posted last night, both openmetal and raxflex instances have worked fine.16:24
opendevreviewMerged opendev/zuul-jobs master: Copy DIB elements from project-config  https://review.opendev.org/c/opendev/zuul-jobs/+/92914016:24
clarkbthank you for confirming. Both clouds indicated they expected it to work so good to confirm it16:25
opendevreviewMerged opendev/zuul-jobs master: Convert python2 dib element scripts to python3  https://review.opendev.org/c/opendev/zuul-jobs/+/92916816:25
corvushttps://zuul.opendev.org/t/opendev/build/abadb70bd617424098f3d2fe3ac95bea/log/job-output.txt#586017:17
corvusinteresting -- diskimage build failed on trying to get the zanata archive17:17
clarkbcorvus: I don't get a 403 from that now, could just be a momentary failure?17:18
corvussame17:18
corvusmaybe that dib element should retry17:19
corvus(i suspect this process may make our build errors more visible :)17:19
opendevreviewClark Boylan proposed zuul/zuul-jobs master: Collection microk8s inspect info in k8s log collection role  https://review.opendev.org/c/zuul/zuul-jobs/+/92968917:30
fricklerwe likely do not notice most one-off dib failures currently, yes. looking at https://grafana.opendev.org/d/f3089338b3/nodepool3a-dib-status?orgId=1 I also note that a) arm64 builds look failing according to this and b) some old images could get removed, maybe new ones are missing, too?17:35
opendevreviewClark Boylan proposed zuul/zuul-jobs master: Collection microk8s inspect info in k8s log collection role  https://review.opendev.org/c/zuul/zuul-jobs/+/92968917:35
clarkbfrickler: yes the dashboard could be udpated to drop old images. re arm64 are they failing again? That is unfortunate. I fI had to guess we're using all the loopback devices again maybe but I haven't looked17:37
fricklerclarkb: looks like it: failed to set up loop device: No such file or directory17:39
clarkbya so for whatever reason we seem to leak those more doing arm builds. One hunch I had is that the arm builds take much much longer so are more likely to be interrpted by udpted nodepool container images that restart the containers. But I don't think we've had a nodepool container update in a while so that may not be it17:39
frickler"Up 2 weeks", so not recently17:42
fricklerchecking now when the last successful build was and whether we might still have the log from that and the first failure17:44
opendevreviewClark Boylan proposed zuul/zuul-jobs master: Bump the default ensure-kubernetes microk8s version to 1.31/stable  https://review.opendev.org/c/zuul/zuul-jobs/+/92968917:45
fricklercentos-9-stream-arm64-fe857118d17141abaed43b10ed7fbd64 ready 03:11:53:43 - that seems to be the most recent one17:46
clarkbbut the leaks can happen slowly since the last boot so it might be one a day or whatever17:46
clarkbbut having the container running for 2 weeks implies it isn't container restarts that are the primary cause17:47
fricklerlooks like openeuler failed after the c9s build with a different issue. and the rocky9 build after that then shows the same loop device failure as above17:49
fricklerso my suggestion would be to pause openeuler arm64 and clean up the server and try again, but I won't have time to do that myself this week17:50
clarkbya the loopback is only used when compiling the final image together from the on disk stuff17:50
clarkbif the on disk stuff fails early we don't hit the loopback issue17:50
clarkband ya maybe openeuler is failing in such a way that we leak loopback setup or something17:51
opendevreviewClark Boylan proposed zuul/zuul-jobs master: Bump the default ensure-kubernetes microk8s version to 1.31/stable  https://review.opendev.org/c/zuul/zuul-jobs/+/92968917:56
fricklerthe second to last line in https://nb04.opendev.org/openEuler-22-03-LTS-arm64-5de13663a7fa45ebaf1df171d62ad94c.log is "LOOPDEV=/dev/loop7", so I was assuming that to happen, yes. I haven't been able to see what the actual failure is, though, maybe just a timeout?18:00
opendevreviewMerged opendev/zuul-jobs master: Add debian-bullseye image  https://review.opendev.org/c/opendev/zuul-jobs/+/92914118:01
opendevreviewClark Boylan proposed zuul/zuul-jobs master: Bump the default ensure-kubernetes microk8s version to 1.31/stable  https://review.opendev.org/c/zuul/zuul-jobs/+/92968918:15
opendevreviewClark Boylan proposed zuul/zuul-jobs master: Bump the default ensure-kubernetes microk8s version to 1.31/stable  https://review.opendev.org/c/zuul/zuul-jobs/+/92968918:27
fungiNeilHanlon: https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/mirror-update/files/ in case you are looking for script inspiration19:55
NeilHanlonthanks!20:00
opendevreviewClark Boylan proposed zuul/zuul-jobs master: Bump the default ensure-kubernetes microk8s version to 1.31/stable  https://review.opendev.org/c/zuul/zuul-jobs/+/92968921:50
opendevreviewClark Boylan proposed zuul/zuul-jobs master: Bump the default ensure-kubernetes microk8s version to 1.31/stable  https://review.opendev.org/c/zuul/zuul-jobs/+/92968922:05
opendevreviewMerged openstack/diskimage-builder master: Adapt to upstream CentOS Stream mirror changes  https://review.opendev.org/c/openstack/diskimage-builder/+/92235223:02
opendevreviewMerged openstack/diskimage-builder master: dib-functests: run on bookworm  https://review.opendev.org/c/openstack/diskimage-builder/+/92269723:13
opendevreviewMerged openstack/diskimage-builder master: Drop logic for old python versions  https://review.opendev.org/c/openstack/diskimage-builder/+/92778123:59

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!