Tuesday, 2022-04-26

clarkbcorvus: should be safe to kill that playbok at least00:01
*** dviroel|rover|afk is now known as dviroel|out00:16
*** rlandy|bbl is now known as rlandy00:34
*** rlandy is now known as rlandy|out00:55
*** ysandeep|out is now known as ysandeep02:12
opendevreviewIan Wienand proposed openstack/diskimage-builder master: Move image validation into extract-image  https://review.opendev.org/c/openstack/diskimage-builder/+/83929403:53
*** ysandeep is now known as ysandeep|afk04:18
fricklerseems we are hitting https://bugs.launchpad.net/ubuntu/+source/reprepro/+bug/1968198 now, sadly there seems to be no indication on how to possibly fix this, except maybe run it on jammy?04:51
fricklerthe mention of zstd means it might be related to the change of compression method that seems to have happened04:51
opendevreviewIan Wienand proposed openstack/diskimage-builder master: Move image validation into extract-image  https://review.opendev.org/c/openstack/diskimage-builder/+/83929405:07
ianwfrickler: sigh, or in a container with similar i guess05:14
ianwideally i guess we'd get jammy nodes and update mirror-update testing before migrating it to a new platform.  although setting up the mirror is a bit chicken-egg with that i guess05:17
fricklerianw: seems to have gotten past this now, error messages about two specific pkgs but now seem to be continuing05:17
fricklerlet's see how the export goes, maybe we have at least a working repo in the end, even if incomplete05:18
fricklerthe other idea would be to start building images without mirror set up, shouldn't be impossible either05:20
frickler2022-04-26 05:32:05  | reprepro completed successfully, running vos release ... let's see how long this takes for ~160G of new stuff05:38
*** ysandeep|afk is now known as ysandeep06:13
*** pojadhav- is now known as pojadhav|pto06:20
ianwit's capped at about 10mbit so i'd say ~3.5 hours06:36
opendevreviewIan Wienand proposed openstack/diskimage-builder master: Move image validation into extract-image  https://review.opendev.org/c/openstack/diskimage-builder/+/83929406:46
*** jpena|off is now known as jpena07:13
*** ysandeep is now known as ysandeep|brb07:25
opendevreviewIan Wienand proposed openstack/diskimage-builder master: centos: avoid head pipe failure  https://review.opendev.org/c/openstack/diskimage-builder/+/83930707:28
*** ysandeep|brb is now known as ysandeep07:30
opendevreviewIan Wienand proposed openstack/diskimage-builder master: Move image validation into extract-image  https://review.opendev.org/c/openstack/diskimage-builder/+/83929407:34
frickler2022-04-26 07:33:16  | Done.08:01
fricklertested from a local jammy instance and seems to work fine, so I've gone ahead and self-approved https://review.opendev.org/c/openstack/project-config/+/83905708:04
opendevreviewMerged openstack/project-config master: Start bulding ubuntu-jammy images  https://review.opendev.org/c/openstack/project-config/+/83905708:15
*** benj_1 is now known as benj_08:18
opendevreviewOleksandr Kozachenko proposed zuul/zuul-jobs master: Add promote-container-image role  https://review.opendev.org/c/zuul/zuul-jobs/+/83891908:56
opendevreviewOleksandr Kozachenko proposed zuul/zuul-jobs master: Add promote-container-image role  https://review.opendev.org/c/zuul/zuul-jobs/+/83891909:35
opendevreviewMerged openstack/diskimage-builder master: Add a job to test building jammy  https://review.opendev.org/c/openstack/diskimage-builder/+/83622809:47
opendevreviewMerged openstack/diskimage-builder master: yum-minimal: clean up release package installs  https://review.opendev.org/c/openstack/diskimage-builder/+/83724809:54
opendevreviewMerged openstack/diskimage-builder master: Move reset-bls-entries to post-install  https://review.opendev.org/c/openstack/diskimage-builder/+/83879210:02
fricklerlooks like hole in one ;) https://nb01.opendev.org/ubuntu-jammy-0000000001.log10:21
*** rlandy|out is now known as rlandy10:24
*** ysandeep is now known as ysandeep|coffee10:37
ianwfrickler: nice work!10:37
*** dviroel|out is now known as dviroel|rover11:09
fungifrickler: what needed to be done for reprepro after i added the new key in 839261?11:17
fungii've started one more run to make sure it finishes clean before i release the file lock11:17
fungiahh, te zstd errors11:17
fungizstd: error 70 : Write error : cannot write decoded block : Broken pipe11:18
fungidid that turn out to be benign then?11:18
*** ysandeep|coffee is now known as ysandeep11:28
fricklerfungi: at least non-fatal, I didn't touch anything, just watch it run and finish somehow11:29
fricklermaybe the repo is now missing some pkgs, but it went well enough to build an image11:29
fricklerI'm hoping the misses will only affect jammy and not existing distros, so I didn't check in more detail11:30
fricklerhttps://review.opendev.org/c/openstack/devstack/+/839359 "Zuul: Unknown configuration error". I guess I missed step to add jammy to the launchers, but zuul/nodepool might be able to provide a more helpful error message?11:31
frickler*the step11:31
fungilooks like there are a lot of conditionals in https://opendev.org/zuul/zuul/src/branch/master/zuul/manager/__init__.py#L998-L1120 and that exception is a fall-through at the end11:35
fungii expect the "error in dynamic layout" exception is masking the catch-all exception raised in the try block11:38
fricklerfungi: corvus: that's the traceback from zuul log https://paste.opendev.org/show/bPl0n9kjkby3gPoLTNHd/11:47
funginot what i expected at all11:49
fungii really don't see where it's coming from, maybe a regression in 836086 (Fix implied branch matchers with regex chars)11:53
opendevreviewDr. Jens Harbott proposed openstack/project-config master: Start launching Jammy images  https://review.opendev.org/c/openstack/project-config/+/83936611:54
*** rlandy is now known as rlandy|mtg11:58
fungilooking at a recent build of a similar job, we do indeed have regex branch matchers inherited from common parents: https://zuul.opendev.org/t/openstack/build/6eae56a2d3354a6e86efedde2819eec3/log/zuul-info/inventory.yaml#278-27912:00
fungiokay, the ubuntu mirror update ran clean again, so i've released the lock and exited the screen session13:33
Clark[m]fungi: did it have the same errors as before?13:38
fungiif by "same errors" you mean the zstd broken pipe, yes13:39
fungiprobably we need newer libzstd to get around that13:40
Clark[m]Ya that. The lp but I linked indicated some fixing was attempted but apparently incomplete13:40
fungibut so far there's no indication that it's an actual problem, certainly wasn't fatal from reprepro's perspective anyway13:40
Clark[m]Thank you frickler for pushing this along and getting it done. I should make note to add a last minute agenda item to discuss afs disk use and planning for jammy ports etc13:42
fungiClark[m]: i think 839366 would be the next step once you're settled in for the morning, already has my +213:45
Clark[m]Will look but will be a bit as kids need to be deposited at school before I'm at a proper computer13:47
fungisure, take your time!13:48
fricklerfungi: I verified that the config error only comes from the unknown label and is the same with a completely bogus label. I'll do some further testing locally to see if this is easily reproducible. unless maybe corvus wakes up and at once sees the solution ;)14:15
fungifrickler: good investigating, so it sounds like the original configuration error is probably tripping an untested code path related to branch matchers. probably a recent regression or we'd have already seen it at some point14:23
fricklerfungi: it also seems to need a more complex setup, all my easy test cases produce the expected NODE_FAILURE so far14:26
fricklerah, or maybe if it is very recent I should try running latest zuul version, too14:27
fungidid you test with inheriting from a parent with a regex branch matcher?14:27
fungiand yes, i have a feeling it might be a regression in 836086 (Fix implied branch matchers with regex chars) which would need 5.2.3 or later, but it could be something even more recent14:29
*** ysandeep is now known as ysandeep|afk14:36
*** ysandeep|afk is now known as ysandeep14:59
*** pojadhav|pto is now known as pojadhav15:03
*** lbragstad8 is now known as lbragstad15:05
clarkbfungi: frickler  I'ev approved 839366. We'll want ot chekc that they can boot in all the clouds once up.15:22
fungiwhich we can do with rechecking the devstack change a bunch15:26
corvusfrickler: fungi ack; i'll look at that later today15:27
fungithanks corvus!15:28
opendevreviewMerged openstack/project-config master: Start launching Jammy images  https://review.opendev.org/c/openstack/project-config/+/83936615:32
fricklercorvus: fungi: the issue actually is triggered by the mismatch in the nodeset name I had in combination with the implied-branches pragma. reproduced that locally with zuul:latest, will try to bisect15:54
fricklermeh, doing too many zuul restarts in a row doesn't work well with github rate limits it seems16:05
*** dviroel|rover is now known as dviroel|rover|lunch16:11
*** marios is now known as marios|out16:12
*** rlandy|mtg is now known as rlandy16:14
*** ysandeep is now known as ysandeep|out16:17
*** jpena is now known as jpena|off16:17
fricklerso it seems I get the issue with 5.2.4 but not with 5.2.316:21
clarkbI've somehow failed to eat breakfast and also flail around without getting much done /me resets the morning and finds food16:25
fungifrickler: thanks, so i guess it's not 836086 after all since that merged prior to 5.2.316:31
frickleractually my test was wrong. 5.2.2 to 5.2.3 is the broken step16:33
fungiin that case 836086 seems increasingly likely to be the culprit (79c6717cea5abad479af98f40eb260472ef94648)16:39
clarkbinfra-root https://review.opendev.org/c/opendev/system-config/+/839250 and child are further gerrit 3.5 prep work16:50
clarkband https://review.opendev.org/q/topic:retire-elk has a bunch of elk related config management retirements now that the ELK servers are gone16:51
clarkbreviews much appreciated16:51
*** dviroel|rover|lunch is now known as dviroel|rover17:02
fricklerfungi: looking at that patch I tend to agree, but I'll leave it to corvus to carve out the details17:11
fricklerseems the jammy mirror isn't doing as well as I hoped, I had only tested the jammy tree, not -updates or -security https://zuul.opendev.org/t/openstack/build/ae536d3e855b48f99a6df7ad174849bb/log/job-output.txt#650-69817:25
* frickler afks for a bit17:25
clarkblooks like they both contain actual content at least17:27
clarkband looking at this more closely it seems we may be mirroring source packages?17:28
clarkbI wonder if we can clean those up to reduce the size of our ubuntu mirrors?17:29
clarkbya there are tar.xz's in the pool which I think are the source packages17:30
fungimaybe there's a flag to tell reprepro not to, though it will certainly render any deb-src lines in a sources.list broken17:30
clarkbhrm we do list deb-src in the the configure-mirrors role17:31
fungiprobably not?17:31
clarkbno we do17:31
fungithough if any jobs add some, they'll become broken... probably safe to assume no17:31
clarkbright they'd break when apt-get updating17:31
clarkbsince that will try to pull the index at least. And we seem to set it in the role for everything unfortunately17:32
clarkbbut maybe we stop doing that and then we can clear out the source packages17:32
clarkband then separately we have https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/configure-mirrors/templates/apt/etc/apt/sources.list.j217:33
clarkbI wonder which is used? but we should probably remove the deb-src lines either way if doing this17:33
clarkbit does seem overkill to haev a CI system care about source packages unless building packages, but in that case we probably don't want to be looking at our mirrors for the sources17:34
fungianyway, configure-mirrors is the only real risk i find via codesearch17:34
fungiall other uses of deb-src seem to be with specific (non-mirror) package repos17:35
clarkbI've got a note to talk about afs disk usage in the meeting (it didn't make the agenda because I realized ports were a problem too late)17:35
clarkbwe can discuss this option along with the others I've identified17:35
clarkbfungi: ubuntu uses https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/configure-mirrors/templates/apt/etc/apt/sources.list.j2 and debian uses the files in sources.list.d for maximum confusion17:38
clarkbbut I think that makes this even less risky as only debian jobs would expect deb-src currently17:38
clarkb(and their number is fewer)17:38
clarkbI wonder why we split them like that17:38
fungii would wager there's no conscious decision behind the difference17:43
clarkbthe fact that source packages use xz likely indicates we won't see as large a savings as we might hope for. But considering we did similar cleanup to the rpm distros this seems reasonable17:44
clarkband once you add up ubuntu and unbuntu ports across 3.5 releases the impact may still be quite large17:45
fungicould be in the neighborhood of 10-25%, gut feeling from having profiled package sizes in years past17:45
fungithe question is, does reprepro have an option to exclude them17:46
fungilooks like if we omit the "source" entry from the architectures line in the updates file, it may not pull them17:50
fungialso from the distributions file i guess17:50
fungithough there's a good chance we'd have to manually clean up the existing source packages if we dropped them from the configs17:52
clarkbI think it may clean them up due to the same mechanism that claers out old packages?17:55
clarkbit knows if it isn't mirroring something to remove it17:55
fungiyeah, but if we take it out of the configuration, it may not know it's not mirroring it...17:56
clarkbI see17:56
clarkbdo we want to start with a small repo like UCA and test it?17:57
fungiyeah, i can push a simple change for that as an experiment17:57
fungibad news (or good?). we don't have a source architecture configured for uca17:59
fungiare we mirroring source packages for uca?17:59
* fungi checks17:59
opendevreviewMerged opendev/system-config master: Update gerritbot-matrix version to include wipness  https://review.opendev.org/c/opendev/system-config/+/83758017:59
clarkboh we may not be I just assumed we were17:59
fungiour experiment may already be done for us ;)17:59
clarkbwe do for debian-docker, debian, ubuntu, and ubuntu-ports according to our reprepro configs18:00
clarkbdebian-docker is another good experiement since it is small in scope assuming upstream publishes source packages at all18:00
clarkblooks like debian docker has no source packages either18:01
fungino source packages, so i guess that's the right way to do it at least18:01
clarkb++ and I think we can go ahead and remove it from debian docker to reflect reality of no upstream source pakges but that won't serve as  test for us18:01
fungithose list source18:02
clarkbyup but no source packages I think because upstream doesn't have them18:02
fungii guess we could do ubuntu-ports, arm jobs are unlikely to hit a problem with it18:04
clarkband I think its safe for us to do ubuntu as well from a job perspectie as only debian instances get deb-src configured by default from my read of configure-mirrors18:05
fungidefinitely source packages in https://static.opendev.org/mirror/ubuntu-ports/pool/main/a/aalib/ right now18:05
clarkbbtw I found pool/main/v/ was a good prefix beacuse there aren't a lot of 'v' packages ;)18:05
clarkbloads much more quickly than some of the other prefixes18:05
opendevreviewJeremy Stanley proposed opendev/system-config master: Drop source package mirroring for ubuntu-ports  https://review.opendev.org/c/opendev/system-config/+/83942118:09
clarkb+2 thanks18:10
clarkbwe can also mirror jammy docker packages: https://download.docker.com/linux/ubuntu/dists/jammy/ I'll get a chagne up for that18:13
opendevreviewClark Boylan proposed opendev/system-config master: Add Jammy Docker package mirroring  https://review.opendev.org/c/opendev/system-config/+/83942218:19
clarkbsomething like that maybe18:19
corvusfungi frickler https://review.opendev.org/839423 should fix it.  implied-branch matcher pragma is required which explains the rarity18:20
fungiclarkb: i guess we don't track quota for mirror.deb-docker in grafana yet?18:22
fungicorvus: thanks!18:22
clarkbfungi: ya we may not. Its super tiny though like 120MB per release?18:22
clarkbthe packages aren't small but there are only a handful of them18:23
fungiinteresting, the mirror.ubuntu grpah shows that adding jammy took us from 667gb to 827gb of our 990gb quota (a 160gb or 16% increase)18:30
fungithat's only a 1/4 increase over the prior size18:31
clarkbya that does make me wonder ifwe don't prune all the old packages as well as we expect. Or maybe jammy is using newer better compression algorithms18:31
fungibut we're not mirroring 4 other releases of ubuntu that i can see18:31
clarkbhrm what do you mean about not mirroring 4 other realses?18:35
fungijammy accounts for ~1/5 of the used space18:38
fungiwhich suggests we're mirroring 5 releases including jammy18:38
fungiassuming all releases are roughly the same size18:38
fungithose are likely invalid assumptions, but suggests we may have some lurking cruft18:39
clarkbright. It wouldn't surprise me if there is lurking cruft just due to the size reduction on the rpm mirrors we just did18:39
opendevreviewGage Hugo proposed openstack/project-config master: End project gating for openstack-helm-docs  https://review.opendev.org/c/openstack/project-config/+/83910319:01
opendevreviewGage Hugo proposed openstack/project-config master: Retire openstack-helm-docs repo, step 3.3  https://review.opendev.org/c/openstack/project-config/+/83942719:01
opendevreviewGage Hugo proposed openstack/project-config master: Retire openstack-helm-docs repo, step 3.3  https://review.opendev.org/c/openstack/project-config/+/83942719:14
fungiclarkb: all open changes have been abandoned for the 5 repositories covered by topic:retire-elk19:58
funginow to work on dinner19:58
clarkbfungi: thanks!20:00
opendevreviewMerged opendev/system-config master: Drop source package mirroring for ubuntu-ports  https://review.opendev.org/c/opendev/system-config/+/83942120:05
opendevreviewClark Boylan proposed openstack/diskimage-builder master: Fix dhcp-all-interfaces on debuntu systems  https://review.opendev.org/c/openstack/diskimage-builder/+/83908020:45
opendevreviewClark Boylan proposed openstack/diskimage-builder master: Revert "Fallback to persistent netifs names with systemd"  https://review.opendev.org/c/openstack/diskimage-builder/+/83886320:45
corvusi'm going to restart the zuul schedulers now.20:52
fungiplease do!21:21
fungioh, that was 30 minutes ago ;)21:21
corvuszuul01 has restarted (was a little slow); doing zuul02 now21:27
fungi839421 has deployed and will in theory be used by the 22z ubuntu mirror run in ~30 minutes21:28
opendevreviewMerged zuul/zuul-jobs master: Add per-build WinRM cert generation  https://review.opendev.org/c/zuul/zuul-jobs/+/83741621:52
clarkbI'm going to bring up the question about deb-src in the zuul room21:52
fungigreat idea21:53
corvus#status log restarted zuul schedulers/web/finger on 77524b359cf427bca16d2a3339be9c1976755bc822:23
opendevstatuscorvus: finished logging22:23
clarkbgmann: now that the board meeting is over I wanted to mention the subunit2sql mysql database is the last thign still up from the health service. I kept it around to make sure that none of the infra-root had concerns with deleting it and there didn't seem to be concern on our end. Any on yours? I would back it up but it is quite large and considering the data is quite stale at this22:46
clarkbpoint deleting it seems better.22:46
gmannclarkb: afaik, health service was one of the user of it in OpenStack. but after health service down we have one usage in tempest test-removal process - https://docs.openstack.org/tempest/latest/test_removal.html#using-subunit2sql-directly22:51
gmannquestion will be if we can keep it up with new system? or we should change the test-removal process in tempest?22:52
gmannkopecmartin: mtreinish ^^ any opinion ? 22:52
gmanntempest test-removal are not very frequent now a days and I think we should be able to very the test status with new dashboard or so? but not to figure out how?22:53
fungiwell, it's currently not working for that because there's no way to query it with the openstack-health api server down22:53
clarkbgmann: I'm not sure what you mean by new system. There is no replacement for the health dashboard or subunit2sql workers as far as I know. The database is going away one way or another the question is if we need to back up the data first22:53
clarkbfungi: not just down, deleted22:53
fungiwell, yes, down because it's deleted ;)22:53
*** rlandy is now known as rlandy|out22:53
clarkbgmann: do you mean is there some way to use opensearch to query this ifno?22:54
gmannnew system i mean server we have for ELK services but not sure if that fits into that. as we already said health one is not going to be and we retired it also.22:54
gmannclarkb: yeah22:54
gmannclarkb: main usage from tempest side is to know test passing rate22:55
clarkbgmann: ok I think that is a separate discussion. Specifically what we are wondering about here is "Is there any reason to backup the 286GB subunit2sql database before we delete it?"22:55
gmann+ shutdown subunit2sql service right?22:55
fungino, that's already gone. this is the database it used22:56
clarkbgmann: the shutdown and deletion of the subunit2sql workers and the health api server is already complete. All that remains is the database22:56
gmannohk that is gone togehter. isee22:56
*** dviroel|rover is now known as dviroel|rover|out22:56
gmannI am not sure any benefits to keep that data. if we keep what is usage?22:56
clarkbgmann: its a 64GB memory 500GB disk trove instance in rax. We are going to delete it. If we backup the database it will cost us about 286GB of backup space (wherever that ends up)22:57
fungiif memory serves, it was pruned automatically to a 6-month window, so there's not that much retention and it will rapidly cease to be relevant anyway22:58
gmannand current data is of last 6 month right? old one should have been gone already?22:59
clarkbI think it may be older than that as things stopped working sometime last year22:59
clarkbwhich is why we idneitifed it as being able to be shut down (it was broken and no one noticed)23:00
gmannIf I remember correctly, e-r also query on that right? or it was separate. and e-r instance is also planned to shutdown?  http://status.openstack.org/elastic-recheck/23:01
clarkbe-r has never queried health/subunit2sql23:01
clarkbthe e-r instance will be shutdown beacuse the elasticsearch instance it talked to is gone23:01
clarkbbasically we need to know if you can think of any reason why that data would be important enough to backup before we delete the database instance. None of us can come up with a reason for that23:02
fungiand the server where e-r ran will be going away at the end of the week as well (status.openstack.org)23:02
gmannyeah, I do not know usage or so reason for backing up that. as services are already down, we cannot do much with that backup .23:04
clarkbok, I'll plan to proceed with deleting it tomorrow. I guess if something pop up between now and then feel free to ping me23:05
gmannI think you can remove that too. if anyone bring that up in some infra they can do it from scratch and do not need to import old data23:05
clarkbgmann: now for tracking tempest test success/fail rates with opensearch. The way opensearch is working is it takes the test job logs (various log files) and indexes them23:05
clarkbgmann: what this means is if your job logs somewhere tempest.test.foo = pass and tempest.test.bar = fail you can query on those strings and check them over time to see what the ratio between pass and fail is23:06
clarkbit depends on you logging the info in the first place though23:06
clarkbwhether or not that is currently done I don't know, but tracking that info in opensearch is theoretically possible if you log that and index those logs23:06
gmannclarkb: yeah that is what I was thinking to search with <test> = *pass|fail*23:06
gmannat least knowing the recent with 'fail' string can give us idea about it23:07
gmannI will try that and add it in tempest doc23:07
gmannbut i have not seen test removal since 1 (or 2 may be) years so its not very frequent things or something we need to worry much. 23:08
clarkbfungi: ianw: doesn't look like we're seeing disk utilization drop in the ubuntu-ports volume yet. I suspect we may need to do some manual intervention?23:14
clarkbI can try to take a look at that tomorrow but I'm just about to the point of needing to help with dinner23:14
ianwError: packages database contains unused 'bionic-backports|main|source' database.23:16
ianwThis usually means you removed some component, architecture or even23:16
ianwa whole distribution from conf/distributions.23:16
ianwIn that case you most likely want to call reprepro clearvanished to get rid23:16
ianwof the databases belonging to those removed parts.23:16
ianwiirc that's what i had to do with the security/stretch removal too.  i can look at it in a bit23:17
fungiyeah, so we need to prune the db23:23

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!