Friday, 2022-02-11

*** rlandy|ruck is now known as rlandy|out00:02
opendevreviewMerged opendev/system-config master: bridge production: fix mtime matching  https://review.opendev.org/c/opendev/system-config/+/82880800:24
opendevreviewIan Wienand proposed opendev/system-config master: [dnm] flesh out a little idea for exporting logs  https://review.opendev.org/c/opendev/system-config/+/82881001:06
*** Guest2 is now known as prometheanfire01:28
opendevreviewIan Wienand proposed opendev/system-config master: [dnm] flesh out a little idea for exporting logs  https://review.opendev.org/c/opendev/system-config/+/82881001:36
opendevreviewIan Wienand proposed opendev/system-config master: [dnm] flesh out a little idea for exporting logs  https://review.opendev.org/c/opendev/system-config/+/82881002:15
opendevreviewIan Wienand proposed opendev/system-config master: [dnm] flesh out a little idea for exporting logs  https://review.opendev.org/c/opendev/system-config/+/82881002:49
opendevreviewIan Wienand proposed opendev/system-config master: [dnm] flesh out a little idea for exporting logs  https://review.opendev.org/c/opendev/system-config/+/82881002:54
opendevreviewSteve Baker proposed openstack/diskimage-builder master: Replace kpartx with qemu-nbd in extract-image  https://review.opendev.org/c/openstack/diskimage-builder/+/82861703:05
opendevreviewSteve Baker proposed openstack/diskimage-builder master: Move grub-install to the end, and skip for partition images  https://review.opendev.org/c/openstack/diskimage-builder/+/82697603:05
opendevreviewIan Wienand proposed opendev/system-config master: [dnm] flesh out a little idea for exporting logs  https://review.opendev.org/c/opendev/system-config/+/82881003:28
opendevreviewIan Wienand proposed opendev/system-config master: [dnm] flesh out a little idea for exporting logs  https://review.opendev.org/c/opendev/system-config/+/82881004:09
opendevreviewIan Wienand proposed opendev/system-config master: [dnm] flesh out a little idea for exporting logs  https://review.opendev.org/c/opendev/system-config/+/82881004:15
*** ysandeep|out is now known as ysandeep04:56
*** akahat|PTO is now known as akahat05:22
*** clarkb is now known as Guest3205:22
opendevreviewIan Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role  https://review.opendev.org/c/zuul/zuul-jobs/+/82881806:11
opendevreviewIan Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role  https://review.opendev.org/c/zuul/zuul-jobs/+/82881806:26
opendevreviewIan Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role  https://review.opendev.org/c/zuul/zuul-jobs/+/82881806:29
opendevreviewIan Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role  https://review.opendev.org/c/zuul/zuul-jobs/+/82881806:34
opendevreviewIan Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role  https://review.opendev.org/c/zuul/zuul-jobs/+/82881806:48
opendevreviewIan Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role  https://review.opendev.org/c/zuul/zuul-jobs/+/82881806:59
opendevreviewIan Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role  https://review.opendev.org/c/zuul/zuul-jobs/+/82881807:05
*** ysandeep is now known as ysandeep|afk07:07
opendevreviewIan Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role  https://review.opendev.org/c/zuul/zuul-jobs/+/82881807:11
opendevreviewIan Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role  https://review.opendev.org/c/zuul/zuul-jobs/+/82881807:24
*** amoralej|off is now known as amoralej07:24
opendevreviewIan Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role  https://review.opendev.org/c/zuul/zuul-jobs/+/82881807:28
opendevreviewIan Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role  https://review.opendev.org/c/zuul/zuul-jobs/+/82881807:33
*** ysandeep|afk is now known as ysandeep07:36
opendevreviewIan Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role  https://review.opendev.org/c/zuul/zuul-jobs/+/82881807:42
opendevreviewIan Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role  https://review.opendev.org/c/zuul/zuul-jobs/+/82881808:11
opendevreviewIan Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role  https://review.opendev.org/c/zuul/zuul-jobs/+/82881808:32
*** jpena|off is now known as jpena08:36
opendevreviewIan Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role  https://review.opendev.org/c/zuul/zuul-jobs/+/82881808:48
*** arxcruz is now known as arxcruz|ruck09:13
*** ysandeep is now known as ysandeep|lunch09:24
*** pojadhav is now known as pojadhav|afk09:46
opendevreviewIan Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role  https://review.opendev.org/c/zuul/zuul-jobs/+/82881810:05
*** ysandeep|lunch is now known as ysandeep10:15
opendevreviewIan Wienand proposed opendev/system-config master: [dnm] flesh out a little idea for exporting logs  https://review.opendev.org/c/opendev/system-config/+/82881010:26
opendevreviewIan Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role  https://review.opendev.org/c/zuul/zuul-jobs/+/82881810:27
opendevreviewIan Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role  https://review.opendev.org/c/zuul/zuul-jobs/+/82881810:36
opendevreviewIan Wienand proposed zuul/zuul-jobs master: encrypt-file : role to encrypt a file  https://review.opendev.org/c/zuul/zuul-jobs/+/82881810:43
opendevreviewIan Wienand proposed zuul/zuul-jobs master: encrypt-file : role to encrypt a file  https://review.opendev.org/c/zuul/zuul-jobs/+/82881810:51
fricklerClark[m]: Guest32: I started https://etherpad.opendev.org/p/ubuntu-xenial-jobs but kind of stopped halfway when I saw the size of the task at hand. I guess we will need to make that a priority item for some time and iterate through things10:51
*** pojadhav|afk is now known as pojadhav10:51
*** rlandy|out is now known as rlandy|ruck11:05
*** dviroel|out is now known as dviroel11:15
*** pojadhav is now known as pojadhav|brb11:25
*** pojadhav|brb is now known as pojadhav11:39
louroto/ I have https://review.opendev.org/c/openstack/project-config/+/825138 and https://review.opendev.org/c/openstack/project-config/+/828166 in the pipe, if anyone has got a few cycles, thanks!12:58
*** pojadhav is now known as pojadhav|brb13:05
lourotthanks fungi!13:08
*** rlandy|ruck is now known as rlandy|ruck|mtg13:08
opendevreviewMerged openstack/project-config master: Mirror cinder-solidfire and cinder-nimblestorage to Github  https://review.opendev.org/c/openstack/project-config/+/82513813:12
opendevreviewMerged openstack/project-config master: Mirror charm-ops-interface-* to GitHub  https://review.opendev.org/c/openstack/project-config/+/82816613:16
*** amoralej is now known as amoralej|lunch13:24
*** rcastillo is now known as rcastillo|rover13:33
*** ysandeep is now known as ysandeep|out14:01
*** amoralej|lunch is now known as amoralej14:01
*** dviroel is now known as dviroel|lunch\14:54
*** ykarel is now known as ykarel|away14:56
NeilHanlonhey folks, did a dib upgrade happen to occur yet, or is there anything I can do to help?15:27
fungiNeilHanlon: https://pypi.org/project/diskimage-builder/ says we got a new 3.18.0 release yesterday, checking nodepool now...15:31
fungihttps://opendev.org/zuul/nodepool/src/branch/master/requirements.txt#L14 says diskimage-builder>=3.18.015:32
funginow to dockerhub...15:32
opendevreviewNeil Hanlon proposed openstack/project-config master: Add rockylinux-8 to nodepool configuration  https://review.opendev.org/c/openstack/project-config/+/82843515:34
fungilatest tag at https://hub.docker.com/r/zuul/nodepool-builder/tags has an upload time roughly corresponding with when https://review.opendev.org/828649 merged15:34
fungiwe probably need to restart our nodepool builders on new containers, but that's fairly quick15:35
NeilHanlonawesome ! thank you for the background info 15:35
fricklerfungi: NeilHanlon: https://review.opendev.org/c/zuul/nodepool/+/828649 bumped nodepool reqs for that15:35
fungiyeah, looks like it's been about a week since our last builder container restarts15:36
NeilHanloni rebased 828435 above so it's probably ready to merge, too15:36
fungicentos-8-stream and centos-9-stream images just started building in the last few minutes... i wonder if those are failing in a tight loop15:44
fungialso we have opensuse-15 images stuck in a building state for nearly 6 days15:44
fungii wonder what that's about15:44
opendevreviewMaksim Malchuk proposed openstack/diskimage-builder master: Correctly create DIB_ENV variable and dib_environment file  https://review.opendev.org/c/openstack/diskimage-builder/+/82889915:45
*** Guest32 is now known as clarkb15:47
fungithe /opt fs is at 100% used on both nb01 and nb02, while nb03 is entirely unreachable for me15:49
fungiyeah, i can't even ping nb0315:49
clarkbhuh thats interesting I checked them when we did recent server reboots and did some cleanup and there should've been a fair bit of disk space. I wonder if we are leaking with dib somehow15:50
fungi`openstack server show` says nb03 is running/active, maybe neutron went toes up there15:51
fungithere are a few gb left on 01 and 02, but it's less than 1%15:51
clarkbcould be focal upgrade fallout too if that tripped some bug15:51
clarkbfrickler: ++ to taking it incrementally15:52
fungialso, `nodepool dib-image-list` mentions a bunch of opensuse-15 images in a "ready" state, i would normally expect only two15:52
clarkbfungi: I think that can happen if a cloud isn't able to upload the newer versions. We'll keep the old versiosn around if that is the only version a cloud has managed to receive15:53
clarkbbut ya under normal operation there should be only two ready so that might point at a problem15:53
fungi18 in a ready state, many more in failed state15:53
fungithe oldest "ready" opensuse-15 image has been there for three weeks15:54
fungiwe've got a lot in the providers themselves too... taking rax-dfw as a quick example, it has 17 opensuse-15 images in a ready state and 1 in a failed state15:56
fungiso something seems to be breaking image deletion15:57
fungi2022-02-11 15:57:44,017 ERROR nodepool.zk.ZooKeeper: Error loading json data from image build /nodepool/images/opensuse-15/builds/000014744015:57
fungii think we have some corrupted znodes, that could be doing it15:58
fungishould i use zkshell to delete that node?15:59
opendevreviewMaksim Malchuk proposed openstack/diskimage-builder master: Correctly create DIB_ENV variable and dib_environment file  https://review.opendev.org/c/openstack/diskimage-builder/+/82889916:00
fungiexploring a bit, it looks like the issue may be that /nodepool/images/opensuse-15/builds/0000147440/providers/rax-iad/images/ is empty16:00
fungiinfra-root: ^ what's the safest way to fix that up? should i try to stop all the builders and then delete the providers tree there? or the entire 0000147440 build?16:03
*** dviroel|lunch\ is now known as dviroel16:08
*** pojadhav|brb is now known as pojadhav|out16:24
*** amoralej is now known as amoralej|off16:28
Clark[m]fungi I think you can just delete the one file that is corrupted then nodepool clears the rest16:33
Clark[m]And you don't have to stop services for that since zk won't let them do a half read16:34
Clark[m]Er not file but znode16:35
fungiwell, the builder is complaining about "Error loading json data from image build /nodepool/images/opensuse-15/builds/0000147440" and raises "json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)"16:46
*** rlandy|ruck|mtg is now known as rlandy|ruck16:46
fungiso does that mean 0000147440 is corrupt, or one of the subnodes under it?16:46
fungithat's mainly what i'm asking. do i delete all of 0000147440 or just 0000147440/providers/rax-iad/images or something in between?16:47
fungiworried that if i delete higher up in that tree that necessary, it might orphan some image build on disk or in a provider we'll need to clean up manually16:48
clarkbfungi: ya you'll probably need to get the znode data and compare against other entries16:50
clarkbbut you are right you want to delete only the specific znode that is corrupted (and if it has children this requires deleting them too iirc)16:50
fungiothers seem to have a serial and a lock under providers/rax-iad/images16:52
clarkbfungi: you should find znodes with json in them16:52
fungiahh, so i can do json_cat on them16:53
clarkbin the zk-shell utility there is a get method iirc16:53
fungiyeah, the serially-numbered znode has json data16:53
clarkband that gives you the raw data back. Then we've got the new zuul utility that handles sharded data but I don't think the nodepool stuff is sharded so zk-shell is good enough16:54
fungiwhen i json_cat it16:54
clarkbya so you want to find the znode for that specific image and upload that is corrupt and delete it iirc16:54
clarkbthen nodepool will correct the rest of the tree16:54
fungiokay, so since there are no nodes with json in them in any of the subnodes in the 0000147440 tree, i should be able to recursively delete 000014744016:54
clarkbif you give me a minute I can login and check really quickly16:55
*** artom__ is now known as artom16:56
fungisure, it looks like it's been this way for several weeks, so a few more minutes aren't going to make much difference16:56
clarkbfungi: ya so if you get /nodepool/images/opensuse-15/builds/0000147440 it is empty and if you get 0000147493 isntead there is json. So ya /nodepool/images/opensuse-15/builds/0000147440 is the item to delete which necessitates deleting its children16:57
fungithanks for confirming16:57
fungiokay, deleted16:58
fungi#status log Manually deleted /nodepool/images/opensuse-15/builds/0000147440 from ZooKeeper, which had been holding back image cleanup for several weeks16:59
opendevstatusfungi: finished logging16:59
clarkbone thing that makes zk a bit "weird" compared to posix is that the direntries can hold data too16:59
clarkbbecause really it is just a tree and each node can have data and children16:59
fungiright, i tried to json_cat each level under there just to be sure they were all empty17:00
clarkb(I guess technically posix dirs have data too but limited to who their children ar)17:00
fungiand attributes17:00
clarkbfungi: one thing to be aware of is that the zuul/ side of the tree uses a lot of node sharding17:00
fungiextended attrs in some posix filesystems can hold vast amounts of data17:00
clarkbsharding and compression17:00
fungiyeah17:00
fungithe compression is new17:00
clarkbwhat this means is that the zk-shell tools like get and so on don't give you a complete accurate picture. There is a tool in the zuul tree that aims to emulate some zk-shell behavior but handles sharding and compression fro you17:01
fungilots of deletions recorded in the builder logs now17:01
*** marios is now known as marios|out17:02
fungionce they're looking more sane i can see about restarting them, but i expect there's a backlog of images to build from the filesystems filling up17:02
mnaserinfra-root: I got an abuse notice of sorts that have been sent before I believe (RuskIQ) — what’s an email I can forward it to?17:02
fungimnaser: infra-root at openstack.org17:03
fungior service-incident at lists.opendev.org17:03
fungithe latter is what we list as the security contact on the main opendev page17:03
fungiit's a private ml with just the sysadmins subscribed17:04
fungiokay, we're down to just the most recent two opensuse-15 images in a ready state now17:06
mnaserI’ll do the first as it’s probably what gets an abuse emails or whateva17:06
fungiyeah, that's where most of then end up. thanks17:07
mnaserSent17:07
fungimuch appreciated17:07
fungiokay, /opt is back down to ~70% used on nb01 and nb02 now. still can't reach nb0317:08
fungishould i try rebooting it?17:08
fungi`openstack console log show` isn't returning for me either17:09
fungifinally returned "Gateway Timeout (HTTP 504)"17:09
fungisame for `openstack console url show`17:10
fungii have a feeling a reboot isn't going to help. kevinz_ ^ are you aware of any network connectivity problems or related issues which would make nb03.opendev.org (e46ef968-9b36-465f-b2d8-0aa4cc034da8) unreachable or unresponsive?17:12
clarkbyou might not even be able toreboot depending on hwere the 504 is occuring17:13
fungiright17:13
fungianyway, in the meantime we shouldn't need to restart it in order to be able to add x86 images for rocky linux17:14
clarkbya rocky images should not affect arm builder unless we are trying to build arm images17:17
clarkbmnaser: fungi: fwiw I don't see an email yet, but my brain isn't quite working yet today17:26
*** jpena is now known as jpena|off17:33
fungiyeah, i didn't see it show up in that inbox, nor the spam folder for it, but maybe it's delayed due to greylisting or something17:36
opendevreviewMatthew Thode proposed openstack/project-config master: add python3.9 to generate-constraints command  https://review.opendev.org/c/openstack/project-config/+/82890917:39
clarkbfrickler: I annotated the etherpad with some thoughts. I expect a good number of things can be cleaned up now and we should probably focus on those then swing back around on the trickier items18:47
mnaserfungi, clarkb: i forwarded to infra-root@opendev.org19:39
clarkbmnaser: it needs to be openstack.org19:40
clarkb(due to lack of mx for opendev.org)19:40
mnaserah oops19:40
mnasersurprised i didnt get a bounce from google19:40
mnasersent again19:40
fungithanks!19:41
opendevreviewJeremy Stanley proposed opendev/system-config master: Temporarily block x/stackalytics in Gitiles  https://review.opendev.org/c/opendev/system-config/+/82891919:58
opendevreviewJeremy Stanley proposed opendev/system-config master: Temporarily block x/stackalytics in Gitiles  https://review.opendev.org/c/opendev/system-config/+/82891920:15
*** dviroel is now known as dviroel|out20:59
clarkbfungi: ^ that passes CI now. SHould we go ahead and +A?21:20
fungiyeah, feel free to single-core approve, it's easy enough to revert later21:23
clarkbI think we'll hold off for now while we make sure that is the right decision21:43
fungiwfm21:48
opendevreviewIan Wienand proposed zuul/zuul-jobs master: encrypt-file : role to encrypt a file  https://review.opendev.org/c/zuul/zuul-jobs/+/82881822:52
opendevreviewIan Wienand proposed zuul/zuul-jobs master: encrypt-file : role to encrypt a file  https://review.opendev.org/c/zuul/zuul-jobs/+/82881823:00
clarkbianw: not for today but if you get a chance early next week it would be great if you could lookover the gitea 1.16.1 install that I held23:00
clarkbmaybe we can do that upgrade next week. (I should cross check against openstackrelease though23:00
ianw++ will do23:00
clarkbI haven't been able to find anything concerning and the updated test to check pushes via ssh should cover the gerrit replication23:02
opendevreviewIan Wienand proposed opendev/system-config master: [dnm] flesh out a little idea for exporting logs  https://review.opendev.org/c/opendev/system-config/+/82881023:15

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!