*** rlandy|ruck is now known as rlandy|out | 00:02 | |
opendevreview | Merged opendev/system-config master: bridge production: fix mtime matching https://review.opendev.org/c/opendev/system-config/+/828808 | 00:24 |
---|---|---|
opendevreview | Ian Wienand proposed opendev/system-config master: [dnm] flesh out a little idea for exporting logs https://review.opendev.org/c/opendev/system-config/+/828810 | 01:06 |
*** Guest2 is now known as prometheanfire | 01:28 | |
opendevreview | Ian Wienand proposed opendev/system-config master: [dnm] flesh out a little idea for exporting logs https://review.opendev.org/c/opendev/system-config/+/828810 | 01:36 |
opendevreview | Ian Wienand proposed opendev/system-config master: [dnm] flesh out a little idea for exporting logs https://review.opendev.org/c/opendev/system-config/+/828810 | 02:15 |
opendevreview | Ian Wienand proposed opendev/system-config master: [dnm] flesh out a little idea for exporting logs https://review.opendev.org/c/opendev/system-config/+/828810 | 02:49 |
opendevreview | Ian Wienand proposed opendev/system-config master: [dnm] flesh out a little idea for exporting logs https://review.opendev.org/c/opendev/system-config/+/828810 | 02:54 |
opendevreview | Steve Baker proposed openstack/diskimage-builder master: Replace kpartx with qemu-nbd in extract-image https://review.opendev.org/c/openstack/diskimage-builder/+/828617 | 03:05 |
opendevreview | Steve Baker proposed openstack/diskimage-builder master: Move grub-install to the end, and skip for partition images https://review.opendev.org/c/openstack/diskimage-builder/+/826976 | 03:05 |
opendevreview | Ian Wienand proposed opendev/system-config master: [dnm] flesh out a little idea for exporting logs https://review.opendev.org/c/opendev/system-config/+/828810 | 03:28 |
opendevreview | Ian Wienand proposed opendev/system-config master: [dnm] flesh out a little idea for exporting logs https://review.opendev.org/c/opendev/system-config/+/828810 | 04:09 |
opendevreview | Ian Wienand proposed opendev/system-config master: [dnm] flesh out a little idea for exporting logs https://review.opendev.org/c/opendev/system-config/+/828810 | 04:15 |
*** ysandeep|out is now known as ysandeep | 04:56 | |
*** akahat|PTO is now known as akahat | 05:22 | |
*** clarkb is now known as Guest32 | 05:22 | |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role https://review.opendev.org/c/zuul/zuul-jobs/+/828818 | 06:11 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role https://review.opendev.org/c/zuul/zuul-jobs/+/828818 | 06:26 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role https://review.opendev.org/c/zuul/zuul-jobs/+/828818 | 06:29 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role https://review.opendev.org/c/zuul/zuul-jobs/+/828818 | 06:34 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role https://review.opendev.org/c/zuul/zuul-jobs/+/828818 | 06:48 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role https://review.opendev.org/c/zuul/zuul-jobs/+/828818 | 06:59 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role https://review.opendev.org/c/zuul/zuul-jobs/+/828818 | 07:05 |
*** ysandeep is now known as ysandeep|afk | 07:07 | |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role https://review.opendev.org/c/zuul/zuul-jobs/+/828818 | 07:11 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role https://review.opendev.org/c/zuul/zuul-jobs/+/828818 | 07:24 |
*** amoralej|off is now known as amoralej | 07:24 | |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role https://review.opendev.org/c/zuul/zuul-jobs/+/828818 | 07:28 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role https://review.opendev.org/c/zuul/zuul-jobs/+/828818 | 07:33 |
*** ysandeep|afk is now known as ysandeep | 07:36 | |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role https://review.opendev.org/c/zuul/zuul-jobs/+/828818 | 07:42 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role https://review.opendev.org/c/zuul/zuul-jobs/+/828818 | 08:11 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role https://review.opendev.org/c/zuul/zuul-jobs/+/828818 | 08:32 |
*** jpena|off is now known as jpena | 08:36 | |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role https://review.opendev.org/c/zuul/zuul-jobs/+/828818 | 08:48 |
*** arxcruz is now known as arxcruz|ruck | 09:13 | |
*** ysandeep is now known as ysandeep|lunch | 09:24 | |
*** pojadhav is now known as pojadhav|afk | 09:46 | |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role https://review.opendev.org/c/zuul/zuul-jobs/+/828818 | 10:05 |
*** ysandeep|lunch is now known as ysandeep | 10:15 | |
opendevreview | Ian Wienand proposed opendev/system-config master: [dnm] flesh out a little idea for exporting logs https://review.opendev.org/c/opendev/system-config/+/828810 | 10:26 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role https://review.opendev.org/c/zuul/zuul-jobs/+/828818 | 10:27 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: [wip] encrypt-file : new role https://review.opendev.org/c/zuul/zuul-jobs/+/828818 | 10:36 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: encrypt-file : role to encrypt a file https://review.opendev.org/c/zuul/zuul-jobs/+/828818 | 10:43 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: encrypt-file : role to encrypt a file https://review.opendev.org/c/zuul/zuul-jobs/+/828818 | 10:51 |
frickler | Clark[m]: Guest32: I started https://etherpad.opendev.org/p/ubuntu-xenial-jobs but kind of stopped halfway when I saw the size of the task at hand. I guess we will need to make that a priority item for some time and iterate through things | 10:51 |
*** pojadhav|afk is now known as pojadhav | 10:51 | |
*** rlandy|out is now known as rlandy|ruck | 11:05 | |
*** dviroel|out is now known as dviroel | 11:15 | |
*** pojadhav is now known as pojadhav|brb | 11:25 | |
*** pojadhav|brb is now known as pojadhav | 11:39 | |
lourot | o/ I have https://review.opendev.org/c/openstack/project-config/+/825138 and https://review.opendev.org/c/openstack/project-config/+/828166 in the pipe, if anyone has got a few cycles, thanks! | 12:58 |
*** pojadhav is now known as pojadhav|brb | 13:05 | |
lourot | thanks fungi! | 13:08 |
*** rlandy|ruck is now known as rlandy|ruck|mtg | 13:08 | |
opendevreview | Merged openstack/project-config master: Mirror cinder-solidfire and cinder-nimblestorage to Github https://review.opendev.org/c/openstack/project-config/+/825138 | 13:12 |
opendevreview | Merged openstack/project-config master: Mirror charm-ops-interface-* to GitHub https://review.opendev.org/c/openstack/project-config/+/828166 | 13:16 |
*** amoralej is now known as amoralej|lunch | 13:24 | |
*** rcastillo is now known as rcastillo|rover | 13:33 | |
*** ysandeep is now known as ysandeep|out | 14:01 | |
*** amoralej|lunch is now known as amoralej | 14:01 | |
*** dviroel is now known as dviroel|lunch\ | 14:54 | |
*** ykarel is now known as ykarel|away | 14:56 | |
NeilHanlon | hey folks, did a dib upgrade happen to occur yet, or is there anything I can do to help? | 15:27 |
fungi | NeilHanlon: https://pypi.org/project/diskimage-builder/ says we got a new 3.18.0 release yesterday, checking nodepool now... | 15:31 |
fungi | https://opendev.org/zuul/nodepool/src/branch/master/requirements.txt#L14 says diskimage-builder>=3.18.0 | 15:32 |
fungi | now to dockerhub... | 15:32 |
opendevreview | Neil Hanlon proposed openstack/project-config master: Add rockylinux-8 to nodepool configuration https://review.opendev.org/c/openstack/project-config/+/828435 | 15:34 |
fungi | latest tag at https://hub.docker.com/r/zuul/nodepool-builder/tags has an upload time roughly corresponding with when https://review.opendev.org/828649 merged | 15:34 |
fungi | we probably need to restart our nodepool builders on new containers, but that's fairly quick | 15:35 |
NeilHanlon | awesome ! thank you for the background info | 15:35 |
frickler | fungi: NeilHanlon: https://review.opendev.org/c/zuul/nodepool/+/828649 bumped nodepool reqs for that | 15:35 |
fungi | yeah, looks like it's been about a week since our last builder container restarts | 15:36 |
NeilHanlon | i rebased 828435 above so it's probably ready to merge, too | 15:36 |
fungi | centos-8-stream and centos-9-stream images just started building in the last few minutes... i wonder if those are failing in a tight loop | 15:44 |
fungi | also we have opensuse-15 images stuck in a building state for nearly 6 days | 15:44 |
fungi | i wonder what that's about | 15:44 |
opendevreview | Maksim Malchuk proposed openstack/diskimage-builder master: Correctly create DIB_ENV variable and dib_environment file https://review.opendev.org/c/openstack/diskimage-builder/+/828899 | 15:45 |
*** Guest32 is now known as clarkb | 15:47 | |
fungi | the /opt fs is at 100% used on both nb01 and nb02, while nb03 is entirely unreachable for me | 15:49 |
fungi | yeah, i can't even ping nb03 | 15:49 |
clarkb | huh thats interesting I checked them when we did recent server reboots and did some cleanup and there should've been a fair bit of disk space. I wonder if we are leaking with dib somehow | 15:50 |
fungi | `openstack server show` says nb03 is running/active, maybe neutron went toes up there | 15:51 |
fungi | there are a few gb left on 01 and 02, but it's less than 1% | 15:51 |
clarkb | could be focal upgrade fallout too if that tripped some bug | 15:51 |
clarkb | frickler: ++ to taking it incrementally | 15:52 |
fungi | also, `nodepool dib-image-list` mentions a bunch of opensuse-15 images in a "ready" state, i would normally expect only two | 15:52 |
clarkb | fungi: I think that can happen if a cloud isn't able to upload the newer versions. We'll keep the old versiosn around if that is the only version a cloud has managed to receive | 15:53 |
clarkb | but ya under normal operation there should be only two ready so that might point at a problem | 15:53 |
fungi | 18 in a ready state, many more in failed state | 15:53 |
fungi | the oldest "ready" opensuse-15 image has been there for three weeks | 15:54 |
fungi | we've got a lot in the providers themselves too... taking rax-dfw as a quick example, it has 17 opensuse-15 images in a ready state and 1 in a failed state | 15:56 |
fungi | so something seems to be breaking image deletion | 15:57 |
fungi | 2022-02-11 15:57:44,017 ERROR nodepool.zk.ZooKeeper: Error loading json data from image build /nodepool/images/opensuse-15/builds/0000147440 | 15:57 |
fungi | i think we have some corrupted znodes, that could be doing it | 15:58 |
fungi | should i use zkshell to delete that node? | 15:59 |
opendevreview | Maksim Malchuk proposed openstack/diskimage-builder master: Correctly create DIB_ENV variable and dib_environment file https://review.opendev.org/c/openstack/diskimage-builder/+/828899 | 16:00 |
fungi | exploring a bit, it looks like the issue may be that /nodepool/images/opensuse-15/builds/0000147440/providers/rax-iad/images/ is empty | 16:00 |
fungi | infra-root: ^ what's the safest way to fix that up? should i try to stop all the builders and then delete the providers tree there? or the entire 0000147440 build? | 16:03 |
*** dviroel|lunch\ is now known as dviroel | 16:08 | |
*** pojadhav|brb is now known as pojadhav|out | 16:24 | |
*** amoralej is now known as amoralej|off | 16:28 | |
Clark[m] | fungi I think you can just delete the one file that is corrupted then nodepool clears the rest | 16:33 |
Clark[m] | And you don't have to stop services for that since zk won't let them do a half read | 16:34 |
Clark[m] | Er not file but znode | 16:35 |
fungi | well, the builder is complaining about "Error loading json data from image build /nodepool/images/opensuse-15/builds/0000147440" and raises "json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)" | 16:46 |
*** rlandy|ruck|mtg is now known as rlandy|ruck | 16:46 | |
fungi | so does that mean 0000147440 is corrupt, or one of the subnodes under it? | 16:46 |
fungi | that's mainly what i'm asking. do i delete all of 0000147440 or just 0000147440/providers/rax-iad/images or something in between? | 16:47 |
fungi | worried that if i delete higher up in that tree that necessary, it might orphan some image build on disk or in a provider we'll need to clean up manually | 16:48 |
clarkb | fungi: ya you'll probably need to get the znode data and compare against other entries | 16:50 |
clarkb | but you are right you want to delete only the specific znode that is corrupted (and if it has children this requires deleting them too iirc) | 16:50 |
fungi | others seem to have a serial and a lock under providers/rax-iad/images | 16:52 |
clarkb | fungi: you should find znodes with json in them | 16:52 |
fungi | ahh, so i can do json_cat on them | 16:53 |
clarkb | in the zk-shell utility there is a get method iirc | 16:53 |
fungi | yeah, the serially-numbered znode has json data | 16:53 |
clarkb | and that gives you the raw data back. Then we've got the new zuul utility that handles sharded data but I don't think the nodepool stuff is sharded so zk-shell is good enough | 16:54 |
fungi | when i json_cat it | 16:54 |
clarkb | ya so you want to find the znode for that specific image and upload that is corrupt and delete it iirc | 16:54 |
clarkb | then nodepool will correct the rest of the tree | 16:54 |
fungi | okay, so since there are no nodes with json in them in any of the subnodes in the 0000147440 tree, i should be able to recursively delete 0000147440 | 16:54 |
clarkb | if you give me a minute I can login and check really quickly | 16:55 |
*** artom__ is now known as artom | 16:56 | |
fungi | sure, it looks like it's been this way for several weeks, so a few more minutes aren't going to make much difference | 16:56 |
clarkb | fungi: ya so if you get /nodepool/images/opensuse-15/builds/0000147440 it is empty and if you get 0000147493 isntead there is json. So ya /nodepool/images/opensuse-15/builds/0000147440 is the item to delete which necessitates deleting its children | 16:57 |
fungi | thanks for confirming | 16:57 |
fungi | okay, deleted | 16:58 |
fungi | #status log Manually deleted /nodepool/images/opensuse-15/builds/0000147440 from ZooKeeper, which had been holding back image cleanup for several weeks | 16:59 |
opendevstatus | fungi: finished logging | 16:59 |
clarkb | one thing that makes zk a bit "weird" compared to posix is that the direntries can hold data too | 16:59 |
clarkb | because really it is just a tree and each node can have data and children | 16:59 |
fungi | right, i tried to json_cat each level under there just to be sure they were all empty | 17:00 |
clarkb | (I guess technically posix dirs have data too but limited to who their children ar) | 17:00 |
fungi | and attributes | 17:00 |
clarkb | fungi: one thing to be aware of is that the zuul/ side of the tree uses a lot of node sharding | 17:00 |
fungi | extended attrs in some posix filesystems can hold vast amounts of data | 17:00 |
clarkb | sharding and compression | 17:00 |
fungi | yeah | 17:00 |
fungi | the compression is new | 17:00 |
clarkb | what this means is that the zk-shell tools like get and so on don't give you a complete accurate picture. There is a tool in the zuul tree that aims to emulate some zk-shell behavior but handles sharding and compression fro you | 17:01 |
fungi | lots of deletions recorded in the builder logs now | 17:01 |
*** marios is now known as marios|out | 17:02 | |
fungi | once they're looking more sane i can see about restarting them, but i expect there's a backlog of images to build from the filesystems filling up | 17:02 |
mnaser | infra-root: I got an abuse notice of sorts that have been sent before I believe (RuskIQ) — what’s an email I can forward it to? | 17:02 |
fungi | mnaser: infra-root at openstack.org | 17:03 |
fungi | or service-incident at lists.opendev.org | 17:03 |
fungi | the latter is what we list as the security contact on the main opendev page | 17:03 |
fungi | it's a private ml with just the sysadmins subscribed | 17:04 |
fungi | okay, we're down to just the most recent two opensuse-15 images in a ready state now | 17:06 |
mnaser | I’ll do the first as it’s probably what gets an abuse emails or whateva | 17:06 |
fungi | yeah, that's where most of then end up. thanks | 17:07 |
mnaser | Sent | 17:07 |
fungi | much appreciated | 17:07 |
fungi | okay, /opt is back down to ~70% used on nb01 and nb02 now. still can't reach nb03 | 17:08 |
fungi | should i try rebooting it? | 17:08 |
fungi | `openstack console log show` isn't returning for me either | 17:09 |
fungi | finally returned "Gateway Timeout (HTTP 504)" | 17:09 |
fungi | same for `openstack console url show` | 17:10 |
fungi | i have a feeling a reboot isn't going to help. kevinz_ ^ are you aware of any network connectivity problems or related issues which would make nb03.opendev.org (e46ef968-9b36-465f-b2d8-0aa4cc034da8) unreachable or unresponsive? | 17:12 |
clarkb | you might not even be able toreboot depending on hwere the 504 is occuring | 17:13 |
fungi | right | 17:13 |
fungi | anyway, in the meantime we shouldn't need to restart it in order to be able to add x86 images for rocky linux | 17:14 |
clarkb | ya rocky images should not affect arm builder unless we are trying to build arm images | 17:17 |
clarkb | mnaser: fungi: fwiw I don't see an email yet, but my brain isn't quite working yet today | 17:26 |
*** jpena is now known as jpena|off | 17:33 | |
fungi | yeah, i didn't see it show up in that inbox, nor the spam folder for it, but maybe it's delayed due to greylisting or something | 17:36 |
opendevreview | Matthew Thode proposed openstack/project-config master: add python3.9 to generate-constraints command https://review.opendev.org/c/openstack/project-config/+/828909 | 17:39 |
clarkb | frickler: I annotated the etherpad with some thoughts. I expect a good number of things can be cleaned up now and we should probably focus on those then swing back around on the trickier items | 18:47 |
mnaser | fungi, clarkb: i forwarded to infra-root@opendev.org | 19:39 |
clarkb | mnaser: it needs to be openstack.org | 19:40 |
clarkb | (due to lack of mx for opendev.org) | 19:40 |
mnaser | ah oops | 19:40 |
mnaser | surprised i didnt get a bounce from google | 19:40 |
mnaser | sent again | 19:40 |
fungi | thanks! | 19:41 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Temporarily block x/stackalytics in Gitiles https://review.opendev.org/c/opendev/system-config/+/828919 | 19:58 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Temporarily block x/stackalytics in Gitiles https://review.opendev.org/c/opendev/system-config/+/828919 | 20:15 |
*** dviroel is now known as dviroel|out | 20:59 | |
clarkb | fungi: ^ that passes CI now. SHould we go ahead and +A? | 21:20 |
fungi | yeah, feel free to single-core approve, it's easy enough to revert later | 21:23 |
clarkb | I think we'll hold off for now while we make sure that is the right decision | 21:43 |
fungi | wfm | 21:48 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: encrypt-file : role to encrypt a file https://review.opendev.org/c/zuul/zuul-jobs/+/828818 | 22:52 |
opendevreview | Ian Wienand proposed zuul/zuul-jobs master: encrypt-file : role to encrypt a file https://review.opendev.org/c/zuul/zuul-jobs/+/828818 | 23:00 |
clarkb | ianw: not for today but if you get a chance early next week it would be great if you could lookover the gitea 1.16.1 install that I held | 23:00 |
clarkb | maybe we can do that upgrade next week. (I should cross check against openstackrelease though | 23:00 |
ianw | ++ will do | 23:00 |
clarkb | I haven't been able to find anything concerning and the updated test to check pushes via ssh should cover the gerrit replication | 23:02 |
opendevreview | Ian Wienand proposed opendev/system-config master: [dnm] flesh out a little idea for exporting logs https://review.opendev.org/c/opendev/system-config/+/828810 | 23:15 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!