Friday, 2021-04-09

ianwok back to nb0200:04
ianwi wonder if all the nested mounts etc means docker pull is unreliable00:04
ianwseems to me "docker-untar / /var/lib/docker/overlay2/c4ccbf94c05603eb4e317f89117879b686031bbe666b77d5a172bf2a7b055126/diff" is what is stuck00:05
clarkbits a different fs thugh00:06
clarkbor maybe not00:07
ianwsomething is up with this host00:10
ianw2021-04-08 07:13:13.896 | (0001 / 1391)00:10
ianw2021-04-08 14:54:41.895 | Cloning from ceilometer cache and applying ref *00:10
ianwit spent 7 hours in the git cloning loop00:11
ianwbefore something killed it00:11
ianwthat untar is going, but *really* slowly00:11
ianwdmesg has a bunch of dracut output ...00:13
ianwit's not "/opt", i just cleaned that up and it went quickly00:24
ianwi'm just going to reboot it, i don't have much else insight00:25
ianwit's back, "docker-compose pull" got new images quickly and it's building00:28
ianw#status log reboot after mystery slowdown, appears to be root disk related00:28
openstackstatusianw: finished logging00:28
openstackgerritIan Wienand proposed opendev/system-config master: Handle zuul-summary-results as .jar / per-project config
openstackgerritMerged opendev/system-config master: zuul-summary-status : handle SKIPPED and ERROR jobs
openstackgerritMerged opendev/system-config master: Set MaxConnectionsPerChild 8192 for Gitea backends
openstackgerritIan Wienand proposed opendev/system-config master: Handle zuul-summary-results as .jar / per-project config
openstackgerritIan Wienand proposed opendev/system-config master: review02: pin ipv6 configuration
openstackgerritDmitriy Rabotyagov proposed openstack/diskimage-builder master: Add Debian Bullseye Zuul job
openstackgerritDmitriy Rabotyagov proposed openstack/diskimage-builder master: Add Debian Bullseye Zuul job
*** gouthamr has quit IRC08:08
*** tosky has joined #opendev08:09
*** parallax has joined #opendev08:13
*** ysandeep|afk is now known as ysandeep11:47
*** jpena|lunch is now known as jpena12:29
*** whoami-rajat_ is now known as whoami-rajat12:39
*** amoralej is now known as amoralej|lunch12:58
openstackgerritDmitriy Rabotyagov proposed openstack/project-config master: Add Debian Bullseye nodepool images and wheels
openstackgerritDmitriy Rabotyagov proposed openstack/project-config master: Add Debian bullseye wheel cache publish jobs
*** ysandeep is now known as ysandeep|away13:16
*** ysandeep|away is now known as ysandeep13:35
*** amoralej|lunch is now known as amoralej13:42
fungi#status log Restarted the nodepool-launcer container on in order to free some indefinitely locked node requests15:10
openstackstatusfungi: finished logging15:10
corvusi'm restarting zuul now15:15
corvus#status log restarted zuul at commit 9c3fce2820fb46aa39dbf89984386420fd7a7f7015:16
openstackstatuscorvus: finished logging15:16
corvus2021-04-09 15:21:42,216 ERROR zuul.zk.SemaphoreHandler: Releasing leaked semaphore /zuul/semaphores/openstack/translations held by 39aedef4181a4c749936d98bd02b1a3b-upstream-translation-update15:24
corvusi believe that is expected and good!15:24
clarkbI'm getting settled in to start the account external id cleanups15:25
fungi#status log Deleted server instance "test" (created 2020-10-23) from nodepool tenant in linaro-us15:25
openstackstatusfungi: finished logging15:25
corvusthere are two leaked semaphore releases, and they match the last two acquired semaphores in the log (which were not released before i killed it)15:25
fungii also removed the lingering xxxlarge ready node in linaro-us a few minutes ago15:25
corvusre-enqueue complete15:32
clarkbI'm double checking my input list to external id cleanups against the list of retirements I did last week. Then will likely do these in batches just to keep things to a reasonable size15:32
fungii'm watching the stream console stream from a centos-8-arm64 job and it's doing jobbity-jobstuffs15:39
*** ysandeep|away is now known as ysandeep15:50
clarkbthere were two inactive accounds that showed up in the audit that weren't in the retirement I did. Looking at them they are both retireable and able to be claened up. I have goine ahead and retired them (they were already inactive so this just removes preferred email to prevent it from missing an external id later)15:53
clarkbI think I'm ready to do my first batch based on what I retired last week. Any objections?15:58
fungino objection from me15:59
openstackgerritJeremy Stanley proposed opendev/system-config master: Drop Debian PPA from openafs-client role
clarkbone day I'll remember to tell python to do unbuffered io the first time I run this script after a while16:10
fungii think you can set that in the script?16:10
fungias opposed to having to use the cli option16:10
clarkboh good idea /me looks16:10
fungibut i don't recall how, so maybe i dreamt it16:10
clarkbthe internet suggests replacing sys.stdount or flushing every call16:11
clarkbanyway I've decided to do 4 batches16:13
clarkbthe first one is running now with unbuffered io and had me confused initially :)16:13
fungiyeah, so not as trivial as just setting some value in the script anyway16:14
clarkbah cool it just finished16:14
clarkbthe output looks the way I expect it so I will do the next batch. When all 4 are done I'll copy the logs up16:14
clarkb(that was actually quicker than I expected too, fridays ftw!)16:15
clarkbI'll do a consistency check when done and ensure we didn't create preferred email address lack external id issues (shouldn't as I checked we had already retired all the accounts first) and see what our new count of external id conflicts is16:19
clarkblogs are up now.16:29
clarkbI'm working on the consistency check next16:29
fungii'm about to get started on adding the debian bullseye wheel volumes16:29
clarkbwhen does bullseye release?16:30
clarkbI was looking last night (briefly and not very in depth) to find out and couldn't see anything16:31
fungi"when it's ready"16:31
fungidebian doesn't really do hard timing on release schedules16:32
fungithere are metrics like number of release critical bugs open and such16:32
fungithough it's getting close to time for a target date to release, i think16:32
fungiexplained at
clarkband then when that happens you have ~1 year to upgrade buster to bullseye?16:36
clarkb(that was the rough info I was looking for)16:36
fungiyeah, basically, except there's also lts after that these days16:37
fungioldstable gets normal support for a year, then lts kicks in16:37
fungithe lts team is predicting to take over buster maintenance in july of next year16:38
clarkbnow down to 334 external id conflicts from 545 at last state (and original was 600 something)16:38
clarkbthe json for that is in the usual location too16:39
clarkbthere were no other issues16:40
clarkb(its good to keep double checking that as removing external ids could orphan a preferred email address if we don't do the retirement first or get something related wrong)16:40
clarkbI'll rerun the audit script in a bit too16:42
clarkb643 was the original count of conflicts16:42
clarkbprogress indeed16:43
clarkbfungi: are we ready to tell Alex_Gaynor that things are happy again?17:08
clarkbAlex_Gaynor: I believe that centos 8 on arm64 is happy again, but fungi can confirm17:09
hrwbullseye is in hard freeze which for me means 'ready to use for most situations' as there will be small amount of changes only17:12
clarkbhrw: ya I'm not in a huge rush to use it, more wondering what the timeline on my buster installs looks like17:13
*** marios|out has quit IRC17:13
hrwclarkb: I moved some of my desktop systems, server will wait for May/June probably and some day of time to make sure all works17:14
hrwbut that's system was ubuntu 13.04 - 13.10 - 14.04 - 16.04 - Debian 10 so things were complicated17:14
fungiclarkb: yeah, i was going to wait until i at least saw some centos builds succeed, but things do seem to be running again. we're *very* backed up though since all the changes waiting on centos nodes previously got completely reenqueued during the zuul restart. we'll likely need the weekend to clear that backlog17:15
hrwfungi: if that help: I am fine with all kolla jobs queued >12h being killed17:15
fungihrw: it might, but at the moment i'm waiting to see if some of the centos-based ones succeed at least17:16
Alex_Gaynorclarkb: thanks! let's see how it goes17:16
hrwas some of patches got merged in meantime17:16
fungihrw: well, those check builds got their branch bases recreated during the reenqueue at ~15:30 utc, so are running merged with fairly current states17:17
JayFAlex_Gaynor: o/ didn't know you were still openstacking, hope you are well (we worked together in rax sf)17:19
clarkbJayF: we're helping the pyca group do aarch64 builds of some libs not really openstack, but helps openstack as cryptography and friends underpin much of python including openstack17:19
hrw lists set of centos-stream-arm64 jobs from today. marked as failure but all worked fine and gave proper output17:20
JayFWell cryptography is everywhere now :D I'm glad we're helpin' em out17:20
Alex_GaynorJayF: 👋 I'm not actually doing OpenStack stuff :-) I still do python crypto stuff and the opendev folks are very generously helping us out with arm64 CI (I assume because openstack cares about arm64 :D)17:20
hrwah. cryptography and that rust use...17:20
clarkbhrw: we're trying to help make that rust use less painful for everyone running aarch64 by providing systems to build wheels that go on pypi :)17:22
fungiyeah, having official aarch64 wheels of cryptography helps arm-based tests for openstack and our other projects go much faster since they don't need to recompile from sdist every time17:22
clarkbbut ya more generally its a time savings when installing python (including openstack) on aarch64 since cryptography builds from source can sink a bit of time17:23
fungiand these days everything depends (at least indirectly) on cryptography because Alex_Gaynor and other contributors are doing such a great job with it17:23
hrwAlex_Gaynor: would you consider building and uploading also versions used by previous openstack releases?17:26
Alex_GaynorWe build and release our wheels as a part of our release process, because we only do wheels for platforms we're actively testing, we don't generally upload wheels for past releases.17:28
hrwok, always worth asking ;D17:28
fungiwe're still able to build and cache them in our wheel cache in afs anyway17:30
fungithey just won't be available directly from pypi17:31
hrwand linaro CI has own cache too17:31
Alex_Gaynorclarkb: good news, centos8 works. bad news our unrelated windows CI got itself broken17:32
funginot good when your jobs bsod ;)17:33
clarkbAlex_Gaynor: great! at least on the half we can help with :)17:36
* hrw off17:37
fungiargh, mirror.wheel.bullseyex64 is >22 bytes long17:39
fungiwe need to shave off two bytes... how about mirror.wheel.bullsex64 and mirror.wheel.bullsea64?17:40
hrwbullsa64 bullsx64 to not have 'sex' in url17:40
fungior we could switch to something like mirror.wheel.deb11x6417:40
hrweven better17:40
clarkbfungi: using the release number makes sense since those will stay short until they hit 100 :)17:40
clarkbunless debian pulls a suse17:41
clarkbthough suse went from 42 to 15 so even then we'll probably be ok :)17:41
fungior we could switch to something like actually we can fit mirror.wheel.deb011x64 it we're really worried about supporting the next few hundred years of debian17:41
fungibut yeah, i'll go with deb11x64 and deb11a6417:42
* TheJulia looks at the history and blinks a few times17:42
fungiTheJulia: yep, sorry, i should have read that more closely before suggesting it :/17:43
fungii really just trimmed the minimum number of bytes off "bullseye" but...17:43
JayFlook, don't have a cow, it was a mistake17:49
TheJuliaoh yeah, totally get that :)17:54
TheJuliajust... "surprising" to see ;)17:55
*** roman_g has quit IRC18:17
*** roman_g has joined #opendev18:17
fungii'll admit, i was also surprised to have typed it18:23
fungi#status log Mounted new AFS volumes mirror.wheel.deb11a64 at mirror/wheel/debian-11-aarch64 and mirror.wheel.deb11x64 at mirror/wheel/debian-11-x86_64 with our standard ACLs and base quotas18:44
openstackstatusfungi: finished logging18:44
clarkbI've got an audit of the remaining conflicts running now18:50
fungigood news, i see a kolla-build-centos8-source-aarch64 which succeeded18:55
fungikevinz: when you have time (sorry i know it's probably your weekend now, not urgent at all) we have 8 server instances in linaro-us which are stuck in either BUILD or ERROR state for several days now and would appreciate if you could clean them up so we can use the rest of that quota: a25e6f3b-5695-4729-b9c7-bb6160d7dd55 3ec7f816-52e5-480d-828a-767ebae7cf84 6accab6d-e1bd-4998-85eb-d7764d1ee0f518:58
fungia82e3b41-b0f7-4310-a89f-2a84eef8f8b0 15c38603-41c2-484d-8637-62941b871cbe 449f31dc-8207-420a-a182-da7c58d962bd 402bcb9d-f40f-4b71-81e9-3aba5b4b432c 74485f01-22e3-4df5-bc23-d408af99cd6f18:58
clarkbalright the audit completed and I've stashed it in the usual location. Includes the raw yaml which can be queried as well as some automated categorization in a separate file19:36
clarkbI'm going to stop doing user account stuff now. /me finds something else19:36
clarkb that sounds familiar19:41
clarkbtuning back ssh threads is something we can consider19:41
clarkbif we want to do that, the 3.2.8 upgrade might be a good time for it19:42
fungihuh, yeah, i wonder what the effect is though if you have an insufficient number of ssh threads for the requested operations19:49
clarkbI think they queue up then as timeouts happen clients will eb told to go away19:50
clarkbalso you can set threads for bot groups separate from interactive users19:50
clarkbso we could prevent zuul from being starved (though it uses a bunch of http now)19:50
fungiyeah, maybe it'll also just encourage users to switch to https instead of ssh for typical client interactions too19:51
*** stewie925 has quit IRC20:00
openstackgerritClark Boylan proposed opendev/git-review master: Add option for disabling thin pushes
clarkbfungi: ^ something like that to make pushes when the pack issue arises might be good20:19
clarkbfor some reason I think I recall that the dest value can't have an underscore in it? but I may be wrong about that20:20
fungii think it's that gerrit maps underscores to spaces?20:33
fungiat least i recall going through a bunch of different options for getting spaces in topics working at push, and underscores were involved somehow, but ultimately nothing completely worked20:34
clarkbI read the argparse docs and I think it is that by default if I didn't set dest it would be no_thin bceause the -- prefix is removed and the - in the middle is replaced with a _20:34
clarkbit should work as is and while redundant is consistent with some of the other options there20:34
clarkbI also haven't tested this change yet.20:35
fungishouldn't be hard to add a functional test for it20:35
clarkbya I expect just copy a push test and add the no thin?20:36
clarkbit is hard to very that git is doing what we want but we can at least ensure it doesn't flat out break20:36
clarkbI'll start looking into that next20:36
openstackgerritClark Boylan proposed opendev/git-review master: Add option for disabling thin pushes
fungiright, i don't think we need to test that git does what it says on the tin, that's for git's maintainers to worry about20:51
openstackgerritDmitriy Rabotyagov proposed openstack/project-config master: Add Debian Bullseye nodepool images and wheels
openstackgerritDmitriy Rabotyagov proposed openstack/project-config master: Add Debian bullseye wheel cache publish jobs
clarkbfungi: that did pass the test I added so I guess its probably good? I will probably try using it a bit next week too22:56
fungiyeah, seems fine to me22:57
