Thursday, 2021-04-01

*** hamalq has quit IRC		00:02
*** mlavalle has quit IRC		00:10
TheJulia	corvus: is it down? Looks like connections are timing out and I got a blank page load right before that	00:22
corvus	TheJulia: it's extremely degraded, but hasn't lost queue state or events. it should eventually recover and restart all the jobs.	00:23
TheJulia	okay	00:25
TheJulia	thanks corvus!	00:25
corvus	TheJulia: thanks for not throwing vegetables at me :)	00:26
TheJulia	corvus: I've been there myself long ago	00:26
*** ysandeep\|away is now known as ysandeep		00:41
fungi	zuul-scheduler process still has a cpu completely pegged and the rest api is unresponsive, but the debug log does indicate it's still dispatching builds to executors (albeit slowly)	00:47
*** diablo_rojo has quit IRC		00:52
fungi	WARNING kazoo.client: Connection dropped: socket connection error: EOF occurred in violation of protocol (_ssl.c:1125)	00:57
fungi	is that what the zk connection timeouts look like?	00:57
fungi	seeing them go by in the debug log every few minutes	00:57
fungi	roughly 2-4 minutes apart for a while	00:58
fungi	corvus: are we likely generating retry events because of zookeeper disconnects faster than we can process them, or do you still expect it to recover on its own without restarting?	01:01
fungi	i'm happy to work on a scheduler restart to get things moving again and try to reenqueue everything from the periodic queue backups. looks like we have one from 23:41 utc	01:06
fungi	which is roughly the time everything seems to have ground to an almost-halt	01:06
corvus	fungi: i have a very large query running	01:08
corvus	i'd like to let it finish	01:08
fungi	no worries, wasn't sure if you were done yet	01:08
corvus	i'll make sure it's running before i go to bed	01:09
fungi	thanks!	01:10
weshay\|ruck	something going.. seeing a ton of retry_limits on centos-8 jobs	01:23
weshay\|ruck	?	01:24
* weshay\|ruck reads		01:24
fungi	weshay\|ruck: yeah, we're trying to get to the bottom of a recent memory leak in zuul	01:44
*** mfixtex has quit IRC		01:46
*** brinzhang_ is now known as brinzhang		02:04
johnsom	Is there an ETA on zuul coming back?	03:07
openstackgerrit	Ian Wienand proposed opendev/system-config master: Planet OPML file https://review.opendev.org/c/opendev/system-config/+/784191	03:09
corvus	i'm going to restart it now	03:16
corvus	#status log restarted zuul after freeze while debugging memleak	03:23
corvus	should be up now	03:24
TheJulia	\o/	03:46
*** gothicserpent has quit IRC		03:49
*** ykarel\|away has joined #opendev		03:50
*** tkajinam has quit IRC		03:51
*** tkajinam has joined #opendev		03:51
*** tkajinam has quit IRC		03:52
*** tkajinam has joined #opendev		03:53
*** ykarel\|away is now known as ykarel		03:54
*** whoami-rajat has joined #opendev		04:17
*** marios has joined #opendev		05:25
*** rosmaita has joined #opendev		05:47
*** sboyron has joined #opendev		05:47
*** tkajinam has quit IRC		06:02
*** tkajinam has joined #opendev		06:03
*** tkajinam has quit IRC		06:03
*** tkajinam has joined #opendev		06:03
*** bandini has joined #opendev		06:10
*** lpetrut has joined #opendev		06:30
*** hashar has joined #opendev		06:39
*** gibi_away is now known as gibi		06:58
openstackgerrit	Merged opendev/system-config master: Explicitly create empty reprepro dists https://review.opendev.org/c/opendev/system-config/+/784158	07:24
*** CeeMac has quit IRC		07:24
openstackgerrit	Merged opendev/system-config master: Correct debian-security repo codename for bullseye https://review.opendev.org/c/opendev/system-config/+/784169	07:24
openstackgerrit	xinliang proposed openstack/diskimage-builder master: Fix generate two grub.cfg files https://review.opendev.org/c/openstack/diskimage-builder/+/784203	07:39
*** tosky has joined #opendev		07:45
openstackgerrit	Daniel Blixt proposed zuul/zuul-jobs master: WIP: Make build-sshkey handling windows compatible https://review.opendev.org/c/zuul/zuul-jobs/+/780662	07:47
Tengu	hello there! is the "job retry_limit/pause" issue solved? or may I help on it if it's still relevant?	07:55
*** ykarel has quit IRC		08:03
*** ykarel has joined #opendev		08:05
*** ysandeep is now known as ysandeep\|lunch		08:27
*** jaicaa has quit IRC		08:33
*** jaicaa has joined #opendev		08:36
*** dtantsur\|afk is now known as dtantsur		08:44
*** ykarel is now known as ykarel\|lunch		08:58
openstackgerrit	Irene Calderón proposed opendev/storyboard master: Esto es una prueba https://review.opendev.org/c/opendev/storyboard/+/784329	09:37
*** elod is now known as elod_afk		10:01
*** ysandeep\|lunch is now known as ysandeep		10:05
*** ykarel\|lunch is now known as ykarel		10:10
openstackgerrit	xinliang proposed openstack/diskimage-builder master: Introduce openEuler distro https://review.opendev.org/c/openstack/diskimage-builder/+/784363	10:13
zbr\|rover	do we use links to logs instead of zuul build page in zuul comments on purpose or accident? I kinda prefer being send to zuul page instead of logs page.	10:27
zbr\|rover	i would personally find it more convenient if the links in comments would be the same as the ones inside the newly "zuul summary" tab.	10:28
zbr\|rover	funny, trying to load https://zuul.opendev.org/t/openstack/build/e14185c56a0f495ca21c3afd0c67a7aa managed to crash chrome.	10:30
openstackgerrit	xinliang proposed openstack/diskimage-builder master: Introduce openEuler distro https://review.opendev.org/c/openstack/diskimage-builder/+/784363	10:36
*** hashar is now known as hasharLunch		11:02
*** ysandeep is now known as ysandeep\|afk		11:35
*** hasharLunch is now known as hashar		11:58
*** bhagyashris has quit IRC		12:28
*** bhagyashris has joined #opendev		12:29
*** hrw has quit IRC		12:29
*** ysandeep\|afk is now known as ysandeep		12:37
*** stand has joined #opendev		12:55
weshay\|ruck	fungi, and all.. thanks!!	12:57
*** gothicserpent has joined #opendev		13:21
*** roman_g has joined #opendev		13:24
*** gothicserpent has quit IRC		13:25
*** mailingsam has joined #opendev		13:46
fungi	zbr\|rover: which comments are you talking about?	13:50
zbr\|rover	fungi: nevermind. i think it was PBKAC on that, when i checked the url manually they were identical.	13:52
zbr\|rover	the "lost between browser tabs" would describe it better	13:53
*** gothicserpent has joined #opendev		14:04
*** gothicserpent has quit IRC		14:04
fungi	happens to me too, sure	14:07
fungi	Tengu: solved (or at least gone for now)	14:07
fungi	we've been trying to get to the bottom of a new memory leak in the zuul scheduler, but interactive debugging the live process was slowing it considerably and causing side effects like spurious mass job retries	14:08
fungi	the memory leak is not gone yet, we're still collecting data	14:08
Tengu	fungi: ah, thanks for the info!	14:12
openstackgerrit	Jeremy Stanley proposed opendev/system-config master: Temporarily serve tarballs site from AFS R+W vols https://review.opendev.org/c/opendev/system-config/+/784424	14:14
fungi	infra-root: expedited approval of that ^ is appreciated so we can get back to serving current content on the tarballs site until the ord replication is finished	14:14
*** elod_afk is now known as elod		14:29
*** ysandeep is now known as ysandeep\|away		14:30
corvus	fungi: i'd like to try to do another data collection pass; hopefully not as terrible as last night, but still almost certainly disruptive	14:32
*** lbragstad has quit IRC		14:36
openstackgerrit	Tristan Cacqueray proposed zuul/zuul-jobs master: ensure-kubernetes: remove dns resolvers hack https://review.opendev.org/c/zuul/zuul-jobs/+/784427	14:37
fungi	corvus: probably the earlier the better	14:38
fungi	lots of openstack teams are under a lot of stress since next week is final release candidates for wallaby	14:40
fungi	so there's been quite a bit of scrambling to get final fixes merged, as usual	14:40
*** roman_g has quit IRC		14:45
*** lpetrut has quit IRC		14:45
openstackgerrit	Sorin Sbârnea proposed openstack/diskimage-builder master: WIP: Add freebash disk image https://review.opendev.org/c/openstack/diskimage-builder/+/784432	14:56
openstackgerrit	Sorin Sbârnea proposed openstack/diskimage-builder master: WIP: Add freebsd disk image https://review.opendev.org/c/openstack/diskimage-builder/+/784432	14:56
*** chkumar\|ruck is now known as raukadah		15:01
*** tkajinam has quit IRC		15:02
*** dtantsur is now known as dtantsur\|afk		15:06
corvus	this current query is proving to be quite disruptive; i have a copy of the queues saved from before i started it though; so if we decide to abort it, i can re-enqueue	15:11
corvus	i believe i have thought of a way to make objgraph nicer though; if we do abort/restart, i'll work on that	15:12
corvus	fungi: no result yet; i think we should restart :(	15:27
*** zbr\|rover is now known as zbr		15:29
openstackgerrit	Jeremy Stanley proposed zuul/zuul-jobs master: Document algorithm var for remove-build-sshkey https://review.opendev.org/c/zuul/zuul-jobs/+/783988	15:30
fungi	corvus: okay, do you need help with the restart or want me to do it?	15:30
corvus	fungi: i won't de any more debugging today; i'll resume tonight or tomorrow, and do so with a process which is hopefully nicer and can be aborted.	15:30
corvus	fungi: nah, i got it	15:30
fungi	thanks!	15:30
*** mlavalle has joined #opendev		15:33
corvus	fungi, clarkb: i wonder if running the objgraph query in a fork would be effective?	15:36
corvus	tobiash: ^	15:36
fungi	corvus: that's an interesting idea	15:36
fungi	it would get its own copy of memory i guess	15:36
corvus	yeah, i'm assuming all the objects would be there and leaked; we'd want to make sure all the tcp connections are closed.	15:37
tobiash	corvus: a fork should work	15:38
corvus	my first idea is to just modify the objgraph methods to add in a sleep between each call to gc.get_referrers, and to check for a stop flag; but if we can do the work in a forked process, we would have an entire cpu available.	15:38
corvus	cool; i'll prototype the fork idea on a local zuul and if it works, try that out in the next debug session tonight/tomorrow.	15:39
fungi	that does seem like it's worth trying anyway	15:39
fungi	and the fork is still using all the same pointers, so shouldn't increase actual memory utilization significantly, right?	15:40
fungi	assuming we don't memcpy everything	15:40
corvus	fungi: yeah, i think mem usage would increase moderately slowly as pages get cow'd	15:40
fungi	cool, that's what i was hoping	15:41
*** diablo_rojo has joined #opendev		15:41
diablo_rojo	fungi, clarkb I assume you're already aware the zuul status site is not loading?	15:43
*** ykarel is now known as ykarel\|away		15:43
fungi	diablo_rojo: yeah, was just talking about that in #openstack-infra with some other folks, probably we need to restart zuul-web now that zuul-scheduler has been restarted	15:44
fungi	corvus: shall i? or do you think it will recover on its own?	15:44
corvus	fungi: it's up	15:44
diablo_rojo	fungi, ah okay cool. Thanks!	15:44
fungi	oh, perfect. thanks!	15:44
diablo_rojo	Way ahead of me :)	15:45
diablo_rojo	Thanks fungi and corvus!	15:45
corvus	#status log restarted zuul after going unresponsive during debugging	15:47
*** whoami-rajat has quit IRC		15:47
corvus	fungi: restart and re-enqueue is complete	15:47
corvus	fungi: i'm done debugging for the day	15:48
fungi	thanks again!	15:48
fungi	i'll keep an eye on the memory graph	15:48
*** bandini has quit IRC		15:49
*** ykarel\|away has quit IRC		15:53
*** hashar has quit IRC		15:58
*** fressi has joined #opendev		15:59
*** fressi has left #opendev		15:59
*** sshnaidm is now known as sshnaidm\|afk		16:20
*** marios is now known as marios\|out		16:31
openstackgerrit	Paul Belanger proposed zuul/zuul-jobs master: ensure-podman: Use official podman repos for ubuntu https://review.opendev.org/c/zuul/zuul-jobs/+/765177	16:37
*** hamalq has joined #opendev		16:40
*** ysandeep\|away is now known as ysandeep		16:41
openstackgerrit	Merged opendev/system-config master: Temporarily serve tarballs site from AFS R+W vols https://review.opendev.org/c/opendev/system-config/+/784424	16:59
*** marios\|out has quit IRC		17:20
clarkb	corvus: re a fork, the risk there is we'll have two schedulers fighting to do the same work? I guess the thing you'll be POCing is convincing the child to be inactive while running?	17:38
fungi	yeah, i took that as a given. the fork needs to explicitly do nothing, i think	17:38
fungi	close all inherited file descriptors, maybe even just go into a busywait	17:39
corvus	clarkb: the only thread in the fork should be the one where fork is called, yeah? so that would be the repl server thread becoming the main thread of the new process, i would think.	17:39
clarkb	infra-root I'm going to try and dig into the vexxhost ipv6 stuff after lunch today as I suspect that is impact job runtimes in that cloud as well as our new review server. I think a good next step there will be to jump on some in use test nodes and check their ipv6 networking configs to see if any patterns emerge and go from there since ianw seems to have the tcpdumping covered on review	17:39
corvus	(ie, the scheduler thread would not exist)	17:39
fungi	ahh, yeah the repl as the only thread would solve it	17:40
fungi	as long as you don't execute functions in the repl to make that no longer true, but the answer there is to just not do that	17:41
clarkb	ah yup	17:42
clarkb	fungi: have we landed the second pair of openedge cleanups yet? that was next on my list to check on from yesterday	17:43
* clarkb finds change links		17:44
clarkb	https://review.opendev.org/c/opendev/system-config/+/783991 looks like that hasn't merged yet. Any reason to not do that now (sounds like zuul things are settling for the moment?)	17:45
fungi	clarkb: no, haven't yet	17:46
fungi	but should be safe now	17:46
clarkb	ok I'll +A it now	17:46
fungi	thanks!	17:53
*** ysandeep is now known as ysandeep\|away		17:57
*** ykarel\|away has joined #opendev		18:13
*** ykarel\|away has quit IRC		18:25
openstackgerrit	Merged opendev/system-config master: Clean up OpenEdge configuration https://review.opendev.org/c/opendev/system-config/+/783991	18:43
fungi	clarkb: i suppose 784086 can go in now too since that's merged	18:48
clarkb	fungi: ++ do you want to +A or should I?	18:52
fungi	feel free	18:52
*** gothicserpent has joined #opendev		18:59
*** gothicserpent has quit IRC		19:01
openstackgerrit	Merged opendev/zone-opendev.org master: Clean up OpenEdge configuration https://review.opendev.org/c/opendev/zone-opendev.org/+/784086	19:01
openstackgerrit	Paul Belanger proposed zuul/zuul-jobs master: ensure-podman: Use official podman repos for ubuntu https://review.opendev.org/c/zuul/zuul-jobs/+/765177	19:07
*** mailingsam has quit IRC		19:12
clarkb	I've started looking at a vexxhost test node to see what is going on with its networking and try to work from that. One thing I checked since I was here is that svm cpu flag is present and it is (this means amd nested virt is a possibility)	19:17
clarkb	dmesg also confirms nested virt is enabled. What i am not seeing is glean or network config at first boot at all	19:22
clarkb	checking some of the older hosts that exist in nodepool's list none of them seem to have more than one globally routable address and 2 default routes (2 default routes are expected on the public ipv6 interface iirc)	19:30
clarkb	so that is all looking good from the test node side. Makes me wonder if they aren't typically sticking around long enough to have trouble	19:30
*** gothicserpent has joined #opendev		19:36
fungi	which test node?	19:37
fungi	oh, "a vexxhost test node" i see	19:38
fungi	trying to find an explanation for the replacement gerrit server's ipv6 madness?	19:38
clarkb	yup, and also the pip installation slowness in johnsom's example from yesterday which I suspect is also related	19:39
clarkb	mirror.ca-ymq-1.vexxhost.opendev.org:/etc/netplan/50-cloud-init.yaml is the modified file that fixed this problem there	19:39
clarkb	we set dhcp6 and accept-ra to false then manually set routes and addr based on the values that we had previously accepted via RA which mnaser confirmed should be stable	19:40
clarkb	however we never really got any more info from the cloud side why this was happening	19:40
clarkb	assuming things are expected to continue to be stable cloud side we could do similar for review02 but that seems really clunky and we should consider documenting/automating that configuration for vexxhost nodes if we do	19:40
*** rosmaita has left #opendev		19:41
fungi	yeah, and bug 1844712 is still basically getting no traction	19:41
openstack	bug 1844712 in OpenStack Security Advisory "RA Leak on tenant network" [Undecided,Incomplete] https://launchpad.net/bugs/1844712	19:41
johnsom	Just throwing a wild idea out, are you accidentally over restricting icmpv6?	19:41
johnsom	If routers don't get all of the neighbor discovery goodness they can prune routes?	19:42
clarkb	johnsom: the problem is we're getting RAs for networks we aren't on	19:42
johnsom	opps, ?->. I see this on my comcast IPv6	19:42
johnsom	Ah, well, that is a whole different issue. lol	19:42
fungi	johnsom: not filtering icmpv6, no. we're wondering if it's that we're getting announcements from gateways which aren't really valid gateways in addition to the correct ones	19:42
clarkb	so then when the host tries to talk from that source addr over that route the packets end up in the bitbucket	19:42
clarkb	we solved that on the mirror node by disabling dynamic configuration which is less than ideal when nova says use ipv6_slaac	19:43
clarkb	johnsom: but I suspect that may explain some of the pip installation slowness in your timeout on vexxhost example. As pip may wait for ipv6 to timeout then fallback to ipv4 (particularly notable is the time spent is consistency ~60 seconds every time)	19:44
fungi	johnsom: the bug report above, we've seen it happen both in vexxhost and limestone, where it's leaking between tenants even, but i can imagine it's even more likely to occur within a tenant (some job sets up routing on a vnic, begins spewing ra packets onto the network, other nodes see those and add prefixes/routes)	19:44
fungi	in theory neutron filters that, but it seems that sometimes that doesn't actually happen	19:45
fungi	and as of yet, nobody's come up with a sound theory on why	19:45
clarkb	and we set up a bunch of nodes once and tried to inject RAs ourselves and they never showed up on other hosts (as expected)	19:46
fungi	yeah, in the past there have been races around things like port creation/deletion, et cetera, where filtering had gaps	19:46
clarkb	fungi: I think my next step is to boot a vexxhost test node manually and see if I can reproduce there if the node hangs around long enough (say check ti tomorrow0	19:47
clarkb	but otherwise on the test node side I didn't see anything amiss after checking about 10 instances	19:47
fungi	i have a feeling it could happen in bursts, and relies on some specific set of circumstances	19:47
clarkb	and maybe we set up review02 to mimic mirror01 when ianw returns	19:47
fungi	you have to catch it when the right job has run there recently and misbehaved in that way while the other node was up and running	19:48
clarkb	ya	19:48
openstackgerrit	Dmitriy Rabotyagov proposed openstack/diskimage-builder master: Add Debian Bullseye Zuul job https://review.opendev.org/c/openstack/diskimage-builder/+/783790	19:50
*** slaweq_ has joined #opendev		19:51
*** slaweq has quit IRC		19:52
*** CeeMac has joined #opendev		19:54
openstackgerrit	Paul Belanger proposed zuul/zuul-jobs master: ensure-podman: Use official podman repos for ubuntu https://review.opendev.org/c/zuul/zuul-jobs/+/765177	20:03
openstackgerrit	Paul Belanger proposed zuul/zuul-jobs master: ensure-podman: Use official podman repos for ubuntu https://review.opendev.org/c/zuul/zuul-jobs/+/765177	20:20
openstackgerrit	Paul Belanger proposed zuul/zuul-jobs master: ensure-podman: Use official podman repos for ubuntu https://review.opendev.org/c/zuul/zuul-jobs/+/765177	20:39
*** gothicserpent has quit IRC		21:07
clarkb	fungi: have you had a chance to look at those gerrit account cleanup proposals? (I know its been a busy week or longer of fires) if possible would be nice to go thorugh those tomorrow	21:36
*** diablo_rojo has quit IRC		21:38
*** osmanlicilegi has quit IRC		21:44
fungi	not yet, but i can take a look now	21:44
clarkb	cool and thank you	21:44
fungi	they're in your homedir on review?	21:44
clarkb	yes let me find the exact path for you	21:45
clarkb	fungi: ~/gerrit_user_cleanups/notes.20210315	21:45
fungi	784424 merged almost 5 hours ago and still hasn't deployed	21:45
fungi	and the tarballs release is still running	21:46
clarkb	are we waiting on available executor slots?	21:46
clarkb	those don't use normal nodes so we shouldn't be queued up being nodepool usage	21:46
fungi	checking to see if it's still in the queue	21:47
clarkb	ya they are all in the queues still	21:47
clarkb	as waiting	21:47
fungi	ahh, yep	21:48
fungi	probably blocked on the periodics?	21:48
clarkb	as well as a large number of tag jobs :/	21:48
clarkb	well but periodic is doing the same thing it is just waiting as well	21:48
clarkb	is this possibly a side effect of corvus' debugging?	21:48
fungi	yeah, nothing's actually running for those items	21:49
clarkb	the infra-prod jobs do have a semaphore	21:50
clarkb	do you know if those tag jobs too? (just wondering if this is more semaphore weirdness)	21:50
fungi	the times on all those items line up with the reenqueue	21:50
fungi	so maybe something is weird about how they were reenqueued	21:50
corvus	clarkb, fungi: i had a fleeting thought that we may have leaked actual semaphores in the crash	21:51
corvus	we have semaphore cleanup as a todo item	21:51
fungi	oh! right, so maybe leaked semaphores still sitting in znodes?	21:51
corvus	yep; is this only affecting infra, or is it wider? how urgent?	21:52
clarkb	corvus: it is affecting a bunch of openstack tag jobs	21:52
clarkb	infra + those tag jobs are the only ones I've seen so far	21:53
fungi	those tag jobs are all release note publication though	21:53
fungi	so maybe not urgent	21:53
clarkb	ah yup	21:53
fungi	and the rest, yeah, just opendev infra deployment and config management runs	21:53
fungi	so not a huge deal, i can manually apply 784424 in production, that's the only thing really causing lingering pain as far as i know	21:54
corvus	how does "resolve within 4 hours" sound for priority on this?	21:54
fungi	oh that's plenty soon as far as i'm concerned	21:54
clarkb	wfm	21:55
corvus	ok. if it's more urgent, i can increase that; but all things being equal, that's convenient for me.	21:55
fungi	i've manually applied 784424 in production now, so nothing else urgent i know about	21:55
fungi	also happy to try manual znode surgery, betting it's safe to delete semaphore znodes older than the restart	21:55
corvus	fungi: i think it's actually a znode edit	22:00
fungi	ahh, okay	22:00
corvus	i think a semaphore is now a json list of jobs which hold it	22:00
corvus	in the case of a semaphore max of 1, however, a delete should be ok	22:01
fungi	so the semaphore itself is persistent, but may be empty	22:01
corvus	my guess is we have a semaphore that looks like "/path/to/semaphores/infra-prod-something" and its contents are "['build-uuid-from-before-restart']"	22:02
corvus	if that's the case, and the max is 1, we can delete that znode	22:02
fungi	clarkb: on the account cleanup topic, i'm in the process of adapting openstack's election tooling to work around the lack of anonymous access to the emails method in the rest api, and i'm finding there are at least some accounts which have contributed changes recently but have no preferred e-mail. wonder if if we can (or should) do anything about those	22:02
corvus	but if it were "['build-uuid-from-before-restart', 'build-uuid-from-after-restart']" with a max of 2, then editing would be required.	22:02
fungi	corvus: right, and i have a feeling we have the latter because in this case there wouldn't be builds waiting otherwise	22:03
clarkb	fungi: those users can simply go in via the web ui and set up a preferred email	22:03
corvus	fungi: i think we have the former because i think our max is 1?	22:03
fungi	clarkb: if we can figure out how to contact them ;)	22:03
clarkb	fungi: they likely have external ids with emails in them	22:03
clarkb	or the git commits they have pushed	22:03
fungi	corvus: oh, okay, i must have misunderstood. so waiting builds don't get added to the data structure in the semaphore znode	22:04
corvus	fungi: correct, only builds which hold the lock	22:04
fungi	anyway, i'll stop distracting you	22:04
fungi	clarkb: excellent point, they will probably have a committer address on the change even if the owner account has no preferred address	22:05
fungi	i can probably use that as a fallback even	22:05
fungi	clarkb: looking at your list, i wonder if we can also identify accounts with invalid openids and no ssh keys and no password set (regardless of whether they have a username)?	22:08
clarkb	fungi: checking if password is set is hard because you have to dig into the git repo directly	22:09
clarkb	it is doable though	22:09
clarkb	fungi: maybe take the set that meet the other criteria as a sublist then check the git repo directly for that?	22:09
clarkb	(no apis expose that essentially)	22:09
fungi	ahh, nevermind. they're not usable, though may have been used previously since we did wipe all the passwords after the incident	22:09
fungi	yeah, i'm good with the stuff in your proposed list. i spot-checked some from each category	22:10
fungi	also i'll be around tomorrow to help with the cleanup on these if you want	22:10
fungi	aha! i just realized most of these changes owned by an account with no preferred e-mail are from "OpenStack Proposal Bot"	22:12
clarkb	silly bot	22:13
clarkb	thank you for checking and I'll need to get back up to speed on running my scripts again :)	22:13
clarkb	ok I've reviewed the gerrit db change	22:39
*** sboyron has quit IRC		22:50
*** eharney has quit IRC		22:54
clarkb	fungi: should we warn the release team about the tag jobs? I assume those tags were pushed by them? but I guess they could be independent?	22:54
*** tkajinam has joined #opendev		22:57
*** tkajinam has quit IRC		22:57
*** tkajinam has joined #opendev		22:58
*** auristor has quit IRC		23:04
corvus	(CONNECTED [localhost:2181]) /zuul/semaphores/openstack> get publish-releasenotes	23:17
corvus	["baaab4cfbc074796b5be235775754aaf-publish-openstack-releasenotes-python3"]	23:17
corvus	(CONNECTED [localhost:2181]) /zuul/semaphores/openstack> get infra-prod-playbook	23:17
corvus	["d573f07ba3094f52bf6a69cf7a0f02a7-infra-prod-service-registry"]	23:17
corvus	those are the 2 semaphores currently held	23:17
corvus	this is pretty cool; i like this level of visibility :)	23:18
corvus	that's uuid-jobname	23:18
corvus	oh, those are queue item uuids	23:19
corvus	(thus the job name addition to make it unique)	23:20
corvus	i think that's so that if the build uuid changes, we keep the semaphore	23:20
corvus	last restart was at 20:32	23:20
corvus	baaab4cfbc074796b5be235775754aaf last appeared in the log at 14:47	23:21
corvus	wait that restart time doesn't look right	23:21
corvus	15:47 was last restart	23:22
corvus	looks like my last log entry didn't make it to the wiki :/	23:22
corvus	anyway, that entry is confirmed as stale	23:22
corvus	https://codesearch.opendev.org/?q=publish-releasenotes&i=nope&files=&excludeFiles=&repos= says max is 1	23:23
corvus	so i will remove the entry	23:23
corvus	the same is true for infra-prod-playbook, but it's even older. removed	23:26
corvus	top releasenodes job is queued now; top infra-prod job is running	23:27
clarkb	infra-prod job is in the periodic queue if anyone has trouble finding it	23:27
clarkb	corvus: when you say remove the entry you removed the znode entirely or made the znode content [] ?	23:28
corvus	clarkb: removed entirely	23:28
corvus	shortcut valid for max=1 semaphores only	23:28
clarkb	for max>1 you would edit the json to remove invalid job entries?	23:28
corvus	yep	23:29
corvus	but i'm going to write code so no one ever has to do that :)	23:29
clarkb	++	23:31
*** tosky has quit IRC		23:40

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!