Monday, 2023-02-06

opendevreview	Merged openstack/project-config master: nodepool: infra-package-needs; cleanup python https://review.opendev.org/c/openstack/project-config/+/872476	00:16
opendevreview	Merged openstack/project-config master: nodepool: infra-package-needs; remove lvm2 https://review.opendev.org/c/openstack/project-config/+/872477	00:16
tonyb	I'm still seeing really slow reqponses from gitea08	02:32
ianw	load average: 78.44, 85.12, 99.48	02:33
ianw	it is .. unhappy	02:34
ianw	nothing completely obvious	02:36
ianw	Feb 6 02:31:24 gitea08 docker-gitea[847]: 2023/02/06 02:31:23 ...ules/context/repo.go:469:RepoAssignment() [E] [63e06679-3] GetUserByName: context canceled	02:36
ianw	seems frequent	02:37
ianw	those messages go back as far as we have logs though	02:40
ianw	there's lots of oom kills	02:44
ianw	i've restarted the container anyway	02:45
fungi	oom kills on the gitea servers are usually a sign that some network behind a common nat is repeatedly cloning large repos like openstack/nova	03:49
fungi	we saw that behavior when people had openstack-ansible deployments acting up and all their servers tried to independently clone all of openstack rather than caching a central copy in their deployment	03:50
fungi	apparently the clone operation results in whole copies of the repository being temporarily stored in memory	03:51
fungi	so it doesn't take many to exhaust one of the backends	03:51
*** yadnesh\|away is now known as yadnesh		04:00
*** bhagyashris_ is now known as bhagyashris		04:28
*** ysandeep is now known as ysandeep\|ruck		05:18
*** ysandeep\|ruck is now known as ysandeep\|ruck\|afk		06:05
*** ysandeep\|ruck\|afk is now known as ysandeep\|ruck		06:47
jrosser	fungi ianw I did a bunch of work to make OSA use an identifiable user agent if you believe that is the cause	06:58
ianw	jrosser: ++ i don't have time to look right now but definitely an angle.	08:42
ianw	i wonder if we could somehow work that into some sort of static report like we do for the other services	08:42
ianw	e.g. https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_757/periodic/opendev.org/opendev/system-config/master/docs-openstack-goaccess-report/757a49e/docs.openstack.org_goaccess_report.html	08:42
*** jpena\|off is now known as jpena		08:42
opendevreview	Ian Wienand proposed zuul/zuul-jobs master: ensure-skopeo: fixup some typos https://review.opendev.org/c/zuul/zuul-jobs/+/872733	08:44
*** gibi_pto is now known as gibi		08:47
opendevreview	Merged zuul/zuul-jobs master: ensure-skopeo: add install from upstream option https://review.opendev.org/c/zuul/zuul-jobs/+/872617	08:56
opendevreview	Merged zuul/zuul-jobs master: zuul-jobs-test-registry-docker-* : update to jammy nodes https://review.opendev.org/c/zuul/zuul-jobs/+/872365	09:56
*** ysandeep\|ruck is now known as ysandeep\|ruck\|break		10:43
opendevreview	Ade Lee proposed zuul/zuul-jobs master: Add ubuntu to enable-fips role https://review.opendev.org/c/zuul/zuul-jobs/+/866881	10:55
*** ysandeep\|ruck\|break is now known as ysandeep\|ruck		11:31
*** yadnesh is now known as yadnesh\|away		13:38
*** dasm\|off is now known as dasm\|rover		13:52
opendevreview	Scott Little proposed openstack/project-config master: Create a git for the storage of public keys and certificates https://review.opendev.org/c/openstack/project-config/+/872758	15:04
gthiemonge	FYI I see a lot of issues with ubuntu mirrors in the octavia-grenade job (I don't know why this particular job is so impacted)	15:16
gthiemonge	https://zuul.opendev.org/t/openstack/build/ca0a84af6a704bb6b3b521c846faab19/log/controller/logs/dib-build/amphora-x64-haproxy.qcow2_log.txt#429	15:21
gthiemonge	I see similar issues in opensearch	15:21
fungi	gthiemonge: any idea why those jobs don't use our package mirrors?	15:25
tweining	once I saw in the log that it was trying an ipv6 address, not sure if that has something to do with it or not though.	15:26
fungi	some of our providers have ipv6 access, some do not. if it was trying to reach an ipv6 address from a system which didn't have any v6 routes that could indicate an issue	15:28
fungi	the zuul host info logged with the job should show the routing table the node had when the build started	15:28
fungi	but also some tools "fall back" to trying ipv6 when v4 connections to something time out, and then report misleading errors	15:29
fungi	so the error message ends up implying that a v4-only host tried to reach something over v6, but really the problem is that the v4 connection it correctly attempted first failed for some reason	15:30
opendevreview	Scott Little proposed openstack/project-config master: Create a git for the storage of public keys and certificates https://review.opendev.org/c/openstack/project-config/+/872758	15:30
fungi	anyway, i'm able to manually download one of the packages that build said it couldn't, so the problem is likely either intermittent or location-dependent	15:33
fungi	the addresses i see for it in dns appear to be hosted directly by canonical, so probably no cdn involved at least	15:34
Clark[m]	fungi: it's likely not using our mirrors because it is within the dib image builds chroot. Dib has support for using our package mirrors though iirc. Maybe that's just part of dibs test suite though	15:36
fungi	yeah, that would make sense	15:40
fungi	also i tested downloading over ipv4 as well as ipv6, fwiw, both worked for me	15:41
fungi	though i didn't try all 3 v4 and 3 v6 addresses in the round-robin	15:41
fungi	could be one of the servers they list is having trouble	15:42
gthiemonge	fungi: Clark[m]: thanks, I'll check how we can use our mirrors	16:03
slittle1_	Review please... https://review.opendev.org/c/openstack/project-config/+/872758	16:28
fungi	you bet, i was just about to pull it up, i was delayed by some local software updates which have just completed	16:30
clarkb	hey I'm doing local software updates too	16:30
clarkb	monday morning routine	16:30
fungi	indeed, though i was way behind in recompiling all my python interpreters since the recent tags	16:31
fungi	and then rebuilding all my venvs, including the one for my gertty	16:32
clarkb	they should only need rebuilding when you change major versions?	16:32
fungi	well, any time your interpreter's path changes, i think	16:32
fungi	which in my case is the case even for new patch releases because i use separate directories for them	16:33
clarkb	ah	16:33
fungi	yeah, pyvenv.cfg embeds the real path to the interpreter in its "executable" key	16:35
fungi	so mine just updated from /home/fungi/lib/cpython/3.11.0/bin/python3.11 to /home/fungi/lib/cpython/3.11.1/bin/python3.11	16:35
fungi	useful when you want to, say, easily compare behaviors between 3.11.0 and 3.11.1 since you can have them installed side by side and create different venvs referencing each	16:37
fungi	i simply update a symlink in ~/bin to point to the new version when i want it to be the default build	16:38
fungi	i've got my venv rebuilds scripted anyway, so it's just a matter of starting the script and waiting for that to complete (or error)	16:41
opendevreview	Jeremy Stanley proposed opendev/system-config master: Feature our cloud donors on opendev.org https://review.opendev.org/c/opendev/system-config/+/869091	16:45
opendevreview	Merged openstack/project-config master: Create a git for the storage of public keys and certificates https://review.opendev.org/c/openstack/project-config/+/872758	17:03
*** jpena is now known as jpena\|off		17:36
clarkb	mtreinish: super minor thing I've noticed cleaning up warnings in Zuul. stestr's subunit_runner opens an fd returning a python file object in SubunitTestRunner._list() and ends up returning that back up again to users of the TestRunner so that status results can be recorded. Python complains that this file object is never closed and raises a ResourceWarning	18:11
clarkb	mtreinish: a quick fix wasn't super obvious to me otherwise I'd write a PR because the status object which uses that file object is passed back up and stats things are called against it	18:12
fungi	infra-root: i think our recent changes to jeepyb may have broken manage-projects: https://paste.opendev.org/show/bwSUS8swuboT3Q6OSr6v/	18:32
fungi	possible chicken-and-egg problem? is it trying to fetch a ref from a project which doesn't exist yet?	18:33
clarkb	fungi: that would be from the change that made the git errors an error rather than log and continue	18:34
clarkb	and ya that hunch sounds correct.	18:34
fungi	yeah, that's the change i was expecting it to be	18:34
clarkb	We should be able to "safely" revert that jeepyb change that raised errors in that situation	18:34
clarkb	(we'll just reintroduce the old behavior which was problematic but less problematic probably)	18:34
clarkb	an alternative would be to treat the fetch of the refs as special. If it fails its ok and continue and we'll try to push what we have anyway	18:34
clarkb	that is probably a reasonably correct fix	18:35
clarkb	the issue before was treating pushes as fail acceptable, here its a fetch	18:35
fungi	though... has that change actually merged yet?	18:35
clarkb	hrm nope https://review.opendev.org/c/opendev/jeepyb/+/869873	18:36
fungi	right, so this must be something else	18:36
fungi	maybe there was an intermittent connectivity failure	18:37
clarkb	its all to localhost I think	18:37
clarkb	that would be highly unlikely but possible if the mina sshd ran out of threads maybe	18:37
clarkb	fungi: the other change is the change of the base image	18:37
fungi	gerrit says the project got created	18:37
fungi	so maybe jeepyb raced something trying to access the config ref from it too soon	18:38
fungi	the repo got prepopulated and synced to gitea too	18:38
clarkb	fungi: if you look at fetch_config in manage_projects it has a loop for 20ish seconds waiting for the meta config to be available	18:39
clarkb	perhaps 20 seconds is not long enough?	18:39
fungi	mmm, the repo actually didn't get prepoputated or synced, it was just created on gitea empty	18:40
fungi	but it exists in gerrit and gitea at least	18:40
clarkb	fungi: if ou look in the manage projects log you should see it looping too as it seems to log each pass through that loop	18:40
clarkb	oh but only at debug level?	18:41
clarkb	what if "public-keys" is the problem	18:42
clarkb	and we're tripping over some gerrit user public keys api path	18:42
fungi	if i `git fetch ssh://review.opendev.org:29418/starlingx/public-keys +refs/meta/config:refs/remotes/gerrit-meta/config` i'm told "fatal: couldn't find remote ref refs/meta/config"	18:43
clarkb	fungi: you have to do that as your admin account iirc	18:43
clarkb	possibly in bootstrappers	18:43
fungi	oh, right	18:43
fungi	that worked	18:43
fungi	git fetch ssh://fungi.admin@review.opendev.org:29418/starlingx/public-keys +refs/meta/config:refs/remotes/gerrit/config	18:43
fungi	* [new ref] refs/meta/config -> gerrit/config	18:44
fungi	so it seems to exist now, at least	18:44
fungi	might have just been a race	18:44
clarkb	ya maybe that 20 second time period isn't long enough depending on how busy gerrit is or how busy its disks are?	18:45
johnsom	gthiemonge https://github.com/openstack/octavia-tempest-plugin/blob/master/zuul.d/jobs.yaml#L219	18:45
fungi	clarkb: should i try manually rerunning manage-projects and see if it succeeds?	18:45
clarkb	fungi: I guess so? maybe with debug enabled so that you can see it loop through things. The only other thought I've got is maybe it has something to do with git in the new image or the git repos in the jeepyb cache on the new image	18:46
clarkb	fungi: but we directly manage the gerrit uid already and that didn't change in the base image swap so that would surprise me I think	18:46
clarkb	and the git versions were basically equivalent	18:46
clarkb	(conversion from our security patched version to debians)	18:47
clarkb	unrelated: Our CI jobs for fungi's gitea change are failing on apparmor for docker 23 now	18:56
fungi	i saw that the build failed, but hadn't found time to see why yet	18:56
fungi	noticed it about the same time as the manage-projects failure	18:56
clarkb	our prod servers already have apparmor installed based on a quick sampling so I think I'll just push a change to add apparmor to our install docker role	18:57
clarkb	fungi: re manage-projects I can't really come up with anything except for git versions/permissions issues due to the base image change, or simply a timeout with our loop not being long enough	18:58
clarkb	fungi: I double checked group membership and project creator appears to have the correct perms	18:58
clarkb	in gerrit I mean.	18:58
opendevreview	Clark Boylan proposed opendev/system-config master: Install apparmor when we install docker-ce from upstream https://review.opendev.org/c/opendev/system-config/+/872801	19:00
opendevreview	Clark Boylan proposed opendev/system-config master: Feature our cloud donors on opendev.org https://review.opendev.org/c/opendev/system-config/+/869091	19:01
clarkb	fungi: ^ rebased as thats a good check it fixes the issue	19:01
fungi	sgtm	19:03
clarkb	fungi: another variable that may have impacted refs/meta/config is if it overlappedwith backups and that was eating up iops	19:04
fungi	ooh	19:04
clarkb	so ya I'm thinking the best next step is to rerun with debug on against that project specifically and see if its happy now. If so our 20 second retry loop may simply be too short	19:04
clarkb	I guess the jdk changed too and maybe its slower at doing that bootstrapping?	19:07
clarkb	fungi: I'm going to go back to zuul warning cleanup while I've got it paged in but ping me if I can help further	19:16
fungi	i'm trying to reverse-engineer the manage-projects playbook since just running it directly seems to have failed (probably in the same spot but it doesn't log to a file, just to stdout)	19:31
fungi	what does this tasks_from do? https://opendev.org/opendev/system-config/src/branch/master/playbooks/manage-projects.yaml#L35	19:32
clarkb	fungi: it runs the tasks from teh manage-projects file in the gerrit role	19:32
clarkb	fungi: https://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/gerrit/tasks/manage-projects.yaml	19:33
fungi	yep, thanks found it	19:33
fungi	so i guess i can just run manage-projects on the gerrit server	19:34
fungi	which seems to be a docker run wrapper	19:34
clarkb	yes because we run jeepyb on the image with all the various dirs bind mounted in	19:35
clarkb	doing that by hand would be annoying so we have the wrapper	19:35
fungi	running with -v, i don't see any debug log entries	19:38
fungi	2023-02-06 19:37:40,534: manage_projects - ERROR - Failed to fetch refs/meta/config for project: starlingx/public-keys	19:39
fungi	so whatever it's trying is still not working	19:39
clarkb	that method is the only place we have log.debug() calls. I wonder if we didn't add an ability to actually record those	19:39
fungi	i find it extra interesting that i can fetch that with a gerrit admin account	19:40
fungi	it does seem to take at least 20 seconds before i get any output, which would suggest the retry loop is actually happening at least	19:43
clarkb	fungi: I think it isn't creating the blank repo to fetch the config into	19:44
clarkb	fungi: the jeepyb cache dir is at /opt/lib/jeepyb:/opt/lib/jeepyb so the same dir path in both host and container. /opt/lib/jeepyb/starlingx does not have any entries, but the public-keys dir should be there to fetch the config into	19:45
fungi	/opt/lib/jeepyb/project.cache has an entry for it with project-created and pushed-to-gerrit both true but no acl-sha, which seems to match what we're observing at least	19:47
clarkb	what I'm confused about is jeepyb's make_local_copy should error if it isn't able to git init I think	19:49
clarkb	oh except we don't raise there we could just be running several git commands that all just fail	19:50
fungi	yeah, it looks like run_command would log.debug the output from those	19:51
fungi	but maybe that doesn't go to stdout/stderr on a normal invocation	19:51
fungi	manage-projects has a -l option to specify a log path	19:51
fungi	we map /var/log into the container too but doesn't look like anything is writing a jeepyb or manage-projects log by default	19:52
clarkb	ya I think beacuse we use default logging which is stdout	19:53
clarkb	oh wait we remove the dir in the cache	19:54
clarkb	ok that explains some very confusing behavior	19:54
clarkb	and the timestamps for that dir do show it was updated roughly when you ran it by hand ok thats making a bit more sense now	19:55
fungi	i'm trying it's -l option	19:57
fungi	which doesn't appear to do anything	19:57
clarkb	fungi: I think the flag for debug is -d	19:58
fungi	aha, the -- i was including was to blame	19:58
clarkb	see setup_logging_arguments	19:58
fungi	--help says it's -v	19:58
clarkb	-v is verbose -d is debug	19:58
fungi	oh! yes okay i see it now	19:59
clarkb	and in this case we've set verbose at INFO and higher and debug at DEBUG and higher	19:59
clarkb	however, that will just log the mostly useless message 10 times over 20 seconds since we already know it is taking roughly that long	19:59
fungi	okay, i have a more useful log file in /var/log now	20:00
fungi	here we go...	20:00
clarkb	fatal: ssh variant 'simple' does not support setting port	20:01
fungi	jeepyb.utils - DEBUG - Command said: fatal: not a git repository: '/opt/lib/jeepyb/starlingx/public-keys/.git'	20:01
clarkb	yup and if you scroll up a bit ssh vairnat simple does not support setting port seems to be why ^ that isn't a repository	20:01
fungi	ahh, yeah that's even earlier	20:01
fungi	looks like GIT_SSH_VARIANT=ssh is a workaround or `git config --global ssh.variant ssh`	20:02
clarkb	ya and this is likely a side effect of our image change then I guess	20:02
fungi	maybe different ssh client?	20:03
clarkb	maybe?	20:03
clarkb	fungi: we also set GIT_SSH to a wrapper script in order to set ssh flags for the key path and the username etc	20:04
fungi	or it could be that the git command there has a built-in ssh client implementation now	20:06
opendevreview	Clark Boylan proposed opendev/system-config master: Install openssh-client in our Gerrit docker image https://review.opendev.org/c/opendev/system-config/+/872802	20:20
clarkb	ok I think ^ will address it. Just a missing dep in the base image swap	20:21
clarkb	note this is based on the apparmor change so that it can gate	20:21
clarkb	the apparmor change should be considered carefully however, as I mentioned I thnk its a noop for our prod hosts	20:21
fungi	system-config-run-gitea is still underway for 869091,10 as our confirmation on that one	20:24
fungi	hopefully we'll see that green shortly	20:24
clarkb	with that largely sorted out (for now anyway) I'm going to eat lunch	20:28
mtreinish	clarkb: yeah, it's been on my backlog to try and figure out how to handle that. There was an issue opened a while ago about all the resource warnings that get raised: https://github.com/mtreinish/stestr/issues/320 and masayukig fixed some of them but there are definitely still more	20:49
mtreinish	some of them will definitely be tricky to fix, because it's all in weird inherited usage from subunit and unittest (mostly because I have to remind myself how that all works)	20:50
clarkb	mtreinish: I can commiserate with that. See also the jeepyb debugging above :)	20:53
ianw	my docker 23 issue was ultimately that i had an old devicemapper based container and the docker daemon wouldn't start	21:13
ianw	it might have been able to with various flags, but it was easier to just start again	21:13
ianw	i think this was from when linode (my host) was a Xen-based vm. at some point they migrated everything to kvm, but iirc at the time something about being xen made it use devicemapper	21:14
clarkb	some linux archeaolgy	21:15
ianw	we're testing with docker 23 now, but i don't think it will get pulled in anywhere in prod unless we explicitly update	21:15
clarkb	(also I can't type)	21:15
clarkb	ianw: correct because updating docker implies restarting containers and we try to control that	21:15
ianw	i wonder if it's worth just making a list and doing it manually, starting with lower-impact hosts?	21:16
clarkb	fungi: heh the latest donor change made the header and text align properly but now the donor logos are stacked on top of each other. I Think I prefer this even if it is more scrolling though	21:16
clarkb	ianw: not a bad idea	21:16
clarkb	fungi: but I'm terrible at css and layout...	21:17
ianw	i can start an etherpad and do that. it's probably not a bad idea to do a reboot anyway on some of these hosts	21:17
clarkb	++	21:17
ianw	tracking at https://etherpad.opendev.org/p/docker-23-prod	21:21
clarkb	ianw: in the past what I've tried to do is stop service containers, upgrade docker, optionally reboot, start service containers again. I think the packaging will attempt to restart containers for you but I like doing it myself for most things	21:23
ianw	++	21:23
mtreinish	clarkb: tbh, looking at the code in detail now I think I can just drop the fdopen call. I don't think it's really relevant. IIRC, I just ported that from subunit and/or unittest when I rewrote the runner to be based on unittest's run instead of testtools, but the stestr context is more limited and we're almost always just passing stdout as the result stream and won't ever need to open a new descriptor	21:28
mtreinish	in that code	21:28
mtreinish	I'm just going to simplify that logic (famous last words)	21:28
opendevreview	Merged zuul/zuul-jobs master: ansible-lint: fix a bunch of command-instead-of-shell errors https://review.opendev.org/c/zuul/zuul-jobs/+/872490	21:36
opendevreview	Merged zuul/zuul-jobs master: ansible-lint: add names to blocks/includes, etc. https://review.opendev.org/c/zuul/zuul-jobs/+/872491	21:36
opendevreview	Merged zuul/zuul-jobs master: ansible-lint: ignore use of mkdir https://review.opendev.org/c/zuul/zuul-jobs/+/872492	21:36
mtreinish	clarkb: https://github.com/mtreinish/stestr/pull/342 it passed tests locally	21:39
opendevreview	Jeremy Stanley proposed opendev/system-config master: Feature our cloud donors on opendev.org https://review.opendev.org/c/opendev/system-config/+/869091	21:48
clarkb	mtreinish: thanks! I was mostly motiviated by the sqlalchemy 2.0 update and needing to filter out all the noise warnings from the useful warnings.	21:48
fungi	clarkb: ^ looking at the other logos at the top of the page, i think i just incorrectly nested them	21:48
*** dmitriis9 is now known as dmitriis		21:51
*** Tengu8 is now known as Tengu		21:51
*** mtreinish_ is now known as mtreinish		21:51
*** dtantsur_ is now known as dtantsur		21:51
*** noonedeadpunk_ is now known as noonedeadpunk		21:51
opendevreview	Merged opendev/system-config master: Install apparmor when we install docker-ce from upstream https://review.opendev.org/c/opendev/system-config/+/872801	22:29
opendevreview	Merged zuul/zuul-jobs master: ansible-lint: use pipefail https://review.opendev.org/c/zuul/zuul-jobs/+/872493	22:35
opendevreview	Merged zuul/zuul-jobs master: ansible-lint: ignore latest git pull https://review.opendev.org/c/zuul/zuul-jobs/+/872494	22:35
opendevreview	Ian Wienand proposed zuul/zuul-jobs master: build-docker-image: further cleanup buildx path https://review.opendev.org/c/zuul/zuul-jobs/+/872806	22:58
ianw	To ssh://review.opendev.org:29418/opendev/system-config.git	23:16
ianw	! [remote rejected] HEAD -> refs/for/master%topic=docker-apt-key (n/a (unpacker error))	23:16
ianw	is this my fault or gerrits fault??	23:17
ianw	Caused by: java.io.IOException: Unpack error on project "opendev/system-config":	23:18
ianw	in gerrit logs	23:18
opendevreview	Ian Wienand proposed opendev/system-config master: install-docker: switch from deprecated apt-key https://review.opendev.org/c/opendev/system-config/+/872808	23:20
opendevreview	Ian Wienand proposed opendev/system-config master: install-docker: remove apt-key cleanup https://review.opendev.org/c/opendev/system-config/+/872809	23:20
ianw	$ zgrep 'Unpack error, check server log' * \| wc -l	23:32
ianw	28	23:32
ianw	so it's not unique, but also not that frequent. maybe it was my client dropping packets or something	23:32
JayF	is something upside down?	23:37
JayF	ianw: I'm seeing exactly that	23:37
JayF	fetch-pack: unexpected disconnect while reading sideband packet	23:37
JayF	more like these, it errors in different places depending on when it times out	23:38
JayF	looks like generically slow-remote-server stuff? but I know little about what goes on behind the covers here	23:38
ianw	JayF: what was the operation you were doing?	23:38
JayF	Trying to push a fresh patch. It died in the git remote update gerrit step	23:39
JayF	and I can make that fail outside of `git review1	23:39
ianw	JayF: hrm, i'm not seeing anything lining up in the gerrit logs, can you paste more context where it popped up?	23:41
JayF	let me get a fresh reproduction then I'll paste it	23:41
JayF	ianw: https://gist.github.com/jayofdoom/bbd2e080f66183192d5546f9a4591b9f	23:42
JayF	web UI works as I'd expect, if a bit slow, so I think it's not connectivity	23:42
ianw	ahh, ok, i see in logs now	23:43
ianw	SshChannelNotFoundException: Received SSH_MSG_CHANNEL_WINDOW_ADJUST on unassigned channel 0 (last assigned=null)	23:43
ianw	always great to see a new weird ssh error, it's been too long since the last one :)	23:44
JayF	I'm running 9.1_p1-r3	23:44
JayF	on gentoo	23:44
ianw	the last one almost turned clarkb into a java developer	23:44
JayF	if it's possible the error is caused by shiny new openssh, it's likely I'm running the shiny new lol	23:44
JayF	although there is a 9.2 in the repo too...	23:44
opendevreview	Merged opendev/system-config master: Install openssh-client in our Gerrit docker image https://review.opendev.org/c/opendev/system-config/+/872802	23:46
ianw	there's references to this in a few places	23:47
JayF	that's an ominous merge in time with this bug LOL	23:48
ianw	https://bugs.chromium.org/p/gerrit/issues/detail?id=11491; an old wikimedia commit seems to have enabled the workaround -> https://gerrit.wikimedia.org/r/c/operations/puppet/+/755968/	23:48
JayF	looks like from what I'm seeing, most reports are when networking is slow or high latency	23:49
JayF	makes me wonder if it's possible there's a network issue underlying this failure mode	23:49
ianw	you're not the only user to have this error in the logs	23:49
JayF	I'm talking more generally than just me; mainly based off a feeling (not quanitative data) that the Web UI is exhibiting some slowness too	23:50
ianw	JayF: hrm, did you just upgrade or something?	23:54
JayF	I don't think so; but I run updates on this thing very frequently	23:55
ianw	there's a few users seeing this in a bit of a regular pattern	23:55
ianw	133 exceptionCaught(ServerSessionImpl[proliantci@	23:55
ianw	e.g. seems proliantci is experiencing it	23:55
JayF	those are ironic third party CI :(	23:55
ianw	33 exceptionCaught(ServerSessionImpl[cisco-cinder-ci	23:55
ianw	19 exceptionCaught(ServerSessionImpl[hp-storage-blr-ci	23:56
JayF	FWIW, looks like I'v ebeen running the same openssh client version for a couple weeks minimum	23:56
ianw	that's the bot accounts, but there's user accounts too	23:56
JayF	honestly, and I'm far from an expert in java ops (and even if I was, that info would be dusty)	23:56
JayF	but this is the sort of thing I'd reboot first and ask questions later LOL	23:56
ianw	i suppose it's possible all three of those are on the same distro with the same openssh	23:57
JayF	honestly, I'd be amazed if anyone has modified config on HP third party CI in months	23:57
ianw	JayF: so are you basically blocked from pushing changes with this atm?	23:58
JayF	yes, but my day ends in 2 minutes	23:58
JayF	so i'm happy to just close the laptop and re-run `git review` tomorrow lol	23:59
JayF	but I can hang out and help w/testing if that's useful	23:59
* JayF tries again for good measure		23:59

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!