Monday, 2022-10-03

opendevreview	Ian Wienand proposed opendev/system-config master: [dnm] attempting to trigger zuul syntax error https://review.opendev.org/c/opendev/system-config/+/860061	02:24
*** soniya29 is now known as soniya29\|ruck		05:05
*** soniya29\|ruck is now known as soniya29\|ruck\|afk		06:54
*** jpena\|off is now known as jpena		07:17
*** soniya29\|ruck\|afk is now known as soniya29\|ruck		07:37
*** pojadhav is now known as pojadhav\|sick		07:55
*** marios is now known as marios\|call		08:47
*** soniya29\|ruck is now known as soniya29\|ruck\|lunch		09:00
*** marios\|call is now known as marios		09:04
*** soniya29\|ruck\|lunch is now known as soniya29\|ruck		10:26
*** rlandy\|out is now known as rlandy		10:35
*** lbragstad4 is now known as lbragstad		11:05
*** dviroel_ is now known as dviroel		11:40
*** dasm\|off is now known as dasm		12:59
*** rcastillo is now known as rcastillo\|ruck		13:29
opendevreview	Neil Hanlon proposed openstack/project-config master: Add rockylinux 9 to OSA grafana https://review.opendev.org/c/openstack/project-config/+/860094	14:19
clarkb	infra-root I'm going to try and dig into the jammy launch node issues today. corvus iirc the issue was deleting the ubuntu user which we were currently ssh'd in as?	15:10
clarkb	hrm it looks like one of the first things that launch node does is switch to root if it isn't already root	15:11
*** dviroel is now known as dviroel\|lunch		15:11
clarkb	ah ok it was a specific pid 1559 which may have been running independently of the ssh connection	15:12
clarkb	fungi: also before I dive too deeply into ^ I should probably go and check that I can trace a connection from the gitea lb to apache to gitea itself	15:19
fungi	makes sense	15:21
clarkb	ok breakfast first, then gitea, then jammy launches	15:21
mtreinish	random ssh key question. I'm trying to push a patch to gerrit and my public key (which I had been using on gerrit since ~2013) is being rejected. Looking at the verbose output it seems to be caused by "no mutual signature algorithm"	15:29
mtreinish	was the a change in the allowed ssh key algorithms that I missed?	15:30
clarkb	mtreinish: yes, but on the client side. Chances are you are running newish openssh which dropped supported for rsa + sha1. but when they did that they didn't update the default to rsa + sha2 (a bug imo). Gerrit can do rsa+sha2 but doesn't support the key exchange extension to negotiate that so it fails	15:35
clarkb	mtreinish: I actually fixed that in newer gerrit but the backport to 3.5 is stalled because my account in upstream gerrit got deleted or smething and other people won't manually cherry pick the chagne for me	15:36
clarkb	and google hasn't said what happened to my account yet	15:36
clarkb	mtreinish: there are two workarounds. One is to use a key that isn't an rsa key. The other is to specifically allow rsa + sha1 to review.opendev.org	15:36
mtreinish	heh, I guess it's the curse of archlinux again :)	15:37
clarkb	mtreinish: https://www.openssh.com/txt/release-8.8 the seciont on backward incompatible changes covers this as well as the rsa + sha1 work around	15:37
mtreinish	I also can try pushing from a system that I haven't updated in a while I guess	15:37
mtreinish	thanks I'll give those a try	15:37
clarkb	originally we weren't going to backport the fix to gerrit 3.5 because that would require updating mina on 3.5 which required updating jgit. But then they did that last week for other reasons so now the key negotiation fix is valid	15:38
clarkb	I also looked at the openssh code to try and figure out how to update the fallback when the negotiation fails to sha2 since sha1 is absically never goign to work. And I got lost in all the indirection they do to implement defaults	15:40
clarkb	I could probably figure it out if I took the time to do a debug build and attach gdb	15:40
clarkb	but meh	15:40
mtreinish	heh, yeah that's probably too much I would have given up long before that	15:41
priteau	Hello. Just got a POST_FAILURE due to an host key verification failure.	15:44
priteau	https://zuul.opendev.org/t/openstack/build/80949a6467644308837009c3c39a6ecd	15:44
clarkb	priteau: we believe those occur because openstack is reusing IP addresses in the cloud(s) that we boot test nodes in.	15:44
priteau	Would it make sense to clear known host somewhere in Zuul?	15:45
clarkb	priteau: unfortunately there isn't much we can do about that from our end other than try and encourage openstack to stop doing that. But since we don't run the clouds we don't have insight into when/why it happens (though aiui cells are suspected)	15:45
clarkb	priteau: no that won't help we'd just fail to ssh when the IP is attached to a host we don't control	15:45
clarkb	priteau: basically two (or more) hosts end up with the same IP then fight over populating ARP tables	15:46
priteau	Oh, that's bad	15:46
clarkb	whichever is currently in the ARP tables wins and gets the connections. If that isn't our host then you get the failure you see. It is entirely a bug in openstack	15:46
priteau	I thought you meant reusing as in reusing later. Like Neutron does everywhere.	15:46
clarkb	no thats fine	15:46
priteau	A genuine bug in openstack? Or something broken in one of the clouds opendev uses?	15:47
clarkb	priteau: I mean the fact that it is possible is a bug in openstack to me.	15:47
clarkb	it should never be possible for neutron/nova/whatever to give two different hosts the same IP at the same time	15:48
clarkb	even if the issue is in a third party driver nova/neutron/whatever should say "no"	15:48
clarkb	and fail to boot the second isntance instead	15:48
clarkb	fungi: I opened a connection to https://opendev.org/opendev/git-review from my desk. Then traced that to the backend. One thing I notice is that apache -> gitea is using a single connection for many requests to the frontend which means this isn't perfect but it is an improvement on what we had before	15:49
clarkb	priteau: fwiw it is also possible for jobs to reset their ssh host keys which would also break this, but this is the standard openstack-tox-docs jobs which shouldn't do that unless the tox run is doing something very weird	15:51
*** marios is now known as marios\|out		15:51
fungi	priteau: apparently cells v1 was really bad about losing track of virtual machines, but i'm not sure all the occurrences are attributable to that	15:58
fungi	but in essence yes, what happens is that some old vm which nova no longer knows about is running on one of the hypervisors, but it/neutron think the ip address is available again so they assign it to a test node we boot, and then we intermittently end up trying to connect to the old stale guest rather than our test node	15:59
fungi	the cloud providers where this is relatively common seem to run automated "cleanup" tasks to find those rogue vms and clear them out periodically	16:00
priteau	Indeed, I could see this happening	16:01
priteau	Rogue VMs	16:01
corvus	clarkb: yes, i think it's likely that userdel is just more careful now than in older versions, and the process (whatever it is) probably is running in older versions too. i suspect the right answer may be to find a way to ignore the error and proceed	16:02
mtreinish	clarkb: thanks I just created a second key with ecdsa and was able to push my patch: https://review.opendev.org/c/openstack/stevedore/+/860109	16:02
fungi	mtreinish: yep, that's probably the safest workaround	16:02
mtreinish	(the config option didn't work for me for whatever reason, I think I remember reading something in an arch package upgrade guide about the rsa keys, so it might be something on the package side)	16:02
clarkb	corvus: ya userdel has a --force option whihc will get around that but it has a bunch of other new behavior it brings in too that we may not want	16:03
clarkb	corvus just dropped but "This option forces the removal of the user account, even if the user is still logged in. It also forces userdel to remove the user's home directory and mail spool, even if another user uses the same home directory or if the mail spool is not owned by the specified user."	16:05
clarkb	the idea I wanted to look into is rebooting before ssh'ing back in as root which should ensure that any remaining ubuntu owned processes are gone	16:09
clarkb	but we can add the force option instead if others aren't worried about that (I think it may cause some stuff to get deleted for other old disabled users). Hrm maybe we move the regular disabled users and system image user disablement into different tasks and one can use force and theo ther won't	16:10
clarkb	I'll go ahead and write that change because I think it will be the least impactful and easiest to understand	16:11
opendevreview	Clark Boylan proposed opendev/system-config master: Disable distro cloud image users more forcefully https://review.opendev.org/c/opendev/system-config/+/860112	16:24
clarkb	something like that maybe	16:24
clarkb	fungi: https://lists.mailman3.org/archives/list/mailman-users@mailman3.org/thread/FVZW5DQJ7C3TW4LPIIU7ARI7XMVJYYWX/ thats the followup to my email about mailman3 docker images. TLDR it sounds like maxking is largely doing it alone and the others involved in the projcet aren't invovled with the docker stuff:/	16:30
clarkb	fungi: I'm beginning to wonder if we shouldn't fork the images	16:30
*** jpena is now known as jpena\|off		16:30
fungi	got it, so more of an example/starting point	16:31
clarkb	well I think the intention is that they be production ready	16:32
clarkb	but I'm not sure they receive the attention needed currently to manage that. I'd be happy to help maxking upstream, but I'm not sure how to reach out other than what I've already done (issues/PRs/mailing list)	16:33
fungi	yeah. also we could always un-fork later if maintenance picks back up on it	16:33
fungi	hopefully there's not a ton of churn for the projects being bundled into those images these days, since mm3 has had many years to stabilize now	16:34
clarkb	I'll have to think on this a bit more now that I'm leaning towards a local fork or modification. In particular we should decide if we want to build them up ourselves using a complete fork of the upstream docker fiels or if we just want to fetch the upstream images and modify them to our needs	16:36
fungi	sure, we do both in different places, depending on the situation	16:37
clarkb	I think the major upside to forking properly is we can set the uids and gids without needing to do a global chown across the image. However, if we do that we're more than likely goign to never reconverge with upstream	16:38
*** dviroel\|lunch is now known as dviroel		16:38
clarkb	if we just want to install lynx then doing that in a new layer is probably simplest and most likely to allow us to unfork later	16:38
clarkb	so that might be the best place to start as it keeps the delta small and options open	16:38
fungi	though it leaves us with concerns over the uid/gid conflicts	16:40
clarkb	right	16:42
fungi	looks like the container author was last seen responding to this thread: https://lists.mailman3.org/archives/list/mailman-users@mailman3.org/thread/NNYOXOE33DJEFWQ5WUBMJBB35IRAACQK/#S3KFJQD23IMACB7CR6K4ZWUQITREG6ID	16:42
fungi	that was roughly a month ago	16:42
clarkb	fyi jitsi meet updated about 4 days ago. I've gone ahead and subscribed to release notifications for https://github.com/jitsi/docker-jitsi-meet so that I'll get alerted when those updates happen	17:53
clarkb	But we may want to retest meetpad soon just to be sure the update is working for us	17:53
fungi	clarkb: spotz and i used it a few hours ago for the diversity wg meeting, seemed fine still	18:37
*** dviroel is now known as dviroel\|afk		19:37
clarkb	fungi: ah cool	19:44
clarkb	fungi: over in gerrit land they are trying to run debian buster jobs and use openjdk-8. It seems that openjdk-8 is not in buster proper but is in sid. Do you know anywhere in our jobs where we might add debian unstable as an example I can show them?	19:59
clarkb	I showed them what zuul did to install libc previously	20:08
clarkb	I think that will work. Pin the default release to stable then install openjdk-8 which only exists in unstable	20:08
fungi	technically you don't need to pin testing or unstable if you have stable sources, since the repositories themselves set relative priorities, so you'll only ever wind up getting packages from sid if they don't exist in buster or you explicitly request them by version or suite name	20:56
clarkb	oh cool. I'd push a patch to them but I can't do that aynmore because something broke my account. But I gave them lots of examples and hopefully they can address it themselves with that info	20:56
ianw	clarkb: thanks for looking at the jammy launcher, i was wondering if that would work.	21:08
ianw	related to that; https://review.opendev.org/q/topic:bridge-ansible-venv is ready for review	21:09
*** dasm is now known as dasm\|off		21:16
clarkb	ah cool I'll have to take a look at those now	21:21
clarkb	I think I reivewed some of them previously but that was very early on.	21:21
clarkb	fungi: I've nearly got a mm3 docker image fork change ready to push on top of the existing change	21:22
clarkb	however, I'm realizing that we may run into the problem with the locale stuff	21:22
clarkb	but we'll figure that out if it becomes a problem I guess	21:22
opendevreview	Clark Boylan proposed opendev/system-config master: WIP fork the maxking/docker-mailman images https://review.opendev.org/c/opendev/system-config/+/860157	21:36
clarkb	I think the django msgfmt issue with the mailman3 images is addressed by https://github.com/django-extensions/django-extensions/pull/1740 which seems to be in the most recent relaese of that tool and is newer than the failures I saw upstream	21:37
opendevreview	Clark Boylan proposed opendev/system-config master: WIP fork the maxking/docker-mailman images https://review.opendev.org/c/opendev/system-config/+/860157	21:41
clarkb	once ^ seems to be working we can layer in our additions as heavily as we like	21:44
clarkb	right now all I'm doing different than upstream is adding lynx	21:44
clarkb	ianw: re the ansible in venv. The plan is to switch the existing server over to the venv first right? So we've got to be careful about not updating ansible to start?	21:46
fungi	clarkb: awesome (wrt mm3 image fork), i was planning to do at least one more import on a fresh held node	21:51
fungi	and yes, i saw the thread on their ml about that pr which maxking suggested but then hasn't had time to review since	21:51
fungi	spotted it when i was browsing their archives earlier today	21:52
clarkb	https://github.com/maxking/docker-mailman/pull/555 is the related PR fwiw	21:54
clarkb	heh tox-linters explodes on the vendored scripts	21:55
opendevreview	Clark Boylan proposed opendev/system-config master: WIP fork the maxking/docker-mailman images https://review.opendev.org/c/opendev/system-config/+/860157	22:03
clarkb	we apparently need buildkit to build these images	22:03
ianw	clarkb: yeah, that entire stack should be safe to apply to the existing host. but i do want to cut over to the new server sooner rather than later.	22:08
*** rlandy is now known as rlandy\|bbl		22:11
clarkb	ianw: suggestion on https://review.opendev.org/c/opendev/system-config/+/857799/ which affects its child	22:17
ianw	thanks; yep we can add a bionic job, just to be sure	22:18
clarkb	ianw: do you know if the updated selenium works against older selenium that will install on bionic? I guess we'll find out :)	22:21
ianw	no the python would be too old for that. but in just a basic bridge deployment test i don't think we'd be trying to run selenium	22:21
clarkb	oh right	22:22
clarkb	its only gitea, paste, codesearch etc that do the screenshots	22:22
clarkb	woot my images built this time in 860157	22:25
clarkb	ianw: also did you see https://review.opendev.org/c/zuul/zuul/+/855309/ merged?	22:29
clarkb	https://review.opendev.org/c/opendev/system-config/+/855472 which means that chagne is ready to land when we are I think. I've got it on the meeting agenda too	22:29
ianw	yeah, thanks. maybe merge early tomorrow (for me) and can monitor	22:30
clarkb	infra-root I finally updated the meeting agnead for tomorrow. Is anything important missing from that?	22:34
ianw	lgtm, thanks	22:35
opendevreview	Clark Boylan proposed opendev/system-config master: WIP fork the maxking/docker-mailman images https://review.opendev.org/c/opendev/system-config/+/860157	23:05
clarkb	now with less bashate hate	23:05
clarkb	and agenda sent	23:09
opendevreview	Clark Boylan proposed opendev/system-config master: WIP fork the maxking/docker-mailman images https://review.opendev.org/c/opendev/system-config/+/860157	23:14
clarkb	it helps to properly test that first (I did actually test it but because find prints a ton of lines I missed it was still printing the lines I didn't want)	23:15

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!