opendevreview | Ian Wienand proposed opendev/system-config master: [dnm] attempting to trigger zuul syntax error https://review.opendev.org/c/opendev/system-config/+/860061 | 02:24 |
---|---|---|
*** soniya29 is now known as soniya29|ruck | 05:05 | |
*** soniya29|ruck is now known as soniya29|ruck|afk | 06:54 | |
*** jpena|off is now known as jpena | 07:17 | |
*** soniya29|ruck|afk is now known as soniya29|ruck | 07:37 | |
*** pojadhav is now known as pojadhav|sick | 07:55 | |
*** marios is now known as marios|call | 08:47 | |
*** soniya29|ruck is now known as soniya29|ruck|lunch | 09:00 | |
*** marios|call is now known as marios | 09:04 | |
*** soniya29|ruck|lunch is now known as soniya29|ruck | 10:26 | |
*** rlandy|out is now known as rlandy | 10:35 | |
*** lbragstad4 is now known as lbragstad | 11:05 | |
*** dviroel_ is now known as dviroel | 11:40 | |
*** dasm|off is now known as dasm | 12:59 | |
*** rcastillo is now known as rcastillo|ruck | 13:29 | |
opendevreview | Neil Hanlon proposed openstack/project-config master: Add rockylinux 9 to OSA grafana https://review.opendev.org/c/openstack/project-config/+/860094 | 14:19 |
clarkb | infra-root I'm going to try and dig into the jammy launch node issues today. corvus iirc the issue was deleting the ubuntu user which we were currently ssh'd in as? | 15:10 |
clarkb | hrm it looks like one of the first things that launch node does is switch to root if it isn't already root | 15:11 |
*** dviroel is now known as dviroel|lunch | 15:11 | |
clarkb | ah ok it was a specific pid 1559 which may have been running independently of the ssh connection | 15:12 |
clarkb | fungi: also before I dive too deeply into ^ I should probably go and check that I can trace a connection from the gitea lb to apache to gitea itself | 15:19 |
fungi | makes sense | 15:21 |
clarkb | ok breakfast first, then gitea, then jammy launches | 15:21 |
mtreinish | random ssh key question. I'm trying to push a patch to gerrit and my public key (which I had been using on gerrit since ~2013) is being rejected. Looking at the verbose output it seems to be caused by "no mutual signature algorithm" | 15:29 |
mtreinish | was the a change in the allowed ssh key algorithms that I missed? | 15:30 |
clarkb | mtreinish: yes, but on the client side. Chances are you are running newish openssh which dropped supported for rsa + sha1. but when they did that they didn't update the default to rsa + sha2 (a bug imo). Gerrit can do rsa+sha2 but doesn't support the key exchange extension to negotiate that so it fails | 15:35 |
clarkb | mtreinish: I actually fixed that in newer gerrit but the backport to 3.5 is stalled because my account in upstream gerrit got deleted or smething and other people won't manually cherry pick the chagne for me | 15:36 |
clarkb | and google hasn't said what happened to my account yet | 15:36 |
clarkb | mtreinish: there are two workarounds. One is to use a key that isn't an rsa key. The other is to specifically allow rsa + sha1 to review.opendev.org | 15:36 |
mtreinish | heh, I guess it's the curse of archlinux again :) | 15:37 |
clarkb | mtreinish: https://www.openssh.com/txt/release-8.8 the seciont on backward incompatible changes covers this as well as the rsa + sha1 work around | 15:37 |
mtreinish | I also can try pushing from a system that I haven't updated in a while I guess | 15:37 |
mtreinish | thanks I'll give those a try | 15:37 |
clarkb | originally we weren't going to backport the fix to gerrit 3.5 because that would require updating mina on 3.5 which required updating jgit. But then they did that last week for other reasons so now the key negotiation fix is valid | 15:38 |
clarkb | I also looked at the openssh code to try and figure out how to update the fallback when the negotiation fails to sha2 since sha1 is absically never goign to work. And I got lost in all the indirection they do to implement defaults | 15:40 |
clarkb | I could probably figure it out if I took the time to do a debug build and attach gdb | 15:40 |
clarkb | but meh | 15:40 |
mtreinish | heh, yeah that's probably too much I would have given up long before that | 15:41 |
priteau | Hello. Just got a POST_FAILURE due to an host key verification failure. | 15:44 |
priteau | https://zuul.opendev.org/t/openstack/build/80949a6467644308837009c3c39a6ecd | 15:44 |
clarkb | priteau: we believe those occur because openstack is reusing IP addresses in the cloud(s) that we boot test nodes in. | 15:44 |
priteau | Would it make sense to clear known host somewhere in Zuul? | 15:45 |
clarkb | priteau: unfortunately there isn't much we can do about that from our end other than try and encourage openstack to stop doing that. But since we don't run the clouds we don't have insight into when/why it happens (though aiui cells are suspected) | 15:45 |
clarkb | priteau: no that won't help we'd just fail to ssh when the IP is attached to a host we don't control | 15:45 |
clarkb | priteau: basically two (or more) hosts end up with the same IP then fight over populating ARP tables | 15:46 |
priteau | Oh, that's bad | 15:46 |
clarkb | whichever is currently in the ARP tables wins and gets the connections. If that isn't our host then you get the failure you see. It is entirely a bug in openstack | 15:46 |
priteau | I thought you meant reusing as in reusing later. Like Neutron does everywhere. | 15:46 |
clarkb | no thats fine | 15:46 |
priteau | A genuine bug in openstack? Or something broken in one of the clouds opendev uses? | 15:47 |
clarkb | priteau: I mean the fact that it is possible is a bug in openstack to me. | 15:47 |
clarkb | it should never be possible for neutron/nova/whatever to give two different hosts the same IP at the same time | 15:48 |
clarkb | even if the issue is in a third party driver nova/neutron/whatever should say "no" | 15:48 |
clarkb | and fail to boot the second isntance instead | 15:48 |
clarkb | fungi: I opened a connection to https://opendev.org/opendev/git-review from my desk. Then traced that to the backend. One thing I notice is that apache -> gitea is using a single connection for many requests to the frontend which means this isn't perfect but it is an improvement on what we had before | 15:49 |
clarkb | priteau: fwiw it is also possible for jobs to reset their ssh host keys which would also break this, but this is the standard openstack-tox-docs jobs which shouldn't do that unless the tox run is doing something very weird | 15:51 |
*** marios is now known as marios|out | 15:51 | |
fungi | priteau: apparently cells v1 was really bad about losing track of virtual machines, but i'm not sure all the occurrences are attributable to that | 15:58 |
fungi | but in essence yes, what happens is that some old vm which nova no longer knows about is running on one of the hypervisors, but it/neutron think the ip address is available again so they assign it to a test node we boot, and then we intermittently end up trying to connect to the old stale guest rather than our test node | 15:59 |
fungi | the cloud providers where this is relatively common seem to run automated "cleanup" tasks to find those rogue vms and clear them out periodically | 16:00 |
priteau | Indeed, I could see this happening | 16:01 |
priteau | Rogue VMs | 16:01 |
corvus | clarkb: yes, i think it's likely that userdel is just more careful now than in older versions, and the process (whatever it is) probably is running in older versions too. i suspect the right answer may be to find a way to ignore the error and proceed | 16:02 |
mtreinish | clarkb: thanks I just created a second key with ecdsa and was able to push my patch: https://review.opendev.org/c/openstack/stevedore/+/860109 | 16:02 |
fungi | mtreinish: yep, that's probably the safest workaround | 16:02 |
mtreinish | (the config option didn't work for me for whatever reason, I think I remember reading something in an arch package upgrade guide about the rsa keys, so it might be something on the package side) | 16:02 |
clarkb | corvus: ya userdel has a --force option whihc will get around that but it has a bunch of other new behavior it brings in too that we may not want | 16:03 |
clarkb | corvus just dropped but "This option forces the removal of the user account, even if the user is still logged in. It also forces userdel to remove the user's home directory and mail spool, even if another user uses the same home directory or if the mail spool is not owned by the specified user." | 16:05 |
clarkb | the idea I wanted to look into is rebooting before ssh'ing back in as root which should ensure that any remaining ubuntu owned processes are gone | 16:09 |
clarkb | but we can add the force option instead if others aren't worried about that (I think it may cause some stuff to get deleted for other old disabled users). Hrm maybe we move the regular disabled users and system image user disablement into different tasks and one can use force and theo ther won't | 16:10 |
clarkb | I'll go ahead and write that change because I think it will be the least impactful and easiest to understand | 16:11 |
opendevreview | Clark Boylan proposed opendev/system-config master: Disable distro cloud image users more forcefully https://review.opendev.org/c/opendev/system-config/+/860112 | 16:24 |
clarkb | something like that maybe | 16:24 |
clarkb | fungi: https://lists.mailman3.org/archives/list/mailman-users@mailman3.org/thread/FVZW5DQJ7C3TW4LPIIU7ARI7XMVJYYWX/ thats the followup to my email about mailman3 docker images. TLDR it sounds like maxking is largely doing it alone and the others involved in the projcet aren't invovled with the docker stuff:/ | 16:30 |
clarkb | fungi: I'm beginning to wonder if we shouldn't fork the images | 16:30 |
*** jpena is now known as jpena|off | 16:30 | |
fungi | got it, so more of an example/starting point | 16:31 |
clarkb | well I think the intention is that they be production ready | 16:32 |
clarkb | but I'm not sure they receive the attention needed currently to manage that. I'd be happy to help maxking upstream, but I'm not sure how to reach out other than what I've already done (issues/PRs/mailing list) | 16:33 |
fungi | yeah. also we could always un-fork later if maintenance picks back up on it | 16:33 |
fungi | hopefully there's not a ton of churn for the projects being bundled into those images these days, since mm3 has had many years to stabilize now | 16:34 |
clarkb | I'll have to think on this a bit more now that I'm leaning towards a local fork or modification. In particular we should decide if we want to build them up ourselves using a complete fork of the upstream docker fiels or if we just want to fetch the upstream images and modify them to our needs | 16:36 |
fungi | sure, we do both in different places, depending on the situation | 16:37 |
clarkb | I think the major upside to forking properly is we can set the uids and gids without needing to do a global chown across the image. However, if we do that we're more than likely goign to never reconverge with upstream | 16:38 |
*** dviroel|lunch is now known as dviroel | 16:38 | |
clarkb | if we just want to install lynx then doing that in a new layer is probably simplest and most likely to allow us to unfork later | 16:38 |
clarkb | so that might be the best place to start as it keeps the delta small and options open | 16:38 |
fungi | though it leaves us with concerns over the uid/gid conflicts | 16:40 |
clarkb | right | 16:42 |
fungi | looks like the container author was last seen responding to this thread: https://lists.mailman3.org/archives/list/mailman-users@mailman3.org/thread/NNYOXOE33DJEFWQ5WUBMJBB35IRAACQK/#S3KFJQD23IMACB7CR6K4ZWUQITREG6ID | 16:42 |
fungi | that was roughly a month ago | 16:42 |
clarkb | fyi jitsi meet updated about 4 days ago. I've gone ahead and subscribed to release notifications for https://github.com/jitsi/docker-jitsi-meet so that I'll get alerted when those updates happen | 17:53 |
clarkb | But we may want to retest meetpad soon just to be sure the update is working for us | 17:53 |
fungi | clarkb: spotz and i used it a few hours ago for the diversity wg meeting, seemed fine still | 18:37 |
*** dviroel is now known as dviroel|afk | 19:37 | |
clarkb | fungi: ah cool | 19:44 |
clarkb | fungi: over in gerrit land they are trying to run debian buster jobs and use openjdk-8. It seems that openjdk-8 is not in buster proper but is in sid. Do you know anywhere in our jobs where we might add debian unstable as an example I can show them? | 19:59 |
clarkb | I showed them what zuul did to install libc previously | 20:08 |
clarkb | I think that will work. Pin the default release to stable then install openjdk-8 which only exists in unstable | 20:08 |
fungi | technically you don't need to pin testing or unstable if you have stable sources, since the repositories themselves set relative priorities, so you'll only ever wind up getting packages from sid if they don't exist in buster or you explicitly request them by version or suite name | 20:56 |
clarkb | oh cool. I'd push a patch to them but I can't do that aynmore because something broke my account. But I gave them lots of examples and hopefully they can address it themselves with that info | 20:56 |
ianw | clarkb: thanks for looking at the jammy launcher, i was wondering if that would work. | 21:08 |
ianw | related to that; https://review.opendev.org/q/topic:bridge-ansible-venv is ready for review | 21:09 |
*** dasm is now known as dasm|off | 21:16 | |
clarkb | ah cool I'll have to take a look at those now | 21:21 |
clarkb | I think I reivewed some of them previously but that was very early on. | 21:21 |
clarkb | fungi: I've nearly got a mm3 docker image fork change ready to push on top of the existing change | 21:22 |
clarkb | however, I'm realizing that we may run into the problem with the locale stuff | 21:22 |
clarkb | but we'll figure that out if it becomes a problem I guess | 21:22 |
opendevreview | Clark Boylan proposed opendev/system-config master: WIP fork the maxking/docker-mailman images https://review.opendev.org/c/opendev/system-config/+/860157 | 21:36 |
clarkb | I think the django msgfmt issue with the mailman3 images is addressed by https://github.com/django-extensions/django-extensions/pull/1740 which seems to be in the most recent relaese of that tool and is newer than the failures I saw upstream | 21:37 |
opendevreview | Clark Boylan proposed opendev/system-config master: WIP fork the maxking/docker-mailman images https://review.opendev.org/c/opendev/system-config/+/860157 | 21:41 |
clarkb | once ^ seems to be working we can layer in our additions as heavily as we like | 21:44 |
clarkb | right now all I'm doing different than upstream is adding lynx | 21:44 |
clarkb | ianw: re the ansible in venv. The plan is to switch the existing server over to the venv first right? So we've got to be careful about not updating ansible to start? | 21:46 |
fungi | clarkb: awesome (wrt mm3 image fork), i was planning to do at least one more import on a fresh held node | 21:51 |
fungi | and yes, i saw the thread on their ml about that pr which maxking suggested but then hasn't had time to review since | 21:51 |
fungi | spotted it when i was browsing their archives earlier today | 21:52 |
clarkb | https://github.com/maxking/docker-mailman/pull/555 is the related PR fwiw | 21:54 |
clarkb | heh tox-linters explodes on the vendored scripts | 21:55 |
opendevreview | Clark Boylan proposed opendev/system-config master: WIP fork the maxking/docker-mailman images https://review.opendev.org/c/opendev/system-config/+/860157 | 22:03 |
clarkb | we apparently need buildkit to build these images | 22:03 |
ianw | clarkb: yeah, that entire stack should be safe to apply to the existing host. but i do want to cut over to the new server sooner rather than later. | 22:08 |
*** rlandy is now known as rlandy|bbl | 22:11 | |
clarkb | ianw: suggestion on https://review.opendev.org/c/opendev/system-config/+/857799/ which affects its child | 22:17 |
ianw | thanks; yep we can add a bionic job, just to be sure | 22:18 |
clarkb | ianw: do you know if the updated selenium works against older selenium that will install on bionic? I guess we'll find out :) | 22:21 |
ianw | no the python would be too old for that. but in just a basic bridge deployment test i don't think we'd be trying to run selenium | 22:21 |
clarkb | oh right | 22:22 |
clarkb | its only gitea, paste, codesearch etc that do the screenshots | 22:22 |
clarkb | woot my images built this time in 860157 | 22:25 |
clarkb | ianw: also did you see https://review.opendev.org/c/zuul/zuul/+/855309/ merged? | 22:29 |
clarkb | https://review.opendev.org/c/opendev/system-config/+/855472 which means that chagne is ready to land when we are I think. I've got it on the meeting agenda too | 22:29 |
ianw | yeah, thanks. maybe merge early tomorrow (for me) and can monitor | 22:30 |
clarkb | infra-root I finally updated the meeting agnead for tomorrow. Is anything important missing from that? | 22:34 |
ianw | lgtm, thanks | 22:35 |
opendevreview | Clark Boylan proposed opendev/system-config master: WIP fork the maxking/docker-mailman images https://review.opendev.org/c/opendev/system-config/+/860157 | 23:05 |
clarkb | now with less bashate hate | 23:05 |
clarkb | and agenda sent | 23:09 |
opendevreview | Clark Boylan proposed opendev/system-config master: WIP fork the maxking/docker-mailman images https://review.opendev.org/c/opendev/system-config/+/860157 | 23:14 |
clarkb | it helps to properly test that first (I did actually test it but because find prints a ton of lines I missed it was still printing the lines I didn't want) | 23:15 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!