Tuesday, 2021-01-12

*** brinzhang has joined #opendev00:13
*** tosky has quit IRC00:15
*** hamalq has quit IRC01:15
*** _mlavalle_1 has quit IRC02:55
*** d34dh0r53 has quit IRC03:32
*** d34dh0r53 has joined #opendev03:33
*** d34dh0r53 has quit IRC03:35
*** d34dh0r53 has joined #opendev03:35
*** d34dh0r53 has quit IRC04:19
*** d34dh0r53 has joined #opendev04:20
*** d34dh0r53 has quit IRC04:22
*** d34dh0r53 has joined #opendev04:22
*** d34dh0r53 has quit IRC04:59
*** d34dh0r53 has joined #opendev04:59
*** d34dh0r53 has quit IRC05:01
*** d34dh0r53 has joined #opendev05:01
*** d34dh0r53 has quit IRC05:03
*** d34dh0r53 has joined #opendev05:03
*** d34dh0r53 has joined #opendev05:04
*** ykarel has joined #opendev05:23
*** ykarel_ has joined #opendev05:26
*** brinzhang has quit IRC05:29
*** ykarel has quit IRC05:29
*** brinzhang has joined #opendev05:29
*** Vivek has joined #opendev05:40
*** ykarel_ has quit IRC06:01
*** ykarel has joined #opendev06:20
*** marios has joined #opendev06:45
*** ykarel_ has joined #opendev07:11
*** ykarel has quit IRC07:12
*** Vivek has quit IRC07:15
*** Vivek has joined #opendev07:24
*** hashar has joined #opendev07:37
*** eolivare has joined #opendev07:38
*** fressi has joined #opendev07:43
*** fressi has joined #opendev07:45
*** jpena|off is now known as jpena07:46
*** Vivek has quit IRC07:48
*** ralonsoh has joined #opendev07:48
*** DSpider has joined #opendev07:49
*** ykarel_ has quit IRC07:55
*** ykarel_ has joined #opendev07:56
*** ykarel__ has joined #opendev07:58
openstackgerritxinliang proposed openstack/diskimage-builder master: Fix building error with element dracut-regenerate  https://review.opendev.org/c/openstack/diskimage-builder/+/77024108:00
*** ykarel__ is now known as ykarel08:01
*** ykarel_ has quit IRC08:01
*** slaweq has joined #opendev08:03
*** rpittau|afk is now known as rpittau08:08
*** Vivek has joined #opendev08:13
*** andrewbonney has joined #opendev08:17
*** tosky has joined #opendev08:22
*** sboyron has joined #opendev08:33
*** ysandeep is now known as ysandeep|lunch08:55
*** Vivek has quit IRC09:09
*** Vivek has joined #opendev09:11
*** dtantsur|afk is now known as dtantsur09:42
*** hrw has joined #opendev09:43
hrwcan someone check aarch64 CI nodes?09:43
hrw2021-01-12 09:18:55.758917 | primary | W: Failed to fetch https://mirror.regionone.linaro-us.opendev.org/debian/dists/buster-backports/InRelease  Could not connect to mirror.regionone.linaro-us.opendev.org:443 (2604:1380:4111:3e54:f816:3eff:fe17:6b17). - connect (113: No route to host) Could not connect to mirror.regionone.linaro-us.opendev.org:443 ( - connect (113: No route to host)09:43
*** Vivek has quit IRC10:00
*** Vivek has joined #opendev10:18
fricklerinfra-root: ^^ can't reach the mirror via ssh, either. checking now whether it might be powered off, istr that that happened occasionally10:19
frickleryup, started the server, waiting for it to come back online. nb03 is also shutoff, but I'm not sure whether that might be intended10:23
fricklerhmm, afs service timed out on initial startup and doesn't behave well when trying to restart it, too. will try another reboot of the whole server10:33
hrwthanks frickler10:39
*** ysandeep|lunch is now known as ysandeep10:43
fricklerinfra-root: no success with the afs client, need someone with more experience to take a deeper look11:00
*** ShadowJonathan has quit IRC11:06
*** ShadowJonathan has joined #opendev11:06
*** rpittau has quit IRC11:06
*** rpittau has joined #opendev11:07
*** otherwiseguy has quit IRC11:07
*** otherwiseguy has joined #opendev11:12
*** Vivek has quit IRC11:49
*** jpena is now known as jpena|lunch12:25
*** jpena|lunch is now known as jpena13:21
*** mlavalle has joined #opendev13:58
*** Alex_Gaynor has joined #opendev14:06
Alex_Gaynorhttps://mirror.regionone.linaro-us.opendev.org/centos/8/AppStream/aarch64/os/repodata/repomd.xml (and I think other resources on the mirror) are returning 403s, all aarch64 builds are failing14:07
hrwAlex_Gaynor: yep14:07
*** fressi has quit IRC14:16
*** fressi has joined #opendev14:20
Alex_GaynorWhat can be done to address the outages this causes? The mirror 403ing is a pretty consistent cause of unavailability.14:41
hrwI asked admins on Linaro side to take a look14:47
fungiAlex_Gaynor: the main thing which can be done is to find a second donor. zuul/nodepool do a great job of absorbing provider outages (they happen all the time) so long as there are multiple providers for a node type14:49
fungifor general x86 nodes this isn't usually user-facing because we have roughly a dozen different providers of those and if one goes offline then our capacity just drops a little14:50
hrwand there is just one aarch64 provider14:50
Alex_GaynorIsn't the mirror run by the opendev team? I'm just assuming based on the domain14:50
clarkbAlex_Gaynor: it is, but the problem with it is that it was shutdown underneath us (eg by the cloud) and when we have started it up again it appears to be having network connectivity issues for the backing filesystem14:51
clarkbso yes we run it but the current problems appear hosting related14:51
fungiahh, i see, so the problem is the mirror not node availability this time. but still yes when there are multiple providers and something is preventing bits of supporting infrastructure like an in-provider mirror from working, we can quickly turn that provider down in our configuration and rely on the others until it's in good shape again14:52
clarkbfungi: ya that is just based on my reading of scrollback I've not yet had time to look at the server myself. I need to make tea and get somethign to eat before meetings14:52
Alex_GaynorI do wonder if there's a way to get apt/rpm to simply fall back to the upstream mirrors in the event of an error like this. The mirror is simply for efficiency, isn't it?14:53
clarkbAlex_Gaynor: no, the upstream mirrors tends to break often particularly for deb repos14:53
clarkbthis is a failure of the deb package mirroring process where you have index updates racing package removals14:54
clarkbwe address this by mirroring ourselves into a filesystem that gets released in verified snapshots14:54
clarkband those snapshots retain old package for several hours to ensure somewhat stale index updates can stillf ind packages14:54
Alex_GaynorRight, so the ideal behavior is that you need _at least one_ of the mirror or upstream in good working order.14:55
Alex_GaynorI don't know if deb can be configured to do that, unfortunately.14:55
clarkbya I'm not sure how apt(-get|itude) would treat multiple mirrors. Off the top of my head it expects each of them to be working but maybe that is configurable14:56
hrws/-get// as 'apt' is enough command now14:56
clarkbfungi: ok this looks like the stuck rmmon for openafs startup errors. I'm not sure those every got traced back to a specific cause (and were they aarch64 only?)14:58
fungii think in one case we decided that an ungraceful shutdown corrupted the afs cache and removing it seemed to get things working on reboot14:59
clarkbsounds like if a package is in multiple indexes the first listed one wins. Not clear to me what happens if an index is broken though15:00
clarkbfungi: oh ya I think that is right15:00
fungiapt/apt-get won't fall back on another source if the download returns an error, even if there are multiple sources for the same package. not unless it was added recently and i missed the announcement15:01
clarkbthat is what I thought, thanks15:04
clarkbfungi: do you know how the cache was cleared? seems like fs flushmount isn't working because openafs wasn't working15:06
clarkbwas it direct rm of the cache dir?15:06
mordredyeah - apt can fallback to a second server for a given package, but it will read the index from all of the sources, and that index will be combined. because of how we construct our mirrors, unfortunately, there's no good way for us to list the upstream sources that would help15:13
fungiwell, also it will error15:14
fungiwhen it can't read one of them, the result will be nonzero exit15:14
fungiclarkb: we deleted the cachedir or its contents directly15:15
mordredit basically would increase our chances of failure15:15
mordredrather than decrease them15:15
fungiclarkb: were you going to delete the contents of /var/cache/openafs or shall i?15:25
clarkbfungi: if you can do it that would be great15:26
fungion it15:26
fungiand rebooting it again to make sure it starts cleanly15:28
*** lbragstad has quit IRC15:29
fungiit's still trying to shutdown according to the console stream, waiting for afsd to stop and to rmmod the lkm for it15:31
fungiit's proceeding now15:33
clarkbI think there is a timeout on the wait for processes to stop15:33
fungiyeah, it's moved on from sigterm to sigkill on them15:33
*** lbragstad has joined #opendev15:34
fungiand actually booting now15:34
*** lbragstad has quit IRC15:35
*** lbragstad has joined #opendev15:35
fungii can ssh into it again15:36
fungior, well, can get a reply from the sshd, other daemons are still starting15:36
fungiokay, *now* i can15:37
clarkband /afs is populated again15:37
fungii can get a directory listing from /afs/openstack.org/ on it yes15:37
clarkbhttps://mirror.regionone.linaro-us.opendev.org/ also looks happy15:37
fungi#status log manually deleted contents of /var/cache/openafs on mirror.regionone.linaro-us and rebooted to recover from a previous unclean shutdown which was preventing afsd from working15:38
openstackstatusfungi: finished logging15:38
fungihrw: Alex_Gaynor: looks like builds there should work again, still hoping we hear back from the admins at linaro as to why server instances there spontaneously got shutdown15:39
Alex_Gaynor👍 given this was not ultimately an issue with the hosting provider, is there something that an be done in the future to prevent this / make it easier to diagnose/recover from / etc.15:40
fungiAlex_Gaynor: well, it was an issue with the hosting provider, persistent servers there were stopped suddenly15:41
clarkband that corrupted caches15:41
fungiwe could clear caches on reboot, but that would force normal reboots to start froma cold cache15:41
Alex_GaynorSure, but it was not the networking issue on teh host that was originally suspected.15:41
clarkbAlex_Gaynor: no but the servers were hard off prior to that15:42
Alex_GaynorI don't think this is the first time this situation has occurred.15:42
clarkbI was just speculating as I read scrollback in the morning15:42
clarkbAlex_Gaynor: it is not and each time the root cause was the provider hard shutting down the VM15:42
fungi(also caches aren't the only thing which can go wrong on an unclean shutdown. thankfully this didn't want console intervention to fsck the rootfs or something)15:42
clarkbI guess what we are trying to say is we can add things on our end but none of that will fix the root cause15:42
clarkbideally the provider would address the root cause problem (and it sounds like hrw has reached out to them about it)15:43
fungiredundancy is the solution15:43
clarkbalso ^15:43
Alex_GaynorGiven there is not another aarch64 host in the waiting (AFAIK), and this one keeps hard shutting down VMs, being resillient to them being shut down seems like a valuable way to improve availability.15:43
*** andrii_ostapenko has joined #opendev15:43
fungii can try to think through some options when i'm not in the middle of an all-morning videoconference, but i'm hesitant to spend a lot of our time resources engineering a workaround for the lack of redundancy, when we ultimately want redundancy15:46
fungiour efforts migt be better spent in outreach to potential donors of arm64 capacity15:47
*** d34dh0r53 has quit IRC15:51
clarkbalso worth nothing that your jobs do not need to use any of our mirroring (at least until we get complaints about network bw usage)15:52
*** d34dh0r53 has joined #opendev15:53
clarkbwe provide them largely to try and make things more reliable. It is possible that the intersection of arm64 and mirroring is such that this isn't currently the case15:53
Alex_GaynorAre there instructions on how to disable them, the playbook that installs them seems to run before our playbooks (that is to say, I don't see any "setup mirroring" steps in ours)16:02
hrwAlex_Gaynor: you can always remove /etc/yum.repos.d/*repo and replace them with yours (same for Debian based)16:03
clarkbI think your jobs are inheriting the mirror setup from our default job. Ya probably the simplest step is to simply override them early in your job16:03
clarkbthe one concern with that is it may not be early enough in the job. If that is the case then may need to override the base job? I'll have to look at concrete job setup before I can give better advice and like fungi am distracted by meetings today16:04
*** fressi has quit IRC16:16
*** d34dh0r53 has quit IRC16:19
*** d34dh0r53 has joined #opendev16:19
hrwthanks for fixing aarch64 nodes16:20
hrwkolla jobs are in progress16:21
*** artom has joined #opendev16:23
artomOpenstack-specific question: What "thing" does those automatic updates to launchpad/storyboard whenever a patch that has 'Closes-bug:' or 'Implements blueprint' is proposed16:24
artomAnd where is the source?16:24
clarkbartom: opendev/jeepyb is the project and it has a number of scripts for various actions. We then tell gerrit to run them as hooks for when changes are proposed/updated/merged16:25
artomclarkb, aha, cool!16:26
clarkbartom: that is for launchpad. For storyboard there is an its-storyboard plugin that we install in gerrit16:27
clarkbartom: the source code for its-storyboard is hosted with the other gerrit plugins upstream in the gerrit code hosting16:27
artomclarkb, so it's driven by gerrit hooks... I want to replicate something similar in our downstream gerrit, but I'm not sure how easy/hard it will be for me to be able to get access to configure hooks16:27
clarkbyes they are gerrit hooks. The way those work is you have scripts with special names in a hooks dir on the server16:28
artomThanks for the tips :)16:30
*** sshnaidm|ruck is now known as sshnaidm|afk16:47
fungiartom: ultimately we'd prefer to replace all those (gerrit hooks and "its" plugins) with zero-node zuul jobs16:54
fungitriggered off gerrit's event stream16:55
*** openstackgerrit has quit IRC16:55
fungilooks like we saw a significant system load spike on review.o.o around 15:55-16:00z16:55
fungimay need to keep an eye on it16:56
andrii_ostapenkohey folks, do you have any insight what's going on with stackalytics.com?16:57
clarkbandrii_ostapenko: no we don't run that service, it has always been something mirantis built, ran, and hosted16:57
andrii_ostapenkoi know you don't manage it, though maybe you have some information16:57
*** marios is now known as marios|out16:59
andrii_ostapenkoclarkb, actually it's not operational for quite a while, and i'm asking because i host one for myself https://stackalytics.io and shared to some guys who were needing it. Though not sure how it will behave on higher load, so hesitating to advertise it for broader audience. However we may try17:00
clarkbsorry, we've never really been involved in it17:00
*** jpena is now known as jpena|off17:01
*** ysandeep is now known as ysandeep|away17:01
*** marios|out has quit IRC17:05
*** mlavalle has quit IRC17:07
*** mlavalle has joined #opendev17:07
*** slaweq has quit IRC17:10
*** artom has quit IRC17:10
*** d34dh0r53 has quit IRC17:10
*** d34dh0r53 has joined #opendev17:27
*** artom has joined #opendev17:30
*** rpittau is now known as rpittau|afk17:32
*** sboyron has quit IRC17:46
*** eolivare has quit IRC17:47
*** sboyron has joined #opendev17:48
*** dtantsur is now known as dtantsur|afk17:51
*** hamalq has joined #opendev17:58
*** ykarel has quit IRC18:00
*** ralonsoh has quit IRC18:02
*** d34dh0r53 has quit IRC18:13
clarkbother meetings have distracted me and I realized that I didn't send out a meeting agenda yesterday. I think we can still plan to meet in ~38 minutes but with a less formal agenda just to make sure any important items are called out. Sorry about that. I must'ev dismissed my reminder and not actually followed through with doing the thing :(18:23
*** d34dh0r53 has joined #opendev18:31
*** ianw_pto is now known as ianw18:58
*** andrewbonney has quit IRC19:07
*** hashar has quit IRC19:30
fungi2021-01-12 16:55:12     <--     openstackgerrit (~openstack@eavesdrop01.openstack.org) has quit (Quit: Changing servers)19:39
* fungi sighs and restarts it19:39
fungi#status log restarted gerritbot as it switched irc servers at 16:55 and never came back19:41
openstackstatusfungi: finished logging19:41
*** slaweq has joined #opendev19:44
fungizbr: maybe we could crowd-source a list of things we'd collectively like to see movement on and try to prioritize them? there's a ton of stuff we need/want to get done, so figuring out what not to work on in a sprint is probably harder19:58
zbri think we have lots of things we want, but if we pick one of two particular aspects, we could try to speedup progress on these.20:00
zbrwe could even use a special hashtag to mark the work on this and assure we review them daily.20:00
zbri see it more like an experiment, lets see if that approach does bring some good progress.20:01
fungii've merged the internal mirrors in rax change (760495) now, will try to keep an eye on jobs to make sure this doesn't destabilize anything once it's deployed to the executors20:01
fungizbr: i'm in favor of experimenting with hashtags, yep20:02
fungiinfra-root: do we want a zuul scheduler restart today to pick up wip support in the gerrit driver?20:02
zbrone benefit of the hastag is that is editable even after the change merged.20:02
zbrbut i think we need to tune the permissions on gerrit a little bit, to allow anyone to edit them.20:03
*** slaweq has quit IRC20:03
zbrif I am correct, now only the original author can.20:03
clarkbsomeone will have to explain how a hashtag is different than say a topic? But I don't midn experimenting with new processes to identify priority work as well as tracking it when in review20:03
zbrhashtag is google name for free form labels/tags.20:03
clarkbtoday and tomorrow a still a bit busy for me so I probably won't be abel to give that the thought it deservers, but in the second half of the week I'm up for brainstorming20:04
fungijudging from the node requests graph zuul has a sizeable backlog right now, so i'd probably try to do the scheduler restart in a few hours if things quiet down some20:04
clarkbfungi: that sounds reasonable. I'm not opposed to a restart once things calm a bit20:04
clarkband now lunch20:05
fungigerrit events are trending downward, but node requests queue hasn't started to really drop yet20:05
zbris 8pm for me, so i will go out, but i will read stuff in the morning. mention me if needed.20:06
fungithanks zbr!20:06
zbrlast thing, we need to make some decisions on testing for the utility tools and libraries, like git-review.20:07
zbrsee https://review.opendev.org/c/openstack/project-config/+/763808/2/zuul.d/projects.yaml20:07
zbrIMHO, we should drop support for unsupported python but test with *all* python versions.20:08
zbri know for sure that there are bugs that can happen only on particular python version.20:08
zbrtesting only with extremes is not good enough, IMHO, at least for stuff like git-review or pbr.20:09
zbrif CI resources are really such an issue, we should just build an image with multiple pythons and use that.20:10
*** openstackgerrit has joined #opendev20:10
openstackgerritMerged openstack/project-config master: Use internal mirror for RAX IAD/DFW  https://review.opendev.org/c/openstack/project-config/+/76049520:10
zbror better, start using containers as simple tox jobs can run inside containers.20:10
zbri had few projects (not zuul) where coverage was done combining results from running tox on each python version, which means you would need all python on the same machine (multiple jobs and artifact collection would have added too much complexity and implementation efforth)20:12
openstackgerritSorin Sbârnea proposed openstack/project-config master: Update git-review test matrix (drop py27)  https://review.opendev.org/c/openstack/project-config/+/76380820:16
fungiso there are code patterns which e.g. wok on python 3.5 and 3.8 but not 3.7?20:16
fungiclearly i should be starting dinner prep20:17
zbrfungi: i got cases where things failed only on py36.20:17
fungizbr: oh, i see, you're worried about codepaths which switch on sys.version20:17
zbrin general the chance is relatively small, but is not zero.20:17
fungiso for example new feature is added in 3.7 and someone matches on sys.version > (3, 6, 0) or whatever20:18
zbri am ok with doing only edge testing on some projects20:18
zbryep, is very easy to introduce bugs if you skip.20:19
fungiso use of the new feature tries to turn on in 3.6 but breaks, where as it would have avoided using it in 3.520:19
zbrvery common with requirements.txt20:19
zbrbut on project where running unittests it takes less than 2 mins, maybe chaining would be better to optimize resource usage.20:20
fungialso i should have clarified my -1 on 763808 was for the fact that the commit message didn't indicate it was dropping python 2.7 testing, the rest was questions20:20
zbrpip would cache deps, most of them are identical between python versions, and we avoid extra jobs.20:20
fungithe previous commit message was merely "Update git-review test matrix" with no subsequent description20:21
zbrfungi: sure, i agree. feel free to just edit the message (if you know what it should be).20:21
fungino, what you wrote in the new patchset was fine. and i'm okay with dropping 2.7 testing on git-review and with testing intermediate versions (especially if the jobs don't run long)20:23
fungiso long as there's some consensus20:24
fungithough for some reason zuul is only running openstack-zuul-jobs-linters against that change20:24
fungid'oh, because it's project-config20:25
fungiand i see 763803 exercises it20:25
zbrbtw, in case one of you were curious, tox4 had a pre-release, it is awesome and finally knows when requirements were updated. Bad part is that it does only work with simple setups. For example I files a bug on constraints not working.20:25
zbrfungi: yep but because is project-config...20:26
fungiright, doesn't actually work20:26
zbrimho i would have required py36 but someone (clarkb?) told me to keep py35.20:26
fungiwe should probably move all that into the git-review repo20:26
zbri would support that.20:26
fungizbr: i want to say we still have (had?) ci jobs running on ubuntu-xenial for stable branches calling git-review. but maybe we can pin the version of git-review they install20:27
zbrfungi: no need to pin version, we can make proper use of python_requires20:27
fungiassuming we use new enough pip to know to rely on that20:28
zbrnext version will required 3.6, and older clients will be stick to whatever was last release.20:28
fungiif the project-pipeline definition were in the git-review repo, the change in jobs run would have been entirely self-testing20:28
zbrwell, if we have such an ancient pip, we have far bigger issues that git-review not working.20:28
fungizbr: yeah, i want to say we explicitly install newer pip from pypi on xenial because of that, and only rely on distro packaged pip on >=bionic20:29
zbrif am far more stressed about lack of py39 testing than dropping py35.20:30
zbrmy local default python is 3.9 and i keep encountering issues (minor but enough to annoy) with lots of repos.20:31
fungiyeah, my default python is 3.9 as well (and i also have 3.10 alpha on hand)20:33
fungibut you should go enjoy your evening, and we can discuss this stuff tomorrow20:34
clarkbI still think that for tools that we maintain and run on python 3.5 we should continue to support that verson of python20:36
clarkbI hvae no issue with also supporting and testing newer pythons20:36
clarkbgit review is an interesting case bceause its also a client side tool20:37
fungibut it's worth keeping in mind that 3.5.10 more than 4 months ago was the final (eol) release for 3.5. i don't think it's entirely out of the question to use that as a guideline for when we expect users to upgrade platforms if they want to directly consume new features/fixes without backporting20:39
clarkbexcept we are one of the users :P20:39
clarkbif we think that is important the nwe should invest in upgrading the places running python3.520:40
fungihence the "if they want to directly consume new features/fixes" part. do we need new git-review features on platforms using python 3.5?20:41
clarkbfungi: possibly if we upgrade gerrit and they change apis again20:42
clarkb(and it isn't entirely theoretical this happened with the 3.2 upgrade)20:42
clarkbit is possible I'm overly cautous with git review, but that is because it has a large diverse install base and is user facing20:43
clarkband I don't think git review gains much from stable dict sort orders or type annotations20:44
zbrtypung could hell improve the code a lot, making safer to add other changes20:44
clarkbI mean its a tiny coe base that is not difficult to test20:45
clarkbthe issues git review tends to run into have to do with differing gerrit versions and different runtime platforms (os x vs windows vs linux)20:45
clarkband it is a very stable code base20:46
clarkbwe're more likely to add bugs taking advatnage of newer python features than we are to avoid them in the process20:47
fungii've become increasingly of the opinion that if software needs typed variables it should be (re)written in a language which implements them rather than bolting them on21:15
fungiand lots of software also doesn't need them. particularly software dealing almost exclusively with bytestreams or text21:16
ianwclarkb: finding it hard to let this ssh thing rest :)21:24
ianwhttps://gitbox.apache.org/repos/asf?p=mina-sshd.git;a=commit;h=84196d2bef1444048645787caaa2764b54dca0cc appears to implement server-sig-algs21:25
clarkbya it was quite the rabbit hole yesterday21:25
clarkbianw: that added a framework to do it, they added support in the mina ssh client but not sshd if I read the java properly21:25
clarkbianw: basically on the sshd side you are expected to implement a KexExtensionHanlder to do it and that isn't provided for you21:26
clarkb(the interface is there but not the implementation)21:26
ianwhrm, is where it returns the list https://gitbox.apache.org/repos/asf?p=mina-sshd.git;a=blob;f=sshd-common/src/main/java/org/apache/sshd/common/kex/extension/parser/ServerSignatureAlgorithms.java;h=72b1364fabbdec8afa13527e069c10a2fda4cd0a;hb=84196d2bef1444048645787caaa2764b54dca0cc21:27
clarkbya I think all of the primitives are there just need to tie them together?21:29
clarkbthis is because the client side is implemented aiui21:29
clarkbthe gerrit sshd uses mina 2.4.0 and it definitely does not return a server-sig-algs to my client when observed with ssh -v21:30
ianwyeah, it seems that should have all gone into 2.3.021:30
*** hashar has joined #opendev21:31
*** lbragstad has quit IRC21:35
*** lbragstad has joined #opendev21:38
ianwmight be worth filing an issue upstream?21:39
clarkbya, I was hoping that the gerrit bug would confirm that reading of the issue, but probably quicker to also just file with mina and then link to that in the gerrit bug21:41
clarkbit does seem like fedora (and eventually openssh) will start pushing the issue21:41
ianwi agree; https://github.com/apache/mina-sshd/commit/84196d2bef1444048645787caaa2764b54dca0cc/#diff-1fd533c1b7162a203c3f4d7860a215eca4d7975b58b1802778b97f70621ae2ae isn't doing anything21:41
ianwi'll try writing something up21:43
fungior gerrit will declare ssh protocol unsupportable21:57
clarkbwell this only affects rsa, I could see them saying don't use rsa then21:58
clarkbbut mina should likely support this in the lib21:58
fungibut also gerrit's gerrit not exposing ssh access suggests there will come a time when they declare it unsupported22:03
clarkbthat is due to google's hosting rules aiui22:03
clarkbbut ya it is possible22:03
clarkb(google doesn't allow ssh in from outside or something like that)22:03
fungithey could argue that applies to shells, not ssh as a transport protocol22:04
fungiwhat gerrit provides isn't really ssh (secure shell) it's an ssh-based api socket22:05
clarkbya its the protocol they don't allow aiui22:05
clarkbI think because too many things can be hidden in it with no insight on their side (same for https really but they can't do business without it)22:05
fungiapi via rfc 4253 transport22:06
ianwclarkb/fungi: https://etherpad.opendev.org/p/2qcN-6GzRb5nW2rbux_m22:09
ianwi think my two questions are 1) why doesn't -oPubkeyAcceptedKeyTypes=rsa-sha2-512 work ... it seems like it should and 2) a gentle question if anyone is working on the server-sig-algs22:10
clarkbianw: for 1) it doesn't work beacuse openssh client will only use rsa-sha2-512 if the server returns a server-sig-algs list that includes it. Otherwise it falls back to ssh-rsa which fails becuse we have only allowed rsa-sha2-51222:11
fungiright, PubkeyAcceptedKeyTypes doesn't instruct it to negotiate, just states a preference if there's a negotiation22:11
ianwit feels like @ https://github.com/openssh/openssh-portable/blob/master/sshconnect2.c#L117222:14
ianwif the server_sig_algs == NULL then it matches what's in opeions.pubkey_key_types22:14
fungiif ... ssh->kex->server_sig_algs == NULL will be true if server-sig-algs negotiation is unsupported by the peer?22:16
fungior rather ssh->kex->server_sig_algs will be NULL if server-sig-algs negotiation is unsupported by the peer22:17
clarkbianw: fwiw that assertion is based on the ssh -vvv output when doing that22:17
clarkbit lists all the keys and say none of them match22:18
fungiand this is also master branch state of openssh-portable22:18
fungiso we should be clear what version of the client we're talking about and make sure the source is the same at that point in history22:19
clarkbOpenSSH_8.3p1 in my case which is pretty up to date22:20
fungiyeah, looks the same around line 112 under the V_8_3 branch22:22
fungier, 111222:23
ianwi got there via https://bugzilla.redhat.com/attachment.cgi?id=1719130&action=diff22:24
ianwthat's the fedora diff for buggy debian 10 era openssh 7.4 servers, which apparently support rsa-sha2-256,rsa-sha2-512 but don't correct advertise it22:24
fungia marvellous example of a pot calling a kettle black22:27
fungi"when deviating from upstream recommendations, we found that we were unable to communicate with other systems which deviated from upstream recommendations in different ways"22:28
ianwperhaps it is not choosing correctly22:32
clarkboh interesting is it the server saying the rsa is wrong maybe? ya22:32
ianwi don't know, i'm more lost than ever.  i think it's probably worth a jira issue in mina just with the basics and see what they think22:33
ianwthere's several related things like this, but nothing i can see clearly directly related.  but even being marked a dup would be helpful22:34
fungitripleo's still got a 12-hour gate backlog at the moment, and grafana shows the node request queue dropping very slowly23:11
fungimay need to wait a bit longer before attempting a zuul scheduler restart23:12
fungiin good news i haven't seen any breakage from the site-variables.yaml update since it was deployed almost 3 hours ago23:13
*** hashar has quit IRC23:15
*** DSpider has quit IRC23:23
*** sboyron has quit IRC23:45
*** sboyron has joined #opendev23:46
*** sboyron has quit IRC23:54
*** sboyron has joined #opendev23:56

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!