*** brinzhang has joined #opendev | 00:13 | |
*** tosky has quit IRC | 00:15 | |
*** hamalq has quit IRC | 01:15 | |
*** _mlavalle_1 has quit IRC | 02:55 | |
*** d34dh0r53 has quit IRC | 03:32 | |
*** d34dh0r53 has joined #opendev | 03:33 | |
*** d34dh0r53 has quit IRC | 03:35 | |
*** d34dh0r53 has joined #opendev | 03:35 | |
*** d34dh0r53 has quit IRC | 04:19 | |
*** d34dh0r53 has joined #opendev | 04:20 | |
*** d34dh0r53 has quit IRC | 04:22 | |
*** d34dh0r53 has joined #opendev | 04:22 | |
*** d34dh0r53 has quit IRC | 04:59 | |
*** d34dh0r53 has joined #opendev | 04:59 | |
*** d34dh0r53 has quit IRC | 05:01 | |
*** d34dh0r53 has joined #opendev | 05:01 | |
*** d34dh0r53 has quit IRC | 05:03 | |
*** d34dh0r53 has joined #opendev | 05:03 | |
*** d34dh0r53 has joined #opendev | 05:04 | |
*** ykarel has joined #opendev | 05:23 | |
*** ykarel_ has joined #opendev | 05:26 | |
*** brinzhang has quit IRC | 05:29 | |
*** ykarel has quit IRC | 05:29 | |
*** brinzhang has joined #opendev | 05:29 | |
*** Vivek has joined #opendev | 05:40 | |
*** ykarel_ has quit IRC | 06:01 | |
*** ykarel has joined #opendev | 06:20 | |
*** marios has joined #opendev | 06:45 | |
*** ykarel_ has joined #opendev | 07:11 | |
*** ykarel has quit IRC | 07:12 | |
*** Vivek has quit IRC | 07:15 | |
*** Vivek has joined #opendev | 07:24 | |
*** hashar has joined #opendev | 07:37 | |
*** eolivare has joined #opendev | 07:38 | |
*** fressi has joined #opendev | 07:43 | |
*** fressi has joined #opendev | 07:45 | |
*** jpena|off is now known as jpena | 07:46 | |
*** Vivek has quit IRC | 07:48 | |
*** ralonsoh has joined #opendev | 07:48 | |
*** DSpider has joined #opendev | 07:49 | |
*** ykarel_ has quit IRC | 07:55 | |
*** ykarel_ has joined #opendev | 07:56 | |
*** ykarel__ has joined #opendev | 07:58 | |
openstackgerrit | xinliang proposed openstack/diskimage-builder master: Fix building error with element dracut-regenerate https://review.opendev.org/c/openstack/diskimage-builder/+/770241 | 08:00 |
---|---|---|
*** ykarel__ is now known as ykarel | 08:01 | |
*** ykarel_ has quit IRC | 08:01 | |
*** slaweq has joined #opendev | 08:03 | |
*** rpittau|afk is now known as rpittau | 08:08 | |
*** Vivek has joined #opendev | 08:13 | |
*** andrewbonney has joined #opendev | 08:17 | |
*** tosky has joined #opendev | 08:22 | |
*** sboyron has joined #opendev | 08:33 | |
*** ysandeep is now known as ysandeep|lunch | 08:55 | |
*** Vivek has quit IRC | 09:09 | |
*** Vivek has joined #opendev | 09:11 | |
*** dtantsur|afk is now known as dtantsur | 09:42 | |
*** hrw has joined #opendev | 09:43 | |
hrw | hello | 09:43 |
hrw | can someone check aarch64 CI nodes? | 09:43 |
hrw | 2021-01-12 09:18:55.758917 | primary | W: Failed to fetch https://mirror.regionone.linaro-us.opendev.org/debian/dists/buster-backports/InRelease Could not connect to mirror.regionone.linaro-us.opendev.org:443 (2604:1380:4111:3e54:f816:3eff:fe17:6b17). - connect (113: No route to host) Could not connect to mirror.regionone.linaro-us.opendev.org:443 (139.178.85.140). - connect (113: No route to host) | 09:43 |
*** Vivek has quit IRC | 10:00 | |
*** Vivek has joined #opendev | 10:18 | |
frickler | infra-root: ^^ can't reach the mirror via ssh, either. checking now whether it might be powered off, istr that that happened occasionally | 10:19 |
frickler | yup, started the server, waiting for it to come back online. nb03 is also shutoff, but I'm not sure whether that might be intended | 10:23 |
frickler | hmm, afs service timed out on initial startup and doesn't behave well when trying to restart it, too. will try another reboot of the whole server | 10:33 |
hrw | thanks frickler | 10:39 |
*** ysandeep|lunch is now known as ysandeep | 10:43 | |
frickler | infra-root: no success with the afs client, need someone with more experience to take a deeper look | 11:00 |
*** ShadowJonathan has quit IRC | 11:06 | |
*** ShadowJonathan has joined #opendev | 11:06 | |
*** rpittau has quit IRC | 11:06 | |
*** rpittau has joined #opendev | 11:07 | |
*** otherwiseguy has quit IRC | 11:07 | |
*** otherwiseguy has joined #opendev | 11:12 | |
*** Vivek has quit IRC | 11:49 | |
*** jpena is now known as jpena|lunch | 12:25 | |
*** jpena|lunch is now known as jpena | 13:21 | |
*** mlavalle has joined #opendev | 13:58 | |
*** Alex_Gaynor has joined #opendev | 14:06 | |
Alex_Gaynor | https://mirror.regionone.linaro-us.opendev.org/centos/8/AppStream/aarch64/os/repodata/repomd.xml (and I think other resources on the mirror) are returning 403s, all aarch64 builds are failing | 14:07 |
hrw | Alex_Gaynor: yep | 14:07 |
*** fressi has quit IRC | 14:16 | |
*** fressi has joined #opendev | 14:20 | |
Alex_Gaynor | What can be done to address the outages this causes? The mirror 403ing is a pretty consistent cause of unavailability. | 14:41 |
hrw | I asked admins on Linaro side to take a look | 14:47 |
fungi | Alex_Gaynor: the main thing which can be done is to find a second donor. zuul/nodepool do a great job of absorbing provider outages (they happen all the time) so long as there are multiple providers for a node type | 14:49 |
fungi | for general x86 nodes this isn't usually user-facing because we have roughly a dozen different providers of those and if one goes offline then our capacity just drops a little | 14:50 |
hrw | and there is just one aarch64 provider | 14:50 |
Alex_Gaynor | Isn't the mirror run by the opendev team? I'm just assuming based on the domain | 14:50 |
clarkb | Alex_Gaynor: it is, but the problem with it is that it was shutdown underneath us (eg by the cloud) and when we have started it up again it appears to be having network connectivity issues for the backing filesystem | 14:51 |
clarkb | so yes we run it but the current problems appear hosting related | 14:51 |
fungi | ahh, i see, so the problem is the mirror not node availability this time. but still yes when there are multiple providers and something is preventing bits of supporting infrastructure like an in-provider mirror from working, we can quickly turn that provider down in our configuration and rely on the others until it's in good shape again | 14:52 |
clarkb | fungi: ya that is just based on my reading of scrollback I've not yet had time to look at the server myself. I need to make tea and get somethign to eat before meetings | 14:52 |
Alex_Gaynor | I do wonder if there's a way to get apt/rpm to simply fall back to the upstream mirrors in the event of an error like this. The mirror is simply for efficiency, isn't it? | 14:53 |
clarkb | Alex_Gaynor: no, the upstream mirrors tends to break often particularly for deb repos | 14:53 |
clarkb | this is a failure of the deb package mirroring process where you have index updates racing package removals | 14:54 |
clarkb | we address this by mirroring ourselves into a filesystem that gets released in verified snapshots | 14:54 |
clarkb | and those snapshots retain old package for several hours to ensure somewhat stale index updates can stillf ind packages | 14:54 |
Alex_Gaynor | Right, so the ideal behavior is that you need _at least one_ of the mirror or upstream in good working order. | 14:55 |
Alex_Gaynor | I don't know if deb can be configured to do that, unfortunately. | 14:55 |
clarkb | ya I'm not sure how apt(-get|itude) would treat multiple mirrors. Off the top of my head it expects each of them to be working but maybe that is configurable | 14:56 |
hrw | s/-get// as 'apt' is enough command now | 14:56 |
clarkb | fungi: ok this looks like the stuck rmmon for openafs startup errors. I'm not sure those every got traced back to a specific cause (and were they aarch64 only?) | 14:58 |
fungi | i think in one case we decided that an ungraceful shutdown corrupted the afs cache and removing it seemed to get things working on reboot | 14:59 |
clarkb | sounds like if a package is in multiple indexes the first listed one wins. Not clear to me what happens if an index is broken though | 15:00 |
clarkb | fungi: oh ya I think that is right | 15:00 |
fungi | apt/apt-get won't fall back on another source if the download returns an error, even if there are multiple sources for the same package. not unless it was added recently and i missed the announcement | 15:01 |
clarkb | that is what I thought, thanks | 15:04 |
clarkb | fungi: do you know how the cache was cleared? seems like fs flushmount isn't working because openafs wasn't working | 15:06 |
clarkb | was it direct rm of the cache dir? | 15:06 |
mordred | yeah - apt can fallback to a second server for a given package, but it will read the index from all of the sources, and that index will be combined. because of how we construct our mirrors, unfortunately, there's no good way for us to list the upstream sources that would help | 15:13 |
fungi | well, also it will error | 15:14 |
fungi | when it can't read one of them, the result will be nonzero exit | 15:14 |
mordred | yah | 15:14 |
fungi | clarkb: we deleted the cachedir or its contents directly | 15:15 |
mordred | it basically would increase our chances of failure | 15:15 |
mordred | rather than decrease them | 15:15 |
fungi | clarkb: were you going to delete the contents of /var/cache/openafs or shall i? | 15:25 |
clarkb | fungi: if you can do it that would be great | 15:26 |
fungi | on it | 15:26 |
fungi | and rebooting it again to make sure it starts cleanly | 15:28 |
*** lbragstad has quit IRC | 15:29 | |
fungi | it's still trying to shutdown according to the console stream, waiting for afsd to stop and to rmmod the lkm for it | 15:31 |
fungi | it's proceeding now | 15:33 |
clarkb | I think there is a timeout on the wait for processes to stop | 15:33 |
fungi | yeah, it's moved on from sigterm to sigkill on them | 15:33 |
*** lbragstad has joined #opendev | 15:34 | |
fungi | and actually booting now | 15:34 |
*** lbragstad has quit IRC | 15:35 | |
*** lbragstad has joined #opendev | 15:35 | |
fungi | i can ssh into it again | 15:36 |
fungi | or, well, can get a reply from the sshd, other daemons are still starting | 15:36 |
fungi | okay, *now* i can | 15:37 |
clarkb | and /afs is populated again | 15:37 |
fungi | i can get a directory listing from /afs/openstack.org/ on it yes | 15:37 |
clarkb | https://mirror.regionone.linaro-us.opendev.org/ also looks happy | 15:37 |
fungi | #status log manually deleted contents of /var/cache/openafs on mirror.regionone.linaro-us and rebooted to recover from a previous unclean shutdown which was preventing afsd from working | 15:38 |
openstackstatus | fungi: finished logging | 15:38 |
fungi | hrw: Alex_Gaynor: looks like builds there should work again, still hoping we hear back from the admins at linaro as to why server instances there spontaneously got shutdown | 15:39 |
Alex_Gaynor | 👍 given this was not ultimately an issue with the hosting provider, is there something that an be done in the future to prevent this / make it easier to diagnose/recover from / etc. | 15:40 |
fungi | Alex_Gaynor: well, it was an issue with the hosting provider, persistent servers there were stopped suddenly | 15:41 |
clarkb | and that corrupted caches | 15:41 |
fungi | we could clear caches on reboot, but that would force normal reboots to start froma cold cache | 15:41 |
Alex_Gaynor | Sure, but it was not the networking issue on teh host that was originally suspected. | 15:41 |
clarkb | Alex_Gaynor: no but the servers were hard off prior to that | 15:42 |
Alex_Gaynor | I don't think this is the first time this situation has occurred. | 15:42 |
clarkb | I was just speculating as I read scrollback in the morning | 15:42 |
clarkb | Alex_Gaynor: it is not and each time the root cause was the provider hard shutting down the VM | 15:42 |
fungi | (also caches aren't the only thing which can go wrong on an unclean shutdown. thankfully this didn't want console intervention to fsck the rootfs or something) | 15:42 |
clarkb | I guess what we are trying to say is we can add things on our end but none of that will fix the root cause | 15:42 |
clarkb | ideally the provider would address the root cause problem (and it sounds like hrw has reached out to them about it) | 15:43 |
fungi | redundancy is the solution | 15:43 |
clarkb | also ^ | 15:43 |
Alex_Gaynor | Given there is not another aarch64 host in the waiting (AFAIK), and this one keeps hard shutting down VMs, being resillient to them being shut down seems like a valuable way to improve availability. | 15:43 |
*** andrii_ostapenko has joined #opendev | 15:43 | |
fungi | i can try to think through some options when i'm not in the middle of an all-morning videoconference, but i'm hesitant to spend a lot of our time resources engineering a workaround for the lack of redundancy, when we ultimately want redundancy | 15:46 |
fungi | our efforts migt be better spent in outreach to potential donors of arm64 capacity | 15:47 |
*** d34dh0r53 has quit IRC | 15:51 | |
clarkb | also worth nothing that your jobs do not need to use any of our mirroring (at least until we get complaints about network bw usage) | 15:52 |
*** d34dh0r53 has joined #opendev | 15:53 | |
clarkb | we provide them largely to try and make things more reliable. It is possible that the intersection of arm64 and mirroring is such that this isn't currently the case | 15:53 |
Alex_Gaynor | Are there instructions on how to disable them, the playbook that installs them seems to run before our playbooks (that is to say, I don't see any "setup mirroring" steps in ours) | 16:02 |
hrw | Alex_Gaynor: you can always remove /etc/yum.repos.d/*repo and replace them with yours (same for Debian based) | 16:03 |
clarkb | I think your jobs are inheriting the mirror setup from our default job. Ya probably the simplest step is to simply override them early in your job | 16:03 |
clarkb | the one concern with that is it may not be early enough in the job. If that is the case then may need to override the base job? I'll have to look at concrete job setup before I can give better advice and like fungi am distracted by meetings today | 16:04 |
*** fressi has quit IRC | 16:16 | |
*** d34dh0r53 has quit IRC | 16:19 | |
*** d34dh0r53 has joined #opendev | 16:19 | |
hrw | thanks for fixing aarch64 nodes | 16:20 |
hrw | kolla jobs are in progress | 16:21 |
*** artom has joined #opendev | 16:23 | |
artom | Openstack-specific question: What "thing" does those automatic updates to launchpad/storyboard whenever a patch that has 'Closes-bug:' or 'Implements blueprint' is proposed | 16:24 |
artom | ? | 16:24 |
artom | And where is the source? | 16:24 |
clarkb | artom: opendev/jeepyb is the project and it has a number of scripts for various actions. We then tell gerrit to run them as hooks for when changes are proposed/updated/merged | 16:25 |
artom | clarkb, aha, cool! | 16:26 |
clarkb | artom: that is for launchpad. For storyboard there is an its-storyboard plugin that we install in gerrit | 16:27 |
clarkb | artom: the source code for its-storyboard is hosted with the other gerrit plugins upstream in the gerrit code hosting | 16:27 |
artom | clarkb, so it's driven by gerrit hooks... I want to replicate something similar in our downstream gerrit, but I'm not sure how easy/hard it will be for me to be able to get access to configure hooks | 16:27 |
clarkb | yes they are gerrit hooks. The way those work is you have scripts with special names in a hooks dir on the server | 16:28 |
artom | Thanks for the tips :) | 16:30 |
*** sshnaidm|ruck is now known as sshnaidm|afk | 16:47 | |
fungi | artom: ultimately we'd prefer to replace all those (gerrit hooks and "its" plugins) with zero-node zuul jobs | 16:54 |
fungi | triggered off gerrit's event stream | 16:55 |
*** openstackgerrit has quit IRC | 16:55 | |
fungi | looks like we saw a significant system load spike on review.o.o around 15:55-16:00z | 16:55 |
fungi | may need to keep an eye on it | 16:56 |
andrii_ostapenko | hey folks, do you have any insight what's going on with stackalytics.com? | 16:57 |
clarkb | andrii_ostapenko: no we don't run that service, it has always been something mirantis built, ran, and hosted | 16:57 |
andrii_ostapenko | i know you don't manage it, though maybe you have some information | 16:57 |
*** marios is now known as marios|out | 16:59 | |
andrii_ostapenko | clarkb, actually it's not operational for quite a while, and i'm asking because i host one for myself https://stackalytics.io and shared to some guys who were needing it. Though not sure how it will behave on higher load, so hesitating to advertise it for broader audience. However we may try | 17:00 |
clarkb | sorry, we've never really been involved in it | 17:00 |
*** jpena is now known as jpena|off | 17:01 | |
*** ysandeep is now known as ysandeep|away | 17:01 | |
*** marios|out has quit IRC | 17:05 | |
*** mlavalle has quit IRC | 17:07 | |
*** mlavalle has joined #opendev | 17:07 | |
*** slaweq has quit IRC | 17:10 | |
*** artom has quit IRC | 17:10 | |
*** d34dh0r53 has quit IRC | 17:10 | |
*** d34dh0r53 has joined #opendev | 17:27 | |
*** artom has joined #opendev | 17:30 | |
*** rpittau is now known as rpittau|afk | 17:32 | |
*** sboyron has quit IRC | 17:46 | |
*** eolivare has quit IRC | 17:47 | |
*** sboyron has joined #opendev | 17:48 | |
*** dtantsur is now known as dtantsur|afk | 17:51 | |
*** hamalq has joined #opendev | 17:58 | |
*** ykarel has quit IRC | 18:00 | |
*** ralonsoh has quit IRC | 18:02 | |
*** d34dh0r53 has quit IRC | 18:13 | |
clarkb | other meetings have distracted me and I realized that I didn't send out a meeting agenda yesterday. I think we can still plan to meet in ~38 minutes but with a less formal agenda just to make sure any important items are called out. Sorry about that. I must'ev dismissed my reminder and not actually followed through with doing the thing :( | 18:23 |
*** d34dh0r53 has joined #opendev | 18:31 | |
*** ianw_pto is now known as ianw | 18:58 | |
*** andrewbonney has quit IRC | 19:07 | |
*** hashar has quit IRC | 19:30 | |
fungi | 2021-01-12 16:55:12 <-- openstackgerrit (~openstack@eavesdrop01.openstack.org) has quit (Quit: Changing servers) | 19:39 |
* fungi sighs and restarts it | 19:39 | |
fungi | #status log restarted gerritbot as it switched irc servers at 16:55 and never came back | 19:41 |
openstackstatus | fungi: finished logging | 19:41 |
*** slaweq has joined #opendev | 19:44 | |
fungi | zbr: maybe we could crowd-source a list of things we'd collectively like to see movement on and try to prioritize them? there's a ton of stuff we need/want to get done, so figuring out what not to work on in a sprint is probably harder | 19:58 |
zbr | i think we have lots of things we want, but if we pick one of two particular aspects, we could try to speedup progress on these. | 20:00 |
zbr | we could even use a special hashtag to mark the work on this and assure we review them daily. | 20:00 |
zbr | i see it more like an experiment, lets see if that approach does bring some good progress. | 20:01 |
fungi | i've merged the internal mirrors in rax change (760495) now, will try to keep an eye on jobs to make sure this doesn't destabilize anything once it's deployed to the executors | 20:01 |
fungi | zbr: i'm in favor of experimenting with hashtags, yep | 20:02 |
fungi | infra-root: do we want a zuul scheduler restart today to pick up wip support in the gerrit driver? | 20:02 |
zbr | one benefit of the hastag is that is editable even after the change merged. | 20:02 |
zbr | but i think we need to tune the permissions on gerrit a little bit, to allow anyone to edit them. | 20:03 |
*** slaweq has quit IRC | 20:03 | |
zbr | if I am correct, now only the original author can. | 20:03 |
clarkb | someone will have to explain how a hashtag is different than say a topic? But I don't midn experimenting with new processes to identify priority work as well as tracking it when in review | 20:03 |
zbr | hashtag is google name for free form labels/tags. | 20:03 |
clarkb | today and tomorrow a still a bit busy for me so I probably won't be abel to give that the thought it deservers, but in the second half of the week I'm up for brainstorming | 20:04 |
fungi | judging from the node requests graph zuul has a sizeable backlog right now, so i'd probably try to do the scheduler restart in a few hours if things quiet down some | 20:04 |
clarkb | fungi: that sounds reasonable. I'm not opposed to a restart once things calm a bit | 20:04 |
clarkb | and now lunch | 20:05 |
fungi | gerrit events are trending downward, but node requests queue hasn't started to really drop yet | 20:05 |
zbr | is 8pm for me, so i will go out, but i will read stuff in the morning. mention me if needed. | 20:06 |
fungi | thanks zbr! | 20:06 |
zbr | last thing, we need to make some decisions on testing for the utility tools and libraries, like git-review. | 20:07 |
zbr | see https://review.opendev.org/c/openstack/project-config/+/763808/2/zuul.d/projects.yaml | 20:07 |
zbr | IMHO, we should drop support for unsupported python but test with *all* python versions. | 20:08 |
zbr | i know for sure that there are bugs that can happen only on particular python version. | 20:08 |
zbr | testing only with extremes is not good enough, IMHO, at least for stuff like git-review or pbr. | 20:09 |
zbr | if CI resources are really such an issue, we should just build an image with multiple pythons and use that. | 20:10 |
*** openstackgerrit has joined #opendev | 20:10 | |
openstackgerrit | Merged openstack/project-config master: Use internal mirror for RAX IAD/DFW https://review.opendev.org/c/openstack/project-config/+/760495 | 20:10 |
zbr | or better, start using containers as simple tox jobs can run inside containers. | 20:10 |
zbr | i had few projects (not zuul) where coverage was done combining results from running tox on each python version, which means you would need all python on the same machine (multiple jobs and artifact collection would have added too much complexity and implementation efforth) | 20:12 |
openstackgerrit | Sorin Sbârnea proposed openstack/project-config master: Update git-review test matrix (drop py27) https://review.opendev.org/c/openstack/project-config/+/763808 | 20:16 |
fungi | so there are code patterns which e.g. wok on python 3.5 and 3.8 but not 3.7? | 20:16 |
fungi | s/wok/work/ | 20:17 |
fungi | clearly i should be starting dinner prep | 20:17 |
zbr | fungi: i got cases where things failed only on py36. | 20:17 |
fungi | zbr: oh, i see, you're worried about codepaths which switch on sys.version | 20:17 |
zbr | in general the chance is relatively small, but is not zero. | 20:17 |
fungi | so for example new feature is added in 3.7 and someone matches on sys.version > (3, 6, 0) or whatever | 20:18 |
zbr | i am ok with doing only edge testing on some projects | 20:18 |
zbr | yep, is very easy to introduce bugs if you skip. | 20:19 |
fungi | so use of the new feature tries to turn on in 3.6 but breaks, where as it would have avoided using it in 3.5 | 20:19 |
zbr | very common with requirements.txt | 20:19 |
zbr | but on project where running unittests it takes less than 2 mins, maybe chaining would be better to optimize resource usage. | 20:20 |
fungi | also i should have clarified my -1 on 763808 was for the fact that the commit message didn't indicate it was dropping python 2.7 testing, the rest was questions | 20:20 |
zbr | pip would cache deps, most of them are identical between python versions, and we avoid extra jobs. | 20:20 |
fungi | the previous commit message was merely "Update git-review test matrix" with no subsequent description | 20:21 |
zbr | fungi: sure, i agree. feel free to just edit the message (if you know what it should be). | 20:21 |
fungi | no, what you wrote in the new patchset was fine. and i'm okay with dropping 2.7 testing on git-review and with testing intermediate versions (especially if the jobs don't run long) | 20:23 |
fungi | so long as there's some consensus | 20:24 |
fungi | though for some reason zuul is only running openstack-zuul-jobs-linters against that change | 20:24 |
fungi | d'oh, because it's project-config | 20:25 |
fungi | and i see 763803 exercises it | 20:25 |
zbr | btw, in case one of you were curious, tox4 had a pre-release, it is awesome and finally knows when requirements were updated. Bad part is that it does only work with simple setups. For example I files a bug on constraints not working. | 20:25 |
zbr | fungi: yep but because is project-config... | 20:26 |
fungi | right, doesn't actually work | 20:26 |
zbr | imho i would have required py36 but someone (clarkb?) told me to keep py35. | 20:26 |
fungi | we should probably move all that into the git-review repo | 20:26 |
zbr | i would support that. | 20:26 |
fungi | zbr: i want to say we still have (had?) ci jobs running on ubuntu-xenial for stable branches calling git-review. but maybe we can pin the version of git-review they install | 20:27 |
zbr | fungi: no need to pin version, we can make proper use of python_requires | 20:27 |
fungi | assuming we use new enough pip to know to rely on that | 20:28 |
zbr | next version will required 3.6, and older clients will be stick to whatever was last release. | 20:28 |
fungi | if the project-pipeline definition were in the git-review repo, the change in jobs run would have been entirely self-testing | 20:28 |
zbr | well, if we have such an ancient pip, we have far bigger issues that git-review not working. | 20:28 |
fungi | zbr: yeah, i want to say we explicitly install newer pip from pypi on xenial because of that, and only rely on distro packaged pip on >=bionic | 20:29 |
zbr | if am far more stressed about lack of py39 testing than dropping py35. | 20:30 |
zbr | my local default python is 3.9 and i keep encountering issues (minor but enough to annoy) with lots of repos. | 20:31 |
fungi | yeah, my default python is 3.9 as well (and i also have 3.10 alpha on hand) | 20:33 |
fungi | but you should go enjoy your evening, and we can discuss this stuff tomorrow | 20:34 |
clarkb | I still think that for tools that we maintain and run on python 3.5 we should continue to support that verson of python | 20:36 |
clarkb | I hvae no issue with also supporting and testing newer pythons | 20:36 |
clarkb | git review is an interesting case bceause its also a client side tool | 20:37 |
fungi | but it's worth keeping in mind that 3.5.10 more than 4 months ago was the final (eol) release for 3.5. i don't think it's entirely out of the question to use that as a guideline for when we expect users to upgrade platforms if they want to directly consume new features/fixes without backporting | 20:39 |
clarkb | except we are one of the users :P | 20:39 |
clarkb | if we think that is important the nwe should invest in upgrading the places running python3.5 | 20:40 |
fungi | hence the "if they want to directly consume new features/fixes" part. do we need new git-review features on platforms using python 3.5? | 20:41 |
clarkb | fungi: possibly if we upgrade gerrit and they change apis again | 20:42 |
clarkb | (and it isn't entirely theoretical this happened with the 3.2 upgrade) | 20:42 |
clarkb | it is possible I'm overly cautous with git review, but that is because it has a large diverse install base and is user facing | 20:43 |
clarkb | and I don't think git review gains much from stable dict sort orders or type annotations | 20:44 |
zbr | typung could hell improve the code a lot, making safer to add other changes | 20:44 |
clarkb | I mean its a tiny coe base that is not difficult to test | 20:45 |
clarkb | the issues git review tends to run into have to do with differing gerrit versions and different runtime platforms (os x vs windows vs linux) | 20:45 |
clarkb | and it is a very stable code base | 20:46 |
clarkb | we're more likely to add bugs taking advatnage of newer python features than we are to avoid them in the process | 20:47 |
fungi | i've become increasingly of the opinion that if software needs typed variables it should be (re)written in a language which implements them rather than bolting them on | 21:15 |
fungi | and lots of software also doesn't need them. particularly software dealing almost exclusively with bytestreams or text | 21:16 |
ianw | clarkb: finding it hard to let this ssh thing rest :) | 21:24 |
ianw | https://gitbox.apache.org/repos/asf?p=mina-sshd.git;a=commit;h=84196d2bef1444048645787caaa2764b54dca0cc appears to implement server-sig-algs | 21:25 |
clarkb | ya it was quite the rabbit hole yesterday | 21:25 |
clarkb | ianw: that added a framework to do it, they added support in the mina ssh client but not sshd if I read the java properly | 21:25 |
clarkb | ianw: basically on the sshd side you are expected to implement a KexExtensionHanlder to do it and that isn't provided for you | 21:26 |
clarkb | (the interface is there but not the implementation) | 21:26 |
ianw | hrm, is where it returns the list https://gitbox.apache.org/repos/asf?p=mina-sshd.git;a=blob;f=sshd-common/src/main/java/org/apache/sshd/common/kex/extension/parser/ServerSignatureAlgorithms.java;h=72b1364fabbdec8afa13527e069c10a2fda4cd0a;hb=84196d2bef1444048645787caaa2764b54dca0cc | 21:27 |
clarkb | ya I think all of the primitives are there just need to tie them together? | 21:29 |
ianw | https://github.com/apache/mina-sshd/commit/84196d2bef1444048645787caaa2764b54dca0cc/#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R17 | 21:29 |
clarkb | this is because the client side is implemented aiui | 21:29 |
clarkb | the gerrit sshd uses mina 2.4.0 and it definitely does not return a server-sig-algs to my client when observed with ssh -v | 21:30 |
ianw | yeah, it seems that should have all gone into 2.3.0 | 21:30 |
*** hashar has joined #opendev | 21:31 | |
*** lbragstad has quit IRC | 21:35 | |
*** lbragstad has joined #opendev | 21:38 | |
ianw | might be worth filing an issue upstream? | 21:39 |
clarkb | ya, I was hoping that the gerrit bug would confirm that reading of the issue, but probably quicker to also just file with mina and then link to that in the gerrit bug | 21:41 |
clarkb | it does seem like fedora (and eventually openssh) will start pushing the issue | 21:41 |
ianw | i agree; https://github.com/apache/mina-sshd/commit/84196d2bef1444048645787caaa2764b54dca0cc/#diff-1fd533c1b7162a203c3f4d7860a215eca4d7975b58b1802778b97f70621ae2ae isn't doing anything | 21:41 |
ianw | i'll try writing something up | 21:43 |
fungi | or gerrit will declare ssh protocol unsupportable | 21:57 |
clarkb | well this only affects rsa, I could see them saying don't use rsa then | 21:58 |
clarkb | but mina should likely support this in the lib | 21:58 |
fungi | but also gerrit's gerrit not exposing ssh access suggests there will come a time when they declare it unsupported | 22:03 |
clarkb | that is due to google's hosting rules aiui | 22:03 |
clarkb | but ya it is possible | 22:03 |
clarkb | (google doesn't allow ssh in from outside or something like that) | 22:03 |
fungi | they could argue that applies to shells, not ssh as a transport protocol | 22:04 |
fungi | what gerrit provides isn't really ssh (secure shell) it's an ssh-based api socket | 22:05 |
clarkb | ya its the protocol they don't allow aiui | 22:05 |
clarkb | I think because too many things can be hidden in it with no insight on their side (same for https really but they can't do business without it) | 22:05 |
fungi | api via rfc 4253 transport | 22:06 |
ianw | clarkb/fungi: https://etherpad.opendev.org/p/2qcN-6GzRb5nW2rbux_m | 22:09 |
ianw | i think my two questions are 1) why doesn't -oPubkeyAcceptedKeyTypes=rsa-sha2-512 work ... it seems like it should and 2) a gentle question if anyone is working on the server-sig-algs | 22:10 |
clarkb | ianw: for 1) it doesn't work beacuse openssh client will only use rsa-sha2-512 if the server returns a server-sig-algs list that includes it. Otherwise it falls back to ssh-rsa which fails becuse we have only allowed rsa-sha2-512 | 22:11 |
fungi | right, PubkeyAcceptedKeyTypes doesn't instruct it to negotiate, just states a preference if there's a negotiation | 22:11 |
ianw | it feels like @ https://github.com/openssh/openssh-portable/blob/master/sshconnect2.c#L1172 | 22:14 |
ianw | if the server_sig_algs == NULL then it matches what's in opeions.pubkey_key_types | 22:14 |
fungi | if ... ssh->kex->server_sig_algs == NULL will be true if server-sig-algs negotiation is unsupported by the peer? | 22:16 |
fungi | or rather ssh->kex->server_sig_algs will be NULL if server-sig-algs negotiation is unsupported by the peer | 22:17 |
clarkb | ianw: fwiw that assertion is based on the ssh -vvv output when doing that | 22:17 |
clarkb | it lists all the keys and say none of them match | 22:18 |
fungi | and this is also master branch state of openssh-portable | 22:18 |
fungi | so we should be clear what version of the client we're talking about and make sure the source is the same at that point in history | 22:19 |
clarkb | OpenSSH_8.3p1 in my case which is pretty up to date | 22:20 |
fungi | yeah, looks the same around line 112 under the V_8_3 branch | 22:22 |
fungi | er, 1112 | 22:23 |
fungi | https://github.com/openssh/openssh-portable/blob/V_8_3_P1/sshconnect2.c#L1112 | 22:23 |
ianw | i got there via https://bugzilla.redhat.com/attachment.cgi?id=1719130&action=diff | 22:24 |
ianw | that's the fedora diff for buggy debian 10 era openssh 7.4 servers, which apparently support rsa-sha2-256,rsa-sha2-512 but don't correct advertise it | 22:24 |
fungi | a marvellous example of a pot calling a kettle black | 22:27 |
fungi | "when deviating from upstream recommendations, we found that we were unable to communicate with other systems which deviated from upstream recommendations in different ways" | 22:28 |
ianw | https://issues.apache.org/jira/browse/SSHD-1105?jql=project%20%3D%20SSHD%20AND%20text%20~%20%22rsa-sha2-256%22 | 22:31 |
ianw | perhaps it is not choosing correctly | 22:32 |
clarkb | oh interesting is it the server saying the rsa is wrong maybe? ya | 22:32 |
ianw | i don't know, i'm more lost than ever. i think it's probably worth a jira issue in mina just with the basics and see what they think | 22:33 |
clarkb | ++ | 22:33 |
ianw | there's several related things like this, but nothing i can see clearly directly related. but even being marked a dup would be helpful | 22:34 |
fungi | tripleo's still got a 12-hour gate backlog at the moment, and grafana shows the node request queue dropping very slowly | 23:11 |
fungi | may need to wait a bit longer before attempting a zuul scheduler restart | 23:12 |
fungi | in good news i haven't seen any breakage from the site-variables.yaml update since it was deployed almost 3 hours ago | 23:13 |
*** hashar has quit IRC | 23:15 | |
*** DSpider has quit IRC | 23:23 | |
*** sboyron has quit IRC | 23:45 | |
*** sboyron has joined #opendev | 23:46 | |
*** sboyron has quit IRC | 23:54 | |
*** sboyron has joined #opendev | 23:56 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!