*** tosky has quit IRC | 00:05 | |
clarkb | fungi: we've got high load on bridge again. I think beacuse of the other servers you discovered that took a holiday recently | 00:32 |
---|---|---|
fungi | oh, likely so, i can take a look in a bit if you haven't | 00:33 |
clarkb | ya it appears to be leaky ansible again but for elasticsearch and logstash workers? I'm going to look at them then kill old processes | 00:34 |
*** akrpan-pure has quit IRC | 00:40 | |
clarkb | elasticsearch 02-04 and 07 all needed to be restarted and logstash-worker02 and 10 | 00:47 |
clarkb | this was based on info from stale ssh connections reported on bridge | 00:47 |
fungi | judging solely from the ones they opened tickets on for us, rax was doing a significant amount of host migrations over the holidays, so not surprised at all | 00:49 |
clarkb | I'm cleaning up the ansible procesess now | 00:50 |
fungi | ahh, thanks! | 00:54 |
clarkb | fungi: are you able to keep an eye on the gitea change? | 00:55 |
clarkb | if so I'll finish this cleanup then go help with dinner | 00:55 |
fungi | yeah, i'll try to keep tabs on the servers | 00:59 |
clarkb | protip the ssh controlmaster persistent processes don't seem to get cleaned up once they get reparented to init when you kill their parents | 01:03 |
openstackgerrit | Merged opendev/system-config master: Upgrade to gitea 1.12.5 https://review.opendev.org/c/opendev/system-config/+/769225 | 01:03 |
clarkb | I got all the stale ansible-playbook processes and now and goign through those ssh processses | 01:03 |
clarkb | fungi: ok I think bridge is much happier now | 01:06 |
clarkb | remote puppet else is currently running then the gitea deploy should happen | 01:06 |
fungi | yup, that's what it's looking like to me | 01:07 |
clarkb | fungi: I believe ti should do the servers in order too so you can check https://gitea01.opendev.org:3000/ then 02 and so on as it works through them | 01:07 |
clarkb | fungi: looks like gitea01 just updated. I see the new version and can browse zuul via the web ui | 01:19 |
fungi | yeah, i've been testing them and not seen an issue yet | 01:31 |
fungi | and fiished | 01:32 |
fungi | finished | 01:32 |
fungi | loading the webuis has been sluggish for me, but could be my connection, or cold caches on the servers i guess | 01:32 |
fungi | poked around a bit and they seem fine | 01:33 |
fungi | once i get the assets loaded for theming and whatnot, it's snappy | 01:33 |
*** DSpider has quit IRC | 01:46 | |
*** hamalq_ has quit IRC | 01:59 | |
*** prometheanfire has quit IRC | 02:24 | |
*** prometheanfire has joined #opendev | 02:25 | |
*** tbarron has quit IRC | 02:52 | |
openstackgerrit | Brian Rosmaita proposed opendev/gerritlib master: Update documentation https://review.opendev.org/c/opendev/gerritlib/+/769241 | 03:19 |
fungi | i'm firing up an engagement stats run, queries might ratchet up the load on gerrit a little but i'll keep an eye on it, will probably run until around 06:00 based on prior experiences | 03:52 |
*** ykarel has joined #opendev | 04:26 | |
*** whoami-rajat___ has joined #opendev | 05:36 | |
*** marios has joined #opendev | 06:35 | |
*** lpetrut has joined #opendev | 07:12 | |
*** rpittau|afk is now known as rpittau | 07:33 | |
*** slaweq has joined #opendev | 07:42 | |
*** ralonsoh has joined #opendev | 07:52 | |
*** sboyron has joined #opendev | 07:53 | |
*** hashar has joined #opendev | 07:57 | |
openstackgerrit | daniel.pawlik proposed openstack/diskimage-builder master: Replace Fedora 31 functional tests with Fedora 33 https://review.opendev.org/c/openstack/diskimage-builder/+/768092 | 07:57 |
*** fressi has joined #opendev | 08:04 | |
*** andrewbonney has joined #opendev | 08:16 | |
*** tosky has joined #opendev | 08:39 | |
*** sboyron_ has joined #opendev | 09:04 | |
*** sboyron has quit IRC | 09:04 | |
*** sboyron_ has quit IRC | 09:06 | |
*** sboyron_ has joined #opendev | 09:07 | |
*** dardelean has joined #opendev | 09:07 | |
*** fressi has quit IRC | 09:08 | |
*** fressi has joined #opendev | 09:15 | |
openstackgerrit | Merged openstack/diskimage-builder master: Add Python3 Wallaby unit tests https://review.opendev.org/c/openstack/diskimage-builder/+/757247 | 09:37 |
*** tkajinam has quit IRC | 09:42 | |
*** ysandeep is now known as ysandeep|afk | 10:14 | |
*** DSpider has joined #opendev | 10:26 | |
*** rpittau is now known as rpittau|bbl | 10:32 | |
*** fressi has quit IRC | 11:03 | |
*** ysandeep|afk is now known as ysandeep | 11:26 | |
*** fressi has joined #opendev | 11:31 | |
*** sboyron_ has quit IRC | 12:07 | |
*** sboyron_ has joined #opendev | 12:07 | |
*** tbarron has joined #opendev | 12:32 | |
*** rpittau|bbl is now known as rpittau | 12:44 | |
*** sboyron_ has quit IRC | 13:08 | |
*** sboyron_ has joined #opendev | 13:09 | |
*** whoami-rajat___ is now known as whoami-rajat__ | 13:16 | |
*** fressi has quit IRC | 13:28 | |
*** fressi has joined #opendev | 13:31 | |
*** tosky has quit IRC | 14:15 | |
*** tosky_ has joined #opendev | 14:16 | |
*** tosky_ is now known as tosky | 14:16 | |
openstackgerrit | Thierry Carrez proposed openstack/project-config master: release-scripts: Remove misleading error message https://review.opendev.org/c/openstack/project-config/+/769353 | 14:30 |
*** roman_g has joined #opendev | 15:10 | |
*** ysandeep is now known as ysandeep|away | 15:13 | |
*** fressi has quit IRC | 15:17 | |
openstackgerrit | Hervé Beraud proposed zuul/zuul-jobs master: Allow to retrieve releasenotes requirements from a dedicated place https://review.opendev.org/c/zuul/zuul-jobs/+/769292 | 15:18 |
*** hashar is now known as hasharAway | 15:23 | |
*** DSpider has quit IRC | 15:43 | |
*** DSpider has joined #opendev | 15:43 | |
*** yoctozepto has quit IRC | 15:43 | |
*** yoctozepto has joined #opendev | 15:44 | |
*** roman_g has quit IRC | 15:47 | |
*** sboyron_ has quit IRC | 15:51 | |
*** rosmaita has joined #opendev | 15:59 | |
*** sboyron has joined #opendev | 16:02 | |
clarkb | fungi: looks like ethercalc, logstash-worker 17 & 20 may also be in similar situation to those other servers from yesterday. We've got some older stale ansible ssh connections to them since my previous cleanup | 16:04 |
clarkb | I half expected once the previous set was cleared out ansible would run further and find new ones. I'll take a look at them in a bit | 16:04 |
fungi | clarkb: yeah, i was getting to ethercalc after checking zm02 (we got a ticky from rax overnight about a host problem impacting it) | 16:05 |
fungi | seems to be up and working now though | 16:05 |
fungi | looks like it came up around 13:14 utc | 16:05 |
clarkb | ethercalc is up you mean or zm02? | 16:06 |
fungi | #status log zm02 was rebooted by the provider at 13:14 utc following recovery from a host outage | 16:06 |
openstackstatus | fungi: finished logging | 16:06 |
fungi | zm02 | 16:06 |
fungi | i'm checking ethercalc's console now | 16:07 |
clarkb | ethercalc seems to ask for my ssh key then stops there. the logstash workers don't even get that far. I'll reboot the logstash workers | 16:07 |
fungi | yeah, ethercalc may have lost contact with its rootfs or something | 16:07 |
fungi | the usual hung kernel tasks kmesg spam on the console | 16:08 |
fungi | i'll reboot it | 16:08 |
clarkb | ok | 16:08 |
clarkb | then I'll give it an hour an look at any leaked ansible processes (the time delta makes it easier to spot them in ps output) | 16:09 |
openstackgerrit | Merged openstack/project-config master: release-scripts: Remove misleading error message https://review.opendev.org/c/openstack/project-config/+/769353 | 16:11 |
*** sboyron has quit IRC | 16:17 | |
*** sboyron has joined #opendev | 16:17 | |
fungi | #status log rebooted ethercalc.o.o because the server hung some time in the past 24 hours | 16:20 |
openstackstatus | fungi: finished logging | 16:20 |
fungi | looks like we don't have the new ethercalc server in cacti | 16:21 |
*** lpetrut has quit IRC | 16:21 | |
openstackgerrit | Jeremy Stanley proposed opendev/system-config master: Clean up ethercalc server replacement transition https://review.opendev.org/c/opendev/system-config/+/769396 | 16:29 |
fungi | that should solve it ^ | 16:29 |
*** hasharAway is now known as hashar | 16:43 | |
clarkb | fungi: https://zuul.opendev.org/t/openstack/build/d72fc062ed96415a8d62dade5503a212/log/gitea99.opendev.org/docker/gitea-docker_gitea-ssh_1.txt#5 is at least one of the problems perveting that gitea 1.13.1 upgrade from succeeding. Any idea what sshd is trying to tell us there? | 16:43 |
clarkb | also really curious why 1.12.5 was fine and 1.13.1 is not. They use the same debian:buster-slim base image | 16:45 |
clarkb | I believe that file is part of the image not bind mounted in so it should be consistent between those two builds | 16:46 |
clarkb | ya we don't bidn mount it in and use an env var to change the listen port | 16:48 |
clarkb | we run /etc/s6/openssh/setup to generate host keys if necessary when starting the container which is what I think failed | 16:49 |
fungi | `docker-compose exec gitea-ssh ssh -Q key` needs to match up with HostKeyAlgorithms | 16:50 |
fungi | however we don't seem to set HostKeyAlgorithms in /etc/ssh/sshd_config so it should be the default? | 16:52 |
clarkb | fungi: right, I guess my confusion is that we're just using debian's docker image and ssh packaging. We don't appear to be changing any of that config. And it works in one case but not another | 16:52 |
clarkb | yup exactly. that is what has me extra confused (if we were editing the list we likely got it wrong or things changed under us but we don't do that) | 16:52 |
fungi | right, i'm just trying to work out where the problem lies first | 16:52 |
clarkb | is it possible buster updated openssh-server builds and broke their default config and no one has noticed beacuse it works ok on an upgrade? | 16:52 |
clarkb | since host keys will already exist if you just update openssh-server it won't regen them and this is a new install only problem potentially | 16:53 |
fungi | fairly unlikely for something like that to break in stable | 16:53 |
fungi | but maybe it's something which broke in an update of the docker images themselves | 16:53 |
clarkb | maybe a regression in that setup script? | 16:53 |
clarkb | or a problem with how we call it directly? | 16:54 |
fungi | i'm not entirely sure how the docker images are built (and i want to say they're not really officially produced by debian, but i don't exactly recall) | 16:54 |
clarkb | oh ya it is possible the config is baked into thei mage and conflicts with the upstream packaging | 16:55 |
clarkb | "Debian Developers tianon and paultag" <- are the maintainers listed on docker hub | 16:55 |
fungi | right, managed by debian developers, but not necessarily produced by the debian project (to be official, builds have to be performed with dsa-controlled infrastructure, et cetera) | 16:56 |
clarkb | I guess we can try to reproduce by pulling debian:buster-slim, install openssh-server, then run the generate locally. I'll try that | 16:57 |
*** marios is now known as marios|out | 16:58 | |
fungi | as to why it hit one gitea build and not the other, was there a significant gap in when those builds ran? | 16:59 |
clarkb | I haven't checked but they were both in the queues at roughly the same time, but maybe af ew minutes made all the difference here | 17:01 |
clarkb | aha | 17:04 |
clarkb | I think I may see it, need a few mintues to track it all down though | 17:04 |
fungi | can't wait for the deduction | 17:05 |
clarkb | we're reusing the ssh config from the upstraem gitea docker images | 17:07 |
clarkb | and that is where sshd_config comes from | 17:07 |
clarkb | going to diff 1.25.5 and 1.31.1 contents | 17:07 |
clarkb | er 12.5 and 13.1 | 17:07 |
*** rpittau is now known as rpittau|afk | 17:08 | |
clarkb | they set CASignatureAlgorithms ecdsa-sha2-nistp256,ecdsa-sha2-nistp384,ecdsa-sha2-nistp521,sk-ecdsa-sha2-nistp256@openssh.com,ssh-ed25519,sk-ssh-ed25519@openssh.com,rsa-sha2-512,rsa-sha2-256,ssh-rsa and their images are based on alpine. Ours are based on debian | 17:11 |
clarkb | the mismatch must be between debian and alpine openssh-server builds | 17:11 |
fungi | aha | 17:12 |
*** marios|out has quit IRC | 17:12 | |
clarkb | sk-ecdsa-sha2-nistp256@openssh.com and sk-ssh-ed25519@openssh.com are the ones debian openssh doesn't like I think | 17:12 |
fungi | yep, that sounds entirely plausible | 17:12 |
clarkb | those are the new u2f key types. Is buster too old to support them? | 17:13 |
fungi | the sk key variants are for fido u2f "security keys" looks like | 17:14 |
fungi | yeah | 17:14 |
clarkb | ok I'm going to do the hacky thing and do a replacement that should work with debain just to see if we get a working system from that. Then we can decide how to unhack this from there | 17:14 |
fungi | added in openssh 8.2 looks like | 17:14 |
fungi | buster ships openssh 7.9 but has 8.4 in backports | 17:15 |
clarkb | oh hrm should we maybe just install a newer openssh | 17:15 |
clarkb | this is only for replication so adding fancy features like that for users doesn't help us much I don't think | 17:16 |
fungi | right, i think whichever is the simpler solution is what gets my vote | 17:17 |
*** sboyron has quit IRC | 17:22 | |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Update gitea to 1.13.1 https://review.opendev.org/c/opendev/system-config/+/769226 | 17:24 |
clarkb | I went with backports because that avoids needing to modify upstream things | 17:24 |
*** sboyron has joined #opendev | 17:24 | |
fungi | also that's using the openssh candidate for the next debian stable version (bullseye), so fewer surprises when we update the image to it i guess | 17:26 |
fungi | is repo.projects the key for the kanban bits? | 17:32 |
clarkb | fungi: ya poorly named. Writing that change was like 60% figuring that out | 17:33 |
clarkb | it jumped out to me because the repo html template added a new projects thing and I had to dig from there to figure out what it actually is | 17:34 |
clarkb | I'm 95% sure its the kanban project management stuff | 17:34 |
fungi | clarkb: there's an e-mail confirmation probe from inmotion in the infra-root inbox, i assume that's you | 17:36 |
clarkb | it is | 17:36 |
*** ykarel has quit IRC | 17:36 | |
clarkb | however, now it wants a credit card and I'm letting them know that I will not do that after past experience with $cloud | 17:36 |
clarkb | so I'm just gonna let it sit in a half created state until they get back to me on next steps I think | 17:37 |
fungi | yeah, sensible | 17:37 |
fungi | it's not uncommon for billing departments to get their wires crossed and auto-charge folks for supposedly donated resources | 17:38 |
corvus | clarkb: who? | 17:38 |
clarkb | hpcloud | 17:38 |
corvus | i mean what are you working on now? | 17:38 |
clarkb | corvus: oh this is the inmotion hosted private cloud thing we takled about last year | 17:38 |
corvus | ok thanks :) | 17:39 |
clarkb | sounds like they are close to being able to spin something up and have asked us to create an account so I was pushing on that | 17:39 |
fungi | "InMotion Flex Metal Cloud" | 17:39 |
clarkb | sounds like I add them as owners on the account too then they can sort out billing so I've done that and am waiting for further instructions | 17:44 |
*** hashar is now known as hasharDinner | 17:49 | |
clarkb | I've updated the usual location with the details (those we have so far) | 17:51 |
clarkb | sounds like we need further setup on their side though | 17:51 |
*** d34dh0r53 has quit IRC | 17:57 | |
*** d34dh0r53 has joined #opendev | 18:08 | |
*** ralonsoh has quit IRC | 18:21 | |
corvus | i may be 5m late for mtg | 18:35 |
*** whoami-rajat__ has quit IRC | 18:39 | |
clarkb | I expect it will be pretty informal and just a catch up on things | 18:41 |
fungi | i'm happy to also be 5m late if that helps ;) | 18:42 |
*** andrewbonney has quit IRC | 19:15 | |
openstackgerrit | Jeremy Stanley proposed opendev/engagement master: Initial commit https://review.opendev.org/c/opendev/engagement/+/729293 | 19:33 |
*** _mlavalle3 has quit IRC | 19:38 | |
*** rosmaita has quit IRC | 19:45 | |
fungi | i'm getting some very minor discrepancies in subsequent runs of ^ for the same time periods, so adding more debugging output to see if i can work out whether gerrit is actually returning unstable query results | 19:48 |
*** rosmaita has joined #opendev | 20:02 | |
*** sboyron has quit IRC | 20:06 | |
*** hasharDinner is now known as hashar | 20:12 | |
*** slaweq has quit IRC | 20:33 | |
*** tosky_ has joined #opendev | 22:12 | |
*** tosky has quit IRC | 22:13 | |
*** tosky_ is now known as tosky | 22:17 | |
openstackgerrit | lotorev vitaly proposed zuul/zuul-jobs master: Clarity tox_environment accepts dictionary not list https://review.opendev.org/c/zuul/zuul-jobs/+/769433 | 22:33 |
openstackgerrit | lotorev vitaly proposed zuul/zuul-jobs master: Clarity tox_environment accepts dictionary not list https://review.opendev.org/c/zuul/zuul-jobs/+/769433 | 22:35 |
openstackgerrit | lotorev vitaly proposed zuul/zuul-jobs master: Document Python siblings handling for tox role https://review.opendev.org/c/zuul/zuul-jobs/+/768823 | 22:35 |
*** hashar has quit IRC | 22:38 | |
*** tkajinam has joined #opendev | 23:01 | |
*** diablo_rojo has joined #opendev | 23:48 | |
*** cloudnull has quit IRC | 23:59 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!