Saturday, 2023-12-09

clarkb	other thoughts: what if borg is failing too	00:02
clarkb	and that breaks the pipe and then mysql can no longer write	00:02
clarkb	the way logging is setup in the borg stream setup is it logs the first error in the pipeline which implies the problem is on the first command to fail, but I suppose to could be the second command (borg-backup) that is erroring	00:03
clarkb	I'm guessing that would be a good thing to try and get better logs for first	00:03
fungi	clarkb: i suspect there's some delay between when the dump is initiated and when the data begins to flow. if there's some middlebox terminating "idle" ssh sessions then using ssh keepalives might help?	13:10
opendevreview	Birger J. Nordølum proposed openstack/diskimage-builder master: feat: add almalinux-container element https://review.opendev.org/c/openstack/diskimage-builder/+/883855	13:57
opendevreview	Merged openstack/project-config master: Remove retired js-openstack-lib from infra https://review.opendev.org/c/openstack/project-config/+/798529	13:57
fungi	pbr.build made it into the charts on https://discuss.python.org/t/40629	14:07
Clark[m]	fungi: the error occurs after a minute or two. It shouldn't be long enough for typical keepalives to help. It's a really weird one. After sleeping on it I think I want to try the dump to file then cat it out process manually to see if that has different behavior. And also update the script to check errors for both sides of the pipe	15:27
fungi	i remember years ago the default "idle" session timeout on both juniper netscreen and f5 bigip was 120 seconds unless configured otherwise	15:29
fungi	so i usually set my ssh servers and clients to a 60-second keepalive ping	15:30
Clark[m]	Huh that seems really low. I guess we can try that	15:30
fungi	it does indeed seem low, when by comparison openbsd was something like 2 weeks	15:30
Clark[m]	That would go into .ssh/config?	15:30
fungi	yes	15:31
fungi	ServerAliveInterval is what you want in .ssh/config, according to the manpage	15:33
fungi	"Sets a timeout interval in seconds after which if no data has been received from the server, ssh(1) will send a message through the encrypted channel to request a response from the server."	15:34
fungi	it defaults to 0, which disables that behavior	15:35
fungi	there's also TCPKeepAlive but i've seen middleboxes ignore those	15:36
JayF	fungi: is the `openstackci` pypi account human-accessible at all?	21:16
JayF	fungi: asking in context of https://github.com/eventlet/eventlet/issues/824#issuecomment-1848673640 -- I'm thinking given that's a github repo, and we don't know the shape of the CI	21:16
JayF	might be wise to have openstackci + a human (me, for now?) get granted access	21:16
JayF	but I figured I'd ask about how frequently humans have to use that account already before worrying about complicating the message	21:17
fungi	JayF: yes, opendev sysadmins have access to shared login credentials and otp for that account	21:43
Clark[m]	I think uploads use tokens now? So in theory we can make a token for that repo and give that to the maintainers but I'm not sure if they can be scoped like that	22:27
fungi	they can, there are account-scoped and project-scoped upload tokens	23:41

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!