Friday, 2023-12-08

opendevreview	Felix Edel proposed zuul/zuul-jobs master: mirror-workspace-git-repos: Retry on failure in git update task https://review.opendev.org/c/zuul/zuul-jobs/+/902907	08:07
opendevreview	Michal Nasiadka proposed openstack/diskimage-builder master: rocky-container: Add installation of Minimal Install group https://review.opendev.org/c/openstack/diskimage-builder/+/899372	08:37
opendevreview	Michal Nasiadka proposed openstack/diskimage-builder master: rocky-container: Add installation of Minimal Install group https://review.opendev.org/c/openstack/diskimage-builder/+/899372	10:21
opendevreview	James E. Blair proposed zuul/zuul-jobs master: mirror-workspace-git-repos: Retry on failure in git update task https://review.opendev.org/c/zuul/zuul-jobs/+/902907	14:38
opendevreview	Merged zuul/zuul-jobs master: mirror-workspace-git-repos: Retry on failure in git update task https://review.opendev.org/c/zuul/zuul-jobs/+/902907	15:04
clarkb	the gitea09 backups to the one backup server are still failing...	22:58
clarkb	I'm going to try a manual run	22:58
clarkb	it fails when run manually so now we know the periodic jobs aren't at fault. The row it complained about changed between the last autoamted run and my manual run	23:06
clarkb	after realizing I needed to set -o pipefail for accurate test results running `bash /etc/borg-streams/mysql \| gzip -9 > clarkb_test_db_backup.sql.gz` locally on the server and not piping to borg to stream offserver succeeds. Which I expected because the other backup host is backing up just fine	23:24
clarkb	this leads me to think that the problem has to do with the networking connection between gitea09 and the vexxhost backup server causing a backup in the stream such that mysqldump has a network error	23:25
opendevreview	Ghanshyam proposed openstack/project-config master: Remove retired js-openstack-lib from infra https://review.opendev.org/c/openstack/project-config/+/798529	23:27
clarkb	I tried added --max-allowed-packet=256M since the internets say one reason these sorts of errors can occur is having the packet size to small for a row	23:41
clarkb	however, I didn't really expect that to help because the other backup works and I would expect this to be a universal problem if increasing the packet size helps	23:42
clarkb	I undid that manual change and the server is back to the way it started. I think I'm going to need to sleep on this one. It feels like the sort of bug that bashing my head against isn't going to be helpful with since it has to do with buffer/networking/mariadb stuff	23:44
clarkb	one thing I think we could do as a workaround is to have the backup write to a tmpfile on disk, cat the file to stream it out, then rm the file	23:44
clarkb	then we remove the mysqldump into borg-backup and instead have mysqldump to disk, then cat/zcat into borgbackup	23:45
clarkb	ianw: ^ fyi struggles with the streaming backups	23:45
clarkb	not sure if you ahve seen similar before and may have pointers	23:45
clarkb	one thing that just occured to me: This could be a regression in mariadb or mariadbdump/mysqldump since one of the things that does change over time is our mariadb container image	23:47
clarkb	I've put everything back the way it was before. I suspect this will continue to fail until we do something or if this is a mariadb regression they fix it and it magicaly goes away.	23:51
clarkb	the more I think about it the more I like the idea of using a staging file locally. We should be able to do something like our current docker exec command \| gzip -9 > $(mktemp tmp.XXXXXXXXXX.sql.gz) && zcat $TMPFILE ; rm $TMPFILE	23:53
clarkb	that said debugging help to better understand is probably the first order of business	23:54

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!