Tuesday, 2025-09-16

*** dmellado2 is now known as dmellad		07:14
*** dmellad is now known as dmellado		07:14
*** janders1 is now known as janders		13:12
*** diablo_rojo_phone is now known as Guest26643		13:12
opendevreview	sean mooney proposed zuul/zuul-jobs master: make disto and pypi mirror configuration condtional https://review.opendev.org/c/zuul/zuul-jobs/+/961369	13:28
*** janders0 is now known as janders		13:49
*** open10k8s_ is now known as open10k8s		13:49
*** clarkb is now known as Guest26673		13:57
*** acoles_ is now known as acoles		14:00
fungi	the mirror.ubuntu volume move has been going for 26 hours at this point	14:41
fungi	if it only takes as long as mirror.ubuntu-ports then it should finish by the time for our meeting, though i have a feeling it will take longer since it still has bionic packages	14:42
*** Guest26673 is now known as clarkb		14:46
mnasiadka	Ah, that's why some Kolla Ubuntu builds are failing ;-)	15:13
fungi	mnasiadka: which ones? arm64 on bionic?	15:14
mnasiadka	nope, some stale mirror it seems for Noble	15:15
mnasiadka	https://zuul.opendev.org/t/openstack/build/30691c49d8cb485093a2958a59ea2b25/log/kolla/build/000_FAILED_manila-share.log	15:15
fungi	the ubuntu mirror should only be at most 28 hours stale right now	15:16
mnasiadka	I think the discrepancy is Kolla using ubuntu cloud archive upstream and the standard Noble repos from mirror	15:18
mnasiadka	Is there a UCA mirror in OpenDev - or should we think of having one after all these moves?	15:18
fungi	aha, yeah uca may have newer dependencies than are in our mirror if they updated libsqlite3-0 in the past day	15:19
fungi	packages.ubuntu.com is being really obstinate in returning content for the updated package	15:22
fungi	looks like it may even be disjoint on their own mirror network	15:22
clarkb	mnasiadka: we have a uca mirror but it is updated separately of the main repos	15:22
clarkb	so the same disconnect can occur I think	15:23
mnasiadka	oh well	15:23
mnasiadka	even if these would run in the same cron script - I assume disconnects can occur	15:24
fungi	well, hopefully the mirror will be up to date by tomorrow	15:24
fungi	https://changelogs.ubuntu.com/changelogs/pool/main/s/sqlite3/sqlite3_3.45.1-1ubuntu2.5/changelog is persistently returning a 404 response	15:27
fungi	seems like this may be an update in progress	15:27
clarkb	https://opendev.org/openstack/kolla-ansible/commit/3be5a3852def1b4580c439809067bda7a2c49730 exists now so something must've triggered broader replication for kolla-ansible	15:50
fungi	eventually consistent replicas	15:56
clarkb	fungi: I rechecked https://review.opendev.org/c/opendev/system-config/+/958666 so that we can get that ready for merging when you're ready on the afs side of things	15:59
fungi	thanks!	15:59
opendevreview	Merged opendev/system-config master: Expand old chrome UA rules https://review.opendev.org/c/opendev/system-config/+/960399	16:04
clarkb	I'm starting to look very briefly at summit scheduling and I think Friday night might be the only night I personally have free to try and do a get together	16:08
clarkb	so if there is interest in having an opendev (and probably zuul) dinner type thing try to keep friday night free!	16:08
clarkb	fungi and corvus are going to be there not sure who else is ^ but pass that along I guess. Note I don't think I'll be organizing anything super formal more of just a lets aim for that day and then find something that works	16:09
*** dan_with__ is now known as dan_with		16:10
corvus	"try to coalesce into a social blob and imbibe sustenance" sounds like a plan! :)	16:12
fungi	infra-root: 960399 just finished deploying (to gitea, mailman, static, and zuul), so keep an eye out for any reports of rejected web requests on those sites/services	16:21
fungi	i guess we don't update gerrit with those	16:21
clarkb	we've been selective in deploying that to things that have had struggles with the crawlers I think	16:22
fungi	i couldn't remember which services used it and whether we remembered to trigger deployment jobs for all of them when the file changes	16:26
fungi	interesting... there's a lists01.openstack.org/main01 cinder volume in rax-dfw which isn't in use, created_at is the same day as the lists01.opendev.org server instance	17:49
fungi	aside from having the wrong name, it's also sata not ssd	17:50
clarkb	its possible the sata volumes are also speedy? we can probably profiel that somewhere. Or just use ssd because we know we're sensitive to iowait and want to avoid the problem as much as possible	17:50
fungi	i'm checking the current volume of data in /var/lib/mailman to confirm whether the minimum 100gb volume size will still be sufficient or if we need something larger	17:51
clarkb	don't forget to check the database size too	17:51
fungi	it's also in there	17:51
clarkb	ah ok	17:52
*** Guest26643 is now known as diablo_rojo_phone		17:52
fungi	/var/lib/mailman/database on the host is mounted as /var/lib/mysql in the mariadb container	17:52
fungi	41G /var/lib/mailman	18:13
fungi	so 100gb should be plenty	18:13
clarkb	might want to give some headroom?	18:15
clarkb	I think that the indexes in particular maybe a concern if xapian is anything like lucene	18:15
clarkb	I guess with lvm we can add a second volume and expand the fs without too much fuss if that becomes necessary	18:16
fungi	that seems like ample headroom anyway, but yes we can always use pvmove onto a larger volume if we want with no downtime	18:16
fungi	i've created a 100gb ssd volume named lists01.opendev.org/main01 and attached it to the server as /dev/xvdb, put a logical volume on it using all available extents and formatted it ext4	18:18
fungi	i've temporaily mounted /dev/main/mailman at /mnt and will get an initial rsync going to it from /var/lib/mailman	18:19
fungi	`rsync -Sax --delete --info=progress2 /var/lib/mailman/ /mnt/` is running in a root screen session on lists01	18:22
fungi	in other news, the mirror.ubuntu volume move is approaching the 30 hour mark, still going from what i can tell	18:26
frickler	looks like the mirror updates are still running, just the vos release is failing for them? might be better to stop those until the move is finished?	18:41
fungi	we could temporarily comment out the ubuntu reprepro cronjob and put the mirror-update server in the emergency disable list, though it's probably not got much longer to go at this point	18:52
fungi	omw to an early dinner, bbl	19:45
opendevreview	Clark Boylan proposed opendev/system-config master: Switch generic container role image builds back to docker https://review.opendev.org/c/opendev/system-config/+/961410	19:47
clarkb	this is the chagne to flip our default image builder back to docker (everything should already be using docker due to explicit overrides, this is just ensuring our default matches our intention going forward)	19:47
clarkb	fungi: I left some notes on your etherpad. Overall looks good just some minor things to consider	19:51
fungi	i suppose i could rename to something other than /var/lib/mailman.old that wouldn't risk getting backed up in the few minutes between those steps	21:25
fungi	it's on the rootfs so a rename to basically anywhere would still be atomic	21:26
clarkb	ya that would also work	21:26
clarkb	I just don't want us to accidentally make backups carry data we don't want	21:26
fungi	i'm not super familiar with what we include/exclude. what path would you recommend?	21:28
clarkb	fungi: https://opendev.org/opendev/system-config/src/commit/03ba936444f4b2e42b08981b549e53b90b267814/playbooks/roles/borg-backup/defaults/main.yaml this is the deafult set of rules	21:29
clarkb	/var/cache might be a good choice	21:29
clarkb	I think that isn't mounted as tmpfs or anything	21:29
clarkb	and it is in the exclude list	21:30
fungi	running fio on /mnt now following the same options you last used on the rootfs for comparison	21:32
clarkb	note it created the four fungi-test.* files. Dont' delete them until you're done as it will reuse them if you run multiple passes	21:33
fungi	READ: bw=111MiB/s (116MB/s), 111MiB/s-111MiB/s (116MB/s-116MB/s), io=3337MiB (3499MB), run=30036-30036msec	21:33
clarkb	fungi: if you look near the top of the output it gies what I think is a better summary as it includes iops	21:33
fungi	bw ( KiB/s): min=109654, max=115386, per=100.00%, avg=113857.77, stdev=330.6	21:34
fungi	2, samples=239	21:34
fungi	iops : min=27412, max=28846, avg=28464.11, stdev=82.68, samples=239	21:34
fungi	that?	21:34
clarkb	no it looks like read: IOPS=814, BW=3260KiB/s (3338kB/s)(95.8MiB/30085msec)	21:35
clarkb	so it includes uops and bw on the same line which I feel is a better summary	21:35
fungi	read: IOPS=28.4k, BW=111MiB/s (116MB/s)(3337MiB/30036msec)	21:35
clarkb	fungi: and was that for read or randread?	21:36
clarkb	(I think they both say read: in the output unfortunately)	21:36
fungi	i don't see a randread in the output	21:37
clarkb	fungi: its in the command you run --rw=read or --rw=randread iirc	21:37
clarkb	it selects which type of test to perform.	21:38
fungi	oh, i ran it with --rw=read	21:38
fungi	since that was the last way you ran it	21:38
clarkb	ya I was collecting both sets of data	21:38
clarkb	read is sequential and randread is random. I think the randread value is more improtant here (at least it was the test that did not great on the existing disk)	21:39
clarkb	so just rerun the command with the different test selected and it should spit out a similar report	21:39
clarkb	but both pieces of info are helpful	21:39
fungi	https://paste.opendev.org/show/bwMpEzmUCJPT7TCZTNXR/	21:41
fungi	that's for randread	21:41
clarkb	that looks very similar to mirror randread `read: IOPS=20.0k, BW=78.2MiB/s (82.0MB/s)(2346MiB/30004msec)`	21:41
clarkb	so I don't think switching to the performance flavor is going to be any better for us	21:42
fungi	so maybe ssd and sata performance are similar	21:42
clarkb	fungi: I ran that on the mirrors root disk not its cache volume fwiw	21:42
clarkb	in any case that is much much better than the current randread ~1k IOPS number so yes this should be an improvement	21:42
fungi	ah, so could also be that the rootfs on performance is ssd or has similar performance characteristics	21:42
clarkb	yup	21:42
fungi	plan in the etherpad is updated based on your feedback	21:46
clarkb	what does rsync -S get us here? Just smaller transfers?	21:50
clarkb	my main concern is that some things may not like that. I know we can't sparse out swap files anymore for example. Wonder if mariadb in particular might have issues with that	21:50
fungi	oh, old habit, i can drop it and rerun the pre-sync	21:50
fungi	edited the plan	21:51
clarkb	ya just thinking if the database is preallocating space for $reasons that might not produce a happy result (I have no evidence this is the case but do know databases do wizardly things with disk stuff)	21:51
clarkb	I think the plan lgtm now	21:51
fungi	i want to say ~ancient rsync did not preserve sparseness and the meaning of the -S option changed over time	21:52
clarkb	my local manpage says `turn sequences of nulls into sparse blocks`	21:54
clarkb	oh one other thought: Maybe we shutdown everything, then do a mysqldump backup, then shutdown mariadb	21:54
clarkb	*shutdown everything but mariadb	21:54
clarkb	that way we have a database backup that doesn't rely on the underlying backing files (in theory since we're moving around on the same fielsystem with no applications running that is safe anyway but belts and suspenders if rsync does something we don't expect)	21:55
clarkb	I don't know how extra careful we want to be	21:56
fungi	or maybe we just don't do the final rm of the original directory we moved until we're sure all is well?	21:59
clarkb	ya that is another good option	21:59
clarkb	since a mv should be largely transparent if we need to shift things back	22:00
fungi	yes, the files are being left entirely untouched, as long as we assume rsync doesn't modify the source side in any way, which has always been my understanding	22:01
clarkb	ya I think we can probably trust rsync there	22:03
fungi	i made a note	22:05
fungi	napkin math, taking the rsync slow reads from the rootfs into account, is ~30 minutes for the outage	22:06
fungi	inbound deliveries should get temporarily queued by exim while mailman is offline, so posts will only be delayed in theory	22:07
clarkb	and in theory they should retry for like a day or two right?	22:07
clarkb	but I think if we announce it even if some deliveries fail thats ok	22:08
fungi	maybe end of this week, friday afternoon my time? say... 20:00 utc?	22:13
fungi	mail volume tends to be really low then	22:14
clarkb	I should be around then	22:14
fungi	i'll go ahead and send something to service announce for that time	22:18
fungi	sent	22:30
clarkb	fungi: any idea if I need to moderate it through? I don't see it in my inbox but I also don't see the moderation request either	22:33
clarkb	maybe I just need to be patient	22:33
fungi	i received it right away	22:33
clarkb	I just got it	22:34
clarkb	so yes patience was required	22:34
fungi	i blocked out 19:30-21:30 utc on my calendar for it, just in case	22:36

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!