Friday, 2023-09-01

opendevreview	Merged opendev/system-config master: Explicitly disable offline reindexing during project renames https://review.opendev.org/c/opendev/system-config/+/880692	00:14
frickler	clarkb: fungi: I like the idea of bumping rebuild-age, start doing it for non-latest images? what value would you suggest? 2d? 7d? something in between?	10:01
fungi	i'd be fine with 7 for older images (centos 7/8, ubuntu bionic/focal, debian buster/bullseye...)	11:36
Clark[m]	++	13:15
fungi	and maybe even scaling back to 48 or 72 hours for current versions of our non-default image as those are also not used quite as much	13:16
slittle	please set me up as first core for starlingx-app-intel-device-plugins	16:09
fungi	slittle: starlingx-app-intel-device-plugins doesn't seem to exist. let me see if that's a typo or if something went sideways configuring the project	16:16
fungi	starlingx-app-intel-device-plugins-core i guess you meant	16:16
fungi	slittle: done!	16:17
clarkb	my laptop has started producing display artifacts on the built in and external displays :/ sorry been pokign at that a bit today	16:17
clarkb	But I've given up for now and am swapped over to the desktop. I'll probably need to rma new shell	16:17
fungi	my isp has decided ipv6 should be a giant black hole, which has been frustrating me this morning	16:17
clarkb	it seems to do a fairly consistent transformation that shifts and tilts things over. I should probably try to do memcheck before starting any rma stuff. But my hunch is gpu problems	16:18
fungi	emergency kernel or firmware update overnight?	16:19
clarkb	no I rebooted after the last updates and things were working. Though Isuppose it could be some latent bug too	16:20
fungi	but yeah, that does sound like something is degraded in video memory	16:20
slittle	thanks	16:20
fungi	or maybe triggered by suspend/resume or dpms blanking and waking back up	16:20
fungi	i've definitely seen my share of weird display driver bugs which only appeared after the display went to sleep and then turned back on	16:21
clarkb	ya its fully powered off now to cool off. I'll see if a cold boot produces different behavior later. Where later might even be "problem for tuesday"	16:21
fungi	oh, yep heat can do it too in some cases	16:22
clarkb	none of the reported temps were above 55C at least	16:22
fungi	clarkb: on 893073 what do you think about putting gitea10-14 in emergency disable, taking gitea09 temporarily out of the haproxy pool, then merging the change and checking up on replication once it deploys?	16:31
clarkb	fungi: I think that could work	16:31
fungi	or we could just approve it, cross our fingers and try to be on top of whatever unexpected problems might arise. it is a friday	16:31
clarkb	the main thing would be rerunning the job or playbook to ensure the others get updated if all is well	16:31
fungi	right, which is why i'm not opposed to just merging it. the risks seem minimal and there's not a ton of activity right now anyway (it's already the weekend for half the globe, and in the usa lots of people are checking out early for the long holiday weekend)	16:36
clarkb	ya and if we do send it in and something goes wrong with replication we're probably most likely to replace the gerrit key?	16:36
fungi	yes, i think so	16:37
clarkb	Actually we can probably safely revert gitea too since the gitea version is fixed	16:37
clarkb	its the udnerlying OS that is changing so a revert should be fine	16:37
fungi	true	16:37
clarkb	I'm coming around to just going for it given ^	16:37
fungi	fairly straightforward to just fip back to the prior image	16:37
fungi	er, flip back	16:37
clarkb	fungi: well the prior image will be gone I think	16:37
clarkb	we'd need to do an actual revert and rebuild the old state	16:37
clarkb	because we use :latest	16:37
clarkb	I'm coming around to just sending it	16:38
fungi	and we prune the old images on the servers? or you mean avoiding locally changing the compose files	16:38
clarkb	fungi: we prune the images `docker image prune -f --filter "until=72h"` <- I think that 72h is from when the image is built not when it was deployed	16:39
fungi	but yeah, i don't anticipate any serious disruption, and if there is some impact to replication then slightly stale repos while we work through that are likely to go entirely unnoticed by !us	16:39
clarkb	so one thing we could do is a noop rebuild of gitea on bullseye then immediately do the bookworm update so the bullseye images are less than 72 hours old	16:39
clarkb	I think if we want to be careful that seems like the easiest safe option	16:39
fungi	wfm	16:40
clarkb	ok give me a few and I can stack a new change under the bookworm chagne that does a noop rebuild	16:40
clarkb	one thing we need to be careful of is doing them in sequence such that the deploy for the first image doesn't run after the promote for the second image	16:42
clarkb	bceause in that case we'll just pull the latest bookworm image and not step through	16:42
clarkb	basically that means approve the first one. Let it deploy then approve the second one	16:43
fungi	yep	16:44
fungi	makes sense	16:44
opendevreview	Clark Boylan proposed opendev/system-config master: Update Gitea images to bookworm https://review.opendev.org/c/opendev/system-config/+/893073	16:45
opendevreview	Clark Boylan proposed opendev/system-config master: Rebuild gitea on bullseye https://review.opendev.org/c/opendev/system-config/+/893539	16:45
clarkb	I think that will do it	16:45
TheJulia	Who wants my pubkey for autohold request 0000000249 :)	17:44
fungi	TheJulia: i can take it	17:45
fungi	TheJulia: ssh root@104.239.174.94	17:48
TheJulia	perfect, thanks!	17:48
fungi	fun fact, my isp's ipv6 routing actually seems fine. it's just that i can't reach review.opendev.org over ipv6. i can reach our other servers in that same network over ipv6 just fine though	17:49
fungi	anyone else having trouble getting to 2604:e100:1:0:f816:3eff:fe52:22de by any chance?	17:49
TheJulia	works for me	17:54
fungi	oh, or maybe i can't reach other stuff there. that's vexxhost ca-ymq-1	17:54
fungi	yeah, looks like my traceroute from home is dying somewhere after equinix ashburn/iad	17:55
fungi	probably asymmetric return path dying elsewhere	17:56
fungi	yeah, looks like something's up with the gateway in vexxhost not being able to reach my isp	17:58
clarkb	fungi: does the mirror in ca-ymq-1 have similar problems for you?	19:23
fungi	yeah, it's that whole network from what i can tell	19:25
fungi	looks like the routers at least don't want to forward packets back to my 2600:6c60:5300:: isp	19:26
fungi	or 2600:6c60:7003::	19:27
fungi	so probably anything in 2600:6c60::	19:27
opendevreview	Merged opendev/system-config master: Rebuild gitea on bullseye https://review.opendev.org/c/opendev/system-config/+/893539	19:35
clarkb	fungi: all of the giteas have restarted on the new imaegs (b7cda0718262 and 474f230c47d5)	19:50
clarkb	I think we can approve the bookworm update now	19:50
clarkb	and since those two images are only an hour or so old they shouldn't get pruned when bookworm deploys	19:51
fungi	done	19:52
fungi	i concur	19:52
clarkb	excellent	19:52
opendevreview	Merged opendev/system-config master: Update Gitea images to bookworm https://review.opendev.org/c/opendev/system-config/+/893073	21:29
clarkb	still waiting for gitea09 to upgrade	21:31
clarkb	I expect that will happen at any moment though	21:31
clarkb	gitea09 is running the bookworm image now	21:32
clarkb	I think the gerrit replication log shows replciation completing to gitea09 after gitea09 was updated	21:35
clarkb	but its close so would be good to get more data	21:35
clarkb	ok just now tempest had a replication event trigger (I think due to a review being posted not a chagne being pushed). I could see all but 13 succeed. 13 refused the connection, I believe its sshd was down as part of the ugprade. Then after the retry backoff it tried again and succeeded to gitea13.	21:39
clarkb	that is looking good, but harder to confirm it worked since it wasn't a regular chagne ref I can fetch but a review intsead	21:39
opendevreview	Clark Boylan proposed opendev/system-config master: DNM Forced fail on Gerrit to check bookworm/java 17 update https://review.opendev.org/c/opendev/system-config/+/893571	21:45
clarkb	I'll put a hold in place for ^ and then we can see if that ref is fetchable from the giteas	21:45
clarkb	the deployment job did end up succeeding	21:46
clarkb	I was able to fetch ^ using `git fetch orgin refs/changes/71/893571/1` where originhttps://opendev.org/opendev/system-config (fetch) so I think this is all happy	21:48
fungi	yep	21:56
fungi	sorry, stepped away at the wrong moment to eat	21:57
fungi	but it's all testing clean for me	21:57
fungi	replication looks good	21:58
fungi	deploy job completed successfully so all the servers are updated	21:59
clarkb	yup. This is what we expected but given the trouble MINA had previously it was worth being cautious	22:11
clarkb	it was also nice to see that the logs show replication working across a gitea restart (when done properly at least)	22:14
clarkb	builds more confidence in the system	22:14
fungi	agreed	22:15
clarkb	heads up there is a jitsi release that just got made	22:16
clarkb	we'll auto upgrade during the daily runs. They've all been happy so far though since you modernized the config	22:16
fungi	yeah, we should probably still test next week just to be sure	22:29

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!