opendevreview | Merged opendev/system-config master: Explicitly disable offline reindexing during project renames https://review.opendev.org/c/opendev/system-config/+/880692 | 00:14 |
---|---|---|
frickler | clarkb: fungi: I like the idea of bumping rebuild-age, start doing it for non-latest images? what value would you suggest? 2d? 7d? something in between? | 10:01 |
fungi | i'd be fine with 7 for older images (centos 7/8, ubuntu bionic/focal, debian buster/bullseye...) | 11:36 |
Clark[m] | ++ | 13:15 |
fungi | and maybe even scaling back to 48 or 72 hours for current versions of our non-default image as those are also not used quite as much | 13:16 |
slittle | please set me up as first core for starlingx-app-intel-device-plugins | 16:09 |
fungi | slittle: starlingx-app-intel-device-plugins doesn't seem to exist. let me see if that's a typo or if something went sideways configuring the project | 16:16 |
fungi | starlingx-app-intel-device-plugins-core i guess you meant | 16:16 |
fungi | slittle: done! | 16:17 |
clarkb | my laptop has started producing display artifacts on the built in and external displays :/ sorry been pokign at that a bit today | 16:17 |
clarkb | But I've given up for now and am swapped over to the desktop. I'll probably need to rma new shell | 16:17 |
fungi | my isp has decided ipv6 should be a giant black hole, which has been frustrating me this morning | 16:17 |
clarkb | it seems to do a fairly consistent transformation that shifts and tilts things over. I should probably try to do memcheck before starting any rma stuff. But my hunch is gpu problems | 16:18 |
fungi | emergency kernel or firmware update overnight? | 16:19 |
clarkb | no I rebooted after the last updates and things were working. Though Isuppose it could be some latent bug too | 16:20 |
fungi | but yeah, that does sound like something is degraded in video memory | 16:20 |
slittle | thanks | 16:20 |
fungi | or maybe triggered by suspend/resume or dpms blanking and waking back up | 16:20 |
fungi | i've definitely seen my share of weird display driver bugs which only appeared after the display went to sleep and then turned back on | 16:21 |
clarkb | ya its fully powered off now to cool off. I'll see if a cold boot produces different behavior later. Where later might even be "problem for tuesday" | 16:21 |
fungi | oh, yep heat can do it too in some cases | 16:22 |
clarkb | none of the reported temps were above 55C at least | 16:22 |
fungi | clarkb: on 893073 what do you think about putting gitea10-14 in emergency disable, taking gitea09 temporarily out of the haproxy pool, then merging the change and checking up on replication once it deploys? | 16:31 |
clarkb | fungi: I think that could work | 16:31 |
fungi | or we could just approve it, cross our fingers and try to be on top of whatever unexpected problems might arise. it is a friday | 16:31 |
clarkb | the main thing would be rerunning the job or playbook to ensure the others get updated if all is well | 16:31 |
fungi | right, which is why i'm not opposed to just merging it. the risks seem minimal and there's not a ton of activity right now anyway (it's already the weekend for half the globe, and in the usa lots of people are checking out early for the long holiday weekend) | 16:36 |
clarkb | ya and if we do send it in and something goes wrong with replication we're probably most likely to replace the gerrit key? | 16:36 |
fungi | yes, i think so | 16:37 |
clarkb | Actually we can probably safely revert gitea too since the gitea version is fixed | 16:37 |
clarkb | its the udnerlying OS that is changing so a revert should be fine | 16:37 |
fungi | true | 16:37 |
clarkb | I'm coming around to just going for it given ^ | 16:37 |
fungi | fairly straightforward to just fip back to the prior image | 16:37 |
fungi | er, flip back | 16:37 |
clarkb | fungi: well the prior image will be gone I think | 16:37 |
clarkb | we'd need to do an actual revert and rebuild the old state | 16:37 |
clarkb | because we use :latest | 16:37 |
clarkb | I'm coming around to just sending it | 16:38 |
fungi | and we prune the old images on the servers? or you mean avoiding locally changing the compose files | 16:38 |
clarkb | fungi: we prune the images `docker image prune -f --filter "until=72h"` <- I think that 72h is from when the image is built not when it was deployed | 16:39 |
fungi | but yeah, i don't anticipate any serious disruption, and if there is some impact to replication then slightly stale repos while we work through that are likely to go entirely unnoticed by !us | 16:39 |
clarkb | so one thing we could do is a noop rebuild of gitea on bullseye then immediately do the bookworm update so the bullseye images are less than 72 hours old | 16:39 |
clarkb | I think if we want to be careful that seems like the easiest safe option | 16:39 |
fungi | wfm | 16:40 |
clarkb | ok give me a few and I can stack a new change under the bookworm chagne that does a noop rebuild | 16:40 |
clarkb | one thing we need to be careful of is doing them in sequence such that the deploy for the first image doesn't run after the promote for the second image | 16:42 |
clarkb | bceause in that case we'll just pull the latest bookworm image and not step through | 16:42 |
clarkb | basically that means approve the first one. Let it deploy then approve the second one | 16:43 |
fungi | yep | 16:44 |
fungi | makes sense | 16:44 |
opendevreview | Clark Boylan proposed opendev/system-config master: Update Gitea images to bookworm https://review.opendev.org/c/opendev/system-config/+/893073 | 16:45 |
opendevreview | Clark Boylan proposed opendev/system-config master: Rebuild gitea on bullseye https://review.opendev.org/c/opendev/system-config/+/893539 | 16:45 |
clarkb | I think that will do it | 16:45 |
TheJulia | Who wants my pubkey for autohold request 0000000249 :) | 17:44 |
fungi | TheJulia: i can take it | 17:45 |
fungi | TheJulia: ssh root@104.239.174.94 | 17:48 |
TheJulia | perfect, thanks! | 17:48 |
fungi | fun fact, my isp's ipv6 routing actually seems fine. it's just that i can't reach review.opendev.org over ipv6. i can reach our other servers in that same network over ipv6 just fine though | 17:49 |
fungi | anyone else having trouble getting to 2604:e100:1:0:f816:3eff:fe52:22de by any chance? | 17:49 |
TheJulia | works for me | 17:54 |
fungi | oh, or maybe i can't reach other stuff there. that's vexxhost ca-ymq-1 | 17:54 |
fungi | yeah, looks like my traceroute from home is dying somewhere after equinix ashburn/iad | 17:55 |
fungi | probably asymmetric return path dying elsewhere | 17:56 |
fungi | yeah, looks like something's up with the gateway in vexxhost not being able to reach my isp | 17:58 |
clarkb | fungi: does the mirror in ca-ymq-1 have similar problems for you? | 19:23 |
fungi | yeah, it's that whole network from what i can tell | 19:25 |
fungi | looks like the routers at least don't want to forward packets back to my 2600:6c60:5300:: isp | 19:26 |
fungi | or 2600:6c60:7003:: | 19:27 |
fungi | so probably anything in 2600:6c60:: | 19:27 |
opendevreview | Merged opendev/system-config master: Rebuild gitea on bullseye https://review.opendev.org/c/opendev/system-config/+/893539 | 19:35 |
clarkb | fungi: all of the giteas have restarted on the new imaegs (b7cda0718262 and 474f230c47d5) | 19:50 |
clarkb | I think we can approve the bookworm update now | 19:50 |
clarkb | and since those two images are only an hour or so old they shouldn't get pruned when bookworm deploys | 19:51 |
fungi | done | 19:52 |
fungi | i concur | 19:52 |
clarkb | excellent | 19:52 |
opendevreview | Merged opendev/system-config master: Update Gitea images to bookworm https://review.opendev.org/c/opendev/system-config/+/893073 | 21:29 |
clarkb | still waiting for gitea09 to upgrade | 21:31 |
clarkb | I expect that will happen at any moment though | 21:31 |
clarkb | gitea09 is running the bookworm image now | 21:32 |
clarkb | I think the gerrit replication log shows replciation completing to gitea09 after gitea09 was updated | 21:35 |
clarkb | but its close so would be good to get more data | 21:35 |
clarkb | ok just now tempest had a replication event trigger (I think due to a review being posted not a chagne being pushed). I could see all but 13 succeed. 13 refused the connection, I believe its sshd was down as part of the ugprade. Then after the retry backoff it tried again and succeeded to gitea13. | 21:39 |
clarkb | that is looking good, but harder to confirm it worked since it wasn't a regular chagne ref I can fetch but a review intsead | 21:39 |
opendevreview | Clark Boylan proposed opendev/system-config master: DNM Forced fail on Gerrit to check bookworm/java 17 update https://review.opendev.org/c/opendev/system-config/+/893571 | 21:45 |
clarkb | I'll put a hold in place for ^ and then we can see if that ref is fetchable from the giteas | 21:45 |
clarkb | the deployment job did end up succeeding | 21:46 |
clarkb | I was able to fetch ^ using `git fetch orgin refs/changes/71/893571/1` where originhttps://opendev.org/opendev/system-config (fetch) so I think this is all happy | 21:48 |
fungi | yep | 21:56 |
fungi | sorry, stepped away at the wrong moment to eat | 21:57 |
fungi | but it's all testing clean for me | 21:57 |
fungi | replication looks good | 21:58 |
fungi | deploy job completed successfully so all the servers are updated | 21:59 |
clarkb | yup. This is what we expected but given the trouble MINA had previously it was worth being cautious | 22:11 |
clarkb | it was also nice to see that the logs show replication working across a gitea restart (when done properly at least) | 22:14 |
clarkb | builds more confidence in the system | 22:14 |
fungi | agreed | 22:15 |
clarkb | heads up there is a jitsi release that just got made | 22:16 |
clarkb | we'll auto upgrade during the daily runs. They've all been happy so far though since you modernized the config | 22:16 |
fungi | yeah, we should probably still test next week just to be sure | 22:29 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!