Thursday, 2025-08-07

*** janders8 is now known as janders01:23
opendevreviewMichal Nasiadka proposed opendev/system-config master: Add opendev-mirror-container-image-debian job to be run  https://review.opendev.org/c/opendev/system-config/+/95676009:19
*** darmach48 is now known as darmach412:15
*** ykarel_ is now known as ykarel15:02
clarkbfungi: looking at https://meetings.opendev.org/ I think we can actually land the eavesdrop container updates in system-config after 1700 UTC today15:03
opendevreviewClark Boylan proposed opendev/irc-meetings master: Drop storyboard meeting  https://review.opendev.org/c/opendev/irc-meetings/+/95679615:17
clarkbfungi: I also noticed ^ when looking at the thursday meeting list15:17
fungithanks! yeah i'm good with approving the changes around 17:00 then, so about 1.5 hours from now15:20
clarkbstill no update on the etherpad issue I filed. Did anyone test the colibris theme that is on the held node? I'm mostly wondering if we think that is a viable alternative if it comes to that and not advocating for it necessarily15:38
opendevreviewMerged opendev/irc-meetings master: Drop storyboard meeting  https://review.opendev.org/c/opendev/irc-meetings/+/95679615:41
fungii have not yet, no15:44
clarkbfungi: now still a good time to approve that change? I'm around too17:02
fungidoing so17:03
fungiaha, that's just a single change, not a pair of them17:04
clarkbyup since both the images and the config management are all in system-config17:05
clarkbthe others were split up across repos17:05
fungiokay, 956706,1 is in the gate now17:06
fungiunrelated, we may want to keep an eye out for impact from https://blog.pypi.org/posts/2025-08-07-wheel-archive-confusion-attacks/ though with everything generating their wheels using pypoject-build i doubt we'll see problems17:11
fungiprobably it'll hit bespoke wheel generators17:12
clarkbfungi: ya it seems like you'd either need to be bespoke or nefarious to get hit17:13
clarkbI doubt any of our pckages have problems but potentially if we consume wheels as deps using bespoke generation processes we could see problems there17:14
fungii think it's only going to affect new uploads?17:14
clarkbah yup and even then its just a warning for 6 months17:15
clarkbthe change should land in abuot 10-15 minutes17:36
clarkb*the eavesdrop quay.io image move change17:36
opendevreviewMerged opendev/system-config master: Reapply "Move system-config irc bots into quay.io"  https://review.opendev.org/c/opendev/system-config/+/95670617:47
clarkbhrm that doesn' seem to be promoting the images?17:48
clarkbwe may end up reverting to old versiosn as a result17:48
fungii thought we promoted those in deploy17:49
clarkbI thought so too but that didn't happen17:49
clarkbI think I should put eavesdrop in emergency17:49
clarkbgah the job is already started it may be too late17:49
clarkbya I was too late. The limnoria and matrix eavesdrop bots have restarted on the quay versions which I think are old from when we last attempted this17:52
clarkbI'm working on a fix on the system-config side. Not sure if we want to do anything on the production side in the interim17:52
clarkbfungi: maybe you can check if these messages are getting logged? If so then we're probably good enough for now17:53
corvushttps://meetings.opendev.org/irclogs/%23opendev/%23opendev.2025-08-07.log has that last message17:56
opendevreviewClark Boylan proposed opendev/system-config master: Trigger rebuilds of our irc bots so that the promote new images to quay  https://review.opendev.org/c/opendev/system-config/+/95682317:56
clarkbcorvus: cool I suspect the old versions are largely functional just running older code than we'd like. And ^ should get us back on the right track17:57
clarkbI'll leave eavesdrop02 in the emergency file for now since we run hourlies. That way we can manually switch back to the docker hosted images in the interim if we decide that is necessary17:58
corvusclarkb: so iiuc 956706 built images but did not promote them?17:58
clarkbcorvus: correct. This has to do with zuul not triggering the promotion jobs if they would only trigger due to being updated by the change17:58
clarkbcorvus: I think it has to do with the triggers on the pipeline? We're no longer in a chagne context but a ref context so the trigger updated jobs behavior doesn't fire?17:59
corvusclarkb: promote is a change context, but ther's no delta between the current running state and future state of the change because the reconfiguration has taken effect18:00
clarkbI think channel logging is working for both matrix and irc18:02
clarkbaccessbot is the other bot but I did get the emergency file update in place before it would have been triggered18:02
clarkbI think18:02
corvusclarkb: if this is anything more than a minor annoyance, we could switch to using a dispatch job in promote like we do for zuul-providers.  then the promote job would just depend on which artifacts happened to show up during the gate run.18:02
corvus(also, we could probably even just switch to running a single promote job that just promotes whatever artifacts there are, all in one job?)18:03
fungiclarkb: i'm looking now18:03
clarkbcorvus: ya that may be worth investigating. I think 90% of the time its not a big deal ebcause you're not rolling back in time18:04
clarkbcorvus: but due to us having had the images on quay in the past we have content to pull from there rather than erroring due to no content18:04
clarkbor succeeding in pulling a recent but slightly older version18:04
fungi/var/lib/limnoria/opendev/logs/ChannelLogger/oftc/#opendev/ has current content18:04
fungi/var/lib/limnoria/opendev/logs/ChannelLogger/oftc/#opendev/#opendev.2025-08-07.log specifically18:04
clarkbfungi: yup I think the old versions of matrix-eavesdrop and limnoria are working for channel logging18:07
clarkbI'm still trying to track down if accessbot ran. It seems we pulled the new old image but the run-accessbot playbook did skip as no hosts matched after I put the host in the emergency file. So I'm trying to figure that out18:07
clarkbI don't think accessbot ran18:08
clarkbbased on /var/log/accessbot/accessbot.log contents18:08
fungiagreed, last line was 02:48:40 utc18:09
clarkbso ya I think 956823 should get us back to where we want to be. I can drop eavesdrop02.opendev.org from the emergency file once that is going to land (to avoid hourlies from doing anything unexpected in the meantime)18:10
clarkband it doesn't seem like we need to shutdown the logging bots (due to file format incompatibilities or anything like that)18:11
clarkbI've noticed that tristanC[m]'s matrix gerritbot related images don't have proper timestamps set on them. (they think they were built on the unix epoch 55 years ago)18:15
corvusi feel like the epoch can't have been that long ago.  ;)18:18
clarkbfungi: corvus: just to be sure you don't see any reason we need to shutdown those bots and/or manually switch them back over to the docker images while we wait on the rebuilds and promotion in quay do you?18:20
mnasiadkaHello, any chance to get a second core review on https://review.opendev.org/c/opendev/system-config/+/956760 ?18:20
corvusclarkb:  can't think of any but haven't been paying close attention to what's been going on with them during that time18:20
clarkbcorvus: I think its mostly just been maintenance efforts to keep them up to date with python and the underlying OS18:21
fungiclarkb: doesn't seem necessary, no, we can just wait for 956823 to upgrade them18:29
clarkbok thanks for confirming18:29
opendevreviewMerged opendev/system-config master: Add opendev-mirror-container-image-debian job to be run  https://review.opendev.org/c/opendev/system-config/+/95676018:30
clarkbthinking out loud here: Another tool we could employ to avoid problems like this is using explicit tag versions rather than :latest18:38
clarkbbut that adds more overhead to writing changes and its probably better to just automate around the problem if we want to invest in fixing this18:39
clarkbthe change to get us back to the future is in the gate now18:39
corvusthat complicates speculative testing; i'd rather stick with latest and either accept the gotcha or update the promote jobs18:45
clarkba simple way to avoid this problem would be split the changes even though we don't always need to. Then we can ensure new images to to quay.io before we pull from quay.io18:51
fungiwhich is effectively what we did with the others by necessity, as they were from other repositories18:56
clarkbyup18:58
clarkbI'm just thinking there are more of these to do so what is the best approach. We could do what corvus suggests and automate it fully. But I'm wondering if there are simpler options and splitting the changes up might be a good simple alternative18:58
clarkbhttps://zuul.opendev.org/t/openstack/build/e1ad36ef66fb45aa8b94bcb3db931255 I don't believe this is related to the change at all but interesting to see that check catch a problem19:02
clarkbopendev.org is up for me too so probably something in that cloud19:03
clarkbhourly jobs just completed and the fix should land in about 10 minutes. I'll remove eavesdrop02 from the emergency file now19:12
clarkbI noticed the mirror for openmetal in the emergency file and checked if it is up now (it is). We are still waiting for one more control plane node to migrate though. Not sure if we want to reenable things and see how they do or just wait a week then reenable after the migration is fully complete19:15
opendevreviewMerged opendev/system-config master: Trigger rebuilds of our irc bots so that the promote new images to quay  https://review.opendev.org/c/opendev/system-config/+/95682319:22
clarkbthe images promoted and quay.io/opendevorg seems to reflext that with new timestamps19:25
fungiit's running and doing stuff according to its log19:31
fungidoing *expected* stuff, from the looks of it19:33
fungii'll note that its current design is to just blindly reapply settings and membership and then ignore errors from the server if those things are already set, rather than looking first and deciding if something needs changing. maybe not the most polite method, but i guess we haven't gotten complaints19:35
fungilooks like it finished at 19:35:1319:36
fungialso the deploy buildset reported success for 95682319:38
opendevreviewClark Boylan proposed opendev/system-config master: Reapply "Migrate statsd sidecar container images to quay.io"  https://review.opendev.org/c/opendev/system-config/+/95682819:38
opendevreviewClark Boylan proposed opendev/system-config master: Pull the haproxy and zookeeper statsd sidecars from quay  https://review.opendev.org/c/opendev/system-config/+/95682919:38
clarkbfungi: it also runs relatively infrequently ~once a day on average I think19:38
clarkbthose two changes migrate two more images I found that should be safe as their hosts are all Noble now19:38
clarkbI've written it out as two separate changes so we can confirm things publish properly first as a way to exercise that approach and see what others think about it19:39
fungisgtm, thanks!19:39
clarkbhttps://meetings.opendev.org/irclogs/%23opendev/latest.log.html hasn't updated with the content before 19:30UTC (I had two messages) and there is no logs2html running on eavesdrop19:43
clarkbI wonder if it got caught out by the restarting containers (the timing is about right for that maybe? We should check if it updates after 19:4519:43
clarkb(the raw logs are updating its just the html conversion I'm not seeing yet)19:43
clarkbhttps://meetings.opendev.org/irclogs/%23opendev/latest.log.html has updated now so ya must've been due to restarts occuring around when we'd run that19:46
fungithat's batched up every... 20 minutes?19:46
fungiand takes a few minutes to complete19:47
clarkbevery 15 minutes. Its a cronjob on eavesdrop0219:47
fungiso it'll catch up nowish hopefully19:47
clarkbyup I think its all sorted now19:49
clarkbhttps://meetings.opendev.org/irclogs/%23zuul/%23zuul.2025-08-07.log matrix logging seems to work too19:52
clarkbcorvus: did you see these errors with the swift image upload switch change: https://zuul.opendev.org/t/opendev/build/0ef10e60a34a4dedaba7ca09c492241d/log/job-output.txt#9820-983119:55
clarkbI think its image_upload_swift. I'll push a fix after I double check the related roles19:58
clarkbcorvus: or do you think we should rename the underlying library file for consistency. Any concern with people using that module with the existing name?19:58
opendevreviewJeremy Stanley proposed opendev/zone-opendev.org master: Clean up old eavesdrop01 records  https://review.opendev.org/c/opendev/zone-opendev.org/+/95683219:59
corvusclarkb: no, i think it's just us, i think we can waive the normal zuul-jobs policies until niz is done19:59
opendevreviewClark Boylan proposed zuul/zuul-jobs master: Fix upload-image-swift role module usage  https://review.opendev.org/c/zuul/zuul-jobs/+/95683320:00
clarkbcorvus: ^ thats the trvial fix. Would you prefer I rename the file instead?20:01
clarkblooks like s3 uses the format that is broken in swift so ya renaming the file for consistency is probably best20:01
corvusyeah, consistency between the two is probably best20:02
opendevreviewClark Boylan proposed zuul/zuul-jobs master: Fix upload-image-swift role module usage  https://review.opendev.org/c/zuul/zuul-jobs/+/95683320:03
corvusplus, it was broken before, and it's broken now... so... can't make it worse. :)20:05
clarkbI don't think that impacts any no_log stuff as that is an attribute at invocation or in the module itself20:06
clarkbbut double check that assertion if thee is some secret masking based on the name itself20:06
corvusi agree20:06
opendevreviewMerged zuul/zuul-jobs master: Fix upload-image-swift role module usage  https://review.opendev.org/c/zuul/zuul-jobs/+/95683320:18
corvusi rechecked the switcheroo change20:19
clarkbI'm going to take advantage of some oddly cool summer weather and go for a bike ride shortly. I think eavesdrop is happy now and hopefully so are image uploads20:19
fungistatic.o.o is struggling20:49
fungioh, or maybe it's my local internet connection20:50
fungiyes, that's it20:50
fungiload average on static is around a third to one20:50
fungiseems completely fine20:51
clarkbinfra-root while on my bike ride it occurred to me that we may be close to being able to switch the python base images over to quay.io as well. Looking at https://codesearch.opendev.org/?q=opendevorg%2Fpython-&i=nope&literal=nope&files=&excludeFiles=&repos= I think this is largely the acse. Within opendev grafyaml and the statsd containers are the main consumers that haven't moved to23:02
clarkbquay yet. I'll get a chnage up today to move grafyaml and already have the statsd changes up23:02
clarkbthen things like elastic-recheck, storyboard, and gear are less important things on our side that we can probably leave as is for now? Maybe I'll update gear. Then vexxhost and zuul are the other main consumers23:03
clarkbzuul within opendev should be a non issue. We just haev to update the consumer side of things within zuul and I can work on that too post move23:03
clarkbfor vexxhost images I'll ping guilhermesp mnaser and ricolin here now in case they see this. tl;dr is that the opendevorg/python- images are likely moving to quay soon. Let us know if you have questions or concerns23:04
opendevreviewClark Boylan proposed opendev/grafyaml master: Reapply "Migrate grafyaml container images to quay.io"  https://review.opendev.org/c/opendev/grafyaml/+/95683923:29
opendevreviewClark Boylan proposed openstack/project-config master: Pull grafyaml from quay.io  https://review.opendev.org/c/openstack/project-config/+/95684023:32
opendevreviewClark Boylan proposed opendev/system-config master: Pull grafyaml from quay.io  https://review.opendev.org/c/opendev/system-config/+/95684223:36
clarkbok I think once that set of changes and the statsd changes land then we can update the python-base and python-builder images to publish to quay then update all the images all over again to fetch from there. A bit backwards but this allowed us to do it piecemeal and take our time. its just now that eavesdrop has updated that all the python containers are not running on noble23:37

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!