*** janders8 is now known as janders | 01:23 | |
opendevreview | Michal Nasiadka proposed opendev/system-config master: Add opendev-mirror-container-image-debian job to be run https://review.opendev.org/c/opendev/system-config/+/956760 | 09:19 |
---|---|---|
*** darmach48 is now known as darmach4 | 12:15 | |
*** ykarel_ is now known as ykarel | 15:02 | |
clarkb | fungi: looking at https://meetings.opendev.org/ I think we can actually land the eavesdrop container updates in system-config after 1700 UTC today | 15:03 |
opendevreview | Clark Boylan proposed opendev/irc-meetings master: Drop storyboard meeting https://review.opendev.org/c/opendev/irc-meetings/+/956796 | 15:17 |
clarkb | fungi: I also noticed ^ when looking at the thursday meeting list | 15:17 |
fungi | thanks! yeah i'm good with approving the changes around 17:00 then, so about 1.5 hours from now | 15:20 |
clarkb | still no update on the etherpad issue I filed. Did anyone test the colibris theme that is on the held node? I'm mostly wondering if we think that is a viable alternative if it comes to that and not advocating for it necessarily | 15:38 |
opendevreview | Merged opendev/irc-meetings master: Drop storyboard meeting https://review.opendev.org/c/opendev/irc-meetings/+/956796 | 15:41 |
fungi | i have not yet, no | 15:44 |
clarkb | fungi: now still a good time to approve that change? I'm around too | 17:02 |
fungi | doing so | 17:03 |
fungi | aha, that's just a single change, not a pair of them | 17:04 |
clarkb | yup since both the images and the config management are all in system-config | 17:05 |
clarkb | the others were split up across repos | 17:05 |
fungi | okay, 956706,1 is in the gate now | 17:06 |
fungi | unrelated, we may want to keep an eye out for impact from https://blog.pypi.org/posts/2025-08-07-wheel-archive-confusion-attacks/ though with everything generating their wheels using pypoject-build i doubt we'll see problems | 17:11 |
fungi | probably it'll hit bespoke wheel generators | 17:12 |
clarkb | fungi: ya it seems like you'd either need to be bespoke or nefarious to get hit | 17:13 |
clarkb | I doubt any of our pckages have problems but potentially if we consume wheels as deps using bespoke generation processes we could see problems there | 17:14 |
fungi | i think it's only going to affect new uploads? | 17:14 |
clarkb | ah yup and even then its just a warning for 6 months | 17:15 |
clarkb | the change should land in abuot 10-15 minutes | 17:36 |
clarkb | *the eavesdrop quay.io image move change | 17:36 |
opendevreview | Merged opendev/system-config master: Reapply "Move system-config irc bots into quay.io" https://review.opendev.org/c/opendev/system-config/+/956706 | 17:47 |
clarkb | hrm that doesn' seem to be promoting the images? | 17:48 |
clarkb | we may end up reverting to old versiosn as a result | 17:48 |
fungi | i thought we promoted those in deploy | 17:49 |
clarkb | I thought so too but that didn't happen | 17:49 |
clarkb | I think I should put eavesdrop in emergency | 17:49 |
clarkb | gah the job is already started it may be too late | 17:49 |
clarkb | ya I was too late. The limnoria and matrix eavesdrop bots have restarted on the quay versions which I think are old from when we last attempted this | 17:52 |
clarkb | I'm working on a fix on the system-config side. Not sure if we want to do anything on the production side in the interim | 17:52 |
clarkb | fungi: maybe you can check if these messages are getting logged? If so then we're probably good enough for now | 17:53 |
corvus | https://meetings.opendev.org/irclogs/%23opendev/%23opendev.2025-08-07.log has that last message | 17:56 |
opendevreview | Clark Boylan proposed opendev/system-config master: Trigger rebuilds of our irc bots so that the promote new images to quay https://review.opendev.org/c/opendev/system-config/+/956823 | 17:56 |
clarkb | corvus: cool I suspect the old versions are largely functional just running older code than we'd like. And ^ should get us back on the right track | 17:57 |
clarkb | I'll leave eavesdrop02 in the emergency file for now since we run hourlies. That way we can manually switch back to the docker hosted images in the interim if we decide that is necessary | 17:58 |
corvus | clarkb: so iiuc 956706 built images but did not promote them? | 17:58 |
clarkb | corvus: correct. This has to do with zuul not triggering the promotion jobs if they would only trigger due to being updated by the change | 17:58 |
clarkb | corvus: I think it has to do with the triggers on the pipeline? We're no longer in a chagne context but a ref context so the trigger updated jobs behavior doesn't fire? | 17:59 |
corvus | clarkb: promote is a change context, but ther's no delta between the current running state and future state of the change because the reconfiguration has taken effect | 18:00 |
clarkb | I think channel logging is working for both matrix and irc | 18:02 |
clarkb | accessbot is the other bot but I did get the emergency file update in place before it would have been triggered | 18:02 |
clarkb | I think | 18:02 |
corvus | clarkb: if this is anything more than a minor annoyance, we could switch to using a dispatch job in promote like we do for zuul-providers. then the promote job would just depend on which artifacts happened to show up during the gate run. | 18:02 |
corvus | (also, we could probably even just switch to running a single promote job that just promotes whatever artifacts there are, all in one job?) | 18:03 |
fungi | clarkb: i'm looking now | 18:03 |
clarkb | corvus: ya that may be worth investigating. I think 90% of the time its not a big deal ebcause you're not rolling back in time | 18:04 |
clarkb | corvus: but due to us having had the images on quay in the past we have content to pull from there rather than erroring due to no content | 18:04 |
clarkb | or succeeding in pulling a recent but slightly older version | 18:04 |
fungi | /var/lib/limnoria/opendev/logs/ChannelLogger/oftc/#opendev/ has current content | 18:04 |
fungi | /var/lib/limnoria/opendev/logs/ChannelLogger/oftc/#opendev/#opendev.2025-08-07.log specifically | 18:04 |
clarkb | fungi: yup I think the old versions of matrix-eavesdrop and limnoria are working for channel logging | 18:07 |
clarkb | I'm still trying to track down if accessbot ran. It seems we pulled the new old image but the run-accessbot playbook did skip as no hosts matched after I put the host in the emergency file. So I'm trying to figure that out | 18:07 |
clarkb | I don't think accessbot ran | 18:08 |
clarkb | based on /var/log/accessbot/accessbot.log contents | 18:08 |
fungi | agreed, last line was 02:48:40 utc | 18:09 |
clarkb | so ya I think 956823 should get us back to where we want to be. I can drop eavesdrop02.opendev.org from the emergency file once that is going to land (to avoid hourlies from doing anything unexpected in the meantime) | 18:10 |
clarkb | and it doesn't seem like we need to shutdown the logging bots (due to file format incompatibilities or anything like that) | 18:11 |
clarkb | I've noticed that tristanC[m]'s matrix gerritbot related images don't have proper timestamps set on them. (they think they were built on the unix epoch 55 years ago) | 18:15 |
corvus | i feel like the epoch can't have been that long ago. ;) | 18:18 |
clarkb | fungi: corvus: just to be sure you don't see any reason we need to shutdown those bots and/or manually switch them back over to the docker images while we wait on the rebuilds and promotion in quay do you? | 18:20 |
mnasiadka | Hello, any chance to get a second core review on https://review.opendev.org/c/opendev/system-config/+/956760 ? | 18:20 |
corvus | clarkb: can't think of any but haven't been paying close attention to what's been going on with them during that time | 18:20 |
clarkb | corvus: I think its mostly just been maintenance efforts to keep them up to date with python and the underlying OS | 18:21 |
fungi | clarkb: doesn't seem necessary, no, we can just wait for 956823 to upgrade them | 18:29 |
clarkb | ok thanks for confirming | 18:29 |
opendevreview | Merged opendev/system-config master: Add opendev-mirror-container-image-debian job to be run https://review.opendev.org/c/opendev/system-config/+/956760 | 18:30 |
clarkb | thinking out loud here: Another tool we could employ to avoid problems like this is using explicit tag versions rather than :latest | 18:38 |
clarkb | but that adds more overhead to writing changes and its probably better to just automate around the problem if we want to invest in fixing this | 18:39 |
clarkb | the change to get us back to the future is in the gate now | 18:39 |
corvus | that complicates speculative testing; i'd rather stick with latest and either accept the gotcha or update the promote jobs | 18:45 |
clarkb | a simple way to avoid this problem would be split the changes even though we don't always need to. Then we can ensure new images to to quay.io before we pull from quay.io | 18:51 |
fungi | which is effectively what we did with the others by necessity, as they were from other repositories | 18:56 |
clarkb | yup | 18:58 |
clarkb | I'm just thinking there are more of these to do so what is the best approach. We could do what corvus suggests and automate it fully. But I'm wondering if there are simpler options and splitting the changes up might be a good simple alternative | 18:58 |
clarkb | https://zuul.opendev.org/t/openstack/build/e1ad36ef66fb45aa8b94bcb3db931255 I don't believe this is related to the change at all but interesting to see that check catch a problem | 19:02 |
clarkb | opendev.org is up for me too so probably something in that cloud | 19:03 |
clarkb | hourly jobs just completed and the fix should land in about 10 minutes. I'll remove eavesdrop02 from the emergency file now | 19:12 |
clarkb | I noticed the mirror for openmetal in the emergency file and checked if it is up now (it is). We are still waiting for one more control plane node to migrate though. Not sure if we want to reenable things and see how they do or just wait a week then reenable after the migration is fully complete | 19:15 |
opendevreview | Merged opendev/system-config master: Trigger rebuilds of our irc bots so that the promote new images to quay https://review.opendev.org/c/opendev/system-config/+/956823 | 19:22 |
clarkb | the images promoted and quay.io/opendevorg seems to reflext that with new timestamps | 19:25 |
fungi | it's running and doing stuff according to its log | 19:31 |
fungi | doing *expected* stuff, from the looks of it | 19:33 |
fungi | i'll note that its current design is to just blindly reapply settings and membership and then ignore errors from the server if those things are already set, rather than looking first and deciding if something needs changing. maybe not the most polite method, but i guess we haven't gotten complaints | 19:35 |
fungi | looks like it finished at 19:35:13 | 19:36 |
fungi | also the deploy buildset reported success for 956823 | 19:38 |
opendevreview | Clark Boylan proposed opendev/system-config master: Reapply "Migrate statsd sidecar container images to quay.io" https://review.opendev.org/c/opendev/system-config/+/956828 | 19:38 |
opendevreview | Clark Boylan proposed opendev/system-config master: Pull the haproxy and zookeeper statsd sidecars from quay https://review.opendev.org/c/opendev/system-config/+/956829 | 19:38 |
clarkb | fungi: it also runs relatively infrequently ~once a day on average I think | 19:38 |
clarkb | those two changes migrate two more images I found that should be safe as their hosts are all Noble now | 19:38 |
clarkb | I've written it out as two separate changes so we can confirm things publish properly first as a way to exercise that approach and see what others think about it | 19:39 |
fungi | sgtm, thanks! | 19:39 |
clarkb | https://meetings.opendev.org/irclogs/%23opendev/latest.log.html hasn't updated with the content before 19:30UTC (I had two messages) and there is no logs2html running on eavesdrop | 19:43 |
clarkb | I wonder if it got caught out by the restarting containers (the timing is about right for that maybe? We should check if it updates after 19:45 | 19:43 |
clarkb | (the raw logs are updating its just the html conversion I'm not seeing yet) | 19:43 |
clarkb | https://meetings.opendev.org/irclogs/%23opendev/latest.log.html has updated now so ya must've been due to restarts occuring around when we'd run that | 19:46 |
fungi | that's batched up every... 20 minutes? | 19:46 |
fungi | and takes a few minutes to complete | 19:47 |
clarkb | every 15 minutes. Its a cronjob on eavesdrop02 | 19:47 |
fungi | so it'll catch up nowish hopefully | 19:47 |
clarkb | yup I think its all sorted now | 19:49 |
clarkb | https://meetings.opendev.org/irclogs/%23zuul/%23zuul.2025-08-07.log matrix logging seems to work too | 19:52 |
clarkb | corvus: did you see these errors with the swift image upload switch change: https://zuul.opendev.org/t/opendev/build/0ef10e60a34a4dedaba7ca09c492241d/log/job-output.txt#9820-9831 | 19:55 |
clarkb | I think its image_upload_swift. I'll push a fix after I double check the related roles | 19:58 |
clarkb | corvus: or do you think we should rename the underlying library file for consistency. Any concern with people using that module with the existing name? | 19:58 |
opendevreview | Jeremy Stanley proposed opendev/zone-opendev.org master: Clean up old eavesdrop01 records https://review.opendev.org/c/opendev/zone-opendev.org/+/956832 | 19:59 |
corvus | clarkb: no, i think it's just us, i think we can waive the normal zuul-jobs policies until niz is done | 19:59 |
opendevreview | Clark Boylan proposed zuul/zuul-jobs master: Fix upload-image-swift role module usage https://review.opendev.org/c/zuul/zuul-jobs/+/956833 | 20:00 |
clarkb | corvus: ^ thats the trvial fix. Would you prefer I rename the file instead? | 20:01 |
clarkb | looks like s3 uses the format that is broken in swift so ya renaming the file for consistency is probably best | 20:01 |
corvus | yeah, consistency between the two is probably best | 20:02 |
opendevreview | Clark Boylan proposed zuul/zuul-jobs master: Fix upload-image-swift role module usage https://review.opendev.org/c/zuul/zuul-jobs/+/956833 | 20:03 |
corvus | plus, it was broken before, and it's broken now... so... can't make it worse. :) | 20:05 |
clarkb | I don't think that impacts any no_log stuff as that is an attribute at invocation or in the module itself | 20:06 |
clarkb | but double check that assertion if thee is some secret masking based on the name itself | 20:06 |
corvus | i agree | 20:06 |
opendevreview | Merged zuul/zuul-jobs master: Fix upload-image-swift role module usage https://review.opendev.org/c/zuul/zuul-jobs/+/956833 | 20:18 |
corvus | i rechecked the switcheroo change | 20:19 |
clarkb | I'm going to take advantage of some oddly cool summer weather and go for a bike ride shortly. I think eavesdrop is happy now and hopefully so are image uploads | 20:19 |
fungi | static.o.o is struggling | 20:49 |
fungi | oh, or maybe it's my local internet connection | 20:50 |
fungi | yes, that's it | 20:50 |
fungi | load average on static is around a third to one | 20:50 |
fungi | seems completely fine | 20:51 |
clarkb | infra-root while on my bike ride it occurred to me that we may be close to being able to switch the python base images over to quay.io as well. Looking at https://codesearch.opendev.org/?q=opendevorg%2Fpython-&i=nope&literal=nope&files=&excludeFiles=&repos= I think this is largely the acse. Within opendev grafyaml and the statsd containers are the main consumers that haven't moved to | 23:02 |
clarkb | quay yet. I'll get a chnage up today to move grafyaml and already have the statsd changes up | 23:02 |
clarkb | then things like elastic-recheck, storyboard, and gear are less important things on our side that we can probably leave as is for now? Maybe I'll update gear. Then vexxhost and zuul are the other main consumers | 23:03 |
clarkb | zuul within opendev should be a non issue. We just haev to update the consumer side of things within zuul and I can work on that too post move | 23:03 |
clarkb | for vexxhost images I'll ping guilhermesp mnaser and ricolin here now in case they see this. tl;dr is that the opendevorg/python- images are likely moving to quay soon. Let us know if you have questions or concerns | 23:04 |
opendevreview | Clark Boylan proposed opendev/grafyaml master: Reapply "Migrate grafyaml container images to quay.io" https://review.opendev.org/c/opendev/grafyaml/+/956839 | 23:29 |
opendevreview | Clark Boylan proposed openstack/project-config master: Pull grafyaml from quay.io https://review.opendev.org/c/openstack/project-config/+/956840 | 23:32 |
opendevreview | Clark Boylan proposed opendev/system-config master: Pull grafyaml from quay.io https://review.opendev.org/c/opendev/system-config/+/956842 | 23:36 |
clarkb | ok I think once that set of changes and the statsd changes land then we can update the python-base and python-builder images to publish to quay then update all the images all over again to fetch from there. A bit backwards but this allowed us to do it piecemeal and take our time. its just now that eavesdrop has updated that all the python containers are not running on noble | 23:37 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!