Monday, 2025-12-01

*** liuxie is now known as liushy03:29
fricklerinfra-root: I'm getting a proxy error from https://paste.opendev.org/show/beYQ1FZ7T0c87tSCMsxC/ , can't check myself right now15:45
corvuswfm15:46
fricklerok, reload helped, weird15:47
fungifrickler: we proxy from apache to a loopback listener, so if the lodgeit process didn't respond or responded with an http error then i would expect apache to serve a proxy error15:54
clarkbthere is an alert email from rackspace for the tracing server too. I can't seem to get to the server via https15:58
clarkbI haven't tried ssh yet as I need to do local updates before loading ssh keys15:58
fungilooks like ssh to tracing02 is timing out for me16:00
clarkbok may be that things haven't recovered on their side yet16:02
fungiyeah, the ticket so far just said something happened to the host and they're looking into it16:03
clarkbhttps://review.opendev.org/c/opendev/system-config/+/968270 hasn't merged yet but the zuul components list indicates this weekend's restarts and upgrades succeeded anyway16:06
clarkbinfra-root ^ that would be a good one to land before holidays where we pay less attention I suspect.16:07
clarkbthen also I'm interested in upgrade gitea soon: https://review.opendev.org/c/opendev/system-config/+/96824516:07
clarkbI've gone ahead and approved https://review.opendev.org/c/opendev/system-config/+/967619 this adds the extra checks for eavesdrop room collisions16:34
clarkbit doesn't change the eavesdrop config for irc or matrix16:34
clarkbfungi: did you want to review https://review.opendev.org/c/opendev/system-config/+/967608 to add gerritbot to our new matrix room?16:35
clarkbfungi: also https://review.opendev.org/c/opendev/statusbot/+/967252 for statusbot matrix support though that has two +2's so we can probably approve it if you'd prefer16:35
clarkbthat second change doesn't affectconfiguration either just adds support for things. We'll have to followup after the container image is updated with configuration updates16:36
fungiah yes, just a sec, thanks for the reminder16:37
fungii added a comment to 967608 that it's missing some of the stuff we currently notify this channel about, but we can always add to the list of projects later16:41
fungiso approved it as-is16:41
opendevreviewMerged opendev/system-config master: Add checks to avoid irc and matrix log collisions  https://review.opendev.org/c/opendev/system-config/+/96761916:44
clarkbfungi: it does have zuul/zuul-jobs but you're right that diskimage builder is missing16:45
fungier, yeah i meant one of the others i guess. anyway there are a bunch that were left out but most of them were probably fine to leave out regardless16:46
opendevreviewMerged opendev/statusbot master: Add matrix support  https://review.opendev.org/c/opendev/statusbot/+/96725217:04
opendevreviewMerged opendev/system-config master: Add matrix gerritbot to the new opendev matrix room  https://review.opendev.org/c/opendev/system-config/+/96760817:06
clarkbthe deployment job for 967608 is running now. I wonder if we can come up with something to test the status notice to both matrix and irc without being too spammy (it will hit all channels so don't want to do it just to test)17:10
clarkb(I think the deploymetn for gerritbot will side effect an update to statusbot)17:11
clarkbwhich ya it just dropped and rejoined so I think that happened17:12
clarkbdeployment is done but didn't restart the matrix gerrit bot17:15
clarkbI'll manually restart the bot now17:16
clarkbok that is done. It occurs to me that the bot may reread its config without restarting and maybe it doesn't join rooms until it needs to (based on the fact that it hasn't joined the room since the restart)17:19
clarkbeither way I suspect that if we push code ot one of those repos it should generate a message in matrix now17:20
fungior merge some17:21
clarkbI think for statusbot we need to create a new user now and update its config to include a matrix token and some rooms?17:25
clarkbfor gitea https://review.opendev.org/c/opendev/system-config/+/968245 I'm happy to babysit that today if someone is able to look at it and just sanity check there isn't anything obviously wrong there17:26
fungisure, looking17:28
fungilooks like tracing02 is still down18:10
fungieven though openstack server list still indicates it's in an active state, i'm unable to reach it via ssh18:11
fungisystem-config-run-gitea finally got nodes assigned18:19
fungizuul's projecting about an hour to merge18:19
clarkbI expect that the change should merge soon. I have my socks proxy running19:26
fungiyeah, system-config-run-gitea is wrapping up now19:28
fungiand opendev-buildset-registry has resumed19:34
fungitime to see if gerritbot is working...19:35
opendevreviewMerged opendev/system-config master: Update gitea to 1.25.2  https://review.opendev.org/c/opendev/system-config/+/96824519:35
fungisaw it in both irc and matrix, so seems to have worked19:35
fungiand now infra-prod-service-gitea is starting19:38
fungigitea09 has restarted according to the log19:39
clarkbit is loading for me through the socks proxy19:41
clarkbI'm going to now do a port forward and test cloning19:41
clarkbclone works for me19:43
clarkb09-11 are done at this point19:44
clarkball servers should be updated now19:50
clarkbweb ui checks look good to me19:50
clarkbclone worked on gitea09. I think the last thing to check is replication. I'll wait patienty for a new patchset/change to check on19:50
fungiyeah, everything's working for me through haproxy and looks fine19:51
clarkbhttps://opendev.org/airship/deckhand/commit/bf362c48ba0cbc19d11cb29a531dc343ee702e4e was pushed with at least 4 of 6 upgraded. I am talking to gitea10 so I think replication is working but I'll wait for another patchset/change after all 6 updated just to be sure19:51
clarkbhttps://opendev.org/openstack/governance/commit/df430585dbafb5a3d0889b369de7488955590c61 arrived at 19:51 ish so I think it was after all 6 completed19:53
clarkbfrom https://review.opendev.org/c/openstack/governance/+/96914519:53
clarkbnext on my todo list is our meeting agenda. Let me know if there are updates you want to see. I will add infos to the gerrit upgrade topic, the move to matrix topic, and drop the gitea 1.25.2 upgrade since it is now done19:56
clarkbtracing02's uptime is reported to be 18 minutes and the web service responds to me now21:02
clarkbfungi: I'm guessing you didn't do anything to make that happen and this is just a case of needing to wait for the provider to address it?21:02
fungiclarkb: i did not, no21:10
clarkbdmsimard[m]: do we want to discuss ara things more tomorrow? I think we addressed some of the immediate questions. Not sure if there is new info we want to discuss since then21:25
dmsimard[m]I am working on some of the ara backlog and this is coming up soon enough now but I don't have progress to report21:31
dmsimard[m]<this> as in testing around ways to retrieve the databases from zuul artifacts21:31
clarkbdmsimard[m]: ack no rush. I'm just putting together an agenda for tomorrow and didn't want to miss it like last time if there was something to note21:31
dmsimard[m]sure21:31
dmsimard[m]I did talk to Arnaud about the flavors and upgrading the aggregate though, it is not as simple as I thought it would be21:33
dmsimard[m]we have to put it on our roadmap and get some other folks involved21:33
dmsimard[m]it can be done, just not as easily as I believed, like a lot of things in life21:33
clarkbdmsimard[m]: yes, he indicated that it was complicated for historical reasons. We're happy to work with you all on that but realize its largely on you to do the hard part21:34
dmsimard[m]I think on your end we might just need to disable CI on the ovh regions temporarily while we swap out things underneath when we are ready to proceed21:35
clarkbyup and then maybe update configuration to point at new flavor names etc21:35
dmsimard[m]right21:36
clarkbI have to do a school run in a few minutes so will be popping out21:55
fungirackspace notified us 5 minutes ago that tracing02 should be back online (which of course we already confirmed some time ago)22:35
fungiso, like, 1.5 hours later22:35
clarkbinfra-root any thoughts on https://review.opendev.org/c/opendev/system-config/+/968270 to make zuul upgrades and reboots more reliable? fungi in particular may have thoughts on not checking for apt failures specific to the lock file being held (maybe there is some failure case where retrying would be dangerous? But I don't think so as apt should avoid making changes in an error state22:57
clarkbwithout intervention)22:57
clarkbok last call on meeting agenda items. I'll get that sent out shortly if there are no suggestions for further edits23:09

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!