| *** liuxie is now known as liushy | 03:29 | |
| frickler | infra-root: I'm getting a proxy error from https://paste.opendev.org/show/beYQ1FZ7T0c87tSCMsxC/ , can't check myself right now | 15:45 |
|---|---|---|
| corvus | wfm | 15:46 |
| frickler | ok, reload helped, weird | 15:47 |
| fungi | frickler: we proxy from apache to a loopback listener, so if the lodgeit process didn't respond or responded with an http error then i would expect apache to serve a proxy error | 15:54 |
| clarkb | there is an alert email from rackspace for the tracing server too. I can't seem to get to the server via https | 15:58 |
| clarkb | I haven't tried ssh yet as I need to do local updates before loading ssh keys | 15:58 |
| fungi | looks like ssh to tracing02 is timing out for me | 16:00 |
| clarkb | ok may be that things haven't recovered on their side yet | 16:02 |
| fungi | yeah, the ticket so far just said something happened to the host and they're looking into it | 16:03 |
| clarkb | https://review.opendev.org/c/opendev/system-config/+/968270 hasn't merged yet but the zuul components list indicates this weekend's restarts and upgrades succeeded anyway | 16:06 |
| clarkb | infra-root ^ that would be a good one to land before holidays where we pay less attention I suspect. | 16:07 |
| clarkb | then also I'm interested in upgrade gitea soon: https://review.opendev.org/c/opendev/system-config/+/968245 | 16:07 |
| clarkb | I've gone ahead and approved https://review.opendev.org/c/opendev/system-config/+/967619 this adds the extra checks for eavesdrop room collisions | 16:34 |
| clarkb | it doesn't change the eavesdrop config for irc or matrix | 16:34 |
| clarkb | fungi: did you want to review https://review.opendev.org/c/opendev/system-config/+/967608 to add gerritbot to our new matrix room? | 16:35 |
| clarkb | fungi: also https://review.opendev.org/c/opendev/statusbot/+/967252 for statusbot matrix support though that has two +2's so we can probably approve it if you'd prefer | 16:35 |
| clarkb | that second change doesn't affectconfiguration either just adds support for things. We'll have to followup after the container image is updated with configuration updates | 16:36 |
| fungi | ah yes, just a sec, thanks for the reminder | 16:37 |
| fungi | i added a comment to 967608 that it's missing some of the stuff we currently notify this channel about, but we can always add to the list of projects later | 16:41 |
| fungi | so approved it as-is | 16:41 |
| opendevreview | Merged opendev/system-config master: Add checks to avoid irc and matrix log collisions https://review.opendev.org/c/opendev/system-config/+/967619 | 16:44 |
| clarkb | fungi: it does have zuul/zuul-jobs but you're right that diskimage builder is missing | 16:45 |
| fungi | er, yeah i meant one of the others i guess. anyway there are a bunch that were left out but most of them were probably fine to leave out regardless | 16:46 |
| opendevreview | Merged opendev/statusbot master: Add matrix support https://review.opendev.org/c/opendev/statusbot/+/967252 | 17:04 |
| opendevreview | Merged opendev/system-config master: Add matrix gerritbot to the new opendev matrix room https://review.opendev.org/c/opendev/system-config/+/967608 | 17:06 |
| clarkb | the deployment job for 967608 is running now. I wonder if we can come up with something to test the status notice to both matrix and irc without being too spammy (it will hit all channels so don't want to do it just to test) | 17:10 |
| clarkb | (I think the deploymetn for gerritbot will side effect an update to statusbot) | 17:11 |
| clarkb | which ya it just dropped and rejoined so I think that happened | 17:12 |
| clarkb | deployment is done but didn't restart the matrix gerrit bot | 17:15 |
| clarkb | I'll manually restart the bot now | 17:16 |
| clarkb | ok that is done. It occurs to me that the bot may reread its config without restarting and maybe it doesn't join rooms until it needs to (based on the fact that it hasn't joined the room since the restart) | 17:19 |
| clarkb | either way I suspect that if we push code ot one of those repos it should generate a message in matrix now | 17:20 |
| fungi | or merge some | 17:21 |
| clarkb | I think for statusbot we need to create a new user now and update its config to include a matrix token and some rooms? | 17:25 |
| clarkb | for gitea https://review.opendev.org/c/opendev/system-config/+/968245 I'm happy to babysit that today if someone is able to look at it and just sanity check there isn't anything obviously wrong there | 17:26 |
| fungi | sure, looking | 17:28 |
| fungi | looks like tracing02 is still down | 18:10 |
| fungi | even though openstack server list still indicates it's in an active state, i'm unable to reach it via ssh | 18:11 |
| fungi | system-config-run-gitea finally got nodes assigned | 18:19 |
| fungi | zuul's projecting about an hour to merge | 18:19 |
| clarkb | I expect that the change should merge soon. I have my socks proxy running | 19:26 |
| fungi | yeah, system-config-run-gitea is wrapping up now | 19:28 |
| fungi | and opendev-buildset-registry has resumed | 19:34 |
| fungi | time to see if gerritbot is working... | 19:35 |
| opendevreview | Merged opendev/system-config master: Update gitea to 1.25.2 https://review.opendev.org/c/opendev/system-config/+/968245 | 19:35 |
| fungi | saw it in both irc and matrix, so seems to have worked | 19:35 |
| fungi | and now infra-prod-service-gitea is starting | 19:38 |
| fungi | gitea09 has restarted according to the log | 19:39 |
| clarkb | it is loading for me through the socks proxy | 19:41 |
| clarkb | I'm going to now do a port forward and test cloning | 19:41 |
| clarkb | clone works for me | 19:43 |
| clarkb | 09-11 are done at this point | 19:44 |
| clarkb | all servers should be updated now | 19:50 |
| clarkb | web ui checks look good to me | 19:50 |
| clarkb | clone worked on gitea09. I think the last thing to check is replication. I'll wait patienty for a new patchset/change to check on | 19:50 |
| fungi | yeah, everything's working for me through haproxy and looks fine | 19:51 |
| clarkb | https://opendev.org/airship/deckhand/commit/bf362c48ba0cbc19d11cb29a531dc343ee702e4e was pushed with at least 4 of 6 upgraded. I am talking to gitea10 so I think replication is working but I'll wait for another patchset/change after all 6 updated just to be sure | 19:51 |
| clarkb | https://opendev.org/openstack/governance/commit/df430585dbafb5a3d0889b369de7488955590c61 arrived at 19:51 ish so I think it was after all 6 completed | 19:53 |
| clarkb | from https://review.opendev.org/c/openstack/governance/+/969145 | 19:53 |
| clarkb | next on my todo list is our meeting agenda. Let me know if there are updates you want to see. I will add infos to the gerrit upgrade topic, the move to matrix topic, and drop the gitea 1.25.2 upgrade since it is now done | 19:56 |
| clarkb | tracing02's uptime is reported to be 18 minutes and the web service responds to me now | 21:02 |
| clarkb | fungi: I'm guessing you didn't do anything to make that happen and this is just a case of needing to wait for the provider to address it? | 21:02 |
| fungi | clarkb: i did not, no | 21:10 |
| clarkb | dmsimard[m]: do we want to discuss ara things more tomorrow? I think we addressed some of the immediate questions. Not sure if there is new info we want to discuss since then | 21:25 |
| dmsimard[m] | I am working on some of the ara backlog and this is coming up soon enough now but I don't have progress to report | 21:31 |
| dmsimard[m] | <this> as in testing around ways to retrieve the databases from zuul artifacts | 21:31 |
| clarkb | dmsimard[m]: ack no rush. I'm just putting together an agenda for tomorrow and didn't want to miss it like last time if there was something to note | 21:31 |
| dmsimard[m] | sure | 21:31 |
| dmsimard[m] | I did talk to Arnaud about the flavors and upgrading the aggregate though, it is not as simple as I thought it would be | 21:33 |
| dmsimard[m] | we have to put it on our roadmap and get some other folks involved | 21:33 |
| dmsimard[m] | it can be done, just not as easily as I believed, like a lot of things in life | 21:33 |
| clarkb | dmsimard[m]: yes, he indicated that it was complicated for historical reasons. We're happy to work with you all on that but realize its largely on you to do the hard part | 21:34 |
| dmsimard[m] | I think on your end we might just need to disable CI on the ovh regions temporarily while we swap out things underneath when we are ready to proceed | 21:35 |
| clarkb | yup and then maybe update configuration to point at new flavor names etc | 21:35 |
| dmsimard[m] | right | 21:36 |
| clarkb | I have to do a school run in a few minutes so will be popping out | 21:55 |
| fungi | rackspace notified us 5 minutes ago that tracing02 should be back online (which of course we already confirmed some time ago) | 22:35 |
| fungi | so, like, 1.5 hours later | 22:35 |
| clarkb | infra-root any thoughts on https://review.opendev.org/c/opendev/system-config/+/968270 to make zuul upgrades and reboots more reliable? fungi in particular may have thoughts on not checking for apt failures specific to the lock file being held (maybe there is some failure case where retrying would be dangerous? But I don't think so as apt should avoid making changes in an error state | 22:57 |
| clarkb | without intervention) | 22:57 |
| clarkb | ok last call on meeting agenda items. I'll get that sent out shortly if there are no suggestions for further edits | 23:09 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!