| -@gerrit:opendev.org- Michal Nasiadka proposed: [openstack/project-config] 978566: propose-updates: Add pcu target https://review.opendev.org/c/openstack/project-config/+/978566 | 07:58 | |
| @vurmil:matrix.org | today again: ...kayobe.git': gnutls_handshake() failed: The TLS connection was non-properly terminated. | 10:48 |
|---|---|---|
| @vurmil:matrix.org | the second time it worked | 10:48 |
| @vurmil:matrix.org | now: https://opendev.org/openstack - ERR_CONNECTION_TIMED_OUT (tested 2 different networks) | 10:59 |
| -@gerrit:opendev.org- Zuul merged on behalf of Clark Boylan: [opendev/system-config] 983221: Drop Bionic package mirrors for Ubuntu, UCA, and Puppetlabs https://review.opendev.org/c/opendev/system-config/+/983221 | 12:09 | |
| @clarkb:matrix.org | the irc side seems to be logging still | 16:35 |
| @clarkb:matrix.org | but looking at the matrix-eavesdrop service log it seems tocompletely miss all of the content from the last few hours | 16:37 |
| @clarkb:matrix.org | fungi: statusbot has a traceback in the .1 logfile from the 4th of march and hasn't recorded anythign since. But it did successfully send your notice yesterday for the gerrit update | 16:38 |
| @fungicide:matrix.org | this happened the other day as well, then out of the blue they woke up and procecessed all the channel backlog and statusbot acted on the command | 16:39 |
| @clarkb:matrix.org | huh | 16:39 |
| @clarkb:matrix.org | corvus: ^ I think you udnerstand the matrix protocol better than we do. Any hunches for what might be going on here or things to check? | 16:39 |
| @fungicide:matrix.org | i wonder if the client connection times out, then gets the backlog when it finally reconnects, or something like that | 16:39 |
| @clarkb:matrix.org | ya I wonder if matrix scrollback history support means if we restart these bots fi they'll come up and start processing the backlog immediately | 16:40 |
| @fungicide:matrix.org | i worry that restarting them will lose the backlog because they'll only process things that arrive after the process starts, but that if they reconnect on their own eventually they'll recover the history | 16:42 |
| @fungicide:matrix.org | like happened the other day | 16:42 |
| @jim:acmegating.com | looking | 16:44 |
| @jim:acmegating.com | well, eavesdrop looks like it recovered as soon as you started talking about it: https://meetings.opendev.org/irclogs/%23opendev/%23opendev.2026-04-07.log | 16:49 |
| @jim:acmegating.com | if there was a network issue, i would expect the messages to eventually arrive, potentially out of order | 16:49 |
| @clarkb:matrix.org | huh with a gap. I wonder if it will add the missing content later | 16:49 |
| @clarkb:matrix.org | ya the timestamps will likely be correct but out of order on the page. I think that is fine for eavesdrop | 16:50 |
| @jim:acmegating.com | yeah, i don't know if the issue is related to the homeserver or the client library | 16:50 |
| @jim:acmegating.com | the bots are on a homeserver that none of us are on, so it's conceivable that homeserver hasn't gotten the older messages | 16:51 |
| @jim:acmegating.com | (while we have) | 16:51 |
| @jim:acmegating.com | but i also wonder if the way the client library is written, it ignores older messages | 16:51 |
| @jim:acmegating.com | so, perhaps when the homeserver re-syncs, we just get the one newest message and don't backfill the older ones | 16:52 |
| @jim:acmegating.com | #status log statusbot check | 16:52 |
| @status:opendev.org | @jim:acmegating.com: finished logging | 16:52 |
| @clarkb:matrix.org | ah that would explain the catch up and then statusbot not doing anything | 16:52 |
| @clarkb:matrix.org | corvus: I think that confirms your theory | 16:53 |
| @jim:acmegating.com | yeah... i think maybe let's go with that for a working hypothesis for now; and if we get a giant dump of messages in the logs in a few hours, that will invalidate it. if not, maybe we can look deeper into the framework and see if there's something we can do about it | 16:53 |
| @jim:acmegating.com | (like, set a flag to tell it to give us old messages, or maybe implement a callback that handles old messages) | 16:54 |
| @clarkb:matrix.org | fungi: do you want to reissue your status message now? | 16:55 |
| @clarkb:matrix.org | corvus: ^ I think under the current theory that may be safe? | 16:56 |
| @fungicide:matrix.org | so we don't think it's probably going to eventually see the earlier one and process it after all? | 16:58 |
| @clarkb:matrix.org | fungi: that seems to be the theory right now. I suppose that message isn't super critical where we can't wait longer to see if it does post | 17:00 |
| @clarkb:matrix.org | basically I'm ok waiting if we think observing that is usefuk | 17:01 |
| @clarkb:matrix.org | * basically I'm ok waiting if we think observing that is useful | 17:01 |
| @fungicide:matrix.org | #status notice Load on the opendev.org Gitea backends is under control again for now, if any Zuul jobs failed with SSL errors or disconnects reaching the service prior to 16:15 UTC they can be safely rechecked | 17:02 |
| @status:opendev.org | @fungicide:matrix.org: sending notice | 17:02 |
| -@gerrit:opendev.org- Michal Nasiadka proposed: [opendev/zone-opendev.org] 983600: Remove DNS entries for mirror02.bhs1.ovh https://review.opendev.org/c/opendev/zone-opendev.org/+/983600 | 17:02 | |
| @fungicide:matrix.org | seems to have spotted it now, yeah | 17:02 |
| -@gerrit:opendev.org- Michal Nasiadka proposed: [opendev/system-config] 983602: Remove mirror02.bhs1.ovh https://review.opendev.org/c/opendev/system-config/+/983602 | 17:03 | |
| -@status:opendev.org- NOTICE: Load on the opendev.org Gitea backends is under control again for now, if any Zuul jobs failed with SSL errors or disconnects reaching the service prior to 16:15 UTC they can be safely rechecked | 17:05 | |
| @status:opendev.org | @fungicide:matrix.org: finished sending notice | 17:05 |
| -@gerrit:opendev.org- Michal Nasiadka proposed: [opendev/zone-opendev.org] 983609: Add mirror04.gra1.ovh https://review.opendev.org/c/opendev/zone-opendev.org/+/983609 | 17:47 | |
| -@gerrit:opendev.org- Michal Nasiadka proposed wip: [opendev/zone-opendev.org] 983609: Add mirror04.gra1.ovh https://review.opendev.org/c/opendev/zone-opendev.org/+/983609 | 17:48 | |
| -@gerrit:opendev.org- Michal Nasiadka proposed wip: [opendev/zone-opendev.org] 983609: Add mirror04.gra1.ovh https://review.opendev.org/c/opendev/zone-opendev.org/+/983609 | 17:49 | |
| -@gerrit:opendev.org- Michal Nasiadka proposed wip: [opendev/zone-opendev.org] 983609: Add mirror04.gra1.ovh https://review.opendev.org/c/opendev/zone-opendev.org/+/983609 | 17:50 | |
| -@gerrit:opendev.org- Michal Nasiadka marked as active: [opendev/zone-opendev.org] 983609: Add mirror04.gra1.ovh https://review.opendev.org/c/opendev/zone-opendev.org/+/983609 | 17:50 | |
| -@gerrit:opendev.org- Michal Nasiadka proposed: [opendev/system-config] 983610: Add mirror04.gra1.ovh https://review.opendev.org/c/opendev/system-config/+/983610 | 17:51 | |
| @clarkb:matrix.org | I've just run through the Gerrit 3.11 -> 3.12 upgrade then the 3.12 -> 3.11 downgrade on my held test server and updated the document in https://etherpad.opendev.org/p/gerrit-upgrade-3.12 as expected nothing really changed other than version numbers | 18:24 |
| @fungicide:matrix.org | somebody reported a ua filter false-positive in #openstack-dev if anyone has time to look into it | 18:24 |
| @clarkb:matrix.org | I'm not even in that channel, but I'll join and see if I can followup | 18:25 |
| @clarkb:matrix.org | re Gerrit upgrade I think things continue to look good as there are no unexpected side effects from the new bugfix images | 18:25 |
| @clarkb:matrix.org | we figured out the problem in #openstack-dev. fungi's user agent exclusions from earlier today included current chrome browsers on os x and windows. I've removed those rules and reloaded apache on all the giteabackends | 18:59 |
| @fungicide:matrix.org | yep, thanks for digging into that! i'm also distracted by other concurrent mini-emergencies and also trying to catch up from an entirely derailed morning | 19:00 |
| -@gerrit:opendev.org- Jeremy Stanley https://matrix.to/#/@fungicide:matrix.org proposed: | 19:34 | |
| - [opendev/system-config] 983630: Increase Apache worker capacity on Gitea backends https://review.opendev.org/c/opendev/system-config/+/983630 | ||
| - [opendev/system-config] 983631: Add new bots to our Apache user agent blocklist https://review.opendev.org/c/opendev/system-config/+/983631 | ||
| @clarkb:matrix.org | fungi: 982630 lgtm but I didn't approve it in case we want to squash changes. I left some notes on https://review.opendev.org/c/opendev/system-config/+/983631 about things we may want to cleanup before applying everywhere | 20:00 |
| @clarkb:matrix.org | mnasiadka: your changes lgtm. I did have a question about whether or not the ipv6 address was handled by launch node for the gra1 node. But otherwise +2 from me on both sets of changes | 20:04 |
| @clarkb:matrix.org | and now I'm going to eat lunch | 20:04 |
| -@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul-jobs] 983641: Record filesystem free space in dstat https://review.opendev.org/c/zuul/zuul-jobs/+/983641 | 20:06 | |
| -@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [opendev/zuul-providers] 982182: Add Ubuntu resolute image build job https://review.opendev.org/c/opendev/zuul-providers/+/982182 | 20:07 | |
| @jim:acmegating.com | Clark: mnasiadka ^ maybe that will tell us more info about the resolute build | 20:07 |
| @clarkb:matrix.org | Ack thanks | 20:10 |
| @fungicide:matrix.org | Clark: does google really identify its search engine crawler as android on a nexus 5x phone? | 20:15 |
| @fungicide:matrix.org | i guess that shouldn't surprise me | 20:16 |
| @fungicide:matrix.org | similarly applebot claiming to be mac os x | 20:20 |
| @fungicide:matrix.org | rather than a server operating system like darwin | 20:21 |
| @fungicide:matrix.org | i assumed those were other bots trying to exploit allowances some sites make for search engine crawlers | 20:22 |
| @clarkb:matrix.org | fungi: yes I think those are legit. I didn't triple check but I remember running into similar (if not specific details) previously | 20:24 |
| @fungicide:matrix.org | so very, very weird | 20:24 |
| @clarkb:matrix.org | fungi: I think this is what to double check against for google https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers | 20:24 |
| @fungicide:matrix.org | once upon a time, search engine crawlers just identified as themselves without including a littany of browser and platform names | 20:25 |
| @fungicide:matrix.org | so very suspicious to my eyes | 20:25 |
| @clarkb:matrix.org | do we want to go ahead and approve the first change and then hope for the best after we reset the ua filter rules? or do we want to squash the changes? or maybe some usage of the emergency file to avoid early application? | 20:44 |
| @clarkb:matrix.org | Mostly want to figure out a plan so that we can move forward with minimal disruption | 20:45 |
| @clarkb:matrix.org | fungi: also I'm looking at the gitea role and we notify the `gitea Reload apache2` handler when we update the apache vhost. But we don't seem to restart Gitea when updating its app.ini file. So Ithink we do have the problem of restarting apache expecting things to be http now instead of https but gitea itself will be running in https mode and that will fail to proxy | 20:48 |
| @clarkb:matrix.org | I think we can mitigate that by updating the when condition that restarts gitea when container images update: `when: pre_pull_image_ids.stdout_lines|sort != post_pull_image_ids.stdout_lines|sort` to include a condition for when app.ini has updated? | 20:49 |
| @clarkb:matrix.org | when images are not the same or when app.ini has updated | 20:49 |
| @fungicide:matrix.org | so we may need to split that change into gitea configuration vs apache configuration, with an explicit restart step in between | 20:50 |
| @fungicide:matrix.org | oh, or add more restart logic, sure | 20:50 |
| @clarkb:matrix.org | ya I think both halves need to be updated relatively close to one another temporarlly Otherwise we'll be half broken either way | 20:50 |
| @clarkb:matrix.org | the playbook runs against each backend one at a time. So if we want to be extra careful we can put all backends in the emergency file, approve it and let it apply to one node. Check that it is working then proceed with the others | 20:51 |
| @clarkb:matrix.org | The change shouldn't impact comms with haproxy so I think this approach would be safe | 20:51 |
| @clarkb:matrix.org | and if it looks good remove the others from the emergency file and reenqueue the deployment to run it against them all | 20:51 |
| @fungicide:matrix.org | yeah, i tried to leave it entirely transparent to the load balancer | 20:52 |
| @clarkb:matrix.org | so ya I think if we update the change to restart gitea when app.ini updates via the existing restart gitea block that should cover this. And if we want to be extra careful we can apply it to one server at first and check it is working properly before applying it to all of them via emergency file manipulation | 20:53 |
| @clarkb:matrix.org | but I feel like that is a tomorrow problem. Today's problem is getting gitea's synced up with the manual intervention we did earlier today. Do we want to proceed with the connection tuning change without the UA filter change? or do we want to try to land them together (either as separaet changes or squashed)? or do we wnt to put things in the emergency file and make that a tomorrow problem too? | 20:57 |
| @fungicide:matrix.org | my biggest concern is that landing the connection tuning change will undo the ua additions, in which case squashing or putting them all in emergency so we can deploy both together would make sense | 20:59 |
| @clarkb:matrix.org | yup I think that is the primary concern. I'm happy to approve a squashed change or put servers in the emergency file so that we can land them separately and have only the second one apply | 21:00 |
| @fungicide:matrix.org | since they're already running what we have in those changes, i'm not too concerned about needing to have a staged deployment to a smaller subset of backends | 21:01 |
| @clarkb:matrix.org | yes for this change I don't think we need that | 21:02 |
| @clarkb:matrix.org | I did update my review of https://review.opendev.org/c/opendev/system-config/+/983134 suggesting that as an option for when we go to http gitea and termination in apache | 21:03 |
| -@gerrit:opendev.org- Jeremy Stanley https://matrix.to/#/@fungicide:matrix.org proposed: [opendev/system-config] 983630: Increase Apache worker capacity on Gitea backends https://review.opendev.org/c/opendev/system-config/+/983630 | 21:09 | |
| @clarkb:matrix.org | thanks +2 from me | 21:10 |
| @clarkb:matrix.org | fungi: though maybe we should go ahead and approve it? Hrm it is running static jobs now so I wonder if those will fail on lp? We shall soon find out I guess | 21:13 |
| @clarkb:matrix.org | if it does fail maybe we mark that job nonvoting? I hate to do it but the ua filter gets plenty of coverage from the other jobs it triggers so this is probably fine. Or maybe we just drop ua filters from the static job file list | 21:15 |
| @clarkb:matrix.org | yup it is failing. I'll push a quick update that drops the static job from matching on ua filter updates | 21:34 |
| @clarkb:matrix.org | hrm zuul is failing too | 21:34 |
| @fungicide:matrix.org | standing by to approve | 21:34 |
| @clarkb:matrix.org | that was not expected. Oh except that zuul executor also installs openafs | 21:34 |
| @clarkb:matrix.org | let me just double check openafs is the problem in both cases and I can drop both from the job list. Lists and gitea should be sufficient coverage for now | 21:34 |
| @clarkb:matrix.org | fungi: I was just going to add it onto your change. or do you think it should be a parent change? | 21:35 |
| @clarkb:matrix.org | except do we run the job if we edit its file matchers anywa because the job changes? | 21:35 |
| @clarkb:matrix.org | yes it was openafs in both cases | 21:37 |
| -@gerrit:opendev.org- Clark Boylan proposed on behalf of Jeremy Stanley https://matrix.to/#/@fungicide:matrix.org: [opendev/system-config] 983630: Increase Apache worker capacity on Gitea backends https://review.opendev.org/c/opendev/system-config/+/983630 | 21:39 | |
| @clarkb:matrix.org | here's hoping zuul doesn't run those jobs anyawy | 21:39 |
| @clarkb:matrix.org | it doesn't! So I think this is hopefully in a mergable staet now | 21:40 |
| @clarkb:matrix.org | fungi: I +2'd you have plenty of time before check tests are done to approve it | 21:40 |
| @clarkb:matrix.org | While we wait for that I'll try to pop out for a bike ride in the next 15-30 minutes or so. I figure this is my opportunity. When I get back I can stick hosts in the emergency file before the 02:00 UTC periodic runs if we're unable to make progress towards reconciling the configs before then | 21:45 |
| @fungicide:matrix.org | sounds good | 21:49 |
| @clarkb:matrix.org | fungi: maybe you want to approve 983630 so that it goes straight to the gate if the lists and gitea check jobs pass? | 21:51 |
| @clarkb:matrix.org | anyway I'm going to go fill water bottles and things now | 21:51 |
| @fungicide:matrix.org | done | 21:52 |
| @fungicide:matrix.org | it's finally in the gate | 23:01 |
| -@gerrit:opendev.org- Zuul merged on behalf of Jeremy Stanley https://matrix.to/#/@fungicide:matrix.org: [opendev/system-config] 983630: Increase Apache worker capacity on Gitea backends https://review.opendev.org/c/opendev/system-config/+/983630 | 23:34 | |
| @fungicide:matrix.org | and deploying | 23:38 |
| @fungicide:matrix.org | succeeded | 23:47 |
| @clarkb:matrix.org | I'm just getting back and ya it reports success | 23:47 |
| @clarkb:matrix.org | I can still reach the service | 23:47 |
| @fungicide:matrix.org | i can browse | 23:47 |
| @fungicide:matrix.org | git fetches still work for me | 23:48 |
| @fungicide:matrix.org | seems all is well | 23:48 |
| @fungicide:matrix.org | i think we're probably set, i'm going into evening mode unless there are immediate concerns | 23:50 |
| @clarkb:matrix.org | cool. I can keep my eyes and ears open for issues for a bit longer but I too need to figure out dinner soon | 23:52 |
| @fungicide:matrix.org | thanks! i'll check it all over again in the mornong and pick the ssl/anubis changes back up hopefully | 23:58 |
Generated by irclog2html.py 4.1.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!