Thursday, 2026-03-05

-@gerrit:opendev.org- Michal Nasiadka proposed: [opendev/system-config] 978980: Add Michal Nasiadka to base_users on all hosts https://review.opendev.org/c/opendev/system-config/+/97898006:33
@scott.little:matrix.orgseeing an odd inconsistency in gerrit this morrning.   https://review.opendev.org/q/topic:%22concurrent-build%22  shows two code reviews with CR-1, but when I click on them there are no Code reviews on the current iteration.15:02
@scott.little:matrix.orgreplicated in a different machine/browser.  So it's not a caching effect15:04
@scott.little:matrix.org* replicated in a different machine/browser.  So it's not a caching effect within the browser15:04
-@gerrit:opendev.org- Pierre Riteau proposed: [opendev/irc-meetings] 979022: Add Matt Crees as Blazar meeting chair https://review.opendev.org/c/opendev/irc-meetings/+/97902215:05
@fungicide:matrix.orgwe did just upgrade gerrit yesterday, around the time that patchset was pushed... i bet one of gerrit's server-side caches is inconsistent (i did force a reindex after the upgrade but maybe it didn't cover the necessary index for that dta?)15:07
-@gerrit:opendev.org- Pierre Riteau proposed: [opendev/irc-meetings] 979022: Add Matt Crees as Blazar meeting chair https://review.opendev.org/c/opendev/irc-meetings/+/97902215:09
@fungicide:matrix.orgwe can try rerunning the reindex too, but i'll wait for some other gerrit admins to be around first since this isn't a critical issue and likely only affects things pushed right when we were restarting the service15:09
@mnasiadka:matrix.orgscott.little: seems to work for me (there's one with CR-1 right now)15:21
@fungicide:matrix.orgmnasiadka: yeah, when i initially looked at that query result, 977874 was showing a code review -1 while clicking into the change that seemed to be a stale vote from the prior patchset15:42
@fungicide:matrix.orgi suspect that when scott.little added several requested reviewers to the change, that caused whatever was cached for the query result to finally refresh15:43
@fungicide:matrix.orgbecause it looks correct to me too now15:43
@fungicide:matrix.orgpopping out to lunch, back shortly16:20
@clarkb:matrix.orgyes if the issue was stale index data then actions on the change that would trigger updates may bring it back into sync16:25
@clarkb:matrix.orgfungi: and other admins: if we see things like this in the future related to specific changes I believe we can reindex the change data for a particular change. https://gerrit-review.googlesource.com/Documentation/cmd-index-changes.html this is the command (it is slightly different from the rebuild the whole index command)16:27
@clarkb:matrix.orgI think for cases like this we would want to start with ^ doing a speicifc change reindex request16:27
@fungicide:matrix.orggood call, that would help us narrow down what index is at fault17:26
-@gerrit:opendev.org- Zuul merged on behalf of Pierre Riteau: [opendev/irc-meetings] 979022: Add Matt Crees as Blazar meeting chair https://review.opendev.org/c/opendev/irc-meetings/+/97902217:52
@fungicide:matrix.orgi'll note that i forgot to perform my followup test to see if my client's forbidden entry for docs.opendev.org lasted past an hour yesterday, so i did it just now almost 20 hours later and a wget of https://docs.opendev.org/ still returns `ERROR 403: Forbidden.`17:53
@fungicide:matrix.orgseems like the `SecCollectionTimeout 86400` in a new /etc/modsecurity/collection-timeout.conf file did the trick17:54
@fungicide:matrix.orgi'll get a change going to encode it in ansible17:54
@fungicide:matrix.orgi'll try again in a few hours after the one-day mark and see if it's back to working again17:55
@clarkb:matrix.orgAny change in conntrack counts?18:01
@fungicide:matrix.orgnot really, 381508 right now18:02
@fungicide:matrix.orgi expect most of them are from when things were going sideways and sessions never got cleanly shut down18:02
@fungicide:matrix.orgmy guess is that we have a floor of ~380k stale connections being tracked18:03
@fungicide:matrix.orgserver status suggests that only ~25% of the apache worker slots are reading requests or logging, and the othr ~75% are waiting for connections or open slots with no worker process running18:04
-@gerrit:opendev.org- Jeremy Stanley https://matrix.to/#/@fungicide:matrix.org proposed wip: [opendev/system-config] 978956: Expire mod_security collection entries in one day https://review.opendev.org/c/opendev/system-config/+/97895621:47
-@gerrit:opendev.org- Jeremy Stanley https://matrix.to/#/@fungicide:matrix.org marked as active: [opendev/system-config] 978956: Expire mod_security collection entries in one day https://review.opendev.org/c/opendev/system-config/+/97895621:48
@fungicide:matrix.orgtested again ~25 hours after getting my client blocked, and i was back to a normal http/200 ok from https://docs.opendev.org/22:46
@jim:acmegating.comlast weekend, an error in a zuul zk schema upgrade broke opendev's zuul, and to correct that for other zuul users, i made another change to zuul that will also break opendev (basically, it's a partial revert, so it breaks it backwards).  i'm wondering if now would be a good time to do a zuul scheduler/launcher restart to force that breakage now.  the symptom is just that node requests will hang until i do a full reconfiguration.  zuul is fairly busy now, but slowing down, and i think this might be a good time to do that in a controlled manner.  thoughts?22:53
@fungicide:matrix.orgseems to me like it should be fine22:54
@clarkb:matrix.orgyes usually pacific time afternoons things tend to quiet22:54
@jim:acmegating.comstill some hours in my day so if it goes wrong i'm not dumping it on anyone else :)22:54
@jim:acmegating.comcool, i'll get started on that then22:54
@clarkb:matrix.orgcorvus: this will only impact jobs that are waiting for nodes or will it impact running jobs too?22:54
@jim:acmegating.comi don't think i'll announce anything since it should just manifest as a delay22:54
@jim:acmegating.comwaiting for nodes22:55
@clarkb:matrix.orgthen ya impact should be minimal and as you say just a longer delay I think that is fine22:55
@jim:acmegating.comwe didn't even loose any builds/results during the unexpected breakage over the weekend22:55
@jim:acmegating.comif google's gerrit is slow again, we're going to find out about it again though22:56
@jim:acmegating.com(the reconfigs over the weekend were how i noticed that)22:56
@clarkb:matrix.orgyour curl test is currently quick and gerrit.googlesource.com loaded for me in a reasonable amount of time22:57
@fungicide:matrix.orgthe more i look at the bogus urls that are getting denied on docs.openstack.org, the more i'm starting to think that this is actually a brute-force scan looking for unlinked pages by taking the path components from related urls and randomly rearranging/recombining/duplicating them23:08
@clarkb:matrix.orgoohhhhh that would certainly explain the url path construction23:08
@clarkb:matrix.organy similar insights with the lists crawler behavior?23:09
@fungicide:matrix.orge.g. one of the most requested url base paths is /developer/diskimage-builder/user_guide/elements which obviously does not exist but /developer/diskimage-builder/user_guide and /developer/diskimage-builder/elements both exist23:09
@fungicide:matrix.orgi haven't even started delving into lists yet23:09
@fungicide:matrix.orgright now i'm analyzing the most frequent requests on docs.openstack.org to put together a short hitlist of tripwires23:10
@fungicide:matrix.org(by "exist" i mean redirect to something similar, e.g. /diskimage-builder/latest/user_guide and /diskimage-builder/latest/elements)23:11
@jim:acmegating.comdoing the full reconfig now23:15
-@gerrit:opendev.org- Jeremy Stanley https://matrix.to/#/@fungicide:matrix.org proposed: [opendev/system-config] 979089: Add WAF rules for docs.openstack.org https://review.opendev.org/c/opendev/system-config/+/97908923:16
-@gerrit:opendev.org- Jeremy Stanley https://matrix.to/#/@fungicide:matrix.org proposed: [opendev/system-config] 979090: Add our tripwire SecRule to docs.openstack.org https://review.opendev.org/c/opendev/system-config/+/97909023:19
-@gerrit:opendev.org- Jeremy Stanley https://matrix.to/#/@fungicide:matrix.org proposed: [opendev/system-config] 978118: Add WAF rule for developer.openstack.org https://review.opendev.org/c/opendev/system-config/+/97811823:21
@fungicide:matrix.orgokay, those are split up the way we want now, i think23:22
@clarkb:matrix.orgfungi: its like they aer doing a full matrix combo on all of the path components23:22
@fungicide:matrix.orgyes, some of them get crazy long as a result23:23
@fungicide:matrix.organd if you don't truncate the requested paths, there are comparatively few repeats23:23
@fungicide:matrix.orgi'll try to work on related bits for lists.opendev.org tomorrow23:24
@clarkb:matrix.orgfungi: I noted one thing about the id: values in the rules I was under the impression they need to be unique. I didn't -1 bceause if this deploys then maybe it is fine? But also maybe we shoudl double check the docs23:25
@clarkb:matrix.orghttps://github.com/owasp-modsecurity/ModSecurity/wiki/Reference-Manual-(v2.x)#id is the documentation for that I think23:26
@clarkb:matrix.orgwhat isn't clear to me is if those values need to eb unique or if they can be for better logging etc23:27
@jim:acmegating.comall the required zuul components should be up and running with a valid config now, so the delay window should be over (i'm still finishing up restarting some redundant components)23:38
@jim:acmegating.comand i see previously queued jobs now running on newly-built nodes23:42
@jim:acmegating.com#status log restarted zuul web, scheduler, launcher components and performed a full-reconfiguration23:53
@status:opendev.org@jim:acmegating.com: finished logging23:53
@jim:acmegating.comthat should be all done now, status page lgtm23:54
@clarkb:matrix.orgthanks!23:56

Generated by irclog2html.py 4.1.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!