| *** mtreinish_ is now known as mtreinish | 00:22 | |
| clarkb | anyone else seeing high packet loss in cogent on the way to review via ipv4? I don't think it is server specific as the mriror in that region seems to have the same problem for me but bridge has no issues back and forth | 00:22 |
|---|---|---|
| clarkb | also pretty sure this isn't my local itnernet connection actiing up as this irc ocnnection is fine as are others | 00:23 |
| clarkb | heh I can't get cogents status page to load | 00:26 |
| tonyb | I can't even get to review from here ATM | 01:05 |
| tonyb | Oh ping gets 80% packet loss | 01:06 |
| Clark[m] | Ya that's what I was getting but via bridge or my personal host in ovh it seems ok. The cogent status page is unreachable so I don't think it is vexxhost specific | 01:12 |
| Clark[m] | I decided to eat dinner and not worry about it for a bit | 01:13 |
| opendevreview | OpenStack Proposal Bot proposed openstack/project-config master: Normalize projects.yaml https://review.opendev.org/c/openstack/project-config/+/962557 | 02:12 |
| tonyb | still seeing 14% packet loss to review.o.o, but at least its functional now | 05:22 |
| mnasiadka | I wonder if there’s some easy way I could debug the “Response is empty” errors when clicking in Gerrit, are the Gerrit server logs available somewhere e | 10:02 |
| mnasiadka | (e.g. OpenSearch)? | 10:02 |
| mnasiadka | It might be an OpenJDK TLS handshake bug, or something similar (seen these somewhere else) | 10:02 |
| tonyb | those logs aren't in opensearch. I can take a look in a bit | 10:05 |
| tonyb | how is your connectivity to Gerrit. | 10:06 |
| tonyb | as you probably noticed we've been having problems | 10:06 |
| opendevreview | Ivan Anfimov proposed openstack/project-config master: wip https://review.opendev.org/c/openstack/project-config/+/962925 | 12:04 |
| fungi | yeah, my guess is that the gerrit webclient js in the browser is failing to pull content from the rest api when that happens | 13:22 |
| fungi | probably wouldn't see anything in the server-side logs about that if it's the case, but browser devtools might tell you what's happened to those background requests | 13:23 |
| mnasiadka | fungi: thanks, I can try on Monday - because I think me (and my colleagues) have had that problem for a while | 14:08 |
| clarkb | fungi: I see note of potential replication errors to gitea from gerrit possibly due to the cogent issues (seems likely). You suggested we trigger a full run of gerrit replication once things are stable. They currently look good between me and the mirror node in ca-ymq-1 which was not the case last night and I can reach gerrit. Do we want to find an all clear from cogent or consider | 14:31 |
| clarkb | that good enough and trigger the replication now? | 14:31 |
| fungi | probably good enough | 14:31 |
| fungi | if there turn out to still be problems later we can always replicate again anyway | 14:32 |
| fungi | i also observed serious slowness getting http(s) responses from sites on static02 this morning, even though pings were 100% clean and system load exceptionally (unrealistically?) low, haven't checked the apache scorecard yet but guessing something is tying up worker slots | 14:33 |
| clarkb | do you want to trigger that or should I? I should be able to dig out the incantation in a bit (I need some tea first | 14:33 |
| fungi | i can, just a sec | 14:33 |
| fungi | reindexer started | 14:35 |
| clarkb | fungi: not reindexer | 14:37 |
| clarkb | we need replication | 14:37 |
| clarkb | (reindexing should be fine, but won't help the giteas) | 14:37 |
| clarkb | thoguh I'm not sure how the two will itneract, might be worth waiting for indexing to complete before replicating to ensure that we don't miss any refs in the great git push | 14:39 |
| clarkb | reindexing is about 1/3 of the way through now | 14:45 |
| mnasiadka | static02 seems unresponsive - at least for tarballs.opendev.org | 14:47 |
| fungi | d'oh, sorry clarkb yes. that was my bad | 14:48 |
| fungi | mnasiadka: yes, i think it's overrun with too many parallel downloads, could be a rush for pulling openstack release artifacts now that flamingo is out, now that i think about it | 14:49 |
| clarkb | it did load for me just a bit slow | 14:50 |
| fungi | clarkb: replication started, i guess those tasks will queue up behind the reindex | 14:50 |
| clarkb | it being tarballs | 14:50 |
| clarkb | fungi: I think they use separate threads so should run in parallel | 14:51 |
| clarkb | ya both have tasks that are not labeled 'waiting...' in the show-queue output so I think both are running at the same time | 14:51 |
| fungi | pulling server-status locally on static02 is taking some time | 14:51 |
| mnasiadka | clarkb: kolla-ansible and kayobe ironic CI jobs are failing (time out) when downloading tinyipa image, so I think httpd there is probably having hard time | 14:52 |
| mnasiadka | ICMP looks good at the same time | 14:52 |
| clarkb | wow looks like that is a half gig image | 14:54 |
| fungi | apache scorecard says 149 busy threads, 1 idle at the moment i pulled it | 14:55 |
| fungi | almost all are in the "R" (reading a request) state | 14:55 |
| clarkb | fungi: is that scorecard going to be per vhost or will it cover all vhosts? Just wondering if it shows a complete view or partial view when we compare against the mpm config | 14:55 |
| fungi | it's for the full apache service, all vhosts | 14:56 |
| clarkb | we configure a maximum of 8k connections for the lifetime of a child worker to ensure they get recycled and age out old certs that have been reissued. But I'm not seeing us configure extra mpm workers/threads/slots | 14:57 |
| fungi | the bulk of connections are requesting docs.openstack.org urls | 14:57 |
| clarkb | mods-enabled/mpm_event.conf says max is 150 so ya we're at the limit | 14:58 |
| fungi | the bulk of the connections are from ipv4 addresses belonging yo china moble, chinanet and china unicom | 14:58 |
| clarkb | considering system load maybe we should consider udpating the connection tuning config to bump up total connections | 14:58 |
| corvus | the cpu usage is unusually low | 14:59 |
| corvus | i don't think those slots are doing any work, i think they're in a slow read situation | 15:00 |
| fungi | yeah, i suspect system load is sub 0.1 because apache is mostly stuffed up | 15:00 |
| clarkb | ah ya that could explain it | 15:00 |
| fungi | [Fri Oct 3 12:45:04 2025] afs: Waiting for busy volume 536870992 () in cell openstack.org | 15:00 |
| corvus | oh hrm, do we think afs is contributing? | 15:01 |
| fungi | that was the only line in dmesg from today | 15:01 |
| fungi | so... maybe? | 15:01 |
| corvus | oh that might be ignorable though | 15:01 |
| clarkb | if it happened once then ya probably not a persistent issue | 15:01 |
| corvus | doing some basic navigating around the docs volume seems fine | 15:01 |
| corvus | looking at the cacti graphs, i think we have lots of room to increase apache workers (even under normal load) | 15:02 |
| clarkb | maybe we start there and see if that gives us enough headroom to get past the slowread situation and let clients who can read quicker connect? | 15:02 |
| fungi | i'm getting fast responses cat'ing random files from /afs/openstack.org/docs/... on static02 | 15:03 |
| corvus | so i like the idea of doing that, and also, just restarting apache now. the restart might clear out some slow connections and provide immediate relief. | 15:03 |
| clarkb | we already have a stub connection tuning config in place. I can work on a patch to expand that | 15:03 |
| fungi | i can restart apache now, sure | 15:03 |
| fungi | okay, it's restarted | 15:03 |
| fungi | all workers are accepting connections at the moment, according to server-status | 15:05 |
| corvus | some graphs: https://imgur.com/a/th9lGlm | 15:05 |
| fungi | page content is returning quickly for me at the moment too | 15:06 |
| fungi | yeah, i agree we have plenty of room to increase the worker max | 15:06 |
| clarkb | just working on the math now | 15:09 |
| clarkb | there are like 8 tunables that all play off of one another | 15:09 |
| opendevreview | Clark Boylan proposed opendev/system-config master: Increase static.o.o apache thread limits https://review.opendev.org/c/opendev/system-config/+/962973 | 15:16 |
| clarkb | something like that maybe? It is based on the etherpad connection tuning but I tuned it down a bit | 15:16 |
| clarkb | I figured a ~8x increase was probably a good start | 15:16 |
| fungi | lgtm, thanks! | 15:17 |
| clarkb | gerrit reindexing completed with the expected error count of 3 | 15:17 |
| clarkb | looks like replication queues are empty too | 15:18 |
| clarkb | fungi: do we want to put that in place manually and speed up the deployment here? | 15:23 |
| fungi | clarkb: maybe? response times are already slow again | 15:33 |
| fungi | even pulling server-status is taking a while | 15:35 |
| fungi | like 30 seconds | 15:35 |
| fungi | i've manually edited the config for 962973 and restarted apache | 15:37 |
| fungi | i have an appointment i need to get to, but hopefully that'll hold it until the change deploys | 15:38 |
| fungi | bbiaw | 15:38 |
| clarkb | thanks | 15:38 |
| clarkb | the apache process counts are definitely increasing | 15:43 |
| clarkb | I have observed the process count fall from 13 to 12 (thats one parent and 12 or 11 child workers aiui) so I think maybe we've found the equilibrium point? | 15:58 |
| clarkb | static tuning failed on a testinfra test that I think is due to a chagne in redirects for docs. Looking into fixing that as part of the same change | 16:02 |
| clarkb | https://review.opendev.org/c/openstack/openstack-manuals/+/962684 this change updated the redirect from 301 to 302 and we are looking for 301 | 16:03 |
| clarkb | but we have a lot of a 301 checks so I need to makesure this is the only test case affected | 16:03 |
| opendevreview | Clark Boylan proposed opendev/system-config master: Increase static.o.o apache thread limits https://review.opendev.org/c/opendev/system-config/+/962973 | 16:11 |
| opendevreview | Clark Boylan proposed opendev/system-config master: Update docs.openstack.org redirect test https://review.opendev.org/c/opendev/system-config/+/962983 | 16:11 |
| fungi | clarkb: yep, that was me, sorry i didn't realize we were also testing that in system-config | 17:11 |
| clarkb | apache processes have shrunk further | 17:38 |
| fungi | oh good | 17:39 |
| fungi | page content is still loading quickly for me too | 17:40 |
| clarkb | ya I think the high water mark was something like 360 concurrent requests | 17:40 |
| clarkb | whcih was well above 150 but now that we've served those requests things are falling back down again. We caught up essentially | 17:40 |
| opendevreview | Merged zuul/zuul-jobs master: Fix up some EL10 compatibility https://review.opendev.org/c/zuul/zuul-jobs/+/962194 | 17:43 |
| opendevreview | Merged opendev/system-config master: Update docs.openstack.org redirect test https://review.opendev.org/c/opendev/system-config/+/962983 | 17:44 |
| opendevreview | Merged opendev/system-config master: Increase static.o.o apache thread limits https://review.opendev.org/c/opendev/system-config/+/962973 | 17:45 |
| clarkb | fungi: it does look like ^ restarted apache at 17:49 UTC | 18:05 |
| fungi | agreed | 18:05 |
| clarkb | and we're down to 8 total processes so this seems to be in a happy steady state for now | 18:06 |
| clarkb | fungi: https://review.opendev.org/c/opendev/system-config/+/962826 may or may not do what we want with canonical links for gitea | 18:08 |
| clarkb | fungi: in particular my main concerns are whether or not git is impacted negatively (we do have minimal git clone testing in testinfra for gitea so maybe not?) and whether or not the query parameters have a ? prefixed on the string there | 18:08 |
| fungi | oh, right i was about to look at that this morning before i ended up digging into gerrit and static content issues | 18:09 |
| clarkb | but I think it is to a point where careful review is helpful and we can dig more into those concerns if we need to | 18:09 |
| clarkb | one idea I had is maybe we hold a node and test it with query parameters that way? | 18:09 |
| clarkb | or we can poke around the gitea ui and see if any requests use query parameters and just add that to testinfra test cases | 18:10 |
| clarkb | I think git uses query parameters actually so maybe that is a good test case | 18:10 |
| fungi | i guess $QUERY_STRING includes the leading "?" | 18:13 |
| fungi | but yeah it doesn't look like we test it | 18:14 |
| fungi | unless as you say, git is actually exercising that anyway | 18:14 |
| clarkb | fungi: I just don't know if it does and I think git exercises it in that it will return the header but I suspect git ignores the header too | 18:16 |
| clarkb | fungi: but maybe making a git like request with curl is a good test to add and verify | 18:16 |
| clarkb | I'm going to poke around the web ui first to see if there is a query string used there (maybe via search) | 18:16 |
| clarkb | yup https://opendev.org/opendev/system-config/search?q=foo | 18:17 |
| clarkb | let me update the change to include a test that exercises ^ and confirms the Link header is correct for that path | 18:17 |
| opendevreview | Clark Boylan proposed opendev/system-config master: Set canonical Link paths for gitea resources https://review.opendev.org/c/opendev/system-config/+/962826 | 18:23 |
| clarkb | I'm really hoping we don't need some complicated rule to include the ? optionally | 18:23 |
| clarkb | but that latest patchset should check for us | 18:23 |
| clarkb | Ramereth[m]: Ramereth (not sure if these go to the same place or not) Ironic is asking if anyone is using Power + Ironic + PReP Partitioning support in Ironic. I think you may have power gear but don't know if ironic is involved and ifgured I would point it out | 18:41 |
| clarkb | Ramereth[m]: Ramereth https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/thread/5RLZTJ2ESGFRVFIS7OUNIE2VOAGHAWZE/ is the email thread | 18:41 |
| Clark[m] | fungi: I'm eating lunch but that test shows it does not add the ? prefix. Do you know how to add that conditionally? | 19:32 |
| Clark[m] | Maybe we can add it in the rewrite rule if the match is non empty? | 19:33 |
| opendevreview | Mathieu Parent proposed opendev/glean master: Use systemd-networkd automatically when enabled https://review.opendev.org/c/opendev/glean/+/963010 | 19:33 |
| fungi | Clark[m]: that's what i was afraid of... yeah maybe a rewritecond on qs being nonempty | 20:05 |
| fungi | then we can add "?$QS" if and only if $QS | 20:05 |
| clarkb | looks like Header can have a conditional expression that might be the clearest understandable way to express this | 20:08 |
| fungi | oh, even better | 20:09 |
| clarkb | ya working on this now | 20:09 |
| opendevreview | Clark Boylan proposed opendev/system-config master: Set canonical Link paths for gitea resources https://review.opendev.org/c/opendev/system-config/+/962826 | 20:14 |
| clarkb | something like that maybe. It isn't claer to me if I need to use %{QS}e in the expr block. I didn't because none of the example do but we'll find out | 20:15 |
| clarkb | I'm going to pop out for a bike ride while I awit for results on that. Feel free to push an update if it fails on something silly before I get back | 20:21 |
| fungi | will do | 20:26 |
| fungi | have fun! | 20:26 |
| Ramereth[m] | <clarkb> "Ramereth: Ramereth (not sure..." <- We don't use Ironic so we don't use that | 20:26 |
| fungi | we figured you probably didn't, but it was worth double-checking | 20:27 |
| fungi | system-config-run-gitea for 962826,8 failed... | 20:59 |
| fungi | apache-ua-filter : Reload apache2 FAILED | 21:00 |
| fungi | AH00526: Syntax error on line 55 of /etc/apache2/sites-enabled/000-default.conf: Can't parse envclause/expression: Variable 'QS' does not exist | 21:01 |
| fungi | oh, if %{QUERY_STRING} is empty on line 44 we never set QS i guess? | 21:02 |
| fungi | mmm, though how is it going to know that at config parsing time? there's got to be another explanation | 21:03 |
| fungi | though i guess RU doesn't get used in a expr like QS does | 21:05 |
| fungi | maybe we need to preset QS to the empty string? | 21:06 |
| fungi | `Define QS ""` maybe? | 21:10 |
| fungi | https://httpd.apache.org/docs/2.4/mod/core.html#define | 21:10 |
| fungi | er, actually i guess those aren't envvars | 21:17 |
| fungi | maybe what we need to do is "expr=-z %{QUERY_STRING}" | 21:18 |
| fungi | or should we be using `env=[!]varname` instead if `expr=...` | 21:20 |
| opendevreview | Jeremy Stanley proposed opendev/system-config master: Set canonical Link paths for gitea resources https://review.opendev.org/c/opendev/system-config/+/962826 | 21:22 |
| fungi | that ^ switches from expr to env | 21:22 |
| fungi | seems like the cleanest solution, reading https://httpd.apache.org/docs/current/mod/mod_headers.html#header | 21:22 |
| clarkb | fungi: I think I skipped over that because the docs said that is for defined and undefined vars and I think we always define QS? But amybe the rewrite cond for query string matching .* won't match empty string (it should) | 22:17 |
| clarkb | we will find out experimetnally I guess can can also change it to .+ potentially so that it doens't match empty string and is unset if undefined? | 22:18 |
| fungi | it looked to me like we only defined it when QUERY_STRING was nonempty, but alternatively we could go back to `expr=-z %{QUERY_STRING}` | 22:18 |
| clarkb | ya I could be misundersatnding `RewriteCond %{QUERY_STRING} (.*)` | 22:18 |
| clarkb | I would expect that to always match due to the * not + or whatever | 22:19 |
| clarkb | but maybe rewrite conds always fail if the var is empty | 22:19 |
| clarkb | we should know soon enough | 22:19 |
| clarkb | I guess the error message you got indicates QS doesn't exist always | 22:20 |
| clarkb | so ya it mustn't be set in all conditions. | 22:20 |
| fungi | that's the theory i was working on, but it's basically me fumbling around in the dark | 22:27 |
| fungi | the job's close to wrapping up now so we'll know in a few | 22:28 |
| fungi | test_matrix_server and test_matrix_client failed | 22:29 |
| fungi | yeah, i think we're getting a trailing ? when there's no query string, so you're probably right | 22:33 |
| fungi | so do you want to try a stricter RewriteCond or switch to using QUERY_STRING in an expr? | 22:34 |
| fungi | seems like the problem is that QS isn't defined at config parsing time but is always defined during runtime | 22:35 |
| fungi | i'm about due to knock off and have my friday evening | 22:36 |
| Clark[m] | Enjoy! I have some pre travel errands to run too (turns out I really need some shoes for walking around in the rain potentially) | 22:39 |
| Clark[m] | We can pick this up Monday. I doubt we're going to deploy this over the weekend anyway. And ya maybe we try .+ Instead to see if that causes the matcher to fail | 22:39 |
| fungi | yeah, sounds fine. good luck shoe shopping! | 22:41 |
| clarkb | I'll get a new patchset up to use .+ before I go | 22:42 |
| clarkb | oh heh my ssh keys unloaded already. Maybe that is an indication I should just go shopping | 22:42 |
| fungi | cool, i'll try to take a peek at the job results later | 22:42 |
| fungi | oh, i can push it | 22:42 |
| clarkb | no I can use the web ui | 22:42 |
| clarkb | its fine | 22:42 |
| opendevreview | Clark Boylan proposed opendev/system-config master: Set canonical Link paths for gitea resources https://review.opendev.org/c/opendev/system-config/+/962826 | 22:43 |
| clarkb | in theory .+ means we only match if the query string is non empty and that means the rewrite rule only runs when set to set QS which means we can use env check to see if the var is set or not | 22:44 |
| clarkb | fingers crossed | 22:44 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!