opendevreview | Jeremy Stanley proposed opendev/system-config master: Downgrade haproxy image from latest to lts https://review.opendev.org/c/opendev/system-config/+/903805 | 13:45 |
---|---|---|
frickler | fungi: are we fine to merge ^^? I would think re-applying the previous +2's should be fine | 14:09 |
frickler | also, do we know which version of haproxy we were running on Saturday? 2.9.0 or 2.9.1? | 14:09 |
frickler | ah, the haproxy:latest image on lb02 says HAPROXY_VERSION=2.9.1 | 14:11 |
*** d34dh0r5- is now known as d34dh0r53 | 14:48 | |
fungi | yes i think we can merge that (the servers involved are still in the disable list for now anyway) | 14:49 |
fungi | and per the commit message it was 2.9.1 we were running when we observed the issue | 14:49 |
fungi | i found an upstream bug report which looks more likely to be what we encountered than the previously suspected one, and it seems to have a fix merged upstream so probably the next lts version (whatever it ends up being) won't have the regression. at least here's hoping | 14:50 |
fungi | the latest commit message has been updated with a link to the newer bug | 14:51 |
frickler | fungi: I was watching that issue, but from the latest comments and the submitted fix I'm not sure how that would match our issue. I think we will need to try and reproduce this and submit our own issue, I'll see if I can get to that next week | 16:00 |
fungi | agreed, on the surface it could be what we observed (pretty sure our services are all http/1.1 not 2.0? but i could be wrong) | 16:01 |
Clark[m] | I think we can use a testinfra test case on the gitea deployment job to test it | 16:22 |
Clark[m] | We have all the components there | 16:22 |
fungi | the bigger challenge will be if it turns out to need a high volume of browsing activity to exhibit issues. for example it took more than a day for things to get bad enough for the zuul-lb to start exhibiting user-facing issues (though maybe we can infer the symptoms by analyzing open sessions tracked in haproxy?) | 16:28 |
fungi | popping out to lunch with friends, but will return in an hour or so | 16:30 |
Clark[m] | I think we can start with something as simple as "for x in range(5000): make web request" and we'd expect all to complete successfully | 16:34 |
Clark[m] | If the issue is still present it will fail on the 4001th request | 16:34 |
Ramereth | fungi: FYI I think the issue we were having was related to not enough memory available. I just added a new arm64 node a few minutes ago which should help with that. Also it's a different model and is our first node using AlmaLinux 8. Let me know if you run into issues | 17:31 |
Ramereth | I booted up a test vm and it seems to be working fine | 17:31 |
Ramereth | looks like we already have some of your vms spinning up on it \o/ | 17:34 |
fungi | Ramereth: thanks for the update! i'll go ahead and revert the temporary cap i put in place last week | 18:26 |
fungi | Ramereth: oh, actually i didn't lower it for osuosl since it was intermittent, but i'll keep an eye on it. thanks again! | 18:27 |
fungi | odd, grafana says "no data" for every graph in every dashboard we have, but i'm still able to query graphite directly and it has current data... i wonder if we broke the grafana configs somehow | 18:33 |
fungi | the good news is we do still have data, it's just not showing up | 18:34 |
frickler | hmm, that must be a very new regression then, I looked at the AFS dashboard during the meeting yesterday and it was still fine, now I can confirm the issue you're seeing | 19:29 |
frickler | c1cb8b9db100 grafana/grafana-oss:latest "/run.sh" 16 hours ago Up 16 hours grafana-docker_grafana_1 | 19:30 |
frickler | docker log seems to show this for every request: logger=live t=2023-12-20T19:30:51.840943044Z level=warn msg="Request Origin is not authorized" origin=https://grafana.opendev.org host=localhost:3000 appUrl=http://grafana.opendev.org:3000/ allowedOrigins= | 19:31 |
* frickler is close to eoding, will check closer tomorrow if nobody beats me to it | 19:32 | |
fungi | aha, sounds like maybe grafana got more picky about cors headers? | 19:50 |
fungi | great find! | 19:50 |
fungi | https://hub.docker.com/r/grafana/grafana-oss/tags implies "latest" is now 10.2.3 and was probably previously 10.2.2 | 19:53 |
fungi | nothing obvious in the changelog for 10.2.3 related to cors | 19:55 |
fungi | though several things related to authentication | 19:56 |
fungi | https://github.com/grafana/grafana/blob/main/CHANGELOG.md | 19:56 |
fungi | or maybe we upgraded from 9.x to 10.x and this is now relevant: https://grafana.com/docs/grafana/latest/setup-grafana/configure-security/configure-security-hardening/#add-a-samesite-attribute-to-cookies | 20:02 |
fungi | docker image list mentions 9.0.6 as the next most recent image tag we have cached on the server | 20:03 |
fungi | mmm, hunting around, that warning seems to be non-fatal for at least some users, looks like setting live.allowed_origins = "https://grafana.opendev.org" would probably silence it if i could figure out where our grafana config lives | 20:30 |
fungi | https://github.com/grafana/grafana/issues/36443 | 20:30 |
fungi | so i'm back to having no idea what's breaking it yet | 20:38 |
frickler | oh, right, that's only a warning. looking at the browser console is more helpful I think: Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource at https://graphite.opendev.org/render. (Reason: header ‘x-grafana-device-id’ is not allowed according to header ‘Access-Control-Allow-Headers’ from CORS preflight response). | 20:41 |
frickler | and that is a header that was recentish added https://github.com/grafana/grafana/commit/1a281ac49dd6b9ee6964badb832918507bf8ba97#diff-70030faa245250908d55db47258ce505d474db1559995683f8df1a951504236fR24 | 20:42 |
fungi | oh neat | 20:44 |
fungi | seems we add Access-Control-Allow-Headers in playbooks/roles/graphite/templates/graphite-statsd.conf.j2 | 20:45 |
frickler | actually this commit enabled it for OSS builds, so that matches better with the 10.2.3 timeframe https://github.com/grafana/grafana/commit/59bdff0280d52ca5d8918157d7697b9279b25501 | 20:46 |
fungi | okay, so maybe we did upgrade from 10.2.2 | 20:47 |
fungi | rather than 9.something | 20:47 |
fungi | i saw that commit in the changelog but in skimming it seemed to just be adding info about anonymous user connections | 20:48 |
fungi | "remove check for enterprise for `Device-Id` header in request" i guess that could be related to the now disallowed x-grafana-device-id | 20:49 |
fungi | bingo: https://github.com/grafana/grafana/issues/79692 | 20:50 |
opendevreview | Jeremy Stanley proposed opendev/system-config master: Temporarily pin Grafana to 10.2.2 https://review.opendev.org/c/opendev/system-config/+/904151 | 20:58 |
fungi | pretty sure that issue is the same one we're seeing since we include templated variables and "special characters" like parentheses in basically all our queries | 20:59 |
frickler | I'm not sure about that, we might need to report an issue ourselves or try to fix the issue on the graphite side in the long run. I have no idea why ianw added that header initially and whether it would be safe to simply add the new header grafana sets | 21:04 |
frickler | also displayAnonymousStats only seems to disable displaying the stats, it will not disable sending the new header if my reading of the patch is correct | 21:04 |
frickler | but pinning to 10.2.2 and verifying that the errors don't appear in the browser console there is a good first step | 21:05 |
ianw | (i have no idea why i added that header ... :) | 21:06 |
ianw | i imagine i copied from some nginx setup/graphite setup guide ... if it didn't come from puppet | 21:07 |
tonyb | fwiw I approved the haproxy change | 21:07 |
tonyb | now back to the grafana issue | 21:07 |
* frickler is really off now | 21:09 | |
fungi | thanks frickler! | 21:10 |
ianw | so we just need 'x-grafana-device-id' in the cors response right? | 21:21 |
JayF | fungi: I assume \join_subline is not an authorized advertiser in #opendev? | 21:33 |
JayF | fungi: they are using their irc profile to advertise in a way that is fairly obvious to irccloud users, less-so to other users, and using potentially an opendev etherpad to do so as well | 21:33 |
JayF | fungi: DM'd you that suspect link | 21:33 |
JayF | fungi: just validating since this isn't an openstack channel before using my newly-minted irc hammer | 21:34 |
opendevreview | Merged opendev/system-config master: Downgrade haproxy image from latest to lts https://review.opendev.org/c/opendev/system-config/+/903805 | 22:16 |
ianw | i added that manually just to test, restarted graphite and it seems to work now | 22:19 |
fungi | ianw: 903805 or the cors response addition? | 22:20 |
ianw | the cors header in the ngnix config | 22:20 |
fungi | er, right 903805 is the haproxy downgrade not the grafana downgrade | 22:21 |
fungi | i'll wip 904151 in favor of the cors update | 22:21 |
ildikov | Hi All, I have a quick question if anyone might have the experience with that. Is there a way to change the email address in a UbuntuOne account if the person forgot their password and don't have access to their email anymore? | 22:23 |
opendevreview | Ian Wienand proposed opendev/system-config master: graphite: add grafana header to CORS allowed list https://review.opendev.org/c/opendev/system-config/+/904154 | 22:24 |
ianw | i actualy think the other cors headers there don't need to be listed because they're on the always allow list | 22:24 |
ianw | but on Dec 21 the smallest change that gets it working is probably the best :) | 22:25 |
opendevreview | Ian Wienand proposed opendev/system-config master: graphite: add grafana header to CORS allowed list https://review.opendev.org/c/opendev/system-config/+/904154 | 22:30 |
fungi | ildikov: i think it requires contacting the ubuntuone admins. i'm not sure they would have any way to verify it's the same person though, and so quite likely they'll be told to just create a new account with their updated address and then contact the admins of all the systems they logged into with the old id to associate the new id with their accounts | 22:34 |
fungi | (or just create new accounts everywhere with the new id) | 22:35 |
Clark[m] | https://github.com/haproxy/haproxy/issues/2395 this looks like our haproxy issue | 23:09 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!