Thursday, 2023-12-14

opendevreviewJames Page proposed openstack/project-config master: sunbeam: retire all single charm repositories  https://review.opendev.org/c/openstack/project-config/+/90366611:04
opendevreviewJames Page proposed openstack/project-config master: Fix the ACL associated with charm-keystone-ldap-k8s  https://review.opendev.org/c/openstack/project-config/+/90366711:14
*** haleyb|out is now known as haleyb14:24
*** jamesdenton_ is now known as jamesdenton15:37
opendevreviewJeremy Stanley proposed openstack/project-config master: Temporarily lower max-servers for linaro  https://review.opendev.org/c/openstack/project-config/+/90370816:36
ttxdpawlik fungi I'm looking at AWS dashboards and I can tell that Fargate is used for logstash. Trying to see if I can find something to explain the doubling in resources...16:41
ttxOK I think there is a thing that doubled on Oct 18. There are two loadbalancers defined in the Fargate logstash thing, and one seems to constantly point to "unhealthy" targets. It used to have an average of 7 unhealthy targets (and 0 healthy ones) until Oct 18 14utc. Starting Oct 18 15utc the average is 18 unhealthy targets. So my read is that it was dysfunctional before but since Oct 18 it spends twice as much resources to be dysfunctional16:52
fungittx: thanks, that's an interesting piece of information. dpawlik: maybe we can temporarily down the half of the lb that's always pointing to unhealthy targets? do you have access to try that?16:54
ttxthe one on port 9600 looks ok, but the one on 9999 is clearly broken and useless. The increase in resources consumed might just be that Fargate got more efficient at respawning failed things.17:00
fungii wonder if the two load balancers are for two different parts of the service, and shutting down either of them will lead to disabling all log ingestion... but without a better understanding of the design, turning the seemingly broken one off is probably the easiest thing to try17:04
ttxI mean... It's clearly not doing anything since the target hosts are all unhealthy17:07
fungiyeah, i guess if it's that broken then disabling it probably won't make things any worse17:07
ttxI could reduce the number of target hosts. That way it will still be unhealthy, but use less resource sbeing unhealthy17:08
ttxthat sounds safe enough...17:08
fungisounds worth trying, i agree17:09
ttxI keep the LB and the target group, I just reduce the number of targets from 30 to, say, 1017:10
ttxhmm deregistering them is not enough, it still keeps the 30 objective17:12
opendevreviewMerged openstack/project-config master: Temporarily lower max-servers for linaro  https://review.opendev.org/c/openstack/project-config/+/90370817:12
ttxhmm it peaks at 28 unhealthy after I removed two, so it has an effect17:23
ttxhmm, not really17:27
ttxit seems to have some kind of an effect, now the number of unhealthy targets is down, so I think it does spend less energy to be dysfunctional17:55
ttxI'll look at it again tomorrow to see if it sticks, then observe if it results in a drop in resources. Would still be good to be able to ask the person who set it up whether that load balancer connected to unhealthy targets on port 9999 actually serves any purpose.17:57
pmatulisanyone else having trouble connecting to https://opendev.org/ ? it's touch and go for me18:55
fungipmatulis: no problems here... are you getting to it over ipv4 or ipv6?19:08
JayFWFM over my ipv4 19:08
fungia traceroute might help19:08
pmatulisipv4 ... digging19:10
pmatulisyeah i think it's a regional carrier issue. i also had trouble accessing another site earlier19:15
pmatulishttps://imgur.com/a/0wXvLFq19:17
fungipmatulis: likely an asymmetric route if the failure is on the last hop, i'll try tracing back to one of the early addresses in your trace for comparison20:52
fungipmatulis: confirmed, return path goes through cogent, who doesn't know how to reach you: https://paste.opendev.org/show/bbb1ztgC80lzTV6zfQwz/20:54
fungicogent has a lookingglass that confirms it too: https://www.cogentco.com/en/looking-glass20:56
fungiplug in ipv4 trace, us - san jose (the pop indicated in our trace), and 64.230.11.206 (the earliest hop in your trace)20:57
fungitheir query form doesn't have deep-linking that i can find, sorry20:57
fungiif you're not familiar with traceroute's annotations, !N is short for "network unreachable"20:59
* fungi goes back to pretending he wasn't a network engineer in a former life21:00

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!