Thursday, 2023-12-14

opendevreview	James Page proposed openstack/project-config master: sunbeam: retire all single charm repositories https://review.opendev.org/c/openstack/project-config/+/903666	11:04
opendevreview	James Page proposed openstack/project-config master: Fix the ACL associated with charm-keystone-ldap-k8s https://review.opendev.org/c/openstack/project-config/+/903667	11:14
*** haleyb\|out is now known as haleyb		14:24
*** jamesdenton_ is now known as jamesdenton		15:37
opendevreview	Jeremy Stanley proposed openstack/project-config master: Temporarily lower max-servers for linaro https://review.opendev.org/c/openstack/project-config/+/903708	16:36
ttx	dpawlik fungi I'm looking at AWS dashboards and I can tell that Fargate is used for logstash. Trying to see if I can find something to explain the doubling in resources...	16:41
ttx	OK I think there is a thing that doubled on Oct 18. There are two loadbalancers defined in the Fargate logstash thing, and one seems to constantly point to "unhealthy" targets. It used to have an average of 7 unhealthy targets (and 0 healthy ones) until Oct 18 14utc. Starting Oct 18 15utc the average is 18 unhealthy targets. So my read is that it was dysfunctional before but since Oct 18 it spends twice as much resources to be dysfunctional	16:52
fungi	ttx: thanks, that's an interesting piece of information. dpawlik: maybe we can temporarily down the half of the lb that's always pointing to unhealthy targets? do you have access to try that?	16:54
ttx	the one on port 9600 looks ok, but the one on 9999 is clearly broken and useless. The increase in resources consumed might just be that Fargate got more efficient at respawning failed things.	17:00
fungi	i wonder if the two load balancers are for two different parts of the service, and shutting down either of them will lead to disabling all log ingestion... but without a better understanding of the design, turning the seemingly broken one off is probably the easiest thing to try	17:04
ttx	I mean... It's clearly not doing anything since the target hosts are all unhealthy	17:07
fungi	yeah, i guess if it's that broken then disabling it probably won't make things any worse	17:07
ttx	I could reduce the number of target hosts. That way it will still be unhealthy, but use less resource sbeing unhealthy	17:08
ttx	that sounds safe enough...	17:08
fungi	sounds worth trying, i agree	17:09
ttx	I keep the LB and the target group, I just reduce the number of targets from 30 to, say, 10	17:10
ttx	hmm deregistering them is not enough, it still keeps the 30 objective	17:12
opendevreview	Merged openstack/project-config master: Temporarily lower max-servers for linaro https://review.opendev.org/c/openstack/project-config/+/903708	17:12
ttx	hmm it peaks at 28 unhealthy after I removed two, so it has an effect	17:23
ttx	hmm, not really	17:27
ttx	it seems to have some kind of an effect, now the number of unhealthy targets is down, so I think it does spend less energy to be dysfunctional	17:55
ttx	I'll look at it again tomorrow to see if it sticks, then observe if it results in a drop in resources. Would still be good to be able to ask the person who set it up whether that load balancer connected to unhealthy targets on port 9999 actually serves any purpose.	17:57
pmatulis	anyone else having trouble connecting to https://opendev.org/ ? it's touch and go for me	18:55
fungi	pmatulis: no problems here... are you getting to it over ipv4 or ipv6?	19:08
JayF	WFM over my ipv4	19:08
fungi	a traceroute might help	19:08
pmatulis	ipv4 ... digging	19:10
pmatulis	yeah i think it's a regional carrier issue. i also had trouble accessing another site earlier	19:15
pmatulis	https://imgur.com/a/0wXvLFq	19:17
fungi	pmatulis: likely an asymmetric route if the failure is on the last hop, i'll try tracing back to one of the early addresses in your trace for comparison	20:52
fungi	pmatulis: confirmed, return path goes through cogent, who doesn't know how to reach you: https://paste.opendev.org/show/bbb1ztgC80lzTV6zfQwz/	20:54
fungi	cogent has a lookingglass that confirms it too: https://www.cogentco.com/en/looking-glass	20:56
fungi	plug in ipv4 trace, us - san jose (the pop indicated in our trace), and 64.230.11.206 (the earliest hop in your trace)	20:57
fungi	their query form doesn't have deep-linking that i can find, sorry	20:57
fungi	if you're not familiar with traceroute's annotations, !N is short for "network unreachable"	20:59
* fungi goes back to pretending he wasn't a network engineer in a former life		21:00

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!