Monday, 2019-02-11

openstackgerrit	Michael Johnson proposed openstack/octavia master: Add LXD support to Octavia devstack https://review.openstack.org/636066	00:34
openstackgerrit	Michael Johnson proposed openstack/octavia-tempest-plugin master: Add an lxd octavia scenario job https://review.openstack.org/636069	00:35
*** sapd1 has joined #openstack-lbaas		01:38
dayou	johnsom, yes, is there anything I can help?	01:38
johnsom	dayou Hi Mr. Dashboard wizard.	01:44
dayou	:-P	01:45
johnsom	dayou I was wondering if there was a chance you could add a flavor selector to the LB create page in the octavia-dashboard? It would be great if we had that for Stein.	01:45
johnsom	It is the flavor ID here: https://developer.openstack.org/api-ref/load-balancer/v2/index.html?expanded=create-a-load-balancer-detail#id2	01:46
*** yamamoto has joined #openstack-lbaas		01:46
johnsom	I fixed the SDK to have it in this patch: https://review.openstack.org/#/c/633849/	01:46
johnsom	You could present the user a list using this API: https://developer.openstack.org/api-ref/load-balancer/v2/index.html?expanded=list-flavors-detail#list-flavors	01:47
johnsom	Which hasn't yet merged in SDK: https://review.openstack.org/#/c/634532/	01:47
dayou	Cool, I'll start working on it this week	01:47
johnsom	If you don't have time, that is ok too. I can attempt to fumble my way to making it happen.	01:48
openstackgerrit	Michael Johnson proposed openstack/octavia master: Add LXD support to Octavia devstack https://review.openstack.org/636066	01:49
johnsom	Thank you sir. Let me know if I can help an any way.	01:50
dayou	Yes, sir, it's my pleasure to help with it, I'll see whether I can I can come up with something in the next two weeks for review.	01:52
johnsom	dayou: Also, let me know if there are any priority dashboard patches we need to review. Feel free to ping me with those at any time.	01:54
dayou	johnsom, got you	01:55
sapd1	johnsom: Hi	02:02
sapd1	you are working on lxd for octavia,	02:02
sapd1	We will run container instead of VM	02:03
johnsom	I am. I have it working locally, but need to get the gate working. Though I have to say, I am not sure I would use it for production.	02:09
johnsom	Nova-lxd seems to need some work.	02:10
sapd1	johnsom: why don't you use zun?	02:24
johnsom	I don’t think it support lxc.	02:44
*** yamamoto has quit IRC		02:48
sapd1	johnsom: Oh, why lxd. why not docker	02:52
*** psachin has joined #openstack-lbaas		03:05
*** ramishra has joined #openstack-lbaas		03:27
sapd1	seem like octavia do not support ovn provider yet	03:57
johnsom	Yes, there is an OVN driver	04:04
sapd1	johnsom: I'm trying to use ovn driver	04:24
sapd1	But I don't know how to config	04:24
*** hongbin has joined #openstack-lbaas		04:25
sapd1	https://github.com/openstack/networking-ovn/blob/a4e69319ad/devstack/lib/networking-ovn#L595	04:25
sapd1	follow this script but it does not work	04:25
*** hongbin has quit IRC		04:53
*** AlexStaf has quit IRC		05:33
*** gcheresh has joined #openstack-lbaas		06:12
openstackgerrit	Michael Johnson proposed openstack/octavia master: Add LXD support to Octavia devstack https://review.openstack.org/636066	06:20
*** gcheresh has quit IRC		06:31
*** yamamoto has joined #openstack-lbaas		06:46
*** yamamoto has quit IRC		06:51
*** velizarx has joined #openstack-lbaas		07:17
*** gcheresh has joined #openstack-lbaas		07:20
openstackgerrit	Michael Johnson proposed openstack/octavia master: Add LXD support to Octavia devstack https://review.openstack.org/636066	07:25
*** yboaron has quit IRC		07:35
*** velizarx has quit IRC		07:43
*** psachin has quit IRC		07:45
*** velizarx has joined #openstack-lbaas		07:56
*** AlexStaf has joined #openstack-lbaas		08:02
*** rpittau has joined #openstack-lbaas		08:07
*** Emine has joined #openstack-lbaas		08:18
cgoncalves	sapd1, try this http://paste.openstack.org/show/744822/	08:25
cgoncalves	it works for me	08:25
sapd1	cgoncalves: thank you	08:34
sapd1	I have run it successfully. :D	08:35
*** Emine has quit IRC		08:37
*** Emine has joined #openstack-lbaas		08:42
*** yboaron has joined #openstack-lbaas		08:48
*** yboaron_ has joined #openstack-lbaas		08:53
cgoncalves	great	08:54
cgoncalves	sapd1, ah, make sure you have LIBS_FROM_GIT=python-octaviaclient	08:54
cgoncalves	so that you get https://review.openstack.org/#/c/633562/	08:54
sapd1	cgoncalves: I'm trying install manual	08:54
sapd1	yeah	08:55
*** yboaron has quit IRC		08:55
*** ccamposr has joined #openstack-lbaas		09:06
*** Emine has quit IRC		09:29
*** Emine has joined #openstack-lbaas		09:36
*** kobis1 has joined #openstack-lbaas		09:43
*** kobis1 has left #openstack-lbaas		09:44
openstackgerrit	Nir Magnezi proposed openstack/octavia master: Encrypt certs and keys https://review.openstack.org/627064	10:06
*** salmankhan has joined #openstack-lbaas		10:21
*** salmankhan1 has joined #openstack-lbaas		10:31
*** salmankhan has quit IRC		10:32
*** salmankhan1 is now known as salmankhan		10:32
*** mkuf_ has joined #openstack-lbaas		10:48
*** mkuf has quit IRC		10:52
*** mkuf_ has quit IRC		11:24
*** sapd1 has quit IRC		11:37
*** Emine has quit IRC		11:46
*** velizarx has quit IRC		11:56
*** velizarx has joined #openstack-lbaas		11:58
openstackgerrit	Nir Magnezi proposed openstack/octavia master: Encrypt certs and keys https://review.openstack.org/627064	12:09
*** mkuf has joined #openstack-lbaas		12:13
*** Emine has joined #openstack-lbaas		12:35
*** yamamoto has joined #openstack-lbaas		13:00
*** gcheresh has quit IRC		13:01
*** gcheresh_ has joined #openstack-lbaas		13:01
openstackgerrit	Nir Magnezi proposed openstack/octavia master: WIP: CentOS with multiple fixed ips https://review.openstack.org/636065	13:01
*** velizarx has quit IRC		13:04
*** yamamoto has quit IRC		13:05
*** velizarx has joined #openstack-lbaas		13:10
*** trown\|outtypewww is now known as trown		13:13
*** sapd1 has joined #openstack-lbaas		13:34
*** yamamoto has joined #openstack-lbaas		13:40
*** gcheresh has joined #openstack-lbaas		13:43
*** gcheresh_ has quit IRC		13:44
*** yamamoto has quit IRC		13:44
nmagnezi	cgoncalves, are you able to locally build a centos based image?	13:54
nmagnezi	cgoncalves, asking because I get the following: Cannot uninstall 'virtualenv'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.	13:55
cgoncalves	nmagnezi, yes. make sure you use DIB from master	13:58
nmagnezi	cgoncalves, aye.. should we bump out deps or something? (Assuming there's a newer release there..)	13:58
cgoncalves	we need a new release of DIB	13:59
cgoncalves	CI is okay because it pulls from master	13:59
nmagnezi	cgoncalves, aye. Added to LIBS_FROM_GIT locally for now	14:00
*** yboaron_ has quit IRC		14:03
*** yboaron_ has joined #openstack-lbaas		14:03
cgoncalves	thought of the day: running grenade locally is a PITA. countless issues I've ran into	14:48
openstackgerrit	boden proposed openstack/neutron-lbaas master: stop using common db mixin https://review.openstack.org/635570	14:56
cgoncalves	kernel panic on cirros, great	14:57
*** AlexStaf has quit IRC		15:08
*** yboaron_ has quit IRC		15:21
openstackgerrit	Margarita Shakhova proposed openstack/octavia master: Support create amphora instance from volume based. https://review.openstack.org/570505	15:21
*** yboaron_ has joined #openstack-lbaas		15:21
*** fnaval has joined #openstack-lbaas		15:38
*** yboaron_ has quit IRC		15:42
*** yboaron_ has joined #openstack-lbaas		15:42
*** hebul_ has joined #openstack-lbaas		15:51
*** hebul_ has quit IRC		15:54
*** hebul has joined #openstack-lbaas		15:55
hebul	Hello All, who will (could) help me with some octavia questions ?	15:57
johnsom	hebul: you are in the right place. What can we help with?	15:58
hebul	Thank you !	15:59
*** sapd1 has quit IRC		16:01
hebul	Question 1. After amphora VM was created (with net id provided) I found out that in consume 2 IP in that network	16:01
hebul	it consume	16:02
hebul	one consumed IP is shown when we ask for lb list through CLI. But also we can see one more IP consumed when ask about vm list in openstack	16:04
johnsom	Correct, each has a base port with IP and a secondary IP (allowed address pairs in neutron) that is used for the VIP and HA.	16:05
*** ramishra has quit IRC		16:09
hebul	@johnsom Why didn't we use single one for all purposes ?	16:10
johnsom	The VIP is a special port, called an allowed address pair port in neutron. It allows us to move the VIP address between VMs in the case of a failure. When running in Active/Standby topology, this can happen in around a second. In single mode, it takes a bit longer	16:12
colin-	not crazy about the extra IPs either but the resilience he's describing made it a no-brainer for me	16:13
*** AlexStaf has joined #openstack-lbaas		16:17
*** fnaval has quit IRC		16:17
johnsom	Yeah, in theory a single topology amp could use just one IP, but it would require a new network driver be written and no one has been motivated to do it. Pretty much all of use always run Active/Standby.	16:17
colin-	have found that works very well, fwiw	16:18
colin-	(to hebul primarily hope that helps!)	16:18
johnsom	colin- Did you tune it or are you running with the defaults?	16:18
colin-	the threads, similar to hm? default for now	16:19
johnsom	colin- It can be tuned to failover much faster than the defaults if you want it to.	16:19
colin-	oh i see, the freshness	16:19
colin-	gotcha	16:19
colin-	also default for now	16:19
johnsom	I mean the Active/Standby settings. Just curious. I run defaults, but have demoed with it tuned	16:19
hebul	Thank you !!! So as I see it is true limitation for now (as we cant move VIP from one VM to another if VM doesn't have network port or we have to change driver)	16:20
colin-	like heartbeat_timeout?	16:20
*** gcheresh has quit IRC		16:20
johnsom	https://docs.openstack.org/octavia/latest/configuration/configref.html#keepalived-vrrp	16:20
johnsom	Those first four settings	16:21
*** Emine has quit IRC		16:21
colin-	oh no, wasn't familiar with this thanks	16:22
johnsom	Hmm, forgot we dropped the vrrp_advert_int to 1, so that is good already. We can go lower with the newer versions of keepalvied, but I'm not sure it's really needed.	16:22
colin-	oh i have been meaning to ask this but never think of it when we are chatting, is there any frequency adjustment on health check failuer by default?	16:22
colin-	old hw lbs i used to manage would sometimes increase the frequency of their checking after a failure and i always found that bothersome	16:22
johnsom	Ah, so if it sees a failure it polls more often?	16:23
colin-	right	16:23
colin-	i'm assuming that is not true in our case	16:23
johnsom	Hmm, I don't think we have that today. If HAproxy can do it, you would have to do a custom template for it.	16:23
johnsom	Yeah, I don't think we do that.	16:24
colin-	no i actually prefer not to have it because it changes the epxected behavior with all these parameters imo, just double checking	16:24
johnsom	I could see that being a bit annoying with the logs scrolling more.	16:24
johnsom	hebul Any other questions we can help with?	16:25
hebul	Question 2 is coming. How to sort out management network for octavia and amphora ? What happens in private network environment (e.g. VXLAN)?	16:28
*** yboaron_ has quit IRC		16:28
hebul	Private network won't have connectivity to external world (incuding octavia mgmt net) by default	16:29
johnsom	So, let me clarify and then answer.	16:29
johnsom	There is the lb-mgmt-net, which is used for the controllers to talk to the amphora and the amphora to talk back to the controllers. It is typically a private network setup for this purpose, but it can be shared and/or routed. It is a TLS TCP connection to the amps, and a UDP back to the controllers. No tenant traffic crosses this network as it is isolated from the tenant traffic inside a network namespace in the amp.	16:31
johnsom	VIP and member networks, isolated inside the network namespace in the amphora, are hot-plugged into the amphora instance as the user configures their load balancer. We support any network that neutron supports for this, could be tenant private, could be a public external network.	16:33
johnsom	Fundamentally the lb-mgmt-net is just a neutron network. The harder part is how to make it available for the controllers. There are many ways to do this. Provider networks, bridging it out to the controllers, setting up routes, etc.	16:34
johnsom	Did that help answer the question?	16:34
openstackgerrit	Michael Johnson proposed openstack/octavia master: Add LXD support to Octavia devstack https://review.openstack.org/636066	16:35
hebul	Yes, it helped to get closer but not clarified :)	16:40
hebul	If I understood correctly, we have to provide both provider network for amphorae and VXLAN for private tenant networks	16:44
hebul	for hypervysors	16:44
hebul	hypervisors	16:44
johnsom	provider network is optional, that is just one way to setup the lb-mgmt-net.	16:45
johnsom	Your tenant networks would use whatever mechanism you use today for neutron networks on your compute hosts. VXLAN is fine, as Octavia only talks to neutron and nova APIs for it. How it works behind nova and neutron, we don't need to know.	16:47
johnsom	So for OpenStack Ansible, they chose to use a provider network for the lb-mgmt-net. For Redhat OSP 13, they chose to not use provider networks but to bridge it out of neutron.	16:48
johnsom	If your neutron is all VXLAN, you could probably even have the Octavia controllers participate in the VXLAN overlay directly if you wish.	16:49
cgoncalves	for OSP that is the default, yes, although one can create a neutron network and pass that in to the installer. the installer will see the network already exists and use it instead	16:51
hebul	"The harder part is how to make it available for the controllers. There are many ways to do this. Provider networks, bridging it out to the controllers, setting up routes, etc."-- bridging it out to the controllers - what does it mean ?	16:53
hebul	Does it mean that I have to connect Octavia services to OVS - integration to the same internal OVS VLAN ID that is used for internal project network for amphorae ?	16:55
johnsom	Well the lb-mgmt-net is a neutron network. You create it with "openstack network create". At that point it lives in neutron and is connected to the amphora as needed. However, you need to also have a way for your controller processes (worker, health manager, housekeeping) to be able to access that network.	16:55
johnsom	That is one option yes. That is how we do it in devstack: https://github.com/openstack/octavia/blob/master/devstack/plugin.sh#L358	16:56
*** ccamposr has quit IRC		16:57
*** fnaval has joined #openstack-lbaas		16:59
*** velizarx has quit IRC		17:01
hebul	Ok, johnsom, thanks a lot. I have more or less basic understanding about second question. Is there any additional resources about octavia setup in different network scenarios to read more carefully and think ?	17:03
johnsom	Not really. It's a TODO item.	17:06
hebul	Ok, question number 3 (short one): multiple lb-mgmt-net - is it supported ?	17:11
hebul	or it is intended to create very large network at the beginning ?	17:13
johnsom	Technically it is supported, but currently there is no way to have the controllers select different networks when booting the amphora. Most of us use large subnets.	17:15
hebul	ok, I see.	17:18
hebul	Thank you, <johnsom>	17:21
johnsom	Sure, let us know if we can help more.	17:21
hebul	Do you think 2 -3 hours weekly I have on weekends will help you to start with TODO item ? Or it is too little ? :)	17:23
johnsom	Any help is welcomed.	17:24
*** fnaval has quit IRC		17:25
*** fnaval has joined #openstack-lbaas		17:28
*** hebul has quit IRC		17:33
*** rpittau has quit IRC		17:44
*** hebul has joined #openstack-lbaas		17:44
*** hebul has left #openstack-lbaas		17:44
*** hebul has joined #openstack-lbaas		17:45
*** hebul has left #openstack-lbaas		17:45
*** rpittau has joined #openstack-lbaas		17:45
*** AlexStaf has quit IRC		17:58
*** trown is now known as trown\|lunch		18:03
*** salmankhan has quit IRC		18:10
*** rpittau has quit IRC		18:12
*** openstackgerrit has quit IRC		18:51
colin-	xgerman: are you using the octavia ingress controller in your clusters atm?	18:56
colin-	can't recall if we've spoken about this	18:56
xgerman	nope	18:56
colin-	ok	18:57
xgerman	yeah, see http://blog.eichberger.de/posts/yolo_cloud/	19:00
colin-	haha, you had me at yolo	19:01
colin-	will give that a read at lunch thx for sharing :)	19:01
johnsom	Well, there you go: http://logs.openstack.org/69/636069/5/check/octavia-v2-dsvm-scenario-lxd/8c1b5e8/testr_results.html.gz	19:07
cgoncalves	you just painted it all green, confess!	19:08
johnsom	Disable most of the security protections, ignore all of the errors being thrown, ignore the fact that the kernel tuning doesn't work.	19:08
cgoncalves	great job!	19:08
johnsom	And you can have LXD amps	19:08
johnsom	cgoncalves What is up with the centos gate. These 2:30 hour timeouts are getting...., old.	19:09
johnsom	Temped to pull centos out of the check gate all together until it can be shown to be functional	19:09
cgoncalves	johnsom, systemd patch merged upstream today IIRC	19:09
johnsom	Look in zuul for that patch....	19:10
cgoncalves	https://bugzilla.redhat.com/show_bug.cgi?id=1666612	19:10
openstack	bugzilla.redhat.com bug 1666612 in systemd "Rules "uname -p" and "systemd-detect-virt" kill the system boot time on large systems" [High,Post] - Assigned to jsynacek	19:10
johnsom	So how long do we have to wait until it gets in centos?	19:10
cgoncalves	you guys complain of EL too much. either because it ships old versions or, now, because it ships latest versions. pick one, but just one! :)	19:11
cgoncalves	dunno	19:11
cgoncalves	RHEL/CentOS 7.7?	19:11
johnsom	I just like things to work....	19:11
cgoncalves	lol	19:12
cgoncalves	I will ask around	19:12
johnsom	So, that lxd run was tempest in serial mode as there was a strange nova error about things "in use" that turned out to be apparmor. After centos times out there I will push a patch that puts it back to tempest concurrency 2. So we will have an apples to apples time comparison.	19:13
johnsom	Oh, and I'm not sure the UDP stuff works. That was another whole set of errors about conntrack modules	19:14
*** trown\|lunch is now known as trown		19:20
*** openstackgerrit has joined #openstack-lbaas		19:21
openstackgerrit	Michael Johnson proposed openstack/octavia-tempest-plugin master: Add an lxd octavia scenario job https://review.openstack.org/636069	19:21
johnsom	octavia-v2-dsvm-scenario-centos-7TIMED_OUT in 2h 33m 20s	19:23
eandersson	Have you guys seen octavia-api VIRT memory growing to crazy amounts? Shouldn't be an issue, but we are seeing crazy cpu usage associated with that	19:52
colin-	still the same figures (~650 amps) we were discussing last week	19:53
johnsom	There really shouldn't be much load on the API side... It's all even driven. What release are you running?	19:54
eandersson	Yea - we restarted the api and load dropped by 20	19:57
eandersson	Rocky	19:57
eandersson	I can't explain why VIRT would be at 26GB	19:57
johnsom	I have seen uwsgi go out to lunch and eat CPU, but typically when that happens nova is the first one that goes down	19:57
eandersson	We are using uwsgi, but so is everything else	19:59
johnsom	Yeah, we are too. I have just had times where I found multiple of the uwsgi processes spinning for no apparent reason.	20:00
eandersson	The odd thing is that cpu usage is not even that high	20:00
cgoncalves	I think I have seen that happening, yes. load incrases with the number of amps created, never drops. not sure I still have the figure from grafana	20:00
eandersson	but for some reason restarting the octavia processes and load drops by 20	20:00
*** jlaffaye has quit IRC		20:01
eandersson	The only thing that is odd that I can see is VIRT is at 26GB	20:01
*** jlaffaye has joined #openstack-lbaas		20:01
cgoncalves	there is a known issue for the house keeping	20:01
johnsom	That is crazy high for our API process. I mean, it doesn't do that much....	20:01
cgoncalves	https://review.openstack.org/#/c/627058/	20:01
eandersson	It feels like IO / memory pressure, but not sure how or why the api could cause that	20:02
johnsom	Yeah, it pretty much is just sqlalchemy and rabbit	20:02
cgoncalves	exactly, sqlalchemy...	20:03
johnsom	Maybe if you have one in that state, find the thread and connect a debugger to it and see what it's up to....	20:03
cgoncalves	not sqlalchemy's fault though but how we do db querying	20:03
johnsom	26GB though? Even if it was caching the whole octavia DB you would have a massive deployment for that	20:04
cgoncalves	I've seen neutron-lbaas going waaaay above that	20:05
johnsom	Plus it shouldn't be burning CPU if it's not handling API calls	20:05
colin-	exactly	20:06
cgoncalves	https://cgoncalves.pt/trash/openstack/octavia/HSZxPMn.png	20:06
colin-	and the journal output suggests that transaction times are not abnormally long for the tasks it is performing	20:06
colin-	all 0.1 0.2s	20:06
cgoncalves	the load increase steps there are rally runs	20:07
cgoncalves	100 LBs IIRC	20:07
eandersson	Can you check system load as well	20:13
eandersson	Load Avg	20:13
cgoncalves	I don't have access to the system any longer, sorry. it was from mid December	20:14
eandersson	pmap on a octavia-api process shows 2477 pages :D	20:24
eandersson	a busy nova process has 400	20:24
eandersson	busy neutron (with lbaas) has less than 400	20:25
eandersson	Not sure why octavia would need to allocate so many pages	20:25
johnsom	I could see it if it was active, but idle no. There is still a stupid join in there, but that should purge after the request is done	20:26
eandersson	Yea - we do have a memory leak in neutron-lbaas as well, but it's different	20:27
johnsom	Yeah, that is pretty much known	20:28
eandersson	Honestly think my latest lbaas patch will fix that (or at the very least improve it a lot)	20:28
eandersson	since it does not have to do those crazy sql queries	20:29
johnsom	Yeah, some of that craziness leaked over here with some patches attempting to reduce the number of round trips to mysql as it was emptying the connection pools/slots. Or something like that. Either way, they were bad patches	20:31
johnsom	We have been working through fixing them.	20:31
*** salmankhan has joined #openstack-lbaas		20:32
*** salmankhan has quit IRC		20:36
*** dmellado has quit IRC		20:39
*** salmankhan has joined #openstack-lbaas		20:53
nmagnezi	o/	21:17
johnsom	Hi there	21:18
nmagnezi	johnsom, when you have a moment, please check my comment in https://storyboard.openstack.org/#!/story/2004112 so I can test my assumptions :)	21:19
johnsom	Looking	21:20
*** salmankhan has quit IRC		21:20
nmagnezi	johnsom, thank you!	21:21
johnsom	Commented	21:28
johnsom	nmagnezi I get your point, but I think the code is broken for IPv6 members. I think it writes out the first subnet and not the one specified.	21:29
nmagnezi	johnsom, so basically creating a LB with one member in subnet_a and another member in subnet_b will result the member in subnet_b to be unreachable? (Saying that so I can test my fix attempts)	21:32
nmagnezi	Does it happen only with IPv6? Or only with a mix of IPv4 and IPv6?	21:32
johnsom	The test case I hit for that bug (sorry I didn't put it in the story) is boot up an LB, create tenant network, add IPv4 subnet, add IPv6 subnet, add an IPv6 member to the network. The interface file written out will be for the IPv4 subnet.	21:34
johnsom	The IPv6 member will be unreachable	21:34
nmagnezi	Ack	21:36
nmagnezi	Will try it out	21:36
nmagnezi	That's all I needed to know	21:36
nmagnezi	Calling in a day..	21:36
johnsom	Yep, sorry for the poor story quality.	21:36
nmagnezi	One last thing is that I responded to most of your comments here, and followed up with some questions: https://review.openstack.org/#/c/627064/	21:37
nmagnezi	No worries	21:37
johnsom	Yeah, saw that. Will reply today	21:37
*** dmellado has joined #openstack-lbaas		21:43
*** trown is now known as trown\|outtypewww		22:00
rm_work	nmagnezi / johnsom re: mixed-members subnet issues -- i definitely ran into that recently, thought there was already movement somewhere on fixing it?	22:39
rm_work	i forget if i had a patch or someone else did	22:39
rm_work	colin-: do you see anything like this in your API logs? `2019-02-06 20:30:25.839 2364 WARNING oslo.messaging._drivers.impl_rabbit [-] Unexpected error during heartbeart thread processing, retrying...: error: [Errno 110] Connection timed out` octavia-api.log	22:39
colin-	good question, let me see if i can spot that line. not immediately familiar	22:40
johnsom	I fixed it for ubuntu, but there is still an open bug for redhat. Really that whole chain needs to be re-worked however....	22:40
rm_work	just look for "error during heartbeart"	22:40
rm_work	ah maybe all that happened was a filed a story for it <_<	22:41
colin-	don't see that from either octavia-api output over the past three hours, looking back farther	22:42
rm_work	hmm k	22:43
rm_work	it'd be before a restart	22:43
rm_work	it takes a while for that line to start showing up IME	22:43
johnsom	Does it really say "hearbeart"?	22:44
rm_work	yes	22:45
rm_work	it's an oslo error	22:45
rm_work	from rabbit	22:45
rm_work	unrelated to our heartbeats	22:45
johnsom	Right, I get that it's a rabbit thing and either oslo or rabbit code	22:46
rm_work	I'm just wondering whether it's a bug in how we use the client (missing a close somewhere?) or in the Oslo side	22:47
johnsom	https://bugzilla.redhat.com/show_bug.cgi?id=1542100	22:47
openstack	bugzilla.redhat.com bug 1542100 in python-oslo-messaging "Can't failover when rabbit_hosts is configured as 3 hosts" [High,Closed: wontfix] - Assigned to jeckersb	22:47
rm_work	Were we supposed to be closing connections somehow and we never got the memo? lol	22:47
johnsom	Where here is the code logging that: https://github.com/openstack/oslo.messaging/blob/master/oslo_messaging/_drivers/impl_rabbit.py#L897	22:51
rm_work	That bug was supposedly fixed in pike	22:52
johnsom	Yeah, it seems like it throws that if a rabbit node goes down or the network drops	22:52
johnsom	What is the exception logged right after that?	22:53
rm_work	Not sure	22:53
johnsom	Ah, I guess it needs debug logging.... sigh	22:53
colin-	broadened the scope to 12h and still haven't found that error so far rm_work	22:53
rm_work	Hmm ok	22:53
rm_work	Well, thanks	22:53
johnsom	Ah, it's the connection timeout message	22:55
rm_work	ok so... in my deployment, those started happening more and more frequently, and my digging showed that was because there were more and more of those threads, and they basically weren't dieing	23:06
rm_work	so they were building up	23:06
rm_work	until eventually there were so many the API process was no longer responsive	23:06
colin-	interesting	23:07
colin-	any chance that is happening and not logging that message?	23:08
rm_work	hmmm, what is your log level	23:08
colin-	default_log_levels is default values	23:12
colin-	if you're referring to that list?	23:12
*** icey has quit IRC		23:46
*** yetiszaf has quit IRC		23:46
*** fyx has quit IRC		23:46
*** coreycb has quit IRC		23:46
*** fnaval has quit IRC		23:48

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!