openstackgerrit | Michael Johnson proposed openstack/octavia master: Add LXD support to Octavia devstack https://review.openstack.org/636066 | 00:34 |
---|---|---|
openstackgerrit | Michael Johnson proposed openstack/octavia-tempest-plugin master: Add an lxd octavia scenario job https://review.openstack.org/636069 | 00:35 |
*** sapd1 has joined #openstack-lbaas | 01:38 | |
dayou | johnsom, yes, is there anything I can help? | 01:38 |
johnsom | dayou Hi Mr. Dashboard wizard. | 01:44 |
dayou | :-P | 01:45 |
johnsom | dayou I was wondering if there was a chance you could add a flavor selector to the LB create page in the octavia-dashboard? It would be great if we had that for Stein. | 01:45 |
johnsom | It is the flavor ID here: https://developer.openstack.org/api-ref/load-balancer/v2/index.html?expanded=create-a-load-balancer-detail#id2 | 01:46 |
*** yamamoto has joined #openstack-lbaas | 01:46 | |
johnsom | I fixed the SDK to have it in this patch: https://review.openstack.org/#/c/633849/ | 01:46 |
johnsom | You could present the user a list using this API: https://developer.openstack.org/api-ref/load-balancer/v2/index.html?expanded=list-flavors-detail#list-flavors | 01:47 |
johnsom | Which hasn't yet merged in SDK: https://review.openstack.org/#/c/634532/ | 01:47 |
dayou | Cool, I'll start working on it this week | 01:47 |
johnsom | If you don't have time, that is ok too. I can attempt to fumble my way to making it happen. | 01:48 |
openstackgerrit | Michael Johnson proposed openstack/octavia master: Add LXD support to Octavia devstack https://review.openstack.org/636066 | 01:49 |
johnsom | Thank you sir. Let me know if I can help an any way. | 01:50 |
dayou | Yes, sir, it's my pleasure to help with it, I'll see whether I can I can come up with something in the next two weeks for review. | 01:52 |
johnsom | dayou: Also, let me know if there are any priority dashboard patches we need to review. Feel free to ping me with those at any time. | 01:54 |
dayou | johnsom, got you | 01:55 |
sapd1 | johnsom: Hi | 02:02 |
sapd1 | you are working on lxd for octavia, | 02:02 |
sapd1 | We will run container instead of VM | 02:03 |
johnsom | I am. I have it working locally, but need to get the gate working. Though I have to say, I am not sure I would use it for production. | 02:09 |
johnsom | Nova-lxd seems to need some work. | 02:10 |
sapd1 | johnsom: why don't you use zun? | 02:24 |
johnsom | I don’t think it support lxc. | 02:44 |
*** yamamoto has quit IRC | 02:48 | |
sapd1 | johnsom: Oh, why lxd. why not docker | 02:52 |
*** psachin has joined #openstack-lbaas | 03:05 | |
*** ramishra has joined #openstack-lbaas | 03:27 | |
sapd1 | seem like octavia do not support ovn provider yet | 03:57 |
johnsom | Yes, there is an OVN driver | 04:04 |
sapd1 | johnsom: I'm trying to use ovn driver | 04:24 |
sapd1 | But I don't know how to config | 04:24 |
*** hongbin has joined #openstack-lbaas | 04:25 | |
sapd1 | https://github.com/openstack/networking-ovn/blob/a4e69319ad/devstack/lib/networking-ovn#L595 | 04:25 |
sapd1 | follow this script but it does not work | 04:25 |
*** hongbin has quit IRC | 04:53 | |
*** AlexStaf has quit IRC | 05:33 | |
*** gcheresh has joined #openstack-lbaas | 06:12 | |
openstackgerrit | Michael Johnson proposed openstack/octavia master: Add LXD support to Octavia devstack https://review.openstack.org/636066 | 06:20 |
*** gcheresh has quit IRC | 06:31 | |
*** yamamoto has joined #openstack-lbaas | 06:46 | |
*** yamamoto has quit IRC | 06:51 | |
*** velizarx has joined #openstack-lbaas | 07:17 | |
*** gcheresh has joined #openstack-lbaas | 07:20 | |
openstackgerrit | Michael Johnson proposed openstack/octavia master: Add LXD support to Octavia devstack https://review.openstack.org/636066 | 07:25 |
*** yboaron has quit IRC | 07:35 | |
*** velizarx has quit IRC | 07:43 | |
*** psachin has quit IRC | 07:45 | |
*** velizarx has joined #openstack-lbaas | 07:56 | |
*** AlexStaf has joined #openstack-lbaas | 08:02 | |
*** rpittau has joined #openstack-lbaas | 08:07 | |
*** Emine has joined #openstack-lbaas | 08:18 | |
cgoncalves | sapd1, try this http://paste.openstack.org/show/744822/ | 08:25 |
cgoncalves | it works for me | 08:25 |
sapd1 | cgoncalves: thank you | 08:34 |
sapd1 | I have run it successfully. :D | 08:35 |
*** Emine has quit IRC | 08:37 | |
*** Emine has joined #openstack-lbaas | 08:42 | |
*** yboaron has joined #openstack-lbaas | 08:48 | |
*** yboaron_ has joined #openstack-lbaas | 08:53 | |
cgoncalves | great | 08:54 |
cgoncalves | sapd1, ah, make sure you have LIBS_FROM_GIT=python-octaviaclient | 08:54 |
cgoncalves | so that you get https://review.openstack.org/#/c/633562/ | 08:54 |
sapd1 | cgoncalves: I'm trying install manual | 08:54 |
sapd1 | yeah | 08:55 |
*** yboaron has quit IRC | 08:55 | |
*** ccamposr has joined #openstack-lbaas | 09:06 | |
*** Emine has quit IRC | 09:29 | |
*** Emine has joined #openstack-lbaas | 09:36 | |
*** kobis1 has joined #openstack-lbaas | 09:43 | |
*** kobis1 has left #openstack-lbaas | 09:44 | |
openstackgerrit | Nir Magnezi proposed openstack/octavia master: Encrypt certs and keys https://review.openstack.org/627064 | 10:06 |
*** salmankhan has joined #openstack-lbaas | 10:21 | |
*** salmankhan1 has joined #openstack-lbaas | 10:31 | |
*** salmankhan has quit IRC | 10:32 | |
*** salmankhan1 is now known as salmankhan | 10:32 | |
*** mkuf_ has joined #openstack-lbaas | 10:48 | |
*** mkuf has quit IRC | 10:52 | |
*** mkuf_ has quit IRC | 11:24 | |
*** sapd1 has quit IRC | 11:37 | |
*** Emine has quit IRC | 11:46 | |
*** velizarx has quit IRC | 11:56 | |
*** velizarx has joined #openstack-lbaas | 11:58 | |
openstackgerrit | Nir Magnezi proposed openstack/octavia master: Encrypt certs and keys https://review.openstack.org/627064 | 12:09 |
*** mkuf has joined #openstack-lbaas | 12:13 | |
*** Emine has joined #openstack-lbaas | 12:35 | |
*** yamamoto has joined #openstack-lbaas | 13:00 | |
*** gcheresh has quit IRC | 13:01 | |
*** gcheresh_ has joined #openstack-lbaas | 13:01 | |
openstackgerrit | Nir Magnezi proposed openstack/octavia master: WIP: CentOS with multiple fixed ips https://review.openstack.org/636065 | 13:01 |
*** velizarx has quit IRC | 13:04 | |
*** yamamoto has quit IRC | 13:05 | |
*** velizarx has joined #openstack-lbaas | 13:10 | |
*** trown|outtypewww is now known as trown | 13:13 | |
*** sapd1 has joined #openstack-lbaas | 13:34 | |
*** yamamoto has joined #openstack-lbaas | 13:40 | |
*** gcheresh has joined #openstack-lbaas | 13:43 | |
*** gcheresh_ has quit IRC | 13:44 | |
*** yamamoto has quit IRC | 13:44 | |
nmagnezi | cgoncalves, are you able to locally build a centos based image? | 13:54 |
nmagnezi | cgoncalves, asking because I get the following: Cannot uninstall 'virtualenv'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall. | 13:55 |
cgoncalves | nmagnezi, yes. make sure you use DIB from master | 13:58 |
nmagnezi | cgoncalves, aye.. should we bump out deps or something? (Assuming there's a newer release there..) | 13:58 |
cgoncalves | we need a new release of DIB | 13:59 |
cgoncalves | CI is okay because it pulls from master | 13:59 |
nmagnezi | cgoncalves, aye. Added to LIBS_FROM_GIT locally for now | 14:00 |
*** yboaron_ has quit IRC | 14:03 | |
*** yboaron_ has joined #openstack-lbaas | 14:03 | |
cgoncalves | thought of the day: running grenade locally is a PITA. countless issues I've ran into | 14:48 |
openstackgerrit | boden proposed openstack/neutron-lbaas master: stop using common db mixin https://review.openstack.org/635570 | 14:56 |
cgoncalves | kernel panic on cirros, great | 14:57 |
*** AlexStaf has quit IRC | 15:08 | |
*** yboaron_ has quit IRC | 15:21 | |
openstackgerrit | Margarita Shakhova proposed openstack/octavia master: Support create amphora instance from volume based. https://review.openstack.org/570505 | 15:21 |
*** yboaron_ has joined #openstack-lbaas | 15:21 | |
*** fnaval has joined #openstack-lbaas | 15:38 | |
*** yboaron_ has quit IRC | 15:42 | |
*** yboaron_ has joined #openstack-lbaas | 15:42 | |
*** hebul_ has joined #openstack-lbaas | 15:51 | |
*** hebul_ has quit IRC | 15:54 | |
*** hebul has joined #openstack-lbaas | 15:55 | |
hebul | Hello All, who will (could) help me with some octavia questions ? | 15:57 |
johnsom | hebul: you are in the right place. What can we help with? | 15:58 |
hebul | Thank you ! | 15:59 |
*** sapd1 has quit IRC | 16:01 | |
hebul | Question 1. After amphora VM was created (with net id provided) I found out that in consume 2 IP in that network | 16:01 |
hebul | it consume | 16:02 |
hebul | one consumed IP is shown when we ask for lb list through CLI. But also we can see one more IP consumed when ask about vm list in openstack | 16:04 |
johnsom | Correct, each has a base port with IP and a secondary IP (allowed address pairs in neutron) that is used for the VIP and HA. | 16:05 |
*** ramishra has quit IRC | 16:09 | |
hebul | @johnsom Why didn't we use single one for all purposes ? | 16:10 |
johnsom | The VIP is a special port, called an allowed address pair port in neutron. It allows us to move the VIP address between VMs in the case of a failure. When running in Active/Standby topology, this can happen in around a second. In single mode, it takes a bit longer | 16:12 |
colin- | not crazy about the extra IPs either but the resilience he's describing made it a no-brainer for me | 16:13 |
*** AlexStaf has joined #openstack-lbaas | 16:17 | |
*** fnaval has quit IRC | 16:17 | |
johnsom | Yeah, in theory a single topology amp could use just one IP, but it would require a new network driver be written and no one has been motivated to do it. Pretty much all of use always run Active/Standby. | 16:17 |
colin- | have found that works very well, fwiw | 16:18 |
colin- | (to hebul primarily hope that helps!) | 16:18 |
johnsom | colin- Did you tune it or are you running with the defaults? | 16:18 |
colin- | the threads, similar to hm? default for now | 16:19 |
johnsom | colin- It can be tuned to failover much faster than the defaults if you want it to. | 16:19 |
colin- | oh i see, the freshness | 16:19 |
colin- | gotcha | 16:19 |
colin- | also default for now | 16:19 |
johnsom | I mean the Active/Standby settings. Just curious. I run defaults, but have demoed with it tuned | 16:19 |
hebul | Thank you !!! So as I see it is true limitation for now (as we cant move VIP from one VM to another if VM doesn't have network port or we have to change driver) | 16:20 |
colin- | like heartbeat_timeout? | 16:20 |
*** gcheresh has quit IRC | 16:20 | |
johnsom | https://docs.openstack.org/octavia/latest/configuration/configref.html#keepalived-vrrp | 16:20 |
johnsom | Those first four settings | 16:21 |
*** Emine has quit IRC | 16:21 | |
colin- | oh no, wasn't familiar with this thanks | 16:22 |
johnsom | Hmm, forgot we dropped the vrrp_advert_int to 1, so that is good already. We can go lower with the newer versions of keepalvied, but I'm not sure it's really needed. | 16:22 |
colin- | oh i have been meaning to ask this but never think of it when we are chatting, is there any frequency adjustment on health check failuer by default? | 16:22 |
colin- | old hw lbs i used to manage would sometimes increase the frequency of their checking after a failure and i always found that bothersome | 16:22 |
johnsom | Ah, so if it sees a failure it polls more often? | 16:23 |
colin- | right | 16:23 |
colin- | i'm assuming that is not true in our case | 16:23 |
johnsom | Hmm, I don't think we have that today. If HAproxy can do it, you would have to do a custom template for it. | 16:23 |
johnsom | Yeah, I don't think we do that. | 16:24 |
colin- | no i actually prefer not to have it because it changes the epxected behavior with all these parameters imo, just double checking | 16:24 |
johnsom | I could see that being a bit annoying with the logs scrolling more. | 16:24 |
johnsom | hebul Any other questions we can help with? | 16:25 |
hebul | Question 2 is coming. How to sort out management network for octavia and amphora ? What happens in private network environment (e.g. VXLAN)? | 16:28 |
*** yboaron_ has quit IRC | 16:28 | |
hebul | Private network won't have connectivity to external world (incuding octavia mgmt net) by default | 16:29 |
johnsom | So, let me clarify and then answer. | 16:29 |
johnsom | There is the lb-mgmt-net, which is used for the controllers to talk to the amphora and the amphora to talk back to the controllers. It is typically a private network setup for this purpose, but it can be shared and/or routed. It is a TLS TCP connection to the amps, and a UDP back to the controllers. No tenant traffic crosses this network as it is isolated from the tenant traffic inside a network namespace in the amp. | 16:31 |
johnsom | VIP and member networks, isolated inside the network namespace in the amphora, are hot-plugged into the amphora instance as the user configures their load balancer. We support any network that neutron supports for this, could be tenant private, could be a public external network. | 16:33 |
johnsom | Fundamentally the lb-mgmt-net is just a neutron network. The harder part is how to make it available for the controllers. There are many ways to do this. Provider networks, bridging it out to the controllers, setting up routes, etc. | 16:34 |
johnsom | Did that help answer the question? | 16:34 |
openstackgerrit | Michael Johnson proposed openstack/octavia master: Add LXD support to Octavia devstack https://review.openstack.org/636066 | 16:35 |
hebul | Yes, it helped to get closer but not clarified :) | 16:40 |
hebul | If I understood correctly, we have to provide both provider network for amphorae and VXLAN for private tenant networks | 16:44 |
hebul | for hypervysors | 16:44 |
hebul | hypervisors | 16:44 |
johnsom | provider network is optional, that is just one way to setup the lb-mgmt-net. | 16:45 |
johnsom | Your tenant networks would use whatever mechanism you use today for neutron networks on your compute hosts. VXLAN is fine, as Octavia only talks to neutron and nova APIs for it. How it works behind nova and neutron, we don't need to know. | 16:47 |
johnsom | So for OpenStack Ansible, they chose to use a provider network for the lb-mgmt-net. For Redhat OSP 13, they chose to not use provider networks but to bridge it out of neutron. | 16:48 |
johnsom | If your neutron is all VXLAN, you could probably even have the Octavia controllers participate in the VXLAN overlay directly if you wish. | 16:49 |
cgoncalves | for OSP that is the default, yes, although one can create a neutron network and pass that in to the installer. the installer will see the network already exists and use it instead | 16:51 |
hebul | "The harder part is how to make it available for the controllers. There are many ways to do this. Provider networks, bridging it out to the controllers, setting up routes, etc."-- bridging it out to the controllers - what does it mean ? | 16:53 |
hebul | Does it mean that I have to connect Octavia services to OVS - integration to the same internal OVS VLAN ID that is used for internal project network for amphorae ? | 16:55 |
johnsom | Well the lb-mgmt-net is a neutron network. You create it with "openstack network create". At that point it lives in neutron and is connected to the amphora as needed. However, you need to also have a way for your controller processes (worker, health manager, housekeeping) to be able to access that network. | 16:55 |
johnsom | That is one option yes. That is how we do it in devstack: https://github.com/openstack/octavia/blob/master/devstack/plugin.sh#L358 | 16:56 |
*** ccamposr has quit IRC | 16:57 | |
*** fnaval has joined #openstack-lbaas | 16:59 | |
*** velizarx has quit IRC | 17:01 | |
hebul | Ok, johnsom, thanks a lot. I have more or less basic understanding about second question. Is there any additional resources about octavia setup in different network scenarios to read more carefully and think ? | 17:03 |
johnsom | Not really. It's a TODO item. | 17:06 |
hebul | Ok, question number 3 (short one): multiple lb-mgmt-net - is it supported ? | 17:11 |
hebul | or it is intended to create very large network at the beginning ? | 17:13 |
johnsom | Technically it is supported, but currently there is no way to have the controllers select different networks when booting the amphora. Most of us use large subnets. | 17:15 |
hebul | ok, I see. | 17:18 |
hebul | Thank you, <johnsom> | 17:21 |
johnsom | Sure, let us know if we can help more. | 17:21 |
hebul | Do you think 2 -3 hours weekly I have on weekends will help you to start with TODO item ? Or it is too little ? :) | 17:23 |
johnsom | Any help is welcomed. | 17:24 |
*** fnaval has quit IRC | 17:25 | |
*** fnaval has joined #openstack-lbaas | 17:28 | |
*** hebul has quit IRC | 17:33 | |
*** rpittau has quit IRC | 17:44 | |
*** hebul has joined #openstack-lbaas | 17:44 | |
*** hebul has left #openstack-lbaas | 17:44 | |
*** hebul has joined #openstack-lbaas | 17:45 | |
*** hebul has left #openstack-lbaas | 17:45 | |
*** rpittau has joined #openstack-lbaas | 17:45 | |
*** AlexStaf has quit IRC | 17:58 | |
*** trown is now known as trown|lunch | 18:03 | |
*** salmankhan has quit IRC | 18:10 | |
*** rpittau has quit IRC | 18:12 | |
*** openstackgerrit has quit IRC | 18:51 | |
colin- | xgerman: are you using the octavia ingress controller in your clusters atm? | 18:56 |
colin- | can't recall if we've spoken about this | 18:56 |
xgerman | nope | 18:56 |
colin- | ok | 18:57 |
xgerman | yeah, see http://blog.eichberger.de/posts/yolo_cloud/ | 19:00 |
colin- | haha, you had me at yolo | 19:01 |
colin- | will give that a read at lunch thx for sharing :) | 19:01 |
johnsom | Well, there you go: http://logs.openstack.org/69/636069/5/check/octavia-v2-dsvm-scenario-lxd/8c1b5e8/testr_results.html.gz | 19:07 |
cgoncalves | you just painted it all green, confess! | 19:08 |
johnsom | Disable most of the security protections, ignore all of the errors being thrown, ignore the fact that the kernel tuning doesn't work. | 19:08 |
cgoncalves | great job! | 19:08 |
johnsom | And you can have LXD amps | 19:08 |
johnsom | cgoncalves What is up with the centos gate. These 2:30 hour timeouts are getting...., old. | 19:09 |
johnsom | Temped to pull centos out of the check gate all together until it can be shown to be functional | 19:09 |
cgoncalves | johnsom, systemd patch merged upstream today IIRC | 19:09 |
johnsom | Look in zuul for that patch.... | 19:10 |
cgoncalves | https://bugzilla.redhat.com/show_bug.cgi?id=1666612 | 19:10 |
openstack | bugzilla.redhat.com bug 1666612 in systemd "Rules "uname -p" and "systemd-detect-virt" kill the system boot time on large systems" [High,Post] - Assigned to jsynacek | 19:10 |
johnsom | So how long do we have to wait until it gets in centos? | 19:10 |
cgoncalves | you guys complain of EL too much. either because it ships old versions or, now, because it ships latest versions. pick one, but just one! :) | 19:11 |
cgoncalves | dunno | 19:11 |
cgoncalves | RHEL/CentOS 7.7? | 19:11 |
johnsom | I just like things to work.... | 19:11 |
cgoncalves | lol | 19:12 |
cgoncalves | I will ask around | 19:12 |
johnsom | So, that lxd run was tempest in serial mode as there was a strange nova error about things "in use" that turned out to be apparmor. After centos times out there I will push a patch that puts it back to tempest concurrency 2. So we will have an apples to apples time comparison. | 19:13 |
johnsom | Oh, and I'm not sure the UDP stuff works. That was another whole set of errors about conntrack modules | 19:14 |
*** trown|lunch is now known as trown | 19:20 | |
*** openstackgerrit has joined #openstack-lbaas | 19:21 | |
openstackgerrit | Michael Johnson proposed openstack/octavia-tempest-plugin master: Add an lxd octavia scenario job https://review.openstack.org/636069 | 19:21 |
johnsom | octavia-v2-dsvm-scenario-centos-7TIMED_OUT in 2h 33m 20s | 19:23 |
eandersson | Have you guys seen octavia-api VIRT memory growing to crazy amounts? Shouldn't be an issue, but we are seeing crazy cpu usage associated with that | 19:52 |
colin- | still the same figures (~650 amps) we were discussing last week | 19:53 |
johnsom | There really shouldn't be much load on the API side... It's all even driven. What release are you running? | 19:54 |
eandersson | Yea - we restarted the api and load dropped by 20 | 19:57 |
eandersson | Rocky | 19:57 |
eandersson | I can't explain why VIRT would be at 26GB | 19:57 |
johnsom | I have seen uwsgi go out to lunch and eat CPU, but typically when that happens nova is the first one that goes down | 19:57 |
eandersson | We are using uwsgi, but so is everything else | 19:59 |
johnsom | Yeah, we are too. I have just had times where I found multiple of the uwsgi processes spinning for no apparent reason. | 20:00 |
eandersson | The odd thing is that cpu usage is not even that high | 20:00 |
cgoncalves | I think I have seen that happening, yes. load incrases with the number of amps created, never drops. not sure I still have the figure from grafana | 20:00 |
eandersson | but for some reason restarting the octavia processes and load drops by 20 | 20:00 |
*** jlaffaye has quit IRC | 20:01 | |
eandersson | The only thing that is odd that I can see is VIRT is at 26GB | 20:01 |
*** jlaffaye has joined #openstack-lbaas | 20:01 | |
cgoncalves | there is a known issue for the house keeping | 20:01 |
johnsom | That is crazy high for our API process. I mean, it doesn't do that much.... | 20:01 |
cgoncalves | https://review.openstack.org/#/c/627058/ | 20:01 |
eandersson | It feels like IO / memory pressure, but not sure how or why the api could cause that | 20:02 |
johnsom | Yeah, it pretty much is just sqlalchemy and rabbit | 20:02 |
cgoncalves | exactly, sqlalchemy... | 20:03 |
johnsom | Maybe if you have one in that state, find the thread and connect a debugger to it and see what it's up to.... | 20:03 |
cgoncalves | not sqlalchemy's fault though but how we do db querying | 20:03 |
johnsom | 26GB though? Even if it was caching the whole octavia DB you would have a massive deployment for that | 20:04 |
cgoncalves | I've seen neutron-lbaas going waaaay above that | 20:05 |
johnsom | Plus it shouldn't be burning CPU if it's not handling API calls | 20:05 |
colin- | exactly | 20:06 |
cgoncalves | https://cgoncalves.pt/trash/openstack/octavia/HSZxPMn.png | 20:06 |
colin- | and the journal output suggests that transaction times are not abnormally long for the tasks it is performing | 20:06 |
colin- | all 0.1 0.2s | 20:06 |
cgoncalves | the load increase steps there are rally runs | 20:07 |
cgoncalves | 100 LBs IIRC | 20:07 |
eandersson | Can you check system load as well | 20:13 |
eandersson | Load Avg | 20:13 |
cgoncalves | I don't have access to the system any longer, sorry. it was from mid December | 20:14 |
eandersson | pmap on a octavia-api process shows 2477 pages :D | 20:24 |
eandersson | a busy nova process has 400 | 20:24 |
eandersson | busy neutron (with lbaas) has less than 400 | 20:25 |
eandersson | Not sure why octavia would need to allocate so many pages | 20:25 |
johnsom | I could see it if it was active, but idle no. There is still a stupid join in there, but that should purge after the request is done | 20:26 |
eandersson | Yea - we do have a memory leak in neutron-lbaas as well, but it's different | 20:27 |
johnsom | Yeah, that is pretty much known | 20:28 |
eandersson | Honestly think my latest lbaas patch will fix that (or at the very least improve it a lot) | 20:28 |
eandersson | since it does not have to do those crazy sql queries | 20:29 |
johnsom | Yeah, some of that craziness leaked over here with some patches attempting to reduce the number of round trips to mysql as it was emptying the connection pools/slots. Or something like that. Either way, they were bad patches | 20:31 |
johnsom | We have been working through fixing them. | 20:31 |
*** salmankhan has joined #openstack-lbaas | 20:32 | |
*** salmankhan has quit IRC | 20:36 | |
*** dmellado has quit IRC | 20:39 | |
*** salmankhan has joined #openstack-lbaas | 20:53 | |
nmagnezi | o/ | 21:17 |
johnsom | Hi there | 21:18 |
nmagnezi | johnsom, when you have a moment, please check my comment in https://storyboard.openstack.org/#!/story/2004112 so I can test my assumptions :) | 21:19 |
johnsom | Looking | 21:20 |
*** salmankhan has quit IRC | 21:20 | |
nmagnezi | johnsom, thank you! | 21:21 |
johnsom | Commented | 21:28 |
johnsom | nmagnezi I get your point, but I think the code is broken for IPv6 members. I think it writes out the first subnet and not the one specified. | 21:29 |
nmagnezi | johnsom, so basically creating a LB with one member in subnet_a and another member in subnet_b will result the member in subnet_b to be unreachable? (Saying that so I can test my fix attempts) | 21:32 |
nmagnezi | Does it happen only with IPv6? Or only with a mix of IPv4 and IPv6? | 21:32 |
johnsom | The test case I hit for that bug (sorry I didn't put it in the story) is boot up an LB, create tenant network, add IPv4 subnet, add IPv6 subnet, add an IPv6 member to the network. The interface file written out will be for the IPv4 subnet. | 21:34 |
johnsom | The IPv6 member will be unreachable | 21:34 |
nmagnezi | Ack | 21:36 |
nmagnezi | Will try it out | 21:36 |
nmagnezi | That's all I needed to know | 21:36 |
nmagnezi | Calling in a day.. | 21:36 |
johnsom | Yep, sorry for the poor story quality. | 21:36 |
nmagnezi | One last thing is that I responded to most of your comments here, and followed up with some questions: https://review.openstack.org/#/c/627064/ | 21:37 |
nmagnezi | No worries | 21:37 |
johnsom | Yeah, saw that. Will reply today | 21:37 |
*** dmellado has joined #openstack-lbaas | 21:43 | |
*** trown is now known as trown|outtypewww | 22:00 | |
rm_work | nmagnezi / johnsom re: mixed-members subnet issues -- i definitely ran into that recently, thought there was already movement somewhere on fixing it? | 22:39 |
rm_work | i forget if i had a patch or someone else did | 22:39 |
rm_work | colin-: do you see anything like this in your API logs? `2019-02-06 20:30:25.839 2364 WARNING oslo.messaging._drivers.impl_rabbit [-] Unexpected error during heartbeart thread processing, retrying...: error: [Errno 110] Connection timed out` octavia-api.log | 22:39 |
colin- | good question, let me see if i can spot that line. not immediately familiar | 22:40 |
johnsom | I fixed it for ubuntu, but there is still an open bug for redhat. Really that whole chain needs to be re-worked however.... | 22:40 |
rm_work | just look for "error during heartbeart" | 22:40 |
rm_work | ah maybe all that happened was a filed a story for it <_< | 22:41 |
colin- | don't see that from either octavia-api output over the past three hours, looking back farther | 22:42 |
rm_work | hmm k | 22:43 |
rm_work | it'd be before a restart | 22:43 |
rm_work | it takes a while for that line to start showing up IME | 22:43 |
johnsom | Does it really say "hearbeart"? | 22:44 |
rm_work | yes | 22:45 |
rm_work | it's an oslo error | 22:45 |
rm_work | from rabbit | 22:45 |
rm_work | unrelated to our heartbeats | 22:45 |
johnsom | Right, I get that it's a rabbit thing and either oslo or rabbit code | 22:46 |
rm_work | I'm just wondering whether it's a bug in how we use the client (missing a close somewhere?) or in the Oslo side | 22:47 |
johnsom | https://bugzilla.redhat.com/show_bug.cgi?id=1542100 | 22:47 |
openstack | bugzilla.redhat.com bug 1542100 in python-oslo-messaging "Can't failover when rabbit_hosts is configured as 3 hosts" [High,Closed: wontfix] - Assigned to jeckersb | 22:47 |
rm_work | Were we supposed to be closing connections somehow and we never got the memo? lol | 22:47 |
johnsom | Where here is the code logging that: https://github.com/openstack/oslo.messaging/blob/master/oslo_messaging/_drivers/impl_rabbit.py#L897 | 22:51 |
rm_work | That bug was supposedly fixed in pike | 22:52 |
johnsom | Yeah, it seems like it throws that if a rabbit node goes down or the network drops | 22:52 |
johnsom | What is the exception logged right after that? | 22:53 |
rm_work | Not sure | 22:53 |
johnsom | Ah, I guess it needs debug logging.... sigh | 22:53 |
colin- | broadened the scope to 12h and still haven't found that error so far rm_work | 22:53 |
rm_work | Hmm ok | 22:53 |
rm_work | Well, thanks | 22:53 |
johnsom | Ah, it's the connection timeout message | 22:55 |
rm_work | ok so... in my deployment, those started happening more and more frequently, and my digging showed that was because there were more and more of those threads, and they basically weren't dieing | 23:06 |
rm_work | so they were building up | 23:06 |
rm_work | until eventually there were so many the API process was no longer responsive | 23:06 |
colin- | interesting | 23:07 |
colin- | any chance that is happening and not logging that message? | 23:08 |
rm_work | hmmm, what is your log level | 23:08 |
colin- | default_log_levels is default values | 23:12 |
colin- | if you're referring to that list? | 23:12 |
*** icey has quit IRC | 23:46 | |
*** yetiszaf has quit IRC | 23:46 | |
*** fyx has quit IRC | 23:46 | |
*** coreycb has quit IRC | 23:46 | |
*** fnaval has quit IRC | 23:48 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!