*** mithilarun has quit IRC | 00:58 | |
*** mithilarun has joined #openstack-lbaas | 01:03 | |
*** yamamoto has joined #openstack-lbaas | 01:15 | |
*** mithilarun has quit IRC | 01:19 | |
*** yamamoto has quit IRC | 02:19 | |
*** goldyfruit has quit IRC | 02:30 | |
*** yamamoto has joined #openstack-lbaas | 03:14 | |
*** hongbin has joined #openstack-lbaas | 04:03 | |
*** goldyfruit has joined #openstack-lbaas | 04:34 | |
*** hongbin has quit IRC | 04:39 | |
openstackgerrit | Adit Sarfaty proposed openstack/neutron-lbaas stable/stein: Prevent deletion of a listener attached to a pool https://review.opendev.org/677659 | 05:31 |
---|---|---|
*** gcheresh has joined #openstack-lbaas | 05:57 | |
openstackgerrit | Carlos Goncalves proposed openstack/octavia-tempest-plugin master: Enable KVM libvirt type on all scenario jobs https://review.opendev.org/702921 | 06:05 |
*** gcheresh has quit IRC | 07:10 | |
*** gcheresh has joined #openstack-lbaas | 07:35 | |
*** maciejjozefczyk_ has joined #openstack-lbaas | 07:48 | |
openstackgerrit | Merged openstack/octavia master: Allow the Octavia wsgi to accept argv parameters https://review.opendev.org/701485 | 08:07 |
*** tkajinam has quit IRC | 08:10 | |
*** tesseract has joined #openstack-lbaas | 08:20 | |
*** rpittau|afk is now known as rpittau | 08:48 | |
*** yamamoto has quit IRC | 08:53 | |
*** goldyfruit has quit IRC | 09:11 | |
*** yamamoto has joined #openstack-lbaas | 09:14 | |
*** yamamoto has quit IRC | 09:18 | |
*** sapd1_x has joined #openstack-lbaas | 09:59 | |
*** ivve has joined #openstack-lbaas | 10:07 | |
*** openstackgerrit has quit IRC | 10:12 | |
*** sapd1_x has quit IRC | 10:29 | |
*** goldyfruit has joined #openstack-lbaas | 10:47 | |
*** goldyfruit has quit IRC | 11:10 | |
*** rpittau is now known as rpittau|bbl | 11:15 | |
*** maciejjozefczyk_ has quit IRC | 11:45 | |
*** gcheresh has quit IRC | 11:58 | |
*** maciejjozefczyk_ has joined #openstack-lbaas | 12:04 | |
*** tkajinam has joined #openstack-lbaas | 12:14 | |
*** rcernin has quit IRC | 12:28 | |
*** gcheresh has joined #openstack-lbaas | 12:42 | |
*** nicolasbock has joined #openstack-lbaas | 12:49 | |
*** rpittau|bbl is now known as rpittau | 13:05 | |
*** servagem has joined #openstack-lbaas | 13:30 | |
*** AlexStaf has joined #openstack-lbaas | 13:37 | |
*** servagem has quit IRC | 14:14 | |
*** yamamoto has joined #openstack-lbaas | 14:21 | |
*** openstackgerrit has joined #openstack-lbaas | 14:36 | |
openstackgerrit | Brian Haley proposed openstack/octavia master: Remove all usage of six library https://review.opendev.org/701290 | 14:36 |
*** yamamoto has quit IRC | 14:36 | |
*** gcheresh has quit IRC | 14:53 | |
*** gcheresh has joined #openstack-lbaas | 14:54 | |
*** AlexStaf has quit IRC | 14:55 | |
*** yamamoto has joined #openstack-lbaas | 14:59 | |
*** yamamoto has quit IRC | 14:59 | |
*** yamamoto has joined #openstack-lbaas | 14:59 | |
*** yamamoto has quit IRC | 15:04 | |
*** TrevorV has joined #openstack-lbaas | 15:14 | |
*** tkajinam has quit IRC | 15:16 | |
*** gcheresh has quit IRC | 15:25 | |
*** gcheresh has joined #openstack-lbaas | 15:59 | |
*** Trevor_V has joined #openstack-lbaas | 16:00 | |
*** ccamposr__ has quit IRC | 16:01 | |
*** ccamposr__ has joined #openstack-lbaas | 16:01 | |
*** TrevorV has quit IRC | 16:04 | |
*** maciejjozefczyk_ has quit IRC | 16:04 | |
*** openstackgerrit has quit IRC | 16:13 | |
*** gcheresh has quit IRC | 16:17 | |
*** gcheresh has joined #openstack-lbaas | 16:17 | |
*** gcheresh has quit IRC | 16:27 | |
*** mithilarun has joined #openstack-lbaas | 16:31 | |
*** servagem has joined #openstack-lbaas | 16:31 | |
*** servagem has quit IRC | 16:37 | |
*** servagem has joined #openstack-lbaas | 16:37 | |
*** mithilarun has quit IRC | 16:54 | |
*** tesseract has quit IRC | 17:01 | |
*** rpittau is now known as rpittau|afk | 17:04 | |
*** mithilarun has joined #openstack-lbaas | 18:11 | |
*** ramishra has quit IRC | 18:41 | |
johnsom | Finally: Flow 'octavia-failover-loadbalancer-flow' (0a5f299e-ba0b-4433-b172-84421b130e3d) transitioned into state 'SUCCESS' from state 'RUNNING' | 18:43 |
cgoncalves | \o/ | 18:43 |
johnsom | It actually works too: <grin> | 18:44 |
johnsom | Flow 'octavia-failover-loadbalancer-flow' (0a5f299e-ba0b-4433-b172-84421b130e3d) transitioned into state 'SUCCESS' from state 'RUNNING' | 18:44 |
johnsom | Wrong paste | 18:44 |
johnsom | stack@devstack:~/octavia$ curl 10.0.0.45 | 18:44 |
johnsom | Welcome to 172.21.1.29 connection 110942 | 18:44 |
johnsom | I really hate gnome's cut/paste... | 18:44 |
cgoncalves | mnaser, hey! I've been trying to hit a nodepool instance on vexxhost to specifically test a patch there to no luck thus far. all I get is rax, ovh and fortnebula. any chance you could help somehow? | 18:47 |
cgoncalves | patch: https://review.opendev.org/#/c/702921/ | 18:47 |
cgoncalves | johnsom, it will be fun merging that in the amphorav2 driver. not to mention backporting | 18:48 |
johnsom | failover? | 18:49 |
cgoncalves | can't get worse than the single-process patch | 18:49 |
cgoncalves | flows refactor | 18:49 |
johnsom | ummm, well, .... Yeah, it will be a bunch of work for sure. | 18:50 |
*** mithilarun has quit IRC | 18:50 | |
johnsom | I guess I have found the longest time fedora 31 can deal with my system load, 49 days. Windows are not all painting, terminal character movement is stuttering, etc.... Time for a reboot. | 18:58 |
cgoncalves | I had the same issue until some weeks (months?) ago. very sporadic and only affecting when on multi-screen | 19:02 |
johnsom | I have seen this before. It seems to be swap related. Once it decides it needs to start swapping, it gets unhappy. | 19:03 |
*** gcheresh has joined #openstack-lbaas | 19:10 | |
*** mithilarun has joined #openstack-lbaas | 19:12 | |
*** born2bake has joined #openstack-lbaas | 19:51 | |
*** nicolasbock has quit IRC | 19:57 | |
cgoncalves | mnaser, never mind. I got one job running on vexxhost. KVM libvirt type + cpu mode host-passthrough work. expected to see better performance, though, but this is good progress :) | 20:09 |
rm_work | are there any known issues currently with UDP-CONNECT healthchecks? | 20:11 |
*** yamamoto has joined #openstack-lbaas | 20:11 | |
rm_work | Seeing ERROR for all member operating_status, but traffic is still passing to those members <_< | 20:11 |
rm_work | and apparently even if the members are taken completely down, the LB still tries to pass traffic to them | 20:12 |
rm_work | and they're still just "ERROR" | 20:12 |
rm_work | not DOWN | 20:12 |
johnsom | rm_works for me, but you have to have your security groups setup right for the ICMP traffic. | 20:12 |
rm_work | yeah i'll make sure | 20:12 |
johnsom | We don't have a "DOWN" status BTW | 20:13 |
rm_work | but why would the HM allow traffic to an ERROR member? | 20:13 |
johnsom | https://docs.openstack.org/api-ref/load-balancer/v2/index.html#status-codes | 20:13 |
rm_work | ah | 20:14 |
rm_work | could have sword we did, ok whelp | 20:14 |
rm_work | but shouldn't it take ERROR members out of rotation? | 20:14 |
johnsom | yes, in theory | 20:15 |
johnsom | I would hop on the amp and look at the ipvsadm status | 20:15 |
johnsom | then back track | 20:15 |
rm_work | AH ok that's how you do that? I was looking for logs but there didn't seem to be any | 20:15 |
johnsom | Yeah, the lvs based UDP functionality is all in the kernel and doesn't not log flows. | 20:16 |
*** yamamoto has quit IRC | 20:16 | |
cgoncalves | FYI, gthiemonge's UDP scenario patch inclues test health monitor for UDP members: https://review.opendev.org/#/c/656515/ | 20:18 |
rm_work | i think it's not listing anything O_o | 20:18 |
johnsom | Are you in the netns? | 20:18 |
johnsom | Yeah, pretty sure I sent you all this once before, including the test patch | 20:18 |
rm_work | ah ok it's netns relevant too | 20:19 |
rm_work | yeah, so it's listing the one member that's in ERROR according to the HM | 20:19 |
rm_work | cgoncalves: is that run on both centos and ubuntu amps? could be an OS difference? | 20:20 |
cgoncalves | rm_work, Cirros in upstream CI and I'm +50% certain gthiemonge also tested on RHEL/CentOS too ;) | 20:24 |
gthiemonge | o/ | 20:24 |
gthiemonge | yep, ubuntu and centos amps | 20:24 |
cgoncalves | oh, there's the man! | 20:24 |
rm_work | hmm | 20:26 |
rm_work | yeah if i'm on the Amp, how can i get some insight into what the HM is doing? | 20:27 |
rm_work | I don't know how the UDP HMs work | 20:27 |
gthiemonge | I'd use tcpdump (in the amp or on the host) to see if HM is working correctly | 20:28 |
johnsom | All of the info is in ipvsadm | 20:28 |
johnsom | I am going to grab lunch, but can maybe answer questions after lunch | 20:29 |
cgoncalves | rm_work, https://docs.openstack.org/octavia/latest/user/guides/basic-cookbook.html#other-heath-monitors | 20:29 |
rm_work | hmm k | 20:29 |
cgoncalves | see "UDP-CONNECT" | 20:29 |
rm_work | yeah | 20:29 |
rm_work | so we're not getting "ONLINE" for a down member | 20:30 |
rm_work | we're getting ERROR for up OR down members, and the members are remaining in the rotation either way | 20:30 |
rm_work | if status is ERROR (correctly, OR not) shouldn't it be removed from the rotation? | 20:30 |
gthiemonge | rm_work: yes it should | 20:32 |
rm_work | ok, that's what I'm not seeing | 20:33 |
rm_work | so trying to figure out what the HM is actually seeing, not what our agent is forwarding | 20:33 |
rm_work | because the agent is obviously forwarding "DOWN" | 20:34 |
* rm_work https://github.com/openstack/octavia/blob/master/octavia/controller/healthmanager/health_drivers/update_db.py#L421-L422 | 20:35 | |
rm_work | but ipvsadm shows the down members still in rotation | 20:36 |
rm_work | (which I guess is where I was thinking of "DOWN" existing) | 20:40 |
*** rcernin has joined #openstack-lbaas | 20:42 | |
*** gcheresh has quit IRC | 20:45 | |
*** mithilarun has quit IRC | 20:46 | |
*** armax has joined #openstack-lbaas | 21:22 | |
*** maciejjozefczyk_ has joined #openstack-lbaas | 21:28 | |
johnsom | So, rm_work, in the amp netns, if you run "ipvsadm --list" this is all of the healthy members on the VIP (first line). | 21:31 |
johnsom | Any members detected as DOWN would not be listed in that list. | 21:31 |
rm_work | right | 21:31 |
rm_work | it's listing the member | 21:32 |
johnsom | I have this config: | 21:32 |
johnsom | https://www.irccloud.com/pastebin/LTsJgsta/ | 21:32 |
johnsom | but once the health check kicks in: | 21:32 |
johnsom | https://www.irccloud.com/pastebin/fIeTC4PN/ | 21:32 |
rm_work | hmm | 21:32 |
johnsom | instance 172.21.1.119 has the ICMP SG rules in to allow the "port unreachable" | 21:33 |
johnsom | Instance 10.21.21.1 does not allow sending ICMP so it is falsely detecting port 55555 as open | 21:33 |
*** Trevor_V has quit IRC | 21:33 | |
rm_work | yeah, again, assuming it's totally unreachable | 21:35 |
rm_work | it'd be marked DOWN by amp agent (which translates to ERROR in our DB/API) | 21:35 |
johnsom | If the ICMP is not being sent, the operating status will never update | 21:35 |
rm_work | which is what i'm seeing | 21:35 |
rm_work | hmmm | 21:35 |
*** maciejjozefczyk_ has quit IRC | 21:35 | |
rm_work | so it'd not be UP *or* DOWN (ONLINE/OFFLINE) | 21:36 |
rm_work | and the status just wouldn't ... change? | 21:36 |
rm_work | if the HM was never able to reach it from the initial add | 21:36 |
johnsom | I think it sticks at the last state it had | 21:36 |
rm_work | ok | 21:37 |
rm_work | so maybe that's the case, and the last state was ERROR for some reason? O_o | 21:37 |
rm_work | so the HM actually isn't getting any status , and therefore wouldn't remove the node from rotation | 21:37 |
johnsom | Hmm, something is fishy though. My "good" one is still showing "NO_MONITOR" | 21:37 |
*** mithilarun has joined #openstack-lbaas | 21:37 | |
johnsom | No, no, no, the removal of rotation is all in the kernel, HM has nothing to do with it | 21:37 |
johnsom | That is the ipvsadm --list output | 21:38 |
johnsom | If it's listed, it is in rotation, if not, it is out of the pool | 21:38 |
johnsom | When it is pulled out of the pool, because it is failed, it will also log: | 21:39 |
johnsom | Jan 21 21:35:10 amphora-6d2886c5-b068-49c8-a510-d4b987a97d9d Keepalived_healthcheckers_amphora-haproxy[5639]: Misc check to [172.21.1.119] for [/var/lib/octavia/lvs/check/udp_check.sh 172.21.1.119 55555] failed. | 21:39 |
johnsom | in the syslog/messages | 21:39 |
rm_work | ahh ok so the HM is still keepalived? | 21:41 |
johnsom | The health monitor for UDP is part keepalived, part kernel. keepalived will monitor | 21:42 |
rm_work | ok and then it just does commands to do the updating of the rotation | 21:44 |
johnsom | So, on mine, the agent is sending the right stuff, but an "UP" isn't making the member online. That said, this devstack is fairly old code | 21:46 |
*** mithilarun has quit IRC | 21:50 | |
*** mithilarun has joined #openstack-lbaas | 22:11 | |
*** tkajinam has joined #openstack-lbaas | 22:57 | |
*** openstackgerrit has joined #openstack-lbaas | 23:00 | |
openstackgerrit | Brian Haley proposed openstack/octavia master: Make octavia-grenade job use python3 https://review.opendev.org/693486 | 23:00 |
johnsom | 71 seconds for an Active/Standby load balancer full failover. | 23:02 |
johnsom | <note: no downtime of course> | 23:04 |
rm_work | \o/ | 23:08 |
johnsom | Well, ok, minimal downtime. lol | 23:27 |
*** armax has quit IRC | 23:41 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!