opendevreview | Gregory Thiemonge proposed openstack/octavia stable/wallaby: Fix failover of az-specific loadbalancers https://review.opendev.org/c/openstack/octavia/+/815585 | 06:22 |
---|---|---|
opendevreview | Gregory Thiemonge proposed openstack/octavia stable/victoria: Fix failover of az-specific loadbalancers https://review.opendev.org/c/openstack/octavia/+/815586 | 06:22 |
opendevreview | Gregory Thiemonge proposed openstack/octavia stable/ussuri: Fix failover of az-specific loadbalancers https://review.opendev.org/c/openstack/octavia/+/815587 | 06:23 |
opendevreview | Gregory Thiemonge proposed openstack/octavia stable/xena: Fix failover of az-specific loadbalancers https://review.opendev.org/c/openstack/octavia/+/815588 | 06:24 |
opendevreview | Gregory Thiemonge proposed openstack/octavia master: Reconfigure amphora network interfaces seamlessly https://review.opendev.org/c/openstack/octavia/+/812368 | 08:39 |
opendevreview | Gregory Thiemonge proposed openstack/octavia master: Fix plugging member subnets on existing networks https://review.opendev.org/c/openstack/octavia/+/665402 | 08:39 |
opendevreview | Gregory Thiemonge proposed openstack/octavia master: Allow multiple VIPs per LB https://review.opendev.org/c/openstack/octavia/+/660239 | 08:39 |
dulek | Hi folks! In Kuryr we see elevated rate of SCPT tests failures when using Amphora based on Ubuntu Focal. | 13:56 |
dulek | The symptoms are mostly that 0 or only a single backend responds when it should be 2. | 13:57 |
dulek | https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_8bd/812588/12/check/kuryr-kubernetes-tempest-defaults/8bdf367/controller/index.html - example run. | 13:58 |
dulek | Hm, interestingly I see the LB moving from PENDING_UPDATE to PENDING_CREATE which we doesn't really expect. | 14:08 |
dulek | Ah no, it's not this thing. :/ | 14:15 |
johnsom | dulek By that message I am assuming you figured out what it was an it's not amphora related? | 14:20 |
dulek | johnsom: Nah, so looks like Kuryr correctly waits for Amphora to become ACTIVE and then allows running the SCTP connectivity tests. | 14:28 |
dulek | And then the test tries running connections to the LB assuming it'll reach all the backends. | 14:28 |
dulek | I think ~100 connections are tried. | 14:29 |
dulek | Yet it only reaches one despite ROUND_ROBIN algorithm. | 14:29 |
dulek | And we do see that only on Amphora gates. | 14:29 |
johnsom | Hmm, does it wait for operating status ONLINE or just prov status ACTIVE? Not that 100 connections shouldn't allow all to come up in time and start the RR. | 14:30 |
dulek | johnsom: I think we wait for the provisioning_status only. Is that incorrect? Could explain quite a bit - those amps on gates are super slow. | 14:30 |
dulek | And maybe even slower with Focal? | 14:31 |
johnsom | I did a quick scan of the logs, there aren't any errors there that I saw. Not even a member dropping offline (though it won't log if they never come online) | 14:31 |
johnsom | Well, adding a member should only take a few seconds. Actually it's centos that is having major performance issues at the moment. | 14:32 |
dulek | johnsom: Those VMs often run on software virtualization, I'm fairly sure few seconds isn't what we see. | 14:33 |
johnsom | So, provisioning status is when Octavia is done configuring things. Operating status is the "observed" status, i.e. the pod is responding, etc. | 14:33 |
johnsom | Yeah, software qemu is rough, very slow in the IO subsystem | 14:34 |
dulek | johnsom: Do I need to check the operating_status only on the LB or also the member? | 14:35 |
johnsom | Since you have more than one member, I would check the pool. ONLINE==all members healthy, DEGRADED==one or more members not responding as expected, ERROR==all members are down | 14:36 |
johnsom | Let me look through your test and the logs deeper to see if I can see what is up. | 14:36 |
johnsom | dulek So, it looks like the traffic test is starting before the second member is marked as provisioning status ACTIVE. | 15:23 |
johnsom | I see traffic through the LB at 10:57:00, but the second member isn't finished creating until 10:57:33 | 15:24 |
johnsom | It looks like only about 19 of the connections could have been in the RR and hit member 2 | 15:25 |
johnsom | Actually, no, none would have looking at the timing | 15:25 |
johnsom | It looks like the LB would have been fully provisioned at Oct 27 10:57:33.617640 | 15:27 |
johnsom | So, yeah, something isn't waiting for the LB create/configuration to finish before starting the test. That is why the traffic is only hitting one member. | 15:29 |
gthiemonge | #startmeeting Octavia | 16:00 |
opendevmeet | Meeting started Wed Oct 27 16:00:41 2021 UTC and is due to finish in 60 minutes. The chair is gthiemonge. Information about MeetBot at http://wiki.debian.org/MeetBot. | 16:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 16:00 |
opendevmeet | The meeting name has been set to 'octavia' | 16:00 |
gthiemonge | Hi | 16:00 |
johnsom | o/ | 16:00 |
gthiemonge | #topic Announcements | 16:02 |
gthiemonge | Yoga PTG | 16:02 |
gthiemonge | The Yoga PTG for Octavia was last week | 16:02 |
gthiemonge | We had some good discussions there | 16:02 |
gthiemonge | I would like to highlight some features/fixes we want to get in the Y release | 16:03 |
gthiemonge | - amphorav2+persistence by default | 16:03 |
gthiemonge | - Fix plugging member subnets on existing networks | 16:03 |
gthiemonge | - Allow multiple VIPs per LB | 16:03 |
gthiemonge | and also improvements for AZ support would be great | 16:03 |
gthiemonge | anything else? any other announcements? | 16:05 |
gthiemonge | #topic Brief progress reports / bugs needing review | 16:07 |
gthiemonge | FYI I created one (last?) backport for stable/ussuri | 16:07 |
gthiemonge | #link https://review.opendev.org/c/openstack/octavia/+/815587 | 16:07 |
gthiemonge | we need to merge it before Nov 12th if we want to include it in the final reelase for Ussuri | 16:08 |
johnsom | I think I reviewed the master patch for that yesterday. I will take a pass on the backports | 16:08 |
gthiemonge | I updated the two patches for the multi-subnet issue on member ports and for the multi-vip support (mentionned in the announcements) | 16:08 |
gthiemonge | johnsom: thanks | 16:08 |
gthiemonge | #link https://review.opendev.org/c/openstack/octavia/+/665402 | 16:08 |
gthiemonge | #link https://review.opendev.org/c/openstack/octavia/+/660239 | 16:08 |
gthiemonge | (last one is still WIP) | 16:09 |
gthiemonge | I also tried to update the octavia-grenade-ffu job | 16:10 |
gthiemonge | we want to update octavia from n-3 to n, without updating the other services (they probably don't support that jump) | 16:10 |
gthiemonge | I'm still failing to configure grenade to update only octavia | 16:10 |
njohnston | gthiemonge: I wonder if tosky can help, he has experience with grenade | 16:11 |
gthiemonge | njohnston: ok I can ask him | 16:12 |
gthiemonge | njohnston: thanks ;-) | 16:13 |
gthiemonge | #topic Open Discussion | 16:15 |
gthiemonge | I don't have any other topics | 16:15 |
gthiemonge | Ok Folks, thanks everyone! | 16:18 |
gthiemonge | #endmeeting | 16:18 |
opendevmeet | Meeting ended Wed Oct 27 16:18:28 2021 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 16:18 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/octavia/2021/octavia.2021-10-27-16.00.html | 16:18 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/octavia/2021/octavia.2021-10-27-16.00.txt | 16:18 |
opendevmeet | Log: https://meetings.opendev.org/meetings/octavia/2021/octavia.2021-10-27-16.00.log.html | 16:18 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!