Wednesday, 2021-01-13

openstackgerritLingxian Kong proposed openstack/octavia stable/victoria: Use 'bash' in the keepalived check script
openstackgerritArieh Maron proposed openstack/octavia-tempest-plugin master: Updating _test_pool_CRUD to enable testing of updates to the  load balancer algorithm:
openstackgerritArieh Maron proposed openstack/octavia-tempest-plugin master: Updating _test_pool_CRUD to enable testing of updates to the  load balancer algorithm:
openstackgerritArkady Shtempler proposed openstack/octavia-tempest-plugin master: Adding failover test. Send HTTP traffic while MASTER Amphorae is rebooted. BACKUP Amphorae should serve the traffic.
openstackgerritArieh Maron proposed openstack/octavia-tempest-plugin master: Updating _test_pool_CRUD to enable testing of updates to the  load balancer algorithm:
johnsomhaleyb FYI, I put you on the agenda:
haleybjohnsom: thanks15:48
johnsom#startmeeting Octavia16:00
johnsomHi everyone16:01
johnsom#topic Announcements16:01
*** openstack changes topic to "Announcements (Meeting topic: Octavia)"16:01
johnsomAs contributing members of OpenStack, you should have received an e-mail for:16:02
johnsom2021 Individual Director Election and Bylaws Amendments16:02
johnsomPlease vote. They need 10% of the eligible people to vote to pass the amendments.16:02
johnsomAlso, this week is milestone 2 for Wallaby.16:04
johnsomOh, next week.16:04
johnsomAnyway, it is close.16:04
johnsomRemember that MS3 is feature freeze, so if you have any features you want to get into Octavia, you need those patches posted super soon.16:05
johnsomAny other announcements today?16:05
johnsom#topic Brief progress reports / bugs needing review16:06
*** openstack changes topic to "Brief progress reports / bugs needing review (Meeting topic: Octavia)"16:06
johnsomI have been doing a few reviews and answering questions via the discuss e-mail list and in the channel. Otherwise my focus is on Designate things.16:06
johnsomNo other updates this week?16:08
gthiemongeWe still have 2 open reviews for the gate issues:16:08
*** ccamposr__ has joined #openstack-lbaas16:09
johnsomJust so it highlights in the meeting minutes summary16:10
johnsom#topic Removing lower constraints jobs (haleyb)16:10
*** openstack changes topic to "Removing lower constraints jobs (haleyb) (Meeting topic: Octavia)"16:10
johnsomhaleyb You have the floor16:11
haleybother projects have been proposing changes to their stable branches to remove the lower-constraints jobs16:11
haleybespecially after the fiasco of the past couple weeks16:12
*** ccamposr has quit IRC16:12
haleybi know that we have found the job useful, but given the amount of time we've spent fixing stable, should we disable it there?16:12
haleybor on just really old stable branches?  looking for opinions16:13
haleyb#link - a victoria WIP16:13
johnsomYeah, I know we put a lot of effort into setting those up and getting them working correctly. It was triggered by issues with CentOS 7 shipping OLD python packages.16:14
johnsomSo, IMO they have value.... That said, no one is managing that list effectively.16:14
johnsomAs for stable branches. I agree that the requirements should be fixed on stable, so they should not be needed in theory.16:15
haleybright.  and i don't think we should remove on master, as it could find an issue16:15
johnsomThough recent events have messed with some stable branch requirements.16:16
haleybjohnsom: the alternative would have been to disable the l-c job i guess16:17
haleybyeah, could do that too, and after your earlier comment we can leave the files and tox.ini changes if you like, although most repos i looked at removed everything16:19
johnsomLol, most patches I saw left the l-c.txt file, which didn't make any sense to me.16:20
johnsomWell, I think the risk is low on stable branch, so I guess I am ok with that.16:20
johnsomReally, this just shifts the burden to downstream testing.16:21
* haleyb wonders if there are other opinions16:22
* johnsom wonders as well16:22
johnsomWell, I guess patches talk louder than IRC meeting chat....16:25
johnsom#topic Open Discussion16:25
haleybthe crickets win :)16:25
*** openstack changes topic to "Open Discussion (Meeting topic: Octavia)"16:25
johnsomAny other topics today?16:25
haleybi had one16:26
haleybbackport chain of an empty UDP pool fix16:26
johnsomZuul does not like you16:27
*** ccamposr has joined #openstack-lbaas16:27
haleybi have a dependent patch that can't merge until that gets to stein16:27
haleybthe grenade failure there should be fixed soon i hope with greg's change16:27
haleyband a somewhat related change is the one enabling the stable/victoria tempest tests16:28
johnsomOk, sounds good. Thanks for working on backports.16:28
johnsomIt would be good if someone can drive some stable branch releases as well. I think we are way behind on some of those.16:29
haleyb^^ that one will add more jobs but not having stable/victoria tested seems bad16:29
*** ccamposr__ has quit IRC16:29
johnsomWe were trying to get a bunch of stuff merged for the releases. I'm just not sure who is driving / tracking that now16:29
gthiemongeI'll try to rebase/recheck the backports when the gate fixes are merged16:30
haleybyes, i think we were waiting for the gates to get green16:30
johnsomOk, any other topics today?16:32
johnsomOk then, thanks everyone!16:34
gthiemongethanks johnsom16:34
openstackgerritMerged openstack/octavia stable/ussuri: Fix Ussuri requirements for Victoria grenade
*** gcheresh has quit IRC18:56
openstackgerritLingxian Kong proposed openstack/octavia stable/victoria: Use 'bash' in the keepalived check script
openstackgerritMerged openstack/octavia stable/train: Fix lower-constraints & requirements
*** vishalmanchanda has quit IRC21:41
openstackgerritBrian Haley proposed openstack/octavia master: Add SCTP support in Amphora
*** gcheresh has quit IRC21:57
spateljohnsom: hey!22:02
johnsomspatel Hi22:02
spatelmy LB is in PENDING_CREATE (since long time)22:02
spateltrying to delete but not letting me do that22:03
spatelany workaround ?22:03
johnsomYeah, that means a controller has it locked and owns it.22:03
johnsomDid you kill -9 a controller while it was working on that load balancer?22:03
spatelkill -9 a controller?22:04
spateli am not following you22:04
johnsomYeah, like the Octavia worker process, or non-graceful shutdown the container, or pull the power on the compute host?22:04
spateli have 3 controller node, do you think restart service would be fine?22:05
johnsomAlso, that could mean an issue with your rabbitmq22:05
spateli can build VM etc.. fine..22:05
johnsomGraceful restarts are fine, non-graceful can lead to things in PENDING_*22:05
spateli think its networking issue22:05
spatelwhy don't octavia has some kind of timeout if it stall then ERROR out instead trying hard22:06
johnsomOk, so first thing to do is tail all three Octavia worker logs. If it's retrying actions on that load balancer, it should be writing out to the log.22:06
spatellet me find out22:06
johnsomspatel It does have that timeout. The only way it doesn't finish is a kill -9 of the process22:07
spateljohnsom: one of the controller i am seeing -
spateljohnsom: finally it error out itself and i am able to delete it22:11
johnsomYeah, so that is probably it. Look at the openstack loadbalancer amphora list output, find the load balancer, see if one of the amphora associated with your PENDING_CREATE loadbalancer is using as the lb-mgmt-net IP22:11
johnsomYeah, ok, so a controller was still retrying to create the load balancer, thus it was in PENDING_CREATE.22:12
johnsomThere are configuration settings to turn how long it retries before giving up and marking it ERROR, should you want it to not try so hard.22:12
spateljohnsom: i like to have short time :) let me try to see what went wrong22:13
spatelthank you22:13
johnsomSome people argue it should try forever, so... that is why it's configurable. grin22:14
mchlumskyOur users complain that they can't do anything with the LBs while they are PENDING_* while they can at least delete and re-create them when they are in ERROR. You have to find the right timeout balance between trying hard enough and empowering users.22:46
johnsomPersonally, I lean towards shorter timeouts. Typically if you get into one of these retry loops, nova or neutron aren't going to get fixed any time soon.22:54
openstackgerritBrian Haley proposed openstack/octavia stable/train: Fix operating status for empty UDP pools
openstackgerritBrian Haley proposed openstack/octavia stable/stein: Fix operating status for empty UDP pools
rm_workYeah, unfortunately letting users do Delete or Update operations on a LB that's in PENDING is a recipe for getting orphaned resources... It's hard to say what point in the workflow the controller-worker process has gotten to, and I'm not sure if there's a reasonable way to cancel a workflow from the outside when it's midway through (or if that would actually help the situation).23:01
johnsomI don't think they are asking for a way to cancel the workflow, just how soon we should give up and go to ERROR state23:02
johnsomThe above issue was the controller retrying to connect to the amp via neutron, but getting "no route to host".23:03
rm_worksounded like mchlumsky was advocating for possibly allowing changes in PENDING, and I was speculating about what would be required to make that feasible (since we would need to interrupt existing workflows if we were going to do further changes).23:03
rm_workIt was more of a response to him than to the specific issue23:03
johnsomWhile it was in PENDING_* it was retrying, over and over in hopes neutron would get fixed.23:03
johnsomAh, yeah, overriding the resource lock of PENDING is a bad thing23:04
rm_workwas just expanding on why the choice was made to have PENDING be immutable23:04
lxkongrm_work, johnsom, can I have your review for
johnsomlxkong, yep, that is as easy one. +223:19
lxkongthanks, we are waiting for this one to support UDP23:20
mchlumskyrm_work Sorry, I should have been more clear. I was talking about how soon to give up and go to ERROR. We can totally live with immutable LBs in PENDING_*. Octavia appears to be a complex state machine as it is without adding arbitrary cancellation points in the workflow.23:20
rm_worklet me look lxkong23:31
