| johnsom | sorrison Left some comments on the SDK patch. | 00:06 |
|---|---|---|
| sorrison | thanks | 00:17 |
| *** hongbin has quit IRC | 00:37 | |
| *** ramishra has quit IRC | 00:40 | |
| *** hongbin has joined #openstack-lbaas | 00:41 | |
| *** gthiemonge has quit IRC | 00:58 | |
| *** gthiemonge has joined #openstack-lbaas | 00:58 | |
| *** spatel has joined #openstack-lbaas | 01:39 | |
| openstackgerrit | Noah Mickus proposed openstack/octavia-lib master: Adding cipher list Support for provider drivers https://review.opendev.org/714558 | 01:44 |
| *** spatel has quit IRC | 01:44 | |
| *** ramishra has joined #openstack-lbaas | 02:07 | |
| *** gthiemonge has quit IRC | 02:46 | |
| *** gthiemonge has joined #openstack-lbaas | 02:46 | |
| *** hongbin has quit IRC | 03:29 | |
| openstackgerrit | Sam Morrison proposed openstack/octavia-dashboard master: Availability zone support https://review.opendev.org/714563 | 03:35 |
| *** jamesdenton has quit IRC | 04:30 | |
| *** jamesdenton has joined #openstack-lbaas | 05:03 | |
| *** dulek has quit IRC | 05:11 | |
| *** spatel has joined #openstack-lbaas | 05:40 | |
| *** spatel has quit IRC | 05:46 | |
| *** vishalmanchanda has joined #openstack-lbaas | 06:32 | |
| *** ataraday_ has joined #openstack-lbaas | 06:56 | |
| *** gcheresh has joined #openstack-lbaas | 07:17 | |
| *** gcheresh has quit IRC | 07:50 | |
| *** gcheresh has joined #openstack-lbaas | 07:56 | |
| *** tkajinam has quit IRC | 08:05 | |
| *** dulek has joined #openstack-lbaas | 08:12 | |
| *** rpittau|afk is now known as rpittau | 08:26 | |
| *** ccamposr__ has joined #openstack-lbaas | 08:36 | |
| *** ccamposr has quit IRC | 08:39 | |
| *** maciejjozefczyk has joined #openstack-lbaas | 08:56 | |
| *** TMM has quit IRC | 09:14 | |
| *** TMM has joined #openstack-lbaas | 09:14 | |
| *** gcheresh has quit IRC | 09:21 | |
| *** ccamposr has joined #openstack-lbaas | 09:27 | |
| *** ccamposr__ has quit IRC | 09:29 | |
| *** spatel has joined #openstack-lbaas | 09:42 | |
| *** spatel has quit IRC | 09:47 | |
| *** gcheresh has joined #openstack-lbaas | 09:55 | |
| *** ccamposr__ has joined #openstack-lbaas | 11:34 | |
| *** ccamposr has quit IRC | 11:37 | |
| *** psachin has joined #openstack-lbaas | 11:44 | |
| *** sapd1_x has joined #openstack-lbaas | 12:07 | |
| *** ccamposr has joined #openstack-lbaas | 12:40 | |
| *** ccamposr__ has quit IRC | 12:42 | |
| *** psachin has quit IRC | 13:05 | |
| *** ataraday_ has quit IRC | 13:36 | |
| *** tkajinam has joined #openstack-lbaas | 13:57 | |
| *** TrevorV has joined #openstack-lbaas | 14:09 | |
| cgoncalves | FYI, Analysis of 2019 User Survey Feedback: https://governance.openstack.org/tc/user_survey/analysis-12-2019.html | 14:27 |
| johnsom | That looks like only the feedback for the TC questions. I wonder if we got access to the Octavia question results. | 15:01 |
| cgoncalves | right. I could also not find them in the survey report at https://www.openstack.org/analytics | 15:03 |
| *** tkajinam has quit IRC | 15:04 | |
| johnsom | Yeah, not surprised really | 15:05 |
| openstackgerrit | Carlos Goncalves proposed openstack/octavia master: WIP: Fix amphora image build jobs https://review.opendev.org/714680 | 15:41 |
| openstackgerrit | Carlos Goncalves proposed openstack/octavia master: WIP: Fix amphora image build jobs https://review.opendev.org/714680 | 15:45 |
| *** dtruong has quit IRC | 15:47 | |
| *** dtruong has joined #openstack-lbaas | 15:48 | |
| openstackgerrit | Carlos Goncalves proposed openstack/octavia master: WIP: Fix amphora image build jobs https://review.opendev.org/714680 | 15:48 |
| *** gthiemonge has quit IRC | 16:07 | |
| *** gthiemonge has joined #openstack-lbaas | 16:07 | |
| *** sapd1_x has quit IRC | 16:08 | |
| *** gcheresh has quit IRC | 16:34 | |
| rm_work | would be cool if they'd email that stuff to the PTL or the liason or something | 16:54 |
| rm_work | <_< | 16:54 |
| openstackgerrit | Carlos Goncalves proposed openstack/octavia master: Fix amphora image build jobs https://review.opendev.org/714680 | 17:04 |
| *** rpittau is now known as rpittau|afk | 17:18 | |
| *** maciejjozefczyk has quit IRC | 17:21 | |
| *** maciejjozefczyk has joined #openstack-lbaas | 17:21 | |
| *** maciejjozefczyk has quit IRC | 17:26 | |
| *** ianychoi has quit IRC | 17:45 | |
| *** vesper11 has quit IRC | 17:45 | |
| *** openstackstatus has quit IRC | 17:45 | |
| *** ianychoi has joined #openstack-lbaas | 17:46 | |
| *** vesper11 has joined #openstack-lbaas | 17:47 | |
| *** openstackstatus has joined #openstack-lbaas | 17:47 | |
| *** ChanServ sets mode: +v openstackstatus | 17:47 | |
| *** irclogbot_1 has quit IRC | 17:47 | |
| *** irclogbot_1 has joined #openstack-lbaas | 17:48 | |
| *** ataraday_ has joined #openstack-lbaas | 18:14 | |
| ataraday_ | rm_work, are you around? | 18:15 |
| rm_work | yes | 18:15 |
| ataraday_ | I added some details https://review.opendev.org/#/c/647406/96/octavia/controller/worker/v2/taskflow_jobboard_driver.py@47 - if you have time we can discuss this | 18:20 |
| rm_work | kk, in a moment, finishing something up | 18:42 |
| *** laerlingSAP has quit IRC | 19:04 | |
| *** laerlingSAP has joined #openstack-lbaas | 19:06 | |
| *** ataraday_ has quit IRC | 19:09 | |
| *** gcheresh has joined #openstack-lbaas | 19:11 | |
| *** maciejjozefczyk has joined #openstack-lbaas | 19:32 | |
| *** rcernin|brb has quit IRC | 19:57 | |
| *** vishalmanchanda has quit IRC | 19:59 | |
| openstackgerrit | Adam Harwell proposed openstack/octavia master: Support HTTP and TCP checks in UDP healthmonitor https://review.opendev.org/589180 | 19:59 |
| *** maciejjozefczyk has quit IRC | 20:20 | |
| *** gcheresh has quit IRC | 20:39 | |
| *** TrevorV has quit IRC | 20:47 | |
| openstackgerrit | Michael Johnson proposed openstack/octavia-tempest-plugin master: Add skip_if_not_implemented to the service client https://review.opendev.org/714003 | 21:03 |
| *** cjloader has quit IRC | 21:18 | |
| sorrison | johnsom: Left some replies for https://review.opendev.org/#/c/714345/ I think we do need the id attribute for these resources | 22:14 |
| johnsom | sorrison Yeah the "don't use \ for line wrap" is an Octavia team thing. I know other teams use it (mostly legacy). So, having it in SDK is fine per SDK hacking rules, but we don't typically allow it in Octavia repos. rm_work may have more comment. grin | 22:17 |
| johnsom | sorrison lol, hmmm, yeah, I did add it in flavor. hmm, maybe we should leave those. Let me refresh my brain. | 22:19 |
| sorrison | Can't figure out how you say "this is part of the resource but not used when creating one" | 22:20 |
| *** gregwork has quit IRC | 22:21 | |
| rm_work | sorrison: oh how do you actually ENABLE that healthcheck btw | 22:21 |
| rm_work | ah nm i bet it's in the doc you committed :D | 22:21 |
| johnsom | Yeah, in the past we have not included ID in the properties list, I thought because that implied it was settable. Most of the proxy methods will already take an id/object | 22:23 |
| johnsom | If something is in the properties list, it ends up in the json body going over the wire. | 22:24 |
| johnsom | Yeah, flavor might be wrong in having the ID there | 22:26 |
| johnsom | It might also be needed for the query map though | 22:26 |
| *** rcernin has joined #openstack-lbaas | 22:45 | |
| lxkong | hi guys, I am wondering if it's possible that some amphora will never be failed over, give octavia is always picking up the first unhealthy one from db? | 22:52 |
| lxkong | https://www.irccloud.com/pastebin/vaRMVfZv/ | 22:52 |
| lxkong | or do i miss something elsewhere? | 22:53 |
| lxkong | stable/train | 22:53 |
| *** tkajinam has joined #openstack-lbaas | 22:53 | |
| johnsom | lxkong No, you will note the "busy" flag. This is set once an amphora is selected for failover, thus will not be in the results for the next health manager | 22:53 |
| lxkong | johnsom: but what if the lb is in pending_update? the amphora will be skipped | 22:54 |
| lxkong | and next time, it is still be picked? | 22:55 |
| lxkong | the following amphora will never get a chance? | 22:55 |
| johnsom | Yes, lb in pending_update means one of the controllers has ownership of the LB and all of it's parts. No other controller will act on it. | 22:55 |
| lxkong | so what about the following unhealthy amphorae? | 22:56 |
| johnsom | If the LB is in pending_update due to a failover, the busy flag will be set | 22:56 |
| lxkong | i mean, e.g. I have two unhealthy amphorae (am1 and am2), the lb of am1 is in pending_update, am2 will never be failed over, right? | 22:57 |
| johnsom | Basically that busy flag is there to make sure it walks the list of amphora | 22:57 |
| lxkong | am1 is firstly be picked, set busy=1, but the lb in pending_update, session rolled back, bush=0, break the loop | 22:59 |
| lxkong | in the next loop, am1 is checked again | 22:59 |
| johnsom | lxkong am2 will not get failed over until the pending_update has been removed. We intentionally only failover one amphora of a load balancer at a time to make sure there is at least one serving traffic and should the initial failover blow up for some reason, there is still a chance of a functioning load balancer. | 22:59 |
| lxkong | hmm...some lbs in our cloud are stuck in pending_xxx, so that affect the octaia-healthmonitor service, right? | 23:00 |
| lxkong | especially the amphorae for those lb are unhealthy | 23:01 |
| johnsom | Yeah, pending_XXX means one of the controllers is already acting on it. | 23:01 |
| lxkong | yeah, that's gonna be problematic | 23:01 |
| johnsom | One of the controllers already has ownership of the object and no other should act on it | 23:01 |
| lxkong | the point is, no controllers are actually working on those lbs | 23:02 |
| johnsom | So, likely what happened is one of your controllers had a non-graceful shutdown where it didn't get a change to release the "pending_*" lock. | 23:02 |
| lxkong | yes, they are in pending_xxx forever | 23:02 |
| johnsom | This is what we are working to fix now, sub-flow controller failures | 23:02 |
| johnsom | Yeah, so someone did a kill -9 or powered off a controller without shutting it down. | 23:03 |
| lxkong | or when octavia service is up and running, maybe scan the pending lbs first and do somethign? | 23:03 |
| johnsom | Also check your systemd scripts to make sure they aren't configured to timeout and kill -9 | 23:03 |
| lxkong | we are using `start-stop-daemon` | 23:04 |
| johnsom | lxkong I think what you are reporting is similar to this: https://storyboard.openstack.org/#!/story/2007340 | 23:05 |
| johnsom | Where you have accumulated PENDING_ that are no longer owned by the controller that locked it. | 23:05 |
| lxkong | yeah, the same | 23:05 |
| lxkong | From http://man7.org/linux/man-pages/man8/start-stop-daemon.8.html, I can see `All matching processes will be sent the TERM signal (or the one specified via --signal or --retry) if --stop is specified.` | 23:08 |
| lxkong | probably that's the reason | 23:08 |
| johnsom | Also check that your start-stop-daemon is using TERM/15 and not 9 | 23:08 |
| johnsom | Just like systemd it will escalate to sending a KILL, so you need to make sure it gives those processes time to shutdown before it escalates to a KILL. | 23:09 |
| johnsom | You will want a --retry config | 23:11 |
| *** gthiemonge has quit IRC | 23:13 | |
| *** gthiemonge has joined #openstack-lbaas | 23:13 | |
| lxkong | thanks johnsom, i will deal with the pending LBs first. | 23:16 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!