*** openstack has joined #openstack-lbaas | 13:14 | |
*** ChanServ sets mode: +o openstack | 13:14 | |
*** rcernin has quit IRC | 13:31 | |
*** boden has joined #openstack-lbaas | 13:34 | |
*** pcaruana has quit IRC | 13:46 | |
*** pcaruana has joined #openstack-lbaas | 13:47 | |
*** gcheresh has quit IRC | 13:48 | |
*** gcheresh has joined #openstack-lbaas | 13:56 | |
*** spatel has joined #openstack-lbaas | 14:01 | |
*** ricolin_ has joined #openstack-lbaas | 14:20 | |
openstackgerrit | Carlos Goncalves proposed openstack/octavia master: Fix listener stats and amp health DB deadlock https://review.opendev.org/667940 | 14:29 |
---|---|---|
johnsom | cgoncalves ^^^^ Hmmm, not sure I agree with that. Those retry wrappers should be used with caution. | 14:34 |
cgoncalves | johnsom, totally open for comments. I don't fully grasp DB locks, etc | 14:37 |
cgoncalves | dulek should be testing it now | 14:38 |
cgoncalves | he encountered that today twice on kuryr CI jobs | 14:38 |
dulek | cgoncalves: It's a transient issue, so it may happen, or not. | 14:38 |
johnsom | Yeah, however I don't think it's really an issue. | 14:38 |
johnsom | There is some built in "throw this away because it's old" | 14:39 |
johnsom | I'm not sure why this is escalating though, we had some exception handling wrappers around this stuff. | 14:39 |
squarebracket | in the health manager, i'm seeing "UnboundLocalError: local variable 'lock_session' referenced before assignment". that's definitely a bug and not a config issue, right? this is from the stein release of octavia | 14:41 |
cgoncalves | all exceptions are caught, db rolled back, exception reraised | 14:41 |
johnsom | squarebracket That is not good. Can you paste the traceback? | 14:42 |
johnsom | cgoncalves, yeah, ok, so we are handling that internally. This is likely a log level bug | 14:46 |
*** fnaval has joined #openstack-lbaas | 14:46 | |
squarebracket | ah wait, it seems i missed an error: "Original exception being dropped .... " which says there's an auth exception. i guess the auth error is triggering the UnboundLocalError. | 14:53 |
cgoncalves | johnsom, instead of changing log level, maybe explicitly catch DBDeadlock exception and log a warn that update was dropped but error to all other exceptions? | 14:56 |
cgoncalves | https://github.com/openstack/octavia/blob/c4faac25de85ca3d8b4f12964589a36b2bcd3b57/octavia/controller/healthmanager/health_drivers/update_db.py#L210 | 14:57 |
johnsom | cgoncalves lol, that is effectively lowering the log level for the DBDeadlock exception. grin | 14:57 |
cgoncalves | hmm. that is what you were suggesting, no? | 14:58 |
cgoncalves | lower for DBDeadlock but keep for all others | 14:58 |
johnsom | Right | 14:58 |
cgoncalves | ok. I understood you were saying lowering for all. alright, will update patch later | 14:59 |
johnsom | I would do it here however: https://github.com/openstack/octavia/blob/master/octavia/controller/healthmanager/health_drivers/update_db.py#L66 | 14:59 |
johnsom | And here: https://github.com/openstack/octavia/blob/master/octavia/controller/healthmanager/health_drivers/update_db.py#L402 | 15:00 |
*** ivve has quit IRC | 15:00 | |
squarebracket | here's the traceback fwiw: https://pastebin.com/DFpv9bux | 15:00 |
squarebracket | i imagine that probably lock_session should be init'd to None and then checked to see if it exists before doing the rollback() | 15:01 |
squarebracket | obviously if the db connection can't be init'd, you can't roll it back | 15:01 |
squarebracket | init'ing right before the `try` here: https://github.com/openstack/octavia/blob/master/octavia/controller/healthmanager/health_manager.py#L88 | 15:02 |
squarebracket | at least in that case the auth error should be the one bubbling up | 15:05 |
johnsom | Yes, looks like it. It's a harmless issue as it will automatically retry on the next cycle, but could be cleaner if we can't get a DB session. | 15:05 |
johnsom | The reraise did bubble up the sql exception, so at least that was a win. | 15:07 |
johnsom | OperationalError: (pymysql.err.OperationalError) (1045, u"Access denied for user \'octavia\'@\'lovelace.qa.vantrix.com\ | 15:07 |
squarebracket | mmm, yeah, in the json | 15:07 |
johnsom | Are you up for opening a story and/or fixing this? | 15:08 |
squarebracket | i'm used to seeing the tracebacks like from futurist. | 15:08 |
squarebracket | i could submit a patch if you'd like | 15:08 |
johnsom | Cool, thank you | 15:08 |
squarebracket | would this be something for which i should create/amend a test? i haven't looked into how you're doing testing. | 15:09 |
johnsom | Yes, there should be a test update for this. Likely a new test here: https://github.com/openstack/octavia/blob/master/octavia/tests/unit/controller/healthmanager/test_health_manager.py | 15:13 |
squarebracket | sorry yes, i just found that, should have rtfm before asking :) | 15:14 |
*** pcaruana has quit IRC | 15:14 | |
johnsom | It in fact is a bit light on the test coverage there: http://logs.openstack.org/61/665861/7/check/openstack-tox-cover/0dc0b6d/cover/octavia_controller_healthmanager_health_manager_py.html | 15:15 |
*** Vorrtex has joined #openstack-lbaas | 15:17 | |
*** luksky has quit IRC | 15:34 | |
*** gcheresh has quit IRC | 15:56 | |
*** yboaron_ has quit IRC | 15:59 | |
*** luksky has joined #openstack-lbaas | 16:30 | |
*** ramishra has quit IRC | 17:03 | |
openstackgerrit | Carlos Goncalves proposed openstack/octavia-tempest-plugin master: DNM: Create two listeners in scenario test https://review.opendev.org/667908 | 17:07 |
*** ricolin_ has quit IRC | 17:12 | |
*** ricolin has joined #openstack-lbaas | 17:12 | |
*** pcaruana has joined #openstack-lbaas | 17:54 | |
openstackgerrit | Carlos Goncalves proposed openstack/octavia master: DNM CentOS7 gate test https://review.opendev.org/665464 | 18:06 |
*** ricolin has quit IRC | 18:09 | |
*** boden has quit IRC | 18:46 | |
*** boden has joined #openstack-lbaas | 18:47 | |
spatel | johnsom: question, does amphora flavor has any specific properties like large hugepage, cpu pinning etc..? | 19:05 |
spatel | that is the problem, i have all my compute nodes configure for hugepage and now amphora failed to launch instance because it doesn't has hugepage setting in flavor | 19:08 |
johnsom | spatel The nova flavor used by Octavia can have those settings. | 19:32 |
johnsom | https://docs.openstack.org/octavia/latest/configuration/configref.html#controller_worker.amp_flavor_id | 19:32 |
openstackgerrit | Chuck Wilson proposed openstack/octavia master: only rollback DB when we have a connection to the DB https://review.opendev.org/668032 | 19:34 |
squarebracket | ^ checked to make sure the test failed without changes, and it did | 19:35 |
johnsom | Cool, thanks! | 19:35 |
johnsom | spatel You can also override that default setting with an Octavia flavor that defines the compute_flavor. | 19:36 |
*** spatel has quit IRC | 19:37 | |
johnsom | squarebracket Do you mind if I make a few updates to your patch or would you like to do the updates based on comments? | 19:37 |
openstackgerrit | Chuck Wilson proposed openstack/octavia master: only rollback DB when we have a connection to the DB https://review.opendev.org/668032 | 19:48 |
squarebracket | johnsom: whatever you prefer | 19:50 |
squarebracket | and just to be clear -- are you asking me to open an equivalent bug? | 19:50 |
johnsom | Yeah, part of the criteria we use for backporting a patch is if there is a filed bug (stories in our case). | 19:51 |
johnsom | https://storyboard.openstack.org/#!/dashboard/stories | 19:51 |
squarebracket | ok, i will create one | 19:54 |
squarebracket | created https://storyboard.openstack.org/#!/story/2006062 | 20:03 |
openstackgerrit | Michael Johnson proposed openstack/octavia master: only rollback DB when we have a connection to the DB https://review.opendev.org/668032 | 20:06 |
johnsom | Cool, I liked the story to the patch (the two lines in the commit message) and added the release note. | 20:07 |
johnsom | That should be good-to-go | 20:07 |
*** gcheresh has joined #openstack-lbaas | 20:08 | |
*** ivve has joined #openstack-lbaas | 20:12 | |
squarebracket | much obliged, i was trying to figure out how to link the two | 20:12 |
*** spatel has joined #openstack-lbaas | 20:13 | |
spatel | johnsom: thanks i am doing it now | 20:13 |
*** gcheresh has quit IRC | 20:29 | |
*** spatel has quit IRC | 20:37 | |
*** Vorrtex has quit IRC | 20:56 | |
squarebracket | i'm getting a MissingAuthPlugin from the octavia api, but i've configured the keystone_auth section of the conf. is there something i might be missing? i've set most of the values listed in the install guide | 20:57 |
johnsom | MissingAuthPlugin, hmnm, definitely a keystone thing. Let me look up a few things | 21:02 |
johnsom | keystone_authtoken and service_auth come to mind immediately. I would check the "auth_type" setting in the service_auth section, or both really. | 21:04 |
johnsom | You can also compare your octavia.conf to one of the test jobs: http://logs.openstack.org/61/665861/7/check/octavia-v2-dsvm-scenario/63357b7/controller/logs/etc/octavia/octavia_conf.txt.gz | 21:04 |
johnsom | If that doesn't help, can you https://paste.openstack.org the error output? | 21:05 |
*** pcaruana has quit IRC | 21:05 | |
*** fnaval has quit IRC | 21:10 | |
*** fnaval has joined #openstack-lbaas | 21:14 | |
*** fnaval has quit IRC | 21:14 | |
*** tesseract has quit IRC | 21:19 | |
squarebracket | ah! i was missing the service_auth section | 21:20 |
squarebracket | thans | 21:20 |
*** rcernin has joined #openstack-lbaas | 21:24 | |
*** boden has quit IRC | 21:55 | |
johnsom | Ok, first light: | 22:43 |
johnsom | https://www.irccloud.com/pastebin/1vYEOJl2/ | 22:43 |
*** luksky has quit IRC | 23:00 | |
openstackgerrit | Michael Johnson proposed openstack/octavia master: Make amphora use a single HAProxy instance https://review.opendev.org/668068 | 23:03 |
xgerman | ^^ undoing the damage Steven has done? | 23:03 |
johnsom | Yes! | 23:04 |
johnsom | It's causing critical failures now so must go! | 23:04 |
johnsom | Do you remember how hard I argued against the proc-per-listener? sigh | 23:05 |
*** spatel has joined #openstack-lbaas | 23:13 | |
*** spatel has quit IRC | 23:14 | |
xgerman | yep, I remember :-) But his argument was that if you habve more than one listener and change stuff on one the others all stop transporting traffic | 23:16 |
johnsom | zero hit reloads.... | 23:20 |
johnsom | Wasn't a true issue then, really isn't now. | 23:23 |
xgerman | Yeah, sounded always fishy… | 23:43 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!