openstackgerrit | German Eichberger proposed openstack/octavia master: [WIP] Switch amphora agent to use privsep https://review.openstack.org/549295 | 00:10 |
---|---|---|
*** longkb has joined #openstack-lbaas | 00:36 | |
*** hongbin has joined #openstack-lbaas | 00:46 | |
*** yamamoto has joined #openstack-lbaas | 00:49 | |
*** yamamoto has quit IRC | 00:54 | |
*** yamamoto has joined #openstack-lbaas | 01:50 | |
*** yamamoto has quit IRC | 01:55 | |
*** yamamoto has joined #openstack-lbaas | 02:03 | |
*** hongbin has quit IRC | 02:38 | |
*** hongbin has joined #openstack-lbaas | 02:38 | |
*** fnaval has joined #openstack-lbaas | 02:57 | |
*** hongbin has quit IRC | 03:08 | |
*** hongbin has joined #openstack-lbaas | 03:12 | |
*** ramishra has joined #openstack-lbaas | 04:05 | |
*** gans has joined #openstack-lbaas | 04:18 | |
*** hongbin has quit IRC | 04:19 | |
*** gans has quit IRC | 04:23 | |
*** links has joined #openstack-lbaas | 05:03 | |
*** nmanos has joined #openstack-lbaas | 05:03 | |
*** strigazi_ has joined #openstack-lbaas | 05:29 | |
*** strigazi has quit IRC | 05:32 | |
*** strigazi has joined #openstack-lbaas | 05:34 | |
*** strigazi_ has quit IRC | 05:36 | |
*** yboaron has joined #openstack-lbaas | 06:05 | |
*** nmanos has quit IRC | 06:12 | |
*** nmanos has joined #openstack-lbaas | 06:18 | |
*** nmanos has left #openstack-lbaas | 06:18 | |
openstackgerrit | ZhaoBo proposed openstack/octavia master: UDP jinja template https://review.openstack.org/525420 | 06:30 |
openstackgerrit | ZhaoBo proposed openstack/octavia master: UDP for [2] https://review.openstack.org/529651 | 06:30 |
openstackgerrit | ZhaoBo proposed openstack/octavia master: UDP for [3][5][6] https://review.openstack.org/539391 | 06:30 |
*** ispp has joined #openstack-lbaas | 06:32 | |
*** velizarx has joined #openstack-lbaas | 06:56 | |
*** velizarx has quit IRC | 07:13 | |
*** rcernin has quit IRC | 07:20 | |
*** ispp has quit IRC | 07:21 | |
*** velizarx has joined #openstack-lbaas | 07:22 | |
*** peereb has joined #openstack-lbaas | 07:25 | |
*** ispp has joined #openstack-lbaas | 07:30 | |
*** kobis has joined #openstack-lbaas | 07:39 | |
*** kobis has quit IRC | 07:44 | |
*** yamamoto has quit IRC | 07:48 | |
*** ispp has quit IRC | 07:50 | |
*** kobis has joined #openstack-lbaas | 07:51 | |
*** ispp has joined #openstack-lbaas | 07:51 | |
*** rraja has joined #openstack-lbaas | 08:06 | |
*** ispp has quit IRC | 08:08 | |
*** links has quit IRC | 08:24 | |
*** links has joined #openstack-lbaas | 08:26 | |
*** ktibi has joined #openstack-lbaas | 08:31 | |
*** ispp has joined #openstack-lbaas | 08:31 | |
openstackgerrit | Tuan Do Anh proposed openstack/octavia master: Update pypi url to new url https://review.openstack.org/582094 | 08:33 |
*** sapd has quit IRC | 08:35 | |
*** sapd has joined #openstack-lbaas | 08:35 | |
*** tesseract has joined #openstack-lbaas | 08:37 | |
ktibi | Hi octavia, I can see in video from last summit, neutron-lbaas will be deprecated soon. Octavia support for plugin like with F5 ? | 08:42 |
*** yamamoto has joined #openstack-lbaas | 08:44 | |
*** ispp has quit IRC | 08:48 | |
*** yamamoto has quit IRC | 08:50 | |
*** ispp has joined #openstack-lbaas | 08:53 | |
*** yboaron has quit IRC | 09:03 | |
*** links has quit IRC | 09:10 | |
*** links has joined #openstack-lbaas | 09:12 | |
*** kobis has quit IRC | 09:38 | |
*** yamamoto has joined #openstack-lbaas | 09:46 | |
*** yamamoto has quit IRC | 09:51 | |
*** yamamoto has joined #openstack-lbaas | 10:03 | |
*** kobis has joined #openstack-lbaas | 10:06 | |
*** kobis has quit IRC | 10:41 | |
*** kobis has joined #openstack-lbaas | 10:41 | |
*** yboaron has joined #openstack-lbaas | 10:48 | |
*** velizarx has quit IRC | 10:53 | |
*** velizarx has joined #openstack-lbaas | 11:09 | |
*** atoth has joined #openstack-lbaas | 11:17 | |
*** longkb has quit IRC | 11:34 | |
*** phuoc has quit IRC | 11:53 | |
*** phuoc has joined #openstack-lbaas | 11:53 | |
*** amuller has joined #openstack-lbaas | 12:02 | |
*** hvhaugwitz has quit IRC | 12:10 | |
*** hvhaugwitz has joined #openstack-lbaas | 12:10 | |
*** ispp has quit IRC | 12:16 | |
*** ispp has joined #openstack-lbaas | 12:22 | |
*** atoth has quit IRC | 12:22 | |
*** kobis has quit IRC | 12:29 | |
openstackgerrit | Carlos Goncalves proposed openstack/octavia master: Translate expected package names to installed ones https://review.openstack.org/582170 | 12:36 |
cgoncalves | johnsom, ^ should resolve the issue I found while reviewing https://review.openstack.org/#/c/577344/. if you agree, rebase yours on top of that | 12:41 |
*** atoth has joined #openstack-lbaas | 12:52 | |
*** yamamoto has quit IRC | 13:01 | |
*** velizarx has quit IRC | 13:12 | |
*** KeithMnemonic has joined #openstack-lbaas | 13:14 | |
*** velizarx has joined #openstack-lbaas | 13:16 | |
*** fnaval has quit IRC | 13:25 | |
*** yamamoto has joined #openstack-lbaas | 13:28 | |
*** yamamoto has quit IRC | 13:29 | |
*** yamamoto has joined #openstack-lbaas | 13:34 | |
*** fnaval has joined #openstack-lbaas | 13:35 | |
*** fnaval has quit IRC | 13:39 | |
*** yamamoto_ has joined #openstack-lbaas | 13:41 | |
*** yamamoto has quit IRC | 13:41 | |
*** fnaval has joined #openstack-lbaas | 13:45 | |
*** kobis has joined #openstack-lbaas | 13:47 | |
*** links has quit IRC | 13:50 | |
johnsom | ktibi Yes, Octavia supports provider drivers. Contact F5 for information on when their driver will be ready. | 14:02 |
ktibi | johnsom, thx ;) | 14:02 |
*** kobis has quit IRC | 14:05 | |
*** velizarx has quit IRC | 14:14 | |
jiteka | Hello I've faced an issue today in my lab, trying to rotate amphora image | 14:20 |
jiteka | I added the new image in glance and initiated a failover, but the new amphora VM never came healthy due to : | 14:20 |
jiteka | Failover exception: Waiting for compute to go active timeout.: ComputeWaitTimeoutException: Waiting for compute to go active timeout. | 14:20 |
jiteka | Looking at nova logs it appear that the amp build failed due to : | 14:20 |
jiteka | NeutronAdminCredentialConfigurationInvalid: Networking client is experiencing an unauthorized exception. | 14:20 |
jiteka | I ended up with my loadbalancer in ERROR running on only 1 amp MASTER instead of 2 in ACTIVE/STANDBY | 14:20 |
jiteka | I forced a failover after updating the "provisioning_status" to ACTIVE and I ended up again with only 1 amp (and as expected experienced some downtime on my lb) | 14:20 |
jiteka | Then I tried another approach by deleting the last amp to see what the health-manager would do and I ended up again with only 1 amp | 14:20 |
jiteka | What could I do in that case to come back to 1 Master and 1 Backup when I only have 1 Master ? | 14:20 |
jiteka | https://pastebin.com/2tLLrKaz | 14:21 |
*** kobis has joined #openstack-lbaas | 14:21 | |
jiteka | I'm running devstack stable/queens on Ubuntu 16.04.4 LTS | 14:21 |
johnsom | jiteka Did you resolve the nova issue that caused the failure? | 14:23 |
johnsom | jiteka We have a few patches up for review that address a few of those cases where a failover fails itself due to nova/neutron outages. Let me find a few links. | 14:24 |
johnsom | This is probably what you are hitting: https://review.openstack.org/#/c/577344/ We haven't merged this on master yet, so not backported to queens yet. | 14:25 |
johnsom | And this one https://review.openstack.org/548989 | 14:27 |
johnsom | It doesn't look like that one got backported yet either, though it is merged | 14:27 |
jiteka | johnsom: I was able to create a new VM after that failure | 14:31 |
jiteka | johnsom: I have the feeling that it's only on the first build using a new image | 14:31 |
*** kobis has quit IRC | 14:34 | |
jiteka | johnsom: did the healthcheck manager will ensure that if a lb run only on 1 amp with role MASTER, when configured for ACTIVE_STANDBY, it will trigger creation of a new BACKUP ? | 14:35 |
jiteka | johnsom: to recover from such situation | 14:35 |
*** mugsie has quit IRC | 14:36 | |
*** mugsie has joined #openstack-lbaas | 14:36 | |
*** mugsie has quit IRC | 14:36 | |
*** mugsie has joined #openstack-lbaas | 14:36 | |
jiteka | johnsom: worst case scenario, if that patch https://review.openstack.org/#/c/577344/ allow the heath-manager to come back in stable situation when all amps are unreachable/deleted, it could work too even if causing downtime | 14:41 |
cgoncalves | johnsom, how is one supposed to test amphora agent code changes if DIB pulls code from git.o.o no matter what? | 14:48 |
johnsom | jiteka Yes, under normal situations, the HM will rebuild either amphora should it be in failure. It will stop if the failover fails and mark it in error though. | 14:48 |
cgoncalves | and FWIW always from master | 14:49 |
johnsom | jiteka If you are running Active/Backup there should be less than a second of downtime with the right tuning | 14:49 |
jiteka | johnsom: yes downtime wasn't an issue with 2 amp, 1 Master and 1 backup | 14:50 |
jiteka | johnsom: but here I have only 1 amp Master and I don't know how to recover from that to get a backup as heathmanager is not re-creating it | 14:50 |
johnsom | If you have the right patches (see above) you can mark the failed backup as "ACTIVE" and it will try again | 14:51 |
johnsom | cgoncalves There are a few answers to that. | 14:51 |
jiteka | johnsom: when I delete the MASTER, healthmanager take care of re-creating it, standalone first, then MASTER, but BACKUP creation is never re-triggered | 14:51 |
jiteka | johnsom: that's the thing, I don't have any backup unfortunately, just 1 amp associated to that LB | 14:52 |
johnsom | Right, if the BACKUP amp failed during a failover, because of nova/neutron failure, it will be marked as ERROR and tagged to not attempt to failover it again until an operator has fixed nova/neutron and tagged it back. | 14:52 |
johnsom | We don't want to make the nova/neutron failure worse by hitting it 10000s of times trying to restore an amp | 14:53 |
johnsom | cgoncalves So, there are environment variables for DIB that override where it gets the amp agent. | 14:54 |
johnsom | cgoncalves Default is pull from git master. | 14:55 |
johnsom | cgoncalves In devstack we override those to a local location: https://github.com/openstack/octavia/blob/master/devstack/plugin.sh#L58 | 14:55 |
johnsom | through line 65 | 14:55 |
jiteka | johnsom: thanks for the explaination :) | 14:58 |
jiteka | johnsom: I guess my mistake here was to delete amp while lb was in inconsistent state and hack into the db to force it as active (it was in error) as I don't see any other way to recover and coming back to ACTIVE to perform administrative task on it | 15:00 |
cgoncalves | johnsom, ah, I see! I was not getting my changes in when building the image outside devstack. so DIB_REPOLOCATION_amphora_agent=/local/path/to/octavia ./diskimage-create.sh [...] | 15:00 |
cgoncalves | thanks! | 15:00 |
johnsom | jiteka I think that should have worked, but the best option would have been to just mark it active again. However, you probably need those two patches | 15:01 |
johnsom | cgoncalves FYI, https://docs.openstack.org/diskimage-builder/latest/elements/source-repositories/README.html#override-per-source | 15:02 |
cgoncalves | appreciated! | 15:04 |
*** yboaron has quit IRC | 15:04 | |
*** ispp has quit IRC | 15:12 | |
*** ispp has joined #openstack-lbaas | 15:13 | |
*** kobis has joined #openstack-lbaas | 15:19 | |
openstackgerrit | Murali Annamneni proposed openstack/neutron-lbaas master: Hardcode foreignkey constraint name for lbaas_listeners https://review.openstack.org/557797 | 15:23 |
*** yamamoto_ has quit IRC | 15:24 | |
*** rraja has quit IRC | 15:30 | |
*** ispp has quit IRC | 15:32 | |
*** kobis has quit IRC | 15:39 | |
*** kobis has joined #openstack-lbaas | 15:44 | |
*** peereb has quit IRC | 15:45 | |
*** ktibi has quit IRC | 15:58 | |
*** kobis has quit IRC | 15:59 | |
*** yamamoto has joined #openstack-lbaas | 16:12 | |
openstackgerrit | Carlos Goncalves proposed openstack/octavia master: Fix package version querying on non-dpkg distros https://review.openstack.org/582293 | 16:23 |
*** ramishra has quit IRC | 16:24 | |
openstackgerrit | Carlos Goncalves proposed openstack/octavia master: Translate expected package names to installed ones https://review.openstack.org/582170 | 16:33 |
*** yamamoto has quit IRC | 16:42 | |
*** kobis has joined #openstack-lbaas | 16:42 | |
*** kobis has quit IRC | 16:47 | |
cgoncalves | with my latest 2 patches on top of fix failover patch, I can now create CentOS-based HA loadbalancers | 17:05 |
cgoncalves | johnsom, trying to failover the LB by deleting both amps, I've got only 1 amp VM up | 17:06 |
johnsom | Yeah, one amp in error the other not. That is the bug | 17:06 |
cgoncalves | seems that nova did not receive any request to create the second one, even though it shows up on the amp list | 17:06 |
cgoncalves | the worker keeps trying to connect to the agent but there's no VM in the first place | 17:07 |
johnsom | hmm, ok, that doesn't quiet make sense. Check you HM log, it should show the failover aborted when it couldn't rebuild the first amp properly | 17:07 |
cgoncalves | it rebuilt the first amp properly. the second is missing | 17:08 |
johnsom | Did it really finish the flow? Check the log, it should have failed and reverted the failover flow | 17:09 |
cgoncalves | I'm checking. it hasn't reverted to anything (yet) looking at the logs | 17:10 |
cgoncalves | http://paste.openstack.org/show/725742/ | 17:14 |
cgoncalves | I did not run the lb failover explicitly. just deleted nova instances and observed | 17:14 |
johnsom | correct | 17:15 |
*** tesseract has quit IRC | 17:16 | |
*** ramishra has joined #openstack-lbaas | 17:23 | |
*** ramishra has quit IRC | 17:37 | |
*** yamamoto has joined #openstack-lbaas | 17:43 | |
*** yamamoto has quit IRC | 18:00 | |
*** kobis has joined #openstack-lbaas | 18:18 | |
*** rraja has joined #openstack-lbaas | 18:19 | |
*** harlowja has joined #openstack-lbaas | 18:24 | |
cgoncalves | johnsom, reached connection time out. as a result, I now have only one amp listed and LB reports operating_status ONLINE | 19:08 |
cgoncalves | http://paste.openstack.org/show/725754/ | 19:09 |
*** kobis has quit IRC | 19:17 | |
*** Deknos has joined #openstack-lbaas | 19:21 | |
*** kobis has joined #openstack-lbaas | 19:29 | |
openstackgerrit | Carlos Goncalves proposed openstack/octavia master: Fix package version querying on non-dpkg distros https://review.openstack.org/582293 | 19:31 |
johnsom | cgoncalves I am not sure what is going on there. I would have to see the full HM log to figure it out. | 19:33 |
openstackgerrit | Carlos Goncalves proposed openstack/octavia master: Fix package version querying on non-dpkg distros https://review.openstack.org/582293 | 19:41 |
openstackgerrit | Carlos Goncalves proposed openstack/octavia master: Translate expected package names to installed ones https://review.openstack.org/582170 | 19:41 |
*** Deknos has left #openstack-lbaas | 19:42 | |
*** kobis has quit IRC | 19:47 | |
cgoncalves | johnsom, http://cgoncalves.pt/trash/openstack/journalctl-o-hm-2018-07-12.log | 19:56 |
*** amuller has quit IRC | 19:59 | |
johnsom | cgoncalves Is this with the patch or without? Are you running multple hm? | 20:03 |
cgoncalves | johnsom, with your patch + my 2 patches of today. 1 HM | 20:05 |
johnsom | cgoncalves and that was the start of the hm log, no other failures before that right? | 20:06 |
johnsom | The interest thing for me is that it did get past the point my code directly addresses... it is failing much later in the flow. | 20:06 |
cgoncalves | with so many testing I no longer can tell when last try started. let me re-try, sorry about that | 20:09 |
johnsom | cgoncalves hold up | 20:09 |
cgoncalves | I can get older log msgs. I truncated till 3 hours ago | 20:10 |
johnsom | cgoncalves when you try again, when it gets into that second retry loop for the connection, can you do a "openstack server list" and "openstack amphora list"? It almost looks like in the first part it didn't mark the second failed amp as ERROR for some reason | 20:10 |
cgoncalves | ok | 20:11 |
johnsom | That second connection looks like it's trying to connect to the second failed amp, which it should not be, the second failed amp should be in "ERROR" at that point and bypassed in the VRRP code | 20:11 |
cgoncalves | johnsom, http://paste.openstack.org/show/725764/ and http://cgoncalves.pt/trash/openstack/journalctl-o-hm-2018-07-12.log | 20:51 |
*** rraja has quit IRC | 20:56 | |
*** rraja has joined #openstack-lbaas | 20:56 | |
johnsom | cgoncalves is that the right log link? it's the same as the last one | 21:16 |
cgoncalves | johnsom, it is | 21:16 |
johnsom | ok | 21:17 |
johnsom | cgoncalves How are you failing these? deleting both with nova? | 21:25 |
cgoncalves | johnsom, deleting both with nova, yes. see http://paste.openstack.org/show/725764/ | 21:26 |
johnsom | Something is really wrong, I should see https://review.openstack.org/#/c/577344/2/octavia/controller/worker/tasks/amphora_driver_tasks.py in the log, but it's not there | 21:26 |
johnsom | line 62 | 21:26 |
cgoncalves | one thing I can confirm you: I'm running with your patch | 21:29 |
cgoncalves | [centos@rdocloud-devstack2 octavia]$ pwd | 21:29 |
cgoncalves | [centos@rdocloud-devstack2 octavia]$ grep "Failed to update listeners on amphora" octavia/controller/worker/tasks/amphora_driver_tasks.py | 21:29 |
cgoncalves | LOG.error('Failed to update listeners on amphora %s. Skipping ' | 21:29 |
johnsom | Yeah, I just don't get why that would not fire.... | 21:29 |
cgoncalves | note that I only created the LB. no listeners nor pools where created | 21:31 |
johnsom | Oh, hmmmm, that might be the key, there might be a bug in this if there are no listeners | 21:32 |
cgoncalves | *were | 21:33 |
johnsom | Darn, I am so buried with internal work right now, not sure when I can get back to that | 21:33 |
cgoncalves | OSP12? xD | 21:33 |
cgoncalves | no worries. I just wanted to help testing | 21:34 |
johnsom | Yeah, if I knew a week ago I would have been able to get it | 21:34 |
johnsom | sigh | 21:34 |
cgoncalves | good that I did. found other issues when on centos-based amps | 21:34 |
*** rcernin has joined #openstack-lbaas | 21:58 | |
*** yboaron has joined #openstack-lbaas | 22:05 | |
cgoncalves | johnsom, creating listener before forcing failover did the trick | 22:11 |
johnsom | Yeah, it's a bug for scenarios with no listener. not sure if the vrrp subflow should not be running or some other issue.... | 22:12 |
cgoncalves | although the second amp only started being created after first amp had been recovered | 22:12 |
cgoncalves | I'd have expected, I guess, both to be rebuilt simultaneously | 22:12 |
johnsom | That is correct behavior | 22:13 |
johnsom | Well, there are some sequencing issues there. notably to configure the other peer we need it's ip info. | 22:13 |
johnsom | It could be done in the future with some fancy sequencing, etc. but... fix bug before optimize the whole flow | 22:14 |
*** fnaval has quit IRC | 22:14 | |
*** yboaron has quit IRC | 22:14 | |
cgoncalves | ok, fair enough :) | 22:14 |
*** rraja has quit IRC | 22:17 | |
*** rraja has joined #openstack-lbaas | 22:21 | |
*** fnaval has joined #openstack-lbaas | 22:30 | |
*** rraja has quit IRC | 22:33 | |
*** fnaval has quit IRC | 22:39 | |
openstackgerrit | Carlos Goncalves proposed openstack/octavia master: Translate expected package names to installed ones https://review.openstack.org/582170 | 22:41 |
*** KeithMnemonic has quit IRC | 22:48 | |
*** fnaval has joined #openstack-lbaas | 22:52 | |
*** fnaval has quit IRC | 22:56 | |
*** fnaval has joined #openstack-lbaas | 23:49 | |
openstackgerrit | German Eichberger proposed openstack/octavia master: [WIP] Switch amphora agent to use privsep https://review.openstack.org/549295 | 23:58 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!