opendevreview | OpenStack Proposal Bot proposed openstack/octavia-dashboard master: Imported Translations from Zanata https://review.opendev.org/c/openstack/octavia-dashboard/+/882660 | 04:34 |
---|---|---|
opendevreview | Omer Schwartz proposed openstack/octavia master: Provide Amphora stats for Octavia no-op drivers https://review.opendev.org/c/openstack/octavia/+/890814 | 12:54 |
opendevreview | Omer Schwartz proposed openstack/octavia master: Provide Amphora stats for Octavia no-op drivers https://review.opendev.org/c/openstack/octavia/+/890814 | 12:59 |
QG | Hello, since we have upgraded Octavia to Zed we have some strange issue with some resources staying in PENDING_CREATE, have you ever seen this behavior ? | 13:08 |
QG | In the logs we only see for example : INFO octavia.controller.queue.v2.endpoints [-] Creating member 'ca977118-94ad-4314-9ca5-89baa017771c'... | 13:10 |
QG | and when we check in the octavia-persistence databases there are no tasks in progress | 13:11 |
tweining | Hi QG, I assume this happens only when you create or failover resources? | 13:11 |
QG | Yes exactly ! | 13:12 |
tweining | AFAIR there were issues with resources stuck in PENDING_*, but I am not aware of any that are related to Zed. I'll have to search. | 13:15 |
tweining | also, that can happen for a lot of reasons | 13:15 |
tweining | the first thing I would probably do is to check if the amphora instance came up without errors and then whether the management network works. | 13:22 |
tweining | maybe it is related to https://storyboard.openstack.org/#!/story/2010426 | 13:24 |
QG | Checking amphora and network thanks ! | 13:26 |
QG | I can see in the logs of the amphora agent some configuration reload from some previous configuration changes ( like couple seconds before ) | 13:29 |
QG | [2023-08-09 12:44:23 +0000] [650] [DEBUG] PUT /1.0/loadbalancer/ff5783e5-15a5-49f7-932a-cadbce463810/reload | 13:29 |
QG | and | 13:29 |
QG | worker 2023-08-09 12:44:29.110 11 INFO octavia.controller.queue.v2.endpoints [-] Creating member 'ca977118-94ad-4314-9ca5-89baa017771c'... | 13:29 |
QG | we have backported https://storyboard.openstack.org/#!/story/2010426 | 13:34 |
johnsom | QG Hi, PENDING_CREATE is an odd one. To clarify, you do see one of the worker processes pick it up off the rabbit queue and start working on it? | 15:09 |
johnsom | You might check your OctaviaConnectionMaxRetries and OctaviaBuildActiveRetries settings, the upstream default is VERY long (hours) due to the slow test gate hosts. We usually set those to lower numbers that are more user friendly. Basically those set how long we keep retrying nova/neutron failures. | 15:10 |
QG | Hi johnsom, yes i see the worker pick it up and start working on it, i even see the action on the amphora side, but it's like at the end of the task when it need to put back the lb in Active, it doesn't do it | 15:10 |
johnsom | Do you see retry warning statements in that worker log? | 15:11 |
QG | oh checking thanks | 15:11 |
johnsom | PENDING_* states means one of the controllers has ownership and is provisioning or retrying failure conditions | 15:11 |
QG | no i don't see any retry warning | 15:13 |
QG | there is no task in octavia-persistence | 15:13 |
johnsom | So you have enabled jobboard? | 15:14 |
QG | yes we have | 15:14 |
johnsom | Hmm, I guess my next step would be to go through the worker log and follow it's progression until it stops. If you want to share the worker log, I am happy to take a look. | 15:15 |
QG | yes sure i will try to compile the logs and share them | 15:39 |
opendevreview | Merged openstack/octavia-dashboard master: Imported Translations from Zanata https://review.opendev.org/c/openstack/octavia-dashboard/+/882660 | 16:30 |
opendevreview | Michael Johnson proposed openstack/octavia master: Remove unused wait_for_port_detach code https://review.opendev.org/c/openstack/octavia/+/890958 | 19:51 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!