Thursday, 2026-02-19

opendevreview	Gregory Thiemonge proposed openstack/octavia stable/2025.1: Fix issues related to pkg_resources module https://review.opendev.org/c/openstack/octavia/+/977293	08:55
opendevreview	Gregory Thiemonge proposed openstack/octavia stable/2024.2: Fix issues related to pkg_resources module https://review.opendev.org/c/openstack/octavia/+/977294	08:56
*** tkajinam_ is now known as tkajinam		10:13
opendevreview	Gregory Thiemonge proposed openstack/octavia stable/2025.1: Fix issues related to pkg_resources module https://review.opendev.org/c/openstack/octavia/+/977293	10:57
gthiemonge	^ fixes additional issues in 2025.1 :/	11:01
*** beagles__ is now known as beagles		13:07
servagem	Hi all. We had some Octavia LBs failover due to a temporary Nova issue. The octavia-failover-amphora-flow failed with ComputeBuildException, leaving the LBs in ERROR and requiring manual intervention. Because these LBs support critical workloads, we need automatic retries for failover when Nova has transient failures. I checked [compute] max_retries option, but it doesn't	13:29
servagem	seem to apply to amphora build/create (only some compute ops like delete). Is that intentional? Any recommended way to enable retries for amphora create during failover?	13:29
gthiemonge	servagem: hey, yes it is intentional, but the code may have changed a lot since it was added. Basically, the octavia-worker sends only one create server request to nova, then waits for a certain amount of time until the VM is marked as active by nova and this amount of time used max_retries and interval. So this max_retries is not used to retry to create a VM, only to retry to get a positive	13:49
gthiemonge	status.	13:49
gthiemonge	they are some proposed patches that enable automatic failover of amphora in ERROR: https://review.opendev.org/c/openstack/octavia/+/934638 but we never agreed on a such feature	13:51
servagem	That would be an excellent feature. These LBs support critical workloads, and the time to detect the issue, engage the ops team, and perform a manual failover isn't acceptable for us. How can we help move this feature forward?	13:57
gthiemonge	reviews or tests would be appreciated, I haven't reviewed the code yet, I think the main question was: could it do more harm than good? in case of major outage, it may be stuck in a loop of VM recreation and push a lot of load on nova	14:05
servagem	It seems this patch doesn't use Tenacity for retries, unlike other parts of the Octavia code.	14:06
servagem	yes, we can help testing. I think that concern could be addressed by using options like the ones in [compute] retry settings (max_retries, retry_interval, retry_backoff, retry_max)	14:10
gthiemonge	servagem: I think it doesn't retry when the creation fails, but it allows the health-manager to trigger a new failover after the vm creation fails (which is currently blocked in Octavia)	14:12
servagem	yep, got it	14:17
*** croeland1 is now known as croelandt		14:18

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!