Thursday, 2026-02-19

opendevreviewGregory Thiemonge proposed openstack/octavia stable/2025.1: Fix issues related to pkg_resources module  https://review.opendev.org/c/openstack/octavia/+/97729308:55
opendevreviewGregory Thiemonge proposed openstack/octavia stable/2024.2: Fix issues related to pkg_resources module  https://review.opendev.org/c/openstack/octavia/+/97729408:56
*** tkajinam_ is now known as tkajinam10:13
opendevreviewGregory Thiemonge proposed openstack/octavia stable/2025.1: Fix issues related to pkg_resources module  https://review.opendev.org/c/openstack/octavia/+/97729310:57
gthiemonge^ fixes additional issues in 2025.1 :/11:01
*** beagles__ is now known as beagles13:07
servagemHi all. We had some Octavia LBs failover due to a temporary Nova issue. The octavia-failover-amphora-flow failed with ComputeBuildException, leaving the LBs in ERROR and requiring manual intervention. Because these LBs support critical workloads, we need automatic retries for failover when Nova has transient failures. I checked [compute] max_retries option, but it doesn't13:29
servagemseem to apply to amphora build/create (only some compute ops like delete). Is that intentional? Any recommended way to enable retries for amphora create during failover?13:29
gthiemongeservagem: hey, yes it is intentional, but the code may have changed a lot since it was added. Basically, the octavia-worker sends only one create server request to nova, then waits for a certain amount of time until the VM is marked as active by nova and this amount of time used max_retries and interval. So this max_retries is not used to retry to create a VM, only to retry to get a positive 13:49
gthiemongestatus.13:49
gthiemongethey are some proposed patches that enable automatic failover of amphora in ERROR: https://review.opendev.org/c/openstack/octavia/+/934638 but we never agreed on a such feature13:51
servagemThat would be an excellent feature. These LBs support critical workloads, and the time to detect the issue, engage the ops team, and perform a manual failover isn't acceptable for us. How can we help move this feature forward?13:57
gthiemongereviews or tests would be appreciated, I haven't reviewed the code yet, I think the main question was: could it do more harm than good? in case of major outage, it may be stuck in a loop of VM recreation and push a lot of load on nova14:05
servagemIt seems this patch doesn't use Tenacity for retries, unlike other parts of the Octavia code.14:06
servagemyes, we can help testing. I think that concern could be addressed by using options like the ones in [compute] retry settings (max_retries, retry_interval, retry_backoff, retry_max)14:10
gthiemongeservagem: I think it doesn't retry when the creation fails, but it allows the health-manager to trigger a new failover after the vm creation fails (which is currently blocked in Octavia)14:12
servagemyep, got it14:17
*** croeland1 is now known as croelandt14:18

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!