Friday, 2025-01-31

sahidralonsoh: morning07:38
sahidsorry to jump on you like that, but you may have a hint, can you have a look to the unit tests? https://24e2a8e2d1a77fa9bdf5-a6c8485b6a6aeffda6bfb9fca2c239bc.ssl.cf2.rackcdn.com/939321/23/check/openstack-tox-py39/7b25c30/testr_results.html07:38
sahida lot are failing but I'm not able at that point to identify the reason07:38
sahidlocally they work07:39
sahidi'm sure it's a tiny thing...07:39
sahidlocally: tox -epy3 -- neutron.tests.unit.plugins.ml2.drivers.openvswitch.agent.test_ovs_neutron_agent.TestOvsNeutronAgentOSKen07:42
sahidRan: 171 tests in 4.4855 sec. - Passed: 171 - Skipped: 0 - Expected Fail: 0 - Unexpected Success: 0 - Failed: 007:43
sahidSum of execute time for each test: 40.2115 sec.07:43
sahidhum perhaps i should rm my py3 env07:43
sahidthat works, even after to have resested env...07:44
ralonsohsahid, so the errors in the testsuit are restricted to the worker {3}07:51
ralonsohhttps://24e2a8e2d1a77fa9bdf5-a6c8485b6a6aeffda6bfb9fca2c239bc.ssl.cf2.rackcdn.com/939321/23/check/openstack-tox-py39/7b25c30/job-output.txt07:51
ralonsohand they start after executing one of your new tests07:51
ralonsoh2025-01-30 15:58:24.643291 | ubuntu-focal | {3} neutron.tests.unit.plugins.ml2.drivers.openvswitch.agent.openflow.native.test_ovs_oskenapp.TestSignalHandling.test_signal_initialization [0.001730s] ... ok07:52
ralonsohthis test is doing something wrong07:52
ralonsohtry sending this patch without this test (commenting it)07:52
ralonsohthe UTs will pass07:52
ralonsohI've reproduced that locally but executing the whole test suite07:54
ralonsohthat means, executing test_signal_initialization too07:54
ralonsohI'm trying now without this test07:54
ralonsohsahid, to be honest, I don't see any benefit from this test. This is just testing that some methods are executed. But these methods are not in a branch or condition07:58
ralonsohthis test is not useful07:58
sahidyes... not useful and looks to broke something07:59
sahidi'm trying to see how i can articulate the change08:00
ralonsohthis test can check that the singal_handler is called08:00
ralonsohthis is passed to ovs_agent main08:00
ralonsohand you spawn it in the hib08:00
ralonsohthen the daemon_loop must be called08:01
ralonsohit is important, and I think that's what is breaking the test suit08:01
ralonsohto stop the app\08:01
ralonsohthis must be define in a cleanUp method08:01
sahidthanks ralonsoh let me see if i can try to follow your reco08:03
opendevreviewSahid Orentino Ferdjaoui proposed openstack/neutron master: ovs: reimplement signals handling  https://review.opendev.org/c/openstack/neutron/+/93932108:26
opendevreviewSahid Orentino Ferdjaoui proposed openstack/neutron master: ovs: remove the usage of eventlet in the OVS agent  https://review.opendev.org/c/openstack/neutron/+/93776508:26
sahidok I have removed the not interesting tests and added cleanup to stop the hub. let's seee if that fix our problem, in that same time I will work to a test that go through ovs-avent to call daemon_loop08:27
opendevreviewRodolfo Alonso proposed openstack/neutron master: [OVN] Refresh host nodes before notifying  https://review.opendev.org/c/openstack/neutron/+/94025609:28
opendevreviewRodolfo Alonso proposed openstack/neutron stable/2024.1: [OVN] Add the network type to the ``Logical_Switch`` register  https://review.opendev.org/c/openstack/neutron/+/94045309:43
opendevreviewRodolfo Alonso proposed openstack/neutron stable/2024.1: [OVN] Set reside-on-chassis-redirect also for FLAT networks  https://review.opendev.org/c/openstack/neutron/+/94045410:01
opendevreviewRodolfo Alonso proposed openstack/neutron stable/2023.2: [OVN] Add the network type to the ``Logical_Switch`` register  https://review.opendev.org/c/openstack/neutron/+/94050210:13
opendevreviewRodolfo Alonso proposed openstack/neutron stable/2023.2: [OVN] Set reside-on-chassis-redirect also for FLAT networks  https://review.opendev.org/c/openstack/neutron/+/94050310:14
*** bpetermann is now known as Guest757110:18
opendevreviewLajos Katona proposed openstack/neutron-tempest-plugin master: Tap Mirror API and scenario tests  https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/88600410:22
opendevreviewLajos Katona proposed openstack/networking-bagpipe master: CFG: add help text for OVS dataplane driver cfg options  https://review.opendev.org/c/openstack/networking-bagpipe/+/93373510:22
opendevreviewLajos Katona proposed openstack/tap-as-a-service master: Do not set ageing in case of system datapath type  https://review.opendev.org/c/openstack/tap-as-a-service/+/92240010:23
opendevreviewMerged openstack/neutron stable/2023.2: Clean up state VRRP PID file  https://review.opendev.org/c/openstack/neutron/+/94044610:42
opendevreviewRodolfo Alonso proposed openstack/neutron master: [OVN] Refactor ``HashRingManager`` sync method  https://review.opendev.org/c/openstack/neutron/+/94034210:47
opendevreviewRodolfo Alonso proposed openstack/neutron stable/2023.2: [OVN] Add the network type to the ``Logical_Switch`` register  https://review.opendev.org/c/openstack/neutron/+/94050210:58
opendevreviewRodolfo Alonso proposed openstack/neutron stable/2023.2: [OVN] Set reside-on-chassis-redirect also for FLAT networks  https://review.opendev.org/c/openstack/neutron/+/94050310:58
opendevreviewMerged openstack/neutron master: Check existence of GW port before trying to delete it  https://review.opendev.org/c/openstack/neutron/+/93945112:47
opendevreviewSahid Orentino Ferdjaoui proposed openstack/neutron master: ovs: reimplement signals handling  https://review.opendev.org/c/openstack/neutron/+/93932113:02
opendevreviewSahid Orentino Ferdjaoui proposed openstack/neutron master: ovs: remove the usage of eventlet in the OVS agent  https://review.opendev.org/c/openstack/neutron/+/93776513:02
haleyb#startmeeting neutron_drivers14:00
opendevmeetMeeting started Fri Jan 31 14:00:17 2025 UTC and is due to finish in 60 minutes.  The chair is haleyb. Information about MeetBot at http://wiki.debian.org/MeetBot.14:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.14:00
opendevmeetThe meeting name has been set to 'neutron_drivers'14:00
haleybPing list: ykarel, mlavalle, mtomaska, slaweq, obondarev, tobias-urdin, lajoskatona, amotoki, haleyb, ralonsoh14:00
mlavalle\o14:00
ralonsohhello14:00
obondarevo/14:00
haleybwill wait another minute for quorum14:01
lajoskatonao/14:01
haleybi had pinged daniel about the one item on our agenda, just noticed he said he might be late14:02
f0oo/ I'm here14:02
haleyboh hi14:02
f0oHi! literally just got my laptop open in time hehe14:02
haleybwe can get started then14:02
haleyb#link https://bugs.launchpad.net/neutron/+bug/209670414:02
haleyb[RFE] Allow rebalancing of OVN LRPs14:02
f0oShoot, how can I help clarifying the RFE :)14:03
haleybso i think you had talked to ralonsoh last week about this? it does seem like a gap to me14:03
ralonsohwell, not a gap14:04
ralonsohbut a useful tool when using OVN L314:04
mlavalleseems a sensible tool14:04
f0o:)14:04
lajoskatonaTuesday ralonsoh said it would be a tool not APi, am I right about it<14:05
lajoskatona?14:05
ralonsohyes, that should be a tool14:05
ralonsohbut f0o can explain how it should be and the goal14:05
f0oFor us it would scratch the itch that Nova already has covered with migrations, we can always rebalance our compute nodes with some maintenance window/s but as for the network nodes they seem destined to just overflow and give in - having little space for horizontal scaling as we canno reassign the LRP priorities to the new nodes14:06
ralonsohyes, but this is the user case14:08
ralonsohhow this tool should work?14:08
ralonsohwhy is it needed?14:08
haleybare there other options? etc14:09
haleybf0o: are you still here?14:10
f0oI can only speak for us, we are happy with a one-shot command that we can Cron or issue on-demand when we see capacity imbalances. But if there's a better option akin to feedback-loops where neutron would check on chassis-events if it needs to rebalance or not, that can also work. However I'm unsure how invasive the changing of LRP priorities is14:10
f0oI dug a big and saw that I could just YOLO change the priorities of the lrp<>chassis but I have no idea how that affects neutron if I were to do that with a bash-loop14:11
f0oThe situation we are in now is a deadlocked one. The only rectifying way I can see to get the priorities shifted is to remove all routers and readd them which breaks FIPs and causes big outages14:11
ralonsohhow did you end in this situation?14:12
f0oI'm not even sure how one GN ended up with being the prio for all lrps - but that's out of scope because even if thats fixed in $future, the bad situation wont be fixed retrospectively14:12
ralonsohdid you remove GW chassis?14:12
f0owe had no notable downtime of GW chassis, we had a few maintenances that made a few failovers but nothing longer than an hour14:12
f0oduring which we didnt accept public neutron endpoint requests to avoid customers from creating routers14:13
ralonsohbut you removed GW chassis from the deployment14:13
f0owe did not, we just quit ovs and performed the maintenance and rebooted14:13
ralonsohdid you stop GW chassis?14:14
f0oyeah via systemctl stop ovs-vswitchd14:14
ralonsohso this is why OVN rebalanced the LRP assigned chassis14:15
ralonsohso you manually made an action that triggered the rebalance of the assigned chassis14:15
f0olikely; it was our understanding that the LRP priorities would go back to the previously assigned chassis14:15
ralonsoheventually, when the high prio chassis is restored, the assigned chassis should be changed back, yes14:16
ralonsohbut if that didn't happen, then you need to check your OVN chassis list14:16
ralonsohin any case, about this tool, it should be an admin only tool, of course14:16
f0owhat I saw when we spoke is that the rt1 chassis has prio 2 and the rt1 chassis has prio 1 on all lrps14:17
f0oso the chassis is there, but its low prio on all lrps, instead of regaining prio 2 on some of them14:17
ralonsohyes but you didn't tell me that you rebooted systems14:17
ralonsohthat was missing in this conversation14:17
ralonsohin any case14:17
ralonsohfirst of all, you should also open a bug related to the OVN L3 scheduler14:18
f0ohow should the maintance have been performed to avoid this in the future? because some actions require reboots14:18
ralonsohif with 2 GW chassis the balance is not correctly done, this could be al algorithm issue14:18
ralonsohf0o, as commented, when you restart a GW chassis, OVN restores the high prio chassis assigned14:18
f0oif I were to add a 3rd GW chassis now, how would the existing LRPs be redistributed so that the new GW wouldnt be empty until a new LRP is created?14:18
ralonsohf0o, by default, the underlaying architecture is usually static in a deployment14:19
ralonsohyou are talking about upgrade/maintenance operations14:19
ralonsohand this is fine but this is not normal operation14:19
ralonsohdo not expect the OVN L3 scheduler to re-balance14:20
f0oif the L3 Scheduler wont rebalance that I need to be able to do so, no?14:20
f0oor at least have the option to do so if I wanted to. Otherwise I will have a dying node permanently and no way to avoid it since it just happens to have more LRPs historically14:21
ralonsohagain, I think this tool is something nice to have and useful **after cluster modifications**14:22
f0o+114:22
mlavallef0o: would you implement it?14:22
f0omlavalle: I have no domain knowledge how external changes to the OVN LRP prios would affect neutron internals. But I can create a tool that just scans all LRPs and all chassis and just redistributes the priors to have a balanced sheet. If neutron needs to be informed of this and somebody can point me to _how_ I could inform neutron of this change so the DB (I guess?) is in14:24
f0osync then I can do that as well14:24
f0othis is where we are now halted as well, if we are able to freely reassign prios without neutron going haywire from it (AKA no notifications needed) then I can make a PoC very quickly14:25
ralonsohI would need to check but this is only OVN NB config: there is a list of gateway_chassis registers per LRP and they are assigned to the LRP.gateway_chassis list14:27
ralonsohNeutron DB doesn't play a role here14:27
f0oso then issuing ovn-nbctl lrp-set-gateway-chassis lrp-123 chassis-with-prio-1 900 on the northd hosts should cause no problems, right?14:28
ralonsohhold on, if you are creating a neutron script, you should use the mechanisms provided by Neutron14:29
ralonsohwe are not talking about a bash script here14:29
ralonsohcheck any other Neutron script14:29
lajoskatonayeah like remove_duplicated_port_bindings.py 14:30
lajoskatonahttps://opendev.org/openstack/neutron/src/branch/master/neutron/cmd/remove_duplicated_port_bindings.py14:30
ralonsohthat's a good example14:30
f0ook, I'll have a look at the OVN NB codebase in neutron and see what I can access/use there14:31
mlavalle+114:32
ralonsohmaybe neutron_ovn_db_sync_util, that has access to the OVN DB could be more useful14:32
ralonsohdo we need a spec for this RFE?14:32
ralonsohI don't think so14:32
ralonsohbut should be properly documented14:32
haleybif it's small in scope i would agree we don't need a spec14:33
haleybi think it should be separate from neutron-ovn-db-sync to be clear14:34
ralonsohyes, this is just an example14:34
ralonsohso should we vote then?14:34
mlavalle+114:35
ralonsoh+114:35
obondarev+114:35
lajoskatona+1 with good documentation what the user can expect using the tool14:35
haleyb+114:35
f0o+1 (Not even sure if mine counts ;)14:35
mlavallelol, it doesn't but it is welcome14:36
lajoskatonayour development will count more :-)14:36
haleybf0o: ok, can you write-up the summary here and add to the bug?14:36
haleybi will approve it14:36
f0ohaleyb: will do14:37
haleybwe can then just comment in any patch14:37
haleybf0o: thanks for bringing up the issue and thanks for working on it14:37
f0oHappy to help :)14:38
haleybis there anything else people want to discuss?14:38
haleybok, then thanks for attending, and have a good weekend!14:39
haleyb#endmeeting14:39
opendevmeetMeeting ended Fri Jan 31 14:39:19 2025 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)14:39
opendevmeetMinutes:        https://meetings.opendev.org/meetings/neutron_drivers/2025/neutron_drivers.2025-01-31-14.00.html14:39
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/neutron_drivers/2025/neutron_drivers.2025-01-31-14.00.txt14:39
opendevmeetLog:            https://meetings.opendev.org/meetings/neutron_drivers/2025/neutron_drivers.2025-01-31-14.00.log.html14:39
ralonsohbye14:39
mlavalle\o14:39
f0oo/14:39
* f0o puts on the secretary hat and types away14:40
lajoskatonaBye14:40
f0otook a look at neutron.common.ovn and I think it has mostly everything. Might need to reimplement something akin to _sync_ha_chassis_group just for non chassis groups I guess.14:48
f0owill do more digging over the weekend hopefully14:48
opendevreviewRodolfo Alonso proposed openstack/neutron stable/2024.1: [OVN] Set reside-on-chassis-redirect also for FLAT networks  https://review.opendev.org/c/openstack/neutron/+/94045415:04
opendevreviewRodolfo Alonso proposed openstack/neutron stable/2023.2: [OVN] Set reside-on-chassis-redirect also for FLAT networks  https://review.opendev.org/c/openstack/neutron/+/94050315:07
opendevreviewRodolfo Alonso proposed openstack/neutron master: [OVN] Refresh host nodes before notifying  https://review.opendev.org/c/openstack/neutron/+/94025615:15
opendevreviewRodolfo Alonso proposed openstack/neutron master: [OVN] Refactor ``HashRingManager`` sync method  https://review.opendev.org/c/openstack/neutron/+/94034215:37
opendevreviewSahid Orentino Ferdjaoui proposed openstack/neutron master: ovs: remove the usage of eventlet in the OVS agent  https://review.opendev.org/c/openstack/neutron/+/93776516:13
sahido/ if you have a moment, how do you feel about this one?16:16
sahidhttps://review.opendev.org/c/openstack/neutron/+/93932116:16
opendevreviewDmitriy Rabotyagov proposed openstack/ovn-bgp-agent master: Add option to avoid VRF removal  https://review.opendev.org/c/openstack/ovn-bgp-agent/+/94053716:32
priteauHello. Could we get another +2 for this backport? Thanks! https://review.opendev.org/c/openstack/neutron/+/94046316:37
bcafarelpriteau: let me check that16:39
opendevreviewMerged openstack/neutron stable/2024.1: [OVN] Add the network type to the ``Logical_Switch`` register  https://review.opendev.org/c/openstack/neutron/+/94045317:48
opendevreviewMerged openstack/neutron stable/2023.2: [OVN] Add the network type to the ``Logical_Switch`` register  https://review.opendev.org/c/openstack/neutron/+/94050217:49
opendevreviewFernando Royo proposed openstack/ovn-octavia-provider master: Add LB sync logic  https://review.opendev.org/c/openstack/ovn-octavia-provider/+/92532418:15
opendevreviewFernando Royo proposed openstack/ovn-octavia-provider master: Add Listener sync logic  https://review.opendev.org/c/openstack/ovn-octavia-provider/+/93125018:15
opendevreviewFernando Royo proposed openstack/ovn-octavia-provider master: Add Pool sync logic  https://review.opendev.org/c/openstack/ovn-octavia-provider/+/93126618:15
opendevreviewFernando Royo proposed openstack/ovn-octavia-provider master: Add Member sync logic  https://review.opendev.org/c/openstack/ovn-octavia-provider/+/93126718:15
opendevreviewFernando Royo proposed openstack/ovn-octavia-provider master: Add Health monitor sync logic  https://review.opendev.org/c/openstack/ovn-octavia-provider/+/93128818:15
opendevreviewFernando Royo proposed openstack/ovn-octavia-provider master: Add sync floating IP support  https://review.opendev.org/c/openstack/ovn-octavia-provider/+/92903918:15
opendevreviewMerged openstack/neutron unmaintained/2023.1: Clean up state VRRP PID file  https://review.opendev.org/c/openstack/neutron/+/94046318:37
opendevreviewFernando Royo proposed openstack/ovn-octavia-provider master: Add Listener sync logic  https://review.opendev.org/c/openstack/ovn-octavia-provider/+/93125018:39
opendevreviewFernando Royo proposed openstack/ovn-octavia-provider master: Add Pool sync logic  https://review.opendev.org/c/openstack/ovn-octavia-provider/+/93126618:39
opendevreviewFernando Royo proposed openstack/ovn-octavia-provider master: Add Member sync logic  https://review.opendev.org/c/openstack/ovn-octavia-provider/+/93126718:39
opendevreviewFernando Royo proposed openstack/ovn-octavia-provider master: Add Health monitor sync logic  https://review.opendev.org/c/openstack/ovn-octavia-provider/+/93128818:40
opendevreviewFernando Royo proposed openstack/ovn-octavia-provider master: Add sync floating IP support  https://review.opendev.org/c/openstack/ovn-octavia-provider/+/92903918:40
opendevreviewMerged openstack/neutron master: [eventlet-removal] Use non-eventlet metadata proxy in OVN metadata agent  https://review.opendev.org/c/openstack/neutron/+/93839319:34

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!