Friday, 2025-01-24

opendevreviewLajos Katona proposed openstack/neutron master: If OVS Manager creation failes retry to set values  https://review.opendev.org/c/openstack/neutron/+/93911708:14
opendevreviewSahid Orentino Ferdjaoui proposed openstack/neutron master: async_process: fix potential race condition with respawn  https://review.opendev.org/c/openstack/neutron/+/93962708:20
opendevreviewSahid Orentino Ferdjaoui proposed openstack/neutron master: async_process: remove usage of eventlet for AsyncProcess  https://review.opendev.org/c/openstack/neutron/+/93934808:20
opendevreviewSahid Orentino Ferdjaoui proposed openstack/neutron master: common: fix wait_until_true to support native thread  https://review.opendev.org/c/openstack/neutron/+/93784308:20
opendevreviewSahid Orentino Ferdjaoui proposed openstack/neutron master: ovs: reimplement signals handling  https://review.opendev.org/c/openstack/neutron/+/93932108:20
opendevreviewSahid Orentino Ferdjaoui proposed openstack/neutron master: ovs: remove the usage of eventlet in the OVS agent  https://review.opendev.org/c/openstack/neutron/+/93776508:20
slaweqralonsoh ykarel or lajoskatona hi, can one of you check and approve https://review.opendev.org/c/openstack/neutron/+/938135/? It already have 2 x +2 and then parent patch was just merged tonight08:27
ralonsohlet me check08:27
ralonsohsure08:27
slaweqthx, and also https://review.opendev.org/c/openstack/neutron/+/937887 if you have time08:27
ralonsohone sec08:27
ralonsohahh you implemented one_or_none()08:28
slaweqyes, I  used it as you advaiced me :)08:29
ralonsohif you have time, these short 3 patches08:29
ralonsohhttps://review.opendev.org/c/openstack/neutron/+/939095/408:29
ralonsohhttps://review.opendev.org/c/openstack/neutron/+/939097/508:29
ralonsohhttps://review.opendev.org/c/openstack/neutron/+/939210/508:29
slaweqralonsoh also also those backports https://review.opendev.org/c/openstack/neutron/+/937887 :)08:29
slaweqand sure, I will look at your patches right now :)08:30
ralonsohsure08:30
lajoskatonaslaweq, ralonsoh: good morning, I check it08:30
ralonsohslaweq, what backports??08:30
slaweqthx lajoskatona 08:30
slaweqhttps://review.opendev.org/q/Ic11992ba3ed91980189efbacdc2a54fba64fcf7c08:30
ralonsohahhh yes08:30
slaweqsorry, I copied wrong link before :)08:30
ralonsohbtw, yesterday I was talking to Terry and I think we know how to fix the hash ring manager instability with wsgi. I'm pushing today a patch (that will be better than describing it here)08:32
slaweqgreat, I hope it will work and our gate will be more stable :)08:33
slaweqI +2'ed your patches already08:35
ralonsohthanks08:35
slaweqand approved 2 of them which had other +208:35
slaweqone of them you just rechecked and it needs another +2 still08:35
slaweqso maybe lajoskatona or ykarel will look into it too08:36
ralonsohthat would be perfect08:36
ralonsohclosing some eventlet-removal bits08:36
opendevreviewVasyl Saienko proposed openstack/neutron master: Add link to Octavia and SRIOV limitations from generic OVN Gaps page  https://review.opendev.org/c/openstack/neutron/+/93994608:39
ralonsohhaleyb, hello! Please check https://review.opendev.org/c/zuul/zuul-jobs/+/940074 comments. As you know, because of the comment in twine, they jumped from 20.04 to 24.04, with the issues during the installation09:15
ralonsohand the jobs is still failing in all openstack...09:16
opendevreviewDmitriy Rabotyagov proposed openstack/ovn-bgp-agent master: Ensure that ARP/NDP is enabled for vlan devices  https://review.opendev.org/c/openstack/ovn-bgp-agent/+/93580110:36
opendevreviewMerged openstack/neutron master: Make API policies for tags to be working with resource attributes  https://review.opendev.org/c/openstack/neutron/+/93813511:14
opendevreviewAnton Kurbatov proposed openstack/neutron master: Fix DHCP agent events throttling malfunction  https://review.opendev.org/c/openstack/neutron/+/93997011:20
opendevreviewRodolfo Alonso proposed openstack/neutron master: [eventlet-removal][OVN] Require wsgi start-time in the config  https://review.opendev.org/c/openstack/neutron/+/94012312:05
f0oralonsoh: at risk of asking outdated information; I just came across your proposal to implement a new l3 scheduler for OVNGatewayHAChassisGroup while investigating why all HA routers are located on one network node instead of being split among both (least loaded does suggest this split but I guess not..?) - I saw your WIP Patch was abandoned, did it get replaced by12:25
f0osomething similar? Am I barking at the wrong tree?12:25
opendevreviewMichel Nederlof proposed openstack/ovn-bgp-agent master: Fix cleanup of rules per evpn device  https://review.opendev.org/c/openstack/ovn-bgp-agent/+/92781612:33
ralonsohf0o, the patch I started 1 year ago to implement a new L3 scheduler using HA_Chassis_Group makes no sense with this new requirement if we are going to migrate to HA_CHassis_Group the current schedulers12:33
ralonsohright now, the HA_Chassis_Group is used for external ports only12:34
ralonsohwhen the HA_Chassis list is built, it is not considered any scheduling method, just retrieve the list and that's all12:34
ralonsohthus, it could be possible to also include, the external ports HA_Chassis_Group creation, some kind of scheduling, in order to balance the GW chassis12:36
opendevreviewyatin proposed openstack/neutron master: [DNM] Check functional failure  https://review.opendev.org/c/openstack/neutron/+/94012812:37
f0oI feel like either I misunderstood how L3HA is supposed to work on OVN or my setup is not working correctly. What I see is that all routers are shifted to the other node on BFD failure, which is great. But all routers always remain on one chassis which is a shame and makes it a bit difficult to scale since we cant just add more nodes but need to add _bigger_ nodes to scale12:37
ralonsohf0o, the L3 scheduler default algorithm is leastloaded, that should "shared" the GW ports across all GW nodes12:41
ralonsohhow many GW chassis do you have? do you have availability zones?12:41
f0o2 GW Nodes (https://paste.opendev.org/show/bQ7LLYGm7OEyEskI9vJv/) - no AZs12:43
ralonsohf0o, yes but are you sure the chassis are GW nodes? please check the ovs local config12:44
ralonsohone sec12:44
ralonsohroot@u22ovn:~# ovs-vsctl list open . | grep enable-chassis-as-gw12:44
ralonsohexternal_ids        : {hostname=u22ovn, ovn-bridge=br-int, ovn-bridge-mappings="public:br-ex", ovn-cms-options=enable-chassis-as-gw, ovn-encap-ip="192.168.10.100", ovn-encap-type=geneve, ovn-remote="tcp:192.168.10.100:6642", system-id="3c4a168c-ac2a-496d-9c04-1a1ced01052a"}12:44
ralonsohcheck this in each host12:44
f0oI got that set on both rt1 and rt212:45
f0oor does it need to be set on the computehosts too? I dont run distributed FIPs because we dont want to drag the public vlan into all hypervisors12:45
ralonsohno if you don't want dvr, that's ok12:46
ralonsohbut there is something weird in this distribution12:46
ralonsoh1 lrp in rt2 only12:46
f0o(https://paste.opendev.org/show/bzhH9TyDBSMAqy65juDJ/ if you want to see the output)12:46
ralonsohno, that's fine12:47
f0oyeah that's what confused me too, it's not 100% on rt1 and 0 on rt2, there's one router that has no special config or range or setup that is on rt212:47
ralonsohso what are you doing? just create a router and assign the public network, right?12:47
f0oI tripple checked that it is set to leastloaded and there was no mentionable downtime of rt2 so that scheduling could've been skewed12:47
f0ocorrect, just create router and add to network12:47
opendevreviewMerged openstack/neutron stable/2023.2: Make sure that policy enforcer is initialized before use  https://review.opendev.org/c/openstack/neutron/+/93891312:48
f0oIf there is a way to rebalance these bindings retroactively I could just make a daily cronjob and call it a day haha12:48
ralonsohok, there could be something broken in the algorithm with 2 nodes. I think the tests are now using 3 or more GW chassis12:48
ralonsohbut there could be some weird corner case with 2 GW nodes only12:48
ralonsohno, right now there is no way to do this (in an automated way)12:49
ralonsohyou can:12:49
ralonsoh1) propose the implementation of this tool (this is something that has been discussed before). Please open a LP bug12:49
ralonsoh2) open another LP bug for the L3 issue with 2 GW nodes12:50
ralonsohthis is something that can be tested with UTs 12:50
f0oMy memory is a bit weak here but I recall an LP bug/feature-request about this tool a while back... need to comb my history12:51
ralonsohf0o, what version are you running? there were many new features/changes during the last year12:51
f0obut from my very limited understanding; the tool only needs to change the chassis priorities in OVN right?12:51
f0orunning openstack-ansible 2024.212:51
f0olet me dig out the tooling versions12:52
ralonsohf0o, yes, you need to change the HA_Chassis priority in sync with the other registers associated to this HA_Chassis_Group12:52
ralonsohbut this tool is dangerous because that implies a change in the LRP binding12:52
ralonsohand that implies breaking the active traffic12:53
ralonsohso this is not a trivial tool for a live env12:53
f0oso when you mention HA_Chassis_Group; None of my LRPs have that populated; it's all `[]` but the rt* ids show in status field as hosting-chassis12:54
ralonsohsorry12:55
ralonsohnot hcg, this is still NOT implemented12:55
ralonsohI mean gateway_chassis12:55
f0ohttps://paste.opendev.org/show/bXH1riYAo1Tzuxw4LqcF/ << gateway chassis seems to only include the Hypervisors with active VMs for me12:56
ralonsohthe second one is incorrect12:56
ralonsohby default the gateway_chassis list can have up to 5 elements (that is hardcoded)12:57
ralonsohso if you have 2 GW chassis, that should have 2 elements12:57
ralonsohsomething didn't work well with the scheduler and 1 of the GW chassis got rejected12:57
f0ooh let me go through the other lrp's then, maybe the rest also only has 1 entry12:58
f0oalso disregard that statement on hypervisor uuids; I grepped all chassis and the UUIDs are the GW nodes just with different Ids over and over again12:59
f0oralonsoh: https://paste.opendev.org/show/b8BqVZnyXqzhFEaYB6NM/ All have 2 entries apart from that one that is on rt2. So somehow rt1 has full priority over rt2 and the only router that's on rt2 is because it wasnt scheduled on rt1 due to a bug? :D13:04
ralonsohf0o, I would need to test the scheduler algorithm with 2 nodes, that could be done with UTs I think (this is just python code)13:06
f0oInterestingly enough, the lrp that's only on rt2 is just a standard router no different than the rest13:07
f0ofun :)13:07
opendevreviewMichel Nederlof proposed openstack/ovn-bgp-agent master: Fix running sync method for every external_ids update.  https://review.opendev.org/c/openstack/ovn-bgp-agent/+/94012913:10
f0oso if I wanted to create a tool to migrate the LRPs priorities around I would just issue ovn-nbctl lrp-set-gateway-chassis lrp-123 chassis-with-prio-1 3 && ovn-nbctl lrp-set-gateway-chassis lrp-123 chassis-with-prio-2 1 && ovn-nbctl lrp-set-gateway-chassis lrp-123 chassis-with-prio-1 2 - which would swap the chassises with prio1 to prio3 (making it active), demote the13:11
f0opreviously active chassis to prio 1 and ultimately make the new primary chassis to prio 2 to retain numbering standards13:11
f0oobviously this would not be 1,2,3 but something a bit bigger like 1,2,...n,n+1 to avoid collision13:12
opendevreviewMichel Nederlof proposed openstack/ovn-bgp-agent master: Fix running sync method for every external_ids update.  https://review.opendev.org/c/openstack/ovn-bgp-agent/+/94012913:12
f0oI understand that this is quite the jackhammer method, I'd just like to know if this would even work since a fix in the assigment algo is not going to backpropagate to these existing lrp's13:13
f0odo I need to inform/mess with neutron in any way or is this entire thing confined to OVN? (that would be very nice tbh)13:14
zigoralonsoh: We still continue to have openvswitch-agent turning in loops. The type of logs we're getting:13:17
zigohttps://paste.opendev.org/show/b6rGYiuIj0QqyzvSAi5X/13:17
zigoIt also happen in some compute nodes that are running with no VM, that's weird...13:17
opendevreviewMichel Nederlof proposed openstack/ovn-bgp-agent master: Expose floating ips attached to virtual ports  https://review.opendev.org/c/openstack/ovn-bgp-agent/+/94013213:31
opendevreviewMichel Nederlof proposed openstack/ovn-bgp-agent master: Expose floating ips attached to virtual ports  https://review.opendev.org/c/openstack/ovn-bgp-agent/+/94013213:36
opendevreviewMichel Nederlof proposed openstack/ovn-bgp-agent master: Fix running sync method for every external_ids update.  https://review.opendev.org/c/openstack/ovn-bgp-agent/+/94012913:41
ralonsohf0o, this kind of tool affects Neutron only. We are OVN users and OVN is not responsible of creating the gateway_chassis priorities13:48
ralonsohif you are going to create something like this, I would suggest to first create a LP with [RFE] in the title13:48
ralonsohand present it in the Neutron drivers meeting (Fridays at 14UTC)13:49
ralonsohtoday there is no meeting because there is no agenda13:49
ralonsohthe agenda link (to add the topic)13:49
ralonsohhttps://wiki.openstack.org/wiki/Meetings/NeutronDrivers13:49
ralonsohzigo, let me check13:50
ralonsohzigo, you receive the RPC call, then the agent can (or not) do anything13:51
zigoThanks.13:51
ralonsohif there are no ports affected, then no action is done13:51
ralonsohbut you'll see the RPC event received13:51
ralonsohl2pop is a bit chatty, but that makes sense if any agent needs to receive all events from all VMs13:52
ralonsohso anytime you create a new VM and spawn a port (or many), you'll receive that in every OVS agent13:52
zigoYeah, but there's some req-IDs that have been ongoing for more than a day.13:53
zigoThat feels weird ...13:54
zigoI now believe we had this before, though what changed with my upgrade is that I'm now in debug True.13:54
zigoSo maybe I'm just seeing things I didn't before.13:54
f0oralonsoh: thanks gotcha - will probably do so in the near future. unfortunately I'm swamped today (family keeping me on my toes)13:55
haleybralonsoh: yes, will try to get that updated today14:00
opendevreviewDmitriy Rabotyagov proposed openstack/ovn-bgp-agent master: Ensure that ARP/NDP is enabled for vlan devices  https://review.opendev.org/c/openstack/ovn-bgp-agent/+/93580114:01
ralonsohhaleyb, thanks!14:01
ralonsohbtw, please check this patch14:01
ralonsohhttps://review.opendev.org/c/openstack/neutron/+/93909514:01
opendevreviewBence Romsics proposed openstack/neutron master: Do not assume the existence of a trunk bridge since os-vif may have deleted it  https://review.opendev.org/c/openstack/neutron/+/93978614:01
ykarelfroyo_, when you get chance please check https://bugs.launchpad.net/neutron/+bug/2095807, may be that's already something known and a duplicated bug14:02
zigoralonsoh: It really is the case that it's just because we switched to debug=True, so we didn't know what was going on. Though I wonder why that much continue traffic for L2 stuff, it still doesn't feel right.14:02
ralonsohthe point is that you need all the events and then filter them depending on the local ports14:03
ralonsohbut you need to receive all of them in the first place14:03
zigoEven in a compute with no VMs ?14:04
ralonsohyes, you are subscribed to all events, then the OVS agent will filter the ones needed14:05
ralonsohyou can check that code14:05
zigoOk, thanks, understood.14:05
ralonsohthat said, OVS agent is not the most scalable architecture14:05
zigoI really see in the rabbit monitoring page that it's a broadcast thingy.14:05
opendevreviewDmitriy Rabotyagov proposed openstack/ovn-bgp-agent master: Ensure that ARP/NDP is enabled for vlan devices  https://review.opendev.org/c/openstack/ovn-bgp-agent/+/93580114:07
zigoralonsoh: Thanks for the details, this really helps. We were kind of scared, I'm not anymore, and I can continue my upgrades! :)14:10
ralonsohyw14:11
opendevreviewDmitriy Rabotyagov proposed openstack/ovn-bgp-agent master: Ensure that ARP/NDP is enabled for vlan devices  https://review.opendev.org/c/openstack/ovn-bgp-agent/+/93580114:33
opendevreviewSlawek Kaplonski proposed openstack/neutron master: Add limit of tags for every resource  https://review.opendev.org/c/openstack/neutron/+/93788714:34
opendevreviewRodolfo Alonso proposed openstack/neutron master: [eventlet-removal][OVN] Require wsgi start-time in the config  https://review.opendev.org/c/openstack/neutron/+/94012314:41
opendevreviewDmitriy Rabotyagov proposed openstack/ovn-bgp-agent master: Ensure that ARP/NDP is enabled for vlan devices  https://review.opendev.org/c/openstack/ovn-bgp-agent/+/93580114:48
opendevreviewSahid Orentino Ferdjaoui proposed openstack/neutron master: async_process: fix potential race condition with respawn  https://review.opendev.org/c/openstack/neutron/+/93962715:07
opendevreviewSahid Orentino Ferdjaoui proposed openstack/neutron master: async_process: remove usage of eventlet for AsyncProcess  https://review.opendev.org/c/openstack/neutron/+/93934815:07
opendevreviewSahid Orentino Ferdjaoui proposed openstack/neutron master: common: fix wait_until_true to support native thread  https://review.opendev.org/c/openstack/neutron/+/93784315:07
opendevreviewSahid Orentino Ferdjaoui proposed openstack/neutron master: ovs: reimplement signals handling  https://review.opendev.org/c/openstack/neutron/+/93932115:07
opendevreviewSahid Orentino Ferdjaoui proposed openstack/neutron master: ovs: remove the usage of eventlet in the OVS agent  https://review.opendev.org/c/openstack/neutron/+/93776515:07
opendevreviewSahid Orentino Ferdjaoui proposed openstack/neutron master: polling: remove usage of eventlet.sleep()  https://review.opendev.org/c/openstack/neutron/+/94013615:07
opendevreviewRodolfo Alonso proposed openstack/neutron master: WIP == [eventlet-removal] OVN hash ring manager reimplementation  https://review.opendev.org/c/openstack/neutron/+/94014015:20
opendevreviewRodolfo Alonso proposed openstack/neutron master: WIP == [eventlet-removal] OVN hash ring manager reimplementation  https://review.opendev.org/c/openstack/neutron/+/94014016:03
opendevreviewRodolfo Alonso proposed openstack/neutron master: DNM - Test "neutron-ovn-tempest-ipv6-only-ovs*" with WSGI  https://review.opendev.org/c/openstack/neutron/+/93997716:06
ralonsohotherwiseguy, ^^ this patch is on top of the re-implementation of the hash ring. Now we just initialize the hash ring but we don't refresh it\16:06
* otherwiseguy looks16:07
ralonsohI need to leave now but I'll reconnect in 3 hours16:07
opendevreviewBrian Haley proposed openstack/neutron master: Optionally configure IPv6 metadata address  https://review.opendev.org/c/openstack/neutron/+/92649716:12
opendevreviewBrian Haley proposed openstack/neutron master: Optionally configure IPv6 metadata address  https://review.opendev.org/c/openstack/neutron/+/92649716:29
opendevreviewMerged openstack/neutron master: [eventlet-removal] Reimplement ``common.utils.spawn_n``  https://review.opendev.org/c/openstack/neutron/+/93909517:23
opendevreviewSahid Orentino Ferdjaoui proposed openstack/neutron master: ovs: reimplement signals handling  https://review.opendev.org/c/openstack/neutron/+/93932118:16
opendevreviewSahid Orentino Ferdjaoui proposed openstack/neutron master: ovs: remove the usage of eventlet in the OVS agent  https://review.opendev.org/c/openstack/neutron/+/93776518:16
opendevreviewRodolfo Alonso proposed openstack/neutron master: DNM - Test "neutron-ovn-tempest-ipv6-only-ovs*" with WSGI  https://review.opendev.org/c/openstack/neutron/+/93997718:31
opendevreviewBernard Cafarelli proposed openstack/neutron master: DNM - Test "neutron-ovn-tempest-ipv6-only-ovs*" with WSGI  https://review.opendev.org/c/openstack/neutron/+/93997719:30
ralonsohbcafarel, thanks!19:30
bcafarelnp, I can still do 1-char typo fix :)19:31
opendevreviewJakub Libosvar proposed openstack/neutron master: Update NAT entry on FIP update  https://review.opendev.org/c/openstack/neutron/+/93991821:47
opendevreviewJakub Libosvar proposed openstack/ovn-bgp-agent master: Change DVR FIP events to monitor the NAT table  https://review.opendev.org/c/openstack/ovn-bgp-agent/+/94017422:46
opendevreviewMerged openstack/neutron master: Use is_cidr_host utils to detect if AAP ip is host in l3_dvr_db  https://review.opendev.org/c/openstack/neutron/+/93907523:50

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!