Tuesday, 2025-03-25

f0oHi, does openstack-ansible handle migration of network_interface_mappings? I would like to test out distributed FIPs but for that I need to move the Gateway Hosts' ports from the internal cross connect to the switch MLAG interface... I know this will cause downtime but wonder what posisble manual steps are required to move the OVN ports around without neutron going nuts08:13
noonedeadpunkhey08:13
noonedeadpunkok, so you're using OVN? As I think in this case it might be even no downtime08:15
noonedeadpunkSo what would happen, is that the new mapped interface will be added to the bridge, but the old one will not be removed08:16
noonedeadpunkso pretty much you'd need to worry about mainly of not making a loop with that08:16
f0ooh thats neat08:17
f0oit *shouldnt* loop but I can prepare for that by using packet filters on the crossconnect08:17
f0oguess I'll cause some havoc todau08:18
noonedeadpunkthat's the task which ensures that behaviour: https://opendev.org/openstack/openstack-ansible-os_neutron/src/branch/master/tasks/providers/setup_ovs_ovn.yml#L79-L9208:19
noonedeadpunkfrasnkly I don't remember about any pre-caution measures and logic before this task.... But I think it should be always executed...08:20
noonedeadpunkexcept might be broken with some tags, but not sure08:20
f0oI'll just limit the playbook to the standby gateway node and make sure they have packet filters on the crossconnect; then just perform a failover and do the same on the other side and finally run distributed fip on all compute nodes08:21
f0othat sounds rather safe08:22
noonedeadpunkso you have st6andalone gateway nodes which are not computes?08:22
f0oyep08:22
f0oand now I'm hitting conntrack issues haha08:23
noonedeadpunkare you planning to leave them or for cost efficiency get rid of them?>08:23
noonedeadpunkoh?08:23
f0oI will leave them as they run edge-bgp as well08:23
f0othe architecture didnt account for OVN using kernel conntrack08:23
noonedeadpunkah, so you use ovn-bgp-agent? or old dragent?08:23
f0oneither08:23
noonedeadpunkaha08:23
noonedeadpunkok then :D08:23
noonedeadpunkas yeah, was surprised about conntrack before you got bgp involved08:24
f0ogateway nodes are the actual core routers that sit on all transit/ix'es/pnis/...08:24
noonedeadpunkaha08:24
f0owe assumed OVN would just do p2p vteps and then offload the packets on the gateway nodes, which was sort of correct but it also utilizes conntrack for it which now blows up at scale08:24
f0oso we're going to see if distributed fip solves it. if not, we get to figure out how to configure ovn-bgp08:25
noonedeadpunkok, that is super interesting actually08:26
noonedeadpunk'08:26
noonedeadpunkyeah, from my experience ovn-bgp is a little bit too edgy right now in terms of code maturity08:27
noonedeadpunkbut we made it work after all08:27
f0oconntrac on primary is at 3368855 and on the secondary 82416708:27
noonedeadpunkthough we actually using gateway nodes as just "gateways" - which have connectivity to public vlans08:28
noonedeadpunkso it seems that what you see with conntrack is extremely relevant for our setup08:28
f0oif you use distributed fip our understanding is that each compute node will handle their own conntrack since you span the public vlan to all nodes08:29
f0othe drawback is that you are spanning the whole vlan across all nodes... which is really not ideal at scale either08:29
noonedeadpunkyes, that was exactly the issue that we don't want to stretch vlan08:30
f0ohonesly same here08:30
noonedeadpunkas public network is shared between multiple datacenters08:30
noonedeadpunkalso, I was able to convert an environment from normal ovn to ovn-bgp with almost no downtime08:31
noonedeadpunkthough keep in mind, it was only a "test" one, with limited amount of workloads08:31
noonedeadpunkso no idea how it will run at scale08:32
noonedeadpunkso such migration is totally doable fwiw08:32
noonedeadpunkwas a matter of getting an absolutely working confiuguratiuon and running an os-neutron-install playbook08:33
f0ohow does that work, do the gateway nodes talk to the compute nodes? and the gateway nodes then dump it on the wire?08:34
noonedeadpunkthe biggest hassle was to get a "correct" version of agent/neutron...08:34
f0oor do all nodes (gateway+compute) talk to a core-router?08:34
f0oreason I ask is the gateway and my core-router are one and the same. Running two bgpd processes on the same box will likely conflict08:34
f0olet alone figuring out how ovn-bgp handles VRFs in case they do need to run on the same box08:35
noonedeadpunkso you get a new neutron-ovn-bgp-agent service. it listens to event on NB DB and do execute "actions" based on the event08:35
noonedeadpunkit also leverages/requires FRR 08:35
f0ois this inside an lxc or on the host?08:35
noonedeadpunkWhere you run it - depends on the setup. So in our case with no distributed FIPs and standalone gateway nodes - we need to run it only on gateway nodes08:36
noonedeadpunkif you do distributed fips - also on computes08:36
noonedeadpunkit is on the host, as agent needs to mess up and inject flows into OVS to eject traffic from it to kernel networking08:37
f0ohow does this work traffic flow wise? because it sounds almost identical to how non-bgp non-distributed-fips work08:37
noonedeadpunkso it depends on exposure_method. 08:38
f0oif only the gateway nodes speak BGP to presumably the core-router, that only eliminates the need to span the vlan to all gateway nodes. But the gateway node <> compute would still use conntrack then just like now, no?08:38
noonedeadpunkin case of "ovn" - ypu need an extrea ovn "cluster" per node08:39
noonedeadpunkin case of "underlay" - yes, it will be pretty much the same08:39
f0oI guess to fully remove conntrack here, or at very least offload as much as possible, I'd need the compute nodes to speak BGP to the core-routers for their VM's FIPs.08:39
noonedeadpunkin case of "vrf" - agent will maintain and split networks into multiple VRFs as well08:40
noonedeadpunkyeah, I think so. But then probably it makes sense also to use computes as gateway hosts?08:40
f0oprobably!08:41
noonedeadpunkas why I started asking, I was trying to understand if there's a good use-case to have distributed fips, but also standalone gateway nodes08:41
noonedeadpunkas if you have to add vlan to all computes....08:42
f0oI guess the use-case is to stop fires from spreading haha08:42
noonedeadpunkthen why waste rack space for gateway nodes kinda08:42
f0othat is correct08:42
noonedeadpunkbut when they act as leaves or core routeres - then yeah08:42
f0oI will look into how to hook up the compute nodes (and/or collapsed gateway nodes) to the core-routers with BGP so each node just announces their respective FIPs/NATs08:43
noonedeadpunkso that is what ovn-bgp-agent does kinda08:43
noonedeadpunkso it takes a port binding from ovn nb and add a respective route to the vrf where it ejects trasffic from ovs08:43
noonedeadpunkand this route is being picked up by frr08:44
noonedeadpunk(unless it's "ovn" exposuure method I guess - I just never looked into it in details)08:44
noonedeadpunkand then a bgp session is established for each compute with announcements of own fips08:45
noonedeadpunkbut it does work pretty much same way witrhout BGP tbh :D08:45
noonedeadpunkonly you need vlans08:45
noonedeadpunkso it's kinda about - where you want to build a complexity08:46
f0owell for BGP sessions I can reuse the management vlan08:46
f0othat's stretched everywhere anyways08:46
f0osame as the vxlan vlan08:46
f0oso there are ways to aggregate/reuse things there08:46
noonedeadpunkone big downside of ovn-bgp-agent so far, is that ovn knows nothing about it's existance. so in case of frr or the agent dies for some reason - networks/fips are just become unreachable08:47
f0oI wonder08:47
f0ocan I run a half ovn-bgp-agent?08:47
noonedeadpunkwhich half?:)08:47
f0obecause what you say and what the docs suggest is that you have ovn create route entries into the kernel for the FIPs08:47
f0othis would entirely bypass conntrack08:47
f0omy current gateway nodes can pick up those entries already08:48
f0oso can I skip FRR?08:48
noonedeadpunkI don't think you can08:48
noonedeadpunkAs it also tries to inject things into frr from time to time08:48
noonedeadpunkso in case vtysh is not there - it will just crash08:49
f0oshame08:49
noonedeadpunkyou can probably have some kind of "fake" frr running...08:49
noonedeadpunkwhich does have only noop neighbour or none at all08:50
f0othe loops and hoops one must go through to remove conntrack08:50
noonedeadpunkas in case of underlay exposure method - I don't think it injects there smth very meaningful08:50
f0oand this doesnt even guarantee that conntrack is actually removed, I'm sure the computenodes just have them then08:51
noonedeadpunkI don't think it's removed tbh08:51
f0oso ultimately this is just to remove the L2 for a L308:51
noonedeadpunkas I was able to break networks with a missing iptables rules for FORWARD chain08:51
noonedeadpunkyeah08:51
*** kleini_ is now known as kleini08:53
noonedeadpunkand tbh, I'd prefer having just l2 setup... but it's me :)08:54
noonedeadpunkas I don't need to deal with network side of things with it - it';s on other teams then :D08:55
f0oyeah I think I'm doing a full 180 to 360 here because I'm back at distributed fip and byting the bullet of stretching vlans... However I can use the gateway nodes to segment it a bit... give each rack a /24 or so and only care for a vlan per rack and then the gateway nodes which already speak bgp just do their thing as they are now08:56
f0oin an ideal world, I could dump routes from OVN NB like ovn-bgp-agent does onto the vxlan bridge of the gateway node and use distribute connected in my gateway node's bgpd. This would be the half-ovn-bgp-agent haha08:57
f0onoonedeadpunk: https://github.com/search?q=repo%3Aopenstack%2Fovn-bgp-agent%20run_vtysh_command&type=code do I read this right that vtysh is solely used to obtain the router_id?09:01
f0oI can 100% mock this09:01
f0ooh I see it also changes FRR's config every so often09:03
f0ohrm maybe I just fork this and make it vty-less09:03
f0oHereBeDragons09:03
noonedeadpunkYeah, it uses frr for more then fetcvhingt the id for sure....09:16
noonedeadpunkbut again, I guess you could jsut to have frr that does mainly nothing?09:16
f0ohttps://github.com/openstack/ovn-bgp-agent/blob/master/ovn_bgp_agent/drivers/openstack/ovn_bgp_driver.py only utilizes vtysh for a quick config sanity check which I can just patch out with a config flag for instance09:17
noonedeadpunkas if you have no peers -  Ithink it will still likely to start and run09:17
f0oI need to draw this up and see what I'm actually trying to fix and which driver is the one I need to meddle with09:18
noonedeadpunkbut it would very depend on exposure method tbh. as in case of vrf it's also being used to handle bgp evpn between computes09:19
noonedeadpunkso that traffic would flow directly if it's "inside" of the public network09:19
f0oin an ideal world, I'd have my ToR-routers listen for any bgp peers on the vxlan subnet. The compute nodes would then peer with them and announce their respective FIPs. From what I understand this is the stretched-l2 bgp driver?09:22
noonedeadpunkum, probably?:)09:22
f0oI'm going back and forth between the docs on the drivers and they all start to sound the same xD09:23
noonedeadpunkJust skip the SB DB driver09:23
noonedeadpunkand all relatged to it09:23
noonedeadpunkI think mainly maintained is NB DB, but also keep in mind there's another level of complexity in each driver, which is "exposure_method"09:24
noonedeadpunkand it changes the driver behaviour really dramaticaly09:24
noonedeadpunkbut keeping the purpose of the driver the same'09:24
f0owhat a rabbit hole09:25
jrosserit does surprise me just how complicated this ovn stuff all is11:33
noonedeadpunkwell... I think it depends. If take more or less regular ovs setup - ovn seems like a way better choice in general. And somehow simpler one. Until it works11:49
noonedeadpunkIf it dopesn't and you need to dig in flows - that's /o\11:50
noonedeadpunkand comparing to lxb that all is a mess11:50
noonedeadpunkI still don't get why 80% of deployments need anything but just lxb (if it was supported)11:51
mgariepyflows are a lot harder to parse by dumping them compared to plain iptables stuff ;)12:27
f0o^ 100%13:44
f0oI've managed to drop my conntrack from 4.5M to 800K by making all timeouts 3s and remove retrans. This seemingly had no effect on the network stability - but it ofc now requires a much more stable upstream network13:46
f0ohowever, while conntrack is now lower it did not ultimately increased network performance by any notable margin. I think the main reason is still the immense amount of context switching ovs-switchd performs here to move packets around and 32 dedicated cores just arent enough to surpass 12G13:53
f0oI hope moving forward with distributed fips would offload some of that load onto the compute nodes resulting in a higher BW capacity13:54
f0oeven if it seems less efficient at a glance and having to deal with the stretched l2 pains13:55
f0oI honestly wonder how operators at scale solved this congestion13:56
f0oovn-bgp-agent seems to just solve the L2 part by swapping to L3 but the packet pushing load of ovs-vswitchd and conntrack needs are still there, just at a different level13:57
f0ohow do "they" cope with the resource steal on the compute nodes then?13:57
f0os/cope/deal/g13:58
noonedeadpunkyeah, that is actually a great question indeed14:06
noonedeadpunkfrom what I got talking to RH folks lately about ovn - they very rarely do have standalone net nodes14:06
noonedeadpunkmostly it's spread between all computes14:06
noonedeadpunkso I guess that distributed fip is likely an answer indeed14:07
noonedeadpunkand also all computes serving as gateway hosts14:07
noonedeadpunk#startmeeting openstack_ansible_meeting15:01
opendevmeetMeeting started Tue Mar 25 15:01:48 2025 UTC and is due to finish in 60 minutes.  The chair is noonedeadpunk. Information about MeetBot at http://wiki.debian.org/MeetBot.15:01
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.15:01
opendevmeetThe meeting name has been set to 'openstack_ansible_meeting'15:01
noonedeadpunk#topic rollcall15:01
noonedeadpunko/15:01
NeilHanlono/ was just about to poke you! :D 15:01
jrossero/ hello15:01
noonedeadpunksorry for being a bit late :)15:02
NeilHanloni only just got here 3 seconds before, so... ;) 15:02
noonedeadpunk#topic office hours15:04
noonedeadpunkso, I think main question for today is PTG and it's planned time?15:05
noonedeadpunkgiven that we're still lacking a good agenda for it... and last years participation was not high...15:05
noonedeadpunkI'd guess we should be fine with 2h timeslot?15:05
jrosserwe will be missing andrew this time i think15:06
NeilHanlon2h sounds good, i'll make it :) 15:06
noonedeadpunk15:00 - 17:00 UTC?15:07
noonedeadpunkTuesday?15:07
noonedeadpunkor 16-17 to reduce conflicts a little bit?15:08
* jrosser tries to work out even which dates this is15:08
noonedeadpunkApril 7-1115:09
noonedeadpunkSo April 815:09
noonedeadpunkor well. we can do this April 7 on Monday - these slots are very empty there15:09
noonedeadpunk#link https://ptg.opendev.org/ptg.html15:10
noonedeadpunkI'm just thinking that eventlet one might be an interesting for sure15:10
jrosser8th is difficult for me15:10
noonedeadpunkshould I create some kind of poll and ask for some votes through ML?15:11
noonedeadpunk(ie doodle or smth)15:12
noonedeadpunkor what would fit you best ?15:16
jrosser9th is pretty clear15:17
NeilHanlonwhatever works for yall i can make work15:17
noonedeadpunk15 - 17?15:17
NeilHanlon+1 here15:18
jrosseryeah thats ok15:18
noonedeadpunkI'm fine with 9th15:18
noonedeadpunkok, agreed then15:18
noonedeadpunkI will try to come up with some kind of agenda this week15:19
noonedeadpunkand book a timeslot :)15:19
noonedeadpunkanything else to discuss now?15:20
jrosseronly that we need extra attention on code review for the next few months15:20
noonedeadpunkyes15:20
jrosserandrewbonney is on paternity leave for ~4mo and has been very active reviewing15:21
noonedeadpunkand some reviews are highly appreciated15:21
noonedeadpunkespecially we're closer and closer to the release date15:21
jrosserso we have to make up the effort from elsewhere during his absence15:21
noonedeadpunkand our review dashbopard can help to figure out outstanding things quite quickly15:22
noonedeadpunk#link https://openinfra.org/cla/15:22
noonedeadpunk#link http://bit.ly/osa-review-board-v515:22
noonedeadpunkthat is the correct one ^15:22
noonedeadpunkignore CLA :D15:22
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-ops master: Add a collection for managing encryption of secret data  https://review.opendev.org/c/openstack/openstack-ansible-ops/+/94386615:24
noonedeadpunkalso - once https://review.opendev.org/c/openstack/openstack-ansible/+/945115 lands - I want to issue a beta release15:25
NeilHanlonawesome :) 15:25
NeilHanlonI will pick up some slack for Andrew15:25
noonedeadpunk(or well, I'd need to push another patch for beta to freeze roles)15:26
noonedeadpunk(and make beta out of it)15:26
noonedeadpunkbut https://review.opendev.org/c/openstack/openstack-ansible-plugins/+/945089 needs to land to create such freeze15:27
noonedeadpunkgiven that we had quite slow-paced cycle - hopefully we'd be able to release early at least this time...15:27
noonedeadpunkbut yes - please, go through the review dashboard once in a while :)15:30
noonedeadpunkthere are couple of nice changes lying around, fwiw15:32
noonedeadpunkand https://review.opendev.org/c/openstack/openstack-ansible-os_cinder/+/942581 is a one with potential side-effects15:33
noonedeadpunk(or well, there's a whole topic around this one)15:36
NeilHanlon:) 15:39
noonedeadpunkI guess I'm not sure why https://review.opendev.org/c/openstack/openstack-ansible-os_cinder/+/942783/ is failing though...15:41
noonedeadpunkoh, well15:42
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_nova master: Switch volume catalog_type to block-storage  https://review.opendev.org/c/openstack/openstack-ansible-os_nova/+/94547615:45
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_cinder master: Disable v3 endpoints by default  https://review.opendev.org/c/openstack/openstack-ansible-os_cinder/+/94278315:46
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_octavia master: Align on cinder service naming  https://review.opendev.org/c/openstack/openstack-ansible-os_octavia/+/94258215:47
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_trove master: Switch volume catalog_type to block-storage  https://review.opendev.org/c/openstack/openstack-ansible-os_trove/+/94547715:49
opendevreviewDmitriy Rabotyagov proposed openstack/openstack-ansible-os_cinder master: Disable v3 endpoints by default  https://review.opendev.org/c/openstack/openstack-ansible-os_cinder/+/94278315:50
noonedeadpunk#endmeeting15:53
opendevmeetMeeting ended Tue Mar 25 15:53:59 2025 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)15:53
opendevmeetMinutes:        https://meetings.opendev.org/meetings/openstack_ansible_meeting/2025/openstack_ansible_meeting.2025-03-25-15.01.html15:53
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/openstack_ansible_meeting/2025/openstack_ansible_meeting.2025-03-25-15.01.txt15:53
opendevmeetLog:            https://meetings.opendev.org/meetings/openstack_ansible_meeting/2025/openstack_ansible_meeting.2025-03-25-15.01.log.html15:53
noonedeadpunkfwiw: https://openinfra.org/blog/openinfra-summit-202516:24
noonedeadpunknext openinfra summit EU is in _Paris in October16:24
admin1will be there :) 18:12
noonedeadpunkI plan to go there as well18:24
noonedeadpunkbut will see18:24

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!