14:00:17 <ralonsoh> #startmeeting neutron_drivers
14:00:17 <opendevmeet> Meeting started Fri Dec 16 14:00:17 2022 UTC and is due to finish in 60 minutes.  The chair is ralonsoh. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:17 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:17 <opendevmeet> The meeting name has been set to 'neutron_drivers'
14:00:18 <mlavalle> o/
14:00:34 <ralonsoh> hello all
14:01:04 <lajoskatona> o/
14:01:53 <slaweq> o/
14:02:04 <obondarev> o/
14:02:14 <ralonsoh> Brian cannot attend today
14:02:24 <ralonsoh> So we are 5, I think we have quorum
14:02:30 <mlavalle> yeap
14:02:33 <ralonsoh> we have two topics today
14:02:42 <ralonsoh> first one
14:02:47 <ralonsoh> #link https://bugs.launchpad.net/neutron/+bug/1998609
14:02:57 <ralonsoh> racosta, please, you are welcome to present it
14:03:28 <racosta> ok, thk. This RFE intends to implement distributed routing support for IPv6 only or dual-stack usage scenarios.
14:04:04 <racosta> The proposal is to introduce a new NAT rule for the IPv6 GUA addresses that are allocated to VMs.
14:04:54 <lajoskatona> is it only for OVN?
14:04:55 <racosta> The ovn-controller running on the chassis needs this rule to start responding Neighbor Advertisements for IPv6
14:05:14 <racosta> Yes, only for OVN driver
14:05:55 <lajoskatona> thanks
14:06:09 <ralonsoh> why core OVN is not implementing this?
14:06:40 * mlavalle likes the pretty drawings
14:06:56 <racosta> On the ovn side, the nat rule is the same for ipv4 or ipv6
14:07:51 <racosta> It would be more of a CMS task to program how ovn handles flows on chassi
14:08:15 <ralonsoh> yeah but not the SB database
14:08:31 <ralonsoh> we should be modifying only the NB registers
14:08:42 <mlavalle> yeap
14:08:44 <racosta> no no, just in the NB database.
14:09:29 <racosta> The NAT rule is associated with the router at the northbound database level.
14:10:01 <ralonsoh> ok, is this doing any IPv6 nating?
14:10:58 <racosta> there is no NAT for IPv6, so this special rule that translates the GUA address to itself is only used to create the flows in the chassis where the VM resides
14:11:46 <racosta> Without this rule, the compute node does not know how to respond to that IPv6 GUA and centralize the communication through the router's GW port.
14:12:00 <ralonsoh> ok, is it compatible with IPv4 FIP DVR?
14:12:47 <racosta> yes, absolutely. Does not change anything for IPv4 FIP DVR.
14:13:26 <ralonsoh> one question (for drivers): should this be configurable?
14:13:33 <slaweq> IIUC from the LP description, it requires NAT like IPv6 GUA to the same IPv6 GUA address, will it require any changes in Neutron API, or do You want to make such nat entry for each IPv6 address automatically?
14:14:00 <slaweq> or enabled/disabled by config option for all ports/IPs
14:14:01 <slaweq> ?
14:14:10 <ralonsoh> right, similar question
14:14:45 <racosta> I believe it must be configured, because the user may be using n-d-r, for example.
14:15:02 <mlavalle> so a config knob?
14:15:54 <lajoskatona> but no API change or extension for it?
14:16:29 <ralonsoh> this is backend specific, actually apart from n-d-r, other traffics should work the same in both scenarios
14:16:36 <racosta> The proposal is to enable by enable_distributed_ipv6 flag in the ml2_conf.ini file.
14:16:38 <ralonsoh> (that's what I think)
14:17:40 <racosta> Yes, other backends may need the current centralized behavior.
14:17:58 <slaweq> ok, thx for explanation
14:18:20 <mlavalle> racosta: I assume this comes from a well known use case to you. How much of a gain / benefit does this improvement represent?
14:18:34 <mlavalle> have you tested it?
14:19:34 <racosta> Yes, I already tested it in a deployment with openstack yoga.
14:20:33 <mlavalle> and what was gained out of it?
14:20:43 <racosta> The benefits of implementing DVR for IPv6 are the same as for IPv4 FIP DVR, it is a more generalized case for dual stack.
14:20:58 <opendevreview> Fernando Royo proposed openstack/ovn-octavia-provider master: Fix listener provisioning_status after HM created/deleted  https://review.opendev.org/c/openstack/ovn-octavia-provider/+/867974
14:21:58 <racosta> Performance gain by distributing north/south traffic per chassis, and configuration gain by reducing dynamic routing complexity in the fabric.
14:22:37 <mlavalle> no side effects so far? for how long?
14:24:34 <racosta> No effect for now, I'm testing dataplane performance with automated shaker tool.
14:25:31 <lajoskatona> sounds great
14:25:48 <obondarev> no questions from me, looks reasonable and rather safe
14:25:53 <ralonsoh> do you think we can vote now?
14:25:55 <mlavalle> I'm ok with this.
14:25:56 <lajoskatona> Would be good to have some tests finally for it in upstream CI also
14:26:04 <lajoskatona> +1 from me
14:26:09 <ralonsoh> +1
14:26:11 <obondarev> +1
14:26:20 <mlavalle> racosta: if you can, please share your shaker results with us
14:26:36 <mlavalle> +1
14:26:45 * mlavalle formalizes his vote
14:26:50 <slaweq> +1
14:26:57 <ralonsoh> thanks! approved then
14:26:58 <ralonsoh> about the spec, do you think we need a spec for this RFE?
14:27:12 <mlavalle> no, it's pretty localized
14:27:19 <ralonsoh> I think it should be documented, that's all
14:27:26 <mlavalle> yeap
14:27:39 <obondarev> I think there is one already https://launchpadlibrarian.net/639474950/rfe_ovn_ipv6_dvr.rst
14:27:41 <mlavalle> and hopefully include the testing results in that document
14:28:13 <ralonsoh> racosta, why don't you present it?
14:28:20 <ralonsoh> obondarev, who's this spec?
14:28:33 <obondarev> it's linked in the bug
14:28:46 <obondarev> I believe it's from racosta
14:28:56 <mlavalle> yeap
14:28:58 <obondarev> in the RFE*
14:29:10 <ralonsoh> ahh I didn't see c#3
14:29:41 <ralonsoh> racosta, ok, please, propose this spec formally (now you have it written)
14:29:51 <ralonsoh> I'll write in the bug how to do it
14:30:12 <mlavalle> and if you could add some testing resuls to it, it would be great
14:30:31 <mlavalle> racosta: thanks for this proposal. Nice!
14:31:06 <racosta> Ok, thank you very much. I will add the test results ;)
14:31:21 <ralonsoh> ok, thanks for your proposal. I'll update the bug
14:31:33 <ralonsoh> the next topic is
14:31:37 <ralonsoh> #link https://bugs.launchpad.net/neutron/+bug/1998608
14:31:45 <ralonsoh> quick summary
14:31:46 <mlavalle> racosta: to be clear, it is not a precondition. I just think it would be nice to share it with the community (and I'm curious)
14:32:32 <ralonsoh> we are "playing" with hardware offload cards with ML2/OVN. The problem we found is that QoS is still not working
14:32:57 <ralonsoh> HWOL drivers do not translate the OVS QoS rules to the interface
14:33:18 <ralonsoh> this is where I'm trying to create an OVN monitor
14:33:24 <ralonsoh> that will run on the compute node
14:33:38 <ralonsoh> this monitor will be generic (new features could be added)
14:33:49 <ralonsoh> by default, there will be NO need to spawn it
14:33:51 <slaweq> so You want to have yet another neutron-ovn-agent on each compute node
14:34:00 <slaweq> to do things like that on nodes, right?
14:34:03 <ralonsoh> yes
14:34:12 <ralonsoh> neutron-ovn-monitor-agent
14:34:22 <obondarev> and new RPC interface, correct?
14:34:25 <ralonsoh> no
14:34:35 <ralonsoh> no RPC, this is something we won't do
14:34:35 <slaweq> ok, dummy question - what about neutron-ovn-metadata-agent then? Can it be combined with this new "generic" agent maybe?
14:34:50 <ralonsoh> obondarev, anything we need should be stored in the OVN/OVS database
14:34:53 <ralonsoh> same as metadata
14:35:05 <mlavalle> reacting to ovsdb events
14:35:31 <obondarev> ok, so the new agent will talk only to local ovsdb?
14:35:36 <ralonsoh> slaweq, no, metadata is specific for metadata only. Mixing features in this agent (that is also mandatory) is not a good idea
14:35:48 <ralonsoh> obondarev, not local ovsdb only, also remote OVN database
14:35:53 <ralonsoh> same as metadata agent
14:35:57 <obondarev> ah, got it
14:36:15 <slaweq> I have one small concern about scalling of this
14:36:27 <ralonsoh> yes, I have this concern too
14:36:28 <slaweq> we know that many connections to the ovn dbs can cause issues
14:36:38 <mlavalle> I think it's a big concern
14:36:44 <slaweq> and this will be at least yet another connection from each compute node
14:36:52 <ralonsoh> yes, for sure. This is why this agent won't be necessary in all deployments
14:37:11 <ralonsoh> this monitor is necessary only, for now, for this specific feature
14:37:16 <ralonsoh> QoS with HWOL
14:37:23 <lajoskatona> but this is only necessary on hots where hw offload things are located, no?
14:37:27 <ralonsoh> yes
14:37:35 <ralonsoh> and if you want qos
14:37:46 <slaweq> for now, but who knows what else we will implement there :p
14:37:58 <ralonsoh> yes, of course
14:37:59 <lajoskatona> good comment
14:38:10 <ralonsoh> POC: (3 patches) https://review.opendev.org/c/openstack/neutron/+/866480
14:38:12 <slaweq> I'm not against but I just wanted to raise concern which I have :)
14:38:44 <mlavalle> as long as the trade offs are well documented so deployers can understand the potential cost, I think it would be ok
14:38:53 <obondarev> +1
14:38:54 <ralonsoh> yes but overloading a necessary agent (metadata) that also has a specific task, is not a good idea
14:39:37 <ralonsoh> and, btw, this is a kind of workaround until driver manufacturers fix the drivers
14:40:21 <slaweq> +1 from me but with proper documentation as mlavalle mentioned :)
14:40:40 <mlavalle> so: 1) it is optional 2) the performance trade offs are well documented. with this I'm +1
14:40:58 <ralonsoh> of course, it will be documented
14:41:59 <ralonsoh> any other question?
14:42:01 <lajoskatona> agree with the above limitations
14:42:46 <mlavalle> none from me
14:43:18 <slaweq> I'm good
14:43:19 <ralonsoh> so just to make it explicit, can you vote?
14:43:24 <mlavalle> +1
14:43:26 <slaweq> +1
14:43:49 <lajoskatona> +1
14:43:56 <obondarev> +1
14:44:01 <ralonsoh> (nothing from me, I proposed it)
14:44:08 <ralonsoh> so approved
14:44:20 <ralonsoh> do I need a spec?
14:45:07 <lajoskatona> good question
14:45:24 <obondarev> New agent seems a pretty big change, I believe some sort of design doc would be useful
14:45:30 <ralonsoh> I agree
14:45:37 <lajoskatona> as it introduces a new agent like thing would be good to write it
14:45:48 <ralonsoh> I'll push a spec next week
14:45:52 <lajoskatona> thanks
14:45:57 <slaweq> thx
14:46:00 <mlavalle> I'd say given the potential performance concerns at scale, let's give ourseleves and the community some time to think about it and the spec process gives us that
14:46:17 <ralonsoh> thank you folks! I have nothing else in the agenda
14:46:23 <ralonsoh> so have a nice weekend!
14:46:29 <slaweq> thx, You too
14:46:31 <lajoskatona> Bye
14:46:32 <ralonsoh> #endmeeting