14:00:12 <mlavalle> #startmeeting neutron_drivers
14:00:12 <opendevmeet> Meeting started Fri Jul  9 14:00:12 2021 UTC and is due to finish in 60 minutes.  The chair is mlavalle. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:12 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:12 <opendevmeet> The meeting name has been set to 'neutron_drivers'
14:00:18 <ralonsoh> hi
14:00:20 <fnordahl> o/
14:00:51 <mlavalle> Good morning / afternoon / evening
14:00:52 <haleyb> hi
14:00:56 <amotoki> hi
14:01:16 <rubasov> o/
14:01:52 <lajoskatona> Hi
14:02:14 <manub> hi
14:02:29 <dmitriis> o/
14:03:35 <obondarev> hi
14:03:48 <mlavalle> let's see if yamamoto or njohnston show up. We need one of them to discuss the first RFE ion out agenda. Let's geve them a few minutes
14:04:03 <mlavalle> https://review.opendev.org/admin/groups/5b063c96511f090638652067cf0939da1cb6efa7,members
14:04:07 <ralonsoh> nate is on PTO
14:05:00 <mlavalle> let's wait a few minutes. If we don't get any of them, we can discuss the other two RFEs in our agenda since we have quorum for those
14:08:32 <mlavalle> fnordahl, dmitriis: are you here to discuss a RFE? I don't want to have you waiting until the end
14:08:46 <dmitriis> yes
14:08:58 <fnordahl> yes
14:09:16 <mlavalle> ok, I see fnordahl filed https://bugs.launchpad.net/neutron/+bug/1932154
14:09:23 <mlavalle> so let's start there
14:09:44 <mlavalle> #topic RFEs
14:10:08 <mlavalle> https://bugs.launchpad.net/neutron/+bug/1932154 is up for discussion now
14:11:03 <ralonsoh> I have one question about the spec (about the architecture)
14:11:16 <dmitriis> sure
14:11:26 <ralonsoh> in the schema, the "SmartNIC DPU Board" is hosting the OVN processes (controller, vswtichd, etc)
14:11:36 <ralonsoh> but any Neutron process?
14:12:00 <opendevreview> Lajos Katona proposed openstack/networking-bagpipe master: Follow pyroute2 changes  https://review.opendev.org/c/openstack/networking-bagpipe/+/800062
14:12:09 <dmitriis> ralonsoh: We aim to avoid it in the OVN case. At least the goal is to not introduce any Neutron agents
14:12:30 <dmitriis> at the moment there is a discussion upstream around representor plugging at the SmartNIC DPU side
14:12:38 <ralonsoh> ok so is running the backend, not the orchestrator processes
14:12:44 <ralonsoh> perfect then
14:13:10 <dmitriis> yes, extra comms between Neutron and every SmartNIC DPU wouldn't be good
14:13:42 <ralonsoh> so far, I'm ok with the RFE (and the spec although I need to review it again)
14:14:35 <dmitriis> we are actively discussing the OVN bits upstream since a lot depends on them
14:14:47 <amotoki> if it is only related to the backend, why do you need to use "binding:profile" (i.e. API visible field)? Can't you pass information from neutron (driver) to the backend?
14:15:07 <amotoki> I understand you need to pass information on host to the backend (ovn).
14:15:31 <ralonsoh> (I think this is for Nova)
14:15:47 <fnordahl> as I understand it this is part of the current Nova -> Neutron communication which we continue to use here
14:16:12 <dmitriis> the board serial number is needed to get an OVN chassis hostname
14:16:26 <dmitriis> in case of port deletion that info needs to be present in the port as well
14:16:53 <dmitriis> otherwise we don't know which SmartNIC DPU host should handle representor unplugging
14:18:31 <dmitriis> (if that makes sense)
14:19:13 <fnordahl> The OVN mechanism driver would most likely need it throughout the lifetime of the port, to know what requested-chassis to put on the LSP
14:19:44 <amotoki> I see. no neutron agent like sriov agent, so you need to pass host information to OVN via the  ovn mech driver running in the API server.
14:19:55 <fnordahl> yes
14:20:38 <mlavalle> and that is why you have a converstion upstream with ovn, to be able to coordinate that, correct?
14:21:08 <dmitriis> yes, we need an entity that would do the job similar to os-vif
14:21:25 <mlavalle> how is that conversation going?
14:21:44 <fnordahl> Our conversation with OVN is to get code to look up representor ports running alongside / as part of the ovn-controller and extending the CMS API to specify how to find it.
14:21:51 <mlavalle> do you think you will get the necessary support implemented?
14:22:48 <fnordahl> Our target is to get it into OVN 21.09, it has been a conversation since May, and recently we have made progress with a few compromises from both sides suggested. We don't have a +2/-2 as of now, but I feel confident we will be able to move forward.
14:23:38 * mlavalle shudders at the thought of having to get approvals from ovn upstream, nova and worst of all, neutron ;-)
14:23:51 <fnordahl> add libvirt to the mix
14:23:53 <fnordahl> ;)
14:23:58 <mlavalle> LOL
14:24:01 <dmitriis> mlavalle: yes, that's a tricky feature to coordinate
14:24:36 <dmitriis> the libvirt part is just about retrieving PCIe VPD and exposing it
14:24:44 <mlavalle> I'm fine with this RFE. I propose that we continue the detailed conversation in the spec
14:24:49 <ralonsoh> right
14:24:51 <dmitriis> we had a conversation about it already in the ML and it was approved
14:25:03 <amotoki> which ML?
14:25:16 <fnordahl> ^ I think dmitriis is talking about libvirt?
14:25:18 <dmitriis> https://listman.redhat.com/archives/libvir-list/2021-May/msg00873.html
14:25:33 <dmitriis> let me find a reply as well
14:25:44 <amotoki> thanks
14:26:26 <dmitriis> https://listman.redhat.com/archives/libvir-list/2021-June/msg00037.html
14:26:48 <amotoki> there are several components involved. where can we get the whole picture on this? I think it is an important thing for people to understand how it works.
14:27:55 <dmitriis> amotoki: the Nova spec contains a lot of information https://review.opendev.org/c/openstack/nova-specs/+/787458. It originally contained more Neutron and OVN pieces but we removed them during the review to keep it Nova-related
14:28:09 <mlavalle> amotoki: I agree with you. Should we strive to get that whole picture in the spec phase?
14:28:39 <fnordahl> The nova spec still has schematics (ascii) of the whole thing if I don't misremember dmitriis?
14:28:56 <fnordahl> https://review.opendev.org/c/openstack/nova-specs/+/787458/9/specs/xena/approved/integration-with-off-path-network-backends.rst line 532
14:29:14 <dmitriis> fnordahl: yes there's a bit more information in it regarding certain pieces
14:29:29 <fnordahl> amotoki: mlavalle: would that schema help?
14:31:35 <mlavalle> I might ask for more in the spec, but at first glance it looks like a good start to me
14:31:43 <amotoki> I cannot read through it right now, but it is okay as it looks like it covers the main picture. if we find something missing, we can cover it in the neutron spec (or add smth to the nova spec if needed)
14:31:56 <amotoki> we can look thru the nova spec when reviewing the neutron spec.
14:31:58 <fnordahl> ack
14:32:10 <dmitriis> mlavalle, amotoki: ack
14:32:42 <amotoki> I am feeling okay with this rfe.
14:33:51 <mlavalle> yeah, we will have to read also at least the nova spec
14:34:06 <mlavalle> haleyb: what are your thoughts?
14:34:30 <haleyb> i'm +1 on it, just trying to read through the spec too
14:34:45 <haleyb> complicated with all the dependencies
14:35:36 <mlavalle> ok, I think we have the votes necessary yo approve this RFE today. Thanks fnordahl and dmitriis for this proposal. We'll see you in gerrit in the spec conversation
14:35:46 <amotoki> fnordahl, dmitriis: the subject of the rfe says Off-path SmartNIC Port Binding, but can we add "with OVN"?
14:36:02 <dmitriis> sure
14:36:28 <dmitriis> done
14:36:36 <fnordahl> thank you all for taking the time to review and discuss it with us
14:36:38 <amotoki> dmitriis: thanks. it highlights the scope more :)
14:37:24 <dmitriis> Yes, thanks a lot for considering it. The input is much appreciated as there are many pieces to this effort.
14:37:45 <mlavalle> good luck herding all these cats!
14:38:04 <dmitriis> 👍
14:38:11 * mlavalle shudders again
14:38:22 <fnordahl> :) I might just add professional cat herder to my CV after this
14:38:52 <mlavalle> ok, next up for discussion is https://bugs.launchpad.net/neutron/+bug/1933222
14:40:02 <ralonsoh> I have a couple of question (that I'll add to the LP bug)
14:40:23 <ralonsoh> 1) shouldn't be called metadata aggregation? this is what he is proposing
14:40:37 <ralonsoh> 2) what is the resource (memory) saving we gain?
14:40:58 <ralonsoh> I saw that an haproxy is using around 2MB
14:41:17 <ralonsoh> we can have tens of then in one host without any problem
14:41:32 <ralonsoh> that's all from my side
14:42:36 <mlavalle> he also raises reliability concerns in his rfe
14:43:13 <obondarev> RPC chattiness is another concern as I see
14:43:50 <obondarev> if not the main
14:44:54 <ralonsoh> because if I'm not wrong, what he is proposing is NOT having a single metadata agent in one single place
14:45:11 <ralonsoh> but one agent with one haproxy per host, right?
14:45:32 <mlavalle> correct.... there will be an ovs bridge in each compute with flows properly set up
14:45:33 <ralonsoh> (from c#3)
14:45:57 <ralonsoh> so I don't see too much benefit on this (at least in memory or CPU usage)
14:46:06 <haleyb> is there a spec?  i guess i would like to see a diagram better describing what this META_IP range is and how the mapping is going to work
14:46:09 <ralonsoh> no
14:47:16 <obondarev> so there is meta agent and meta proxy now, the RFE is going to replace meta agent only IIUC
14:47:41 <mlavalle> same understanding
14:47:45 <lajoskatona> I think this proposal is in line with the distributed dhcp ( https://bugs.launchpad.net/neutron/+bug/1900934 )
14:48:20 <lajoskatona> to have only as minimum agents as possible and that is ovs-agent
14:48:21 <mlavalle> yeah, it is really an overall effort on yulong's part to distribute a lot of the functions
14:49:45 <ralonsoh> ok this is what I though initially but I didn't read that in the RFE
14:50:12 <mlavalle> it is part of a broader picture
14:50:55 <amotoki> my understanding is as yours. it tries to replace the metadata agent to per-host agent-like feature. I am not sure who plays a role of the metadata agent i.e. who laucnhes local haproxy.
14:51:04 <amotoki> but perhaps it can be covered by a spec.
14:51:09 <liuyulong_> Hi guys,
14:51:29 <ralonsoh> hi, we are discussing https://bugs.launchpad.net/neutron/+bug/1933222
14:51:41 <liuyulong_> amotoki, it is ovs-agent
14:52:04 <liuyulong_> The datapath is VM -> br-int -> br-meta -> tap-meta -> host haproxy.
14:52:24 <mlavalle> and all that path is in the same host
14:52:37 <liuyulong_> Yep
14:52:46 <ralonsoh> so you want OVS agent to control the metadata proxy (I din't read that in the RFE)
14:53:07 <obondarev> so may it result in more time needed to handle each new port?
14:53:08 <liuyulong_> ralonsoh, "So, I'd like to request for implementing an agent extension for Neutron
14:53:08 <liuyulong_> openvswitch agent to make the metadata datapath distributed."
14:53:10 <obondarev> I mean overall
14:53:48 <obondarev> currently the load is kind of shared between metadata/dhcp/l3 agents and ovs agent
14:53:49 <liuyulong_> obondarev, No, at least for our cloud.
14:54:17 <obondarev> if ovs agent becomes responsible for all this..
14:54:20 <liuyulong_> We implement this about 2 years ago, like distribtued DHCP extension for ovs-agent.
14:55:01 <obondarev> so did you measure port provisioning time?
14:55:45 <liuyulong_> comparing to metadata agent, this is more reliable and fast.
14:55:46 <mlavalle> if you did, it would be useful data in the rfe / spec
14:56:03 <liuyulong_> The origin metadata work procedure is:
14:56:41 <liuyulong_> VM -> router namespace -> haproxy -> metadata agent
14:57:15 <mlavalle> with node hops in the middle
14:57:16 <liuyulong_> #link https://review.opendev.org/c/openstack/neutron/+/633871
14:58:10 <liuyulong_> This is the fix of race conditon between VM booting and L3 agent processing router ( creating "router namespace + haproxy" ).
14:59:23 <ralonsoh> one question: that feature will be independent to the L3 deployment (legacy, DVR, HA) now you don't need the router
14:59:26 <ralonsoh> right?
14:59:52 <liuyulong_> "router namespace -> haproxy -> metadata agent" any of these point down will have effect on all hosts for VMs.
15:00:16 <ralonsoh> but with your feature that won't be needed
15:00:19 <ralonsoh> right?
15:00:24 <liuyulong_> Yes
15:00:28 <ralonsoh> perfect
15:00:30 <mlavalle> ok, we are at the top of the hour. We will start next meeting with this RFE, which will take place in two weeks
15:00:47 <mlavalle> Have a nice weekend
15:00:51 <ralonsoh> you too
15:00:54 <liuyulong_> OK
15:00:56 <amotoki> this proposal replaces existing metadata-agent along with l3-agent and dhcp-agent with a per-host metadata-agent as part of ovs-agent.
15:00:57 <mlavalle> #endmeeting