14:00:11 <lajoskatona> #startmeeting neutron_drivers 14:00:12 <opendevmeet> Meeting started Fri Aug 26 14:00:11 2022 UTC and is due to finish in 60 minutes. The chair is lajoskatona. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:12 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:12 <opendevmeet> The meeting name has been set to 'neutron_drivers' 14:00:13 <lajoskatona> o/ 14:00:16 <ralonsoh> hello 14:00:20 <amorin_> hello 14:00:40 <slaweq> hi 14:01:23 <njohnston_> o/ 14:02:50 <lajoskatona> mlavalle can't join today 14:03:03 <mlavalle> o/ 14:03:13 * mlavalle in another meeting 14:04:03 <lajoskatona> we have four drivers today, but we need at least 5 as I count 14:04:13 <obondarev_> hi 14:04:18 <lajoskatona> 5 :-) 14:04:29 <haleyb> hi 14:04:52 <lajoskatona> Let's start :-) 14:05:13 <lajoskatona> We have 2 RFEs which are related to each other as I see: 14:05:19 <lajoskatona> [RFE] Add DSCP mark 44 (#link https://bugs.launchpad.net/neutron/+bug/1987378 ) 14:05:40 <ralonsoh> This one is easy, I think 14:05:41 <lajoskatona> In comment slaweq already added that perhaps no need for a formal RFE for this 14:05:47 <slaweq> TBH, for me it doesn't seems like real RFE even 14:05:58 <lajoskatona> +1 14:05:58 <ralonsoh> (I added the RFE just in case) 14:06:06 <slaweq> this seems like minor and obvious change (addition) 14:06:12 <njohnston> +1 14:06:22 <ralonsoh> I have just one question related, now you are here 14:06:34 <lajoskatona> ralonsoh: please 14:06:47 <ralonsoh> this feature implies to add a new DSCP mark, that implies changing the API 14:06:48 <ralonsoh> but 14:06:59 <ralonsoh> the API values can be consulted via API too 14:07:02 <ralonsoh> example 14:07:20 <ralonsoh> https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/854369/1/neutron_tempest_plugin/api/test_qos.py 14:07:57 <ralonsoh> so I don't think we need an extension, because those values can be consulted via API (these values = DSCP accepted values, per driver) 14:08:01 <ralonsoh> right? 14:08:16 <slaweq> for me this sounds good 14:08:36 <slaweq> we can discover values acceptable in the cloud using that existing API 14:08:42 <ralonsoh> exactly 14:08:47 <obondarev> makes sense 14:08:51 <njohnston> agreed 14:09:11 <ralonsoh> thanks a lot, that's all from me related to this feature 14:09:21 <lajoskatona> +1 14:09:40 <haleyb> +1 from me 14:10:04 <njohnston> I would just say that we should make sure that discoverabiluty method is in the api ref - I don't see it when I look at the DSCP api 14:10:30 <ralonsoh> it is in the rule-type API 14:10:37 <ralonsoh> I'll send you the link 14:11:24 <ralonsoh> https://docs.openstack.org/api-ref/network/v2/index.html?expanded=list-qos-rule-types-detail,show-qos-rule-type-details-detail#show-qos-rule-type-details 14:11:48 <njohnston> perhaps we add an example there that shows the DSCP details, none of the examples demonstrate that 14:12:05 <ralonsoh> njohnston, I'll add it 14:12:11 <njohnston> +1 thanks! 14:12:17 <lajoskatona> good idea 14:12:44 <lajoskatona> ok, I think we can move on 14:12:48 <lajoskatona> [RFE] Add a port extension to set/define the switchdev capabilities (#link https://bugs.launchpad.net/neutron/+bug/1987093 ) 14:13:25 <ralonsoh> the goal of this RFE is to avoid exposing/modifying the port bindings from Neutron (or a Neutron user) 14:13:33 <ralonsoh> port binding should be read-only always 14:13:41 <ralonsoh> and only Nova should change that 14:14:04 <ralonsoh> so, in order to create a HW offloaded port, I'm proposing this new extension 14:14:31 <ralonsoh> why? a non-admin user, with the needed policies, can change a port binding PCI address 14:14:45 <ralonsoh> pointing, for example, to the compute node hard drive PCI address 14:14:56 <ralonsoh> that means when you boot the VM, this compute nodes dies 14:15:04 <ralonsoh> that's a possible security issue 14:15:08 <slaweq> for backward compatybility will this value be still exposed in binding profile, as it is now? 14:15:13 <ralonsoh> yes 14:15:21 <slaweq> so there will be no changes on nova required 14:15:28 <ralonsoh> it will 14:15:34 <slaweq> ok, sounds good for me 14:16:14 <lajoskatona> agree, as I see it is nova who has all the knowledge for the profile, let it do its best :-) 14:16:30 <ralonsoh> (and btw, that was something proposed by a Nova folk, sean-k-mooney ) 14:16:50 <ralonsoh> so we'll have full support from Nova side 14:16:50 <lajoskatona> :-) 14:16:58 <obondarev> should we try make it consistent with direct ports (SRIOV)? 14:17:11 <ralonsoh> yes, that's the goal 14:17:24 <obondarev> ok cool 14:18:22 <lajoskatona> Ok, it was also easy 14:18:59 <lajoskatona> We have no more RFEs, but: 14:19:02 <lajoskatona> #topic On Demand Agenda 14:19:08 <lajoskatona> (amorin): Add new novaplug mechanism driver 14:19:14 <amorin> Hey 14:19:15 <lajoskatona> https://bugs.launchpad.net/neutron/+bug/1986969 14:19:21 <lajoskatona> https://review.opendev.org/c/openstack/neutron/+/854553 14:19:37 <amorin> I can explain a little bit the context 14:19:46 <sean-k-mooney> o/ 14:19:58 <amorin> hey sean-k-mooney 14:20:03 <sean-k-mooney> this the trusted vf changes 14:20:08 <sean-k-mooney> or hardware offloaed ovs 14:20:33 <ralonsoh> sean-k-mooney, it is approved, yes 14:20:35 <sean-k-mooney> for the swtichdev capablity nova should set that on our side 14:20:44 <ralonsoh> we are talking about other bug now 14:20:49 <sean-k-mooney> fro trustedVF a new extstion on the neutron side 14:20:52 <sean-k-mooney> woudl be ideal 14:21:16 <lajoskatona> sean-k-mooney: for this one: https://bugs.launchpad.net/neutron/+bug/1987093 ? 14:21:31 <sean-k-mooney> we should not do that 14:21:46 <sean-k-mooney> at least not as the tital states 14:22:01 <sean-k-mooney> nova can detect if the VF has that capablity and set it 14:22:17 <sean-k-mooney> no if we want to allow user to schdule to a shost that has hardawr offloaed ovs 14:22:28 <sean-k-mooney> having an extion to request that via a triat or somthing would be nice 14:22:39 <sean-k-mooney> but setting the capablit is somethign we shoudl fix in nova 14:22:58 <sean-k-mooney> the trusted VF feature ahs a simiar problem however 14:23:04 <sean-k-mooney> that can only be fixed in neutorn 14:23:14 <sean-k-mooney> it also requires a vaule to be set in the profile to use 14:23:23 <sean-k-mooney> that shoudl be done differntly 14:23:37 <ralonsoh> but this is other feature, not related 14:23:43 <ralonsoh> we can address it in other bug 14:23:59 <sean-k-mooney> yes but i don tthink you shoudl try to adress https://bugs.launchpad.net/neutron/+bug/1987093 in nova 14:24:03 <sean-k-mooney> * in neutron 14:24:24 <ralonsoh> ok, now I'm lost 14:24:32 <lajoskatona> me too :-) 14:24:46 <sean-k-mooney> the binding profie is ment to be used to pass info form nova to the network backend 14:24:58 <lajoskatona> ok, let's go back to https://bugs.launchpad.net/neutron/+bug/1987093 ([RFE] Add a port extension to set/define the switchdev capabilities ) 14:25:22 <sean-k-mooney> right so i am not conviced we shoudl do ^ 14:25:49 <sean-k-mooney> nova does not use the "{"capabilities": ["switchdev"]}" for anything today 14:26:00 <lajoskatona> do we need a spec on nova side also to be in sync? 14:26:02 <sean-k-mooney> and the enduer has no idea if the VF supports it 14:26:26 <ralonsoh> and how should we ask for a HWOL port? 14:26:37 <sean-k-mooney> just vic type = direct 14:26:43 <sean-k-mooney> no special request 14:27:00 <ralonsoh> and if we have SRIOV + OVS? 14:27:10 <ralonsoh> what driver should attend this request? 14:27:12 <sean-k-mooney> if we want to suppot that we shoudl have a custom trait 14:27:21 <sean-k-mooney> but just adding "{"capabilities": ["switchdev"]}" 14:27:29 <sean-k-mooney> will not allow you to chosee today 14:27:43 <sean-k-mooney> the nova spec to implemnt the checking based on that was not finished 14:27:59 <sean-k-mooney> ralonsoh: you wote the code may years ago but we never merged it 14:28:03 <ralonsoh> this is the way we have now to discriminate if this port is for OVS (HW offload) or SRIOV (not HWOL) 14:28:11 <sean-k-mooney> nope 14:28:16 <ralonsoh> in Neutron yes 14:28:27 <sean-k-mooney> the bug fix that enforect that requriemtn was wrong 14:28:40 <sean-k-mooney> its check by ovs but that check is useless 14:28:46 <ralonsoh> if (vnic_type == portbindings.VNIC_DIRECT and 14:28:46 <ralonsoh> 'switchdev' in capabilities): 14:28:46 <ralonsoh> LOG.debug("Refusing to bind due to unsupported vnic_type: %s " 14:28:46 <ralonsoh> "with switchdev capability", portbindings.VNIC_DIRECT) 14:28:49 <ralonsoh> ^^ SRIOV 14:28:54 <ralonsoh> and the opposite in OVS 14:28:56 <sean-k-mooney> yep 14:29:01 <sean-k-mooney> that check is pointless 14:29:22 <sean-k-mooney> you have no idea if that port is conencted to a VF that is in swtichdev mode 14:29:24 <ralonsoh> Ok, let's do this 14:29:35 <ralonsoh> let's move this discussion out of this meeting 14:29:40 <sean-k-mooney> +1 14:29:42 <ralonsoh> between you and me 14:29:51 <ralonsoh> and we can continue with amorin bug 14:29:54 <sean-k-mooney> well with anyone that is interested 14:29:58 <amorin> ok 14:29:59 <sean-k-mooney> but sure 14:29:59 <ralonsoh> (I'll update the bug in LP) 14:30:02 <lajoskatona> thanks, let's do that 14:30:13 <lajoskatona> ralonsoh, sean-k-mooney: thanks 14:30:20 <amorin> we have customers using neutron api from terraform, they create port with a device_id, which endup NOT beeing attached to the instance 14:30:35 <amorin> so, to solve this on our side, we will add a custom mech-driver (novaplug) 14:30:38 <sean-k-mooney> ah this im glad im here :) 14:31:10 <amorin> after a discussion with sean-k-mooney 14:31:29 <amorin> it seems that the best would be to set device_id to be an admin only thing 14:31:49 <ralonsoh> (BTW, the goal of https://bugs.launchpad.net/neutron/+bug/1986969 was not to create a new mech driver, but to document the current behaviour) 14:31:55 <amorin> because nova will never approve plugging a port outside of a regular nova attache 14:32:08 <amorin> ralonsoh, yes true 14:32:19 <amorin> I just wanted to share what we will do downstream 14:32:38 <amorin> I dont want to bring our solution upstream, I just want to fix the issue at the end 14:32:57 <amorin> I created this in nova: https://review.opendev.org/c/openstack/nova/+/854627 14:33:06 <sean-k-mooney> yep so form a nova point of view detaching or attaching a port via neuton is not supported 14:33:15 <sean-k-mooney> and we dont want to supprot that in the future 14:33:26 <ralonsoh> right, you shouldn't 14:33:48 <sean-k-mooney> so we would prefer i fthe device_id and devie_oweer filed were admin/service user ownly for post/put 14:34:02 <sean-k-mooney> allowing normaly users readonly access is fine 14:34:14 <sean-k-mooney> but only nova/ironic/zun shoudl set them 14:34:26 <sean-k-mooney> as part of there internal interface attach workflows 14:34:52 <sean-k-mooney> well or neuton in the case of l3 router ports ecrta 14:35:06 <lajoskatona> sounds logical 14:35:12 <amorin> +1 14:35:42 <slaweq> ok, but if we will change that API and allow it only to the admin, we may break someone's usecase potentially if e.g. someone uses this field now for some custom ports 14:36:01 <sean-k-mooney> yes that is possible 14:36:07 <obondarev> for device_owner I think many users might use it 14:36:08 <amorin> maybe we can add an extra option in neutron.conf to change this behavior> 14:36:11 <amorin> ? 14:36:13 <slaweq> (I don't know about such uses cases in real world, just thinking loud now) 14:36:28 <sean-k-mooney> well it can be contoled by custom policy 14:36:34 <ralonsoh> this is part of the port binding 14:36:40 <sean-k-mooney> i know of one incorrect use today 14:37:08 <amorin> custom policy is not an option here, if it's available, our customer expects to use it 14:37:12 <sean-k-mooney> the shfit on stack team were considering using the id to reserve ip for contianer 14:37:44 <ralonsoh> so another reason to make the port binding read only 14:38:08 <slaweq> ralonsoh port binding? 14:38:23 <slaweq> don't we talk about device_id and device_owner fields? 14:38:24 <obondarev> these are separate fields 14:38:33 <ralonsoh> sorry, no, this is not part of the port binding info 14:38:34 <sean-k-mooney> yes are seperate yes 14:38:35 <ralonsoh> my bad 14:38:36 <lajoskatona> slaweq: I thought the same 14:38:43 <slaweq> ok 14:38:51 <sean-k-mooney> nova sets them during port attch at the same time it binds the port 14:38:56 <sean-k-mooney> but they are seperate 14:39:21 <sean-k-mooney> but that is just use minimisng the api calls not 14:39:27 <slaweq> ok, so I would like to clarify one thing as I'm not sure I'm following all the discussion here 14:40:10 <slaweq> what is the proposal to discuss: document current behaviour on Neutron side or add new mechanism driver in neutron as amorin proposed already in gerrit or maybe something else? 14:40:23 <slaweq> like changing current default API policies? 14:40:27 <opendevreview> sean mooney proposed openstack/neutron master: Revert "ovs mech: bind only if user request switchdev" https://review.opendev.org/c/openstack/neutron/+/854796 14:40:52 <sean-k-mooney> i was suggesting changing the default api policy 14:40:55 <lajoskatona> To document is the minimum as I see 14:41:03 <amorin> document +1 14:41:07 <sean-k-mooney> documenting is alwasy good 14:41:15 <amorin> changing it to admin only 14:41:23 <lajoskatona> but changing the policies that can be risky as you said with current unknown users 14:41:30 <sean-k-mooney> the issue is right now if you set the device-id to a nova server uuid you will start currptiong the nova db 14:41:55 <amorin> moreover, you can set it to a device_id which does not belong to the same tenant 14:41:59 <amorin> nova does nothing about it 14:42:12 <amorin> but openstack port list --server uuid 14:42:15 <amorin> as admin 14:42:19 <amorin> will print it 14:42:22 <sean-k-mooney> really thats not nice 14:42:22 <obondarev> should nova distinguish real interfaces from such manually created ones? 14:42:29 <amorin> very confusing for admins 14:43:01 <sean-k-mooney> the curent api contarct is that the device-id should be set to the entity that consumes the port 14:43:16 <sean-k-mooney> entiy being nova instnace, ironic server or zun contianer 14:43:49 <sean-k-mooney> the device-owner field is not used for filtering 14:43:50 <slaweq> so as sean-k-mooney said, we can change default API policies, document it and write "big" release note that this behaviour changed and if You want to use it in old way, You need to change Your API policies 14:44:06 <amorin> +1 14:44:11 <sean-k-mooney> yep 14:44:27 <lajoskatona> yeah, as it wa never an official "feature" 14:44:29 <lajoskatona> +1 14:44:30 <sean-k-mooney> by the way the usecase the opensfhit had was creating port to reservice ips for there use 14:44:48 <sean-k-mooney> so i also feel like that could be adress as its won feature another way 14:45:10 <sean-k-mooney> i.e. provide a way to reserve ips without having to create unattached ports 14:45:46 <amorin> but you can create ports without device_id, why do they need a device_id? 14:45:49 <sean-k-mooney> baicaly like creating flotating ip but form a normal subnet 14:45:56 <slaweq> ^^ that would be another, bigger RFE IMO 14:46:04 <sean-k-mooney> amorin: they wanted to track that the ip is used by that vm 14:46:16 <sean-k-mooney> or rahter a contianer on that vm 14:46:27 <amorin> ack 14:46:37 <sean-k-mooney> slaweq: yes it would 14:46:41 <amorin> were they aware that a reboot hard of the VM would actually attached it? 14:46:53 <sean-k-mooney> nope 14:47:00 <sean-k-mooney> and we were not either 14:47:06 <sean-k-mooney> (nova team) 14:47:09 <amorin> ok 14:47:14 <sean-k-mooney> i dont thin that used to happen 14:47:22 <ralonsoh> it works, yes 14:47:30 <sean-k-mooney> i think its a side effect of the network info cahce healing i added a few cycle ago 14:47:33 <lajoskatona> do we have to handle that case? I mean hard reboot ? 14:47:37 <amorin> true 14:47:42 <sean-k-mooney> its not ment ot work because it does not actully set thing up in our db properly 14:47:46 <amorin> (for the healing side effect) 14:47:47 <obondarev> so should we disallow device_id only or device_owner is also required? 14:48:03 <amorin> both I think 14:48:20 <sean-k-mooney> nova likely need to have its own bug for this and figure out hwo we want to move forward 14:48:59 <lajoskatona> perhaps a good discussion for the PTG :-) 14:49:09 <sean-k-mooney> yep perhaps 14:49:37 <sean-k-mooney> likely we shoudl jsut reject the external event and not populate it in the netowrk info cache and complain loadly 14:49:54 <sean-k-mooney> the probalem is to who 14:50:08 <lajoskatona> ok, so today and from Neutron perspective we agree that we have to document this behaviour and change the default policies for dev_id and dev_owner, am I right? 14:50:30 <ralonsoh> I think so, this should be the first step 14:50:43 <obondarev> well, the issue described is only related to device_id, so not sure device_owner is required here also 14:50:54 <lajoskatona> ok, thanks 14:51:49 <lajoskatona> no it mentions owner also, let's see it in the review, I would say :-) 14:52:08 <obondarev> less potential "broken" users is better I think 14:52:15 <ralonsoh> right 14:52:25 <slaweq> ++ 14:52:26 <lajoskatona> that is a good point 14:53:04 <obondarev> + for continue in review 14:53:04 <lajoskatona> Ok, so documentation and policy change for device_id for now 14:53:08 <amorin> ok 14:53:33 <lajoskatona> Do you have anything more to discuss today? 14:54:35 <lajoskatona> If nothing we can close the meeting for today 14:54:43 <slaweq> nothing from me 14:55:24 <lajoskatona> #endmeeting