14:00:59 #startmeeting neutron_drivers 14:00:59 Meeting started Fri Jul 2 14:00:59 2021 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:59 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:59 The meeting name has been set to 'neutron_drivers' 14:01:02 o/ 14:01:03 Rodolfo Alonso proposed openstack/neutron stable/victoria: [OVN] Do not fail when processing SG rule deletion https://review.opendev.org/c/openstack/neutron/+/799210 14:01:06 hi 14:01:07 hi 14:01:30 hi 14:01:39 let's wait few more minutes for people to join 14:01:50 I know that haleyb and njohnston are on PTO today 14:01:57 hi 14:02:01 but amotoki and yamamoto will maybe join 14:02:03 hi 14:02:15 Pedro Henrique Pereira Martins proposed openstack/neutron master: Extend database to support portforwardings with port range https://review.opendev.org/c/openstack/neutron/+/798961 14:02:43 hi 14:03:55 ok, let's start 14:04:06 agenda for today is at https://wiki.openstack.org/wiki/Meetings/NeutronDrivers 14:04:10 do we have quorum? 14:04:18 mlavalle: I think so 14:04:25 there is You, ralonsoh amotoki and me 14:04:33 ok 14:04:34 so minimum but quorum, right? 14:04:44 yeah, I think so 14:04:50 #topic RFEs 14:04:50 actually I'm presenting a RFE, so I should not vote 14:05:13 ralonsoh: sure, so with Your rfe we can wait for next meeting 14:05:19 perfect 14:05:38 we have then one rfe for today :) 14:05:40 https://bugs.launchpad.net/neutron/+bug/1930866 14:07:04 who is presenting it? 14:07:29 doesn't matter 14:07:33 we can discuss it 14:07:37 ok, perfect 14:07:44 that's the usual approach 14:07:59 yeah, personally I think it is totally valid issue 14:08:11 I didn't know that nova have something like "lock server" 14:08:36 it is. we should worry about the complete end user experience accross all projects, not only Neutron 14:08:58 mlavalle: yes, exactly :) 14:09:00 end users don't use Neutron. They use OpenStack 14:09:54 because this is part of the Nova API, can we ask them to modify the VM ports? as obondarev suggested 14:10:23 or should be us responsible of checking this state? 14:10:27 ralonsoh: yes, but looking from the neutron PoV only, we should provide some way to "lock" port in such case 14:10:31 the bug is reported about locked instances. what I am not sure is whether we need to handle ports used by locked instances specially. 14:10:40 similar to what we do with the dns_name attribute when NOva creates a port for an instance 14:10:41 then nova could "lock" port as part of server lock 14:11:31 potentially endusers can hit similar issues even for non-lcoked instances. 14:11:43 yeap 14:12:07 I don't see in what scenario, sorry 14:12:10 amotoki: IMHO we should, I'm not sure if forbid to delete any port which is attached to the instance would be good idea as that would be pretty big change in API 14:12:20 but in the case of locked instances we really deliver an awful end user experience, because OpenStack made a promise that gets broken 14:12:21 looking from Cinder perspective, if a Volume is attached, you cannot just delete it. would the same make sense for a port? 14:12:51 neutron already forbids deleting port of certain types, right? 14:13:25 if we will change neutron so it will forbid deletion all ports which are attached to vms, we will break nova I think 14:13:46 and nova will need to adjust own code to first detach port and then delete it 14:13:51 yes, we already forbit deleting ports used by router interfaces (and otherr maybe) 14:13:57 and that will make problems during e.g. upgrade 14:14:07 or am I missing something? 14:14:45 slaweq, right we need to mark those ports somehow 14:14:46 no 14:14:53 slaweq: I haven't checked the whole procedure in server delete. It may affect nova precedures in deelting ports attached to instances. 14:15:01 no, you are not missing anything 14:15:26 maybe we want to discuss this with Nova folks 14:15:33 is gibi around? 14:15:41 mlavalle: hi 14:15:47 hi gibi :) 14:16:01 tbh, I always found it confusing that there are 2 interfaces to attach/detach a port/network to/from an instances - Nova and Neutron directly 14:17:00 jkulik: precisely speaking there is no two ways to attach ports. neutron port deletion is not visilbe to nova, so it confuses users. 14:17:11 for what its worth we have suggested bocking port deletion of in use port in the past 14:17:25 amotoki: actully it is 14:17:43 neutron send a network-vif-delete event to nova when the neutron prot is delete 14:18:16 sean-k-mooney: ah, good point. i totally forget it. 14:18:46 amotoki: form a nova point of vew we have never really support this usecause though we would really prefer if you detached it first then deleted it if you needed too 14:19:30 sorry but I don't think we should go this way, making this change in Neutron/Nova 14:19:34 I agree with sean-k-mooney, while deleting a bound port is possible today and there is some level of support for it in nova, this is something that complicates things 14:19:48 regarding https://bugs.launchpad.net/neutron/+bug/1930866 is there an object to just blocking port delete while it has the device-owner and device id set 14:20:17 gibi: sean-k-mooney: but today nova, when e.g. vm is deleted will just call neutron once to delete port, right? 14:20:26 or will it first detach port and then delete it? 14:20:41 slaweq: that is a good question 14:21:03 slaweq: nova will unbind the port during VM delete, and if the port was actually created by nova during the boot with a network, then nova will delete the port too 14:21:11 we proably dont do a port update and then a delete but we could 14:21:30 gibi: oh we do unbind i was just going to check that 14:21:46 disallowing port delete for ports with deviceowner/device_id set would mean that a user could not remove dangling ports anymore without having write access to those fields 14:22:28 seba: those feild i belive are writable by the user. and they still could by doing a nova server delete or port detach via nova 14:22:44 seba: no, what we discuss is just about port deletion. 14:23:03 the above RFE talks about locked instances specifically. Nova could reject the network-vif-delete event if the instance is locked. It could be a solution 14:23:17 device_owner is writable by NET_OWNER and ADMINs https://github.com/openstack/neutron/blob/master/neutron/conf/policies/port.py#L380 14:23:28 gibi but at this point the port is already gone 14:23:34 gibi: i think neutron sends that asyc form the deletion of the port 14:23:35 so in typical use case, user will be able to clean it 14:23:46 I agree with gibi ... let's reduce the scope of this to locked instances 14:23:47 ralonsoh: ahh, then never mind, we cannot do that 14:23:52 gibi: but neutron still can delete a port though? 14:24:38 mlavalle: well neutron realy shoudl not care or know if an instance is locked 14:24:42 ralonsoh commented the same thing already :) 14:24:43 my suggestion can only be implemented if nova can prevent the port deletion by rejecting the netwokr-vif-delete event 14:25:17 gibi: that would require the neutron server to send that event first and check the return before proceeding with the db deletion 14:25:29 sean-k-mooney: yeah, I realize that 14:25:42 i dont think neutroc currently check the status code of those notificaitons 14:25:49 personally I like the idea of not allowing port deletion if it is attached to vm but to avoid problems during e.g. upgrades we could add temporary config knob to allow old behaviour 14:26:18 if we would forbid deletion of such ports it would be more consistent with what cinder do 14:26:31 slaweq: if we go that way the I thin Octavia folks should be involved, I think they also depend on deleting a bound port 14:26:31 so IMHO more consisten UX in general :) 14:26:49 qibi do Octavia use Nova or Neutron API? 14:26:50 gibi: ouch, I didn't know that 14:27:02 so I wonder who else we may break :) 14:27:16 I have a faint recollection from a PTG where they approached us with this port delete case as nova had some issue with it 14:27:25 https://github.com/sapcc/nova/blob/cd084aeeb8a2110759912c1b529917a9d3aac555/nova/network/neutron.py#L1683-L1686 looks like nova unbinds pre-existing ports, but directly deletes those it created without unbind. looks like an easy change though. 14:27:29 I have to dig if I found a recording of it 14:27:48 jkulik: that's what I though :) 14:27:49 jkulik: good reference, and I agree we can change that sequence 14:27:54 so nova would need changes too 14:29:12 yes it look like it would we coudl backport that however 14:29:29 maybe you could contol this tempoerally on the neutron side with a workaround config option 14:29:47 i hate config driven api behavior but since neutron does not use microverions 14:30:08 the only other way to do this would be with a new extsion but tha tis not backportable 14:30:12 but we have extensions 14:30:41 if it's not a bugfix, but a change to not allow deletion of ports with device-owner and -id, can this even be supported by a new API version? or would all old API versions also behave differently, then? 14:30:46 ralonsoh yes, but imagine upgrade case: 14:30:53 older nova, new neutron 14:30:54 so what are downsides of: "nova sets 'locked' flag in port_binding dict for locked instances, neutron checks that flag on port delete"? 14:31:07 old nova wants to delete bound port and it fails 14:31:17 and old nova don't know about extension at all :) 14:31:17 slaweq: so the extion would hae to be configurable 14:31:40 slaweq, right. So let's make this configurable for the next release 14:31:44 sean-k-mooney: yes, that's what I wrote few minutes ago also :) I think that we could add temporary config option for that 14:32:12 slaweq: yep for xena and then make it mandaroy for Y 14:32:14 I know we did that already with some other things between nova and neutron 14:32:24 that would allow nova to always unbidn before deleting 14:32:24 sean-k-mooney++ for me would be ok 14:32:32 iirc, locked instances in Nova can still be changed by an admin 14:32:47 would Nova then be able to delete the locked port, too? 14:32:58 jkulik: well instances are locked not ports right 14:33:07 exactly 14:33:11 sean-k-mooney: if we would go for "nova locks the port in neutron" 14:33:40 ok so new neutron extion for locked ports 14:33:53 if nova detect it we lock them automiatcly if you lock the vm? 14:34:07 that seems reasonable 14:34:25 nova already detects other neutron extensions, like extended port binding 14:34:33 and then neutron just prevents updating locked ports 14:34:49 mlavalle: yep that shoudl not be hard to add on the nova side 14:34:54 sean-k-mooney: does it mean nova needs an upgrade step where it updates the ports of already locked instances? 14:35:19 like syncing this state for exising instances 14:35:24 good question 14:35:31 we could do that in init host i guess 14:36:03 our we could just not and document that you will need to lock them again 14:36:10 I think locking / unlocking happens in the API so it would be strange to do the state synch in compute side 14:36:35 anyhow, this needs a nova spec 14:36:49 I dont want to solve all the open question on friday afternoon :D 14:36:49 nad I suggest a Neutron spec as well 14:37:22 well i was going to say technially its not an api change on the nova side so it could be a specless blueprint but ya for the upgrade question a spec is needed 14:37:32 so do we want to add "lock port" extension to neutron or forbid deletion of ports in-use? 14:37:41 IIUC we have such 2 alternatives now, right? 14:37:45 add lock port extension 14:37:45 yes 14:37:51 yes 14:38:16 either seam vaild but lock-port is proably a better mapping to the rfe 14:38:37 if I'm admin, I can delete a locked instance. Neutron needs to take this into account for the "locked" port 14:38:45 i would still personally like to forbid deliting in use ports 14:39:09 jkulik: well no nova can unlock them in that case 14:39:13 the bug was reported for locked instances, but I personally prefer to "block deletion of ports used by nova". 14:39:23 but what "locked port" would mean - it can't be deleted only? can't be updated at all? 14:39:37 sean-k-mooney: yeah, makes sense 14:39:52 slaweq: i would assume cant be updated at all but we could detail that in the spec 14:40:19 IMHO blocking deletion of ports in use is more straight forward solution, but from the other hand it may break more people so it's more risky :) 14:40:47 sean-k-mooney: yeah, we could try to align with nova's behaviour for locked instances 14:40:54 neutron already blocks direct port deletion of router interfaces. in this case we disallow to delete ports used as router interface, but we still allow to update device_owner/id of such ports. if users would like to delete such ports expliclity they first need to clear device_owner/id and then they can delete these ports. 14:40:55 and that can be clarified in the spec 14:41:13 i know that doing a delete this way used ot leak some resouces on the nova side in the past like sriov resouces 14:41:26 slaweq: unfrotunetly i dont knwo off hand what the nova behivor actully is 14:41:40 but yes it would be nice to keep them consitnet 14:41:44 sean-k-mooney: np, it can be discussed in the spec as You said :) 14:44:42 so, do we want to vote for prefered option? 14:45:10 if we have to vote I lean towrds lock port extension 14:45:23 if use existing port's binding_profile dict - do we need a neutron API extension at all? 14:45:50 obondarev: I would say yes, to make discoverable new behaviour in neutron 14:45:52 another key - value pair there? 14:45:54 it's still API change 14:46:07 ok, makes sense 14:46:09 slaweq: +1 14:46:10 so You need to somehow tell users that neutron supports that 14:46:24 IMO this conversation has been a but chaotic: we started with the "lock port" idea, then we moved to block a bounf port deletion and now we are voting for a "lock port" extension 14:46:37 I really don't understand what happened in the middle 14:46:52 ralonsoh: :) 14:46:57 then let's not vote and decide the entire thing in the spec 14:47:03 we were going to implement this RFE by blocking the deletion of a bound port 14:47:17 yeah, we discussed two approaches 14:47:21 I know 14:47:26 but mixing both 14:47:50 so the point is to provide a transition knob from neutron to Nova 14:47:53 or extension 14:48:02 to know if this is actually supported in Neutron 14:48:10 and then implement the port deletion block 14:48:19 (that will also comply with the RFE) 14:48:42 obondarev: the content of the binding_profile is own by nova and is one way 14:48:54 so I propose that: 14:48:57 it provide info from nova to the neutron backedn 14:49:17 1. we will approve rfe and will continue work on it in spec - I think we all agree that this if valid rfe 14:49:46 2. I will summarize this discussion in the LP's comment and will describe both potential solutions 14:50:07 +1, yes it's a valid rfe 14:50:16 +1, the RFE is legit 14:50:22 if there is anyone who wants to work on it and propose spec for it, that's great but if not, I will propose something 14:50:32 *by something I mean RFE :) 14:50:41 sorry, spec :) 14:50:45 and I can work on it 14:50:53 mlavalle: great, thx 14:50:53 thanks 14:51:19 totally agree what is proposed. 14:51:41 thx, so I think we have agreement about next steps with that rfe :) 14:52:13 according to second rfe from ralonsoh we will discuss it on next meeting 14:52:20 thanks 14:52:25 I'll update the spec 14:52:44 #topic On Demand agenda 14:52:55 seba: You wanted to discuss about https://review.opendev.org/c/openstack/neutron/+/788714 14:53:01 yes! 14:53:04 so You have few minutes now :) 14:53:21 okay, so just so you understand where I come from: I maintain a neutron driver using hierarchical portbinding (HPB), which allocates second-level VLAN segments. If I end up with a (network, physnet) combination exists with different segmentation_ids my network breaks. 14:53:41 This can happen when using allocate_dynamic_segment(), so my goal would be to either get allocate_dynamic_segment() safe or find another way for safe segment allocation in neutron. 14:54:33 We discussed https://bugs.launchpad.net/neutron/+bug/1791233 on another driver meeting and the idea to solve this was to employ a constraint on the network segments table to make (network_type, network, physical_network) unique. 14:55:37 that could be an solution, but doesn't work for tunneled networks 14:55:55 well in that case physical network would be None 14:55:55 you'll have (vxlan, net_1, None) 14:56:11 yes and repeated several times 14:56:20 ralonsoh, that should not be a problem, two "None"s are never the same 14:56:24 we dont curently supprot have 2 vxlan secment for one neutron network 14:57:31 ralonsoh, jkulik wrote something about the NULL values in the bugreport and how they're not the same with most if not all major databases 14:57:53 so that use-case would not be hindered by the UniqueConstraint 14:57:54 there is no valid configuration wehre we can have 2 (vxlan, net_1, None) segment with different vxlan vids right 14:58:07 sean-k-mooney, I have one top-level vxlan segment and then a vlan segment below it for handoff to the next driver. I don't see though what would stop me from having a second level vxlan segment 14:58:41 well i was thinkign about what the segments exteion allows 14:58:58 when doing heracial port binding that is sligly idfferent 14:59:01 ah, so you're thinking about multiple vxlan segments without specifying a physnet? 14:59:22 seba: yes sicne tunnels do not have a physnet 14:59:48 we need to finish meeting now, but please continue discussion in the channel :) I have to leave now becuase I have another meeting. Have a great weekend! 14:59:52 #endmeeting