13:59:59 #startmeeting neutron_l3 14:00:00 Meeting started Wed Nov 20 13:59:59 2019 UTC and is due to finish in 60 minutes. The chair is liuyulong. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:01 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:03 The meeting name has been set to 'neutron_l3' 14:00:06 #chair liuyulong_ 14:00:07 Current chairs: liuyulong liuyulong_ 14:01:41 #topic Announcements 14:02:32 Let's recall the announcements yesterday 14:02:39 #link http://eavesdrop.openstack.org/meetings/networking/2019/networking.2019-11-19-14.00.log.html#l-10 14:02:59 Then no more from me. 14:05:06 #topic Bugs 14:06:42 No bug deputy email received this week, so let's directly search the bug list. 14:09:13 First one 14:09:18 #link https://bugs.launchpad.net/neutron/+bug/1852777 14:09:18 Launchpad bug 1852777 in neutron "Neutron allows to create two subnets with same CIDR in a network through heat" [High,In progress] - Assigned to Rodolfo Alonso (rodolfo-alonso-hernandez) 14:09:51 hi yes 14:09:56 I'm still testing this 14:09:56 The distributed lock should be introduced for this, IMO 14:10:04 no, is not working 14:10:16 A local file or memory lock does not work in multiple physical hosts. 14:10:27 even with the threading lock I have both subnets created 14:10:46 distributed lock? 14:10:54 sorry can you point me to this? 14:11:00 I mean we should use tooz. 14:11:06 ah ok 14:11:11 hi, sorry for being late 14:11:30 but this request will be done to a single server only 14:11:37 or am I wrong? 14:11:52 so a distributed lock won't be necessary here 14:12:13 create subnet with same CIDR should be two different API calls. 14:12:22 ralonsoh: I think that each create-subnet request can go to different host with neutron-api process 14:12:24 no? 14:12:27 So it will spread to different hosts. 14:12:34 ok 14:13:05 btw, this is going to add an extra delay in subnet creation 14:13:27 just a heads-up to everybody complaining about the time consumption in neutron API 14:13:32 BTW, the UNIQUE CONSTRAINT is also worthy to add. 14:13:48 where? 14:14:07 because two different cidrs can overlap being different 14:14:11 just different masks 14:14:22 10.0.0.0/24 and 10.0.0.0/25 14:14:25 Yes, the distributed lock will make the API workers from different linearly. 14:14:28 exactly, constraint on db level will not help here 14:14:45 (I tried to do something with IPSets) 14:14:53 by adding a new table 14:14:58 and adding a register per network 14:15:10 containing the IPSets of the CIDRs 14:15:15 ralonsoh, slaweq, but at least it can cover one case. 14:15:56 liuyulong, hmmm I don;t think this is enough 14:16:01 #chair slaweq 14:16:02 Current chairs: liuyulong liuyulong_ slaweq 14:16:11 ^ just in case 14:17:05 ralonsoh, yes, I'm not saying it will cover all cases. 14:17:06 unique constraint can at least fix problem when 2 workers will try to create exactly same subnets 14:17:20 and as it's probably easy to add, it makes sense for me 14:17:27 but this will not solve the problem for sure 14:17:38 ok 14:17:46 I'll add a partial-bug patch for this 14:18:23 ralonsoh++ 14:18:39 liuyulong++ for the idea about unique constraint too 14:18:49 I have another bad idea based on such unique constraint, : ) 14:19:32 Store each IP of the CIDR and add unique constraint between IP and network_id, hahaha 14:19:44 not all the IPs 14:19:48 but the IPSet 14:19:52 as I proposed before 14:20:13 one IPset per network, in one single register 14:20:40 this will force, any time we want to update the network subnets, to update this single register 14:21:11 this will use the DB to enforce the logic 14:21:37 Database has such IPset data type? 14:21:55 no, json.dumps(netaddr.IPSet()) 14:22:02 and then json.loads() 14:22:03 ralonsoh: how one IPSet will work for network is I will have e.g. 2 subnets 1.0.0.0/24 and 2.0.0.0/24 ? 14:22:30 one sec 14:22:52 ralonsoh, a json list can be used for unique constraint? 14:22:59 >>> n10=netaddr.IPNetwork('1.0.0.0/24') 14:22:59 >>> n11=netaddr.IPNetwork('2.0.0.0/24') 14:22:59 >>> ips=netaddr.IPSet([n10,n11]) 14:22:59 >>> ips 14:22:59 IPSet(['1.0.0.0/24', '2.0.0.0/24']) 14:23:00 and also You will still need to have logic in python to validate this IPset each time e.g. new subnet is added 14:23:08 slaweq, yes 14:23:23 so it still not be atomic 14:23:25 right? 14:23:33 but the point is you can have only one writer context to one DB register 14:23:59 as I said, this is not easy and I'm trying to find the way 14:24:00 ahh, ok 14:24:09 so this would be locked by one api worker 14:24:17 it should 14:24:23 and other would need to wait to read from it, correct? 14:24:32 (of course, this will break the DB normal forms) 14:24:40 exactly, it should wait 14:25:11 than this may work 14:25:33 +1, make sense 14:26:29 but one more thing 14:26:31 It looks like a distributed lock implemented by neutron itself for each network during create subnet. 14:26:43 do You want to store in db list(ips)? or what exactly? 14:27:03 store str(IPSet) 14:27:17 this is way shorter than the IP list 14:27:35 TypeError: Object of type IPSet is not JSON serializable 14:27:47 I have such error when I'm trying to do this 14:27:57 I know, we need to create a serializer 14:28:18 but ok, we can even store there list of cidrs, and than create IPSet object in fligh during the validation 14:28:19 this could be done just with the ranges list 14:28:26 e.g.: ['1.0.0.0/24', '2.0.0.0/24'] 14:28:31 This is my understanding: one API try to add 'ip_set' to the 'new table' and it should have uniq constraint for network; while another worker try to write this table will meet uniq constraint error. 14:29:01 slaweq, but we should not store the CIDR list 14:29:15 slaweq, we already have this information from the DB 14:29:26 this is bad DB design 14:29:49 liuyulong, yes, that's the point 14:29:53 So after the first one creation done, and another will start another retry to write this table, and go to the IPAM check again. 14:29:54 to use the DB as a lock 14:30:16 ralonsoh: sure, I was thinking about list as You mentioned: ['1.0.0.0/24', '2.0.0.0/24'] 14:30:44 slaweq, yes another example 14:31:00 >>> n11=netaddr.IPNetwork('1.0.1.0/24') 14:31:00 >>> ips=netaddr.IPSet([n10,n11]) 14:31:00 >>> ips 14:31:00 IPSet(['1.0.0.0/23']) 14:31:10 n10=netaddr.IPNetwork('1.0.0.0/24') 14:31:30 one network range for two cidrs 14:32:54 yes, that makes sense 14:35:03 If neutron is willing to introduce tooz, the subnet creation can also apply lock on the network only, basically logical is same. 14:35:56 OK, we have good ideas here, thanks. 14:36:06 Next one 14:36:10 #link https://bugs.launchpad.net/neutron/+bug/1852760 14:36:10 Launchpad bug 1852760 in neutron "When running 'openstack floating ip list' on undercloud, client cannot handle NotFoundException" [Low,Invalid] - Assigned to Nate Johnston (nate-johnston) 14:36:58 yes, I moved that to storyboard as it's a client issue https://storyboard.openstack.org/#!/story/2006863 14:37:28 It is a client error. We need user friendly outputs for it. Right? 14:37:51 correct. neutron api is returning the correct thing. 14:38:21 And the Neutron API response should also add the resource type in the message, IMO. 14:38:31 Now it is just "The resource could not be found" 14:38:53 that would be nice, but not absolutely required 14:39:34 Yes, user should remember the resource type they are just trying to find. 14:39:41 indeed 14:39:54 Next 14:39:57 #link https://bugs.launchpad.net/neutron/+bug/1852680 14:39:58 Launchpad bug 1852680 in neutron "floatingip can not access after associate to instance" [Undecided,Incomplete] 14:40:32 I highly doubt the VM was not set the security group rule correctly. 14:41:13 Since there is no more information attached to this now, let's remain it as Incomplete. 14:42:22 I agree 14:42:24 Next 14:42:28 #link https://bugs.launchpad.net/neutron/+bug/1852504 14:42:28 Launchpad bug 1852504 in neutron "DHCP reserved ports that were unscheduled are advertised as DNS servers" [Medium,In progress] - Assigned to Mithil Arun (arun-mithil) 14:43:58 Alright, another DHCP bug, this was seen in our cloud. 14:44:38 Mainly are because of the auto_schedule mechanism of DHCP. 14:45:24 the issue here is that we are not removing reserved_dhcp_ports but left them unbound, right? 14:46:18 We have no fix of this, but as a workaround, I just suggest to disable the auto_schedule of the DHCP, and increase the dhcp_agents_per_network to 3 or more. In such way, it can cover most failure case. 14:46:44 slaweq, yes 14:46:59 maybe we should remove such ports? 14:47:18 slaweq, but you can see that the bug description has more that 2 ACTIVE DHCP ports... 14:48:02 liuyulong: yes, but how it's related? 14:48:08 active ports are ok, right? 14:49:00 I have no idea because we have used the distributed DHCP based on the openflow and ovs local controller. 14:49:25 Which I proposed during the PTG, : ) 14:49:54 yes, I remember that one :) 14:50:37 Anyway, it has a fix there: https://review.opendev.org/694859 14:50:50 We can test that. 14:51:03 yes, let's review this 14:51:57 Next: https://bugs.launchpad.net/neutron/+bug/1852468 14:51:57 Launchpad bug 1852468 in neutron "network router:external value is non-boolean (Internal) which causes server create failure" [Undecided,Invalid] 14:52:00 It is invalid now. 14:52:24 And next: 14:52:27 https://bugs.launchpad.net/neutron/+bug/1852447 14:52:27 Launchpad bug 1852447 in neutron "FWaaS: adding a router port to fwg and removing it leaves the fwg active" [Medium,Triaged] 14:52:49 Who is the new daddy of this project now? haha 14:53:23 there is no new daddy for fwaas (yet) 14:54:21 Alright, time is running out. Let's move on. 14:54:32 #topic On demand agenda 14:55:02 I have one update of IPv6. 14:55:38 We finally move back to dhcpv6-stateful for both address and other option with prefix len of 64. 14:56:48 Everything works fine for now, the instance image does not change for the IPv6 and the NetworkManager also works fine. 14:59:05 * haleyb completely forgot about the time change for this meeting, sorry :( updated... 14:59:07 Windows have a very magic behavior, when you add a IPv6 address for a port (once with IPv4 only), the NIC of the windows will automatically set the IPv6 address to it. But for Linux, user should to ifdow/up the network interface to dhcp the IPv6 address. 14:59:26 OK, let's end here. 14:59:29 #endmeeting