14:00:53 #startmeeting neutron_l3 14:00:57 Meeting started Wed Aug 7 14:00:53 2019 UTC and is due to finish in 60 minutes. The chair is liuyulong. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:58 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:01:00 The meeting name has been set to 'neutron_l3' 14:01:04 hi 14:01:10 #chair haleyb 14:01:11 Current chairs: haleyb liuyulong 14:01:13 hi 14:01:13 hi 14:01:29 #topic Announcements 14:03:17 Unfortunately, the port forwarding topic was not accepted by the Summit Team. 14:04:01 So mlavalle and I will not share this in the Summit. : ) 14:04:45 Any other announcements? 14:05:33 OK, let's move on. 14:05:41 #topic Bugs 14:07:40 there were some new bugs filed last week in the l3 space 14:07:51 Bad network connection... 14:08:01 #link https://wiki.openstack.org/wiki/Network/Meetings#Bug_deputy 14:08:04 https://bugs.launchpad.net/neutron/+bug/1838699 14:08:05 Launchpad bug 1838699 in neutron "Removing a subnet from DVR router also removes DVR MAC flows for other router on network" [High,Confirmed] 14:08:51 slaweq confirmed it, but it will need an owner 14:08:58 My comment yesterday was not registered... 14:09:00 in this bug 14:09:09 #link https://bugs.launchpad.net/neutron/+bug/1838697 14:09:10 Launchpad bug 1838697 in neutron "DVR Mac conversion rules are only added for the first router a network is attached to" [Undecided,Incomplete] 14:09:14 this is also related. 14:09:33 if you have more than one DVR, the match flows will be the same 14:09:50 that means, when you deleted the flows, you'll delete all of them 14:09:53 liuyulong_: ack, makes sense think they were both filed by same person 14:10:05 (I'l write this comment again in the bug) 14:10:24 ralonsoh: thanks 14:10:46 ralonsoh, so it is designed like that, it is a feature? 14:10:58 I don't think so 14:11:11 that means we can't have more than one DVR per host 14:11:17 but I need confirmation 14:11:23 I'll write the comment again in the bug 14:12:19 OK, thank you, I read some code ealier, the ovs-agent will query the related flow by ports subnet. 14:12:56 If more than one dvr ports in same subnet, it will indeed delete once for all. 14:15:05 haleyb, I have bad network connection now, please take over the meeting chair. 14:15:13 ack 14:15:32 next bug, https://bugs.launchpad.net/neutron/+bug/1838793 14:15:33 Launchpad bug 1838793 in neutron ""KeepalivedManagerTestCase" tests failing during namespace deletion" [High,In progress] - Assigned to Rodolfo Alonso (rodolfo-alonso-hernandez) 14:15:46 https://review.opendev.org/#/c/674820/ was created - thanks ralonsoh 14:16:09 I need to check the CI again 14:16:51 ralonsoh: i've added myself to review so will look at next update 14:16:58 thanks! 14:18:11 next bug, https://bugs.launchpad.net/neutron/+bug/1838403 14:18:12 Launchpad bug 1838403 in neutron "Asymmetric floating IP notifications" [Medium,New] 14:19:10 i had triaged this last week and couldn't reproduce part of it. see now it was on queens, so perhaps part was fixed 14:20:11 still needs owner to track down the other possible issued with notifications, if noone wants it i can take a look 14:20:20 How to "delete a router that still has fip"? 14:20:54 liuyulong_: right, it didn't work for me on master, but can't imagine it works on queens either 14:21:48 liuyulong_: the part we need to investigate is what happens with a VM is destroyed - is the floating IP in one of the messages? his trace showed it wasn't 14:22:52 If the VM port is delete, l3_db will catch port_delete notification and release the FIP. 14:24:20 liuyulong_: yes, but is that sending a notification? 14:25:49 * haleyb wonders how fast liuyulong_'s modem is :) (if anyone remembers what a modem is) 14:26:10 28.8 bauds per sec 14:26:20 zoom zoom 14:26:34 Subscrition is more accurate. 14:26:47 Subscription 14:27:11 liuyulong_: oh, maybe they didn't subscribe to all the events? 14:27:52 https://github.com/openstack/neutron/blob/master/neutron/db/l3_db.py#L1848 14:28:04 This line it is. 14:30:04 liuyulong_: there is nothing there regarding floating IP though, maybe i'm mis-understanding 14:30:30 https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/plugin.py#L1941-L1943 14:31:11 liuyulong_: if i'm listening for FLOATINGIP events shouldn't i get one when a port with an associated floating IP is deleted? 14:32:58 either way, please add a comment to the bug so maybe the submitter can track things down 14:33:30 I have no details about the FIP events now, but according to my experiences, the floating IP will finally get disassociated. 14:33:37 there was one more new bug 14:33:39 https://bugs.launchpad.net/neutron/+bug/1839004 14:33:40 Launchpad bug 1839004 in neutron "Rocky DVR-SNAT seems missing entries for conntrack marking" [Undecided,Incomplete] 14:34:30 don't know if tidwellr is here, but this looked like maybe a mis-configuration 14:34:48 * tidwellr is lurking 14:35:48 tidwellr: hi, and this involved dynamic-routing too, any thoughts based on last update? 14:36:33 there's still something to look into in that bug, supposedly there was an address scope mismatch and yet the API was reporting that it was finding a next-hop for the tenant subnet 14:36:43 that doesn't seem right 14:37:57 tidwellr: it could be correct if snat was enabled though i think, but he had it disabled 14:40:16 you can see that the uplink subnet is in the null address scope 14:40:57 then, he shows neutron-dynamic-routing returning a next-hop for his tenant subnet 14:41:23 that should only happen if the inside and outside subnets are both in the same address scope 14:41:25 so 14:41:48 either the info in the bug report is inaccurate, or there is still a bug 14:42:30 so is that a bug in neutron or dynamic-routing? 14:42:50 assuming the steps in the bug report are accurate, yes 14:43:42 not sure how that would slip through the tests though, I would have to really look closely at that 14:43:48 yes a bug in dynamic-routing? 14:44:05 yes 14:44:36 he ran "openstack bgp speaker list advertised routes" and it returned routes it shouldn't have had 14:44:58 tidwellr: should we re-assign? as the scoping issue looks like user error 14:46:02 I'll leave a comment, then try to reproduce this myself. It seems pretty straight forward given his instructions in the bug report 14:46:25 tidwellr: thanks 14:46:54 liuyulong_: i didn't have any other new bugs, did you have old ones you wanted to talk about? or anyone else? 14:47:20 Yes, I have 14:48:05 For the fix: https://review.opendev.org/#/c/673557/ and the https://bugs.launchpad.net/neutron/+bug/1834308 14:48:06 Launchpad bug 1834308 in neutron "[DVR][DB] too many slow query during agent restart" [Medium,In progress] - Assigned to LIU Yulong (dragon889) 14:49:49 It is well tested locally. It does not break DVR functions. But I still hope to see if more test result can come from our community. 14:50:27 The next is: https://bugs.launchpad.net/neutron/+bug/1828494 14:50:28 Launchpad bug 1828494 in neutron "[RFE][L3] l3-agent should have its capacity" [Wishlist,In progress] - Assigned to LIU Yulong (dragon889) 14:51:02 I have one question, how many router do you guys think a network node can host? 100? 200? 1000? 14:51:32 no idea 14:51:41 when I ran network nodes in production, we capped it at 250 14:51:41 liuyulong_: there is no easy answer for that 14:52:02 i think the limiting factor always seems to be how long it takes to restart the agents 14:52:13 there are a lot of unique factors that go into that number we came up with though 14:52:53 and when I say we capped it, I mean we would add network nodes and rebalance routers to spread the load 14:53:15 I have one result, when the router reach 300+, the ovs-agent will never restart successfully. 14:54:10 that's consistent with what I've observed (anecdotally) 14:54:43 My env is 17 physical hosts for dvr_snat nodes, with 2700+ router, disable DHCP. 14:54:58 Every ovs-agent will host about 1700+ ports! 14:55:30 Yes, I've tested 400+ ports for a ovs-agent once, it is about 40+ mins to restart. 14:57:04 Ovs-agent seems can be easily stuck in many code path.... 14:58:48 that is too long of course, should it be on the performance sub-team's list ? 15:00:05 Should be, make sense 15:01:02 liuyulong_: we're at time 15:01:17 OK 15:01:22 Let's end here. 15:01:33 bye 15:01:40 This nick does not have that right. 15:01:51 haleyb, please end our meeting, thank you. 15:02:00 #endmeeting