14:00:14 #startmeeting neutron_l3 14:00:15 Meeting started Wed Mar 11 14:00:14 2020 UTC and is due to finish in 60 minutes. The chair is liuyulong. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:16 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:18 The meeting name has been set to 'neutron_l3' 14:00:24 #topic Announcements 14:00:25 hi 14:00:33 ralonsoh, hi 14:01:11 #link http://eavesdrop.openstack.org/meetings/networking/2020/networking.2020-03-10-14.00.log.html#l-10 14:01:45 hi 14:02:02 I guess we can skip this topic today, everything was talked yesterday. 14:02:20 slaweq, hi 14:02:32 OK, let's move on 14:02:36 #topic Bugs 14:02:50 #link http://lists.openstack.org/pipermail/openstack-discuss/2020-March/013177.html 14:03:09 Bence was our bug deputy last week. 14:03:29 So the first one: 14:03:44 #link https://bugs.launchpad.net/neutron/+bug/1866336 14:03:46 Launchpad bug 1866336 in neutron "Binding of floating ip agent gateway port and agent_id isn't removed " [High,In progress] - Assigned to Slawek Kaplonski (slaweq) 14:03:55 slaweq is working on this now. 14:04:17 Code looks good to me, just one small concern about a DB query. : ) 14:04:41 I have left a comment on the gerrit: 14:05:06 I agree with your comment, but I'm still reviewing the patch 14:05:08 #link https://review.opendev.org/#/c/711623/3/neutron/db/l3_dvr_db.py@371 14:05:09 liuyulong: thx, I will check it after today's meetings 14:05:48 slaweq, yw 14:05:52 OK, next one 14:06:08 #link https://bugs.launchpad.net/neutron/+bug/1866560 14:06:09 Launchpad bug 1866560 in neutron "FIP Port forwarding description API extension don't work" [High,In progress] - Assigned to Slawek Kaplonski (slaweq) 14:06:29 Also assigned to slaweq 14:07:13 yes, this one requires new neutron-lib version 14:07:14 One small thing is we have 2 bps for this. 14:07:54 the correct one related this this RFE is https://blueprints.launchpad.net/neutron/+spec/portforwarding-description 14:08:50 So since we are still in the developing cycle, I'd like to say we should add this bp link to the commit message to trace all the code. 14:09:13 liuyulong: ok, I will add link to BPs to this patch 14:10:20 slaweq, sure, so both link are fine to me, you could choose any one you want. : ) 14:10:44 I'm using https://blueprints.launchpad.net/neutron/+spec/fip-pf-description to track this feature 14:10:53 actually I wasn't aware of the other one :) 14:12:01 Next one, we've discussed last week. 14:12:16 during the LP bug scanning 14:12:18 #link #https://bugs.launchpad.net/neutron/+bug/1865891 14:12:20 Launchpad bug 1865891 in neutron "Race condition during removal of subnet from the router and removal of subnet" [Medium,Confirmed] - Assigned to Slawek Kaplonski (slaweq) 14:13:04 ralonsoh had left a potential solution 14:13:08 it's exactly like ralonsoh described 14:13:21 I tested this possible solution and works 14:13:33 I like to use the DB as a lock 14:13:47 ralonsoh: do You have patch for that already? 14:13:50 but, of course, this will add some extra time in the process 14:13:56 slaweq, I don't but I can propose it 14:14:08 do you want me to do this? 14:14:21 ralonsoh: if You can, that would be great 14:14:28 perfect 14:14:35 I did debugging process but I didn't propose any patch yet 14:15:04 and btw. it's IMO even Low instead of Medium bug 14:15:25 as it is very unlikly to happen in real world scenario 14:15:26 ralonsoh, so every port creation will have to check this DB lock? 14:15:52 liuyulong, yes, the transaction will lock any other one modifying/deleting this register 14:16:04 ralonsoh: there is one thing which bothers me with this 14:16:13 is this only related to router's ports? 14:16:27 or can it be the same when we are creating "regular" ports 14:16:46 (I didn't try to reproduce it with port-create command) 14:17:01 yes, if it is every port creation, nova may not happy during large sets VM booting. 14:17:03 let me check.... but this will work for any port requesting an IP from IPAM 14:18:02 yes, that will work for any port creation 14:18:10 the subnets will be locked during this process 14:18:58 ralonsoh: if You can push some PS with fix for it, that would be great 14:19:11 slaweq, sure, tomorrow 14:19:11 we can then maybe compare its performance impact on rally job 14:19:25 and see how much cost it has for us :) 14:19:32 what do You think? 14:19:43 rally is the best tool for this 14:20:14 OK, but again, we should find out the balance between we fix that and warn the users not to do that. 14:20:23 : ) 14:20:32 well, that's difficult 14:20:41 you can always send a batch of commands 14:20:52 of course, this should be executed at the same time 14:21:07 but the MUST in our code is reliability 14:22:02 ralonsoh++ 14:22:37 Make sense 14:22:48 OK, next one: 14:22:49 #link https://bugs.launchpad.net/neutron/+bug/1866077 14:22:50 Launchpad bug 1866077 in neutron "[L3][IPv6] IPv6 traffic with DVR in compute host" [Undecided,Incomplete] 14:23:00 #link https://bugs.launchpad.net/neutron/+bug/1774463 14:23:00 Launchpad bug 1774463 in neutron "RFE: Add support for IPv6 on DVR Routers for the Fast-path exit " [High,In progress] - Assigned to LIU Yulong (dragon889) 14:23:18 It is related to the 1774463 as haleyb|away pointed. 14:24:44 I filed the bug/1866077 is because we found that the IPv6 in dvr mode is missing some address scope mark which will block the traffic from the FIP-namespace. 14:26:00 We have a fix of that, and interesting thing is Swami’s patch also does: 14:26:03 #link https://review.opendev.org/#/c/662111/33/neutron/agent/l3/dvr_local_router.py@628 14:26:32 So, I may want to take his patch and continue the work. 14:26:43 liuyulong: that's very old patch :) 14:27:03 Yes, I've rebased that on the master branch. 14:27:47 It is a really nice feature for IPv6 with DVR. 14:28:12 IPv6 traffic can be distributed if users want to. 14:29:30 And IPv6 could be the feature, maybe someday IPv4 support will be dropped like py27. : ) 14:29:49 liuyulong: You wish :P 14:30:24 OK, last one 14:30:25 #link https://bugs.launchpad.net/neutron/+bug/1866445 14:30:26 Launchpad bug 1732067 in OpenStack Security Notes "duplicate for #1866445 openvswitch firewall flows cause flooding on integration bridge" [Undecided,New] 14:30:42 It is more like a L2 bug。 14:30:58 But the reporter left some thing related to L3 router. 14:30:59 https://bugs.launchpad.net/neutron/+bug/1866445/comments/3 14:32:49 I have no fully picture about this bug. Will need to have more information about this. 14:33:07 I need to check why Yi is claiming about the MAC learning 14:37:37 OK, that's all bugs from me. 14:37:46 I want to talk again about one 14:38:02 https://bugs.launchpad.net/neutron/+bug/1859832 14:38:04 Launchpad bug 1859832 in neutron "L3 HA connectivity to GW port can be broken after reboot of backup node" [Medium,In progress] - Assigned to LIU Yulong (dragon889) 14:38:04 :) 14:38:26 liuyulong: I was again looking at Your patch and testing it 14:39:08 and I still have worries that it may introduce new race conditions and regression during the failover 14:39:43 I wrote it in my reply to Your comment https://review.opendev.org/#/c/707406/9/neutron/agent/l3/keepalived_state_change.py@98 14:40:54 The first one has a config option for keepalived, 'vrrp_garp_interval'. 14:41:10 We talked about that last week. 14:41:48 yes, but it's interval between garp packets send by keepalived, right? 14:41:54 But I'm not going to add it to this patch, it will be a follow-up one. 14:42:25 I still don't see how it may solve potential race condition between sending garps and setting interface to be up 14:44:48 It has 3 times garp for HA router. 14:45:21 yes, but as I said in the reply to Your comment: first one will always fail 14:46:00 second one is after 10 seconds (so 10 seconds of datapath break, right?) and You have no any quarantee that interfaces will be already up then 14:46:18 and third one should be IMO removed from neutron-keepalived-state-change code :) 14:48:52 The link up action for linux device is not time-consuming, so the first one has config vrrp_garp_interval=1 could be enough. 14:49:14 The keepalived will send 5 times GARP when it hit the state chagne. 14:49:19 s/change 14:49:47 and what about longer time of failover? 14:49:48 So the first one fails due to link not up, wait one second, and next one. 14:50:04 ralonsoh: what do You think? 14:50:34 I still think that we should not rely on the garp 14:50:58 and using slaweq's patch but implementing liuyulong's linuxinterface change 14:51:02 And more about this is the out side world has ARP. 14:51:16 we can have an easy architecture and we handle all backends 14:52:18 ralonsoh: now I'm a bit lost, I don't understand why we would need my patch with part of the liuyulong's change 14:52:35 https://review.opendev.org/#/c/707406/9/neutron/agent/linux/interface.py 14:52:46 For the 3rd GARP, IMO, remove that is not so much good for neutron. But could move it to L3-agent state change code path, not the sate change deamon. 14:52:57 you should implement this change in your patch, slaweq 14:53:12 ralonsoh: but why? 14:53:34 to handle all possible interfaces 14:53:39 ovs, lb, etc 14:53:48 this is the main problem you see in your patch 14:54:46 ralonsoh: ok, I think I need to talk offline with You about it later :) 14:54:51 sure 14:55:14 I don't want to stole all remaining time so please continue with other topics now 14:58:14 OK, we could continue the code discussion on the patch set. 14:58:30 Actually we have no much time now. 14:58:34 liuyulong: yes :) 14:58:40 #topic On demand agenda 14:59:20 Alright, let's end here. 14:59:27 Thank you guys. 14:59:32 Bye 14:59:37 #endmeeting