14:00:14 <liuyulong> #startmeeting neutron_l3 14:00:15 <openstack> Meeting started Wed Mar 11 14:00:14 2020 UTC and is due to finish in 60 minutes. The chair is liuyulong. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:16 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:18 <openstack> The meeting name has been set to 'neutron_l3' 14:00:24 <liuyulong> #topic Announcements 14:00:25 <ralonsoh> hi 14:00:33 <liuyulong> ralonsoh, hi 14:01:11 <liuyulong> #link http://eavesdrop.openstack.org/meetings/networking/2020/networking.2020-03-10-14.00.log.html#l-10 14:01:45 <slaweq> hi 14:02:02 <liuyulong> I guess we can skip this topic today, everything was talked yesterday. 14:02:20 <liuyulong> slaweq, hi 14:02:32 <liuyulong> OK, let's move on 14:02:36 <liuyulong> #topic Bugs 14:02:50 <liuyulong> #link http://lists.openstack.org/pipermail/openstack-discuss/2020-March/013177.html 14:03:09 <liuyulong> Bence was our bug deputy last week. 14:03:29 <liuyulong> So the first one: 14:03:44 <liuyulong> #link https://bugs.launchpad.net/neutron/+bug/1866336 14:03:46 <openstack> Launchpad bug 1866336 in neutron "Binding of floating ip agent gateway port and agent_id isn't removed " [High,In progress] - Assigned to Slawek Kaplonski (slaweq) 14:03:55 <liuyulong> slaweq is working on this now. 14:04:17 <liuyulong> Code looks good to me, just one small concern about a DB query. : ) 14:04:41 <liuyulong> I have left a comment on the gerrit: 14:05:06 <ralonsoh> I agree with your comment, but I'm still reviewing the patch 14:05:08 <liuyulong> #link https://review.opendev.org/#/c/711623/3/neutron/db/l3_dvr_db.py@371 14:05:09 <slaweq> liuyulong: thx, I will check it after today's meetings 14:05:48 <liuyulong> slaweq, yw 14:05:52 <liuyulong> OK, next one 14:06:08 <liuyulong> #link https://bugs.launchpad.net/neutron/+bug/1866560 14:06:09 <openstack> Launchpad bug 1866560 in neutron "FIP Port forwarding description API extension don't work" [High,In progress] - Assigned to Slawek Kaplonski (slaweq) 14:06:29 <liuyulong> Also assigned to slaweq 14:07:13 <slaweq> yes, this one requires new neutron-lib version 14:07:14 <liuyulong> One small thing is we have 2 bps for this. 14:07:54 <ralonsoh> the correct one related this this RFE is https://blueprints.launchpad.net/neutron/+spec/portforwarding-description 14:08:50 <liuyulong> So since we are still in the developing cycle, I'd like to say we should add this bp link to the commit message to trace all the code. 14:09:13 <slaweq> liuyulong: ok, I will add link to BPs to this patch 14:10:20 <liuyulong> slaweq, sure, so both link are fine to me, you could choose any one you want. : ) 14:10:44 <slaweq> I'm using https://blueprints.launchpad.net/neutron/+spec/fip-pf-description to track this feature 14:10:53 <slaweq> actually I wasn't aware of the other one :) 14:12:01 <liuyulong> Next one, we've discussed last week. 14:12:16 <liuyulong> during the LP bug scanning 14:12:18 <liuyulong> #link #https://bugs.launchpad.net/neutron/+bug/1865891 14:12:20 <openstack> Launchpad bug 1865891 in neutron "Race condition during removal of subnet from the router and removal of subnet" [Medium,Confirmed] - Assigned to Slawek Kaplonski (slaweq) 14:13:04 <liuyulong> ralonsoh had left a potential solution 14:13:08 <slaweq> it's exactly like ralonsoh described 14:13:21 <ralonsoh> I tested this possible solution and works 14:13:33 <ralonsoh> I like to use the DB as a lock 14:13:47 <slaweq> ralonsoh: do You have patch for that already? 14:13:50 <ralonsoh> but, of course, this will add some extra time in the process 14:13:56 <ralonsoh> slaweq, I don't but I can propose it 14:14:08 <ralonsoh> do you want me to do this? 14:14:21 <slaweq> ralonsoh: if You can, that would be great 14:14:28 <ralonsoh> perfect 14:14:35 <slaweq> I did debugging process but I didn't propose any patch yet 14:15:04 <slaweq> and btw. it's IMO even Low instead of Medium bug 14:15:25 <slaweq> as it is very unlikly to happen in real world scenario 14:15:26 <liuyulong> ralonsoh, so every port creation will have to check this DB lock? 14:15:52 <ralonsoh> liuyulong, yes, the transaction will lock any other one modifying/deleting this register 14:16:04 <slaweq> ralonsoh: there is one thing which bothers me with this 14:16:13 <slaweq> is this only related to router's ports? 14:16:27 <slaweq> or can it be the same when we are creating "regular" ports 14:16:46 <slaweq> (I didn't try to reproduce it with port-create command) 14:17:01 <liuyulong> yes, if it is every port creation, nova may not happy during large sets VM booting. 14:17:03 <ralonsoh> let me check.... but this will work for any port requesting an IP from IPAM 14:18:02 <ralonsoh> yes, that will work for any port creation 14:18:10 <ralonsoh> the subnets will be locked during this process 14:18:58 <slaweq> ralonsoh: if You can push some PS with fix for it, that would be great 14:19:11 <ralonsoh> slaweq, sure, tomorrow 14:19:11 <slaweq> we can then maybe compare its performance impact on rally job 14:19:25 <slaweq> and see how much cost it has for us :) 14:19:32 <slaweq> what do You think? 14:19:43 <ralonsoh> rally is the best tool for this 14:20:14 <liuyulong> OK, but again, we should find out the balance between we fix that and warn the users not to do that. 14:20:23 <liuyulong> : ) 14:20:32 <ralonsoh> well, that's difficult 14:20:41 <ralonsoh> you can always send a batch of commands 14:20:52 <ralonsoh> of course, this should be executed at the same time 14:21:07 <ralonsoh> but the MUST in our code is reliability 14:22:02 <slaweq> ralonsoh++ 14:22:37 <liuyulong> Make sense 14:22:48 <liuyulong> OK, next one: 14:22:49 <liuyulong> #link https://bugs.launchpad.net/neutron/+bug/1866077 14:22:50 <openstack> Launchpad bug 1866077 in neutron "[L3][IPv6] IPv6 traffic with DVR in compute host" [Undecided,Incomplete] 14:23:00 <liuyulong> #link https://bugs.launchpad.net/neutron/+bug/1774463 14:23:00 <openstack> Launchpad bug 1774463 in neutron "RFE: Add support for IPv6 on DVR Routers for the Fast-path exit " [High,In progress] - Assigned to LIU Yulong (dragon889) 14:23:18 <liuyulong> It is related to the 1774463 as haleyb|away pointed. 14:24:44 <liuyulong> I filed the bug/1866077 is because we found that the IPv6 in dvr mode is missing some address scope mark which will block the traffic from the FIP-namespace. 14:26:00 <liuyulong> We have a fix of that, and interesting thing is Swami’s patch also does: 14:26:03 <liuyulong> #link https://review.opendev.org/#/c/662111/33/neutron/agent/l3/dvr_local_router.py@628 14:26:32 <liuyulong> So, I may want to take his patch and continue the work. 14:26:43 <slaweq> liuyulong: that's very old patch :) 14:27:03 <liuyulong> Yes, I've rebased that on the master branch. 14:27:47 <liuyulong> It is a really nice feature for IPv6 with DVR. 14:28:12 <liuyulong> IPv6 traffic can be distributed if users want to. 14:29:30 <liuyulong> And IPv6 could be the feature, maybe someday IPv4 support will be dropped like py27. : ) 14:29:49 <slaweq> liuyulong: You wish :P 14:30:24 <liuyulong> OK, last one 14:30:25 <liuyulong> #link https://bugs.launchpad.net/neutron/+bug/1866445 14:30:26 <openstack> Launchpad bug 1732067 in OpenStack Security Notes "duplicate for #1866445 openvswitch firewall flows cause flooding on integration bridge" [Undecided,New] 14:30:42 <liuyulong> It is more like a L2 bug。 14:30:58 <liuyulong> But the reporter left some thing related to L3 router. 14:30:59 <liuyulong> https://bugs.launchpad.net/neutron/+bug/1866445/comments/3 14:32:49 <liuyulong> I have no fully picture about this bug. Will need to have more information about this. 14:33:07 <ralonsoh> I need to check why Yi is claiming about the MAC learning 14:37:37 <liuyulong> OK, that's all bugs from me. 14:37:46 <slaweq> I want to talk again about one 14:38:02 <slaweq> https://bugs.launchpad.net/neutron/+bug/1859832 14:38:04 <openstack> Launchpad bug 1859832 in neutron "L3 HA connectivity to GW port can be broken after reboot of backup node" [Medium,In progress] - Assigned to LIU Yulong (dragon889) 14:38:04 <slaweq> :) 14:38:26 <slaweq> liuyulong: I was again looking at Your patch and testing it 14:39:08 <slaweq> and I still have worries that it may introduce new race conditions and regression during the failover 14:39:43 <slaweq> I wrote it in my reply to Your comment https://review.opendev.org/#/c/707406/9/neutron/agent/l3/keepalived_state_change.py@98 14:40:54 <liuyulong> The first one has a config option for keepalived, 'vrrp_garp_interval'. 14:41:10 <liuyulong> We talked about that last week. 14:41:48 <slaweq> yes, but it's interval between garp packets send by keepalived, right? 14:41:54 <liuyulong> But I'm not going to add it to this patch, it will be a follow-up one. 14:42:25 <slaweq> I still don't see how it may solve potential race condition between sending garps and setting interface to be up 14:44:48 <liuyulong> It has 3 times garp for HA router. 14:45:21 <slaweq> yes, but as I said in the reply to Your comment: first one will always fail 14:46:00 <slaweq> second one is after 10 seconds (so 10 seconds of datapath break, right?) and You have no any quarantee that interfaces will be already up then 14:46:18 <slaweq> and third one should be IMO removed from neutron-keepalived-state-change code :) 14:48:52 <liuyulong> The link up action for linux device is not time-consuming, so the first one has config vrrp_garp_interval=1 could be enough. 14:49:14 <liuyulong> The keepalived will send 5 times GARP when it hit the state chagne. 14:49:19 <liuyulong> s/change 14:49:47 <slaweq> and what about longer time of failover? 14:49:48 <liuyulong> So the first one fails due to link not up, wait one second, and next one. 14:50:04 <slaweq> ralonsoh: what do You think? 14:50:34 <ralonsoh> I still think that we should not rely on the garp 14:50:58 <ralonsoh> and using slaweq's patch but implementing liuyulong's linuxinterface change 14:51:02 <liuyulong> And more about this is the out side world has ARP. 14:51:16 <ralonsoh> we can have an easy architecture and we handle all backends 14:52:18 <slaweq> ralonsoh: now I'm a bit lost, I don't understand why we would need my patch with part of the liuyulong's change 14:52:35 <ralonsoh> https://review.opendev.org/#/c/707406/9/neutron/agent/linux/interface.py 14:52:46 <liuyulong> For the 3rd GARP, IMO, remove that is not so much good for neutron. But could move it to L3-agent state change code path, not the sate change deamon. 14:52:57 <ralonsoh> you should implement this change in your patch, slaweq 14:53:12 <slaweq> ralonsoh: but why? 14:53:34 <ralonsoh> to handle all possible interfaces 14:53:39 <ralonsoh> ovs, lb, etc 14:53:48 <ralonsoh> this is the main problem you see in your patch 14:54:46 <slaweq> ralonsoh: ok, I think I need to talk offline with You about it later :) 14:54:51 <ralonsoh> sure 14:55:14 <slaweq> I don't want to stole all remaining time so please continue with other topics now 14:58:14 <liuyulong> OK, we could continue the code discussion on the patch set. 14:58:30 <liuyulong> Actually we have no much time now. 14:58:34 <slaweq> liuyulong: yes :) 14:58:40 <liuyulong> #topic On demand agenda 14:59:20 <liuyulong> Alright, let's end here. 14:59:27 <liuyulong> Thank you guys. 14:59:32 <liuyulong> Bye 14:59:37 <liuyulong> #endmeeting