14:02:14 #startmeeting neutron_l3 14:02:14 Meeting started Wed Feb 24 14:02:14 2021 UTC and is due to finish in 60 minutes. The chair is liuyulong. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:02:15 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:02:18 The meeting name has been set to 'neutron_l3' 14:02:27 Long time no see. :) 14:02:50 hi 14:03:26 OK, let's start. 14:04:04 #topic Bugs 14:04:17 #link https://bugs.launchpad.net/neutron/+bug/1913621 14:04:19 Launchpad bug 1913646 in neutron "duplicate for #1913621 DVR router ARP traffic broken for networks containing multiple subnets" [Medium,Confirmed] - Assigned to LIU Yulong (dragon889) 14:04:21 #link https://bugs.launchpad.net/neutron/+bug/1913646 14:04:22 Launchpad bug 1913646 in neutron "DVR router ARP traffic broken for networks containing multiple subnets" [Medium,Confirmed] - Assigned to LIU Yulong (dragon889) 14:04:47 This was fixed in a way that we change the ARP reply dest mac to the router gateway. 14:05:12 But the bug reporter said that 1913646 is a bit different. 14:05:38 Sorry, it's 1913621 14:06:12 The main issue of bug/1913621 is why the Permant ARP was not added. 14:06:48 Hi 14:07:10 Hi, lajoskatona 14:07:19 So I just removed the duplicated mark. 14:07:43 I agree that point, the main problem of 1913621  may exist in dvr related code. 14:10:22 I will revisit this bug  1913621 and try to reproduce that. 14:10:38 next 14:10:39 #link https://bugs.launchpad.net/neutron/+bug/1916022 14:10:40 Launchpad bug 1916022 in neutron "L3HA Race condition during startup of the agent may cause inconsistent router's states" [Low,In progress] - Assigned to Slawek Kaplonski (slaweq) 14:11:11 #link https://review.opendev.org/c/openstack/neutron/+/776423 14:11:14 The patch is here. 14:11:39 I've read the code, but not test it yet. 14:12:27 A "Race condition" bug sometimes is not so much easy to test, IMO. 14:13:22 yes we found it with tobiko tests 14:14:59 those tobiko tests can be very useful for us in some cases :) 14:15:41 maybe run some times job in that tobiko to verify the fix 14:16:00 liuyulong: I did 14:16:02 https://review.opendev.org/c/openstack/neutron/+/776284/5 14:16:11 this is "test patch" which runs tobiko jobs 14:16:19 and it passed many times already 14:16:31 so for me it's clearly solving the issue 14:16:54 Cool 14:17:12 Yep, Great work! 14:17:23 thx 14:18:30 Just post my +2 14:18:39 thx 14:18:40 OK, next one 14:18:43 #link https://bugs.launchpad.net/neutron/+bug/1916024 14:18:45 Launchpad bug 1916024 in neutron "HA router master instance in error state because qg-xx interface is down" [High,In progress] - Assigned to Rodolfo Alonso (rodolfo-alonso-hernandez) 14:19:23 #link https://review.opendev.org/c/openstack/neutron/+/776427 14:19:34 The fix ^ 14:20:32 The fix is simple, just use the normal workaround method "retry". 14:20:49 #link https://review.opendev.org/c/openstack/neutron/+/776427/4//COMMIT_MSG 14:20:56 I have left some comment here. 14:21:14 #link http://paste.openstack.org/show/802779/ 14:21:55 There are some actions " DelPortCommand(port=qg-3e872c7f-68" and "AddPortCommand(bridge=br-int, port=qg-3e872c7f-68" 14:22:30 I don't know if these method can be the root cause, but it is really close to the behavior we found. 14:23:53 While ovsdbapp is doing the "delete and add" work, the privsep deamon is trying to run the "ip link" related command. 14:28:30 So, maybe we can refactor that replace_port method to a more grace way: not delete it, but clear attributes and reset. 14:28:49 Just some thoughts, do not be sure if it really works. 14:29:03 OK, no more bugs from me then. 14:29:11 Any updates? 14:31:10 OK, let's move on 14:31:19 #topic distributed_dhcp 14:31:39 I have uploaded all code. 14:31:41 #link https://review.opendev.org/q/topic:%22bp%252Fdistributed-dhcp-for-ml2-ovs%22+(status:open%20OR%20status:merged) 14:32:10 #link https://review.opendev.org/c/openstack/neutron/+/776568 14:32:34 The fullstack test case is passed locally in my devstack environment. 14:32:51 I'm not quite sure why the upstream is failing. 14:33:07 One issue may be the DHCP client version. 14:35:19 Since we use namespace as the fake VM, the dhcp client should be from "Linux ubuntu-focal-ovh-gra1-0023152345 5.4.0-65-generic #73-Ubuntu SMP Mon Jan 18 17:25:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux" 14:35:54 Maybe I should run a devstack deployment in ubuntu to verify this case. 14:37:16 Some other concerns are that maybe protocol coverage of the DHCPv4/v6 responder. 14:38:21 All these options and replies are based on the dnsmasq example of dhcp-instance from DHCP-agent. 14:39:00 We use Wireshark to verify and compare all related options from distribtued_dhcp and dnsmasq 14:39:14 Maybe it's not enough. 14:40:18 So, any comments/problem/testing/issues you have about these DHCPv4/v6 responder are welcomed. 14:40:38 i wonder if looking at things in an OVN environment would help? at least the flows? 14:40:54 This is the main part of the distributed DHCP. 14:41:24 haleyb, I did that, seems negtive. 14:41:47 The flows from OVN and OVS are totally different. 14:42:17 Actually for the implementation here for ovs agent, we only add one flow for DHCP request. 14:42:28 "submit to controller, aka ovs-agent" 14:42:56 OVN's flow has some user data, then upload to ovn-controller. 14:43:43 ack, just thinking out loud 14:43:50 ovn-controller can read those userdata, but ovs-agent with ryu app does not. 14:45:33 Last thing about this bp is that config option... 14:45:41 "disable_traditional_dhcp" or "enable_traditional_dhcp" 14:46:36 I'm still thinking that we should not add config option to "enable neutron's default behavior by default". 14:46:39 I would say to make anyway default the "legacy" dhcp, otherwise I am fine with any 14:47:34 The original purpose and main aim is to "disable" something. 14:49:27 lajoskatona, yes, we will make sure that. 14:49:27 liuyulong: yes, it seems a little backwards, but to me it looks similar to other config options we've added where the default is True 14:50:33 for example, we've done that when we wanted to add a new thing then backport the config option, not that this is the same case 14:51:00 i.e. set the option to what the current default is - enabling dhcp-agent 14:51:27 i don't think the "votes" agreed on either direction in the review 14:53:09 This option should not be backported to stable branch. : ) 14:53:45 Maybe someday this "disable_traditional_dhcp = True" will be default value. 14:55:12 OK, time is running out. 14:55:20 or the enable default is False :) maybe i should ask someone that works on tripleo what they think, since they're maybe more in-tune with exposing config options to customers 14:56:05 haleyb, great, thank you. 14:56:10 We can continue the discussion on the gerrit. 14:56:23 sure 14:56:44 #topic On demand agenda 14:56:50 I have one update here: 14:57:19 The spec for "elastic snat" https://review.opendev.org/c/openstack/neutron-specs/+/770540 14:57:59 Reviews are welcomed. 14:59:45 OK, thank you guys 14:59:51 Let's end here. 14:59:53 Bye 14:59:59 Bye 15:00:05 #endmeeting