14:02:14 <liuyulong> #startmeeting neutron_l3 14:02:14 <openstack> Meeting started Wed Feb 24 14:02:14 2021 UTC and is due to finish in 60 minutes. The chair is liuyulong. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:02:15 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:02:18 <openstack> The meeting name has been set to 'neutron_l3' 14:02:27 <liuyulong> Long time no see. :) 14:02:50 <slaweq> hi 14:03:26 <liuyulong> OK, let's start. 14:04:04 <liuyulong> #topic Bugs 14:04:17 <liuyulong> #link https://bugs.launchpad.net/neutron/+bug/1913621 14:04:19 <openstack> Launchpad bug 1913646 in neutron "duplicate for #1913621 DVR router ARP traffic broken for networks containing multiple subnets" [Medium,Confirmed] - Assigned to LIU Yulong (dragon889) 14:04:21 <liuyulong> #link https://bugs.launchpad.net/neutron/+bug/1913646 14:04:22 <openstack> Launchpad bug 1913646 in neutron "DVR router ARP traffic broken for networks containing multiple subnets" [Medium,Confirmed] - Assigned to LIU Yulong (dragon889) 14:04:47 <liuyulong> This was fixed in a way that we change the ARP reply dest mac to the router gateway. 14:05:12 <liuyulong> But the bug reporter said that 1913646 is a bit different. 14:05:38 <liuyulong> Sorry, it's 1913621 14:06:12 <liuyulong> The main issue of bug/1913621 is why the Permant ARP was not added. 14:06:48 <lajoskatona> Hi 14:07:10 <liuyulong> Hi, lajoskatona 14:07:19 <liuyulong> So I just removed the duplicated mark. 14:07:43 <liuyulong> I agree that point, the main problem of 1913621 may exist in dvr related code. 14:10:22 <liuyulong> I will revisit this bug 1913621 and try to reproduce that. 14:10:38 <liuyulong> next 14:10:39 <liuyulong> #link https://bugs.launchpad.net/neutron/+bug/1916022 14:10:40 <openstack> Launchpad bug 1916022 in neutron "L3HA Race condition during startup of the agent may cause inconsistent router's states" [Low,In progress] - Assigned to Slawek Kaplonski (slaweq) 14:11:11 <liuyulong> #link https://review.opendev.org/c/openstack/neutron/+/776423 14:11:14 <liuyulong> The patch is here. 14:11:39 <liuyulong> I've read the code, but not test it yet. 14:12:27 <liuyulong> A "Race condition" bug sometimes is not so much easy to test, IMO. 14:13:22 <slaweq> yes we found it with tobiko tests 14:14:59 <slaweq> those tobiko tests can be very useful for us in some cases :) 14:15:41 <liuyulong> maybe run some times job in that tobiko to verify the fix 14:16:00 <slaweq> liuyulong: I did 14:16:02 <slaweq> https://review.opendev.org/c/openstack/neutron/+/776284/5 14:16:11 <slaweq> this is "test patch" which runs tobiko jobs 14:16:19 <slaweq> and it passed many times already 14:16:31 <slaweq> so for me it's clearly solving the issue 14:16:54 <liuyulong> Cool 14:17:12 <liuyulong> Yep, Great work! 14:17:23 <slaweq> thx 14:18:30 <liuyulong> Just post my +2 14:18:39 <slaweq> thx 14:18:40 <liuyulong> OK, next one 14:18:43 <liuyulong> #link https://bugs.launchpad.net/neutron/+bug/1916024 14:18:45 <openstack> Launchpad bug 1916024 in neutron "HA router master instance in error state because qg-xx interface is down" [High,In progress] - Assigned to Rodolfo Alonso (rodolfo-alonso-hernandez) 14:19:23 <liuyulong> #link https://review.opendev.org/c/openstack/neutron/+/776427 14:19:34 <liuyulong> The fix ^ 14:20:32 <liuyulong> The fix is simple, just use the normal workaround method "retry". 14:20:49 <liuyulong> #link https://review.opendev.org/c/openstack/neutron/+/776427/4//COMMIT_MSG 14:20:56 <liuyulong> I have left some comment here. 14:21:14 <liuyulong> #link http://paste.openstack.org/show/802779/ 14:21:55 <liuyulong> There are some actions " DelPortCommand(port=qg-3e872c7f-68" and "AddPortCommand(bridge=br-int, port=qg-3e872c7f-68" 14:22:30 <liuyulong> I don't know if these method can be the root cause, but it is really close to the behavior we found. 14:23:53 <liuyulong> While ovsdbapp is doing the "delete and add" work, the privsep deamon is trying to run the "ip link" related command. 14:28:30 <liuyulong> So, maybe we can refactor that replace_port method to a more grace way: not delete it, but clear attributes and reset. 14:28:49 <liuyulong> Just some thoughts, do not be sure if it really works. 14:29:03 <liuyulong> OK, no more bugs from me then. 14:29:11 <liuyulong> Any updates? 14:31:10 <liuyulong> OK, let's move on 14:31:19 <liuyulong> #topic distributed_dhcp 14:31:39 <liuyulong> I have uploaded all code. 14:31:41 <liuyulong> #link https://review.opendev.org/q/topic:%22bp%252Fdistributed-dhcp-for-ml2-ovs%22+(status:open%20OR%20status:merged) 14:32:10 <liuyulong> #link https://review.opendev.org/c/openstack/neutron/+/776568 14:32:34 <liuyulong> The fullstack test case is passed locally in my devstack environment. 14:32:51 <liuyulong> I'm not quite sure why the upstream is failing. 14:33:07 <liuyulong> One issue may be the DHCP client version. 14:35:19 <liuyulong> Since we use namespace as the fake VM, the dhcp client should be from "Linux ubuntu-focal-ovh-gra1-0023152345 5.4.0-65-generic #73-Ubuntu SMP Mon Jan 18 17:25:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux" 14:35:54 <liuyulong> Maybe I should run a devstack deployment in ubuntu to verify this case. 14:37:16 <liuyulong> Some other concerns are that maybe protocol coverage of the DHCPv4/v6 responder. 14:38:21 <liuyulong> All these options and replies are based on the dnsmasq example of dhcp-instance from DHCP-agent. 14:39:00 <liuyulong> We use Wireshark to verify and compare all related options from distribtued_dhcp and dnsmasq 14:39:14 <liuyulong> Maybe it's not enough. 14:40:18 <liuyulong> So, any comments/problem/testing/issues you have about these DHCPv4/v6 responder are welcomed. 14:40:38 <haleyb> i wonder if looking at things in an OVN environment would help? at least the flows? 14:40:54 <liuyulong> This is the main part of the distributed DHCP. 14:41:24 <liuyulong> haleyb, I did that, seems negtive. 14:41:47 <liuyulong> The flows from OVN and OVS are totally different. 14:42:17 <liuyulong> Actually for the implementation here for ovs agent, we only add one flow for DHCP request. 14:42:28 <liuyulong> "submit to controller, aka ovs-agent" 14:42:56 <liuyulong> OVN's flow has some user data, then upload to ovn-controller. 14:43:43 <haleyb> ack, just thinking out loud 14:43:50 <liuyulong> ovn-controller can read those userdata, but ovs-agent with ryu app does not. 14:45:33 <liuyulong> Last thing about this bp is that config option... 14:45:41 <liuyulong> "disable_traditional_dhcp" or "enable_traditional_dhcp" 14:46:36 <liuyulong> I'm still thinking that we should not add config option to "enable neutron's default behavior by default". 14:46:39 <lajoskatona> I would say to make anyway default the "legacy" dhcp, otherwise I am fine with any 14:47:34 <liuyulong> The original purpose and main aim is to "disable" something. 14:49:27 <liuyulong> lajoskatona, yes, we will make sure that. 14:49:27 <haleyb> liuyulong: yes, it seems a little backwards, but to me it looks similar to other config options we've added where the default is True 14:50:33 <haleyb> for example, we've done that when we wanted to add a new thing then backport the config option, not that this is the same case 14:51:00 <haleyb> i.e. set the option to what the current default is - enabling dhcp-agent 14:51:27 <haleyb> i don't think the "votes" agreed on either direction in the review 14:53:09 <liuyulong> This option should not be backported to stable branch. : ) 14:53:45 <liuyulong> Maybe someday this "disable_traditional_dhcp = True" will be default value. 14:55:12 <liuyulong> OK, time is running out. 14:55:20 <haleyb> or the enable default is False :) maybe i should ask someone that works on tripleo what they think, since they're maybe more in-tune with exposing config options to customers 14:56:05 <liuyulong> haleyb, great, thank you. 14:56:10 <liuyulong> We can continue the discussion on the gerrit. 14:56:23 <haleyb> sure 14:56:44 <liuyulong> #topic On demand agenda 14:56:50 <liuyulong> I have one update here: 14:57:19 <liuyulong> The spec for "elastic snat" https://review.opendev.org/c/openstack/neutron-specs/+/770540 14:57:59 <liuyulong> Reviews are welcomed. 14:59:45 <liuyulong> OK, thank you guys 14:59:51 <liuyulong> Let's end here. 14:59:53 <liuyulong> Bye 14:59:59 <lajoskatona> Bye 15:00:05 <liuyulong> #endmeeting