14:02:54 <liuyulong> #startmeeting neutron_l3
14:02:55 <openstack> Meeting started Wed Jul  1 14:02:54 2020 UTC and is due to finish in 60 minutes.  The chair is liuyulong. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:02:56 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:02:59 <openstack> The meeting name has been set to 'neutron_l3'
14:03:14 <liuyulong> #topic Announcements
14:03:16 <slaweq> hi
14:03:19 <liuyulong> hi
14:04:48 <liuyulong> The L3 meeting is trending to lack of followers and attendees.
14:05:13 <liuyulong> Maybe we should change the schedule to every 2 weeks.
14:06:05 <liuyulong> Or merge the meeting with some others, or directly go back to a section during the team meeting.
14:07:37 <liuyulong> Mostly the L3 meeting is mainly aiming on the bugs, and mostly were mentioned before L3 meeting in the team meeting.
14:07:49 <haleyb> hi, sorry, i always have a conflict with a downstream meeting, sometimes forget to join
14:07:57 <slaweq> I think we can do it biweekly
14:08:21 <slaweq> but I don't think we will be able to merge it to team meeting
14:08:35 <slaweq> as there may be not enough time there to discuss all L3 bugs e.g.
14:10:37 <liuyulong> Sure, thoese are some opinions from me based on my statistics in the last few L3 meetings.
14:10:44 <liuyulong> So biweekly is ok to me.
14:11:26 <liuyulong> Alright, thanks, no more announcement from me.
14:11:41 <liuyulong> Next
14:11:54 <liuyulong> #topic Bugs
14:12:18 <liuyulong> #link http://lists.openstack.org/pipermail/openstack-discuss/2020-June/015707.html
14:12:35 <liuyulong> Slawek Kaplonski slaweq was our bug deputy last week, thank you.
14:12:51 <liuyulong> First one:
14:12:54 <liuyulong> #link https://bugs.launchpad.net/neutron/+bug/1884906
14:12:54 <openstack> Launchpad bug 1884906 in neutron "L3 agent cannot be manually scheduled" [High,Fix released] - Assigned to Rodolfo Alonso (rodolfo-alonso-hernandez)
14:13:13 <liuyulong> We have talked about this last week during L3 meeting.
14:14:21 <liuyulong> The bug https://bugs.launchpad.net/neutron/+bug/1786272 is the long history of this.
14:14:21 <openstack> Launchpad bug 1786272 in neutron "Connection between two virtual routers does not work with DVR" [Medium,Fix released] - Assigned to Slawek Kaplonski (slaweq)
14:15:00 <liuyulong> Sorry wrong link...
14:15:20 <liuyulong> #link https://bugs.launchpad.net/neutron/+bug/1884527
14:15:24 <openstack> Launchpad bug 1884527 in neutron "Related dvr routers aren't created on compute nodes" [Medium,In progress] - Assigned to Slawek Kaplonski (slaweq)
14:15:54 <slaweq> I need to check why some functional test is failing there and update this patch
14:16:08 <liuyulong> #link https://review.opendev.org/#/c/737286/
14:16:41 <liuyulong> slaweq, yes, this is the link of the patch. And unit test cases too. : )
14:17:23 <liuyulong> I will try the patch after the zuul pass.
14:19:12 <liuyulong> Allow me share some experiences from our cloud, we have no such "complex" user define topology.
14:21:27 <liuyulong> In the production perspective, we define the virtual private cloud (aka VPC) with a router, a external gateway, some networks/subnets and the connections between them.
14:23:37 <liuyulong> Such hierarchical topology can introduce a very high load of trouble shooting. And in some cases like VIP, load balance, bare mentel, it is no so much available.
14:23:47 <liuyulong> So we directly disable that.
14:24:34 <slaweq> liuyulong: yes, but we have customers who are using it for some reason
14:24:41 <liuyulong> But it is a nice feature of neutron, it is kind of something we called "SDN", software defined network!
14:24:41 <slaweq> and in fact this is perfectly valid use case
14:24:55 <slaweq> which works fine for topologies other than dvr
14:26:26 <liuyulong> slaweq, Definitely, it is the cool point that neutron can define such topology.
14:27:48 <liuyulong> OK, next
14:27:52 <liuyulong> #link https://bugs.launchpad.net/neutron/+bug/1883321
14:27:52 <openstack> Launchpad bug 1883321 in neutron "Neutron OpenvSwitch DVR - connection problem" [High,New]
14:28:23 <liuyulong> The bug reporter did really nice work.
14:28:42 <liuyulong> We almost have everything we need.
14:29:11 <liuyulong> And after reading the bug, I noticed that this may be a bug of floating IP migration.
14:29:42 <liuyulong> change the compute node L3 agent mode from dvr to dvr_no_external, or conversely.
14:29:55 <liuyulong> The floating IPs will lose the connection.
14:30:28 <liuyulong> It is not related to the config option "explicity_egress_direct".
14:31:10 <liuyulong> I will take this one, and test the floating IP migration.
14:32:03 <liuyulong> For now, I can say, a valid step to do this work is:
14:32:30 <liuyulong> 1. disassociate the floating IPs
14:32:37 <liuyulong> 2. disable the routers before change the agent mode
14:32:50 <liuyulong> 3. change the agent mode
14:32:55 <liuyulong> 4. enable the routers
14:33:02 <liuyulong> 5. associate the floating IPs back
14:33:17 <liuyulong> OK, I will paste this to the LP for the reporter.
14:33:31 <slaweq> k
14:34:59 <liuyulong> OK, no more L3 bugs from the deputy list.
14:35:41 <liuyulong> Let's scan the LP list.
14:36:05 <liuyulong> Alright, a fresh one:
14:36:08 <liuyulong> #link https://bugs.launchpad.net/neutron/+bug/1885921
14:36:08 <openstack> Launchpad bug 1885921 in neutron "[RFE][floatingip port_forwarding] Add port ranges" [Undecided,New] - Assigned to Pedro Henrique Pereira Martins (pedrohpmartins)
14:37:00 <liuyulong> Looks like a nice request of the L3 port forwarding, something like a bulk creation.
14:38:46 <liuyulong> Seems there is a large scale uses case behind such request.
14:38:58 <liuyulong> Let's continue the discussion on the LP bug.
14:39:13 <liuyulong> Next one:
14:39:16 <liuyulong> #link https://bugs.launchpad.net/neutron/+bug/1885898
14:39:16 <openstack> Launchpad bug 1885898 in neutron "test connectivity through 2 routers fails in neutron-ovn-tempest-full-multinode-ovs-master job" [High,Confirmed]
14:40:00 <liuyulong> This is related to OVN L3, IMO.
14:40:25 <slaweq> yes
14:40:30 <slaweq> I opened it today
14:40:39 <slaweq> and it's only in ovn based scenarios
14:42:18 <liuyulong> If this test_connectivity_through_2_routers case blocks the gate/CI alot, maybe we should mark it as unstable.
14:42:59 <slaweq> liuyulong: no, I saw it failing only in non-voting jobs
14:43:14 <slaweq> that's why it's "high" and not "critical"
14:43:33 <liuyulong> At least bug 1884527 and 1885898 are all failing on it.
14:43:33 <openstack> bug 1885898 in neutron "test connectivity through 2 routers fails in neutron-ovn-tempest-full-multinode-ovs-master job" [High,Confirmed] https://launchpad.net/bugs/1885898
14:43:34 <openstack> bug 1884527 in neutron "Related dvr routers aren't created on compute nodes" [Medium,In progress] https://launchpad.net/bugs/1884527 - Assigned to Slawek Kaplonski (slaweq)
14:45:20 <liuyulong> slaweq, OK
14:45:38 <liuyulong> Alright, no more bugs from LP list then.
14:45:43 <liuyulong> Any updates?
14:45:47 <slaweq> I have one old to talk about
14:45:54 <liuyulong> Sure
14:45:58 <slaweq> https://bugs.launchpad.net/neutron/+bug/1774459
14:45:58 <openstack> Launchpad bug 1774459 in neutron "Update permanent ARP entries for allowed_address_pair IPs in DVR Routers" [High,In progress] - Assigned to Brian Haley (brian-haley)
14:46:22 <slaweq> there is patch https://review.opendev.org/#/c/601336/ which I think is last missing bit to solve that
14:47:04 <slaweq> liuyulong but as You had many comments there, I wanted to ask if You think we should/can move on with this one
14:47:16 <slaweq> or maybe do You have some other idea about how to solve this issue
14:47:33 * liuyulong opening the link...
14:48:15 <liuyulong> OK, I see that.
14:48:33 <liuyulong> This is really a tough one.
14:49:07 <liuyulong> The current fix is relying on the garp sending out from the VM.
14:49:23 <slaweq> yes
14:49:38 <slaweq> when additional IP is configured there
14:50:22 <liuyulong> This could be the main risk of the fix, neutron is rely on something which is out of control.
14:51:27 <slaweq> yes but do You have any other idea how to solve this?
14:51:56 <slaweq> and also main use case for that is using e.g. keepalived and it sends garps when configures IP on the host
14:52:02 <liuyulong> What if the garp is not out? Or there are some tools which does not send garp by default?
14:52:03 <slaweq> so in this use case it would be fine
14:52:24 <liuyulong> yep, for keepalived for now.
14:52:58 <slaweq> I see Your point but if tool/vm is not informing us that it is using this IP now actually, how we can know that?
14:53:53 <liuyulong> I have an potential alternative of the fix which based on arp proxy.
14:53:57 <liuyulong> It may work.
14:54:33 <liuyulong> I have not tested it yet.
14:54:52 <haleyb> i don't think there's a perfect solution, this area has always been "should be good enough for most" unfortunately
14:55:43 <liuyulong> haleyb, agreed, so maybe each solution can be configurable.
14:56:00 <liuyulong> then the end user can choose their best way.
14:56:38 <liuyulong> slaweq, arp proxy on the qr-device for those VIPs cross the subnets.
14:56:58 <slaweq> IMHO for now we have use cases with keepalived for which garps should be ok and we should focus to address that
14:57:01 <haleyb> that might lead to confusion having two ways with a config option
14:57:32 <slaweq> if we will have other valid use case which isn't addressed with such solution we can think about other and about some config switch
14:57:37 <haleyb> slaweq: my next question is how would OVN do this? :)
14:58:00 <slaweq> haleyb: I'm not even sure if that is the issue in case of ovn
14:58:14 <haleyb> that would be good
14:58:18 <slaweq> AFAIU this issue is strictly related with how dvr works
14:58:38 <slaweq> e.g. in L3ha it should works fine probably
14:58:54 <liuyulong> maybe we should test the VIP for OVN in east-west connections.
14:58:55 <slaweq> but that would need to be tested probably
14:59:09 <liuyulong> #link https://bugs.launchpad.net/neutron/+bug/1859638
14:59:09 <openstack> Launchpad bug 1774459 in neutron "duplicate for #1859638 Update permanent ARP entries for allowed_address_pair IPs in DVR Routers" [High,In progress] - Assigned to Brian Haley (brian-haley)
14:59:30 <liuyulong> It is marked as duplicated.
14:59:33 <liuyulong> The title is "VIP between dvr east-west networks does not work at all".
15:00:44 <slaweq> I think we are over time today
15:00:55 <liuyulong> OK, time is up.
15:01:03 <liuyulong> Thank you guys for the discussion.
15:01:24 <liuyulong> #endmeeting