15:00:36 <mlavalle> #startmeeting neutron_l3
15:00:37 <openstack> Meeting started Thu Oct 11 15:00:36 2018 UTC and is due to finish in 60 minutes.  The chair is mlavalle. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:38 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:40 <openstack> The meeting name has been set to 'neutron_l3'
15:00:49 <Swami> hi
15:00:54 <mlavalle> hi Swami
15:01:03 <haleyb> hi
15:01:09 <liuyulong> hello
15:01:18 <mlavalle> hi haleyb
15:01:24 <mlavalle> welcome liuyulong
15:01:42 <tidwellr> hi
15:02:20 <mlavalle> #topic Announcements
15:02:39 <mlavalle> Stein-1 is one week and a half away
15:03:00 <mlavalle> The week of October 22 - 26
15:03:51 <mlavalle> and the Summit in Berlin is not that far in the future, November 13 - 15
15:04:02 <mlavalle> anybody here attending?
15:05:16 <haleyb> not me
15:05:21 <mlavalle> I take that as a no
15:05:28 <mlavalle> so let's move on
15:05:34 <mlavalle> #topic Bugs
15:05:43 <mlavalle> Swami: please go ahead
15:05:58 <Swami> mlavalle: thanks
15:06:26 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1796491
15:06:26 <openstack> Launchpad bug 1796491 in neutron "DVR Floating IP setup in the SNAT namespace of the network node and also in the qrouter namespace in the compute node" [Undecided,New]
15:07:42 <Swami> I am currently looking into this bug. I was not able to reproduce this issue and also the code is in place to remove the fip cidr from the snat namespace after the migration. I have update the bug report with my findings and let me see in the mean time if I could reproduce it. I suspect that there may be some timing issue or a race that might cause this issue.
15:07:54 <Swami> The problem is reported in Pike and Queens.
15:08:12 <Swami> Have you guys seen this in the master
15:09:17 <haleyb> i haven't
15:09:39 <Swami> haleyb: ok thanks, I will see if I can reproduce it in the master, if not I will check on the Pike.
15:09:55 <Swami> The next one in the list is.
15:09:59 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1794991
15:09:59 <openstack> Launchpad bug 1794991 in neutron "Inconsistent flows with DVR l2pop VxLAN on br-tun" [Undecided,New]
15:10:36 <mlavalle> that reporter was involved in the previous bug as well
15:10:51 <Swami> It seems the l2pop vxlan flows are not getting populated properly in a multinode scenario. So they are having issues with DHCP connection and also VM to VM connection.
15:11:31 <mlavalle> also Pike
15:11:48 <Swami> Again they say it is not consistently reproduceable but happens. They also mentioned that there is no rpc-timeout's seen.
15:12:17 <Swami> I have seen such issue when there is an rpc_timeout where l2pop tries to fetch all the fdb_entries.
15:13:06 <Swami> Also I have requested for their ovs_vswitchd logs to see if the vxlan interfaces where created properly. But let us see.
15:13:44 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1786272
15:13:44 <openstack> Launchpad bug 1786272 in neutron "Connection between two virtual routers does not work with DVR" [Medium,In progress] - Assigned to Slawek Kaplonski (slaweq)
15:13:58 <slaweq> hi, sorry for being late
15:14:28 <Swami> Patch is in review: #link https://review.openstack.org/#/c/597567/
15:14:55 <Swami> This patch is almost complete and I did see slaweq was addressing some last issues with the agent handling multiple routers.
15:15:21 <slaweq> yes, please take a look if that makes sense for You :)
15:15:45 <Swami> So nothing else to discuss in this bug. Ok will take a look at it again today. Also this bug brought out another bug with HA-DVR.
15:16:07 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1797037
15:16:07 <openstack> Launchpad bug 1797037 in neutron "Extra routes configured on routers are not set in the router namespace and snat namespace with DVR-HA routers" [Medium,In progress] - Assigned to Swaminathan Vasudevan (swaminathan-vasudevan)
15:16:41 <mlavalle> Swami: you alfready pushed a fix for that one, right?
15:17:04 <Swami> The extra routes are not configured in the router namespace when DVR routers are configured. I did check that the routes are only configured in the 'vrrp_conf' and the snat_namespace when HA is configured.
15:17:11 <Swami> Yes, I have a patch up for review.
15:17:33 <Swami> #link https://review.openstack.org/#/c/609273/
15:17:41 <Swami> It needs another +2.
15:18:03 <mlavalle> I looked at it last night
15:18:11 <slaweq> I can only say that I tested it together with this patch for 2 routers connected together and it worked fine :)
15:18:18 <mlavalle> and had the same question as haleyb: https://review.openstack.org/#/c/609273/3/neutron/agent/l3/dvr_edge_ha_router.py@123
15:19:21 <Swami> mlavalle: Yes it seems that because of the inheritance from different classes we have to override to certain functions to call it properly.
15:19:32 <mlavalle> I even played with a unit test of DvrEdgeHaRouter
15:19:38 <Swami> mlavalle: Let me comment it in the patch.
15:19:42 <haleyb> yes, i don't quite understand that super(), but it's just a python thing...
15:20:14 <mlavalle> in my experimenting with the unit tests, the object seems to have the correct method
15:20:31 <mlavalle> update_routing_table
15:20:37 <Swami> mlavalle: ok
15:21:08 <mlavalle> it should come from DvrEdgeRouter, right?
15:21:23 <Swami> mlavalle: Yes it should come from DVrEdgeRouter
15:21:37 <haleyb> yes, HaRouter doesn't have it
15:21:44 <mlavalle> I can tell you that in the unit tests, it is getting the method from there
15:22:09 <mlavalle> but maybe in a deployment, and override is heppening somehow
15:22:32 <Swami> mlavalle: Let me check the 'DVREdgeHARouter' and will update if the 'update_routing_table' override is required or not? I will update you on this.
15:22:36 <mlavalle> I was planning to add some debug calls to the code in master in my deployment
15:22:58 <mlavalle> and let it run and see if we are hitting the correct method
15:23:04 <mlavalle> would that be helpful?
15:23:06 <Swami> mlavalle: I already have some debug calls added, I will test it again.
15:23:20 <mlavalle> ok, then I'll let you test
15:23:33 <mlavalle> I was only trying to help
15:23:40 <mlavalle> because it seems odd
15:23:54 <Swami> mlavalle: sure,
15:24:06 <mlavalle> and if we have an inheritance problem, it might be worth it finding out why
15:24:23 <Swami> mlavalle: yes, you are right.
15:24:39 <Swami> The next one is #link https://bugs.launchpad.net/neutron/+bug/1774459
15:24:39 <openstack> Launchpad bug 1774459 in neutron "Update permanent ARP entries for allowed_address_pair IPs in DVR Routers" [High,Confirmed] - Assigned to Swaminathan Vasudevan (swaminathan-vasudevan)
15:24:57 * mlavalle was going to leave a comment last night in the patch after dinner. but was tired and figured could be discussed with Swami during today's meeting
15:25:20 <Swami> mlavalle: no problem.
15:25:22 <Swami> #link https://review.openstack.org/601336
15:25:46 <mlavalle> ahh nice!
15:26:09 <Swami> Here is the patch. I have a question on the Ryu controller for adding in the IN_Packet handler. I left a question in there, if anyone can take a look and comment on it, I can proceed with my work.
15:26:39 <mlavalle> I'll take a look later
15:26:39 <Swami> I had a question related to 'registration' of the in_packet handler call.
15:26:43 <Swami> mlavalle: thanks
15:26:59 <mlavalle> if I can't help, I'll ping ajo
15:27:45 <Swami> mlavalle: Sure thanks that would help.
15:27:50 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1795222
15:27:50 <openstack> Launchpad bug 1795222 in neutron "[l3] router agent side gateway IP not changed if directly change IP address" [Medium,In progress] - Assigned to LIU Yulong (dragon889)
15:28:08 <Swami> #link https://review.openstack.org/606876
15:28:13 <Swami> There is a patch up for review.
15:29:10 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1785227
15:29:10 <openstack> Launchpad bug 1785227 in neutron "Router port: no dataplane update on change" [Medium,Confirmed]
15:29:39 <Swami> I think these two bugs are kind of related. But I need to recheck once again before marking it as duplicate. I will check it out today.
15:30:10 * mlavalle will review the patch^^^^ today
15:30:18 <Swami> mlavalle: that's all I had for bugs today. Back to you.
15:30:29 <slaweq> I have one more thing
15:30:38 <slaweq> there is bug https://bugs.launchpad.net/neutron/+bug/1796703
15:30:38 <openstack> Launchpad bug 1796703 in neutron "HA router interfaces in standby state" [Undecided,New]
15:30:56 <slaweq> which I think should be checked by some L3 experts
15:31:07 <slaweq> so please take a look at this one if You can :)
15:31:28 <liuyulong> Pike again
15:31:48 <Swami> slaweq: ok I will take a look at it.
15:31:53 <slaweq> thx Swami
15:32:27 <mlavalle> Next bug I have is https://bugs.launchpad.net/neutron/+bug/1789434
15:32:27 <openstack> Launchpad bug 1789434 in neutron "neutron_tempest_plugin.scenario.test_migration.NetworkMigrationFromHA failing 100% times" [High,Confirmed] - Assigned to Manjeet Singh Bhatia (manjeet-s-bhatia)
15:32:45 <mlavalle> This one was assigned to manjeets.... any progress?
15:33:04 <manjeets> mlavalle, I've sent update email.
15:33:13 <manjeets> my finding is it even exists in rocky
15:33:48 <manjeets> now looking onto dvr_sync and sync_ha_state to compare if ha case is missing any notification for port updates
15:34:51 <mlavalle> thanks for the update :-)
15:35:30 <mlavalle> Next one is https://bugs.launchpad.net/neutron/+bug/1791989
15:35:30 <openstack> Launchpad bug 1791989 in neutron "grenade-dvr-multinode job fails" [High,Confirmed] - Assigned to Slawek Kaplonski (slaweq)
15:35:59 <mlavalle> slaweq found the cause for this one
15:36:03 <mlavalle> \o/
15:36:20 <mlavalle> and we have a patch to make the job voting again: https://review.openstack.org/#/c/609437/
15:36:48 <slaweq> yes :) in patch description there is explained what fixed that
15:37:10 <mlavalle> haleyb: take a look when you have some time^^^^
15:37:18 <haleyb> and i had created a related patch as well to add permanent ARP entries for the veth pair IPs, just an optimization, https://review.openstack.org/#/c/607685/
15:37:50 <haleyb> will look, see zuul is now happy
15:38:08 <mlavalle> haleyb: the patch you pushed is good for review?
15:39:00 <haleyb> mlavalle: yes, should be good.  was just something we noticed during debug
15:39:51 <mlavalle> cool, I'll take a look later today
15:40:01 <Swami> haleyb: is that permanent arp entry there a requirement.
15:40:02 <mlavalle> Next one is https://bugs.launchpad.net/neutron/+bug/1787919
15:40:02 <openstack> Launchpad bug 1787919 in neutron "Upgrade router to L3 HA broke IPv6" [High,Confirmed] - Assigned to Miguel Lavalle (minsel)
15:40:14 <mlavalle> #undo
15:40:15 <openstack> Removing item from minutes: #link https://bugs.launchpad.net/neutron/+bug/1785227
15:41:16 <haleyb> Swami: no, we had just seen STALE arp entries, and since we know the IP/MAC just felt we could add the corresponding entries
15:41:19 <Swami> haleyb: My question is it failing just because of the missing ARP entries and is the ARP entries failing.
15:41:48 <Swami> haleyb: Ok thanks.
15:41:51 <haleyb> it wound-up being unrelated, we thought it was initially.
15:42:12 <mlavalle> ok next one is https://bugs.launchpad.net/neutron/+bug/1787919
15:42:12 <openstack> Launchpad bug 1787919 in neutron "Upgrade router to L3 HA broke IPv6" [High,Confirmed] - Assigned to Miguel Lavalle (minsel)
15:42:33 <mlavalle> For this one I have the environment to reproduce it
15:42:50 <mlavalle> I have also setup similar conditions as the ones reported in the bug
15:42:56 <mlavalle> debugging at this moment
15:43:34 <mlavalle> and that's all I have for today
15:43:42 <mlavalle> #topic Open Agenda
15:43:51 <mlavalle> anything we should discuss today?
15:44:00 <xubozhang> hi
15:44:11 <xubozhang> i got some errors in unit testing
15:44:11 <mlavalle> hi xubozhang
15:44:23 <xubozhang> not sure how to debug them
15:44:36 <Swami> mlavalle: haleyb: I had one question.
15:44:50 <mlavalle> xubozhang: have the patch url handy?
15:45:11 <xubozhang> the patch still has too many issues
15:45:35 <xubozhang> i got some assertions failed
15:45:55 <mlavalle> xubozhang: do you know how to run the unit tests with debugger enabled?
15:46:05 <xubozhang> nope
15:46:28 <mlavalle> give me the url of the patch and I'll leave you a comment with instructions there
15:46:36 <xubozhang> i tried tox -v -e py35
15:47:26 <mlavalle> Swami: go ahead
15:47:48 <xubozhang> https://review.openstack.org/#/c/528336/
15:47:59 <mlavalle> xubozhang: ack. will leave a comment there
15:48:05 <Swami> haleyb: i posted a question in your  IRC room. If the external_process monitor is monitoring a process, how do we kill that monitoring action. In this case ip_monitor, without rebooting the l3_agent.
15:48:05 <xubozhang> thanks!
15:49:05 <Swami> Somehow in one of our setup the PID of the ip_monitor is removed and also the snat_namespace is removed. But the external_process monitor constantly tried to restart the ip_monitor.
15:49:15 <haleyb> Swami: i hadn't seen it, sometimes irc doesn't create a chat tab and i miss those things
15:49:36 <Swami> haleyb: no problem. I posted in neutron channel on thursday the same question.
15:49:42 * mlavalle thought haleyb was ignoring his comments about the Red Sox
15:50:00 <haleyb> "Is there a way to stop the External Process monitor from monitoring a service with rebooting the l3-agent. In this case it is the ip-monitor process running in snat-namespace for HA."
15:50:07 <haleyb> Swami: that was it, right?
15:50:12 <Swami> haleyb: yes
15:50:31 <Swami> haleyb: there is a typo there, without rebooting the l3-agent.
15:51:18 <haleyb> ack, let me copy into the chat
15:51:45 <haleyb> mlavalle: and i wouldn't ignore Red Sox banter either :)
15:51:52 <Swami> You can reproduce this when you try to delete the snat-namespace and PID of ip_monitor when HA is configured. The logs are filled with ip_monitor trying to restart.
15:52:27 <Swami> haleyb: you can check it and let me know.
15:52:29 <Swami> Thanks
15:52:37 <Swami> mlavalle: that's all I had.
15:52:38 <haleyb> Swami: we should file a bug too
15:52:56 * mlavalle feels relieved haleyb is not ignoring him
15:53:23 <haleyb> moving from office to home always messes with my irc :(
15:53:57 <mlavalle> ok team, thanks for attending
15:54:01 <tidwellr> I don't want to take much time, but wanted to give a quick update on DVR-aware neutron-dynamic-routing and point folks at https://review.openstack.org/#/c/581098/
15:54:03 <Swami> haleyb: Sure I will file a bug.
15:54:14 <mlavalle> tidwellr: goa head
15:54:31 <mlavalle> tidwellr: asking for reviews?
15:55:03 <tidwellr> yes, particularly with the advent multiple port bindings
15:55:17 <mlavalle> ok added to my pile
15:55:21 <tidwellr> not sure that is being handled correctly
15:55:33 <tidwellr> so your review would be helpful
15:56:14 <mlavalle> I should also mention that tidwellr has volunteered to add the ability to remove ip ranges from subnet pools
15:56:51 <mlavalle> so that revert patch in openstack client should probably be stopped
15:57:05 <tidwellr> well, maybe not
15:57:07 <Swami> tidwellr: will take a look.
15:57:27 <tidwellr> I'm envisioning some API changes that the client will need to be aware of
15:57:50 <mlavalle> so changes will probably still be needed in the client
15:57:55 <tidwellr> yes
15:58:07 <mlavalle> let's just hold off on the revert
15:58:11 <mlavalle> for the time being
15:58:48 <mlavalle> thanks for the update tidwellr !
15:59:06 <mlavalle> have a great weekend y'all!
15:59:13 <mlavalle> #endmeeting