15:00:20 <mlavalle> #startmeeting neutron_l3 15:00:21 <openstack> Meeting started Thu Aug 31 15:00:20 2017 UTC and is due to finish in 60 minutes. The chair is mlavalle. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:22 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:25 <openstack> The meeting name has been set to 'neutron_l3' 15:00:32 <Swami> hi 15:00:38 <haleyb> hi 15:01:14 <mlavalle> #chair haleyb Swami 15:01:14 <openstack> Current chairs: Swami haleyb mlavalle 15:01:31 <mlavalle> #topic Announcements 15:01:53 <mlavalle> Pike is being released this week 15:02:31 <mlavalle> so we will start the whole thing again and move to Queens now :-) 15:03:32 <mlavalle> We are a little more than one week away from the PTG 15:04:41 <mlavalle> I arrive Sunday 10th at around 2:30pm. Staying at the event hotel (Renaissance) until Saturday morning 15:05:12 <Swami> mlavalle: I am arriving late in the evening on Sunday and will be flying back on friday evening. 15:05:25 <haleyb> i'll be there Sunday as well 15:05:42 <mlavalle> Any other annoucements from the team? 15:06:30 <Swami> nothing from me. 15:07:10 <mlavalle> I had a quick chat a couple of weeks ago with carl_baldwin. He is going to join us for dinner Monday or Tuesday 15:07:26 <Swami> mlavalle: great! 15:07:42 <mlavalle> He'll drive all the wasy from Fort Collins 15:07:55 <mlavalle> ok, moving on 15:08:00 <mlavalle> #topic Bugs 15:08:14 <mlavalle> Swami: go ahead, as usual 15:08:17 <Swami> mlavalle: thanks 15:08:34 <Swami> We had a critical bug in the grenade job that was filed yesterday. 15:08:54 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1713927 15:08:55 <openstack> Launchpad bug 1713927 in neutron "gate-grenade-dsvm-neutron-dvr-multinode-ubuntu-xenial fails constantly" [Critical,Confirmed] 15:09:39 <Swami> If all are not aware about the problem. It seems somehow the server is providing a floatingip that is not bound to a host. 15:10:03 <Swami> The agent configures the floatingip without checking for the host. 15:10:22 <Swami> So the floatingip is residing on two different nodes. 15:11:00 <Swami> This happens only when a neutron-server is restarted after some time interval. At that time when a full-sync happens, we sneak in this duplicate floatingip. 15:11:20 <haleyb> Swami: kevin did put a reproducer in the bug this morning, and i had a discussion with kuba earlier, think we might have a workaround 15:11:51 <Swami> haleyb: I did see that jakob had posted a patch. 15:12:11 <haleyb> yes, but we also think we need this revert https://review.openstack.org/#/c/499263/ 15:12:11 <Swami> haleyb: Yes I also ready kevin's reproducing steps. 15:12:32 <Swami> haleyb: yesterday I was trying to reproduce by restarting the agent. I could not reproduce it. 15:13:23 <Swami> haleyb: As I mentioned there are two problems. 15:13:45 <Swami> haleyb: mlavalle: From the server side, the notification should be only sent to the host that is hosting it. 15:14:13 <Swami> haleyb: mlavalle: on the client side it should check for host or dest_host, before configuring the floatingip. 15:14:53 <Swami> haleyb: mlavalle: While I have not figured out what is happening on the server side yet. But on the client side we may have a better solution. 15:15:00 <haleyb> Swami: i think with the revert and jakub's change the agent side would be fixed, fixing the check queue 15:15:12 <Swami> haleyb: mlavalle: Yes I agree. 15:15:32 <mlavalle> so the host fix we merged last week was really a symptom? 15:15:59 <Swami> haleyb: mlavalle: But still the check that we have under the 'get_external_device_interface_name" 15:16:11 <Swami> haleyb: mlavalle: may not be the right place. 15:16:25 <Swami> haleyb: mlavalle: we should have this check before configuring the floatingip. 15:17:12 <Swami> #link https://github.com/openstack/neutron/blob/master/neutron/agent/l3/dvr_local_router.py#L103 15:17:31 <Swami> we should have this check at this line, so that no floatingips are configured. 15:17:45 <Swami> and only floatingips that are associated with that host will be configured. 15:18:31 <Swami> mlavalle: To your question. Yes it is a symptom that we saw last week, which was sending floatingip's without host. 15:18:44 <haleyb> but the agent shouldn't have sent, right? 15:18:47 <mlavalle> makes sense, thanks 15:19:20 <Swami> haleyb: Yes, you mean the 'server' shouldn't have sent in the first place. 15:20:11 <haleyb> Swami: right, sent by accident, or because 'host' not set 15:20:40 <Swami> haleyb: Yes. 15:21:27 <Swami> haleyb: mlavalle: The case where we check for the 'host' condition in the server has three different conditions, so probably we might be having a leak in one of those checks. 15:22:23 <haleyb> Swami: so let's do the revert in stable/pike and/or master until we can fix correctly 15:22:37 <Swami> haleyb: mlavalle: Probably the fast approach is to fix the agent side to handle the fips based on the host and then we can take a look at the server side. 15:23:27 <Swami> haleyb: mlavalle: I think we should fix the agent first and no need to revert at this point. 15:24:18 <haleyb> Swami: i think without the pike revert we can't land anything, as it's now broken so grenade in master can't pass (from what I understood) 15:25:22 <haleyb> we can ask jakub in #neutron after meeting, but that's what i understood 15:25:38 <Swami> haleyb: Ok, if that is the case then we can decide on reverting. 15:27:17 <Swami> haleyb: mlavalle : So the plan is, let us check with jakub and see how his patch fairs. 15:28:03 <Swami> haleyb: mlavalle: If it works and can be merged with the grenade job allowing to merge, then we can go ahead and push this patch and need not revert. Otherwise we should revert. 15:28:05 <haleyb> grenade job is now doing pike->queens as of yesterday, so that changed things 15:28:31 <Swami> haleyb: what is that? I don't get it. 15:28:50 <mlavalle> yeah, the grenade equation chenged 15:28:55 <haleyb> grenade configures "old" version, then upgrades to "new" version 15:28:57 <mlavalle> changed 15:29:06 <haleyb> so if pike is broken, then we can't upgrade 15:29:40 <haleyb> i think the revert fixes pike, then jakub's change fixes master enough to make progress 15:30:08 <Swami> haleyb: ok makes sense. 15:31:22 <Swami> haleyb: mean while I will try to see what is happening on the server side logic. 15:32:08 <Swami> Let us move on. 15:32:46 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1712913 15:32:48 <openstack> Launchpad bug 1712913 in neutron "Update DVR router port cause error with QoS rules" [Undecided,In progress] - Assigned to Slawek Kaplonski (slaweq) 15:33:00 <Swami> This is another bug that was filed against DVR. 15:33:40 <Swami> But by looking at the bug description and the logs, it seems that it might also happen with legacy routers. 15:34:16 <Swami> The issue is with the ovs_agent. There is patch currently for review. 15:34:39 <Swami> #link https://review.openstack.org/#/c/498598/ 15:35:12 <Swami> Please take a look at the patch, after we fix this critical issue and when the gate is happy. 15:35:32 <mlavalle> ok 15:35:48 <Swami> The next bug in the list is 15:35:50 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1712795 15:35:52 <openstack> Launchpad bug 1712795 in neutron "Fail to startup neutron-l3-agent" [Undecided,New] 15:36:49 <Swami> This bug seems to be incomplete. I don't see any issue in neutron-l3-agent processing routers. I have asked couple of questions on the branch and how it can be reproduced, but have not received any information yet. 15:36:57 <Swami> Until then we can mark it as incomplete. 15:37:34 <Swami> There are two other bugs that was filed against the multinode failure. 15:37:56 <Swami> Since we have already discussed about the multinode failure, we can have the discussions next week on where we stand. 15:38:05 <Swami> mlavalle: back to you. 15:38:18 <mlavalle> Swami: Thanks 15:38:26 <mlavalle> Good discussion 15:38:48 <mlavalle> I don't have any major bugs to discuss 15:38:56 <mlavalle> #topic Open Agenda 15:39:33 <mlavalle> Since we have that critical DVR bug breaking the gate, let's move to Opem Sgenda 15:39:36 <mlavalle> Agenda 15:40:07 <mlavalle> Any other topics we should discuss today? 15:40:34 <Swami> mlavalle: No I will keep working on the fix. 15:40:58 <haleyb> https://www.eventbrite.com/e/drinks-with-rdo-at-the-ptg-tickets-37396477872 15:41:05 <haleyb> RDO release party at PTG :) 15:41:23 <haleyb> i think that's open to all 15:41:51 <mlavalle> But I think you need to register. I will right now 15:42:13 <mlavalle> ok guys, thanks for attending 15:42:21 <mlavalle> #endmeeting