15:00:34 <haleyb> #startmeeting neutron_dvr
15:00:34 <openstack> Meeting started Wed Aug 10 15:00:34 2016 UTC and is due to finish in 60 minutes.  The chair is haleyb. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:35 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:38 <openstack> The meeting name has been set to 'neutron_dvr'
15:00:52 <haleyb> #chair Swami Swami_
15:00:52 <openstack> Warning: Nick not in channel: Swami
15:00:54 <openstack> Current chairs: Swami Swami_ haleyb
15:01:27 <haleyb> #topic Announcements
15:02:03 <haleyb> midcycle is next week, https://etherpad.openstack.org/p/newton-neutron-midcycle
15:02:49 <Swami_> we are already there at the mid-cycle for newton.
15:02:54 <Swami_> time runs fast
15:02:57 <haleyb> doesn't look like any of the normal participants here will be there
15:03:13 <haleyb> it's actually late being in N-3
15:03:26 <Swami_> haleyb: what do you think should be our priority for the mid-cylce.
15:03:32 <Swami_> Cleaning up the bug log.
15:04:08 <Swami_> Probably we should clean up all the 'ha' related bugs.
15:04:35 <haleyb> Swami_: yes, we need to get some of the bugs closed.  Tracking down the multinode failures as well, i saw a trace today on one
15:04:57 <Swami_> haleyb: anything interesting on the multi-node failures.
15:06:10 <haleyb> Swami_: well, just a failure that i hadn't seen before, can talk about it in bugs or open discussion
15:06:16 <haleyb> #topic Bugs
15:06:32 <Swami_> haleyb: ok thanks
15:06:50 <Swami_> This week we had this gate failure bug
15:07:11 <Swami_> #link https://bugs.launchpad.net/neutron/+bug/1609540
15:07:11 <openstack> Launchpad bug 1609540 in neutron "Deleting csnat port fails due to no fixed ips" [Critical,In progress] - Assigned to Carl Baldwin (carl-baldwin)
15:07:40 <Swami_> A patch has been proposed as a work around and I think still we have not fixed the root issue why the fixed_ips are none.
15:07:53 <Swami_> #link https://review.openstack.org/350783
15:08:13 <haleyb> that patch merged
15:08:23 <Swami_> haleyb: Yes it merged.
15:09:04 <Swami_> The next one high in the list is #link https://bugs.launchpad.net/neutron/+bug/1597461
15:09:04 <openstack> Swami_: Error: Could not gather data from Launchpad for bug #1597461 (https://launchpad.net/bugs/1597461). The error has been logged
15:09:38 <Swami_> #link https://bugs.launchpad.net/neutron/+bug/1597461
15:09:45 <Swami_> reposting the link
15:09:48 <haleyb> yes, that is easy to reproduce
15:09:59 <Swami_> haleyb: did you find out the root cause.
15:10:37 <haleyb> jschwarz: ^^ can i drag you in here to talk about this, don't know if you had time yet
15:11:02 <haleyb> Swami_: i do not have a root cause yet
15:11:16 <Swami_> haleyb: ok, no problem.
15:12:32 <Swami_> The next one in the list is
15:12:37 <Swami_> #link https://bugs.launchpad.net/neutron/+bug/1606741
15:12:37 <openstack> Launchpad bug 1606741 in neutron "Metadata service for instances is unavailable when the l3-agent on the compute host is dvr_snat mode" [High,New] - Assigned to Zhixin Li (lizhixin)
15:13:07 <Swami_> This bug has a patch and I did see that you have reviewed this patch already.
15:13:12 <Swami_> Here is the patch link
15:13:26 <Swami_> #link https://review.openstack.org/352686
15:13:59 <haleyb> yes, that seems fixable, i had posted comments yesterday
15:14:37 <Swami_> I did see that the changes made in this patch is related to /l3/ha, so does this problem persist only when you have dvr_snat and ha enabled or irrespective of ha, it happens.
15:15:21 <haleyb> i think you need ha to hit that code
15:15:45 <Swami_> haleyb: Ok, then probably the bug description should be changed.
15:16:13 <Swami_> haleyb: Yes that patch seemed to be a simple fix.
15:16:30 <Swami_> haleyb: hopefully we should see a revision quick.
15:17:02 <Swami_> The next one is #link https://bugs.launchpad.net/neutron/+bug/1595043
15:17:02 <openstack> Launchpad bug 1595043 in neutron "Make DVR portbinding implementation useful for HA ports" [Medium,In progress] - Assigned to venkata anil (anil-venkata)
15:17:02 <haleyb> hope so
15:17:18 <Swami_> I think anilvenkata had a new patch.
15:17:22 <anilvenkata> Swami_, yes
15:17:30 <Swami_> #link https://review.openstack.org/324302
15:17:52 <anilvenkata> Swami_, I have abandon this patch
15:18:04 <Swami_> anilvenkata: thanks for considering the backport options and abandoning the old ones.
15:18:32 <anilvenkata> Swami_, need reviewers for my l2pop ha patch
15:18:35 <Swami_> anilvenkata: I hope this patch will not have any issues with backport.
15:18:40 <haleyb> https://review.openstack.org/#/c/255237is new patch
15:19:00 <haleyb> https://review.openstack.org/#/c/255237
15:19:09 <anilvenkata> Swami_, haleyb https://review.openstack.org/#/c/255237 yes this patch is there for a long time
15:19:21 <anilvenkata> Swami_, haleyb need reviewers for this patch
15:19:24 <Swami_> anilvenkata: yes got it.
15:19:53 <anilvenkata> Swami_, haleyb this patch also solves https://bugs.launchpad.net/neutron/+bug/1602614
15:19:53 <openstack> Launchpad bug 1602614 in neutron "DVR + L3 HA loss during failover is higher that it is expected" [Undecided,In progress] - Assigned to venkata anil (anil-venkata)
15:19:53 <Swami_> anilvenkata: will review it.
15:20:15 <anilvenkata> Swami_, haleyb thanks
15:20:56 <Swami_> anilvenkata: That was my next bug to discuss. Since you have already posted it here, it saves my time.
15:21:19 <anilvenkata> yes, that patch solves this bug also
15:21:39 <Swami_> There is another bug related to ha and vrrp.
15:21:43 <Swami_> #link https://bugs.launchpad.net/neutron/+bug/1602320
15:21:43 <openstack> Launchpad bug 1602320 in neutron "ha + distributed router: keepalived process kill vrrp child process" [Undecided,In progress] - Assigned to Dongcan Ye (hellochosen)
15:22:25 <Swami_> This has not been triaged yet and I did see jschwarz comment in there, that it is expected behavior, but we need to close the loop on this.
15:22:58 <haleyb> https://review.openstack.org/#/c/342730/ was sent out a couple of weeks ago
15:24:32 <Swami_> haleyb: thanks for the link
15:24:48 <haleyb> Swami_: i'll update the meeting wiki afterwards
15:24:53 <Swami_> ok.
15:25:23 <Swami_> #link https://bugs.launchpad.net/neutron/+bug/1596473
15:25:23 <openstack> Launchpad bug 1596473 in neutron "Packet loss with DVR and IPv6" [Undecided,Incomplete]
15:26:26 <Swami_> haleyb: I think this is incomplete, may be there is nothing to discuss ehre.
15:26:30 <Swami_> s/ehre/here
15:27:03 <haleyb> Right, submitter has not responded, and there's only so many things we can try and reproduce
15:27:17 <haleyb> i will close and hopefully get their at tention
15:27:31 <haleyb> or at least poke them again
15:27:31 <Swami_> haleyb: ok
15:27:40 <Swami_> The next one in the list is
15:27:43 <Swami_> #link https://bugs.launchpad.net/neutron/+bug/1506567
15:27:44 <openstack> Launchpad bug 1506567 in neutron "No information from Neutron Metering agent" [Undecided,New]
15:28:56 <Swami_> It seems there is a workaround posted there, may be we should look into it.
15:29:01 <Swami_> #link https://bugs.launchpad.net/neutron/+bug/1506567/comments/5
15:29:01 <openstack> Launchpad bug 1506567 in neutron "No information from Neutron Metering agent" [Undecided,New]
15:29:26 <haleyb> I think we talked about this last week too.  It's a known issue that some of the agents don't know what namespace and/or interface to use when on a DVR compute node
15:29:38 <haleyb> RA has the same issue
15:30:03 <Swami_> haleyb: yes I remember talking about it.
15:30:33 <Swami_> #link https://bugs.launchpad.net/neutron/+bug/1599287
15:30:33 <openstack> Swami_: Error: Could not gather data from Launchpad for bug #1599287 (https://launchpad.net/bugs/1599287). The error has been logged
15:30:44 <Swami_> There is patch under review
15:30:47 <Swami_> #link https://review.openstack.org/337855
15:32:06 <Swami_> obondarev has some comments on this patch.
15:32:19 <haleyb> yes, but it is getting close
15:32:23 <Swami_> I will take a look at it and respond to his comments.
15:33:05 <Swami_> haleyb: obondarev's comment rings a bell, I need to check one more case, before I respond to his comments.
15:33:38 <Swami_> I will recheck it today and will repost a patch or will respond.
15:34:00 <haleyb> sounds good
15:34:08 <Swami_> One the fast-path-exit RFE patch, I do have the agent patch in good shape.
15:34:14 <Swami_> haleyb: can you take a look at it.
15:34:33 <Swami_> #link https://review.openstack.org/#/c/283757/
15:34:47 <haleyb> i'll take a look
15:34:47 <Swami_> This would also help the service_type networks
15:35:09 <Swami_> This creates the fip namespace on all nodes, irrespective of the fip.
15:35:52 <Swami_> I think that's all I had for the bugs this week.
15:36:15 <haleyb> anyone else have bugs to discuss ?
15:37:00 <haleyb> #topic Gate failures
15:37:50 <haleyb> So the gate has been a mess overall, not exactly dvr's fault
15:37:52 <Swami_> haleyb: Is it getting better.
15:38:48 <haleyb> the dvr just started spiking again, about 5% failure now
15:38:54 <haleyb> http://grafana.openstack.org/dashboard/db/neutron-failure-rate?panelId=5&fullscreen
15:39:25 <Swami_> looking at the graph
15:39:31 <haleyb> that just started earlier today, don't know what the issue is
15:40:20 <haleyb> The check queue has gotten better, but still showing increases - http://grafana.openstack.org/dashboard/db/neutron-failure-rate?panelId=8&fullscreen
15:40:26 <Swami_> haleyb: ok
15:40:38 <haleyb> of course that assumes every patch is perfect since a bug in a patch reflects in that
15:40:39 <Swami_> haleyb: this is going to be a never ending story.
15:40:54 <haleyb> groundhog day
15:41:35 <haleyb> http://logs.openstack.org/51/337851/19/check/gate-tempest-dsvm-neutron-dvr-multinode-full/c944b3d/logs/screen-q-dhcp.txt.gz#_2016-08-10_08_43_58_552 is something i noticed today in one of my patches, seems interesting
15:41:40 <Swami_> haleyb: agreed.
15:41:59 <haleyb> multinode dvr test, one VM failed dhcp, but it was due to agent not starting
15:42:30 <Swami_> haleyb: that is good.
15:43:00 <haleyb> if good is bad :)
15:43:24 <haleyb> it seems we should be able to debug it from the log (i hope)
15:43:51 <Swami_> haleyb: sure if it is obvious.
15:44:03 <haleyb> it never is, but i will look and see
15:44:38 <Swami_> haleyb: ping
15:45:10 <haleyb> sorry, had to talk in that other meeting, but failed
15:46:35 <Swami_> haleyb: yes I realized
15:46:55 <haleyb> i had nothing more on the gate today
15:47:09 <Swami_> haleyb: thanks
15:47:12 <haleyb> #topic Stable backports
15:47:45 <Swami_> #link https://review.openstack.org/#/c/351923/
15:47:51 <haleyb> nothing in particular for stable, just keep doing backports
15:48:02 <Swami_> #link https://review.openstack.org/#/c/351947/
15:48:06 <haleyb> I already +2'd that :)
15:48:17 <haleyb> any other stable backports that need attention
15:48:26 <Swami_> haleyb: I need another +2 for these patches. Can you ping ihar.
15:49:00 <Swami_> ok.
15:49:21 <Swami_> I need to backport this to liberty.
15:49:25 <Swami_> #link https://review.openstack.org/#/c/348372/6
15:49:49 <Swami_> but we have a dependency on #link https://review.openstack.org/#/c/351923/
15:51:30 <haleyb> https://review.openstack.org/#/c/351947/1 first, then that, but yes, those need to go back
15:51:52 <Swami_> haleyb: yes
15:52:13 <haleyb> any others
15:52:29 <Swami_> haleyb: that's it.
15:52:51 <haleyb> #topic Open Discussion
15:53:13 <haleyb> Ok, let the tomatoes fly! :)
15:53:54 <Swami_> haleyb: I might need some help/guidance from you on creating the iproute chains for the floatingip namespace for fast path exit.
15:54:19 <haleyb> iptables ?
15:54:20 <Swami_> This might also help for the floatingip namespace static routes for nexthop.
15:54:54 <Swami_> Basically we have to add static routes for every tenant owned cidr in the fipnamespace.
15:55:14 <Swami_> We should figure out what is the best way to do this without affecting what he have today.
15:56:35 <haleyb> ok, i can help with that
15:56:40 <Swami_> I do have a patch right now that adds the static route, I will try to polish it a bit and will pull you in for review and you can provide your feedback.
15:56:56 <Swami_> #link https://review.openstack.org/#/c/297468/
15:58:02 <haleyb> i'll take a look
15:58:02 <Swami_> haleyb: I wanted to have it working before the mid-cycle so that we can churn it out. But will see where it goes.
15:58:56 <Swami_> That's all I had for today.
15:59:27 <haleyb> Swami_: ok.  i know you won't be there, but https://etherpad.openstack.org/p/newton-neutron-midcycle-workitems had a list of things to discuss at midcycle if you want to add it, maybe irc discussion
15:59:52 <Swami_> haleyb: sure will add it to the list.
16:00:07 <haleyb> we are out of time, keep fixing those bugs! :)
16:00:10 <haleyb> #endmeeting