15:00:34 #startmeeting neutron_dvr 15:00:34 Meeting started Wed Aug 10 15:00:34 2016 UTC and is due to finish in 60 minutes. The chair is haleyb. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:35 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:38 The meeting name has been set to 'neutron_dvr' 15:00:52 #chair Swami Swami_ 15:00:52 Warning: Nick not in channel: Swami 15:00:54 Current chairs: Swami Swami_ haleyb 15:01:27 #topic Announcements 15:02:03 midcycle is next week, https://etherpad.openstack.org/p/newton-neutron-midcycle 15:02:49 we are already there at the mid-cycle for newton. 15:02:54 time runs fast 15:02:57 doesn't look like any of the normal participants here will be there 15:03:13 it's actually late being in N-3 15:03:26 haleyb: what do you think should be our priority for the mid-cylce. 15:03:32 Cleaning up the bug log. 15:04:08 Probably we should clean up all the 'ha' related bugs. 15:04:35 Swami_: yes, we need to get some of the bugs closed. Tracking down the multinode failures as well, i saw a trace today on one 15:04:57 haleyb: anything interesting on the multi-node failures. 15:06:10 Swami_: well, just a failure that i hadn't seen before, can talk about it in bugs or open discussion 15:06:16 #topic Bugs 15:06:32 haleyb: ok thanks 15:06:50 This week we had this gate failure bug 15:07:11 #link https://bugs.launchpad.net/neutron/+bug/1609540 15:07:11 Launchpad bug 1609540 in neutron "Deleting csnat port fails due to no fixed ips" [Critical,In progress] - Assigned to Carl Baldwin (carl-baldwin) 15:07:40 A patch has been proposed as a work around and I think still we have not fixed the root issue why the fixed_ips are none. 15:07:53 #link https://review.openstack.org/350783 15:08:13 that patch merged 15:08:23 haleyb: Yes it merged. 15:09:04 The next one high in the list is #link https://bugs.launchpad.net/neutron/+bug/1597461 15:09:04 Swami_: Error: Could not gather data from Launchpad for bug #1597461 (https://launchpad.net/bugs/1597461). The error has been logged 15:09:38 #link https://bugs.launchpad.net/neutron/+bug/1597461 15:09:45 reposting the link 15:09:48 yes, that is easy to reproduce 15:09:59 haleyb: did you find out the root cause. 15:10:37 jschwarz: ^^ can i drag you in here to talk about this, don't know if you had time yet 15:11:02 Swami_: i do not have a root cause yet 15:11:16 haleyb: ok, no problem. 15:12:32 The next one in the list is 15:12:37 #link https://bugs.launchpad.net/neutron/+bug/1606741 15:12:37 Launchpad bug 1606741 in neutron "Metadata service for instances is unavailable when the l3-agent on the compute host is dvr_snat mode" [High,New] - Assigned to Zhixin Li (lizhixin) 15:13:07 This bug has a patch and I did see that you have reviewed this patch already. 15:13:12 Here is the patch link 15:13:26 #link https://review.openstack.org/352686 15:13:59 yes, that seems fixable, i had posted comments yesterday 15:14:37 I did see that the changes made in this patch is related to /l3/ha, so does this problem persist only when you have dvr_snat and ha enabled or irrespective of ha, it happens. 15:15:21 i think you need ha to hit that code 15:15:45 haleyb: Ok, then probably the bug description should be changed. 15:16:13 haleyb: Yes that patch seemed to be a simple fix. 15:16:30 haleyb: hopefully we should see a revision quick. 15:17:02 The next one is #link https://bugs.launchpad.net/neutron/+bug/1595043 15:17:02 Launchpad bug 1595043 in neutron "Make DVR portbinding implementation useful for HA ports" [Medium,In progress] - Assigned to venkata anil (anil-venkata) 15:17:02 hope so 15:17:18 I think anilvenkata had a new patch. 15:17:22 Swami_, yes 15:17:30 #link https://review.openstack.org/324302 15:17:52 Swami_, I have abandon this patch 15:18:04 anilvenkata: thanks for considering the backport options and abandoning the old ones. 15:18:32 Swami_, need reviewers for my l2pop ha patch 15:18:35 anilvenkata: I hope this patch will not have any issues with backport. 15:18:40 https://review.openstack.org/#/c/255237is new patch 15:19:00 https://review.openstack.org/#/c/255237 15:19:09 Swami_, haleyb https://review.openstack.org/#/c/255237 yes this patch is there for a long time 15:19:21 Swami_, haleyb need reviewers for this patch 15:19:24 anilvenkata: yes got it. 15:19:53 Swami_, haleyb this patch also solves https://bugs.launchpad.net/neutron/+bug/1602614 15:19:53 Launchpad bug 1602614 in neutron "DVR + L3 HA loss during failover is higher that it is expected" [Undecided,In progress] - Assigned to venkata anil (anil-venkata) 15:19:53 anilvenkata: will review it. 15:20:15 Swami_, haleyb thanks 15:20:56 anilvenkata: That was my next bug to discuss. Since you have already posted it here, it saves my time. 15:21:19 yes, that patch solves this bug also 15:21:39 There is another bug related to ha and vrrp. 15:21:43 #link https://bugs.launchpad.net/neutron/+bug/1602320 15:21:43 Launchpad bug 1602320 in neutron "ha + distributed router: keepalived process kill vrrp child process" [Undecided,In progress] - Assigned to Dongcan Ye (hellochosen) 15:22:25 This has not been triaged yet and I did see jschwarz comment in there, that it is expected behavior, but we need to close the loop on this. 15:22:58 https://review.openstack.org/#/c/342730/ was sent out a couple of weeks ago 15:24:32 haleyb: thanks for the link 15:24:48 Swami_: i'll update the meeting wiki afterwards 15:24:53 ok. 15:25:23 #link https://bugs.launchpad.net/neutron/+bug/1596473 15:25:23 Launchpad bug 1596473 in neutron "Packet loss with DVR and IPv6" [Undecided,Incomplete] 15:26:26 haleyb: I think this is incomplete, may be there is nothing to discuss ehre. 15:26:30 s/ehre/here 15:27:03 Right, submitter has not responded, and there's only so many things we can try and reproduce 15:27:17 i will close and hopefully get their at tention 15:27:31 or at least poke them again 15:27:31 haleyb: ok 15:27:40 The next one in the list is 15:27:43 #link https://bugs.launchpad.net/neutron/+bug/1506567 15:27:44 Launchpad bug 1506567 in neutron "No information from Neutron Metering agent" [Undecided,New] 15:28:56 It seems there is a workaround posted there, may be we should look into it. 15:29:01 #link https://bugs.launchpad.net/neutron/+bug/1506567/comments/5 15:29:01 Launchpad bug 1506567 in neutron "No information from Neutron Metering agent" [Undecided,New] 15:29:26 I think we talked about this last week too. It's a known issue that some of the agents don't know what namespace and/or interface to use when on a DVR compute node 15:29:38 RA has the same issue 15:30:03 haleyb: yes I remember talking about it. 15:30:33 #link https://bugs.launchpad.net/neutron/+bug/1599287 15:30:33 Swami_: Error: Could not gather data from Launchpad for bug #1599287 (https://launchpad.net/bugs/1599287). The error has been logged 15:30:44 There is patch under review 15:30:47 #link https://review.openstack.org/337855 15:32:06 obondarev has some comments on this patch. 15:32:19 yes, but it is getting close 15:32:23 I will take a look at it and respond to his comments. 15:33:05 haleyb: obondarev's comment rings a bell, I need to check one more case, before I respond to his comments. 15:33:38 I will recheck it today and will repost a patch or will respond. 15:34:00 sounds good 15:34:08 One the fast-path-exit RFE patch, I do have the agent patch in good shape. 15:34:14 haleyb: can you take a look at it. 15:34:33 #link https://review.openstack.org/#/c/283757/ 15:34:47 i'll take a look 15:34:47 This would also help the service_type networks 15:35:09 This creates the fip namespace on all nodes, irrespective of the fip. 15:35:52 I think that's all I had for the bugs this week. 15:36:15 anyone else have bugs to discuss ? 15:37:00 #topic Gate failures 15:37:50 So the gate has been a mess overall, not exactly dvr's fault 15:37:52 haleyb: Is it getting better. 15:38:48 the dvr just started spiking again, about 5% failure now 15:38:54 http://grafana.openstack.org/dashboard/db/neutron-failure-rate?panelId=5&fullscreen 15:39:25 looking at the graph 15:39:31 that just started earlier today, don't know what the issue is 15:40:20 The check queue has gotten better, but still showing increases - http://grafana.openstack.org/dashboard/db/neutron-failure-rate?panelId=8&fullscreen 15:40:26 haleyb: ok 15:40:38 of course that assumes every patch is perfect since a bug in a patch reflects in that 15:40:39 haleyb: this is going to be a never ending story. 15:40:54 groundhog day 15:41:35 http://logs.openstack.org/51/337851/19/check/gate-tempest-dsvm-neutron-dvr-multinode-full/c944b3d/logs/screen-q-dhcp.txt.gz#_2016-08-10_08_43_58_552 is something i noticed today in one of my patches, seems interesting 15:41:40 haleyb: agreed. 15:41:59 multinode dvr test, one VM failed dhcp, but it was due to agent not starting 15:42:30 haleyb: that is good. 15:43:00 if good is bad :) 15:43:24 it seems we should be able to debug it from the log (i hope) 15:43:51 haleyb: sure if it is obvious. 15:44:03 it never is, but i will look and see 15:44:38 haleyb: ping 15:45:10 sorry, had to talk in that other meeting, but failed 15:46:35 haleyb: yes I realized 15:46:55 i had nothing more on the gate today 15:47:09 haleyb: thanks 15:47:12 #topic Stable backports 15:47:45 #link https://review.openstack.org/#/c/351923/ 15:47:51 nothing in particular for stable, just keep doing backports 15:48:02 #link https://review.openstack.org/#/c/351947/ 15:48:06 I already +2'd that :) 15:48:17 any other stable backports that need attention 15:48:26 haleyb: I need another +2 for these patches. Can you ping ihar. 15:49:00 ok. 15:49:21 I need to backport this to liberty. 15:49:25 #link https://review.openstack.org/#/c/348372/6 15:49:49 but we have a dependency on #link https://review.openstack.org/#/c/351923/ 15:51:30 https://review.openstack.org/#/c/351947/1 first, then that, but yes, those need to go back 15:51:52 haleyb: yes 15:52:13 any others 15:52:29 haleyb: that's it. 15:52:51 #topic Open Discussion 15:53:13 Ok, let the tomatoes fly! :) 15:53:54 haleyb: I might need some help/guidance from you on creating the iproute chains for the floatingip namespace for fast path exit. 15:54:19 iptables ? 15:54:20 This might also help for the floatingip namespace static routes for nexthop. 15:54:54 Basically we have to add static routes for every tenant owned cidr in the fipnamespace. 15:55:14 We should figure out what is the best way to do this without affecting what he have today. 15:56:35 ok, i can help with that 15:56:40 I do have a patch right now that adds the static route, I will try to polish it a bit and will pull you in for review and you can provide your feedback. 15:56:56 #link https://review.openstack.org/#/c/297468/ 15:58:02 i'll take a look 15:58:02 haleyb: I wanted to have it working before the mid-cycle so that we can churn it out. But will see where it goes. 15:58:56 That's all I had for today. 15:59:27 Swami_: ok. i know you won't be there, but https://etherpad.openstack.org/p/newton-neutron-midcycle-workitems had a list of things to discuss at midcycle if you want to add it, maybe irc discussion 15:59:52 haleyb: sure will add it to the list. 16:00:07 we are out of time, keep fixing those bugs! :) 16:00:10 #endmeeting