15:01:04 #startmeeting neutron_dvr 15:01:05 Meeting started Wed Sep 28 15:01:04 2016 UTC and is due to finish in 60 minutes. The chair is haleyb. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:06 I greet you! 15:01:07 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:09 The meeting name has been set to 'neutron_dvr' 15:01:10 #chair Swami 15:01:12 Current chairs: Swami haleyb 15:01:18 #topic Announcements 15:01:20 jschwarz: hi 15:01:55 RC2 was cut, that's hopefully the end, no new dvr issues that we needed to fix 15:02:11 haleyb: old issues are enough 15:02:53 yes, there is enough already :) 15:03:08 #topic Bugs 15:03:25 hi 15:03:34 No new bugs this week.:) 15:03:37 btw, i'm in another meeting if i flake out 15:03:50 haleyb: no problem. 15:04:19 https://bugs.launchpad.net/neutron/+bug/1476469 15:04:20 Launchpad bug 1476469 in neutron "with DVR, a VM can't use floatingIP and VPN at the same time" [Medium,Opinion] 15:05:07 After looking at the bug again it seems that it is a design limitation where the VPN service is centralized and running only on the SNAT Namespace. 15:06:49 Let me know if any else have different opinions. 15:07:20 The next in the list is. 15:07:23 #link https://bugs.launchpad.net/neutron/+bug/1625333 15:07:24 Launchpad bug 1625333 in neutron "Booting VM with a Floating IP and pinging it via that takes a long time with errors in L3-Agent logs when using DVR" [Undecided,Invalid] 15:08:12 We have not got any reply from the person who filed the bug to make sure that is only seen in their environment with the custom l2pop setting they have. 15:08:24 So until then we don't have anything to discuss in this bug. 15:08:24 We still have no reponse on what exactly the kernel issue was, or a bug link 15:08:29 I know otherwiseguy is actively working on this, but I'm not aware of any progress made on this yet. 15:08:33 haleyb yes. 15:08:42 jschwarz: thanks 15:09:04 Swami_: There is an open red hat bugzilla on it, but it is private due to being posted by a customer. 15:09:18 The kernel team is aware and are working on it. 15:09:19 otherwiseguy: got it. 15:09:44 otherwiseguy: So let us keep it and watch it. 15:11:29 The next in the list is 15:11:31 #link https://bugs.launchpad.net/neutron/+bug/1612192 15:11:33 Launchpad bug 1612192 in neutron "L3 DVR: Unable to complete operation on subnet" [High,Confirmed] 15:13:03 haleyb: any update on the gate issues with these two bugs. 15:13:14 I have done nothing on this save for looking at logstash, and do not see anything in the past 7 days 15:13:17 #link https://bugs.launchpad.net/neutron/+bug/1612804 15:13:18 Launchpad bug 1612804 in neutron "test_shelve_instance fails with sshtimeout" [High,Confirmed] 15:13:31 haleyb: ok thanks. 15:13:58 The next is 15:14:00 #link https://bugs.launchpad.net/neutron/+bug/1593354 15:14:02 Launchpad bug 1593354 in neutron "SNAT HA failed because of missing nat rule in snat namespace iptable" [Undecided,New] 15:14:24 jschwarz: did you get a chance to check this out in mitaka. I know you have verified it in the newton branch. 15:14:45 Swami_, I didn't get a change to look at it at all yet :( 15:15:15 I can re-confirm it's not happening on newton though 15:15:34 close it! :) 15:15:49 haleyb: yes we can close it for now. 15:16:00 haleyb: I have a couple of other bugs that need to be closed. 15:16:29 haleyb: I have added a section in the Wiki for bugs that need to be closed. So you can take a look at it and close it. 15:16:50 Swami_: yes, saw that, will look 15:17:07 jschwarz: can you confirm if this bug is still valid or can we close it. 15:17:12 #link https://bugs.launchpad.net/neutron/+bug/1595043 15:17:13 Launchpad bug 1595043 in neutron "Make DVR portbinding implementation useful for HA ports" [Medium,In progress] - Assigned to venkata anil (anil-venkata) 15:17:37 Swami_, I think this one can be closed - it was dealt with in the l2pop patch by anilvenkata iirc 15:17:39 Anil? 15:17:40 I knew anilvenkata had a alternate patch and merged that patch. Do we still need this bug. 15:17:49 jschwarz: yes that's what I thought. 15:17:55 jschwarz: thanks for the confirmation. 15:18:19 Swami_, give me sometime for that 15:18:32 Swami_, before we close it 15:18:39 anilvenkata: ok think through that, and i will remove it from the bugs to be closed list then. 15:18:48 Swami_, thanks Swami 15:18:49 anilvenkata: thanks 15:19:01 The next in the list is 15:19:03 #link https://bugs.launchpad.net/neutron/+bug/1606741 15:19:04 Launchpad bug 1606741 in neutron "Metadata service for instances is unavailable when the l3-agent on the compute host is dvr_snat mode" [High,In progress] - Assigned to Zhixin Li (lizhixin) 15:19:24 Here is the patch for it. 15:19:27 #link https://review.openstack.org/352686 15:19:30 There is a discussion going on right now on what is the correct fix 15:20:05 jschwarz: so needs review on this patch. 15:20:11 and there was no movement on getting nginx or another wsgi that has less memory overhead 15:20:26 Swami_, mostly just needing to find the correct implkementation 15:21:00 haleyb: is this related to the above patch. 15:21:25 Swami_: well, if the memory used was less i don't think running the proxy everywhere would be as big a problem 15:22:02 haleyb, Swami_ did the bug say that already many metadata services running on that node? 15:22:20 anilvenkata: I don't think so. 15:22:33 Swami_, then it wont be that issue 15:23:05 I think this should get more visibility by the L3 guys 15:23:20 anilvenkata: memory consumption was raised by carl on PS8 15:23:22 the discussion has been going on for a few weeks now 15:23:54 haleyb, jschwarz ok 15:23:55 anilvenkata: The only thing mentioned here in the bug is he is running all nodes in dvr_snat node and the meta data agent is not running on the node with dvr_snat agent mode. 15:23:59 haleyb, anilvenkata, I remember for some reason that each metadata proxy process = 80MB 15:24:03 but my memory might be wrong 15:24:26 jschwarz: I think we internally also heard that metadata proxy consumes too much of memory 15:24:41 yes 15:24:57 Swami_, I'm still not convinced this can't go on only the master node and adjust the routing rules for the other nodes 15:24:57 jschwarz: yes, it's more a related bug, but came up in the context of running on the backup l3-agent in HA 15:25:07 logically this should be a good solution 15:25:59 jschwarz: seems possible, I have not investigated on the metadata agent a lot. 15:26:27 Is there anything else to discuss on this bug. 15:26:39 I would gladly dedicated some time for that, but I'm already overbooked and it's holiday season in Israel so I'm gonna get even less work cycles in the coming month 15:26:58 I think this should be proposed for a Friday session for the summit 15:27:02 jschwarz: enjoy your festival. 15:27:25 we really should find a proper solution to this and if one is not achieved by this, talking about this in person seems like the best solution 15:27:27 thoughts? 15:27:32 Swami_, thanks :) 15:27:44 s/by this/by then/ 15:27:47 The next in the list is 15:27:49 #link https://bugs.launchpad.net/neutron/+bug/1506567 15:27:51 Launchpad bug 1506567 in neutron "No information from Neutron Metering agent" [Undecided,In progress] - Assigned to Swaminathan Vasudevan (swaminathan-vasudevan) 15:28:07 #link https://review.openstack.org/#/c/377108/ 15:28:10 Link to the patch. 15:28:28 Needs some review. haleyb I got your review comments. 15:28:35 haleyb: thanks 15:28:42 Swami_, can we set the importance of the launchpad bug please? 15:29:06 Swami_: yes, i think doing some cleanup like that would make it easier to review, let me know if you need my help 15:29:46 haleyb: Sure, I don't have the rights to set the priority of the bug. 15:30:08 haleyb: may be you can give me the permission to do it. 15:30:09 Swami_: i do, what do you want it 15:31:02 Swami_, haleyb I added a comment for https://bugs.launchpad.net/neutron/+bug/1606741/comments/5 15:31:04 Launchpad bug 1606741 in neutron "Metadata service for instances is unavailable when the l3-agent on the compute host is dvr_snat mode" [High,In progress] - Assigned to Zhixin Li (lizhixin) 15:31:04 Swami_: i think you get those perms if added to the right group, links in the neutron page for bug czar 15:31:37 those patch will improve HA 15:31:40 haleyb: ok will do. 15:31:55 Swami_: move to medium or high? 15:32:09 haleyb: move it to high 15:32:21 done 15:32:41 The next in the list is 15:32:44 #link https://bugs.launchpad.net/neutron/+bug/1580648 15:32:45 Launchpad bug 1580648 in neutron "Two HA routers in master state during functional test" [High,Confirmed] - Assigned to John Schwarz (jschwarz) 15:32:51 jschwarz: has this bug been resolved. 15:33:02 Swami_, nope 15:33:14 jschwarz, https://review.openstack.org/#/c/357458/ is not helping 15:33:16 ? 15:33:17 Swami_, not only that, it stopped reproducing for me on a live setup, and for Ann 15:33:36 jschwarz: so can we close this as well. 15:33:41 anilvenkata, the bug was re-opened after that patch was merged I think. 15:33:57 Swami_, anyway, it looks a bit dead in the water 15:34:03 ok 15:34:06 jschwarz: I did see a message in there by Ann that he can still see this problem in the functional tests. 15:34:17 on the other hand, we do have a bunch of "master-master" occurances lately 15:34:26 looking 15:34:49 jschwarz: ok are you going to file a bug, after triaging. 15:34:56 Swami_, Ann wrote "And now it also does not reproduce for me as well." 15:35:06 Swami_, if there is a bug to report, I will 15:35:12 Swami_, no such luck as of now though 15:35:14 jschwarz: ok thanks 15:35:28 regarding 1580648 I think we should close it atm 15:35:33 jschwarz: So for now I will live this bug untouched and jschwarz you can recommend either to close it or not. 15:35:35 if this pops up again we can reopen 15:35:45 jschwarz: ok thanks for the confirmation. 15:35:51 Swami_, I'll close it now 15:36:04 The next one in the list is 15:36:06 #link https://bugs.launchpad.net/neutron/+bug/1571676 15:36:08 Launchpad bug 1571676 in neutron "After binding a floating IP to VM, the static route can't work in DVR." [Undecided,In progress] - Assigned to Swaminathan Vasudevan (swaminathan-vasudevan) 15:36:30 Link to the patch #link https://review.openstack.org/#/c/308068/ 15:36:35 Needs review. 15:37:32 The next is the RFE 15:37:35 #link https://bugs.launchpad.net/neutron/+bug/1577488 15:37:37 Launchpad bug 1577488 in neutron "[RFE]"Fast exit" for compute node egress flows when using DVR" [Wishlist,In progress] - Assigned to Swaminathan Vasudevan (swaminathan-vasudevan) 15:37:48 This RFE has two patches and needs review as well. 15:38:00 #link https://review.openstack.org/#/c/283757/ 15:38:13 #link https://review.openstack.org/#/c/355062/ 15:38:22 Please review these patches. 15:38:37 Swami_, will do! great RFE IMO 15:38:38 jschwarz: anilvenkata: Is there anything else from your side for the bugs, that I missed. 15:38:55 nothing I'm aware of 15:39:02 nothin, thanks Swami_ 15:39:10 ok, thanks. 15:39:16 That's all I had for bugs today. 15:39:24 haleyb: back to you. 15:39:30 thanks 15:39:45 #topic Gate failures 15:40:24 I don't think dvr has been an issue in the gate lately 15:40:49 haleyb: good news 15:41:32 Swami_, haleyb jschwarz good news 15:41:35 there are check queue failures, but those are sometimes false positives 15:41:54 i.e. a bug in a review 15:42:11 haleyb: agreed 15:44:11 Swami_: the dvr-multinode in the gate isn't voting, right? 15:44:34 haleyb: yes 15:45:03 i guess it wouldn't be in the gate if it wasn't voting 15:46:04 haleyb: yes you are right it is voting. 15:46:17 i guess i'm still confused by the extra jobs grafana lists with the same name 15:46:48 it's the xenial ones that are non-voting in the check queue 15:47:23 anyways, not much else here 15:47:29 haleyb: ok 15:47:29 #topic Stable backports 15:47:54 I don't have any pending backports at this point. 15:48:04 Swami_: oleg posted this today - https://review.openstack.org/#/c/378374/ 15:48:28 that's the only active i know of 15:48:45 haleyb: ok. 15:48:57 #topic Open Discussion 15:49:04 hey yo 15:49:08 free for all 15:49:15 Re: the DVR+HA job we discussed last week 15:49:25 jschwarz: yes 15:49:32 I didn't get any cycles there, but Swami_ did send me a mail with details on how to accomplish this 15:49:47 thankfully, anilvenkata has stepped up and he'll take over for this 15:49:49 jschwarz: was that patch useful 15:49:57 so he's the point of contact for this now 15:50:19 Swami_, I didn't look into it that much - anilvenkata will have more details i believe 15:50:24 jschwarz, Swami_ :) yes, wanted to have HA+DVR job on CI 15:50:41 anilvenkata: great 15:50:50 jschwarz: good to know. if we can get that working would be great 15:51:01 jschwarz: as I mentioned first discuss with clarkb on the extra node requirement. 15:51:02 haleyb, we are in agreement 15:51:07 i trust in anil 15:51:10 Swami_, haleyb sure, thanks Swamy and Brian 15:51:21 I will ping you if I need any help for that 15:51:30 anilvenkata: no problem 15:51:31 jschwarz, :) 15:52:33 anything else to discuss? 15:52:46 haleyb: nothing from me 15:53:40 nope 15:54:39 ok, then i'll let you get to your patches :) 15:54:49 bye guys :) 15:54:51 see you next week 15:54:55 bye 15:54:58 bye 15:55:00 #endmeeting