15:00:08 <haleyb> #startmeeting neutron_dvr 15:00:08 <openstack> Meeting started Wed Dec 9 15:00:08 2015 UTC and is due to finish in 60 minutes. The chair is haleyb. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:10 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:13 <openstack> The meeting name has been set to 'neutron_dvr' 15:00:13 <haleyb> hi swami 15:00:21 <haleyb> #chair Swami 15:00:24 <openstack> Current chairs: Swami haleyb 15:00:56 <obondarev> o/ 15:01:04 <Swami> obondarev:hi 15:01:07 <haleyb> Swami: i had a complete system failure earlier this morning, so if I disappear it's all yours... 15:01:18 <Swami> haleyb: no problem 15:01:20 * regXboi wanders in late 15:01:21 <regXboi> hi 15:01:45 <Swami> I have to leave early today, so may be we can wind up the meeting early 15:01:54 <regXboi> I'm in favor of that 15:02:02 <haleyb> sure 15:02:19 <haleyb> #topic Announcements 15:02:28 <Swami> haleyb: I have edited the meeting wiki with bugs and then sorted the bugs based on category 15:03:03 <haleyb> ok, we might have stepped on each other as oleg and myself also edited today, i'll look 15:03:27 <obondarev> yeah 15:03:29 <Swami> haleyb: yes my changes overlapped yours and then I fixed it. 15:03:36 <haleyb> I didn't have any particular announcements, just that reviews have continued to merge, which is goodness 15:03:50 <haleyb> #topic Bugs 15:03:53 <Swami> haleyb: is the gate, zuul happy 15:03:54 * carl_baldwin wanders in 15:04:12 <fitoduarte> . 15:04:14 <Swami> Ok we have at least 4 new bugs for this week that was filed. 15:04:29 <haleyb> Swami: get is getting better 15:04:34 <haleyb> s/get/gate 15:04:36 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1524291 15:04:36 <openstack> Launchpad bug 1524291 in neutron "check_ports_on_host_and_subnet() duplicates check_ports_exist_on_l3agent()" [Low,In progress] - Assigned to Oleg Bondarev (obondarev) 15:04:55 <obondarev> preparing fix for it 15:04:58 <Swami> This is a low category and probably a cleanup and I don't think we need more discussion on this. 15:05:06 <Swami> obondarev: thanks 15:05:14 <obondarev> just noticed while working on dvr sheduling refactoring 15:05:19 <Swami> next one 15:05:23 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1524020 15:05:23 <openstack> Launchpad bug 1524020 in neutron "DVRImpact: dvr_vmarp_table_update and dvr_update_router_add_vm is called for every port update instead of only when host binding or mac-address changes occur" [Medium,Confirmed] - Assigned to Swaminathan Vasudevan (swaminathan-vasudevan) 15:05:52 <Swami> This bug was filed by me, again this is related to reducing the arp calls that are initiated from the server for all updates. 15:05:58 <Swami> I have a patch up for review. 15:06:01 <obondarev> I've left a couple og suggestions for that on in review 15:06:04 <Swami> please feel free to review it. 15:06:12 <obondarev> of* 15:06:19 <Swami> obondarev: I will take a look at it. Thanks for your ongoing review comments. 15:06:34 <Swami> just for the benefit of the audience here is the patch details. 15:06:37 <obondarev> Swami: thanks for being patient 15:06:45 <haleyb> i didn't see a review link in the bug 15:06:56 <Swami> #link https://review.openstack.org/#/c/253685/ 15:07:26 <Swami> haleyb: sometimes I have seen if you push the patch first and then assign the bug id, it is not getting populated in the launchpad. 15:07:45 <Swami> The next one is 15:07:49 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1522824 15:07:49 <openstack> Launchpad bug 1522824 in neutron "DVR multinode job: test_shelve_instance failure due to SSHTimeout" [High,In progress] - Assigned to Oleg Bondarev (obondarev) 15:08:11 <obondarev> this one is one of the culprits for multinode job failures 15:08:11 <Swami> There is patch that oleg pushed in for this issue. 15:08:27 <Swami> obondarev: yes I have seen this failure more 15:08:43 <obondarev> there is one with 'resize' test as well 15:08:51 <Swami> obondarev: also the other failure i have noticed is the "volume-boot-pattern". Just to consider after. 15:08:54 <obondarev> let's see if it's the same root cause 15:09:02 <Swami> obondarev: thanks 15:09:07 <haleyb> we need to ping an ML2 core on that review 15:09:22 <Swami> #link https://review.openstack.org/#/c/253569/ 15:09:50 <obondarev> just added kevinbenton there 15:09:51 <Swami> obondarev: you have mentioned in your comment, that we don't need any event update from nova and neutron should handle itself. 15:10:42 <obondarev> Swami: correct 15:10:52 <Swami> obondarev: going forward if we need any nova handshake is that possible with neutron or neutron should just work on the basis of the port status. The reason I ask this question is for the live migration. 15:11:25 <obondarev> Swami: live migration is a bit harder 15:11:39 <obondarev> Swami: for this bug we don't need anything on nova side 15:11:39 <Swami> obondarev: ok we can discuss about it later. 15:12:04 <Swami> That's all for the new bugs this week. 15:12:13 <Swami> But i wanted to discuss about a new feature patch. 15:12:16 <Swami> #link https://review.openstack.org/#/c/143169/ 15:12:25 <Swami> This is the DVR SNAT HA patch. 15:13:03 <Swami> This has been pending for a while and fitoduarte has addressed all merge conflicts on this patch. 15:13:19 <Swami> Can we have the cores attention on this patch. 15:13:54 <Swami> Also a couple of days back amuller mentioned about this patch and other dependent patches for the DVR, L3 HA to be working smoother. 15:13:57 <regXboi> IIRC, this had +2s on an earlier revision 15:14:04 <regXboi> the issue was amuller wanted to review 15:14:06 <Swami> regXboi: yes 15:14:08 <haleyb> I will review again, but assuming we need amuller has final say 15:14:13 <obondarev> I'll review it, it seems to be affecting sheduling refactoring as well 15:14:38 <Swami> maintaining that patch is too hard right now and we should probably close the loop on it. 15:14:40 * carl_baldwin will look again 15:14:52 <Swami> carl_baldwin: obondarev: haleyb: thanks 15:15:56 <Swami> I have a patch for addressing the allowed address pair with FIP on DVR. 15:16:00 <Swami> #link https://review.openstack.org/#/c/254439/ 15:16:14 <Swami> This is currently WIP and I would like to get some early feedback on this. 15:16:39 <Swami> here is the bug details on this. 15:16:43 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1445255 15:16:43 <openstack> Launchpad bug 1445255 in neutron "DVR FloatingIP to unbound port does not work" [Low,In progress] - Assigned to Swaminathan Vasudevan (swaminathan-vasudevan) 15:17:39 <Swami> am I still visible. 15:17:52 <regXboi> yes on the visibility 15:17:55 * regXboi reading 15:18:51 <Swami> next one in the list 15:18:55 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1521815 15:18:55 <openstack> Launchpad bug 1521815 in neutron "DVR functional tests failing intermittently" [Low,Confirmed] - Assigned to Swaminathan Vasudevan (swaminathan-vasudevan) 15:19:18 <Swami> This bug can't be reproduced and it kind of settled itself in the gate. 15:19:56 <Swami> We are not sure what caused this issue in the gate. So let us keep an eye on this and move forward. regXboi has lowered the serverity on this bug. 15:20:37 <Swami> I think that's all we have for bugs today. 15:20:38 <regXboi> I've got nothing to add that isn't in the bug discussion 15:20:55 <Swami> If you guys have anything else to discuss we can discuss, else we can move one. 15:21:00 <regXboi> let's move on 15:21:10 <Swami> ok, that's all I have for bugs. 15:21:21 <Swami> #topic gate-failures 15:21:42 <Swami> How is gate failure looking with respect to dvr 15:22:19 <regXboi> so the check pipelines have settled out after the metering hangup 15:22:29 <Swami> regXboi: yes true 15:22:49 <obondarev> I'm concerned by a new dvr multinode failure here https://review.openstack.org/#/c/250075/ 15:22:51 <regXboi> single node DVR can probably go back to voting - it is tracking neutron-full pretty well 15:23:00 <obondarev> which is probably related to the patch 15:23:00 * regXboi looks 15:23:27 <obondarev> need to investigate before merging 15:23:28 <Swami> obondarev: is it causing failures in gate. 15:23:40 <regXboi> obondarev: I'm -1 that patch because of that 15:23:43 <Swami> obondarev: I think we agreed that if it breaks the gate we can revert it. 15:24:08 <obondarev> We've not merged it yet, so we have time to investigate 15:24:11 <Swami> at this point we don't want to introduce more issues in gate. 15:24:17 <regXboi> I've thrown a -1 on that patch 15:24:19 <obondarev> Swami: right 15:24:37 <regXboi> *because* that dvr-multinode failure isn't one I can just say "oh that's independent" 15:24:40 <obondarev> regXboi: fair enough 15:24:48 <Swami> obondarev: if you have more data points please update it. 15:24:51 <obondarev> regXboi: right 15:25:05 <obondarev> Swami: I don't have any ATM 15:25:11 <Swami> obondarev: ok 15:25:18 <obondarev> will try to find time to investigate 15:25:30 <regXboi> so right now I'm thinking we can ask for dvr to be voting again 15:25:36 <regXboi> and maybe multinode-full 15:25:39 <Swami> ok I do have a debug patch to address the fip failures in the gate, but I am seeing another issue where the pings are not getting response. 15:25:46 <regXboi> but multinode-dvr is still too high in its failure rate 15:25:48 <carl_baldwin> regXboi: Let's get a patch up to make single node dvr voting. It might take a few more days of stability but at least we'll have it queued up. 15:25:48 <haleyb> regXboi: so the DVR multinode still seems double the regular multinode - 25% 15:26:09 <regXboi> haleyb: yes - that's what I mean by "still too high" 15:26:22 <Swami> If everyone is with an agreement then I can push a voting job patch just for the single node. 15:26:45 <regXboi> Swami: if you have time today, please spin the patch - I'm stuck in meetings all day :( 15:26:53 <Swami> In the case of multinode we should first fix the live migration problem otherwise the tests will fail. 15:27:06 <regXboi> multinode-dvr still needs to be nv 15:27:15 <Swami> regXboi: I will be in a meeting as well the entire day. 15:27:24 <haleyb> Swami: +1 on the singlenode, let me know if i can help 15:27:29 <regXboi> Swami, carl_baldwin: ok, I'll spin a patch here 15:27:30 <Swami> regXboi: but i will try to push one. 15:27:46 <regXboi> heh - ok - you spin the singlenode dvr 15:27:51 <regXboi> I'll spin the multinode-full 15:28:00 <regXboi> because I'd like to have that queued up as well 15:28:01 <Swami> regXboi: ok fine, I will spin it up. 15:28:10 <Swami> ok, let us move on to the next item. 15:28:11 <obondarev> Swami: which live migration problem are you reffering? 15:28:36 <Swami> obondarev: the block migration issue that is breaking with ssh. 15:29:01 <obondarev> Swami: ok 15:29:02 <Swami> #topic performance-scalability 15:29:27 <Swami> obondarev: anything to add in here. 15:29:33 <obondarev> I've put an initial WIP patch for dvr sheduling refactoring 15:29:46 <obondarev> #link https://review.openstack.org/#/c/254837/ 15:30:01 <obondarev> continue working on it 15:30:15 <Swami> obondarev: thanks 15:30:18 <carl_baldwin> ++ 15:30:30 <Swami> #topic open-discussion 15:30:41 <Swami> anything else we need to discuss here. 15:30:42 <obondarev> it is sad that https://review.openstack.org/#/c/143169 had to deal with all complexity of dvr scheduling 15:30:57 <obondarev> which should be cleaned up soon 15:31:07 <Swami> obondarev: yes I agree. 15:31:18 <Swami> obondarev: that is the reason it took a while. 15:31:36 <obondarev> one of the reasons I guess :) 15:31:47 <Swami> obondarev: agreed 15:31:58 <Swami> ok, that's all I have today. 15:32:02 <haleyb> Swami: i found the infra change that disabled the dvr job, https://review.openstack.org/#/c/223173/ i can send a change to un-do that 15:32:13 <Swami> haleyb: sure, that would work. 15:32:52 <haleyb> #action haleyb will re-enable dvr job 15:33:03 <Swami> haleyb: Thanks 15:33:07 <regXboi> ok, multinode-full voting patch is up for review: https://review.openstack.org/#/c/212058/ 15:33:27 <haleyb> you're faster than me 15:33:28 <regXboi> I admit to having an old patch set to rebase :) 15:33:40 <Swami> regXboi: you might have had it all prepared. 15:33:45 <Swami> regXboi: good work. 15:34:00 <regXboi> look at the number, and then look at the patch history 15:34:06 <regXboi> it's been hanging around for months 15:34:19 <Swami> do we have anything else to discuss here. 15:34:27 <Swami> if not we can end this meeting. 15:34:31 <regXboi> no, I'm good 15:34:43 <Swami> Thanks for your attendance and see you all next week. 15:34:57 <haleyb> ok, thanks everyone, keep up the great work :) 15:35:02 <haleyb> #endmeeting