15:00:08 <haleyb> #startmeeting neutron_dvr
15:00:08 <openstack> Meeting started Wed Dec  9 15:00:08 2015 UTC and is due to finish in 60 minutes.  The chair is haleyb. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:10 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:13 <openstack> The meeting name has been set to 'neutron_dvr'
15:00:13 <haleyb> hi swami
15:00:21 <haleyb> #chair Swami
15:00:24 <openstack> Current chairs: Swami haleyb
15:00:56 <obondarev> o/
15:01:04 <Swami> obondarev:hi
15:01:07 <haleyb> Swami: i had a complete system failure earlier this morning, so if I disappear it's all yours...
15:01:18 <Swami> haleyb: no problem
15:01:20 * regXboi wanders in late
15:01:21 <regXboi> hi
15:01:45 <Swami> I have to leave early today, so may be we can wind up the meeting early
15:01:54 <regXboi> I'm in favor of that
15:02:02 <haleyb> sure
15:02:19 <haleyb> #topic Announcements
15:02:28 <Swami> haleyb: I have edited the meeting wiki with bugs and then sorted the bugs based on category
15:03:03 <haleyb> ok, we might have stepped on each other as oleg and myself also edited today, i'll look
15:03:27 <obondarev> yeah
15:03:29 <Swami> haleyb: yes my changes overlapped yours and then I fixed it.
15:03:36 <haleyb> I didn't have any particular announcements, just that reviews have continued to merge, which is goodness
15:03:50 <haleyb> #topic Bugs
15:03:53 <Swami> haleyb: is the gate, zuul happy
15:03:54 * carl_baldwin wanders in
15:04:12 <fitoduarte> .
15:04:14 <Swami> Ok we have at least 4 new bugs for this week that was filed.
15:04:29 <haleyb> Swami: get is getting better
15:04:34 <haleyb> s/get/gate
15:04:36 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1524291
15:04:36 <openstack> Launchpad bug 1524291 in neutron "check_ports_on_host_and_subnet() duplicates check_ports_exist_on_l3agent()" [Low,In progress] - Assigned to Oleg Bondarev (obondarev)
15:04:55 <obondarev> preparing fix for it
15:04:58 <Swami> This is a low category and probably a cleanup and I don't think we need more discussion on this.
15:05:06 <Swami> obondarev: thanks
15:05:14 <obondarev> just noticed while working on dvr sheduling refactoring
15:05:19 <Swami> next one
15:05:23 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1524020
15:05:23 <openstack> Launchpad bug 1524020 in neutron "DVRImpact: dvr_vmarp_table_update and dvr_update_router_add_vm is called for every port update instead of only when host binding or mac-address changes occur" [Medium,Confirmed] - Assigned to Swaminathan Vasudevan (swaminathan-vasudevan)
15:05:52 <Swami> This bug was filed by me, again this is related to reducing the arp calls that are initiated from the server for all updates.
15:05:58 <Swami> I have a patch up for review.
15:06:01 <obondarev> I've left a couple og suggestions for that on in review
15:06:04 <Swami> please feel free to review it.
15:06:12 <obondarev> of*
15:06:19 <Swami> obondarev: I will take a look at it. Thanks for your ongoing review comments.
15:06:34 <Swami> just for the benefit of the audience here is the patch details.
15:06:37 <obondarev> Swami: thanks for being patient
15:06:45 <haleyb> i didn't see a review link in the bug
15:06:56 <Swami> #link https://review.openstack.org/#/c/253685/
15:07:26 <Swami> haleyb: sometimes I have seen if you push the patch first and then assign the bug id, it is not getting populated in the launchpad.
15:07:45 <Swami> The next one is
15:07:49 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1522824
15:07:49 <openstack> Launchpad bug 1522824 in neutron "DVR multinode job: test_shelve_instance failure due to SSHTimeout" [High,In progress] - Assigned to Oleg Bondarev (obondarev)
15:08:11 <obondarev> this one is one of the culprits for multinode job failures
15:08:11 <Swami> There is patch that oleg pushed in for this issue.
15:08:27 <Swami> obondarev: yes I have seen this failure more
15:08:43 <obondarev> there is one with 'resize' test as well
15:08:51 <Swami> obondarev: also the other failure i have noticed is the "volume-boot-pattern". Just to consider after.
15:08:54 <obondarev> let's see if it's the same root cause
15:09:02 <Swami> obondarev: thanks
15:09:07 <haleyb> we need to ping an ML2 core on that review
15:09:22 <Swami> #link https://review.openstack.org/#/c/253569/
15:09:50 <obondarev> just added kevinbenton there
15:09:51 <Swami> obondarev: you have mentioned in your comment, that we don't need any event update from nova and neutron should handle itself.
15:10:42 <obondarev> Swami: correct
15:10:52 <Swami> obondarev: going forward if we need any nova handshake is that possible with neutron or neutron should just work on the basis of the port status. The reason I ask this question is for the live migration.
15:11:25 <obondarev> Swami: live migration is a bit harder
15:11:39 <obondarev> Swami: for this bug we don't need anything on nova side
15:11:39 <Swami> obondarev: ok we can discuss about it later.
15:12:04 <Swami> That's all for the new bugs this week.
15:12:13 <Swami> But i wanted to discuss about a new feature patch.
15:12:16 <Swami> #link https://review.openstack.org/#/c/143169/
15:12:25 <Swami> This is the DVR SNAT HA patch.
15:13:03 <Swami> This has been pending for a while and fitoduarte has addressed all merge conflicts on this patch.
15:13:19 <Swami> Can we have the cores attention on this patch.
15:13:54 <Swami> Also a couple of days back amuller mentioned about this patch and other dependent patches for the DVR, L3 HA to be working smoother.
15:13:57 <regXboi> IIRC, this had +2s on an earlier revision
15:14:04 <regXboi> the issue was amuller wanted to review
15:14:06 <Swami> regXboi: yes
15:14:08 <haleyb> I will review again, but assuming we need amuller has final say
15:14:13 <obondarev> I'll review it, it seems to be affecting sheduling refactoring as well
15:14:38 <Swami> maintaining that patch is too hard right now and we should probably close the loop on it.
15:14:40 * carl_baldwin will look again
15:14:52 <Swami> carl_baldwin: obondarev: haleyb: thanks
15:15:56 <Swami> I have a patch for addressing the allowed address pair with FIP on DVR.
15:16:00 <Swami> #link https://review.openstack.org/#/c/254439/
15:16:14 <Swami> This is currently WIP and I would like to get some early feedback on this.
15:16:39 <Swami> here is the bug details on this.
15:16:43 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1445255
15:16:43 <openstack> Launchpad bug 1445255 in neutron "DVR FloatingIP to unbound port does not work" [Low,In progress] - Assigned to Swaminathan Vasudevan (swaminathan-vasudevan)
15:17:39 <Swami> am I still visible.
15:17:52 <regXboi> yes on the visibility
15:17:55 * regXboi reading
15:18:51 <Swami> next one in the list
15:18:55 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1521815
15:18:55 <openstack> Launchpad bug 1521815 in neutron "DVR functional tests failing intermittently" [Low,Confirmed] - Assigned to Swaminathan Vasudevan (swaminathan-vasudevan)
15:19:18 <Swami> This bug can't be reproduced and it kind of settled itself in the gate.
15:19:56 <Swami> We are not sure what caused this issue in the gate. So let us keep an eye on this and move forward. regXboi has lowered the serverity on this bug.
15:20:37 <Swami> I think that's all we have for bugs today.
15:20:38 <regXboi> I've got nothing to add that isn't in the bug discussion
15:20:55 <Swami> If you guys have anything else to discuss we can discuss, else we can move one.
15:21:00 <regXboi> let's move on
15:21:10 <Swami> ok, that's all I have for bugs.
15:21:21 <Swami> #topic gate-failures
15:21:42 <Swami> How is gate failure looking with respect to dvr
15:22:19 <regXboi> so the check pipelines have settled out after the metering hangup
15:22:29 <Swami> regXboi: yes true
15:22:49 <obondarev> I'm concerned by a new dvr multinode failure here https://review.openstack.org/#/c/250075/
15:22:51 <regXboi> single node DVR can probably go back to voting - it is tracking neutron-full pretty well
15:23:00 <obondarev> which is probably related to the patch
15:23:00 * regXboi looks
15:23:27 <obondarev> need to investigate before merging
15:23:28 <Swami> obondarev: is it causing failures in gate.
15:23:40 <regXboi> obondarev: I'm -1 that patch because of that
15:23:43 <Swami> obondarev: I think we agreed that if it breaks the gate we can revert it.
15:24:08 <obondarev> We've not merged it yet, so we have time to investigate
15:24:11 <Swami> at this point we don't want to introduce more issues in gate.
15:24:17 <regXboi> I've thrown a -1 on that patch
15:24:19 <obondarev> Swami: right
15:24:37 <regXboi> *because* that dvr-multinode failure isn't one I can just say "oh that's independent"
15:24:40 <obondarev> regXboi: fair enough
15:24:48 <Swami> obondarev: if you have more data points please update it.
15:24:51 <obondarev> regXboi: right
15:25:05 <obondarev> Swami: I don't have any ATM
15:25:11 <Swami> obondarev: ok
15:25:18 <obondarev> will try to find time to investigate
15:25:30 <regXboi> so right now I'm thinking we can ask for dvr to be voting again
15:25:36 <regXboi> and maybe multinode-full
15:25:39 <Swami> ok I do have a debug patch to address the fip failures in the gate, but I am seeing another issue where the pings are not getting response.
15:25:46 <regXboi> but multinode-dvr is still too high in its failure rate
15:25:48 <carl_baldwin> regXboi: Let's get a patch up to make single node dvr voting.  It might take a few more days of stability but at least we'll have it queued up.
15:25:48 <haleyb> regXboi: so the DVR multinode still seems double the regular multinode - 25%
15:26:09 <regXboi> haleyb: yes - that's what I mean by "still too high"
15:26:22 <Swami> If everyone is with an agreement then I can push a voting job patch just for the single node.
15:26:45 <regXboi> Swami: if you have time today, please spin the patch - I'm stuck in meetings all day :(
15:26:53 <Swami> In the case of multinode we should first fix the live migration problem otherwise the tests will fail.
15:27:06 <regXboi> multinode-dvr still needs to be nv
15:27:15 <Swami> regXboi: I will be in a meeting as well the entire day.
15:27:24 <haleyb> Swami: +1 on the singlenode, let me know if i can help
15:27:29 <regXboi> Swami, carl_baldwin: ok, I'll spin a patch here
15:27:30 <Swami> regXboi: but i will try to push one.
15:27:46 <regXboi> heh - ok - you spin the singlenode dvr
15:27:51 <regXboi> I'll spin the multinode-full
15:28:00 <regXboi> because I'd like to have that queued up as well
15:28:01 <Swami> regXboi: ok fine, I will spin it up.
15:28:10 <Swami> ok, let us move on to the next item.
15:28:11 <obondarev> Swami: which  live migration problem are you reffering?
15:28:36 <Swami> obondarev: the block migration issue that is breaking with ssh.
15:29:01 <obondarev> Swami: ok
15:29:02 <Swami> #topic performance-scalability
15:29:27 <Swami> obondarev: anything to add in here.
15:29:33 <obondarev> I've put an initial WIP patch for dvr sheduling refactoring
15:29:46 <obondarev> #link https://review.openstack.org/#/c/254837/
15:30:01 <obondarev> continue working on it
15:30:15 <Swami> obondarev: thanks
15:30:18 <carl_baldwin> ++
15:30:30 <Swami> #topic open-discussion
15:30:41 <Swami> anything else we need to discuss here.
15:30:42 <obondarev> it is sad that https://review.openstack.org/#/c/143169 had to deal with all complexity of dvr scheduling
15:30:57 <obondarev> which should be cleaned up soon
15:31:07 <Swami> obondarev: yes I agree.
15:31:18 <Swami> obondarev: that is the reason it took a while.
15:31:36 <obondarev> one of the reasons I guess :)
15:31:47 <Swami> obondarev: agreed
15:31:58 <Swami> ok, that's all I have today.
15:32:02 <haleyb> Swami: i found the infra change that disabled the dvr job, https://review.openstack.org/#/c/223173/ i can send a change to un-do that
15:32:13 <Swami> haleyb: sure, that would work.
15:32:52 <haleyb> #action haleyb will re-enable dvr job
15:33:03 <Swami> haleyb: Thanks
15:33:07 <regXboi> ok, multinode-full voting patch is up for review: https://review.openstack.org/#/c/212058/
15:33:27 <haleyb> you're faster than me
15:33:28 <regXboi> I admit to having an old patch set to rebase :)
15:33:40 <Swami> regXboi: you might have had it all prepared.
15:33:45 <Swami> regXboi: good work.
15:34:00 <regXboi> look at the number, and then look at the patch history
15:34:06 <regXboi> it's been hanging around for months
15:34:19 <Swami> do we have anything else to discuss here.
15:34:27 <Swami> if not we can end this meeting.
15:34:31 <regXboi> no, I'm good
15:34:43 <Swami> Thanks for your attendance and see you all next week.
15:34:57 <haleyb> ok, thanks everyone, keep up the great work :)
15:35:02 <haleyb> #endmeeting