15:00:26 <haleyb> #startmeeting neutron_dvr
15:00:27 <openstack> Meeting started Wed Dec 16 15:00:26 2015 UTC and is due to finish in 60 minutes.  The chair is haleyb. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:28 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:31 <openstack> The meeting name has been set to 'neutron_dvr'
15:00:32 <Swami_> obondarev:hi
15:00:35 <Swami_> haleyb: hi
15:00:42 <haleyb> #chair Swami
15:00:43 <openstack> Warning: Nick not in channel: Swami
15:00:45 <openstack> Current chairs: Swami haleyb
15:00:51 <haleyb> hi there
15:00:57 <haleyb> #chair Swami_
15:00:58 <openstack> Current chairs: Swami Swami_ haleyb
15:01:09 * regXboi watches the musical chairs
15:01:17 <Swami_> do we have any announcements today
15:01:20 <obondarev> :)
15:01:34 <haleyb> #topic Announcements
15:02:14 <carl_baldwin> o/
15:02:28 <Swami_> carl_baldwin:hi
15:02:31 <haleyb> So this might be the last meeting this year for me as I'm on vacation, don't know about others
15:02:42 <regXboi> ditto for me
15:02:46 <Swami_> haleyb: makes sense
15:03:04 <carl_baldwin> ++
15:03:34 <Swami_> if we all decide then we can resume the meeting in 2016
15:04:00 <regXboi> I'd ask it the other way
15:04:07 <haleyb> Right.  I might be watching reviews here and there but not much else
15:04:16 <regXboi> "does anybody see a reason to meet with reduced groups"
15:04:41 <Swami_> I don't see any value with reduced groups.
15:04:48 <obondarev> we can continue in 2016 I think
15:05:06 <regXboi> ok, so somebody want to do the #agree?
15:05:19 <haleyb> #agree with that
15:05:24 <Swami_> +1
15:05:37 <regXboi> #agreed DVR meetings will pick up on 1/6/2016
15:05:42 <regXboi> whatever - I'll do it :)
15:05:44 <lizk> +1
15:06:09 <haleyb> I'll update the wiki page after the meeting
15:06:15 <Swami_> #action I will send a message out in the channel.
15:06:30 <regXboi> Swami_: you want to #undo and s/I/Swami_/ in that?
15:06:32 <haleyb> Swami_: dev list?
15:06:38 <regXboi> otherwise the notes have an action item for "I"
15:06:39 <hichihara> We need ML for info
15:06:56 <Swami_> #undo s/I/Swami
15:06:57 <openstack> Removing item from minutes: <ircmeeting.items.Action object at 0xa684150>
15:07:18 <regXboi> (I can't believe I'm asking this) haleyb: hand me a chair, please
15:07:37 <haleyb> #chair regXboi
15:07:38 <openstack> Current chairs: Swami Swami_ haleyb regXboi
15:07:45 <haleyb> hope you can reach the top shelf now
15:08:00 <regXboi> #action Swami_ to send a message about meetings continuing on 1/6/2016 to both channel and -dev ML
15:08:09 * regXboi breathes easier
15:08:30 <regXboi> any other announcements?
15:08:32 <Swami_> regXboi: did we give you enough breathing problems.
15:09:05 <regXboi> Swami_: since I've read minutes from these things in the past, I've become sensitive to the messages making sense to people who weren't here
15:09:06 <haleyb> i wish the bot would ackowledge all the #'s
15:09:07 <Swami_> nothing else I hope let us get going.
15:09:22 <regXboi> and note: in three days I will be a "person who weren't here" :)
15:09:31 <haleyb> #topic Bugs
15:09:39 <Swami_> haleyb: hi
15:09:56 <Swami_> I have created the bug sections in the Wiki page to categorize the bugs.
15:10:12 <Swami_> Please let me know if that helps out for the people who review and read the wiki.
15:10:23 <obondarev> Swami_: nice
15:10:29 <Swami_> still minor clean up is required, I will do it in the coming week.
15:10:51 <Swami_> This week we have not seen many bugs but a couple
15:10:53 <haleyb> Swami_: thanks, it helps, should cut/paste a small info line for each
15:11:01 <Swami_> Let us go over it.
15:11:08 <Swami_> haleyb: yes will work on it.
15:11:28 <Swami_> #link https://bugs.launchpad.net/neutron/+bug/1524908
15:11:28 <openstack> Launchpad bug 1524908 in neutron "Router may be removed from dvr_snat agent by accident" [Undecided,In progress] - Assigned to Oleg Bondarev (obondarev)
15:11:47 <Swami_> There is a patch for this bug right now.
15:12:11 <Swami_> It should be straight forward, this was found during addressing another patch by oleg, so oleg has pushed in this patch.
15:12:15 <Swami_> Please review it.
15:12:29 <Swami_> Nothing to discuss more about this bug.
15:13:00 <Swami_> I think this bug has two patchs one to add the "admin_context" to delete and the other one to handle delete of router_namespaces in snat.
15:13:07 <Swami_> review both the patches.
15:13:29 <Swami_> The next one in the list is
15:13:34 <Swami_> #link https://bugs.launchpad.net/neutron/+bug/1526175
15:13:34 <openstack> Launchpad bug 1526175 in neutron "ha router schedule to dvr agent in compute node" [Medium,In progress] - Assigned to zhang sheng (langyxxl)
15:14:01 <regXboi> did we drop a bug reference there?
15:14:12 <Swami_> This one was filed recently, the bug states that somehow when ha is configured and dvr agent is running, the ha routers end up in the dvr node.
15:14:35 <Swami_> regXboi: which bug reference?
15:15:06 <regXboi> Swami_: you said Nothing to discuss more about 1524908 and then said "I think this bug has two patches..."
15:15:28 <regXboi> and I'm confused if the I think statement still refers to 1524908
15:15:42 <Swami_> #link https://bugs.launchpad.net/neutron/+bug/1424096
15:15:42 <openstack> Launchpad bug 1424096 in neutron "DVR routers attached to shared networks aren't being unscheduled from a compute node after deleting the VMs using the shared net" [Undecided,In progress] - Assigned to Oleg Bondarev (obondarev)
15:16:03 <Swami_> regXboi: this is the bug associated with the other dependent patch.
15:16:26 <regXboi> ah
15:16:31 <regXboi> ok, now the loop is closed - thanks
15:16:33 <Swami_> regXboi: is that clear now.
15:16:52 <obondarev> I reopened that one
15:17:06 <Swami_> obondarev: thanks
15:17:08 <obondarev> because faced it while reworking unit tests
15:17:20 <obondarev> so decided to go with a separate patch
15:17:58 <Swami_> obondarev: thanks for the update
15:18:03 <Swami_> next one.
15:18:07 <Swami_> #link https://bugs.launchpad.net/neutron/+bug/1522824
15:18:07 <openstack> Launchpad bug 1522824 in neutron "DVR multinode job: test_shelve_instance failure due to SSHTimeout" [High,In progress] - Assigned to Oleg Bondarev (obondarev)
15:18:30 <Swami_> This is old bug and there is a patch out there for review. Please review it if not reviewed.
15:18:54 <Swami_> #link https://review.openstack.org/#/c/253569/ - This is the patch.
15:19:01 <haleyb> obondarev: were you discussing that one with kevin ?
15:19:28 <obondarev> it had +2 from carl_baldwinoncebut then kevinbenton suggested to use BUILD instead of new PENDING_BUILD
15:19:34 <Swami_> haleyb: no the one that i was discussing with kevin is the other one, this is related, but not the same.
15:19:57 <obondarev> haleyb: yes, we discussed with kevinbenton
15:19:58 <Swami_> obondarev: I did see that you pushed in a new version.
15:20:30 <obondarev> so the suggestion didn't wokr spo I returned to initial version
15:20:44 <obondarev> reviews needed
15:21:01 <Swami_> obondarev: ok, thanks will review it.
15:21:10 <Swami_> #link https://bugs.launchpad.net/neutron/+bug/1456073
15:21:10 <openstack> Launchpad bug 1456073 in neutron "Connection to an instance with floating IP breaks during block migration when using DVR" [High,Confirmed]
15:21:30 <obondarev> Swami_: thanks, I think you did already
15:21:58 <Swami_> This bug is related to live migration on DVR and FIP.
15:22:12 <Swami_> haleyb: This is the one that I had discussion with kevin yesterday.
15:22:58 <Swami_> obondarev has a couple of patches to address the live migration but that is not directly related to this bug
15:23:18 <obondarev> right
15:24:16 <Swami_> obondarev: I will try to test with obondarev patch the live migration issue with fip and see if there is any improvement on this.
15:24:49 <obondarev> Swami_: my patches will hardly resolve the issue, I think more work is needed
15:24:50 <Swami_> obondarev: I have also added some comments in your patch on passing some information as "kwargs" to the registerd parties.
15:24:58 <Swami_> obondarev: yes I agree.
15:25:11 <obondarev> Swami_: saw that, prefer a separate patch for this
15:25:15 <Swami_> obondarev: I will work on it and see what is more required.
15:25:23 <Swami_> obondarev: ok will add one.
15:25:49 <Swami_> #link https://review.openstack.org/#/c/246898/
15:25:56 <Swami_> This is the patch that we are discussing.
15:26:32 <carl_baldwin> Thanks for the link
15:26:49 <Swami_> The next high priority bug in the list is
15:26:53 <Swami_> #link https://bugs.launchpad.net/neutron/+bug/1462154
15:26:53 <openstack> Launchpad bug 1462154 in neutron "With DVR Pings to floating IPs replied with fixed-ips" [High,In progress] - Assigned to ZongKai LI (lzklibj)
15:27:36 <Swami_> #link https://review.openstack.org/#/c/246855/
15:27:55 <Swami_> This patch is under review, please review if not reviewed.
15:28:52 <Swami_> #link https://bugs.launchpad.net/neutron/+bug/1522824
15:28:52 <openstack> Launchpad bug 1522824 in neutron "DVR multinode job: test_shelve_instance failure due to SSHTimeout" [High,In progress] - Assigned to Oleg Bondarev (obondarev)
15:29:01 <Swami_> This is related to the gate test failure.
15:29:01 <carl_baldwin> I need to catch up on this.  Looks like still a wip
15:29:16 <carl_baldwin> ... About the previous bug
15:29:21 <Swami_> carl_baldwin: yes seems like it.
15:29:34 <obondarev> 1522824 was already discussed, wasn't it?
15:29:41 <Swami_> lizk: are you still here.
15:29:41 <haleyb> yes, it was first
15:29:50 <lizk> yes, I'm here
15:30:25 <Swami_> lizk: are you still working on 1522824
15:30:38 <lizk> it's should be under review now, but failed for gate-grenade-dsvm-neutron test
15:30:43 <Swami_> #undo
15:30:45 <openstack> Removing item from minutes: <ircmeeting.items.Link object at 0x9f90d90>
15:30:49 <lizk> yes, I'm working on that
15:31:07 <Swami_> lizk: ok, just ping us in the channel once your are ready for review.
15:31:22 <lizk> ok
15:31:41 <Swami_> obondarev: yes we have already reviewed the bug 1522824, my mistake.
15:31:41 <openstack> bug 1522824 in neutron "DVR multinode job: test_shelve_instance failure due to SSHTimeout" [High,In progress] https://launchpad.net/bugs/1522824 - Assigned to Oleg Bondarev (obondarev)
15:32:58 <Swami_> #topic Gate-Test-Failures
15:33:19 <Swami_> Is there any new failures seen in the gate recently with respect to DVR.
15:33:39 <haleyb> Swami_: we should discuss the two infra reviews as well
15:33:47 <regXboi> I've not seen anything new
15:33:57 <Swami_> regXboi: thanks
15:33:58 <regXboi> and +1 on the infra discussion
15:34:08 <Swami_> haleyb: what are the two infra ones.
15:34:16 <Swami_> haleyb: do you have the links
15:34:24 <haleyb> the dvr job does seem higher than neutron-full fwiw
15:34:53 <haleyb> https://review.openstack.org/#/c/255325/ -make dvr job voting again
15:35:05 <haleyb> which had some good comments from kyle, doug and swami
15:35:19 <Swami_> haleyb: the single voting job was pushed last week based on our agreement.
15:35:41 <Swami_> But there were a couple of comments on that patch.
15:35:54 <Swami_> if we need to go for voting on single node job or multinode job?
15:36:15 <Swami_> What is the reason that we need to go in for single node job?
15:36:15 <regXboi> well, ideally, we want multinode to be voting and single node to go away
15:36:37 <haleyb> The comment on the single-node was to change it to be a check only job
15:36:40 <regXboi> but multinode (of both types) are rather ill
15:36:57 <regXboi> and I agree to that idea - single node can be check only
15:37:00 <Swami_> regXboi: is but we don't want to wait until the multinode job gets stable, if single node job is stable, then we should vote and then start working on multinode to prevent further regression.
15:37:19 <regXboi> honestly, having single node vote in the gate isn't really useful, is it?
15:37:22 <obondarev> both multinode jobs are affected by some block live migration failure
15:37:31 <regXboi> obondarev: ack
15:37:33 <Swami_> haleyb: regXboi: what is the difference between having it in check versus the other.
15:37:38 <obondarev> not related to dvr I guess
15:37:50 <obondarev> and not sure if related to neutron
15:37:58 <regXboi> so, we *had* single node voting before, remember?
15:38:01 <Swami_> regXboi: why do you think so that voting single node is not useful.
15:38:11 <regXboi> and it led to regressions because it doesn't really *test* anything
15:38:20 <regXboi> DVR without multinode isn't really DVR
15:39:01 <obondarev> +1
15:39:06 <regXboi> and dougwig + mestery (and armax I expect) are pointing that out
15:39:11 <regXboi> and after thinking about it, they are correct
15:39:37 <Swami_> regXboi: but do we need to delay the single node voting because of multinode that is my point.
15:40:02 <Swami_> once multinode gets voting we can remove the single node job.
15:40:08 <regXboi> the push back is "making single node voting doesn't mean anything because it isn't really testing DVR"
15:40:18 <regXboi> and I can't argue with that
15:40:31 <regXboi> in fact, I'd argue that we might as well just remove singlenode DVR completely
15:40:42 <haleyb> right, and we want to reduce the number of rechecks in the gate
15:40:46 <regXboi> as a non voting job it doesn't do anything
15:40:47 <Swami_> obondarev: as you pointed out in multinode the live migration is a blocker right now, until we fix that bug.
15:40:53 <regXboi> as a voting job it doesn't do anything
15:40:58 <regXboi> so why is it there?
15:41:10 <Swami_> regXboi: were things different when we discussed this last week.
15:41:36 <regXboi> Swami_: yes, I had forgotten a bit of history that I went and boned up on after the comment stream
15:41:56 <Swami_> regXboi: thanks
15:42:28 <Swami_> regXboi: first I would say that the tempest tests were not specifically written for multinode jobs.
15:43:02 <haleyb> Swami_: and as you noticed, the dvr job is already in the check queue, it's just not voting
15:43:21 <Swami_> haleyb: yes I see that.
15:43:26 <regXboi> Swami_: doesn't that sorta translate into "the tempest tests were not specifically written for DVR?"
15:43:57 <Swami_> regXboi: not only DVR any multinode scenario is not well handled for the multinode case.
15:44:26 <Swami_> regXboi: yes we need to fix all the tests before attempting to make the multinode voting, that is my argument.
15:44:34 <regXboi> Swami_: granted, but I don't see why that's a reason to (a) make the single node dvr job voting or (b) keep the single node job around
15:44:35 <regXboi> whoa
15:44:47 <regXboi> I'm not talking about making the multinode job voting now
15:44:53 <Swami_> regXboi: haleyb: agreed
15:45:01 <regXboi> that's the end goal, yes, but we are nowhere *close* to that one
15:45:20 <Swami_> So the agreement here by the team is not to make single node job vote and move forward with voting the multinode job.
15:45:33 <carl_baldwin> If the single now job isn't testing anything more than the non dvr one, why has it been falling more?
15:45:35 <Swami_> #agree
15:45:40 <regXboi> uh... not exactly
15:46:00 <regXboi> carl_baldwin: I didn't quite parse that
15:46:01 <haleyb> right, has the single-node job caught anything?
15:46:01 <Swami_> carl_baldwin: I like your question.
15:46:18 <regXboi> I'll ask the question this way
15:46:28 <haleyb> although i guess since it's not voting the net is always empty
15:46:37 <regXboi> (1) why should the single node DVR job be voting?
15:46:54 <regXboi> (2) if there is no good answer to #1, why does the singe node DVR job exist?
15:47:16 <Swami_> regXboi: I think you have not answered for carl_baldwin question above.
15:47:18 <regXboi> and I've not heard a reason for #1 that holds up yet
15:47:31 <regXboi> Swami_: I didn't *parse* carl_baldwin's question above
15:47:42 <carl_baldwin> Yes, the single now job has caught problems with dvr.
15:48:01 <Swami_> s/now/node
15:48:07 <carl_baldwin> Swami_ knows this
15:48:22 <regXboi> carl_baldwin: recently?
15:48:38 <Swami_> carl_baldwin: I agree, the only one thing that the single node did not catch is the live migration that involves two or more nodes.
15:49:02 <carl_baldwin> regXboi: does it matter when?
15:49:11 <regXboi> carl_baldwin: yes actually it does
15:49:34 <Swami_> regXboi: if the answer is Yes, then why should we remove the job
15:49:46 <carl_baldwin> regXboi: why?
15:49:58 <regXboi> if the answer is "yes, recently" than keeping the job makes sense
15:50:12 <regXboi> if the answer is "yes, but not recently" then having the multi-node job covers it
15:50:36 <regXboi> in other words: "what is the single node job testing that the multi node job isn't"
15:51:14 <Swami_> regXboi: as I mentioned above, the additional tests that the multinode job is testing is the nova live migration that is only turned on when there are more than one nodes.
15:51:36 <regXboi> Swami_: that's not the answer I think you want to give me
15:51:51 <Swami_> regXboi: what do you expect?
15:51:54 <regXboi> If the set of tests from multinode is a superset of the tests of single node then why does single node exist
15:52:12 <Swami_> regXboi: it is a stepping stone.
15:52:26 <haleyb> regXboi: well, it is testing the namespace manager code, as that aspect differs
15:52:56 <carl_baldwin> regXboi: I haven't seen that the multinode job is anywhere near working well.  Why not have the single node, which is close, in the mean time?
15:53:28 <regXboi> carl_baldwin: I believe these are the arguements that will need to be made to the infra folks
15:53:41 <Swami_> carl_baldwin: that is exactly what I am thinking. Until the multinode job is stable and ready let us depend on the single node.
15:53:51 <carl_baldwin> Someone has said that the single node test doesn't test *anything*.  That is bogus.
15:54:02 <regXboi> I said that
15:54:14 <carl_baldwin> What they should be saying is that the single node test doesn't test all that might be testable in a multinode.
15:54:14 <regXboi> and except for haleyb's statement, I rather stand by it
15:54:17 <haleyb> carl_baldwin: fyi, see the discussion in https://review.openstack.org/#/c/255325/ from doug and kyle (and swami) if you haven't already
15:54:38 <carl_baldwin> But, in its current form, the multi-node test barely tests anything more and it is a lot more broken.
15:54:39 <haleyb> regXboi: right, packet flow via namespaces and OVS rules is different
15:54:49 <regXboi> haleyb's namespace manager statement is the first thing I've seen that I think single node can hang its hat on
15:55:42 <Swami_> my thoughts.
15:55:58 <carl_baldwin> When multi-node is serving its purpose well, there is no need for single node.  I agree with you there.  But, for now, no one will pay attention to it for anything because it is broken.
15:56:15 <regXboi> carl_baldwin: I'll give you a +1 on that
15:56:17 <Swami_> If we don't make the single node job voting we will be constantly fixing bugs which are due to new updates.
15:56:34 <carl_baldwin> So, suggesting we remove single now, that it has no purpose, is premature.
15:57:23 <regXboi> carl_baldwin: ok, that's self consistent reasoning  for keeping it
15:57:23 <Swami_> carl_baldwin: agreed
15:57:31 <regXboi> but that's not enough to make it voting
15:58:06 <regXboi> or at least, I don't think it is
15:58:09 <regXboi> yet
15:58:36 <regXboi> now - haleyb's comments about namespace and packet flow I think *might be*
15:58:37 <haleyb> regXboi: this gets at "noone will notice" if a non-voting job fails, we don't want to slide backwards
15:58:38 <Swami_> regXboi: may be thinks will change next week or by 2016, if you think through.
15:59:00 <Swami_> s/thinks/things
15:59:00 <carl_baldwin> regXboi: That was only arguing to not get rid of the single node job not to be conflated with any arguments to make it voting.
15:59:00 <haleyb> we are almost out of time
15:59:09 <hichihara> Can I ask dvr floks in last?
15:59:11 <regXboi> carl_baldwin: ack
15:59:17 <hichihara> I wonder if DVR floks will review a linuxbridge DVR spec https://review.openstack.org/#/c/255174/
15:59:23 <regXboi> carl_baldwin: apologies for conflation
15:59:30 <Swami_> hichihara: will od
15:59:36 <haleyb> carl_baldwin: so you don't think it should vote in the check job?
15:59:37 <Swami_> s/od/do
15:59:48 <hichihara> Swami_: Thanks :)
16:00:06 <Swami_> we are at the top of the hour
16:00:13 <haleyb> we will have to keep talking in #neutron as out of time
16:00:19 <Swami_> we can discuss it in IRC or on the voting job patch.
16:00:30 <haleyb> have a happy new year everyone, and thanks for the great work
16:00:32 <haleyb> #endmeeting