#openstack-meeting-alt log

15:00:51 <haleyb> #startmeeting neutron_dvr
15:00:52 <openstack> Meeting started Wed Mar 30 15:00:51 2016 UTC and is due to finish in 60 minutes.  The chair is haleyb. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:54 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:56 <openstack> The meeting name has been set to 'neutron_dvr'
15:01:07 <haleyb> #chair Swami
15:01:07 <openstack> Current chairs: Swami haleyb
15:01:52 <haleyb> #topic Announcements
15:02:09 <haleyb> Mitaka milestone RC2 released, should be final
15:02:19 <Swami> Good
15:02:42 <haleyb> so unless there's something critical we're looking ahead, or to stable
15:03:13 <haleyb> #topic Bugs
15:03:22 <Swami> haleyb: thanks
15:03:32 <Swami> This week I we have couple of new bugs
15:03:54 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1526855
15:03:54 <openstack> Launchpad bug 1526855 in neutron "VMs fail to get metadata in large scale environments" [Undecided,Confirmed]
15:04:27 <Swami> Thsis one was filed already but has been re-opened since it was not fixed.
15:05:02 <Swami> At present there are no patch to address this issue, since it is very difficult right now to identify where the problem is with respect to metadata.
15:05:38 <haleyb> is some of it still that we're not starting the proxy quick enough?
15:06:21 <Swami> haleyb: I think that is the symptom, it is delayed for a while and even at larger scale if we increase the wait time, since we are seeing a bunch of failures.
15:06:36 <Swami> s/since/still
15:07:32 <haleyb> any thoughts on new solutions?
15:07:55 <Swami> We need to see why the metadata proxy is getting delayed, may be for dvr routers, should we be starting it early enough or we should be adding some delay in the code not to proceed before the metadata is received.
15:08:41 <Swami> It would be great if there is some status update about metadata fetch happened or not.
15:09:11 <haleyb> part of the problem is that nova will proceed irregardless of our state.  That patch was rejected
15:09:25 <haleyb> https://review.openstack.org/#/c/181674/
15:10:16 <Swami> haleyb: yes we should have some sort of mechanism to co-ordinate, otherwise it would be very difficult to fix these issues.
15:11:02 <Swami> probably in newton we should work on these nova/neutron interactions.
15:11:04 <haleyb> Swami: guess we can only focus on the l3-agent for now
15:12:26 <Swami> haleyb: is there any flag or info that we can set in the router_namespace for the metadata proxy enablement and can the l3 agent use it for further configurations.
15:13:28 <haleyb> i don't exactly understand
15:13:38 <Swami> We can discuss this offline on the options.
15:13:55 <haleyb> ok, continue with the new bugs
15:14:06 <obondarev> hi, sorry for being late
15:14:18 <Swami> obondarev: hi
15:14:26 <Swami> The next one in the list is
15:14:32 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1562110
15:14:32 <openstack> Launchpad bug 1562110 in neutron "link-lock-address allocater for DVR has a limit of 256 address pairs per node" [Undecided,In progress] - Assigned to Swaminathan Vasudevan (swaminathan-vasudevan)
15:14:58 <Swami> here is the patch #link https://review.openstack.org/#/c/297839/
15:15:48 <Swami> Just a quick question on this patch, I got a comment from Assaf that he is not comfortable with providing a config option for configuring this link_local_ip address cidr.
15:16:08 <Swami> What do you guys think? Should we just hardcode it and document it on the impact.
15:16:21 <haleyb> Swami: i know you got a comment on a new config option, is it useful to explore making this a DB entry?  some new extension?
15:16:30 <haleyb> or we just increase as large as possible
15:17:10 <Swami> haleyb: The reason I went for the config option is, it is similar to the one used in HA routers for the internal network.
15:17:35 <Swami> haleyb: at present it seems that the best option is increase as large as possible and document it.
15:18:00 <Swami> haleyb: may be a /16 range would work for now.
15:18:43 <haleyb> so 169.254.0.0/16 ?
15:18:53 <Swami> haleyb: yes
15:19:54 <haleyb> that doesn't conflict with any metadata setting?
15:20:12 <Swami> haleyb: I think so.
15:20:36 <haleyb> keepalived might not like it
15:20:58 <haleyb> we can discuss in the patch, might be a /17
15:21:15 <Swami> yes keepalived has 169.254.192.0/18
15:21:37 <Swami> haleyb: let us take the discussion in the patch.
15:22:07 <Swami> The next one in the list is
15:22:11 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1563879
15:22:11 <openstack> Launchpad bug 1563879 in neutron "[RFE] DVR should route packets to Instances behind the L2 Gateway" [Undecided,New]
15:22:33 <Swami> This is an RFE that we should look into for the newton release.
15:22:53 <Swami> Right now i have posted it as an RFE, but we need an blueprint we can add one.
15:23:12 <Swami> s/but we/but if we
15:23:38 <Swami> have any questions on this RFE
15:24:02 <haleyb> no questions, agree it should be done
15:24:16 <Swami> The next one in the list is also an RFE.
15:24:36 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1557290
15:24:36 <openstack> Launchpad bug 1557290 in neutron "[RFE]DVR FIP agent gateway does not pass traffic directed at fixed IP" [Wishlist,Confirmed]
15:25:22 <Swami> This is again to provide north-south with DVR and BGP.
15:26:04 <Swami> haleyb: Do you think we also need an RFE to support IPv6 north-south or shall we include that as part of this RFE
15:27:03 <haleyb> i think we can include it, even though it is different wrt prefix delegation, etc
15:27:37 <Swami> haleyb: ok
15:28:23 <Swami> That's all on the new bugs list
15:29:23 <Swami> haleyb: This bug is there for a while,
15:29:27 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1414559
15:29:27 <openstack> Launchpad bug 1414559 in neutron "OVS drops RARP packets by QEMU upon live-migration - VM temporarily disconnected" [Medium,In progress] - Assigned to Oleg Bondarev (obondarev)
15:29:42 <Swami> obondarev: has patches for it, but we should review it.
15:30:12 <Swami> #link https://review.openstack.org/#/c/246898/ - patch 1
15:30:30 <Swami> #link https://review.openstack.org/#/c/246910/ - patch 2
15:31:25 <haleyb> i had reviewed the neutron one, but not nova.  it actually needs a rebase now
15:31:37 <Swami> haleyb: ok
15:31:43 <obondarev> just rebased
15:31:50 <Swami> obondarev: thanks
15:31:59 <haleyb> and https://review.openstack.org/#/c/281137/ as well for review?
15:32:26 <obondarev> yep
15:32:36 <obondarev> I was asked to split in 2 patches
15:32:40 <Swami> obondarev: will review it.
15:32:47 <obondarev> Swami: thanks
15:32:52 <haleyb> me too
15:33:23 <haleyb> obondarev: one nova patch needs rebase too
15:33:31 <Swami> We have also pushed in a doc patch for the DVR SNAT HA
15:33:33 <obondarev> haleyb: yeah
15:33:37 <Swami> #link https://review.openstack.org/#/c/296836/
15:34:19 <Swami> #link https://review.openstack.org/#/c/296711/
15:34:48 <haleyb> will review
15:34:48 <Swami> These two document patches are up for review so if you get a chance please review it.
15:35:08 <Swami> That's all I had for bugs.
15:35:30 <Swami> I had one question may be we can discuss in the Open discussion
15:35:50 <haleyb> sure
15:36:28 <haleyb> #topic Gate Failures
15:36:57 <Swami> The multinode failure has gone down from last week.
15:37:10 <Swami> But I did see that it is climbing up right now.
15:37:39 <haleyb> The check queue failures have gone down, but I still need to go through things to determine cause
15:38:19 <Swami> haleyb: we should probably bring down the mutinode failures to be on par with the single node.
15:38:29 <haleyb> Could just be bad patches, right :)
15:38:43 <Swami> haleyb: that's my assumption at this point.
15:39:13 <haleyb> Swami: yes, it will need a dedicated resource(s) to work on it
15:39:30 <obondarev> I saw a couple of failures caused by some infra issues (tests didn'r run even)
15:40:03 <Swami> obondarev: thanks for the information.
15:40:10 <Swami> haleyb: yes I agree with you on this.
15:40:33 <Swami> haleyb: may be this would be the right time between now and the summit we can focus on getting the multinode stabilized.
15:40:37 <haleyb> yeah, and i've seen repo clone failures, but there's still some difference
15:41:25 <haleyb> Swami: but i added a topic this morning, which has been consuming my time at least...
15:42:06 <Swami> haleyb: I will try to spend some time on gate failures for multinode.
15:43:00 <haleyb> I think if we just focus on one failure mode at a time (for now) we can make progress, as these are not always easy to diagnose
15:43:54 <Swami> haleyb: make sense.
15:44:05 <haleyb> #topic Stable backports
15:44:24 <haleyb> Ihar created https://etherpad.openstack.org/p/stable-bug-candidates-from-master to track Mitaka changes for backport
15:44:49 <haleyb> i've been going throught it and manually cherry-picking things, along with help from Swami
15:45:45 <haleyb> I need to update with more review links, but it's getting there
15:46:03 <Swami> haleyb: thanks for the link
15:47:29 <haleyb> there is both an l3 and dvr section, hopefully can get all reviews up and passing
15:47:30 <obondarev> haleyb: nice
15:47:50 <Swami> haleyb: This is a good idea to maitain an etherpad to track.
15:48:54 <haleyb> Swami: and merging those fip race patches will help as it makes cherry-picking easier, and less rechecks since it's a common failure
15:49:33 <Swami> haleyb: yes, I will take a look at it
15:50:08 <haleyb> obondarev: btw, i couldn't add you to one of the stable reviews today, there's 3 of you in gerrit and they all failed with some strange error
15:50:35 <obondarev> haleyb: yeah, just try "obondarev"
15:51:14 <obondarev> haleyb: I'm not sure what's wrong, my personal data in gerrit looks ok
15:51:34 <Swami> obondarev: I have had the same problem adding you with your email address.
15:52:18 <obondarev> right, I know, it is just "obondarev" not email
15:52:27 <haleyb> unless it's my browser, "Oleg Bondarev <obondarev@mirantis.com> does not identify a registered user or group"
15:52:35 <Swami> obondarev: got it.
15:53:00 <carl_baldwin> Have you asked infra about that?  I had the same problem a while back.
15:53:28 <obondarev> I should write to infra
15:53:37 <haleyb> no, hadn't yet.  firefox is always auto-completing so i can't just use your name
15:53:54 <haleyb> another reason to revert gerrit :)
15:54:13 <obondarev> the issue was with old gerrit as well :(
15:54:39 <haleyb> obondarev: https://review.openstack.org/#/c/295987/ is the review
15:55:32 <haleyb> #topic Open Discussion
15:55:38 <haleyb> Swami: you mentioned something
15:55:51 <Swami> haleyb: carl_baldwin: I have a question with respect to address-scopes and DVR
15:56:02 <obondarev> haleyb: what's the review you wanted to add me?
15:56:40 <Swami> If we try to add two subnets belonging to two different address scopes to a router, the router does not route ( East-west). Is it a valid behavior or known limitation.
15:56:48 <haleyb> obondarev:  https://review.openstack.org/#/c/295987/ it's just a stable backport you had +2'd in master
15:57:46 <carl_baldwin> Swami: known valid behavior.
15:57:54 <obondarev> haleyb: thanks, going to ask infra for help
15:58:28 <Swami> carl_baldwin: Do we need to document or throw an error message when we try to add since it does not route
15:58:44 <carl_baldwin> Documentation is up for review.
15:58:45 <Swami> carl_baldwin: Or is it well documented already.
15:59:00 <Swami> carl_baldwin: thanks that answers my question.
15:59:09 <carl_baldwin> Swami: You're welcome.
15:59:24 <carl_baldwin> Swami: https://review.openstack.org/286294
15:59:49 <haleyb> anything else in the last seconds?
15:59:58 <Swami> that's all I had.
16:00:15 <haleyb> thanks everyone, time is up
16:00:17 <haleyb> #endmeeting