15:03:03 <mlavalle> #startmeeting neutron_l3
15:03:03 <openstack> Meeting started Thu Sep 24 15:03:03 2015 UTC and is due to finish in 60 minutes.  The chair is mlavalle. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:03:04 <carl_baldwin> SergeyLukjanov: Thanks!
15:03:05 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:03:07 <openstack> The meeting name has been set to 'neutron_l3'
15:03:14 <mlavalle> SergeyLukjanov: thanks!
15:03:18 <mlavalle> hi
15:03:21 * regXboi still notes having multiple chairs is good idea - cough, cough
15:03:28 <regXboi> morning all
15:03:40 <mlavalle> #topic Announcements
15:04:05 <mlavalle> so, we cut RC1 yesterday.... lots of last time activity, at least for me
15:04:13 <regXboi> mlavalle: yay!
15:04:16 <adduarte> hi
15:04:27 <carl_baldwin> That also means that Mitaka is open!
15:04:39 <mlavalle> ++
15:04:40 <regXboi> carl_baldwin: +1
15:04:58 <mlavalle> also we should know who is the new PTL soon
15:05:05 <regXboi> that will be tomorrow
15:05:15 <regXboi> if you haven't voted - go do so :)
15:05:25 <mlavalle> yes, please vote.....
15:05:38 <regXboi> note: I don't care who you vote for :) just vote
15:06:00 <mlavalle> any other announcements?
15:06:09 <carl_baldwin> Remember the last vote was won my one single vote.
15:06:35 <carl_baldwin> s/my/by/
15:07:03 <mlavalle> that's true, so if you haven't voted, you might elect the ptl
15:07:41 <mlavalle> ok, moving along.....
15:07:48 <mlavalle> #topic Bugs
15:08:01 <mlavalle> so pinging myself....
15:08:21 <mlavalle> the good news today is that we worked through our critical bugs last week
15:08:36 <mlavalle> haven't shown up in at least 7 days
15:09:04 <mlavalle> we also worked through a lot of high importance bugs, so today I have two to highlight
15:09:12 <carl_baldwin> Yay!
15:09:24 <mlavalle> https://bugs.launchpad.net/neutron/+bug/1365473
15:09:25 <openstack> Launchpad bug 1365473 in neutron "Unable to create a router that's both HA and distributed" [High,In progress] - Assigned to Assaf Muller (amuller)
15:09:48 <mlavalle> I noticed adduarte is in the meeting and he has been involved with this one.... any comments?
15:10:08 <adduarte> assayf was working on it review
15:10:44 <adduarte> testing shows us in good shape. but it does need code review
15:10:45 <carl_baldwin> I talked with amuller yesterday about this.  We’re still hoping to merge this before Liberty final.  I’m going to mark it an RC2 candidate.
15:11:05 <regXboi> carl_baldwin: sounds good
15:11:12 <mlavalle> yeah, it got assigned to amuller 17 hours ago.....
15:11:14 <Swami> carl_baldwin:+1
15:11:19 <adduarte> and retesting after any new changes from assaf
15:11:37 <adduarte> bout an hour of testing
15:12:51 <carl_baldwin> adduarte: Thanks for all of your work here.
15:13:08 <mlavalle> ok, next up is https://bugs.launchpad.net/neutron/+bug/1494351
15:13:09 <openstack> Launchpad bug 1494351 in neutron "Observed StaleDataError in gate-neutron-dsvm-api tests if reference IPAM driver is used" [High,In progress] - Assigned to Pavel Bondar (pasha117)
15:13:31 <mlavalle> pavel_bondar was working on it.... any updates?
15:13:36 <regXboi> mlavalle: fyi - I have two others to throw in the hopper when you are done
15:13:38 <pavel_bondar> unfortunatelly I did not had free cycles to work on that during last week
15:13:54 <pavel_bondar> it is pretty clear now what the fix should be
15:14:06 <pavel_bondar> but I need about 2 free days to implement and test
15:14:17 * mlavalle pu the bugs regXboi is talking about in the agenda....
15:14:27 * regXboi goes and looks :)
15:14:27 * mlavalle maybe?
15:15:07 <mlavalle> pavel_bondar: that's cool, thanks for the update. Any help needed?
15:15:22 * regXboi no - these are ones I'm just seeing now :( - I think carl_baldwin bumped them up and I forgot to add them - apologies
15:15:53 <carl_baldwin> regXboi: Sorry
15:16:07 <pavel_bondar> mlavalle: it is clear what to do, so I think I am ok here
15:16:09 <regXboi> carl_baldwin: my fault, I should have added them
15:16:28 <mlavalle> pavel_bondar: thanks... moving on
15:16:39 <regXboi> mlavalle: the first is https://bugs.launchpad.net/neutron/+bug/1486795
15:16:40 <openstack> Launchpad bug 1486795 in neutron "DVR: create or update port by using notify specific host rather than fanout" [High,In progress] - Assigned to shihanzhang (shihanzhang)
15:17:07 <regXboi> this got bumped up by carl_baldwin to High on 9/18 - there is a request in the bug for a patch set review
15:17:08 <carl_baldwin> pavel_bondar: I’ll get to that one today.  It had dropped off my radar.  :(
15:17:32 <regXboi> (  https://review.openstack.org/221209 ) so maybe we can make this a RC-2 target as well?
15:17:33 * regXboi hopes
15:17:41 <mlavalle> carl_baldwin: you reviewed it on 9/21. we are actually waiting for the next revision
15:18:05 <carl_baldwin> regXboi: Possibly.  It needs work and I wouldn’t call it a release blocker.
15:18:19 <regXboi> carl_baldwin: looking at that patch set, it needs more work :(
15:18:32 <regXboi> carl_baldwin, so I don't think it's RC-2 ready now
15:18:40 <mlavalle> regXboi: yes, carl_baldwin indicated that the code needs cleanup
15:19:23 <regXboi> the last one is https://bugs.launchpad.net/neutron/+bug/1486828 and I don't see a patch set (which means I think launchpad missed it)
15:19:24 <openstack> Launchpad bug 1486828 in neutron "L3: Notify specific agent rather than fanout when associating floatingip" [High,In progress] - Assigned to changzhi (changzhi)
15:19:50 <mlavalle> regXboi: we were actually talking about the same 2 bugs. both in the agenda
15:20:04 <regXboi> mlavalle: ok - cool :)
15:20:13 <mlavalle> I got your back man :-)
15:20:29 <regXboi> good, because I'd have lost my head this week if it wasn't attached to my shoulders :)
15:20:51 <carl_baldwin> regXboi: There is a patch.  Let me find...
15:21:15 <regXboi> carl_baldwin: yes, I remember seeing it - didn't realize that launchpad had missed it
15:21:16 <carl_baldwin> #link https://review.openstack.org/#/c/215136/
15:21:45 <regXboi> thanks - I've updated launchpad
15:21:49 <mlavalle> carl_baldwin: I'll update the bug with the patchset
15:21:56 <carl_baldwin> mlavalle: Thanks
15:21:57 <regXboi> mlavalle: I got your back this time :)
15:22:02 <mlavalle> :-)
15:22:31 <regXboi> yeah this one had me worried
15:22:44 <regXboi> the test failures didn't look like I could just call them "unrelated"
15:23:23 <pavel_bondar> one, question, is https://bugs.launchpad.net/neutron/+bug/1494351 required for Liberty?
15:23:24 <openstack> Launchpad bug 1494351 in neutron "Observed StaleDataError in gate-neutron-dsvm-api tests if reference IPAM driver is used" [High,In progress] - Assigned to Pavel Bondar (pasha117)
15:24:10 <carl_baldwin> pavel_bondar: It’d be nice but I wouldn’t hold up the release for it.
15:24:18 <neiljerram_bb> Hi, sorry to come in late.
15:24:48 <pavel_bondar> carl_baldwin: ok, then try to get some time to fix it sooner
15:25:31 <mlavalle> the other good news is that we worked through 2 medium importance bugs that were marked for RC-1 by regXboi.... so, all in all, great teamwork
15:25:42 <carl_baldwin> +1
15:25:52 <regXboi> yes - thanks to everybody that pitched in once they were found :)
15:26:19 <mlavalle> any other bugs to discuss?
15:26:47 <mlavalle> ok, moving on......
15:26:48 <regXboi> mlavalle: I think we are good for now - the backlog is up a bit, and I need to spend some time with it for next week
15:27:09 <mlavalle> #topic Router Networks
15:27:40 <mlavalle> carl_baldwin: I think this is the spec you want to discuss https://review.openstack.org/#/c/225384/
15:27:59 <mlavalle> I put it in the agenda
15:28:11 <carl_baldwin> mlavalle: Yes.  I wanted to point it out.  It is collecting feedback.
15:28:58 <carl_baldwin> It introduces two very poorly named new entities to help model L3 networking.
15:29:31 <carl_baldwin> I think it is probably best to just take the discussion to the review.  Maybe we’ll have more to discuss next week.
15:29:47 <neiljerram_bb> I will review soon, hopefully tomorrow.
15:30:10 <mlavalle> Please take a look and pitch in with feedback
15:30:24 <mlavalle> moving on.....
15:30:32 <mlavalle> #topic DVR
15:30:42 <Swami> mlavalle: hi
15:30:47 <mlavalle> Swami: the floor is all yours
15:31:04 <Swami> I am working on fixing some error logs that I have been seeing in the l3-agent logs.
15:31:11 <carl_baldwin> regXboi: and haleyb: too
15:31:23 <Swami> I have also added a couple of bugs related to those fixes.
15:31:40 <mlavalle> carl_baldwin: thanks for reminding me
15:31:46 <Swami> #link https://review.openstack.org/#/c/225319/
15:32:06 <Swami> #link https://review.openstack.org/#/c/225514/
15:32:27 <Swami> #link https://review.openstack.org/#/c/225523/
15:32:46 <Swami> #link https://review.openstack.org/#/c/227008/
15:33:18 <haleyb> I have questions about that last one, but will put them in the review
15:33:27 <Swami> There was one other patch I have submitted to revert it. This one I had a discussion with carl_baldwin yesterday regarding the static routes being added to snat_namespace instead of the qr-namespace
15:33:32 <regXboi> Swami: can you speculate how many of those might be causing DVR jobs failures in the pipelines?
15:33:44 <Swami> #link https://review.openstack.org/#/c/227045/
15:34:18 <Swami> regXboi: As of today they are causing the DVR jobs to fail, but the intermittent failures may be due to these issues.
15:34:49 <Swami> s/they are/they are not
15:35:13 <regXboi> Swami: are there logstash queries that we can use to check the pipelines?
15:35:41 <Swami> regXboi: I looked at the logtrace from my patches for any traces that have happened.
15:36:03 <Swami> regXboi: I don't have a logstash query at this time.
15:36:23 <Swami> haleyb: thanks
15:36:31 <regXboi> ok, I ask becasue the dvr jobs are running at between 25 and 50% failure rates for the last week :(
15:36:43 <regXboi> and that's not good
15:36:54 <Swami> regXboi: last two days on 20 and 21 the DVR job failure was high.
15:37:28 <Swami> regXboi: I was not sure what was cusing that spike for last two days, it came down yesterday.
15:37:32 <haleyb> Swami: I guess my main concern with some of the log message "squashing" is that is there a bug underneath?  Hiding the message just makes it worse
15:37:37 <Swami> Does anyone know what caused that spike.
15:38:06 <regXboi> haleyb: +1 squashing messages without fixing bugs is not the way to go
15:38:23 <Swami> haleyb: The issue that I am seeing with these log message, is either the port or the namespace is concurrently deleted, while other one is trying to add the data.
15:38:27 <regXboi> Swami: I do not know - that's why I was looking for logstash queries to check to see if they were related
15:38:29 <haleyb> for example, if we get a "namespace does not exist" error was it supposed to be there?
15:38:42 <Swami> regXboi: haleyb: I agree.
15:39:01 <Swami> haleyb: This is the reason for the patch revert that I included.
15:39:48 <haleyb> Swami: yes, concurrency with DVR is a bit tricky
15:39:48 <Swami> haleyb: it is trying to add a static route to the snat-namespace, when the snat-namespace has been already deleted or not in existence.
15:39:54 <regXboi> on the DVR job failures, I need to finish this last O(n) perf issue that I'm chasing and then I was going to go at the problem from the FIP side
15:40:30 <Swami> regXboi: Most of the DVR jobs inconsistent failure is due to the "ssh timeout" or not able to reach with the FloatingIP.
15:40:31 <regXboi> to see *why* the FIPs aren't available as those are *most* of what I've seen when DVR fails
15:40:40 <regXboi> Swami: exactly
15:40:59 <haleyb> Swami: on the SNAT namespace, that gets back at the conversation you and carl_baldwin had yesterday regarding do we need to add routes there, what was the conclusion there?
15:41:06 <Swami> regXboi: The best option is that we might have to add some test cases to see if the packet can reach from one namespace to another namespace.
15:41:36 <regXboi> Swami: I think we may have some additional options to follow
15:42:51 <carl_baldwin> I still need to come to my own conclusion.  At first I thought it makes sense for all of the dvr router namespaces to have essentially the same routing tables.  But, the static routes may not be used in the snat namespace.
15:43:33 <Swami_> sorry got disconnected.
15:44:14 <Swami_> regXboi: haleyb: what would be the best option to handle the concurrency problem with the DVR.
15:45:08 <mlavalle> regXboi, haleyb maybe we need to continue this discusssion in the Neutron channel?
15:45:18 <regXboi> mlavalle: +1 - let's move on
15:45:28 <mlavalle> ok.....
15:45:33 <haleyb> yes....
15:45:43 <mlavalle> #topic BGP dynamic routing
15:45:52 <tidwellr> hi
15:45:55 <Swami_> sorry got disconnected.
15:45:57 <mlavalle> hi
15:46:13 <mlavalle> Swami_: let's continue that conversation in the Neutron channel
15:46:24 <Swami_> mlavalle: sure
15:47:01 <mlavalle> tidwellr: floor is all yours.....
15:47:13 <tidwellr> just plugging along, the dynamic routing agent is in pretty solid shape
15:47:38 <tidwellr> feel free to review (let me get the links)
15:48:06 <tidwellr> https://review.openstack.org/#/c/207607/
15:48:11 <tidwellr> https://review.openstack.org/#/c/207625/
15:48:23 <tidwellr> https://review.openstack.org/#/c/207635/
15:48:34 <tidwellr> these are in good shape
15:49:07 <tidwellr> we still have the service plugin and API code to polish up https://review.openstack.org/#/c/201621/
15:49:14 <carl_baldwin> tidwellr: great!  Let’s get them back in to our merge queues.
15:49:39 <tidwellr> since Mitaka is now open, I need to do a rebase
15:49:58 <tidwellr> https://review.openstack.org/#/c/201621/ is still WIP
15:50:10 <carl_baldwin> tidwellr: I’d say only rebase if you hit a merge conflict.
15:50:30 <tidwellr> well, we've got migrations in the wrong folders and stuff like that
15:51:23 <tidwellr> I'm looking to break down https://review.openstack.org/#/c/201621/, it's rather large
15:51:48 <carl_baldwin> tidwellr: ok
15:51:49 <tidwellr> anyway, that's where we're at
15:52:04 <mlavalle> tidwellr: thanks for the update.... let's move on
15:52:12 <mlavalle> #topic DNS
15:52:57 <mlavalle> so regXboi filed a bug last Friday, where internal dns queries were causing poor performance in ports gets
15:53:21 <mlavalle> we were able to propose a fix for it, that made it in RC-1: https://review.openstack.org/#/c/226581/
15:53:26 <regXboi> :)
15:53:40 <mlavalle> esentially eliminated all the overhead db queries for subnets
15:53:54 <mlavalle> thanks regXboi, great catch
15:54:01 <regXboi> you are welcome
15:54:07 <carl_baldwin> mlavalle: Thanks for the quick response on that.
15:54:58 <mlavalle> as far as external DNS (https://review.openstack.org/#/c/212213/) we are at a point where the functionality for floating ips is woking fine
15:55:53 <mlavalle> Here http://paste.openstack.org/show/471874/ I walk though the creation of a fip with dns_name and dns_domain and show the impact on the designate database and I can dig the A and PTR pointers
15:56:44 <carl_baldwin> mlavalle: nice!
15:56:44 <carl_baldwin> We’ve got a talk we need to get ready for
15:56:48 <mlavalle> I reviewed this with Kiall, mugsie and the rest of the Designate tea, and I am going to polish it over the next 2 days. after that, we are good to show it in Tokyo
15:57:37 <mlavalle> and next week i'll move on to vm ports on external networks and will start bugging the infra team to get neutron + designate in the gate
15:57:56 <mlavalle> that's my update for today
15:58:32 <mlavalle> #topic Open Discussion
15:58:45 <mlavalle> any other topics to bring up?
15:58:45 <regXboi> under the O(n) heading - I've got one more I'm chasing for router scheduling - ovs_add_port
16:00:01 <mlavalle> ok, guys, thanks for attending, time is over. And keep up the great work!
16:00:09 <mlavalle> #endmeeting