15:00:20 #startmeeting neutron_l3 15:00:21 o/ 15:00:24 hi 15:00:25 Meeting started Thu Aug 4 15:00:20 2016 UTC and is due to finish in 60 minutes. The chair is tidwellr. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:26 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:28 The meeting name has been set to 'neutron_l3' 15:00:42 #chair mlavalle carl_baldwin 15:00:42 Current chairs: carl_baldwin mlavalle tidwellr 15:01:19 o/ 15:01:44 calr_baldwin is here, we do announcements! j/k 15:01:46 #topic Announcements 15:03:34 mid-cycle is coming 17th-19th 15:03:41 #link https://etherpad.openstack.org/p/newton-neutron-midcycle 15:04:32 anybody have a sense for what some of the hot topics might be? 15:05:12 There have been some ML posts. There are lots of topics. 15:05:39 Etherpad has a lot of info on it. 15:05:59 https://etherpad.openstack.org/p/newton-neutron-midcycle-workitems 15:06:00 I asked because I didn't see much other than travel info in the etherpad 15:06:10 tidwellr: ^^ 15:06:11 ah, that's the one I was looking for 15:06:32 You're right. I might have been thinking of another page. 15:06:34 ah, haleyb beat me to it 15:06:47 and seeing the inside of pubs is a work item for some 15:06:49 Yep, that page. 15:07:22 The two should be cross-linked. 15:07:51 haleyb: you not going? 15:08:48 mlavalle: no, on vacation with family that week, although i'll try to be online for some 15:09:13 :-( 15:09:34 Enjoy the vacation though! 15:09:48 yes, i tried to have the vacation moved to IE 15:10:39 any more announcements? 15:11:35 alright, moving on 15:11:41 #topic Bugs 15:12:20 looking at the agenda https://etherpad.openstack.org/p/neutron-l3-subteam, it looks like we should go over potential backports 15:12:27 I would like to highlight one bug 15:12:42 Bug 1562878 15:12:42 bug 1562878 in neutron "L3 HA: Unable to complete operation on subnet" [High,Confirmed] https://launchpad.net/bugs/1562878 - Assigned to Ann Taraday (akamyshnikova) 15:13:23 * carl_baldwin looks... 15:14:03 It has been occuring regularly in the check/gate since July 24. 15:15:02 HenryG: Should it be critical? 15:15:27 Most gate failures are treated as critical. Especially if they're occurring regularly. 15:15:50 carl_baldwin: the recurrence is low enough that a recheck usually passes 15:16:06 But I am leaning towards critical 15:16:48 HenryG: Do you have a logstash query URL you've been using? 15:17:15 I put it in comment #5 15:17:49 Remove the build_queue:"gate" option to see all the occurrences 15:17:58 jschwarz: is ^^ on your radar? I know you've been looking at HA issues 15:19:03 HenryG: Ah, I see it. 15:21:08 HenryG: That looks like a lot of occurrences. But, many are in the same run. 15:21:15 reading 15:21:23 haleyb, yes, it's on my radar 15:21:45 haleyb, it's in my queue (which is a bit full atm), but Ann should be back next week and I hope to cooperate with her on this 15:21:45 I'm scratching my head over how retrying DBConnectionError will cause it. 15:21:54 carl_baldwin: I suck at logstash queries 15:22:34 HenryG: Me too. 15:22:51 also, I got this reproduced locally on a simple 2-node devstack deployment 15:23:07 so that should give us a better understanding on why it's happening 15:23:13 carl_baldwin: I doubt the BDConnectionError patch causes it, I just located the last merge that went in before the bug started showing up. 15:24:05 HenryG: Do you think maybe you just hit the limit of what logstash keeps around? 15:24:57 I think it only keeps about 7 days of data. 15:25:20 carl_baldwin: It allows me to select 30 days from the drop-down. 15:26:34 Anyway, I think the jschwarz option is the better way to track this down. 15:26:52 * jschwarz lols @ "the jschwarz option" 15:26:57 jschwarz: Is it reliably reproducible? 15:27:30 carl_baldwin, I remember running a bunch of rally tests (5ish?) and it happened a few times (2-3) 15:27:48 HenryG: I think all my queries get clamped at around 7 days even when selecting the 30 day option. 15:28:08 jschwarz: Sounds good. 15:28:15 * mlavalle has never been able to get a 30 days query 15:28:26 Should we keep it assigned to Ann? 15:28:31 carl_baldwin: then I have been on many wild goose chases :( 15:28:51 HenryG: I've been on those. 15:28:56 carl_baldwin, I think so - she'll come back next week and I'll discuss this bug with her and see if I should take it or not 15:29:10 ok 15:29:33 Let's move on. 15:29:59 alright 15:30:16 https://bugs.launchpad.net/neutron/+bug/1604370 15:30:16 Launchpad bug 1604370 in neutron "functional: test_legacy_router_ns_rebuild is unstable" [High,Fix released] - Assigned to Terry Wilson (otherwiseguy) 15:30:55 looks like we can pull this off the agenda 15:31:01 it seems fix is realeased 15:31:32 moving on 15:31:35 https://bugs.launchpad.net/neutron/+bug/1596075 15:31:35 Launchpad bug 1596075 in neutron "Neutron confused about overlapping subnet creation" [High,In progress] - Assigned to Kevin Benton (kevinbenton) 15:31:51 I spent some time this morning tracking this one 15:32:16 It is a complicated affair, involving the quota engine, db retries and Galera 15:32:36 2 patchsets have been merged in relationship to it 15:32:49 and kevinbenton is working on another 2 fixes: 15:33:08 #link https://review.openstack.org/#/c/339226/ 15:33:32 #link https://review.openstack.org/#/c/346289/ 15:34:22 I'll keep tracking it 15:34:37 mlvalle: thanks for staying on top of this 15:35:06 https://bugs.launchpad.net/neutron/+bug/1599329 15:35:06 Launchpad bug 1599329 in neutron "Potential regression on handing over DHCP addresses to VMs" [High,In progress] 15:36:08 In watch mode. 15:36:15 yeah, looks like we're just waiting to see if this strikes again 15:36:47 can we take https://bugs.launchpad.net/neutron/+bug/1605277 off the agenda? 15:36:47 Launchpad bug 1605277 in neutron "[IPAM] 'Internal' ipam driver does not allow to delete all pools on subnet update" [High,Fix released] - Assigned to Carl Baldwin (carl-baldwin) 15:37:11 Yes 15:37:28 cool, done 15:37:45 https://bugs.launchpad.net/neutron/+bug/1603162 15:37:45 Launchpad bug 1603162 in neutron "Pluggable IPAM rollback fails with reference driver" [High,In progress] - Assigned to Carl Baldwin (carl-baldwin) 15:38:04 carl_baldwin: any luck with this one? 15:38:31 I've had some discussion on the ML with kevinbenton . We have some ideas. 15:38:40 I've got to get on this one quickly. 15:39:02 I'll be working on it today. 15:39:04 this is a blocker for the cutover to pluggable IPAM, right? 15:40:06 Yes. 15:40:10 ok 15:40:18 https://bugs.launchpad.net/neutron/+bug/1608406 15:40:18 Launchpad bug 1608406 in neutron "BGP: DVR fip host routes query including legacy/HA fip routes" [Undecided,In progress] - Assigned to LIU Yulong (dragon889) 15:40:31 just wanted to call this out as it has backport potential 15:40:53 this is my worst BGP nightmare come true 15:41:20 :( 15:42:00 we're sending the wrong next-hop for a FIP 15:42:42 tidwellr: just for mitaka, right? just updating bug tags 15:42:57 both newton and mitaka 15:43:26 for backport i meant :) 15:43:32 lol 15:43:44 oh, right 15:43:53 yes, mitaka for backport :) 15:44:03 ugh, it's been a morning...... 15:44:20 I was able to reproduce locally, attaching a legacy router and a distributed router to the same external network results in a FIP on the legacy router also being announced as accessible via one of the FIP gateways 15:44:40 I'm helping chase this down 15:45:07 * mlavalle appreciates that it is early in tidwellr time zone and he still shows up 15:46:10 any more bugs to discuss? 15:46:33 I have one but that can be left for the Open Discussion if there's time 15:46:34 We have https://bugs.launchpad.net/neutron/+bug/1609540, filed by a certain carl_baldwin 15:46:34 Launchpad bug 1609540 in neutron "Deleting csnat port fails due to no fixed ips" [Critical,In progress] - Assigned to Kevin Benton (kevinbenton) 15:46:49 It is in watch mode. 15:47:09 I'm going to keep an eye on it and hopefully reduce the severity soon. 15:47:36 Tnaks! 15:48:42 I'll put in the etherpad anyway 15:48:55 alright, we don't have much time to dive in to routed networks, FWaaS, RFE's, etc. 15:49:03 hello, i'm filling in for njohnston again this week. just 3 things: 15:49:09 the handle_router method was split into add_router and update_router in the L3 Agent Extension Manager patch (https://review.openstack.org/#/c/339246/10..11/neutron/agent/l3/l3_agent_extension.py) 15:49:18 I'm thinking we move to open discussion 15:49:20 so sorry 15:49:48 mfranc213: no worries, go for it 15:50:01 next patchset for the FWaaS L3 agent extension was issued yesterday (Refactor FWaaS' L3 agent extension) 15:50:06 ork on the FWaaS plugin is proceeding and i believe we will get a patchset pushed in the next couple of days. 15:50:08 #topic Open Discussion 15:50:09 that's it! 15:50:33 https://review.openstack.org/#/c/337662/ need 1 more "+2" 15:50:37 mfranc213: thanks for the update 15:50:45 anyone can help? 15:51:24 haleyb: could you take a look ^ 15:51:45 mfranc213: I'll take a look at that change. 15:51:48 A couple of updates on service subnets. Brian noticed a possible problem with the deletion logic for https://review.openstack.org/#/c/337851/ and pushed a fix yesterday. It should be good to go. And a WIP for the follow-up patch is here https://review.openstack.org/#/c/350613/ - It's almost ready for review. 15:51:49 i'm on it 15:51:49 steve_ruan: thanks for making some noise about that one 15:51:54 thank you carl_baldwin 15:52:26 I filed https://bugs.launchpad.net/neutron/+bug/1609738 which deals with a weird state HA routers can get into while creating/updating it.. the solution I have in mind involves refactoring update_router_db for the l3_hamode_db.py 15:52:26 Launchpad bug 1609738 in neutron "l3-ha: a router can be stuck in the ALLOCATING state" [Undecided,New] - Assigned to John Schwarz (jschwarz) 15:52:52 john-davidge: and the OSC patch https://review.openstack.org/#/c/342976/ just got a +2 15:53:05 such that modifying admin_state_up will unschedule/schedule the router (as opposed for the current ha attribute change which does this one) 15:53:16 haleyb: woop \o/ 15:53:16 I need more opinions though on this matter 15:53:39 john-davidge: I took a look at https://review.openstack.org/#/c/350613 15:54:30 mlavalle: Yes, thanks for the review. I've already incorporated it into the next patch. Should help with performance at scale 15:56:55 jschwarz: is there a patchset up for review or should we comment in the bug? 15:57:08 mlavalle, comments on the bug will be much appreciated 15:57:27 it's quite a refactor and I started working on it today and it broke a few things :< 15:58:02 yeap, that's what big refactors do 15:58:07 XD 15:59:46 it seems bugs left us exhausted today :-) 16:00:13 mlavalle: indeed 16:00:25 thanks everyone! 16:00:31 #endmeeting