15:00:20 <tidwellr> #startmeeting neutron_l3 15:00:21 <mlavalle> o/ 15:00:24 <haleyb> hi 15:00:25 <openstack> Meeting started Thu Aug 4 15:00:20 2016 UTC and is due to finish in 60 minutes. The chair is tidwellr. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:26 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:28 <openstack> The meeting name has been set to 'neutron_l3' 15:00:42 <tidwellr> #chair mlavalle carl_baldwin 15:00:42 <openstack> Current chairs: carl_baldwin mlavalle tidwellr 15:01:19 <carl_baldwin> o/ 15:01:44 <tidwellr> calr_baldwin is here, we do announcements! j/k 15:01:46 <tidwellr> #topic Announcements 15:03:34 <tidwellr> mid-cycle is coming 17th-19th 15:03:41 <tidwellr> #link https://etherpad.openstack.org/p/newton-neutron-midcycle 15:04:32 <tidwellr> anybody have a sense for what some of the hot topics might be? 15:05:12 <carl_baldwin> There have been some ML posts. There are lots of topics. 15:05:39 <carl_baldwin> Etherpad has a lot of info on it. 15:05:59 <haleyb> https://etherpad.openstack.org/p/newton-neutron-midcycle-workitems 15:06:00 <tidwellr> I asked because I didn't see much other than travel info in the etherpad 15:06:10 <haleyb> tidwellr: ^^ 15:06:11 <tidwellr> ah, that's the one I was looking for 15:06:32 <carl_baldwin> You're right. I might have been thinking of another page. 15:06:34 <john-davidge> ah, haleyb beat me to it 15:06:47 <haleyb> and seeing the inside of pubs is a work item for some 15:06:49 <carl_baldwin> Yep, that page. 15:07:22 <carl_baldwin> The two should be cross-linked. 15:07:51 <mlavalle> haleyb: you not going? 15:08:48 <haleyb> mlavalle: no, on vacation with family that week, although i'll try to be online for some 15:09:13 <mlavalle> :-( 15:09:34 <mlavalle> Enjoy the vacation though! 15:09:48 <haleyb> yes, i tried to have the vacation moved to IE 15:10:39 <tidwellr> any more announcements? 15:11:35 <tidwellr> alright, moving on 15:11:41 <tidwellr> #topic Bugs 15:12:20 <tidwellr> looking at the agenda https://etherpad.openstack.org/p/neutron-l3-subteam, it looks like we should go over potential backports 15:12:27 <HenryG> I would like to highlight one bug 15:12:42 <HenryG> Bug 1562878 15:12:42 <openstack> bug 1562878 in neutron "L3 HA: Unable to complete operation on subnet" [High,Confirmed] https://launchpad.net/bugs/1562878 - Assigned to Ann Taraday (akamyshnikova) 15:13:23 * carl_baldwin looks... 15:14:03 <HenryG> It has been occuring regularly in the check/gate since July 24. 15:15:02 <carl_baldwin> HenryG: Should it be critical? 15:15:27 <carl_baldwin> Most gate failures are treated as critical. Especially if they're occurring regularly. 15:15:50 <HenryG> carl_baldwin: the recurrence is low enough that a recheck usually passes 15:16:06 <HenryG> But I am leaning towards critical 15:16:48 <carl_baldwin> HenryG: Do you have a logstash query URL you've been using? 15:17:15 <HenryG> I put it in comment #5 15:17:49 <HenryG> Remove the build_queue:"gate" option to see all the occurrences 15:17:58 <haleyb> jschwarz: is ^^ on your radar? I know you've been looking at HA issues 15:19:03 <carl_baldwin> HenryG: Ah, I see it. 15:21:08 <carl_baldwin> HenryG: That looks like a lot of occurrences. But, many are in the same run. 15:21:15 <jschwarz> reading 15:21:23 <jschwarz> haleyb, yes, it's on my radar 15:21:45 <jschwarz> haleyb, it's in my queue (which is a bit full atm), but Ann should be back next week and I hope to cooperate with her on this 15:21:45 <carl_baldwin> I'm scratching my head over how retrying DBConnectionError will cause it. 15:21:54 <HenryG> carl_baldwin: I suck at logstash queries 15:22:34 <carl_baldwin> HenryG: Me too. 15:22:51 <jschwarz> also, I got this reproduced locally on a simple 2-node devstack deployment 15:23:07 <jschwarz> so that should give us a better understanding on why it's happening 15:23:13 <HenryG> carl_baldwin: I doubt the BDConnectionError patch causes it, I just located the last merge that went in before the bug started showing up. 15:24:05 <carl_baldwin> HenryG: Do you think maybe you just hit the limit of what logstash keeps around? 15:24:57 <carl_baldwin> I think it only keeps about 7 days of data. 15:25:20 <HenryG> carl_baldwin: It allows me to select 30 days from the drop-down. 15:26:34 <HenryG> Anyway, I think the jschwarz option is the better way to track this down. 15:26:52 * jschwarz lols @ "the jschwarz option" 15:26:57 <carl_baldwin> jschwarz: Is it reliably reproducible? 15:27:30 <jschwarz> carl_baldwin, I remember running a bunch of rally tests (5ish?) and it happened a few times (2-3) 15:27:48 <carl_baldwin> HenryG: I think all my queries get clamped at around 7 days even when selecting the 30 day option. 15:28:08 <carl_baldwin> jschwarz: Sounds good. 15:28:15 * mlavalle has never been able to get a 30 days query 15:28:26 <carl_baldwin> Should we keep it assigned to Ann? 15:28:31 <HenryG> carl_baldwin: then I have been on many wild goose chases :( 15:28:51 <carl_baldwin> HenryG: I've been on those. 15:28:56 <jschwarz> carl_baldwin, I think so - she'll come back next week and I'll discuss this bug with her and see if I should take it or not 15:29:10 <carl_baldwin> ok 15:29:33 <carl_baldwin> Let's move on. 15:29:59 <tidwellr> alright 15:30:16 <tidwellr> https://bugs.launchpad.net/neutron/+bug/1604370 15:30:16 <openstack> Launchpad bug 1604370 in neutron "functional: test_legacy_router_ns_rebuild is unstable" [High,Fix released] - Assigned to Terry Wilson (otherwiseguy) 15:30:55 <tidwellr> looks like we can pull this off the agenda 15:31:01 <mlavalle> it seems fix is realeased 15:31:32 <tidwellr> moving on 15:31:35 <tidwellr> https://bugs.launchpad.net/neutron/+bug/1596075 15:31:35 <openstack> Launchpad bug 1596075 in neutron "Neutron confused about overlapping subnet creation" [High,In progress] - Assigned to Kevin Benton (kevinbenton) 15:31:51 <mlavalle> I spent some time this morning tracking this one 15:32:16 <mlavalle> It is a complicated affair, involving the quota engine, db retries and Galera 15:32:36 <mlavalle> 2 patchsets have been merged in relationship to it 15:32:49 <mlavalle> and kevinbenton is working on another 2 fixes: 15:33:08 <mlavalle> #link https://review.openstack.org/#/c/339226/ 15:33:32 <mlavalle> #link https://review.openstack.org/#/c/346289/ 15:34:22 <mlavalle> I'll keep tracking it 15:34:37 <tidwellr> mlvalle: thanks for staying on top of this 15:35:06 <tidwellr> https://bugs.launchpad.net/neutron/+bug/1599329 15:35:06 <openstack> Launchpad bug 1599329 in neutron "Potential regression on handing over DHCP addresses to VMs" [High,In progress] 15:36:08 <carl_baldwin> In watch mode. 15:36:15 <tidwellr> yeah, looks like we're just waiting to see if this strikes again 15:36:47 <tidwellr> can we take https://bugs.launchpad.net/neutron/+bug/1605277 off the agenda? 15:36:47 <openstack> Launchpad bug 1605277 in neutron "[IPAM] 'Internal' ipam driver does not allow to delete all pools on subnet update" [High,Fix released] - Assigned to Carl Baldwin (carl-baldwin) 15:37:11 <carl_baldwin> Yes 15:37:28 <tidwellr> cool, done 15:37:45 <tidwellr> https://bugs.launchpad.net/neutron/+bug/1603162 15:37:45 <openstack> Launchpad bug 1603162 in neutron "Pluggable IPAM rollback fails with reference driver" [High,In progress] - Assigned to Carl Baldwin (carl-baldwin) 15:38:04 <tidwellr> carl_baldwin: any luck with this one? 15:38:31 <carl_baldwin> I've had some discussion on the ML with kevinbenton . We have some ideas. 15:38:40 <carl_baldwin> I've got to get on this one quickly. 15:39:02 <carl_baldwin> I'll be working on it today. 15:39:04 <tidwellr> this is a blocker for the cutover to pluggable IPAM, right? 15:40:06 <carl_baldwin> Yes. 15:40:10 <tidwellr> ok 15:40:18 <tidwellr> https://bugs.launchpad.net/neutron/+bug/1608406 15:40:18 <openstack> Launchpad bug 1608406 in neutron "BGP: DVR fip host routes query including legacy/HA fip routes" [Undecided,In progress] - Assigned to LIU Yulong (dragon889) 15:40:31 <tidwellr> just wanted to call this out as it has backport potential 15:40:53 <tidwellr> this is my worst BGP nightmare come true 15:41:20 <carl_baldwin> :( 15:42:00 <tidwellr> we're sending the wrong next-hop for a FIP 15:42:42 <haleyb> tidwellr: just for mitaka, right? just updating bug tags 15:42:57 <tidwellr> both newton and mitaka 15:43:26 <haleyb> for backport i meant :) 15:43:32 <mlavalle> lol 15:43:44 <tidwellr> oh, right 15:43:53 <tidwellr> yes, mitaka for backport :) 15:44:03 <tidwellr> ugh, it's been a morning...... 15:44:20 <tidwellr> I was able to reproduce locally, attaching a legacy router and a distributed router to the same external network results in a FIP on the legacy router also being announced as accessible via one of the FIP gateways 15:44:40 <tidwellr> I'm helping chase this down 15:45:07 * mlavalle appreciates that it is early in tidwellr time zone and he still shows up 15:46:10 <tidwellr> any more bugs to discuss? 15:46:33 <jschwarz> I have one but that can be left for the Open Discussion if there's time 15:46:34 <mlavalle> We have https://bugs.launchpad.net/neutron/+bug/1609540, filed by a certain carl_baldwin 15:46:34 <openstack> Launchpad bug 1609540 in neutron "Deleting csnat port fails due to no fixed ips" [Critical,In progress] - Assigned to Kevin Benton (kevinbenton) 15:46:49 <carl_baldwin> It is in watch mode. 15:47:09 <carl_baldwin> I'm going to keep an eye on it and hopefully reduce the severity soon. 15:47:36 <mlavalle> Tnaks! 15:48:42 <mlavalle> I'll put in the etherpad anyway 15:48:55 <tidwellr> alright, we don't have much time to dive in to routed networks, FWaaS, RFE's, etc. 15:49:03 <mfranc213> hello, i'm filling in for njohnston again this week. just 3 things: 15:49:09 <mfranc213> the handle_router method was split into add_router and update_router in the L3 Agent Extension Manager patch (https://review.openstack.org/#/c/339246/10..11/neutron/agent/l3/l3_agent_extension.py) 15:49:18 <tidwellr> I'm thinking we move to open discussion 15:49:20 <mfranc213> so sorry 15:49:48 <tidwellr> mfranc213: no worries, go for it 15:50:01 <mfranc213> next patchset for the FWaaS L3 agent extension was issued yesterday (Refactor FWaaS' L3 agent extension) 15:50:06 <mfranc213> ork on the FWaaS plugin is proceeding and i believe we will get a patchset pushed in the next couple of days. 15:50:08 <tidwellr> #topic Open Discussion 15:50:09 <mfranc213> that's it! 15:50:33 <steve_ruan> https://review.openstack.org/#/c/337662/ need 1 more "+2" 15:50:37 <tidwellr> mfranc213: thanks for the update 15:50:45 <steve_ruan> anyone can help? 15:51:24 <carl_baldwin> haleyb: could you take a look ^ 15:51:45 <carl_baldwin> mfranc213: I'll take a look at that change. 15:51:48 <john-davidge> A couple of updates on service subnets. Brian noticed a possible problem with the deletion logic for https://review.openstack.org/#/c/337851/ and pushed a fix yesterday. It should be good to go. And a WIP for the follow-up patch is here https://review.openstack.org/#/c/350613/ - It's almost ready for review. 15:51:49 <haleyb> i'm on it 15:51:49 <tidwellr> steve_ruan: thanks for making some noise about that one 15:51:54 <mfranc213> thank you carl_baldwin 15:52:26 <jschwarz> I filed https://bugs.launchpad.net/neutron/+bug/1609738 which deals with a weird state HA routers can get into while creating/updating it.. the solution I have in mind involves refactoring update_router_db for the l3_hamode_db.py 15:52:26 <openstack> Launchpad bug 1609738 in neutron "l3-ha: a router can be stuck in the ALLOCATING state" [Undecided,New] - Assigned to John Schwarz (jschwarz) 15:52:52 <haleyb> john-davidge: and the OSC patch https://review.openstack.org/#/c/342976/ just got a +2 15:53:05 <jschwarz> such that modifying admin_state_up will unschedule/schedule the router (as opposed for the current ha attribute change which does this one) 15:53:16 <john-davidge> haleyb: woop \o/ 15:53:16 <jschwarz> I need more opinions though on this matter 15:53:39 <mlavalle> john-davidge: I took a look at https://review.openstack.org/#/c/350613 15:54:30 <john-davidge> mlavalle: Yes, thanks for the review. I've already incorporated it into the next patch. Should help with performance at scale 15:56:55 <mlavalle> jschwarz: is there a patchset up for review or should we comment in the bug? 15:57:08 <jschwarz> mlavalle, comments on the bug will be much appreciated 15:57:27 <jschwarz> it's quite a refactor and I started working on it today and it broke a few things :< 15:58:02 <mlavalle> yeap, that's what big refactors do 15:58:07 <jschwarz> XD 15:59:46 <mlavalle> it seems bugs left us exhausted today :-) 16:00:13 <tidwellr> mlavalle: indeed 16:00:25 <tidwellr> thanks everyone! 16:00:31 <tidwellr> #endmeeting