15:01:18 #startmeeting neutron_dvr 15:01:19 Meeting started Wed Jan 11 15:01:18 2017 UTC and is due to finish in 60 minutes. The chair is haleyb. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:21 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:24 The meeting name has been set to 'neutron_dvr' 15:01:29 #chair Swami 15:01:30 Current chairs: Swami haleyb 15:01:54 #topic Announcements 15:02:44 ocata-3 will sneak up quickly, so try and finish up work 15:03:19 https://launchpad.net/neutron/+milestone/ocata-3 was created to track blueprints/bugs targeted for O-3, ping me if you need something added 15:04:02 haleyb: the fast-exit rfe is it tagged for ocata-3, if not tagged can we tag it for ocata-3 otherwise, it will not go anywhere. 15:04:04 Also, if anyone is planning on running for PTL, get your paperwork in, think that's in two weeks or so 15:04:35 Swami: i don't see it there, i can add after meeting 15:04:41 haleyb: ok 15:05:40 #topic Bugs 15:05:49 haleyb: thanks 15:06:03 This week I did not see any new bugs. 15:06:43 haleyb: thanks for triaging couple of old bugs. 15:07:04 I still got no reponse on bug 1653633 15:07:04 bug 1653633 in neutron "fwaas v1 with DVR: l3 agent can't restore the NAT rules for floatingIP" [Undecided,Incomplete] https://launchpad.net/bugs/1653633 15:07:26 haleyb: Yes I saw your response on it. 15:07:42 let us go through the high priority ones that are in review 15:07:48 #link https://bugs.launchpad.net/neutron/+bug/1647432 15:07:48 Launchpad bug 1647432 in neutron "Multiple SIGHUPs to keepalived might trigger re-election" [High,In progress] - Assigned to John Schwarz (jschwarz) 15:08:10 I've gotten some review comments and I'm going through them as we speak 15:08:18 hopefully I'll get a new patchset out tomorrow 15:08:21 #link https://review.openstack.org/#/c/407099/ ( patch in review) 15:08:33 jschwarz: thanks 15:09:30 #link https://bugs.launchpad.net/neutron/+bug/1403455 15:09:30 Launchpad bug 1403455 in neutron "neutron-netns-cleanup doesn't clean up all L3 agent spawned processes" [High,In progress] - Assigned to Daniel Alvarez (dalvarezs) 15:09:47 a patch for that was merged I believe 15:10:03 yes, think even to stable 15:10:08 https://review.openstack.org/#/c/411968/ 15:10:14 #link https://review.openstack.org/#/c/417957/ ( This patch has merged). Let me update it in the launchpad. 15:10:18 there's a patch to backport it to stable/mitaka now 15:10:24 dalvarez is doing excellent work on that 15:10:42 stable/newton patch merged, not sure there's anything further 15:11:20 haleyb: yes I was looking at the launchpad and just updated the status to fix committed. 15:11:26 #link https://bugs.launchpad.net/neutron/+bug/1644231 15:11:26 Launchpad bug 1644231 in neutron "fip router config is not created if the vm ports attached to FIPs have no device_owner" [Low,Triaged] 15:11:42 This has been triaged and might be a documentation update. 15:12:23 Based on what we decide. We also have a similar problem for the allowed-address-pair port used for assigning the VIPs without the binding 15:13:56 haleyb: should this be update in the admin guide or in the developers guide 15:14:02 Swami: i think we were just going to document that. I need to think where though - could be port-create or other page 15:14:03 s/update/updated 15:14:44 haleyb: ok thanks, I will look at the doc and see where we can fit it in. 15:14:58 Swami: thanks 15:15:11 #link https://bugs.launchpad.net/neutron/+bug/1644415 15:15:11 Launchpad bug 1644415 in neutron "dvr_edge_ha_router disassociates floatingip incompletely" [High,In progress] - Assigned to Zhixin Li (lizhixin) 15:15:27 #link https://review.openstack.org/#/c/404571/ - Patch needs review 15:16:16 The patch is pretty obvious but i have requested for a functional test to make sure the rules are getting cleared 15:16:22 Swami: thanks for the -1 on that, does need a test 15:16:49 #link https://bugs.launchpad.net/neutron/+bug/1632540 15:16:49 Launchpad bug 1632540 in neutron "l3-agent print the ERROR log in l3 log file continuously ,finally fill file space,leading to crash the l3-agent service" [Undecided,Incomplete] - Assigned to zhichao zhu (rtmdk) 15:18:03 The only patches have been to remove the error, no root cause 15:18:03 I did see that there was some more update on the bug in launchpad on how to reproduce it. This is a known issue when a router update fails that agent will continuously try to update the router until it succeeds and the log will quickly fill. 15:18:49 Probably the agent should have a retry count on the router update as a config option that would reduce the amount of logs. 15:19:57 jschwarz: have you seen this issue with HA. Because it is combination of DVR+HA on SNAT nodes. 15:20:20 looking 15:20:57 Swami: i wonder if the dvr race we just fixed would also help - "failed to process compatible router" was one of the messages 15:21:26 tbh it looks like we need a lot more logs to figure out what's wrong 15:21:37 haleyb: but it depends on what is causing the failure, the race that we fixed may not solve this problem. 15:21:40 specifically the neutron-server logs 15:21:45 but that bug is from 2 months ago.. 15:22:37 jschwarz: is it possible for you to triage this bug based on the steps and see if you can reproduce the problem. 15:23:01 Swami, can try 15:23:12 at the end of the day, the bug submitters need to help out when we can't reproduce it 15:23:16 Swami, but since then we've merged quite a few patches in that area 15:23:23 Swami, it might have been already fixed 15:23:42 jschwarz: yes a lot of patch fixes went in those area. 15:23:45 i'll add a note for them to re-test with ToT 15:25:25 May be we should also see how we can contain the logs in the agent log, this is not directly related to this bug, but for any agent related router updates. 15:25:50 #link https://bugs.launchpad.net/neutron/+bug/1629539 15:25:50 Launchpad bug 1629539 in neutron "Broken distributed virtual router w/ lbaas v1" [Undecided,Incomplete] 15:26:25 This has been marked as incomplete so don't need any further discussion on this. 15:26:40 #link https://bugs.launchpad.net/neutron/+bug/1612804 15:26:40 Launchpad bug 1612804 in neutron "test_shelve_instance fails with sshtimeout" [High,Confirmed] 15:27:12 * haleyb looks at kibana again 15:27:13 haleyb: Is this one still considered as High, since you did see only one or two occurrences in the gate. 15:27:36 * jschwarz has a meeting in 3 minutes 15:28:04 jschwarz: before you leave, do you have any other bugs to discuss. 15:28:13 Swami, nope, i'm good 15:28:21 jschwarz: thanks, no problem. 15:28:29 there was a gate breakage but me and kevinbenton fixed it earlier this week 15:28:37 Swami: i'll have to look at that again, i see failures in kibana, but not all are in neutron jobs 15:28:53 haleyb: ok no problem. 15:29:01 The next in the list is. 15:29:05 #link https://bugs.launchpad.net/neutron/+bug/1506567 15:29:05 Launchpad bug 1506567 in neutron "No information from Neutron Metering agent" [High,In progress] - Assigned to Brian Haley (brian-haley) 15:29:20 #link https://review.openstack.org/377108 - patch in review 15:29:54 Need reviews on this patch. 15:30:23 i will review again, must have fallen though the cracks over the break since i forgot about it 15:30:32 haleyb: thanks 15:30:37 #link https://bugs.launchpad.net/neutron/+bug/1571676 15:30:37 Launchpad bug 1571676 in neutron "After binding a floating IP to VM, the static route can't work in DVR." [Undecided,In progress] - Assigned to Swaminathan Vasudevan (swaminathan-vasudevan) 15:31:02 #link https://review.openstack.org/#/c/308068/ - Patch needs review 15:31:44 The next in the list is 15:31:47 #link https://review.openstack.org/#/c/352686/ 15:32:31 #link https://bugs.launchpad.net/neutron/+bug/1606741 15:32:32 Launchpad bug 1606741 in neutron "Metadata service for instances is unavailable when the l3-agent on the compute host is dvr_snat mode" [Medium,In progress] 15:33:10 #link https://review.openstack.org/#/c/352686/ - Patch in review. It has been abandoned, probably the owner has not rebased it for a quite a while. 15:33:31 and the bug is unassigned 15:34:21 haleyb: ok 15:34:37 running dvr_snat on a compute node is an edge case imo 15:34:57 haleyb: yes, let me at least try to rebase the patch and push it for review 15:35:12 Swami: then you will become the bug owner :) 15:35:18 automatically 15:35:26 haleyb: ok will take it. 15:35:44 Swami: i just mean the tools update the bug to the last commiter 15:36:14 haleyb: yes I got it. 15:36:26 That's all I had for the bugs. 15:37:46 #topic Gate 15:38:21 I need to update my patch to get dvr-multinode job voting, had some feedback i need to incoporate 15:38:35 Do you have the patch link 15:39:17 https://review.openstack.org/#/c/410973/ 15:39:26 i'll put that in the wiki 15:39:48 haleyb: thanks 15:40:00 for some reason grafana page shows the -trusty job failing often, but only xenial appears on any patches 15:40:48 #topic Stable backports 15:40:59 haleyb: I have seen some random failures on xenial in couple of my patches. But the test failures are not consistent. I am not sure if something else is wrong with the gate. 15:41:35 Swami: which jobs? i see the linuxbridge job fail a lot :( 15:42:10 gate-neutron-dsvm-functional-ubuntu-xenial 15:42:59 something related to ovs failure in the test. 15:43:41 i haven't seen that failure myself 15:44:17 haleyb: I saw that in one of my patch. https://review.openstack.org/#/c/283757/ 15:45:25 haleyb: I don't have any backports at this time. But I will keep an eye on other patches and will cherry-pick if needed. 15:45:58 so backports 15:46:03 sorry, concurrent meetings :P 15:46:19 Swami: thanks, and if you can gather any more info on that failure from the logs open a bug 15:46:20 we have the l3 scheduler patches we pushed that we can certainly backport to newton 15:47:05 jschwarz: yes, i was going to ask if we had any open changes or candidates 15:47:19 we do 15:47:25 I'm getting a list, hold on 15:47:36 https://review.openstack.org/#/c/317949/ 15:47:41 * haleyb plays jeopardy song 15:47:42 https://review.openstack.org/#/c/417089/ 15:47:46 https://review.openstack.org/#/c/417854/ 15:47:51 sorry I got disconnected. 15:47:52 https://review.openstack.org/#/c/357966/ 15:47:56 https://review.openstack.org/#/c/418777/ 15:47:58 these 5 :P 15:48:33 they are all pretty safe as there wasn't a lot of code change between newton and master in the l3 scheduling department 15:48:46 jschwarz: so just to newton? 15:48:47 and they all depend on a db change that was made pre-newton-rc 15:48:50 yes 15:49:13 jschwarz: so they can be only backported to newton. 15:49:23 Swami_, that's correct 15:50:05 anyways, those could be nice to have 15:50:13 the first one is a pretty good one I mustadmit 15:50:18 and the rest are kinda "free" 15:50:45 jschwarz: i'll add them to the wiki 15:50:50 haleyb++ 15:51:57 and add me to the reviews if i don't do it myself 15:53:00 anything else for stable we're missing? 15:53:05 not from me 15:53:38 on the open discussion though I have a topic :P 15:53:40 haleyb: nothing else. 15:53:48 #topic Open Discussion 15:54:08 haleyb: I also have one 15:54:10 haleyb, thoughts on the general direction of https://review.openstack.org/#/c/376550/ 15:54:11 ? 15:54:50 jschwarz: it's passing jenkins :) 15:54:58 haleyb, lol :) 15:55:11 haleyb, a few months back you guys were against it because it might break vpnaas 15:55:25 I was interested in knowing of that objection is now off the table and it might actually merge 15:55:37 s/of/off/ 15:55:42 s/of/if/ gahhhh 15:56:16 jschwarz: I need to recall what was the issue with the vpnaas. 15:56:26 * haleyb doesn't remember the issue either 15:56:44 Swami_, the issue was that when setting admin_state_up=False for DVR routers, vpnaas doesn't do anything 15:56:47 jschwarz: do you have a context on why it should break the vpnaas 15:56:49 (doesn't turn it off) 15:57:24 but since vpnaas is no longer maintained by anyone (pcm doesn't even know any known cores for it anymore), I don't see why we can't maybe-break it 15:58:04 jschwarz: I thought the vpnaas does the similar ones as the l3 agent, so when admint_sate_up=False, the l3 agent should bring down the router, and so the router-delete or router-update information should be handled by vpnaas. 15:58:08 Am I wrong. 15:58:25 Swami_, the vpaas agent inherits from l3-agent 15:58:50 Swami_, but it still has its own resources so in addition to calling super(), it should also do stuff on its own (like bring the vpn tunnels down, etc) 15:58:53 jschwarz: yes that's what i meant. 15:59:37 one minute left 15:59:49 can we take this to #openstack-neutron then? 16:00:01 sure 16:00:05 :) 16:00:05 jschwarz: we had a patch initially for migration that brings recommends the admin to bring down the service before proceeding with the migration. 16:00:21 Swami_, yes - that was merged for the HA part 16:00:21 we are at the top of the hour. 16:00:23 we can continue in the neutron channel 16:00:26 #endmeeting