15:01:18 <haleyb> #startmeeting neutron_dvr
15:01:19 <openstack> Meeting started Wed Jan 11 15:01:18 2017 UTC and is due to finish in 60 minutes.  The chair is haleyb. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:01:21 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:01:24 <openstack> The meeting name has been set to 'neutron_dvr'
15:01:29 <haleyb> #chair Swami
15:01:30 <openstack> Current chairs: Swami haleyb
15:01:54 <haleyb> #topic Announcements
15:02:44 <haleyb> ocata-3 will sneak up quickly, so try and finish up work
15:03:19 <haleyb> https://launchpad.net/neutron/+milestone/ocata-3 was created to track blueprints/bugs targeted for O-3, ping me if you need something added
15:04:02 <Swami> haleyb: the fast-exit rfe is it tagged for ocata-3, if not tagged can we tag it for ocata-3 otherwise, it will not go anywhere.
15:04:04 <haleyb> Also, if anyone is planning on running for PTL, get your paperwork in, think that's in two weeks or so
15:04:35 <haleyb> Swami: i don't see it there, i can add after meeting
15:04:41 <Swami> haleyb: ok
15:05:40 <haleyb> #topic Bugs
15:05:49 <Swami> haleyb: thanks
15:06:03 <Swami> This week I did not see any new bugs.
15:06:43 <Swami> haleyb: thanks for triaging couple of old bugs.
15:07:04 <haleyb> I still got no reponse on bug 1653633
15:07:04 <openstack> bug 1653633 in neutron "fwaas v1 with DVR: l3 agent can't restore the NAT rules for floatingIP" [Undecided,Incomplete] https://launchpad.net/bugs/1653633
15:07:26 <Swami> haleyb: Yes I saw your response on it.
15:07:42 <Swami> let us go through the high priority ones that are in review
15:07:48 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1647432
15:07:48 <openstack> Launchpad bug 1647432 in neutron "Multiple SIGHUPs to keepalived might trigger re-election" [High,In progress] - Assigned to John Schwarz (jschwarz)
15:08:10 <jschwarz> I've gotten some review comments and I'm going through them as we speak
15:08:18 <jschwarz> hopefully I'll get a new patchset out tomorrow
15:08:21 <Swami> #link https://review.openstack.org/#/c/407099/ ( patch in review)
15:08:33 <Swami> jschwarz: thanks
15:09:30 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1403455
15:09:30 <openstack> Launchpad bug 1403455 in neutron "neutron-netns-cleanup doesn't clean up all L3 agent spawned processes" [High,In progress] - Assigned to Daniel Alvarez (dalvarezs)
15:09:47 <jschwarz> a patch for that was merged I believe
15:10:03 <haleyb> yes, think even to stable
15:10:08 <jschwarz> https://review.openstack.org/#/c/411968/
15:10:14 <Swami> #link https://review.openstack.org/#/c/417957/ ( This patch has merged). Let me update it in the launchpad.
15:10:18 <jschwarz> there's a patch to backport it to stable/mitaka now
15:10:24 <jschwarz> dalvarez is doing excellent work on that
15:10:42 <haleyb> stable/newton patch merged, not sure there's anything further
15:11:20 <Swami> haleyb: yes I was looking at the launchpad and just updated the status to fix committed.
15:11:26 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1644231
15:11:26 <openstack> Launchpad bug 1644231 in neutron "fip router config is not created if the vm ports attached to FIPs have no device_owner" [Low,Triaged]
15:11:42 <Swami> This has been triaged and might be a documentation update.
15:12:23 <Swami> Based on what we decide. We also have a similar problem for the allowed-address-pair port used for assigning the VIPs without the binding
15:13:56 <Swami> haleyb: should this be update in the admin guide or in the developers guide
15:14:02 <haleyb> Swami: i think we were just going to document that.  I need to think where though - could be port-create or other page
15:14:03 <Swami> s/update/updated
15:14:44 <Swami> haleyb: ok thanks, I will look at the doc and see where we can fit it in.
15:14:58 <haleyb> Swami: thanks
15:15:11 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1644415
15:15:11 <openstack> Launchpad bug 1644415 in neutron "dvr_edge_ha_router disassociates floatingip incompletely" [High,In progress] - Assigned to Zhixin Li (lizhixin)
15:15:27 <Swami> #link https://review.openstack.org/#/c/404571/ - Patch needs review
15:16:16 <Swami> The patch is pretty obvious but i have requested for a functional test to make sure the rules are getting cleared
15:16:22 <haleyb> Swami: thanks for the -1 on that, does need a test
15:16:49 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1632540
15:16:49 <openstack> Launchpad bug 1632540 in neutron "l3-agent print the ERROR log in l3 log file continuously ,finally fill file space,leading to crash the l3-agent service" [Undecided,Incomplete] - Assigned to zhichao zhu (rtmdk)
15:18:03 <haleyb> The only patches have been to remove the error, no root cause
15:18:03 <Swami> I did see that there was some more update on the bug in launchpad on how to reproduce it. This is a known issue when a router update fails that agent will continuously try to update the router until it succeeds and the log will quickly fill.
15:18:49 <Swami> Probably the agent should have a retry count on the router update as a config option that would reduce the amount of logs.
15:19:57 <Swami> jschwarz: have you seen this issue with HA. Because it is combination of DVR+HA on SNAT nodes.
15:20:20 <jschwarz> looking
15:20:57 <haleyb> Swami: i wonder if the dvr race we just fixed would also help - "failed to process compatible router" was one of the messages
15:21:26 <jschwarz> tbh it looks like we need a lot more logs to figure out what's wrong
15:21:37 <Swami> haleyb: but it depends on what is causing the failure, the race that we fixed may not solve this problem.
15:21:40 <jschwarz> specifically the neutron-server logs
15:21:45 <jschwarz> but that bug is from 2 months ago..
15:22:37 <Swami> jschwarz: is it possible for you to triage this bug based on the steps and see if you can reproduce the problem.
15:23:01 <jschwarz> Swami, can try
15:23:12 <haleyb> at the end of the day, the bug submitters need to help out when we can't reproduce it
15:23:16 <jschwarz> Swami, but since then we've merged quite a few patches in that area
15:23:23 <jschwarz> Swami, it might have been already fixed
15:23:42 <Swami> jschwarz: yes a lot of patch fixes went in those area.
15:23:45 <haleyb> i'll add a note for them to re-test with ToT
15:25:25 <Swami> May be we should also see how we can contain the logs in the agent log, this is not directly related to this bug, but for any agent related router updates.
15:25:50 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1629539
15:25:50 <openstack> Launchpad bug 1629539 in neutron "Broken distributed virtual router w/ lbaas v1" [Undecided,Incomplete]
15:26:25 <Swami> This has been marked as incomplete so don't need any further discussion on this.
15:26:40 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1612804
15:26:40 <openstack> Launchpad bug 1612804 in neutron "test_shelve_instance fails with sshtimeout" [High,Confirmed]
15:27:12 * haleyb looks at kibana again
15:27:13 <Swami> haleyb: Is this one still considered as High, since you did see only one or two occurrences in the gate.
15:27:36 * jschwarz has a meeting in 3 minutes
15:28:04 <Swami> jschwarz: before you leave, do you have any other bugs to discuss.
15:28:13 <jschwarz> Swami, nope, i'm good
15:28:21 <Swami> jschwarz: thanks, no problem.
15:28:29 <jschwarz> there was a gate breakage but me and kevinbenton fixed it earlier this week
15:28:37 <haleyb> Swami: i'll have to look at that again, i see failures in kibana, but not all are in neutron jobs
15:28:53 <Swami> haleyb: ok no problem.
15:29:01 <Swami> The next in the list is.
15:29:05 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1506567
15:29:05 <openstack> Launchpad bug 1506567 in neutron "No information from Neutron Metering agent" [High,In progress] - Assigned to Brian Haley (brian-haley)
15:29:20 <Swami> #link https://review.openstack.org/377108 - patch in review
15:29:54 <Swami> Need reviews on this patch.
15:30:23 <haleyb> i will review again, must have fallen though the cracks over the break since i forgot about it
15:30:32 <Swami> haleyb: thanks
15:30:37 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1571676
15:30:37 <openstack> Launchpad bug 1571676 in neutron "After binding a floating IP to VM, the static route can't work in DVR." [Undecided,In progress] - Assigned to Swaminathan Vasudevan (swaminathan-vasudevan)
15:31:02 <Swami> #link https://review.openstack.org/#/c/308068/ - Patch needs review
15:31:44 <Swami> The next in the list is
15:31:47 <Swami> #link https://review.openstack.org/#/c/352686/
15:32:31 <Swami> #link https://bugs.launchpad.net/neutron/+bug/1606741
15:32:32 <openstack> Launchpad bug 1606741 in neutron "Metadata service for instances is unavailable when the l3-agent on the compute host is dvr_snat mode" [Medium,In progress]
15:33:10 <Swami> #link https://review.openstack.org/#/c/352686/ - Patch in review. It has been abandoned, probably the owner has not rebased it for a quite a while.
15:33:31 <haleyb> and the bug is unassigned
15:34:21 <Swami> haleyb: ok
15:34:37 <haleyb> running dvr_snat on a compute node is an edge case imo
15:34:57 <Swami> haleyb: yes, let me at least try to rebase the patch and push it for review
15:35:12 <haleyb> Swami: then you will become the bug owner :)
15:35:18 <haleyb> automatically
15:35:26 <Swami> haleyb: ok will take it.
15:35:44 <haleyb> Swami: i just mean the tools update the bug to the last commiter
15:36:14 <Swami> haleyb: yes I got it.
15:36:26 <Swami> That's all I had for the bugs.
15:37:46 <haleyb> #topic Gate
15:38:21 <haleyb> I need to update my patch to get dvr-multinode job voting, had some feedback i need to incoporate
15:38:35 <Swami> Do you have the patch link
15:39:17 <haleyb> https://review.openstack.org/#/c/410973/
15:39:26 <haleyb> i'll put that in the wiki
15:39:48 <Swami> haleyb: thanks
15:40:00 <haleyb> for some reason grafana page shows the -trusty job failing often, but only xenial appears on any patches
15:40:48 <haleyb> #topic Stable backports
15:40:59 <Swami> haleyb: I have seen some random failures on xenial in couple of my patches. But the test failures are not consistent. I am not sure if something else is wrong with the gate.
15:41:35 <haleyb> Swami: which jobs?  i see the linuxbridge job fail a lot :(
15:42:10 <Swami> gate-neutron-dsvm-functional-ubuntu-xenial
15:42:59 <Swami> something related to ovs failure in the test.
15:43:41 <haleyb> i haven't seen that failure myself
15:44:17 <Swami> haleyb: I saw that in one of my patch. https://review.openstack.org/#/c/283757/
15:45:25 <Swami> haleyb: I don't have any backports at this time. But I will keep an eye on other patches and will cherry-pick if needed.
15:45:58 <jschwarz> so backports
15:46:03 <jschwarz> sorry, concurrent meetings :P
15:46:19 <haleyb> Swami: thanks, and if you can gather any more info on that failure from the logs open a bug
15:46:20 <jschwarz> we have the l3 scheduler patches we pushed that we can certainly backport to newton
15:47:05 <haleyb> jschwarz: yes, i was going to ask if we had any open changes or candidates
15:47:19 <jschwarz> we do
15:47:25 <jschwarz> I'm getting a list, hold on
15:47:36 <jschwarz> https://review.openstack.org/#/c/317949/
15:47:41 * haleyb plays jeopardy song
15:47:42 <jschwarz> https://review.openstack.org/#/c/417089/
15:47:46 <jschwarz> https://review.openstack.org/#/c/417854/
15:47:51 <Swami_> sorry I got disconnected.
15:47:52 <jschwarz> https://review.openstack.org/#/c/357966/
15:47:56 <jschwarz> https://review.openstack.org/#/c/418777/
15:47:58 <jschwarz> these 5 :P
15:48:33 <jschwarz> they are all pretty safe as there wasn't a lot of code change between newton and master in the l3 scheduling department
15:48:46 <haleyb> jschwarz: so just to newton?
15:48:47 <jschwarz> and they all depend on a db change that was made pre-newton-rc
15:48:50 <jschwarz> yes
15:49:13 <Swami_> jschwarz: so they can be only backported to newton.
15:49:23 <jschwarz> Swami_, that's correct
15:50:05 <jschwarz> anyways, those could be nice to have
15:50:13 <jschwarz> the first one is a pretty good one I mustadmit
15:50:18 <jschwarz> and the rest are kinda "free"
15:50:45 <haleyb> jschwarz: i'll add them to the wiki
15:50:50 <jschwarz> haleyb++
15:51:57 <haleyb> and add me to the reviews if i don't do it myself
15:53:00 <haleyb> anything else for stable we're missing?
15:53:05 <jschwarz> not from me
15:53:38 <jschwarz> on the open discussion though I have a topic :P
15:53:40 <Swami_> haleyb: nothing else.
15:53:48 <haleyb> #topic Open Discussion
15:54:08 <Swami_> haleyb: I also have one
15:54:10 <jschwarz> haleyb, thoughts on the general direction of https://review.openstack.org/#/c/376550/
15:54:11 <jschwarz> ?
15:54:50 <haleyb> jschwarz: it's passing jenkins :)
15:54:58 <jschwarz> haleyb, lol :)
15:55:11 <jschwarz> haleyb, a few months back you guys were against it because it might break vpnaas
15:55:25 <jschwarz> I was interested in knowing of that objection is now off the table and it might actually merge
15:55:37 <jschwarz> s/of/off/
15:55:42 <jschwarz> s/of/if/ gahhhh
15:56:16 <Swami_> jschwarz: I need to recall what was the issue with the vpnaas.
15:56:26 * haleyb doesn't remember the issue either
15:56:44 <jschwarz> Swami_, the issue was that when setting admin_state_up=False for DVR routers, vpnaas doesn't do anything
15:56:47 <Swami_> jschwarz: do you have a context on why it should break the vpnaas
15:56:49 <jschwarz> (doesn't turn it off)
15:57:24 <jschwarz> but since vpnaas is no longer maintained by anyone (pcm doesn't even know any known cores for it anymore), I don't see why we can't maybe-break it
15:58:04 <Swami_> jschwarz: I thought the vpnaas does the similar ones as the l3 agent, so when admint_sate_up=False, the l3 agent should bring down the router, and so the router-delete or router-update information should be handled by vpnaas.
15:58:08 <Swami_> Am I wrong.
15:58:25 <jschwarz> Swami_, the vpaas agent inherits from l3-agent
15:58:50 <jschwarz> Swami_, but it still has its own resources so in addition to calling super(), it should also do stuff on its own (like bring the vpn tunnels down, etc)
15:58:53 <Swami_> jschwarz: yes that's what i meant.
15:59:37 <haleyb> one minute left
15:59:49 <jschwarz> can we take this to #openstack-neutron then?
16:00:01 <haleyb> sure
16:00:05 <jschwarz> :)
16:00:05 <Swami_> jschwarz: we had a patch initially for migration that brings recommends the admin to bring down the service before proceeding with the migration.
16:00:21 <jschwarz> Swami_, yes - that was merged for the HA part
16:00:21 <Swami_> we are at the top of the hour.
16:00:23 <haleyb> we can continue in the neutron channel
16:00:26 <haleyb> #endmeeting