14:00:11 <mlavalle> #startmeeting neutron_drivers
14:00:12 <openstack> Meeting started Fri May 17 14:00:11 2019 UTC and is due to finish in 60 minutes.  The chair is mlavalle. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:13 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:16 <openstack> The meeting name has been set to 'neutron_drivers'
14:00:18 <njohnston> o/
14:00:22 <Tengu> \o/
14:00:25 <davidsha> o/
14:00:30 <tidwellr> o/
14:00:41 <doreilly> o/
14:00:49 * Zara lurks
14:00:58 <slaweq> hi
14:01:22 <haleyb> o/
14:01:25 <Tengu> aaand I don't remember who told me to be here for validation stuff. #smallBrain
14:02:11 <ralonsoh> hi
14:02:21 <mlavalle> doreilly: I wasn't expecting you today here
14:02:32 <mlavalle> we will try to get to you at the end
14:02:37 <mlavalle> ok?
14:02:41 <doreilly> sure - thanks
14:04:06 <slaweq> Tengu: aren't You here to talk about https://bugs.launchpad.net/neutron/+bug/1825943 ?
14:04:07 <openstack> Launchpad bug 1825943 in neutron "[RFE] Implement "stop agent" hook script support in Neutron" [Wishlist,New]
14:04:08 <slaweq> :)
14:04:41 <Tengu> slaweq: ohh. yeah, probably, indeed :). Sorry, I'm in validations things up to the neck.
14:04:50 <slaweq> Tengu: np :)
14:05:14 <mlavalle> LOL
14:05:26 <mlavalle> #topic RFEs
14:05:55 <mlavalle> Actually, our first RFE of the day is indeed https://bugs.launchpad.net/neutron/+bug/1825943
14:05:56 <openstack> Launchpad bug 1825943 in neutron "[RFE] Implement "stop agent" hook script support in Neutron" [Wishlist,New]
14:06:09 <Tengu> woohoo that's mine :)
14:06:35 <Tengu> so. do you want a quick summary maybe?
14:06:56 <mlavalle> sure, go ahead
14:07:03 <Tengu> so.
14:07:44 <Tengu> we discovered neutron launches container agents and instead of killing the container, it kills the running service.
14:08:14 <mlavalle> becasue Neutron doesn't know about containers
14:08:19 <Tengu> yep.
14:08:37 <Tengu> so the current situation keeps dangling containers in a failed state, and this is pretty bad
14:09:09 <Tengu> regarding disk usage, well, it's not a big deal since those containers are small, but when you get the container ps output, seeing those dead things is pretty bad.
14:09:24 <Tengu> the RFE is about making neutron aware of "kill hooks" more than "container".
14:09:57 <Tengu> basically, there's already something like this for starting the agent (wrapper scripts are being used), so it would be great to get the same kind of thing for the agent death management
14:10:14 <Tengu> (sorry, I'm multi-tasking meetings - bare with me if I'm slow)
14:10:24 <njohnston> it's a common enough idea; at a previous employer we used to call these 'dirge scripts'
14:10:47 <njohnston> in order to handle shutdowns that required some level of orchestration or multi-step cleanup
14:11:07 <slaweq> and in fact our process_monitor is ready for it https://github.com/openstack/neutron/blob/master/neutron/agent/linux/external_process.py#L98
14:11:29 <Tengu> in "our" case, a simple $container_cli stop -t 10 <container-agent> would be enough - but if we can get some cleanup ($container_cli rm <container-agent>) that would be even better.
14:11:37 <Tengu> hence the notion of "hook"
14:11:51 <slaweq> so we can pass there some get_stop_command() function which wil return such 'dirge script'
14:11:51 <Tengu> not something container-centric since it might be usefull for other things.
14:12:11 <Tengu> slaweq: that would be great :)
14:15:12 <mlavalle> when Neutron was originally implemented, we certainly didn't think of containers
14:15:17 <njohnston> and those "other things" are a very operator-friendly concept
14:15:52 <slaweq> njohnston++
14:16:08 <mlavalle> but as containers based deployments become more prevalent, a mechanism like this becomes more and more important
14:16:21 <mlavalle> I am fine with approving this
14:16:28 <slaweq> me too
14:16:47 <slaweq> and as I said, it don't look as very hard to implement
14:16:47 <Tengu> great :)
14:16:49 <njohnston> what I like about this is we provide the places to plug in so that neutron can work with containers... but it also helps if you want to run neutron undr k8s, or in mesos, or in whatever the next framework to come down the line is
14:17:05 <mlavalle> yes
14:17:29 <mlavalle> what does haleyb think?
14:18:32 <haleyb> sorry, i wan't paying attention
14:19:38 <haleyb> reading s/b i think it's a good idea
14:19:52 <Tengu> :)
14:20:24 <mlavalle> any thoughts from others attending the meeting?
14:21:42 <mlavalle> Tengu: are you planning to implement it?
14:22:13 <Tengu> mlavalle: hm, not really in my skillset unfortunately
14:22:22 <slaweq> mlavalle: I have it added to my todo list
14:22:31 <mlavalle> ok
14:22:37 <Tengu> I'm part of the deployment framework team, so.... we just discovered this "issue" :)
14:22:40 <Tengu> slaweq: thanks!
14:22:56 <Tengu> slaweq: if you need help, I can run tests and so on, maybe review if needed.
14:23:11 <mlavalle> slaweq: I'll assign it to you. I can also help if you need
14:23:12 <slaweq> Tengu: sure, I will ping You if will need help
14:23:17 <slaweq> mlavalle: thx
14:23:24 <Tengu> slaweq: +1
14:25:37 <mlavalle> Next one is https://bugs.launchpad.net/neutron/+bug/1825345
14:25:38 <openstack> Launchpad bug 1825345 in neutron "[RFE] admin-state-down doesn't evacuate bindings in the dhcp_agent_id column" [Wishlist,Confirmed]
14:29:03 <mlavalle> zigo: if you are around, we are taking a look at ^^^^
14:29:23 <njohnston> zigo: Can you shed some light on how the results of 'openstack network agent remove' differ from 'neutron dhcp-agent-network-remove' - might be a bug
14:29:28 <slaweq> so personally I'm not sure if setting agent's admin_state_up to False should move all networks/routers from agent immediatelly
14:29:52 <slaweq> IMO we should debug and fix bug in openstack client if it doesn't work as expected
14:30:02 <njohnston> slaweq++
14:30:16 <slaweq> and then operator would be able to move all networks/routers from agent when needs it
14:30:44 <mlavalle> and even create some automation based on commands
14:31:16 <slaweq> mlavalle: exactly but that's something on operator's plate - we should give proper API calls to allow this
14:31:33 <slaweq> and OSC to allow to do it easily :)
14:32:46 <haleyb> this gets back at another rfe from intel (?) at doing automatic dhcp failovers, and we found it's maybe better to do outside the server code, right?
14:33:12 <slaweq> haleyb: yes, I remember that there was something like that in the past :)
14:33:15 <mlavalle> haleyb: yes, in Denver, Fall 2018
14:33:35 <haleyb> and there was an operators repo that ovh contributed to
14:34:08 <slaweq> haleyb: I think it was https://github.com/openstack/osops-tools-contrib
14:34:35 <haleyb> yes, that was it
14:35:57 <mlavalle> ok, I'll leave a commentt in the rfe and point submitter to the scripts repor ^^^^
14:36:15 <haleyb> yes, would be good to get his feedback
14:36:40 <slaweq> mlavalle++
14:37:04 <mlavalle> doreilly: did you want to revisit https://bugs.launchpad.net/neutron/+bug/1817022?
14:37:05 <openstack> Launchpad bug 1817022 in neutron "[RFE] set inactivity_probe and max_backoff for OVS bridge controller" [Wishlist,In progress] - Assigned to Darragh O'Reilly (darragh-oreilly)
14:37:17 <doreilly> yes please
14:37:22 <mlavalle> go ahead
14:37:45 <doreilly> not really a whole lot to add since the last meeting
14:38:04 <doreilly> I retested with the patch that breaks up the dump_flows() and it helps
14:39:26 <doreilly> My only test case to reproduce a problem is to restart the agent when having a lot of flows
14:40:10 <doreilly> but it seems others have seen problems in places other than restart
14:40:27 <doreilly> https://bugs.launchpad.net/neutron/+bug/1821753
14:40:29 <openstack> Launchpad bug 1817022 in neutron "duplicate for #1821753 [RFE] set inactivity_probe and max_backoff for OVS bridge controller" [Wishlist,In progress] - Assigned to Darragh O'Reilly (darragh-oreilly)
14:42:16 <mlavalle> so above fix mitigates but don't fix the problem completely?
14:42:26 <doreilly> correct
14:42:52 <slaweq> IMHO we should go with this "partial fix" if we don't have anything better for now
14:43:03 <slaweq> it's better than nothing
14:43:39 <tidwellr> slaweq++
14:44:47 <mlavalle> yeah. obondarev seems to be thinking along the same lines, although he is hesitant about the specific mix of values to tweak
14:45:02 <mlavalle> but at the very least we sould experiment with it
14:45:10 <mlavalle> especially this early in the cycle
14:46:03 <slaweq> mlavalle++
14:46:39 <mlavalle> so I am for approving the RFE
14:46:44 <slaweq> me too
14:47:00 <doreilly> thanks guys
14:47:09 <Zara> :D
14:47:13 <mlavalle> haleyb: any thoughts?
14:47:51 <haleyb> i agree, get something in to experiment with at least
14:48:06 <mlavalle> ok, approved it is
14:48:30 <mlavalle> doreilly: thanks for your patience and perseverance. Much appreciated!
14:48:39 <doreilly> thx, I will add the docstring and repush
14:50:04 <mlavalle> ok, anything else we should discuss today?
14:50:48 <ralonsoh> only a heads-up
14:50:48 <ralonsoh> https://bugs.launchpad.net/neutron/+bug/1821058
14:50:50 <openstack> Launchpad bug 1821058 in neutron "[RFE] Port binding event extended information for Nova" [Wishlist,Confirmed] - Assigned to Rodolfo Alonso (rodolfo-alonso-hernandez)
14:51:10 <ralonsoh> Feel free to review https://review.opendev.org/#/c/645173/
14:51:11 <ralonsoh> thanks!
14:51:20 <mlavalle> thanks ralonsoh
14:51:28 <mlavalle> ok
14:51:28 <slaweq> ralonsoh: I will try to review it next week
14:51:59 <njohnston> ralonsoh: same, will try to review next week
14:52:06 <mlavalle> don't forget that this Sunday is the last episode of Game of Thrones
14:52:14 <mlavalle> Enjoy!
14:52:17 <ralonsoh> hahahaha
14:52:36 <mlavalle> Have a nice weekend
14:52:41 <mlavalle> #endmeeting