16:00:37 <mestery> #startmeeting networking_ml2
16:00:39 <openstack> Meeting started Wed Jan 15 16:00:37 2014 UTC and is due to finish in 60 minutes.  The chair is mestery. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:40 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:00:43 <openstack> The meeting name has been set to 'networking_ml2'
16:00:52 <mestery> #link https://wiki.openstack.org/wiki/Meetings/ML2 Agenda
16:01:15 <mestery> We've got an agenda which will either go at least an hour, or we'll be done in 20 minutes. Lets see what happens. :)
16:01:23 <mestery> #topic Action Items
16:01:33 <mestery> The first action item to cover is for rcurran.
16:01:43 <mestery> rcurran: Did you verify if a bug is needed for Cisco UT coverage in ML2?
16:01:58 <rcurran> yes, do you want the link
16:02:02 <rcurran> bug#
16:02:09 <mestery> rcurran: Yes please, I'll add it in the meeting minutes.
16:02:18 <rcurran> https://bugs.launchpad.net/neutron/+bug/1267481
16:02:26 <mestery> rcurran: thanks!
16:02:41 <mestery> #link https://bugs.launchpad.net/neutron/+bug/1267481 Bug for Cisco ML2 UT coverage
16:03:06 <mestery> asadoughi: Thank you for filing bugs for the ML2 UT coverage gaps!
16:03:21 <mestery> #info ML2 UT coverage bugs filed and tagged: https://bugs.launchpad.net/neutron/+bugs?field.tag=ml2-unit-test-coverage
16:03:26 <asadoughi> mestery: np
16:03:58 <mestery> Now that these bugs are filed, if you are planning to work on one, please assign it to yourself. Anyone can grab one of these.
16:04:13 <mestery> Would be good to get this coverage added during Icehouse if we can spread the load.
16:04:32 <mestery> So, jumping around a little (we'll come back to you rkukura):
16:04:42 <mestery> I pinged asomya this morning around his RPC google document
16:04:46 <mestery> #link https://docs.google.com/document/d/1ZHb2zzPmkSOpM6PR8M9sx2SJOJPHblaP5eVXHr5zOFg/edit RPC Google Document
16:04:56 <mestery> I have not heard back from asomya yet, but once I do, we'll send email to the mailing list.
16:05:12 <mestery> #topic Port Binding
16:05:21 <mestery> So now, the meat of the meeting: port binding :)
16:05:28 <mestery> #undo
16:05:29 <openstack> Removing item from minutes: <ircmeeting.items.Topic object at 0x39171d0>
16:05:32 <mestery> Wait, asomya joined!
16:05:35 <mestery> asomya, welcome!
16:05:41 <asomya> Hello
16:05:46 <mestery> I just mentioned your RPC document (https://docs.google.com/document/d/1ZHb2zzPmkSOpM6PR8M9sx2SJOJPHblaP5eVXHr5zOFg/edit)
16:06:01 <matrohon> hi
16:06:03 <mestery> Last week, there was a question on if you were still targeting this for Icehouse or not.
16:06:10 <mestery> I wasn't sure, thus needed your input :)
16:06:31 <asomya> That's not being worked on actively at the moment, It needs the other patch that i proposed in the summit to make type drivers more self sufficient
16:06:49 <asomya> I was waiting for Zan'g ptch to go in before i posted mine but now I see that one is abandoned
16:06:53 <mestery> asomya: Thanks. Could we expand Zhang's typedriver patch to make that work with this RPC work?
16:06:57 <mestery> asomya: :)
16:07:13 <asomya> Yes I'll post a patch for that work and then work on this
16:07:15 <rkukura> Phasing smaller patches is better
16:07:37 <rkukura> Lets get the type driver refactor active again, whether the original patch or a new one
16:07:39 <mestery> Agreed. So asomya, do you want to reach out to Zhang and cooridnate thigns there?
16:07:45 <mestery> rkukura: Agreed.
16:07:50 <asomya> rkukura: agreed
16:07:53 <mestery> asomya: You ok reaching out for this or do you want me to to take an action?
16:08:10 <asomya> I'll reach out to Zhang if i need his counsel on the patch
16:08:16 <mestery> asomya: Thanks!
16:08:26 <mestery> #action asomya to work on new typedriver refactoring patch with zhang
16:08:39 <mestery> OK, anything else on RPC or TypeDriver here before we move on to port binding discussions?
16:09:09 <matrohon> asomya: let's keep in touch we may need your functionnality
16:09:22 <asomya> matrohon: sure
16:09:38 <mestery> #topic Port Binding
16:09:49 <mestery> rkukura: You're up!
16:10:19 <rkukura> OK, I have not yet emailed a concrete proposal on port binding improvments
16:10:36 <rkukura> but can discuss the ideas/conclusions here
16:11:26 <rkukura> 1st, because MDs may need to make remote calls to controllers/devices, we need ml2 to call bind_port() on them outside of any transaction rather than inside, as is currently done
16:12:36 <rkukura> So I was looking into whether we could do the port binding before starting the transaction. But if binding:host_id is supplied in port_create, this would mean trying to bind before we've stored anything about the port in the DB
16:13:11 <rkukura> It seems some MDs might need to maintain their own table mapping neutron's port ID to a controller's port ID
16:14:03 <rkukura> But the MD's could not reference the neutron port as a foriegn key with cascading delete and all that if the transaction creating this hasn't even started.
16:14:26 <rkukura> So that leads to a two transaction approach.
16:15:17 <rkukura> The port_create or port_update that sets the binding:host_id would 1st be processed normally without trying to bind the port, and the MDs' precommit and postcommit calls would be made
16:15:36 <rkukura> Then port binding would occur outside any tranaction
16:15:57 <mestery> rkukura: Per our discussion last night, this all makes sense to me. I'm curious what others think.
16:16:08 <rkukura> in bind_port, MDs could do their own transactions if needed and/or could talk to controllers/devices
16:16:38 <rkukura> Then a 2nd transaction would update the ml2_port_binding table with the result
16:16:50 <matrohon> make sens to me too
16:16:52 <rkukura> And I think the MDs would see this as a port_update
16:17:09 <rkukura> with its own precommit and postcommit calls
16:17:22 <mestery> Yes. rkukura, I think this also makes the interactions with nova pretty concise as well, since nova would first create the port, then update the port.
16:17:54 <rkukura> mestery: Actually, nova could still just create the port, and expect the returned port dict to have the binding result
16:18:04 <matrohon> mestery : why nova should update the port?
16:18:32 <mestery> rkukura matrohon: Nevermind, I was confusing this with some other nova thing. Carry on. :)
16:18:44 <rkukura> when the user passes in a port-id, then nova updates to set binding:host_id, otherwise it can just create it with binding:host_id
16:19:15 <rkukura> either way, nova needs the binding result (vif_type, capabilities/vif_security, etc.)
16:19:41 <rkukura> So from the API, its a single operation, but internally its two transactions
16:20:11 <rkukura> Which then brings up the question of concurrency, since all kinds of things can happen between those two transactions
16:21:03 <mestery> rkukura: :)
16:21:06 <matrohon> maybe out of scope but don't you think that this is a scheduler issue, the scheduler should ask for a host capable to bind the port first
16:21:11 <rkukura> I was thinking a state machine would be needed, where the thread doing the binding sets a state showing that a binding is in progress in the 1st transaction, and then changes to complete/failed after
16:21:35 <rkukura> matrohon: Agreed, but we still need to eventually do the binding
16:22:07 <rkukura> matrohon: We could have the scheduler set binding:host_id and nova just to port_get, but that's later
16:22:22 <matrohon> rkukura : ok
16:22:56 <rkukura> Anyway, the state machine gets complex when you consider the thread/process doing the binding dying, or other threads needing the result of the binding
16:23:58 <rkukura> So was chatting with mestery and describing this, and we started thinking maybe we can just allow different threads/proccess to attempt to bind concurrently, and not need a state machine
16:24:12 <rkukura> This would work as follows:
16:25:02 <rkukura> Any thread executing port_create, port_update, or port_get may see that it is possible to bind (binding:host_id is set) but the port is not yet bound.
16:25:41 <rkukura> It will then attempt to bind the port (after committing the 1st transaction in the case of port_create or port_update)
16:26:06 <rkukura> When its binding attempt is complete, it will start a new transaction
16:26:41 <rkukura> In that new transaction, it will look to see if the port has already been bound concurrently by some other thread/process
16:27:28 <rkukura> if so, it will use the stored binding to return the right port dict info from the operation, and if not it will its own binding result and return it
16:27:42 <rkukura> does this make any sense, and seem workable?
16:28:07 <matrohon> looks great to me :)
16:28:11 <mestery> +1
16:28:58 <rcurran> looks good but waiting for the unbind/delete logic :-)
16:29:54 <rkukura> rcurran: good point - haven't worked out details on that yet
16:31:43 <matrohon> the port delet should ask the MD in charge of the binding first, to unbind the port
16:31:46 <rkukura> So if the basic idea/approach discussed here seems workable, I'll flesh out the remaining details, including unbind/delete and send an email to openstack-dev for discussion
16:32:07 <mestery> rkukura: Go for it! Thanks for all your work on this!
16:32:16 <mestery> #action rkukura to flesh out details of bind/unbind and send email to openstack-dev ML
16:32:19 <matrohon> rkukura : thanks
16:32:36 <rcurran> yes, thanks
16:32:38 <rkukura> matrohon: I think you are right - want to make sure the bound MD knows it has been unbound, but also that all MDs know the port itself has been unbound
16:33:19 <rkukura> And in both cases, should know the details of that previous binding
16:33:22 <matrohon> other MD will be aware of that with port_update_pre/post commit
16:33:36 <rkukura> matrohon: right
16:33:39 <rcurran> and the (now) unbound information is available to md's (bob beat me to it)
16:34:50 <rkukura> one other port binding change comes from nati_uen_'s vif_security patch
16:35:36 <rkukura> that basically replaces storing the capabilities supplied by the bound MD with calling into the bound MD to get the vif_security attribute
16:36:10 <rkukura> I'd like to apply that same pattern to the vif_type, and allow it to be extended for things like attributes needed for sr-iov
16:36:33 <mestery> Makes sense to me rkukura.
16:37:06 <rkukura> nati_ueno: did you see last couple lines?
16:37:12 <matrohon> make sense to me too
16:37:17 <nati_ueno> rkukura: yes
16:38:31 <nati_ueno> rkukura: Ah may be I'm missing few lines.. It looks like I've disconnected
16:38:44 <rkukura> I'm willing to take this subset of nati_uen's patch, generalize it a bit, make sure it works for vif_type, and propose it for review separately from the capabilities->vif_security change
16:38:58 <rkukura> nati_ueno: one other port binding change comes from nati_uen_'s vif_security patch
16:39:08 <rkukura> nati_ueno: that basically replaces storing the capabilities supplied by the bound MD with calling into the bound MD to get the vif_security attribute
16:39:17 <nati_ueno> rkukura: OK I saw the line. Thanks
16:39:26 <rkukura> nati_ueno: I'd like to apply that same pattern to the vif_type, and allow it to be extended for things like attributes needed for sr-iov
16:40:11 <matrohon> vif_security and vif_type are returned back to agent through get_device_details?
16:40:32 <nati_ueno> matrohon: vif_security won't got to the agent
16:40:40 <nati_ueno> s/got/go/
16:40:44 <nati_ueno> it is needed by nova
16:40:45 <rkukura> matrohon: These are REST API extensions to port, not RPC
16:41:26 <rkukura> nati_ueno's patch updates the portbinding extension, replacing capabilities with vif_security
16:41:52 <nati_ueno> rkukura: so you wanna have the generalized version before the my patch?
16:42:05 <rkukura> nati_ueno: If that helps get things moving, sure
16:42:29 <nati_ueno> rkukura: if it is faster, ok please
16:42:39 <rkukura> I think that patch would be a small change completely localized to ml2
16:43:27 <rkukura> Separate issue with vif_security is how ml2 gets right info for the firewall_driver in the L2 agent on the node where the port gets bound
16:44:11 <nati_ueno> yes. but we need some workaround fix for this since security group is broken
16:44:17 <rkukura> I've suggested that the L2 agent could get the vif_security info from its firewall_driver, and include this in its agents_db info
16:44:39 <rkukura> then the bound MD would return this as the vif_security for the port
16:45:47 <rkukura> existing agents_db RPC would send it from agent to server and store it in the agents_db table
16:46:04 <mestery> makes sense to me rkukura.
16:46:14 <rkukura> Maybe this should be on agenda for next week's ml2 meeting if not wrapped up by then?
16:47:00 <mestery> rkukura: I agree.
16:47:24 <nati_ueno> rkukura: so you wanna do this from first step?
16:47:28 <rkukura> nati_ueno: Sound reasonable?
16:48:20 <nati_ueno> rkukura: hmm I need to time to think about the architecture.. depending agent's configuration sounds like wrong direction
16:48:43 <rkukura> nati_ueno: I should be able to post initial patch for calling extend_port_dict() on the bound MD instread of storing vif_type and capabliities in ml2_port_binding table today
16:49:23 <nati_ueno> rkukura: Ok how about remove ml2 from https://review.openstack.org/#/c/21946/
16:49:24 <rkukura> nati_ueno: Is that concern around getting the vif_security from the firewall driver in the agent?
16:49:33 <nati_ueno> rkukura: yes
16:49:50 <nati_ueno> rkukura: then you will have a patch on ml2 localized
16:50:17 <rkukura> nati_ueno: As I recall, your patch has the firewall_driver supply the vif_security value, right?
16:50:39 <nati_ueno> rkukura: yes
16:51:00 <rkukura> If that's the case, then I think the issue is whether this call on the firewall_driver gets made in the L2 agent or in the server
16:51:35 <rkukura> Right now, ml2 users need to supply a dummy value for firewall_driver so that the SG API extension is enabled
16:51:43 <rkukura> in havanan
16:51:46 <rkukura> havana
16:52:06 <nati_ueno> so my opinion is the current firewall driver model is broken
16:52:18 <nati_ueno> some functionaries are mixed
16:52:27 <nati_ueno> enable sec group or not
16:52:37 <nati_ueno> select driver for each agent implementation
16:52:42 <rkukura> Agreed enabling the API should be a separate config in the server
16:53:03 <rkukura> But with ml2, different L2 agents might use different firewall drivers
16:53:05 <nati_ueno> select driver for agent who support more than one implemenation
16:53:31 <rkukura> Or a MD for a controller might not use any L2 agent, and instead do SG's some other way
16:53:33 <nati_ueno> Most of plugin (or MD) supports only one driver
16:53:41 <nati_ueno> so it should be defined automatically
16:54:09 <rkukura> ml2 works concurrently with openvswitch-agent, linuxbridge-agent,  hyperv-agent, and soon with controllers like ODL
16:54:19 <nati_ueno> currently there is no plugin or (MD) support multiple drivers
16:54:29 <rkukura> each of these may have different ways to enforce SGs
16:54:39 <matrohon> so MD should be in charge of defining the firewall_driver?
16:54:41 <nati_ueno> rkukura: I agree. so ML2 can support multiple MD
16:54:55 <nati_ueno> and driver will be decided by MD
16:55:01 <rkukura> So your approach of having the bound MD return vif_security is correct
16:55:12 <nati_ueno> because there is no MD which support more than one firewall driver
16:55:21 <nati_ueno> except Noop
16:55:32 <rkukura> but we need to resolve how the different MDs in the server get the correct vif_security to return
16:55:41 <nati_ueno> so we should able to decide vif_security value based on vif_type
16:56:02 <rkukura> when these MDs are supporting an L2 agent, the agents_db seems like a good solution to me
16:56:42 <nati_ueno> rkukura: OK so we should remove NoopDriver.
16:56:52 <nati_ueno> so let's say we could remove Noop
16:57:02 <nati_ueno> we should able to decide vif_security value based on vif_type
16:57:16 <rkukura> right now, the bound MD supplies the vif_type and the capabilities - just want it to be able to supply the vif_security from the firewall_driver in the agent
16:57:45 <rkukura> the agent-based MDs already use agents_db to see what network segment they can bind to, based on bridge_mappings info
16:57:48 <nati_ueno> rkukura: yes, but it is not needed because we can map vif_type to vif_security value
16:57:52 <mestery> Just a note: We have 3 minutes left here folks.
16:57:58 <mestery> Best to continue this on ML discussion perhaps?
16:58:03 <rkukura> sure
16:58:07 <nati_ueno> gotcha
16:58:13 <mestery> I just know there is another meeting right after this one. :)
16:58:22 <rkukura> I'm glad we got a chance to get this conversation with nati_ueno going
16:58:37 <nati_ueno> rkukura: I'll start thread in the openstack-dev
16:58:41 * mestery nods in agreement
16:58:46 <mestery> nati_ueno: Thanks!
16:58:53 <nati_ueno> mestery: sure!
16:59:04 <mestery> OK, I think that's it for this week folks!
16:59:09 <mestery> Lets continue with these threads on the ML.
16:59:12 <matrohon> very inetresting discussion
16:59:18 <mestery> Thanks for joining the discussions this week everyone!
16:59:22 <mestery> #endmeeting