16:00:37 <mestery> #startmeeting networking_ml2 16:00:39 <openstack> Meeting started Wed Jan 15 16:00:37 2014 UTC and is due to finish in 60 minutes. The chair is mestery. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:40 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:43 <openstack> The meeting name has been set to 'networking_ml2' 16:00:52 <mestery> #link https://wiki.openstack.org/wiki/Meetings/ML2 Agenda 16:01:15 <mestery> We've got an agenda which will either go at least an hour, or we'll be done in 20 minutes. Lets see what happens. :) 16:01:23 <mestery> #topic Action Items 16:01:33 <mestery> The first action item to cover is for rcurran. 16:01:43 <mestery> rcurran: Did you verify if a bug is needed for Cisco UT coverage in ML2? 16:01:58 <rcurran> yes, do you want the link 16:02:02 <rcurran> bug# 16:02:09 <mestery> rcurran: Yes please, I'll add it in the meeting minutes. 16:02:18 <rcurran> https://bugs.launchpad.net/neutron/+bug/1267481 16:02:26 <mestery> rcurran: thanks! 16:02:41 <mestery> #link https://bugs.launchpad.net/neutron/+bug/1267481 Bug for Cisco ML2 UT coverage 16:03:06 <mestery> asadoughi: Thank you for filing bugs for the ML2 UT coverage gaps! 16:03:21 <mestery> #info ML2 UT coverage bugs filed and tagged: https://bugs.launchpad.net/neutron/+bugs?field.tag=ml2-unit-test-coverage 16:03:26 <asadoughi> mestery: np 16:03:58 <mestery> Now that these bugs are filed, if you are planning to work on one, please assign it to yourself. Anyone can grab one of these. 16:04:13 <mestery> Would be good to get this coverage added during Icehouse if we can spread the load. 16:04:32 <mestery> So, jumping around a little (we'll come back to you rkukura): 16:04:42 <mestery> I pinged asomya this morning around his RPC google document 16:04:46 <mestery> #link https://docs.google.com/document/d/1ZHb2zzPmkSOpM6PR8M9sx2SJOJPHblaP5eVXHr5zOFg/edit RPC Google Document 16:04:56 <mestery> I have not heard back from asomya yet, but once I do, we'll send email to the mailing list. 16:05:12 <mestery> #topic Port Binding 16:05:21 <mestery> So now, the meat of the meeting: port binding :) 16:05:28 <mestery> #undo 16:05:29 <openstack> Removing item from minutes: <ircmeeting.items.Topic object at 0x39171d0> 16:05:32 <mestery> Wait, asomya joined! 16:05:35 <mestery> asomya, welcome! 16:05:41 <asomya> Hello 16:05:46 <mestery> I just mentioned your RPC document (https://docs.google.com/document/d/1ZHb2zzPmkSOpM6PR8M9sx2SJOJPHblaP5eVXHr5zOFg/edit) 16:06:01 <matrohon> hi 16:06:03 <mestery> Last week, there was a question on if you were still targeting this for Icehouse or not. 16:06:10 <mestery> I wasn't sure, thus needed your input :) 16:06:31 <asomya> That's not being worked on actively at the moment, It needs the other patch that i proposed in the summit to make type drivers more self sufficient 16:06:49 <asomya> I was waiting for Zan'g ptch to go in before i posted mine but now I see that one is abandoned 16:06:53 <mestery> asomya: Thanks. Could we expand Zhang's typedriver patch to make that work with this RPC work? 16:06:57 <mestery> asomya: :) 16:07:13 <asomya> Yes I'll post a patch for that work and then work on this 16:07:15 <rkukura> Phasing smaller patches is better 16:07:37 <rkukura> Lets get the type driver refactor active again, whether the original patch or a new one 16:07:39 <mestery> Agreed. So asomya, do you want to reach out to Zhang and cooridnate thigns there? 16:07:45 <mestery> rkukura: Agreed. 16:07:50 <asomya> rkukura: agreed 16:07:53 <mestery> asomya: You ok reaching out for this or do you want me to to take an action? 16:08:10 <asomya> I'll reach out to Zhang if i need his counsel on the patch 16:08:16 <mestery> asomya: Thanks! 16:08:26 <mestery> #action asomya to work on new typedriver refactoring patch with zhang 16:08:39 <mestery> OK, anything else on RPC or TypeDriver here before we move on to port binding discussions? 16:09:09 <matrohon> asomya: let's keep in touch we may need your functionnality 16:09:22 <asomya> matrohon: sure 16:09:38 <mestery> #topic Port Binding 16:09:49 <mestery> rkukura: You're up! 16:10:19 <rkukura> OK, I have not yet emailed a concrete proposal on port binding improvments 16:10:36 <rkukura> but can discuss the ideas/conclusions here 16:11:26 <rkukura> 1st, because MDs may need to make remote calls to controllers/devices, we need ml2 to call bind_port() on them outside of any transaction rather than inside, as is currently done 16:12:36 <rkukura> So I was looking into whether we could do the port binding before starting the transaction. But if binding:host_id is supplied in port_create, this would mean trying to bind before we've stored anything about the port in the DB 16:13:11 <rkukura> It seems some MDs might need to maintain their own table mapping neutron's port ID to a controller's port ID 16:14:03 <rkukura> But the MD's could not reference the neutron port as a foriegn key with cascading delete and all that if the transaction creating this hasn't even started. 16:14:26 <rkukura> So that leads to a two transaction approach. 16:15:17 <rkukura> The port_create or port_update that sets the binding:host_id would 1st be processed normally without trying to bind the port, and the MDs' precommit and postcommit calls would be made 16:15:36 <rkukura> Then port binding would occur outside any tranaction 16:15:57 <mestery> rkukura: Per our discussion last night, this all makes sense to me. I'm curious what others think. 16:16:08 <rkukura> in bind_port, MDs could do their own transactions if needed and/or could talk to controllers/devices 16:16:38 <rkukura> Then a 2nd transaction would update the ml2_port_binding table with the result 16:16:50 <matrohon> make sens to me too 16:16:52 <rkukura> And I think the MDs would see this as a port_update 16:17:09 <rkukura> with its own precommit and postcommit calls 16:17:22 <mestery> Yes. rkukura, I think this also makes the interactions with nova pretty concise as well, since nova would first create the port, then update the port. 16:17:54 <rkukura> mestery: Actually, nova could still just create the port, and expect the returned port dict to have the binding result 16:18:04 <matrohon> mestery : why nova should update the port? 16:18:32 <mestery> rkukura matrohon: Nevermind, I was confusing this with some other nova thing. Carry on. :) 16:18:44 <rkukura> when the user passes in a port-id, then nova updates to set binding:host_id, otherwise it can just create it with binding:host_id 16:19:15 <rkukura> either way, nova needs the binding result (vif_type, capabilities/vif_security, etc.) 16:19:41 <rkukura> So from the API, its a single operation, but internally its two transactions 16:20:11 <rkukura> Which then brings up the question of concurrency, since all kinds of things can happen between those two transactions 16:21:03 <mestery> rkukura: :) 16:21:06 <matrohon> maybe out of scope but don't you think that this is a scheduler issue, the scheduler should ask for a host capable to bind the port first 16:21:11 <rkukura> I was thinking a state machine would be needed, where the thread doing the binding sets a state showing that a binding is in progress in the 1st transaction, and then changes to complete/failed after 16:21:35 <rkukura> matrohon: Agreed, but we still need to eventually do the binding 16:22:07 <rkukura> matrohon: We could have the scheduler set binding:host_id and nova just to port_get, but that's later 16:22:22 <matrohon> rkukura : ok 16:22:56 <rkukura> Anyway, the state machine gets complex when you consider the thread/process doing the binding dying, or other threads needing the result of the binding 16:23:58 <rkukura> So was chatting with mestery and describing this, and we started thinking maybe we can just allow different threads/proccess to attempt to bind concurrently, and not need a state machine 16:24:12 <rkukura> This would work as follows: 16:25:02 <rkukura> Any thread executing port_create, port_update, or port_get may see that it is possible to bind (binding:host_id is set) but the port is not yet bound. 16:25:41 <rkukura> It will then attempt to bind the port (after committing the 1st transaction in the case of port_create or port_update) 16:26:06 <rkukura> When its binding attempt is complete, it will start a new transaction 16:26:41 <rkukura> In that new transaction, it will look to see if the port has already been bound concurrently by some other thread/process 16:27:28 <rkukura> if so, it will use the stored binding to return the right port dict info from the operation, and if not it will its own binding result and return it 16:27:42 <rkukura> does this make any sense, and seem workable? 16:28:07 <matrohon> looks great to me :) 16:28:11 <mestery> +1 16:28:58 <rcurran> looks good but waiting for the unbind/delete logic :-) 16:29:54 <rkukura> rcurran: good point - haven't worked out details on that yet 16:31:43 <matrohon> the port delet should ask the MD in charge of the binding first, to unbind the port 16:31:46 <rkukura> So if the basic idea/approach discussed here seems workable, I'll flesh out the remaining details, including unbind/delete and send an email to openstack-dev for discussion 16:32:07 <mestery> rkukura: Go for it! Thanks for all your work on this! 16:32:16 <mestery> #action rkukura to flesh out details of bind/unbind and send email to openstack-dev ML 16:32:19 <matrohon> rkukura : thanks 16:32:36 <rcurran> yes, thanks 16:32:38 <rkukura> matrohon: I think you are right - want to make sure the bound MD knows it has been unbound, but also that all MDs know the port itself has been unbound 16:33:19 <rkukura> And in both cases, should know the details of that previous binding 16:33:22 <matrohon> other MD will be aware of that with port_update_pre/post commit 16:33:36 <rkukura> matrohon: right 16:33:39 <rcurran> and the (now) unbound information is available to md's (bob beat me to it) 16:34:50 <rkukura> one other port binding change comes from nati_uen_'s vif_security patch 16:35:36 <rkukura> that basically replaces storing the capabilities supplied by the bound MD with calling into the bound MD to get the vif_security attribute 16:36:10 <rkukura> I'd like to apply that same pattern to the vif_type, and allow it to be extended for things like attributes needed for sr-iov 16:36:33 <mestery> Makes sense to me rkukura. 16:37:06 <rkukura> nati_ueno: did you see last couple lines? 16:37:12 <matrohon> make sense to me too 16:37:17 <nati_ueno> rkukura: yes 16:38:31 <nati_ueno> rkukura: Ah may be I'm missing few lines.. It looks like I've disconnected 16:38:44 <rkukura> I'm willing to take this subset of nati_uen's patch, generalize it a bit, make sure it works for vif_type, and propose it for review separately from the capabilities->vif_security change 16:38:58 <rkukura> nati_ueno: one other port binding change comes from nati_uen_'s vif_security patch 16:39:08 <rkukura> nati_ueno: that basically replaces storing the capabilities supplied by the bound MD with calling into the bound MD to get the vif_security attribute 16:39:17 <nati_ueno> rkukura: OK I saw the line. Thanks 16:39:26 <rkukura> nati_ueno: I'd like to apply that same pattern to the vif_type, and allow it to be extended for things like attributes needed for sr-iov 16:40:11 <matrohon> vif_security and vif_type are returned back to agent through get_device_details? 16:40:32 <nati_ueno> matrohon: vif_security won't got to the agent 16:40:40 <nati_ueno> s/got/go/ 16:40:44 <nati_ueno> it is needed by nova 16:40:45 <rkukura> matrohon: These are REST API extensions to port, not RPC 16:41:26 <rkukura> nati_ueno's patch updates the portbinding extension, replacing capabilities with vif_security 16:41:52 <nati_ueno> rkukura: so you wanna have the generalized version before the my patch? 16:42:05 <rkukura> nati_ueno: If that helps get things moving, sure 16:42:29 <nati_ueno> rkukura: if it is faster, ok please 16:42:39 <rkukura> I think that patch would be a small change completely localized to ml2 16:43:27 <rkukura> Separate issue with vif_security is how ml2 gets right info for the firewall_driver in the L2 agent on the node where the port gets bound 16:44:11 <nati_ueno> yes. but we need some workaround fix for this since security group is broken 16:44:17 <rkukura> I've suggested that the L2 agent could get the vif_security info from its firewall_driver, and include this in its agents_db info 16:44:39 <rkukura> then the bound MD would return this as the vif_security for the port 16:45:47 <rkukura> existing agents_db RPC would send it from agent to server and store it in the agents_db table 16:46:04 <mestery> makes sense to me rkukura. 16:46:14 <rkukura> Maybe this should be on agenda for next week's ml2 meeting if not wrapped up by then? 16:47:00 <mestery> rkukura: I agree. 16:47:24 <nati_ueno> rkukura: so you wanna do this from first step? 16:47:28 <rkukura> nati_ueno: Sound reasonable? 16:48:20 <nati_ueno> rkukura: hmm I need to time to think about the architecture.. depending agent's configuration sounds like wrong direction 16:48:43 <rkukura> nati_ueno: I should be able to post initial patch for calling extend_port_dict() on the bound MD instread of storing vif_type and capabliities in ml2_port_binding table today 16:49:23 <nati_ueno> rkukura: Ok how about remove ml2 from https://review.openstack.org/#/c/21946/ 16:49:24 <rkukura> nati_ueno: Is that concern around getting the vif_security from the firewall driver in the agent? 16:49:33 <nati_ueno> rkukura: yes 16:49:50 <nati_ueno> rkukura: then you will have a patch on ml2 localized 16:50:17 <rkukura> nati_ueno: As I recall, your patch has the firewall_driver supply the vif_security value, right? 16:50:39 <nati_ueno> rkukura: yes 16:51:00 <rkukura> If that's the case, then I think the issue is whether this call on the firewall_driver gets made in the L2 agent or in the server 16:51:35 <rkukura> Right now, ml2 users need to supply a dummy value for firewall_driver so that the SG API extension is enabled 16:51:43 <rkukura> in havanan 16:51:46 <rkukura> havana 16:52:06 <nati_ueno> so my opinion is the current firewall driver model is broken 16:52:18 <nati_ueno> some functionaries are mixed 16:52:27 <nati_ueno> enable sec group or not 16:52:37 <nati_ueno> select driver for each agent implementation 16:52:42 <rkukura> Agreed enabling the API should be a separate config in the server 16:53:03 <rkukura> But with ml2, different L2 agents might use different firewall drivers 16:53:05 <nati_ueno> select driver for agent who support more than one implemenation 16:53:31 <rkukura> Or a MD for a controller might not use any L2 agent, and instead do SG's some other way 16:53:33 <nati_ueno> Most of plugin (or MD) supports only one driver 16:53:41 <nati_ueno> so it should be defined automatically 16:54:09 <rkukura> ml2 works concurrently with openvswitch-agent, linuxbridge-agent, hyperv-agent, and soon with controllers like ODL 16:54:19 <nati_ueno> currently there is no plugin or (MD) support multiple drivers 16:54:29 <rkukura> each of these may have different ways to enforce SGs 16:54:39 <matrohon> so MD should be in charge of defining the firewall_driver? 16:54:41 <nati_ueno> rkukura: I agree. so ML2 can support multiple MD 16:54:55 <nati_ueno> and driver will be decided by MD 16:55:01 <rkukura> So your approach of having the bound MD return vif_security is correct 16:55:12 <nati_ueno> because there is no MD which support more than one firewall driver 16:55:21 <nati_ueno> except Noop 16:55:32 <rkukura> but we need to resolve how the different MDs in the server get the correct vif_security to return 16:55:41 <nati_ueno> so we should able to decide vif_security value based on vif_type 16:56:02 <rkukura> when these MDs are supporting an L2 agent, the agents_db seems like a good solution to me 16:56:42 <nati_ueno> rkukura: OK so we should remove NoopDriver. 16:56:52 <nati_ueno> so let's say we could remove Noop 16:57:02 <nati_ueno> we should able to decide vif_security value based on vif_type 16:57:16 <rkukura> right now, the bound MD supplies the vif_type and the capabilities - just want it to be able to supply the vif_security from the firewall_driver in the agent 16:57:45 <rkukura> the agent-based MDs already use agents_db to see what network segment they can bind to, based on bridge_mappings info 16:57:48 <nati_ueno> rkukura: yes, but it is not needed because we can map vif_type to vif_security value 16:57:52 <mestery> Just a note: We have 3 minutes left here folks. 16:57:58 <mestery> Best to continue this on ML discussion perhaps? 16:58:03 <rkukura> sure 16:58:07 <nati_ueno> gotcha 16:58:13 <mestery> I just know there is another meeting right after this one. :) 16:58:22 <rkukura> I'm glad we got a chance to get this conversation with nati_ueno going 16:58:37 <nati_ueno> rkukura: I'll start thread in the openstack-dev 16:58:41 * mestery nods in agreement 16:58:46 <mestery> nati_ueno: Thanks! 16:58:53 <nati_ueno> mestery: sure! 16:59:04 <mestery> OK, I think that's it for this week folks! 16:59:09 <mestery> Lets continue with these threads on the ML. 16:59:12 <matrohon> very inetresting discussion 16:59:18 <mestery> Thanks for joining the discussions this week everyone! 16:59:22 <mestery> #endmeeting