16:00:37 #startmeeting networking_ml2 16:00:39 Meeting started Wed Jan 15 16:00:37 2014 UTC and is due to finish in 60 minutes. The chair is mestery. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:40 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:43 The meeting name has been set to 'networking_ml2' 16:00:52 #link https://wiki.openstack.org/wiki/Meetings/ML2 Agenda 16:01:15 We've got an agenda which will either go at least an hour, or we'll be done in 20 minutes. Lets see what happens. :) 16:01:23 #topic Action Items 16:01:33 The first action item to cover is for rcurran. 16:01:43 rcurran: Did you verify if a bug is needed for Cisco UT coverage in ML2? 16:01:58 yes, do you want the link 16:02:02 bug# 16:02:09 rcurran: Yes please, I'll add it in the meeting minutes. 16:02:18 https://bugs.launchpad.net/neutron/+bug/1267481 16:02:26 rcurran: thanks! 16:02:41 #link https://bugs.launchpad.net/neutron/+bug/1267481 Bug for Cisco ML2 UT coverage 16:03:06 asadoughi: Thank you for filing bugs for the ML2 UT coverage gaps! 16:03:21 #info ML2 UT coverage bugs filed and tagged: https://bugs.launchpad.net/neutron/+bugs?field.tag=ml2-unit-test-coverage 16:03:26 mestery: np 16:03:58 Now that these bugs are filed, if you are planning to work on one, please assign it to yourself. Anyone can grab one of these. 16:04:13 Would be good to get this coverage added during Icehouse if we can spread the load. 16:04:32 So, jumping around a little (we'll come back to you rkukura): 16:04:42 I pinged asomya this morning around his RPC google document 16:04:46 #link https://docs.google.com/document/d/1ZHb2zzPmkSOpM6PR8M9sx2SJOJPHblaP5eVXHr5zOFg/edit RPC Google Document 16:04:56 I have not heard back from asomya yet, but once I do, we'll send email to the mailing list. 16:05:12 #topic Port Binding 16:05:21 So now, the meat of the meeting: port binding :) 16:05:28 #undo 16:05:29 Removing item from minutes: 16:05:32 Wait, asomya joined! 16:05:35 asomya, welcome! 16:05:41 Hello 16:05:46 I just mentioned your RPC document (https://docs.google.com/document/d/1ZHb2zzPmkSOpM6PR8M9sx2SJOJPHblaP5eVXHr5zOFg/edit) 16:06:01 hi 16:06:03 Last week, there was a question on if you were still targeting this for Icehouse or not. 16:06:10 I wasn't sure, thus needed your input :) 16:06:31 That's not being worked on actively at the moment, It needs the other patch that i proposed in the summit to make type drivers more self sufficient 16:06:49 I was waiting for Zan'g ptch to go in before i posted mine but now I see that one is abandoned 16:06:53 asomya: Thanks. Could we expand Zhang's typedriver patch to make that work with this RPC work? 16:06:57 asomya: :) 16:07:13 Yes I'll post a patch for that work and then work on this 16:07:15 Phasing smaller patches is better 16:07:37 Lets get the type driver refactor active again, whether the original patch or a new one 16:07:39 Agreed. So asomya, do you want to reach out to Zhang and cooridnate thigns there? 16:07:45 rkukura: Agreed. 16:07:50 rkukura: agreed 16:07:53 asomya: You ok reaching out for this or do you want me to to take an action? 16:08:10 I'll reach out to Zhang if i need his counsel on the patch 16:08:16 asomya: Thanks! 16:08:26 #action asomya to work on new typedriver refactoring patch with zhang 16:08:39 OK, anything else on RPC or TypeDriver here before we move on to port binding discussions? 16:09:09 asomya: let's keep in touch we may need your functionnality 16:09:22 matrohon: sure 16:09:38 #topic Port Binding 16:09:49 rkukura: You're up! 16:10:19 OK, I have not yet emailed a concrete proposal on port binding improvments 16:10:36 but can discuss the ideas/conclusions here 16:11:26 1st, because MDs may need to make remote calls to controllers/devices, we need ml2 to call bind_port() on them outside of any transaction rather than inside, as is currently done 16:12:36 So I was looking into whether we could do the port binding before starting the transaction. But if binding:host_id is supplied in port_create, this would mean trying to bind before we've stored anything about the port in the DB 16:13:11 It seems some MDs might need to maintain their own table mapping neutron's port ID to a controller's port ID 16:14:03 But the MD's could not reference the neutron port as a foriegn key with cascading delete and all that if the transaction creating this hasn't even started. 16:14:26 So that leads to a two transaction approach. 16:15:17 The port_create or port_update that sets the binding:host_id would 1st be processed normally without trying to bind the port, and the MDs' precommit and postcommit calls would be made 16:15:36 Then port binding would occur outside any tranaction 16:15:57 rkukura: Per our discussion last night, this all makes sense to me. I'm curious what others think. 16:16:08 in bind_port, MDs could do their own transactions if needed and/or could talk to controllers/devices 16:16:38 Then a 2nd transaction would update the ml2_port_binding table with the result 16:16:50 make sens to me too 16:16:52 And I think the MDs would see this as a port_update 16:17:09 with its own precommit and postcommit calls 16:17:22 Yes. rkukura, I think this also makes the interactions with nova pretty concise as well, since nova would first create the port, then update the port. 16:17:54 mestery: Actually, nova could still just create the port, and expect the returned port dict to have the binding result 16:18:04 mestery : why nova should update the port? 16:18:32 rkukura matrohon: Nevermind, I was confusing this with some other nova thing. Carry on. :) 16:18:44 when the user passes in a port-id, then nova updates to set binding:host_id, otherwise it can just create it with binding:host_id 16:19:15 either way, nova needs the binding result (vif_type, capabilities/vif_security, etc.) 16:19:41 So from the API, its a single operation, but internally its two transactions 16:20:11 Which then brings up the question of concurrency, since all kinds of things can happen between those two transactions 16:21:03 rkukura: :) 16:21:06 maybe out of scope but don't you think that this is a scheduler issue, the scheduler should ask for a host capable to bind the port first 16:21:11 I was thinking a state machine would be needed, where the thread doing the binding sets a state showing that a binding is in progress in the 1st transaction, and then changes to complete/failed after 16:21:35 matrohon: Agreed, but we still need to eventually do the binding 16:22:07 matrohon: We could have the scheduler set binding:host_id and nova just to port_get, but that's later 16:22:22 rkukura : ok 16:22:56 Anyway, the state machine gets complex when you consider the thread/process doing the binding dying, or other threads needing the result of the binding 16:23:58 So was chatting with mestery and describing this, and we started thinking maybe we can just allow different threads/proccess to attempt to bind concurrently, and not need a state machine 16:24:12 This would work as follows: 16:25:02 Any thread executing port_create, port_update, or port_get may see that it is possible to bind (binding:host_id is set) but the port is not yet bound. 16:25:41 It will then attempt to bind the port (after committing the 1st transaction in the case of port_create or port_update) 16:26:06 When its binding attempt is complete, it will start a new transaction 16:26:41 In that new transaction, it will look to see if the port has already been bound concurrently by some other thread/process 16:27:28 if so, it will use the stored binding to return the right port dict info from the operation, and if not it will its own binding result and return it 16:27:42 does this make any sense, and seem workable? 16:28:07 looks great to me :) 16:28:11 +1 16:28:58 looks good but waiting for the unbind/delete logic :-) 16:29:54 rcurran: good point - haven't worked out details on that yet 16:31:43 the port delet should ask the MD in charge of the binding first, to unbind the port 16:31:46 So if the basic idea/approach discussed here seems workable, I'll flesh out the remaining details, including unbind/delete and send an email to openstack-dev for discussion 16:32:07 rkukura: Go for it! Thanks for all your work on this! 16:32:16 #action rkukura to flesh out details of bind/unbind and send email to openstack-dev ML 16:32:19 rkukura : thanks 16:32:36 yes, thanks 16:32:38 matrohon: I think you are right - want to make sure the bound MD knows it has been unbound, but also that all MDs know the port itself has been unbound 16:33:19 And in both cases, should know the details of that previous binding 16:33:22 other MD will be aware of that with port_update_pre/post commit 16:33:36 matrohon: right 16:33:39 and the (now) unbound information is available to md's (bob beat me to it) 16:34:50 one other port binding change comes from nati_uen_'s vif_security patch 16:35:36 that basically replaces storing the capabilities supplied by the bound MD with calling into the bound MD to get the vif_security attribute 16:36:10 I'd like to apply that same pattern to the vif_type, and allow it to be extended for things like attributes needed for sr-iov 16:36:33 Makes sense to me rkukura. 16:37:06 nati_ueno: did you see last couple lines? 16:37:12 make sense to me too 16:37:17 rkukura: yes 16:38:31 rkukura: Ah may be I'm missing few lines.. It looks like I've disconnected 16:38:44 I'm willing to take this subset of nati_uen's patch, generalize it a bit, make sure it works for vif_type, and propose it for review separately from the capabilities->vif_security change 16:38:58 nati_ueno: one other port binding change comes from nati_uen_'s vif_security patch 16:39:08 nati_ueno: that basically replaces storing the capabilities supplied by the bound MD with calling into the bound MD to get the vif_security attribute 16:39:17 rkukura: OK I saw the line. Thanks 16:39:26 nati_ueno: I'd like to apply that same pattern to the vif_type, and allow it to be extended for things like attributes needed for sr-iov 16:40:11 vif_security and vif_type are returned back to agent through get_device_details? 16:40:32 matrohon: vif_security won't got to the agent 16:40:40 s/got/go/ 16:40:44 it is needed by nova 16:40:45 matrohon: These are REST API extensions to port, not RPC 16:41:26 nati_ueno's patch updates the portbinding extension, replacing capabilities with vif_security 16:41:52 rkukura: so you wanna have the generalized version before the my patch? 16:42:05 nati_ueno: If that helps get things moving, sure 16:42:29 rkukura: if it is faster, ok please 16:42:39 I think that patch would be a small change completely localized to ml2 16:43:27 Separate issue with vif_security is how ml2 gets right info for the firewall_driver in the L2 agent on the node where the port gets bound 16:44:11 yes. but we need some workaround fix for this since security group is broken 16:44:17 I've suggested that the L2 agent could get the vif_security info from its firewall_driver, and include this in its agents_db info 16:44:39 then the bound MD would return this as the vif_security for the port 16:45:47 existing agents_db RPC would send it from agent to server and store it in the agents_db table 16:46:04 makes sense to me rkukura. 16:46:14 Maybe this should be on agenda for next week's ml2 meeting if not wrapped up by then? 16:47:00 rkukura: I agree. 16:47:24 rkukura: so you wanna do this from first step? 16:47:28 nati_ueno: Sound reasonable? 16:48:20 rkukura: hmm I need to time to think about the architecture.. depending agent's configuration sounds like wrong direction 16:48:43 nati_ueno: I should be able to post initial patch for calling extend_port_dict() on the bound MD instread of storing vif_type and capabliities in ml2_port_binding table today 16:49:23 rkukura: Ok how about remove ml2 from https://review.openstack.org/#/c/21946/ 16:49:24 nati_ueno: Is that concern around getting the vif_security from the firewall driver in the agent? 16:49:33 rkukura: yes 16:49:50 rkukura: then you will have a patch on ml2 localized 16:50:17 nati_ueno: As I recall, your patch has the firewall_driver supply the vif_security value, right? 16:50:39 rkukura: yes 16:51:00 If that's the case, then I think the issue is whether this call on the firewall_driver gets made in the L2 agent or in the server 16:51:35 Right now, ml2 users need to supply a dummy value for firewall_driver so that the SG API extension is enabled 16:51:43 in havanan 16:51:46 havana 16:52:06 so my opinion is the current firewall driver model is broken 16:52:18 some functionaries are mixed 16:52:27 enable sec group or not 16:52:37 select driver for each agent implementation 16:52:42 Agreed enabling the API should be a separate config in the server 16:53:03 But with ml2, different L2 agents might use different firewall drivers 16:53:05 select driver for agent who support more than one implemenation 16:53:31 Or a MD for a controller might not use any L2 agent, and instead do SG's some other way 16:53:33 Most of plugin (or MD) supports only one driver 16:53:41 so it should be defined automatically 16:54:09 ml2 works concurrently with openvswitch-agent, linuxbridge-agent, hyperv-agent, and soon with controllers like ODL 16:54:19 currently there is no plugin or (MD) support multiple drivers 16:54:29 each of these may have different ways to enforce SGs 16:54:39 so MD should be in charge of defining the firewall_driver? 16:54:41 rkukura: I agree. so ML2 can support multiple MD 16:54:55 and driver will be decided by MD 16:55:01 So your approach of having the bound MD return vif_security is correct 16:55:12 because there is no MD which support more than one firewall driver 16:55:21 except Noop 16:55:32 but we need to resolve how the different MDs in the server get the correct vif_security to return 16:55:41 so we should able to decide vif_security value based on vif_type 16:56:02 when these MDs are supporting an L2 agent, the agents_db seems like a good solution to me 16:56:42 rkukura: OK so we should remove NoopDriver. 16:56:52 so let's say we could remove Noop 16:57:02 we should able to decide vif_security value based on vif_type 16:57:16 right now, the bound MD supplies the vif_type and the capabilities - just want it to be able to supply the vif_security from the firewall_driver in the agent 16:57:45 the agent-based MDs already use agents_db to see what network segment they can bind to, based on bridge_mappings info 16:57:48 rkukura: yes, but it is not needed because we can map vif_type to vif_security value 16:57:52 Just a note: We have 3 minutes left here folks. 16:57:58 Best to continue this on ML discussion perhaps? 16:58:03 sure 16:58:07 gotcha 16:58:13 I just know there is another meeting right after this one. :) 16:58:22 I'm glad we got a chance to get this conversation with nati_ueno going 16:58:37 rkukura: I'll start thread in the openstack-dev 16:58:41 * mestery nods in agreement 16:58:46 nati_ueno: Thanks! 16:58:53 mestery: sure! 16:59:04 OK, I think that's it for this week folks! 16:59:09 Lets continue with these threads on the ML. 16:59:12 very inetresting discussion 16:59:18 Thanks for joining the discussions this week everyone! 16:59:22 #endmeeting