16:01:59 <Sukhdev> #startmeeting networking_ml2
16:02:00 <openstack> Meeting started Wed May 13 16:01:59 2015 UTC and is due to finish in 60 minutes.  The chair is Sukhdev. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:02:01 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:02:04 <openstack> The meeting name has been set to 'networking_ml2'
16:02:08 <Sukhdev> #topic: Agenda
16:02:27 <Sukhdev> #ilnk: https://wiki.openstack.org/wiki/Meetings/ML2#Agenda
16:02:41 <Sukhdev> Welcome folks to ML2 meeting
16:02:56 <Sukhdev> #topic: Announcements:
16:03:13 <Sukhdev> I am proposing that we cancel next two meetings
16:03:36 <Sukhdev> we will be at summit next week and tired the following week recovering from travels
16:03:44 <shivharis> duh..ok
16:03:46 <yamahata> +1
16:03:48 <Sukhdev> Is everybody OK with this?
16:03:53 <yamamoto> +1
16:03:59 <rkukura> +1
16:04:00 <yalie> +1
16:04:01 <Sukhdev> cool
16:04:14 <Sukhdev> Lets dive into the Agenda then
16:04:26 <Sukhdev> #topic: Task Flow discussion
16:04:38 <Sukhdev> #link: https://docs.google.com/document/d/1aSgTVB7nW_v7lHH0Z0DUgfymEsx0O16k1Jgu7QFXkFA/edit?usp=sharing
16:04:51 <Sukhdev> manishg: want to lead the discussion?
16:05:13 <Sukhdev> Are folks reviewing the document?
16:05:14 <manishg> sure.  would like to first know if folks have the read document :)
16:05:34 <manishg> I updated the doc based on the discussion from last week.
16:05:43 <shivharis> btw: i am ok with not going through all the state changes - as discussed last meeting
16:05:54 <rkukura> I see there have been signficant updates since I last looked!
16:06:19 <manishg> rkukura: yeah, the updates are based on last meeting
16:06:25 <Sukhdev> shivharis: you mean intermediate states, right?
16:06:35 <shivharis> Sukhdev: yes
16:06:41 <manishg> rkukura: one of the items was to update the document with the discussion.
16:07:11 <rkukura> manishg: Much appreciated!
16:07:26 <Sukhdev> manishg: the doc is looking good -
16:07:49 <Sukhdev> But, I did not see any additional comments on it - I saw one and answered it
16:07:50 <manishg> based on comments I've tried to put the conclusion in proposal #1 - which is also echoed you and Sukhdev also suggested.
16:09:03 <manishg> The idea is to let the updates continue to happen, and intermediate updates won't count (assuming the original operation is still in-progress).  Once the operation on back-end is done, it moves on to the latest db state and tries to execute that.
16:09:20 <manishg> I have noted the issues as well.
16:10:00 <manishg> One thing not mentioned in the document is that we should move in lock-step with all the mechanism drivers registered for the resource.  This way the backend state is consistent.
16:10:47 <rkukura> manishg: What do you mean by “move in lock-step with all the MDs …”?
16:10:59 <manishg> Let's say have operations O1, O2, O3, O4.  O2-4 happenn while all mech-drivers are still on O1.
16:11:15 <manishg> if some drivers finish O1, then we still wait for others to finish O1.
16:11:31 <manishg> before moving to realizing latest state O4.
16:12:14 <manishg> This way if O4 , fails we can move back to known good state O1.  If all drivers (registered for the resource) are in different states then the resource is not considered in consistent state.
16:12:20 <manishg> rkukura: makes sense?
16:12:23 <rkukura> manishg: Wouldn’t that mean one MD having problems would block all other MDs? Are we sure this is what we want?
16:12:53 <rkukura> I’m thinking a more optimistic strategy might be better.
16:12:54 <manishg> If an MD is having issues with a resource then do we actually want to move ahead with others?
16:13:11 <Sukhdev> manishg: I think so
16:13:12 <rkukura> We basically would move forward assuming the MDs will eventually catch up.
16:13:21 <manishg> is the resource considered realized if one of the MDs that is registered doesn't complete the operation?
16:13:27 <Sukhdev> manishg: let the sync process take care of dealing with the failures
16:13:35 <rkukura> And make it visible to clients via the state whether MDs are caught up.
16:13:56 <rkukura> But not try to roll-back operations that have appeared to complete from the client’s perspective.
16:14:15 <manishg> rkukura: hmm... so really an operation is N independent operations (one per registered driver)?
16:14:30 <rkukura> Here, complete means the REST operation returned, but MDs may not have synced yet.
16:14:40 <manishg> if we do that, what would really a rollback mean?
16:14:59 <rkukura> I’m thinking the operation is really the DB state change. The drivers are always chasing the DB state.
16:15:20 <Sukhdev> manishg: rollback becomes less interesting - I mentioned this in one of my comments
16:15:28 <shivharis> rkukura: is this a security hole?
16:15:31 <manishg> so then there is really no rollback.
16:15:57 <Sukhdev> shivharis: security hole - you lost me there
16:16:07 <shivharis> let me explain:
16:16:13 <Sukhdev> manishg: I think so
16:16:19 <manishg> rkukura, Sukhdev: so if an MD fails, then the state of that resource is really unknown and the operator must figure it out.
16:16:32 <shivharis> a network get s deleted, some mds do not delete the network( i.e. vlan)
16:16:55 <manishg> is everyone else okay with no rollback?
16:17:07 <shivharis> new network gets created for a new vm - but the vlan is not released by some backend
16:17:23 <rkukura> I think I’m OK with no rollback of updates that have been committed to the DB.
16:17:49 <Sukhdev> shivharis: This case can not happen if the VLAN is still in the DB
16:18:00 <manishg> rkukura: so in case of error, the resource is sort of abandoned/ error state for operator to interven?
16:18:06 <rkukura> Maybe shivharis’s scenario could be avoided by not releasing a network’s resources (i.e. VLANs) until the delete has really completed for all MDs.
16:18:41 <Sukhdev> rkukura: correct - long time ago we discussed this in ML2 meeting
16:18:51 <shivharis> so that is a hole that needs special attention - are there any other such?
16:18:53 <manishg> shivharis: in this case, the db delete will be marked 'DELETING'
16:19:04 <manishg> shivharis: so cannot use the resource until it's gone from db.
16:19:36 <shivharis> so this logic should also be taken care for in delete port
16:19:49 <Sukhdev> manishg: simply reverse the calls in ML2 plugin - i.e. post_commit_delete() ahead of pre_commig_delete()
16:19:50 <shivharis> or in general all objects
16:20:10 <manishg> Sukhdev: there is no concept of pre- post- anymore.  why do you need it?
16:20:33 <manishg> the idea is to update the state, then after it's realized set it to final state.
16:20:40 <Sukhdev> manishg: I know, I used it to elaborate the idea - that it should be in the reverse direction
16:20:42 <manishg> shivharis: this applies to all resources.
16:20:42 <rkukura> manishg: I think the client API should be designed such that normally, if the operation completes and the REST call returns, the expectation is that the things are working. If this isn’t the case, it should be visible to the client via the status of the resource, and the operator should be able to get more details (i.e. from admin ops and/or logs).
16:21:14 <manishg> Sukhdev: it doesn't need to be reversed.  you mark it 'DELETING' in db.  then call drivers async.  when they are done, delete from db.
16:21:41 <Sukhdev> manishg: that means the same - we are on the same page
16:21:57 <Sukhdev> rkukura: +1
16:21:59 <shivharis> only if all backends say deleted - only then the resource gets deleted
16:22:07 <manishg> Sukhdev: if we are doing async, it is not the same thing
16:22:15 <Sukhdev> rkukura: Nova API, I believe, work like that
16:22:25 <manishg> rkukura: are we saying , the calls are 'sync' (= blocking)?
16:22:49 <manishg> because otherwise when the call returns , things aren't working (because they are being set in parallel)
16:23:19 <rkukura> manishg: I think we are saying that clients use the API in a synchronous way - i.e. they continue with the next step when the call returns. But if they need to, they can look to see if the backends are in sync.
16:24:19 <rkukura> The point where we really need to make sure the backends are in sync is when we notify nova that a port is plugged adn the VM can continue booting.
16:24:19 <Sukhdev> manishg: no sync - all is async, the burden is shifted to client side
16:24:23 <manishg> rkukura: ok.  then we are on same page.  call returns and clients can proceed with next operation but the operation in the backend may be still going on.  The clients would need to know that much.
16:24:53 <manishg> Sukhdev: if all is sync, calling "post-" first will not change anything .  By the time it's actually executed the pre- would have deleted from db.
16:25:02 <manishg> so, the intermediate state is the way to go.
16:25:12 <manishg> and that doesn't have any issues that shivharis mentioned.
16:26:05 <manishg> I think we are all on the same page more or less that in the end backend should sync with db (outside of rest call)
16:26:15 <Sukhdev> manishg: I am not a huge fan of intermediate state
16:26:37 <Sukhdev> manishg: agree with your last statement
16:26:39 <rkukura> By “intermediate state”, do you mean CREATING, UPDATING, and DELETING?
16:26:47 <manishg> rkukura: yes
16:27:08 <manishg> I think for async that will be sort of a basic requirement.  agree?
16:27:11 <Sukhdev> manishg: Oh I misunderstood - then I am OK as well with intermediate state
16:27:31 <manishg> ok.  there are two main questions I have.
16:27:44 <manishg> One was about the lock-step issue (error in one driver).
16:27:54 <rkukura> Are we agreeing that if you create a network and it returns in CREATING state (MDs not yet synced), its OK to immediately create a port on that network?
16:27:55 <manishg> How do we deal with that.  I think we didn't close that.
16:28:42 <rkukura> Specifically, that the client doesn’t have to way for a READY state before creating the port?
16:28:42 <manishg> rkukura: I think we are agreeing that we can perform other operations on that resource.  But in this example,
16:29:04 <manishg> I think the resource would have to be realized, correct?
16:29:07 <rkukura> s/way/wait/
16:29:20 <rkukura> The resource would be realized in the DB.
16:29:33 <Sukhdev> rkukura: the desired state and actual state are not same (based upon my model), hence, the resource is not available yet
16:29:40 <manishg> if the backend can deal with this then it shouldn't be an issue.
16:29:41 <rkukura> But the backend work would be async.
16:30:41 <manishg> so in this case, the port-creation would need to wait for the network-creation.
16:31:08 <rkukura> I’m arguing for an “optimistic” approach where the client proceeds creating and using resources without normally worrring about whether the backends are in sync.
16:31:13 <manishg> inter-resource dependency.  I need to think more about this.  I was mainly focusing on operations on the same resource.  any ideas?
16:31:20 <Sukhdev> rkukura: one should not be allowed to act on that resource until desired state == actual state - they could create port on different network, but, not on this one
16:31:48 <manishg> rkukura: yes, the client could move ahead.  but the backend may not be able to.
16:32:01 <rkukura> Sukhdev: I think trying to reject API calls related to the resource would break exisitng clients badly.
16:32:02 <manishg> Sukhdev: what does the client see?  desired state?
16:32:22 <Sukhdev> manishg: correct
16:32:23 <manishg> I think rkukura is trying to preserve the existing model where client doesn't change.
16:32:47 <manishg> Sukhdev: in that case, the client might as well as see if the state is READY or not (ACTIVE or not), etc.
16:32:56 <rkukura> Why not just let the API proceed from transaction to transaction, and only really wait for the backend to catch up when absolutely necessary (like when notifying nova the port is up)?
16:32:57 <Sukhdev> rkukura: actually - more I think about it, I think, it would be OK to allow subsequent operations - that would be OK
16:32:59 <manishg> but rkukura is saying the client shouldn't have to.
16:33:48 <Sukhdev> I think I am agreeing with rkukura as I am thinking more about it -
16:33:54 <manishg> rkukura: yes, that's what I understood from the last discussion.  Let me address this case in the doc also.
16:34:11 <rkukura> Sure, a client could create a network, wait to make sure the backends sync, then create the port, but they shouldn’t need to wait.
16:34:21 <manishg> so the MDs may block on port-create until network is available to them.
16:34:42 <shivharis> here is what i think will work (i need to sit peacefully to account for all repercussions though)
16:34:44 <Sukhdev> the back-ends should be able to handle this - task flow engine has to make sure that it ensures that if desired state in not equal to actual state, it will keep re-trying the operation
16:34:53 <manishg> rkukra: yes, the client doesn't need to.  but the port won't be active until the netowrk + port is created in the backend.
16:35:15 <shivharis> operations on a specific object may be fast forwarded
16:35:27 <shivharis> but inter-object dependencies cannot
16:35:58 <manishg> shivharis: that would be up to the implementation of the MD for that resource.  It will know the dependency.
16:36:30 <shivharis> manishg: i am looking from the client side
16:36:49 <manishg> so port-create for a yet-to-be-realized network will succeed at REST call level.  port will be in CREATING state. and the MDs could block on port-create till network is actually available.
16:37:07 <manishg> shivharis: rkukura doesn't want the client to change
16:37:08 <Sukhdev> shivharis manishg : If you have access to Arista's sync implementation - we deal with these interdependencies based up the DB state only
16:37:40 <shivharis> manishg: i need time to think this through..
16:37:48 <manishg> Sukhdev: how do you create a port in your implementation if the network in the backend doesn't exist but exists in the db?
16:38:10 <Sukhdev> manishg: We have precedence on the resources -
16:38:17 <rkukura> It sounds like we are close to consensus on what this looks like from the client API perspective, but I think we all need to think about how the backend sync would work with that.
16:38:35 <shivharis> rkukura: +1
16:38:53 <Sukhdev> manishg: Higher precedence resources are ensured to sync first and then lower - this works out OK
16:39:07 <manishg> rkukura: +1
16:39:57 <Sukhdev> So, I see we have converged - manishg can you please capture this in your doc?
16:39:57 <manishg> Sukhdev: yes, if there is some inter-resource dependency defined then this works fine I think.
16:40:37 <Sukhdev> Folks, Please review the document and post the comments there
16:40:39 <manishg> I'll note this in the doc as well.  But doesn't look like many people have read the doc.
16:41:09 <Sukhdev> manishg: too many things on plates, perhaps :)
16:41:11 <yamamoto> i'm not sure if ml2 meeting is an appropriate place for these rest api discussion
16:41:18 <yamamoto> it needs wider audience
16:41:30 <Sukhdev> Summit will give us opportunity to sit down face-to-face and discuss this all
16:41:58 <Sukhdev> Shall we move on?
16:42:04 <yamamoto> i have a question
16:42:13 <Sukhdev> yamamoto: please ask
16:43:21 <Sukhdev> In the mean time, lets keep marching forward
16:43:22 <manishg> yamamoto: +1 (re discussion)
16:43:41 <Sukhdev> #topic: Design Summit Discussion
16:43:45 <Sukhdev> #link: https://etherpad.openstack.org/p/YVR-neutron-contributor-meetup
16:43:54 <Sukhdev> shivharis: added ML2 topics
16:43:54 <yamamoto> please move on.  i realized my question was invalid while typing.
16:44:00 <rkukura> yamahata: I’d agree that if we were changing the API or the way the client uses it, we’d need wider discussion. But I think if we solidify how ML2 MDs sync, but preserve the current client API and interactions, we can proceed.
16:44:13 <Sukhdev> yamamoto:-) it happens :-)
16:44:29 <rkukura> shivharis: thanks for updating the etherpad!
16:44:48 <shivharis> I have added ml2 topics  - we need to sort this however - now or at summit?
16:45:01 <Sukhdev> Are we OK with what shivharis added?
16:45:14 <Sukhdev> Please review it and feel free to add/update
16:45:20 <shivharis> have i missed anything dear to you?
16:45:52 <rkukura> I think we should identify 3 or 4 top-level topics, and maybe move some things under these.
16:45:55 <Sukhdev> Lets all review and update with your name next to any updates
16:46:08 <Sukhdev> rkukura+1
16:46:21 <shivharis> i can do it offline before the summit
16:46:25 <rkukura> This will also likely evolve during the summit based on the other sessions.
16:46:39 <Sukhdev> I will be very actively driving Bare Metal integration
16:46:58 <rkukura> Sukhdev: There is a design session for bare metal, right?
16:47:32 <Sukhdev> rkukura: There are two - one on Neutron side, and other on Ironic side - I am going to request all ML2 folks to attend both
16:47:52 <Sukhdev> neutron side is on Wednesday PM and Ironic side is on Thursday PM
16:48:01 <rkukura> Sukhdev: Good - we can use some of Friday to followup if needed, but should identify other topics for Friday.
16:48:20 <Sukhdev> There is a huge interest in this topic, hence, both sides gave me the sessions
16:48:23 <rkukura> Is there much interest in more flexibility for security groups within ML2? I know ODL people are interested in this.
16:48:59 <shivharis> there is a topic in design session wrt SG and FW and the future of SG?
16:50:16 * Sukhdev moving on - running out of time
16:50:25 <Sukhdev> #topic: Liberty BP
16:50:52 <Sukhdev> #link: https://review.openstack.org/#/c/169223/3/specs/liberty/ml2-extension-driver-convert.rst
16:51:06 <yalie> Hi all, I submit a converting extension driver for addres-pairs and sec-group
16:51:22 <yalie> but don't know if this is the direction of the ML2
16:51:49 <Sukhdev> yalie: I saw this today - have not had time to digest
16:52:00 <yalie> rkukura mentioned maybe we can do something for agentless MD
16:52:18 <yalie> Sukhdev: right, maybe we can discuss it later
16:52:31 <yalie> in the neutorn room
16:52:32 <shivharis> rkukura: we should address security group, since we really do not know what the design session has to offer
16:52:33 <rkukura> In principal, I like the idea of turning as many extensions as possible into oprional pluggable extension drivers, simpifying the “core” of ML2.
16:53:42 <rkukura> seems we’ve got plenty to discuss for ML2 Friday at the summit!
16:53:43 <yalie> I don't quite get that "the server-side implementation of security groups in terms of RPCs to L2 agents on the compute nodes is hardwired into ML2."
16:53:44 <Sukhdev> yalie: next week Summit may be a good place to discuss this - consider adding to etherpad
16:54:12 <yalie> rkukura: thanks
16:54:21 <rkukura> yalie: yes, lets add this if its not already there
16:54:35 * Sukhdev moving on
16:54:36 <yalie> rkukura: OK
16:54:44 <Sukhdev> #topic Open Discussion
16:54:54 <Sukhdev> I have few questions listed
16:55:14 <Sukhdev> Do we support N:M relationship between ports and networks
16:55:30 <Sukhdev> i.e. one port on multiple networks and vice-a-vera?
16:55:36 <rkukura> I don’t think we do currently.
16:55:52 <rkukura> I think this is the “trunk port” idea that has been kicked around.
16:56:11 <Sukhdev> rkukura: I tested on thing - Launch a VM from horizon and add to two networks - it works
16:56:44 <rkukura> Sukhdev: Is this with two separate neutron ports, one for each network, as two separate vNICs on the VM?
16:56:45 <Sukhdev> rkukura: do not know what goes on under the covers - anybody knows?
16:57:00 <Sukhdev> rkukura: yes
16:57:05 <yamamoto> doesn't the VM has 2 ports?
16:57:29 <Sukhdev> yamamoto: I think yes
16:57:40 <Sukhdev> So, this will imply one port to one network -
16:57:57 <Sukhdev> however a given instance can connect to multiple networks
16:58:07 <rkukura> Sukhdev: Do you want to add “trunk ports” to the etherpad to look at extending this to multiple networks per port/vNIC?
16:58:49 <yamahata> vlan-aware-vm session is allocated for this topic.
16:59:00 <Sukhdev> rkukura: good idea - let me add to the etherpad - I will bring it up in the main design session
16:59:16 <rkukura> before we finish - we skipped bugs, but I’d appreciate if people would review https://review.openstack.org/#/c/182134/, which is the 1st step on completing the DVR code clenaup.
16:59:19 <rkukura> cleanup
16:59:23 <Sukhdev> yamahata: can you elaborate?
16:59:34 <yamahata> I mean https://etherpad.openstack.org/p/YVR-neutron-vlan-trunk
16:59:35 <rkukura> yamahata: right
16:59:49 <Sukhdev> yamahata: Thanks for the link
16:59:49 <yalie> vlan-transparent seems has been implemented
17:00:00 <yalie> vlan-aware-vm has not
17:00:12 <yalie> I am not sure
17:00:36 <yamamoto> we run out time
17:00:56 <Sukhdev> any body wants to discuss anything? 1 min left
17:01:00 <rkukura> I rhink vlan-transparent means one neutron network is a trunk of VLANs, where the newer idea is more flexible, composing a neutron port from multiple networks.
17:01:39 <yalie> rkukura: thanks for clarify!
17:02:10 <rkukura> we need to wrap up
17:02:11 <Sukhdev> Folks, this was a very good meeting - good discussion
17:02:21 <rkukura> thanks Sukhdev!
17:02:22 <Sukhdev> Thanks for attending
17:02:34 <Sukhdev> will see you in Vancouver next week
17:03:10 <Sukhdev> #endmeeting