16:01:54 #startmeeting networking_ml2 16:01:54 Meeting started Wed May 6 16:01:54 2015 UTC and is due to finish in 60 minutes. The chair is rkukura. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:01:56 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:01:58 The meeting name has been set to 'networking_ml2' 16:02:12 #topic Agenda 16:02:35 #link https://wiki.openstack.org/wiki/Meetings/ML2#Meeting_May_6.2C_2015 16:03:06 Any questions about the agenda, or would anyone like to add anything? 16:03:46 #topic Announcements 16:04:14 kilo has been released - congrats everyone! 16:04:21 any other announcements? 16:05:21 Many ML2 drivers were decomposed in this release - so great effort by everybody 16:05:22 #topic Task Flow Discussion 16:05:57 manishg has posted a working doc, and people are starting to provide feedback 16:06:06 #link https://docs.google.com/document/d/1aSgTVB7nW_v7lHH0Z0DUgfymEsx0O16k1Jgu7QFXkFA/edit?usp=sharing 16:06:19 only see Sukhdev's comments there I think. 16:06:20 manishg: Did you have time to review my comments? 16:07:15 Sukhdev, I noticed them yesterday. looked at them and we can discuss. depending on how many people have reviewed. 16:07:57 manishg: I was hoping we discuss some of this today - if folks are ready 16:08:03 Have others had a chance to start reviewing the doc yet? 16:08:12 yep, we can. 16:08:27 I added a few comments this morning. 16:08:32 I see your comments from yesterday and rkukura's . shivharis? 16:09:12 For those who just joined, we are discussing manishg’s working doc on TaskFlow 16:10:15 Sukhdev, of your comments the one I note that we may want to discuss is the one where you are commenting on queueing, right? 16:10:40 yes i have read the doc 16:10:54 manishg: I am suggesting an alternate which avoids dealing with queueing issues 16:11:10 manishg: Meant to mention that I really appreciate you putting this together! 16:11:58 thanks rkukura. Sukhdev: I didn't get what alternative you are proposing? I don't think queuing is orthogonal to db being source of truth 16:12:01 manishg: As I highlighted - there are basically two design models - one is queuing based (where each and every transition is dealt with) 16:12:12 and the second which is state based - 16:12:25 rkukura +1 16:12:35 my concern is that what happens when an object is still updating in the back end and a new operation for the same obkect is modified 16:12:45 by the front en 16:12:47 end 16:13:25 shivharis: Well, in a state driven model, it is assumed that the desired state will be programmed evantually 16:13:55 shivharis: For example, if you network name is blue, then red, then green and then yellow - 16:14:00 how many can you queue up? 16:14:29 shivharis: do you care to preserve all the intermediate states or you ensure that when system settles down the network name is yellow 16:14:35 Sukhdev: queueing is needed only if you want to "hide" the underlying async nature from the caller. 16:14:38 In a queing model where each transition is guaranteed to be processed by each MD, wouldn’t we need to store each entire state in the queue until all MD’s are done processing it? 16:14:48 i.e. what is the queue lenght of the ops for a specific object 16:14:48 manishg: Not really 16:15:17 Sukhdev: alternatives? 16:15:34 Should we try to agree on how this looks to the client first? 16:15:36 If you read my proposal, it says the goal of the back-end is to achieve the desired state - 16:15:50 rkukura: I think that will help 16:16:03 shivharis: "how many can you queue? " -- maybe we can discuss this at implementation time. not sure if this impacts the design much (i.e. if we do queueing). agree? 16:16:05 rkukura: we are kind of jumping into the implementation details here - :-) 16:16:18 if you queue up, you will have to process the queue one by one 16:16:27 manishg: agree 16:16:48 here is the rub: 16:17:03 rkukura: Go ahead start with the client's view and then we can dive into the next level 16:17:19 Sukhdev: addressing your statement - where is the state maintained? in db right? (nothing else) 16:17:33 manishg: correct 16:17:55 manishg: basically DB size doubles - but, it cures most of issues 16:17:59 Sukhdev: now, if these operations are done in sequence O1, O2, O3. and backend is still working on O1 16:18:11 is the final state that is needed or all the state transitions necessary 16:18:15 then where is O2 O3 end state maintained? 16:18:30 manishg: you will never have that issue - let me give an example 16:18:55 I think I agree with manishg’s statement that “ML2 should allow operations regardless of current state”. Do others have any concerns about this? 16:19:17 rkukura: i agree with that statement 16:19:18 rkukura: manishg +1 16:19:24 Sukhdev: ok. let's take this one step at a time and understand your statement(s)/ proposal. 16:19:32 say the front end-started with O1, the back end was getting ready to process the O1, but, in the mean time front-end has reached to O3 - the back-end processes the O3 directly and ignores everything in between 16:19:46 hang on. 16:19:52 the premises here is the desired state is true source - which is assumed in ML2 16:19:53 what do you mean frontend is at O3 ? 16:20:00 what is front end here? db ? 16:20:40 yes- front end is client, back-end is drivers 16:20:43 you are assuming we do either transition table or we just wipe out previous with latest. and backend sync with the db. is that right?\ 16:20:47 both work with the DB 16:21:15 front-end works/updates with desired state and back-end works/updates actual state 16:21:31 isn't that same as 'state transition' ? 16:21:37 manishg: there is a problem, let me explain 16:21:39 that is what happens in that mode. 16:21:48 The task flow manager/ML2 Pluin only sees if both states are in same state or not - if not, action is taken 16:22:06 if O? operations are for create port 16:22:19 Sukhdev: can you explain what you do mean by both states in same state ? 16:22:32 and O1 and O5 are for create port (but for different networks) 16:22:46 what do you do? 16:23:04 Could we move away from passing previous state to the MDs in postcommit? 16:23:05 manishg: If you look at DB record for a network or pot in DB (the way we have it today) - this is the desired state 16:23:09 shivharis: the dependency need to be identified correctly. what if O2 is DELETE ? 16:23:39 Sukhdev: today it is sync ! 16:23:47 manishg: exactly, you need the intermediate transitions 16:23:52 so there is no question of queueing or state transition. 16:24:06 manishg: If you created a replica of this record (with all values as NONE) and asked the driver to act on that record and update the state of the actual state 16:24:15 why do we need the intermediate transitions? 16:24:33 you can then look at both records and say which resource is out-of-sync - 16:25:00 Sukhdev: your proposal seems -- "implement state table OR wipe out state based on latest operation" and "sync backend with this state", correct? 16:25:16 manishg: yes 16:25:46 Sukhdev: so this is same as "state transition" proposed in document, right? All you are saying is that the transition you want 16:25:59 is in any state the latest is allowed. 16:26:11 which is fine if everyone agrees. 16:26:58 manishg: I am more or less in agreement with what you have written - just trying to give a model that will eliminate the need of complex queuing models and baby-sitting intermediate transaction - therefore, simplifying the design 16:27:16 If we can get away with it and it simplifies the implementation, I’d think ignoring intermediate transistions would be OK 16:27:43 Give me sec - let me make a point here 16:28:04 Sukhdev: consider this . CREATE, DELETE, UPDATE - what should happen? also consider the fact 16:28:05 Has anyone considered the transition serial number approach from my comment? 16:28:19 that there may be multiple drivers - some faster than others. 16:28:36 i think the first imlementation will put this to bed, details matter here 16:28:36 rkukura: comment in doc? 16:28:48 manishg: The update should be rejected in precommit once delete has been processed in precommit 16:28:57 If you really look at the ML2/Neutron DB - we are saying this is the state of the system 16:29:34 manishg: yes, in the doc I suggested that ML2 keep a counter of transitions for each resource, and use this to determine for each MD if there are outstanding transitions to process. 16:29:35 rkukura: so there is some state transition table (in some form). Ok, that works. 16:29:47 If anybody is not in sync with this, needs to bring into sync - and this engine (task flow) is simply looking at the DB and triggering notifications/operations 16:29:52 rkukura: didn't look at comments this morning. I'll take a look. 16:30:38 This really requires a white board discussion with pictures :-):-) manishg wish you were going to Vancouver 16:30:40 Sukhdev: yes, we if implement simple transition states like DELETING -> * : not allowed. *->UPDATE : ok. etc. then we get the desired goal of 16:30:40 (hard to follow :p) 16:30:42 Lets spend a couple more minutes on this topic today, then proceed with the rest of the agenda. 16:30:52 db being latest. 16:31:20 GLaupre: correct - agree with you - requires a white board with pictures to explain :-) 16:31:21 GLaupre: discussing based on document and previous discussions. there are a couple of key issues which need to be nailed down further. 16:31:35 manishg: I think those constraints are already implemented 16:31:47 rkukura: yes. 16:31:52 but not async. 16:32:25 some of these problems won't be problems when operating in sync model. 16:32:31 They are implemented in the precommit part. I guess we need to think about whether postcommit processing could happen in a different order. 16:32:34 May I propose that we schedule a google hangout meeting and discuss it there? 16:32:39 Sukhdev: re whiteboard - whiteboarding with hangout works? 16:32:44 In fact, that’s possible right now with multiple server threads. 16:34:10 rkukura: if we have multiple server threads and if each one doesn't lock db for long (only precommit) then are you 16:34:19 saying we perhaps don't need async model? 16:34:53 manishg: Is there any chance you could participate remotely in a discussion of this Friday morning of the summit? 16:35:01 or are we saying we use TF only for proper rollback/ recovery? 16:35:12 rkukura: yes sure. I'd like to. 16:35:24 manishg: I’m thinking we should use TF to make the postcommit processing async, like you propose 16:35:26 and I'll request Josh to be there. He'll be there. 16:36:20 rkukura: should I put how things are today (in the doc) - and address the multiple threads in there too? 16:36:21 TF? 16:36:25 TaskFlow 16:36:37 manishg: +1 16:36:45 Seems we need to agree on how this looks to the client (i.e. is there a new CREATING/READY/UPDATING/DELETING state visible?) and work through the implementation options. 16:36:48 rkukura: any more changes needed in doc? layout? details? etc. ? 16:36:55 I'll go through the comments and address them. 16:37:03 we should prepare for this with a state diagram and a few scenarios 16:37:16 maybe add some example flows also, to highlight the issues we're discussing. 16:37:30 manishg: I think we should continue adding ideas and feedback to the document, either inline or as comments. 16:37:44 shivharis: state diagrams in our case will be simple I think. 16:37:47 manishg: you were going to look at how midonet uses it, that information will also be helpful 16:37:48 manishg: yes, I’d really like to identify failure use cases to handle 16:37:51 rkukura: agree. 16:38:03 like servers dying at various points, multiple servers, ... 16:38:18 and restarts 16:38:19 rkukura: ok, will do. 16:38:39 lets not worry about format while we are brainstorming 16:38:43 rkukura: Do you want to take an action to add to the fishbowl topics to discuss this? 16:38:57 Sukhdev: midonet - I started looking at it. at least for network create etc. it seems to be straight forward and doesn't have the issues with multiple drivers and such. but will look at other calls 16:39:08 to see if some of it give us any ideas. 16:39:24 #action rkukura to add TaskFlow discussion to Friday summit fishbowl session etherpad 16:39:59 manishg, Sukhdev: What is the relevance of midonet to the TF discussion? 16:40:00 Sukhde: I'll continue to explore midonet's other API calls. 16:40:14 manishg: Is Josh going to Vancouver? 16:40:28 rkukura: someone had suggested midonet plugin may have some async stuff or similar cases to handle. 16:40:31 manishg: if he is, can you have him participate in this discussion? 16:40:40 manishg: OK, thanks 16:40:50 Sukhdev: yes, JOsh will be at the summit. I'll request him to get in touch with you to join the discussion. 16:41:09 Lets continue this discussion in the doc, and move on with the agenda 16:41:09 I already talked to him earlier last week and he had agreed. 16:41:15 anything else on sync/TF now? 16:42:01 #topic Liberty Design Summit Discussion 16:43:08 So my understanding is that no ML2-specific sessions are scheduled for Monday thru Thursday, but we can have an “ML2 corner” at the Friday morning session. 16:43:31 shivharis: had added some items on ehterpad - those need to be moved to the fishbowl topics under ML2 16:43:39 rkukura: correct 16:44:12 i am not familar with fishbowl .. somebody bring me to speed? 16:44:17 link if handy? don’t seem to find it 16:44:21 I believe this is the etherpad for this Friday session: 16:44:24 #link https://etherpad.openstack.org/p/YVR-neutron-contributor-meetup 16:44:30 rkukura: thanks 16:44:44 shivharis: I think it is similar to PODs in Paris 16:44:45 So I’ve got the action to add something on sync/TF 16:44:52 got it 16:45:20 shivharis: I think fishbowl just means its a big room (~200 people) 16:45:40 What other ML2 items do we want to cover? 16:45:47 with a few fish in it 16:45:51 Sukhdev, rkukura: thx 16:46:34 banix: ha ha - it should actually be whalePond, if it is big enough to hold 200 people :-) 16:47:01 some of the items from the original neutron topics etherpad are candidates 16:47:17 i'll sort it and put there is the fishbowl 16:47:27 s/there/these/ 16:47:38 We could certainly discuss the state of the DVR cleanup I’ve been doing, and generalizing distribute port support 16:48:02 And there are several ideas related to extension drivers. 16:48:09 rkukura: distribute port? 16:48:25 distributed port, like DVR, but for DHCP, … 16:48:31 I had added only one item on the original etherpad - and I got a session for this (Ironic Integration with neutron) - who else added topics to original etherpad - we need to bring those into this new fishbowl etherpad 16:48:34 i see 16:48:39 So the same port can be bound on multiple hosts. 16:48:50 enhancement to extention driver? 16:48:53 yes yes 16:49:02 rkukura: distributed dhcp? 16:49:11 Sukhdev: I think ironic is covered in other sessions, right? 16:49:23 rkukura: yes 16:49:26 shivharis: maybe, or any service that wants to be replicated 16:49:48 rkukura: for HA? 16:50:06 shivharis: possibly 16:50:13 rkukura: ok 16:50:19 I’ll add it to the list 16:50:42 Is there any interest in making security group enforcment pluggable and/or MD-specific in ML2? 16:50:50 all, please feel free to update the fishbowl 16:51:27 Right, if there is anything ML2-specific you’d like to discuss at the summit add it to the fishbowl etherpad 16:51:28 shivharis rkukura : I was hoping we nail down the topics for fishbowl - so that we can plan/prepare for it 16:52:08 Sukhdev: Can we pare down / organize the topics next week? 16:52:13 Sukhdev: ideally yes. 16:52:44 rkukura: ah ha - I was going to ask if we want to meet next week - you just answered my question :-):-) 16:52:52 If you care about something and are willing to lead a discussion, please add it to the list and we can go over it next Wednesday at this meeting. 16:53:05 lets move on with the agenda 16:53:23 #topic Mid-Cycle Sprint 16:53:58 Assuming we nail down/agree on the design of Task Flow, this is best place to get it done 16:54:07 main thing is that ML2 sync/TF is on the agenda 16:54:13 Sukhdev: right 16:54:29 also “reference plugin decomp” 16:55:02 I think this means moving the OVS and LB agents and MDs to a different repo. Is that right? 16:55:12 rkukura: what do you have on mind regarding plugin decomp? 16:55:39 Sukhdev: Wasn’t my agenda item, just noticed it 16:55:49 Five minutes left 16:55:55 i believe that had to do with the inconsisteny in the way folks do decomp. 16:55:58 rkukura: Oh I see 16:56:33 some folks remove most of the driver, some have minimal stuff in it 16:56:36 shivharis: coud be, but it does say “reference plugin” and there has been talk of moving things to other repos. Lets see what happens at the summit on this. 16:56:54 #topic Bugs 16:56:58 ah.. plugin 16:56:59 shivharis: Any update? 16:57:20 shivharis: “plugin” seems to be used generically to mean plugin or ML2 driver these days. 16:57:23 not much to report on bugs things seem stable 16:57:44 shivharis: thanks 16:57:47 i did have issues witht the master branch not coming up yesterday 16:58:03 You should see a patch for the DVR MD issues and UTs in the next day or so. 16:58:09 there was a bug fixed in kilo - needs ported to master 16:58:15 but has nothing to do with ML2 16:58:41 anything else on ML2 bugs? 16:58:47 anyone has anything to add to the bugs topic 16:58:53 #topic Open Discussion 16:59:01 We’ve got just one minute 16:59:06 Hi rkukura, I want to do some work on 'convert the sec-group/address-pair into extension driver', would that be one of Ml2's object. 16:59:26 ? 16:59:34 yalie: probably - any details available? 16:59:51 I am writeing a bp 17:00:08 I was also suggesting we make SGs more MD-specific so options other than L2 agent are possible 17:00:15 https://review.openstack.org/#/c/169223/ but in draft 17:00:33 OK, we can see if we want to cover this during the fishbowl 17:00:45 maybe put the link in the etherpad 17:00:49 thanks 17:00:49 I need directions for SR-IOV support with ML2 plugin. With whom can I chat on that topic offline someday? 17:01:23 GLaupre: I’d just ask on #openstack-neutron 17:01:32 a number of people were involved 17:01:34 ouki 17:01:36 we are out of time 17:01:43 Thanks everyone! 17:01:47 #endmeeting