#openstack-meeting log

15:03:24 <garyk> #startmeeting scheduling
15:03:25 <openstack> Meeting started Tue Sep 24 15:03:24 2013 UTC and is due to finish in 60 minutes.  The chair is garyk. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:03:26 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:03:30 <openstack> The meeting name has been set to 'scheduling'
15:03:58 <garyk> Last week we did not have much chance to discuss Mike's and Tahi's ideas.
15:04:04 <garyk> Sorry Yathi's
15:04:16 <garyk> MikeSpreitzer: do you want to start?
15:04:23 <MikeSpreitzer> OK
15:04:24 <Subbu> #info
15:04:49 <garyk> Subbu: please see https://docs.google.com/document/d/1hQQGHId-z1A5LOipnBXFhsU3VAMQdSe-UXvL4VPY4ps/edit
15:04:57 <MikeSpreitzer> Should I start with responding to the latest thing on the ML (From Zane), or start from scratch?
15:05:19 <Subbu> thanks garyk
15:05:24 <garyk> MikeSpreitzer: I am fine with that. Not sure if others are up to speed or having been following the list.
15:05:31 <garyk> Maybe it is best to start from the beginning
15:05:37 <MikeSpreitzer> OK, I'll start from the beginning.
15:06:12 <garyk> Great
15:06:20 <MikeSpreitzer> I am interested in holistic scheduling.  By that I mean the idea of a scheduler that can look at a whole template/pattern/topology and make a joint decision about all the resources in it.
15:06:40 <MikeSpreitzer> I do not mean that this thing *has* to make all the decisions, but it should have the opportunity.
15:06:55 <MikeSpreitzer> I mean a richer notion of pattern than CFN has today.
15:07:16 <MikeSpreitzer> A pattern should have internal grouping, with various sorts of policy and relationship statements attached.
15:07:42 <garyk> I agree with the fact that the scheduler should have a complete picture of all of the resources
15:07:50 <MikeSpreitzer> (that's the response to Zane's main complaint.  This richer information gives the holistic scheduler information to use, rather than requiring mind reading)
15:08:29 <MikeSpreitzer> Not sure how much you want to hear about what my group has working, so I'll go on for now.
15:08:29 <garyk> MikeSpreitzer: Gilad an I tried to brach this with the VM ensembles.
15:08:48 <garyk> MikeSpreitzer: all ears
15:09:41 <MikeSpreitzer> My group is doing stuff like this, but not integrated with heat; our current holistic controller is a client of Nova, Cinder, etc, but slightly extended versions of them to give us the visibility and control we need.
15:10:12 <MikeSpreitzer> We have worked an example of a template for IBM Connections, which is a complicated set of apps based on our appserver products, and are working examples based on Hadoop.
15:10:48 <MikeSpreitzer> Anyway, once the holistic scheduler has made the decisions it is going to, the next step is infrastructure orchestration.
15:11:18 <MikeSpreitzer> That is the business of invoking the specific resource services to put the decisions already made into effect and pass the remaining bits of the problem along.
15:11:54 <MikeSpreitzer> This is the main job of today's heat engine, and I see no reason to use something else for this part.
15:12:50 <garyk> Can you please explain why it does the orchestration part?
15:12:52 <MikeSpreitzer> I am also interested in other ways of doing software orchestration.  I have colleagues who want to promote a technique that has a non-trivial preparatory stage, and then at runtime in the VMs etc all the dependencies are handled primarily by software running there.
15:13:09 <MikeSpreitzer> Garyk: which "it" ?
15:13:17 <garyk> I can understand that the scheduler should see the entire 'application' that is going to be deployed, but that can be done without integration with heat
15:13:48 <MikeSpreitzer> I think it is awkward to have today's heat engine upstream of holistic infrastructure scheduling
15:14:05 <MikeSpreitzer> today's heat engine breaks a whole template up into individual resource calls and makes those…
15:14:17 <MikeSpreitzer> Not very natural to pass a whole template downstream from that.
15:14:41 <garyk> if i understand correct heat has a template. if the template can create logical links 'links' between the entitietis and these passed to the scheduler then it can be seprate
15:14:43 <MikeSpreitzer> Also, kind of pointless to break up the template before the holistic scheduling.
15:15:20 <garyk> i think that at the moment heat does things sequentially (i may be wrong here)
15:15:37 <MikeSpreitzer> Sure, if the scheduler sees all the resources and links, that's what is needed.
15:16:26 <MikeSpreitzer> There is no infrastructure orchestration to be done before holistic scheduling, and there *is* infrastructure scheduling to be done after holistic scheduling.
15:16:41 <garyk> That was our initial goal with the VM ensembles. We failed to convince people that it was the right way. The piece meal approcah was to us the instance groups
15:17:04 <MikeSpreitzer> What was the sticking point?
15:17:30 <garyk> I think that we did not manage to define the API well enough.
15:17:54 <garyk> In addition to this there were schedulers being developed for all of the different projects
15:18:01 <MikeSpreitzer> My group has been using a pretty simple API — with a rich template/pattern/topology language
15:18:42 <MikeSpreitzer> Sure, there should be schedulers for smaller scopes.
15:18:49 <Yathi> Currently the resources are limited to the individual services (projects), there is definitely a need for some kind of a Global state repository (like how I can explain later about my high-level-vision)
15:19:26 <Yathi> this global state repository can feed to any of the services
15:19:29 <MikeSpreitzer> Yathi: yes, repo as well as decision making.  The repo raises Boris' issues, which are relevant to all schedulers.
15:19:42 <garyk> Yathi: that is very intersting. How would this get the information from the various sources.
15:19:53 <garyk> MikeSpreitzer: how did you guys address that?
15:20:11 <MikeSpreitzer> I can tell you how we do it now, but like you guys, we are not satisfied...
15:20:23 <MikeSpreitzer> I think there is room for improvement here, but it does not change the overall picture.
15:20:39 <Yathi> An attempt to get there is started by the blueprint proposed by Boris
15:20:43 <garyk> hopefully with the community we can improve things :)
15:20:49 <Yathi> In-memory state
15:21:32 <Yathi> https://blueprints.launchpad.net/nova/+spec/no-db-scheduler
15:21:33 <MikeSpreitzer> Our current approach is anchored in a database, and not fully built out as we already decided we want.  We are also interested in moving to something that is based in memory.  This stuff is all a cache, the hard state is in lower layers.
15:21:38 <garyk> #info https://review.openstack.org/#/c/45867 (this is review mentioned ^)
15:22:12 <Yathi> yeah that is something that can eventually address getting a global state repository
15:22:29 <garyk> So we all seem to be aligned int he fact that we need to cache the information locally (in memory)
15:22:54 <MikeSpreitzer> OK, so let's get back to the reasons for rejection earlier.
15:23:03 <garyk> I think that one of the major challenges is how this information is shared between the hosts, services and scheduler
15:23:11 <MikeSpreitzer> I do not see a conflict with the fact that individual services have their own schedulers.  What was the problelm?
15:23:18 <Yathi> I can explain later.. but in my document #info https://docs.google.com/document/d/1IiPI0sfaWb1bdYiMWzAAx0HYR6UqzOan_Utgml5W1HI/edit?pli=1 I try to put together the bits required for a smart resource placement
15:23:51 <MikeSpreitzer> OTOH, I do see a conflict...
15:24:13 <MikeSpreitzer> If you think the only place to put smarts is in an individual service, then that's a conflict.
15:25:03 <garyk> MikeSpreitzer: true
15:25:03 <MikeSpreitzer> But I think we are liking the idea of enabling joint decision-making.
15:25:31 <garyk> MikeSpreitzer: i think that there are a number of people who like and support that idea
15:25:35 <MikeSpreitzer> garyk: enough of my conjecture, can you elaborate on the objection wrt service schedulers?
15:26:45 <alaski> I'm late to the meeting and just catching up, but I'm very interested in what extra data needed to be exposed from nova/cinder/etc for holistic scheduling and what resource placement control was needed
15:26:52 <garyk> MikeSpreitzer: it was a tough one. there were people who felt like it was part of heat
15:27:27 <garyk> our point was that the scheuler needed to see all of the information and heat was not able to do that
15:27:29 <MikeSpreitzer> alaski: To compute we added visibility into the physical hierarchy, so we can make rack-aware decisions.
15:27:36 <alaski> I think the sticking point for earlier efforts are partly due to the focus on holistic scheduling before discussing what each service needs to provide and accept
15:28:33 <MikeSpreitzer> I think that, at least for private clouds, there is a simple general rule: you may think you are at the top of the heap, but you are not.  Enable a smarter client with a bigger view to make decisions.
15:29:18 <garyk> it is a chance to provide preferential services for applications
15:29:19 <alaski> right.  I've heard very little objection to that, the devil is in the details
15:29:24 <MikeSpreitzer> Nova today allows its client to direct placement.  We added visibility of the physical hierarchy, so a smarter client can decide where VMs go.
15:29:49 <MikeSpreitzer> We also worked out a way to abuse Cinder volume types to direct placement.
15:30:05 <MikeSpreitzer> We are currently cheating on the visibility for Cinder, would prefer that Cinder have a real solution.
15:30:28 <MikeSpreitzer> For network, we are moving from something more proprietary to something OpenDaylight based.
15:30:40 <alaski> so as far as Nove direct placement goes, I'm very much in favor of removing scheduler hints.  I want a placement api, but I think it needs to redone with an idea of what we want from it
15:30:59 <MikeSpreitzer> We currently use a tree-shaped abstraction for network.  That is admittedly an serious abstraction, it is an open question how well it will work.
15:32:00 <garyk> alaski: the placement api is a good start. the instance groups can be an option?
15:32:16 <MikeSpreitzer> So the kind of visibility that I think is needed is an abstract statement of the topology and capacity of the physical resources (we tend to use the word "containers",but not to mean LXC, rather as a general term for things that can host virtual resources).
15:33:01 <garyk> MikeSpreitzer: in a public cloud how much information do you want to provide to the end users?  At the end of the day they just want to be guarantted for the service that they are paying for
15:33:24 <MikeSpreitzer> Yes, it's pretty different in a public cloud.
15:33:33 <alaski> garyk: I think instance groups is a good start, but I don't know if it's rich enough to stop there
15:33:38 <Yathi> I think we are in the common theme for a smarter resource placement, and I would like to present a high-level vision document that I shared.. and connect it to some efforts being undertaken
15:33:50 <Yathi> including Instance groups
15:33:53 <garyk> alaski: agreed. it is very primitive at the moment
15:34:13 <garyk> Yathi: agreed. Can you elborate
15:34:18 <MikeSpreitzer> Yes, I think instance groups falls short of what we need.
15:34:51 <MikeSpreitzer> There are things to say about public cloud, but I will listen to Yathi first.
15:34:52 <Yathi> Ok.. the idea is to start with business rules/ policies as stated by tenants  - leading to a smart resource placement
15:35:11 <Yathi> with the help of all the datacenter resources with the help of a global state repository
15:35:34 <Yathi> but the main decision making of resource placement to be handled by a smart constraint-based resource placement engine
15:35:55 <Yathi> this ties and puts together several efforts and blueprints proposed
15:36:12 <Yathi> the instance groups effort - should evolve to support policies / business rules
15:36:32 <Yathi> and this should transform to some form of constraints to be used by the decision engine,
15:37:04 <Yathi> Boris's efforts of in-memory stuff should evolve to provide a global state repository providing a view of all the resources
15:37:35 <Yathi> and the new work (for which I added a POC code) - using LP based solver scheduler should handle the decision making
15:37:54 <garyk> Yathi: can you please post the link to the code you posted
15:38:00 <Yathi> the actual orchestration or the placement of the vms can be done using existing mechanisms
15:38:24 <Yathi> #link https://review.openstack.org/#/c/46588/
15:38:45 <Yathi> so the general idea is presented in this doc - #link https://docs.google.com/document/d/1IiPI0sfaWb1bdYiMWzAAx0HYR6UqzOan_Utgml5W1HI/edit?pli=1
15:39:06 <Yathi> the idea is this should be backward compatible and hence non-disruptive
15:39:10 <Yathi> works with the current Nova
15:39:49 <garyk> my concerns is that we seen to all have great ideas about how to do the backend implementations but the user and admin facing api's are our achilles heal
15:39:53 <Yathi> using a PULP solver module that I added code for in Nova, that I ran instead of the Filter Scheduler
15:40:24 <Yathi> user facing APIs - is the one we should reach a common agreement on
15:40:46 <alaski> garyk: agreed.  In order to get something in there will need to be consensus among the projects, who don't care what makes the placement decisions.  They care what they need to expose
15:41:02 <Yathi> THe instance group blueprint - brought in the concept of policlies
15:41:02 <alaski> so that needs to be figured out and added to various projects
15:41:06 <MikeSpreitzer1> I think the user-facing API can be pretty simple, I think the pattern/template/topology language is where the action is
15:41:18 <garyk> alaski: agreed
15:41:27 <Yathi> that is something we want to evolve  to transform to constraints to be used by a solver engine
15:41:27 <MikeSpreitzer1> Well, there are APIs at various levels
15:42:00 <MikeSpreitzer1> I think the infrastructure level APIs need to expose sufficient visibility and control; the whole-pattern layers can have simple APIs but need rich pattern language.
15:42:22 <garyk> what about the idea that we divide it up into 3 parts:
15:42:29 <garyk> 1. the user facing apis
15:42:39 <garyk> 2. the information required from all of the services
15:42:48 <garyk> 3. backend scheduling
15:43:03 <Yathi> Garyk if you read my document - this is exactly the three points :)
15:43:11 <garyk> if we can define the relations ships between these (i guess with api's)
15:43:12 <Yathi> you read my mind!
15:43:13 <MikeSpreitzer1> garyk: which are "user facing" APIs?  Which scheduling is the "backend"?
15:43:18 <garyk> Yathi: i have yet to read it
15:43:30 <Yathi> ok
15:43:42 <MikeSpreitzer1> yathi: I have skimmed it , will review more carefully
15:44:39 <Yathi> okay it presents a high-level vision of the necessary efforts, relationship to some of the existing proposed blueprints, and then some additional details on the actual DECISION engine - which makes resource placement decisions
15:45:06 <Yathi> this is work-in-progress, design-in-progress, and something we want to discuss in detail at the summit also face-to-face
15:45:32 <Yathi> but it is dependent on other blueprints and hence a big collaborative effort
15:45:32 <MikeSpreitzer1> sounds good.  BTW, will anybody be around before/after the official summit for informal discussions?
15:45:50 <garyk> MikeSpreitzer1: hopefully
15:45:58 <PaulMurray> how much before or after?
15:46:06 <PaulMurray> but yes, a bit
15:46:08 <MikeSpreitzer1> Not much.
15:46:18 <Yathi> Please do review and post your feedback, and we can continue the discussion again
15:46:23 <garyk> Maybe we could all meet for lunch or breakfast one day to discuss
15:46:34 <Yathi> that will be a good idea
15:46:47 <PaulMurray> I would like to be there for that
15:46:54 <MikeSpreitzer1> I'm not sure how far this goes before it becomes a process error.
15:47:17 <garyk> MikeSpreitzer1: not sure i understand
15:47:36 <MikeSpreitzer1> Is there any problem with organizing an extra summit?
15:47:56 <MikeSpreitzer1> I'm still new here, learning the rules
15:48:03 <MikeSpreitzer1> already made some mistakes, sorry about that
15:48:18 <PaulMurray> Mike
15:48:22 <PaulMurray> ooops
15:48:30 <garyk> At the summit hopefully we'll have a few slots to discuss the scheduling. The PTL will need to allocate time
15:48:56 <Yathi> unconference sessions ?
15:49:05 <garyk> The last summit russellb gave us a lot of sessions. We met before and aligned the presentations
15:49:26 <garyk> I think that this time we should also meet before. Syncing it all will be a challenge
15:49:34 <PaulMurray> If that is the objective it is a good idea
15:50:21 <garyk> I just think that it is very important for us to try and make the most of the time that we get.
15:50:38 <garyk> Conveying the ideas and getting the communities support is a challenge
15:50:50 <MikeSpreitzer1> Yes.  But at some point we have to dig into details, that is sometimes necessary to get real agreement.
15:51:19 <garyk> MikeSpreitzer1: true.
15:51:40 <MikeSpreitzer1> I am a big fan of reading and writing.  But time for discussion is needed too.
15:51:53 <garyk> In the last two summit with Neutron there was great colabortaion for the LBaas and FWaas,. maybe we need to follow the same model
15:52:11 <MikeSpreitzer1> Can you elaborate?
15:52:12 <garyk> That is, set up a few meetings and get all of our information into a google doc (or etherpad)
15:52:31 <MikeSpreitzer1> yes, that sounds good.  Good writeup and reading beforehand, detailed discussion.
15:52:34 <garyk> Then when we can come to summit we can present the details and get inputs from the community
15:52:43 <garyk> MikeSpreitzer1: exactly
15:53:06 <garyk> boris-42: and Yathi: have two implementaions
15:53:19 <Yathi> Okay I think we have already started some of this process in our etherpad
15:53:29 <garyk> I still think that we need the documentation to have the idea from A - Z. Then we can slice up the cake/pie
15:53:30 <Yathi> and we have added some POC code to demo and discuss
15:53:35 <boris-42> garyk I will try to find some time
15:53:41 <boris-42> garyk to update our docs and ehterpad
15:53:43 <garyk> http://9gag.com/gag/adN9Mp9 (sorry I could not resist)
15:54:26 <MikeSpreitzer1> the cake is a lie
15:54:26 <Yathi> funny!
15:54:38 <garyk> Does someone want to take the initiative and start to prepare a document for the API's?
15:55:11 <MikeSpreitzer1> garyk: which APIs?  (which level)?
15:55:47 <garyk> I think the 3 parts - user/admin; information required from service and scheduling engine
15:56:06 <MikeSpreitzer1> I am interested in working on that.
15:56:32 <MikeSpreitzer1> Not sure what I can promise, but realize it has to be done long enough before summit to allow careful reading.
15:56:53 <garyk> MikeSpreitzer1: great. I'd be happy to work with you on that too. I am a bit pressed for time in the coming two weeks, but after that I will have some free cycles
15:56:59 <Yathi> scheduling engine API part,  my code relied upon something existing
15:57:07 <Yathi> my POC code I mean
15:57:21 <Yathi> something based on what the FilterScheduler uses
15:57:28 <garyk> Yathi: cool.
15:57:43 <garyk> which is good for backward compatibility (and very importnat)
15:57:53 <Yathi> but this should interface with the new ideas of a "global state repo", and the tenant-faceing APIs
15:57:57 <garyk> How about we decide next week on how we want to proceed?
15:58:09 <MikeSpreitzer1> OK
15:58:14 <Yathi> sure
15:58:33 <Yathi> please review the POC code and the doc I shared links on etherpad
15:58:44 <MikeSpreitzer1> yep
15:58:45 <garyk> i'll try.
15:58:53 <Yathi> POC code doesn't pass unit-tests because of the dependency to PULP
15:59:05 <Yathi> I will need to figure out how to make them pass
15:59:12 <Yathi> not for discussino in this forum sorry
15:59:34 <garyk> so i guess that we'll meet next week.
15:59:38 <garyk> thanks guys
15:59:50 <garyk> #endmeeting