14:03:11 <n0ano> #startmeeting nova-scheduler
14:03:12 <openstack> Log:            http://eavesdrop.openstack.org/meetings/nova_meeting/2015/nova_meeting.2015-10-05-14.01.log.html
14:03:14 <openstack> Meeting started Mon Oct  5 14:03:11 2015 UTC and is due to finish in 60 minutes.  The chair is n0ano. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:03:15 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:03:18 <openstack> The meeting name has been set to 'nova_scheduler'
14:03:26 <edleafe> third time's a charm
14:03:39 <n0ano> OK, what else can I screw up this morning
14:03:40 * bauzas waves again
14:03:49 <n0ano> n0ano, waves back
14:04:29 <n0ano> #topic Mitaka planning
14:04:52 <n0ano> So, I read the two specs pointed out last week and they're a good start
14:05:30 <n0ano> the one, https://review.openstack.org/#/c/192260/8 , scheduler plans, mainly talks about stuff we know and are doing
14:05:58 <bauzas> that's technically a backlog spec and a devref change, but anyway :)
14:06:09 <n0ano> the other, https://review.openstack.org/#/c/191914/6 , parallel scheduler for V2, I'
14:06:23 <n0ano> s/I'/I'm concerned might be overkill
14:07:04 <n0ano> to me, we've all said along that DB access is the problem but I don't know that we've measured exactly how bad it is, especially with the caching scheduler that we have
14:07:25 <n0ano> I'd like to see some real perfmance number before we try and do major changes
14:07:41 <bauzas> n0ano: the parallel scheduler is actually needed for the cells effort
14:08:09 <bauzas> n0ano: just from a design tenets PoV, it just makes sense without figures
14:08:25 <n0ano> bauzas, so it's a functional need with maybe a performance benefit
14:08:56 <bauzas> n0ano: it's more a scalability feature than a performance feature if you prefer
14:09:35 <bauzas> I'm a little bit concerned by the word 'paralled', I would have preferred 'distributed' but that's fine
14:10:04 <bauzas> the thing is, we need to document what is our christmas list
14:10:18 <n0ano> a closely related thing in this environment, if the scheduler was fast enough you wouldn't need to distribute it but that's kind of nit picky
14:10:24 <bauzas> not zactly discussing about how we 'll build the super train that daddy bought us
14:10:53 <edleafe> christmas shopping in october? bleh
14:10:59 <bauzas> n0ano: that's debatable
14:11:11 <n0ano> I agree, lets get the current stuff completed before we get distracted by the new shiny
14:11:22 <bauzas> edleafe: speak of that, our malls are now full of advent calendars
14:11:40 <n0ano> bauzas, I agree and I'm willing to not debate it right now
14:12:11 <n0ano> bauzas, they mail you those, we have buy them at the grocery store :-)
14:12:15 <bauzas> n0ano: sure, it's just about snapshotting a necessary move and the ideas behind
14:12:40 <bauzas> not yet discussing about which one to pick
14:13:23 <bauzas> n0ano: heh, our calendars are lego-ones, so I guess it's why - the chicken stopped producing chocolate ones like years ago
14:13:42 <bauzas> anyway, diverting
14:14:08 <n0ano> bauzas, heresy, chocolate is a requirement :-) but ignoring that
14:15:03 <n0ano> I'd like to kind of keep us focused, to me the progression is:
14:15:14 <n0ano> 1) finish the API clean up...
14:15:21 <n0ano> 2) split out the scheduler...
14:15:38 <n0ano> 3) consider performance/scalability
14:15:53 <n0ano> if we try and do too many of those at the same time nothing will happen
14:16:13 <bauzas> the #2 is still debatable :)
14:16:20 <johnthetubaguy> the problem is claims I think, did we work out how that impacts the scheduler API yet?
14:16:29 <bauzas> while the #3 is a benefit anyway :)
14:16:48 <bauzas> johnthetubaguy: that's a good point
14:17:07 <bauzas> johnthetubaguy: a distributed scheduler needs to address how to claim properly
14:17:09 <johnthetubaguy> do we have a concrete plan there yet? claims wise I mean
14:17:20 <n0ano> I think 3) will be much more doable after 2) (ignoring the cross project benefits of 2) so it's still a priority to me
14:17:22 <bauzas> johnthetubaguy: nothing we agreed
14:17:33 <johnthetubaguy> there was some talk of moving that into the scheduler, mostly for the parallel bit
14:17:48 <bauzas> johnthetubaguy: right, that's why I'd like to address #3 before #2
14:17:54 <johnthetubaguy> n0ano: my worry is getting (1) completed, its hard to evolve that API once its split out
14:18:24 <edleafe> johnthetubaguy: +1
14:18:30 <johnthetubaguy> the parallel bit is more about availability than speed, if we get it working well enough, FWIW
14:18:36 <n0ano> johnthetubaguy, APIs can change, that's no impossible and it's probably better that it requires thought to make a change
14:18:45 <edleafe> I would like an API that isn't nova-specific
14:18:53 <edleafe> otherwise, what's the point of a split?
14:19:18 * bauzas feels we discussed of that a couple of times before :)
14:19:32 <johnthetubaguy> so there is a chicken and egg thing here, honestly, both *could* be made to work, its a cases of working out the trade-offs
14:19:37 <n0ano> I'm with bauzas I thought it was pretty generic
14:19:53 <bauzas> don't get me wrong, here is my take
14:20:16 <bauzas> #1 we know that we should address the distributed thing, just because we're blocking cells v2 at least
14:20:43 <bauzas> #2 we know we should consider heterogenous resources provided to the scheduler
14:20:57 <bauzas> #3 we never yet agreed on a split and how
14:21:11 <bauzas> that's what I considered the consensus
14:21:46 <n0ano> in re #3 - my understanding is we did agree, clean up the APIs through the current effort and then do the mechanics of a split
14:22:49 <n0ano> in re #2 - are heterogenious resources a problem with the current design?
14:22:50 <bauzas> so the deal was to fix the API and discuss whether we split and how
14:23:19 <bauzas> n0ano: http://lists.openstack.org/pipermail/openstack-dev/2015-September/075403.html
14:23:24 <bauzas> to answer your question
14:23:57 <johnthetubaguy> so I thought we agreed, get the APIs sorted, then look again at the split, but I don't think its worth fixating on the difference, or lack there of, between those two positions
14:24:25 <n0ano> johnthetubaguy, +1
14:25:38 <johnthetubaguy> volume capacity, IP capacity and how it relates to compute capacity is an age old issue here really, it would be good to get that fixed up
14:26:07 <johnthetubaguy> availability zones and relating different pools of resources is certainly a common requirement
14:27:01 <bauzas> johnthetubaguy: so I guess you better explain my position, because I +1
14:27:04 <n0ano> johnthetubaguy, to me those capacities are just metrics (e.g. numbers), from a scheduler perspective that's pretty simple - how you measure them is not so simple
14:27:31 <bauzas> what I'm trying to explain is that we agreed on refactoring the APIs and reconsider once that done whether it was necessary to split
14:28:02 <bauzas> but in the meantime, there are many other topics coming in, and I'm really not convinced by the idea of splitting could just solve all our problmes
14:28:34 <edleafe> bauzas: splitting by itself doesn't get us any improvement
14:28:39 <n0ano> it won't solve our problems but I do believe it will make working on a lot of them easier
14:28:40 <bauzas> edleafe: ezactly
14:28:47 <edleafe> cleaning up so that a split *could* happen does
14:28:51 <johnthetubaguy> n0ano: so this is more about error handling
14:29:18 <n0ano> johnthetubaguy, not following you
14:29:18 <johnthetubaguy> say you pick where the volume goes, or where the compute goes, and that means you can only get some of your resources, you need to pick something else
14:29:54 <johnthetubaguy> its logically separate pools of resources you need to claim, that have a dependency relationship described in their metrics
14:30:18 <johnthetubaguy> so the request spec would be for both compute, volume and networking resources, in an extreme case
14:30:40 <n0ano> the way we currently work you wouldn't pick a host unless it satisfied all of the resource requirements, the scheduler just has to know about all of those resources
14:30:51 <bauzas> that's why the plan was to clean up our APIs first, then identify what could be needed for cross-project scheduling, then identify how to provide those and only by then, decide whether we split or just add another endpoint
14:31:21 <johnthetubaguy> its about picking a compute host, and a volume az, and a neutron network segment that all are able to give you a claim, and retry if not, right?
14:31:37 <johnthetubaguy> its multiple related items, its not just picking a compute node at this point
14:31:44 <edleafe> johnthetubaguy: and doing it in a non-racy way
14:31:56 <bauzas> edleafe: that I'm cautious
14:32:04 <bauzas> edleafe: I mean, we need retries
14:32:19 <edleafe> bauzas: yes, we will always need them with the current approach
14:32:28 <johnthetubaguy> edleafe: well, optionally, yes, claims would help make the retries inside the scheduler, and the expensive of a quick choice
14:32:36 <edleafe> but we should improve things so that they are kept to a minimum
14:33:14 <johnthetubaguy> we need to be more prepared to offer choice here, being less racy will be crazy important for some users, and a big slow down for other users, depends on your needs and resource usage patterns really
14:33:15 <bauzas> are we looping back ?
14:33:17 <n0ano> I think we're in violent agreement, retries are necessary but if we do too many of them we have a problem.
14:33:40 <johnthetubaguy> I am just trying to define the problem for the multi resource pool scheduling here
14:34:10 <johnthetubaguy> its been a long standing requirement, that seems to be getting more important, rather than less important
14:34:39 <bauzas> johnthetubaguy: that's why your backlog spec is worthwhile
14:34:52 <bauzas> and that's why I'd like to consider it before splitting
14:34:54 <n0ano> johnthetubaguy, do you know if anyone has written up anything about this (multi resource pools)
14:34:55 <johnthetubaguy> I probably should create a different one for this issue
14:35:11 <johnthetubaguy> n0ano: there have been a few ML posts and things, not seen anything written up
14:35:28 <bauzas> there is a spec
14:35:31 <bauzas> from jay
14:35:32 <bauzas> sec
14:35:34 <johnthetubaguy> n0ano: basically volume must be local to compute AZ, IP capacity must be local to compute AZ
14:35:46 <johnthetubaguy> bauzas: ah, cool, do you have a link for that one?
14:35:49 <edleafe> n0ano: I'm almost finished with my radical rewrite proposal, if that helps :)
14:35:57 <bauzas> https://review.openstack.org/#/c/225546/
14:36:02 <bauzas> johnthetubaguy: n0ano^
14:36:28 <johnthetubaguy> edleafe: please present that as an alternative scheduler that could be in tree, no throwing away what we have, for now
14:36:51 <johnthetubaguy> bauzas: ah, sweet
14:36:54 <edleafe> johnthetubaguy: understood, but no, there's no way it could be in tree
14:37:10 <edleafe> johnthetubaguy: I realize that it won't ever happen
14:37:22 <edleafe> I just want people to think about the root causes of our issues
14:37:41 <bauzas> jaypipes: around ?
14:38:28 <n0ano> bauzas, probably not, he would have jumped in by now
14:38:53 <bauzas> let's call him 5 times in front of a mirror
14:39:03 <bauzas> jaypipes jaypipes jaypipes jaypipes jaypipes
14:39:08 <johnthetubaguy> edleafe: so I have done a lot of experiments with a decent rack of servers, with belliott that lead to the caching scheduler and tuning the greenlet workers down, thats mostly what I am basing that parallel work on, but that idea still needs testing out, anyways
14:39:11 <n0ano> looks like we have a lot to discuss in Tokyo
14:39:20 <johnthetubaguy> so the key bit here, is we have a spec up, so lets get it reviewed
14:39:26 <johnthetubaguy> #link https://review.openstack.org/#/c/225546/
14:39:38 <bauzas> johnthetubaguy: +1
14:39:43 <bauzas> that's an iterative process
14:39:59 <bauzas> we have johnthetubaguy's spec and devref, we also have jaypipes spec
14:42:02 <johnthetubaguy> so the parallel one is not very actionable right now, I think resource pools are more important right now
14:42:22 <johnthetubaguy> the claims piece still need some specific solutions being discussed I feel
14:42:33 <bauzas> johnthetubaguy: that's a good question
14:42:45 <johnthetubaguy> honestly, once we have those two piece, I will feel better about our API
14:42:58 <bauzas> johnthetubaguy: we haven't yet finished to implement the resouce-object BP
14:43:01 <johnthetubaguy> by API I mean the scheduler interface
14:43:13 <johnthetubaguy> bauzas: very true, that needs review now, its up for review right?
14:43:22 <bauzas> johnthetubaguy: and that spec is actually an extension to the resource-objects BP
14:43:38 <johnthetubaguy> bauzas: totally agreed
14:43:52 <bauzas> johnthetubaguy: so yes, we can somehow identify that what we agreed (ie. refactor our APIs) is still valid for Mitaka
14:44:34 <bauzas> johnthetubaguy: re: the resource-objects, I saw some patches from jaypipes but I guess he had not all the work ready
14:44:52 <bauzas> johnthetubaguy: at least, I can find the objects creation, not their usage
14:45:07 <n0ano> well, I think I need to distill all the different issues from this thread and propose a scheduler session in Tokyo to discuss them
14:45:25 <bauzas> well
14:45:32 <johnthetubaguy> so the deadline for session proposals is tomorrow I think
14:45:38 <johnthetubaguy> if I remember correctly
14:45:44 * bauzas looking
14:45:44 <n0ano> NP, I'll get it done today
14:45:56 <johnthetubaguy> we need something concrete to discuss ideally
14:46:49 <n0ano> we have some specific specs to discuss plus some more speculative stuff
14:46:54 <johnthetubaguy> I think we have some agreement on the list of issues (solid API, inc resource pools, inc claims in scheduler)
14:47:07 <johnthetubaguy> For me the list is: (solid API, inc resource pools, inc claims in scheduler)
14:47:24 <bauzas> quoting johnthetubaguy "The deadline for proposals will likely be Tuesday 6th October, 23.59 UTC,"
14:47:36 <bauzas> today EOB :)
14:48:07 <bauzas> johnthetubaguy: scheduler claims are part of the parallel (ie. distributed) scheduler discussion I feel
14:48:17 <n0ano> bauzas, today's the 5th, that should be EOB tomorrow
14:48:28 <bauzas> right, we're on Monday
14:48:32 * bauzas facepalm
14:48:37 <bauzas> I thought we were Tuesday
14:48:40 <bauzas> anyway
14:48:44 <bauzas> so I was saying
14:49:01 <johnthetubaguy> bauzas: thats true, I am thinking call out claims as they impact parallel and resource pools really
14:49:10 <bauzas> solid APIs is surely one thing to address (at least the missing bits considering that reqspec-obj is on its way)
14:49:41 <bauzas> distributed scheduling (incl. sched claims) are IMHO a second part to address
14:49:53 <bauzas> johnthetubaguy: but I see your point
14:50:41 <bauzas> what I'm a bit worried is to defer the necessary talk for a scaling-out scheduler would mean that we'd defer cells v2
14:51:17 <bauzas> because we can't hardly assume that one single scheduler could boil the ocean, er. the whole cloud
14:51:46 <bauzas> 8 mins to the end of that meeting also
14:52:03 <johnthetubaguy> so it only affects multi-cell v2
14:52:14 <johnthetubaguy> and that feels like its release + 1 still
14:52:28 <johnthetubaguy> but ideally we would have a prototype ready during mitaka
14:52:33 <n0ano> bauzas, yeah, those concerns should be addressed at a session and no, we're running out of time
14:52:39 <bauzas> johnthetubaguy: erm, the idea is that the n-api would have one scheduler to address all cells
14:53:26 <bauzas> johnthetubaguy: but that's certainly debatable
14:53:41 <johnthetubaguy> bauzas: well it just works for the single cell case, its just the same API as today
14:53:58 <n0ano> getting late guys, let's move on
14:54:04 <n0ano> #topic opens
14:54:34 <n0ano> so, only two weeks until Tokyo, do we want to meet next week & after or should we just re-convene at the summit?
14:54:57 <bauzas> johnthetubaguy: http://specs.openstack.org/openstack/nova-specs/specs/liberty/approved/cells-scheduling-interaction.html was that I was thinking about
14:55:03 <bauzas> but sure, we can move on
14:55:11 <johnthetubaguy> yeah, lets move on
14:55:33 <bauzas> n0ano: I can attend the next one, not the one before the Summit
14:55:38 <johnthetubaguy> bauzas: sharding doesn't affect the API really
14:55:56 <bauzas> (enjoying Tokyo with family, eh)
14:56:26 <bauzas> johnthetubaguy: sure
14:56:28 <n0ano> I'm willing to talk IRC next week and then defer to the summit, that's doable
14:57:03 <bauzas> I'd be interested in gathering feedback from cinder and neutron folks about what they'd like to send to us
14:57:26 <bauzas> given we're discussing about https://review.openstack.org/#/c/225546/1/specs/mitaka/approved/resource-providers.rst,cm
14:57:38 <bauzas> but that's a bit premature
14:57:38 <n0ano> bauzas, me too, we asked about 2 summits ago and haven't gotten much back
14:58:31 <n0ano> well, I have to run (next meeting), tnx everyone, talk next week
14:58:35 <n0ano> #endmeeting