14:03:11 <n0ano> #startmeeting nova-scheduler 14:03:12 <openstack> Log: http://eavesdrop.openstack.org/meetings/nova_meeting/2015/nova_meeting.2015-10-05-14.01.log.html 14:03:14 <openstack> Meeting started Mon Oct 5 14:03:11 2015 UTC and is due to finish in 60 minutes. The chair is n0ano. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:03:15 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:03:18 <openstack> The meeting name has been set to 'nova_scheduler' 14:03:26 <edleafe> third time's a charm 14:03:39 <n0ano> OK, what else can I screw up this morning 14:03:40 * bauzas waves again 14:03:49 <n0ano> n0ano, waves back 14:04:29 <n0ano> #topic Mitaka planning 14:04:52 <n0ano> So, I read the two specs pointed out last week and they're a good start 14:05:30 <n0ano> the one, https://review.openstack.org/#/c/192260/8 , scheduler plans, mainly talks about stuff we know and are doing 14:05:58 <bauzas> that's technically a backlog spec and a devref change, but anyway :) 14:06:09 <n0ano> the other, https://review.openstack.org/#/c/191914/6 , parallel scheduler for V2, I' 14:06:23 <n0ano> s/I'/I'm concerned might be overkill 14:07:04 <n0ano> to me, we've all said along that DB access is the problem but I don't know that we've measured exactly how bad it is, especially with the caching scheduler that we have 14:07:25 <n0ano> I'd like to see some real perfmance number before we try and do major changes 14:07:41 <bauzas> n0ano: the parallel scheduler is actually needed for the cells effort 14:08:09 <bauzas> n0ano: just from a design tenets PoV, it just makes sense without figures 14:08:25 <n0ano> bauzas, so it's a functional need with maybe a performance benefit 14:08:56 <bauzas> n0ano: it's more a scalability feature than a performance feature if you prefer 14:09:35 <bauzas> I'm a little bit concerned by the word 'paralled', I would have preferred 'distributed' but that's fine 14:10:04 <bauzas> the thing is, we need to document what is our christmas list 14:10:18 <n0ano> a closely related thing in this environment, if the scheduler was fast enough you wouldn't need to distribute it but that's kind of nit picky 14:10:24 <bauzas> not zactly discussing about how we 'll build the super train that daddy bought us 14:10:53 <edleafe> christmas shopping in october? bleh 14:10:59 <bauzas> n0ano: that's debatable 14:11:11 <n0ano> I agree, lets get the current stuff completed before we get distracted by the new shiny 14:11:22 <bauzas> edleafe: speak of that, our malls are now full of advent calendars 14:11:40 <n0ano> bauzas, I agree and I'm willing to not debate it right now 14:12:11 <n0ano> bauzas, they mail you those, we have buy them at the grocery store :-) 14:12:15 <bauzas> n0ano: sure, it's just about snapshotting a necessary move and the ideas behind 14:12:40 <bauzas> not yet discussing about which one to pick 14:13:23 <bauzas> n0ano: heh, our calendars are lego-ones, so I guess it's why - the chicken stopped producing chocolate ones like years ago 14:13:42 <bauzas> anyway, diverting 14:14:08 <n0ano> bauzas, heresy, chocolate is a requirement :-) but ignoring that 14:15:03 <n0ano> I'd like to kind of keep us focused, to me the progression is: 14:15:14 <n0ano> 1) finish the API clean up... 14:15:21 <n0ano> 2) split out the scheduler... 14:15:38 <n0ano> 3) consider performance/scalability 14:15:53 <n0ano> if we try and do too many of those at the same time nothing will happen 14:16:13 <bauzas> the #2 is still debatable :) 14:16:20 <johnthetubaguy> the problem is claims I think, did we work out how that impacts the scheduler API yet? 14:16:29 <bauzas> while the #3 is a benefit anyway :) 14:16:48 <bauzas> johnthetubaguy: that's a good point 14:17:07 <bauzas> johnthetubaguy: a distributed scheduler needs to address how to claim properly 14:17:09 <johnthetubaguy> do we have a concrete plan there yet? claims wise I mean 14:17:20 <n0ano> I think 3) will be much more doable after 2) (ignoring the cross project benefits of 2) so it's still a priority to me 14:17:22 <bauzas> johnthetubaguy: nothing we agreed 14:17:33 <johnthetubaguy> there was some talk of moving that into the scheduler, mostly for the parallel bit 14:17:48 <bauzas> johnthetubaguy: right, that's why I'd like to address #3 before #2 14:17:54 <johnthetubaguy> n0ano: my worry is getting (1) completed, its hard to evolve that API once its split out 14:18:24 <edleafe> johnthetubaguy: +1 14:18:30 <johnthetubaguy> the parallel bit is more about availability than speed, if we get it working well enough, FWIW 14:18:36 <n0ano> johnthetubaguy, APIs can change, that's no impossible and it's probably better that it requires thought to make a change 14:18:45 <edleafe> I would like an API that isn't nova-specific 14:18:53 <edleafe> otherwise, what's the point of a split? 14:19:18 * bauzas feels we discussed of that a couple of times before :) 14:19:32 <johnthetubaguy> so there is a chicken and egg thing here, honestly, both *could* be made to work, its a cases of working out the trade-offs 14:19:37 <n0ano> I'm with bauzas I thought it was pretty generic 14:19:53 <bauzas> don't get me wrong, here is my take 14:20:16 <bauzas> #1 we know that we should address the distributed thing, just because we're blocking cells v2 at least 14:20:43 <bauzas> #2 we know we should consider heterogenous resources provided to the scheduler 14:20:57 <bauzas> #3 we never yet agreed on a split and how 14:21:11 <bauzas> that's what I considered the consensus 14:21:46 <n0ano> in re #3 - my understanding is we did agree, clean up the APIs through the current effort and then do the mechanics of a split 14:22:49 <n0ano> in re #2 - are heterogenious resources a problem with the current design? 14:22:50 <bauzas> so the deal was to fix the API and discuss whether we split and how 14:23:19 <bauzas> n0ano: http://lists.openstack.org/pipermail/openstack-dev/2015-September/075403.html 14:23:24 <bauzas> to answer your question 14:23:57 <johnthetubaguy> so I thought we agreed, get the APIs sorted, then look again at the split, but I don't think its worth fixating on the difference, or lack there of, between those two positions 14:24:25 <n0ano> johnthetubaguy, +1 14:25:38 <johnthetubaguy> volume capacity, IP capacity and how it relates to compute capacity is an age old issue here really, it would be good to get that fixed up 14:26:07 <johnthetubaguy> availability zones and relating different pools of resources is certainly a common requirement 14:27:01 <bauzas> johnthetubaguy: so I guess you better explain my position, because I +1 14:27:04 <n0ano> johnthetubaguy, to me those capacities are just metrics (e.g. numbers), from a scheduler perspective that's pretty simple - how you measure them is not so simple 14:27:31 <bauzas> what I'm trying to explain is that we agreed on refactoring the APIs and reconsider once that done whether it was necessary to split 14:28:02 <bauzas> but in the meantime, there are many other topics coming in, and I'm really not convinced by the idea of splitting could just solve all our problmes 14:28:34 <edleafe> bauzas: splitting by itself doesn't get us any improvement 14:28:39 <n0ano> it won't solve our problems but I do believe it will make working on a lot of them easier 14:28:40 <bauzas> edleafe: ezactly 14:28:47 <edleafe> cleaning up so that a split *could* happen does 14:28:51 <johnthetubaguy> n0ano: so this is more about error handling 14:29:18 <n0ano> johnthetubaguy, not following you 14:29:18 <johnthetubaguy> say you pick where the volume goes, or where the compute goes, and that means you can only get some of your resources, you need to pick something else 14:29:54 <johnthetubaguy> its logically separate pools of resources you need to claim, that have a dependency relationship described in their metrics 14:30:18 <johnthetubaguy> so the request spec would be for both compute, volume and networking resources, in an extreme case 14:30:40 <n0ano> the way we currently work you wouldn't pick a host unless it satisfied all of the resource requirements, the scheduler just has to know about all of those resources 14:30:51 <bauzas> that's why the plan was to clean up our APIs first, then identify what could be needed for cross-project scheduling, then identify how to provide those and only by then, decide whether we split or just add another endpoint 14:31:21 <johnthetubaguy> its about picking a compute host, and a volume az, and a neutron network segment that all are able to give you a claim, and retry if not, right? 14:31:37 <johnthetubaguy> its multiple related items, its not just picking a compute node at this point 14:31:44 <edleafe> johnthetubaguy: and doing it in a non-racy way 14:31:56 <bauzas> edleafe: that I'm cautious 14:32:04 <bauzas> edleafe: I mean, we need retries 14:32:19 <edleafe> bauzas: yes, we will always need them with the current approach 14:32:28 <johnthetubaguy> edleafe: well, optionally, yes, claims would help make the retries inside the scheduler, and the expensive of a quick choice 14:32:36 <edleafe> but we should improve things so that they are kept to a minimum 14:33:14 <johnthetubaguy> we need to be more prepared to offer choice here, being less racy will be crazy important for some users, and a big slow down for other users, depends on your needs and resource usage patterns really 14:33:15 <bauzas> are we looping back ? 14:33:17 <n0ano> I think we're in violent agreement, retries are necessary but if we do too many of them we have a problem. 14:33:40 <johnthetubaguy> I am just trying to define the problem for the multi resource pool scheduling here 14:34:10 <johnthetubaguy> its been a long standing requirement, that seems to be getting more important, rather than less important 14:34:39 <bauzas> johnthetubaguy: that's why your backlog spec is worthwhile 14:34:52 <bauzas> and that's why I'd like to consider it before splitting 14:34:54 <n0ano> johnthetubaguy, do you know if anyone has written up anything about this (multi resource pools) 14:34:55 <johnthetubaguy> I probably should create a different one for this issue 14:35:11 <johnthetubaguy> n0ano: there have been a few ML posts and things, not seen anything written up 14:35:28 <bauzas> there is a spec 14:35:31 <bauzas> from jay 14:35:32 <bauzas> sec 14:35:34 <johnthetubaguy> n0ano: basically volume must be local to compute AZ, IP capacity must be local to compute AZ 14:35:46 <johnthetubaguy> bauzas: ah, cool, do you have a link for that one? 14:35:49 <edleafe> n0ano: I'm almost finished with my radical rewrite proposal, if that helps :) 14:35:57 <bauzas> https://review.openstack.org/#/c/225546/ 14:36:02 <bauzas> johnthetubaguy: n0ano^ 14:36:28 <johnthetubaguy> edleafe: please present that as an alternative scheduler that could be in tree, no throwing away what we have, for now 14:36:51 <johnthetubaguy> bauzas: ah, sweet 14:36:54 <edleafe> johnthetubaguy: understood, but no, there's no way it could be in tree 14:37:10 <edleafe> johnthetubaguy: I realize that it won't ever happen 14:37:22 <edleafe> I just want people to think about the root causes of our issues 14:37:41 <bauzas> jaypipes: around ? 14:38:28 <n0ano> bauzas, probably not, he would have jumped in by now 14:38:53 <bauzas> let's call him 5 times in front of a mirror 14:39:03 <bauzas> jaypipes jaypipes jaypipes jaypipes jaypipes 14:39:08 <johnthetubaguy> edleafe: so I have done a lot of experiments with a decent rack of servers, with belliott that lead to the caching scheduler and tuning the greenlet workers down, thats mostly what I am basing that parallel work on, but that idea still needs testing out, anyways 14:39:11 <n0ano> looks like we have a lot to discuss in Tokyo 14:39:20 <johnthetubaguy> so the key bit here, is we have a spec up, so lets get it reviewed 14:39:26 <johnthetubaguy> #link https://review.openstack.org/#/c/225546/ 14:39:38 <bauzas> johnthetubaguy: +1 14:39:43 <bauzas> that's an iterative process 14:39:59 <bauzas> we have johnthetubaguy's spec and devref, we also have jaypipes spec 14:42:02 <johnthetubaguy> so the parallel one is not very actionable right now, I think resource pools are more important right now 14:42:22 <johnthetubaguy> the claims piece still need some specific solutions being discussed I feel 14:42:33 <bauzas> johnthetubaguy: that's a good question 14:42:45 <johnthetubaguy> honestly, once we have those two piece, I will feel better about our API 14:42:58 <bauzas> johnthetubaguy: we haven't yet finished to implement the resouce-object BP 14:43:01 <johnthetubaguy> by API I mean the scheduler interface 14:43:13 <johnthetubaguy> bauzas: very true, that needs review now, its up for review right? 14:43:22 <bauzas> johnthetubaguy: and that spec is actually an extension to the resource-objects BP 14:43:38 <johnthetubaguy> bauzas: totally agreed 14:43:52 <bauzas> johnthetubaguy: so yes, we can somehow identify that what we agreed (ie. refactor our APIs) is still valid for Mitaka 14:44:34 <bauzas> johnthetubaguy: re: the resource-objects, I saw some patches from jaypipes but I guess he had not all the work ready 14:44:52 <bauzas> johnthetubaguy: at least, I can find the objects creation, not their usage 14:45:07 <n0ano> well, I think I need to distill all the different issues from this thread and propose a scheduler session in Tokyo to discuss them 14:45:25 <bauzas> well 14:45:32 <johnthetubaguy> so the deadline for session proposals is tomorrow I think 14:45:38 <johnthetubaguy> if I remember correctly 14:45:44 * bauzas looking 14:45:44 <n0ano> NP, I'll get it done today 14:45:56 <johnthetubaguy> we need something concrete to discuss ideally 14:46:49 <n0ano> we have some specific specs to discuss plus some more speculative stuff 14:46:54 <johnthetubaguy> I think we have some agreement on the list of issues (solid API, inc resource pools, inc claims in scheduler) 14:47:07 <johnthetubaguy> For me the list is: (solid API, inc resource pools, inc claims in scheduler) 14:47:24 <bauzas> quoting johnthetubaguy "The deadline for proposals will likely be Tuesday 6th October, 23.59 UTC," 14:47:36 <bauzas> today EOB :) 14:48:07 <bauzas> johnthetubaguy: scheduler claims are part of the parallel (ie. distributed) scheduler discussion I feel 14:48:17 <n0ano> bauzas, today's the 5th, that should be EOB tomorrow 14:48:28 <bauzas> right, we're on Monday 14:48:32 * bauzas facepalm 14:48:37 <bauzas> I thought we were Tuesday 14:48:40 <bauzas> anyway 14:48:44 <bauzas> so I was saying 14:49:01 <johnthetubaguy> bauzas: thats true, I am thinking call out claims as they impact parallel and resource pools really 14:49:10 <bauzas> solid APIs is surely one thing to address (at least the missing bits considering that reqspec-obj is on its way) 14:49:41 <bauzas> distributed scheduling (incl. sched claims) are IMHO a second part to address 14:49:53 <bauzas> johnthetubaguy: but I see your point 14:50:41 <bauzas> what I'm a bit worried is to defer the necessary talk for a scaling-out scheduler would mean that we'd defer cells v2 14:51:17 <bauzas> because we can't hardly assume that one single scheduler could boil the ocean, er. the whole cloud 14:51:46 <bauzas> 8 mins to the end of that meeting also 14:52:03 <johnthetubaguy> so it only affects multi-cell v2 14:52:14 <johnthetubaguy> and that feels like its release + 1 still 14:52:28 <johnthetubaguy> but ideally we would have a prototype ready during mitaka 14:52:33 <n0ano> bauzas, yeah, those concerns should be addressed at a session and no, we're running out of time 14:52:39 <bauzas> johnthetubaguy: erm, the idea is that the n-api would have one scheduler to address all cells 14:53:26 <bauzas> johnthetubaguy: but that's certainly debatable 14:53:41 <johnthetubaguy> bauzas: well it just works for the single cell case, its just the same API as today 14:53:58 <n0ano> getting late guys, let's move on 14:54:04 <n0ano> #topic opens 14:54:34 <n0ano> so, only two weeks until Tokyo, do we want to meet next week & after or should we just re-convene at the summit? 14:54:57 <bauzas> johnthetubaguy: http://specs.openstack.org/openstack/nova-specs/specs/liberty/approved/cells-scheduling-interaction.html was that I was thinking about 14:55:03 <bauzas> but sure, we can move on 14:55:11 <johnthetubaguy> yeah, lets move on 14:55:33 <bauzas> n0ano: I can attend the next one, not the one before the Summit 14:55:38 <johnthetubaguy> bauzas: sharding doesn't affect the API really 14:55:56 <bauzas> (enjoying Tokyo with family, eh) 14:56:26 <bauzas> johnthetubaguy: sure 14:56:28 <n0ano> I'm willing to talk IRC next week and then defer to the summit, that's doable 14:57:03 <bauzas> I'd be interested in gathering feedback from cinder and neutron folks about what they'd like to send to us 14:57:26 <bauzas> given we're discussing about https://review.openstack.org/#/c/225546/1/specs/mitaka/approved/resource-providers.rst,cm 14:57:38 <bauzas> but that's a bit premature 14:57:38 <n0ano> bauzas, me too, we asked about 2 summits ago and haven't gotten much back 14:58:31 <n0ano> well, I have to run (next meeting), tnx everyone, talk next week 14:58:35 <n0ano> #endmeeting