15:02:59 <garyk> #startmeeting scheduler 15:03:01 <openstack> Meeting started Tue Oct 22 15:02:59 2013 UTC and is due to finish in 60 minutes. The chair is garyk. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:03:02 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:03:04 <openstack> The meeting name has been set to 'scheduler' 15:03:26 <garyk> sorry about not making it last week 15:03:46 <garyk> do we have any open issues regarding the summit sessions? 15:03:54 <Yathi> What is the final list? 15:04:06 <alaski> The list isn't final yet 15:04:10 <Yathi> Last week we decided to merge API + smart resource placement sessions into one 15:04:15 <alaski> but we're looking at 4 slots right now 15:04:25 <Yathi> after some discussion other pieces were taken out 15:04:30 <alaski> so API and smart resource placement should get their own sessions 15:04:34 <alaski> instance group api 15:04:38 <Yathi> oh that is sweet 15:04:55 <mspreitz> So we got 4 slots? 15:05:14 <garyk> is there an updated list of the latest session topics 15:05:51 <alaski> mspreitz: yes 15:06:03 <mspreitz> great 15:06:31 <Yathi> are we documenting this final list some where ? 15:06:35 <alaski> garyk: not yet, some have been refused and preapproved but a lot aren't touched yet 15:06:52 <garyk> alaski: ok, thanks 15:07:10 <alaski> but the ones that were discussed last week are looking good for approval 15:07:26 <garyk> great. thanks! 15:07:31 <alaski> rethinking scheduler design, extensible metrics, instance group api, amrter resource placement 15:07:44 <alaski> smarter 15:07:58 <garyk> ok cool. 15:08:11 <garyk> i have concerns with the metrics 15:08:24 <alaski> rethinking design is taking over for performance since boris didn't propose a session 15:08:29 <alaski> and they seem to have some overlap 15:08:30 <garyk> not really regarding the session but issues that i stumbled on a few days ago 15:09:09 <garyk> i think that is logical. i guess that the design considerations should take the performance and scale into account 15:09:24 <alaski> garyk: are you going to be at the summit? 15:09:31 <garyk> alaski: are you familiar with the 15:09:41 <garyk> alaski: yes, i will be at the summit 15:10:01 <mspreitz> BTW, what is the format of the design summit sessions going to be? I heard a suggestion of text chat only. 15:10:30 <alaski> garyk: okay, it will be good if you can voicethe issue in person 15:10:45 <garyk> alaski: understood. 15:10:48 <alaski> mspreitz: it's a discussion, with notetaking in etherpads 15:11:01 <mspreitz> thanks 15:11:09 <garyk> just wanted to ask about the resource tracking - it just seems to ignore all used statistics form the hypervisor 15:11:28 <mspreitz> garyk: examples? 15:11:44 <garyk> that is, the hypervisor returns used disk and used memory 15:12:00 <mspreitz> garyk: no good to ignore, IMHO 15:12:04 <garyk> the scheduler is not aware of these as the resource tracker calculate the used memeory and disk by itself 15:12:25 <garyk> mspreitz: that is my concern too 15:12:46 <mspreitz> Do we have evidence of whether or not that dead reckoning falls short in practice? 15:12:58 <Yathi> don't we update the host metrics after a scheduling is done 15:13:04 <Yathi> I am surprised 15:13:04 <mspreitz> (I have evidence from other systems that it will) 15:13:13 <garyk> mspreitz: i am not sure. 15:14:03 <garyk> my concern is that the scheduler may think that there is enough disk space but a cinder volume may take up space and the resource tracker may not be aware of this 15:14:20 <garyk> that is just one case. 15:14:30 <mspreitz> Yow, that's a really simple example. 15:14:44 <mspreitz> If it can fail that way, won't we already have reports of problems? 15:14:48 <garyk> i guess that i need to go back and do my homework 15:15:12 <Yathi> well scheduler has this notion of retries.. 15:15:19 <toan-tran> I have a problem of updating too 15:15:24 <Yathi> I am guessing that is how it works now.. if something is not really available 15:15:29 <toan-tran> whenI create multiples VMs 15:15:45 <toan-tran> I got that they are not registered immediately 15:15:46 <garyk> please look at the comment - https://github.com/openstack/nova/blob/master/nova/compute/resource_tracker.py#L576 15:16:17 <garyk> toan-tran: not sure i understand your comment. what do you mean by regsitered 15:16:26 <toan-tran> meaning the scheduler sheduling the second VM does not know see the first in DB 15:16:39 <toan-tran> sorry, register = update DB 15:17:15 <toan-tran> I made a simple weigher that looks for number of VMs in a host instead of available Ram 15:17:17 <Yathi> garyk: now it makes sense, probably it is just taking into consideration the current instance 15:17:27 <alaski> toan-tran: yeah, there's definitely a race condition with muiltiple instances being scheduled in quick succession 15:17:29 <Yathi> and not the actual hypervisor's state 15:17:43 <toan-tran> then I found out that DB is not updated among multiple VMs 15:18:09 <garyk> Yathi: but the hyperviosr has the true picture of the actual state of the host - that is, the actual amount of free memory and disk space. 15:18:23 <mspreitz> This is one reason I keep talking about scheduling against the union of observed and target state. 15:19:47 <alaski> So there was a session proposed about cleaning up the resource tracker. We're passing it over based on there being no contention about cleaning it up. 15:19:54 <Yathi> something still needs to be done for race conditions I guess, when multiple scheduling calls in parallel or quick succession 15:20:11 <alaski> There are likely to be issues with it but I don't think there's resistance to fixing it up 15:20:19 <mspreitz> Yeah. Take the union of plans and effects as the current usage. 15:20:19 <toan-tran> Yathi: +1 15:21:06 <garyk> alaski: i am in favor of fixing it up. 15:21:15 <Yathi> which of our planned sessions covers the resource tracking topic - enhanced metrics ? 15:21:33 <mspreitz> I think the other one... that's Boris' topic, right? 15:21:36 <garyk> alaski: i just think that it would be nice if there were considerations to the actual usage on the hyperviosr 15:21:54 <alaski> garyk: agreed 15:21:58 <Yathi> garyk: +1 15:22:13 <mspreitz> alaski, garyk: agree. Union the actual usage and the planned usae. 15:22:36 <Yathi> how often do we update the db to get the latest hypervisor states.. that matters I guess here 15:23:10 <alaski> Yathi: enhanced metrics has some overlap with resource tracking concerns 15:23:14 <mspreitz> If you use the union, latency only affects speed with which you can reclaim freed space 15:23:26 <toan-tran> garyk: +1 15:23:30 <alaski> but overall resource tracking issues are non contentious. the work just needs to be done 15:24:47 <garyk> resource tracker - http://docs.openstack.org/developer/nova/api/nova.compute.resource_tracker.html 15:25:02 <toan-tran> Yathi: if I'm not wrong, once per several seconds at best 15:25:03 <toan-tran> Yathi: if I'm not wrong, once per several seconds at best 15:25:21 <mspreitz> Maybe somebody could spell out those two current session proposals in a bit of detail, so we know what goes in which? 15:25:59 <garyk> mspreitz: as far as i recall the one was about ceilometer/accessing the resources directly 15:26:03 <Yathi> what is part of the "rethinking design" session 15:26:32 <garyk> i thin line 63 - https://etherpad.openstack.org/p/IceHouse-Nova-Scheduler-Sessions 15:27:03 <toan-tran> there is a blueprint on real resource usage: https://blueprints.launchpad.net/nova/+spec/utilization-aware-scheduling 15:28:16 <garyk> alaski: Yathi: who will be leading the "rethinking" session? 15:28:53 <Yathi> where is the "rethinking" session in our etherpad? 15:29:19 <garyk> Yathi: I do not think that it appears. 15:29:45 <alaski> garyk: I believe it's Mike Wilson(?) 15:29:59 <alaski> goes by geekinutah in irc, but doesn't appear to be on 15:30:34 <alaski> Yathi: it's not in the etherpad, but it took the place of scheduler performance 15:30:42 <alaski> since there's no propsed session there 15:30:49 <garyk> alaski: thanks. 15:31:08 <mspreitz> I thought Boris was going to propose a session? 15:31:13 <alaski> and it's looking to address similar concerns 15:31:23 <alaski> mspreitz: I thought so to, but he didn't 15:31:48 <garyk> i guess that we should try and sync on this topic so that we can be most affective when we meet up 15:32:52 <mspreitz> garyk: sounds good, but what does that mean? 15:32:58 <Yathi> Can someone please point me to any written description of this "rethinking" session? 15:33:12 <alaski> Yathi: http://summit.openstack.org/cfp/details/34 15:33:22 <Yathi> alaski: Thanks 15:33:40 <garyk> mspreitz: i am not sure. i think that we need to have boris and theguyfromutah talk 15:33:53 <mspreitz> Session 34 is different from Boris' topic 15:34:03 <mspreitz> garyk: if we can do that, it would be great 15:34:59 <Yathi> There are some overlaps in the "rethinking" session to our "smart resource placement" ideas 15:35:13 <garyk> #action try and get some talk about ideas of the rethinking prior to the summit 15:35:14 <mspreitz> Yathi: yes. Smart has to be "good enough" 15:35:39 <Yathi> mspreitz: yeah the smart need to always wait for the most optimal solution 15:35:47 <Yathi> but it need not 15:35:53 <mspreitz> You mean NOT always wait 15:36:05 <Yathi> yeah NOT always wait.. sorry 15:36:25 <mspreitz> Optimization problems are usually NP hard, you never expect to find the true optimum 15:37:32 <Yathi> there has to be a cut off as to when to stop the minimization or maximization, as long as the constraints are satisfied, you are good.. 15:37:35 <mspreitz> So, yeah, I think going smart implies doing what session 34 asks for. 15:37:46 <mspreitz> yathi: exactly 15:37:59 <garyk> at the moment i feel that people are dealing with a lot of issues: placement, processing, interactions with databases etc. 15:38:14 <garyk> i am not sure that we have one topic or idea that covers it all. 15:38:37 <Yathi> I think our idea for the smart placement involves this one piece of a smart resource placement - constraint solver, along with the other aspects 15:38:42 <alaski> session 34 is also dealing with performance. geekinutah is dealing with a >1000 node cluster iirc and they've had performance issues they want to address 15:39:06 <mspreitz> I did not think smart was only for small systems 15:39:54 <Yathi> other aspects I mean - common db that covers cross services, suitable for high scale, improved performance over filter scheduling, 15:40:13 <garyk> alaski: is there any mention of the number of schedulers they are using? 15:40:17 <Yathi> well we have a bunch of sessions with overlapping concerns 15:40:46 <mspreitz> yathi: yes 15:40:55 <alaski> garyk: it may have come up before but I don't recall, might be 1 though 15:41:03 <garyk> ok, thanks 15:41:37 <mspreitz> His etherpad explicitly suggests parallel schedulers 15:41:56 <toan-tran> is SovlerScheduler in smarter placement ? 15:43:09 <Yathi> toan-tran - SolverScheduler is one aspect of the smarter placement, but involves other aspects too 15:43:27 <toan-tran> Yathi: thanks 15:44:00 <Yathi> toan-tran: See line 53 in https://etherpad.openstack.org/p/IceHouse-Nova-Scheduler-Sessions 15:44:12 <toan-tran> I'm just curious about the choice of LinearProgram 15:44:25 <toan-tran> is it a little time-consuming? 15:45:25 <garyk> the idea proposed is intersting 15:45:39 <Yathi> The idea is a pluggable constraints-based solver framework.. so any pluggable solvers can be included 15:45:45 <garyk> i think that the pain points will arise when it comes to the messaging 15:46:05 <garyk> that is, we need some kind of p2p messaging. 15:46:15 <mspreitz> garyk: for what? 15:46:16 <toan-tran> Yathi: ok, so not necessary LP, thanks 15:46:32 <garyk> for the "rethinking" 15:46:42 <mspreitz> I'm lost. 15:46:48 <mspreitz> p2p = peer to peer 15:46:49 <mspreitz> ? 15:46:56 <Yathi> are we talking some kind of mapreduce kind of scheduling ? 15:47:00 <garyk> mspreitz: i am going over what is written in https://etherpad.openstack.org/p/RethinkingSchedulerDesign 15:47:01 <Yathi> distributed 15:47:01 <mspreitz> You mean offline one-on-one discussions? 15:47:34 <mspreitz> that etherpad ends with a long list of alternatives 15:47:45 <mspreitz> one of which is optimization orientation 15:49:00 <garyk> i have to go over it in more detail. i am just concerned that the current infrastructure that we have may not be suited for something like this. i guess that when we discuss it we can see what is required, what is missing and then address. 15:49:39 <Yathi> garyk: mspreitz: if that etherpad has a bunch of alternatives, what is it mainly trying to achieve ? - performance ? 15:49:41 <mspreitz> garyk: there are many "it" there. My group has done some investigation of some of them. 15:50:13 <mspreitz> First two bullets say "scalability" to me 15:50:23 <garyk> but maybe i am being a little conservative - that is, if we are unable to get very simple things in then how can we do something that is non trivial 15:50:27 <mspreitz> scalability in cloud size, request rate 15:50:30 <Yathi> not very clear - there could be several alternatives possible that way 15:50:47 <mspreitz> I would also say we should be explicit about request size 15:51:03 <mspreitz> when request is for a whole pattern, not a single resource 15:51:59 <garyk> yes, a request should be a whole pattern, only the scheduler can know how to place a collection or resources most optimally 15:52:14 <mspreitz> We had a summer student with an economics background investigate a bidding approach that can solve joint problems with things like bidding. Takes several rounds of bidding to sort of converge. 15:52:33 <mspreitz> I mean problems with affinity 15:53:00 <mspreitz> Result was not strong enough to make us take that approach. 15:53:09 <Yathi> garyk: is it now related to instance group apis + the smarter placement taking the whole picture into consideration 15:53:19 <garyk> i guess that we can all agree - it will be challenging and interesting :) 15:53:56 <Yathi> cross-services scheduling is key 15:54:13 <garyk> agreed 15:54:17 <Yathi> we made some progress - combining cinder into nova to schedule based on volume affinity 15:54:28 <mspreitz> Agreed too. but it also has the problems in "rethinking" 15:54:52 <garyk> yup, i do not think that it was even addressed in the etherpad (but may be wrong here) 15:55:45 <garyk> are there any additional issues that we would like to address? 15:55:55 <mspreitz> right now or at the summit? 15:56:12 <garyk> now - we have ~4 min left 15:56:34 <mspreitz> I'd like to plead for progress on the API issues before the summit. 15:56:40 <mspreitz> No time to do anything now, 15:56:44 <garyk> mspreitz: +1 15:56:48 <mspreitz> but maybe we can agree to do something inML? 15:57:07 <garyk> mspreitz: that would be great. 15:57:33 <garyk> maybe if the sessions are closed then next week we can start with discussing the API's 15:57:40 <Yathi> garyk: mspreitz: I guess the API work we have made significant progress already 15:57:52 <Yathi> we agreed on the model 15:58:05 <Yathi> leaving certain minor implementation specifics aside 15:58:32 <Yathi> now it is about how the list of APIs to support.. 15:58:43 <Yathi> and what the payload will be like 15:58:54 <mspreitz> right. My group is implementing right now, I am hoping for convergence 15:59:32 <Yathi> mspreitz: Good, Debo and I are planning to push updates for the already committed instance group API code 15:59:56 <Yathi> but this is planned for Icehouse, and not planned to complete before the summit 16:00:25 <garyk> ok. i really hope we can get this in in Icehouse and do not miss this opportunity 16:00:31 <garyk> i guess time is up. 16:00:31 <mspreitz> It's all provisional, which is why I am concerned about convergence 16:00:40 <garyk> chat to you guys next week 16:00:44 <mspreitz> thanks 16:00:46 <Yathi> ok thanks 16:00:50 <garyk> #endmeeting