#openstack-meeting log

15:02:59 <garyk> #startmeeting scheduler
15:03:01 <openstack> Meeting started Tue Oct 22 15:02:59 2013 UTC and is due to finish in 60 minutes.  The chair is garyk. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:03:02 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:03:04 <openstack> The meeting name has been set to 'scheduler'
15:03:26 <garyk> sorry about not making it last week
15:03:46 <garyk> do we have any open issues regarding the summit sessions?
15:03:54 <Yathi> What is the final list?
15:04:06 <alaski> The list isn't final yet
15:04:10 <Yathi> Last week we decided to merge API + smart resource placement sessions into one
15:04:15 <alaski> but we're looking at 4 slots right now
15:04:25 <Yathi> after some discussion other pieces were taken out
15:04:30 <alaski> so API and smart resource placement should get their own sessions
15:04:34 <alaski> instance group api
15:04:38 <Yathi> oh that is sweet
15:04:55 <mspreitz> So we got 4 slots?
15:05:14 <garyk> is there an updated list of the latest session topics
15:05:51 <alaski> mspreitz: yes
15:06:03 <mspreitz> great
15:06:31 <Yathi> are we documenting this final list some where ?
15:06:35 <alaski> garyk: not yet, some have been refused and preapproved but a lot aren't touched yet
15:06:52 <garyk> alaski: ok, thanks
15:07:10 <alaski> but the ones that were discussed last week are looking good for approval
15:07:26 <garyk> great. thanks!
15:07:31 <alaski> rethinking scheduler design, extensible metrics, instance group api, amrter resource placement
15:07:44 <alaski> smarter
15:07:58 <garyk> ok cool.
15:08:11 <garyk> i have concerns with the metrics
15:08:24 <alaski> rethinking design is taking over for performance since boris didn't propose a session
15:08:29 <alaski> and they seem to have some overlap
15:08:30 <garyk> not really regarding the session but issues that i stumbled on a few days ago
15:09:09 <garyk> i think that is logical. i guess that the design considerations should take the performance and scale into account
15:09:24 <alaski> garyk: are you going to be at the summit?
15:09:31 <garyk> alaski: are you familiar with the
15:09:41 <garyk> alaski: yes, i will be at the summit
15:10:01 <mspreitz> BTW, what is the format of the design summit sessions going to be?  I heard a suggestion of text chat only.
15:10:30 <alaski> garyk: okay, it will be good if you can voicethe issue in person
15:10:45 <garyk> alaski: understood.
15:10:48 <alaski> mspreitz: it's a discussion, with notetaking in etherpads
15:11:01 <mspreitz> thanks
15:11:09 <garyk> just wanted to ask about the resource tracking - it just seems to ignore all used statistics form the hypervisor
15:11:28 <mspreitz> garyk: examples?
15:11:44 <garyk> that is, the hypervisor returns used disk and used memory
15:12:00 <mspreitz> garyk: no good to ignore, IMHO
15:12:04 <garyk> the scheduler is not aware of these as the resource tracker calculate the used memeory and disk by itself
15:12:25 <garyk> mspreitz: that is my concern too
15:12:46 <mspreitz> Do we have evidence of whether or not that dead reckoning falls short in practice?
15:12:58 <Yathi> don't we update the host metrics after a scheduling is done
15:13:04 <Yathi> I am surprised
15:13:04 <mspreitz> (I have evidence from other systems that it will)
15:13:13 <garyk> mspreitz: i am not sure.
15:14:03 <garyk> my concern is that the scheduler may think that there is enough disk space but a cinder volume may take up space and the resource tracker may not be aware of this
15:14:20 <garyk> that is just one case.
15:14:30 <mspreitz> Yow, that's a really simple example.
15:14:44 <mspreitz> If it can fail that way, won't we already have reports of problems?
15:14:48 <garyk> i guess that i need to go back and do my homework
15:15:12 <Yathi> well scheduler has this notion of retries..
15:15:19 <toan-tran> I have a problem of updating too
15:15:24 <Yathi> I am guessing that is how it works now.. if something is not really available
15:15:29 <toan-tran> whenI create multiples VMs
15:15:45 <toan-tran> I got that they are not registered immediately
15:15:46 <garyk> please look at the comment - https://github.com/openstack/nova/blob/master/nova/compute/resource_tracker.py#L576
15:16:17 <garyk> toan-tran: not sure i understand your comment. what do you mean by regsitered
15:16:26 <toan-tran> meaning the scheduler sheduling the second VM does not know see the first in DB
15:16:39 <toan-tran> sorry, register = update DB
15:17:15 <toan-tran> I made a simple weigher that looks for number of VMs in a host instead of available Ram
15:17:17 <Yathi> garyk: now it makes sense, probably it is just taking into consideration the current instance
15:17:27 <alaski> toan-tran: yeah, there's definitely a race condition with muiltiple instances being scheduled in quick succession
15:17:29 <Yathi> and not the actual hypervisor's state
15:17:43 <toan-tran> then I found out that DB is not updated among multiple VMs
15:18:09 <garyk> Yathi: but the hyperviosr has the true picture of the actual state of the host - that is, the actual amount of free memory and disk space.
15:18:23 <mspreitz> This is one reason I keep talking about scheduling against the union of observed and target state.
15:19:47 <alaski> So there was a session proposed about cleaning up the resource tracker.  We're passing it over based on there being no contention about cleaning it up.
15:19:54 <Yathi> something still needs to be done for race conditions I guess, when multiple scheduling calls in parallel or quick succession
15:20:11 <alaski> There are likely to be issues with it but I don't think there's resistance to fixing it up
15:20:19 <mspreitz> Yeah.  Take the union of plans and effects as the current usage.
15:20:19 <toan-tran> Yathi: +1
15:21:06 <garyk> alaski: i am in favor of fixing it up.
15:21:15 <Yathi> which of our planned sessions covers the resource tracking topic - enhanced metrics ?
15:21:33 <mspreitz> I think the other one... that's Boris' topic, right?
15:21:36 <garyk> alaski: i just think that it would be nice if there were considerations to the actual usage on the hyperviosr
15:21:54 <alaski> garyk: agreed
15:21:58 <Yathi> garyk: +1
15:22:13 <mspreitz> alaski, garyk: agree.  Union the actual usage and the planned usae.
15:22:36 <Yathi> how often do we update the db to get the latest hypervisor states.. that matters I guess here
15:23:10 <alaski> Yathi: enhanced metrics has some overlap with resource tracking concerns
15:23:14 <mspreitz> If you use the union, latency only affects speed with which you can reclaim freed space
15:23:26 <toan-tran> garyk: +1
15:23:30 <alaski> but overall resource tracking issues are non contentious.  the work just needs to be done
15:24:47 <garyk> resource tracker - http://docs.openstack.org/developer/nova/api/nova.compute.resource_tracker.html
15:25:02 <toan-tran> Yathi: if I'm not wrong, once per several seconds at best
15:25:03 <toan-tran> Yathi: if I'm not wrong, once per several seconds at best
15:25:21 <mspreitz> Maybe somebody could spell out those two current session proposals in a bit of detail, so we know what goes in which?
15:25:59 <garyk> mspreitz: as far as i recall the one was about ceilometer/accessing the resources directly
15:26:03 <Yathi> what is part of the "rethinking design" session
15:26:32 <garyk> i thin line 63 - https://etherpad.openstack.org/p/IceHouse-Nova-Scheduler-Sessions
15:27:03 <toan-tran> there is a blueprint on real resource usage: https://blueprints.launchpad.net/nova/+spec/utilization-aware-scheduling
15:28:16 <garyk> alaski: Yathi: who will be leading the "rethinking" session?
15:28:53 <Yathi> where is the "rethinking" session in our etherpad?
15:29:19 <garyk> Yathi: I do not think that it appears.
15:29:45 <alaski> garyk: I believe it's Mike Wilson(?)
15:29:59 <alaski> goes by geekinutah in irc, but doesn't appear to be on
15:30:34 <alaski> Yathi: it's not in the etherpad, but it took the place of scheduler performance
15:30:42 <alaski> since there's no propsed session there
15:30:49 <garyk> alaski: thanks.
15:31:08 <mspreitz> I thought Boris was going to propose a session?
15:31:13 <alaski> and it's looking to address similar concerns
15:31:23 <alaski> mspreitz: I thought so to, but he didn't
15:31:48 <garyk> i guess that we should try and sync on this topic so that we can be most affective when we meet up
15:32:52 <mspreitz> garyk: sounds good, but what does that mean?
15:32:58 <Yathi> Can someone please point me to any written description of this "rethinking" session?
15:33:12 <alaski> Yathi: http://summit.openstack.org/cfp/details/34
15:33:22 <Yathi> alaski: Thanks
15:33:40 <garyk> mspreitz: i am not sure. i think that we need to have boris and theguyfromutah talk
15:33:53 <mspreitz> Session 34 is different from Boris' topic
15:34:03 <mspreitz> garyk: if we can do that, it would be great
15:34:59 <Yathi> There are some overlaps in the "rethinking" session to our "smart resource placement" ideas
15:35:13 <garyk> #action try and get some talk about ideas of the rethinking prior to the summit
15:35:14 <mspreitz> Yathi: yes.  Smart has to be "good enough"
15:35:39 <Yathi> mspreitz: yeah the smart need to always wait for the most optimal solution
15:35:47 <Yathi> but it need not
15:35:53 <mspreitz> You mean NOT always wait
15:36:05 <Yathi> yeah NOT always wait.. sorry
15:36:25 <mspreitz> Optimization problems are usually NP hard, you never expect to find the true optimum
15:37:32 <Yathi> there has to be a cut off as to when to stop the minimization or maximization,  as long as the constraints are satisfied,  you are good..
15:37:35 <mspreitz> So, yeah, I think going smart implies doing what session 34 asks for.
15:37:46 <mspreitz> yathi: exactly
15:37:59 <garyk> at the moment i feel that people are dealing with a lot of issues: placement, processing, interactions with databases etc.
15:38:14 <garyk> i am not sure that we have one topic or idea that covers it all.
15:38:37 <Yathi> I think our idea for the smart placement involves this one piece of a smart resource placement - constraint solver, along with the other aspects
15:38:42 <alaski> session 34 is also dealing with performance.  geekinutah is dealing with a >1000 node cluster iirc and they've had performance issues they want to address
15:39:06 <mspreitz> I did not think smart was only for small systems
15:39:54 <Yathi> other aspects I mean - common db that covers cross services, suitable for high scale, improved performance over filter scheduling,
15:40:13 <garyk> alaski: is there any mention of the number of schedulers they are using?
15:40:17 <Yathi> well we have a bunch of sessions with overlapping concerns
15:40:46 <mspreitz> yathi: yes
15:40:55 <alaski> garyk: it may have come up before but I don't recall, might be 1 though
15:41:03 <garyk> ok, thanks
15:41:37 <mspreitz> His etherpad explicitly suggests parallel schedulers
15:41:56 <toan-tran> is SovlerScheduler in smarter placement ?
15:43:09 <Yathi> toan-tran - SolverScheduler is one aspect of the smarter placement, but involves other aspects too
15:43:27 <toan-tran> Yathi: thanks
15:44:00 <Yathi> toan-tran: See line 53 in https://etherpad.openstack.org/p/IceHouse-Nova-Scheduler-Sessions
15:44:12 <toan-tran> I'm just curious about the choice of LinearProgram
15:44:25 <toan-tran> is it a little time-consuming?
15:45:25 <garyk> the idea proposed is intersting
15:45:39 <Yathi> The idea is a pluggable constraints-based solver framework.. so any pluggable solvers can be included
15:45:45 <garyk> i think that the pain points will arise when it comes to the messaging
15:46:05 <garyk> that is, we need some kind of p2p messaging.
15:46:15 <mspreitz> garyk: for what?
15:46:16 <toan-tran> Yathi: ok, so not necessary LP, thanks
15:46:32 <garyk> for the "rethinking"
15:46:42 <mspreitz> I'm lost.
15:46:48 <mspreitz> p2p = peer to peer
15:46:49 <mspreitz> ?
15:46:56 <Yathi> are we talking some kind of mapreduce kind of scheduling ?
15:47:00 <garyk> mspreitz: i am going over what is written in https://etherpad.openstack.org/p/RethinkingSchedulerDesign
15:47:01 <Yathi> distributed
15:47:01 <mspreitz> You mean offline one-on-one discussions?
15:47:34 <mspreitz> that etherpad ends with a long list of alternatives
15:47:45 <mspreitz> one of which is optimization orientation
15:49:00 <garyk> i have to go over it in more detail. i am just concerned that the current infrastructure that we have may not be suited for something like this. i guess that when we discuss it we can see what is required, what is missing and then address.
15:49:39 <Yathi> garyk: mspreitz: if that etherpad has a bunch of alternatives, what is it mainly trying to achieve ? - performance ?
15:49:41 <mspreitz> garyk: there are many "it" there.  My group has done some investigation of some of them.
15:50:13 <mspreitz> First two bullets say "scalability" to me
15:50:23 <garyk> but maybe i am being a little conservative - that is, if we are unable to get very simple things in then how can we do something that is non trivial
15:50:27 <mspreitz> scalability in cloud size, request rate
15:50:30 <Yathi> not very clear - there could be several alternatives possible that way
15:50:47 <mspreitz> I would also say we should be explicit about request size
15:51:03 <mspreitz> when request is for a whole pattern, not a single resource
15:51:59 <garyk> yes, a request should be a whole pattern, only the scheduler can know how to place a collection or resources most optimally
15:52:14 <mspreitz> We had a summer student with an economics background investigate a bidding approach that can solve joint problems with things like bidding.  Takes several rounds of bidding to sort of converge.
15:52:33 <mspreitz> I mean problems with affinity
15:53:00 <mspreitz> Result was not strong enough to make us take that approach.
15:53:09 <Yathi> garyk:  is it now related to instance group apis + the smarter placement taking the whole picture into consideration
15:53:19 <garyk> i guess that we can all agree - it will be challenging and interesting :)
15:53:56 <Yathi> cross-services scheduling is key
15:54:13 <garyk> agreed
15:54:17 <Yathi> we made some progress - combining cinder into nova to schedule based on volume affinity
15:54:28 <mspreitz> Agreed too.  but it also has the problems in "rethinking"
15:54:52 <garyk> yup, i do not think that it was even addressed in the etherpad (but may be wrong here)
15:55:45 <garyk> are there any additional issues that we would like to address?
15:55:55 <mspreitz> right now or at the summit?
15:56:12 <garyk> now - we have ~4 min left
15:56:34 <mspreitz> I'd like to plead for progress on the API issues before the summit.
15:56:40 <mspreitz> No time to do anything now,
15:56:44 <garyk> mspreitz: +1
15:56:48 <mspreitz> but maybe we can agree to do something inML?
15:57:07 <garyk> mspreitz: that would be great.
15:57:33 <garyk> maybe if the sessions are closed then next week we can start with discussing the API's
15:57:40 <Yathi> garyk: mspreitz:  I guess the API work we have made significant progress already
15:57:52 <Yathi> we agreed on the model
15:58:05 <Yathi> leaving certain minor implementation specifics aside
15:58:32 <Yathi> now it is about how the list of APIs to support..
15:58:43 <Yathi> and what the payload will be like
15:58:54 <mspreitz> right.  My group is implementing right now, I am hoping for convergence
15:59:32 <Yathi> mspreitz:  Good,  Debo and I are planning to push updates for the already committed instance group API code
15:59:56 <Yathi> but this is planned for Icehouse, and not planned to complete before the summit
16:00:25 <garyk> ok. i really hope we can get this in in Icehouse and do not miss this opportunity
16:00:31 <garyk> i guess time is up.
16:00:31 <mspreitz> It's all provisional, which is why I am concerned about convergence
16:00:40 <garyk> chat to you guys next week
16:00:44 <mspreitz> thanks
16:00:46 <Yathi> ok thanks
16:00:50 <garyk> #endmeeting