15:01:00 <n0ano> #startmeeting scheduler
15:01:01 <openstack> Meeting started Tue Nov 26 15:01:00 2013 UTC and is due to finish in 60 minutes.  The chair is n0ano. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:01:03 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:01:05 <openstack> The meeting name has been set to 'scheduler'
15:01:38 <alaski> hi
15:01:43 <n0ano> bauzas, welcome (we don't bite - much :-)
15:01:56 <bauzas> n0ano: thanks :)
15:02:51 <n0ano> I sent out an agenda but most of the people that are concerned with those items aren't here yet
15:03:51 <n0ano> Given the US holiday this week this meeting might be a bust
15:04:04 <toan-tran> I see Boris with memcached based
15:04:09 <toan-tran> Yathi with instance group
15:04:09 <jgallard> hi all
15:04:19 <toan-tran> collins cannot join
15:04:29 <n0ano> boris doesn't appear to be online and yathi hasn't said anything
15:04:38 <toan-tran> what is "black box scheduler" ?
15:04:44 <MikeSpreitzer> hi
15:04:50 <Yathi> Hi
15:05:29 <n0ano> a session from the summit, basically allow the system to use a `black box' scheduler, put in the data and the black box gives the scheduling answer
15:05:56 <jgallard> this is the same thing as "scheduling as a service" ?
15:06:04 <MikeSpreitzer> How is BB sched different from plugging in a custom scheduler?
15:06:07 <bauzas> n0ano: is it related to the scheduling-as-a-service thing ?
15:06:07 <Yathi> is it the session we proposed ? - smart resource placement ?
15:06:14 <bauzas> jgallard: :)
15:06:17 <jgallard> :)
15:06:38 <n0ano> jgallard, I don't think so, saas is move the scheduler into a separately addressable service, black box is changing the internals of the scheduler
15:06:40 <Yathi> garyk are you on?
15:07:23 <alaski> I think a "black box" scheduler would have to be a new scheduler that's plugged in rather than filter_scheduler.  There would have to be a compelling reason for a deployer to use it
15:07:25 <garyk> hi, sorry, was on a call
15:07:28 <n0ano> Yathi, I believe the BB was from the rethinking scheduler design session
15:07:33 <jgallard> n0ano: ok, thanks for the clarification
15:07:53 <toan-tran> do we have a etherpad on BB?
15:08:16 <n0ano> one was started at the summit, it should still be there
15:08:36 <n0ano> #topic black blox scheduler
15:08:49 <garyk> toan-tran: let me try and look up lifeless's etherpad on the scheduling
15:09:06 <toan-tran> garyk: thanks
15:09:24 <n0ano> alaski, yes, I was worried about throwing out the baby with the bath water with this proposal
15:09:29 <Yathi> It will be good to have the link... for all the session etherpads.. I seem to have lost it
15:09:55 <MikeSpreitzer> https://wiki.openstack.org/wiki/Summit/Icehouse/Etherpads#Nova
15:10:01 <n0ano> the current filter scheduler has some scaling concerns, I don't know that we have to throw it away completely to address them.
15:10:10 <MikeSpreitzer> I do not see the words "black box" on that index
15:10:19 <jgallard> MikeSpreitzer: thanks
15:10:25 <garyk> here is the proposal - https://etherpad.openstack.org/p/icehouse-external-scheduler
15:10:57 <n0ano> MikeSpreitzer, that's my interpretation, that's probably not the exact words from the session but I think it describes it better
15:11:22 <MikeSpreitzer> What garyk posted is Robert Collins' proposal
15:11:38 <MikeSpreitzer> that's not "black box", that's code refactoring
15:11:44 <bauzas> n0ano: was it about extending the resource tracker ?
15:11:46 <garyk> MikeSpreitzer: yes, that is correct. it seems be be gaining momentum
15:12:01 <garyk> My understanding is that the first step will be code moving
15:12:12 <garyk> Then there will be discussion how to make it into a service
15:12:33 <bauzas> garyk: agreed, that's the saas goal
15:12:34 <MikeSpreitzer> garyk: neither of those is "black box", at least as the words are usually construed
15:12:44 <garyk> :)
15:12:44 <n0ano> bauzas, it was to create a set of constraints that could be fed to an industry standard scheduler code
15:12:51 <Yathi> black box, if I remember about the rethinking scheduler design proposal - it is about the multiple scheduler threads
15:13:19 <Yathi> but can't recollect this being called as black box
15:13:23 <bauzas> n0ano: ah, so you talk about this one ? https://etherpad.openstack.org/p/IcehouseNovaExtensibleSchedulerMetrics
15:13:35 <MikeSpreitzer> OK, maybe the problem is just bad wording on today's agenda
15:13:38 <alaski> I think there was concern that a solver scheduler would be a black box
15:14:10 <Yathi> as part of the smart resource placement design session - we talked about Solver Scheduler - a constraint based solver
15:14:25 <n0ano> bauzas, no, that's not it either, let me look
15:14:28 <Yathi> a pluggable "black box" so to say!
15:14:43 <MikeSpreitzer> Any replaceable module in a system is a black box in that sense, alaski, right?  We define its interface, internals are private.
15:15:09 <toan-tran> pluggable? plugged to what?
15:15:13 <toan-tran> nova?
15:15:14 <n0ano> found it - https://etherpad.openstack.org/p/RethinkingSchedulerDesign
15:15:20 <toan-tran> or openstack in general?
15:15:52 <alaski> MikeSpreitzer: in a sense yes.  But with the filter_scheduler it's easy to trace how it made its decision, a solver scheduler was considered a potential black box because there's not that same traceability
15:16:23 <alaski> it's more about debugging issues when the scheduler doesn't return an answer you expect
15:16:30 <MikeSpreitzer> ah yes, I remember that remark
15:16:37 <Yathi> exactly.. this issue was raised at the session
15:17:39 <Yathi> traceability may have to be introduced, probably with some logging if it is possible
15:17:50 <MikeSpreitzer> But I'm not sure what to do with it.  Are we to shy away from every computation that is not easy to reproduce in someone's head?
15:18:43 <alaski> In my opinion, no.  But it can't be the only, or default, option for Nova
15:19:00 <MikeSpreitzer> because...?
15:19:20 <alaski> because the default is used for gating code changes and traceability is a necessity
15:19:42 <MikeSpreitzer> what exactly do you mean by "traceability"?
15:20:06 <Yathi> it believe even with filter scheduler - it is a list of filters
15:20:16 <Yathi> so some filters will fail and log ?
15:20:30 <alaski> understanding why a scheduling decision was made.  If Jenkins fails a gate check because it couldn't schedule an instance, I want to know why
15:20:39 <MikeSpreitzer> thanks
15:20:52 <toan-tran> alaski: we have logs on every filter
15:20:59 <toan-tran> cant that help?
15:21:41 <MikeSpreitzer> In my group's previous work, we developed a replay framework.  Problem instances can be logged completely, and replayed into a test harness for debugging purposes.
15:21:57 <MikeSpreitzer> Essentially, a formalized kind of log that can be replayed.
15:22:14 <n0ano> MikeSpreitzer, but how do you know the exact state that the system was in in order to replay things
15:22:34 <MikeSpreitzer> The log contains all the relevant information.
15:22:35 <alaski> toan-tran: I'm not sure if there are logs on every filter, but they can be added.  And there is a blueprint for additional logging in the scheduler being worked on
15:23:07 <toan-tran> alaski: at least the filter_scheduler says which filter returns which hosts
15:23:34 <toan-tran> of course' it's inside the filer that we have to add log if we need more details
15:23:44 <n0ano> always remembering that logging adds overhead, we're already concerned about scheduler efficiency
15:23:50 <toan-tran> we also have Error code, although not every detailed
15:24:03 <alaski> toan-tran: right.  The filter_scheduler is fine, the concern is regarding a potential new scheduler which is based on more complicated solving methods, or possibly even hueristics
15:24:34 <alaski> and by fine I mean not too bad, it could certainly be better
15:24:41 <toan-tran> alaski: aggreed
15:24:54 <MikeSpreitzer> Regardless of decision method, same inputs apply, right?
15:25:25 <MikeSpreitzer> Would it be OK to have a variable level of logging?  Full in the gate, production might be less?
15:25:34 <Yathi> just to be clear.. the idea is not yet to replace Filter scheduler.. provide an additional option for a scheduler driver
15:25:50 <n0ano> MikeSpreitzer, I think that's an absolute requirement
15:25:52 <cfriesen> is logging really expensive?  I thought the issue was mostly the time to pull the data out of the database?
15:26:06 <toan-tran> so basically we need a framework to write the new scheduler, some steps that it must voice the state?
15:26:17 <alaski> Yathi: yes.
15:26:38 <n0ano> cfriesen, the logs have to be stored somewhere, we're already concerned about DB access, this would just make it worse
15:26:47 <alaski> MikeSpreitzer: variable logging would be great
15:27:10 <cfriesen> why not just stream the logs via syslog?
15:27:20 <Yathi> I think it is about enhancing a decision making engine to be able to clearly log which of the constraints did not satisfy
15:27:50 <MikeSpreitzer> Yathi: getting a log of complete input is non-trivial
15:27:59 <MikeSpreitzer> but necessary to replay and explain.
15:28:18 <MikeSpreitzer> However, note that some serious guys do very extensive logging all the time
15:28:29 <n0ano> cfriesen, possible but one of the ideas is creating multiple schedulers, with multiples a single log point would be helpful although maybe I've overthinking things
15:28:36 <MikeSpreitzer> Do I recall correctly that Google logs a lot all the time?
15:29:16 <Yathi> I guess I don't have anything else to add here at this point on the logging aspect
15:30:00 <toan-tran> log is not good, we should think about creating info objects
15:30:11 <MikeSpreitzer> I have experience with IBM products that offer variable level of logging.  Our product guys love it.  I hate it when called in to debug a customer problem, they always logged too little, so it always starts with "turn up the logging to XXX and then reproduce the problem"
15:30:12 <toan-tran> I think we have a blueprint for that
15:30:25 <n0ano> toan-tran, not sure I understand what you mean about objects
15:30:48 <alaski> toan-tran: https://blueprints.launchpad.net/nova/+spec/record-scheduler-information though it's still under discussion
15:31:08 <n0ano> MikeSpreitzer, but at least that's an option vs. no or minimal logging
15:31:44 <MikeSpreitzer> yes
15:32:12 <MikeSpreitzer> What we did at first is to make some of our optional logging have a very precise and parseable format, put all information on scheduler problems in there.
15:32:30 <n0ano> well, one take away from this seems to be a concensus that we need to consider logging, especially variable level
15:32:39 <MikeSpreitzer> Later the product guys got interested in non-optional binary logging of structured data, but I'm not sure how far they have taken it thus far.
15:32:50 <n0ano> I don't know if there is any kind of loggin standard in OpenStack, anybody know?
15:33:28 <russellb> openstack/common/log.py is what everything uses
15:33:29 <toan-tran> alaski: this is what I'm talking about: https://blueprints.launchpad.net/nova/+spec/add-missing-notifications
15:34:02 <toan-tran> I remember it has had more information than current version
15:34:22 <n0ano> russellb, which I believe puts everything in files on the local machine with not level capability
15:34:54 <alaski> n0ano: there are level capabilities
15:35:20 <toan-tran> and this one: https://blueprints.launchpad.net/nova/+spec/notification-compute-scheduler
15:35:20 <n0ano> alaski, which are setable from configuration files/run time?
15:35:27 <russellb> and can use syslog
15:35:41 <russellb> yes, you configure what levels you want logged
15:35:44 <russellb> and where you want the logs to go
15:35:48 <bauzas> n0ano: you just set it explicitely
15:36:23 <n0ano> sounds like the infrastructure is there then, we just need to make sure all the filters use the logging services properly
15:37:05 <MikeSpreitzer> And if we want to be able to debug scheduler decision making, "properly" means log all the relevant information at the chosen log level.
15:37:20 <n0ano> MikeSpreitzer, +2
15:37:28 <n0ano> s/+2/+1
15:38:00 <toan-tran> Mike: +1
15:38:07 <toan-tran> the question is , how we find "relevant"?
15:38:27 <bauzas> I only played with a global logger for the whole project, don't know if we can have a special logger for scheduling things
15:38:48 <bauzas> afaik, the logger is global to nova
15:38:59 <n0ano> bauzas, I would hope we don't need anything special, standard loggin services should be fine
15:39:06 <toan-tran> the problem of text logging is that the developper of a scheduler can write anything in the log
15:39:19 <toan-tran> which is not necessarily meaningful to others
15:39:22 <bauzas> n0ano: then you're fine
15:39:35 <MikeSpreitzer> toan-tran: that's why I talk about a precisely defined format for the scheduler problem info
15:39:47 <toan-tran> Mike: agreed!
15:40:03 <toan-tran> should we create a log class for that?
15:40:11 <toan-tran> put some structure into what is logged
15:40:56 <alaski> That's probably a good idea, but I think there's more immediate work before that becomes a concern
15:41:02 <MikeSpreitzer> I agree
15:41:15 <n0ano> some structure is good as long as there is the freedom to add other things that aren't part of the structure
15:41:43 <bauzas> toan-tran: there is no need for a log class
15:42:08 <bauzas> you just have to explicitely define which logger name you want
15:42:15 <n0ano> I'm feeling that someone needs to create a BP to propose some standardized logging for the current scheduler filters
15:42:16 <MikeSpreitzer> In my group's work, we have an internal API to the solver, and it has simple style: input is a whole problem, output is a whole answer.  It is pretty easy to do complete logging in that case.
15:43:06 <MikeSpreitzer> We have not had to worry about alternate solvers or alternate schedulers.
15:43:09 <toan-tran> Mike: is it possible to record the state of the system in the log?
15:43:39 <n0ano> MikeSpreitzer, the filter scheduler is kind of like that, input is the set of possible nodes and output is the set of acceptable nodes
15:43:39 <MikeSpreitzer> Currently we log snapshots of the relevant state info.  Alternatively the log could stream updates.
15:44:22 <MikeSpreitzer> As alaski said, I think we have beat this horse enough for now.
15:44:55 <Yathi> Mike +1
15:45:06 <n0ano> agreed, since Yahti is here let's switch to
15:45:15 <n0ano> #topic instance groups
15:45:38 <n0ano> Yathi, do you have an update on this
15:45:58 <Yathi> garyk you want to say something
15:47:07 <toan-tran> well, since one one say a word, I have a question :)
15:47:12 <Yathi> no major update as of now.  But the plan after the summit was to continue the implementation on a simpler instance group model
15:47:15 <n0ano> looks like garyk got called away
15:47:25 <toan-tran> if we intend to make it into nova
15:47:35 <toan-tran> do we keep edge & policy?
15:47:41 <Yathi> a flat group model
15:47:55 <Yathi> we do not keep the edge
15:48:04 <toan-tran> Yathi: +1
15:48:12 <n0ano> Yathi, I thought there was work needed on the V3 API, is that ongoing
15:48:19 <toan-tran> what about policy, we don't have policy manager either
15:48:24 <toan-tran> ?
15:48:38 <Yathi> yeah I believe it is part of the plan.. to complete what was pending from Havana time..
15:49:16 <Yathi> work is needed for V3 API
15:49:44 <n0ano> do you think that will be controversial or should it be straight forward
15:49:52 * n0ano always worries about API changes
15:50:52 <cfriesen> I've been playing with the current instance groups CLI and have some comments on usability--where do I send feedback?
15:51:01 <Yathi> we will sync up again with others - garyk, debo and discuss on the remaining tasks
15:51:42 <MikeSpreitzer> cfriesen: I'm just a newbie here, my guess is the mailing list
15:51:57 <Yathi> please send it to  - the dev mailer is the best
15:52:02 <MikeSpreitzer> but you can talk to us now too!
15:52:36 <cfriesen> there's a bunch of stuff I ran into...like it would be nice to accept human-readable group names in the commands rather than only the full group UUID
15:52:48 <MikeSpreitzer> +1 in general on that
15:52:53 <cfriesen> and to me it doesn't make sense to have an "instance-group-add-members" command where the member argument is optional
15:53:05 <cfriesen> what does that even mean?  :)
15:53:40 <MikeSpreitzer> me steps back, waiting for someone who designed that API to answer
15:54:07 * MikeSpreitzer will eventually remember to type a slash before a command
15:54:20 <n0ano> sounds like no one wants to admit ownership, might need to ask that one on the dev mailing list
15:54:34 <Yathi> I think it is  best to compile an email
15:54:56 <cfriesen> okay, will do.
15:55:05 <n0ano> OK, time running down
15:55:09 <n0ano> #topic opens
15:55:21 <garyk> sorry, i had internet problems.
15:55:31 <n0ano> anybody have any opens they want to raise in the few minutes we have available
15:55:39 <garyk> instance group updates: have posted scheduler changes. pending api changes - debu will work on these next week\
15:55:45 <garyk> sorry for late update
15:55:50 <n0ano> garyk, NP, we didn't say too many bad things about you :-)
15:55:55 <garyk> :)
15:56:09 <n0ano> garyk, yeah, that's what we got, pretty much WIP
15:56:40 <n0ano> any other opens
15:56:43 <toan-tran> I'd like to discuss on SaaS
15:56:58 <toan-tran> well, discuss on SaaS's discussion :)
15:57:03 <Yathi> you mean the external scheduler ?
15:57:09 <toan-tran> yeah
15:57:17 <bauzas> we're running out of time
15:57:19 <n0ano> toan-tran, I would like to discuss it also but we'll need a full session for that
15:57:20 <Yathi> that might need a lot of time..
15:57:32 <n0ano> Yathi, no might, it will take a lot of time.
15:57:42 <Yathi> :)
15:57:44 <toan-tran> that's what I'm saying :)
15:57:56 <toan-tran> how we organise discussion on SaaS
15:58:02 <jgallard> can we add this item in 1st for next week?
15:58:04 <jgallard> :)
15:58:11 <alaski> +1
15:58:21 <bauzas> I was thinking there was a separate meeting on that point, non ?
15:58:23 <bauzas> no ?
15:58:36 <toan-tran> I don't know where Collins live
15:58:37 <n0ano> next week, if possible, I'd like to get Boris on board to talk about memcached, that's the most important immediate topic, we can put SaaS as the 2nd priority
15:58:46 <bauzas> toan-tran: he lives in NZ
15:58:49 <toan-tran> if he lives in UTC+13
15:59:02 <toan-tran> ...
15:59:12 <alaski> n0ano: sounds good
15:59:14 <toan-tran> ok so we need another slot
15:59:19 <jgallard> n0ano: ok, great :)
15:59:25 <bauzas> that's what lifeless proposed
15:59:30 <toan-tran> not scheduler meeting
15:59:36 <n0ano> we'll discuss further next week
15:59:40 <n0ano> tnx everyone
15:59:45 <n0ano> #endmeeting