#openstack-meeting-alt log

14:00:08 <cdent> #startmeeting nova_scheduler
14:00:09 <openstack> Meeting started Mon Apr  4 14:00:08 2016 UTC and is due to finish in 60 minutes.  The chair is cdent. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:11 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:13 <openstack> The meeting name has been set to 'nova_scheduler'
14:00:25 <bauzas> aloha
14:00:28 <mriedem> o/
14:00:28 <mlavalle> hello
14:00:31 <cdent> #link agenda https://wiki.openstack.org/wiki/Meetings/NovaScheduler
14:01:05 <alaski> o/
14:01:36 <_gryf> \o
14:02:03 <cdent> The first thing on the agenda is this spec: https://review.openstack.org/#/c/297898/ but jay's not going to be here until a bit and I'm not sure tim is here so perhaps we should shuffle things around a bit. I'm not sure this policy based scheduler maps to resource providers well so though jay's input might be good.
14:02:18 <cdent> Agree to punt that discussion until after bugs?
14:03:18 <cdent> Yeah, I see no thinrichs
14:03:27 <bauzas> agreed
14:03:28 <cdent> Anybody have other specs they'd like to discuss?
14:03:50 <bauzas> I'm working on the check-destination-on-migrations spec
14:04:20 <bauzas> for providing a migration call to create a RequestSpec for the existing instances
14:04:28 <mlavalle> cdent: yeah, I want to bring up https://review.openstack.org/#/c/263898/
14:05:08 <mlavalle> cdent: I added in the comments a second approach to achieve synchronization between Nova and Neutron
14:05:09 <cdent> bauzas: you have a link yet or is it not pushed yet?
14:05:15 <bauzas> not yet
14:05:23 <ruby_> 297898: If Tim (thinrichs) does not  join, I can take any questions
14:05:41 <cdent> mlavalle: that's your comment on patchset 5, eventual consistency?
14:05:55 <mlavalle> cdent: correct
14:06:22 <cdent> bauzas: are you up to date on 263898 (I'm not)?
14:06:30 <mlavalle> cdent: I would like feedback from the team and also guidance from jaypipes as to how Neutron can release back resources to NOva
14:06:52 <johnthetubaguy> mlavalle: I should re-read that, I was worred about us loosing the nice properties of the resource reservation
14:07:04 <cdent> jaypipes said he might show up about 20 minutes into the meeting.
14:07:24 <johnthetubaguy> mlavalle: I quite like the idea of neutron delegating the resource claims to this "external" scheduler (that currently lives in nova)
14:07:25 <mlavalle> johnthetubaguy: please take a look and let us know what you think
14:07:32 <bauzas> cdent: no, sorry, I have to review this one
14:07:54 <johnthetubaguy> cdent: I had a spec if we have time, just an awareness thing really
14:07:58 <bauzas> mlavalle: tbh, I'm trying to review the already approved specs until the summit
14:08:12 <cdent> johnthetubaguy: I think we do
14:08:14 <mlavalle> bauzas: :-)
14:08:54 <johnthetubaguy> so I have this idea around the scheduler and making it easier to configure: https://review.openstack.org/#/c/256323/
14:08:58 <jaypipes> heya, I'm here now, sorry for being late.
14:09:04 <johnthetubaguy> I hope to re-work this so its a regular spec soon
14:09:17 <johnthetubaguy> and I have a prototype up, if folks are interested in what it might look like
14:09:36 <cdent> johnthetubaguy: that's the one with (actual math)++?
14:09:44 <johnthetubaguy> cdent: yeah
14:09:56 <bauzas> johnthetubaguy: ++
14:10:02 <jaypipes> johnthetubaguy: with a move of the scheduler filters to be simply DB queries, that wouldn't be necessary at all.
14:10:27 <jaypipes> johnthetubaguy: i.e. the filters and weighers become just WHERE and ORDER BY expressions in the single SQL statement. no need for any of it any more.
14:10:32 <johnthetubaguy> jaypipes: well it would I think, it replaces all the weights
14:10:38 <cdent> (db queries also equal actual math)
14:10:48 <johnthetubaguy> so the prototype is here: https://review.openstack.org/#/c/286023/
14:10:59 <johnthetubaguy> jaypipes: well I should clarify...
14:11:21 <johnthetubaguy> jaypipes: we could implement the it in SQL, but I still like the different approach for the configuration
14:11:53 <jaypipes> johnthetubaguy: what do you mean by the different approach to configuration?
14:12:09 <jaypipes> I see just a single new option for listing of sched wgt classes
14:12:14 <johnthetubaguy> jaypipes: I have soft filters and hard filters
14:12:23 <johnthetubaguy> jaypipes: spec goes through why thats useful
14:13:00 * jaypipes very skeptical of any solution that adds more configuration options for the scheduler with its existing design...
14:13:16 <jaypipes> johnthetubaguy: but, I will add a full review on the spec.
14:13:38 <johnthetubaguy> jaypipes: its trying to reduce the options, but I admit it fails
14:13:44 <bauzas> jaypipes: why should we use SQL for existing filters ?
14:14:05 <bauzas> jaypipes: we know that the performance is not really bad for python modules
14:14:14 <jaypipes> bauzas: we've discussed this before... see https://review.openstack.org/300178'
14:14:57 <jaypipes> bauzas: you have seen my performance benchmarks, yes? the more compute nodes in the system, the worse the performance of the python-side filters becomes and the bigger a bottleneck it becomes.
14:14:58 <bauzas> jaypipes: wait
14:15:12 <bauzas> jaypipes: there are multiple filters
14:15:31 <jaypipes> bauzas: 38% performance difference at only 800 nodes isn't "not really bad" IMHO.
14:15:33 <bauzas> jaypipes: I understand your reasons for CPU or memory filters, but you know that we have others
14:16:06 <jaypipes> bauzas: I am talking about the resource (quantitative) ones right now.
14:16:35 <bauzas> jaypipes: okay, I thought you were saying about *all* the filters
14:16:35 <cdent> bauzas: see my comment on https://review.openstack.org/#/c/300178/ for a useful way to think about it
14:17:01 <jaypipes> bauzas: my goal is to eventually get rid of them all, but right now that spec only discusses quantitative filters.
14:17:43 <mriedem> how do you get rid of the TrustedFilter which relies on an external service? if we're playing devil's advocate.
14:17:50 <bauzas> jaypipes: tbh, you know my opinion, I prefer to have multiple schedulers running, rather than just one calling for 800 computes
14:18:18 <jaypipes> bauzas: I have no idea why that statement is in opposition to my spec.
14:18:35 <jaypipes> mriedem: for the -3 users that use that filter, ok.
14:18:44 <johnthetubaguy> the external service is meant to report into the scheduler, via the compute node info I think?
14:18:44 <mriedem> jaypipes: :)
14:19:00 <jaypipes> johnthetubaguy: some bits of data, yes, others are real-time, IIRC>
14:19:40 <cdent> Can I make a suggestion? I've seen this discussion come up pretty much every scheduler meeting yet we never seem to resolve it. Can we put it on the agenda for the _next_ meeting so as to actually approach it with some thought and not derail this meeting with an item that was on the agenda?
14:19:52 <mriedem> anywho, yeah,
14:19:54 <mriedem> or the summit
14:19:56 <johnthetubaguy> jaypipes: can do the real time check after picking, but yeah, there are issues there
14:19:59 <mriedem> there are 2 slots for scheduler at the summit
14:20:10 <mriedem> jaypipes said he'd review johnthetubaguy's backlog spec
14:20:11 <johnthetubaguy> cdent: the use SQL thing?
14:20:13 <jaypipes> anyway, what we really need is a functional scheduler-testing framework. Yingxin has a start of that proposed, but it's got a number of issues that I've commented on. Would appreciate more reviews on his patch, please.
14:20:21 <cdent> johnthetubaguy: yeah, the extent thereof
14:20:36 <cdent> and summit will be too late if we want to get it done for this cycle, I reckon
14:20:38 <johnthetubaguy> mriedem: so my spec should be independent of the SQL thing really
14:20:46 <bauzas> agreed with johnthetubaguy
14:21:06 <johnthetubaguy> cdent: I don't think its our biggest problem right now, if we can get the initial resource filters done in SQL
14:21:08 <mriedem> johnthetubaguy: i'm just basing that on jay saying it might not be necessary given sql filtering
14:21:10 <bauzas> and using SQL filters would mean that it would be a big-bang for users
14:21:24 <jaypipes> johnthetubaguy: but it will add another configuration option that frankly won't be applicable after the filters/weighers are moved to SQL clauses... but ok.
14:21:25 <bauzas> so, I prefer to see some stable situation with the existing system
14:21:36 <johnthetubaguy> jaypipes: it should be applicable after the move
14:21:56 * cdent drums his fingers on the table
14:22:03 <jaypipes> johnthetubaguy: applicable how? you mean we would interpret class name strings differently?
14:22:50 <johnthetubaguy> jaypipes: so, we shouldn't be referencing class names, it should just be named filters, but I guess I need to understand your proposal more to be sure
14:23:02 <jaypipes> k
14:23:26 <bauzas> TBH, I think we're discussing about a new Scheduler driver
14:23:33 <bauzas> even for the SQL one
14:24:00 <bauzas> I'm fine with having both, just thinking we should possible create a very different driver
14:24:11 <jaypipes> bauzas: that's certainly a possibility.
14:24:16 <cdent> I think the clear thing we can take away from the last 23 minutes is that there are several specs pending, related to the scheduler, which need deeper review: johnthetubaguy's backlog idea, thinrichs policy based, the neutron routed networks, and the entire resource-* stack
14:24:22 <cdent> #link https://review.openstack.org/#/c/263898/
14:24:28 <cdent> #link  https://review.openstack.org/#/c/256323/
14:24:45 <cdent> #link https://review.openstack.org/#/q/owner:jaypipes%2540gmail.com+status:open
14:24:57 <cdent> #link https://review.openstack.org/#/c/297898/
14:25:20 <bauzas> heh, is jaypipes@gmail.com a new patch series ? :p
14:25:21 <mriedem> i think the priority for this week is getting the compute node inventory spec approved, https://review.openstack.org/#/c/300175/
14:25:34 <cdent> with the caveat that mriedem wants us to not look at those yet, can we look at those soon please?
14:25:36 <mriedem> dan has a +2 on it, i went through it on friday and posted questions
14:25:56 <mriedem> we agreed before to get compute node inventory in prior to the summit
14:25:57 <bauzas> mriedem: I agree
14:26:03 <cdent> mriedem++
14:26:12 <mriedem> we have a xp sessoin with neutron to discuss routed networks
14:26:16 <mriedem> basically,
14:26:26 <mriedem> wed morning the first 3 sessions are scheduler (2) + neutron xp (1)
14:26:43 <mriedem> so sql filters + john's spec + neutron routed networks can be discussed wed morning of the summit,
14:26:52 <mriedem> with any high level stuff sorted out before then
14:27:01 <cdent> #action dogpile on https://review.openstack.org/#/c/300175/ by everyone to get it approved this week
14:27:17 <mriedem> but let's focus on the things we've actually agreed on already
14:27:38 <mriedem> and hash the other stuff out in person
14:27:42 <mriedem> or in reviews
14:28:05 <cdent> moving on then?
14:28:05 <johnthetubaguy> mriedem: looking at your comments on the spec, are there things would be an issue just fixing in a follow up?
14:28:27 <johnthetubaguy> (in the interests of making more progress on that effort)
14:28:38 <johnthetubaguy> i should reword that
14:28:39 <mriedem> johnthetubaguy: i don't care about the nits, but i had some actual questions
14:28:59 <johnthetubaguy> OK, I will keep reading
14:29:28 <cdent> mriedem: can you remind the log please of which etherpad is holding pending priorities?
14:29:38 <mriedem> https://etherpad.openstack.org/p/newton-nova-priorities-tracking
14:29:48 <cdent> thanks
14:30:22 <cdent> #topic reviews
14:30:26 <cdent> any other reviews?
14:31:06 <cdent> #topic open
14:31:13 <cdent> anything else from anyone?
14:31:31 <_gryf> yup
14:31:31 <bauzas> thinrichs seems not be there
14:31:38 <_gryf> I have a question/thoughts about enabling FPGA as a resource
14:31:43 <cdent> jaypipes: you were invoked several time before you showed up, with regard to a couple specs people wanted your specific input on, they are in the #links above
14:31:47 <_gryf> I've tried to gather the information the other day (spoke briefly with cdent and johnthetubaguy)
14:31:57 <_gryf> however it seams
14:32:01 <_gryf> that FPGA itself is not a trivial resource as CPU or memory, while it kind of rather quantitative type
14:32:07 <jaypipes> cdent: yeah, I owe mlavalle and carl_baldwin a re-review on the routed networks spec.
14:32:10 <_gryf> although it required some preparations before vm could be deployed on such fpga enabled compute
14:32:17 <_gryf> and might require some action after vm deleting
14:32:21 <mlavalle> jaypipes: thanks :-)
14:32:29 <jaypipes> mlavalle: sorry for delay :(
14:32:29 <_gryf> and those are just for the start…
14:32:41 <mlavalle> jaypipes: np
14:32:59 <bauzas> _gryf: possible a good point for the resource-* stuff
14:33:07 <_gryf> bauzas, ues
14:33:10 <_gryf> yes
14:33:22 <bauzas> what I wonder is how we can extend that model easily
14:33:27 <jaypipes> _gryf: so, I view FPGA algorithms as very similar to SR-IOV VFs. they are assigned to a particular guest after some amount of setup on the host.
14:33:30 <bauzas> and yours is a nice usecase
14:33:51 <_gryf> that's why i've chime in
14:33:54 <_gryf> jaypipes, actually
14:34:33 <bauzas> jaypipes: well, I think FPGA uses a different consumption algorithm than SR-IOV righT?
14:34:33 <_gryf> fpga can be treated both - as a fixed accelerators - so those can be treated as a sr-iov virt functions
14:34:35 <_gryf> or
14:35:02 <_gryf> as a accelerators with specific algorithms, for custom programs
14:35:22 <_gryf> so it is going to be a bit complicated
14:35:58 <bauzas> yeah, I think it's like vGPUs
14:36:01 <johnthetubaguy> _gryf: is there a way that one is images and the other is image_caching?
14:36:38 <johnthetubaguy> bauzas: except we have to "provision" it from an "image"
14:36:58 <_gryf> it is also possible, that fpga can be divided into the regions, where one og them accelerating ovs (and is exposed via vf and sr-iov) and the other slots can be used for custom bitstreams
14:37:08 <jaypipes> johnthetubaguy: not really...
14:37:38 <_gryf> johnthetubaguy, I was thinking about images
14:38:24 <_gryf> as an artifacts for in the glance glare project rather
14:38:29 <bauzas> _gryf: do you have pointers for me ? FPGA makes me back to my school days
14:38:58 <_gryf> so that we don't have to spoil the image catalog in glance
14:39:37 <_gryf> bauzas, I can prepare some ml thread with links to the stuff, if you interested :)
14:39:37 <johnthetubaguy> _gryf: we already have many type of image in there, in some deployments, we probably shouldn't over think that
14:39:45 <bauzas> _gryf: yeah :)
14:39:48 <bauzas> ++
14:40:02 <johnthetubaguy> yeah, ML thread sounds good
14:40:04 <_gryf> johnthetubaguy, okay, that was just a thought :)
14:40:20 <bauzas> _gryf: I can find some OpenPower docs, is that related ?
14:40:44 <johnthetubaguy> I guess the simple case, deploy image X onto FPGA, and use it?
14:40:45 <cdent> #action _gryf to make a mailing list posting about the issues with FPGA, scheduling, images
14:40:46 <_gryf> bauzas, it might be - I've already bump on some white papers
14:41:09 <bauzas> _gryf: okay, will continue digging
14:41:11 <johnthetubaguy> anyways, listing out the possible use cases seems good
14:41:38 <_gryf> ok, cool. I'll prepare something for tomorrow :)
14:41:45 <cdent> thanks _gryf
14:41:56 <cdent> anybody or anything else?
14:41:59 <ruby_> bauzas: thinrichs may not join (297898), I'll rep. Could we do this spec?
14:42:03 <bauzas> _gryf: now, my point is, apart the usecases, are you already thinking of the implementation using the resource-classes, I guess ?
14:42:06 <johnthetubaguy> _gryf: I would worry more about the usage patterns that the solutions, to start with
14:42:21 <_gryf> bauzas, that's right
14:42:22 <johnthetubaguy> s/that the/rather than the/
14:42:42 <bauzas> _gryf: okay, I think it would be nice to see some proposal in your ML thread :)
14:42:57 <_gryf> ok.
14:43:43 <cdent> anybody or anything else?
14:43:45 <_gryf> johnthetubaguy, I'll place the use cases as well.
14:44:20 <cdent> because if not we can give 15 minutes back to reviewing all that stuff that mriedem is cracking the whip about
14:45:03 <cdent> once
14:45:27 <cdent> twice
14:45:29 <mriedem> wait
14:45:39 <mriedem> ruby_:  was asking to go over that policy driven scheduler spec twice now
14:46:02 <ruby_> yes please
14:46:08 <mriedem> #link policy-based scheduler spec https://review.openstack.org/#/c/297898/
14:46:15 <cdent> cool
14:46:46 <mriedem> ruby_: go ahead
14:46:49 <cdent> gist seems to be that nobody has read it yet, other than reading it what is important to highlight?
14:46:59 <jaypipes> I read parts of it...
14:47:05 <ruby_> and?
14:47:48 <jaypipes> looks like the authors haven't been following along what has been happening in the nova scheduler for the last few cycles and have just been trying to respind the solver scheduler into a new incarnation.
14:47:59 <ruby_> no, it is not the solver scheduler
14:48:27 <jaypipes> ruby_: are you familiar with what has happened in the nova scheduler and resource tracker since the Icehouse release?
14:48:40 <ruby_> Somewhat
14:48:51 <ruby_> We studied the nova scheduler.
14:48:53 <johnthetubaguy> ruby_: did you see my suggested simplification of the filters and weights idea, that makes the configuration easier to understand, with the idea being all the filters you need come in the box
14:49:28 <ruby_> We read the filters+weights idea as well. We think policies may be more generic
14:50:07 <ruby_> Easen writing new "filters" (aka constraints for placement decision)
14:50:37 <jaypipes> I am really not a fan of adding a YAML syntax to the scheduler.
14:51:12 <ruby_> YAML was to configure policies in the system, what the admin is expected to do.
14:51:24 <cdent> I was under the impression, as well, that custom and/or new filters was not something we wanted to be in the business of encouraging?
14:51:31 <ruby_> Do you have another proposal (if not Yaml)?
14:51:32 <cdent> Is that true or false?
14:51:46 <jaypipes> ruby_: yes, my proposal is not to reinvent the scheduler using Congress.
14:51:50 <johnthetubaguy> so we don't expect users to write code today, generally we expect what comes in the box to be express enough for 80% of users, I am curious what scenarios you see as not possible today, given the current plans
14:52:07 <jaypipes> johnthetubaguy: NFV. of course.
14:52:10 <ruby_> jaypipes: don't understand
14:52:11 <alaski> cdent: the scheduler should be a somewhat open framework, within some constraints
14:52:38 <jaypipes> ruby_: the work items:
14:52:38 <jaypipes> 1. Extract Datalog engine from OpenStack Congress into library
14:52:38 <jaypipes> 2. Provide YAML front-end to the Datalog engine, and incorporate into library
14:52:38 <jaypipes> 3. Ingest Nova database data into policy engine
14:52:45 <alaski> cdent: so custom filters/weighers are not considered bad
14:52:52 <ruby_> johntheubaguy: even some of the  examples you cited in filter+weights are not offered as filters today
14:53:59 <ruby_> jaypipes:the idea was to pull in the policy "core" engine. We are open to suggestions.
14:54:26 <jaypipes> ruby_: basically, when I see proposed specs like this (and the solver scheduler spec had virtually the exact same problem) that list "extensibility" and "monitoring" as use cases for something, I just think that the spec is not specific enough about the problem it is trying to solve and does not clearly explain why the proposal would address some problem..
14:54:50 <jaypipes> ruby_: "extensibility" and "monitoring" are not use cases.
14:55:01 <cdent> five minute warning
14:55:19 <ruby_> e.g. would we want weights to be configurable or context aware? when selecting from the filtered list?
14:55:56 <johnthetubaguy> I really think we need a specific set of use cases, so we can compare the approaches
14:56:16 <bauzas> agreed
14:56:22 <jaypipes> ruby_: "Understandability" is also not a use case.
14:56:24 <ruby_> ok, we will write up some use cases (policies) and give our reasoning.
14:56:30 <bauzas> what I actually wonder is how we can integrate that with the existing
14:56:36 <jaypipes> ruby_: and frankly, Datalog plus YAML does not understandable make.
14:56:48 <bauzas> and what is really needed for achieving policy-based scheduling
14:57:22 <ruby_> Datalog is the form the policies will be stored. But admins do not need to use datalog
14:57:23 <jaypipes> ruby_: sorry to be harsh on this, but I've seen this academic approach to the scheduler proposed a number of times and I've yet to see any concrete use cases that it fulfills.
14:57:46 <ruby_> would a proof of concept implem help?
14:58:01 <ruby_> bauzas: idea was to maintain the same scheduler API
14:58:10 <jaypipes> ruby_: it would be great to see something that definitively shows that the existing scheduler makes the wrong decisions for scenario X and this propsed scheduler makes the correct decision.
14:58:25 <cdent> two minutes
14:58:29 <bauzas> ruby_: so, your proposal is to have a separate scheduler driver that would use Congress as a backend ?
14:58:35 <ruby_> ok: yes
14:59:07 <ruby_> is adjusting weights (ram/iops) etc based on context a useful case for you?
14:59:13 <bauzas> ruby_: what I actually think - before talking of the implementation - is what kind of feature you would like to have
14:59:27 <johnthetubaguy> I think jaypipes covered it well, its easier to review this when there are specific use cases where it demonstrates why the suggested approach beats the existing suggested approach
14:59:34 <ruby_> ok, we will improve the writing on this.
14:59:43 <ruby_> we will add some examples to the spec.
14:59:56 <jaypipes> ruby_: that is really vague (about the context in a weight). you will need to be much more specific.
14:59:57 <johnthetubaguy> a slight nit, Nova doesn't have a solver scheduler in tree, that is out of tree, AFAIK
15:00:07 <ruby_> thanks for all the input.
15:00:07 <cdent> times up
15:00:10 <cdent> #endmeeting