14:00:08 <cdent> #startmeeting nova_scheduler 14:00:09 <openstack> Meeting started Mon Apr 4 14:00:08 2016 UTC and is due to finish in 60 minutes. The chair is cdent. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:11 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:13 <openstack> The meeting name has been set to 'nova_scheduler' 14:00:25 <bauzas> aloha 14:00:28 <mriedem> o/ 14:00:28 <mlavalle> hello 14:00:31 <cdent> #link agenda https://wiki.openstack.org/wiki/Meetings/NovaScheduler 14:01:05 <alaski> o/ 14:01:36 <_gryf> \o 14:02:03 <cdent> The first thing on the agenda is this spec: https://review.openstack.org/#/c/297898/ but jay's not going to be here until a bit and I'm not sure tim is here so perhaps we should shuffle things around a bit. I'm not sure this policy based scheduler maps to resource providers well so though jay's input might be good. 14:02:18 <cdent> Agree to punt that discussion until after bugs? 14:03:18 <cdent> Yeah, I see no thinrichs 14:03:27 <bauzas> agreed 14:03:28 <cdent> Anybody have other specs they'd like to discuss? 14:03:50 <bauzas> I'm working on the check-destination-on-migrations spec 14:04:20 <bauzas> for providing a migration call to create a RequestSpec for the existing instances 14:04:28 <mlavalle> cdent: yeah, I want to bring up https://review.openstack.org/#/c/263898/ 14:05:08 <mlavalle> cdent: I added in the comments a second approach to achieve synchronization between Nova and Neutron 14:05:09 <cdent> bauzas: you have a link yet or is it not pushed yet? 14:05:15 <bauzas> not yet 14:05:23 <ruby_> 297898: If Tim (thinrichs) does not join, I can take any questions 14:05:41 <cdent> mlavalle: that's your comment on patchset 5, eventual consistency? 14:05:55 <mlavalle> cdent: correct 14:06:22 <cdent> bauzas: are you up to date on 263898 (I'm not)? 14:06:30 <mlavalle> cdent: I would like feedback from the team and also guidance from jaypipes as to how Neutron can release back resources to NOva 14:06:52 <johnthetubaguy> mlavalle: I should re-read that, I was worred about us loosing the nice properties of the resource reservation 14:07:04 <cdent> jaypipes said he might show up about 20 minutes into the meeting. 14:07:24 <johnthetubaguy> mlavalle: I quite like the idea of neutron delegating the resource claims to this "external" scheduler (that currently lives in nova) 14:07:25 <mlavalle> johnthetubaguy: please take a look and let us know what you think 14:07:32 <bauzas> cdent: no, sorry, I have to review this one 14:07:54 <johnthetubaguy> cdent: I had a spec if we have time, just an awareness thing really 14:07:58 <bauzas> mlavalle: tbh, I'm trying to review the already approved specs until the summit 14:08:12 <cdent> johnthetubaguy: I think we do 14:08:14 <mlavalle> bauzas: :-) 14:08:54 <johnthetubaguy> so I have this idea around the scheduler and making it easier to configure: https://review.openstack.org/#/c/256323/ 14:08:58 <jaypipes> heya, I'm here now, sorry for being late. 14:09:04 <johnthetubaguy> I hope to re-work this so its a regular spec soon 14:09:17 <johnthetubaguy> and I have a prototype up, if folks are interested in what it might look like 14:09:36 <cdent> johnthetubaguy: that's the one with (actual math)++? 14:09:44 <johnthetubaguy> cdent: yeah 14:09:56 <bauzas> johnthetubaguy: ++ 14:10:02 <jaypipes> johnthetubaguy: with a move of the scheduler filters to be simply DB queries, that wouldn't be necessary at all. 14:10:27 <jaypipes> johnthetubaguy: i.e. the filters and weighers become just WHERE and ORDER BY expressions in the single SQL statement. no need for any of it any more. 14:10:32 <johnthetubaguy> jaypipes: well it would I think, it replaces all the weights 14:10:38 <cdent> (db queries also equal actual math) 14:10:48 <johnthetubaguy> so the prototype is here: https://review.openstack.org/#/c/286023/ 14:10:59 <johnthetubaguy> jaypipes: well I should clarify... 14:11:21 <johnthetubaguy> jaypipes: we could implement the it in SQL, but I still like the different approach for the configuration 14:11:53 <jaypipes> johnthetubaguy: what do you mean by the different approach to configuration? 14:12:09 <jaypipes> I see just a single new option for listing of sched wgt classes 14:12:14 <johnthetubaguy> jaypipes: I have soft filters and hard filters 14:12:23 <johnthetubaguy> jaypipes: spec goes through why thats useful 14:13:00 * jaypipes very skeptical of any solution that adds more configuration options for the scheduler with its existing design... 14:13:16 <jaypipes> johnthetubaguy: but, I will add a full review on the spec. 14:13:38 <johnthetubaguy> jaypipes: its trying to reduce the options, but I admit it fails 14:13:44 <bauzas> jaypipes: why should we use SQL for existing filters ? 14:14:05 <bauzas> jaypipes: we know that the performance is not really bad for python modules 14:14:14 <jaypipes> bauzas: we've discussed this before... see https://review.openstack.org/300178' 14:14:57 <jaypipes> bauzas: you have seen my performance benchmarks, yes? the more compute nodes in the system, the worse the performance of the python-side filters becomes and the bigger a bottleneck it becomes. 14:14:58 <bauzas> jaypipes: wait 14:15:12 <bauzas> jaypipes: there are multiple filters 14:15:31 <jaypipes> bauzas: 38% performance difference at only 800 nodes isn't "not really bad" IMHO. 14:15:33 <bauzas> jaypipes: I understand your reasons for CPU or memory filters, but you know that we have others 14:16:06 <jaypipes> bauzas: I am talking about the resource (quantitative) ones right now. 14:16:35 <bauzas> jaypipes: okay, I thought you were saying about *all* the filters 14:16:35 <cdent> bauzas: see my comment on https://review.openstack.org/#/c/300178/ for a useful way to think about it 14:17:01 <jaypipes> bauzas: my goal is to eventually get rid of them all, but right now that spec only discusses quantitative filters. 14:17:43 <mriedem> how do you get rid of the TrustedFilter which relies on an external service? if we're playing devil's advocate. 14:17:50 <bauzas> jaypipes: tbh, you know my opinion, I prefer to have multiple schedulers running, rather than just one calling for 800 computes 14:18:18 <jaypipes> bauzas: I have no idea why that statement is in opposition to my spec. 14:18:35 <jaypipes> mriedem: for the -3 users that use that filter, ok. 14:18:44 <johnthetubaguy> the external service is meant to report into the scheduler, via the compute node info I think? 14:18:44 <mriedem> jaypipes: :) 14:19:00 <jaypipes> johnthetubaguy: some bits of data, yes, others are real-time, IIRC> 14:19:40 <cdent> Can I make a suggestion? I've seen this discussion come up pretty much every scheduler meeting yet we never seem to resolve it. Can we put it on the agenda for the _next_ meeting so as to actually approach it with some thought and not derail this meeting with an item that was on the agenda? 14:19:52 <mriedem> anywho, yeah, 14:19:54 <mriedem> or the summit 14:19:56 <johnthetubaguy> jaypipes: can do the real time check after picking, but yeah, there are issues there 14:19:59 <mriedem> there are 2 slots for scheduler at the summit 14:20:10 <mriedem> jaypipes said he'd review johnthetubaguy's backlog spec 14:20:11 <johnthetubaguy> cdent: the use SQL thing? 14:20:13 <jaypipes> anyway, what we really need is a functional scheduler-testing framework. Yingxin has a start of that proposed, but it's got a number of issues that I've commented on. Would appreciate more reviews on his patch, please. 14:20:21 <cdent> johnthetubaguy: yeah, the extent thereof 14:20:36 <cdent> and summit will be too late if we want to get it done for this cycle, I reckon 14:20:38 <johnthetubaguy> mriedem: so my spec should be independent of the SQL thing really 14:20:46 <bauzas> agreed with johnthetubaguy 14:21:06 <johnthetubaguy> cdent: I don't think its our biggest problem right now, if we can get the initial resource filters done in SQL 14:21:08 <mriedem> johnthetubaguy: i'm just basing that on jay saying it might not be necessary given sql filtering 14:21:10 <bauzas> and using SQL filters would mean that it would be a big-bang for users 14:21:24 <jaypipes> johnthetubaguy: but it will add another configuration option that frankly won't be applicable after the filters/weighers are moved to SQL clauses... but ok. 14:21:25 <bauzas> so, I prefer to see some stable situation with the existing system 14:21:36 <johnthetubaguy> jaypipes: it should be applicable after the move 14:21:56 * cdent drums his fingers on the table 14:22:03 <jaypipes> johnthetubaguy: applicable how? you mean we would interpret class name strings differently? 14:22:50 <johnthetubaguy> jaypipes: so, we shouldn't be referencing class names, it should just be named filters, but I guess I need to understand your proposal more to be sure 14:23:02 <jaypipes> k 14:23:26 <bauzas> TBH, I think we're discussing about a new Scheduler driver 14:23:33 <bauzas> even for the SQL one 14:24:00 <bauzas> I'm fine with having both, just thinking we should possible create a very different driver 14:24:11 <jaypipes> bauzas: that's certainly a possibility. 14:24:16 <cdent> I think the clear thing we can take away from the last 23 minutes is that there are several specs pending, related to the scheduler, which need deeper review: johnthetubaguy's backlog idea, thinrichs policy based, the neutron routed networks, and the entire resource-* stack 14:24:22 <cdent> #link https://review.openstack.org/#/c/263898/ 14:24:28 <cdent> #link https://review.openstack.org/#/c/256323/ 14:24:45 <cdent> #link https://review.openstack.org/#/q/owner:jaypipes%2540gmail.com+status:open 14:24:57 <cdent> #link https://review.openstack.org/#/c/297898/ 14:25:20 <bauzas> heh, is jaypipes@gmail.com a new patch series ? :p 14:25:21 <mriedem> i think the priority for this week is getting the compute node inventory spec approved, https://review.openstack.org/#/c/300175/ 14:25:34 <cdent> with the caveat that mriedem wants us to not look at those yet, can we look at those soon please? 14:25:36 <mriedem> dan has a +2 on it, i went through it on friday and posted questions 14:25:56 <mriedem> we agreed before to get compute node inventory in prior to the summit 14:25:57 <bauzas> mriedem: I agree 14:26:03 <cdent> mriedem++ 14:26:12 <mriedem> we have a xp sessoin with neutron to discuss routed networks 14:26:16 <mriedem> basically, 14:26:26 <mriedem> wed morning the first 3 sessions are scheduler (2) + neutron xp (1) 14:26:43 <mriedem> so sql filters + john's spec + neutron routed networks can be discussed wed morning of the summit, 14:26:52 <mriedem> with any high level stuff sorted out before then 14:27:01 <cdent> #action dogpile on https://review.openstack.org/#/c/300175/ by everyone to get it approved this week 14:27:17 <mriedem> but let's focus on the things we've actually agreed on already 14:27:38 <mriedem> and hash the other stuff out in person 14:27:42 <mriedem> or in reviews 14:28:05 <cdent> moving on then? 14:28:05 <johnthetubaguy> mriedem: looking at your comments on the spec, are there things would be an issue just fixing in a follow up? 14:28:27 <johnthetubaguy> (in the interests of making more progress on that effort) 14:28:38 <johnthetubaguy> i should reword that 14:28:39 <mriedem> johnthetubaguy: i don't care about the nits, but i had some actual questions 14:28:59 <johnthetubaguy> OK, I will keep reading 14:29:28 <cdent> mriedem: can you remind the log please of which etherpad is holding pending priorities? 14:29:38 <mriedem> https://etherpad.openstack.org/p/newton-nova-priorities-tracking 14:29:48 <cdent> thanks 14:30:22 <cdent> #topic reviews 14:30:26 <cdent> any other reviews? 14:31:06 <cdent> #topic open 14:31:13 <cdent> anything else from anyone? 14:31:31 <_gryf> yup 14:31:31 <bauzas> thinrichs seems not be there 14:31:38 <_gryf> I have a question/thoughts about enabling FPGA as a resource 14:31:43 <cdent> jaypipes: you were invoked several time before you showed up, with regard to a couple specs people wanted your specific input on, they are in the #links above 14:31:47 <_gryf> I've tried to gather the information the other day (spoke briefly with cdent and johnthetubaguy) 14:31:57 <_gryf> however it seams 14:32:01 <_gryf> that FPGA itself is not a trivial resource as CPU or memory, while it kind of rather quantitative type 14:32:07 <jaypipes> cdent: yeah, I owe mlavalle and carl_baldwin a re-review on the routed networks spec. 14:32:10 <_gryf> although it required some preparations before vm could be deployed on such fpga enabled compute 14:32:17 <_gryf> and might require some action after vm deleting 14:32:21 <mlavalle> jaypipes: thanks :-) 14:32:29 <jaypipes> mlavalle: sorry for delay :( 14:32:29 <_gryf> and those are just for the start… 14:32:41 <mlavalle> jaypipes: np 14:32:59 <bauzas> _gryf: possible a good point for the resource-* stuff 14:33:07 <_gryf> bauzas, ues 14:33:10 <_gryf> yes 14:33:22 <bauzas> what I wonder is how we can extend that model easily 14:33:27 <jaypipes> _gryf: so, I view FPGA algorithms as very similar to SR-IOV VFs. they are assigned to a particular guest after some amount of setup on the host. 14:33:30 <bauzas> and yours is a nice usecase 14:33:51 <_gryf> that's why i've chime in 14:33:54 <_gryf> jaypipes, actually 14:34:33 <bauzas> jaypipes: well, I think FPGA uses a different consumption algorithm than SR-IOV righT? 14:34:33 <_gryf> fpga can be treated both - as a fixed accelerators - so those can be treated as a sr-iov virt functions 14:34:35 <_gryf> or 14:35:02 <_gryf> as a accelerators with specific algorithms, for custom programs 14:35:22 <_gryf> so it is going to be a bit complicated 14:35:58 <bauzas> yeah, I think it's like vGPUs 14:36:01 <johnthetubaguy> _gryf: is there a way that one is images and the other is image_caching? 14:36:38 <johnthetubaguy> bauzas: except we have to "provision" it from an "image" 14:36:58 <_gryf> it is also possible, that fpga can be divided into the regions, where one og them accelerating ovs (and is exposed via vf and sr-iov) and the other slots can be used for custom bitstreams 14:37:08 <jaypipes> johnthetubaguy: not really... 14:37:38 <_gryf> johnthetubaguy, I was thinking about images 14:38:24 <_gryf> as an artifacts for in the glance glare project rather 14:38:29 <bauzas> _gryf: do you have pointers for me ? FPGA makes me back to my school days 14:38:58 <_gryf> so that we don't have to spoil the image catalog in glance 14:39:37 <_gryf> bauzas, I can prepare some ml thread with links to the stuff, if you interested :) 14:39:37 <johnthetubaguy> _gryf: we already have many type of image in there, in some deployments, we probably shouldn't over think that 14:39:45 <bauzas> _gryf: yeah :) 14:39:48 <bauzas> ++ 14:40:02 <johnthetubaguy> yeah, ML thread sounds good 14:40:04 <_gryf> johnthetubaguy, okay, that was just a thought :) 14:40:20 <bauzas> _gryf: I can find some OpenPower docs, is that related ? 14:40:44 <johnthetubaguy> I guess the simple case, deploy image X onto FPGA, and use it? 14:40:45 <cdent> #action _gryf to make a mailing list posting about the issues with FPGA, scheduling, images 14:40:46 <_gryf> bauzas, it might be - I've already bump on some white papers 14:41:09 <bauzas> _gryf: okay, will continue digging 14:41:11 <johnthetubaguy> anyways, listing out the possible use cases seems good 14:41:38 <_gryf> ok, cool. I'll prepare something for tomorrow :) 14:41:45 <cdent> thanks _gryf 14:41:56 <cdent> anybody or anything else? 14:41:59 <ruby_> bauzas: thinrichs may not join (297898), I'll rep. Could we do this spec? 14:42:03 <bauzas> _gryf: now, my point is, apart the usecases, are you already thinking of the implementation using the resource-classes, I guess ? 14:42:06 <johnthetubaguy> _gryf: I would worry more about the usage patterns that the solutions, to start with 14:42:21 <_gryf> bauzas, that's right 14:42:22 <johnthetubaguy> s/that the/rather than the/ 14:42:42 <bauzas> _gryf: okay, I think it would be nice to see some proposal in your ML thread :) 14:42:57 <_gryf> ok. 14:43:43 <cdent> anybody or anything else? 14:43:45 <_gryf> johnthetubaguy, I'll place the use cases as well. 14:44:20 <cdent> because if not we can give 15 minutes back to reviewing all that stuff that mriedem is cracking the whip about 14:45:03 <cdent> once 14:45:27 <cdent> twice 14:45:29 <mriedem> wait 14:45:39 <mriedem> ruby_: was asking to go over that policy driven scheduler spec twice now 14:46:02 <ruby_> yes please 14:46:08 <mriedem> #link policy-based scheduler spec https://review.openstack.org/#/c/297898/ 14:46:15 <cdent> cool 14:46:46 <mriedem> ruby_: go ahead 14:46:49 <cdent> gist seems to be that nobody has read it yet, other than reading it what is important to highlight? 14:46:59 <jaypipes> I read parts of it... 14:47:05 <ruby_> and? 14:47:48 <jaypipes> looks like the authors haven't been following along what has been happening in the nova scheduler for the last few cycles and have just been trying to respind the solver scheduler into a new incarnation. 14:47:59 <ruby_> no, it is not the solver scheduler 14:48:27 <jaypipes> ruby_: are you familiar with what has happened in the nova scheduler and resource tracker since the Icehouse release? 14:48:40 <ruby_> Somewhat 14:48:51 <ruby_> We studied the nova scheduler. 14:48:53 <johnthetubaguy> ruby_: did you see my suggested simplification of the filters and weights idea, that makes the configuration easier to understand, with the idea being all the filters you need come in the box 14:49:28 <ruby_> We read the filters+weights idea as well. We think policies may be more generic 14:50:07 <ruby_> Easen writing new "filters" (aka constraints for placement decision) 14:50:37 <jaypipes> I am really not a fan of adding a YAML syntax to the scheduler. 14:51:12 <ruby_> YAML was to configure policies in the system, what the admin is expected to do. 14:51:24 <cdent> I was under the impression, as well, that custom and/or new filters was not something we wanted to be in the business of encouraging? 14:51:31 <ruby_> Do you have another proposal (if not Yaml)? 14:51:32 <cdent> Is that true or false? 14:51:46 <jaypipes> ruby_: yes, my proposal is not to reinvent the scheduler using Congress. 14:51:50 <johnthetubaguy> so we don't expect users to write code today, generally we expect what comes in the box to be express enough for 80% of users, I am curious what scenarios you see as not possible today, given the current plans 14:52:07 <jaypipes> johnthetubaguy: NFV. of course. 14:52:10 <ruby_> jaypipes: don't understand 14:52:11 <alaski> cdent: the scheduler should be a somewhat open framework, within some constraints 14:52:38 <jaypipes> ruby_: the work items: 14:52:38 <jaypipes> 1. Extract Datalog engine from OpenStack Congress into library 14:52:38 <jaypipes> 2. Provide YAML front-end to the Datalog engine, and incorporate into library 14:52:38 <jaypipes> 3. Ingest Nova database data into policy engine 14:52:45 <alaski> cdent: so custom filters/weighers are not considered bad 14:52:52 <ruby_> johntheubaguy: even some of the examples you cited in filter+weights are not offered as filters today 14:53:59 <ruby_> jaypipes:the idea was to pull in the policy "core" engine. We are open to suggestions. 14:54:26 <jaypipes> ruby_: basically, when I see proposed specs like this (and the solver scheduler spec had virtually the exact same problem) that list "extensibility" and "monitoring" as use cases for something, I just think that the spec is not specific enough about the problem it is trying to solve and does not clearly explain why the proposal would address some problem.. 14:54:50 <jaypipes> ruby_: "extensibility" and "monitoring" are not use cases. 14:55:01 <cdent> five minute warning 14:55:19 <ruby_> e.g. would we want weights to be configurable or context aware? when selecting from the filtered list? 14:55:56 <johnthetubaguy> I really think we need a specific set of use cases, so we can compare the approaches 14:56:16 <bauzas> agreed 14:56:22 <jaypipes> ruby_: "Understandability" is also not a use case. 14:56:24 <ruby_> ok, we will write up some use cases (policies) and give our reasoning. 14:56:30 <bauzas> what I actually wonder is how we can integrate that with the existing 14:56:36 <jaypipes> ruby_: and frankly, Datalog plus YAML does not understandable make. 14:56:48 <bauzas> and what is really needed for achieving policy-based scheduling 14:57:22 <ruby_> Datalog is the form the policies will be stored. But admins do not need to use datalog 14:57:23 <jaypipes> ruby_: sorry to be harsh on this, but I've seen this academic approach to the scheduler proposed a number of times and I've yet to see any concrete use cases that it fulfills. 14:57:46 <ruby_> would a proof of concept implem help? 14:58:01 <ruby_> bauzas: idea was to maintain the same scheduler API 14:58:10 <jaypipes> ruby_: it would be great to see something that definitively shows that the existing scheduler makes the wrong decisions for scenario X and this propsed scheduler makes the correct decision. 14:58:25 <cdent> two minutes 14:58:29 <bauzas> ruby_: so, your proposal is to have a separate scheduler driver that would use Congress as a backend ? 14:58:35 <ruby_> ok: yes 14:59:07 <ruby_> is adjusting weights (ram/iops) etc based on context a useful case for you? 14:59:13 <bauzas> ruby_: what I actually think - before talking of the implementation - is what kind of feature you would like to have 14:59:27 <johnthetubaguy> I think jaypipes covered it well, its easier to review this when there are specific use cases where it demonstrates why the suggested approach beats the existing suggested approach 14:59:34 <ruby_> ok, we will improve the writing on this. 14:59:43 <ruby_> we will add some examples to the spec. 14:59:56 <jaypipes> ruby_: that is really vague (about the context in a weight). you will need to be much more specific. 14:59:57 <johnthetubaguy> a slight nit, Nova doesn't have a solver scheduler in tree, that is out of tree, AFAIK 15:00:07 <ruby_> thanks for all the input. 15:00:07 <cdent> times up 15:00:10 <cdent> #endmeeting