14:00:06 <edleafe> #startmeeting nova_scheduler
Meeting started Mon Jun 27 14:00:06 2016 UTC
14:00:11 <openstack> The meeting name has been set to 'nova_scheduler'
14:00:17 <edleafe> Anyone here for fun and games?
14:01:17 <edleafe> Let's wait another minute for the people running late...
14:02:29 <bauzas> \....p
14:03:14 <edleafe> OK, let's start
14:03:18 <bauzas> that's just meaning "long arm but tired" :)
14:03:24 <edleafe> #topic Specs and reviews
14:03:33 <edleafe> Nothing was added to the agenda
14:03:52 <edleafe> But that doesn't mean you can't bring it up now
14:04:03 <edleafe> So... anyone have something to discuss?
14:04:38 * edleafe hears nothing but crickets
14:04:54 <edleafe> OK, moving on
14:05:04 <edleafe> #topic Mid-cycle discussion topics
14:05:13 <jaypipes> edleafe: I will have the new resource-providers-allocations spec pushed within an hour before I leave for the airport. It contains the strategy for populating the allocations table in the API database instead of migrating data from the existing cell DB locations.
14:05:34 <edleafe> jaypipes: thanks
14:05:44 <edleafe> jaypipes: will review when it's pushed
14:05:48 <jaypipes> danke
14:05:55 <bauzas> jaypipes: not sure I got the change but okay :)
14:06:08 <jaypipes> bauzas: sorry, what do you mean?
14:06:23 <bauzas> jaypipes: I don't remember what was about to change :)
14:06:30 <bauzas> but I'll review th espec
14:06:31 <edleafe> #undo
14:06:31 <openstack> Removing item from minutes: <ircmeeting.items.Topic object at 0x7f2b35ec8b90>
14:07:03 <edleafe> #topic Mid-cycle discussion topics
14:07:14 <edleafe> First up: Accepting multiple hosts for live migration
14:07:25 <edleafe> #link https://review.openstack.org/276840
14:07:36 <edleafe> I'd like to get consensus on this approach, so I created a POC patch:
14:07:41 <jaypipes> bauzas: ah, sorry. so originally we planned to *migrate* data from the instance_extra table to the new allocations table. Now we're not going to migrate anything. We're going to populate the allocations table records in the API database with *additional* calls in the resource tracker to ResourceProvider.add_allocations() and ResourceProvider.delete_allocations(). Basically, for a short
14:07:46 <edleafe> #link https://review.openstack.org/#/c/327809/
14:07:47 <jaypipes> while, the resource tracker will write inventory and usage data to *both* the cell DB *and* the API DB.
14:08:38 <bauzas> jaypipes: oh ok, sure, drop me a ping when the spec change is up and I'll pick it shortly
14:09:03 <edleafe> OK, glad that's settled
14:09:08 <edleafe> Back to the topic
14:09:24 <edleafe> Has anyone looked over the multiple host spec and/or POC code?
14:10:18 * edleafe takes that as a 'No'
14:11:00 <bauzas> edleafe: well, I reviewed the spec a bit
14:11:34 <bauzas> edleafe: I'm a bit torn by the UX (ie. using a long list of comma-splitted hosts), but it seemed to me a bit okay
14:11:53 <edleafe> bauzas: that was an implementation detail
14:11:58 <edleafe> for the CLI
14:12:03 <bauzas> edleafe: yup, I know
14:12:05 <edleafe> not really part of the spec
14:12:36 <edleafe> The CLI can be done in whatever way is desired to pass a JSON list to the API
14:12:44 <bauzas> edleafe: but since it's not really trivial on how you figure out which hosts to pass to the scheduler, I thought it was a bit important, also given it was a REST API modification
14:13:19 <bauzas> edleafe: well, my point was not about the CLI, rather about the JSON list for the request body argument
14:13:44 <edleafe> bauzas: so what alternative format would be better?
14:13:57 <bauzas> edleafe: I was originally thinking of aggregates
14:14:24 <edleafe> bauzas: don't you think that will be messy?
14:15:02 <bauzas> you can have multiple aggregates per host, and I was thinking that it was something easy for operators creating an aggregate and setting the list of hosts in there
14:15:08 <edleafe> IOW, you have to a) define the hosts you want b) create an agg c) pass the agg to the request d) monitor the success of the request, then e) delete the agg
14:15:25 <bauzas> so that the UX was then "live-migrate inst1 <my_agg>"
14:15:43 <edleafe> bauzas: you do realize that these aggs would be ephemeral?
14:15:56 <edleafe> IOW, they would exist for a single request?
14:15:57 <bauzas> yup, and I'm fine with that
14:16:25 <Yingxin> or both is ok? to an aggregate or a list of hosts?
14:16:34 <bauzas> I'm fine too
14:16:36 <bauzas> anyway
14:16:42 <bauzas> I don't want to nitpick
14:16:55 <edleafe> Yingxin: it's passing a list that bauzas doesn't like
14:17:09 <edleafe> Yingxin: so you have to accept a list in the first place
14:17:09 <bauzas> just explaining that I'm pretty sure that in 1 or 2 cycles, some people will complain about not being able to use aggregates for that :)
14:17:37 <bauzas> Yingxin: the problem is that we're pretty well formalized on the API side
14:17:49 <edleafe> bauzas: so you want to see both?
14:17:56 <johnthetubaguy> bauzas: I get your point, you want a "scheduler" hint to say, pick any hosts in this aggregate, and thats where you put all your updated hosts?
14:18:04 <bauzas> edleafe: I dunno, I'm not violently opposed to a list of hosts, TBC
14:18:21 <bauzas> edleafe: I'm just saying that I'm betting to see a feature request about that in the next cycles
14:18:28 <johnthetubaguy> I think there are two separate use cases here
14:18:55 <johnthetubaguy> my main objection was having a structure inside a string field, if we dropped that, I am less opposed to the idea
14:19:25 <edleafe> johnthetubaguy: the spec now proposes a list, not a string
14:19:58 <johnthetubaguy> coolness
14:20:00 <bauzas> johnthetubaguy: well, my concern was not exacly about the length of the list, just about the thing that sometimes, operators don't like explicit lists of things, and prefer to use implicit concepts
14:20:18 <bauzas> but I agree, that's a second part of the problem
14:20:25 <johnthetubaguy> bauzas: agreed with you, I think there are two use cases
14:20:29 <edleafe> bauzas: don't forget that humans are only one use case
14:20:44 <edleafe> humans usually know where they want it to go
14:20:46 <bauzas> edleafe: I know, and I know why you're proposing that spec
14:20:54 <edleafe> whereas programs don't have that insight
14:21:26 <edleafe> OK, I don't think we need to settle this here
14:21:42 <edleafe> This will be a good topic for the midcycle
14:21:53 <bauzas> anyway, I'm like discussing about 5 mins about some extra feature request that nobody is yet asking for, so I suggest to go for a list of hosts, and ask those guys - if they exist - to write another story for using other concepts if they don't like the list of hosts
14:21:58 <edleafe> I just want to be sure that everyone is familiar with the issues
14:22:28 <edleafe> bauzas: ok, maybe make a note of that on the spec
14:22:29 <bauzas> edleafe: either way, I'll look at the spec and vote, given the above ^
14:22:54 <edleafe> bauzas: thanks
14:23:10 <edleafe> Note: jaypipes just posted his updated spec
14:23:13 <edleafe> #link https://review.openstack.org/300177
14:23:30 <jaypipes> danke edleafe
14:23:31 <edleafe> Please review that, too.
14:23:50 <edleafe> Next up: Separating qualitative requirements from quantitative in a request
14:24:10 <edleafe> #link https://review.openstack.org/313784
14:24:33 <edleafe> This has gotten some positive initial review, although nothing much on the implementation details
14:25:10 <bauzas> edleafe: I remember a long list of specs about that, is the above a consensus of what jaypipes proposed at the summit session ?
14:25:13 <edleafe> Since this is a much more disruptive change, I want to be sure that people have given it some thought before the midcycle
14:25:24 <edleafe> bauzas: not exactly
14:25:28 <bauzas> s/proposed/exposed rather
14:25:30 <edleafe> bauzas: it's based on that
14:25:56 <edleafe> bauzas: but probably more of a long-term view than what we discussed
14:26:01 <bauzas> okay, because I recently reviewed some change that I felt was about adding a new capability to an host, and I'd love to give some pointer
14:26:16 <bauzas> about which things we're currently looking for
14:26:39 <bauzas> jaypipes: you pretty aware of https://review.openstack.org/#/c/313784/ ?
14:26:41 <edleafe> bauzas: I'd also prefer at all costs to not fall into the "specify/hardcode every feature that a host could offer" trap
14:26:58 <bauzas> edleafe: k
14:27:09 <jaypipes> bauzas: reviewing that spec now actually...
14:27:11 <edleafe> Enumerating some for x-provider is one thin
14:27:14 <edleafe> thing
14:27:34 <edleafe> but requiring a code change to add a new capability is quite another
14:28:05 <edleafe> In any case, please review so that we can discuss at the midcycle
14:28:30 <edleafe> Any other topics that anyone thinks would be good midcycle discussion material?
14:28:46 <jaypipes> edleafe: nested resource providers.
14:28:55 <edleafe> link?
14:28:56 <jaypipes> edleafe: https://etherpad.openstack.org/p/nested-resource-providers
14:29:04 <jaypipes> edleafe: just my thinking to date...
14:29:08 <edleafe> #link https://etherpad.openstack.org/p/nested-resource-providers
14:29:31 <edleafe> wow! jaypipes really likes pink!
14:29:47 <jaypipes> edleafe: heh
14:29:48 <edleafe> :)
14:29:52 <Yingxin> is it worthwhile to go on the work of host-state-level-locking?
14:30:18 <edleafe> Yingxin: what do you want to discuss about it?
14:30:29 <Yingxin> it still needs a patch to add claim logic to scheduler
14:30:42 <Yingxin> and a patch to refactor claim logic
14:31:15 <Yingxin> https://review.openstack.org/#/c/262938/
14:31:15 <edleafe> Yingxin: ok - is there any issue about it that isn't clear?
14:31:41 <Yingxin> the issue is no one review it..
14:31:54 <edleafe> Yingxin: :)
14:32:12 <bauzas> Yingxin: are those up to date ?
14:32:23 <bauzas> Yingxin: I remember looking at them in the past and they were having a merge conflict
14:32:30 <Yingxin> yes, that patch is up to date
14:32:37 <Yingxin> but the next one is not
14:33:07 <bauzas> Yingxin: okay, IIRC, that's all about defining a new scheduler claim object, so that's a pretty huge change
14:33:17 <edleafe> #action All to review https://review.openstack.org/#/c/262938/
14:33:25 <Yingxin> bauzas: right
14:33:41 <bauzas> Yingxin: but sure, it's in my list, ping me during that week to make sure I'm not forgetting it
14:33:42 <Yingxin> bauzas: but scheduler seems work very well without claim
14:33:57 <Yingxin> bauzas: thanks
14:34:01 <bauzas> Yingxin: yup, I did read your (long) email
14:34:09 <Yingxin> edleafe: thanks too
14:34:22 <bauzas> Yingxin: but I have points and questions, and that requires time to ask for :)
14:34:43 <Yingxin> bauzas: np
14:35:13 <edleafe> There will be one more opportunity to discuss this (July 11) before the midcycle, so if anyone has questions, leave them on the patch and we can address them then
14:35:25 <bauzas> Yingxin: basically, your test scenario is stressing the same environment of 1000 computes with 50 concurrent requests, I'm unclear whether those 1000 computes are leavinf enough free space for those 50 requests
14:35:41 <bauzas> in what you call 'preloaded"
14:36:09 <Yingxin> preloaded only allows exact 49 new vms to boot
14:36:18 <Yingxin> in the 1000-node environment
14:36:43 <Yingxin> so if there're 50 requests, there should be one and only one fail.
14:36:46 <bauzas> Yingxin: okay, so that means those 1000 hosts are fully stacked, with only room for 49 instances to be created ?
14:36:50 <bauzas> okaaaaay
14:36:55 <bauzas> that wasn't clear
14:37:05 <Yingxin> bauzas: right, sorry about that
14:37:18 <bauzas> okay, I'll reply to that one with interest
14:37:40 <Yingxin> bauzas: :)
14:37:42 <edleafe> Anything else for midcycle topics?
14:38:05 <bauzas> edleafe: I think Yingxin's thread is worth of interest for reading *before* the midcycle
14:38:44 <bauzas> #link http://lists.openstack.org/pipermail/openstack-dev/2016-June/098202.html
14:39:02 <edleafe> bauzas: Agreed
14:39:12 <edleafe> that's what I hope to get from this meeting:
14:39:33 <edleafe> topics to brush up on so that we can make progress on them at the midcycle
14:39:55 <Yingxin> ^ that's a performance profiling of three types of schedulers
14:39:57 <edleafe> bauzas: Thanks for the link to the list
14:40:22 <diga> edleafe: if I am using devstack, how I can simulate 1000 compute node environment to test these stuff
14:40:49 <Yingxin> diga: run them with fake virt drivers
14:41:13 <diga> Yingxin: okay
14:41:25 <edleafe> Yingxin: do you have a reference for how to do that? Link?
14:41:57 <Yingxin> I have a tool actually
14:41:58 <bauzas> edleafe: it's in the dev docs :)
14:41:58 <Yingxin> https://github.com/cyx1231st/nova-scheduler-bench
14:42:16 <edleafe> bauzas: that's the easy answer :)
14:42:21 <bauzas> edleafe: http://docs.openstack.org/developer/nova/development.environment.html#using-fake-computes-for-tests
14:42:27 <edleafe> bauzas: thanks!
14:42:41 <diga> Yingxin: got it
14:42:44 <edleafe> diga: that should get you started
14:42:55 <diga> edleafe: Sure.
14:43:00 <edleafe> Anyway, we still have a lot to discuss today
14:43:09 <edleafe> #topic Opens
14:43:13 <edleafe> Other services directly accessing the Scheduler and/or request database
14:43:24 <edleafe> Watcher wants to directly access the scheduler
14:43:36 <edleafe> #link https://review.openstack.org/#/c/329873
14:43:47 <edleafe> I have been advising against it
14:43:58 <diga> Yingxin: thanks :)
14:44:06 <edleafe> I'd like some of the Watcher people here today to explain what they want to do
14:44:16 <bauzas> yeah, we should be having a placement API sooner or later :)
14:44:20 <edleafe> And for the Nova people to add their advice
14:44:51 <edleafe> jed56: acabot: The floor is yours :)
14:44:59 <jed56> Everybody know the watcher project ? :)
14:45:29 <raj_singh> jed56: Link to project doc will help
14:45:29 <diga> jed56: can you share the link ?
14:45:41 <jed56> #link https://github.com/openstack/watcher
14:45:49 <acabot> #link https://wiki.openstack.org/wiki/Watcher
14:46:11 <jed56> #link http://docs.openstack.org/developer/watcher/
14:46:37 <acabot> Watcher is an infrastructure optimization service
14:47:23 <acabot> we have build a framework to allows ops to build optimization strategies
14:47:58 <acabot> we dont touch the initial placement
14:48:25 <acabot> the idea is to collect metrics and define a better placement for VMs in the long run
14:48:32 <acabot> is that clear ?
14:49:11 <edleafe> The part that concerns us is that to optimize the DC, Watcher needs to make live migration calls
14:49:47 <edleafe> Since the scheduler now checks the destination by default, the selected host may fail
14:49:52 <bauzas> that, I know already :)
14:49:53 <acabot> edleafe : yes, we need to move VMs and so we need to know what are the constraints between them
14:50:27 <edleafe> The multi-host change to live mig is one approach to minimizing failures and repeated API calls
14:50:40 <acabot> this spec is the first step to open discussion with the Nova team on how to do it in a clever way
14:51:09 <edleafe> The link above is a spec that basically runs select_destinations inside of Watcher to verify that the host is acceptable before calling live mig
14:51:32 <acabot> edleafe : this solution is clearly not acceptable today
14:51:45 <edleafe> acabot: understood
14:55:20 <acabot> edleafe : and we would like to find the best path to do it with the current scheduler implem
14:55:20 <edleafe> acabot: One thing I'd like to get a better idea on is the timeframe for a fully public RESTful placement API
14:55:21 <edleafe> and whether that should be relied upon now
14:55:21 <edleafe> is jaypipes still here?
14:55:21 <bauzas> well, the real problem is that a cross-project dependency usually needs a client package
14:55:21 <bauzas> for providing a stable contract that the consumer can use
14:55:21 <acabot> edleafe : yes, I think we need to find different solutions in a large timeframe (what can we do in 6 months/1 year/ 2years)
14:55:21 <jed56> bauzas: +1
14:55:23 <edleafe> bauzas: yes, exactly
14:55:23 <edleafe> it needs to be a public contract
14:55:23 <bauzas> using RPC calls is by definition something we don't guarantee as defined by http://docs.openstack.org/developer/nova/policies.html#public-contractual-apis
14:55:23 <jed56> bauzas: this is something we are also agree about 'internal rpc'
14:55:23 <bauzas> so, the problem is then "where the scheduler provides those stable APIs we can consume"
14:55:23 <acabot> the main idea today is to be able to get your feedback on this idea and then improve our spec
14:55:23 <bauzas> and the answer is "we don't have those yet"
14:55:36 <jed56> bauzas: yes : )
14:55:37 <acabot> bauzas : :-)
14:55:41 <bauzas> but
14:55:53 <bauzas> we notify on any instance action
14:56:05 <jed56> bauzas: did you have the chance to read my spec ?
14:56:08 <bauzas> and those notification messages are *now* (yippee) versioned
14:56:17 <bauzas> jed56: nope sorry :(
14:56:49 <jed56> The main problem that we have is that watcher is blind
14:57:39 <edleafe> About 3 minutes left
14:57:46 <edleafe> So let's move on
14:57:52 <edleafe> Please comment on the spec!!
14:57:56 <acabot> thx
14:57:56 <bauzas> jed56: okay, raise your spec here, so we can discuss that offline
14:59:11 <edleafe> Last open topic: FPGA as a resource
14:59:12 <bauzas> oh this is
14:59:21 <jed56> #link https://review.openstack.org/#/c/329873
14:59:28 <edleafe> #link https://review.openstack.org/#/c/323130/ - IBM attempt
14:59:28 <edleafe> #link https://review.openstack.org/318047 - High level use cases
14:59:28 <edleafe> #link https://review.openstack.org/#/c/312696 - Current approach for dynamic resources proposal
14:59:28 <_gryf> right
14:59:29 <edleafe> We obviously can't get through them all in 1 minute
14:59:29 <_gryf> only 1 min left
14:59:30 <edleafe> But to be clear: we agreed that programming FPGA is out of scope for nova
14:59:30 <_gryf> so at least please read the specs
14:59:31 <_gryf> edleafe, for newton cycle
14:59:36 <edleafe> _gryf: No, it was stronger than that
15:00:09 <edleafe> Nova will not be an FPGA utility
15:00:21 <_gryf> edleafe, I mean for newton cycle we agree, all the hussle will do the admin
15:00:48 <edleafe> yes
15:00:48 <_gryf> and i agree
15:01:12 <_gryf> there would be no programming done by nova
15:01:12 <edleafe> in any case, we're at time. Thanks everyone! Continue in -nova
15:01:12 <edleafe> #endmeeting