#openstack-meeting-alt log

14:00:19 <edleafe> #startmeeting nova_scheduler
14:00:20 <openstack> Meeting started Mon Apr  3 14:00:19 2017 UTC and is due to finish in 60 minutes.  The chair is edleafe. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:21 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:23 <openstack> The meeting name has been set to 'nova_scheduler'
14:00:27 <cdent> o/
14:00:32 <edleafe> Good UGT morning!
14:00:35 <macsz> \o
14:00:41 <jimbaker> o/
14:01:27 <jaypipes> o/
14:01:31 <thomasem> o/
14:01:31 * macsz ashamed that every single time has to check what UGT means
14:01:49 <bauzas> \o
14:01:56 <edleafe> macsz: :)
14:02:21 * johnthetubaguy lurks
14:02:23 <bauzas> 4pm is like an excellent morning for some real people
14:02:23 <macsz> and it isn't general union of workers in Portugal!
14:02:37 <jimbaker> "Now, instead of spending time figuring out what time of day is it for every member of the channel, we spend time explaining newcomers benefits of UGT." is the relevant quote...
14:02:52 <bauzas> mostly jetsettes
14:02:56 <bauzas> jetsetters even
14:03:07 <edleafe> jimbaker: 'zactly
14:03:31 <edleafe> Anyways, let's get started
14:03:37 <edleafe> #topic Specs and Reviews
14:03:52 <edleafe> First up:
14:03:54 <edleafe> #link Traits series https://review.openstack.org/#/c/376201/
14:04:14 <edleafe> Seems to be some gate issues, but it looks like that's pretty much done
14:05:09 <jaypipes> ya
14:05:12 <edleafe> Along the same lines...
14:05:13 <edleafe> #link os-traits reorg https://review.openstack.org/#/c/448282/
14:06:18 <edleafe> Two approaches for automating/simplifying the import of the directories have been proposed
14:06:21 <edleafe> #link pony https://github.com/cdent/pony
14:06:23 <edleafe> #link autoimport https://github.com/EdLeafe/autoimport
14:07:06 <edleafe> jaypipes: have you had a chance to look over those?
14:07:30 <jaypipes> edleafe: only yours. which I supported and said "go for it."
14:07:37 <jaypipes> edleafe: looking at the pony now..
14:08:02 <jaypipes> oh wait, yeah, I did look at the pony.
14:08:06 <cdent> I think a hybrid of autoimport and pony is probably "right" (not that any of it is wrong)
14:08:20 <jaypipes> I didn't like it having the 'NAMES' attribute in each module.
14:08:35 <cdent> what's there already is basically pony, but with c-macros style thing
14:08:55 <cdent> I think the repeat of register is _really_ noisy, but I think that's a question of preference/taste/whatever and not really a technical concern
14:09:08 <cdent> adding the autoimport would be nice
14:09:18 * jroll strolls in late, looks around, whispers good morning
14:09:25 <jaypipes> mornin
14:09:42 <edleafe> jaypipes: So do you want me to take a whack and push an update that incorporates these?
14:09:53 <bauzas> mmm
14:09:55 * edleafe tips his hat to jroll
14:09:59 <jaypipes> edleafe: yes, on top of my existing reorg patch.
14:10:12 <jaypipes> edleafe: just the autoimport pkgutils foo.
14:10:24 <bauzas> why should we import some minor package, for getting an autoimport ? :)
14:10:26 <jaypipes> edleafe: not the "put everything in a NAMES attribute in each module": thing
14:10:44 <bauzas> tbh, I'm not liking adding some minor packages in the requirements repo
14:10:55 <cdent> edleafe: make sure you get rid of that register noise ;)
14:10:55 <jaypipes> bauzas: importlib you mean?
14:10:57 <bauzas> at least in case the only packager stops providing new versions....
14:11:52 <edleafe> bauzas: pkgutil is part of the standard library
14:11:56 <jaypipes> bauzas: pkgutil is a stdlib: https://docs.python.org/2/library/pkgutil.html
14:12:06 <jaypipes> bauzas: and importlib is an oslo library.
14:12:13 <jaypipes> I thought?
14:12:15 <bauzas> oh sorry, I misunderstood
14:12:29 <edleafe> jaypipes: importlib is stdlib too
14:12:30 <bauzas> those 2 above repos are not a new package, right?
14:12:31 <cdent> importlib is stdlib
14:12:36 <bauzas> it's just a discussion repo ?
14:12:39 <cdent> edleafe: you are still a faster typist
14:13:02 <edleafe> bauzas: those were POCs to show to jay what we had in mind
14:13:25 <bauzas> ah ok
14:13:53 <bauzas> fair enough then, I need to look at those
14:14:15 <edleafe> #action edleafe to a patch on top of jaypipes's os-traits reorg patch to automate import of submodules
14:14:38 <edleafe> Moving on...
14:14:38 <edleafe> #link Claims in placement https://review.openstack.org/#/c/437424/
14:14:41 <edleafe> #link Forum topic on placement claims http://forumtopics.openstack.org/cfp/details/63
14:15:14 <edleafe> mriedem wants to have a Forum session to lay out the path for claims in placement
14:15:33 <edleafe> So we can get all the groundword done in Pike, and implement in Queens
14:15:54 <bauzas> yup, and I need to provide a new PS...
14:16:20 <edleafe> jaypipes: You had some concerns about having the scheduler make a second call to claim the resources for its selected host
14:16:29 <edleafe> Care to explain the issue?
14:18:37 <bauzas> edleafe: cdent: I think we have some kind of misunderstanding between us
14:18:56 <bauzas> edleafe: cdent: long-term, we could still have filters running *after* placement is called
14:19:15 <bauzas> that could be done in the conductor if we merge with the scheduler, but we could still have that
14:19:26 <bauzas> at least, it's a consensus got by PTG
14:19:54 <bauzas> and placement couldn't possibly have a feature-parity with all the filters
14:19:58 <edleafe> bauzas: the question is how placement could ever make a claim if it doesn't know the eventual chosen resource provider
14:20:12 <edleafe> jaypipes seemed to think it could
14:20:21 <bauzas> edleafe: that's why I think the scheduler (or conductor) needs to set which host it's accepting
14:20:35 <bauzas> edleafe: hence me providing using the placement API for that
14:20:50 <edleafe> My point was that *unless* placement is a complete replacement for scheduler filters/weighers, it cannot claim directly
14:21:02 <bauzas> like : we get a list of RP, the scheduler runs the filters, finds a destination, sets the destination by claiming it
14:21:09 <cdent> bauzas: I don't think edleafe and I are disagreeing with you, jaypipes is.
14:21:18 <cdent> ed and I are trying to find the middle ground between you two
14:21:23 <bauzas> edleafe: and *my* point is that placement can't be feature-parity with all the filters we have
14:21:35 <jaypipes> edleafe: the whole *point* of doing claims in the placement API is to have the placement choose the resource provider.
14:21:36 <edleafe> bauzas: yes, I agree with that
14:22:09 <edleafe> jaypipes: and I say that until the placement engine can do that, we need the additional step
14:22:09 <bauzas> jaypipes: if we run filters after a list of RPs, how placement could know which one is accepted ?
14:22:31 <edleafe> jaypipes: right now placement could return hundreds of RPs for a request
14:22:33 <jaypipes> edleafe: we already *have* the extra step. It's how the system currently works.
14:22:49 <edleafe> jaypipes: not exactly
14:22:56 <jaypipes> yes, excatly.
14:23:07 <bauzas> jaypipes: the extra step of claiming the resource is not done currently where I'm planning to add it
14:23:17 <edleafe> in this change, the scheduler selects a host, and would claim the resources before it returns that host to the conductor
14:23:25 <bauzas> yup
14:23:28 <cdent> jaypipes: are you up to date on the comments on the spec since you made your comment?
14:23:39 <edleafe> it doesn't eliminate the race, but it sure cuts it down a lot
14:24:10 <bauzas> the Placement API is here for good reasons
14:24:10 <jaypipes> bauzas: right, it's done on the compute node, but I'm meh on putting that step into the scheduler after running weighers and filters.
14:24:30 <jaypipes> cdent: yes, I'm up to date on them.
14:24:31 <bauzas> jaypipes: you admit it would fix a couple of races ?
14:24:52 <bauzas> jaypipes: I get you'd like to get the placement returning a token
14:25:03 <cdent> j✔
14:25:04 <bauzas> jaypipes: instead of a list of RPs
14:25:24 <bauzas> jaypipes: but there are reasons why we can't leave placement only providing a destination
14:25:26 <jaypipes> bauzas: I absolutely *don't* want placement returning a "token". I've said that numerous times.
14:25:34 <bauzas> jaypipes: s/token/claim
14:25:41 <jaypipes> yeah, I *don't* want that.
14:26:10 <jaypipes> bauzas: I see no reason to invent some new "claim" resource.
14:26:16 <johnthetubaguy> the sooner placement knows about which host Nova picks, the better right?
14:26:19 <jaypipes> bauzas: "claim" isn't a resource. it's an action.
14:26:38 <bauzas> jaypipes: I agree with your last sentence
14:26:44 <edleafe> jaypipes: totally agree on not creating a new Claim resource
14:26:49 <johnthetubaguy> (I am checking the discussion is about the how, not the why/what)
14:27:13 <bauzas> okay, I think I understand your concerns both of you
14:27:25 <bauzas> I added a claims REST resource, it should be a verb
14:27:30 <bauzas> lemme clarify that
14:27:49 <bauzas> what if the scheduler *claims* to placement that allocation ?
14:28:02 <jaypipes> bauzas: I don't know what you mean by that.
14:28:16 <johnthetubaguy> jaypipes: I am curious how you would like us to let placement know sooner what resource providers we picked? I can see a few possible ways, they all seem to interact with the resource tracker in nasty ways.
14:28:31 <bauzas> claiming is an action, you said
14:28:40 <bauzas> I'm translating that into a verb
14:28:54 <bauzas> the object model we have in placement is "allocations"
14:29:08 <jaypipes> bauzas: basically, what I'm saying is that I see no reason whatsoever why there should be *any* change in the placement REST APIs for what you're suggesting to do (the multi-step request providers, filter/sort, and choose provider).
14:29:11 <edleafe> jaypipes: do I understand your proposal in that spec correctly? Have placement claims the resources on *all* the RPs it returns, and then have scheduler call DELETE on all the allocations for RPs it *doesn't* select?
14:30:29 <edleafe> jaypipes: I think we pretty much agree that the existing REST API can handle claims
14:30:48 <bauzas> jaypipes: if the scheduler sets a pre-existing allocation to the host it picked, it would help placement with concurrent requests, right?
14:31:15 <jaypipes> edleafe: no. I was suggesting just have the placement select a provider (or set of providers like a compute host and a storage pool, etc) and actually allocate against those providers. The scheduler would check that the provider met any additional wacky filters that could not be met in th eplacement API (yet) and if the providers did not meet those filters, would DELETE the allocations from the placement API.
14:31:23 <bauzas> ie. say the scheduler picks A, it sets an allocation against A for that RC of X
14:31:28 <jaypipes> edleafe: that said, my approach wouldn't allow sorting/weighing.
14:31:38 <johnthetubaguy> jaypipes: ah, yeah, thats the problem bit
14:32:02 <jaypipes> edleafe: so I'd be fine just having the placement API return the providers as it does now, then have the scheduler run those providers through its filters and weighers and then call POST /allocations.
14:32:21 <bauzas> jaypipes: that's what I just proposed
14:32:35 <jaypipes> edleafe: what I don't want is a new /claims resource or a "ticket system" or anything like that added to the placement REST API.
14:32:52 <bauzas> jaypipes: like I said 5 mins ago, I got your concerns and I will update the PS
14:32:56 <edleafe> jaypipes: yeah, there's no need to add some new concept here
14:33:03 <bauzas> jaypipes: claims shouldn't be a REST resource
14:33:08 <bauzas> and that, I agree
14:33:15 <jaypipes> ok, cool with me then
14:33:16 <johnthetubaguy> jaypipes: +1 thats what I tried to suggest on the spec too, the problem was the resource tracker deleting allocation it doesn't recognise I think?
14:33:34 <edleafe> jaypipes: but the idea that placement will have sufficient "knowledge" to make a choice is not in the foreseeable future
14:33:49 <bauzas> okay, here is what I propose
14:34:06 <bauzas> I think we're basically agreeing on a direction with lots of corner cases still to discuss
14:34:08 <jaypipes> edleafe: undertstood.
14:34:18 <edleafe> johnthetubaguy: the idea would be if the build fails, RT would delete the allocations for that instance UUID
14:34:24 <bauzas> so I'm just editing the spec now, so you could just bite it
14:34:39 <johnthetubaguy> edleafe: I am fine with all that, I am on about the current logic of refreshing allocations in the resource tracker
14:35:05 <edleafe> johnthetubaguy: OIC
14:35:23 <johnthetubaguy> you might have to update instance.host before you claim from placement, or something
14:35:26 <edleafe> johnthetubaguy: yes, that will have to be accounted for in bauzas's spec
14:35:46 <johnthetubaguy> that my main worry anyways, interactions with the resource tracker
14:35:59 <johnthetubaguy> overall, seems like an approach worth digging deeper into
14:36:13 <edleafe> #action bauzas to revise claims spec to remove API changes
14:36:42 <edleafe> Let's continue the discussion on the revised spec
14:36:45 <johnthetubaguy> was this still planned for early next cycle?
14:36:45 <edleafe> #link Show scheduler hints in server details https://review.openstack.org/#/c/440580/
14:37:14 <edleafe> Lots of negative response to that one
14:37:55 <johnthetubaguy> the main issue is being clear why that API is needed
14:38:18 <bauzas> it's... difficult
14:38:25 <bauzas> given custom hints
14:38:41 <bauzas> I think the consensus was to admit custom hints should never be returned by that API
14:39:08 <johnthetubaguy> big picture wise: is it build another instance that is similar as a user, make good decisions when you pick a specific host as an operator, etc
14:39:10 <bauzas> but the real problem provided by sdague is that we make a clear contract on what is a hint
14:39:25 <bauzas> and we forcely name it a thing
14:39:55 <johnthetubaguy> we had a alaski spec about discoverable hints via a registration system at some point
14:40:02 <bauzas> johnthetubaguy: I think we should be very explicit that getting the list of hints you provided is just meaningless
14:40:18 <bauzas> it's just for very specific purposes
14:40:35 <bauzas> like, I know what I do, but I forgot what I typed
14:41:25 <bauzas> johnthetubaguy: discovering the hints could be easier if we merge scheduler with conductor
14:41:33 <bauzas> conceptually I mean
14:41:43 <bauzas> but that's mid-term
14:42:01 <bauzas> I'm not super happy with us exposing the available hints thru the API
14:42:12 <cdent> bauzas: what does the merge change that would make things easier?
14:42:25 <johnthetubaguy> bauzas: not sure the merge makes any different to how it was originally proposed
14:42:28 <bauzas> cdent: the fact that hints are scheduler related
14:43:02 <johnthetubaguy> the proposal was just make a better, proper, extension point for filters and weighers
14:43:06 <bauzas> cdent: if the conductor owns the filters, it makes one less RPC roundtrip
14:43:27 <cdent> okay, but how does that impact hints?
14:43:45 <johnthetubaguy> bauzas: I think the API would have to be configured with the list of scheduler plugins, for this to all work, else it was really suck
14:43:57 <johnthetubaguy> s/was/would/
14:43:59 <bauzas> cdent: hints are defined by filters
14:44:27 <bauzas> cdent: it's just some semantics that is defined in there
14:44:41 * johnthetubaguy is confused
14:44:53 * cdent is with johnthetubaguy
14:45:04 <bauzas> well
14:45:40 <bauzas> I can't be more explicit when I'm saying that a hint is passed down from the API to the scheduler, and that only the filter which uses it really looks at it
14:45:56 <bauzas> if you disable the filter, then the hint is meaningless
14:46:24 <bauzas> the semantics of the string, and how you need to write the hint is also defined by the filter
14:46:46 <johnthetubaguy> I was suggesting the API makes use of the list of configured filter and weight "plugins" basically, but that kinda makes them hard to split out
14:47:02 <johnthetubaguy> the old proposal basically allowed json schema for the possible hits each thing supported, I think
14:47:26 <johnthetubaguy> with human readable descriptions for each api hit that can be output by the scheduler
14:47:50 <bauzas> johnthetubaguy: a JSON schema is worth it, given some filter can change what it expects from the hint
14:47:52 <johnthetubaguy> there was some talk of namespacing, so its clear what is built in vs custom, etc
14:48:07 <johnthetubaguy> but we seem distracted from the original spec I guess
14:48:14 <bauzas> eg. a filter can wait for a string in a hint, but the next release of that filter can accept both a string or a list
14:48:38 <johnthetubaguy> the problem is its not clear how folks want to use that API that lists the scheduler hints used a boot, so its not clear what they want to know about what they mean
14:48:46 <bauzas> :)
14:49:31 <edleafe> johnthetubaguy: yeah, even if they got the list of hints, they wouldn't know if they were ignored or not, right?
14:49:56 <johnthetubaguy> edleafe: yes, but its unclear to me if that matters to the users
14:50:04 <bauzas> it does
14:50:09 <johnthetubaguy> on move we can "check" the destination is "good"
14:50:20 <bauzas> if affinity filter is disabled, then thz same_host hint is meaningless
14:50:23 <johnthetubaguy> but that fails with affinity related operators
14:50:47 <johnthetubaguy> now does that mean the real think folks want is to move server groups of servers to different hosts
14:50:56 <edleafe> bauzas: exactly. So how would an enduser benefit from seeing that hint?
14:51:01 <johnthetubaguy> or do people want to just create a VM thats just like their old VM
14:51:06 <johnthetubaguy> or is it both
14:51:17 <johnthetubaguy> seems like different solutions are best for each of those cases
14:51:41 <johnthetubaguy> none appear to be the proposed spec, but the proposed spec is easiest to implement to hack in said features
14:51:50 <bauzas> yeah
14:52:03 <edleafe> Let's continue discussion on the spec - only 8 minutes left
14:52:08 <edleafe> #link ProviderTree series https://review.openstack.org/#/c/415920/
14:52:21 <bauzas> I think the usecase is just for people wanting to remember what they typed initially
14:52:23 <edleafe> jaypipes: anything to say about that one?
14:52:30 <bauzas> here, for example "same_hint"
14:52:59 <johnthetubaguy> bauzas: thats not clear in the spec yet
14:53:42 <bauzas> agreed
14:53:46 <jaypipes> edleafe: will leave my comments on the review.
14:53:52 <jaypipes> edleafe: need to digest it, sorry.
14:54:00 <edleafe> jaypipes: thanks
14:54:10 <edleafe> Finally:
14:54:11 <edleafe> #link Add use-local-scheduler https://review.openstack.org/#/c/438936/
14:54:17 <edleafe> johnthetubaguy: comments?
14:54:33 <macsz> i am still digging through code to answer bauzas comments
14:54:47 <johnthetubaguy> I think we agreed to can this, but the main thing is what work can we do to make this possible next cycle
14:54:54 <johnthetubaguy> i.e. lets have a list of all the technical blockers
14:55:07 <johnthetubaguy> like the scheduler host state cache stuff
14:55:11 <cdent> I'm still not understanding why it is a good idea.
14:55:14 <edleafe> johnthetubaguy: sounds good. Any blockers?
14:55:26 <johnthetubaguy> it might be once we have "allocations quicker" a lot of the blockers go away
14:55:38 <johnthetubaguy> lots of blockers, but they are all with macsz I think
14:56:15 <johnthetubaguy> to people agree with the general direction?
14:56:29 <johnthetubaguy> i.e. less processes for operators to run == easier life for operators
14:56:35 <cdent> no
14:56:55 <cdent> (see my comment on patchset 9)
14:56:58 <macsz> i moved the spec to backlog, adding list of stuff that would need to be resolved is a good point
14:57:05 <jaypipes> johnthetubaguy: we live in a microservices world, and you are a microservices girl.
14:57:29 * edleafe applauds the Madonna reference
14:57:51 <edleafe> Time is short, so...
14:57:52 <edleafe> #topic Bugs
14:57:52 <edleafe> Any new bugs to discuss?
14:58:15 <macsz> nothing from last week, i think
14:58:30 <edleafe> Then let's jump to:
14:58:31 <edleafe> #topic Open Discussion
14:58:47 <edleafe> 90 seconds to spill whatever's on your mind :)
14:58:51 <johnthetubaguy> cdent: agreed with the concern on debug-ability, honestly, thats why I proposed this
14:59:01 <johnthetubaguy> jaypipes: I am all for it when they have a good clear reason to exist
14:59:25 <jaypipes> johnthetubaguy: I was mostly just joking with ya :)
14:59:32 <jaypipes> johnthetubaguy: haven't read the spec yet :)
14:59:45 <johnthetubaguy> jaypipes: heh, well you got that song in my head, lol
14:59:55 <jaypipes> Mission Accomplished.
15:00:03 <edleafe> Looks like we're out of time, so let's continue any discussions in #openstack-nova
15:00:05 <edleafe> #endmeeting