14:00:19 #startmeeting nova_scheduler 14:00:20 Meeting started Mon Apr 3 14:00:19 2017 UTC and is due to finish in 60 minutes. The chair is edleafe. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:21 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:23 The meeting name has been set to 'nova_scheduler' 14:00:27 o/ 14:00:32 Good UGT morning! 14:00:35 \o 14:00:41 o/ 14:01:27 o/ 14:01:31 o/ 14:01:31 * macsz ashamed that every single time has to check what UGT means 14:01:49 \o 14:01:56 macsz: :) 14:02:21 * johnthetubaguy lurks 14:02:23 4pm is like an excellent morning for some real people 14:02:23 and it isn't general union of workers in Portugal! 14:02:37 "Now, instead of spending time figuring out what time of day is it for every member of the channel, we spend time explaining newcomers benefits of UGT." is the relevant quote... 14:02:52 mostly jetsettes 14:02:56 jetsetters even 14:03:07 jimbaker: 'zactly 14:03:31 Anyways, let's get started 14:03:37 #topic Specs and Reviews 14:03:52 First up: 14:03:54 #link Traits series https://review.openstack.org/#/c/376201/ 14:04:14 Seems to be some gate issues, but it looks like that's pretty much done 14:05:09 ya 14:05:12 Along the same lines... 14:05:13 #link os-traits reorg https://review.openstack.org/#/c/448282/ 14:06:18 Two approaches for automating/simplifying the import of the directories have been proposed 14:06:21 #link pony https://github.com/cdent/pony 14:06:23 #link autoimport https://github.com/EdLeafe/autoimport 14:07:06 jaypipes: have you had a chance to look over those? 14:07:30 edleafe: only yours. which I supported and said "go for it." 14:07:37 edleafe: looking at the pony now.. 14:08:02 oh wait, yeah, I did look at the pony. 14:08:06 I think a hybrid of autoimport and pony is probably "right" (not that any of it is wrong) 14:08:20 I didn't like it having the 'NAMES' attribute in each module. 14:08:35 what's there already is basically pony, but with c-macros style thing 14:08:55 I think the repeat of register is _really_ noisy, but I think that's a question of preference/taste/whatever and not really a technical concern 14:09:08 adding the autoimport would be nice 14:09:18 * jroll strolls in late, looks around, whispers good morning 14:09:25 mornin 14:09:42 jaypipes: So do you want me to take a whack and push an update that incorporates these? 14:09:53 mmm 14:09:55 * edleafe tips his hat to jroll 14:09:59 edleafe: yes, on top of my existing reorg patch. 14:10:12 edleafe: just the autoimport pkgutils foo. 14:10:24 why should we import some minor package, for getting an autoimport ? :) 14:10:26 edleafe: not the "put everything in a NAMES attribute in each module": thing 14:10:44 tbh, I'm not liking adding some minor packages in the requirements repo 14:10:55 edleafe: make sure you get rid of that register noise ;) 14:10:55 bauzas: importlib you mean? 14:10:57 at least in case the only packager stops providing new versions.... 14:11:52 bauzas: pkgutil is part of the standard library 14:11:56 bauzas: pkgutil is a stdlib: https://docs.python.org/2/library/pkgutil.html 14:12:06 bauzas: and importlib is an oslo library. 14:12:13 I thought? 14:12:15 oh sorry, I misunderstood 14:12:29 jaypipes: importlib is stdlib too 14:12:30 those 2 above repos are not a new package, right? 14:12:31 importlib is stdlib 14:12:36 it's just a discussion repo ? 14:12:39 edleafe: you are still a faster typist 14:13:02 bauzas: those were POCs to show to jay what we had in mind 14:13:25 ah ok 14:13:53 fair enough then, I need to look at those 14:14:15 #action edleafe to a patch on top of jaypipes's os-traits reorg patch to automate import of submodules 14:14:38 Moving on... 14:14:38 #link Claims in placement https://review.openstack.org/#/c/437424/ 14:14:41 #link Forum topic on placement claims http://forumtopics.openstack.org/cfp/details/63 14:15:14 mriedem wants to have a Forum session to lay out the path for claims in placement 14:15:33 So we can get all the groundword done in Pike, and implement in Queens 14:15:54 yup, and I need to provide a new PS... 14:16:20 jaypipes: You had some concerns about having the scheduler make a second call to claim the resources for its selected host 14:16:29 Care to explain the issue? 14:18:37 edleafe: cdent: I think we have some kind of misunderstanding between us 14:18:56 edleafe: cdent: long-term, we could still have filters running *after* placement is called 14:19:15 that could be done in the conductor if we merge with the scheduler, but we could still have that 14:19:26 at least, it's a consensus got by PTG 14:19:54 and placement couldn't possibly have a feature-parity with all the filters 14:19:58 bauzas: the question is how placement could ever make a claim if it doesn't know the eventual chosen resource provider 14:20:12 jaypipes seemed to think it could 14:20:21 edleafe: that's why I think the scheduler (or conductor) needs to set which host it's accepting 14:20:35 edleafe: hence me providing using the placement API for that 14:20:50 My point was that *unless* placement is a complete replacement for scheduler filters/weighers, it cannot claim directly 14:21:02 like : we get a list of RP, the scheduler runs the filters, finds a destination, sets the destination by claiming it 14:21:09 bauzas: I don't think edleafe and I are disagreeing with you, jaypipes is. 14:21:18 ed and I are trying to find the middle ground between you two 14:21:23 edleafe: and *my* point is that placement can't be feature-parity with all the filters we have 14:21:35 edleafe: the whole *point* of doing claims in the placement API is to have the placement choose the resource provider. 14:21:36 bauzas: yes, I agree with that 14:22:09 jaypipes: and I say that until the placement engine can do that, we need the additional step 14:22:09 jaypipes: if we run filters after a list of RPs, how placement could know which one is accepted ? 14:22:31 jaypipes: right now placement could return hundreds of RPs for a request 14:22:33 edleafe: we already *have* the extra step. It's how the system currently works. 14:22:49 jaypipes: not exactly 14:22:56 yes, excatly. 14:23:07 jaypipes: the extra step of claiming the resource is not done currently where I'm planning to add it 14:23:17 in this change, the scheduler selects a host, and would claim the resources before it returns that host to the conductor 14:23:25 yup 14:23:28 jaypipes: are you up to date on the comments on the spec since you made your comment? 14:23:39 it doesn't eliminate the race, but it sure cuts it down a lot 14:24:10 the Placement API is here for good reasons 14:24:10 bauzas: right, it's done on the compute node, but I'm meh on putting that step into the scheduler after running weighers and filters. 14:24:30 cdent: yes, I'm up to date on them. 14:24:31 jaypipes: you admit it would fix a couple of races ? 14:24:52 jaypipes: I get you'd like to get the placement returning a token 14:25:03 j✔ 14:25:04 jaypipes: instead of a list of RPs 14:25:24 jaypipes: but there are reasons why we can't leave placement only providing a destination 14:25:26 bauzas: I absolutely *don't* want placement returning a "token". I've said that numerous times. 14:25:34 jaypipes: s/token/claim 14:25:41 yeah, I *don't* want that. 14:26:10 bauzas: I see no reason to invent some new "claim" resource. 14:26:16 the sooner placement knows about which host Nova picks, the better right? 14:26:19 bauzas: "claim" isn't a resource. it's an action. 14:26:38 jaypipes: I agree with your last sentence 14:26:44 jaypipes: totally agree on not creating a new Claim resource 14:26:49 (I am checking the discussion is about the how, not the why/what) 14:27:13 okay, I think I understand your concerns both of you 14:27:25 I added a claims REST resource, it should be a verb 14:27:30 lemme clarify that 14:27:49 what if the scheduler *claims* to placement that allocation ? 14:28:02 bauzas: I don't know what you mean by that. 14:28:16 jaypipes: I am curious how you would like us to let placement know sooner what resource providers we picked? I can see a few possible ways, they all seem to interact with the resource tracker in nasty ways. 14:28:31 claiming is an action, you said 14:28:40 I'm translating that into a verb 14:28:54 the object model we have in placement is "allocations" 14:29:08 bauzas: basically, what I'm saying is that I see no reason whatsoever why there should be *any* change in the placement REST APIs for what you're suggesting to do (the multi-step request providers, filter/sort, and choose provider). 14:29:11 jaypipes: do I understand your proposal in that spec correctly? Have placement claims the resources on *all* the RPs it returns, and then have scheduler call DELETE on all the allocations for RPs it *doesn't* select? 14:30:29 jaypipes: I think we pretty much agree that the existing REST API can handle claims 14:30:48 jaypipes: if the scheduler sets a pre-existing allocation to the host it picked, it would help placement with concurrent requests, right? 14:31:15 edleafe: no. I was suggesting just have the placement select a provider (or set of providers like a compute host and a storage pool, etc) and actually allocate against those providers. The scheduler would check that the provider met any additional wacky filters that could not be met in th eplacement API (yet) and if the providers did not meet those filters, would DELETE the allocations from the placement API. 14:31:23 ie. say the scheduler picks A, it sets an allocation against A for that RC of X 14:31:28 edleafe: that said, my approach wouldn't allow sorting/weighing. 14:31:38 jaypipes: ah, yeah, thats the problem bit 14:32:02 edleafe: so I'd be fine just having the placement API return the providers as it does now, then have the scheduler run those providers through its filters and weighers and then call POST /allocations. 14:32:21 jaypipes: that's what I just proposed 14:32:35 edleafe: what I don't want is a new /claims resource or a "ticket system" or anything like that added to the placement REST API. 14:32:52 jaypipes: like I said 5 mins ago, I got your concerns and I will update the PS 14:32:56 jaypipes: yeah, there's no need to add some new concept here 14:33:03 jaypipes: claims shouldn't be a REST resource 14:33:08 and that, I agree 14:33:15 ok, cool with me then 14:33:16 jaypipes: +1 thats what I tried to suggest on the spec too, the problem was the resource tracker deleting allocation it doesn't recognise I think? 14:33:34 jaypipes: but the idea that placement will have sufficient "knowledge" to make a choice is not in the foreseeable future 14:33:49 okay, here is what I propose 14:34:06 I think we're basically agreeing on a direction with lots of corner cases still to discuss 14:34:08 edleafe: undertstood. 14:34:18 johnthetubaguy: the idea would be if the build fails, RT would delete the allocations for that instance UUID 14:34:24 so I'm just editing the spec now, so you could just bite it 14:34:39 edleafe: I am fine with all that, I am on about the current logic of refreshing allocations in the resource tracker 14:35:05 johnthetubaguy: OIC 14:35:23 you might have to update instance.host before you claim from placement, or something 14:35:26 johnthetubaguy: yes, that will have to be accounted for in bauzas's spec 14:35:46 that my main worry anyways, interactions with the resource tracker 14:35:59 overall, seems like an approach worth digging deeper into 14:36:13 #action bauzas to revise claims spec to remove API changes 14:36:42 Let's continue the discussion on the revised spec 14:36:45 was this still planned for early next cycle? 14:36:45 #link Show scheduler hints in server details https://review.openstack.org/#/c/440580/ 14:37:14 Lots of negative response to that one 14:37:55 the main issue is being clear why that API is needed 14:38:18 it's... difficult 14:38:25 given custom hints 14:38:41 I think the consensus was to admit custom hints should never be returned by that API 14:39:08 big picture wise: is it build another instance that is similar as a user, make good decisions when you pick a specific host as an operator, etc 14:39:10 but the real problem provided by sdague is that we make a clear contract on what is a hint 14:39:25 and we forcely name it a thing 14:39:55 we had a alaski spec about discoverable hints via a registration system at some point 14:40:02 johnthetubaguy: I think we should be very explicit that getting the list of hints you provided is just meaningless 14:40:18 it's just for very specific purposes 14:40:35 like, I know what I do, but I forgot what I typed 14:41:25 johnthetubaguy: discovering the hints could be easier if we merge scheduler with conductor 14:41:33 conceptually I mean 14:41:43 but that's mid-term 14:42:01 I'm not super happy with us exposing the available hints thru the API 14:42:12 bauzas: what does the merge change that would make things easier? 14:42:25 bauzas: not sure the merge makes any different to how it was originally proposed 14:42:28 cdent: the fact that hints are scheduler related 14:43:02 the proposal was just make a better, proper, extension point for filters and weighers 14:43:06 cdent: if the conductor owns the filters, it makes one less RPC roundtrip 14:43:27 okay, but how does that impact hints? 14:43:45 bauzas: I think the API would have to be configured with the list of scheduler plugins, for this to all work, else it was really suck 14:43:57 s/was/would/ 14:43:59 cdent: hints are defined by filters 14:44:27 cdent: it's just some semantics that is defined in there 14:44:41 * johnthetubaguy is confused 14:44:53 * cdent is with johnthetubaguy 14:45:04 well 14:45:40 I can't be more explicit when I'm saying that a hint is passed down from the API to the scheduler, and that only the filter which uses it really looks at it 14:45:56 if you disable the filter, then the hint is meaningless 14:46:24 the semantics of the string, and how you need to write the hint is also defined by the filter 14:46:46 I was suggesting the API makes use of the list of configured filter and weight "plugins" basically, but that kinda makes them hard to split out 14:47:02 the old proposal basically allowed json schema for the possible hits each thing supported, I think 14:47:26 with human readable descriptions for each api hit that can be output by the scheduler 14:47:50 johnthetubaguy: a JSON schema is worth it, given some filter can change what it expects from the hint 14:47:52 there was some talk of namespacing, so its clear what is built in vs custom, etc 14:48:07 but we seem distracted from the original spec I guess 14:48:14 eg. a filter can wait for a string in a hint, but the next release of that filter can accept both a string or a list 14:48:38 the problem is its not clear how folks want to use that API that lists the scheduler hints used a boot, so its not clear what they want to know about what they mean 14:48:46 :) 14:49:31 johnthetubaguy: yeah, even if they got the list of hints, they wouldn't know if they were ignored or not, right? 14:49:56 edleafe: yes, but its unclear to me if that matters to the users 14:50:04 it does 14:50:09 on move we can "check" the destination is "good" 14:50:20 if affinity filter is disabled, then thz same_host hint is meaningless 14:50:23 but that fails with affinity related operators 14:50:47 now does that mean the real think folks want is to move server groups of servers to different hosts 14:50:56 bauzas: exactly. So how would an enduser benefit from seeing that hint? 14:51:01 or do people want to just create a VM thats just like their old VM 14:51:06 or is it both 14:51:17 seems like different solutions are best for each of those cases 14:51:41 none appear to be the proposed spec, but the proposed spec is easiest to implement to hack in said features 14:51:50 yeah 14:52:03 Let's continue discussion on the spec - only 8 minutes left 14:52:08 #link ProviderTree series https://review.openstack.org/#/c/415920/ 14:52:21 I think the usecase is just for people wanting to remember what they typed initially 14:52:23 jaypipes: anything to say about that one? 14:52:30 here, for example "same_hint" 14:52:59 bauzas: thats not clear in the spec yet 14:53:42 agreed 14:53:46 edleafe: will leave my comments on the review. 14:53:52 edleafe: need to digest it, sorry. 14:54:00 jaypipes: thanks 14:54:10 Finally: 14:54:11 #link Add use-local-scheduler https://review.openstack.org/#/c/438936/ 14:54:17 johnthetubaguy: comments? 14:54:33 i am still digging through code to answer bauzas comments 14:54:47 I think we agreed to can this, but the main thing is what work can we do to make this possible next cycle 14:54:54 i.e. lets have a list of all the technical blockers 14:55:07 like the scheduler host state cache stuff 14:55:11 I'm still not understanding why it is a good idea. 14:55:14 johnthetubaguy: sounds good. Any blockers? 14:55:26 it might be once we have "allocations quicker" a lot of the blockers go away 14:55:38 lots of blockers, but they are all with macsz I think 14:56:15 to people agree with the general direction? 14:56:29 i.e. less processes for operators to run == easier life for operators 14:56:35 no 14:56:55 (see my comment on patchset 9) 14:56:58 i moved the spec to backlog, adding list of stuff that would need to be resolved is a good point 14:57:05 johnthetubaguy: we live in a microservices world, and you are a microservices girl. 14:57:29 * edleafe applauds the Madonna reference 14:57:51 Time is short, so... 14:57:52 #topic Bugs 14:57:52 Any new bugs to discuss? 14:58:15 nothing from last week, i think 14:58:30 Then let's jump to: 14:58:31 #topic Open Discussion 14:58:47 90 seconds to spill whatever's on your mind :) 14:58:51 cdent: agreed with the concern on debug-ability, honestly, thats why I proposed this 14:59:01 jaypipes: I am all for it when they have a good clear reason to exist 14:59:25 johnthetubaguy: I was mostly just joking with ya :) 14:59:32 johnthetubaguy: haven't read the spec yet :) 14:59:45 jaypipes: heh, well you got that song in my head, lol 14:59:55 Mission Accomplished. 15:00:03 Looks like we're out of time, so let's continue any discussions in #openstack-nova 15:00:05 #endmeeting