14:00:17 #startmeeting nova_scheduler 14:00:17 Meeting started Mon Aug 28 14:00:17 2017 UTC and is due to finish in 60 minutes. The chair is edleafe. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:19 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:21 The meeting name has been set to 'nova_scheduler' 14:00:26 * edleafe thinks cdent is antsy 14:00:37 Anyone here today? 14:01:05 Does antsy mean “the sun is shing outside”? 14:01:23 o/ 14:01:37 o/ 14:01:48 I'm feeling guilty having a bit of sun this morning while nearby Houston drowns 14:02:52 I know bauzas is still on holiday. jaypipes - around? 14:03:06 he's out until mid week i think 14:03:20 well still traveling but might be around right now 14:03:42 edleafe: yup 14:03:48 kewl 14:03:52 let's start 14:03:56 #link Saner RT agg map updates https://review.openstack.org/#/c/489633/ 14:03:59 doh! 14:04:12 * edleafe cant copy/paste 14:04:15 #undo 14:04:16 Removing item from minutes: #link https://review.openstack.org/#/c/489633/ 14:04:21 #topic Specs & Reviews 14:04:31 #link Spec for returning allocation requests to the scheduler https://review.openstack.org/#/c/471927/ 14:04:49 This is an internal spec that we've already pretty much implemented 14:04:57 just leftover from pike 14:05:28 needs some love from nova-specs cores 14:05:36 #link Add alternate hosts https://review.openstack.org/486215/ 14:06:03 This is marked WIP because I don't really like how this works 14:06:40 I think we really need to spend some design time before we merge something like this 14:06:49 is it on the ptg etherpad? 14:06:49 #link https://blog.leafe.com/handling-unstructured-data/ 14:07:13 mriedem: not yet - I wanted to have discussions sooner 14:07:20 and then continue at PTG 14:07:27 if needed 14:08:27 We made these choices in haste last cycle 14:08:55 And while we have the time I want to make sure we use it to not add more technical debt 14:09:41 +1 14:10:05 jaypipes doesn't feel that this is going to be a problem; he says it makes things simpler 14:10:12 I'd like to hear from others 14:10:20 edleafe: I don't necessarily agree that this is unstructured data, but I agree with you that just returning lists of lists of unnamed tuples or a tuple of lists of lists of HostState objects is not good. 14:10:26 Look at the code for that series, and make sure we want to live with that 14:11:06 edleafe: what are you talking about I don't think this is going to be a problem? :) I specifically say in the patch that I don't like returning lists of lists of tuples. 14:11:31 maybe we should just stick the data in dogpile, and send around a referference uuid instead 14:11:33 jaypipes: I was basing that on your comment on the blog post 14:11:35 edleafe: and gave a suggestion of solving with a namedtuple. 14:11:40 ;)/2 14:12:04 cdent: I was thinking of oslo.cache this weekend 14:12:11 which is a wrapper around dogpile 14:12:47 the key could be something like request_id + root_provider 14:13:37 Passing around huge globs of data never feels right to me, whether they are in named or unnamed tuples 14:13:38 ew 14:14:02 huge globs of data, like the RequestSpec? 14:14:08 this is quite a bit simpler though, right? 14:14:25 why don't we just abstract this into some versioned object? 14:14:29 mriedem: no, like the dicts of junk we passed before RequestSpec was made into an object 14:14:43 I fail to see how this is a huge set of data. 14:14:44 filter_properties 14:15:20 so why not just an AlternativeHosts object or something? 14:15:31 edleafe: isn't this just zero or more allocation request blobs (one per max_attempts per instance in num_instances)? 14:15:33 mriedem: that would be an improvement 14:15:50 jaypipes: one or more 14:15:56 for each instance 14:15:56 I don’t think the issue is that there’s some catastrophe afoot 14:16:08 but rather that we can do this more cleanly, so may as well get it right, now 14:16:21 and the way to get something right is to have a chat about it 14:16:27 so the cache idea is like how reservation_id works in the compute API? 14:16:39 create >1 instances and get a reservation id back, so you can query on that later when listing servers? 14:17:07 mriedem: the exact key isn't important. It could be a UUID 14:17:15 -1 on using a cache. I see no reason to do that for this amount of data. 14:17:31 * jaypipes surprised cdent hasn't said the same. 14:17:45 i also don't see the need to use a cache for tihs 14:17:48 jaypipes: once again, it's not the *volume* of data 14:18:00 mriedem: I'm throwing out ideas 14:18:06 I'm not married to any 14:18:06 if it's the structure that's a problem, just create a versioned object 14:18:06 the cache idea was a lark, a spitball 14:18:25 mriedem: ya 14:18:25 mriedem: I proposed an object for this in the blog post 14:18:29 ok 14:18:32 and document the fields 14:18:37 ^ something i wish the reqspec had 14:18:39 documentation 14:18:43 per my recent ML thread 14:19:24 ++ 14:19:29 it’s a derivation of how I think we should be managing passing data: not in objects that we put over RPC, but in retrievable cacheable data, only identifiers over the rpc wire. But that’s not really germane for right now. What’s germane for right now is: hey wouldn’t it be great if we had something tidy. It sounds like some kind of object is the current def of “tidy” 14:19:56 cdent: you mean how k8s works? 14:20:30 I don’t watch that show 14:20:53 or we could just go back to having all the computes read from the DB. 14:20:58 I just happen to like global ram 14:21:00 yeah, but at some point the client side has to deal with the structure of the thing it pulls out of the cache, or rpc response 14:21:11 mriedem: zactly. 14:21:12 and it sounds like that is the main concern 14:21:34 mriedem: not necessarily 14:21:37 (for the record, I'm totally not serious about having computes read from the DB again) 14:21:57 the allocations are there so that we can unambiguously claim complex resources 14:22:20 whether we post the details, or a link to the details, is not important 14:23:09 right now we're passing around a bunch of these details, most of which will never be needed 14:24:06 edleafe: well, right now we're not passing around anything :) 14:24:21 jaypipes: right now our current design is 14:24:32 and we are passing between placement and scheduler 14:24:51 edleafe: right, but that's not going to change w.r.t. the alternate hosts stuff... 14:24:53 the design calls for then passing to super conductor and cell conductor 14:25:12 alternate hosts just multiplies the complexity 14:25:23 I fail to see that. 14:25:26 making every X a list of X 14:25:49 instead of returning a host per instance, we return a list of hosts per instance 14:26:18 it *reduces* the complexity of the retry operation as a whole because no longer does the request_spec.filter_properties['retry'] stuff need to be adjusted nor does the scheduler need to be re-consulted on each retry iteration 14:26:23 instead of returning an allocation dict per instance, we return a list of them per instance 14:26:46 jaypipes: of course. I've not said a thing about retry 14:26:48 edleafe: yep, that is totes true. 14:26:59 edleafe: this whole thing is about the retry operation. 14:27:19 what I'm saying it would be cleaner to return a list of objects instead of this complex 2-tuple 14:27:24 edleafe: attempting to a) reduce the complexity of that operation and b) allow it to work in a cell-no-upcall situation 14:27:40 edleafe: no disagreement from me on that. 14:27:45 are there two different types of complexity being discussed here? 14:28:11 cdent: yes, and it's the multiplication effect that is my concern 14:28:32 The added complexity of passing a list of hosts is necessary 14:28:44 It's much cleaner than going through the retry cycle 14:29:12 edleafe: I'm really not following your multiplication effect concern. 14:29:17 But passing the corresponding allocation dicts along with that is messy 14:29:39 edleafe: that's what I'm not following you. why is that messy? 14:29:48 so i think we can agree we don't want the 2-tuple 14:29:57 and use an object 14:30:37 jaypipes: because it's relying on positional matches 14:31:07 the allocation dict for a given host is referenced by having the same nested list position 14:31:41 edleafe: ok 14:31:48 IOW, if we are using host 2 of instance 3, we get its allocation through allocation_lists[2, 3] 14:31:48 key off the host, 14:31:50 don't use tuples 14:31:52 use an object with a dict 14:32:05 mriedem: that would be much, much better 14:32:09 so di it 14:32:10 *do it 14:32:22 Sure 14:32:34 I wanted agreement before I did 14:32:42 in case someone had an even better idea 14:32:46 or a reason not to 14:33:09 if we have to identify something in a structure, let's use keys in a dict rather than indexes in a tuple/list/whatever 14:33:13 in general, always 14:33:27 #agreed select_destinations will return a list of objects for each requested instance 14:33:34 otherwise i'll always have to re-learn what the items in the tuple are 14:33:39 mriedem: ++ 14:33:46 mriedem: yeah, that was my fear 14:34:09 mriedem: *I* know what those indexes are, but someone coming in new to the code would be completely confused 14:34:40 i can assure you'd i'd have to relearn it every time i debug that code 14:34:48 *you 14:35:03 like everything in the scheduler 14:35:06 mriedem: 'zactly 14:35:18 I'll start working on that today 14:35:23 Let's move on 14:35:26 #link Saner RT agg map updates https://review.openstack.org/#/c/489633/ 14:35:48 cdent: comments? 14:36:04 nope, there it is, have at 14:36:40 ok then 14:36:47 and for completeness: 14:36:47 #link Nested RP series https://review.openstack.org/#/c/470575/ 14:37:02 That is still a ways off before it is resumed 14:37:12 let's re-propose the spec for nested RPs for queens 14:37:20 are there changes in design that need to be updated in the spec? 14:38:02 mriedem: none that I know of. jaypipes? 14:38:39 i just wondered since you said it's a ways off 14:38:44 not sure what it's a ways off from 14:39:28 the integration of traits, shared, and nested all in the same place is major hairy 14:39:31 mriedem: well, that was jay's comment. The stuff that has to be done first is mostly in his head 14:39:53 I’d love to see us make that more comprehensible and composable before adding more 14:39:57 cdent: yeah i wanted to know if we need to do shared first, plus do moves with a migration uuid 14:40:01 cdent: agree 14:40:13 so, dansmith started the migration uuid stuff here https://review.openstack.org/#/c/496933/ 14:40:17 at least the data model and object changes 14:40:31 I figure we want a spec there 14:40:36 I just wanted to work on code instead 14:40:37 dansmith: agree 14:40:42 I think that nested is getting pressure from the NFV folks, since it's needed for that kind of scheduling 14:41:02 edleafe: sure, 14:41:06 sorry, pulled away 14:41:10 but as cdent noted we need to clean up some of the pike mess first 14:41:14 no, no changes to nested stuff. 14:41:22 just need to rebase and fix up conflicts 14:41:25 still... 14:41:30 so i think "migrating" the move operations to use migration uuid is job 1 14:42:00 because the move operation stuff was a real bear that came in way too late in pike 14:42:03 I don't disagree,but I think reschedules/alternatives need to be high on the list 14:42:08 and was the cause of most of our bugs in rc 14:42:19 mriedem: and will be the cause of most of the bugs not filed yet :) 14:42:29 migraiton uuid and alternates can happen concurrently, yeah? 14:42:35 dansmith: yeah reschedules are important to cells v2 adoption too 14:42:38 cdent: yeah 14:42:40 cdent: yeah 14:42:44 cdent: thx for taking on the agg update thing. will review that later. 14:42:52 * cdent bows 14:43:04 the traits stuff in the api is also concurrent while it's just work in the placement api 14:43:11 i.e. alex_xu's changes 14:43:19 that’s the stuff that is likely to imact nested 14:43:31 as the query complexity goes exponential 14:44:09 dansmith: so are you going to write a spec for the migration stuff? 14:44:14 at least some high level spec for the idea 14:44:19 yeah I guess so 14:44:23 that would be tops 14:44:25 I won't enjoy it though, just FYI 14:44:27 i know 14:44:37 i didn't want to ask, just FYI :) 14:44:37 we'll enjoy you not enjoying it 14:44:39 but you brought it up 14:44:41 hah 14:44:42 I'd like to prioritize shared resources over nested actually 14:44:51 jaypipes: agree 14:44:55 agree too 14:45:05 the migration uuid should help that 14:45:08 ok, cool. sorry if I missed that as an earlier agreement. 14:45:36 so i think, migration uuid for move consumer -> shared providers -> nested|traits? 14:45:43 plus alternatives happening concurrently 14:45:57 yes 14:46:12 can't traits also happen concurrently? 14:46:17 #action dansmith to enjoy writing spec for using migration uuid as move operation consumer 14:46:23 edleafe: yeah, I think they can. 14:46:24 #undo 14:46:31 lol 14:46:44 * dansmith knows his undo has no power 14:46:46 i don't think we can actually do meeting stuff, edleafe is the chair 14:46:57 yeah 14:47:32 ok then 14:47:33 #action dansmith to enjoy writing spec for using migration uuid as move operation consumer 14:47:36 hehehe 14:47:39 gah 14:47:54 anything else for specs / reviews 14:47:56 ? 14:47:59 yeah. 14:48:12 jaypipes: go for it 14:49:07 #link traits in allocation candidates https://review.openstack.org/497713 14:49:13 ^ the spec added 14:49:20 * alex_xu is faster than jaypipes 14:49:42 alex_xu's patch here https://review.openstack.org/#/c/480379/ 14:49:44 alex_xu: everyone is faster than jaypipes 14:49:44 lol 14:50:17 heh 14:50:22 yep, was going to bring up that I asked alex_xu to split out the test in that patch in the same manner that gibi did for other bugs 14:50:37 #link ensure RP maps to those RPs that share with it https://review.openstack.org/#/c/480379/ 14:50:57 yea, I will do that, probably tomorrow morning 14:51:29 no worries alex_xu 14:51:58 thanks alex_xu 14:52:07 Let's move ahead 14:52:07 #topic Bugs 14:52:07 Placement bugs 14:52:08 https://bugs.launchpad.net/nova/+bugs?field.tag=placement 14:52:32 One new bug this week - migration related (surprise!) 14:52:37 yeah so https://bugs.launchpad.net/nova/+bug/1712411 is one that didn't get fixed for rc2 14:52:37 Launchpad bug 1712411 in OpenStack Compute (nova) pike "Allocations may not be removed from dest node during failed migrations" [High,Triaged] 14:53:02 i know of at least one place in conductor where that could be addressed, but the failed migration bug fixes are getting to be whack a mole 14:53:18 and i'm on the fence about whether we should have something general like a periodic in the computes to also remove failed stuff 14:53:32 mriedem: are we waiting on migration uuid for those? 14:53:40 to fix them? 14:53:40 no 14:53:47 ah, ok 14:53:58 i opened ^ after fixing the bug for force live migraiton not creating allocations 14:54:31 the problem is when you specify a host for live migration, the scheduler will allocate but then we do some other pre-checks which could fail, and we don't delete the allocations on the dest host if those fail 14:55:03 mriedem: ew. 14:55:06 we could do that cleanup right at the point of failure in the conductor live migration task, and/or with a periodic in the compute 14:55:16 meaning we’re in a known state within the conductor, yeah, so seems like we should just fix it there 14:55:25 that's the easiest fix 14:55:25 agreed 14:55:30 but like i said, whack a mole 14:55:41 it is whack a mole, but it is explicit 14:55:46 * edleafe inserts his quarter to play 14:55:54 like this guy https://review.openstack.org/#/c/497606/ 14:56:05 having a clean up job is actually whack a mole: randomly stamping on the playing field, hoping that mole shows up somewhere 14:56:27 yes, and the periodic cleanup in the compute is exactly how the overwrite was happening 14:56:32 which we disabled late in pike 14:56:47 i just, 14:56:49 you know, 14:56:51 :( 14:57:03 cleaning up allocations is now like cleaning up volumes and ports, 14:57:07 it's spinkled everywhere 14:57:14 *sprinkled 14:57:27 sure but that says more about how we allocate them in the first place, not about how we clean them up? 14:57:30 but i digress 14:57:43 digression is the finest form of progression 14:58:01 So we have two minutes left for 14:58:03 #topic Open discussion 14:58:14 i think how/where we allocate them is fine, it's in the controller services, which is much better than doing it in the claim in the compute 14:58:15 anything we *haven't* covered yet? 14:58:24 cleanup on failure is just always going to be messy 14:58:40 because we have 100 APIs that can fail randomly anywhere :) 14:59:00 right: how we X in the first place… 14:59:28 https://www.youtube.com/watch?v=fGx6K90TmCI ? 14:59:38 Before we get to the ptg I wanted to ask: if extracting placement ever going to be possible, or should I just stop worrying about it. If it is possible, I can write something (a spec? an etherpad) up. 15:00:08 Let's continue in -nova 15:00:12 #endmeeting