14:00:12 #startmeeting nova_scheduler 14:00:12 Meeting started Mon Oct 23 14:00:12 2017 UTC and is due to finish in 60 minutes. The chair is edleafe. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:12 #link Agenda for this meeting https://wiki.openstack.org/wiki/Meetings/NovaScheduler#Agenda_for_next_meeting 14:00:13 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:16 The meeting name has been set to 'nova_scheduler' 14:00:24 Who's here? 14:00:28 o/ 14:00:31 o/ 14:00:32 \o 14:00:38 \o 14:01:26 o/ for fifteen minutes 14:01:56 jaypipes: Do you have anything you need to discuss? We can do that first 14:02:03 o/ 14:02:49 edleafe: just to say I will be pretty useless the next couple days. :( I'm mostly in meetings in Buffalo and doing new-hire things. 14:03:04 ah, ok 14:03:05 edleafe: and learning how to hate^Wuse this fucking Mac. 14:03:38 * bauzas waves super-busy 14:03:44 "Why won't it work like it isn't a Mac????" 14:04:04 Let's get started then 14:04:05 edleafe: Error: Can't start another meeting, one is in progress. Use #endmeeting first. 14:04:05 #link Agenda for this meeting https://wiki.openstack.org/wiki/Meetings/NovaScheduler#Agenda_for_next_meeting 14:04:16 oops bad copy/paste 14:04:28 jaypipes: did you end up with one with a touchbar? worst mac yet, I reckon. 14:04:34 #undo 14:04:35 Removing item from minutes: #link https://wiki.openstack.org/wiki/Meetings/NovaScheduler#Agenda_for_next_meeting 14:04:38 #topic Specs 14:04:39 Only one spec remaining spec - is this pushed to Rocky? 14:04:39 #link Add spec for symmetric GET and PUT of allocations https://review.openstack.org/#/c/508164/ 14:04:52 on ^, 14:04:56 talked about that last week, it’s got a this cycle dependency 14:04:57 cdent asked about that last week, 14:04:59 jinx 14:05:11 yeah the POST allocations one right? 14:05:16 yeah 14:05:18 and the migration uuid stuff depends on that 14:05:30 s/depends on/desires/ 14:06:04 Is there a blocker? 14:06:06 ok - yeah, but i think that one gets a pass 14:06:36 edleafe: just lack of reviews as far as I can tell 14:07:04 Anything else for Specs? 14:07:21 #topic Reviews 14:07:30 #link Nested RP series starting with: https://review.openstack.org/#/c/415921/ 14:07:41 Saw that one merged last week 14:07:55 Looks like this series is moving ahead 14:08:08 jaypipes: any issues for that? 14:08:33 edleafe: nothing other than to say I will not be getting to any new revisions until Wednesday 14:08:43 edleafe: well, mostly the de-orm stuff merged 14:08:46 roger that 14:09:10 next up 14:09:10 #link Add traits to GET /allocation_candidates https://review.openstack.org/479776 14:09:29 alex_xu: is that on hold pending the other traits patches? 14:10:01 no 14:10:02 Sorry, back to the NRP series, jaypipes do you want me to marshal any minor issues until you're back in play? 14:10:11 to keep the series moving? 14:10:34 efried: I'll let you know later. Need to read any reviews from the latest revisions. 14:10:39 jaypipes ack 14:11:00 I rewrite all the patches again. Try to refactor the AllocationCandidates.get_by_filters 14:11:07 alex_xu: ok 14:11:21 next 14:11:21 #link Allow _set_allocations to delete allocations https://review.openstack.org/#/c/501051/ 14:11:44 That had a +2 before rebase. Just needs some more reviews 14:12:33 does ^ change api behavior at all? 14:13:04 not yet, no, that’s higher in the stack 14:13:08 mriedem: it adds a new microversion 14:13:21 opps 14:13:25 that's the next patch 14:13:30 ok because i remember we put a minLength of 1 on PUT allocations 14:13:32 * edleafe is getting ahead of himself 14:13:37 because of some bug at one point 14:13:53 the functionality is for POST, and _not_ for PUT 14:13:54 bug 1673227 14:13:55 bug 1673227 in OpenStack Compute (nova) ocata "placement apis for inventory and allocations use insufficiently robust jsonschema" [Medium,Confirmed] https://launchpad.net/bugs/1673227 14:14:18 ah ok 14:14:21 but changes are present in that stack for GET, PUT, POST, because the representations are all intwined 14:14:52 so at somem point later in the stack a POST schema is made which make minLength 0 14:15:00 This is the one that adds a microversion: 14:15:00 #link POST /allocations for >1 consumer https://review.openstack.org/#/c/500073/ 14:15:31 ^ that's important for migrations, right? 14:15:54 yes, 14:15:57 yes, that’s the top of the stack we just talking about, and is the thing “desired” by migrations 14:16:09 because we move the allocation from the source node provider and the instance to the migration uuid 14:16:28 and put the instance uuid allocatoin on the dest node provider 14:18:07 I'll re-review it later today 14:18:17 next up 14:18:18 #link Use ksa adapter for placement https://review.openstack.org/#/c/492247/ 14:18:39 no controversy there 14:19:00 next up: 14:19:00 #link Alternate hosts: series starting with https://review.openstack.org/#/c/486215/ 14:19:14 Thanks to mriedem for updating that late on a Friday 14:19:31 i think that's ready to go, 14:19:35 dan was going to look it over again 14:20:08 i put a comment for discussion on the selections object patch, 14:20:16 about the limits field assuming only integer values, 14:20:27 but then kind of doubted myself in my same comment, so didn't -1 14:20:46 not sure if you wanted to discuss that now or not 14:20:47 Yeah, I'm unclear on how that is (mis)used in the field 14:20:58 well, it's assuming in-tree filters that put stuff in the limits dict 14:21:11 which is numatopologyfilter (which you have a separate field for that limit), 14:21:19 and ram/core/disk filters which do just set an int value for limits, 14:21:23 my concern was out of tree filters, 14:21:38 however, any out of tree filter putting stuff in limits must also have out of tree code in the claim/resource tracker code to deal with that limit, 14:21:41 jaypipes: option down arrow 14:21:54 and while we support out of tree filters, we don't support extending the resource tracker / claims code 14:21:57 ala ERT 14:22:04 mriedem: out of tree filters are fundamentally broken with the new resource claiming in placement/scheduler 14:22:13 jaypipes: not really 14:22:27 there are still things we don't claim in the scheduler 14:22:28 like numa 14:22:38 mriedem: well, they essentially mean we can never get rid of claiming stuff on the compute node. 14:22:53 and not all out of tree filters would put stuff into the limits dict, 14:22:57 i'm hoping that most don't 14:23:11 limits are useless for out-of-tree filters 14:23:11 mriedem: didn't you say huawei did custom out of tree filters? 14:23:13 and my point is, like i said above, we support out of tree filters, but not extending the RT 14:23:20 because we don't have a custom RT :) 14:23:25 lots of people use out of tree filters 14:23:32 and yes huawei has some too 14:23:45 i don't think they have any that put stuff in the limits dict though 14:23:51 but it got me thinking about thta 14:23:52 *that 14:23:58 again - we don't have an extendable RT, 14:24:01 so i think it's a moot point 14:24:05 like bauzas reiterated 14:24:14 mriedem: all depends on whether you are essentially asking us to make the limits dict-packing thing part of the "API" for custom filters. 14:24:29 mriedem: if you are, then we will forever have to live with claims in the compute node. 14:24:29 i don't think i am now 14:24:32 in general, people use out-of-tree filters for things they don't *consume* 14:24:54 but rather just checking whether the host support *this* or *that* 14:24:56 i just wanted to make sure i brought it up 14:25:02 bauzas: example pls 14:25:36 jaypipes: just a boolean condition whether HostState has this or that 14:25:51 not something saying "I want to consume X foos" 14:25:57 since we don't have a custom R 14:25:58 RT 14:26:11 bauzas: gotcha. yes, I'm not concerned about those types of filters. I'm concerned about filters that store stuff in limits dict 14:26:30 jaypipes: like I said, I never saw that because we don't have a modular resource tracking 14:26:40 people use some other way for tracking 14:26:46 like an external agent 14:26:56 and last I read, mriedemwas concerned about the caching scheduler using the core filters which store stuff in the limits dict for things like cpu, ram, etc 14:27:14 in that case, they don't care about limits - and tbh, only a very few of them know what are those "limits" 14:27:44 jaypipes: yeah we're good for the caching scheduler now, 14:27:54 the limits dict in the selection object will handle the basic in-tree filters we have that put stuff in the limits dict 14:27:57 honestly, I just feel we should just wait for some usecase if that presents 14:28:09 and just consider that we only support in-tree limits 14:28:24 which is, tbh, only the NUMA limits IMHO 14:28:35 because CPU, RAM and disk are now accounted differently 14:28:39 mriedem: ok, I didn't realize we were doing the limits int dict thing in the Selection object. thought we were only doing the numa_limits specialized field. but ok 14:28:41 bauzas: not if you're using the caching scheduler 14:28:50 jaypipes: new as of last week at some point 14:28:54 ah meh good point 14:29:02 we can move on now i think 14:29:03 so, 4 limits, that's i( 14:29:05 yeah 14:29:07 jaypipes: while you were gone they had me add the non-numa limits back in 14:29:41 OK, moving on. 14:29:41 magics of the children vacations, I don't have to leave now and can still hassle you guys for a few more mins :p 14:29:41 #link Numbered grouping of traits https://review.openstack.org/#/c/514091/ 14:30:01 That's still WIP, so please chime in with comments 14:30:11 * bauzas should give a medal to the grand-parents 14:30:54 was talking to alex_xu about this earlier 14:31:17 It's going to completely change the format of the 'filters' param we're currently using in AllocationCandidates.get_by_filters. 14:31:30 So, for microversion purposes, I was going to write a get_by_filters_grouped (next patch in the series) 14:31:44 Welcoming comments on that approach. 14:31:46 That's it. 14:31:58 will try to review that today, efried 14:32:07 I don't know if I'll have time today (meetings, meetings, etc) but will dig into that patch ASAP 14:32:33 Anything other reviews to discuss? 14:32:53 yeah 14:33:08 https://review.openstack.org/#/c/513526/ is Enable limiting GET /allocation_candidates 14:33:19 it’s mostly ready for review, except that efried had one of his usual insights 14:33:41 suggesting a change, in the comments, that should probably be done, but worth getting some additional input 14:34:00 which is if we are randomzing when limit is set, we should also randomize when not 14:34:10 (see the review for details, just want to draw your attention) 14:34:10 #link Enable limiting GET /allocation_candidates https://review.openstack.org/#/c/513526/ 14:34:20 Thanks cdent 14:34:38 #topic Bugs 14:34:40 #link Placement bugs https://bugs.launchpad.net/nova/+bugs?field.tag=placement 14:34:47 A few new ones this week 14:35:19 the ironic CUSTOM_CUSTOM_FOO one surprised me 14:35:43 I thought we had normalized the name, instead of always adding CUSTOM_ to what was sent 14:36:23 i seem to remember bringing this up during impl in the driver 14:37:13 yeah the assumption was always that you wouldn't use the CUSTOM_ prefix in ironic, 14:37:17 which was probably the wrong assumption 14:37:27 and nova would just blindly slap it on 14:37:43 We didn't want to make operators add CUSTOM_ to their classes, IIRC 14:38:00 we don't have to make them, 14:38:07 but we can check to see if they did anyway 14:38:31 Looks like johnthetubaguy is on it 14:38:35 i could see someone thinking things should be consistent in the 3 places 14:38:37 yeah, reviewing the fix 14:38:48 Anything else for bugs? 14:38:52 yeah, it confused me them not being the same in the three places 14:39:10 We could talk about splitting RCs across RPs in an aggregate. Or, you know, not. 14:39:40 efried: is there a bug for that? 14:39:46 bug #1724613 and bug #1724633 14:39:48 bug 1724613 in OpenStack Compute (nova) "AllocationCandidates.get_by_filters ignores shared RPs when the RC exists in both places" [Undecided,New] https://launchpad.net/bugs/1724613 14:39:49 bug 1724633 in OpenStack Compute (nova) "AllocationCandidates.get_by_filters hits incorrectly when traits are split across the main RP and aggregates" [Undecided,New] https://launchpad.net/bugs/1724633 14:40:44 Well, discuss away! 14:40:55 If someone can tell me that there's no actual code anywhere in existence today that makes *use* of the implementation for placement aggregates, I'll be happy with deferring the discussion. 14:41:53 But the existing code in AllocationCandidates.get_by_filters is subject to that first bug right now; and will be subject to the second once alex_xu's traits series is done. 14:42:14 dansmith probably remembers the CUSTOM_CUSTOM_FOO discussion 14:42:23 * cdent contacts his omniscience as a service service 14:42:24 yes 14:42:50 TL;DR: if my compute has a RC that is also in an aggregate, things go awry. E.g. my compute has local disk and is also associated with a shared storage pool. 14:43:23 efried: I think the thing you can be confident of is that spwaning a compute node does not use aggregates yet 14:43:36 Good. Does anything else? 14:43:40 there may be unkown uses of aggregates out there, but not any that involve booting an instance 14:43:54 cdent is correct 100% 14:44:15 It's a matter of getting allocation candidates, not just booting instances. But that makes me feel a lot better. 14:44:41 So I'm happy to leave those bugs out there and defer the discussion. They have test case code proposed that demonstrates them, which we can merge any time too. 14:44:42 efried: yes, that’s why I qualified myself. GET /allocation_candidates has the bug. 14:45:08 https://review.openstack.org/513149 14:45:12 whether it will be exercised is a different question 14:45:12 * alex_xu just found his refactor doesn't fix that bug also 14:45:37 alex_xu Thanks for testing that. /me not surprised :) 14:46:33 We can move on 14:46:36 not surprised on the bug of allocation, that is right thing :) 14:46:50 #topic Open discussion 14:47:11 One item on the agenda: 14:47:13 #link Should we forbid removing traits from an RP with allocations? http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2017-10-20.log.html#t2017-10-20T21:34:44 14:47:47 johnthetubaguy ^ 14:48:13 So my vote is no. This goes along with what we talked about in Denver wrt inventories being modified downward to the point where they're less than what's already been allocated. 14:48:22 It's allowed, but you won't get any more allocations until you're back out of the red. 14:49:06 efried: how does changing an RP's trait affect that? 14:49:07 Same thing with the traits. You can remove a trait from a RP that's already part of an allocation that may have required that trait, but you won't get scheduled to that RP hereafter if you require that trait (unless it gets re-added) 14:49:09 efried: I'd like to hear from operators on that question. 14:49:33 edleafe Not a direct relationship, just a similar example. 14:49:37 efried: it's a similar conundrum to the instance group member modifications/delete 14:49:43 comparing the resource class changes and traits changes in ironic pushes me down that line 14:49:45 yeah, I think “no” is the right thing as well. From the change on, no new placements will happen and we don’t want to remove existing, so... 14:50:24 efried: since changing the member set of an instance group affects whether a new or existing instance still is constrained by the placement policy in use at the time of initial scheduling 14:50:53 jaypipes Does the instance grouping affect anything after the deploy, though? 14:50:58 In the case of traits, that answer is no. 14:51:07 It's a one-and-done kind of thing. 14:51:15 the next live-migration could be odd, no possible hosts as the trait has gone? 14:51:17 fwiw, k8s attitude towards this problem is "oh well, shit changed. sorry, go ahead and reschedule if you want" 14:52:07 efried: if I add or remove a member from the affinity group, we do a check on the compute node to ensure that the constraints around affinity are kept for existing members of the group, tyes. 14:52:09 johnthetubaguy How about this as more concrete example: if you want to shuffle around which nodes are in CUSTOM_GENERAL_USE vs. CUSTOM_PROJECT_B. Should you have to clear out existing deployments before you can do that, or is it okay if they just take effect "the next time"? 14:53:14 i've heard people talk about using traits from external systems to dynamically flag if something is available, or not if something failed, 14:53:32 like something in the hardware failed and that component is no longer available, remove the trait so things that require that trait don't get scheduled there 14:53:43 mriedem: pls no... traits are not for state/status information 14:53:54 That's still a capability statement. 14:54:01 no it's not. 14:54:35 jaypipes: traits are not for that, but it works 14:54:38 traits are going to be used for dynamic scheduling 14:54:50 whether we intended them to be or not 14:54:50 efried: that is a different thing. whether or not I am capable of chewing gum and patting my head simultaneously is not the same thing as whether or not I am chewing gum and patting my head currently. 14:55:26 jaypipes Totally agree - been discussing same on the ML. 14:55:32 mriedem: if that's the case, we might as well give up and just reinstate the ERT, undo all of placement API and just use kubernetes. 14:55:46 jaypipes What we're talking about here is, if I cut off both your hands, you're no longer capable of doing that, so that trait should be removed from you. 14:56:34 efried: sure, but what mriedem is talking about is some external agent that is constantly cutting off my hands and reattaching them when, say, a NIC state link goes up or down. 14:56:48 efried: I want to be able to fix things right away, myself 14:57:08 johnthetubaguy: what do you mean by "fix things right away:"? 14:57:34 I don't want to have to wait for a node to be free before I define how/who/what it gets used for next time 14:57:49 jaypipes Which seems reasonable to me, tbh. I don't know about link state specifically, how fast that can flip on and off, but if someone trips over a cable, I don't want to deploy new instances there until they plug it back in. 14:58:03 Two minutes left 14:58:09 johnthetubaguy: I don't see how that is relevant here? if we only use traits for capabilities (and not for state information) then you don't have that problem.' 14:58:25 efried: ugh. 14:58:36 efried: that is NOT a capability. that is a state/status link. 14:58:59 efried: see the nic-state-aware-scheduling spec/blueprint for a lengthy discussion of where this leads to... 14:59:04 whatever we want people to do with these things, we have to be aware of what people _will_ do 14:59:09 jaypipes: so its the only way we have to do non-capability things right now, so that's what is getting used 14:59:28 Sounds like this can be summed up as: can the capabilities of a "thing" change? 15:00:07 OK, thanks everyone! 15:00:07 Continue this in -nova 15:00:08 #endmeeting