14:00:12 <edleafe> #startmeeting nova_scheduler 14:00:12 <openstack> Meeting started Mon Oct 23 14:00:12 2017 UTC and is due to finish in 60 minutes. The chair is edleafe. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:12 <edleafe> #link Agenda for this meeting https://wiki.openstack.org/wiki/Meetings/NovaScheduler#Agenda_for_next_meeting 14:00:13 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:16 <openstack> The meeting name has been set to 'nova_scheduler' 14:00:24 <edleafe> Who's here? 14:00:28 <alex_xu> o/ 14:00:31 <mriedem> o/ 14:00:32 <takashin> \o 14:00:38 <efried> \o 14:01:26 <jaypipes> o/ for fifteen minutes 14:01:56 <edleafe> jaypipes: Do you have anything you need to discuss? We can do that first 14:02:03 <cdent_> o/ 14:02:49 <jaypipes> edleafe: just to say I will be pretty useless the next couple days. :( I'm mostly in meetings in Buffalo and doing new-hire things. 14:03:04 <edleafe> ah, ok 14:03:05 <jaypipes> edleafe: and learning how to hate^Wuse this fucking Mac. 14:03:38 * bauzas waves super-busy 14:03:44 <edleafe> "Why won't it work like it isn't a Mac????" 14:04:04 <edleafe> Let's get started then 14:04:05 <openstack> edleafe: Error: Can't start another meeting, one is in progress. Use #endmeeting first. 14:04:05 <edleafe> #link Agenda for this meeting https://wiki.openstack.org/wiki/Meetings/NovaScheduler#Agenda_for_next_meeting 14:04:16 <edleafe> oops bad copy/paste 14:04:28 <cdent_> jaypipes: did you end up with one with a touchbar? worst mac yet, I reckon. 14:04:34 <edleafe> #undo 14:04:35 <openstack> Removing item from minutes: #link https://wiki.openstack.org/wiki/Meetings/NovaScheduler#Agenda_for_next_meeting 14:04:38 <edleafe> #topic Specs 14:04:39 <edleafe> Only one spec remaining spec - is this pushed to Rocky? 14:04:39 <edleafe> #link Add spec for symmetric GET and PUT of allocations https://review.openstack.org/#/c/508164/ 14:04:52 <mriedem> on ^, 14:04:56 <cdent_> talked about that last week, it’s got a this cycle dependency 14:04:57 <mriedem> cdent asked about that last week, 14:04:59 <cdent_> jinx 14:05:11 <mriedem> yeah the POST allocations one right? 14:05:16 <cdent> yeah 14:05:18 <mriedem> and the migration uuid stuff depends on that 14:05:30 <cdent> s/depends on/desires/ 14:06:04 <edleafe> Is there a blocker? 14:06:06 <mriedem> ok - yeah, but i think that one gets a pass 14:06:36 <cdent> edleafe: just lack of reviews as far as I can tell 14:07:04 <edleafe> Anything else for Specs? 14:07:21 <edleafe> #topic Reviews 14:07:30 <edleafe> #link Nested RP series starting with: https://review.openstack.org/#/c/415921/ 14:07:41 <edleafe> Saw that one merged last week 14:07:55 <edleafe> Looks like this series is moving ahead 14:08:08 <edleafe> jaypipes: any issues for that? 14:08:33 <jaypipes> edleafe: nothing other than to say I will not be getting to any new revisions until Wednesday 14:08:43 <bauzas> edleafe: well, mostly the de-orm stuff merged 14:08:46 <edleafe> roger that 14:09:10 <edleafe> next up 14:09:10 <edleafe> #link Add traits to GET /allocation_candidates https://review.openstack.org/479776 14:09:29 <edleafe> alex_xu: is that on hold pending the other traits patches? 14:10:01 <alex_xu> no 14:10:02 <efried> Sorry, back to the NRP series, jaypipes do you want me to marshal any minor issues until you're back in play? 14:10:11 <efried> to keep the series moving? 14:10:34 <jaypipes> efried: I'll let you know later. Need to read any reviews from the latest revisions. 14:10:39 <efried> jaypipes ack 14:11:00 <alex_xu> I rewrite all the patches again. Try to refactor the AllocationCandidates.get_by_filters 14:11:07 <edleafe> alex_xu: ok 14:11:21 <edleafe> next 14:11:21 <edleafe> #link Allow _set_allocations to delete allocations https://review.openstack.org/#/c/501051/ 14:11:44 <edleafe> That had a +2 before rebase. Just needs some more reviews 14:12:33 <mriedem> does ^ change api behavior at all? 14:13:04 <cdent> not yet, no, that’s higher in the stack 14:13:08 <edleafe> mriedem: it adds a new microversion 14:13:21 <edleafe> opps 14:13:25 <edleafe> that's the next patch 14:13:30 <mriedem> ok because i remember we put a minLength of 1 on PUT allocations 14:13:32 * edleafe is getting ahead of himself 14:13:37 <mriedem> because of some bug at one point 14:13:53 <cdent> the functionality is for POST, and _not_ for PUT 14:13:54 <mriedem> bug 1673227 14:13:55 <openstack> bug 1673227 in OpenStack Compute (nova) ocata "placement apis for inventory and allocations use insufficiently robust jsonschema" [Medium,Confirmed] https://launchpad.net/bugs/1673227 14:14:18 <mriedem> ah ok 14:14:21 <cdent> but changes are present in that stack for GET, PUT, POST, because the representations are all intwined 14:14:52 <cdent> so at somem point later in the stack a POST schema is made which make minLength 0 14:15:00 <edleafe> This is the one that adds a microversion: 14:15:00 <edleafe> #link POST /allocations for >1 consumer https://review.openstack.org/#/c/500073/ 14:15:31 <edleafe> ^ that's important for migrations, right? 14:15:54 <mriedem> yes, 14:15:57 <cdent> yes, that’s the top of the stack we just talking about, and is the thing “desired” by migrations 14:16:09 <mriedem> because we move the allocation from the source node provider and the instance to the migration uuid 14:16:28 <mriedem> and put the instance uuid allocatoin on the dest node provider 14:18:07 <edleafe> I'll re-review it later today 14:18:17 <edleafe> next up 14:18:18 <edleafe> #link Use ksa adapter for placement https://review.openstack.org/#/c/492247/ 14:18:39 <edleafe> no controversy there 14:19:00 <edleafe> next up: 14:19:00 <edleafe> #link Alternate hosts: series starting with https://review.openstack.org/#/c/486215/ 14:19:14 <edleafe> Thanks to mriedem for updating that late on a Friday 14:19:31 <mriedem> i think that's ready to go, 14:19:35 <mriedem> dan was going to look it over again 14:20:08 <mriedem> i put a comment for discussion on the selections object patch, 14:20:16 <mriedem> about the limits field assuming only integer values, 14:20:27 <mriedem> but then kind of doubted myself in my same comment, so didn't -1 14:20:46 <mriedem> not sure if you wanted to discuss that now or not 14:20:47 <edleafe> Yeah, I'm unclear on how that is (mis)used in the field 14:20:58 <mriedem> well, it's assuming in-tree filters that put stuff in the limits dict 14:21:11 <mriedem> which is numatopologyfilter (which you have a separate field for that limit), 14:21:19 <mriedem> and ram/core/disk filters which do just set an int value for limits, 14:21:23 <mriedem> my concern was out of tree filters, 14:21:38 <mriedem> however, any out of tree filter putting stuff in limits must also have out of tree code in the claim/resource tracker code to deal with that limit, 14:21:41 <cdent> jaypipes: option down arrow 14:21:54 <mriedem> and while we support out of tree filters, we don't support extending the resource tracker / claims code 14:21:57 <mriedem> ala ERT 14:22:04 <jaypipes> mriedem: out of tree filters are fundamentally broken with the new resource claiming in placement/scheduler 14:22:13 <mriedem> jaypipes: not really 14:22:27 <mriedem> there are still things we don't claim in the scheduler 14:22:28 <mriedem> like numa 14:22:38 <jaypipes> mriedem: well, they essentially mean we can never get rid of claiming stuff on the compute node. 14:22:53 <mriedem> and not all out of tree filters would put stuff into the limits dict, 14:22:57 <mriedem> i'm hoping that most don't 14:23:11 <bauzas> limits are useless for out-of-tree filters 14:23:11 <jaypipes> mriedem: didn't you say huawei did custom out of tree filters? 14:23:13 <mriedem> and my point is, like i said above, we support out of tree filters, but not extending the RT 14:23:20 <bauzas> because we don't have a custom RT :) 14:23:25 <mriedem> lots of people use out of tree filters 14:23:32 <mriedem> and yes huawei has some too 14:23:45 <mriedem> i don't think they have any that put stuff in the limits dict though 14:23:51 <mriedem> but it got me thinking about thta 14:23:52 <mriedem> *that 14:23:58 <mriedem> again - we don't have an extendable RT, 14:24:01 <mriedem> so i think it's a moot point 14:24:05 <mriedem> like bauzas reiterated 14:24:14 <jaypipes> mriedem: all depends on whether you are essentially asking us to make the limits dict-packing thing part of the "API" for custom filters. 14:24:29 <jaypipes> mriedem: if you are, then we will forever have to live with claims in the compute node. 14:24:29 <mriedem> i don't think i am now 14:24:32 <bauzas> in general, people use out-of-tree filters for things they don't *consume* 14:24:54 <bauzas> but rather just checking whether the host support *this* or *that* 14:24:56 <mriedem> i just wanted to make sure i brought it up 14:25:02 <jaypipes> bauzas: example pls 14:25:36 <bauzas> jaypipes: just a boolean condition whether HostState has this or that 14:25:51 <bauzas> not something saying "I want to consume X foos" 14:25:57 <bauzas> since we don't have a custom R 14:25:58 <bauzas> RT 14:26:11 <jaypipes> bauzas: gotcha. yes, I'm not concerned about those types of filters. I'm concerned about filters that store stuff in limits dict 14:26:30 <bauzas> jaypipes: like I said, I never saw that because we don't have a modular resource tracking 14:26:40 <bauzas> people use some other way for tracking 14:26:46 <bauzas> like an external agent 14:26:56 <jaypipes> and last I read, mriedemwas concerned about the caching scheduler using the core filters which store stuff in the limits dict for things like cpu, ram, etc 14:27:14 <bauzas> in that case, they don't care about limits - and tbh, only a very few of them know what are those "limits" 14:27:44 <mriedem> jaypipes: yeah we're good for the caching scheduler now, 14:27:54 <mriedem> the limits dict in the selection object will handle the basic in-tree filters we have that put stuff in the limits dict 14:27:57 <bauzas> honestly, I just feel we should just wait for some usecase if that presents 14:28:09 <bauzas> and just consider that we only support in-tree limits 14:28:24 <bauzas> which is, tbh, only the NUMA limits IMHO 14:28:35 <bauzas> because CPU, RAM and disk are now accounted differently 14:28:39 <jaypipes> mriedem: ok, I didn't realize we were doing the limits int dict thing in the Selection object. thought we were only doing the numa_limits specialized field. but ok 14:28:41 <mriedem> bauzas: not if you're using the caching scheduler 14:28:50 <mriedem> jaypipes: new as of last week at some point 14:28:54 <bauzas> ah meh good point 14:29:02 <mriedem> we can move on now i think 14:29:03 <bauzas> so, 4 limits, that's i( 14:29:05 <bauzas> yeah 14:29:07 <edleafe> jaypipes: while you were gone they had me add the non-numa limits back in 14:29:41 <edleafe> OK, moving on. 14:29:41 <bauzas> magics of the children vacations, I don't have to leave now and can still hassle you guys for a few more mins :p 14:29:41 <edleafe> #link Numbered grouping of traits https://review.openstack.org/#/c/514091/ 14:30:01 <edleafe> That's still WIP, so please chime in with comments 14:30:11 * bauzas should give a medal to the grand-parents 14:30:54 <efried> was talking to alex_xu about this earlier 14:31:17 <efried> It's going to completely change the format of the 'filters' param we're currently using in AllocationCandidates.get_by_filters. 14:31:30 <efried> So, for microversion purposes, I was going to write a get_by_filters_grouped (next patch in the series) 14:31:44 <efried> Welcoming comments on that approach. 14:31:46 <efried> That's it. 14:31:58 <jaypipes> will try to review that today, efried 14:32:07 <edleafe> I don't know if I'll have time today (meetings, meetings, etc) but will dig into that patch ASAP 14:32:33 <edleafe> Anything other reviews to discuss? 14:32:53 <cdent> yeah 14:33:08 <cdent> https://review.openstack.org/#/c/513526/ is Enable limiting GET /allocation_candidates 14:33:19 <cdent> it’s mostly ready for review, except that efried had one of his usual insights 14:33:41 <cdent> suggesting a change, in the comments, that should probably be done, but worth getting some additional input 14:34:00 <cdent> which is if we are randomzing when limit is set, we should also randomize when not 14:34:10 <cdent> (see the review for details, just want to draw your attention) 14:34:10 <edleafe> #link Enable limiting GET /allocation_candidates https://review.openstack.org/#/c/513526/ 14:34:20 <edleafe> Thanks cdent 14:34:38 <edleafe> #topic Bugs 14:34:40 <edleafe> #link Placement bugs https://bugs.launchpad.net/nova/+bugs?field.tag=placement 14:34:47 <edleafe> A few new ones this week 14:35:19 <edleafe> the ironic CUSTOM_CUSTOM_FOO one surprised me 14:35:43 <edleafe> I thought we had normalized the name, instead of always adding CUSTOM_ to what was sent 14:36:23 <mriedem> i seem to remember bringing this up during impl in the driver 14:37:13 <mriedem> yeah the assumption was always that you wouldn't use the CUSTOM_ prefix in ironic, 14:37:17 <mriedem> which was probably the wrong assumption 14:37:27 <mriedem> and nova would just blindly slap it on 14:37:43 <edleafe> We didn't want to make operators add CUSTOM_ to their classes, IIRC 14:38:00 <mriedem> we don't have to make them, 14:38:07 <mriedem> but we can check to see if they did anyway 14:38:31 <edleafe> Looks like johnthetubaguy is on it 14:38:35 <mriedem> i could see someone thinking things should be consistent in the 3 places 14:38:37 <mriedem> yeah, reviewing the fix 14:38:48 <edleafe> Anything else for bugs? 14:38:52 <johnthetubaguy> yeah, it confused me them not being the same in the three places 14:39:10 <efried> We could talk about splitting RCs across RPs in an aggregate. Or, you know, not. 14:39:40 <edleafe> efried: is there a bug for that? 14:39:46 <efried> bug #1724613 and bug #1724633 14:39:48 <openstack> bug 1724613 in OpenStack Compute (nova) "AllocationCandidates.get_by_filters ignores shared RPs when the RC exists in both places" [Undecided,New] https://launchpad.net/bugs/1724613 14:39:49 <openstack> bug 1724633 in OpenStack Compute (nova) "AllocationCandidates.get_by_filters hits incorrectly when traits are split across the main RP and aggregates" [Undecided,New] https://launchpad.net/bugs/1724633 14:40:44 <edleafe> Well, discuss away! 14:40:55 <efried> If someone can tell me that there's no actual code anywhere in existence today that makes *use* of the implementation for placement aggregates, I'll be happy with deferring the discussion. 14:41:53 <efried> But the existing code in AllocationCandidates.get_by_filters is subject to that first bug right now; and will be subject to the second once alex_xu's traits series is done. 14:42:14 <jaypipes> dansmith probably remembers the CUSTOM_CUSTOM_FOO discussion 14:42:23 * cdent contacts his omniscience as a service service 14:42:24 <dansmith> yes 14:42:50 <efried> TL;DR: if my compute has a RC that is also in an aggregate, things go awry. E.g. my compute has local disk and is also associated with a shared storage pool. 14:43:23 <cdent> efried: I think the thing you can be confident of is that spwaning a compute node does not use aggregates yet 14:43:36 <efried> Good. Does anything else? 14:43:40 <cdent> there may be unkown uses of aggregates out there, but not any that involve booting an instance 14:43:54 <jaypipes> cdent is correct 100% 14:44:15 <efried> It's a matter of getting allocation candidates, not just booting instances. But that makes me feel a lot better. 14:44:41 <efried> So I'm happy to leave those bugs out there and defer the discussion. They have test case code proposed that demonstrates them, which we can merge any time too. 14:44:42 <cdent> efried: yes, that’s why I qualified myself. GET /allocation_candidates has the bug. 14:45:08 <efried> https://review.openstack.org/513149 14:45:12 <cdent> whether it will be exercised is a different question 14:45:12 * alex_xu just found his refactor doesn't fix that bug also 14:45:37 <efried> alex_xu Thanks for testing that. /me not surprised :) 14:46:33 <efried> We can move on 14:46:36 <alex_xu> not surprised on the bug of allocation, that is right thing :) 14:46:50 <edleafe> #topic Open discussion 14:47:11 <edleafe> One item on the agenda: 14:47:13 <edleafe> #link Should we forbid removing traits from an RP with allocations? http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2017-10-20.log.html#t2017-10-20T21:34:44 14:47:47 <efried> johnthetubaguy ^ 14:48:13 <efried> So my vote is no. This goes along with what we talked about in Denver wrt inventories being modified downward to the point where they're less than what's already been allocated. 14:48:22 <efried> It's allowed, but you won't get any more allocations until you're back out of the red. 14:49:06 <edleafe> efried: how does changing an RP's trait affect that? 14:49:07 <efried> Same thing with the traits. You can remove a trait from a RP that's already part of an allocation that may have required that trait, but you won't get scheduled to that RP hereafter if you require that trait (unless it gets re-added) 14:49:09 <jaypipes> efried: I'd like to hear from operators on that question. 14:49:33 <efried> edleafe Not a direct relationship, just a similar example. 14:49:37 <jaypipes> efried: it's a similar conundrum to the instance group member modifications/delete 14:49:43 <johnthetubaguy> comparing the resource class changes and traits changes in ironic pushes me down that line 14:49:45 <cdent> yeah, I think “no” is the right thing as well. From the change on, no new placements will happen and we don’t want to remove existing, so... 14:50:24 <jaypipes> efried: since changing the member set of an instance group affects whether a new or existing instance still is constrained by the placement policy in use at the time of initial scheduling 14:50:53 <efried> jaypipes Does the instance grouping affect anything after the deploy, though? 14:50:58 <efried> In the case of traits, that answer is no. 14:51:07 <efried> It's a one-and-done kind of thing. 14:51:15 <johnthetubaguy> the next live-migration could be odd, no possible hosts as the trait has gone? 14:51:17 <jaypipes> fwiw, k8s attitude towards this problem is "oh well, shit changed. sorry, go ahead and reschedule if you want" 14:52:07 <jaypipes> efried: if I add or remove a member from the affinity group, we do a check on the compute node to ensure that the constraints around affinity are kept for existing members of the group, tyes. 14:52:09 <efried> johnthetubaguy How about this as more concrete example: if you want to shuffle around which nodes are in CUSTOM_GENERAL_USE vs. CUSTOM_PROJECT_B. Should you have to clear out existing deployments before you can do that, or is it okay if they just take effect "the next time"? 14:53:14 <mriedem> i've heard people talk about using traits from external systems to dynamically flag if something is available, or not if something failed, 14:53:32 <mriedem> like something in the hardware failed and that component is no longer available, remove the trait so things that require that trait don't get scheduled there 14:53:43 <jaypipes> mriedem: pls no... traits are not for state/status information 14:53:54 <efried> That's still a capability statement. 14:54:01 <jaypipes> no it's not. 14:54:35 <alex_xu> jaypipes: traits are not for that, but it works 14:54:38 <mriedem> traits are going to be used for dynamic scheduling 14:54:50 <mriedem> whether we intended them to be or not 14:54:50 <jaypipes> efried: that is a different thing. whether or not I am capable of chewing gum and patting my head simultaneously is not the same thing as whether or not I am chewing gum and patting my head currently. 14:55:26 <efried> jaypipes Totally agree - been discussing same on the ML. 14:55:32 <jaypipes> mriedem: if that's the case, we might as well give up and just reinstate the ERT, undo all of placement API and just use kubernetes. 14:55:46 <efried> jaypipes What we're talking about here is, if I cut off both your hands, you're no longer capable of doing that, so that trait should be removed from you. 14:56:34 <jaypipes> efried: sure, but what mriedem is talking about is some external agent that is constantly cutting off my hands and reattaching them when, say, a NIC state link goes up or down. 14:56:48 <johnthetubaguy> efried: I want to be able to fix things right away, myself 14:57:08 <jaypipes> johnthetubaguy: what do you mean by "fix things right away:"? 14:57:34 <johnthetubaguy> I don't want to have to wait for a node to be free before I define how/who/what it gets used for next time 14:57:49 <efried> jaypipes Which seems reasonable to me, tbh. I don't know about link state specifically, how fast that can flip on and off, but if someone trips over a cable, I don't want to deploy new instances there until they plug it back in. 14:58:03 <edleafe> Two minutes left 14:58:09 <jaypipes> johnthetubaguy: I don't see how that is relevant here? if we only use traits for capabilities (and not for state information) then you don't have that problem.' 14:58:25 <jaypipes> efried: ugh. 14:58:36 <jaypipes> efried: that is NOT a capability. that is a state/status link. 14:58:59 <jaypipes> efried: see the nic-state-aware-scheduling spec/blueprint for a lengthy discussion of where this leads to... 14:59:04 <cdent> whatever we want people to do with these things, we have to be aware of what people _will_ do 14:59:09 <johnthetubaguy> jaypipes: so its the only way we have to do non-capability things right now, so that's what is getting used 14:59:28 <edleafe> Sounds like this can be summed up as: can the capabilities of a "thing" change? 15:00:07 <edleafe> OK, thanks everyone! 15:00:07 <edleafe> Continue this in -nova 15:00:08 <edleafe> #endmeeting