#openstack-meeting-alt log

14:00:12 <edleafe> #startmeeting nova_scheduler
14:00:12 <openstack> Meeting started Mon Oct 23 14:00:12 2017 UTC and is due to finish in 60 minutes.  The chair is edleafe. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:12 <edleafe> #link Agenda for this meeting https://wiki.openstack.org/wiki/Meetings/NovaScheduler#Agenda_for_next_meeting
14:00:13 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:16 <openstack> The meeting name has been set to 'nova_scheduler'
14:00:24 <edleafe> Who's here?
14:00:28 <alex_xu> o/
14:00:31 <mriedem> o/
14:00:32 <takashin> \o
14:00:38 <efried> \o
14:01:26 <jaypipes> o/ for fifteen minutes
14:01:56 <edleafe> jaypipes: Do you have anything you need to discuss? We can do that first
14:02:03 <cdent_> o/
14:02:49 <jaypipes> edleafe: just to say I will be pretty useless the next couple days. :( I'm mostly in meetings in Buffalo and doing new-hire things.
14:03:04 <edleafe> ah, ok
14:03:05 <jaypipes> edleafe: and learning how to hate^Wuse this fucking Mac.
14:03:38 * bauzas waves super-busy
14:03:44 <edleafe> "Why won't it work like it isn't a Mac????"
14:04:04 <edleafe> Let's get started then
14:04:05 <openstack> edleafe: Error: Can't start another meeting, one is in progress.  Use #endmeeting first.
14:04:05 <edleafe> #link Agenda for this meeting https://wiki.openstack.org/wiki/Meetings/NovaScheduler#Agenda_for_next_meeting
14:04:16 <edleafe> oops bad copy/paste
14:04:28 <cdent_> jaypipes: did you end up with one with a touchbar? worst mac yet, I reckon.
14:04:34 <edleafe> #undo
14:04:35 <openstack> Removing item from minutes: #link https://wiki.openstack.org/wiki/Meetings/NovaScheduler#Agenda_for_next_meeting
14:04:38 <edleafe> #topic Specs
14:04:39 <edleafe> Only one spec remaining spec - is this pushed to Rocky?
14:04:39 <edleafe> #link Add spec for symmetric GET and PUT of allocations https://review.openstack.org/#/c/508164/
14:04:52 <mriedem> on ^,
14:04:56 <cdent_> talked about that last week, it’s got a this cycle dependency
14:04:57 <mriedem> cdent asked about that last week,
14:04:59 <cdent_> jinx
14:05:11 <mriedem> yeah the POST allocations one right?
14:05:16 <cdent> yeah
14:05:18 <mriedem> and the migration uuid stuff depends on that
14:05:30 <cdent> s/depends on/desires/
14:06:04 <edleafe> Is there a blocker?
14:06:06 <mriedem> ok - yeah, but i think that one gets a pass
14:06:36 <cdent> edleafe: just lack of reviews as far as I can tell
14:07:04 <edleafe> Anything else for Specs?
14:07:21 <edleafe> #topic Reviews
14:07:30 <edleafe> #link Nested RP series starting with: https://review.openstack.org/#/c/415921/
14:07:41 <edleafe> Saw that one merged last week
14:07:55 <edleafe> Looks like this series is moving ahead
14:08:08 <edleafe> jaypipes: any issues for that?
14:08:33 <jaypipes> edleafe: nothing other than to say I will not be getting to any new revisions until Wednesday
14:08:43 <bauzas> edleafe: well, mostly the de-orm stuff merged
14:08:46 <edleafe> roger that
14:09:10 <edleafe> next up
14:09:10 <edleafe> #link Add traits to GET /allocation_candidates https://review.openstack.org/479776
14:09:29 <edleafe> alex_xu: is that on hold pending the other traits patches?
14:10:01 <alex_xu> no
14:10:02 <efried> Sorry, back to the NRP series, jaypipes do you want me to marshal any minor issues until you're back in play?
14:10:11 <efried> to keep the series moving?
14:10:34 <jaypipes> efried: I'll let you know later. Need to read any reviews from the latest revisions.
14:10:39 <efried> jaypipes ack
14:11:00 <alex_xu> I rewrite all the patches again. Try to refactor the AllocationCandidates.get_by_filters
14:11:07 <edleafe> alex_xu: ok
14:11:21 <edleafe> next
14:11:21 <edleafe> #link Allow _set_allocations to delete allocations https://review.openstack.org/#/c/501051/
14:11:44 <edleafe> That had a +2 before rebase. Just needs some more reviews
14:12:33 <mriedem> does ^ change api behavior at all?
14:13:04 <cdent> not yet, no, that’s higher in the stack
14:13:08 <edleafe> mriedem: it adds a new microversion
14:13:21 <edleafe> opps
14:13:25 <edleafe> that's the next patch
14:13:30 <mriedem> ok because i remember we put a minLength of 1 on PUT allocations
14:13:32 * edleafe is getting ahead of himself
14:13:37 <mriedem> because of some bug at one point
14:13:53 <cdent> the functionality is for POST, and _not_ for PUT
14:13:54 <mriedem> bug 1673227
14:13:55 <openstack> bug 1673227 in OpenStack Compute (nova) ocata "placement apis for inventory and allocations use insufficiently robust jsonschema" [Medium,Confirmed] https://launchpad.net/bugs/1673227
14:14:18 <mriedem> ah ok
14:14:21 <cdent> but changes are present in that stack for GET, PUT, POST, because the representations are all intwined
14:14:52 <cdent> so at somem point later in the stack a POST schema is made which make minLength 0
14:15:00 <edleafe> This is the one that adds a microversion:
14:15:00 <edleafe> #link POST /allocations for >1 consumer https://review.openstack.org/#/c/500073/
14:15:31 <edleafe> ^ that's important for migrations, right?
14:15:54 <mriedem> yes,
14:15:57 <cdent> yes, that’s the top of the stack we just talking about, and is the thing “desired” by migrations
14:16:09 <mriedem> because we move the allocation from the source node provider and the instance to the migration uuid
14:16:28 <mriedem> and put the instance uuid allocatoin on the dest node provider
14:18:07 <edleafe> I'll re-review it later today
14:18:17 <edleafe> next up
14:18:18 <edleafe> #link Use ksa adapter for placement https://review.openstack.org/#/c/492247/
14:18:39 <edleafe> no controversy there
14:19:00 <edleafe> next up:
14:19:00 <edleafe> #link Alternate hosts: series starting with https://review.openstack.org/#/c/486215/
14:19:14 <edleafe> Thanks to mriedem for updating that late on a Friday
14:19:31 <mriedem> i think that's ready to go,
14:19:35 <mriedem> dan was going to look it over again
14:20:08 <mriedem> i put a comment for discussion on the selections object patch,
14:20:16 <mriedem> about the limits field assuming only integer values,
14:20:27 <mriedem> but then kind of doubted myself in my same comment, so didn't -1
14:20:46 <mriedem> not sure if you wanted to discuss that now or not
14:20:47 <edleafe> Yeah, I'm unclear on how that is (mis)used in the field
14:20:58 <mriedem> well, it's assuming in-tree filters that put stuff in the limits dict
14:21:11 <mriedem> which is numatopologyfilter (which you have a separate field for that limit),
14:21:19 <mriedem> and ram/core/disk filters which do just set an int value for limits,
14:21:23 <mriedem> my concern was out of tree filters,
14:21:38 <mriedem> however, any out of tree filter putting stuff in limits must also have out of tree code in the claim/resource tracker code to deal with that limit,
14:21:41 <cdent> jaypipes: option down arrow
14:21:54 <mriedem> and while we support out of tree filters, we don't support extending the resource tracker / claims code
14:21:57 <mriedem> ala ERT
14:22:04 <jaypipes> mriedem: out of tree filters are fundamentally broken with the new resource claiming in placement/scheduler
14:22:13 <mriedem> jaypipes: not really
14:22:27 <mriedem> there are still things we don't claim in the scheduler
14:22:28 <mriedem> like numa
14:22:38 <jaypipes> mriedem: well, they essentially mean we can never get rid of claiming stuff on the compute node.
14:22:53 <mriedem> and not all out of tree filters would put stuff into the limits dict,
14:22:57 <mriedem> i'm hoping that most don't
14:23:11 <bauzas> limits are useless for out-of-tree filters
14:23:11 <jaypipes> mriedem: didn't you say huawei did custom out of tree filters?
14:23:13 <mriedem> and my point is, like i said above, we support out of tree filters, but not extending the RT
14:23:20 <bauzas> because we don't have a custom RT :)
14:23:25 <mriedem> lots of people use out of tree filters
14:23:32 <mriedem> and yes huawei has some too
14:23:45 <mriedem> i don't think they have any that put stuff in the limits dict though
14:23:51 <mriedem> but it got me thinking about thta
14:23:52 <mriedem> *that
14:23:58 <mriedem> again - we don't have an extendable RT,
14:24:01 <mriedem> so i think it's a moot point
14:24:05 <mriedem> like bauzas reiterated
14:24:14 <jaypipes> mriedem: all depends on whether you are essentially asking us to make the limits dict-packing thing part of the "API" for custom filters.
14:24:29 <jaypipes> mriedem: if you are, then we will forever have to live with claims in the compute node.
14:24:29 <mriedem> i don't think i am now
14:24:32 <bauzas> in general, people use out-of-tree filters for things they don't *consume*
14:24:54 <bauzas> but rather just checking whether the host support *this* or *that*
14:24:56 <mriedem> i just wanted to make sure i brought it up
14:25:02 <jaypipes> bauzas: example pls
14:25:36 <bauzas> jaypipes: just a boolean condition whether HostState has this or that
14:25:51 <bauzas> not something saying "I want to consume X foos"
14:25:57 <bauzas> since we don't have a custom R
14:25:58 <bauzas> RT
14:26:11 <jaypipes> bauzas: gotcha. yes, I'm not concerned about those types of filters. I'm concerned about filters that store stuff in limits dict
14:26:30 <bauzas> jaypipes: like I said, I never saw that because we don't have a modular resource tracking
14:26:40 <bauzas> people use some other way for tracking
14:26:46 <bauzas> like an external agent
14:26:56 <jaypipes> and last I read, mriedemwas concerned about the caching scheduler using the core filters which store stuff in the limits dict for things like cpu, ram, etc
14:27:14 <bauzas> in that case, they don't care about limits - and tbh, only a very few of them know what are those "limits"
14:27:44 <mriedem> jaypipes: yeah we're good for the caching scheduler now,
14:27:54 <mriedem> the limits dict in the selection object will handle the basic in-tree filters we have that put stuff in the limits dict
14:27:57 <bauzas> honestly, I just feel we should just wait for some usecase if that presents
14:28:09 <bauzas> and just consider that we only support in-tree limits
14:28:24 <bauzas> which is, tbh, only the NUMA limits IMHO
14:28:35 <bauzas> because CPU, RAM and disk are now accounted differently
14:28:39 <jaypipes> mriedem: ok, I didn't realize we were doing the limits int dict thing in the Selection object. thought we were only doing the numa_limits specialized field. but ok
14:28:41 <mriedem> bauzas: not if you're using the caching scheduler
14:28:50 <mriedem> jaypipes: new as of last week at some point
14:28:54 <bauzas> ah meh good point
14:29:02 <mriedem> we can move on now i think
14:29:03 <bauzas> so, 4 limits, that's i(
14:29:05 <bauzas> yeah
14:29:07 <edleafe> jaypipes: while you were gone they had me add the non-numa limits back in
14:29:41 <edleafe> OK, moving on.
14:29:41 <bauzas> magics of the children vacations, I don't have to leave now and can still hassle you guys for a few more mins :p
14:29:41 <edleafe> #link Numbered grouping of traits https://review.openstack.org/#/c/514091/
14:30:01 <edleafe> That's still WIP, so please chime in with comments
14:30:11 * bauzas should give a medal to the grand-parents
14:30:54 <efried> was talking to alex_xu about this earlier
14:31:17 <efried> It's going to completely change the format of the 'filters' param we're currently using in AllocationCandidates.get_by_filters.
14:31:30 <efried> So, for microversion purposes, I was going to write a get_by_filters_grouped (next patch in the series)
14:31:44 <efried> Welcoming comments on that approach.
14:31:46 <efried> That's it.
14:31:58 <jaypipes> will try to review that today, efried
14:32:07 <edleafe> I don't know if I'll have time today (meetings, meetings, etc) but will dig into that patch ASAP
14:32:33 <edleafe> Anything other reviews to discuss?
14:32:53 <cdent> yeah
14:33:08 <cdent> https://review.openstack.org/#/c/513526/ is Enable limiting GET /allocation_candidates
14:33:19 <cdent> it’s mostly ready for review, except that efried had one of his usual insights
14:33:41 <cdent> suggesting a change, in the comments, that should probably be done, but worth getting some additional input
14:34:00 <cdent> which is if we are randomzing when limit is set, we should also randomize when not
14:34:10 <cdent> (see the review for details, just want to draw your attention)
14:34:10 <edleafe> #link Enable limiting GET /allocation_candidates https://review.openstack.org/#/c/513526/
14:34:20 <edleafe> Thanks cdent
14:34:38 <edleafe> #topic Bugs
14:34:40 <edleafe> #link Placement bugs https://bugs.launchpad.net/nova/+bugs?field.tag=placement
14:34:47 <edleafe> A few new ones this week
14:35:19 <edleafe> the ironic CUSTOM_CUSTOM_FOO one surprised me
14:35:43 <edleafe> I thought we had normalized the name, instead of always adding CUSTOM_ to what was sent
14:36:23 <mriedem> i seem to remember bringing this up during impl in the driver
14:37:13 <mriedem> yeah the assumption was always that you wouldn't use the CUSTOM_ prefix in ironic,
14:37:17 <mriedem> which was probably the wrong assumption
14:37:27 <mriedem> and nova would just blindly slap it on
14:37:43 <edleafe> We didn't want to make operators add CUSTOM_ to their classes, IIRC
14:38:00 <mriedem> we don't have to make them,
14:38:07 <mriedem> but we can check to see if they did anyway
14:38:31 <edleafe> Looks like johnthetubaguy is on it
14:38:35 <mriedem> i could see someone thinking things should be consistent in the 3 places
14:38:37 <mriedem> yeah, reviewing the fix
14:38:48 <edleafe> Anything else for bugs?
14:38:52 <johnthetubaguy> yeah, it confused me them not being the same in the three places
14:39:10 <efried> We could talk about splitting RCs across RPs in an aggregate.  Or, you know, not.
14:39:40 <edleafe> efried: is there a bug for that?
14:39:46 <efried> bug #1724613 and bug #1724633
14:39:48 <openstack> bug 1724613 in OpenStack Compute (nova) "AllocationCandidates.get_by_filters ignores shared RPs when the RC exists in both places" [Undecided,New] https://launchpad.net/bugs/1724613
14:39:49 <openstack> bug 1724633 in OpenStack Compute (nova) "AllocationCandidates.get_by_filters hits incorrectly when traits are split across the main RP and aggregates" [Undecided,New] https://launchpad.net/bugs/1724633
14:40:44 <edleafe> Well, discuss away!
14:40:55 <efried> If someone can tell me that there's no actual code anywhere in existence today that makes *use* of the implementation for placement aggregates, I'll be happy with deferring the discussion.
14:41:53 <efried> But the existing code in AllocationCandidates.get_by_filters is subject to that first bug right now; and will be subject to the second once alex_xu's traits series is done.
14:42:14 <jaypipes> dansmith probably remembers the CUSTOM_CUSTOM_FOO discussion
14:42:23 * cdent contacts his omniscience as a service service
14:42:24 <dansmith> yes
14:42:50 <efried> TL;DR: if my compute has a RC that is also in an aggregate, things go awry.  E.g. my compute has local disk and is also associated with a shared storage pool.
14:43:23 <cdent> efried: I think the thing you can be confident of is that spwaning a compute node does not use aggregates yet
14:43:36 <efried> Good.  Does anything else?
14:43:40 <cdent> there may be unkown uses of aggregates out there, but not any that involve booting an instance
14:43:54 <jaypipes> cdent is correct 100%
14:44:15 <efried> It's a matter of getting allocation candidates, not just booting instances.  But that makes me feel a lot better.
14:44:41 <efried> So I'm happy to leave those bugs out there and defer the discussion.  They have test case code proposed that demonstrates them, which we can merge any time too.
14:44:42 <cdent> efried: yes, that’s why I qualified myself. GET /allocation_candidates has the bug.
14:45:08 <efried> https://review.openstack.org/513149
14:45:12 <cdent> whether it will be exercised is a different question
14:45:12 * alex_xu just found his refactor doesn't fix that bug also
14:45:37 <efried> alex_xu Thanks for testing that.  /me not surprised :)
14:46:33 <efried> We can move on
14:46:36 <alex_xu> not surprised on the bug of allocation, that is right thing :)
14:46:50 <edleafe> #topic Open discussion
14:47:11 <edleafe> One item on the agenda:
14:47:13 <edleafe> #link Should we forbid removing traits from an RP with allocations?  http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2017-10-20.log.html#t2017-10-20T21:34:44
14:47:47 <efried> johnthetubaguy ^
14:48:13 <efried> So my vote is no.  This goes along with what we talked about in Denver wrt inventories being modified downward to the point where they're less than what's already been allocated.
14:48:22 <efried> It's allowed, but you won't get any more allocations until you're back out of the red.
14:49:06 <edleafe> efried: how does changing an RP's trait affect that?
14:49:07 <efried> Same thing with the traits.  You can remove a trait from a RP that's already part of an allocation that may have required that trait, but you won't get scheduled to that RP hereafter if you require that trait (unless it gets re-added)
14:49:09 <jaypipes> efried: I'd like to hear from operators on that question.
14:49:33 <efried> edleafe Not a direct relationship, just a similar example.
14:49:37 <jaypipes> efried: it's a similar conundrum to the instance group member modifications/delete
14:49:43 <johnthetubaguy> comparing the resource class changes and traits changes in ironic pushes me down that line
14:49:45 <cdent> yeah, I think “no” is the right thing as well. From the change on, no new placements will happen and we don’t want to remove existing, so...
14:50:24 <jaypipes> efried: since changing the member set of an instance group affects whether a new or existing instance still is constrained by the placement policy in use at the time of initial scheduling
14:50:53 <efried> jaypipes Does the instance grouping affect anything after the deploy, though?
14:50:58 <efried> In the case of traits, that answer is no.
14:51:07 <efried> It's a one-and-done kind of thing.
14:51:15 <johnthetubaguy> the next live-migration could be odd, no possible hosts as the trait has gone?
14:51:17 <jaypipes> fwiw, k8s attitude towards this problem is "oh well, shit changed. sorry, go ahead and reschedule if you want"
14:52:07 <jaypipes> efried: if I add or remove a member from the affinity group, we do a check on the compute node to ensure that the constraints around affinity are kept for existing members of the group, tyes.
14:52:09 <efried> johnthetubaguy How about this as more concrete example: if you want to shuffle around which nodes are in CUSTOM_GENERAL_USE vs. CUSTOM_PROJECT_B.  Should you have to clear out existing deployments before you can do that, or is it okay if they just take effect "the next time"?
14:53:14 <mriedem> i've heard people talk about using traits from external systems to dynamically flag if something is available, or not if something failed,
14:53:32 <mriedem> like something in the hardware failed and that component is no longer available, remove the trait so things that require that trait don't get scheduled there
14:53:43 <jaypipes> mriedem: pls no... traits are not for state/status information
14:53:54 <efried> That's still a capability statement.
14:54:01 <jaypipes> no it's not.
14:54:35 <alex_xu> jaypipes: traits are not for that, but it works
14:54:38 <mriedem> traits are going to be used for dynamic scheduling
14:54:50 <mriedem> whether we intended them to be or not
14:54:50 <jaypipes> efried: that is a different thing. whether or not I am capable of chewing gum and patting my head simultaneously is not the same thing as whether or not I am chewing gum and patting my head currently.
14:55:26 <efried> jaypipes Totally agree - been discussing same on the ML.
14:55:32 <jaypipes> mriedem: if that's the case, we might as well give up and just reinstate the ERT, undo all of placement API and just use kubernetes.
14:55:46 <efried> jaypipes What we're talking about here is, if I cut off both your hands, you're no longer capable of doing that, so that trait should be removed from you.
14:56:34 <jaypipes> efried: sure, but what mriedem is talking about is some external agent that is constantly cutting off my hands and reattaching them when, say, a NIC state link goes up or down.
14:56:48 <johnthetubaguy> efried: I want to be able to fix things right away, myself
14:57:08 <jaypipes> johnthetubaguy: what do you mean by "fix things right away:"?
14:57:34 <johnthetubaguy> I don't want to have to wait for a node to be free before I define how/who/what it gets used for next time
14:57:49 <efried> jaypipes Which seems reasonable to me, tbh.  I don't know about link state specifically, how fast that can flip on and off, but if someone trips over a cable, I don't want to deploy new instances there until they plug it back in.
14:58:03 <edleafe> Two minutes left
14:58:09 <jaypipes> johnthetubaguy: I don't see how that is relevant here? if we only use traits for capabilities (and not for state information) then you don't have that problem.'
14:58:25 <jaypipes> efried: ugh.
14:58:36 <jaypipes> efried: that is NOT a capability. that is a state/status link.
14:58:59 <jaypipes> efried: see the nic-state-aware-scheduling spec/blueprint for a lengthy discussion of where this leads to...
14:59:04 <cdent> whatever we want people to do with these things, we have to be aware of what people _will_ do
14:59:09 <johnthetubaguy> jaypipes: so its the only way we have to do non-capability things right now, so that's what is getting used
14:59:28 <edleafe> Sounds like this can be summed up as: can the capabilities of a "thing" change?
15:00:07 <edleafe> OK, thanks everyone!
15:00:07 <edleafe> Continue this in -nova
15:00:08 <edleafe> #endmeeting