14:00:11 <edleafe> #startmeeting nova_scheduler 14:00:12 <openstack> Meeting started Mon Oct 9 14:00:11 2017 UTC and is due to finish in 60 minutes. The chair is edleafe. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:13 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:15 <openstack> The meeting name has been set to 'nova_scheduler' 14:00:20 <efried> \o 14:00:22 <cdent> o/ 14:00:22 <edleafe> #link Meeting Agenda https://wiki.openstack.org/wiki/Meetings/NovaScheduler 14:00:23 <alex_xu> o/ 14:01:01 <edleafe> Let's wait a minute for more people to arrive 14:01:03 <mriedem> o/ 14:02:21 <jaypipes> hola 14:02:30 * jaypipes caffeinated 14:03:01 <edleafe> #topic Specs 14:03:33 <edleafe> Lots merged last week 14:03:42 <edleafe> 3 left that I know of: 14:03:43 <edleafe> #link Granular Resource Request Syntax https://review.openstack.org/510244 14:03:46 <edleafe> #link Add spec for symmetric GET and PUT of allocations https://review.openstack.org/#/c/508164/ 14:03:49 <edleafe> #link Support traits in the Ironic driver https://review.openstack.org/#/c/507052/ 14:03:58 <edleafe> Any comments on these? 14:04:17 <efried> That first one I just added on Friday. It's introducing the numbered-group syntax in GET /allocation_candidates 14:04:27 <jaypipes> I'll review each this morning. the symmetric one is a no brainer I think... 14:04:41 <mriedem> https://review.openstack.org/#/c/507052/ is approved, just won't merge because of the depends-on 14:04:56 <jaypipes> the GRanular one was just submitted by efried on Friday. it's the one about requesting multiple distinct "subrequests" of resource/traits 14:04:59 <cdent> there’s a mild issue in the symmetric one that I’ve just discovered while doing the implementation: 14:05:14 * bauzas waves 14:05:19 <cdent> when we GET there’s no project_id and user_id in the response, but we require that on the PUT. Do we care? 14:05:39 <cdent> I’ll highlight it when I commit, and we can discuss it on the review. 14:05:41 <jaypipes> cdent: probably should be made consistent... 14:06:56 <cdent> an aspect of making it consistent is that it kind of assumes that they might stay the same, which may be too big of an assumption 14:07:12 <cdent> it’s easy to adjust whatever we decide 14:07:16 <jaypipes> cdent: agreed 14:07:24 <mriedem> getting the info about the current project/user is fine, 14:07:35 <mriedem> doesn't mean the PUT has to be the same, but i don't know of case where they wouldn't be the same 14:08:47 <edleafe> #topic Reviews 14:08:56 <edleafe> #link Nested RP series starting with: https://review.openstack.org/#/c/470575/ 14:08:59 <edleafe> There was one question attached to this in the agenda: 14:09:01 <edleafe> Debate: should the root_provider_uuid be reported in the GET /resource_providers response? 14:09:16 <efried> So my vote is yes. 14:09:21 <edleafe> Someone had concerns about this a while back - anyone remember why? 14:09:31 <efried> edleafe Heh, jaypipes said it was you :) 14:09:44 <edleafe> efried: yeah, I think he's mis-remembering 14:10:07 <jaypipes> very possible 14:10:18 <efried> Okay. My take is that I want to be able to look at a given RP and get the whole tree for that RP in one step. 14:10:31 <efried> With parent but not root, I have to walk the whole tree up. 14:10:42 <jaypipes> edleafe, efried: I can certainly add it back in if the group votes for that. not a difficult change at all. 14:10:47 <efried> With the root ID, I can just call with ?tree=root and I'm done. 14:11:08 <edleafe> makes sense to me 14:11:14 <jaypipes> efried: well, you could also just do ?tree=<some_provider_uuid> and the backend can query on root. 14:11:25 <jaypipes> efried: meaning there's no reason to expose the attribute. 14:12:01 <efried> if it works that way, I'd be cool with that. But I still think there's reason to have the root. 14:12:16 <jaypipes> like I said, I'm cool putting it in 14:12:19 <efried> Thinking about the scheduler using it to figure out e.g. the compute host. 14:12:28 <jaypipes> ack 14:12:32 <jaypipes> ok, let's vote... 14:12:43 <jaypipes> #vote expose the root 14:12:47 <jaypipes> sounds kinky. 14:12:55 <bauzas> startvote FTW 14:13:16 <edleafe> Simpler: anyone opposed? 14:13:20 <jaypipes> well, let's do it this way... does anyone NOT want to expose the root? 14:13:21 <cdent> It was likely me that was opposed originally because it seemed an unecessary detail and I was trying to limit the growth of atributes in the representation 14:13:25 <jaypipes> edleafe: heh, jinks 14:13:26 <edleafe> jinx 14:13:29 <jaypipes> lol 14:13:44 <bauzas> seriously? I dunno 14:13:54 <cdent> but at this stage, given the extent of hairiness that nested is looking like it is going to become, I don’t reckon it matters 14:13:57 <bauzas> use a coin ? 14:14:00 <cdent> there’s going to be a lot of hair 14:14:03 <cdent> so I’d say go for it 14:14:18 <jaypipes> bauzas: what say you? 14:14:20 <bauzas> I don't think it hurts 14:14:24 <edleafe> I don't hear anyone saying no, so... 14:14:28 <edleafe> #agreed Add root provider uuid to GET /resource_providers 14:14:33 <jaypipes> dansmith, mriedem: any thoughts? 14:14:45 <bauzas> jaypipes: I meant we should flip a coin 14:14:49 <bauzas> for deciding 14:14:53 <bauzas> but meh 14:15:08 <dansmith> I'd have to read back 14:15:15 <bauzas> just a stupid untranslatable and unbearable French try of joke 14:15:23 <jaypipes> bauzas: :) 14:15:32 <edleafe> jaypipes: anything else on the nested RP series to discuss now? 14:15:36 <mriedem> so we're talking about exposing something when we don't have a use case to use it? 14:15:39 <mriedem> or a need to use it yet? 14:15:44 <bauzas> I think the spec is pretty rock solid 14:16:07 <bauzas> mriedem: we have one approved spec that would use nested RPs 14:16:09 <jaypipes> mriedem: no, there's definitely a use case for it. 14:16:33 <bauzas> oh, the root UUID ? 14:16:35 <bauzas> well, meh 14:16:41 <jaypipes> mriedem: it's something that *could* be derived by the caller though. in other words, it just makes life a little easier for the scheduler code. 14:16:52 <bauzas> lemme say something terrible 14:17:07 <bauzas> just pass a parameter for telling whether we should return it 14:17:09 <bauzas> tadaaaaaaa 14:17:40 <dansmith> um 14:17:41 <bauzas> so, honestly, I don't care and like I said, it doesn't hurt 14:17:49 <mriedem> given i don't have context on how the scheduler code is going to look with or without it, i can't really say 14:17:56 <mriedem> if it makes the scheduler client code better, then sure, throw it in 14:18:04 <bauzas> it's not a performance problem, right? 14:18:10 <dansmith> I don't understand why we wouldn't if we have the data 14:18:10 <bauzas> so, should we really care of that? 14:18:37 <mriedem> yeah, the less rebuilding of the tree client-side is the way to go 14:18:41 <jaypipes> bauzas: no, nothing perf related 14:19:05 <jaypipes> ok, it's settled then, let's move on. 14:19:09 <efried> I'll update the review. 14:19:15 <jaypipes> danke 14:19:15 <edleafe> jaypipes: again, anything else on the nested RP series to discuss now? 14:19:24 <bauzas> jaypipes: yeah I know, so honestly not a big deal if we leak it 14:19:31 <jaypipes> edleafe: just to note that I'm rebasing the n-r-p series on the no-orm-resource-providers HEAD 14:19:47 <edleafe> jaypipes: got it 14:19:53 <edleafe> Next up: 14:19:57 <edleafe> #link Add traits to GET /allocation_candidates https://review.openstack.org/479776 14:20:23 <edleafe> alex_xu is back this week, so we should see some activity there 14:20:36 <alex_xu> yea, i 14:20:42 <alex_xu> 'm working on it 14:20:56 * alex_xu isn 14:21:00 <alex_xu> ... 14:21:26 <alex_xu> new keyboard layout... 14:21:31 <cdent> :) 14:21:32 <edleafe> alex_xu: same thing with 14:21:33 <edleafe> #link Add traits to get RPs with shared https://review.openstack.org/478464/ 14:21:35 <efried> Use a Dvorak keyboard. The ' is nowhere near the <Enter> key. 14:21:36 <edleafe> ? 14:22:50 <mriedem> i thought we were deferring shared support from queens? 14:22:56 <mriedem> why bother with api changes? 14:23:20 <mriedem> because when we start working on what the client needs for that support, we might need to change the api 14:23:36 <mriedem> or, is this totally not that and i should shut up? 14:24:03 * bauzas bbiab (kids) 14:24:12 <mriedem> yeah nevermind, this isn't what i thought it was 14:24:53 <edleafe> moving on 14:24:55 <edleafe> #link Allow _set_allocations to delete allocations https://review.openstack.org/#/c/501051/ 14:25:05 <edleafe> cdent: anything going on with that? 14:25:19 <cdent> it’s just waiting for people to review it pretty much 14:25:53 <cdent> it’s a precursor to doing POST /allocations 14:26:05 <edleafe> Good segueway 14:26:06 <edleafe> #link WIP - POST /allocations for >1 consumer https://review.openstack.org/#/c/500073/ 14:26:46 <edleafe> next up 14:26:47 <edleafe> #link Use ksa adapter for placement https://review.openstack.org/#/c/492247/ 14:27:10 <edleafe> efried: any comments on these? They look pretty straightforward to me 14:27:39 <efried> The base of that series is getting final reviews from mriedem at this point. 14:27:55 <efried> That patch itself should indeed be pretty straightforward. 14:28:21 <efried> And the rest of the stuff in that series doesn't have anything to do with placement/scheduler. 14:28:29 <mriedem> got the tab open 14:28:44 <edleafe> next up 14:28:45 <edleafe> #link Migration allocation fixes: series starting with https://review.openstack.org/#/c/498950/ 14:29:01 <edleafe> That series is moving along 14:29:54 <edleafe> Final review on the agenda: 14:29:56 <edleafe> #link Alternate hosts: series starting with https://review.openstack.org/#/c/486215/ 14:30:17 <edleafe> I have to add versioning to the allocation_request in the Selection object 14:30:20 <edleafe> :( 14:30:31 <mriedem> jesus does that bottom change still have the s/failure/error/ comment?! 14:31:02 <edleafe> mriedem: what comment? 14:31:21 <mriedem> nvm 14:31:42 <edleafe> ok 14:32:05 <edleafe> I also need suggestions for naming the parameter added to the select_destinations() RPC call 14:32:27 <edleafe> This tells the scheduler to return the selection objects and alternates 14:32:39 <edleafe> I called it 'modern_flag' as a placeholder 14:32:46 <edleafe> let the bikeshedding begin! 14:33:18 <edleafe> Please add your thoughts to the review 14:33:22 <edleafe> Moving on 14:33:23 <edleafe> #topic Bugs 14:33:28 <edleafe> 2 new ones: 14:33:36 <edleafe> #link https://bugs.launchpad.net/nova/+bugs?field.tag=placement 14:33:44 <edleafe> #link placement server needs to retry allocations, server-side https://bugs.launchpad.net/nova/+bug/1719933 14:33:45 <openstack> Launchpad bug 1719933 in OpenStack Compute (nova) "placement server needs to retry allocations, server-side" [Medium,In progress] - Assigned to Jay Pipes (jaypipes) 14:34:05 <edleafe> This was uncovered by mriedem trying to start 1000 servers at once 14:34:24 <mriedem> which wouldn't have fixed the ultimately reason why i was hitting that, but yeah 14:34:25 <jaypipes> edleafe: yeah, I'm on that 14:34:31 <mriedem> *ultimate 14:34:40 <edleafe> cool 14:34:44 <edleafe> The other is: 14:34:45 <edleafe> #link Evacuate cleanup fails at _delete_allocation_for_moved_instance https://bugs.launchpad.net/nova/+bug/1721652 14:34:46 <openstack> Launchpad bug 1721652 in OpenStack Compute (nova) pike "Evacuate cleanup fails at _delete_allocation_for_moved_instance" [High,Confirmed] 14:34:59 <mriedem> gibi has started a recreate for ^ 14:35:18 <mriedem> https://review.openstack.org/#/c/510176/ 14:36:32 <edleafe> #link Functional test for bug 1721652https://review.openstack.org/#/c/510176/ 14:36:32 <openstack> bug 1721652 in OpenStack Compute (nova) pike "Evacuate cleanup fails at _delete_allocation_for_moved_instance" [High,Confirmed] https://launchpad.net/bugs/1721652 14:36:41 <edleafe> #undo 14:36:42 <openstack> Removing item from minutes: #link https://review.openstack.org/#/c/510176/ 14:36:50 <edleafe> #link Functional test for bug 1721652 https://review.openstack.org/#/c/510176/ 14:37:22 <edleafe> Anything else for bugs? 14:38:23 * cdent watches the pretty tumbleweeds 14:38:30 <mriedem> mr gorbachev, tear down this meeting 14:38:37 <edleafe> nope 14:38:39 <edleafe> #topic Open Discussion 14:38:55 <edleafe> Getting allocations into virt (e.g. new param to spawn). Some discussion here: 14:38:58 <edleafe> #link Getting allocations into virt http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2017-10-04.log.html#t2017-10-04T13:49:18-2 14:39:38 <edleafe> efried: wanna lead this? 14:40:05 <efried> Some alternatives that were (briefly) discussed: 14:40:17 <efried> Adding allocations to the request spec 14:40:24 <efried> Or elsewhere in the instance object. 14:40:47 <efried> IIRC, those were rejected because of general resistance to glomming more gorp onto those things. 14:41:04 <edleafe> Yeah, glomming gorp is bad 14:41:36 <efried> The drive for this is for virt to be able to understand comprehensively what has been requested of it. 14:41:58 <mriedem> right now it's just going to be passing shit through flavor extra specs isn't it? 14:42:00 <mriedem> unless we change that? 14:42:33 <efried> Which is limited 14:42:33 <mriedem> doesn't alex_xu have a spec for specifying traits in a flavor 14:42:38 <efried> yes 14:43:12 <edleafe> efried: so you want something more spectific, no? 14:43:18 <mriedem> we agreed to limited support for stuff like vgpus in queens 14:43:23 <alex_xu> I guess efried is talking about specific resource allocated to the instance? 14:43:28 <mriedem> what are you needing? like a complex data structure? 14:43:35 <edleafe> E.g., not just a VF, but a VF on a particular PF? 14:43:38 <efried> Right; flavor extra specs tells us what was requested generically; the allocations will tell us specific RPs etc. 14:43:50 <efried> edleafe just so. 14:44:02 <efried> mriedem Not any more complex than the allocations object :) 14:44:19 <mriedem> so, as a user, i want not only a VF, but the 3rd VF on 4th PF? 14:44:23 * edleafe remembers when we were making clouds... 14:44:41 <efried> mriedem Or possibly just "a VF on the 4th PF". But yeah, that's the general idea. 14:44:49 <mriedem> ew 14:44:51 <efried> Because placement is going to have allocated inventory out of a specific RP. 14:44:58 <mriedem> do we need this for queens? 14:45:06 <efried> If spawn doesn't have any way to know which one, how does it know where to take the VF from? 14:45:19 <mriedem> it's random isn't it? 14:45:27 <efried> What's random? 14:45:38 <mriedem> the PF 14:45:39 <efried> Certainly not which PF the VF comes from. 14:45:43 <efried> No, not at all. 14:45:46 <mriedem> or is that all whitelist magic? 14:45:50 <efried> Could be based on traits, inventory, etc. 14:45:54 <efried> Not even thinking about whitelist. 14:46:29 <efried> Placement narrowed it down, scheduler picked one. Virt needs to know which one. 14:46:48 <mriedem> can't virt ask placement for the one that was picked? 14:46:59 <efried> Yes, could. 14:47:06 <efried> But I got the impression we didn't want virt to talk to placement. 14:47:18 <mriedem> it already does for reporting inventory 14:47:27 <efried> bauzas and dansmith both expressed that 14:47:31 <efried> Not directly 14:47:49 <mriedem> virt == compute service == contains the rt that reports inventory 14:47:55 <mriedem> in my head anyway 14:48:00 <dansmith> virt != compute service:) 14:48:06 <efried> "virt driver" then. 14:48:06 <dansmith> virt should not talk directly to placement, IMHO 14:48:09 <dansmith> compute should 14:48:15 <mriedem> ok, 14:48:22 <mriedem> so compute manager asks placement for the allocations for a given request, 14:48:28 <mriedem> builds those into some fancy pants object, 14:48:32 <mriedem> and passes that to the virt drive 14:48:33 <mriedem> *driver 14:48:44 <mriedem> ? 14:49:07 <efried> Didn't scheduler already give that allocations object to compute manager? 14:49:09 <dansmith> compute should provide the allocations to virt when needed, yeah 14:49:15 <mriedem> just like the neutron network API asks neutron for ports, builds network_info and passes that to spawn 14:49:30 <mriedem> efried: no 14:49:53 <efried> So it'll have to ask placement for that allocations object. Okay. 14:49:56 <mriedem> so this essentially sounds like the same thing we do for bdms and ports 14:50:08 <mriedem> so in _build_resources you yield another new thing 14:50:13 <mriedem> and pass that to driver.spawn 14:50:17 <efried> And yeah, I guess we could funnel it into a pythonic nova object (which may eventually be an os-placement object) 14:50:29 <efried> right 14:50:32 <mriedem> oo we're already talking about new libraries?! 14:50:40 <mriedem> :P 14:50:41 <efried> When we split placement out into its own thing? 14:50:49 <efried> Sorry, don't mean to muddy the waters. 14:51:13 <mriedem> ok so in the Slime release... 14:51:36 <mriedem> anyway, i think you get the general idea of what the compute would do yeah/ 14:51:37 <mriedem> ? 14:52:00 <efried> You're saying this isn't something we want to do in Queens? 14:52:00 <mriedem> is there a specific bp that is going to need this? 14:52:15 <mriedem> there are things we can want to do, and things we can actually get done 14:52:34 <efried> Well, I don't see how e.g. the vGPU thing is going to work without it. 14:52:37 <mriedem> i'm trying to figure out what we actually need to get done so we can focus on those first 14:52:52 <efried> Unless we bridge the gap by having the virt driver ask placement for the allocations. 14:53:08 <mriedem> is there any poc up yet for that? 14:53:19 <mriedem> maybe the xen team hasn't gotten that far? 14:53:21 <efried> For vGPU? 14:53:23 <mriedem> yeah 14:53:41 <dansmith> bauzas was going to be working on this 14:53:42 <mriedem> anyway, maybe it will be needed, but i'd check with the other people working on this too 14:53:43 <efried> Wasn't there a big stack with mdev in libvirt? 14:53:52 <dansmith> providing the allocation to virt so we could do that 14:53:53 <dansmith> however, 14:53:54 <mriedem> the totally separate effort? 14:54:00 <dansmith> we can use the flavor for right now and move on 14:54:33 <efried> dansmith And accept that the virt driver may pick a different PF than that from which placement allocated the inventory? 14:54:50 <efried> And have the virt driver duplicate the logic to check for traits? 14:54:51 <dansmith> efried: placement isn't picking PFs right now 14:54:51 <mriedem> efried: so how about you follow up with the xen team and see what they had in mind for this, 14:55:25 <efried> placement is picking specific RPs. Depending how the RPs are modeled, those could be PFs. Just using PFs as a general example. 14:55:30 <dansmith> efried: it's just picking "has a vgpu" which means virt can easily grab the first free one and do that thing 14:55:40 <efried> Unless traits. 14:55:55 <dansmith> efried: we don't have nrps, which means it's not picking traits 14:56:06 <dansmith> er, picking PFs, 14:56:08 <efried> All of that is landing in Queens, early. 14:56:13 <efried> at least in theory. 14:56:15 <dansmith> but also means no multiples, so traits are irrelevant 14:56:28 <efried> Also hopefully landing in Queens. 14:56:35 * bauzas is back 14:56:41 <dansmith> efried: yeah, in theory and we're working on it, but we can easily land a flavor-based thing right now and have that as a backup if we don't get NRPs or something else blocks us 14:56:43 <dansmith> it's trivial 14:57:07 <edleafe> 3 minutes to go 14:57:17 <efried> Let me ask it this way: does putting allocations in a spawn param need a blueprint? 14:57:20 <dansmith> if we linearize everything, something is definitely going to miss queens 14:57:26 <dansmith> efried: not IMHO 14:58:08 <efried> Cool. Then if someone gets the bandwidth to propose a patch, and it doesn't seem too heinous, it could happen. 14:58:09 <dansmith> efried: the thing I'm worried about is that if we go the allocation route, 14:58:27 <dansmith> you have to build a big matrix of rp_uuids to actual devices and figure out how to do all that accounting before you can do the basic thing 14:58:39 <dansmith> however, if we just assume one set of identical gpus per node with flavor right now, 14:58:45 <dansmith> you can get basic support in place 14:59:08 <dansmith> if we rabbit-hole on this after NRPs are done, we could likely miss queens and bauzas will be taken to the gallows 14:59:10 <efried> dansmith Sure, fair point. That matrix of RP UUIDs to devices is something that's going to have to happen. 14:59:17 <dansmith> efried: totes 14:59:35 <dansmith> efried: but let's not hamstring any sort of support on that when we can do the easy thing right now 14:59:54 <efried> Sure 15:00:01 <edleafe> OK, thanks everyone! Continue the discussion in -nova 15:00:01 <edleafe> #endmeeting