14:00:19 <edleafe> #startmeeting nova_scheduler 14:00:20 <openstack> Meeting started Mon Jan 29 14:00:19 2018 UTC and is due to finish in 60 minutes. The chair is edleafe. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:22 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:24 <openstack> The meeting name has been set to 'nova_scheduler' 14:00:46 <efried> @/ 14:00:49 <alex_xu> o/ 14:00:53 <ttsiouts> o/ 14:00:56 <takashin> o/ 14:01:04 <edleafe> Looks like efried didn't comb his hair 14:01:24 * efried is a conditioner commercial 14:01:26 <bauzas> \o 14:01:28 <cdent> i'm only vaguely here 14:02:03 <bauzas> can we please try to make that meeting short ? :) 14:02:09 <bauzas> I have a ton of things to catch up :) 14:02:28 <edleafe> bauzas: sounds good 14:02:31 <edleafe> let's start then 14:02:38 <edleafe> #topic Reviews 14:02:53 <edleafe> #link Provider Tree series, starting with: https://review.openstack.org/#/c/533808/ 14:02:56 <edleafe> Most of the bottom patches are +W'd 14:02:59 <edleafe> #link First provider tree patch in progress is https://review.openstack.org/#/c/537648/ 14:03:02 <edleafe> Anything to add, efried? 14:03:24 <mriedem> o/ 14:03:26 <efried> The weekend saw two of eight merge there. The remaining six are In The Gate. 14:03:36 <efried> <ominous cadence> 14:03:58 <mriedem> are a bunch failing on functional job timeouts? 14:03:58 <edleafe> ok 14:04:00 <efried> I'll be cleaning up the rest of the series over the next couple of days, but of course those pieces are Rocky-bound. 14:04:14 <efried> mriedem Infra has declared no rechecks until they've finished sleuthing. 14:04:21 <mriedem> ooo... 14:04:27 <bauzas> it should no longer be called "the gate", rather "the wall" 14:04:33 <mriedem> was there something in the ML? 14:04:39 <mriedem> or just a channel message? 14:04:39 <edleafe> "the pit" 14:04:40 <efried> mriedem IRC broadcast 14:04:42 <bauzas> mriedem: see IRC topics 14:04:54 <efried> "...of despair" 14:04:54 <mriedem> ah 14:05:08 <edleafe> next up: 14:05:09 <edleafe> #link Nested RP traits selection: https://review.openstack.org/#/c/531899/ 14:05:12 <edleafe> Not much recent action on this, as the series is postponed until Rocky 14:05:14 <bauzas> and https://wiki.openstack.org/wiki/Infrastructure_Status 14:05:31 <bauzas> says that we had an infra cloud provider outage, hence the very large delay 14:05:48 <edleafe> #link Singular request group traits, starting with: https://review.openstack.org/#/c/536085/ 14:05:51 <edleafe> This series is also +W'd, although a cleanup patch at the end needs work 14:05:55 <bauzas> the traits API thing has been merged, right? 14:06:17 <edleafe> bauzas: which thing? 14:06:32 <alex_xu> yes 14:06:45 <bauzas> the API side 14:06:51 <alex_xu> bauzas: the traits support in allocation candidates API is merged 14:07:24 <alex_xu> the patch for support traits in flavor extra spec is still in the gate pipeline 14:07:24 <mriedem> https://review.openstack.org/#/c/536085/ is left 14:07:29 <mriedem> to tie things together 14:07:40 <alex_xu> yes 14:07:43 <bauzas> yeah ok, we're on the same page 14:07:46 <mriedem> alex_xu: did you see my request for a functional test? 14:07:48 <bauzas> I was meaning that 14:08:05 <alex_xu> mriedem: the patch for bump the timeout? 14:08:14 <mriedem> alex_xu: no, 14:08:17 <mriedem> on the traits stuff, 14:08:34 <mriedem> i think we should have a functional test with 2 computes and one decorated with a trait that we use via flavor extra spec to see everything works as expected 14:08:35 <mriedem> end to end 14:08:50 <bauzas> +1 14:09:03 <alex_xu> mriedem: yea, I will add one, gibi_ wants that also 14:09:16 <mriedem> thanks. i can't remember which patch i mentioned that in. 14:09:27 <edleafe> Moving on... 14:09:28 <edleafe> #link Granular resource requests, starting with: https://review.openstack.org/#/c/517757/ 14:09:31 <edleafe> Still WIP; to be completed in Rocky 14:09:52 <edleafe> Next: 14:09:53 <edleafe> #link Remove microversion fallback: https://review.openstack.org/#/c/528794/ 14:09:56 <edleafe> Simple enough; caught in Zuul hell 14:10:03 <edleafe> Needs some +2s, also 14:10:30 <edleafe> Last on the agenda: 14:10:31 <edleafe> #link Use alternate hosts for resize: https://review.openstack.org/#/c/537614/ 14:10:34 <edleafe> The main patch finally merged! All that's left is this follow-up unit test 14:10:39 <bauzas> honestly, I think we need to identify what we can still land for Rocky and what can be postponed 14:10:50 <bauzas> I know mriedem made an etherpad for that 14:10:56 <edleafe> s/Rocky/Queens 14:11:05 <bauzas> because I don't know when the gate problems will be fixed 14:11:10 <bauzas> heh, oops, yes 14:11:29 <bauzas> we can't just recheck the entire world 14:11:40 <mriedem> https://etherpad.openstack.org/p/nova-queens-blueprint-status is what i'm still tracking for queens 14:11:40 <edleafe> We're OpenStack - we can do anything! 14:11:52 <bauzas> (and I tell that as I also have one important change blocked in the gate) 14:12:30 <bauzas> mriedem: yup, I was talking of that etherpad 14:12:55 <edleafe> Is there anything on that etherpad we need to discuss? Seems straightforward to me 14:13:05 <mriedem> no i don't think so, 14:13:13 <mriedem> i think all the approved NRP stuff is mergd, 14:13:16 <mriedem> so the rest goes to rocky 14:13:24 <mriedem> i'll probably put a procedural -2 on the bottom change in the series 14:13:44 <mriedem> oh nvm i see a bunch is approved now 14:13:45 <efried> The bottom unapproved one, you mean 14:14:16 <efried> https://review.openstack.org/#/c/537648/ is the first unapproved in the series. 14:14:20 <mriedem> yeah 14:14:53 <mriedem> question 14:14:59 <mriedem> is there an 'end' patch in this series yet 14:14:59 <mriedem> ? 14:15:09 <mriedem> like, when do we say a user can use this stuff 14:15:23 <efried> https://review.openstack.org/#/c/520246/ 14:15:31 <efried> if by "user" you mean "virt driver" 14:15:50 <efried> And can use in the sense that they can use update_provider_tree, but they can't make use of nested therein. 14:15:58 <mriedem> i mean, 14:16:04 <mriedem> an API user or operator, 14:16:12 <mriedem> modeling server create requests that rely on NRP 14:17:15 <efried> Actual hierarchical provider models will require 1) the above (https://review.openstack.org/#/c/520246/), 2) Jay's series on NRP in alloc cands, and 3) a virt driver impl of update_provider_tree. 14:17:16 <bauzas> mriedem: the theory is that a good customer usecase would be the VGPU stuff 14:17:20 <efried> None of those things are making Q. 14:17:31 <efried> But are very close. 14:17:48 <efried> Well, somewhat close. At least already started. 14:17:49 <bauzas> mriedem: ie. xen folks providing a nested inventory so that a server boot asking for VGPU in the flavor would use that 14:18:13 <bauzas> that's the only usecase I'm aware of in the foreseenable future 14:18:23 <mriedem> vgpu and xen reminds me, there are open xen vgpu patches, so what's the support statement for the xen driver wrt gpu in queens? 14:18:25 <bauzas> because PCI things are way behind that in terms of schedule 14:18:33 <mriedem> https://review.openstack.org/#/q/topic:bp/add-support-for-vgpu+status:open 14:19:03 <mriedem> does the xen driver have the same basic support for vgpu as the libvirt driver? and the remaining patches are future work for using NRP? 14:19:12 <bauzas> mriedem: AFAICT, they started to play with nested RPs 14:19:15 <efried> 1) is a few lines of test code from being ready. 2) is fairly far along. 3) gpu stuff has been written based on WIPs of #1, but is obviously blocked by that, and is still pretty nascent afaik 14:19:26 <efried> yes 14:19:40 <bauzas> mriedem: yes, basic feature parity, except they haven't tested all the server actions AFAICS 14:19:58 <mriedem> ok thanks both; bauzas we should have a patch for a feature support matrix entry for vgpu i think 14:20:11 <bauzas> mriedem: yup, I saw your comment and I agree 14:20:12 <mriedem> but now i'm derailing this meeting and will stop 14:20:25 <bauzas> +1 14:20:43 <bauzas> (and I asked for a quick meeting - pretty unfair if I chat too much) 14:21:15 <edleafe> bauzas: heh 14:21:30 <edleafe> Any other reviews to discuss? 14:22:06 <edleafe> ok then 14:22:06 <edleafe> #topic Bugs 14:22:07 <edleafe> #link Placement bugs: https://bugs.launchpad.net/nova/+bugs?field.tag=placement 14:22:11 <edleafe> Nothing new this week; anyone have anything to say about bugs? 14:22:38 <mriedem> we got'em 14:22:49 <mriedem> just keep an eye out for new bugs each day until RC1 14:22:51 <bauzas> just that I'll do some triage 14:23:09 <edleafe> Sounds good 14:23:11 <edleafe> #topic Open Discussion 14:23:16 <mriedem> ooo i have something 14:23:21 <edleafe> go for it 14:23:23 <mriedem> http://lists.openstack.org/pipermail/openstack-dev/2018-January/126653.html 14:23:25 <mriedem> #link http://lists.openstack.org/pipermail/openstack-dev/2018-January/126653.html 14:23:33 <mriedem> just a simple thing i POC'ed over the weekend, 14:23:40 <mriedem> to put the driver.capabilities as traits on the compute node RP 14:24:04 <mriedem> i think this would be useful for scheduling things that depend on driver capabilities, like multiattach and tagged attach, 14:24:14 <mriedem> which are things today that if we pick the wrong compute, we fail and don't reschedule 14:24:31 <mriedem> and also doesn't require a full blown capabilities API 14:24:33 <bauzas> mriedem: I saw your email but I was waiting you to be awake 14:24:52 <bauzas> mriedem: because looks like I wasn't thinking of the same than you when I was looking at a "capabilities API" 14:25:08 <efried> First blush I like the idea. 14:25:12 <bauzas> mriedem: in your email, you mention exposing the CPU features and other nasty bits, right? 14:25:35 <efried> IMO that should be done anyway, but is unrelated to driver capabilities. 14:25:46 <bauzas> oh, sorrt 14:25:49 <bauzas> misunderstood 14:25:54 <cdent> efried: yeah, my thought too 14:26:02 <mriedem> bauzas: no 14:26:03 <bauzas> the proposal is to use traits for exposing what you can do for a compute 14:26:13 <bauzas> correct? 14:26:21 <mriedem> i've always wanted a way to take what's in the driver.capabilities dict, 14:26:25 <mriedem> and expose that out of the rest api 14:26:30 <mriedem> this is an easy way to do that 14:26:55 <efried> In the sense of "has image cache" or "supports quiesce". Not in the sense of "HW_CPU_X86_AVX". 14:27:01 <mriedem> correct 14:27:09 <bauzas> gotcha 14:27:29 <mriedem> we could have stored these along with the compute_nodes table and written a capabilities subresource API on os-hypervisors, 14:27:43 <mriedem> but a new API just seems like it would get stuck in committee 14:28:09 <mriedem> anyway, it was just an idea, thrown out there for discussion 14:28:10 <efried> I mean, I get that there could be some confusion, because we're mixing virt driver traits with hardware traits on the same resource provider. 14:28:28 <mriedem> not all traits on a RP are going to be hw traits, are they? 14:28:33 <mriedem> that's a bit limited in scope 14:28:50 <mriedem> ages ago we talked about hypervisor version as a trait 14:29:26 <cdent> presumably an rp should expose any traits which are useful in selection 14:29:32 <efried> Yeah, I'm not saying we shouldn't do it. I'm just saying muddling the distinction between "the box" and "the virt driver"... 14:29:37 <cdent> traits are cheap and designed to be cheap 14:30:11 <efried> We *could* make a dummy RP with no inventory that represents the virt driver. Put it... really anywhere in the tree. 14:30:21 <efried> But to cdent's point, RPs are more expensive than traits. 14:30:30 <mriedem> yeah that seems overly complicated to me 14:30:34 <efried> Agree. 14:30:35 <edleafe> efried: yeah, I was just going to say that 14:30:54 <efried> reductio ad absurdam and all that. 14:31:06 <mriedem> soon we'll have an SDD for modeling this all 14:31:07 <edleafe> and "supports foo" is better than "version 1.23" 14:31:14 <efried> ++ 14:31:39 <bauzas> sorry, was on another discussion 14:31:42 * mriedem gets lost down memory lane http://docs.oasis-open.org/sdd/v1.0/os/sdd-spec-v1.0-os.html 14:31:48 <bauzas> but I think it's a doable way 14:32:05 * cdent demerits mriedem for making a refernce to oasis 14:32:06 <mriedem> i'll throw it in the PTG etherpad and we can discuss there, and move on here 14:32:23 <bauzas> what's next to talk ? TOSCA support ? 14:32:32 <mriedem> mmm https://www.dmtf.org/standards/cim 14:32:46 <edleafe> OK, anything else for opens? 14:33:26 <edleafe> OK, everyone - back to whatever it was you were doing before. 14:33:29 <edleafe> #endmeeting