14:00:27 <cdent> #startmeeting nova_scheduler
14:00:28 <openstack> Meeting started Mon Jan 16 14:00:27 2017 UTC and is due to finish in 60 minutes.  The chair is cdent. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:29 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:32 <openstack> The meeting name has been set to 'nova_scheduler'
14:00:56 <cdent> say hi if you're hear for nova scheduler team meeting
14:01:05 <cdent> i'll be running the show today because edleafe is on PTO
14:01:31 <cdent> #link agenda https://wiki.openstack.org/wiki/Meetings/NovaScheduler#Agenda_for_next_meeting
14:01:33 <_gryf> o/
14:01:52 <jaypipes> o/
14:01:59 <diga> o/
14:02:17 <cdent> #topic specs & reviews
14:02:32 <cdent> #link inventory tree https://review.openstack.org/#/c/415920
14:02:59 <cdent> jaypipes: I think you and ed negotiated a new approach on this? do things in report client instead of virt manager?
14:03:13 <cdent> also: thank you for choosing to change the name to ProviderTree
14:03:24 <cdent> makes my brain feel smoother
14:03:31 <jaypipes> cdent: yup, and I have made all those changes.
14:03:48 <jaypipes> cdent: I still need to work in NUMA cells into the patch series, but it's getting there.
14:03:51 <bauzas> \o
14:03:58 <cdent> is that link above still good as the entry point to review stuff?
14:04:21 <jaypipes> cdent: yes sir
14:04:29 <cdent> cool
14:04:55 <cdent> the agenda says (on that topic): "Is it possible to fix libvirt instead?"
14:05:04 <cdent> I assume ed probably wrote that.
14:05:24 <jaypipes> yeah. never mind that, I pulled all of it out of the virt layer.
14:05:29 <bauzas> could someone please give me more context about the change we're discussing ?
14:05:42 <bauzas> I did read the commit msg of course :)
14:06:15 <cdent> bauzas: you mean "why do we need this thing?"
14:06:16 <jaypipes> bauzas: I had proposed a change to the virt driver API that would pass a new InventoryTree object to an update_inventory() method of the virt driver, where the driver would be responsible for updating the state of the inventory.
14:06:49 <bauzas> cdent: not the reasoning, just the problem statement :)
14:06:54 <jaypipes> bauzas: edleafe rightly said it was too invasive and cdent said it felt yucky, so I reworked that to instead be self-contained within the scheduler reporting client.
14:07:12 <cdent> "felt yucky" was my professional evaluation
14:07:18 <bauzas> mmm
14:07:42 <bauzas> because we don't have yet an hierarchical way for the inventories ?
14:08:13 <jaypipes> bauzas: well, simpler problem than that :)
14:08:20 <bauzas> actually, I was a bit out when you discussed about the custom rps
14:08:31 <bauzas> oops
14:08:34 <bauzas> s/custom/nested
14:08:42 <jaypipes> bauzas: we didn't yet have a way of identifying resource providers by name and not uuid, and the way the NUMA stuff was written made it impossible to add a uuid field to NUMACell.
14:09:09 <bauzas> ah I see
14:09:12 <jaypipes> bauzas: in addition to the problem of nesting levels.
14:09:40 <bauzas> it was impossible because of the limbo dance in the virt hardware helper module ?
14:09:49 <jaypipes> bauzas: add to that the ongoing mutex on API microversion and it made for a fun Sunday of rebasing :)
14:10:26 <bauzas> okay, I don't want to eat too much time about that problem statement
14:10:26 <jaypipes> bauzas: yes, if by limbo dance you mean the functional programming paradigms that that module was written with that have readonly parameters and functions that return copies/mutations of the supplied parameters.
14:10:44 <bauzas> we could discuss that offline
14:10:49 <jaypipes> :)
14:10:53 <jaypipes> anyway, yeah, moving on.
14:11:04 <jaypipes> so we got bauzas patch merged. \o/
14:11:06 <bauzas> jaypipes: yup, that and all the conditionals in there
14:11:10 <cdent> anyone have any other specs and reviews they'd like to mention?
14:11:16 <_gryf> yeah
14:11:26 <_gryf> I didn't put that on the agenda
14:11:32 <_gryf> #link https://review.openstack.org/#/c/418393/
14:11:41 <_gryf> just a heads up
14:12:05 <cdent> that will be a great thing to have once it happens
14:12:06 <jaypipes> _gryf: ooh, nice. thanks for hopping on that!
14:12:11 <_gryf> this is a BP for providing detailed error info for placement api
14:12:16 <jaypipes> ++
14:12:23 <_gryf> cdent, thanks for the comments
14:12:57 <cdent> you're welcome, more to come soon I'm sure
14:13:01 <_gryf> I feel, that BP will take a while to make satisfying shape
14:13:07 <cdent> :)
14:13:13 <_gryf> esp on messages
14:13:56 <diga> cdent: jaypipes: about notification BP - https://blueprints.launchpad.net/nova/+spec/placement-notifications
14:13:59 <cdent> several bikesheds will be needed
14:14:04 <diga> I forgot to add it in agent
14:14:29 <cdent> #link notification bp: https://blueprints.launchpad.net/nova/+spec/placement-notifications
14:14:31 <diga> I have gone through the versioned object & now has good understanding of how it works
14:15:03 <diga> I will start writing spec on it, if I need some help, will ping you on nova IRC
14:15:16 <cdent> awesome
14:15:45 <cdent> ready to move on to bugs?
14:15:52 <diga> :)
14:16:03 <cdent> #topic bugs
14:16:09 <cdent> #link placement tagged bugs: https://bugs.launchpad.net/nova/+bugs?field.tag=placement&orderby=-id&start=0
14:16:12 <cdent> there are a couple of new ones
14:16:40 <jaypipes> diga: rock on.
14:16:43 <cdent> both from mattr
14:16:54 <diga> jaypipes: :)
14:16:55 <jaypipes> mattr doesn't matter.
14:17:42 <jaypipes> cdent: cool on bugs. do you want to assign those or are you looking for volunteers?
14:17:43 <cdent> dark mattr
14:18:03 <bauzas> heh
14:18:08 <cdent> only one needs a volunteer: https://bugs.launchpad.net/nova/+bug/1656075
14:18:08 <openstack> Launchpad bug 1656075 in OpenStack Compute (nova) "DiscoveryFailure when trying to get resource providers from the scheduler report client" [Low,Confirmed]
14:18:14 <cdent> the other is already started
14:18:37 <jaypipes> k
14:19:03 <_gryf> i can take a look on that
14:19:05 <jaypipes> cdent: how about writing a note to the ML asking for contributor on that bug.
14:19:13 <jaypipes> cdent: or _gryf can take a look :)
14:19:24 <cdent> #action _gryf to work on https://bugs.launchpad.net/nova/+bug/1656075
14:19:24 <openstack> Launchpad bug 1656075 in OpenStack Compute (nova) "DiscoveryFailure when trying to get resource providers from the scheduler report client" [Low,Confirmed]
14:19:57 <rfolco> cdent, jaypipes: what level is that one ? :)
14:20:32 <cdent> the actual code would not be complicated, but there's some concern about how we keep finding new different keystoneauth1 exceptions that are raised
14:20:33 <_gryf> btw, how easy is to setup placement service with devstack?
14:20:36 <rfolco> oh _gryf is assigned already... nm
14:20:50 <bauzas> _gryf: it's already by default
14:20:51 <cdent> so there needs to be some inspection to find out what the real ones
14:21:06 <_gryf> bauzas, no need for plugin setup or smth?
14:21:11 <bauzas> _gryf: no longer
14:21:16 <_gryf> cool
14:21:30 <bauzas> we defaulted devstack to include the placement service
14:21:36 <_gryf> +1
14:22:01 <bauzas> _gryf: ping me if you experience problems tho
14:22:07 <cdent> any other bugs
14:22:09 <_gryf> bauzas, sure
14:22:48 <cdent> #topic open discussion
14:23:33 <cdent> edleafe wrote on the agenda asking about: are we going to allow the can_host parameter to be passed in queries to /resource_providers? is it going to be a part of the ResourceProvider object model?
14:24:03 <cdent> that's somewhat related to a topic for a hangout happening later
14:24:13 <jaypipes> indeed.
14:24:39 <jaypipes> we also had discussed changing that field to "shared" or something like that.
14:26:40 <cdent> I can't recall a specific explanation for why a particular field is needed
14:27:35 <_gryf> IIRC can_host indicates that this particular node can roll out VMs
14:27:56 <cdent> _gryf: sure, but doesn't having inventory of VCPU mean the same thing?
14:28:13 <_gryf> well, implicitly
14:28:26 <jaypipes> cdent: it's mostly for the querying of providers of shared resources. we need a way to filter a set of providers that have shared resources from providers that are associated with providers that share resources.
14:28:55 <jaypipes> cdent: the placement API is intended to be used by more than just Nova eventually.
14:29:01 <cdent> of course
14:29:11 <jaypipes> cdent: but yeah, there are other ways to query to get this information.
14:30:04 <_gryf> I can think of situation, where host which have CPU shouldn't spin out vms
14:30:09 <jaypipes> cdent: the long-term solution for the problem of "where do I send this placement decision?" is to use a mapping table on the Nova side between resource provider and the "processing worker" (ala the service record in Nova)
14:30:17 <_gryf> like specific hosts for HPC
14:31:45 <jaypipes> cdent: to be specific about the querying problem...
14:31:51 <jaypipes> cdent: here is the issue:
14:31:56 <cdent> I suspect what's driving ed's question is avoiding a late stage microversion: "Currently you cannot pass 'can_host=1' as a query parameter, since we only support 'name', 'uuid', 'member_of', and 'resources' as valid params. "
14:32:05 <jaypipes> cdent: if I have a request for DISK_GB of 100
14:32:28 <jaypipes> cdent: and I have 3 hosts, all of which are associated to aggregate A.
14:32:46 <jaypipes> cdent: 2 of the hosts have no local disk. one has local disk (and thus an inventory record of DISK_GB)
14:33:12 <jaypipes> cdent: aggregate A is associated with provider D, which is a shared storage pool, obviously having an inventory record of DISK_GB
14:34:15 <jaypipes> cdent: now, to find the providers that have VCPU, MEMORY_MB and DISK_GB, I would get all the provider records that have VCPU, MEMORY_MB and DISK_GB, along with the provider record for provider D which has the DISK_GB pool.
14:35:08 <cdent> yeah, what you're describing here is pretty much the center of the question in that email thread
14:35:13 <jaypipes> cdent: the potential is there to return the two compute hosts that have no local disk, and the compute host that has local disk as the provider for DISK_GB of those non-local-disk compute hosts.
14:35:28 <cdent> and it is centered around what 'local_gb' currently means in the resource tracker
14:35:41 <jaypipes> cdent: eh, partly, yes.
14:37:27 <jaypipes> cdent: the way I'm currently doing the query in SQL -- you can do it by doing an intersection of providers that have *some* of the requested resources minus the set that has *all* of the resources, conbined with the providers of only shared resources... but it's messy :)
14:37:45 <jaypipes> cdent: having that can_host field simplifies things on the querying front.
14:38:01 <jaypipes> cdent: but... I don't want to expose that out the main API if I don't have to.
14:38:29 <jaypipes> cdent: basically, I'd like to be able to deduce "can_host" from looking at an inventory colleciton and the set of aggregates a provider is associated with.
14:38:30 <cdent> if can_host matters, we need need start actually using it then, because last I checked the resource tracker doesn't?
14:38:34 <cdent> ah
14:38:37 <cdent> i ssee
14:38:38 <jaypipes> right.
14:39:10 <jaypipes> the resource tracker doesn't yet because the resource tracker hasn't yet been responsbile for adding shared provider inventory.
14:39:57 <jaypipes> and it may not be -- an external script, for instance, might be what does that. but the RT will need to at least *know* which shared providers that compute node is associated with in order to set allocation records against it.
14:41:48 <cdent> to clarify: how much of this matters before the end of ocata?
14:41:48 <jaypipes> anyway, seems I've lost everyone :)
14:42:46 <cdent> Well, what I think needs to happen (because I always think this needs to happen) is more writing in email (perhaps in response to the update # messages).
14:43:07 <jaypipes> cdent: I would like to see shared resource inventory records being written in Ocata, but it's a stretch. I think if Ocata can focus on scheduler integration for VCPU, MEMORY_MB and DISK_GB as assumed local disk, I'd be happy.
14:43:27 <jaypipes> cdent: and then in Pike, integrate scheduler decisions for shared resources (and nested stuff, etc)
14:43:45 <jaypipes> cdent: agreed on communication to ML.
14:43:48 <cdent> so, for the non-ironic case, my email assumptions are effectively correct?
14:43:53 <jaypipes> cdent: will try to dump brain on that this week.
14:44:07 <jaypipes> cdent: need to read your email assumptions :(
14:44:09 <cdent> (e.g., the meaning of local_gb)
14:44:16 * cdent shakes his tiny fist ;)
14:44:31 <cdent> you'll probably want to read that before the hangout
14:44:35 * jaypipes hammers cdent with his YUGE palm.
14:44:45 * cdent has already built a wall that jaypipes paid for
14:44:59 * jaypipes has made bauzas pay for it (he doesn't know yet)
14:45:00 <cdent> your illegal palm cannot reach my pristine homeland face
14:45:16 * cdent sighs
14:45:22 <jaypipes> ok, enough fun.
14:45:28 <cdent> anyone have any other open topics?
14:45:31 <jaypipes> another thing I wanted to bring up...
14:45:32 <jaypipes> yeah
14:45:59 <jaypipes> I wanted to see if there's any interest from our more UX-centric friends in getting started on a Horizon placement plugin.
14:46:14 <jaypipes> readonly for now.
14:46:47 <_gryf> is anyone use horizon?
14:46:48 <cdent> as in for resource viewing?
14:46:51 * _gryf runz
14:47:18 <mriedem> listing providers and their inventories/allocations could be nice
14:47:40 <mriedem> or an admin being able to filter those by his list of hosts
14:48:00 <jaypipes> oh, look, it's a mriedem.
14:48:05 <jaypipes> welcome, friend.
14:48:16 <mriedem> jaypipes: i assume that anything for that will need to be owned by our team in a plugin outside of horizon
14:48:20 <jaypipes> _gryf: yes, many folks use it :)
14:48:24 <mriedem> and i don't know anyone that works on horizon
14:48:31 * _gryf was just kidding
14:48:40 <jaypipes> mriedem: right, me neither, that's why I was asking :)
14:48:50 <mriedem> we can throw it into the ptg etherpad of doom
14:48:56 <jaypipes> hehe
14:49:16 <_gryf> at least mail on ml would be helpful
14:49:27 <jaypipes> _gryf: k, will send.
14:49:36 * jaypipes has many communication action items..
14:49:51 <cdent> #action jaypipes read and write a fair bit of email
14:50:11 <cdent> anyone or anything else?
14:50:58 <cdent> guess not
14:51:22 <cdent> thanks everyone for showing up and watching the cdent and jaypipes show, next week we'll have punch and judy
14:51:38 <jaypipes> ciao
14:51:38 <_gryf> :D
14:51:49 <cdent> #endmeeting