#openstack-meeting log

15:00:35 <n0ano> #startmeeting gantt
15:00:37 <openstack> Meeting started Tue Oct 14 15:00:35 2014 UTC and is due to finish in 60 minutes.  The chair is n0ano. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:38 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:40 <openstack> The meeting name has been set to 'gantt'
15:00:47 <n0ano> anyone here to talk about the scheduler?
15:00:49 <bauzas> \o
15:00:49 <mspreitz> yes
15:01:16 <yjiang51> yes
15:01:33 <jgallard> hi
15:01:49 <edleafe> o/
15:02:33 <bauzas> roll call done ? :)
15:02:38 <n0ano> #topic forklift status
15:02:52 <n0ano> bauzas, this is mainly you, anything to report?
15:03:04 <bauzas> well
15:03:18 <bauzas> there are many things to discuss but about BPs
15:03:32 <bauzas> I can just cover my current work
15:04:02 <yjiang51> bauzas: is https://etherpad.openstack.org/p/nova-scheduler-refactoring still maintained, or should we go to BPs?
15:04:16 <bauzas> #link https://review.openstack.org/126895 is the spec about splitting ComputeNode
15:04:25 <bauzas> yjiang51: right, this one is a good one
15:04:55 <PaulMurray> hi -sorry I'm late
15:04:57 <yjiang51> bauzas: thanks.
15:05:13 <bauzas> PaulMurray also has some work related to ComputeNode too
15:05:37 <bauzas> #link https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/detach-service-from-computenode,n,z Implementation of the spec above
15:05:49 <bauzas> that's it mainly for me about the current work
15:06:06 <PaulMurray> bauzas, did you cover pci)stats?
15:06:10 <bauzas> we have many things to discuss yet about BPs again
15:06:27 <bauzas> PaulMurray: nope, yjiang51 do you plan to fix that ?
15:06:48 <bauzas> yjiang51: PaulMurray was speaking about pci_devices field missing in ComputeNode object
15:07:09 <yjiang51> bauzas: I had patch for PCIstats, but since it missed the J release, I didn't continue that. I will revoke it.
15:07:25 <bauzas> yjiang51: k
15:07:28 <n0ano> bauzas, my real question is do you need any help on the current work, otherwise we can go into the BPs
15:07:43 <bauzas> n0ano: we can go to the BPs, far more importanty
15:07:51 <PaulMurray> yjiang51, I can help you if you need?
15:08:01 <yjiang51> PaulMurray: Cool.
15:08:02 <n0ano> in that case...
15:08:06 <yjiang51> PaulMurray: thanks.
15:08:08 <n0ano> #topic Kilo BPs
15:08:29 <PaulMurray> yjiang51, I speak to you outside this meeting
15:08:56 <bauzas> #link https://etherpad.openstack.org/p/nova-scheduler-refactoring Here is the basic discussion about Scheduler refactoring
15:08:58 <yjiang51> PaulMurray: sure.
15:09:22 <bauzas> there are BPs attached to that created by jaypipes
15:09:56 <bauzas> #link https://review.openstack.org/#/c/127609/ Add resource objects model
15:10:21 <bauzas> #link https://review.openstack.org/#/c/127610/5 Add request_spec object model
15:10:49 <bauzas> #link https://review.openstack.org/#/c/127612 Change select_destinations to use request_spec object model
15:11:21 <bauzas> all of them have previously been discussed here, so I expect no clear disagreement on the problem description
15:11:33 <bauzas> I tho have concerns on the resource models themselves
15:11:46 <bauzas> jaypipes: around ?
15:12:37 <n0ano> I come back to, if we implement these 3 BPs do we think we'll be ready to split out the scheduler?
15:12:39 <PaulMurray> bauzas, I have not talked much about extensible resource tracking recently
15:13:10 <PaulMurray> bauzas, I need to fit that into this plan
15:13:31 <bauzas> n0ano: that's my main worries, it will still need to change update_resource_stats to make use of these new resource models
15:13:33 <PaulMurray> bauzas, I think it fits ok - but do need to make sure I understand
15:13:35 <bauzas> so that's a 4th BP
15:13:51 <bauzas> PaulMurray: I'm a bit concerned about the resource modeling proposed by jaypipes
15:14:00 <jaypipes> hey guys, sorry, here now...
15:14:08 <PaulMurray> hi jaypipes
15:14:24 <n0ano> I want to try and focus on `clean up then split', we've been constanly changing the goal posts up to now so I want to focus
15:14:51 <PaulMurray> jaypipes, bauzas my main problem is I think I need a white board and a locked room to drive to conclusions
15:14:53 <jaypipes> bauzas: I don't think you read my blueprints all that carefully :)
15:14:53 <bauzas> n0ano: +1000
15:15:13 <n0ano> jaypipes, can you explain?
15:15:19 <bauzas> jaypipes: I mostly agree with the last 2 of the series :)
15:15:19 <jaypipes> bauzas: those are the only blueprints that would (IMHO) need to be done before a split is possible.
15:15:49 <n0ano> jaypipes, which do you mean by `those`?
15:15:58 <jaypipes> bauzas: the resource object models blueprint specifically do *not* call for any change to the RPC API layer or the call structure of update_resource_stats
15:16:15 <jaypipes> n0ano: the three above, and bauza's compute node detach from service BP
15:16:18 <bauzas> jaypipes: a split is only possible if filters are looking at those resource models instead of querying other Nova components
15:16:33 <jaypipes> bauzas: no, that can all be done after a split.
15:17:07 <jaypipes> bauzas: the blueprint specifically states that the resource models can be constructed on the nova-scheduler side by looking at the existing data model in the compute node.
15:17:15 <bauzas> jaypipes: I disagree, we regularly enforced the need of having something extensible enough for fitting the isolate-sched-db BP
15:17:51 <bauzas> jaypipes: ok, I take your point, I will cover this 1st BP very carefully
15:17:52 <n0ano> compute node detach from service BP = https://review.openstack.org/#/c/89893/
15:18:01 <bauzas> n0ano: nope
15:18:35 <jaypipes> bauzas: resources don't need to be extensible. amounts of resources and usages need to be able to be represented using a singular interface, but the way that data is stored does not *need* to change.
15:18:45 <bauzas> n0ano: https://blueprints.launchpad.net/nova/+spec/detach-service-from-computenode
15:18:59 <jaypipes> bauzas: it would be nice if the way the data was stored was cleaned up, but it's not necessary before a split of the scheduler, IMO
15:19:15 <bauzas> jaypipes: ok, will read carefully
15:20:03 <jaypipes> bauzas: https://review.openstack.org/#/c/127609/4/specs/kilo/approved/resource-objects.rst <-- line 176
15:20:25 <n0ano> OK, just to be specific, we need this 3 patch and 1 BP before we can do the split:
15:20:29 <n0ano> #link https://review.openstack.org/#/c/127609/
15:20:29 <n0ano> #link https://review.openstack.org/#/c/127610/5
15:20:29 <n0ano> #link https://review.openstack.org/#/c/127612
15:20:29 <n0ano> #link https://blueprints.launchpad.net/nova/+spec/detach-service-from-computenode
15:21:03 <bauzas> jaypipes: yeah saw that, just wonders how we can safely store information about aggregates for example
15:21:56 <jaypipes> bauzas: aggregates are not a resource.
15:22:01 <bauzas> jaypipes: but I need to further dig into your proposal and test it against our usecases
15:22:35 <bauzas> jaypipes: but we need to find some way for filters to look into HostState in order to filter on aggregates or AZs for example
15:22:54 <jaypipes> bauzas: that has nothing to do with resource object models, though. none of that changes.
15:22:57 <bauzas> jaypipes: whatever an AZ is
15:22:59 <yjiang51> bauzas: jaypipes: possibly host aggregate can be solely in scheduler, not in nova , in future.
15:23:27 <bauzas> jaypipes: agreed, hence isolate-scheduler-db BP
15:23:53 <n0ano> HostState should just be part of the scheduler, unrelated to the resource tracker (restating what jaypipes said)
15:24:04 <bauzas> jaypipes: the former was using ERT for achieving this, we need to see how resource model BP can fit this
15:24:24 <bauzas> n0ano: HostState is not related to RT already
15:24:43 <jaypipes> bauzas: if ERT was reasonable, then the numa_topology field would not have been needed to be added to the compute_nodes table.
15:24:48 <n0ano> I think we're in violent agreement then
15:25:29 <jaypipes> bauzas: but ERT didn't solve any problem, so the folks that added the NUMA topology resources created a separate field numa_topology in the compute_nodes table to store infdomation about the NUMA cells used on the node.
15:25:37 <bauzas> jaypipes: I'm just saying that the resource-model BP is necessary for updating HostState with Aggregate related info :)
15:25:38 <PaulMurray> jaypipes, ERT did not do the resource object - and it should
15:26:07 <jaypipes> bauzas: with the resource object models, we are simply standardizing the way that resource amounts are *compared*, and punting on the storage of those data points.
15:26:17 <jaypipes> bauzas: as that can be cleaned up later...
15:26:20 <PaulMurray> jaypipes, I am behind the resource models
15:26:36 <jaypipes> PaulMurray: cool, thx :)
15:26:40 <PaulMurray> jaypipes, I think you know I was pushed in all sorts of directions
15:26:47 <jaypipes> PaulMurray: yes, I do.
15:27:09 <PaulMurray> jaypipes, and just had to end up without structuring the model while getting in a way to be extensible
15:27:12 <jaypipes> PaulMurray: and I'm not blaming anyone or anything. I'm just presenting a view of how to iterate towards a more consistent way of comparing resource amounts/usages
15:27:55 <PaulMurray> jaypipes, right - I'm basically on board with this - its all about details
15:27:58 <bauzas> jaypipes: I think we all think it is required
15:28:06 <jaypipes> bauzas:  I think I may understand a problem you have...
15:28:15 <PaulMurray> and I think we are now able to get beyond the problems I couldn't a year ago
15:28:20 <bauzas> jaypipes: because we decided to work on that first before splitting
15:28:26 <jaypipes> bauzas: so, because I edited all these blueprints in the same topic branch in git, it looks like they are dependent on each other.
15:28:40 <jaypipes> bauzas: but the second two are not dependent on resource-object-models BP
15:28:46 <bauzas> jaypipes: yeah, that's helpful, thanks for this
15:28:52 <bauzas> jaypipes: agreed
15:28:55 <jaypipes> bauzas: i.e. work on request-spec-object can and should begin now
15:28:58 <bauzas> jaypipes: I can cover those
15:29:05 <jaypipes> bauzas: w/o waiting for the resource-iobejct-models work.
15:29:14 <jaypipes> those two BPs are entirely independent of each other.
15:29:37 <jaypipes> bauzas: but of course, the sched-select-destinations-use-request-spec-object depends on the completion of request-spec-object :)
15:29:42 <bauzas> jaypipes: I just want to explain to you our main concern about these heterogenous resources that we need to store and filter upon these
15:29:56 <jaypipes> bauzas: I'm listening.
15:29:58 <bauzas> jaypipes: agreed, can I borrow these two ?
15:30:08 <jaypipes> bauzas: please do!
15:30:11 <bauzas> jaypipes: cool
15:30:14 <bauzas> jaypipes: so
15:30:27 <bauzas> jaypipes: back to the problem I mentioned
15:30:27 <jaypipes> bauzas: did you notice I put you as a contributor on them?
15:30:31 <bauzas> jaypipes: yup
15:30:34 <jaypipes> k
15:30:38 <bauzas> jaypipes: at least for one
15:30:40 <bauzas> anyway
15:30:57 <bauzas> jaypipes: so, let me explain to you the problem we have with the current filter design
15:31:08 <jaypipes> bauzas: you are listed as a contributor on the second two patches.
15:31:18 <bauzas> k
15:31:37 <bauzas> so, filters are looking at HostState for deciding by comparing with request_spec
15:32:22 <bauzas> jaypipes: but for some specific filters, they directly call other Nova objects (like aggregates) for comparing with request_spec
15:32:31 <bauzas> jaypipes: ie. AvailabilityZoneFilter
15:32:36 <bauzas> eg. sorry
15:32:49 <jaypipes> bauzas: the only one that does that does it to get the allocation overcommit ratios for the aggregate object.
15:33:18 <jaypipes> bauzas: which was the purpose of the allocation-ratios-to-resource-tracker blueprint from last cycle...
15:33:22 <bauzas> jaypipes: nope, that's not about the ratios
15:33:35 <jaypipes> bauzas: well, capabilities plus overcommit ratios.
15:33:48 <bauzas> jaypipes: AZFilter is fetching the metadata associated to the aggregates whose host belongs to
15:33:55 <jaypipes> right. capabilities.
15:33:59 <bauzas> jaypipes: and compares it with the AZ hint passed
15:34:24 <bauzas> jaypipes: so, here, that's an explicit call to the Aggregate DB (or Object)
15:34:29 <jaypipes> yes.
15:34:49 <bauzas> jaypipes: again, it should look into HostState instead of querying Aggregate
15:35:07 <jaypipes> bauzas: HostState still needs to get the information about aggregates from somewhere.
15:35:14 <bauzas> jaypipes: exactly !
15:35:28 <bauzas> jaypipes: that's the purpose of isolate-scheduler-db BP
15:35:29 <PaulMurray> bauzas, jaypipes have you looked at the service groups example
15:35:29 <jaypipes> but what does this have to do with any of the blueprints I've proposed?
15:35:37 <jaypipes> PaulMurray: yes.
15:35:49 <jaypipes> PaulMurray: I think you meant server-groups, not service-groups, right?
15:36:01 <PaulMurray> jaypipes, yes, sorry
15:36:05 <jaypipes> no worries. :)
15:36:06 <bauzas> jaypipes: it comes to your BPs because we need to find some way to update the scheduler with info about aggregates
15:36:38 <bauzas> jaypipes: the former proposal was using ERT for updating stats field with Aggregates info
15:36:39 <jaypipes> bauzas: but that doesn't have anything to do with any of the blueprints I have proposed...
15:36:46 <yjiang51> bauzas: are there anyone else other than scheduler/filter use the aggregate information?
15:36:53 <PaulMurray> bauzas, it looks up the server information and adds it to the request spec before calling the filters
15:37:06 <jaypipes> bauzas: Aggregates are not resources. They are collections of providers of resources.
15:37:39 <n0ano> jaypipes, indeed, they are closer to a host state then they are to a resource
15:38:09 <jaypipes> n0ano: correct.
15:38:13 <bauzas> jaypipes: still, how do you see how the Scheduler should be notified about these informations ?
15:38:38 <bauzas> jaypipes: so that the filters could get that info ?
15:38:43 <yjiang51> bauzas: why not keep these information on scheduler?
15:38:53 <jaypipes> bauzas: I think the scheduler should *own* the information about aggregates. not "be notified about them"
15:39:03 <n0ano> bauzas, good question, I would say similar to the way host state is updated but that will expand that particular API
15:39:06 <yjiang51> jaypipes: +1
15:39:34 <jaypipes> bauzas: but in the meantime, just add a new shceduler RPC API method update_aggregate()
15:39:45 <jaypipes> bauzas: and notify the scheduler about changes by calling that RPC API.
15:39:47 <bauzas> yjiang51: jaypipes: what about if we need to store information about a Neutron router' list of ports ?
15:39:50 <jaypipes> bauzas: problem solved :P
15:39:54 <PaulMurray> bauzas, what is the problem you address by having the compute node send aggregate info to the scheduler
15:40:26 <bauzas> PaulMurray: my concern is how to store it
15:40:32 <yjiang51> bauzas: is the list of ports also aggregate ?
15:40:46 <bauzas> yjiang51: nope, that's another example of a possible Gantt filter
15:40:53 <jaypipes> bauzas: provide that information in the request_spec, or better yet, have the scheduler request that information as-needed from Neutron. That said, I have no idea what use the scheduler has with a list of ports from a Neutron router.
15:40:54 <n0ano> PaulMurray, once we split the scheduler out to a separate service the aggregat info won't be availble if it's not stored in the scheduler
15:41:24 <yjiang51> bauzas: port is resource, right?
15:41:28 <n0ano> or at least not easily available
15:41:29 <jaypipes> no.
15:41:34 <PaulMurray> n0ano, bauzas I see, but it doesn't belong to the compute node either
15:41:48 <jaypipes> yjiang51: a VF on an SR-IOV PF is a resource, but a port is not.
15:42:22 <jaypipes> yjiang51: unless you are referring to a physical port, which has some limited number on a device, in which case, yes, a port is a resource ;)
15:42:25 <n0ano> jaypipes, I would have to disagree, a port and a VF are pretty much the same if one is a resrouce then the other is
15:42:45 <jaypipes> n0ano: see above :) depends on whether there is a total capacity of the "thing" :)
15:42:45 <n0ano> yes, I was equating a port to a physical device
15:42:56 <jaypipes> n0ano: port in neutron, though... that's not a physical device.
15:43:08 <jaypipes> n0ano: which is why I like to be specific about these things :)
15:43:32 <n0ano> jaypipes, hence the confusion, do we care about Neutron ports for the moment?
15:43:52 <bauzas> n0ano: I was mentioning this for explaining that we need to be generic
15:43:58 <yjiang51> jaypipes: you mean un-limited resource is not resource? Hmm, that make sense also, since it means no need to be managed at all :)
15:44:04 <bauzas> and not rely on Nova existing DB or info
15:44:17 <jaypipes> n0ano: we only care about the PCI ports, but those are handled by the resource_tracker separately.
15:44:31 <jaypipes> yjiang51: right, zactly.
15:44:38 <bauzas> anyway, it seems that time is running on
15:44:41 <n0ano> jaypipes, I'm pretty sure I'm +1 on that
15:45:19 <n0ano> bauzas, yeah, let's try and sqeeze some other items in today
15:45:26 <jaypipes> bauzas: so... may I work on the resource-object-model, you work on the request-spec-object, and we continue to discuss ways in which we communicate agregate changes to the scheduler? Perhaps PaulMurray can work on that last one?
15:45:39 <bauzas> jaypipes: sounds a good plan to me
15:45:41 <n0ano> jaypipes, WFM
15:45:48 <jaypipes> PaulMurray: ? work for you>?
15:46:05 <PaulMurray> oh right, it works for everyone else eh? Yes, fine by me :)
15:46:08 <bauzas> jaypipes: PaulMurray: https://review.openstack.org/89893
15:46:19 <bauzas> ^ above is the spec about the problem I mentioned
15:46:27 <n0ano> anyway...
15:46:33 <n0ano> #topic Kilo sessions
15:46:41 <bauzas> so
15:46:52 <bauzas> 2 possible sessions
15:47:04 <n0ano> I think we will have a cross project session, the open is whether we'll have a separate session just on the current Gantt split
15:47:05 <bauzas> 1/ in cross-project track on Tuesday
15:47:31 <bauzas> https://etherpad.openstack.org/p/kilo-crossproject-summit-topics L93 (comments welcome)
15:48:20 <bauzas> another one is expected to come in the nova summit track
15:48:41 <bauzas> https://etherpad.openstack.org/p/kilo-nova-summit-topics
15:48:58 <n0ano> I'm still a little unclear on what we can do to push for these 2 sessions
15:49:40 <doron> just a comment from my side. The cross project is needed as other projects want to integrate and possibly help with Gantt. So this will be more of an invitation and asking for general requirements.
15:50:13 <doron> The nova one is obviously more into the inner works of the scheduler split work.
15:50:34 <n0ano> doron, hoping for some specifics from other project on what they want from a scheduler and some API guidance
15:51:21 <doron> n0ano: I'm getting feedbacks from folks in cinder and other projects.
15:51:27 <bauzas> mikal seems to have already triaged some sessions in the Nova etherpad
15:51:53 <doron> n0ano: they would like to know what to expect and what's needed form their side.
15:52:33 <n0ano> doron, what I said, what capabilities are they looking for and what kind of APIs do they want
15:53:15 <bauzas> doron: n0ano: hence for example a discussion about whether we accept other projects's polling like jaypipes suggested or only stats notifications
15:53:32 <doron> n0ano: basically scheduling services. I'd assume they'll write filter(s) and weight modules
15:53:33 <bauzas> the latter having my preference
15:53:44 <doron> n0ano: so for example,
15:54:05 <doron> neutron may be able to schedule a service VM with a mitation of X hops from a network device.
15:54:24 <doron> (limitation)
15:54:38 <doron> The same use case is valid for cinder.
15:54:45 <doron> from a storage backend.
15:54:47 <n0ano> doron, the specifics on what `scheduling services' they need is what we're looking for, your Neutron example is perfect
15:55:13 <doron> n0ano: this is exactly why we should have this all-hands session.
15:55:23 <n0ano> doron, +1
15:55:56 <bauzas> back to the Nova session, I don't know if mikal or johnthetubaguy placed the good candidates but this person asked for blueprints
15:56:20 <johnthetubaguy> bauzas: thats by fault, I was asking for specs
15:56:32 <n0ano> well, we have BPs, does johnthetubaguy need pointers?
15:56:41 <bauzas> so, I will rebase my isolate-sched-db BP on Kilo and jaypipes's BPs will be discussed too
15:56:43 <johnthetubaguy> bauzas: we mentioned this on the ML before a few times, we need a spec for summit sessions
15:56:51 <bauzas> johnthetubaguy: agreed
15:57:09 <johnthetubaguy> n0ano: ideally the specs would be listed in the etherpad
15:57:11 <bauzas> johnthetubaguy: jaypipes covers most of the work, and isolate-scheduler-db BP will be reproposed for Kilo
15:57:15 <bauzas> johnthetubaguy: so will do
15:57:30 <johnthetubaguy> bauzas: hoping to get most of those approved before the summit
15:57:42 <n0ano> johnthetubaguy, tnx, that would be great
15:57:44 <johnthetubaguy> bauzas: seems like you have a good direction that just needs executing
15:57:52 <bauzas> johnthetubaguy: hope so too
15:57:56 <johnthetubaguy> what bits is there no agreement on?
15:58:13 <johnthetubaguy> I noticed the extensible resource tracker is getting a bit of a battering, but other than that
15:58:36 <n0ano> johnthetubaguy, among this group on IRC we're pretty much in agreement
15:58:47 <bauzas> johnthetubaguy: well, I don't see real problems except that isolate-sched-db implementation we need to cover in between jaypipes, PaulMurray and me
15:59:13 <bauzas> johnthetubaguy: I would really appreciate if Nova team could help us on guidance there
15:59:17 <johnthetubaguy> bauzas: honestly, after the resource tracker work, it should be clear what is needed
15:59:23 <n0ano> approaching the top of the hour, is there any open anyone wants to raise in the last few minutes?
15:59:25 <bauzas> johnthetubaguy: we need feedback mostly
15:59:39 <johnthetubaguy> bauzas: cool, lets get the specs up, and see what we can do
16:00:03 <bauzas> on a side note, will be on PTO next week and beginning of the week after
16:00:10 <johnthetubaguy> PS, you folks are part of the nova team, so do review all the specs, seeing +1s from you folks really helps drive it forward
16:00:19 <bauzas> so, won't be present for the next 2 Gantt meetings
16:00:38 <bauzas> johnthetubaguy: that's my duty :)
16:00:55 <n0ano> bauzas, me too (business trip next week and vacation in Normandy the week after) we should cancel the next 3 meetings and do things via email
16:01:11 <n0ano> tnx everyone
16:01:14 <n0ano> #endmeeting