15:00:35 #startmeeting gantt 15:00:37 Meeting started Tue Oct 14 15:00:35 2014 UTC and is due to finish in 60 minutes. The chair is n0ano. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:38 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:40 The meeting name has been set to 'gantt' 15:00:47 anyone here to talk about the scheduler? 15:00:49 \o 15:00:49 yes 15:01:16 yes 15:01:33 hi 15:01:49 o/ 15:02:33 roll call done ? :) 15:02:38 #topic forklift status 15:02:52 bauzas, this is mainly you, anything to report? 15:03:04 well 15:03:18 there are many things to discuss but about BPs 15:03:32 I can just cover my current work 15:04:02 bauzas: is https://etherpad.openstack.org/p/nova-scheduler-refactoring still maintained, or should we go to BPs? 15:04:16 #link https://review.openstack.org/126895 is the spec about splitting ComputeNode 15:04:25 yjiang51: right, this one is a good one 15:04:55 hi -sorry I'm late 15:04:57 bauzas: thanks. 15:05:13 PaulMurray also has some work related to ComputeNode too 15:05:37 #link https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/detach-service-from-computenode,n,z Implementation of the spec above 15:05:49 that's it mainly for me about the current work 15:06:06 bauzas, did you cover pci)stats? 15:06:10 we have many things to discuss yet about BPs again 15:06:27 PaulMurray: nope, yjiang51 do you plan to fix that ? 15:06:48 yjiang51: PaulMurray was speaking about pci_devices field missing in ComputeNode object 15:07:09 bauzas: I had patch for PCIstats, but since it missed the J release, I didn't continue that. I will revoke it. 15:07:25 yjiang51: k 15:07:28 bauzas, my real question is do you need any help on the current work, otherwise we can go into the BPs 15:07:43 n0ano: we can go to the BPs, far more importanty 15:07:51 yjiang51, I can help you if you need? 15:08:01 PaulMurray: Cool. 15:08:02 in that case... 15:08:06 PaulMurray: thanks. 15:08:08 #topic Kilo BPs 15:08:29 yjiang51, I speak to you outside this meeting 15:08:56 #link https://etherpad.openstack.org/p/nova-scheduler-refactoring Here is the basic discussion about Scheduler refactoring 15:08:58 PaulMurray: sure. 15:09:22 there are BPs attached to that created by jaypipes 15:09:56 #link https://review.openstack.org/#/c/127609/ Add resource objects model 15:10:21 #link https://review.openstack.org/#/c/127610/5 Add request_spec object model 15:10:49 #link https://review.openstack.org/#/c/127612 Change select_destinations to use request_spec object model 15:11:21 all of them have previously been discussed here, so I expect no clear disagreement on the problem description 15:11:33 I tho have concerns on the resource models themselves 15:11:46 jaypipes: around ? 15:12:37 I come back to, if we implement these 3 BPs do we think we'll be ready to split out the scheduler? 15:12:39 bauzas, I have not talked much about extensible resource tracking recently 15:13:10 bauzas, I need to fit that into this plan 15:13:31 n0ano: that's my main worries, it will still need to change update_resource_stats to make use of these new resource models 15:13:33 bauzas, I think it fits ok - but do need to make sure I understand 15:13:35 so that's a 4th BP 15:13:51 PaulMurray: I'm a bit concerned about the resource modeling proposed by jaypipes 15:14:00 hey guys, sorry, here now... 15:14:08 hi jaypipes 15:14:24 I want to try and focus on `clean up then split', we've been constanly changing the goal posts up to now so I want to focus 15:14:51 jaypipes, bauzas my main problem is I think I need a white board and a locked room to drive to conclusions 15:14:53 bauzas: I don't think you read my blueprints all that carefully :) 15:14:53 n0ano: +1000 15:15:13 jaypipes, can you explain? 15:15:19 jaypipes: I mostly agree with the last 2 of the series :) 15:15:19 bauzas: those are the only blueprints that would (IMHO) need to be done before a split is possible. 15:15:49 jaypipes, which do you mean by `those`? 15:15:58 bauzas: the resource object models blueprint specifically do *not* call for any change to the RPC API layer or the call structure of update_resource_stats 15:16:15 n0ano: the three above, and bauza's compute node detach from service BP 15:16:18 jaypipes: a split is only possible if filters are looking at those resource models instead of querying other Nova components 15:16:33 bauzas: no, that can all be done after a split. 15:17:07 bauzas: the blueprint specifically states that the resource models can be constructed on the nova-scheduler side by looking at the existing data model in the compute node. 15:17:15 jaypipes: I disagree, we regularly enforced the need of having something extensible enough for fitting the isolate-sched-db BP 15:17:51 jaypipes: ok, I take your point, I will cover this 1st BP very carefully 15:17:52 compute node detach from service BP = https://review.openstack.org/#/c/89893/ 15:18:01 n0ano: nope 15:18:35 bauzas: resources don't need to be extensible. amounts of resources and usages need to be able to be represented using a singular interface, but the way that data is stored does not *need* to change. 15:18:45 n0ano: https://blueprints.launchpad.net/nova/+spec/detach-service-from-computenode 15:18:59 bauzas: it would be nice if the way the data was stored was cleaned up, but it's not necessary before a split of the scheduler, IMO 15:19:15 jaypipes: ok, will read carefully 15:20:03 bauzas: https://review.openstack.org/#/c/127609/4/specs/kilo/approved/resource-objects.rst <-- line 176 15:20:25 OK, just to be specific, we need this 3 patch and 1 BP before we can do the split: 15:20:29 #link https://review.openstack.org/#/c/127609/ 15:20:29 #link https://review.openstack.org/#/c/127610/5 15:20:29 #link https://review.openstack.org/#/c/127612 15:20:29 #link https://blueprints.launchpad.net/nova/+spec/detach-service-from-computenode 15:21:03 jaypipes: yeah saw that, just wonders how we can safely store information about aggregates for example 15:21:56 bauzas: aggregates are not a resource. 15:22:01 jaypipes: but I need to further dig into your proposal and test it against our usecases 15:22:35 jaypipes: but we need to find some way for filters to look into HostState in order to filter on aggregates or AZs for example 15:22:54 bauzas: that has nothing to do with resource object models, though. none of that changes. 15:22:57 jaypipes: whatever an AZ is 15:22:59 bauzas: jaypipes: possibly host aggregate can be solely in scheduler, not in nova , in future. 15:23:27 jaypipes: agreed, hence isolate-scheduler-db BP 15:23:53 HostState should just be part of the scheduler, unrelated to the resource tracker (restating what jaypipes said) 15:24:04 jaypipes: the former was using ERT for achieving this, we need to see how resource model BP can fit this 15:24:24 n0ano: HostState is not related to RT already 15:24:43 bauzas: if ERT was reasonable, then the numa_topology field would not have been needed to be added to the compute_nodes table. 15:24:48 I think we're in violent agreement then 15:25:29 bauzas: but ERT didn't solve any problem, so the folks that added the NUMA topology resources created a separate field numa_topology in the compute_nodes table to store infdomation about the NUMA cells used on the node. 15:25:37 jaypipes: I'm just saying that the resource-model BP is necessary for updating HostState with Aggregate related info :) 15:25:38 jaypipes, ERT did not do the resource object - and it should 15:26:07 bauzas: with the resource object models, we are simply standardizing the way that resource amounts are *compared*, and punting on the storage of those data points. 15:26:17 bauzas: as that can be cleaned up later... 15:26:20 jaypipes, I am behind the resource models 15:26:36 PaulMurray: cool, thx :) 15:26:40 jaypipes, I think you know I was pushed in all sorts of directions 15:26:47 PaulMurray: yes, I do. 15:27:09 jaypipes, and just had to end up without structuring the model while getting in a way to be extensible 15:27:12 PaulMurray: and I'm not blaming anyone or anything. I'm just presenting a view of how to iterate towards a more consistent way of comparing resource amounts/usages 15:27:55 jaypipes, right - I'm basically on board with this - its all about details 15:27:58 jaypipes: I think we all think it is required 15:28:06 bauzas: I think I may understand a problem you have... 15:28:15 and I think we are now able to get beyond the problems I couldn't a year ago 15:28:20 jaypipes: because we decided to work on that first before splitting 15:28:26 bauzas: so, because I edited all these blueprints in the same topic branch in git, it looks like they are dependent on each other. 15:28:40 bauzas: but the second two are not dependent on resource-object-models BP 15:28:46 jaypipes: yeah, that's helpful, thanks for this 15:28:52 jaypipes: agreed 15:28:55 bauzas: i.e. work on request-spec-object can and should begin now 15:28:58 jaypipes: I can cover those 15:29:05 bauzas: w/o waiting for the resource-iobejct-models work. 15:29:14 those two BPs are entirely independent of each other. 15:29:37 bauzas: but of course, the sched-select-destinations-use-request-spec-object depends on the completion of request-spec-object :) 15:29:42 jaypipes: I just want to explain to you our main concern about these heterogenous resources that we need to store and filter upon these 15:29:56 bauzas: I'm listening. 15:29:58 jaypipes: agreed, can I borrow these two ? 15:30:08 bauzas: please do! 15:30:11 jaypipes: cool 15:30:14 jaypipes: so 15:30:27 jaypipes: back to the problem I mentioned 15:30:27 bauzas: did you notice I put you as a contributor on them? 15:30:31 jaypipes: yup 15:30:34 k 15:30:38 jaypipes: at least for one 15:30:40 anyway 15:30:57 jaypipes: so, let me explain to you the problem we have with the current filter design 15:31:08 bauzas: you are listed as a contributor on the second two patches. 15:31:18 k 15:31:37 so, filters are looking at HostState for deciding by comparing with request_spec 15:32:22 jaypipes: but for some specific filters, they directly call other Nova objects (like aggregates) for comparing with request_spec 15:32:31 jaypipes: ie. AvailabilityZoneFilter 15:32:36 eg. sorry 15:32:49 bauzas: the only one that does that does it to get the allocation overcommit ratios for the aggregate object. 15:33:18 bauzas: which was the purpose of the allocation-ratios-to-resource-tracker blueprint from last cycle... 15:33:22 jaypipes: nope, that's not about the ratios 15:33:35 bauzas: well, capabilities plus overcommit ratios. 15:33:48 jaypipes: AZFilter is fetching the metadata associated to the aggregates whose host belongs to 15:33:55 right. capabilities. 15:33:59 jaypipes: and compares it with the AZ hint passed 15:34:24 jaypipes: so, here, that's an explicit call to the Aggregate DB (or Object) 15:34:29 yes. 15:34:49 jaypipes: again, it should look into HostState instead of querying Aggregate 15:35:07 bauzas: HostState still needs to get the information about aggregates from somewhere. 15:35:14 jaypipes: exactly ! 15:35:28 jaypipes: that's the purpose of isolate-scheduler-db BP 15:35:29 bauzas, jaypipes have you looked at the service groups example 15:35:29 but what does this have to do with any of the blueprints I've proposed? 15:35:37 PaulMurray: yes. 15:35:49 PaulMurray: I think you meant server-groups, not service-groups, right? 15:36:01 jaypipes, yes, sorry 15:36:05 no worries. :) 15:36:06 jaypipes: it comes to your BPs because we need to find some way to update the scheduler with info about aggregates 15:36:38 jaypipes: the former proposal was using ERT for updating stats field with Aggregates info 15:36:39 bauzas: but that doesn't have anything to do with any of the blueprints I have proposed... 15:36:46 bauzas: are there anyone else other than scheduler/filter use the aggregate information? 15:36:53 bauzas, it looks up the server information and adds it to the request spec before calling the filters 15:37:06 bauzas: Aggregates are not resources. They are collections of providers of resources. 15:37:39 jaypipes, indeed, they are closer to a host state then they are to a resource 15:38:09 n0ano: correct. 15:38:13 jaypipes: still, how do you see how the Scheduler should be notified about these informations ? 15:38:38 jaypipes: so that the filters could get that info ? 15:38:43 bauzas: why not keep these information on scheduler? 15:38:53 bauzas: I think the scheduler should *own* the information about aggregates. not "be notified about them" 15:39:03 bauzas, good question, I would say similar to the way host state is updated but that will expand that particular API 15:39:06 jaypipes: +1 15:39:34 bauzas: but in the meantime, just add a new shceduler RPC API method update_aggregate() 15:39:45 bauzas: and notify the scheduler about changes by calling that RPC API. 15:39:47 yjiang51: jaypipes: what about if we need to store information about a Neutron router' list of ports ? 15:39:50 bauzas: problem solved :P 15:39:54 bauzas, what is the problem you address by having the compute node send aggregate info to the scheduler 15:40:26 PaulMurray: my concern is how to store it 15:40:32 bauzas: is the list of ports also aggregate ? 15:40:46 yjiang51: nope, that's another example of a possible Gantt filter 15:40:53 bauzas: provide that information in the request_spec, or better yet, have the scheduler request that information as-needed from Neutron. That said, I have no idea what use the scheduler has with a list of ports from a Neutron router. 15:40:54 PaulMurray, once we split the scheduler out to a separate service the aggregat info won't be availble if it's not stored in the scheduler 15:41:24 bauzas: port is resource, right? 15:41:28 or at least not easily available 15:41:29 no. 15:41:34 n0ano, bauzas I see, but it doesn't belong to the compute node either 15:41:48 yjiang51: a VF on an SR-IOV PF is a resource, but a port is not. 15:42:22 yjiang51: unless you are referring to a physical port, which has some limited number on a device, in which case, yes, a port is a resource ;) 15:42:25 jaypipes, I would have to disagree, a port and a VF are pretty much the same if one is a resrouce then the other is 15:42:45 n0ano: see above :) depends on whether there is a total capacity of the "thing" :) 15:42:45 yes, I was equating a port to a physical device 15:42:56 n0ano: port in neutron, though... that's not a physical device. 15:43:08 n0ano: which is why I like to be specific about these things :) 15:43:32 jaypipes, hence the confusion, do we care about Neutron ports for the moment? 15:43:52 n0ano: I was mentioning this for explaining that we need to be generic 15:43:58 jaypipes: you mean un-limited resource is not resource? Hmm, that make sense also, since it means no need to be managed at all :) 15:44:04 and not rely on Nova existing DB or info 15:44:17 n0ano: we only care about the PCI ports, but those are handled by the resource_tracker separately. 15:44:31 yjiang51: right, zactly. 15:44:38 anyway, it seems that time is running on 15:44:41 jaypipes, I'm pretty sure I'm +1 on that 15:45:19 bauzas, yeah, let's try and sqeeze some other items in today 15:45:26 bauzas: so... may I work on the resource-object-model, you work on the request-spec-object, and we continue to discuss ways in which we communicate agregate changes to the scheduler? Perhaps PaulMurray can work on that last one? 15:45:39 jaypipes: sounds a good plan to me 15:45:41 jaypipes, WFM 15:45:48 PaulMurray: ? work for you>? 15:46:05 oh right, it works for everyone else eh? Yes, fine by me :) 15:46:08 jaypipes: PaulMurray: https://review.openstack.org/89893 15:46:19 ^ above is the spec about the problem I mentioned 15:46:27 anyway... 15:46:33 #topic Kilo sessions 15:46:41 so 15:46:52 2 possible sessions 15:47:04 I think we will have a cross project session, the open is whether we'll have a separate session just on the current Gantt split 15:47:05 1/ in cross-project track on Tuesday 15:47:31 https://etherpad.openstack.org/p/kilo-crossproject-summit-topics L93 (comments welcome) 15:48:20 another one is expected to come in the nova summit track 15:48:41 https://etherpad.openstack.org/p/kilo-nova-summit-topics 15:48:58 I'm still a little unclear on what we can do to push for these 2 sessions 15:49:40 just a comment from my side. The cross project is needed as other projects want to integrate and possibly help with Gantt. So this will be more of an invitation and asking for general requirements. 15:50:13 The nova one is obviously more into the inner works of the scheduler split work. 15:50:34 doron, hoping for some specifics from other project on what they want from a scheduler and some API guidance 15:51:21 n0ano: I'm getting feedbacks from folks in cinder and other projects. 15:51:27 mikal seems to have already triaged some sessions in the Nova etherpad 15:51:53 n0ano: they would like to know what to expect and what's needed form their side. 15:52:33 doron, what I said, what capabilities are they looking for and what kind of APIs do they want 15:53:15 doron: n0ano: hence for example a discussion about whether we accept other projects's polling like jaypipes suggested or only stats notifications 15:53:32 n0ano: basically scheduling services. I'd assume they'll write filter(s) and weight modules 15:53:33 the latter having my preference 15:53:44 n0ano: so for example, 15:54:05 neutron may be able to schedule a service VM with a mitation of X hops from a network device. 15:54:24 (limitation) 15:54:38 The same use case is valid for cinder. 15:54:45 from a storage backend. 15:54:47 doron, the specifics on what `scheduling services' they need is what we're looking for, your Neutron example is perfect 15:55:13 n0ano: this is exactly why we should have this all-hands session. 15:55:23 doron, +1 15:55:56 back to the Nova session, I don't know if mikal or johnthetubaguy placed the good candidates but this person asked for blueprints 15:56:20 bauzas: thats by fault, I was asking for specs 15:56:32 well, we have BPs, does johnthetubaguy need pointers? 15:56:41 so, I will rebase my isolate-sched-db BP on Kilo and jaypipes's BPs will be discussed too 15:56:43 bauzas: we mentioned this on the ML before a few times, we need a spec for summit sessions 15:56:51 johnthetubaguy: agreed 15:57:09 n0ano: ideally the specs would be listed in the etherpad 15:57:11 johnthetubaguy: jaypipes covers most of the work, and isolate-scheduler-db BP will be reproposed for Kilo 15:57:15 johnthetubaguy: so will do 15:57:30 bauzas: hoping to get most of those approved before the summit 15:57:42 johnthetubaguy, tnx, that would be great 15:57:44 bauzas: seems like you have a good direction that just needs executing 15:57:52 johnthetubaguy: hope so too 15:57:56 what bits is there no agreement on? 15:58:13 I noticed the extensible resource tracker is getting a bit of a battering, but other than that 15:58:36 johnthetubaguy, among this group on IRC we're pretty much in agreement 15:58:47 johnthetubaguy: well, I don't see real problems except that isolate-sched-db implementation we need to cover in between jaypipes, PaulMurray and me 15:59:13 johnthetubaguy: I would really appreciate if Nova team could help us on guidance there 15:59:17 bauzas: honestly, after the resource tracker work, it should be clear what is needed 15:59:23 approaching the top of the hour, is there any open anyone wants to raise in the last few minutes? 15:59:25 johnthetubaguy: we need feedback mostly 15:59:39 bauzas: cool, lets get the specs up, and see what we can do 16:00:03 on a side note, will be on PTO next week and beginning of the week after 16:00:10 PS, you folks are part of the nova team, so do review all the specs, seeing +1s from you folks really helps drive it forward 16:00:19 so, won't be present for the next 2 Gantt meetings 16:00:38 johnthetubaguy: that's my duty :) 16:00:55 bauzas, me too (business trip next week and vacation in Normandy the week after) we should cancel the next 3 meetings and do things via email 16:01:11 tnx everyone 16:01:14 #endmeeting