15:00:39 #startmeeting gantt 15:00:40 Meeting started Tue Sep 9 15:00:39 2014 UTC and is due to finish in 60 minutes. The chair is n0ano. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:41 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:44 The meeting name has been set to 'gantt' 15:00:50 anyone here to talk about the scheduler? 15:01:34 \o 15:01:54 bauzas_, I was beginning to think you were on baby duty today :-) 15:01:58 o/ 15:02:29 #help 15:02:32 n0ano: naaaah 15:03:20 o/ 15:03:30 well, it's approaching 5 after, let's begin 15:03:36 #topic next steps 15:03:49 is jaypipes around ? 15:04:03 hi all - I'm here 15:04:10 odd, my topic didn't work, wonder why, whatever 15:04:35 bauzas_: ya! 15:04:47 hopefully you all saw the email about `what we agreed to', I'd like to expand upon that 15:05:06 to me there are 3 APIs that need to be cleaned up... 15:05:12 1) client interface 15:05:18 2) DB access 15:05:29 3) resource tracker 15:05:30 n0ano: not that exactly that 15:05:41 bauzas_, feel free to clarify 15:05:46 n0ano: by speaking about APIs, we're discussing about 15:06:00 1/ select_destinations (incl. request_spec data) 15:06:16 2/ update_resource_stats (ie. how the Scheduler is getting updates) 15:06:27 bauzas_: correct. it's the request_spec that needs to be versioned and "objectiified". 15:06:42 bauzas_: I prefer "update_resources" vs. update_resource_stats, but yes. 15:06:45 we also need to objecify the stats dict 15:07:01 bauzas_: we need to get rid of the stats dict entirely... 15:07:16 jaypipes: +1 15:07:17 jaypipes, bauzas_ I agree incidntally 15:07:19 I was folding the client interface as the implmention of the select destination API 15:07:30 n0ano: client interface is providing both 15:07:36 bauzas_: and just send a versioned object that has a set of resource usage objects in it. 15:07:46 jaypipes: violent agreement here 15:07:49 jaypipes, +1 15:08:12 bauzas_: similar to what the NUMATopology classes added by ndipanov and danpb in nova.virt.hardware look like 15:08:13 there is one last thing to consider : claims 15:08:23 bauzas_: yes, that's the biggie. 15:08:34 or the baddie, depending on how you look at it ;) 15:09:25 the problem that ndipanov faced is that he had to hack the claim methods for the NUMATopology objects 15:09:28 so my thinking was that claims need to live in the scheduler, and that weather they are done in the scheduler service or on nodes is an implemntation detail 15:09:42 bauzas_, that is mostly because we don't have the data model 15:09:49 bauzas_: exactly. 15:10:05 bauzas_: please :see my comments about the claim interface in those patch reviews :) 15:10:06 ndipanov: I'm just thinking we should attach a claim() method to the objects we provide to the Scheduler with API endpoint #2 (update_resource_stats or whatevor) 15:10:43 ndipanov: claims should be *returned* from the scheduler, yes. 15:10:47 bauzas_, that does not sound insane to me, but is really up to how we do it 15:10:48 jaypipes: I'm just thinking that scheduler could just call this method when select_destinations 15:10:50 ndipanov: or rather, a set of claims.. 15:11:34 jaypipes, you may be right - if you are certain that we want to claim in the scheduler (with a cheap almost lock free db hit) than we may want to do that form the start 15:11:35 to me, I'm thinking of : 15:11:46 bauzas_: so, before 1/ and 2/ above, we need to complete the work to un-kludge the ComputeNode -> Service relation. 15:11:54 jaypipes: \o/ 15:11:57 but if not and we stick to "optimistic" clam if fail retry system we have now 15:12:00 can't agree more 15:12:04 ndipanov: yes. I am certain :) 15:12:07 that FK is insane 15:12:10 we still want tthe code to live in the sched 15:12:20 I'm a little confused, the claiming should be part of the select destination or part of the update resources or should be a separate interface 15:12:27 jaypipes, in that case yes - claims HAVE to be in the scheduler :D 15:12:42 n0ano: objects are passed to the Scheduler using update_resources 15:12:49 n0ano: and stored 15:13:03 ndipanov: well, putting the construction of the claim set in the scheduler means the "retry on resource starvation" loop becomes much tighter, meaning we don't need to go all the way to the compute node to figure out we're starved and initiate a retry.. 15:13:09 n0ano: that's when select_destinations is coming that these objects are claimed 15:13:54 jaypipes: I stated the proposal in the n0ano's reply that we could live with current RT claims for one cycle 15:14:01 jaypipes: in parallel with objects claims 15:14:02 my issue is that updating resources doesn't necessarily imply claiming anything, your just informing the scheduler what;s available 15:14:12 jaypipes, right - of course - (your retry is got 0 rows back) 15:14:22 ndipanov: correct :) 15:14:25 n0ano: you inform scheduler with objects 15:14:33 this does sound like we will need to change a lot about the RT as well 15:14:43 though I may be wrong 15:14:47 but overall 15:14:48 n0ano: those objects are claimed when select_destinations 15:15:16 if this can be done and is not ridiculously more difficult then keeping the current RT + claims only changing the data model 15:15:23 then by all means do it! 15:15:26 so, in my thinking, the claim happens in the select destination, that makes sense 15:15:35 n0ano: yup 15:15:38 ++ 15:15:39 correct. 15:15:57 lemme provide the next steps I'm seeing 15:16:14 so folks, you can debate 15:16:28 jaypipes, who was the ++ to? me or n0ano ? 15:16:44 0/ Remove FK on ComputeNode 15:17:06 1/ update_resource_stats is providing ComputeNode objects instead of stats dict 15:17:24 2/ build_request_spec builds a Request object 15:17:30 ndipanov: to n0ano and bauza, who were saying claims are done in select_destinations() (which I think should be renamed get_placement_claims() 15:17:52 3/ add claim() to ComputeNode 15:18:05 4/ call CN.claim() in select_destinations 15:18:17 jaypipes, ack 15:18:21 bauzas_: well, not necessarily... 15:18:29 ok, time for debating :) 15:18:36 jaypipes: your thoughts ? 15:18:41 bauzas_, was that your complete list? 15:18:48 jaypipes, and then the RT on compute nodes still goes and does it's updates of available resources as they are now? 15:18:50 * n0ano hates IRC at times 15:19:03 bauzas_: you wouldn't want to call claim() on the nova.objects.ComputeNode itself, but rather a wrapper (I called it ComputeNodeState object) that has a ComputeNode DB object inside it). 15:19:04 jaypipes, or how will that work? 15:19:15 yeah, steps 0 to 4 15:19:37 jaypipes: interesting 15:20:03 jaypipes: I just need to care about nested objects 15:20:06 ndipanov: the RT on the compute node would still do a check, yes, but it would be the exception to the rule that a resource starvation would occur, vs. the current implementation which has the retry logic actually *be* the Claim object abort in the resource tracker. 15:20:25 jaypipes: +1 15:20:41 jaypipes, ok - I have a picture now... 15:20:43 jaypipes: I'm seeing the RT Claims as an emergency check 15:20:57 bauzas_: see the ComputeNodeState object here: https://review.openstack.org/#/c/103598/4/nova/placement/resource_tracker.py 15:20:58 is there any way you guys could do a write up 15:21:04 so that I am sure I am not missunderstanding 15:21:12 ndipanov: I began step 0 15:21:15 bauzas_: for an explanation as to why the ComputeNode DB object should be an attribute of the maintained state. 15:21:28 jaypipes: will read further 15:21:54 we need a short writeup of steps 3 & 4, the others are pretty obvious 15:22:12 n0ano: we need a *spec* for step 3 and 4 :) 15:22:21 bauzas_: yes, the RT claim on the compute node would be the emergency check for any externally-initiated change to the virt state (for instance somebody doing a virsh destroy out of band of Nova) 15:22:40 jaypipes: +1 again 15:22:58 jaypipes: the last controller, but not the main one 15:23:04 ndipanov: yes, I am working on a writeup, will share with bauzas_, you, and n0ano today. (etherpad) 15:23:12 jaypipes: awesome 15:23:18 jaypipes, awesome! 15:23:18 jaypipes, that would be great 15:23:20 and sorry n0ano for not responding sooner to your ML post :( 15:23:23 jaypipes, me too please 15:23:31 PaulMurray: of course, will do. 15:23:32 jaypipes: interested too, hard to follow the irc converstion 15:23:36 jaypipes, NP 15:23:42 will just send the link to the ML thread and ping you all 15:23:48 cools 15:23:57 johnthetubaguy: yeah, agreed :) tough to follow sometimers :) 15:24:08 jaypipes: yeah, just pile up an empty etherpad, so we can attach it to the logs 15:24:33 (mostly my fault for not ever being able to make the first half of this meeting) 15:24:34 well, start with an etherpad, we'll need to turn it into a true BP soon 15:24:40 n0ano: in the meantime, I want to make it clear that I fully support the split of the scheduler code. I just want to make that process have a good chance of suceeding by doing the above few steps before the split happens. 15:25:26 jaypipes, I've always said we all agree on lthe goal, it's the path that is the issue 15:25:31 ++ 15:26:05 one thought, we're redoing `how` we call into the scheduler, do we think this provides the clean, separable interface we are looking for? 15:26:14 I just want to restate that we agreed during last midcycle meetup that the split will consist in a python lib 15:26:23 not atm a new project 15:26:46 bauzas_: yes, the first step is a python lib. 15:27:09 yeah, just emphasizing this, that's it :) 15:27:12 n0ano: I personally do think that it would provide such a clean, separable interface. 15:27:40 jaypipes: +1 its a good first step, lets get an interface, before anyone mentions REST 15:27:51 oh yeah. totes. 15:27:59 johnthetubaguy, go away, too early for that :-) 15:28:06 johnthetubaguy: especially since the RPC API is alreaduy versioned. 15:28:14 n0ano, I think we need to get this step right to determine exactly what is in the interface 15:28:19 yeah, its about getting the data versioned too 15:28:21 update_resource_stats is not RPC versioned 15:28:24 n0ano, the retry loop is killing us 15:28:54 PaulMurray, what I'm hearing is that getting the claims right will ameliorate the retry loop 15:29:24 n0ano: it will 15:29:30 n0ano, I think so, I think (not used to long words) 15:29:53 n0ano: because we're moving from a 2-step scheduling to what the community calls an "optimistic scheduler" 15:30:05 * n0ano likes to expand everyone thinking (and show off - sorry about that :-) 15:30:53 bauzas_: no, I meant the existing scheduler RPC API is versioned. 15:31:05 anyway, the critical thing is the write up so... 15:31:06 yeah I know what you meant 15:31:11 jaypipes: ^ 15:31:18 n0ano: I will have it done by EOD today. 15:31:24 jaypipes: I mean, select_destinations is actually an RPC versioned call 15:31:35 #action jaypipes to writeup the claims process 15:31:37 jaypipes: while update_resources_stats is just a DB call 15:31:38 might even have pretty pictures and diagrams :) 15:31:49 bauzas_: yes, understood. 15:32:21 jaypipes: well, to be precise, a conductor call, so versioned anyway 15:32:28 anyway 15:32:42 n0ano: jaypipes: want me to restate the next steps ? 15:32:53 is that clear to everybody ? 15:32:57 bauzas_, can't hurt, to ahead 15:32:57 I still think we need all that DB split work we spoke about before though, unless I am missing something? 15:33:12 bauzas_: sure, go for it. 15:33:22 johnthetubaguy: indeed, that's the last step 15:33:25 s/to ahead/go ahead 15:33:26 I guess maybe the solution looks slightly different 15:33:37 johnthetubaguy: it will 15:33:42 ok, let me restate 15:33:54 step #0 : Remove FK on ComputeNodes/Service 15:34:11 step #1 : update_resource_stats provides ComputeNode object 15:34:31 step #2: build_request_sepc provides Request object 15:34:46 step #3: add claims to objects 15:34:58 step #4 : call objects claims in select_destinations 15:35:03 end. 15:35:19 so, johnthetubaguy, isolate-sched-db is based on these steps 15:35:28 we should provide objects to the Scheduler 15:35:31 hmm, feels odd said like that, but anyways 15:35:37 bauzas_: that is my understanding 15:35:42 and jaypipes will provide a writeup on how steps 3&4 work 15:36:00 I think object vs db object might need claifying, vs dict vs object 15:36:11 but anyway, I will read the write up 15:36:13 so, by saying that, update_resource_stats(Aggregate) 15:36:49 so the Aggregate object is versioned 15:37:12 that's step 6 15:37:39 I would have said step 1.5 but either way 15:38:22 n0ano: step 3 and step 4 are mandatory 15:38:33 n0ano: because you have to claim to aggegates then 15:38:42 bauzas_, no argument 15:38:58 * PaulMurray has to leave - will catch up later 15:39:08 PaulMurray: o/ 15:39:09 PaulMurray, tnx for coming 15:39:32 I'd prefer to continue this after we have the writeup so... 15:39:37 #topic opens 15:39:45 any new items for today? 15:40:17 * n0ano listenint to the sound of crickets 15:40:54 OK, let's close, good talking to everyone, we'll meet again next week 15:40:59 #endmeeting