15:00:46 #startmeeting gantt 15:00:47 Meeting started Tue Jul 1 15:00:46 2014 UTC and is due to finish in 60 minutes. The chair is n0ano. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:48 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:51 The meeting name has been set to 'gantt' 15:00:56 o/ 15:01:00 anyone here to talk abou tthe scheduler? 15:01:17 o/ 15:02:21 hmm, small group today (maybe we can get lots done :-) 15:02:35 * n0ano watches the solar panels being install on my roof 15:02:39 Hi all 15:02:45 #topic code forklift 15:02:46 Hello 15:02:49 Hi 15:02:55 * bauzas also writes slides at the same time :) 15:02:58 I'm new here. 15:03:12 ericfriz, LisaZangrando nice you can make it, we'll get to you soon 15:03:21 schwicke, NP, I promise we don't bite :-) 15:03:38 n0ano: don't we ? :) 15:03:49 n0ano: hoping for the best :) 15:03:52 ok thanks 15:04:03 bauzas, I thought we were good on the client and then john had some issues, do they look doable? 15:04:19 n0ano: well, we discussed today with johnthetubaguy 15:04:31 n0ano: about how we should do the steps to Gantt 15:04:48 n0ano: because he thought about possible issues and how we should do that 15:04:59 n0ano: long story short, there is an etherpad 15:04:59 bauzas: I need check IRC history to see your discussion,right? 15:05:14 n0ano: hey, in my team planning meeting, but do shout at me if you want some answers 15:05:28 #link https://etherpad.openstack.org/p/gantt-nova-compute_nodes 15:05:46 so the main problem is: 15:05:56 what should we do with ComputeNode table ? 15:06:11 should it be a Scheduler table or a Nova table ? 15:06:31 as per the last findings, johnthetubaguy is thinking to leave ComputeNode in Noa 15:06:32 Nova 15:06:44 and only do updates in the client 15:06:50 bauzas: how often will we access the compute table? 15:06:58 for compatibility reasons I think it should probably stay with nova for now, maybe in the future it can be moved into gantt 15:07:18 n0ano: so that means we go to keep the computenodes table 15:07:40 n0ano: gantt will need its own table, with a different structure, and that seems fine 15:07:57 ok, please all review https://etherpad.openstack.org/p/gantt-nova-compute_nodes and make comments if any 15:08:16 PCI stats need the ComputeNode table at the moment, for the PCI devices stuff, and I recon that means it has to say in Nova for the medium term 15:08:18 bauzas: n0ano: Don't this should be compute_table object scope? If everything is kepts in compute_node object, then no matter how we do the implementation, we will simply change the compute_node object? 15:08:28 if we all agree to keep compute_nodes, I'll backport johnthetubaguy's change into the 82778 patch 15:08:56 johnthetubaguy: you mean PCI stats or PCI dev tracker? 15:09:04 well, to be precise, I already made that, I need to restore a previous patchset 15:09:10 unfortunately, I think johnthetubaguy is right and we should stay that way for now 15:09:51 johnthetubaguy: why don''t we hide all these thing behind the compute node object? 15:09:56 bauzas, then why did you change originally, won't the same objections apply? 15:10:04 ok, my main concern is keeping the roadmap, so I'll go with these changes for https://review.openstack.org/#/c/82778 15:10:32 n0ano: the problem is that there were no clear consensus 15:10:45 n0ano: so I made lots of proposals over here 15:10:54 n0ano: now, I'll stick with the proposal 15:11:02 n0ano: there is another side effect to it 15:11:25 johnthetubaguy and I agreed that we should possibly do the fork once that patch get merged 15:11:32 the PCI issue is a strong argument (to me anyway) so I'd just say nova owns the table is the new concensus and we try and make it work 15:11:43 and then work on Gantt directly 15:11:53 but that requires some code freeze in Nova 15:12:00 johnthetubaguy: you agree ? 15:12:13 n0ano: if we will remove the compute table out of nova, we can change PCI for it also. 15:12:35 johnthetubaguy: I mean, we take your scenario, go ahead, do the split, work on Gantt for feature parity 15:12:45 yjiang5, I's say that's something we do later, after we do the split into gantt 15:12:55 bauzas, +1 15:13:00 johnthetubaguy: that means that Nova will possibly have some code freeze 15:13:10 johnthetubaguy: and *that* is a big turn in mind 15:13:46 bauzas: +1, who knows how long it takes to finish it 15:14:00 bauzas, not necessarily code freeze, we just have to back port changes from nova to gantt after the s;lit 15:14:01 because the idea was to do some pre-work on sched-db https://review.openstack.org/89893 15:14:21 but johnthetubaguy got -1 to it 15:14:43 n0ano: my only worries go about the level of backports needed 15:15:02 n0ano: and the idea was to do the steps *before* to prevent that backports 15:15:09 s/that/these 15:15:36 bauzas, a concern but I think it's doable, the steps we are doing reduce the number of backports needed rather than eliminate them 15:15:39 the main problem is about filtering on aggregates and instances 15:16:04 ok so https://review.openstack.org/82778 is the top prio and then we split 15:16:14 +1 15:16:30 n0ano: we need to think about all the steps for stepping up a CI, etc. 15:16:35 n0ano: some of the mechnism will need to change, like aggregates, these will prevent some nova patches into gantt 15:16:36 an API and a client :) 15:17:03 toan-tran: there are some blueprints for porting the aggs stats to sched using extensible RT 15:17:20 toan-tran: that would avoid the sched to call the Nova API for it 15:17:30 toan-tran: which is good 15:17:43 bauzas: just an example, but thanks for the info :) 15:17:44 toan-tran: but until that, Gantt won't support aggregates filtering 15:18:12 n0ano: still happy with that ? 15:18:30 my point is since we don't even have all the changes sorted out, there are risks that some new patches cannot be backported to gantt 15:18:31 bauzas, works for me, just means we'll have some feature parity work still for gantt 15:18:40 we can possibly vote on it ? 15:18:43 s/sorted/figured 15:18:54 give me chair on the meeting, will arrange a vote 15:19:03 #chair bauzas 15:19:12 #chair bauzas 15:19:13 Current chairs: bauzas n0ano 15:19:18 #help vote 15:20:18 do you need anything from me, you have the chair 15:20:20 #vote Do we agree to split scheduler code without waiting work on isolate-sched-db, which implies Gantt to not be feature-parity with nova-scheduler ? 15:20:35 strange 15:20:41 the bot is unhappy 15:21:35 well, +1 from me, no matter what the bot is doing 15:21:59 +1 for me too 15:22:04 #startvote Do we agree to split scheduler code without waiting work on isolate-sched-db, which implies Gantt to not be feature-parity with nova-scheduler ? 15:22:05 Begin voting on: Do we agree to split scheduler code without waiting work on isolate-sched-db, which implies Gantt to not be feature-parity with nova-scheduler ? Valid vote options are Yes, No. 15:22:06 Vote using '#vote OPTION'. Only your last vote counts. 15:22:15 bauzas: you mean Gantt not feature parity at first, but will be later? 15:22:16 dammit, forgot the good tag :) 15:22:20 #undo 15:22:21 Removing item from minutes: 15:22:31 #vote Do we agree to split scheduler code without waiting work on isolate-sched-db, which implies Gantt to not be feature-parity at first with nova-scheduler ? 15:22:34 mspreitz, yes, that is the plan 15:22:48 #startvote Do we agree to split scheduler code without waiting work on isolate-sched-db, which implies Gantt to not be feature-parity at first with nova-scheduler ? 15:22:49 Already voting on 'Do we agree to split scheduler code without waiting work on isolate-sched-db, which implies Gantt to not be feature-parity with nova-scheduler ' 15:23:00 #endvote 15:23:01 Voted on "Do we agree to split scheduler code without waiting work on isolate-sched-db, which implies Gantt to not be feature-parity with nova-scheduler ?" Results are 15:23:12 #startvote Do we agree to split scheduler code without waiting work on isolate-sched-db, which implies Gantt to not be feature-parity at first with nova-scheduler ? 15:23:14 Begin voting on: Do we agree to split scheduler code without waiting work on isolate-sched-db, which implies Gantt to not be feature-parity at first with nova-scheduler ? Valid vote options are Yes, No. 15:23:15 Vote using '#vote OPTION'. Only your last vote counts. 15:23:19 #vote yes 15:23:22 #vote No 15:23:24 #vote yes 15:23:24 #vote yes 15:23:30 #vote Yes 15:23:57 * bauzas eventually found out how to setup a vote... 15:24:03 mspreitz: ? 15:24:14 not sure, I came in late, I think I will abstain 15:24:17 ok 15:24:20 #endvote 15:24:21 Voted on "Do we agree to split scheduler code without waiting work on isolate-sched-db, which implies Gantt to not be feature-parity at first with nova-scheduler ?" Results are 15:24:39 awesome... 15:24:54 anyway, we have a majority over here 15:24:58 n0ano: bauzas: need leave now. talk to you guys later. 15:25:07 sure, thanks yjiang5 15:25:15 bot is weird but my count was 3-1 so yes wins 15:25:25 bauzas: bye. 15:25:27 yjiang5, later 15:25:42 bauzas, so, you have clear direction for now? 15:25:50 #action bauzas to deliver a new patchset for sched-lib based on keeping ComputeNode in Nova 15:26:12 n0ano: yup 15:26:23 cool, let's move on then 15:26:31 #topic Fair Share scheduler 15:26:31 n0ano: we need to sync up next week to see what to do with the split itself 15:26:41 ericfriz, you still here? 15:26:55 yes, i'm here! 15:27:11 me too 15:27:15 so, I hope everyone read ericfriz email about his fair share scheduler idea 15:27:26 yep 15:27:38 me too 15:27:46 the idea looks interesting, I'm curious is this just a new filter or are you changing the scheduler itself? 15:29:04 ericfriz: could you just summarize your idea ? 15:29:28 It's not a filter, but it's a change to the scheduler algorithm 15:30:03 please take a look to the slide #12 15:30:21 LisaZangrando: can you provide the link here? 15:30:29 ericfriz: if I got it right, you are using the scheduler from slurm, correct ? 15:30:37 the schema show the new architecture 15:30:56 #link https://github.com/CloudPadovana/openstack-fairshare-scheduler 15:31:10 yes, SLURM's Priority MultiFactor 15:31:26 it appears to me that your proposal is really close to what Blazar already does :) 15:31:34 no, the scheduler implements the same scheduling algorithm og slurm 15:31:41 no, the scheduler implements the same scheduling algorithm of slurm 15:31:53 ie. you have a reservation and the system will handle it 15:33:10 #link https://wiki.openstack.org/wiki/Blazar 15:33:21 which kind of reservation? 15:33:41 ericfriz: does a user request in your design have a start time and/or end time or duration? 15:33:47 virtual instances or physical compute_node 15:34:25 LisaZangrando: you mean slide #12 of https://agenda.infn.it/getFile.py/access?contribId=17&sessionId=3&resId=0&materialId=slides&confId=7915 ? 15:35:12 LisaZangrando: I quick scanned your docment 15:35:32 and get the feeling that you want to sorted users' requests based on priority 15:35:40 mspreitz: user request has no duration when it's queued 15:35:54 but users' requests are asynchronized 15:35:55 #link https://review.openstack.org/#/c/103598/ <-- totally different way of approaching the scheduler. Just for kicks and giggles. 15:36:15 you're assuming that there are not enough resources for current requests? so that they have to wait in a queue? 15:36:47 toan-tran: yes, it's. 15:37:18 ericfriz: as you said, current nova scheduler does handle requests FIFO 15:37:51 so I think your patch targets nova-scheduler Manager than nova-scheduler Scheduler :) 15:39:00 toan-tran: I just thought about Blazar because it seems the whole idea is to say "as a user, I want to start an instance but I want to guaranttee that I'll have enough resource for it" 15:39:18 so I'll wait until all the conditions are met 15:39:34 that's what I call a lease 15:39:42 bauzas: yes, but Blazar focuses on time condition 15:39:51 ie. a strong contract in between the user and the system 15:40:01 toan-tran: not exactly 15:40:03 here they're talking a bout priority, who gets the resources first 15:40:31 * johnthetubaguy is free to talk if its useful later in the meeting 15:40:35 toan-tran: the Blazar lease is about granting resources for a certain amount of time 15:40:49 johnthetubaguy: nah we agreed on your approach 15:41:03 johnthetubaguy: I'll do a new patchset tomorrow so you'll review it 15:41:06 bauzas: OK 15:41:09 thanks 15:41:27 briefly, To all user requests will be assigned a priority value calculated by considering the share allocated to the user by the administrator and the evaluation of the effective resource usage consumed in the recent past. All requests will be inserted in a priority queue, and processed in parallel by a configurable pool of workers without interfering with the priority order. 15:41:39 I think the proposal is useful in situations where resources are limited and where the provider has an interest in getting it's resources used all the time. 15:42:11 not being an expert on blazar but to me it seems to address a different use case 15:43:16 LisaZangrando: by saying "shares", you mean quotas ? 15:43:19 schwicke: well if user does not care much on time constraint so yes tou're right 15:43:28 s/tou/you 15:43:48 bauzas: yes 15:43:50 toan-tran: certainly correct 15:44:11 * n0ano wishes bauzas would quit stealing my questions :-) 15:44:22 bauzas: yes, share in batch system terminology 15:44:36 LisaZangrando, then what is a quota, CPU usage, mem usage, disk usage? 15:44:58 n0ano: quicker next time! otherwise you'll loose 15:45:23 toan-tran, age is slowing down my fingers 15:45:24 n0ano: quotas on those are ceilings. 15:45:43 They define the maximum of what a user can have I'd say, right, Lisa ? 15:46:03 schwicke: that's what we call quotas in OpenStack :) 15:46:09 :) 15:46:15 I'm a little confused here, it looks like the FairShare design is about a priority queue to hand out things sooner or later, not limit usage 15:46:24 mspreitz: +1 15:46:27 mspreitz: +1 15:46:37 it is a % of resource assigend to a project/user 15:46:40 and the "sooner or later" sounds familiar to me... 15:46:42 so it's about "share" not "quota" 15:46:59 yes 15:47:03 please correct me if I'm wrong 15:47:23 FaireShare targets a situation in which there is not enough resources for everbody 15:47:27 LisaZangrando, implication is that your cahnges will affect more than just the scheduler, you have to setup mechanism for specifying and allocating these shares 15:47:38 so the scheduler hes to decide who to give resoures to, and how much 15:47:59 nothing to do with quota , right ? 15:48:08 s/hes/has 15:48:15 is there a good description of the use cases for the fairshare scheduler anywhere? 15:48:34 johnthetubaguy: https://agenda.infn.it/getFile.py/access?contribId=17&sessionId=3&resId=0&materialId=slides&confId=7915 15:48:41 toan-tran: correct 15:49:11 bauzas: those slides do not have use case in them 15:49:38 bauzas: my mistake 15:49:48 there is text about use case, it is pretty generic 15:50:00 page 16 and 17 15:50:02 mspreitz: +1 15:50:10 I don't see what problem it is trying to solve 15:50:25 I am sure there is one, I just don't see it right now 15:50:51 page 16 and 17 are really statements of technology goals, not illustrations of usage 15:51:17 The FairShareScheduler is used in our Openstack installation, named "Cloud Padovana" 15:51:31 it talks about queuing the requests of users, there is generally never a backlog, so it doesn't matter about the order, you just place things when you get the request, so there much be something bigger that is required here 15:51:41 so, could we consider to ask you to provide some information for next week ? 15:52:02 ericfriz: so are you trying to share resources between users, and evict people who are using too many resources? 15:52:16 looks like a clear definition of the use cases/problems you are solving would be nice to have 15:53:04 johnthetubaguy: yes, that is an usecase 15:53:14 ericfriz: tell us about the people using Cloud Padovana and what problems they would have if your solution were not in place 15:53:40 ericfriz: OK, is quite a different "contract" with users to what nova offers, so we need a nice description of that ideally 15:53:45 ericfriz: I did not notice anything about eviction 15:54:04 mspreitz, +1 15:54:58 mspreitz: scientific teams. when there are not more resources, the user requests fail. 15:55:02 I am assuming you want someone to have 10% of all resources, should they request them, and others can use that space if they are not using them, and so it boils down to "spot instance" style things, but I don't really see that described anywhere 15:55:50 I'm also thinking about something mentioned in the thread, deferred booting 15:56:30 ericfriz: what is the nature of these scientific jobs? Can they tolerate allocation of some VMs now, a few more later, and a few more later? 15:56:31 ericfriz: it probably seems obvious with a grid computing hat, but doesn't fit too well into nova right now 15:56:45 you would probably require the Keystone trusts mechanism, hence my idea about blazar 15:57:05 and what johnthetubaguy said still makes me thinking about Blazar... 15:57:25 The "illusion" to have unlimited resources ready to be used and always available is one of the key concepts underlying the Cloud paradigm. Openstack refuses further requests if the resources are not available. 15:57:25 Blazar == Climate, for the records 15:58:18 LisaZangrando: so you want to guaranttee them on a best-effort basis ? 15:58:49 we want to guarantee all requests are processed 15:58:49 LisaZangrando: Does this illusion include maybe being spontaneously evicted? 15:58:55 LisaZangrando: well, sure, but we could make things a bit more "griddy" for highly utilised clouds, its just going to involve introducing new type of flavors, like "spot instances", extra instance above your quota, but they could get killed at any point 15:59:13 bauzas, more like don't every fail a request, just pend it until it can be satisfied 15:59:17 we're running out of time, we need to conclude 15:59:27 Blazar uses time condition, FairshareScheduler has no time condition for extracting the user requests from the queue. 15:59:42 * n0ano refers back to his last commen about stealing questions 15:59:57 but Blazar implements some best-effort mode where you define your contract :) 15:59:59 indeed, it's the top of the hour so we'll have to conclude 16:00:03 LisaZangrando: as a public cloud provider ( <== Cloudwatt ), I can tell you that we're trying our best to not be in the situation o fneeding FS 16:00:15 but I can see its value 16:00:31 maybe as an action item a compilation of use cases would be useful I think, circulate that and then review later ? 16:00:33 s/fneeding/needing 16:00:39 I want to thank everyone, good discussion, I'd like to continue the fair share discussion next week, hopefully we've given you guys stuff to think about 16:00:41 schwicke: +1 16:00:58 can that be action-itemed 16:01:04 #action schwicke to come up with clear use cases for next weeks meeting 16:01:11 so you should come up with a good usecase, and be careful not to get too much on grid's phylosophy 16:01:11 awesome 16:01:16 tnx everyone 16:01:20 #endmeeting