15:00:37 <n0ano> #startmeeting gantt 15:00:38 <openstack> Meeting started Tue Feb 4 15:00:37 2014 UTC and is due to finish in 60 minutes. The chair is n0ano. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:40 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:42 <openstack> The meeting name has been set to 'gantt' 15:00:50 <n0ano> anyone here to talk about the scheduler? 15:01:05 <alexey_o> me here 15:01:16 <alaski> o/ 15:01:21 <gilr> hi 15:01:27 <PaulMurray> hi 15:02:10 <toan-tran> hi 15:02:23 <n0ano> then let's get started 15:02:32 <n0ano> #topic no db scheduler 15:02:55 <n0ano> hmm, boris doesn't seem to be on, might not be able to talk about this one much 15:03:38 <alexey_o> well, I can substitute boris to some extent 15:03:58 <n0ano> alexey_o, I thought you might but I didn't want to presume :-), what have you got? 15:04:49 <alexey_o> I am working on it and I think we are close now 15:05:10 <alexey_o> it appeared to be not a straitforward task to get rid of compute_nodes table 15:06:29 <n0ano> the devil is always in the details, can you give a short summary of the problem (emphasis on short) 15:07:34 <alexey_o> well, right now I am hunting a geizenbug that thwarts tempest tests at the gate 15:08:16 <alexey_o> also one can't just drop compute_nodes table and expect everything to keep working 15:08:51 <alexey_o> it is related to services and appears to be joinedloaded quite frequently 15:09:04 <n0ano> so nothing obvious, just finding the implicit assumptions that are there. 15:09:29 <n0ano> good luck finding the bugs 15:09:35 <toan-tran> alexey_o: pls remind me again, how nova conductor work with compute_nodes forked out? 15:09:58 <toan-tran> and how computes do their frequence update : to nova-condcutor, or scheduler? 15:10:02 <alexey_o> the last problem can be overcome with some effort 15:12:03 <alexey_o> toan-tran: the idea was to get rid of writing state changes to compute_nodes table and keep host-related data right in places where it will be needed 15:12:21 <alexey_o> for example in nova-scheduler 15:12:56 <n0ano> e.g. state the scheduler needs is periodically sent to the scheduler, not the DB 15:13:09 <alexey_o> yes 15:13:23 <alexey_o> ultimately that was supposed to be sent to schedulers 15:13:24 <toan-tran> so nova-compute will send their updates to scheduler? that's quite a quantity 15:13:48 <toan-tran> and also their would be a question of the existence of nova-conductor at this point 15:14:24 <alexey_o> right now the change does not remove the conductor 15:14:37 <alexey_o> it still listens to state updates 15:14:51 * mspreitz is still unclear on whether we are talking about one scheduler or multiple 15:15:11 <alexey_o> when it receives an update it propagate it to all other entities which are interested in host states 15:15:22 <alexey_o> that would be multiple schedulers 15:15:24 <mspreitz> how many entities is that? Which are they? 15:15:48 <alexey_o> at least one more -- a scheduler 15:16:23 <mspreitz> And there is just one conductor? 15:16:29 <alexey_o> it is quite possible to have mutliple schedulers so this change was made keeping this in mind 15:17:13 <alexey_o> I am unsure whether or not current implementation allows for multiple conductors, but state synchronizatoiin allows for multiple conductors as well 15:18:18 <mspreitz> And how is an update propagated from original receive (a conductor) to the other interested parties? And why is this better than just pub/sub to all of the interested parties directly in the first place? 15:20:09 <alexey_o> as soon as one of the interested parties decides it wants to do sth with state data stored locally it checks a backend to see if there are any fresh updates 15:21:18 <mspreitz> So it's polled/pulled, not pushed, to other parties 15:21:22 <alexey_o> as for pub/sub approach if I get you right our technique does not load the queue as much as state broadcasting 15:21:26 <coolsvap> n0ano mspreitz alexey_o can this discussion continue on ML instead of meeting? 15:21:53 <mspreitz> OK with me 15:22:08 <n0ano> if it needs to go too much deeper we can move to ML, I'm OK with a little more discussion. 15:22:15 <alexey_o> ok, feel free to ask any questions 15:22:40 <n0ano> mspreitz, maybe you can start a ML thread on your concerns, they certainly seem valid 15:22:52 <coolsvap> n0ano: I agree, but we are getting stuck at places with no inputs, so just wanted to put a thought 15:22:55 <mspreitz> Maybe I got the answer, let me see if I got it right.. 15:23:20 <alexey_o> yes, it is polled by those who cares 15:23:36 <mspreitz> The revised technique wins when updates are more frequent than need to know. In this situation, the revised technnique does a query when there is a need to know 15:24:27 <alexey_o> yes, it was intended to be used in very large scale clouds 15:24:39 <alexey_o> by very large I mean thousands of nodes 15:24:57 <mspreitz> It's a question not only of scale but how often there is a need to know (scheduling task). 15:25:13 <mspreitz> I presume the eval done is accurate. I think I got my answer. thanks 15:25:24 <n0ano> alexey_o, one quick question, last week boris mentioned the demo show bottlenecks not in the scheduler, do you know if these bottlenecks have been identified? 15:25:25 <alexey_o> at such scale updates will be very frequent, especially comapred to requests for scheduling 15:25:37 <toan-tran> alexey_o: I remark that there is not a design on how all these things interact with other components 15:25:46 <toan-tran> on your blueprint/doc page 15:25:52 * mspreitz is not sure why we would expect scheduling rate to not increase with cloud size 15:25:58 <toan-tran> could you elaborate in a wiki/doc 15:26:16 <toan-tran> so that we can understand how no-db scheduler would interact with all other nova components 15:26:19 <alexey_o> n0ano: unfortunately I don't know 15:26:32 <toan-tran> and how current message flows would go 15:26:36 <alexey_o> it is best to ask boris 15:26:48 <n0ano> mspreitz, the scheduling rate increases the but node updates increase (potentially even more) 15:27:03 <n0ano> alexey_o, tnx 15:27:16 <mspreitz> n0ano: is that an observation based on data I can see? 15:27:17 <alexey_o> toan-tran: yes, there is still a lot to document carefully 15:27:55 <n0ano> mspreitz, strictly guess, hard data would be good to have but `I` think the guess is good 15:28:24 <mspreitz> Can you elaborate on why you expect scheduling rate to increase more slowly than cloud size? 15:28:43 <alexey_o> mspreitz: some time ago I've measured delay in mysql responce to compute node get as a function of the number of compute node records 15:29:06 <johnthetubaguy> as long as its optional, then I am good with having it as an option for people to try, and we can decide to change the default, once we "are happy" with it 15:29:34 <n0ano> as cloud size increases the node updates increase, if guests are long lived then the scheduling rate should not increase at the same rate 15:29:45 <alexey_o> as scheduler issues a request for all compute nodes to be grabbed from db it takes more time to schedule a single instance 15:30:17 <johnthetubaguy> number of nodes vs rate of builds vs rate of deletes is going to change what is best, so trying this out seems to make sense to me 15:31:01 <mspreitz> alexey_o: yes, that makes sense. I think n0ano is making a different assertion: the (number of calls to scheduler)/time is less than proportional to cloud size 15:31:21 <alexey_o> yes, I got it wrong the first time 15:31:52 <mspreitz> n0ano: I do not follow your reply... 15:32:09 <n0ano> mspreitz, no, I think I'm saying (number of call to sched)/time is less than (node updates)/time, especially as the cloud gets larger 15:32:11 <mspreitz> Do you expect guest lifetime to increase with cloud size? 15:32:49 <mspreitz> node = host, I take it 15:32:57 <n0ano> mspreitz, yes 15:33:24 <mspreitz> node update = guest start, guest stop, or stats, or what? 15:33:50 <n0ano> stats is the real concern, start/stop should be directly proportional to scheduler calls 15:34:12 <toan-tran> mspreitz: I think node update = periodic ms 15:34:21 <mspreitz> n0ano: your previus "yes" was response to terminology question, or guest lifetime? 15:34:48 <n0ano> I think terminology confusions, I'm not really concerned about lifetime 15:34:51 <toan-tran> = number of nodes (per minute) 15:35:21 <mspreitz> stats update / time should be proportional to cloud size 15:35:23 <toan-tran> while (call to scheduler) = guesses requesting compute nodes 15:35:34 <toan-tran> mspreitz: yes 15:36:01 <toan-tran> so I do not see clearly the relation between the two 15:36:31 <toan-tran> Big servers ==> less update & more call to scheduler 15:36:40 <mspreitz> Assuming overall utilization stays constant as cloud size grows, and that guest lifetime stays constant as cloud size grows, this implies that guest arrivals/time stays constant as cloud size grows 15:37:19 <n0ano> but updates/time should increase as the cloud size increases 15:37:32 <toan-tran> mspreitz: I don't think so 15:37:35 <mspreitz> sorry, I stated my conclusion wrong 15:37:49 <mspreitz> the implication is that guest arrivals / time will be proportional to cloud size 15:37:55 <toan-tran> because cloud providers will not increase their size if there is a constant guest arrival 15:38:33 <toan-tran> rember that is scheduler call is when there is a change in a number of instances of clients' applications 15:38:35 <mspreitz> Assuming overall utilization stays constant as cloud size grows, and that guest lifetime stays constant as cloud size grows, this implies that guest arrivals/time will be proportional to cloud size 15:39:12 <toan-tran> so if guest arrival is constant ==> there is almost no activities on clients' applications 15:39:18 <toan-tran> it's hard to fantom 15:39:31 <mspreitz> toan-tran: right. Read my corrected statement 15:39:35 <n0ano> well, my concern is the ratio between (update requests)/(scheduler requests), I'm thinking that ratio should be >1 15:40:18 <n0ano> guys, I hate to cut this short but I think we have to move on, one other topic I'd like to cover today 15:40:20 <mspreitz> I gave an argument why instance creations / time will be proportional to cloud size. I think we agree that stats / time will be proportional to cloud size 15:40:40 <toan-tran> mspreitz +1 15:40:50 <mspreitz> OK with me 15:41:04 <Yathi> It will be good to see some real experiments and numbers 15:41:06 <mspreitz> I think Boris showed that the revision is a win at the size he studied 15:41:22 <mspreitz> and if my expectation is right, that winnage extends to other size 15:41:22 <Yathi> mspreitz, if you have experiments done in your team, please share the results or any paper you may have written 15:42:04 <mspreitz> My group is not studying that question, we are pretty convinced that Boris has a win 15:42:15 <mspreitz> we are working on scaling study 15:42:33 <n0ano> moving on... 15:42:39 <Yathi> +1 15:42:44 <n0ano> #scheduler code forklift 15:42:52 <n0ano> good news/bad news 15:43:17 <n0ano> good news - the changes to devstack have been merged so you can now requests the gantt scheduler service 15:43:59 <n0ano> bad news - still working on pushing the change to get gantt to pass the unit tests, everytime I think I've got it another (mostly minor) issue crops up 15:44:21 <n0ano> getting there, it's just a matter of resolving the review issues 15:45:07 <coolsvap> n0ano: thats great! 15:45:11 <ddutta> +1 15:46:07 <n0ano> actually, the other good news is that gantt passes most of the tempest tests locally (I fail about 100 tests out of 2000 but I get the same failures on a clean devstack build with the nova scheduler so the failures are probably my local setup 15:46:43 <n0ano> anyone, if anyone want to review the changes at https://review.openstack.org/#/c/68521/ go for it 15:46:50 <ddutta> n0ano: does the scheduler code work standalone right now? without OS dependencies 15:47:24 <n0ano> ddutta, no, that's the next step, to cut all ties to nova, not there yet 15:47:49 <ddutta> coool! 15:48:31 <n0ano> I like stepwise progression - first tree -> unit tests -> tempests tests -> independent tree 15:48:51 <ddutta> agreed! 15:49:16 <coolsvap> n0ano: I will give a try to gantt tree tempest tests tonight 15:49:20 <toan-tran> n0ano +1 15:50:14 <n0ano> coolsvap, one warning, you have to change the SCHEDULER environment variable from it's default, set it to gantt.scheduler.filter_scheduler.FilterScheduler in your localrc 15:50:38 <n0ano> other than that, it should work (I want to know if it doesn't) 15:51:04 <toan-tran> n0ano: is there a requirement for nova version? 15:51:39 <coolsvap> n0ano: point noted! thx! 15:51:44 <n0ano> for the unit tests yes, my local tempest tests have just been against to of tree 15:51:51 <n0ano> s/to of/top of 15:52:06 <Yathi> n0ano, do you have any documentation of how we can start trying out gantt now ? 15:52:32 <n0ano> depending upon how far the top of tree changes this could be a problem for tempest, that's why we need to cut the cord as soon as possible 15:53:17 <n0ano> Yathi, no, it's pretty simple, I can sent an email to the ML to tell how to do it (enable gantt, disable n-sch, set SCHEDULER) 15:53:48 <Yathi> ok cool.. will look for it 15:54:18 <n0ano> aproaching the top of the hour 15:54:21 <n0ano> #topic opens 15:54:30 <n0ano> anyone have anything else? 15:54:39 <mspreitz> ML about gantt will be good 15:54:48 <Yathi> Request to review two patches for Solver scheduler 15:54:56 <mspreitz> pointers please 15:54:57 <n0ano> Yathi, links? 15:55:00 <Yathi> #link https://review.openstack.org/#/c/46588/ 15:55:11 <Yathi> #link https://review.openstack.org/#/c/70654/ 15:55:19 <mspreitz> thanks 15:55:21 <n0ano> Yathi, tnx 15:55:21 <ddutta> and the API patches https://review.openstack.org/#/c/62557/ 15:55:37 <ddutta> sorry I meant API for group instance 15:55:48 <toan-tran> and the Policy Based Scheduler patch: https://review.openstack.org/#/c/61386 15:55:50 <toan-tran> :) 15:55:54 <ddutta> :) 15:55:58 <Yathi> :) 15:56:15 <n0ano> mspreitz, there have been multiple ML threads on gantt and there's a launchpad about it at https://etherpad.openstack.org/p/icehouse-external-scheduler 15:56:29 <mspreitz> I meant the instr on how to test 15:56:44 <n0ano> mspreitz, ah, sure NP 15:56:56 * n0ano I guess I really have to write today :-) 15:57:39 <Yathi> n0ano, I may have missed this from the previous meetings, how will the new patches to scheduler, like the Solver scheduler, policy-based scheduler be merged with gantt ? 15:57:52 <Yathi> after gantt merging, before, and re-merge with gantt ? 15:58:44 <n0ano> the current gantt tree is a preview, after we get it working we'll re-create the tree, apply the changes we now know get it working (lots of work for me) and then cut to gantt 15:59:17 <Yathi> ok cool Thanks 15:59:31 <n0ano> OK guys, tnx and we'll talk again 15:59:37 <n0ano> #endmeeting