#openstack-meeting log

15:00:37 <n0ano> #startmeeting gantt
15:00:38 <openstack> Meeting started Tue Feb  4 15:00:37 2014 UTC and is due to finish in 60 minutes.  The chair is n0ano. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:40 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:42 <openstack> The meeting name has been set to 'gantt'
15:00:50 <n0ano> anyone here to talk about the scheduler?
15:01:05 <alexey_o> me here
15:01:16 <alaski> o/
15:01:21 <gilr> hi
15:01:27 <PaulMurray> hi
15:02:10 <toan-tran> hi
15:02:23 <n0ano> then let's get started
15:02:32 <n0ano> #topic no db scheduler
15:02:55 <n0ano> hmm, boris doesn't seem to be on, might not be able to talk about this one much
15:03:38 <alexey_o> well, I can substitute boris to some extent
15:03:58 <n0ano> alexey_o, I thought you might but I didn't want to presume :-), what have you got?
15:04:49 <alexey_o> I am working on it and I think we are close now
15:05:10 <alexey_o> it appeared to be not a straitforward task to get rid of compute_nodes table
15:06:29 <n0ano> the devil is always in the details, can you give a short summary of the problem (emphasis on short)
15:07:34 <alexey_o> well, right now I am hunting a geizenbug that thwarts tempest tests at the gate
15:08:16 <alexey_o> also one can't just drop compute_nodes table and expect everything to keep working
15:08:51 <alexey_o> it is related to services and appears to be joinedloaded quite frequently
15:09:04 <n0ano> so nothing obvious, just finding the implicit assumptions that are there.
15:09:29 <n0ano> good luck finding the bugs
15:09:35 <toan-tran> alexey_o: pls remind me again, how nova conductor work with compute_nodes forked out?
15:09:58 <toan-tran> and how computes do their frequence update : to nova-condcutor, or scheduler?
15:10:02 <alexey_o> the last problem can be overcome with some effort
15:12:03 <alexey_o> toan-tran: the idea was to get rid of writing state changes to compute_nodes table and keep host-related data right in places where it will be needed
15:12:21 <alexey_o> for example in nova-scheduler
15:12:56 <n0ano> e.g. state the scheduler needs is periodically sent to the scheduler, not the DB
15:13:09 <alexey_o> yes
15:13:23 <alexey_o> ultimately that was supposed to be sent to schedulers
15:13:24 <toan-tran> so nova-compute will send their updates to scheduler? that's quite a quantity
15:13:48 <toan-tran> and also their would be a question of the existence of nova-conductor at this point
15:14:24 <alexey_o> right now the change does not remove the conductor
15:14:37 <alexey_o> it still listens to state updates
15:14:51 * mspreitz is still unclear on whether we are talking about one scheduler or multiple
15:15:11 <alexey_o> when it receives an update it propagate it to all other entities which are interested in host states
15:15:22 <alexey_o> that would  be multiple schedulers
15:15:24 <mspreitz> how many entities is that?  Which are they?
15:15:48 <alexey_o> at least one more -- a scheduler
15:16:23 <mspreitz> And there is just one conductor?
15:16:29 <alexey_o> it is quite possible to have mutliple schedulers so this change was made keeping this in mind
15:17:13 <alexey_o> I am unsure whether or not current implementation allows for multiple conductors, but state synchronizatoiin allows for multiple conductors as well
15:18:18 <mspreitz> And how is an update propagated from original receive (a conductor) to the other interested parties?  And why is this better than just pub/sub to all of the interested parties directly in the first place?
15:20:09 <alexey_o> as soon as one of the interested parties decides it wants to do sth with state data stored locally it checks a backend to see if there are any fresh updates
15:21:18 <mspreitz> So it's polled/pulled, not pushed, to other parties
15:21:22 <alexey_o> as for pub/sub approach if I get you right our technique does not load the queue as much as state broadcasting
15:21:26 <coolsvap> n0ano mspreitz alexey_o can this discussion continue on ML instead of meeting?
15:21:53 <mspreitz> OK with me
15:22:08 <n0ano> if it needs to go too much deeper we can move to ML, I'm OK with a little more discussion.
15:22:15 <alexey_o> ok, feel free to ask any questions
15:22:40 <n0ano> mspreitz, maybe you can start a ML thread on your concerns, they certainly seem valid
15:22:52 <coolsvap> n0ano: I agree, but we are getting stuck at places with no inputs, so just wanted to put a thought
15:22:55 <mspreitz> Maybe I got the answer, let me see if I got it right..
15:23:20 <alexey_o> yes, it is polled by those who cares
15:23:36 <mspreitz> The revised technique wins when updates are more frequent than need to know.  In this situation, the revised technnique does a query when there is a need to know
15:24:27 <alexey_o> yes, it was intended to be used in very large scale clouds
15:24:39 <alexey_o> by very large I mean thousands of nodes
15:24:57 <mspreitz> It's a question not only of scale but how often there is a need to know (scheduling task).
15:25:13 <mspreitz> I presume the eval done is accurate.  I think I got my answer.  thanks
15:25:24 <n0ano> alexey_o, one quick question, last week boris mentioned the demo show bottlenecks not in the scheduler, do you know if these bottlenecks have been identified?
15:25:25 <alexey_o> at such scale updates will be very frequent, especially comapred to requests for scheduling
15:25:37 <toan-tran> alexey_o: I remark that there is not a design on how all these things interact with other components
15:25:46 <toan-tran> on your blueprint/doc page
15:25:52 * mspreitz is not sure why we would expect scheduling rate to not increase with cloud size
15:25:58 <toan-tran> could you elaborate in a wiki/doc
15:26:16 <toan-tran> so that we can understand how no-db scheduler would interact with all other nova components
15:26:19 <alexey_o> n0ano: unfortunately I don't know
15:26:32 <toan-tran> and how current message flows would go
15:26:36 <alexey_o> it is best to ask boris
15:26:48 <n0ano> mspreitz, the scheduling rate increases the but node updates increase (potentially even more)
15:27:03 <n0ano> alexey_o, tnx
15:27:16 <mspreitz> n0ano: is that an observation based on data I can see?
15:27:17 <alexey_o> toan-tran: yes, there is still a lot to document carefully
15:27:55 <n0ano> mspreitz, strictly guess, hard data would be good to have but `I` think the guess is good
15:28:24 <mspreitz> Can you elaborate on why you expect scheduling rate to increase more slowly than cloud size?
15:28:43 <alexey_o> mspreitz: some time ago I've measured delay in mysql responce to compute node get as a function of the number of compute node records
15:29:06 <johnthetubaguy> as long as its optional, then I am good with having it as an option for people to try, and we can decide to change the default, once we "are happy" with it
15:29:34 <n0ano> as cloud size increases the node updates increase, if guests are long lived then the scheduling rate should not increase at the same rate
15:29:45 <alexey_o> as scheduler issues a request for all compute nodes to be grabbed from db it takes more time to schedule a single instance
15:30:17 <johnthetubaguy> number of nodes vs rate of builds vs rate of deletes is going to change what is best, so trying this out seems to make sense to me
15:31:01 <mspreitz> alexey_o: yes, that makes sense.  I think n0ano is making a different assertion: the (number of calls to scheduler)/time is less than proportional to cloud size
15:31:21 <alexey_o> yes, I got it wrong the first time
15:31:52 <mspreitz> n0ano: I do not follow your reply...
15:32:09 <n0ano> mspreitz, no, I think I'm saying (number of call to sched)/time is less than (node updates)/time, especially as the cloud gets larger
15:32:11 <mspreitz> Do you expect guest lifetime to increase with cloud size?
15:32:49 <mspreitz> node = host, I take it
15:32:57 <n0ano> mspreitz, yes
15:33:24 <mspreitz> node update  = guest start, guest stop, or stats, or what?
15:33:50 <n0ano> stats is the real concern, start/stop should be directly proportional to scheduler calls
15:34:12 <toan-tran> mspreitz: I think node update = periodic ms
15:34:21 <mspreitz> n0ano: your previus "yes" was response to terminology question, or guest lifetime?
15:34:48 <n0ano> I think terminology confusions, I'm not really concerned about lifetime
15:34:51 <toan-tran> = number of nodes (per minute)
15:35:21 <mspreitz> stats update / time should be proportional to cloud size
15:35:23 <toan-tran> while (call to scheduler) = guesses requesting compute nodes
15:35:34 <toan-tran> mspreitz: yes
15:36:01 <toan-tran> so I do not see clearly the relation between the two
15:36:31 <toan-tran> Big servers ==> less update & more call to scheduler
15:36:40 <mspreitz> Assuming overall utilization stays constant as cloud size grows, and that guest lifetime stays constant as cloud size grows, this implies that guest arrivals/time stays constant as cloud size grows
15:37:19 <n0ano> but updates/time should increase as the cloud size increases
15:37:32 <toan-tran> mspreitz: I don't think so
15:37:35 <mspreitz> sorry, I stated my conclusion wrong
15:37:49 <mspreitz> the implication is that guest arrivals / time will be proportional to cloud size
15:37:55 <toan-tran> because cloud providers will not increase their size if there is a constant guest arrival
15:38:33 <toan-tran> rember that is scheduler call is when there is a change in a number of instances of clients' applications
15:38:35 <mspreitz> Assuming overall utilization stays constant as cloud size grows, and that guest lifetime stays constant as cloud size grows, this implies that guest arrivals/time will be proportional to cloud size
15:39:12 <toan-tran> so if guest arrival is constant ==> there is almost no activities on clients' applications
15:39:18 <toan-tran> it's hard to fantom
15:39:31 <mspreitz> toan-tran: right.  Read my corrected statement
15:39:35 <n0ano> well, my concern is the ratio between (update requests)/(scheduler requests), I'm thinking that ratio should be >1
15:40:18 <n0ano> guys, I hate to cut this short but I think we have to move on, one other topic I'd like to cover today
15:40:20 <mspreitz> I gave an argument why instance creations / time will be proportional to cloud size.  I think we agree that stats / time will be proportional to cloud size
15:40:40 <toan-tran> mspreitz +1
15:40:50 <mspreitz> OK with me
15:41:04 <Yathi> It will be good to see some real experiments and numbers
15:41:06 <mspreitz> I think Boris showed that the revision is a win at the size he studied
15:41:22 <mspreitz> and if my expectation is right, that winnage extends to other size
15:41:22 <Yathi> mspreitz, if you have experiments done in your team, please share the results or any paper you may have written
15:42:04 <mspreitz> My group is not studying that question, we are pretty convinced that Boris has a win
15:42:15 <mspreitz> we are working on scaling study
15:42:33 <n0ano> moving on...
15:42:39 <Yathi> +1
15:42:44 <n0ano> #scheduler code forklift
15:42:52 <n0ano> good news/bad news
15:43:17 <n0ano> good news - the changes to devstack have been merged so you can now requests the gantt scheduler service
15:43:59 <n0ano> bad news - still working on pushing the change to get gantt to pass the unit tests, everytime I think I've got it another (mostly minor) issue crops up
15:44:21 <n0ano> getting there, it's just a matter of resolving the review issues
15:45:07 <coolsvap> n0ano: thats great!
15:45:11 <ddutta> +1
15:46:07 <n0ano> actually, the other good news is that gantt passes most of the tempest tests locally (I fail about 100 tests out of 2000 but I get the same failures on a clean devstack build with the nova scheduler so the failures are probably my local setup
15:46:43 <n0ano> anyone, if anyone want to review the changes at https://review.openstack.org/#/c/68521/ go for it
15:46:50 <ddutta> n0ano: does the scheduler code work standalone right now? without OS dependencies
15:47:24 <n0ano> ddutta, no, that's the next step, to cut all ties to nova, not there yet
15:47:49 <ddutta> coool!
15:48:31 <n0ano> I like stepwise progression - first tree -> unit tests -> tempests tests -> independent tree
15:48:51 <ddutta> agreed!
15:49:16 <coolsvap> n0ano: I will give a try to gantt tree tempest tests tonight
15:49:20 <toan-tran> n0ano +1
15:50:14 <n0ano> coolsvap, one warning, you have to change the SCHEDULER environment variable from it's default, set it to gantt.scheduler.filter_scheduler.FilterScheduler in your localrc
15:50:38 <n0ano> other than that, it should work (I want to know if it doesn't)
15:51:04 <toan-tran> n0ano: is there a requirement for nova version?
15:51:39 <coolsvap> n0ano: point noted! thx!
15:51:44 <n0ano> for the unit tests yes, my local tempest tests have just been against to of tree
15:51:51 <n0ano> s/to of/top of
15:52:06 <Yathi> n0ano, do you have any documentation of how we can start trying out gantt now ?
15:52:32 <n0ano> depending upon how far the top of tree changes this could be a problem for tempest, that's why we need to cut the cord as soon as possible
15:53:17 <n0ano> Yathi, no, it's pretty simple, I can sent an email to the ML to tell how to do it (enable gantt, disable n-sch, set SCHEDULER)
15:53:48 <Yathi> ok cool.. will look for it
15:54:18 <n0ano> aproaching the top of the hour
15:54:21 <n0ano> #topic opens
15:54:30 <n0ano> anyone have anything else?
15:54:39 <mspreitz> ML about gantt will be good
15:54:48 <Yathi> Request to review two patches for Solver scheduler
15:54:56 <mspreitz> pointers please
15:54:57 <n0ano> Yathi, links?
15:55:00 <Yathi> #link https://review.openstack.org/#/c/46588/
15:55:11 <Yathi> #link https://review.openstack.org/#/c/70654/
15:55:19 <mspreitz> thanks
15:55:21 <n0ano> Yathi, tnx
15:55:21 <ddutta> and the API patches https://review.openstack.org/#/c/62557/
15:55:37 <ddutta> sorry I meant API for group instance
15:55:48 <toan-tran> and the Policy Based Scheduler patch: https://review.openstack.org/#/c/61386
15:55:50 <toan-tran> :)
15:55:54 <ddutta> :)
15:55:58 <Yathi> :)
15:56:15 <n0ano> mspreitz, there have been multiple ML threads on gantt and there's a launchpad about it at https://etherpad.openstack.org/p/icehouse-external-scheduler
15:56:29 <mspreitz> I meant the instr on how to test
15:56:44 <n0ano> mspreitz, ah, sure NP
15:56:56 * n0ano I guess I really have to write today :-)
15:57:39 <Yathi> n0ano, I may have missed this from the previous meetings, how will the new patches to scheduler, like the Solver scheduler, policy-based scheduler be merged with gantt ?
15:57:52 <Yathi> after gantt merging, before, and re-merge with gantt ?
15:58:44 <n0ano> the current gantt tree is a preview, after we get it working we'll re-create the tree, apply the changes we now know get it working (lots of work for me) and then cut to gantt
15:59:17 <Yathi> ok cool Thanks
15:59:31 <n0ano> OK guys, tnx and we'll talk again
15:59:37 <n0ano> #endmeeting