15:00:03 <bauzas> #startmeeting gantt 15:00:04 <openstack> Meeting started Tue Mar 18 15:00:03 2014 UTC and is due to finish in 60 minutes. The chair is bauzas. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:05 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:07 <openstack> The meeting name has been set to 'gantt' 15:00:28 <bauzas> hi all, someone here to discuss about scheduler ? 15:00:38 <mspreitz> o/ 15:00:40 <PaulMurray> hi 15:00:53 <bauzas> hi 15:01:08 <bauzas> let's wait a few minutes 15:01:10 <lcostantino> hi 15:01:24 <bauzas> is boris-42 there ? 15:01:49 <bauzas> the first topic is about discussing no-db scheduler blueprint 15:02:26 <bauzas> but we can move forward and go to the 2nd topic 15:02:47 <bauzas> #topic scheduler forklift 15:03:03 <bauzas> #link https://etherpad.openstack.org/p/icehouse-external-scheduler 15:03:20 * johnthetubaguy waves, but is a bit distracted 15:03:44 <bauzas> so, there are 2 blueprints for the forklift 15:04:02 <bauzas> #link https://blueprints.launchpad.net/nova/+spec/remove-cast-to-schedule-run-instance 15:04:12 <bauzas> #link https://blueprints.launchpad.net/nova/+spec/scheduler-lib 15:04:29 <johnthetubaguy> yeah, just looking at the draft change there 15:04:37 <bauzas> johnthetubaguy: thanks 15:04:49 <bauzas> maybe could you please put me owner of https://blueprints.launchpad.net/nova/+spec/scheduler-lib ? 15:04:53 <johnthetubaguy> doesn't seem quite right yet, but I see the idea 15:05:04 <bauzas> johnthetubaguy: that's the goal of FF 15:05:04 <johnthetubaguy> bauzas: what is your lp nic? 15:05:05 <bauzas> :) 15:05:09 <bauzas> sylvain-bauza 15:05:47 <johnthetubaguy> you are going to need a better spec for this I think, would be good to rough that out 15:05:51 <bauzas> johnthetubaguy: I'm taking FF as an opportunity for drafting the patch 15:06:12 <bauzas> johnthetubaguy: we can open a distinct etherpad or create a wiki page 15:06:19 <bauzas> johnthetubaguy: both are fine with me 15:06:29 <johnthetubaguy> sure, wiki could work 15:07:12 <bauzas> ok, taking action to create it 15:07:30 <johnthetubaguy> Need to be clearer about the split 15:07:32 <bauzas> #action bauzas Create a wiki page for spec'ing bp scheduler-lib 15:07:38 <johnthetubaguy> nova db stuff stays in nova db 15:07:43 <johnthetubaguy> think about the information flow 15:08:29 <bauzas> johnthetubaguy: ComputeNode shall be thought to be stored in Gantt 15:08:41 <bauzas> johnthetubaguy: especially if we consider moving to memcached 15:08:53 <bauzas> johnthetubaguy: for storing host states 15:09:22 <bauzas> anyone left wanting to see the draft ? 15:09:35 <bauzas> #link https://review.openstack.org/80113 15:10:09 <mspreitz> yes, I would like to see it 15:10:24 <bauzas> ok, will add you as reviewer 15:10:39 <mspreitz> thanks 15:10:47 <bauzas> mspreitz: doen 15:10:48 <bauzas> done 15:10:51 <PaulMurray> bauzas I would like to look 15:11:03 <johnthetubaguy> bauzas: I think you are hooking up too high 15:11:03 <PaulMurray> please 15:11:05 <bauzas> PaulMurray: done 15:11:10 <PaulMurray> thx 15:11:14 <johnthetubaguy> nova service status will stay in nova after the split I feel 15:11:35 <bauzas> johnthetubaguy: oh, seems there was a confusion 15:11:44 <bauzas> johnthetubaguy: service status should stay in Nova, right 15:11:54 <bauzas> johnthetubaguy: computenode status should move to Gantt 15:12:00 <johnthetubaguy> yep, scheduler probably has to have its own copy of that state 15:12:12 <bauzas> johnthetubaguy: ok, seeing my mistake 15:12:20 <johnthetubaguy> self.conductor_api.compute_node_update 15:12:42 <bauzas> johnthetubaguy: that one should be replaced by a call to memcached (at the end of the story of course) 15:12:52 <bauzas> johnthetubaguy: see the discussion around no-db-scheduler 15:13:03 <johnthetubaguy> I don't agree with that, but it doesn't really matter right now 15:13:12 <johnthetubaguy> its certainly a valid option 15:13:15 <bauzas> johnthetubaguy: yey, let's discuss this at the summit 15:13:26 <bauzas> that's not that important for now 15:13:31 <johnthetubaguy> agreed 15:13:39 <johnthetubaguy> so if scheduler lib is just: 15:13:47 <johnthetubaguy> nductor_api.compute_node_update 15:13:51 <johnthetubaguy> conductor_api.compute_node_update 15:13:59 <johnthetubaguy> and the select_destination 15:14:01 <johnthetubaguy> would that work? 15:14:16 <johnthetubaguy> so scheduler right now just sends it to the conductor today, later it sends it to the scheduler 15:14:30 <johnthetubaguy> then thats about it I think…? 15:14:38 <johnthetubaguy> nice simple client 15:15:00 <bauzas> johnthetubaguy: I was thinking to replace _sync_compute_node() 15:15:10 <bauzas> of the RT 15:15:12 <johnthetubaguy> yeah, thats way too big I feel 15:15:30 <johnthetubaguy> its really just compute node plus a dict sent to the scheduler right? 15:15:33 <PaulMurray> bauzas conductor_api.compute_node_update is going away because of objects, but 15:15:39 <johnthetubaguy> we need to agree the format mind 15:15:43 <PaulMurray> there will be somethings imilar that can be done 15:15:47 <johnthetubaguy> PaulMurray: agreed, its just that operation 15:15:55 <bauzas> johnthetubaguy: PaulMurray: ok, let's discuss this on the wikipagfez 15:15:58 <bauzas> wikipage 15:16:10 <PaulMurray> ok 15:16:21 <bauzas> johnthetubaguy: I see your idea 15:16:36 <johnthetubaguy> basically client should be a single line seam in nova 15:16:45 <bauzas> johnthetubaguy: should we consider that ComputeNode should stay in Nova ? 15:16:49 <johnthetubaguy> I think you got select_destinations spot on 15:17:02 <bauzas> yey, that's the issue with RT 15:17:13 <bauzas> because select_destinations() is already decoupled 15:17:39 <johnthetubaguy> yeah, ComputeNode will stay in nova, I think you need to call nova method, and scheduler call at the moment, but not totally sure about it, I think I see your idea more now 15:17:42 <bauzas> but RT is having tight dependency with the schedulret 15:18:23 <johnthetubaguy> well it could all go into oslo, but yes, need to deal with these things 15:18:27 <bauzas> johnthetubaguy: ok, let's all of us take a few amout of time discussing about the split itself 15:18:47 <bauzas> johnthetubaguy: and see what we must do step-by-step 15:18:56 <bauzas> s/must/shall (better :) ) 15:19:38 <bauzas> PaulMurray: I still have to review https://blueprints.launchpad.net/nova/+spec/make-resource-tracker-use-objects 15:20:08 <bauzas> ok, I'm done with this topic 15:20:16 <bauzas> anyone else has to add something ? 15:20:28 <PaulMurray> justa thought 15:20:34 <bauzas> sure 15:20:47 <PaulMurray> there is some code that is on scheduler and compute node side 15:20:53 <PaulMurray> to do with claims 15:21:01 <PaulMurray> on compute node and 15:21:12 <PaulMurray> consuming from host state on scheduler 15:21:18 <digambar> Hi 15:21:20 <digambar> https://review.openstack.org/80113 15:21:24 <PaulMurray> would be good to think if that needs to be the way 15:21:29 <digambar> this url is not opening up 15:21:30 <bauzas> claims are only located on resourcetracker, right? 15:21:35 <PaulMurray> right 15:21:41 <PaulMurray> but the effect of a claim is 15:21:49 <PaulMurray> repeated in host state 15:21:57 <PaulMurray> on sheduler 15:22:01 <bauzas> digambar: please give me your Gerrit username so I could add you as reviewer 15:22:15 <PaulMurray> its just something to think about 15:22:18 <bauzas> PaulMurray: see your point, and already thought about it 15:22:38 <bauzas> PaulMurray: the idea is that RT objects should place a call to scheduler-lib for updating states 15:22:45 <johnthetubaguy> PaulMurray: yeah, its getting the write slice, I am think duplicate state for the first cut, but not sure 15:22:46 <digambar> ok 15:23:02 <bauzas> either in claims or thru the update_available_resources() method 15:23:05 <digambar> username-digambar 15:23:14 <digambar> email-digambarpatil15@yahoo.co.in 15:23:32 <PaulMurray> bauzas one of the problems 15:23:40 <PaulMurray> I had to deal with in extensible rRT 15:23:44 <bauzas> digambar: I already added you, please check that you are logged in 15:23:55 <digambar> ok 15:23:56 <PaulMurray> is that a new resource has to be implemented at compute node and scheduler 15:24:13 <bauzas> PaulMurray: I see your problem 15:24:16 <PaulMurray> so to allow split I did a plugin for both, but could have been one plugin 15:24:23 <PaulMurray> shared 15:24:42 <PaulMurray> It may be a limitation to live with, but there might be another way to factor the code long term 15:24:54 <bauzas> johnthetubaguy: that's one of the reason IMHO I think ComputeNode should get rid of Nova and be stored only in Gantt 15:24:57 <johnthetubaguy> PaulMurray: object shoud be a good plugin point… good point 15:25:22 <johnthetubaguy> bauzas: yeah, ComputeNode probably should go, but service stays, my bad 15:25:52 <bauzas> ok, let's write the rationale and see how it integrates with extensible RT 15:25:59 <johnthetubaguy> bauzas: a new resource tracker that is gnatt specific could be the other way, but that seems too broad 15:26:08 <bauzas> +1 15:26:29 <johnthetubaguy> PaulMurray: does the object work make this clearer? 15:26:35 <bauzas> hence the hook on RT.update_available_resource() 15:26:36 <PaulMurray> Not really 15:26:58 <bauzas> RT should stay in Nova 15:26:58 <johnthetubaguy> hmm, just thinking, could compute node be like instance and cells and report to scheduler vs nova 15:27:08 <johnthetubaguy> yeah RT will be in Nova 15:27:18 <digambar> how to configure https://github.com/openstack/gantt repo to openstack 15:27:36 <digambar> for replacing existing scheduler to use this new one ? 15:27:53 <bauzas> digambar: gantt is currently not ready to be integrated 15:28:01 <digambar> okay 15:28:17 <digambar> then we have to test it off way 15:28:19 <digambar> ? 15:28:26 <johnthetubaguy> bauzas: I think I like what you have done now, thinking about this more... 15:28:59 <bauzas> johnthetubaguy: provided I find some way to request nova service either way 15:29:23 <bauzas> johnthetubaguy: that's a big showstopper 15:29:36 <johnthetubaguy> bauzas: don't understand you 15:29:56 <johnthetubaguy> bauzas: the blocker for me is removing the old code from resource tracker.py 15:30:01 <bauzas> johnthetubaguy: the service table will stay in Nova 15:30:30 <johnthetubaguy> bauzas: yeah, service table will be in Nova, gnatt will need its own copy of that 15:30:32 <bauzas> johnthetubaguy: so that the call to _get_service() requires an external call 15:30:44 <bauzas> johnthetubaguy: a copy or an API call ? 15:30:45 <bauzas> :) 15:31:04 <bauzas> johnthetubaguy: I don't like much caching objects :) 15:31:20 <johnthetubaguy> bauzas: I think its getting the correct data split 15:31:31 <johnthetubaguy> gnatt should have its own view of what services it knows about 15:31:38 <johnthetubaguy> nova-computes and nova-volumes 15:31:46 <johnthetubaguy> Nova needs to maintain its own list 15:31:56 <johnthetubaguy> and we need to plug in correctly 15:32:20 <bauzas> johnthetubaguy: well, that means that creating a Nova service must end up to create it as well on Gantt 15:32:43 <bauzas> johnthetubaguy: why can't we consider that Gantt is discovering services using a REST call ? 15:32:55 <bauzas> thanks to python-novaclient 15:33:20 <johnthetubaguy> bauzas: not sure I understand what you are trying to say, but I think we are agreeing, just prehaps not on the implementation 15:33:28 <bauzas> agreed 15:33:35 <bauzas> taking the point 15:33:48 <johnthetubaguy> there are two things 15:33:58 <bauzas> yet 15:34:00 <bauzas> yey 15:34:03 <johnthetubaguy> is the service alive, just like nova-network or whatever 15:34:11 <johnthetubaguy> what stats does the service have 15:34:15 <bauzas> +1 15:34:17 <johnthetubaguy> for the compute node 15:34:22 <johnthetubaguy> we need the first in nova 15:34:22 <bauzas> exactly 15:34:28 <johnthetubaguy> the second goes in gnatt 15:34:32 <bauzas> +1 15:34:37 <PaulMurray> good 15:34:53 <johnthetubaguy> OK, so I think we agree 15:35:01 <johnthetubaguy> my plan for this is thus…. 15:35:04 <bauzas> but the question is : how gantt can discover nova data ? 15:35:13 <bauzas> for services 15:35:24 <johnthetubaguy> make nova-scheduler not make any DB calls, except for the above compute node stats 15:35:37 <johnthetubaguy> make no other service access the compute node stats 15:35:48 <bauzas> that's right 15:35:51 <johnthetubaguy> using coding similar to how nova-compute is unable to directly contact the db 15:36:01 <johnthetubaguy> then we can start to prove the split inside the Nova tree 15:36:09 <johnthetubaguy> I think thats the goal here 15:36:11 <bauzas> #agreed make nova-scheduler not make any DB calls, except for the above compute node stats 15:36:16 <bauzas> #agreed make no other service access the compute node stats 15:36:18 <PaulMurray> +1 15:36:47 <bauzas> ok, let's move on and discuss about the implementation on the wiki page 15:36:59 <bauzas> any other things to mention on that topic ? 15:37:22 <johnthetubaguy> I think we can do this in the code review now, but lets see how it goes 15:37:31 <bauzas> ok 15:37:43 <bauzas> #topic no-db-scheduler 15:38:21 <bauzas> I have nothing to say here, but maybe there is a discussion on -dev about a scalable scheduler that could make use of it 15:38:54 <johnthetubaguy> oh, have you got a link? 15:39:09 <bauzas> #link http://lists.openstack.org/pipermail/openstack-dev/2014-March/030084.html 15:39:43 <johnthetubaguy> oh right 15:39:47 <bauzas> johnthetubaguy: the problem is about instance groups and race condition happening 15:40:06 <johnthetubaguy> yep, I responded to that saying I agree with Russells fix, not read the response to that 15:40:09 <johnthetubaguy> (yet) 15:40:14 <bauzas> so I would be interested in knowing what's the status of no-db-scheduler bp 15:40:32 <johnthetubaguy> it got deferred for being too risky in Juno 15:40:43 <bauzas> yey, of course 15:40:49 <johnthetubaguy> if its not optional I will −2 it because it requires a dependency 15:40:59 <johnthetubaguy> new depedency^ 15:41:00 <bauzas> johnthetubaguy: we just need to make sure if it can handle this issue 15:41:16 <johnthetubaguy> I like the general ideas as an option 15:41:32 <bauzas> agreed 15:41:46 <bauzas> there is one stackforge project about distributed locking, called tooz 15:42:09 <bauzas> it would be nice to see if there are mutual concerns for this 15:42:21 <johnthetubaguy> hmm, I don't mind that being later 15:42:21 <bauzas> and I personnally do think 15:42:35 <johnthetubaguy> don't want too many dependencies, but also what to share code eventually 15:43:09 <mspreitz> bauzas: what do you mean by mutual concerns for tooz? 15:43:34 <bauzas> mspreitz: I mean that distributed locking mechanism is one of the goals for memcached scheduler states 15:44:23 <mspreitz> bauzas: you mean no-db-scheduler should use tooz? 15:44:35 <bauzas> that could be one option yes 15:45:22 <bauzas> and tooz could also add a backend memcached plugin 15:45:33 <bauzas> in order to minimize the dependencies 15:45:42 <johnthetubaguy> yeah, thats something for after the current code merges 15:45:52 <bauzas> of course :-) 15:45:56 <johnthetubaguy> good good 15:46:09 <PaulMurray> bauzas I have a question 15:46:14 <bauzas> ok, thats'it for me on that topic 15:46:17 <bauzas> PaulMurray: sure 15:46:29 <PaulMurray> don't want to open it up too much 15:46:42 <PaulMurray> but it seems that this dicussion is gonig towards a general 15:46:52 <PaulMurray> configuration service 15:47:00 <PaulMurray> you know, like zoo keeper 15:47:04 <PaulMurray> I can see why 15:47:13 <PaulMurray> do you see the scheduler 15:47:14 <bauzas> tooz is based on zk :-) 15:47:27 <PaulMurray> taking that role or do you think something else should 15:47:28 <bauzas> that's the current default driver 15:47:33 <PaulMurray> and scheduler should use that service 15:47:46 <bauzas> PaulMurray: that's too early for answering it :) 15:48:08 <PaulMurray> but thanks anyway - didn't know about tooz 15:48:09 <bauzas> for Juno, we should just make sure the paths won't diverge too much 15:48:31 <bauzas> but that could be an opportunity for a later cycle 15:48:36 <bauzas> IMHO 15:48:36 <PaulMurray> BTW I hate zookeeper 15:48:47 <bauzas> hence the plugin mechanism :d 15:48:49 <bauzas> write your own ;) 15:48:53 <PaulMurray> :) 15:49:08 <johnthetubaguy> yeah, that an interesting one 15:49:30 <bauzas> I agree that now we should focus on memcached scheduler 15:49:51 <bauzas> but just make sure that the interfaces are flexible enough for accepting a new backend 15:50:02 <johnthetubaguy> erm, well I think we need to focus on the split, which gives the memcached scheduler a good plugging in point 15:50:05 <bauzas> the few I read about the reviews make me think it's the case 15:50:26 <bauzas> johnthetubaguy: yey, both efforts are separated 15:50:45 <bauzas> johnthetubaguy: but both can profit 15:50:49 <bauzas> anyway 15:50:58 <bauzas> let's discuss this at the summit 15:51:07 <bauzas> and that leads me to the next topic 15:51:07 <johnthetubaguy> bauzas: not so sure anymore, I have a feeling the no-db-scheduler needs the split to plugin in an optional way 15:51:15 <johnthetubaguy> sure, do continue... 15:51:24 <bauzas> #topic open discussion 15:51:39 <bauzas> a quick FYI : http://summit.openstack.org/cfp/details/80 15:52:15 <bauzas> at the moment, I don't have a need for another discussion 15:52:41 <johnthetubaguy> do you want to cover no-db-scheduler? 15:52:55 <bauzas> I'm not the owner 15:53:04 <bauzas> I will send an email to boris-42 15:53:11 <johnthetubaguy> sounds good 15:53:18 <bauzas> and see if he plans to promote it 15:53:38 <bauzas> that's it for me 15:53:43 <bauzas> we're having 5 mins lefty 15:53:44 <bauzas> left 15:53:52 <bauzas> any other subject to discuss ? 15:55:02 <PaulMurray> what's the abbrieviation for I"I hear the tumble weeds blowing in the wind"? 15:55:27 <johnthetubaguy> d.o.n.e ? 15:55:38 <bauzas> no shout, no doubt :) 15:55:42 <bauzas> #endmeeting