#openstack-meeting log

15:00:03 <bauzas> #startmeeting gantt
15:00:04 <openstack> Meeting started Tue Mar 18 15:00:03 2014 UTC and is due to finish in 60 minutes.  The chair is bauzas. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:05 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:07 <openstack> The meeting name has been set to 'gantt'
15:00:28 <bauzas> hi all, someone here to discuss about scheduler ?
15:00:38 <mspreitz> o/
15:00:40 <PaulMurray> hi
15:00:53 <bauzas> hi
15:01:08 <bauzas> let's wait a few minutes
15:01:10 <lcostantino> hi
15:01:24 <bauzas> is boris-42 there ?
15:01:49 <bauzas> the first topic is about discussing no-db scheduler blueprint
15:02:26 <bauzas> but we can move forward and go to the 2nd topic
15:02:47 <bauzas> #topic scheduler forklift
15:03:03 <bauzas> #link https://etherpad.openstack.org/p/icehouse-external-scheduler
15:03:20 * johnthetubaguy waves, but is a bit distracted
15:03:44 <bauzas> so, there are 2 blueprints for the forklift
15:04:02 <bauzas> #link https://blueprints.launchpad.net/nova/+spec/remove-cast-to-schedule-run-instance
15:04:12 <bauzas> #link https://blueprints.launchpad.net/nova/+spec/scheduler-lib
15:04:29 <johnthetubaguy> yeah, just looking at the draft change there
15:04:37 <bauzas> johnthetubaguy: thanks
15:04:49 <bauzas> maybe could you please put me owner of https://blueprints.launchpad.net/nova/+spec/scheduler-lib ?
15:04:53 <johnthetubaguy> doesn't seem quite right yet, but I see the idea
15:05:04 <bauzas> johnthetubaguy: that's the goal of FF
15:05:04 <johnthetubaguy> bauzas: what is your lp nic?
15:05:05 <bauzas> :)
15:05:09 <bauzas> sylvain-bauza
15:05:47 <johnthetubaguy> you are going to need a better spec for this I think, would be good to rough that out
15:05:51 <bauzas> johnthetubaguy: I'm taking FF as an opportunity for drafting the patch
15:06:12 <bauzas> johnthetubaguy: we can open a distinct etherpad or create a wiki page
15:06:19 <bauzas> johnthetubaguy: both are fine with me
15:06:29 <johnthetubaguy> sure, wiki could work
15:07:12 <bauzas> ok, taking action to create it
15:07:30 <johnthetubaguy> Need to be clearer about the split
15:07:32 <bauzas> #action bauzas Create a wiki page for spec'ing bp scheduler-lib
15:07:38 <johnthetubaguy> nova db stuff stays in nova db
15:07:43 <johnthetubaguy> think about the information flow
15:08:29 <bauzas> johnthetubaguy: ComputeNode shall be thought to be stored in Gantt
15:08:41 <bauzas> johnthetubaguy: especially if we consider moving to memcached
15:08:53 <bauzas> johnthetubaguy: for storing host states
15:09:22 <bauzas> anyone left wanting to see the draft ?
15:09:35 <bauzas> #link https://review.openstack.org/80113
15:10:09 <mspreitz> yes, I would like to see it
15:10:24 <bauzas> ok, will add you as reviewer
15:10:39 <mspreitz> thanks
15:10:47 <bauzas> mspreitz: doen
15:10:48 <bauzas> done
15:10:51 <PaulMurray> bauzas I would like to look
15:11:03 <johnthetubaguy> bauzas: I think you are hooking up too high
15:11:03 <PaulMurray> please
15:11:05 <bauzas> PaulMurray: done
15:11:10 <PaulMurray> thx
15:11:14 <johnthetubaguy> nova service status will stay in nova after the split I feel
15:11:35 <bauzas> johnthetubaguy: oh, seems there was a confusion
15:11:44 <bauzas> johnthetubaguy: service status should stay in Nova, right
15:11:54 <bauzas> johnthetubaguy: computenode status should move to Gantt
15:12:00 <johnthetubaguy> yep, scheduler probably has to have its own copy of that state
15:12:12 <bauzas> johnthetubaguy: ok, seeing my mistake
15:12:20 <johnthetubaguy> self.conductor_api.compute_node_update
15:12:42 <bauzas> johnthetubaguy: that one should be replaced by a call to memcached (at the end of the story of course)
15:12:52 <bauzas> johnthetubaguy: see the discussion around no-db-scheduler
15:13:03 <johnthetubaguy> I don't agree with that, but it doesn't really matter right now
15:13:12 <johnthetubaguy> its certainly a valid option
15:13:15 <bauzas> johnthetubaguy: yey, let's discuss this at the summit
15:13:26 <bauzas> that's not that important for now
15:13:31 <johnthetubaguy> agreed
15:13:39 <johnthetubaguy> so if scheduler lib is just:
15:13:47 <johnthetubaguy> nductor_api.compute_node_update
15:13:51 <johnthetubaguy> conductor_api.compute_node_update
15:13:59 <johnthetubaguy> and the select_destination
15:14:01 <johnthetubaguy> would that work?
15:14:16 <johnthetubaguy> so scheduler right now just sends it to the conductor today, later it sends it to the scheduler
15:14:30 <johnthetubaguy> then thats about it I think…?
15:14:38 <johnthetubaguy> nice simple client
15:15:00 <bauzas> johnthetubaguy: I was thinking to replace _sync_compute_node()
15:15:10 <bauzas> of the RT
15:15:12 <johnthetubaguy> yeah, thats way too big I feel
15:15:30 <johnthetubaguy> its really just compute node plus a dict sent to the scheduler right?
15:15:33 <PaulMurray> bauzas conductor_api.compute_node_update is going away because of objects, but
15:15:39 <johnthetubaguy> we need to agree the format mind
15:15:43 <PaulMurray> there will be somethings imilar that can be done
15:15:47 <johnthetubaguy> PaulMurray: agreed, its just that operation
15:15:55 <bauzas> johnthetubaguy: PaulMurray: ok, let's discuss this on the wikipagfez
15:15:58 <bauzas> wikipage
15:16:10 <PaulMurray> ok
15:16:21 <bauzas> johnthetubaguy: I see your idea
15:16:36 <johnthetubaguy> basically client should be a single line seam in nova
15:16:45 <bauzas> johnthetubaguy: should we consider that ComputeNode should stay in Nova ?
15:16:49 <johnthetubaguy> I think you got select_destinations spot on
15:17:02 <bauzas> yey, that's the issue with RT
15:17:13 <bauzas> because select_destinations() is already decoupled
15:17:39 <johnthetubaguy> yeah, ComputeNode will stay in nova, I think you need to call nova method, and scheduler call at the moment, but not totally sure about it, I think I see your idea more now
15:17:42 <bauzas> but RT is having tight dependency with the schedulret
15:18:23 <johnthetubaguy> well it could all go into oslo, but yes, need to deal with these things
15:18:27 <bauzas> johnthetubaguy: ok, let's all of us take a few amout of time discussing about the split itself
15:18:47 <bauzas> johnthetubaguy: and see what we must do step-by-step
15:18:56 <bauzas> s/must/shall (better :) )
15:19:38 <bauzas> PaulMurray: I still have to review https://blueprints.launchpad.net/nova/+spec/make-resource-tracker-use-objects
15:20:08 <bauzas> ok, I'm done with this topic
15:20:16 <bauzas> anyone else has to add something ?
15:20:28 <PaulMurray> justa thought
15:20:34 <bauzas> sure
15:20:47 <PaulMurray> there is some code that is on scheduler and compute node side
15:20:53 <PaulMurray> to do with claims
15:21:01 <PaulMurray> on compute node and
15:21:12 <PaulMurray> consuming from host state on scheduler
15:21:18 <digambar> Hi
15:21:20 <digambar> https://review.openstack.org/80113
15:21:24 <PaulMurray> would be good to think if that needs to be the way
15:21:29 <digambar> this url is not opening up
15:21:30 <bauzas> claims are only located on resourcetracker, right?
15:21:35 <PaulMurray> right
15:21:41 <PaulMurray> but the effect of a claim is
15:21:49 <PaulMurray> repeated in host state
15:21:57 <PaulMurray> on sheduler
15:22:01 <bauzas> digambar: please give me your Gerrit username so I could add you as reviewer
15:22:15 <PaulMurray> its just something to think about
15:22:18 <bauzas> PaulMurray: see your point, and already thought about it
15:22:38 <bauzas> PaulMurray: the idea is that RT objects should place a call to scheduler-lib for updating states
15:22:45 <johnthetubaguy> PaulMurray: yeah, its getting the write slice, I am think duplicate state for the first cut, but not sure
15:22:46 <digambar> ok
15:23:02 <bauzas> either in claims or thru the update_available_resources() method
15:23:05 <digambar> username-digambar
15:23:14 <digambar> email-digambarpatil15@yahoo.co.in
15:23:32 <PaulMurray> bauzas one of the problems
15:23:40 <PaulMurray> I had to deal with in extensible rRT
15:23:44 <bauzas> digambar: I already added you, please check that you are logged in
15:23:55 <digambar> ok
15:23:56 <PaulMurray> is that a new resource has to be implemented at compute node and scheduler
15:24:13 <bauzas> PaulMurray: I see your problem
15:24:16 <PaulMurray> so to allow split I did a plugin for both, but could have been one plugin
15:24:23 <PaulMurray> shared
15:24:42 <PaulMurray> It may be a limitation to live with, but there might be another way to factor the code long term
15:24:54 <bauzas> johnthetubaguy: that's one of the reason IMHO I think ComputeNode should get rid of Nova and be stored only in Gantt
15:24:57 <johnthetubaguy> PaulMurray: object shoud be a good plugin point… good point
15:25:22 <johnthetubaguy> bauzas: yeah, ComputeNode probably should go, but service stays, my bad
15:25:52 <bauzas> ok, let's write the rationale and see how it integrates with extensible RT
15:25:59 <johnthetubaguy> bauzas: a new resource tracker that is gnatt specific could be the other way, but that seems too broad
15:26:08 <bauzas> +1
15:26:29 <johnthetubaguy> PaulMurray: does the object work make this clearer?
15:26:35 <bauzas> hence the hook on RT.update_available_resource()
15:26:36 <PaulMurray> Not really
15:26:58 <bauzas> RT should stay in Nova
15:26:58 <johnthetubaguy> hmm, just thinking, could compute node be like instance and cells and report to scheduler vs nova
15:27:08 <johnthetubaguy> yeah RT will be in Nova
15:27:18 <digambar> how to configure https://github.com/openstack/gantt repo to openstack
15:27:36 <digambar> for replacing existing scheduler to use this new one ?
15:27:53 <bauzas> digambar: gantt is currently not ready to be integrated
15:28:01 <digambar> okay
15:28:17 <digambar> then we have to test it off way
15:28:19 <digambar> ?
15:28:26 <johnthetubaguy> bauzas: I think I like what you have done now, thinking about this more...
15:28:59 <bauzas> johnthetubaguy: provided I find some way to request nova service either way
15:29:23 <bauzas> johnthetubaguy: that's a big showstopper
15:29:36 <johnthetubaguy> bauzas: don't understand you
15:29:56 <johnthetubaguy> bauzas: the blocker for me is removing the old code from resource tracker.py
15:30:01 <bauzas> johnthetubaguy: the service table will stay in Nova
15:30:30 <johnthetubaguy> bauzas: yeah, service table will be in Nova, gnatt will need its own copy of that
15:30:32 <bauzas> johnthetubaguy: so that the call to _get_service() requires an external call
15:30:44 <bauzas> johnthetubaguy: a copy or an API call ?
15:30:45 <bauzas> :)
15:31:04 <bauzas> johnthetubaguy: I don't like much caching objects :)
15:31:20 <johnthetubaguy> bauzas: I think its getting the correct data split
15:31:31 <johnthetubaguy> gnatt should have its own view of what services it knows about
15:31:38 <johnthetubaguy> nova-computes and nova-volumes
15:31:46 <johnthetubaguy> Nova needs to maintain its own list
15:31:56 <johnthetubaguy> and we need to plug in correctly
15:32:20 <bauzas> johnthetubaguy: well, that means that creating a Nova service must end up to create it as well on Gantt
15:32:43 <bauzas> johnthetubaguy: why can't we consider that Gantt is discovering services using a REST call ?
15:32:55 <bauzas> thanks to python-novaclient
15:33:20 <johnthetubaguy> bauzas: not sure I understand what you are trying to say, but I think we are agreeing, just prehaps not on the implementation
15:33:28 <bauzas> agreed
15:33:35 <bauzas> taking the point
15:33:48 <johnthetubaguy> there are two things
15:33:58 <bauzas> yet
15:34:00 <bauzas> yey
15:34:03 <johnthetubaguy> is the service alive, just like nova-network or whatever
15:34:11 <johnthetubaguy> what stats does the service have
15:34:15 <bauzas> +1
15:34:17 <johnthetubaguy> for the compute node
15:34:22 <johnthetubaguy> we need the first in nova
15:34:22 <bauzas> exactly
15:34:28 <johnthetubaguy> the second goes in gnatt
15:34:32 <bauzas> +1
15:34:37 <PaulMurray> good
15:34:53 <johnthetubaguy> OK, so I think we agree
15:35:01 <johnthetubaguy> my plan for this is thus….
15:35:04 <bauzas> but the question is : how gantt can discover nova data ?
15:35:13 <bauzas> for services
15:35:24 <johnthetubaguy> make nova-scheduler not make any DB calls, except for the above compute node stats
15:35:37 <johnthetubaguy> make no other service access the compute node stats
15:35:48 <bauzas> that's right
15:35:51 <johnthetubaguy> using coding similar to how nova-compute is unable to directly contact the db
15:36:01 <johnthetubaguy> then we can start to prove the split inside the Nova tree
15:36:09 <johnthetubaguy> I think thats the goal here
15:36:11 <bauzas> #agreed make nova-scheduler not make any DB calls, except for the above compute node stats
15:36:16 <bauzas> #agreed make no other service access the compute node stats
15:36:18 <PaulMurray> +1
15:36:47 <bauzas> ok, let's move on and discuss about the implementation on the wiki page
15:36:59 <bauzas> any other things to mention on that topic ?
15:37:22 <johnthetubaguy> I think we can do this in the code review now, but lets see how it goes
15:37:31 <bauzas> ok
15:37:43 <bauzas> #topic no-db-scheduler
15:38:21 <bauzas> I have nothing to say here, but maybe there is a discussion on -dev about a scalable scheduler that could make use of it
15:38:54 <johnthetubaguy> oh, have you got a link?
15:39:09 <bauzas> #link http://lists.openstack.org/pipermail/openstack-dev/2014-March/030084.html
15:39:43 <johnthetubaguy> oh right
15:39:47 <bauzas> johnthetubaguy: the problem is about instance groups and race condition happening
15:40:06 <johnthetubaguy> yep, I responded to that saying I agree with Russells fix, not read the response to that
15:40:09 <johnthetubaguy> (yet)
15:40:14 <bauzas> so I would be interested in knowing what's the status of no-db-scheduler bp
15:40:32 <johnthetubaguy> it got deferred for being too risky in Juno
15:40:43 <bauzas> yey, of course
15:40:49 <johnthetubaguy> if its not optional I will −2 it because it requires a dependency
15:40:59 <johnthetubaguy> new depedency^
15:41:00 <bauzas> johnthetubaguy: we just need to make sure if it can handle this issue
15:41:16 <johnthetubaguy> I like the general ideas as an option
15:41:32 <bauzas> agreed
15:41:46 <bauzas> there is one stackforge project about distributed locking, called tooz
15:42:09 <bauzas> it would be nice to see if there are mutual concerns for this
15:42:21 <johnthetubaguy> hmm, I don't mind that being later
15:42:21 <bauzas> and I personnally do think
15:42:35 <johnthetubaguy> don't want too many dependencies, but also what to share code eventually
15:43:09 <mspreitz> bauzas: what do you mean by mutual concerns for tooz?
15:43:34 <bauzas> mspreitz: I mean that distributed locking mechanism is one of the goals for memcached scheduler states
15:44:23 <mspreitz> bauzas: you mean no-db-scheduler should use tooz?
15:44:35 <bauzas> that could be one option yes
15:45:22 <bauzas> and tooz could also add a backend memcached plugin
15:45:33 <bauzas> in order to minimize the dependencies
15:45:42 <johnthetubaguy> yeah, thats something for after the current code merges
15:45:52 <bauzas> of course :-)
15:45:56 <johnthetubaguy> good good
15:46:09 <PaulMurray> bauzas I have a question
15:46:14 <bauzas> ok, thats'it for me on that topic
15:46:17 <bauzas> PaulMurray: sure
15:46:29 <PaulMurray> don't want to open it up too much
15:46:42 <PaulMurray> but it seems that this dicussion is gonig towards a general
15:46:52 <PaulMurray> configuration service
15:47:00 <PaulMurray> you know, like zoo keeper
15:47:04 <PaulMurray> I can see why
15:47:13 <PaulMurray> do you see the scheduler
15:47:14 <bauzas> tooz is based on zk :-)
15:47:27 <PaulMurray> taking that role or do you think something else should
15:47:28 <bauzas> that's the current default driver
15:47:33 <PaulMurray> and scheduler should use that service
15:47:46 <bauzas> PaulMurray: that's too early for answering it :)
15:48:08 <PaulMurray> but thanks anyway - didn't know about tooz
15:48:09 <bauzas> for Juno, we should just make sure the paths won't diverge too much
15:48:31 <bauzas> but that could be an opportunity for a later cycle
15:48:36 <bauzas> IMHO
15:48:36 <PaulMurray> BTW I hate zookeeper
15:48:47 <bauzas> hence the plugin mechanism :d
15:48:49 <bauzas> write your own ;)
15:48:53 <PaulMurray> :)
15:49:08 <johnthetubaguy> yeah, that an interesting one
15:49:30 <bauzas> I agree that now we should focus on memcached scheduler
15:49:51 <bauzas> but just make sure that the interfaces are flexible enough for accepting a new backend
15:50:02 <johnthetubaguy> erm, well I think we need to focus on the split, which gives the memcached scheduler a good plugging in point
15:50:05 <bauzas> the few I read about the reviews make me think it's the case
15:50:26 <bauzas> johnthetubaguy: yey, both efforts are separated
15:50:45 <bauzas> johnthetubaguy: but both can profit
15:50:49 <bauzas> anyway
15:50:58 <bauzas> let's discuss this at the summit
15:51:07 <bauzas> and that leads me to the next topic
15:51:07 <johnthetubaguy> bauzas: not so sure anymore, I have a feeling the no-db-scheduler needs the split to plugin in an optional way
15:51:15 <johnthetubaguy> sure, do continue...
15:51:24 <bauzas> #topic open discussion
15:51:39 <bauzas> a quick FYI : http://summit.openstack.org/cfp/details/80
15:52:15 <bauzas> at the moment, I don't have a need for another discussion
15:52:41 <johnthetubaguy> do you want to cover no-db-scheduler?
15:52:55 <bauzas> I'm not the owner
15:53:04 <bauzas> I will send an email to boris-42
15:53:11 <johnthetubaguy> sounds good
15:53:18 <bauzas> and see if he plans to promote it
15:53:38 <bauzas> that's it for me
15:53:43 <bauzas> we're having 5 mins lefty
15:53:44 <bauzas> left
15:53:52 <bauzas> any other subject to discuss ?
15:55:02 <PaulMurray> what's the abbrieviation for I"I hear the tumble weeds blowing in the wind"?
15:55:27 <johnthetubaguy> d.o.n.e ?
15:55:38 <bauzas> no shout, no doubt :)
15:55:42 <bauzas> #endmeeting