13:01:19 <Qiming> #startmeeting senlin
13:01:20 <openstack> Meeting started Tue Oct  6 13:01:19 2015 UTC and is due to finish in 60 minutes.  The chair is Qiming. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:01:21 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
13:01:23 <openstack> The meeting name has been set to 'senlin'
13:01:38 <Qiming> hello
13:01:45 <yanyanhu_> hi
13:01:45 <haiwei> hi
13:01:45 <elynn> Hi
13:01:51 <jruano> hello
13:02:02 <Qiming> wow, you are all here, ;)
13:02:12 <yanyanhu_> yep :)
13:02:16 <lixinhui> :)
13:02:20 <yanyanhu_> just came back from hometown
13:02:22 <Qiming> pls check agenda and see if you have things to add
13:02:28 <Qiming> https://wiki.openstack.org/wiki/Meetings/SenlinAgenda#Weekly_Senlin_.28Clustering.29_meeting
13:02:39 <haiwei> nice holiday
13:03:01 <Qiming> I was expecting that I will be alone here, since it is holiday for most of us
13:03:32 <yanyanhu_> ;)
13:03:37 <Qiming> #topic liberty work items
13:04:04 <Qiming> the etherpad page loads pretty slow
13:04:49 <Qiming> we still have some items left there, have to postpone to next cycle
13:05:12 <Qiming> for the container work, will get an update from SUR team tonight -- 1 hour later
13:05:31 <yanyanhu_> about the unit test in senlinclient, have we done it?
13:05:37 <jruano> yes, i sent out a request qi ming. haven't heard anything back
13:05:44 <Qiming> yanyanhu_, that is something we need to postpone
13:05:50 <yanyanhu_> ok
13:06:11 <haiwei> sorry about that, I should do that
13:06:15 <Qiming> jruano, you are in that meeting, I haven't heard anything last week
13:06:16 <jruano> there is some interesting client code, but all the specs for profile and policy are not checked into the repo
13:06:21 <jruano> yep
13:06:44 <Qiming> I have invited Liam to join that discussion
13:07:06 <jruano> ton and i have defined the use cases, and i want to start understanding how/where senlin fits
13:07:20 <Qiming> haiwei and I should spend sometime on client test cases anyway
13:07:37 <Qiming> yep, jruano, need to touch base with the eam
13:07:40 <haiwei> yes, Qiming
13:07:44 <Qiming> s/eam/team
13:07:47 <yanyanhu_> I think we can work on it together in the coming cycle.
13:08:07 <yanyanhu_> I will spend some time on it as well
13:08:21 <Qiming> okay, we have just got an +2 on the patch to propose senlinclient into global requirements
13:08:34 <elynn> good news
13:08:38 <yanyanhu_> cool
13:08:38 <jruano> awesome
13:08:39 <haiwei> saw it
13:08:41 <Qiming> hope it will be approved soon, so it won't block senlin-dashboard progress
13:08:48 <yanyanhu_> this is helpful for elynn I think
13:09:09 <Qiming> yanyanhu_, do we have more to add regarding functional tests?
13:09:11 <yanyanhu_> for the senlin support in heat
13:09:20 <elynn> yes
13:09:24 <yanyanhu_> Qiming, I guess not for now
13:09:37 <Qiming> great
13:09:48 <haiwei> great job yanyanhu
13:09:51 <yanyanhu_> oh, maybe still a little more work on node
13:10:10 <Qiming> elynn, we need to sit together for a discussion on how to get heat resource types work
13:10:11 <yanyanhu_> but I think it won't take much time
13:10:37 <elynn> Qiming: yes, when I get back from holiday.
13:10:58 <Qiming> yanyanhu_, alright, let's get it done then we switch to mitaka work items
13:11:08 <yanyanhu_> ok
13:11:17 <Qiming> #topic placement policy
13:11:40 <Qiming> lixinhui, anything to share from your side?
13:12:03 <lixinhui> I am working on vSphereDRS_policy
13:12:07 <lixinhui> and unit test
13:12:28 <lixinhui> will submit patch around this Thursday
13:12:35 <Qiming> okay, cool
13:12:53 <lixinhui> Need your help to review then :)
13:12:59 <haiwei> in fact I am not familiar with vSphereDRS_policy
13:13:04 <Qiming> Liuwei's patch reagarding az placement policy needs a patch
13:13:13 <lixinhui> oh?
13:13:28 <lixinhui> Maybe I can help
13:13:32 <haiwei> what is the relationship between vSphereDRS_policy and placement policy
13:13:40 <Qiming> seems we need a doc for every policy?
13:13:41 <lixinhui> I think he has done that part
13:13:52 <yanyanhu_> Qiming, agree.
13:14:04 <yanyanhu_> and I noticed you have worked on docstring
13:14:11 <haiwei> it will help me at least
13:14:26 <lixinhui> Okay
13:14:33 <lixinhui> reasonable
13:14:44 <Qiming> em. for all builtin policies, we need some docs explaining how it works
13:14:50 <lixinhui> I could add some when the patch done
13:15:01 <Qiming> sounds great
13:15:06 <yanyanhu_> maybe we should always adding docstring when adding new features
13:15:09 <Qiming> lixinhui, Liuwei's patch: https://review.openstack.org/221684
13:15:17 <haiwei> thanks lixinhui
13:15:29 <lixinhui> Okay, Qiming
13:15:38 <lixinhui> I will read it
13:15:53 <Qiming> yanyanhu_, by docs, I am referring to some design level things, not just function level comments
13:16:08 <Qiming> lixinhui, it is in a pretty good shape now
13:16:09 <jruano> yes i think that will help
13:16:42 <yanyanhu_> yes, this is nice since it can help people understand the implementation better
13:17:02 <Qiming> the placement policy patch can be tweaked a little bit to support cross-region placement
13:17:16 <lixinhui> Okay, Qiming
13:17:49 <Qiming> yanyanhu_, can you have add a TODO item in the TODO.rst file? don't want this ball dropped, :)
13:18:17 <yanyanhu_> sure
13:18:33 <Qiming> #topic deletion policy for RESIZE operation
13:18:50 <Qiming> haiwei has helped start this thread
13:19:10 <Qiming> so far the implementation is not correct in my opinion
13:19:19 <haiwei> I thought it was not difficult, but it seems not
13:19:31 <Qiming> yep
13:20:08 <Qiming> there are many easier paths to get this done, but we have to look at the big picture
13:20:09 <haiwei> I think the problem is that resize action can delete nodes more than one at a time
13:20:35 <haiwei> yes
13:21:22 <Qiming> maybe we can extract the parsing of RESIZE parameters into a utility function
13:21:52 <Qiming> we then call that function directly if no policy is attached to the cluster
13:22:20 <haiwei> that will be done inside the action execution?
13:22:34 <Qiming> if we do have certain policies that want to handle RESIZE action, we invoke this parser as well
13:23:00 <haiwei> sounds a good idea
13:23:16 <Qiming> in action execution, we check if there are policy outputs and skip the parsing if seemed unnecessary
13:23:25 <haiwei> I think this should be done in the engine/service layer
13:23:47 <Qiming> there will be concurrency problems if you do maths there
13:23:52 <Qiming> the cluster is not locked
13:24:06 <Qiming> any other actions can change the cluster at the same time
13:24:42 <haiwei> but for deletion policy the pre_op in which candidates are chosen in done before resize action execution
13:24:53 <yanyanhu_> haiwei, I think you can refer to the implementation of do_scale_in/out
13:25:09 <yanyanhu_> they have similar problem
13:25:18 <Qiming> agreed, scale_in/scale_out is a good reference
13:25:30 <haiwei> yanyanhu, the problem is do_scale_in/out only delete one node at a time
13:25:42 <haiwei> but resize action is different
13:26:01 <Qiming> scale_in/out can carry a 'count' parameter
13:26:27 <haiwei> how to initialize 'count' in pre_op is difficult for resize action
13:26:36 <Qiming> in the case of resize, there are more parameters to handle
13:26:43 <yanyanhu_> yes, haiwei, just as Qiming said, those two actions can also accept 'count' input
13:27:30 <yanyanhu_> haiwei, yes, the logic is more complicated since the constraint might be changed at the same time
13:27:34 <Qiming> haiwei, I was suggesting to extract the "count" computation into a utility function
13:27:34 <haiwei> i know that, those 'count' is default to 1, but resize action is different, we have to give it a value
13:27:45 <yanyanhu_> but I think you can split the logic out
13:28:10 <haiwei> ok, I think I got you
13:28:17 <Qiming> great
13:28:34 <Qiming> #topic policy for node create/delete
13:28:56 <Qiming> so far we have been focusing on CLUSTER_XYZ actions when dealing with policies
13:29:09 <Qiming> however, we do have NODE_CREATE/DELETE/JOIN/LEAVE actions
13:29:43 <Qiming> a NODE_CREATE action, with cluster_id provided, needs to be considered by the LB policy, for example
13:29:53 <yanyanhu_> yes
13:30:19 <yanyanhu_> ideally, those node_xxx actions should also be the target of some policies
13:30:39 <Qiming> I have been looking at this during the past days
13:31:36 <Qiming> I'm hoping this won't be a disruptive change to the current policy implementation
13:32:06 <yanyanhu_> umm, we need to think through this...
13:32:33 <Qiming> anyway, I'll keep working on this
13:32:33 <yanyanhu_> will think about it
13:32:50 <Qiming> okay, feel free to ping me for a discussion
13:33:00 <yanyanhu_> sure :)
13:33:08 <Qiming> #topic batch policy
13:33:13 <haiwei> Qiming, you mean you will focus on the LB policy only?
13:33:21 <Qiming> haiwei, no
13:33:39 <Qiming> it is more about how to weave NODE_xxx actions into policy checking
13:33:49 <Qiming> not just for LB policy
13:34:44 <haiwei> ok
13:35:03 <Qiming> placement policy, for example, is another case where NODE_CREATE action should be checked
13:35:05 <haiwei> and also placement policy
13:35:14 <haiwei> yes
13:35:17 <Qiming> :)
13:36:07 <Qiming> #topic batching policy
13:36:48 <Qiming> batching policy is really about throttling
13:37:06 <Qiming> when creating/updating/deleting objects, senlin makes calls to other services
13:37:49 <Qiming> we have to impose some constraints on the number requests sent to other services during any given period
13:38:15 <Qiming> this is not an easy job as it seems to be
13:38:33 <Qiming> take CLUSTER_CREATE as an example
13:39:06 <Qiming> we want to control how many NODE_CREATE (thus nova boot requests after translation) we will trigger
13:39:51 <Qiming> however, the cluster is just being created, no policy has a chance to get attached to it yet
13:40:07 <Qiming> it becomes a chicken-and-egg problem
13:41:00 <Qiming> so ... the question becomes: how can we do throttling without a policy
13:41:40 <elynn> Define it in senlin.conf?
13:41:48 <yanyanhu_> Qiming, I think maybe we can split a cluster creation into multiple action sets, and there are dependencies between them?
13:41:51 <haiwei> maybe we only attach batch policy after the cluster is created?
13:42:18 <Qiming> I just proposed a configuration option, max_actions_per_batch
13:42:34 <yanyanhu_> but the existing dependency logic may not be able to support it
13:42:41 <Qiming> it can be overridden later by a batching policy
13:43:52 <Qiming> right, the current batched creation logic has to be revised to support this option
13:44:04 <yanyanhu_> hmm, my idea is not good. It is too far from our policy framework
13:45:04 <Qiming> I am starting to change my mind now
13:45:34 <Qiming> since we are executing all 'actions' asynchronously
13:46:04 <Qiming> (at least we wanted to do things that way)
13:46:27 <Qiming> all actions are first persisted into database
13:46:41 <Qiming> then retrieved for execution
13:47:11 <Qiming> so that throttling problem seems more like a scheduler problem
13:47:17 <yanyanhu_> yes
13:47:19 <yanyanhu_> it is
13:47:28 <Qiming> say if we create a cluster of 1000 nodes
13:47:32 <yanyanhu_> or it can be
13:47:45 <jruano> sounds like it
13:48:10 <Qiming> the 1000 NODE_CREATE actions are supposed to be executed in a controlled way
13:49:09 <Qiming> so ... I am inclined to looking at it from a different angel now
13:49:33 <Qiming> the problem is: how do we define the 'batching' policy then?
13:49:45 <haiwei> by the way, this policy is triggered by hand?
13:50:08 <Qiming> haiwei, all policies are supposed to be triggered by certain actions
13:50:32 <jruano> set of rules to a scheduler
13:50:35 <haiwei> so we need a new action for it?
13:50:42 <yanyanhu_> haiwei, I think we just manually provide the rule
13:50:44 <Qiming> for creation/deletion, it sounds more like a scheduler parameter
13:51:05 <elynn> So is it still needed?
13:51:15 <Qiming> for update, it may carry some other QoS related semantics
13:51:42 <yanyanhu_> yes. maybe we can start to consider the refactoring of engine scheduler
13:51:47 <Qiming> during batched update, users may want to keep a certain number of service node running at any time
13:52:40 <Qiming> yanyanhu_, yes, that is why I wasn't proposing a lot of patches recently, :)
13:53:00 <Qiming> really blocked by this problem
13:53:17 <Qiming> maybe we need to refactor the engine scheduler first
13:53:22 <yanyanhu_> ok, lets think about it :)
13:53:30 <Qiming> make it a 'real' scheduler
13:53:40 <yanyanhu_> yep
13:53:55 <Qiming> then we do 'scheduler.reschedule()' whenever necessary, just like a tickless Linux kernel
13:54:38 <Qiming> maybe we need to change 'batching' policy to just an 'update' policy
13:54:39 <yanyanhu_> right
13:55:07 <Qiming> will keep thinking of this
13:55:13 <Qiming> #topic open discussions
13:56:12 <Qiming> regarding big tent proposal, I'm drafting it in the coming days
13:56:15 <haiwei> I will have a small session about senlin in my company's booth during the summit
13:56:27 <Qiming> will send out to everyone for review
13:56:42 <Qiming> thx, haiwei, ping us if any help needed
13:56:43 <yanyanhu_> haiwei, cool :)
13:56:51 <jruano> nice
13:57:07 <lixinhui> great!
13:57:15 <Qiming> about the meetup during summit, the room allocation is pretty tight
13:57:21 <yanyanhu_> will visit your booth ;p
13:57:22 <haiwei> I will prepare for the presentation , and want you to advices
13:57:23 <Qiming> http://lists.openstack.org/pipermail/openstack-dev/2015-October/076054.html
13:57:33 <Qiming> sure
13:57:59 <Qiming> need to find out how to get everyone together for a f2f discussion
13:58:40 <Qiming> anything else
13:58:44 <Qiming> ?
13:58:49 <haiwei> no
13:58:51 <Qiming> hoho, 1 min left
13:58:53 <yanyanhu_> nope
13:58:55 <jruano> nope
13:58:57 <elynn> nope
13:59:04 <Qiming> thanks for joining, during your vacation
13:59:14 <Qiming> talk to you later
13:59:17 <Qiming> #endmeeting