13:02:55 <Qiming> #startmeeting senlin
13:02:56 <openstack> Meeting started Tue Sep 13 13:02:55 2016 UTC and is due to finish in 60 minutes.  The chair is Qiming. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:02:58 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
13:03:00 <openstack> The meeting name has been set to 'senlin'
13:03:05 <Qiming> hi, sorry for being late
13:03:15 <lixinhui_> hi
13:03:19 <elynn> Hi
13:03:24 <guoshan> hi, everyone
13:03:32 <Qiming> hi, everyone
13:04:01 <Qiming> please let me know if you have topics to discuss
13:04:11 <ruijie_> Evening, everyone
13:04:22 <Qiming> welcome, guoshan and ruijie_
13:04:57 <Qiming> meeting agenda here
13:04:59 <Qiming> #link https://wiki.openstack.org/wiki/Meetings/SenlinAgenda#Weekly_Senlin_.28Clustering.29_meeting
13:05:18 <Qiming> let's start with the newton work items etherpad
13:05:28 <Qiming> #link https://etherpad.openstack.org/p/senlin-newton-workitems
13:05:29 <ruijie_> Will we talked about the desired_capacity today?
13:05:39 <Qiming> yes, we can
13:06:20 <Qiming> added to meeting agenda
13:06:44 <Qiming> I'm not aware of any progress in performance test during last week
13:06:50 <Qiming> yanyan is on vacation
13:07:21 <Qiming> he has been pushing the rally side on this
13:07:56 <Qiming> health management, no new patches related to this topic either
13:08:21 <Qiming> lixinhui_, the lb bug closed or not?
13:08:35 <lixinhui_> closed for Octavia
13:08:45 <lixinhui_> I will change the status for that bug
13:08:53 <lixinhui_> but for others not
13:08:57 <lixinhui_> depends on driver
13:09:10 <elynn> So if we use haproxy then we will encounter this bug?
13:09:12 <Qiming> okay, so we still cannot get node status correct?
13:09:15 <lixinhui_> and neutron team is pushing change towards Octvia from lbaas
13:09:33 <lixinhui_> Nowdays
13:09:47 <Qiming> okay, fine, cannot rely on non-stable features there
13:10:07 <Qiming> we may have to postpone this feature to Ocata cycle then
13:10:08 <lixinhui_> once the node status change, octavia will send RPC call to lbaas for it to change the da status
13:10:38 <lixinhui_> but not implement the notify
13:10:54 <Qiming> it is a very useful feature, realy hope that can be landed and get stabilized soon
13:10:58 <lixinhui_> just PPC call
13:11:05 <lixinhui_> RPC call
13:11:17 <Qiming> but can we get pollers work?
13:11:51 <lixinhui_> oh, I see. will try this in vacation
13:11:58 <lixinhui_> middle-autumn
13:12:03 <Qiming> many thanks
13:12:08 <lixinhui_> np
13:12:31 <Qiming> documentation side, we have merged quite some doc fixes recently, about syntax and grammar
13:12:40 <Qiming> other than that, no new docs added
13:12:58 <Qiming> one thing I just realized is about testing
13:13:09 <Qiming> we do have testing section in the developer's guide
13:13:29 <Qiming> but we failed to let people know we have a cloud_backend = openstack_test option
13:13:34 <Qiming> that should be added
13:14:10 <Qiming> added this item to etherpad for tracking
13:14:21 <Qiming> container profile
13:14:40 <Qiming> haiwei has commited some patches about node/cluster dependencies
13:14:57 <Qiming> our discussion concluded that these dependencies can be generalized
13:15:09 <Qiming> so there have been db level and engine level patches
13:15:26 <Qiming> there are still a few patches about this to be reviewed
13:15:50 <Qiming> pls spend sometime on this when you are not so busy
13:16:00 <Qiming> zaqar receiver
13:16:15 <Qiming> the whole invocation flow is working
13:16:27 <Qiming> the most tricky part is about trust building
13:16:44 <Qiming> an end user trusts 'senlin' account to perform cluster operations
13:17:12 <Qiming> and he/she would trust 'zaqar' account to trigger such operations by sending in some messages
13:17:46 <Qiming> the trust between the requesting user and the 'senlin' account is much easier, thus solved a long time ago
13:18:11 <Qiming> the other one means we have to know zaqar user id to build such a trust
13:18:29 <Qiming> yanyan has worked out a solution there, so no worries
13:18:59 <Qiming> the current vision is that zaqar will enable more flexible action triggering if used properly
13:19:17 <Qiming> next topic is event/notification
13:19:22 <Qiming> no progress I'm aware of
13:19:41 <Qiming> I myself have been tweaking nova server profile update recently
13:19:53 <Qiming> buiding a long chain of patches
13:20:11 <Qiming> the goal is to make profile update more reliable and maintainable
13:20:40 <Qiming> also applied some optimizations related to name update or password update etc.
13:20:58 <Qiming> that's all about newton work items etherpad
13:21:08 <Qiming> questions/comments?
13:21:36 <elynn> I'm a little concern about zaqar part
13:21:43 <Qiming> yes
13:21:54 <elynn> since not all users will enable zaqar in their openstack
13:22:07 <Qiming> right
13:22:29 <elynn> Is it better to provide an option to enable/disable it in senlin?
13:22:43 <Qiming> good question
13:23:01 <Qiming> but I hate options, which may get deprecated later
13:23:24 <Qiming> how about we do a check at api layer
13:23:40 <elynn> That sounds great
13:24:20 <Qiming> if you are creating a message type of receiver, and we know zaqar is not installed, we throw some exception?
13:24:59 <Qiming> we cannot do this by simply checking if zaqar is installed on the local machine
13:25:02 <elynn> service unavailable or bad request?
13:25:15 <Qiming> we should check keystone service catalog
13:25:32 <Qiming> should be a bad request of something 4xx
13:25:33 <elynn> Yes
13:25:40 <Qiming> it is definitely not a 5xx error
13:26:38 <Qiming> added an item
13:26:44 <Qiming> won't be a huge task
13:26:46 <elynn> en, bad request is better
13:27:15 <Qiming> anything else?
13:27:48 <elynn> nope from me.
13:28:03 <ruijie_> For the desired_capacity...
13:28:13 <ruijie_> If we do not specified the max and min size
13:28:22 <Qiming> ruijie_, we leave that to the next topic
13:28:33 <ruijie_> Sorry about that.
13:28:38 <Qiming> #topic planning for RC release
13:28:54 <Qiming> any high priority bugs seen recently?
13:29:36 <Qiming> https://bugs.launchpad.net/senlin/+bug/1619842
13:29:37 <openstack> Launchpad bug 1619842 in senlin "after cluster-check, the status of cluster is warning" [Critical,Triaged] - Assigned to miaohb (miao-hongbao)
13:29:45 <Qiming> this one should be already closed ...
13:31:16 <Qiming> anyone looking at this #1546960
13:31:20 <Qiming> bug #1546960
13:31:20 <openstack> bug 1546960 in senlin "node-create's index will be -1 if create more than 1 node,then cluster-node-add will fail as well" [Undecided,New] https://launchpad.net/bugs/1546960
13:31:58 <Qiming> lixinhui_, is the 'idontknow' you?
13:32:32 <lixinhui_> No, Qiming ...
13:32:40 <Qiming> en, youdontknow
13:32:44 <ruijie_> I did not reproduce the situation the bug describled
13:33:00 <Qiming> thanks ruijie_, for confirmation
13:33:13 <Qiming> marking as incomplete for now
13:33:35 <Qiming> bug #1609244
13:33:36 <openstack> bug 1609244 in senlin "Getting image authentication failed when use fernet in keystone" [Undecided,New] https://launchpad.net/bugs/1609244
13:33:53 <Qiming> this one seems fixed, it was a keystone configuration problem
13:34:56 <Qiming> I'm gonna cut RC1 this week, probably on Thursday
13:35:23 <Qiming> if you have got any patches you want a review please speak up on #senlin channel
13:35:38 <Qiming> #topic dealing with desired_capacity
13:35:54 <Qiming> ruijie_, room is yours
13:37:08 <Qiming> ruijie_, still awake?
13:37:11 <ruijie_> yea, the eval)stauts() will check the desired_capacity
13:37:31 <Qiming> yes
13:37:34 <ruijie_> if we do not specified the max and min size of the cluster
13:37:49 <Qiming> you get min_size=0, max_size = -1
13:38:08 <ruijie_> this method weill change the reason to ' number of active nodes is above desired_capacity'
13:38:10 <ruijie_> is that ok?
13:38:37 <Qiming> yes, it means the cluster is operational, just may be wasting some additional resources
13:39:53 <Qiming> having the number of active nodes equal to the desired_capacity would be great
13:40:08 <ruijie_> That's true
13:40:29 <ruijie_> The number of active nodes is equal to desired_Capacity
13:40:37 <ruijie_> but the max_size is sitll -1
13:41:03 <Qiming> but if we only set cluster status to 'ACTIVE' when that number is exactly desired_capacity, we may be a little too restrictive
13:41:19 <Qiming> max_size means no upper limit
13:42:07 <Qiming> http://git.openstack.org/cgit/openstack/senlin/tree/senlin/engine/cluster.py#n551
13:42:11 <ruijie_> I understand that, but I think the reason of should be more friendly
13:43:44 <Qiming> that whole condition is about desired_capacity <= current capacity <= max_size
13:44:27 <Qiming> or if max_size is set to -1, there is no limit
13:45:11 <ruijie_> Ok. I get it
13:45:38 <Qiming> feel free to propose a better description when you get one
13:45:41 <Qiming> :)
13:45:55 <Qiming> when speaking of desired_capacity
13:46:06 <ruijie_> Sure :)
13:46:13 <Qiming> I'm gonna work on that during the coming week
13:46:25 <Qiming> the change would be about all cluster operations
13:46:47 <Qiming> especially those that change the size of a cluster
13:47:16 <Qiming> the goal is to make sure all operations are based on currently observed number of nodes, not desired capacity
13:47:24 <ruijie_> Yes, that will be good for the Healty Manager
13:47:35 <Qiming> say if you have a cluster of 3 nodes, and your desired_capacity is 5
13:47:47 <Qiming> fine, cluster is in WARNING status
13:47:55 <Qiming> when I do scale out
13:48:17 <Qiming> I'll use the observed 3 nodes as the basis
13:48:48 <Qiming> we are not basing the resize operation on the desired_capacity
13:49:03 <Qiming> 'desired_capacity' of a cluster is always a "desired", not the reality
13:49:25 <ruijie_> and recalculate the desired_capacity or not?
13:49:34 <Qiming> if not for senlinclient being freezed, I'm even thinking of adding a 'active_nodes' property to a cluster
13:49:50 <Qiming> ruijie_, will recalculate
13:50:00 <Qiming> using the above example
13:50:15 <Qiming> you have 3 nodes, your previous desire was 5 (not realized)
13:50:28 <Qiming> now you say 'cluster-scale-out -c 3'
13:50:48 <Qiming> that means you will want the cluster to have 6 nodes, your new desire
13:50:59 <Qiming> senlin will try its best to achieve that
13:51:26 <Qiming> but ... in real world, you may still have 3 nodes, or maybe just 4, because you have running out of quota or resources ...
13:51:48 <Qiming> so, yes, desired_capacity will always describe your new desire
13:52:06 <Qiming> it will be changed
13:52:16 <ruijie_> how does senlin treat the ERROR nodes when we do a 'SCALE_OUT' action
13:52:24 <Qiming> no matter the operation/action succeeded or not
13:52:32 <Qiming> we leave them there
13:52:34 <ruijie_> I know senlin will delete ERROR nodes first when we do a 'SCALE_IN'
13:53:12 <Qiming> that's a good question, I haven't thought it thru yet
13:53:42 <Qiming> maybe add a cluster-reset operation, delete all inactive nodes?
13:53:44 <Qiming> not sure
13:54:07 <Qiming> cluster-delete-all-inactive-or-error-or-warning-nodes?
13:54:12 <ruijie_> Maybe just let the user deside how to treat them
13:54:48 <Qiming> yup, will keep working on that when I finish the desired_capacity one
13:54:53 <Qiming> #topic open discussion
13:56:09 <Qiming> anything for a quick discussion?
13:56:32 <ruijie_> nope from me
13:57:57 <Qiming> okay, thanks everyone
13:58:07 <Qiming> best wishes to you and your family, ...
13:58:18 <lixinhui_> u2
13:58:21 <Qiming> it's mid-autumn season
13:58:29 <Qiming> thanks for joining, see you
13:58:36 <lixinhui_> cu
13:58:45 <guoshan> good night, and have fun in festival days
13:58:47 <Qiming> #endmeeting