13:02:55 <Qiming> #startmeeting senlin 13:02:56 <openstack> Meeting started Tue Sep 13 13:02:55 2016 UTC and is due to finish in 60 minutes. The chair is Qiming. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:02:58 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:03:00 <openstack> The meeting name has been set to 'senlin' 13:03:05 <Qiming> hi, sorry for being late 13:03:15 <lixinhui_> hi 13:03:19 <elynn> Hi 13:03:24 <guoshan> hi, everyone 13:03:32 <Qiming> hi, everyone 13:04:01 <Qiming> please let me know if you have topics to discuss 13:04:11 <ruijie_> Evening, everyone 13:04:22 <Qiming> welcome, guoshan and ruijie_ 13:04:57 <Qiming> meeting agenda here 13:04:59 <Qiming> #link https://wiki.openstack.org/wiki/Meetings/SenlinAgenda#Weekly_Senlin_.28Clustering.29_meeting 13:05:18 <Qiming> let's start with the newton work items etherpad 13:05:28 <Qiming> #link https://etherpad.openstack.org/p/senlin-newton-workitems 13:05:29 <ruijie_> Will we talked about the desired_capacity today? 13:05:39 <Qiming> yes, we can 13:06:20 <Qiming> added to meeting agenda 13:06:44 <Qiming> I'm not aware of any progress in performance test during last week 13:06:50 <Qiming> yanyan is on vacation 13:07:21 <Qiming> he has been pushing the rally side on this 13:07:56 <Qiming> health management, no new patches related to this topic either 13:08:21 <Qiming> lixinhui_, the lb bug closed or not? 13:08:35 <lixinhui_> closed for Octavia 13:08:45 <lixinhui_> I will change the status for that bug 13:08:53 <lixinhui_> but for others not 13:08:57 <lixinhui_> depends on driver 13:09:10 <elynn> So if we use haproxy then we will encounter this bug? 13:09:12 <Qiming> okay, so we still cannot get node status correct? 13:09:15 <lixinhui_> and neutron team is pushing change towards Octvia from lbaas 13:09:33 <lixinhui_> Nowdays 13:09:47 <Qiming> okay, fine, cannot rely on non-stable features there 13:10:07 <Qiming> we may have to postpone this feature to Ocata cycle then 13:10:08 <lixinhui_> once the node status change, octavia will send RPC call to lbaas for it to change the da status 13:10:38 <lixinhui_> but not implement the notify 13:10:54 <Qiming> it is a very useful feature, realy hope that can be landed and get stabilized soon 13:10:58 <lixinhui_> just PPC call 13:11:05 <lixinhui_> RPC call 13:11:17 <Qiming> but can we get pollers work? 13:11:51 <lixinhui_> oh, I see. will try this in vacation 13:11:58 <lixinhui_> middle-autumn 13:12:03 <Qiming> many thanks 13:12:08 <lixinhui_> np 13:12:31 <Qiming> documentation side, we have merged quite some doc fixes recently, about syntax and grammar 13:12:40 <Qiming> other than that, no new docs added 13:12:58 <Qiming> one thing I just realized is about testing 13:13:09 <Qiming> we do have testing section in the developer's guide 13:13:29 <Qiming> but we failed to let people know we have a cloud_backend = openstack_test option 13:13:34 <Qiming> that should be added 13:14:10 <Qiming> added this item to etherpad for tracking 13:14:21 <Qiming> container profile 13:14:40 <Qiming> haiwei has commited some patches about node/cluster dependencies 13:14:57 <Qiming> our discussion concluded that these dependencies can be generalized 13:15:09 <Qiming> so there have been db level and engine level patches 13:15:26 <Qiming> there are still a few patches about this to be reviewed 13:15:50 <Qiming> pls spend sometime on this when you are not so busy 13:16:00 <Qiming> zaqar receiver 13:16:15 <Qiming> the whole invocation flow is working 13:16:27 <Qiming> the most tricky part is about trust building 13:16:44 <Qiming> an end user trusts 'senlin' account to perform cluster operations 13:17:12 <Qiming> and he/she would trust 'zaqar' account to trigger such operations by sending in some messages 13:17:46 <Qiming> the trust between the requesting user and the 'senlin' account is much easier, thus solved a long time ago 13:18:11 <Qiming> the other one means we have to know zaqar user id to build such a trust 13:18:29 <Qiming> yanyan has worked out a solution there, so no worries 13:18:59 <Qiming> the current vision is that zaqar will enable more flexible action triggering if used properly 13:19:17 <Qiming> next topic is event/notification 13:19:22 <Qiming> no progress I'm aware of 13:19:41 <Qiming> I myself have been tweaking nova server profile update recently 13:19:53 <Qiming> buiding a long chain of patches 13:20:11 <Qiming> the goal is to make profile update more reliable and maintainable 13:20:40 <Qiming> also applied some optimizations related to name update or password update etc. 13:20:58 <Qiming> that's all about newton work items etherpad 13:21:08 <Qiming> questions/comments? 13:21:36 <elynn> I'm a little concern about zaqar part 13:21:43 <Qiming> yes 13:21:54 <elynn> since not all users will enable zaqar in their openstack 13:22:07 <Qiming> right 13:22:29 <elynn> Is it better to provide an option to enable/disable it in senlin? 13:22:43 <Qiming> good question 13:23:01 <Qiming> but I hate options, which may get deprecated later 13:23:24 <Qiming> how about we do a check at api layer 13:23:40 <elynn> That sounds great 13:24:20 <Qiming> if you are creating a message type of receiver, and we know zaqar is not installed, we throw some exception? 13:24:59 <Qiming> we cannot do this by simply checking if zaqar is installed on the local machine 13:25:02 <elynn> service unavailable or bad request? 13:25:15 <Qiming> we should check keystone service catalog 13:25:32 <Qiming> should be a bad request of something 4xx 13:25:33 <elynn> Yes 13:25:40 <Qiming> it is definitely not a 5xx error 13:26:38 <Qiming> added an item 13:26:44 <Qiming> won't be a huge task 13:26:46 <elynn> en, bad request is better 13:27:15 <Qiming> anything else? 13:27:48 <elynn> nope from me. 13:28:03 <ruijie_> For the desired_capacity... 13:28:13 <ruijie_> If we do not specified the max and min size 13:28:22 <Qiming> ruijie_, we leave that to the next topic 13:28:33 <ruijie_> Sorry about that. 13:28:38 <Qiming> #topic planning for RC release 13:28:54 <Qiming> any high priority bugs seen recently? 13:29:36 <Qiming> https://bugs.launchpad.net/senlin/+bug/1619842 13:29:37 <openstack> Launchpad bug 1619842 in senlin "after cluster-check, the status of cluster is warning" [Critical,Triaged] - Assigned to miaohb (miao-hongbao) 13:29:45 <Qiming> this one should be already closed ... 13:31:16 <Qiming> anyone looking at this #1546960 13:31:20 <Qiming> bug #1546960 13:31:20 <openstack> bug 1546960 in senlin "node-create's index will be -1 if create more than 1 node,then cluster-node-add will fail as well" [Undecided,New] https://launchpad.net/bugs/1546960 13:31:58 <Qiming> lixinhui_, is the 'idontknow' you? 13:32:32 <lixinhui_> No, Qiming ... 13:32:40 <Qiming> en, youdontknow 13:32:44 <ruijie_> I did not reproduce the situation the bug describled 13:33:00 <Qiming> thanks ruijie_, for confirmation 13:33:13 <Qiming> marking as incomplete for now 13:33:35 <Qiming> bug #1609244 13:33:36 <openstack> bug 1609244 in senlin "Getting image authentication failed when use fernet in keystone" [Undecided,New] https://launchpad.net/bugs/1609244 13:33:53 <Qiming> this one seems fixed, it was a keystone configuration problem 13:34:56 <Qiming> I'm gonna cut RC1 this week, probably on Thursday 13:35:23 <Qiming> if you have got any patches you want a review please speak up on #senlin channel 13:35:38 <Qiming> #topic dealing with desired_capacity 13:35:54 <Qiming> ruijie_, room is yours 13:37:08 <Qiming> ruijie_, still awake? 13:37:11 <ruijie_> yea, the eval)stauts() will check the desired_capacity 13:37:31 <Qiming> yes 13:37:34 <ruijie_> if we do not specified the max and min size of the cluster 13:37:49 <Qiming> you get min_size=0, max_size = -1 13:38:08 <ruijie_> this method weill change the reason to ' number of active nodes is above desired_capacity' 13:38:10 <ruijie_> is that ok? 13:38:37 <Qiming> yes, it means the cluster is operational, just may be wasting some additional resources 13:39:53 <Qiming> having the number of active nodes equal to the desired_capacity would be great 13:40:08 <ruijie_> That's true 13:40:29 <ruijie_> The number of active nodes is equal to desired_Capacity 13:40:37 <ruijie_> but the max_size is sitll -1 13:41:03 <Qiming> but if we only set cluster status to 'ACTIVE' when that number is exactly desired_capacity, we may be a little too restrictive 13:41:19 <Qiming> max_size means no upper limit 13:42:07 <Qiming> http://git.openstack.org/cgit/openstack/senlin/tree/senlin/engine/cluster.py#n551 13:42:11 <ruijie_> I understand that, but I think the reason of should be more friendly 13:43:44 <Qiming> that whole condition is about desired_capacity <= current capacity <= max_size 13:44:27 <Qiming> or if max_size is set to -1, there is no limit 13:45:11 <ruijie_> Ok. I get it 13:45:38 <Qiming> feel free to propose a better description when you get one 13:45:41 <Qiming> :) 13:45:55 <Qiming> when speaking of desired_capacity 13:46:06 <ruijie_> Sure :) 13:46:13 <Qiming> I'm gonna work on that during the coming week 13:46:25 <Qiming> the change would be about all cluster operations 13:46:47 <Qiming> especially those that change the size of a cluster 13:47:16 <Qiming> the goal is to make sure all operations are based on currently observed number of nodes, not desired capacity 13:47:24 <ruijie_> Yes, that will be good for the Healty Manager 13:47:35 <Qiming> say if you have a cluster of 3 nodes, and your desired_capacity is 5 13:47:47 <Qiming> fine, cluster is in WARNING status 13:47:55 <Qiming> when I do scale out 13:48:17 <Qiming> I'll use the observed 3 nodes as the basis 13:48:48 <Qiming> we are not basing the resize operation on the desired_capacity 13:49:03 <Qiming> 'desired_capacity' of a cluster is always a "desired", not the reality 13:49:25 <ruijie_> and recalculate the desired_capacity or not? 13:49:34 <Qiming> if not for senlinclient being freezed, I'm even thinking of adding a 'active_nodes' property to a cluster 13:49:50 <Qiming> ruijie_, will recalculate 13:50:00 <Qiming> using the above example 13:50:15 <Qiming> you have 3 nodes, your previous desire was 5 (not realized) 13:50:28 <Qiming> now you say 'cluster-scale-out -c 3' 13:50:48 <Qiming> that means you will want the cluster to have 6 nodes, your new desire 13:50:59 <Qiming> senlin will try its best to achieve that 13:51:26 <Qiming> but ... in real world, you may still have 3 nodes, or maybe just 4, because you have running out of quota or resources ... 13:51:48 <Qiming> so, yes, desired_capacity will always describe your new desire 13:52:06 <Qiming> it will be changed 13:52:16 <ruijie_> how does senlin treat the ERROR nodes when we do a 'SCALE_OUT' action 13:52:24 <Qiming> no matter the operation/action succeeded or not 13:52:32 <Qiming> we leave them there 13:52:34 <ruijie_> I know senlin will delete ERROR nodes first when we do a 'SCALE_IN' 13:53:12 <Qiming> that's a good question, I haven't thought it thru yet 13:53:42 <Qiming> maybe add a cluster-reset operation, delete all inactive nodes? 13:53:44 <Qiming> not sure 13:54:07 <Qiming> cluster-delete-all-inactive-or-error-or-warning-nodes? 13:54:12 <ruijie_> Maybe just let the user deside how to treat them 13:54:48 <Qiming> yup, will keep working on that when I finish the desired_capacity one 13:54:53 <Qiming> #topic open discussion 13:56:09 <Qiming> anything for a quick discussion? 13:56:32 <ruijie_> nope from me 13:57:57 <Qiming> okay, thanks everyone 13:58:07 <Qiming> best wishes to you and your family, ... 13:58:18 <lixinhui_> u2 13:58:21 <Qiming> it's mid-autumn season 13:58:29 <Qiming> thanks for joining, see you 13:58:36 <lixinhui_> cu 13:58:45 <guoshan> good night, and have fun in festival days 13:58:47 <Qiming> #endmeeting