13:00:36 <Qiming> #startmeeting senlin 13:00:37 <openstack> Meeting started Tue Aug 2 13:00:36 2016 UTC and is due to finish in 60 minutes. The chair is Qiming. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:00:38 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:00:41 <openstack> The meeting name has been set to 'senlin' 13:00:59 <Qiming> hello 13:01:06 <lixinhui_> hi 13:01:13 <haiwei_> hi 13:01:31 <Qiming> evening, xinhui, haiwei 13:01:35 <yanyanhu> hi 13:01:39 <yanyanhu> sorry I'm late 13:01:41 <Qiming> hi, yanyan 13:01:51 <Qiming> np 13:02:10 <Qiming> pls review agenda and see if you got items to add 13:02:12 <Qiming> #link https://wiki.openstack.org/wiki/Meetings/SenlinAgenda#Weekly_Senlin_.28Clustering.29_meeting 13:02:40 <Qiming> first item, newton work items 13:02:48 <Qiming> #topic newton work items 13:02:56 <Qiming> #link https://etherpad.openstack.org/p/senlin-newton-workitems 13:03:37 <Qiming> updates ? 13:03:58 <yanyanhu> rally plugin 13:04:23 <yanyanhu> the patch for rally side is still in progress, will check and fix latest issues tomorrow morning 13:04:32 <yanyanhu> hope can finish it soon 13:04:43 <Qiming> okay, just got some comments from Roman ... 13:04:48 <yanyanhu> for senlin repo, plugin for cluster scaling has been proposed 13:04:50 <yanyanhu> Qiming, yes 13:05:00 <yanyanhu> need quick fix and also some explanation 13:05:08 <Qiming> okay 13:05:29 <yanyanhu> https://review.openstack.org/346656 13:05:32 <Qiming> interesting ... we are still exposing 'parent' to client? 13:05:32 <yanyanhu> the one for cluster scaling 13:05:56 <yanyanhu> I guess there is still some out of date msg in doc? 13:06:07 <Qiming> okay, it has been hanging there for some days 13:06:07 <yanyanhu> will check it and reply to roman 13:06:15 <Qiming> sounds great 13:06:24 <yanyanhu> yes, hope can complete it soon 13:06:38 <Qiming> maybe we can ask some helps from cmcc 13:06:51 <Qiming> don't know eldon can offer a hand 13:07:01 <yanyanhu> Qiming, yes, I think we can ask them for some use case reference 13:07:20 <yanyanhu> for coding, it's ok for me since 13:07:28 <Qiming> okay 13:07:34 <yanyanhu> there is no critical issue left I feel 13:07:45 <yanyanhu> anyway, will keep working on it 13:07:53 <Qiming> anything else in this space? 13:08:02 <yanyanhu> nope I think 13:08:06 <Qiming> moving on 13:08:18 <lixinhui_> About Fencing 13:08:22 <Qiming> health management, no progress from my side last week 13:08:50 <lixinhui_> I add some points on Qiming's HA etherpad 13:09:13 <lixinhui_> First step is to target fencing nova compute service 13:09:35 <lixinhui_> Second step is fencing of vm 13:09:49 <lixinhui_> for compute service fencing 13:10:01 <lixinhui_> which should happen when some host failure happens 13:10:11 <Qiming> that is actually abouf fencing a nova compute node, correct? 13:10:19 <Qiming> yep 13:10:23 <lixinhui_> Yes, Qiming 13:10:31 <Qiming> I don't have a multi-node setup at hand 13:10:48 <Qiming> cannot produce a compute node failure to observe the host failure events 13:10:53 <lixinhui_> So I wonder if proper to add this into healthmonitor 13:11:06 <lixinhui_> Qiming 13:11:11 <Qiming> have you got any hints on that? either by digging into the source or doc or thru experimentation? 13:11:37 <lixinhui_> compute node failure can only be known by polling service status 13:11:45 <Qiming> observing host failure could only be done thru events 13:12:07 <lixinhui_> Actually 13:12:13 <Qiming> I'm a little reluctant to poll nova compute services 13:12:28 <lixinhui_> Nova today use heartbeat to know if host is alive or not 13:12:37 <Qiming> that is their internals 13:12:55 <Qiming> we are not supposed to peek into that 13:13:01 <lixinhui_> There are only two types of event for nova to notice 13:13:08 <lixinhui_> one if the node.update 13:13:16 <lixinhui_> the other is service.update 13:13:17 <Qiming> IIRC, nova has event reports when a host fails 13:14:01 <Qiming> okay, then we can listen to those events 13:14:07 <lixinhui_> Service.update can be sent only when the change happen on the nova services by nova service* 13:14:28 <Qiming> don't understand 13:15:08 <lixinhui_> you can read the code of nova/objects/service 13:15:16 <lixinhui_> .py 13:15:36 <Qiming> can you pls just explain your last sentence? 13:15:38 <lixinhui_> my experiments proove this 13:16:28 <lixinhui_> that means the up or down of nova services will be changes based on heatbeat without notice 13:16:50 <lixinhui_> but service.update will be sent when I enable or disable some serivce 13:16:57 <Qiming> okay, those are the nova internal state maintenance, we cannot check it from outside 13:17:15 <lixinhui_> so 13:17:24 <Qiming> if nova-compute is down, no event notification is sent? 13:17:31 <lixinhui_> no 13:17:36 <lixinhui_> after two cycle 13:17:44 <lixinhui_> the serivce becomes down 13:17:47 <lixinhui_> that is all 13:18:20 <lixinhui_> after detection that, we can fencing the compute 13:18:21 <Qiming> good/bad to know ... 13:18:49 <Qiming> sounds like the only way for failure detection is polling? 13:18:59 <Qiming> need to double check that 13:19:04 <Qiming> that was my understanding 13:19:06 <lixinhui_> or read the status of nova service 13:19:28 <Qiming> but last time in a mailinglist discussion, I raised this question 13:19:30 <lixinhui_> that would be good if you can double check 13:19:51 <Qiming> someone told me that nova is already capable of sending out notifications when a compute service is down 13:20:04 <Qiming> I hope I have ten heads 13:20:18 <Qiming> need to dig that email 13:20:25 <Qiming> or the source code 13:21:00 <Qiming> #action Qiming to double check nova's capability of notifying host down 13:21:10 <lixinhui_> they indeed add some notices 13:21:10 <Qiming> moving on 13:21:13 <lixinhui_> nova/nova/objects/service.py 13:21:29 <Qiming> documentation side 13:21:42 <Qiming> added some user references docs last week 13:21:57 <Qiming> mainly a reorg around auto-scaling, receivers ... etc 13:22:44 <Qiming> I was thinking of adding a tutorial about auto-scaling, but later I realized that is a huge topic, not suitable for a tutorial, which is supposed to be pretty short 13:23:22 <Qiming> I have also moved the heat based autoscaling under a scenarios subdirectory 13:23:35 <Qiming> where in future we can add more scenarios for references 13:24:03 <Qiming> will check if tutorial doc can be left there ... 13:24:15 <Qiming> next ... 13:24:34 <Qiming> yanyan just started adding version control to profile and policy specs 13:24:47 <Qiming> this is necessary, pls help review 13:24:58 <yanyanhu> yes, just proposed the first patch https://review.openstack.org/348709 13:25:06 <Qiming> thanks 13:25:06 <yanyanhu> to add version support to schema and spec 13:25:28 <Qiming> in parallel, I'm looking into oslo.versionedobjects for a more wholistic solution 13:25:32 <Qiming> will update later 13:25:43 <Qiming> moving on ... 13:25:43 <yanyanhu> my pleasure. Really need some discussion about this topic 13:25:51 <Qiming> container profile support 13:26:24 <Qiming> haiwei just pushed a commit: https://review.openstack.org/#/c/349906 13:27:11 <haiwei_> yes, Qiming 13:27:22 <Qiming> I haven't got time to review 13:27:28 <haiwei_> I only tested it partly 13:27:28 <Qiming> just a quick glance 13:27:59 <Qiming> team, please take a look at it and help polish it when you got cycles 13:28:19 <yanyanhu> sure, will check it 13:28:28 <Qiming> thx 13:28:30 <Qiming> moving on 13:28:32 <haiwei_> I think the point for that patch is where should we store 'host_node' uuid? in that patch I stored it in the metadata of profile 13:29:00 <Qiming> maybe node.data ? 13:29:33 <Qiming> if you check other policy decisions such as zone placement, region placement ... 13:29:54 <haiwei_> ok, I will think about it 13:29:59 <Qiming> we are injecting data into the 'data' field of the node (abstract one) 13:30:23 <Qiming> then when we are about to create the physical resource, we extract those policy decisions 13:30:36 <Qiming> profile metadata was designed for users to use 13:30:49 <haiwei_> in service layer we got host_node, but it is not the server's id, so we need to pass server's id to profile 13:31:08 <Qiming> e.g. {'author': 'haiwei', 'last-updated': '2016-08-02', ... } etc 13:31:35 <Qiming> we can pass those information in node.data field 13:31:50 <haiwei_> I will check it later 13:31:52 <Qiming> the node.data field was designed to carry those data around 13:31:55 <Qiming> great 13:32:33 <Qiming> pls also think if we can move the policy decision out into a policy type 13:33:04 <Qiming> 1. that will make the engine code cleaner; 2. we could later improve/replace that policy type implementation easily 13:33:19 <haiwei_> what policy decision? 13:33:33 <haiwei_> currently I am thinking about node_create 13:33:37 <Qiming> by "policy decision" I mean the selection of node in a hosting cluster 13:34:26 <Qiming> just something to keep in mind, I'm not sure how feasible it is without digging into the code 13:34:39 <haiwei_> ok 13:34:45 <Qiming> great, moving on 13:34:52 <Qiming> zaqar based receiver 13:35:13 <Qiming> yanyan has been busy working on that ... 13:35:31 <yanyanhu> yes, have confirmed with zaqar team again about the usage of "project_id" and "client_id" today 13:35:53 <yanyanhu> just as you said, we should expose them out for invoker of sdk proxy call 13:36:20 <yanyanhu> have post the latest result in the follow patches: * https://review.openstack.org/349369 13:36:21 <yanyanhu> * https://review.openstack.org/338041 13:36:37 <Qiming> okay, the thing to bear in mind is ... 13:37:03 <Qiming> if you put 'client_id = Header('Client-ID') in that Message class 13:37:21 <Qiming> the header still won't appear in the final request ... 13:37:41 <Qiming> that is something I missed when reviewing your last patch 13:38:23 <yanyanhu> overriding resource calls will make it take effect I think? 13:38:23 <Qiming> so ... your way of overriding those methods are still valid, though there are rooms for improvement 13:38:29 <Qiming> yes 13:38:30 <yanyanhu> like the latest patch does 13:38:36 <yanyanhu> yes 13:38:49 <yanyanhu> really not graceful way 13:38:51 <Qiming> it is ugly, but ... you know, people need time to understand the issue we are facing 13:38:58 <yanyanhu> yea 13:39:17 <Qiming> we should allow custom headers in all those create, get, list calls 13:39:18 <yanyanhu> hope brian can figure it out when building resource2 13:39:24 <yanyanhu> using better way 13:39:29 <yanyanhu> yes 13:39:30 <Qiming> resource2 is already there ... 13:39:36 <yanyanhu> if so, that will be much better 13:39:49 <yanyanhu> Qiming, it still needs some improvement I think 13:39:50 <Qiming> but he doesn't seem buy in the idea of adding more parameters 13:40:00 <yanyanhu> for those "corner" use cases 13:40:12 <Qiming> thanks for keeping the balls rolling 13:40:24 <Qiming> will review your new patchset tomorrow 13:40:32 <yanyanhu> thanks a lot 13:40:38 <Qiming> moving on 13:40:47 <Qiming> events/notifications 13:40:55 <Qiming> no update from me in this space 13:41:17 <Qiming> actually I was trapped in a more general issue .... versioning of things 13:41:43 <Qiming> okay, next topic 13:41:49 <Qiming> #topic newton deliverables 13:42:29 <Qiming> though I've been digging into the issue of versioning of things, I don't think we can get it done by this cycle 13:43:04 <Qiming> on the other hand, the new features about cluster-collect and cluster-do will have to base on micro-version support 13:43:22 <Qiming> which is also blocked here: https://review.openstack.org/#/c/343992/ 13:43:56 <Qiming> still need time to convince brian that the current patch is already okay 13:44:07 <yanyanhu> this part is really complicated... 13:44:37 <Qiming> the overall design and impl is good, there are some trivial coding style things for communication 13:44:49 <Qiming> health policy implementation ... 13:45:08 <Qiming> I do hope we can deliver a basic, working version by this cycle 13:45:17 <yanyanhu> sure 13:45:18 <Qiming> as for container cluster 13:45:33 <yanyanhu> really need to achieve that goal I feel 13:45:35 <Qiming> it would be GREAT we can have a basic, working version 13:45:51 <Qiming> yes, people are asking questions on that 13:46:01 <haiwei_> yes 13:46:18 <Qiming> let's keep working hard on this 13:46:24 <Qiming> s/this/these 13:46:39 <Qiming> next topic I added is about versioned objects 13:46:56 <Qiming> when adding new properties to policy (e.g. the lb policy revision lately) 13:47:03 <Qiming> we need to bump policy version 13:47:19 <Qiming> so ... we have a lot of things to be versioned 13:47:35 <Qiming> 1. API micro-version 13:47:43 <Qiming> 2. API request body version 13:47:51 <Qiming> 3. API response version 13:47:57 <Qiming> 4. RPC version 13:48:07 <Qiming> 5. DB object version 13:48:14 <Qiming> 6. Event/Notification version 13:48:21 <Qiming> 7, Policy type version 13:48:26 <Qiming> 8. Profile type version 13:48:39 <Qiming> 9. Receiver version 13:49:05 <Qiming> without proper versioning infra at hand, we will quickly loose control of things 13:49:16 <yanyanhu> Qiming, yes, almost every elements that could vary over time 13:49:17 <Qiming> and things will break in a thousand ways 13:49:44 <Qiming> so... I'm investigating oslo.versionedobjects, every single line of code there 13:49:54 <Qiming> and also jsonschema doc/implementation 13:50:21 <Qiming> I think I have got a rough idea on how to unify all object versioning into the same framework 13:50:42 <Qiming> but that warrants a lot of experimentation and code churn 13:51:14 <Qiming> will leave that as a long term work, maybe by end of Ocata we will have this framework completely landed 13:51:41 <Qiming> ideally, after that, when you want to add a new property to an existing resource 13:52:09 <Qiming> you won't need to modify a few hundred lines of code while still worrying about breaking existing users 13:52:22 <yanyanhu> great, we can add version support for different elements gradually I think 13:52:42 <Qiming> some preliminary code have proved the feasibility of this 13:53:05 <Qiming> we can even make the api-ref documentation generated out of these objects 13:53:10 <yanyanhu> start from most basic part and keep it in mind when making changes on those "unversioned" stuff 13:53:24 <Qiming> yep 13:53:56 <Qiming> so that is my update in this thread 13:54:15 <Qiming> I didn't leave any time for questions/comments 13:54:25 <Qiming> #topic open discussion 13:54:31 <yanyanhu> no problem, will check the code :) 13:54:44 <yanyanhu> voting is open now :) 13:54:50 <yanyanhu> good luck for senlin's topic 13:54:59 <yanyanhu> topic(s) 13:55:07 <Qiming> yup 13:55:09 <Qiming> blessing 13:55:33 <Qiming> this is a very strange document to read ... http://json-schema.org/latest/json-schema-validation.html 13:55:59 <yanyanhu> yes... 13:56:16 <Qiming> and that is their most comprehensive one I guess .... :) 13:56:23 <namnh> Hi 13:56:23 <yanyanhu> really hope we are native eng speaker :) 13:56:38 <Qiming> hi, namnh 13:57:46 <Qiming> anything else? 13:57:55 <yanyanhu> nope from me 13:58:42 <Qiming> seems lixinhui has dropped 13:58:50 <yanyanhu> yea 13:58:57 <yanyanhu> will over time soon 13:58:59 <Qiming> but anyway I will look into the nova code 13:59:15 <Qiming> thanks, guys 13:59:21 <Qiming> let's meet next week 13:59:23 <yanyanhu> thanks, have a good night 13:59:28 <Qiming> #endmeeting