Monday, 2016-11-21

*** ruijie has joined #senlin		00:58
*** yanyanhu has joined #senlin		01:41
yanyanhu	hi, Qiming, just got response from Heidi last weekend and she said they are still waiting for the illustrator to give them the update version since they didn't think the first version is good enough.	01:43
Qiming	ok	01:44
yanyanhu	and she will give me message as soon as she get it	01:44
*** elynn has joined #senlin		01:44
Qiming	so long they are still working on it, it is fine	01:44
yanyanhu	yep, looks so	01:46
*** elynn has quit IRC		01:47
*** elynn has joined #senlin		01:49
*** XueFeng has joined #senlin		01:56
*** elynn has joined #senlin		02:11
openstackgerrit	Merged openstack/python-senlinclient: Support "global_project" arguments for action-list https://review.openstack.org/397805	02:11
openstackgerrit	Qiming Teng proposed openstack/senlin: Move notifications object down one level https://review.openstack.org/400024	02:15
openstackgerrit	miaohb proposed openstack/python-senlinclient: Cluster collect display error https://review.openstack.org/400026	02:25
*** yuanying has quit IRC		02:50
*** yuanying has joined #senlin		02:51
openstackgerrit	Qiming Teng proposed openstack/senlin: Registry support for notification classes https://review.openstack.org/400033	02:54
Qiming	are we suffering from the same problem? https://review.openstack.org/#/c/378572/	03:19
*** zhurong has joined #senlin		03:27
*** yuanying has quit IRC		03:46
*** elynn has quit IRC		03:48
*** yuanying has joined #senlin		03:48
*** zhurong has quit IRC		03:58
*** zhurong has joined #senlin		03:59
*** gongysh2 has quit IRC		04:01
*** shu-mutou-AWAY is now known as shu-mutou		04:13
*** zhurong has quit IRC		04:26
*** gongysh2 has joined #senlin		04:27
*** rasmus has joined #senlin		04:32
*** rasmus has quit IRC		04:36
*** elynn has joined #senlin		04:52
yanyanhu	hi, Qiming, we are not I think. We inherit from service.Service for three services including engine, dispatcher and health manager and I think all of them are used with threadgroup	04:55
Qiming	okay, good to know	04:55
*** elynn has quit IRC		04:57
*** elynn has joined #senlin		04:58
*** zhurong has joined #senlin		05:11
openstackgerrit	Merged openstack/senlin: Add TODO item about referencing existing pool https://review.openstack.org/398872	05:12
*** zhurong has quit IRC		05:31
*** zhurong has joined #senlin		05:35
openstackgerrit	Merged openstack/senlin: Add request object for event-get https://review.openstack.org/399835	05:51
openstackgerrit	Merged openstack/senlin: Engine support for profile-validate2 https://review.openstack.org/398863	05:53
*** elynn has quit IRC		06:09
*** elynn has joined #senlin		06:31
openstackgerrit	lvdongbing proposed openstack/senlin: API support for profile-validate2 https://review.openstack.org/400062	06:31
*** elynn has quit IRC		06:36
*** elynn has joined #senlin		06:36
Qiming	yanyanhu, free for a quick discussion?	06:46
yanyanhu	Qiming, sure	06:46
Qiming	working on versioned notification ...	06:46
yanyanhu	ok	06:46
Qiming	I don't think we will have a big problem unifying the logging interface among database, message, file, etc	06:47
Qiming	I'm deferring that work till we get a poc implementation for versioned notification	06:47
Qiming	once the interface is proved to be flexible, generic enough for both database and message, we can work on the generalization step	06:48
yanyanhu	yes, that makes sense. Once the poc implementation is ready, it will be easy to add more backend	06:48
Qiming	before we are there, we need to get the versioned notification thing done	06:48
Qiming	I'm somehow blocked by the granularity problem when modeling events	06:49
Qiming	it is not like the LOG.info, LOG.error, ... which we treated as free	06:49
Qiming	for notifications, if no one is receiving and processing them, it seems that they will be accumulated into the message queue for a long time	06:50
yanyanhu	you're worried about the overhead	06:50
yanyanhu	oh, I see	06:50
Qiming	I just experience that when reinstalling devstack adding gnocchi and aodh	06:50
yanyanhu	agree with this. Actually I think amqp is designed for runtime message delivering	06:51
yanyanhu	with an assumption that the consumer is always online to receive and handle message	06:51
Qiming	and ... previous experience debuging some enterprise middleware ... do NOT overly log, do NOT overly notify ...	06:51
yanyanhu	this is different from log type of message, e.g. kafka	06:51
yanyanhu	yes, it is	06:52
Qiming	alright, I did considered this when drafting the spec file, so eventually, we will expose some switches into the config file for users to customize ...	06:52
Qiming	what level of events should be fired, what kind of events should be masked etc ...	06:53
yanyanhu	yep	06:53
Qiming	that would be ... complex to use, but flexible enough to meet requirements we are not anticipating	06:53
Qiming	okay, enought recap	06:53
Qiming	the problem is ... we have too many choices to send event notifications	06:53
Qiming	we have to make some design decisions on this	06:54
Qiming	we are not supposed to emit a notification whenever we just need a debug info	06:54
Qiming	even with user customization, we will only decide whether to emit an event at the last moment	06:55
yanyanhu	last moment, you mean?	06:55
Qiming	we are not supposed to place a lot of 'if ... else ..' calls at the call site	06:55
yanyanhu	Qiming, that's for sure...	06:56
Qiming	take the LOG.info calls as an example	06:56
Qiming	we are calling it everywhere	06:56
Qiming	if we want to do conditional logging, we are not supposed to add 'if (info is allowed for this module, for this action) then LOG.info' everywhere	06:57
Qiming	we will keep the call site simple, just a single line, LOG.info(...)	06:57
yanyanhu	yes	06:57
yanyanhu	so there should a filter for this purpose?	06:58
Qiming	then in the driver layer, we decide whether we will actually generate an event notification (or db record)	06:58
Qiming	called a filter or a filter chain if you want	06:58
Qiming	but that filtering logic is not supposed to be placed at the call site, instead it should be placed inside the 'info' call	06:59
yanyanhu	yes	06:59
Qiming	in other words, the 'info' call should be smart enough to handle this customizations, correct?	06:59
yanyanhu	right	06:59
Qiming	okay, then ... where do we place those 'info/warn/error/' calls?	07:00
Qiming	(suppose we can filter them eventually, efficiently, at the last moment)	07:00
yanyanhu	each key point of workflow I guess?	07:00
yanyanhu	e.g. action starts, succeeds	07:01
Qiming	right, the question lies in the definition of "key point of workflows"	07:01
Qiming	say, cluster-scale-out as a workflow	07:01
yanyanhu	tough quesition :)	07:01
Qiming	where do we call event generation?	07:01
yanyanhu	inside engine, I think service call, action building, policy taking effect, action scheduling/executing/finishing?	07:02
yanyanhu	and those points inside each sub action	07:03
yanyanhu	It's hard to ask enduer to make decision I feel	07:04
Qiming	we can emit event at the following places: 1) rpc request received and validated 2) cluster_scale_out action queued 3) cluster_scale_out action starts execution 4) cluster_scale_out action forks node_create action; 5) node_create action queued; 6) node_create action starts execution; 7) node_create action failes/succeeds 8) cluster_create action fails/succeeds 9) the original request reached a conclusion, i.e. cluster was scaled or not (status changes)	07:04
Qiming	I do see every step a key point in the workflow	07:05
yanyanhu	yes, those events should be emitted	07:05
Qiming	but I don't think we need to log them all	07:05
Qiming	it is too heavy	07:05
Qiming	completely ruining the idea of notification	07:06
yanyanhu	Qiming, that's true	07:06
yanyanhu	especially consider the overhead from interacting with event backend	07:06
*** guoshan has joined #senlin		07:06
Qiming	correct, 5), 6), 7) above are proportional to the scale of a cluster operation	07:07
Qiming	after drawing this on a paper, I'm astonished ...	07:08
Qiming	we cannot afford logging so many events where each event will carry a lot of payload (based on my current design)	07:08
Qiming	oslo versioned objects, when dumped, are already generating a lot of overhead regarding bytes added	07:09
yanyanhu	yes	07:09
yanyanhu	in large scale, that could be very low efficient	07:09
Qiming	suppose we dump the cluster properties for all the events above, and all the action properties for these events	07:09
Qiming	if we don't dump all the properties, we will be challenged ... why the cluster_scale_out event didn't tell me when it was started and when it was stopped? I want to compute the duration of its execution ...	07:11
Qiming	so ... a tough decision, right?	07:11
yanyanhu	yes	07:11
yanyanhu	if so maybe we start from coarse granularity(e.g. only logging cluster level events) and then try finer granularity and evaluate the overhead increasing?	07:11
Qiming	here is my current proposal	07:11
Qiming	we don't dump action details	07:12
Qiming	we think from end user's perspective	07:12
Qiming	they shouldn't care about the asynchronous/synchronous execution of cluster operations ...	07:13
Qiming	it was ... senlin ... that makes things a "mess"	07:13
Qiming	take node_create as an example	07:14
Qiming	if it is a derived action, not one originated from RPC request, we don't have to expose that detail to users	07:14
Qiming	we instead should strive and focus on exposing information on the cluster operation itself ... event if it fails, we let the users know why it failed ..	07:15
openstackgerrit	xu-haiwei proposed openstack/senlin: Update host node 'dependents' when create/delete container node https://review.openstack.org/396016	07:15
Qiming	that is the 'original' goal of events or notifications	07:15
yanyanhu	Qiming, yep, totally makes sense. They are events not "debug" info	07:16
*** zhurong has quit IRC		07:16
Qiming	back to the list above	07:16
Qiming	I'd like to focus on 3), 9) only	07:16
yanyanhu	8) is duplicated with action list/get?	07:17
Qiming	in terms of event notification, there will be three types of events for this operation: cluster.scale_out.start, cluster.scale_out.end, cluster.scale_out.error	07:18
Qiming	and .. that is ALL	07:18
yanyanhu	that's reasonable	07:18
Qiming	oh, 8) is inside the 'do_scale_out' function, and 9) is at the end of the '_execute' function	07:19
yanyanhu	I see	07:19
Qiming	it is gonna complicate the event generation a little bit, regarding the derivation of "status reason", but ...	07:20
Qiming	the simplification of overall infrastructure may justify that effort, I hope	07:20
yanyanhu	it will I think	07:21
Qiming	okay, will proceed on this	07:21
yanyanhu	otherwise, the overhead could be unaffordable	07:21
yanyanhu	great, thanks for those explanation :)	07:21
Qiming	and try apply the same principle on node operations (those derived from RPC)	07:21
Qiming	em ... actually, it maybe not that complex	07:22
Qiming	we have been working very hard to reduce error message into the action.status and even into cluster status	07:22
Qiming	that include failures of policy checks ...	07:23
Qiming	so we will see if we have to emit something when a policy check has failed	07:23
yanyanhu	ok	07:23
yanyanhu	sounds feasible	07:23
Qiming	it is not that interesting either, if we have recorded the reason why a cluster operation has failed	07:24
Qiming	okay, thx for ur time, :)	07:24
yanyanhu	event can be used together with action get I think	07:24
yanyanhu	my pleasure	07:24
yanyanhu	hope this digging can help the team better understand the design principle	07:25
yanyanhu	:)	07:25
openstackgerrit	lvdongbing proposed openstack/senlin: Engine support for profile-create2 https://review.openstack.org/400075	07:25
Qiming	will try to document these design considerations when creating developer docs	07:25
yanyanhu	great	07:27
openstackgerrit	miaohb proposed openstack/python-senlinclient: The default value of "--list" in cluster-collect's help message displays error https://review.openstack.org/400076	07:30
openstackgerrit	lvdongbing proposed openstack/senlin: API support for profile-create2 https://review.openstack.org/400079	07:50
openstackgerrit	Yanyan Hu proposed openstack/senlin: Fix an error in integration test https://review.openstack.org/400081	07:53
openstackgerrit	miaohb proposed openstack/python-senlinclient: Revise the help message of cluster-collect https://review.openstack.org/400076	08:06
openstackgerrit	miaohb proposed openstack/python-senlinclient: Revise the help message of cluster-collect https://review.openstack.org/400076	08:12
openstackgerrit	miaohb proposed openstack/python-senlinclient: Revise the help info of cluster collect https://review.openstack.org/400026	08:19
openstackgerrit	lvdongbing proposed openstack/senlin: Remove dead code related to profile-get in engine layer https://review.openstack.org/400093	08:22
openstackgerrit	lvdongbing proposed openstack/senlin: Remove dead code related to profile-update in engine layer https://review.openstack.org/400104	08:30
openstackgerrit	Shan Guo proposed openstack/senlin: Modify the cli in doc of policy attach command https://review.openstack.org/400105	08:31
*** gongysh2 has quit IRC		08:43
openstackgerrit	lvdongbing proposed openstack/senlin: Remove dead code related to profile-delete in engine layer https://review.openstack.org/400114	08:44
openstackgerrit	Merged openstack/senlin: Add engine support for event_get2 https://review.openstack.org/399836	08:50
openstackgerrit	Merged openstack/senlin: Api support for event_get2 https://review.openstack.org/399841	08:52
openstackgerrit	RUIJIE YUAN proposed openstack/senlin: prepare for "destory" parameter in cluster-replace-nodes https://review.openstack.org/400129	09:04
openstackgerrit	miaohb proposed openstack/python-senlinclient: Fix error in cluster collect https://review.openstack.org/400133	09:15
openstackgerrit	Yanyan Hu proposed openstack/senlin: Versioned request object for receiver-delete https://review.openstack.org/400135	09:19
openstackgerrit	Yanyan Hu proposed openstack/senlin: Engine support for receiver_delete2 https://review.openstack.org/400136	09:19
*** yanyanhu has quit IRC		09:24
*** shu-mutou is now known as shu-mutou-AWAY		09:25
openstackgerrit	Merged openstack/python-senlinclient: Updated from global requirements https://review.openstack.org/395377	09:30
openstackgerrit	lvdongbing proposed openstack/senlin: Versioned request objects for profile_type https://review.openstack.org/400148	09:39
*** elynn has quit IRC		09:51
*** guoshan has quit IRC		10:40
*** guoshan has joined #senlin		11:41
*** guoshan has quit IRC		11:46
-openstackstatus- NOTICE: We are currently having capacity issues with our ubuntu-xenial nodes. We have addressed the issue but will be another few hours before new images have been uploaded to all cloud providers.		12:20
*** catintheroof has joined #senlin		12:31
openstackgerrit	XueFeng Liu proposed openstack/senlin: Fix nova resource leak https://review.openstack.org/400232	12:40
*** guoshan has joined #senlin		12:42
*** guoshan has quit IRC		12:47
openstackgerrit	Merged openstack/senlin: Update host node 'dependents' when create/delete container node https://review.openstack.org/396016	13:33
*** guoshan has joined #senlin		13:43
*** bran has quit IRC		13:44
*** guoshan has quit IRC		13:47
openstackgerrit	Qiming Teng proposed openstack/senlin: Remove NotificationPayloadBase class https://review.openstack.org/400266	14:20
openstackgerrit	Qiming Teng proposed openstack/senlin: New fields for versioned notification https://review.openstack.org/400267	14:20
openstackgerrit	Qiming Teng proposed openstack/senlin: New fields for versioned notification https://review.openstack.org/400267	14:21
*** guoshan has joined #senlin		14:44
*** guoshan has quit IRC		14:48
*** elynn has joined #senlin		14:50
*** elynn has quit IRC		15:09
*** guoshan has joined #senlin		15:44
*** guoshan has quit IRC		15:49
*** guoshan has joined #senlin		16:45
*** guoshan has quit IRC		16:50
*** guoshan has joined #senlin		17:46
*** guoshan has quit IRC		17:51
*** guoshan has joined #senlin		19:01
*** guoshan has quit IRC		19:05
*** guoshan has joined #senlin		20:02
*** guoshan has quit IRC		20:06
*** guoshan has joined #senlin		21:03
*** guoshan has quit IRC		21:07
*** shu-mutou-AWAY has quit IRC		21:41
*** guoshan has joined #senlin		22:03
*** guoshan has quit IRC		22:08
*** guoshan has joined #senlin		23:04
*** guoshan has quit IRC		23:09
*** openstack has joined #senlin		23:47

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!