Tuesday, 2016-07-05

openstackgerrit	Digambar proposed openstack/watcher: Fix field type to audit_type https://review.openstack.org/336390	00:38
*** Kevin_Zheng has joined #openstack-watcher		03:34
*** diga has joined #openstack-watcher		04:16
openstackgerrit	licanwei proposed openstack/watcher: Fix link error in base-setup.rst https://review.openstack.org/337456	06:40
openstackgerrit	Jinquan Ni proposed openstack/watcher: Remove duplicate unittest https://review.openstack.org/337462	06:50
*** vincentfrancoise has joined #openstack-watcher		07:08
*** danpawlik has joined #openstack-watcher		07:12
diga	vincentfrancoise: Hi - https://review.openstack.org/#/c/336390/	07:18
diga	Please review above patch	07:18
*** dtardivel has joined #openstack-watcher		08:03
diga	vincentfrancoise: acabot : Can someone can review my patch - https://review.openstack.org/#/c/336390/	08:22
vincentfrancoise	diga: hi	08:29
vincentfrancoise	FYI I'll probably be doing some reviews this afternoon	08:30
diga	vincentfrancoise: sure	09:07
*** diga has quit IRC		09:31
dtardivel	diga: review done	09:39
tkaczynski	hi guys. can I ask few questions about how watcher works internally (communication between watcher modules)?	10:31
tkaczynski	not sure if anybody is around :)	10:33
danpawlik	tkaczynski: I guess not :)	10:37
tkaczynski	I'll wait then, I need some interaction as my questions might explode ;)	10:38
*** vincentfrancoise has quit IRC		11:17
*** acabot has quit IRC		11:17
*** jed56 has joined #openstack-watcher		11:33
*** vincentfrancoise has joined #openstack-watcher		11:39
vincentfrancoise	hi tkaczynski	11:39
openstackgerrit	Merged openstack/python-watcherclient: Add license file https://review.openstack.org/337347	11:39
tkaczynski	hi vincentfrancoise	11:39
vincentfrancoise	did your comment from earlier already exploded?	11:39
tkaczynski	so here's what I want to do: I 'm creating a scoring module for watcher now and I want decision engine to communicate with it. specifically, from some strategy I want to simply call some scoring engine and get the response from it	11:41
tkaczynski	no, the earlier thing is quiet ;)	11:41
vincentfrancoise	but if you want to communicate, this means it will be tightly related to the scoring engine one will be using isn't it?	11:42
tkaczynski	so I looked how is communication between modules already implemented and I see that it's using messaging, even though the clients are called "rpcapi" ;)	11:43
tkaczynski	yes. if a strategy wants to use some scoring engine, it will be tightly coupled to it from that point	11:43
vincentfrancoise	at the moment the whole messaging system is not really tidy	11:43
vincentfrancoise	in watcher I mean	11:44
tkaczynski	but I don't think it will work in my case because I need really a synchronous call: send data to scoring engine and get the response	11:44
tkaczynski	where the messaging is async really. the job is submitted and sometime later a notification comes	11:45
tkaczynski	so implementing scoring module as, say applier, is probably not a good idea. do you agree?	11:46
tkaczynski	jed56: I would appreciate your opinion as well	11:48
vincentfrancoise	Wouldn't there be any other way of communicating with it?	11:48
tkaczynski	well, that's the next question :)	11:49
jed56	Hello	11:49
tkaczynski	it seems that the reasonable choice might be a rest service, similar to the API	11:50
vincentfrancoise	tkaczynski: I'm digging a bit	11:50
jed56	you can do a sync call	11:50
tkaczynski	sync call using the existing messaging infrastructure?	11:50
jed56	cast - the method is invoked asynchronously and no result is returned to the caller	11:51
jed56	call - the method is invoked synchronously and a result is returned to the caller	11:51
jed56	yes with oslo messaging	11:51
jed56	However, if we have a huge volume of data this is not a good idea to use oslo messaing	11:52
tkaczynski	jed56: are you talking about this?	11:53
tkaczynski	https://www.irccloud.com/pastebin/8zRNX9CX/	11:53
tkaczynski	this is from applier/rpcapi.py	11:53
jed56	tkaczynski: yes for example	11:53
tkaczynski	so the oslo messaging will take care of everything? even if I have multiple events being sent at the same time?	11:55
jed56	tkaczynski: yes	11:56
jed56	you have the notion of context	11:56
tkaczynski	context is kind of another problem - from what I see it comes from the pecan (so from the API)	11:57
tkaczynski	strategies have no context inside, so I have nothing to put when calling the scoring engine RPC client	11:58
jed56	https://github.com/openstack/watcher/blob/master/watcher/applier/messaging/trigger.py#L45	11:58
jed56	by context, I mean the context of your call (action_plan_uuid )	11:59
jed56	with the parameters	11:59
jed56	I looking for a example for you	11:59
jed56	https://gist.github.com/begonia/992033fa3f523d8ae596	12:00
tkaczynski	yes, but with scoring engine there might be many strategies calling the same scoring engine (same id), I still want to route the response to the right caller	12:00
jed56	one call one reponse one context	12:01
jed56	you don't have to manage the concurrency	12:01
jed56	this is manage by the broker	12:01
tkaczynski	ok, I can try that. this also means that there is no point of trying to use multithreading in scoring module	12:03
tkaczynski	because then it basically becomes asynchronous	12:03
jed56	true	12:03
tkaczynski	going forward - do you thing it makes a sense to have such module at all (meaning: as a separate service)?	12:04
jed56	IMHO, we should be a separate servic	12:04
jed56	e	12:04
jed56	it	12:04
tkaczynski	for me it makes a perfect sense to introduce scoring engine concept within a decision engine service, strategies might and will consume it. but I don't see much benefits of having it as a separate service. is it for performance? using the fact that it will run in separate process, so potentially using another CPU core?	12:06
tkaczynski	I don't have the real data yet, but it's possible that handling all the communication might actually take more CPU cycles than executing the calculate_score on the scoring engine. it kind of defeats the purpose of having it externally	12:08
vincentfrancoise	tkaczynski: feels like this is a debate that was already discussed although I can't remember what we concluded last time	12:14
tkaczynski	we concluded that we want to have a separate service :)	12:15
jed56	:p	12:15
jed56	I'm searching the logs of the debate :p	12:15
tkaczynski	I'm just asking this question again one more time as I am closer to implementation now, have more data and can revert from it :) and since we can't even have threads, I asked myself a question - why is it for?	12:16
tkaczynski	and don't take me wrong: I already implemented this service, have it up ad running now :)	12:16
tkaczynski	just no communicating ;)	12:17
tkaczynski	*not	12:17
jed56	lol	12:17
*** openstackgerrit has quit IRC		12:19
*** openstackgerrit has joined #openstack-watcher		12:19
tkaczynski	so it's not I'm lazy and don't want to do thing. I'm just double checking that we are doing the right thing. but since jed56 helped me with communication problem (I'm able to have sync calls), I don't need to re-implement it as a rest service or something	12:19
jed56	tkaczynski: if we are wrong we change that :)	12:19
jed56	this is an incremental process	12:20
tkaczynski	sure, but there is no guarantee that I will be able to do that once I finish implementation and move to other tasks :) just saying	12:21
tkaczynski	so better get it right now and not delay it for future	12:22
jed56	I understand, IMHO this question is very dependent on the scope of the scoring engine	12:23
jed56	do we want that the scoring engine is in charge of prefetching the metrics ?	12:24
jed56	in this case, IMHO a service is better	12:24
jed56	moreover, we could avoid to restart the decision engine of each new prediction	12:26
jed56	I agree, for the first version of the scoring engine this a not a huge value to be a service	12:27
jed56	the good news, this is more easy to remove a service versus create it	12:28
jed56	tkaczynski: are you going to portland ?	12:28
tkaczynski	jed56: I'm not sure where is the question about "prefetching the metrics" coming from. scoring engine will be pretty simple and it should be by design.	12:29
jed56	The querying of the input data from any sources(ceilometer, monasca, kafka rabbitmq,ect) can take lot of time	12:30
tkaczynski	it's not fetching any metrics by itself. all this work is done in strategy - the strategy needs to understand the input data and manage it. scoring engine is a representation of the machine learning algorithms or frameworks. It should be treated as a mathematical function, nothing else!	12:31
tkaczynski	I fully understand that querying the data might take time, but it is NOT a responsibility of the scoring module	12:32
tkaczynski	on the high level this will be the integration with scoring engine from strategy code:	12:33
tkaczynski	scoring_engine = scoring_engine_factory.get_scoring_engine('model-1')	12:34
tkaczynski	result = scoring_engine.calculate_score('[1, 2, 3, 4, 5, 6]')	12:34
tkaczynski	... do something with the result (which might be some prediction or something)	12:35
jed56	yes but this is possible to that in the strategy for scalability	12:35
jed56	not	12:35
jed56	do	12:35
tkaczynski	what is not possible?	12:35
jed56	sorry :)	12:36
jed56	let's try again :p	12:37
tkaczynski	I am all listening :)	12:37
tkaczynski	jed56: I won't be in Portland :(	12:38
jed56	okay	12:39
tkaczynski	jed56: but Susanne will organize the video conference, so I will kind of participate ;)	12:39
jed56	cool	12:39
jed56	I think we are agree, this is not a good idea to query the measurements in the strategies and then give it to the scoring engine.	12:39
jed56	so, who is in charge of that ?	12:39
tkaczynski	well, I agree that querying the measurements is a complex task. but the role of the scoring module is to integrate machine learning with Watcher. single responsibility principle	12:41
tkaczynski	I don't have a good answer at the moment how the problem with querying the data should be solved, but this problem is valid for all strategies, whether they use scoring engines or not	12:42
jed56	we are not doing complex calculation with the metrics in the strategy moreover we will have soon luster-model-objects-wrapper	12:43
jed56	blueprint implemented	12:45
tkaczynski	but cluster model is not covering the usage metrics, right?	12:45
jed56	tkaczynski: yes	12:45
tkaczynski	for me it's a design choice: do we want each strategy requiring the access to the usage metrics to deal with this problem independently or do we want to solve it "globally" on the watcher level, so the code is shared	12:47
tkaczynski	... and possibly introduce some framework, which can be used by all strategies	12:48
jed56	IMHO, if we give the measurements to the scoring engine for example 1 week of data this is will required time to be computed bye the engine	12:49
tkaczynski	are you saying that you would like the strategy to react (make optimizations) based on the data which is a week old?	12:50
tkaczynski	typical scenario for optimization is that the data is analysed on-line (withing seconds or minutes) and then actions are taken, because cloud needs optimization	12:52
tkaczynski	the past data might be useful for training the machine learning models of course, but not much for online analysis	12:52
jed56	we would like that the strategy uses the prediction which is based on 1 week of data	12:52
*** jed56 has quit IRC		12:55
tkaczynski	I think I know where are you going: you are mixing learning the model with making actual predictions. in machine learning learning the model happens "offline". data scientists collect the data (possibly from a week, a month or even years), they train the model using this data and then the model becomes a scoring engine and starts doing the analysis of the	12:56
tkaczynski	online data	12:56
tkaczynski	what watcher really needs if it wants to be a serious player in openstack optimization: a workflow for collecting and storing the data for future analysis	12:57
*** jed56 has joined #openstack-watcher		12:57
tkaczynski	and a separate workflow for online data, which will allow for cloud monitoring and optimization	12:57
jed56	sorry, I was offline	12:58
tkaczynski	both of these workflows are needed on the watcher level. scoring module needs both of them - one for learning, one for analysing. but it's not managing these workflows in any way	12:59
jed56	+1	12:59
*** alexchadin has joined #openstack-watcher		13:01
tkaczynski	my recommendation would be: the watcher core members should look into this problem and prioritize it. the strategies will need that sooner rather than later. but honestly I don't think that scoring module is part of that problem. it's a consumer and only in the context of the strategy, because the strategy is the ultimate place where everything starts and	13:03
tkaczynski	ends. strategy developer must understand what she/he wants to do, what data is needed and what algorithms are required to perform the optimization	13:03
tkaczynski	scoring module is a helper module which brings machine learning to the table, so the strategy has one (big on its own) problem less to solve	13:04
jed56	in this is case the scoring engine didn't need to be a service :)	13:05
tkaczynski	I guess the future developer of scoring engine will come and ask: where is the historical usage data for this data center? watcher should have a way to provide that (possibly offline, even on a cd disc!), then the developer can create a great data model solving all problems in the world, predicting the next lotto numbers or whatever and once this is done it	13:07
tkaczynski	can be delivered as a scoring engine plugin, at which point some strategy might start to use that :)	13:07
tkaczynski	and it's quite likely that the online data will be needed as for example the scoring engine might want to detect hot spots in the cloud by analysing the cpu, memory or whatever usage	13:08
jed56	tkaczynski: IMHO, you could add a point for portland on that subject	13:09
tkaczynski	all I'm trying to say is that this is much bigger problem than the scoring engine itself and needs to be solved!	13:09
jed56	current scoring engine and beyond	13:09
tkaczynski	ok, I will	13:09
jed56	thanks	13:09
tkaczynski	so now coming back to original question: did we just agree that scoring module doesn't need to be a service? ;)	13:10
jed56	+2 :)	13:11
tkaczynski	sounds great, I can throw away my last commit then, only like 30 files ;)	13:15
jed56	sorry	13:17
tkaczynski	no worries, better this than polluting code base with unnecessary services and abstractions	13:20
tkaczynski	jed56 vincentfrancoise: thanks for your input, really appreciated!	13:21
*** esberglu has joined #openstack-watcher		13:23
*** diga has joined #openstack-watcher		14:05
*** diga has quit IRC		14:31
openstackgerrit	Merged openstack/puppet-watcher: Added watcher package ensure https://review.openstack.org/335072	14:40
*** alexchadin has quit IRC		14:55
*** alexchadin has joined #openstack-watcher		15:00
*** alexchad_ has joined #openstack-watcher		15:04
*** alexchadin has quit IRC		15:04
*** vincentfrancoise has quit IRC		15:04
*** danpawlik has quit IRC		15:07
*** openstackgerrit has quit IRC		15:33
*** openstackgerrit has joined #openstack-watcher		15:33
*** jed56 has quit IRC		15:35
*** harlowja has joined #openstack-watcher		15:40
*** alexchad_ has quit IRC		15:50
openstackgerrit	Daniel Pawlik proposed openstack/puppet-watcher: Implement applier and decision-engine https://review.openstack.org/336935	17:37
*** wootehfoot has joined #openstack-watcher		17:40
openstackgerrit	Daniel Pawlik proposed openstack/puppet-watcher: Implement watcher-db-manage commands https://review.openstack.org/336939	17:50
*** thorst_ has joined #openstack-watcher		18:41
*** thorst_ has quit IRC		18:41
*** thorst_ has joined #openstack-watcher		18:42
openstackgerrit	Daniel Pawlik proposed openstack/puppet-watcher: Implement watcher-db-manage commands https://review.openstack.org/336939	18:43
openstackgerrit	Daniel Pawlik proposed openstack/puppet-watcher: Implement applier and decision-engine https://review.openstack.org/336935	19:02
*** dtardivel has quit IRC		19:27
*** Zucan has joined #openstack-watcher		19:42
*** Zucan has quit IRC		19:47
*** Zucan has joined #openstack-watcher		19:52
*** Zucan has quit IRC		20:23
*** wootehfoot has quit IRC		20:39
*** edleafe- is now known as edleafe		21:36
openstackgerrit	Tin Lam proposed openstack/watcher: Update docs links to docs.openstack.org https://review.openstack.org/333723	22:28
*** thorst_ has quit IRC		22:47
*** thorst has joined #openstack-watcher		22:48
*** thorst has quit IRC		22:57
*** thorst has joined #openstack-watcher		23:54

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!