Thursday, 2017-08-24

*** hidekazu has joined #openstack-watcher		00:32
*** thorst_afk has joined #openstack-watcher		00:45
*** thorst_afk has quit IRC		00:49
*** yuanying_ has joined #openstack-watcher		00:58
*** yuanying has quit IRC		01:02
*** yuanying has joined #openstack-watcher		01:02
*** yuanying_ has quit IRC		01:02
*** thorst_afk has joined #openstack-watcher		01:23
*** thorst_afk has quit IRC		01:46
*** yuanying has quit IRC		01:50
*** zhurong has joined #openstack-watcher		01:54
*** yuanying has joined #openstack-watcher		02:01
*** thorst_afk has joined #openstack-watcher		02:19
*** thorst_afk has quit IRC		02:19
*** thorst_afk has joined #openstack-watcher		03:20
*** thorst_afk has quit IRC		03:25
*** nicolasbock has quit IRC		03:35
*** thorst_afk has joined #openstack-watcher		04:21
*** thorst_afk has quit IRC		04:26
*** zhurong has quit IRC		04:58
*** zhurong has joined #openstack-watcher		05:06
*** thorst_afk has joined #openstack-watcher		05:22
*** thorst_afk has quit IRC		05:26
*** thorst_afk has joined #openstack-watcher		06:23
*** thorst_afk has quit IRC		06:27
*** thorst_afk has joined #openstack-watcher		07:24
*** thorst_afk has quit IRC		07:28
*** alexchadin has joined #openstack-watcher		07:31
*** vincentfrancoise has joined #openstack-watcher		07:35
*** thorst_afk has joined #openstack-watcher		08:25
*** thorst_afk has quit IRC		08:29
hidekazu	many fixed bugs were cherry-picked to stable/pike..	08:33
*** alexchadin has quit IRC		08:53
*** alexchadin has joined #openstack-watcher		08:54
alexchadin	aspiers: hi	08:55
aspiers	hi alexchadin	08:58
aspiers	I'll be free to talk in about 30 mins, is that OK?	08:58
alexchadin	aspiers: yeap	08:58
aspiers	cool	08:59
openstackgerrit	Hidekazu Nakamura proposed openstack/watcher-specs master: Add cdm-scoping spec https://review.openstack.org/496092	09:19
*** hidekazu has left #openstack-watcher		09:21
*** thorst_afk has joined #openstack-watcher		09:25
*** suzhengwei has joined #openstack-watcher		09:28
*** thorst_afk has quit IRC		09:30
*** vincentfrancoise has quit IRC		09:43
*** vincentfrancoise has joined #openstack-watcher		09:45
aspiers	alexchadin: ok	09:45
*** alexchadin has quit IRC		09:49
*** alexchadin has joined #openstack-watcher		09:54
alexchadin	aspiers: ping	09:54
aspiers	hi	09:54
alexchadin	aspiers: back yo your proposal, what strategy do you want to suggest?	09:55
aspiers	so I figured out how to do VM consolidation via linear programming	09:56
aspiers	it's actually pretty easy	09:56
aspiers	the only problem is the algorithmic complexity, which would probably require partitioning the cloud up into chunks (e.g. max 500 servers per chunk)	09:56
aspiers	but I don't think this would be an issue	09:56
alexchadin	you speak about vms or nodes (500 servers)?	09:57
aspiers	compute hosts	09:57
alexchadin	oh, ok	09:57
aspiers	and then once the optimal placement is calculated, we can use the software I already wrote for ordering the migration rearrangements	09:57
aspiers	https://blueprints.launchpad.net/watcher/+spec/vm-migration-ordering	09:57
aspiers	and IIUC we also now have a power-off strategy, right?	09:58
aspiers	so they could all be combined to minimise the amount of energy spent on running compute hosts	09:58
aspiers	is there already a VM consolidation strategy?	09:58
alexchadin	aspiers: some of them are presented already	09:59
alexchadin	let me see	09:59
alexchadin	aspiers: there is one proof of concept: https://github.com/openstack/watcher/blob/master/watcher/decision_engine/strategy/strategies/workload_balance.py	10:00
alexchadin	oh, it's about balancing, not consolidation	10:01
aspiers	yeah, that's the opposite	10:01
alexchadin	aspiers: consolidation one: https://github.com/openstack/watcher/blob/master/watcher/decision_engine/strategy/strategies/vm_workload_consolidation.py	10:01
aspiers	OK, looking	10:01
alexchadin	aspiers: we also have basic consolidation strategy, but it's more about proofing that watcher works	10:02
aspiers	which one is that?	10:02
alexchadin	https://github.com/openstack/watcher/blob/master/watcher/decision_engine/strategy/strategies/basic_consolidation.py	10:02
alexchadin	aspiers: I'd like to ask you to create appropriate blueprint with pointing out methods of linear programming you plan to use in your strategy	10:05
aspiers	ok cool	10:05
aspiers	sure	10:05
aspiers	is it already possible to combine the existing consolidation strategies with the power off strategy?	10:06
alexchadin	to be honest, we haven't planned to cross strategies as they are independent by its nature	10:07
alexchadin	aspiers: instead of it, you are free to use any actions watcher provides	10:07
aspiers	oh I see, so the solution would include both migrations and power off?	10:08
openstackgerrit	Merged openstack/watcher master: Remove watcher_tempest_plugin https://review.openstack.org/494472	10:09
alexchadin	aspiers: yes	10:09
openstackgerrit	Merged openstack/watcher master: Updated from global requirements https://review.openstack.org/497089	10:09
alexchadin	aspiers: you may use any of these methods: https://github.com/openstack/watcher/tree/master/watcher/applier/actions	10:09
alexchadin	actons, excuse me	10:10
aspiers	cool, that's really helpful thanks!	10:10
alexchadin	list of these actions is result of your perfect super best consolidation strategy ;D	10:10
aspiers	so the solution will provide an ordering of the actions, but then the planner can reorder them?	10:11
aspiers	I think any solutions generated in this way would have a very specific order which could not be easily changed	10:11
alexchadin	aspiers: yes, we have two planners to use: https://github.com/openstack/watcher/tree/master/watcher/decision_engine/planner	10:11
alexchadin	aspiers: unfortunately I should go for doings, we have pluggable planner mechanism, you may provide yours if you want	10:13
aspiers	ok	10:15
aspiers	thanks	10:15
aspiers	alexchadin: I have other things to discuss too	10:15
alexchadin	aspiers: I'll ping you once get back	10:15
aspiers	ok great!	10:15
*** alexchadin has quit IRC		10:16
*** vincentfrancoise has quit IRC		10:22
*** thorst_afk has joined #openstack-watcher		10:26
*** thorst_afk has quit IRC		10:31
*** nicolasbock has joined #openstack-watcher		10:33
*** dpawlik_ is now known as dpawlik		10:51
*** nicolasbock has quit IRC		10:54
*** nicolasbock has joined #openstack-watcher		11:08
*** thorst_afk has joined #openstack-watcher		11:27
*** alexchadin has joined #openstack-watcher		11:31
*** thorst_afk has quit IRC		11:32
*** alexchadin has quit IRC		11:36
*** alexchadin has joined #openstack-watcher		11:37
*** vincentfrancoise has joined #openstack-watcher		12:15
*** thorst_afk has joined #openstack-watcher		12:16
*** Yumeng has quit IRC		12:21
openstackgerrit	Merged openstack/watcher master: Fix KeyError exception https://review.openstack.org/494145	12:40
*** vincentfrancoise has quit IRC		12:43
*** vincentfrancoise has joined #openstack-watcher		12:44
*** alexchadin has quit IRC		12:52
*** alexchadin has joined #openstack-watcher		12:53
alexchadin	aspiers: I'm back	13:01
sballe_	morning	13:09
alexchadin	sballe_: hi	13:11
*** vincentfrancoise has quit IRC		13:13
*** vincentfrancoise has joined #openstack-watcher		13:14
aspiers	alexchadin:	13:15
aspiers	alexchadin: hi	13:15
alexchadin	aspiers: pong	13:15
aspiers	alexchadin: so one question I had is around terminology	13:16
aspiers	https://docs.openstack.org/watcher/latest/glossary.html#cluster-definition	13:16
aspiers	I don't understand this definition	13:16
aspiers	it doesn't seem to relate to any other concept for grouping machines in OpenStack	13:16
aspiers	and it also conflicts with other definitions of "cluster" in OpenStack	13:16
alexchadin	hm	13:17
aspiers	what does "managed by the same controller node" mean?	13:17
alexchadin	typical grouping in OpenStack is being managed by Availability Zones, Host Aggregates and Cells, right?	13:18
aspiers	and regions	13:18
alexchadin	yes, I've forgot about it	13:18
alexchadin	what other definitions of cluster have you found?	13:19
aspiers	the control plane typically contains HA clusters	13:20
aspiers	and also Senlin manages clusters	13:20
aspiers	also the control plane does not typically run on a single controller	13:20
aspiers	it gets split across multiple nodes, sometimes even multiple clusters	13:20
aspiers	so it does not really make sense to say that a machine is managed by one controller node	13:21
aspiers	so I'm trying to understand what that definition really means	13:21
aspiers	is Watcher tracking its own grouping of compute nodes?	13:21
aspiers	where is a Watcher "cluster" defined?	13:22
alexchadin	basically, we tracked all VMs and Compute Node we could find	13:22
aspiers	so that is the whole compute plane of the cloud?	13:23
alexchadin	then, we have implemented Audit Scope that allows us to restrict scope of resources which are using during working of strategy	13:23
aspiers	OK but where is a Watcher "cluster" defined? in the Audit Scope?	13:24
alexchadin	aspiers: I haven't met it in the code :)	13:24
alexchadin	We have Cluster Data Model	13:24
aspiers	yes, I saw that	13:25
aspiers	I am trying to understand what a Watcher cluster really is	13:25
alexchadin	It's a graph with nodes and related VMs	13:25
aspiers	is it a strict subset of the compute plane, or just the whole compute plane?	13:25
alexchadin	aspiers: to give you the answer I need to know what compute plane is	13:26
aspiers	the compute plane is all the compute hosts in the cloud	13:26
alexchadin	so I'll read this article now	13:26
aspiers	some people might also include nova-{scheduler,conductor} etc. in the compute plane, but IMHO they are part of the control plane instead	13:27
alexchadin	aspiers: Nova Cluster Data Model collects all compute nodes and their VMs in background	13:29
alexchadin	Cinder Cluster Data Model does the same thing for Volume Nodes	13:30
aspiers	alexchadin: I guess you mean https://github.com/openstack/watcher/tree/master/watcher/decision_engine/model/collector	13:35
aspiers	alexchadin: yeah so it seems that it really refers to all compute hosts and instances, i.e. the whole compute plane	13:36
alexchadin	Exactly	13:37
aspiers	in which case I think "cluster" is really not a good term to use	13:37
alexchadin	aspiers: if you want to rephrase the definition, feel free to submit the patch	13:37
aspiers	alexchadin: OK, I'll have a think about the best way to deal with it. Thanks!	13:37
alexchadin	aspiers: thanks for your help!	13:38
aspiers	alexchadin: welcome :) I had one more question	13:38
aspiers	alexchadin: are there plans to extend the scope of Watcher beyond optimization?	13:38
alexchadin	aspiers: We didn't plan to do it. We have some thoughts about including some things to optimize (i.e. containers), but there weren't discussions about extending the scope	13:40
alexchadin	aspiers: We have pretty straightforward goal, why should we extend the scope?	13:41
aspiers	alexchadin: I don't think you should :)	13:41
aspiers	and even if you optimize containers, that is still optimization, so it is still inside the current scope of Watcher's mission statement	13:42
aspiers	I am asking because there have been one or two blueprints submitted recently which extend the scope outside optimization	13:42
*** vincentfrancoise has quit IRC		13:42
alexchadin	aspiers: which ones?	13:43
aspiers	into also handling failures	13:43
aspiers	https://blueprints.launchpad.net/watcher/+spec/workload-evacuate-strategy	13:43
*** vincentfrancoise has joined #openstack-watcher		13:44
aspiers	if Watcher starts to tackle auto-healing of failures then it is no longer just optimization, but also high availability	13:44
aspiers	and then it starts to overlap with other OpenStack projects which already exist	13:44
aspiers	this particular spec is targetting the exact same failure scenario which there is already a big project addressing	13:45
aspiers	so I don't understand why it is being proposed	13:46
aspiers	I asked on https://review.openstack.org/#/c/495168/2/specs/queens/approved/workload-evacuate-strategy.rst for an explanation but I didn't get one yet	13:47
alexchadin	I haven't approved it yet, it sounds fair enough	13:47
alexchadin	hm	13:48
alexchadin	Watcher is created to reduce total cost of ownership by using optimization models with different goals	13:49
aspiers	yes, that makes sense to me	13:51
aspiers	and by definition optimization makes a cloud which is already working, better.	13:51
aspiers	fixing a broken cloud is not optimization	13:51
aspiers	I suppose some people might want to argue with the last point	13:52
alexchadin	it's more about Python script bound by something like cron than optimization algorithm	13:52
aspiers	the best definition I could find was https://en.wikipedia.org/wiki/Program_optimization	13:53
aspiers	and that article does not mention anything about optimization fixing failures	13:53
aspiers	it also does not mention HA	13:53
aspiers	but even if hypothetically optimization did include fixing failures, it would still not make sense for Watcher to duplicate the work of existing OpenStack projects dedicated to HA	13:54
alexchadin	aspiers: I think it's time to call suzhengwei :)	13:54
aspiers	sure	13:54
aspiers	the question is simple: if you want a compute HA solution, why not just use Masakari? it is a well-established OpenStack project, which the OpenStack HA community has agreed is the best solution	13:55
aspiers	instead of reinventing the wheel, it would be more effective to collaborate on the existing solution	13:55
alexchadin	I agree	13:56
alexchadin	aspiers: well, it's more about fixing post-failure state of cloud than optimising the living, right?	13:57
aspiers	alexchadin: masakari is about fixing post-failure state, yes. It does not do any optimisation	13:58
aspiers	alexchadin: so currently there is a nice clean divide between the responsibilities of masakari and watcher	13:58
alexchadin	aspiers: I was speaking about proposed strategy	13:58
aspiers	oh yes, the proposed strategy changes that divide so that it is no longer clear	13:58
aspiers	with that strategy, Watcher would be not just handling optimisation, but also post-failure states, so it would overlap with masakari	13:59
openstackgerrit	OpenStack Release Bot proposed openstack/puppet-watcher master: Update reno for stable/pike https://review.openstack.org/497391	14:02
aspiers	alexchadin: another problem with this proposal is that it is impossible to do safe compute HA without fencing, and Watcher has no fencing mechanism	14:04
aspiers	alexchadin: Masakari uses Pacemaker for fencing in order to ensure that VM resurrection via nova-evacuate is safe	14:05
aspiers	alexchadin: so in order to implement this strategy in Watcher, you would need to add a dependency from Watcher on Pacemaker or some other cluster manager which implements quorum/voting/consensus/fencing	14:06
alexchadin	aspiers: I've left my questions here: https://review.openstack.org/#/c/495168/2	14:10
aspiers	alexchadin: OK thanks	14:10
aspiers	I explained the need for fencing in Austin https://youtu.be/lddtWUP_IKQ?t=6m07s	14:10
alexchadin	aspiers: thanks for link	14:12
aspiers	welcome	14:12
*** alexchadin has quit IRC		14:46
*** ianychoi has joined #openstack-watcher		16:00
*** openstackgerrit has quit IRC		16:04
*** vincentfrancoise has quit IRC		16:13
*** vincentfrancoise has joined #openstack-watcher		16:14
*** efoley has joined #openstack-watcher		16:44
*** efoley has quit IRC		16:54
*** vincentfrancoise has quit IRC		16:57
*** openstackgerrit has joined #openstack-watcher		19:08
openstackgerrit	Alex Schultz proposed openstack/puppet-watcher master: Update versions for Queens cycle https://review.openstack.org/497585	19:08
*** thorst_afk has quit IRC		20:16
*** thorst_afk has joined #openstack-watcher		20:19
*** thorst_afk has quit IRC		20:24
*** thorst_afk has joined #openstack-watcher		20:36
*** yuanying_ has joined #openstack-watcher		23:59

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!