*** hidekazu has joined #openstack-watcher | 00:32 | |
*** thorst_afk has joined #openstack-watcher | 00:45 | |
*** thorst_afk has quit IRC | 00:49 | |
*** yuanying_ has joined #openstack-watcher | 00:58 | |
*** yuanying has quit IRC | 01:02 | |
*** yuanying has joined #openstack-watcher | 01:02 | |
*** yuanying_ has quit IRC | 01:02 | |
*** thorst_afk has joined #openstack-watcher | 01:23 | |
*** thorst_afk has quit IRC | 01:46 | |
*** yuanying has quit IRC | 01:50 | |
*** zhurong has joined #openstack-watcher | 01:54 | |
*** yuanying has joined #openstack-watcher | 02:01 | |
*** thorst_afk has joined #openstack-watcher | 02:19 | |
*** thorst_afk has quit IRC | 02:19 | |
*** thorst_afk has joined #openstack-watcher | 03:20 | |
*** thorst_afk has quit IRC | 03:25 | |
*** nicolasbock has quit IRC | 03:35 | |
*** thorst_afk has joined #openstack-watcher | 04:21 | |
*** thorst_afk has quit IRC | 04:26 | |
*** zhurong has quit IRC | 04:58 | |
*** zhurong has joined #openstack-watcher | 05:06 | |
*** thorst_afk has joined #openstack-watcher | 05:22 | |
*** thorst_afk has quit IRC | 05:26 | |
*** thorst_afk has joined #openstack-watcher | 06:23 | |
*** thorst_afk has quit IRC | 06:27 | |
*** thorst_afk has joined #openstack-watcher | 07:24 | |
*** thorst_afk has quit IRC | 07:28 | |
*** alexchadin has joined #openstack-watcher | 07:31 | |
*** vincentfrancoise has joined #openstack-watcher | 07:35 | |
*** thorst_afk has joined #openstack-watcher | 08:25 | |
*** thorst_afk has quit IRC | 08:29 | |
hidekazu | many fixed bugs were cherry-picked to stable/pike.. | 08:33 |
---|---|---|
*** alexchadin has quit IRC | 08:53 | |
*** alexchadin has joined #openstack-watcher | 08:54 | |
alexchadin | aspiers: hi | 08:55 |
aspiers | hi alexchadin | 08:58 |
aspiers | I'll be free to talk in about 30 mins, is that OK? | 08:58 |
alexchadin | aspiers: yeap | 08:58 |
aspiers | cool | 08:59 |
openstackgerrit | Hidekazu Nakamura proposed openstack/watcher-specs master: Add cdm-scoping spec https://review.openstack.org/496092 | 09:19 |
*** hidekazu has left #openstack-watcher | 09:21 | |
*** thorst_afk has joined #openstack-watcher | 09:25 | |
*** suzhengwei has joined #openstack-watcher | 09:28 | |
*** thorst_afk has quit IRC | 09:30 | |
*** vincentfrancoise has quit IRC | 09:43 | |
*** vincentfrancoise has joined #openstack-watcher | 09:45 | |
aspiers | alexchadin: ok | 09:45 |
*** alexchadin has quit IRC | 09:49 | |
*** alexchadin has joined #openstack-watcher | 09:54 | |
alexchadin | aspiers: ping | 09:54 |
aspiers | hi | 09:54 |
alexchadin | aspiers: back yo your proposal, what strategy do you want to suggest? | 09:55 |
aspiers | so I figured out how to do VM consolidation via linear programming | 09:56 |
aspiers | it's actually pretty easy | 09:56 |
aspiers | the only problem is the algorithmic complexity, which would probably require partitioning the cloud up into chunks (e.g. max 500 servers per chunk) | 09:56 |
aspiers | but I don't think this would be an issue | 09:56 |
alexchadin | you speak about vms or nodes (500 servers)? | 09:57 |
aspiers | compute hosts | 09:57 |
alexchadin | oh, ok | 09:57 |
aspiers | and then once the optimal placement is calculated, we can use the software I already wrote for ordering the migration rearrangements | 09:57 |
aspiers | https://blueprints.launchpad.net/watcher/+spec/vm-migration-ordering | 09:57 |
aspiers | and IIUC we also now have a power-off strategy, right? | 09:58 |
aspiers | so they could all be combined to minimise the amount of energy spent on running compute hosts | 09:58 |
aspiers | is there already a VM consolidation strategy? | 09:58 |
alexchadin | aspiers: some of them are presented already | 09:59 |
alexchadin | let me see | 09:59 |
alexchadin | aspiers: there is one proof of concept: https://github.com/openstack/watcher/blob/master/watcher/decision_engine/strategy/strategies/workload_balance.py | 10:00 |
alexchadin | oh, it's about balancing, not consolidation | 10:01 |
aspiers | yeah, that's the opposite | 10:01 |
alexchadin | aspiers: consolidation one: https://github.com/openstack/watcher/blob/master/watcher/decision_engine/strategy/strategies/vm_workload_consolidation.py | 10:01 |
aspiers | OK, looking | 10:01 |
alexchadin | aspiers: we also have basic consolidation strategy, but it's more about proofing that watcher works | 10:02 |
aspiers | which one is that? | 10:02 |
alexchadin | https://github.com/openstack/watcher/blob/master/watcher/decision_engine/strategy/strategies/basic_consolidation.py | 10:02 |
alexchadin | aspiers: I'd like to ask you to create appropriate blueprint with pointing out methods of linear programming you plan to use in your strategy | 10:05 |
aspiers | ok cool | 10:05 |
aspiers | sure | 10:05 |
aspiers | is it already possible to combine the existing consolidation strategies with the power off strategy? | 10:06 |
alexchadin | to be honest, we haven't planned to cross strategies as they are independent by its nature | 10:07 |
alexchadin | aspiers: instead of it, you are free to use any actions watcher provides | 10:07 |
aspiers | oh I see, so the solution would include both migrations and power off? | 10:08 |
openstackgerrit | Merged openstack/watcher master: Remove watcher_tempest_plugin https://review.openstack.org/494472 | 10:09 |
alexchadin | aspiers: yes | 10:09 |
openstackgerrit | Merged openstack/watcher master: Updated from global requirements https://review.openstack.org/497089 | 10:09 |
alexchadin | aspiers: you may use any of these methods: https://github.com/openstack/watcher/tree/master/watcher/applier/actions | 10:09 |
alexchadin | actons, excuse me | 10:10 |
aspiers | cool, that's really helpful thanks! | 10:10 |
alexchadin | list of these actions is result of your perfect super best consolidation strategy ;D | 10:10 |
aspiers | so the solution will provide an ordering of the actions, but then the planner can reorder them? | 10:11 |
aspiers | I think any solutions generated in this way would have a very specific order which could not be easily changed | 10:11 |
alexchadin | aspiers: yes, we have two planners to use: https://github.com/openstack/watcher/tree/master/watcher/decision_engine/planner | 10:11 |
alexchadin | aspiers: unfortunately I should go for doings, we have pluggable planner mechanism, you may provide yours if you want | 10:13 |
aspiers | ok | 10:15 |
aspiers | thanks | 10:15 |
aspiers | alexchadin: I have other things to discuss too | 10:15 |
alexchadin | aspiers: I'll ping you once get back | 10:15 |
aspiers | ok great! | 10:15 |
*** alexchadin has quit IRC | 10:16 | |
*** vincentfrancoise has quit IRC | 10:22 | |
*** thorst_afk has joined #openstack-watcher | 10:26 | |
*** thorst_afk has quit IRC | 10:31 | |
*** nicolasbock has joined #openstack-watcher | 10:33 | |
*** dpawlik_ is now known as dpawlik | 10:51 | |
*** nicolasbock has quit IRC | 10:54 | |
*** nicolasbock has joined #openstack-watcher | 11:08 | |
*** thorst_afk has joined #openstack-watcher | 11:27 | |
*** alexchadin has joined #openstack-watcher | 11:31 | |
*** thorst_afk has quit IRC | 11:32 | |
*** alexchadin has quit IRC | 11:36 | |
*** alexchadin has joined #openstack-watcher | 11:37 | |
*** vincentfrancoise has joined #openstack-watcher | 12:15 | |
*** thorst_afk has joined #openstack-watcher | 12:16 | |
*** Yumeng has quit IRC | 12:21 | |
openstackgerrit | Merged openstack/watcher master: Fix KeyError exception https://review.openstack.org/494145 | 12:40 |
*** vincentfrancoise has quit IRC | 12:43 | |
*** vincentfrancoise has joined #openstack-watcher | 12:44 | |
*** alexchadin has quit IRC | 12:52 | |
*** alexchadin has joined #openstack-watcher | 12:53 | |
alexchadin | aspiers: I'm back | 13:01 |
sballe_ | morning | 13:09 |
alexchadin | sballe_: hi | 13:11 |
*** vincentfrancoise has quit IRC | 13:13 | |
*** vincentfrancoise has joined #openstack-watcher | 13:14 | |
aspiers | alexchadin: | 13:15 |
aspiers | alexchadin: hi | 13:15 |
alexchadin | aspiers: pong | 13:15 |
aspiers | alexchadin: so one question I had is around terminology | 13:16 |
aspiers | https://docs.openstack.org/watcher/latest/glossary.html#cluster-definition | 13:16 |
aspiers | I don't understand this definition | 13:16 |
aspiers | it doesn't seem to relate to any other concept for grouping machines in OpenStack | 13:16 |
aspiers | and it also conflicts with other definitions of "cluster" in OpenStack | 13:16 |
alexchadin | hm | 13:17 |
aspiers | what does "managed by the same controller node" mean? | 13:17 |
alexchadin | typical grouping in OpenStack is being managed by Availability Zones, Host Aggregates and Cells, right? | 13:18 |
aspiers | and regions | 13:18 |
alexchadin | yes, I've forgot about it | 13:18 |
alexchadin | what other definitions of cluster have you found? | 13:19 |
aspiers | the control plane typically contains HA clusters | 13:20 |
aspiers | and also Senlin manages clusters | 13:20 |
aspiers | also the control plane does not typically run on a single controller | 13:20 |
aspiers | it gets split across multiple nodes, sometimes even multiple clusters | 13:20 |
aspiers | so it does not really make sense to say that a machine is managed by one controller node | 13:21 |
aspiers | so I'm trying to understand what that definition really means | 13:21 |
aspiers | is Watcher tracking its own grouping of compute nodes? | 13:21 |
aspiers | where is a Watcher "cluster" defined? | 13:22 |
alexchadin | basically, we tracked all VMs and Compute Node we could find | 13:22 |
aspiers | so that is the whole compute plane of the cloud? | 13:23 |
alexchadin | then, we have implemented Audit Scope that allows us to restrict scope of resources which are using during working of strategy | 13:23 |
aspiers | OK but where is a Watcher "cluster" defined? in the Audit Scope? | 13:24 |
alexchadin | aspiers: I haven't met it in the code :) | 13:24 |
alexchadin | We have Cluster Data Model | 13:24 |
aspiers | yes, I saw that | 13:25 |
aspiers | I am trying to understand what a Watcher cluster really is | 13:25 |
alexchadin | It's a graph with nodes and related VMs | 13:25 |
aspiers | is it a strict subset of the compute plane, or just the whole compute plane? | 13:25 |
alexchadin | aspiers: to give you the answer I need to know what compute plane is | 13:26 |
aspiers | the compute plane is all the compute hosts in the cloud | 13:26 |
alexchadin | so I'll read this article now | 13:26 |
aspiers | some people *might* also include nova-{scheduler,conductor} etc. in the compute plane, but IMHO they are part of the control plane instead | 13:27 |
alexchadin | aspiers: Nova Cluster Data Model collects all compute nodes and their VMs in background | 13:29 |
alexchadin | Cinder Cluster Data Model does the same thing for Volume Nodes | 13:30 |
aspiers | alexchadin: I guess you mean https://github.com/openstack/watcher/tree/master/watcher/decision_engine/model/collector | 13:35 |
aspiers | alexchadin: yeah so it seems that it really refers to *all* compute hosts and instances, i.e. the whole compute plane | 13:36 |
alexchadin | Exactly | 13:37 |
aspiers | in which case I think "cluster" is really not a good term to use | 13:37 |
alexchadin | aspiers: if you want to rephrase the definition, feel free to submit the patch | 13:37 |
aspiers | alexchadin: OK, I'll have a think about the best way to deal with it. Thanks! | 13:37 |
alexchadin | aspiers: thanks for your help! | 13:38 |
aspiers | alexchadin: welcome :) I had one more question | 13:38 |
aspiers | alexchadin: are there plans to extend the scope of Watcher beyond optimization? | 13:38 |
alexchadin | aspiers: We didn't plan to do it. We have some thoughts about including some things to optimize (i.e. containers), but there weren't discussions about extending the scope | 13:40 |
alexchadin | aspiers: We have pretty straightforward goal, why should we extend the scope? | 13:41 |
aspiers | alexchadin: I don't think you should :) | 13:41 |
aspiers | and even if you optimize containers, that is still optimization, so it is still inside the current scope of Watcher's mission statement | 13:42 |
aspiers | I am asking because there have been one or two blueprints submitted recently which extend the scope *outside* optimization | 13:42 |
*** vincentfrancoise has quit IRC | 13:42 | |
alexchadin | aspiers: which ones? | 13:43 |
aspiers | into also handling failures | 13:43 |
aspiers | https://blueprints.launchpad.net/watcher/+spec/workload-evacuate-strategy | 13:43 |
*** vincentfrancoise has joined #openstack-watcher | 13:44 | |
aspiers | if Watcher starts to tackle auto-healing of failures then it is no longer just optimization, but also high availability | 13:44 |
aspiers | and then it starts to overlap with other OpenStack projects which already exist | 13:44 |
aspiers | this particular spec is targetting the exact same failure scenario which there is already a big project addressing | 13:45 |
aspiers | so I don't understand why it is being proposed | 13:46 |
aspiers | I asked on https://review.openstack.org/#/c/495168/2/specs/queens/approved/workload-evacuate-strategy.rst for an explanation but I didn't get one yet | 13:47 |
alexchadin | I haven't approved it yet, it sounds fair enough | 13:47 |
alexchadin | hm | 13:48 |
alexchadin | Watcher is created to reduce total cost of ownership by using optimization models with different goals | 13:49 |
aspiers | yes, that makes sense to me | 13:51 |
aspiers | and by definition optimization makes a cloud which is already working, better. | 13:51 |
aspiers | fixing a broken cloud is not optimization | 13:51 |
aspiers | I suppose some people might want to argue with the last point | 13:52 |
alexchadin | it's more about Python script bound by something like cron than optimization algorithm | 13:52 |
aspiers | the best definition I could find was https://en.wikipedia.org/wiki/Program_optimization | 13:53 |
aspiers | and that article does not mention anything about optimization fixing failures | 13:53 |
aspiers | it also does not mention HA | 13:53 |
aspiers | but even if hypothetically optimization *did* include fixing failures, it would still not make sense for Watcher to duplicate the work of existing OpenStack projects dedicated to HA | 13:54 |
alexchadin | aspiers: I think it's time to call suzhengwei :) | 13:54 |
aspiers | sure | 13:54 |
aspiers | the question is simple: if you want a compute HA solution, why not just use Masakari? it is a well-established OpenStack project, which the OpenStack HA community has agreed is the best solution | 13:55 |
aspiers | instead of reinventing the wheel, it would be more effective to collaborate on the existing solution | 13:55 |
alexchadin | I agree | 13:56 |
alexchadin | aspiers: well, it's more about fixing post-failure state of cloud than optimising the living, right? | 13:57 |
aspiers | alexchadin: masakari is about fixing post-failure state, yes. It does not do any optimisation | 13:58 |
aspiers | alexchadin: so currently there is a nice clean divide between the responsibilities of masakari and watcher | 13:58 |
alexchadin | aspiers: I was speaking about proposed strategy | 13:58 |
aspiers | oh yes, the proposed strategy changes that divide so that it is no longer clear | 13:58 |
aspiers | with that strategy, Watcher would be not just handling optimisation, but also post-failure states, so it would overlap with masakari | 13:59 |
openstackgerrit | OpenStack Release Bot proposed openstack/puppet-watcher master: Update reno for stable/pike https://review.openstack.org/497391 | 14:02 |
aspiers | alexchadin: another problem with this proposal is that it is impossible to do safe compute HA without fencing, and Watcher has no fencing mechanism | 14:04 |
aspiers | alexchadin: Masakari uses Pacemaker for fencing in order to ensure that VM resurrection via nova-evacuate is safe | 14:05 |
aspiers | alexchadin: so in order to implement this strategy in Watcher, you would need to add a dependency from Watcher on Pacemaker or some other cluster manager which implements quorum/voting/consensus/fencing | 14:06 |
alexchadin | aspiers: I've left my questions here: https://review.openstack.org/#/c/495168/2 | 14:10 |
aspiers | alexchadin: OK thanks | 14:10 |
aspiers | I explained the need for fencing in Austin https://youtu.be/lddtWUP_IKQ?t=6m07s | 14:10 |
alexchadin | aspiers: thanks for link | 14:12 |
aspiers | welcome | 14:12 |
*** alexchadin has quit IRC | 14:46 | |
*** ianychoi has joined #openstack-watcher | 16:00 | |
*** openstackgerrit has quit IRC | 16:04 | |
*** vincentfrancoise has quit IRC | 16:13 | |
*** vincentfrancoise has joined #openstack-watcher | 16:14 | |
*** efoley has joined #openstack-watcher | 16:44 | |
*** efoley has quit IRC | 16:54 | |
*** vincentfrancoise has quit IRC | 16:57 | |
*** openstackgerrit has joined #openstack-watcher | 19:08 | |
openstackgerrit | Alex Schultz proposed openstack/puppet-watcher master: Update versions for Queens cycle https://review.openstack.org/497585 | 19:08 |
*** thorst_afk has quit IRC | 20:16 | |
*** thorst_afk has joined #openstack-watcher | 20:19 | |
*** thorst_afk has quit IRC | 20:24 | |
*** thorst_afk has joined #openstack-watcher | 20:36 | |
*** yuanying_ has joined #openstack-watcher | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!