20:02:09 <ttx> #startmeeting tc 20:02:10 <openstack> Meeting started Tue Sep 10 20:02:09 2013 UTC and is due to finish in 60 minutes. The chair is ttx. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:02:11 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 20:02:12 <markwash> o/ 20:02:13 <mordred> o/ 20:02:14 <openstack> The meeting name has been set to 'tc' 20:02:17 <annegentle> o/ 20:02:17 <ttx> Our agenda: 20:02:25 <ttx> #link https://wiki.openstack.org/wiki/Governance/TechnicalCommittee 20:02:38 <ttx> #topic Savanna incubation request: initial discussion 20:02:44 <ttx> #link http://lists.openstack.org/pipermail/openstack-dev/2013-September/014623.html 20:02:50 <ttx> #link https://wiki.openstack.org/wiki/Savanna/Incubation 20:02:59 <ttx> SergeyLukjanov: hi! 20:03:13 <SergeyLukjanov> ttx, howdy! 20:03:19 <mikal> Hi 20:03:27 <akuznetsov> hi 20:03:30 <ttx> So we usually consider incubation over two meetings. The first week is an initial discussion so that the main issues can emerge 20:03:46 <ttx> and at the second one we usually conclude that discussion and vote 20:04:06 <ttx> So this week is mostly about Q&A 20:04:22 <ttx> Personally I had a question about the scope. Savanna is a single project but proposes two very different services: cluster and data operations 20:04:39 <ttx> That sounds like two very separate use cases to me 20:04:47 <hub_cap> yes the former seems very similar to trove mission 20:04:54 <ttx> So far we had provisioning stuff like Trove 20:05:04 <ttx> and data-oriented stuff like Marconi 20:05:06 <mikal> I would like to hear more about plans for heat as well 20:05:17 <zaneb> mikal: ++ 20:05:22 <ttx> but no project that would handle both at the same tiem 20:05:22 <vishy> o/ 20:05:33 <SergeyLukjanov> let's start from the question about cluster and data ops 20:05:37 <ttx> could you explain the value of bundling those two activities in the same project ? 20:05:52 <SergeyLukjanov> all data ops are build around the Hadoop eco 20:06:34 <SergeyLukjanov> and the Hadoop cluster provisioning process is very complex, additionally, we need not only Hadoop cluster but a lot of other tools that works on Hadoop 20:06:36 <jd__> i.e. relying on Heat? 20:06:49 <ErikB> Specifically, Savanna is focused around enabling setup, provisioning, configuration and deployment of Hadoop on OpenStack and related data operations that would be executed on Hadoop 20:07:22 <hub_cap> yes but trove is focused on enabling setup, prov, config and deployment of X on OpensSTack 20:07:28 <hub_cap> where X is a datastore 20:07:37 <akuznetsov> Savanna main goal is the elastic data processing, but without cluster operation it is unreachable. So the first step was creation of staff for cluster operation 20:07:43 <ttx> sounds like a product definition more than a project definition... how much code would be shared between those two types of ops ? 20:07:54 <SergeyLukjanov> hub_cap, we're not targeting Savanna as data store provider 20:08:19 <ttx> (not saying you don't need both, just wondering how much sense it makes as a single project) 20:08:34 <SergeyLukjanov> ttx, in fact we want to provide data ops using Hadoop eco, but we need to provision complex cluster to do it 20:08:40 <hub_cap> im not sure what you mean by data store provider? youre targetting a "spin up a hadoop ecosystem", yes? 20:08:41 <russellb> another way to look at it, is that the cluster part could be considered part of the OpenStack Deployment program 20:08:51 <mordred> hub_cap: hadoop isn't storage, it's processing 20:09:02 <SergeyLukjanov> mordred, yep, absolutely 20:09:07 <ttx> russellb: or some kind of trove-like project that would leverage heat 20:09:08 <mattf> SergeyLukjanov, ErikB, a little background of Hadoop might help to level the understanding 20:09:09 <rnirmal> actually it's both.. 20:09:11 <russellb> ttx: yes 20:09:23 <hub_cap> im pretty sure its more than just processing 20:09:24 <vishy> ok so the problem here is that cluster management and configuration is a shared problem 20:09:33 <hub_cap> correct vishy well put 20:09:38 <vishy> which is partially solved by trove and heat 20:09:43 <ttx> russellb: I can see a use case for someone to spin up hadoop clusters rather than submit direct jobs... it just sounds like a very different type of user 20:09:53 <ruhe> ElasticDataProcessing allows to execute MapReduce jobs on demand. It means that Hadoop cluster will be provisioned specifically for the job, and destroyed once job is complete. 20:10:03 <hub_cap> fwiw, i dont want to touch a data api w/ a 10 ft pole..., so i see savanna filling that gap 20:11:16 <mikal> So the use case here is that a user sends an api requests saying "map reduce this please" and savnna brings up the jobs and kicks them off? 20:11:29 <ruhe> mikal right 20:11:31 <hub_cap> is there no use case for long running hadoop clusters? 20:11:32 <mikal> I guess I don't understand why Savanna shouldn't be using heat to orchestrate that process 20:11:51 <zaneb> I've heard it said that Hadoop is too complicated for Heat to deploy... I would love to at least find out what we're missing 20:11:54 <SergeyLukjanov> hub_cap, we have such use case 20:11:58 <hub_cap> or even hadoop provisioning for customers 20:11:59 <ruhe> hub_cap, yes there is such use case 20:12:01 <ttx> mikal: but there is also a use case where a user sends api requests to bring up a whole hadoop cluster 20:12:02 <hub_cap> to use on their own 20:12:32 <mikal> ttx: a hadoop cluster that lives longer than a single map reduce you mean? 20:12:38 <rnirmal> mikal: yes 20:12:40 <shardy> ttx: that can still use heat behind the scenes tho, no? 20:12:47 <ttx> shardy: definitely 20:12:50 * gabrielhurley is late to the meeting 20:12:52 <mikal> shardy: agreed 20:12:53 <mattf> shardy, +1 20:13:11 <ttx> mikal: yes, and for which you'd use classic hadoop tools to submit jobs 20:13:26 <mikal> ttx: fair enough 20:13:27 <hub_cap> or some data api savanna provides ttx? 20:13:46 <ErikB> Hadoop can be fairly difficult to configure and deploy. Savanna provides the mechanism to deploy the Hadoop infrastructure (composed of multiple services, configuration, topology) on OpenStack leveraging distribution specific constructs. Each distribution (Apache, MapR, Cloudera, HDP) tends to provide their own mechanism for deployment and management which is what Savanna provides a framework for. Duplicating this in Heat 20:13:53 <SergeyLukjanov> Hadoop cluster provisioning is a very complex process with tons of configs and Savanna provides an ability for users to create templates 20:13:58 <asavu> IMHO Savanna is like EMR + Netflix Genie tightly integrated. I'm not sure Heat can solve the orchestration problem completely but I agree can be part of the solution 20:14:15 <hub_cap> do you feel hadoop is more complicated than setting up a cassandra or mongo cluster? 20:14:30 <hub_cap> because Trove is going to tackle those, as blueprints are sitting in our queue 20:14:32 <ruhe> hub_cap definetely 20:14:34 <rnirmal> I might be wrong but currently heat doesn't have the option to do a post processing operation. which is needed for cluster configuration 20:14:43 <hub_cap> oh oh oh Trove can :) 20:14:48 <demorris> SergeyLukjanov: Trove would need that same capability to configure clusters and templates that describe the complexities of the different cluster deployments 20:14:49 <mikal> I worry that saying "we can model this in heat" indicates a heat bug instead of a need for a new orchestration system 20:14:51 <jd__> enhancing Heat might be a better solution though asavu 20:14:54 <hub_cap> i know heat has deferred things too 20:14:56 <russellb> rnirmal: but part of being an OpenStack projet is to work with other projects to fill gaps :) 20:14:59 <akuznetsov> hub_cap yes it hadoop cluster contains several services with circular dependencies 20:15:00 <shardy> asavu: as mentioned by zaneb, we'd like to understand the parts you think you can't solve with heat atm 20:15:10 <SergeyLukjanov> hub_cap, we're provisioning not only Hadoop, but Hadoop eco using some Hadoop management consoles 20:15:12 <rnirmal> russellb: agree... just wanted to point it out 20:15:32 <aignatov_> yes, hadoop(hdfs, mr service and services for data processing) has more complexity than deployment of cassandra 20:15:35 <hub_cap> sure SergeyLukjanov / akuznetsov and im sure cassandra would be as such too... are there no ecosystem tools wrt it? 20:15:38 <russellb> and i think one of the expectations if you were to be incubated would be to work with projects to fill gaps so that you can build on them as much as possible 20:15:47 <hub_cap> +1 russellb 20:15:52 <shardy> mikal: Agree, probably we just need to better understand what's missing/needed in Heat 20:16:12 <ttx> My point is that while I see a value for the data API, I question the value of a hadoop-specific cluster thing. That would overlap with a lot of Heat/Trove space 20:16:13 <mikal> shardy: yes. I think an analysis of that is something I'd like to see for part two of this discussion. 20:16:20 <akuznetsov> the main issue that Heat is not support a circular decencies 20:16:21 <russellb> so part of the Q&A is trying to establish some vision for where this is headed, and how it might integrate 20:16:26 <shardy> mikal: +1 20:16:30 <mordred> yah 20:16:34 <hub_cap> ttx +1. and id love a cassandra data api too built on top of heat/trove ;) 20:16:39 <ruhe> we actually have a wiki page for such questions about "Why don't you use Heat?" - https://wiki.openstack.org/wiki/Savanna/WhyNotHeat 20:16:44 <mordred> I mean, when trove came to us, it was not using heat, and we said, dude, you should use heat 20:16:45 <SergeyLukjanov> hub_cap, there are a lot of tools that works on Hadoop - Hive, Pig, Oozie 20:16:45 <hub_cap> maybe savanna fits for a "data api" 20:16:50 <mikal> ruhe: /me looks 20:16:58 <hub_cap> SergeyLukjanov: sure, and i assume we could install them 20:17:09 <hub_cap> we have a postprocessing guest that is in charge of this 20:17:13 * ttx looks 20:17:20 <mordred> "Savanna currently maintains Grizzly+ compatibility. " - if you got integrated, would that still be a goal? 20:17:21 <russellb> all of this isn't necessarily arguing where you should be *right now*, just where you should go :) 20:17:22 <russellb> to be clear ... 20:17:25 <hub_cap> and it keeps services online and reports failures 20:17:27 <mordred> russellb: ++ 20:17:28 <asavu> shardy afaik aws cloudformation doesn't have all the semantics we need e.g arbitrary script execution, vendor API interaction etc. 20:17:32 <hub_cap> russellb: +1 billino 20:17:40 <hub_cap> *billion 20:18:10 <mordred> "Circular dependencies - we should generate ‘/etc/hosts’ for all instances in provisioned cluster." - I believe os-*-config will be your friends there 20:18:19 <ruhe> agree, that at some point we'll need to use Heat for provisioning. It's just a matter of time 20:18:28 <mordred> ah - good 20:18:30 <mordred> "Once Heat fulfills all these requirements we will be able and should use Heat for VM provisioning. " 20:18:39 <SergeyLukjanov> mordred, we're currently planing to guarantee only H support in 0.3 release (mid October) 20:18:40 <ttx> So... basically if Heat needs to be improved to be used as a basis for Savanna... then maybe it makes sense to wait for that to happen before filing Savanna for incubation. Incubation is about INTEGRATING with existing intergrated projects to form a coherent whole. 20:19:02 <shardy> asavu: we're working on lots of native, non cloudformation-compatible functionality atm, your requirements (and contributions) would be very valuable to that process 20:19:02 <ttx> projects can completely exist outside of incubation 20:19:06 <markmcclain> +1 20:19:14 <shardy> rather than just rolling your own everything 20:19:18 <mordred> ttx: or, perhaps part of integrating is a list of things that need to be done on both sides to come out of integrating 20:19:30 <hub_cap> i woould still see a lot of overlap between the clustering api trove has proposed and the savanna clustering api 20:19:33 <mordred> it's hard to say "hey, heat, we need this feature for X" if X isn't on heat's radar per-se 20:19:38 <hub_cap> rather than saying "just heat" 20:19:43 <SergeyLukjanov> ttx, integration with Heat is one of our goals for I cycle 20:19:52 <zaneb> to be clear, the Heat team is working on a new template format with the explicit goal of being able to describe something like a Hadoop deployment 20:20:07 <russellb> i think these sorts of things are fine to work on during incubation 20:20:09 <zaneb> i.e. this is the canonical example of what we want to be able to support 20:20:11 <ttx> mordred: so you would incubate and let them there until proper integration is achieved ? 20:20:20 <mordred> ttx: that's the point of incubation, no? 20:20:21 <russellb> that's what we asked of trove 20:20:26 <hub_cap> yar 20:20:26 <russellb> ttx: +1 20:20:33 <shardy> mordred: we welcome feedback from users and potential contributors about what features they need 20:20:39 <jd__> ttx: mordred: +1 20:20:42 <hub_cap> so does no one see the overlap between trove/savanna wrt the clustering? 20:20:50 <dolphm> hub_cap: ++ 20:20:51 <russellb> i mean, we should have reasonable confidence that they can/will achieve what we're asking, and are off to a good start 20:20:52 <ttx> mordred: sure, as long as the stated goal of savanna is to achieve that integration 20:20:56 <mordred> hub_cap: sure- I'd love to see you guys working together 20:21:18 <hub_cap> +1 mordred it would make trove a better product 20:21:19 <mordred> hub_cap: explicitly working on solving that - either by one of you consuming the other, or by spinning off a third thing that both of you consume 20:21:21 <mordred> or something 20:21:37 <ttx> mordred: i.e. it makes sense to incubate if its' to work on integration. Not so much if it's to explain that they can't use Heat because of A and B 20:21:40 <hub_cap> absolutely mordred... something can be consenus-ified 20:21:48 <mattf> SergeyLukjanov, wouldn't you say that a core goal of savanna is to integrate with other openstack projects? 20:21:53 <rnirmal> also cluster provisioning doesn't just apply to hadoop.. it's hadoop today.. spark tomorrow and a whole lot of possible other tools/projects 20:21:58 <mordred> ttx: I think their 'why not heat' is already actually a 'why not heat right now' 20:21:58 <vishy> hub_cap: what "clustering" do you need that couldn't be provided by heat? 20:22:02 <ttx> but Sergey said it's one of their goals, so I think we are fine 20:22:03 <hub_cap> +1 rnirmal 20:22:05 <mattf> imho it's a core reason savanna is being done in the openstack community instead of outside 20:22:05 <gabrielhurley> sounds like that shared clustering feature might be a good candidate for a shared library 20:22:21 <hub_cap> gabrielhurley: that would work 20:22:26 <SergeyLukjanov> ttx, yep integration with other OS projects is our goal 20:22:41 <hub_cap> vishy: we use heat for clustering, for installiation of a cluster 20:22:49 <hub_cap> doing things like master/slave promotion for mysql 20:22:50 <russellb> i think part of incubation should be viewed as more attentive guidance from us on how/where a project should integrate 20:22:51 <hub_cap> or failover 20:23:07 <shardy> mordred: I'd like to see a roadmap of "$stuff we need to migrate to heat in the medium term" 20:23:08 <hub_cap> complex things require knowledge from "within" so to speak, being a guest 20:23:13 <russellb> i think that works well ... gets it more on everyone's radar 20:23:16 <vishy> ok so promotion and failover 20:23:18 <demorris> there is also benefit to a common API for clustering so we don't end up with two completely distinct clustering API provisioning methods 20:23:35 <vishy> that isn't what I think of when i think clustering 20:23:35 <hub_cap> to start, vishy, thats 2 things i can think of 20:23:37 <demorris> unless the use case is so different that it calls for it, but I don't see it as such just yet for the API 20:23:50 <jgriffith> demorris: +1 20:23:54 <hub_cap> vishy: clustering a data store 20:23:56 <vishy> because if it is just launch a group of vms i think heat handles that just fine 20:23:57 <jmaron> I think you need a clearer definition of "cluster" and or "clustering". In the hadoop world it's more than the provisioning of VMs - it's the provisioning and configuraiton of a slew of data services on top of those hosts (Note that hadoop isn't necessarily cloud/VM aware) 20:24:00 <SergeyLukjanov> hub_cap, I hope that we'll use Heat for cluster provisioning and potentially for [auto]scaling support 20:24:12 <akuznetsov> possible clustering should be done on Heat side, for example cluster for j2ee application 20:24:15 <hub_cap> vishy: totally agree, thats what we are doing in trove :) 20:24:20 <ruhe> cluster deployment is one of the simplest things in Savanna. there are lot's of details related to Hadoop - integration with Swift, HDFS block placement 20:24:21 <vishy> so specifically it is atomically configuring things 20:24:23 <hub_cap> install = heat 20:24:31 <vishy> which is going to require something like zookeeper, no? 20:24:56 <hub_cap> possibly :) 20:25:02 <mordred> can't heat already handle that? or will do soon? 20:25:06 <vishy> that doesn't sound like it specifically belongs to one project so shared library might be the way to go 20:25:22 <hub_cap> im all for shared lib 20:25:25 <hub_cap> shared = better right? :) 20:25:29 <vishy> have to wonder if it actually fits into the the taskflow library 20:25:31 <ruhe> hub_cap, so the idea is to develop clustering support in Heat which then could be used by Trove and Savanna? 20:25:48 <hub_cap> possibly? we are already going ot use heat for clustering support 20:25:57 <vishy> ruhe: +1 20:25:59 <dmakogon_> hub_cap: cluster_provisioning lib is VERY(!!!) good point 20:26:01 <hub_cap> i see it as savanna uses trove for clustering / post processing etc... 20:26:14 <dmakogon_> hub_cap: and it could be like the part of heat 20:26:16 <hub_cap> and if you want a "hadoop prov api only" you can use trove 20:26:23 <vishy> the location of the clustering library is a minor point 20:26:23 <hub_cap> which obviously will use heat to prov 20:26:26 * ttx sees a lot of discussions needed between heat, trove and savanna guys in HK 20:26:33 <hub_cap> yes plz! 20:26:38 <ttx> but I like what I'm seeing 20:26:38 <hub_cap> ttx ^ ^ 20:26:49 <hub_cap> i think there is much overlap 20:26:53 <hub_cap> and i dont like duplicating work 20:27:03 <dmakogon_> hub_cap +1 20:27:03 <dolphm> ttx: ++ 20:27:08 <SergeyLukjanov> hub_cap, Trove is 'Database as a Service', but Hadoop isn't a DB 20:27:22 <hub_cap> is it not, at the heart of it? 20:27:27 <mordred> ttx: ++ 20:27:31 <hub_cap> is there not a (or many) ways to process and retrieve data 20:27:35 <hub_cap> and a storage engine 20:27:42 <zaneb> it is and it isn't ;) 20:27:43 <ruhe> hub_cap, it's main goal is bigdata processing 20:27:46 <hub_cap> and a plethora of tools avail 20:27:46 <dmakogon_> SergeyLukjanov: but Hive/HBase - yes ! 20:28:07 <hub_cap> absolutely, and i dont want to touch processing in Trove 20:28:08 <ttx> I think at the very least, even with clustering completely stripped off, savanna would make sense standalone as a data API 20:28:19 <hub_cap> ttx:++ 20:28:30 <hub_cap> id love to see savanna as the data api in openstack 20:28:32 <dmakogon_> ttx: good point 20:28:38 <mikal> ttx: I agree. I just think savanna should be as thin as possible to reduce duplication of effort 20:28:48 <hub_cap> mikal: ++ 20:28:51 <russellb> the deployment still has to be solved somewhere 20:28:54 <ttx> mikal: sounds like one of my lines 20:28:58 <ruhe> hub_cap, i guess we need a definition of clustering 20:29:02 <hub_cap> id love to see savanna tackle cassandra and mongo in the future 20:29:06 <dmakogon_> ruhe: yes 20:29:08 <hub_cap> in terms of data api 20:29:13 <gabrielhurley> thin++ 20:29:19 <russellb> those might be different data APIs though ... 20:29:26 <ttx> russellb: oh sure. But it's aproblem space that several others are exploring..; and all those people need to talk around a beer to solve it 20:29:35 <russellb> fair enough 20:29:39 <hub_cap> ttx beer+whiteboard 20:29:41 <dolphm> mikal: ++ 20:29:41 <SergeyLukjanov> hub_cap, we're planning to support cassandra as external data source 20:29:43 <dmakogon_> hub_cap: cassandra/mongo via savanna ?? 20:29:53 <hub_cap> dmakogon_: the data api 20:29:58 <ttx> SergeyLukjanov: I also have a slight concern about you being the author of more than half of the commits 20:30:00 <hub_cap> not the prov/clustering 20:30:03 <ttx> It's not as extreme as Designate (59% instead of 84%) but it still looks a bit brittle to me 20:30:06 <markwash> where can I learn more about the savanna data api? 20:30:09 <akuznetsov> hub_cap we will have cassandra and mongo as one of the data source for edp 20:30:17 <ttx> SergeyLukjanov: are you superman ? 20:30:28 <ttx> SergeyLukjanov: are there buses near where your live ? 20:30:34 <hub_cap> akuznetsov: and you will need clusters for those correct? 20:30:56 <hub_cap> and trove is going to solve prov'ing those clusters in its near future (dmakogon_ is foaming at the mouth) 20:30:56 <akuznetsov> markwash https://wiki.openstack.org/wiki/Savanna/EDP 20:31:03 <SergeyLukjanov> ttx, it's mostly related to the initial state of the project, you can take a look for the last 3 months percentage 20:31:16 <dmakogon_> hub_Cap: correct me if i'm wrong, trove could use savanna(in future) for provision clusters of cassandra ? 20:31:19 <zaneb> https://github.com/stackforge/savanna/contributors 20:31:30 <russellb> and I keep typoing Savanna as Savannah because of Savannah, GA 20:31:32 <russellb> :( 20:31:41 <hub_cap> i see it as the opposite dmakogon_, savanna being a data api uses trove to prov the clusters 20:31:45 <hub_cap> and then does magic data stuff w them 20:31:48 <ruhe> hub_cap, i'm not sure if database_as_a_service is a proper tool to provision data_processing_tool 20:31:57 <mattf> ttx, savanna has active development from mirantis, red hat and hortonworks. state is mirantis seeded the project and has a lot of historical commits. 20:31:58 <SergeyLukjanov> ruhe, agreed 20:32:14 <vishy> hub_cap: it sounds like you are starting to see trove as cluster_provisioning_as_a_service 20:32:23 <markwash> akuznetsov: so is it basically "post up a hive/pig script after establishing your data sources" ? 20:32:28 <rnirmal> maybe that needs to be split out into it's own service then 20:32:29 <ttx> SergeyLukjanov: agree that recent data shows a good trends 20:32:29 <dmakogon_> hub_cap: we can do it in any way 20:32:31 <hub_cap> well we are clustering nosql datastors as a service vishy :) 20:32:46 <hub_cap> man i cant type 20:32:46 <vishy> i don't think it clearly "belongs" to any project today 20:32:47 <dmakogon_> hub_cap: savanna via trove, trove via savanna 20:32:57 <mikal> mattf: there are a lot more mirantis people though right? I don't think that's bad (and all credit to mirantis), but do you think the project would survive if for some reason it stopped being a priority for mirantis? 20:32:58 <dmakogon_> hub_cap: multiple dual support 20:33:02 <vishy> i think we all agree that clurster provisioning is important 20:33:04 <akuznetsov> markwash not only 20:33:16 <vishy> * cluster 20:33:16 <hub_cap> +1 for clusters 20:33:19 <vishy> and it goes somewhere 20:33:23 <jmaron> the provisioning of VMs in savanna is already fairly well partitioned as an API/service. trove/heat are not precluded as playing a potential role. However, it is only a portion of what is required to configure a hadoop deployment (hate using cluster - seems to be a term with specific conotation in this crowd ;) ) 20:33:23 <ruhe> anyway, Savanna is all around about integration with various Hadoop vendors. I'm worried that splitting development into two separate project will make thing really complex. both will need integration with various Hadoop distros 20:33:30 <hub_cap> vishy: ++ 20:33:32 <ttx> vishy: I wouldn't mind trove to expand scope to support generic clustering 20:33:33 <ErikB> mikal - savanna is a priority for Hortonworks 20:33:36 <mattf> mikal, i think the community is growing, will continue to grow and will be sustaining outside of mirantis, yes 20:33:40 <vishy> and trove/heat/savanna/workflow can fight to the death about who gets to own it 20:33:50 <hub_cap> vishy: cage match? 20:33:56 <dmakogon_> mikal: savanna has contributors from RedHat, so it could survive in any way) 20:33:58 <akuznetsov> savanna already has a lot staff for clustering anti affinity group, networks and ect. 20:33:58 <ttx> vishy: otherwise everyone will keep on reinventing it 20:33:59 <vishy> oh and tripleo 20:34:05 <vishy> since they have to provision clusters as well 20:34:11 <vishy> :o 20:34:19 <hub_cap> lol 20:34:25 <ruhe> vishy, it seems like Heat is the right tool to provision cluster, others are for tight integration with software running inside VMs 20:34:25 <russellb> zomg clusters 20:34:26 <mordred> yeah - but that's just heat really :) 20:34:31 * hub_cap hands vishy the wrench for the meeting 20:34:33 * mordred clusters himself 20:34:41 * markwash noms on clusters of nuts 20:34:41 <hub_cap> oh god now were all doomed 20:34:51 <dmakogon_> ruhe: +1 for clusters provisioning 20:35:00 <vishy> ruhe: the issue is provisioning the vms is easy 20:35:04 <ttx> wow 34min in and it's already toasted 20:35:09 <SergeyLukjanov> ruhe, yep and we planning to digg into Heat and try to contributed missed features to it to be able to use it for provisioning in Savanna 20:35:21 <hub_cap> ttx fwiw, i wouldnt mind expandoing scope cuz its going to be cassandra/mysql/mongo/redis/etc/etc/etc in trove 20:35:27 <vishy> it is configuring the services to know about each other, do elections, etc. 20:35:27 <vishy> that is hard 20:35:28 <hub_cap> and savanna will have hadoop 20:35:34 <dmakogon_> SergeyLukjanov: good idea 20:35:38 <vishy> and although the software is different there is definitely a lot of overlap in these things 20:35:42 <demorris> hub_cap: + Vertica CE 20:35:44 <annegentle> I think big data/ map reduce use cases are really valuable, and would like to see heat orchestrating to help other projects laser focus on use cases 20:35:44 <ruhe> vishy, not only provision, but apply configs, for instance host names in /etc/hosts for the whole cluster 20:36:07 <hub_cap> yes that should be the same w/ cassandra right ruhe? 20:36:18 <mikal> ruhe: would wouldn't just bring up an instane running bind and point everyone to that (or something other than losts of copies of /etc/hosts?) 20:36:19 <ruhe> hub_cap, right 20:36:23 <hub_cap> we plan on supporting in a generic way, cuz im sure mongo/cassandra will need 20:36:32 <vishy> ruhe: imo that is part of configuring the software 20:36:32 <ttx> annegentle: yes, and I wouldn't mind Savanna contributors to contribute the missing stuff they need to existing projects :) Yay cross-openstack collaboration 20:36:52 <annegentle> ttx: yep 20:36:55 <mordred> ++ 20:37:10 <jgriffith> vishy: ++ 20:37:14 <vishy> a lot of the difficulty would be avoided if we had integrated dns and autodiscovery 20:37:21 <jgriffith> I think there needs to be a clearer distinction here 20:37:22 * vishy has a whole bag of wrenches 20:37:33 <hub_cap> geez you sure do 20:37:34 <ttx> so basically I think there is value in incubating savanna, if only to get all those devs to show up at the design summit and see how they can best fit 20:37:43 <hub_cap> +1 ttx 20:37:46 <hub_cap> so um, state of clustering at the summit? 20:37:50 <dmakogon_> hup_cap: SergeyLukjanov: i see the next situation: trove support HBase/Hive - that is means that trove get Hadoop cluster provisioned via Savanna and than install Hive/Hbase on that cluster 20:37:52 <russellb> vishy: huge +1 to auto discovery ... 20:37:56 <dmakogon_> hub_cap:+1 20:38:15 * ttx looks up scedhule to make sure heat/trove and savanna don't run at the same time 20:38:18 <russellb> vishy: we keep inventing new methods to do that by hand, kinda getting silly 20:38:35 <mordred> vishy: dns++ 20:38:37 <ruhe> dmakogon_, don't hurry :) HBase provisioning is the most complicated thing i ever seen in my life 20:38:42 <russellb> and yes, dns++ too 20:38:48 <russellb> is anyone help with dns yet? 20:38:51 <ruhe> I mean, getting it done right 20:38:52 <russellb> (sorry, another topoic) 20:38:58 <hub_cap> hah 20:39:00 <vishy> ruhe: no way it is more complicated than configuring openstack! 20:39:05 <ttx> err. trove and heat run at the same time, sigh 20:39:10 <dmakogon_> ruhe: i know, i've deployed it by hands a lot of times 20:39:14 <ruhe> vishy, good catch 20:39:25 <hub_cap> ttx FAIL 20:39:27 <mordred> russellb: no worries - we can talk about infinite number of things in parallel in this meeting 20:39:37 <hub_cap> mordred: hows the weather? 20:39:51 <mordred> hub_cap: great! I got some torchy's tacos yesterday and a bowl of queso 20:39:59 <SergeyLukjanov> agreed with need to discuss where clustering part should be done 20:40:02 <mordred> ttx: rework the whole scedule now! 20:40:08 <ttx> on it 20:40:08 <mattf> mordred, super linear scaling eh? 20:40:12 <dolphm> mordred: /jealous 20:40:34 <hub_cap> yes id love to see 1 project support clustering, there is no need to reinvent 20:40:40 <rnirmal> heat can provision a cluster with some work maybe... but some X needs to configure it and X needs to manage the lifecycle of a cluster... be it hadoop, cassandra or spark.. and I see a split between savanna and trove..... can we work towards solving that first 20:40:43 <hub_cap> and id love to see 1 project support a data api 20:40:49 <dmakogon_> hub_cap: +1 to shared lib 20:40:55 <hub_cap> +1 to single project 20:41:04 <ruhe> rnirmal +1 20:41:10 <hub_cap> rather than shared lib w/ the same api's between 2 diff projects heh 20:41:15 <dmakogon_> i think this is shouldn't be a standalone project 20:41:28 <dmakogon_> just some algorithms 20:41:32 <hub_cap> oh no a data api is quite valid :) 20:42:34 <zaneb> hub_cap: shared lib (vs. shared service) seems quite reasonable to me? 20:42:43 <dmakogon_> or heat should provision clusters for next usage in terms of current project 20:42:45 <hub_cap> yes zaneb 20:43:36 <ttx> mordred: I'll propose to swap trove and ironic on https://docs.google.com/spreadsheet/ccc?key=0AmUn0hzC1InKdDdPRXFrNjV4SW91SWF5N2gwYnRHYWc#gid=1 -- sounds like the most limited change that would solve it 20:43:44 <ErikB> +1 rnirmal - this is the value that Savanna adds. 20:43:47 <dmakogon_> hub_cap: shared lib ease to reuse without any integration 20:43:53 <ruhe> to understand requirements of such shared lib (or service) we'll need to understand requirement from both Trove and Savanna 20:43:56 <russellb> ironic and heat overlap is probably rough too 20:44:16 <russellb> since the folks interested in baremetal, are also interested in tripleo, which are interested in heat 20:44:21 <hub_cap> +1 ruhe 20:44:32 <ttx> russellb: there are more heat slots than ironic slots though, so they can still attend some of it 20:44:41 <russellb> ttx: ah, cool, probably fine then 20:44:50 <SergeyLukjanov> looks like no ideas for clustering discussion now and the best solution is to setup clustering discussion at design summit and apply the decision in I cycle 20:44:51 <hub_cap> hmmm seems like we are starting to see program vs project 20:45:05 <hub_cap> +1 SergeyLukjanov we have submitted one for trove :) 20:45:20 <mordred> hub_cap: ++ 20:45:20 <hub_cap> http://summit.openstack.org/cfp/details/54 20:45:29 <dmakogon_> SergeyLukjanov: +1 20:45:34 <russellb> ttx: we should just serialize the whole thing and have the design summit never end 20:45:52 <hub_cap> when it is over is when it begins 20:45:55 <ttx> russellb: sounds like paradise 20:46:51 <ttx> OK, so it would be great to start this discussion a bit this week so that we can see the premises of this collaboration by the time we finally vote on this (next week or the week after) 20:47:11 <russellb> good call 20:47:28 <dmakogon_> hub_cap: SergeyLukjanov: we could discuss clustering together, it term of trove/heat/savana/ironic 20:47:31 <annegentle> ttx: how does the election timing and vote for incubation line up? 20:47:38 <annegentle> ttx: I can't remember when elections are 20:47:40 <ttx> and unless someone has another concern to raise, we can go to open discussion now 20:47:43 <ttx> annegentle: next topic 20:47:45 <hub_cap> yes dmakogon_ 20:47:51 <hub_cap> go go go ttx 20:47:58 <mordred> we have another topic? jeez 20:48:05 <ttx> not really 20:48:07 <ttx> #topic Open discussion 20:48:12 <ttx> I set up the pages for the PTL and TC elections in the next weeks: 20:48:13 <hub_cap> we are gonna discuss mordred's hatred for open discussion 20:48:18 <ttx> #link https://wiki.openstack.org/wiki/PTL_Elections_Fall_2013 20:48:22 <dmakogon_> hub_cap: SergeyLukjanov: more that +100500 for shared lib for clustering 20:48:23 <ttx> #link https://wiki.openstack.org/wiki/TC_Elections_Fall_2013 20:48:33 <ttx> I'm looking for volunteers for filling the election official roles, especially for the PTL election 20:48:43 <ttx> Can be difficult since you should not be running for any of the PTL positions to be an PTL election official... 20:49:02 <SergeyLukjanov> are there any other questions about savanna? (we've discussed only one not main feature of savanna…) 20:49:11 <russellb> dang that's a lot of PTL positions :-) 20:49:17 <russellb> elections like woah 20:49:36 <ttx> annegentle: to answer your question... we won't start renewing the TC until October 4 so we still have a few meetings dates possible 20:49:46 <russellb> SergeyLukjanov: i think we need to continue on the ML with what came up so far, and we'll continue the discussion next week 20:49:48 <hub_cap> russellb: openstack is growing up :) or out! 20:49:57 <annegentle> ttx: ok thx 20:50:48 <ttx> the TC elections run after the PTL elections, which increases the odds recently-elected PTLs would get an electoral boost in the TC election (feature, not bug) 20:51:13 <mordred> this may not really be a TC thing - but since you're all here - we're going to make a stronger push to finish moving to testr next cycle - because there are some things we want to do with subunit streaming processing in the gate (which will result in quicker response time to failures and shorter gate resets) 20:51:24 <ttx> In other news, TC members should review the initial governance repo commit at: 20:51:29 <ttx> #link https://review.openstack.org/#/c/44489/ 20:51:44 <ttx> once that is set we will use that for voting 20:52:09 <mikal> Yay! 20:52:40 <ttx> Anything else, anyone ? 20:52:55 <markwash> testr should have coverage, and I *want* to help :-) 20:53:14 <ttx> depending on how LinuxCon/CloudOpen gets finally scheduled and how many people are stuck, we may cancel next week meeting 20:53:48 <markmcclain1> Won't a good portion of us be there? 20:54:00 <russellb> how do you people attend so many conferences and still get work done? 20:54:03 <ttx> I count at least 3 20:54:09 <ttx> russellb: work ? 20:54:12 <markwash> russellb: you answered your own question methinks 20:54:22 <russellb> heh. 20:54:24 <mordred> russellb: I usually put in a full day's work while at ocnferences - it's just at different/odd times 20:54:28 <mordred> markwash: ++ 20:54:59 <clarkb> testr coverage? 20:55:21 <clarkb> I feel like the word coverage is far too overloaded here. What are we walking about? 20:55:24 <markwash> clarkb: coverage measurements, that is 20:55:28 <gabrielhurley> russellb: also relevant are your definitions of "work" and "done" 20:55:34 <markwash> like, using testr tests, I can measure the code coverage of my unit tests 20:55:37 <ttx> so the final discussion/vote on savanna might just wait for the September 24 meeting. Skip or notskip will be discussed on the TC mailing-list 20:55:42 <clarkb> markwash: we have it doing that today 20:55:50 <markwash> clarkb: say what? sorry I'm out of date 20:56:00 <clarkb> markwash: that was one of the requirements to use testr with nova 20:56:15 <mordred> markwash: yup. we're all fancy like that 20:56:20 <clarkb> markwash: basically we swap in coverage.py for python and run the test runners that way then combine coverage afterwards 20:56:20 <markwash> <3 20:56:22 <clarkb> works great 20:56:27 <mordred> (sorry, I was ++ing "want to help") 20:56:27 <ttx> I even count 4, notmyname will be there 20:56:34 <notmyname> yes 20:56:48 <ttx> like I said on another thread, more PTL/TC members talking there than at an openstack summit :/ 20:56:59 <mordred> yah. that makes me sad 20:57:18 <mordred> I spend most of my year talking at conferences, and I've only ever given half of one talk at an openstack one 20:57:28 <mordred> kinda funny that 20:57:43 <gabrielhurley> people only *think* they want to hear from me. I'll show them... I'll show them all! 20:57:49 <notmyname> are there project updates during the conference this time? 20:58:00 <mordred> notmyname: post confernece webinar things 20:58:00 <notmyname> if not, that removes all the PTL talks 20:58:04 <ttx> notmyname: I think they want to do the webinar thing again 20:58:07 <notmyname> mordred: should be both 20:58:21 <russellb> it's painful to do it at the conference 20:58:21 <mordred> notmyname: the request came through to not do them in person because of time 20:58:26 <russellb> no time to let the stuff soak 20:58:27 <ttx> notmyname: I placed a "TC panel" proposal so that we appear somewhere 20:58:31 <russellb> haven't even finished talking yet and you have to present it? 20:59:04 <mordred> russellb: I like giving the infra update at the start of the summit - more time for beer that way :) 20:59:07 <russellb> i'm glad we're not doing it there (with the current layout anyway) 20:59:22 <russellb> if we split or offset them, sure :) 20:59:26 <notmyname> it's an opportunity to talk about what's been happending, perhaps a couple of ideas that were talked about, and to brag on contributors. 20:59:28 <ttx> we are looking into a one-day offset for the next one 20:59:39 <ttx> hopefully more for the one after that 20:59:41 <russellb> ttx: that's a start 20:59:43 <mordred> ++ 20:59:53 <russellb> whatever we can get is ++ from me 20:59:57 <ttx> like conf mon-Thy and design cummit tue-fri 21:00:06 <ttx> err conf mon-thu 21:00:17 <ttx> and summit* 21:00:21 <ttx> and #endmeeting 21:00:24 <ttx> #endmeeting