18:00:52 <SergeyLukjanov> #startmeeting sahara 18:00:52 <openstack> Meeting started Thu Oct 30 18:00:52 2014 UTC and is due to finish in 60 minutes. The chair is SergeyLukjanov. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:00:53 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 18:00:56 <openstack> The meeting name has been set to 'sahara' 18:00:56 <SergeyLukjanov> ping sahara folks 18:01:14 <elmiko> o/ 18:01:19 <tmckay> SergeyLukjanov, crobertsrh has lost power, and mattf is out 18:01:25 <aignatov> o/ 18:01:26 <jodah> o/ 18:01:29 <SergeyLukjanov> tmckay, ack 18:01:32 <NikitaKonovalov> o/ 18:01:52 <sreshetnyak> o/ 18:02:36 <SergeyLukjanov> #link https://wiki.openstack.org/wiki/Meetings/SaharaAgenda 18:02:44 <SergeyLukjanov> #topic sahara@horizon status (croberts, NikitaKonovalov) 18:03:08 <SergeyLukjanov> NikitaKonovalov you're the only presenter for this topic 18:05:04 <SergeyLukjanov> okay, let's move on 18:05:08 <SergeyLukjanov> #undo 18:05:09 <openstack> Removing item from minutes: <ircmeeting.items.Topic object at 0x2f5bdd0> 18:05:16 <SergeyLukjanov> #topic News / updates 18:05:18 <SergeyLukjanov> folks, please 18:05:29 <SergeyLukjanov> I've been mostly working on preparing to summit 18:05:56 <elmiko> i'm making good progress on bug#1306505, i've also been talking with the OSSG folks and the API working group. i've got some ideas about security and api to talk about at summit. 18:06:16 <SergeyLukjanov> #1306505 18:06:22 <elmiko> oh, also trying to work up a spec for api v2 18:06:36 <SergeyLukjanov> elmiko, that's great 18:06:47 <jodah> elmiko i'd love to have a look at the spec when it's ready 18:07:20 <elmiko> SergeyLukjanov: i think there is some interesting activity happening surrounding api standardization 18:07:23 <tmckay> interesting question on openstack-dev about dockerizing Sahara. We should discuss at summit whether to make dockerized Sahara a feature for kilo 18:07:26 <elmiko> jodah: definitely 18:07:30 <tmckay> not sure what the issues may be 18:07:45 <sreshetnyak> I'm working on various bug fixes 18:07:50 <NikitaKonovalov> SergeyLukjanov, my Internet connection is dropping frequently, status for Sahara UI is good, there are few minor comments 18:07:51 <SergeyLukjanov> tmckay, dockerized sahara itself or dockerized hadoop cluster? 18:08:09 <jodah> SergeyLukjanov tmckay both ideally :) 18:08:17 <tmckay> SergeyLukjanov, unclear from the email, I asked for clarification 18:08:25 <NikitaKonovalov> I'm also trying to keep backports up-to-date for stable/juno 18:08:30 <tmckay> jodah, ++, might be two different features 18:08:34 <elmiko> SergeyLukjanov, tmckay, i think the email was talking about hadoop nodes being containerized 18:09:01 <SergeyLukjanov> elmiko, tmckay, okay, so, anyway both of the dockerizations are interesting 18:09:02 <tmckay> I think so too, the specific question was about changing hostsnames 18:09:05 <tmckay> so I think it was nodes 18:09:19 <jodah> dockerizing sahara is perhaps more about dockerizing openstack/devstack entirely 18:09:20 <SergeyLukjanov> tmckay, yeah, sounds like that (just saw an email) 18:09:35 <elmiko> i think we have 2 paths for containers; 1. sahara controller container, should definitely watch the kolla project, 2. node containers, should definitely watch nova-docker 18:10:11 <tmckay> so the issue in question is, how do we get around docker hostname change restrictions in launched nodes 18:10:13 <elmiko> i listened in on the kolla meeting earlier this week, and they are preparing to add a sahara container 18:10:28 <tmckay> could just be a single issue, but I am guessing there might be other docker issues once we get into it 18:10:55 <tmckay> I'd like to learn more about docker anyway :) EDP is so yesterday :) 18:10:59 <elmiko> tmckay: yea, we'll have to visit how we use the node hostname in the controller 18:11:25 <tmckay> elmiko, I'm wondering if we can switch most references to ip address, and defer hostname 18:11:35 <tmckay> maybe that would help 18:11:39 <tmckay> anyway ... summit 18:11:45 <tmckay> we won't solve here 18:11:51 <elmiko> tmckay: yea, that might work, or do something with container names. 18:11:58 <SergeyLukjanov> tmckay, elmiko, not always, some services requires dns, as I remember zookeeper requires dns 18:12:41 <elmiko> SergeyLukjanov: dns + docker, i know it's a sticky issue. we've had some folks run into issues with reverse-dns lookup and containers 18:12:55 <tmckay> hey, my service provider just hooked me up for international. So I can text you all in Paris :) Account was too new, didn't show up on the web page ... 18:13:09 <elmiko> tmckay: lol, nice 18:13:36 * SergeyLukjanov planning to buy some local pre-paid sim card 18:13:41 <tmckay> oh, other issue for me .... 18:14:15 <tmckay> playing with Scala class that can be imported by Spark jobs for convenience in setting hadoop configs for swift access (lots of names in that sentence) 18:14:16 <SergeyLukjanov> do we need to discuss something about design summit? 18:14:32 <tmckay> haven't gotten too far yet, beyond learning enough Scala to do it :) 18:15:05 <tmckay> SergeyLukjanov, oh, that's a good idea on the sim card 18:15:27 <SergeyLukjanov> tmckay, lebara was good enough last time I've been in paris 18:15:41 <elmiko> SergeyLukjanov: i'm happy with the session text for security stuff, i talked with OSSG and i think we'll have good communication/cooperation from them when needed. 18:15:45 <tmckay> SergeyLukjanov, I missed sign up for the Mirantis party. Can you get me in? :) 18:16:05 <SergeyLukjanov> elmiko, great! 18:16:14 <SergeyLukjanov> tmckay, I think we'll be able to do it 18:16:20 <tmckay> I may try to clean up the EDP pad a little, but the basic topics are there 18:16:41 <SergeyLukjanov> tmckay, okay 18:16:50 <SergeyLukjanov> #topic Design Summit @ Paris 18:16:54 <tmckay> I think maybe there is not too much to do for EDP in Kilo 18:16:58 * SergeyLukjanov need to update integration part 18:17:06 <tmckay> but, I could be wrong 18:17:12 <aignatov> tmckay: thx you for composing awesome summary for me ;) 18:17:16 <SergeyLukjanov> tmckay, it's a good question 18:17:33 <tmckay> aignatov, you're welcome 18:17:35 <SergeyLukjanov> tmckay, probably it's time to expose supported job types in API and remove hardcode from Horizon 18:17:50 <elmiko> SergeyLukjanov: +1 18:18:00 <tmckay> SergeyLukjanov, yes. I think there is cleanup we can do, but I'm not sure there are big, new features 18:18:11 <SergeyLukjanov> tmckay, yeah 18:18:58 <tmckay> alazarev spearheaded very nice refactoring, that makes it easy to add storm, fake plugin, spark, etc etc 18:19:11 <aignatov> but it could all be discussed in your section 18:19:21 <aignatov> I think we can talk about not obly kilo part 18:19:27 <aignatov> but about edp future 18:19:38 <SergeyLukjanov> aignatov, ++ 18:19:48 <elmiko> will we have time outside the meetup session to talk about api v2? 18:19:48 <tmckay> agreed 18:19:59 <tmckay> sure, in the pod, at the parties 18:20:01 <aignatov> elmiko: yes 18:20:08 <tmckay> during lunch, breakfast 18:20:11 <elmiko> lol 18:20:24 <elmiko> i remember the parties from icehouse, no work is getting done there ;) 18:20:29 <SergeyLukjanov> elmiko, we could talk about it a lot on the meetup 18:20:52 <SergeyLukjanov> and some folks are unable to chair sessions after such parties ;) 18:21:02 <elmiko> lol! 18:21:16 <tmckay> hmm, I think not too many are leaving on Friday so we can go all night if we have too 18:21:23 <elmiko> i think there are some interesting ideas we should consider implementing, like async object endpoints 18:21:57 <tmckay> we really should add scaling to the CLI 18:22:01 <tmckay> it's still missing, I believe 18:22:14 <tmckay> no technical reason, just not done 18:22:18 <alazarev_> tmckay, +1 18:22:41 <SergeyLukjanov> tmckay, +1 18:22:50 <aignatov> +1 18:22:52 <SergeyLukjanov> and probably cleanup the cli implementation 18:22:53 <alazarev_> tmckay, there is a plenty of work that can be done in python client 18:23:00 <SergeyLukjanov> yeah 18:23:02 <tmckay> which pad should we add that to? Or just make sure there is still a blueprint? 18:23:15 <tmckay> maybe we should start a client pad 18:23:18 <tosky> migrate to the unified client, if the project is still alive? 18:23:35 <tmckay> tosky, unfamiliar with unified client 18:23:46 <SergeyLukjanov> tmckay, it souds like client is one more very important area to invest in Kilo 18:24:05 <tosky> tmckay: https://wiki.openstack.org/wiki/OpenStackClient 18:24:06 <elmiko> something that was brought up, that would really help with cli, is having more documented json templates 18:24:21 <jodah> documenting each parameter for each resource 18:24:23 <tmckay> +1 I think we have a "extend and improve" dev cycle, just make everything better, fill in gaps 18:24:36 <tmckay> maybe not too many headline features 18:24:51 <SergeyLukjanov> tmckay, yup, it'll be great to extend and improve all the things ;) 18:24:56 <tmckay> but, we have to have some headlines, or people get bored :) 18:25:30 <SergeyLukjanov> #topic Open discussion 18:26:17 <SergeyLukjanov> we have a question raised by our new contributors from Shanhai - to make meeting time more friendly for Asia tz 18:27:10 <SergeyLukjanov> the easiest solution is to make each other week meeting in other time 18:27:22 <tmckay> good idea 18:27:25 <SergeyLukjanov> I think i'll follow up with it in mailing list 18:27:35 <tmckay> iirc, Asia is +13 or +14 from EST 18:28:06 <tosky> Asia what timezone? 18:28:29 <tmckay> well, I meant Japan :) 18:28:36 <tmckay> but there is more :) 18:29:01 <alazarev_> the main question I have, will they regularly attend the meeting? 18:29:29 <SergeyLukjanov> it's a good question too 18:29:41 <alazarev_> I have never seen them here 18:29:51 <dmitryme> speaking about shanghai, it has timezone UTC/GMT +8 hours 18:29:52 <jodah> maybe they're asleep? :) 18:30:10 <SergeyLukjanov> yeah, I think so 18:30:13 <alazarev_> if this is not important enough to attend meeting one time at 3AM... 18:30:45 <alazarev_> will they attend _regularly_ in convenient time? 18:31:07 <SergeyLukjanov> who knows 18:32:11 <crobertsrh> What is the proposed convenient time? 18:32:31 <jodah> Perhaps ask them for a proposed convenient time range 18:32:41 <aignatov> crobertsrh: you found your power? ;) 18:32:49 <elmiko> lol 18:32:50 <crobertsrh> Yes :) 18:34:36 <tmckay> crobertsh, blizzard there? 18:34:43 <tmckay> It is October, after all 18:34:51 <crobertsrh> Nothing but sunshine today. I have no idea what happened 18:35:01 <elmiko> weird... 18:35:17 <elmiko> tmckay: actually, the forecast for halloween is snow 18:35:36 <tmckay> same here I think, but only in the mountains 18:35:50 <jodah> I'm new to the project, trying to get familiar with things :) I was reading the storm blueprint and it got me wondering about what is or isn't in scope for Sahara? What sorts of things are fair game to provision? Because I know Heat sees some of this stuff as potentially its area. 18:36:22 <SergeyLukjanov> jodah, sahara is the data processing as the service 18:36:57 <SergeyLukjanov> jodah, so, we're making not only provisioning but operations like correct scaling/ decomminisioning, configuration + EDP 18:37:19 <jodah> sure. EDP applies to hadoop though, not really storm 18:37:33 <jodah> ...or stream processing, at least looking at the EDP API 18:37:43 <SergeyLukjanov> we'd like to apply it to Storm too 18:37:54 <jodah> for deploying topologies and such? 18:37:57 <SergeyLukjanov> jodah, currently it's applied to both Hadoop and Spark 18:38:12 <tmckay> jodah, why not storm? You can run jobs in storm, right? 18:38:18 <SergeyLukjanov> jodah, storm topology == hadoop job execution 18:38:44 <jodah> sure 18:39:03 <tmckay> yeah, as long as you have data to process, and you can start it and cancel it and check its progress, it qualifies 18:39:16 <tmckay> EDP has only "run, cancel, status" at a high level 18:39:30 <jodah> the concepts somewhat line up 18:39:32 <tmckay> and "input" and "output", or an argument list 18:39:48 <jodah> aside from storm though, i'm curious just generally - about the scope of sahara 18:39:53 <tmckay> run => "go" or "start" 18:40:04 <jodah> is it to encompass any input->process->output data pipeline? 18:40:35 <SergeyLukjanov> jodah, it's impossible to cover inf list of data processing frameworks 18:40:52 <crobertsrh> I think to encompass "many" would be a better way to say it 18:40:53 <SergeyLukjanov> so, for now we're mostly hadoop, hadoop-like and very popular frameworks :) 18:41:02 <SergeyLukjanov> crobertsrh, ++ 18:41:03 <jodah> the EDP resources are obviously very hadoop specific - ex: job-binaries, whereas storm has its own specific concepts. 18:41:20 <jodah> i'm sure they can be generalized, i'm just looking at it right now 18:41:29 <tmckay> but, in general, 1) provision a cluster for analytics and 2) simple interface to run analytics 18:42:43 <jodah> fair enough :) 18:42:49 <tmckay> jodah, I'll read up on storm, not too familiar. It would be an interesting look at the Sahara concepts, maybe expose unintended biases 18:43:23 <jodah> i just came from doing a bunch of stream processing / messaging work, including with storm, but hadoop is a newer area to me :) 18:43:32 <jodah> related question, If we're provisioning Hadoop, and Storm, should we be provisioning ZooKeeper clusters separately (since they are often shared across services in production)? Does this get a bit to close to what heat can do? 18:44:04 <tmckay> jodah, I am also thinking about mesos, and a cluster that maybe has multiple applications running on it 18:44:18 <jodah> exactly 18:44:45 <jodah> similar with ZK - we share our ZK clusters for kafka and storm. i've seen deployments that share them for hadoop as well, since the throughput is so low 18:45:38 <tmckay> jodah, will you be at summit? 18:45:44 <jodah> unfortunately i will not 18:45:51 <tmckay> pity 18:45:53 <jodah> i'm sure i will at the following summit though 18:45:55 <tmckay> :) 18:46:05 <tmckay> okay, great 18:46:17 <jodah> i just started working on some of this stuff with HP cloud. i'll be around a lot :) 18:46:17 <elmiko> fwiw i think it's an interesting idea we should investigate 18:47:04 <jodah> with ZK in particular though, does that start to overlap too much with heat? 18:47:30 <jodah> provisioning things that don't really get re-configured much or scaled up/down 18:48:16 <tmckay> well, sahara uses heat under the hood (can use heat) 18:48:16 <SergeyLukjanov> jodah, I think that it'll looks like we could to specify external ZK cluster to the clusters provisioned by sahara 18:48:31 <jodah> that would make sense 18:48:33 <tmckay> so, if we have things that just map through to heat, that will be fine imho 18:48:43 <SergeyLukjanov> yeah 18:48:53 <SergeyLukjanov> and due to the fact that we use heat 18:49:00 <jodah> or somehow delegate to a heat stack to standup the ZK cluster that might be needed for hadoop/storm 18:49:01 <SergeyLukjanov> and sahara could be used from heat 18:50:28 <jodah> One more thought on storm - aside from lifecycle management, scaling up/down, etc., has there been any talk about doing more stream processing related abstractions on top of storm/spark-streaming, similar to what other services like kinesis do? 18:51:09 <SergeyLukjanov> jodah, it's a topic to talk about 18:51:11 <jodah> it's hard to generalize storm since the concepts are so specific, but perhaps a few things could be done 18:51:30 <tmckay> jodah, have you talked to tellesnobrega? He is doing the storm plugin 18:51:56 <jodah> no, but i'd like to 18:52:38 <tmckay> jodah, he's been doing most of the work. Maybe you can collaborate 18:52:55 <tmckay> most may even be "all" 18:52:56 <jodah> sounds good 18:53:19 <jodah> since i'm new to the project i'm not sure which areas i'll be asked to focus on, but that is certainly possible :) 18:54:52 <tmckay> 5 minutes folks, anything else? 18:54:59 * tmckay pretends to be Sergey :) 18:55:13 <SergeyLukjanov> if not we could end it 5 mins earlier and have extra coffee 18:55:22 <tmckay> ++ 18:55:27 <elmiko> +1 18:55:51 <SergeyLukjanov> #endmeeting