17:59:54 <SergeyLukjanov> #startmeeting sahara 17:59:54 <openstack> Meeting started Thu Jul 9 17:59:54 2015 UTC and is due to finish in 60 minutes. The chair is SergeyLukjanov. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:59:55 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:59:58 <openstack> The meeting name has been set to 'sahara' 18:00:00 <alazarev> o/ 18:00:00 <elmiko> heyo/ 18:00:15 <SergeyLukjanov> #link https://wiki.openstack.org/wiki/Meetings/SaharaAgenda 18:00:25 <weiting> Hi 18:00:28 <SergeyLukjanov> let's wait for a few mins 18:00:30 <alazarev> SergeyLukjanov, will you chair? 18:00:32 <SergeyLukjanov> for other folks 18:00:56 <SergeyLukjanov> alazarev, yeah, the PTLs overview will be one hour later ;) 18:01:02 <SergeyLukjanov> #chair alazarev 18:01:03 <openstack> Current chairs: SergeyLukjanov alazarev 18:01:08 <tosky> o/ 18:01:14 <SergeyLukjanov> to be sure that someone will end the meeting :) 18:01:27 <alazarev> SergeyLukjanov :) 18:01:43 <egafford> \o 18:01:47 <pino|work> o/ 18:01:50 <elmiko> updated the agenda to remove last weeks item about spark 18:01:59 <SergeyLukjanov> elmiko, yeah, thx 18:02:20 <SergeyLukjanov> crobertsrh, NikitaKonovalov ping 18:02:24 <SergeyLukjanov> #topic sahara@horizon status (crobertsrh, NikitaKonovalov) 18:02:30 <crobertsrh> howdy 18:02:34 <SergeyLukjanov> #link https://etherpad.openstack.org/p/sahara-reviews-in-horizon 18:02:38 <SergeyLukjanov> hey :) 18:03:03 <crobertsrh> We've had a few changes go through, not a ton of progress 18:03:09 <crobertsrh> The move to contrib patch is up 18:03:30 <crobertsrh> hopefully, once the move is done, we can get patches through a bit quicker (after we rebase them, of course) 18:03:41 <SergeyLukjanov> heh, yeah, hopefully 18:03:53 <SergeyLukjanov> anything else re horizon? 18:04:43 <tosky> crobertsrh: so is the sahara part fully separated, from the source point of view? Interesting 18:05:06 <tosky> it seems that the existing (two) selenium integration tests are still happy 18:05:07 <SergeyLukjanov> tosky, it seems still really mixed 18:05:15 <tosky> oh, I see 18:05:15 <crobertsrh> It's mostly still the same imho 18:05:20 <SergeyLukjanov> crobertsrh, ++ 18:05:28 <SergeyLukjanov> but it's logically separated 18:06:21 <crobertsrh> The move to contrib patch is here... 18:06:23 <crobertsrh> #link https://review.openstack.org/#/c/197363/ 18:06:35 <SergeyLukjanov> okay, thx 18:06:45 <SergeyLukjanov> #topic News / updates 18:06:52 <sreshetnyak> o/ 18:06:57 <egafford> crobertsrh: "blueprint plugin-sanity" is excellent. 18:07:25 <vgridnev> i am working for several bugs and ntp support in plugins 18:07:27 <crobertsrh> how could anyone -1 a bp like that 18:07:34 <elmiko> i'm working on the conversion to use keystone sessions for authentication. also, writing up an abstract for tokyo on using spark with sahara to process streaming logs from openstack services. 18:08:00 <sreshetnyak> no updates from me 18:08:21 <egafford> Working on a few changes in parallel (specs for two stages of manila integration for binary storage, trusts for long-running clusters to permit cleanup.) On the latter, question: is alazarev still on leave? 18:08:27 <esikachev> i am working on the cluster-verification checks 18:08:32 <alazarev> I'm back from parental leave starting from today 18:08:47 <egafford> alazarev: Cool; I have a question for you then. 18:08:54 <egafford> (Will wait for discussion.) 18:09:00 <tosky> working on the small change on scenario tests (configuration files) now the spec is approved, it will be ready soooon 18:09:14 <huichun> working on recurrence schedule edp 18:09:21 <alazarev> egafford, yes, I still remember something about work :) 18:09:24 <SergeyLukjanov> tosky, cool, looking forward for the scenario templates 18:09:32 <huichun> and also with suspend resume edp jobs 18:10:05 <SergeyLukjanov> huichun, great, with this stuff we'll have some kind of additional job lifecycle management 18:10:17 <SergeyLukjanov> #topic Open discussion 18:10:37 <huichun> SergeyLukjanov: Hi Sergey, do we need suspend and resume edp job feature? 18:10:41 <tmckay> so no updates for me, I have been on PTO for a while :) 18:11:01 <NikitaKonovalov> o/ I've been working on HDP 2.2 plugin 18:11:08 <tmckay> time to catch up on reviews 18:11:15 <NikitaKonovalov> mainly on HA and EDP stuff 18:11:16 <SergeyLukjanov> huichun, it's a good q., do you have the use case for it? 18:11:18 <huichun> tmckay: lots of edp enhancement spec needs your review ^_^ 18:11:36 <tmckay> huichun, noted, I'll try hard to review them 18:11:48 <tmckay> long vacation 18:11:51 <alazarev> huichun, I'll join to review too 18:12:14 <egafford> alazarev: I note that in your spec for trusts to enable cluster cleanup, you suggest that trust ids should be stored in memory on the context. However, in a distributed Sahara install, this will mean that only one server will be able to cleanup any one cluster. That means, in turn, that we will always need to run that cleanup job on every node, or we'll need to find a solution for distributing the trust ids among engine nodes 18:12:57 <huichun> SergeyLukjanov: for example, if one job has many steps, user want to suspend this job when finish the first step, to check if the log or data is right, then resume this job 18:13:17 <huichun> alazarev: thx ^_^ 18:13:49 <alazarev> egafford, it was for "create/scale cluster" operation. The whole task is done by one engine now. 18:15:03 <alazarev> egafford, for clean we need to store it in DB, what spec are you referencing? 18:15:40 <egafford> alazarev: (Aggregating links) 18:16:44 <egafford> alazarev: https://bugs.launchpad.net/sahara/+bug/1468722 covers a bug with periodic cluster cleanup (no trust-clusters cannot be cleaned up.) 18:16:44 <openstack> Launchpad bug 1468722 in Sahara "Periodic cleanup of non-final clusters moves the cluster into Error instead of removing it" [High,New] - Assigned to Ethan Gafford (egafford) 18:17:45 <egafford> alazarev: When we discussed this, you pointed me to the spec: http://specs.openstack.org/openstack/sahara-specs/specs/kilo/cluster-creation-with-trust.html 18:18:20 <egafford> The line "Trust is stored in memory only (probably context is the good place to put it). No serialization to DB." is a bit of a problem for the periodic cluster cleanup job. 18:18:42 <alazarev> egafford, I see, this spec is for long operation inside one engine, it will not work "as is" for clean up 18:19:23 <alazarev> egafford, but they both can use the same mechanism (e.g. with trust stored in DB) 18:20:12 <alazarev> egafford, or can use different (e.g. because long operation doesn't need trust in DB), need to think more 18:20:39 <alazarev> egafford, do you have thoughts about? 18:21:37 <egafford> alazarev: Right; that was my thought. I'll submit a patch to the spec for review, then, and we can talk about it there. I've got an impl nearly complete. Well, I don't actually see a great deal of sec difference between storing a trust for a transient cluster and storing a temporary trust for a long-running cluster. Either could contain extremely sensitive data. 18:22:25 <elmiko> i dont think storing the trust id is necessarily a sec concern 18:22:37 <elmiko> you still need a valid auth token to do anything useful 18:22:57 <egafford> I think if we're okay with one, the other makes sense. I can see an argument the other way (a long-running cluster trust allows a malicious user to create a backdoor for attack for longer,) but. elmiko: That makes sense. 18:23:38 <egafford> Okay, I have enough information to keep working. Thanks alazarev, elmiko. 18:24:55 <elmiko> the other option would be for each server instance to generate a trust with similar permissions 18:25:48 <elmiko> then each instance would have separate permissions to remove the cluster, or whatever operation is needed 18:26:11 <egafford> Sure; just create a trust on demand if you need one but don't have it. 18:26:17 <alazarev> elmiko, what do you mean by "server"? 18:26:29 <elmiko> i meant, -engine server 18:26:58 <elmiko> does that make sense? 18:26:59 <alazarev> to create trust you need valid token 18:27:15 <elmiko> yea 18:27:27 <alazarev> you can't create at "clean up" time 18:27:38 <egafford> Right, and we only have a valid token for the tenant plane while we're creating the nodes. 18:28:00 <elmiko> i meant more that each sahara instance could create a trust, then they each would have permissions on the cluster 18:28:18 <elmiko> that way there is no need to share a trust between sahara instances 18:28:42 <SergeyLukjanov> elmiko, what if the new -engine added after the last op on a cluster? 18:28:44 <elmiko> each could contain a seprate trust id in their context 18:29:00 <SergeyLukjanov> (added == reloaded for example) 18:29:04 <elmiko> SergeyLukjanov: good point, that could get tricky coordinating the actions 18:29:17 <elmiko> maybe this is not a good way to approach it 18:29:56 <egafford> elmiko: Yeah, I don't think we have any other fanout-type messaging tricks, and we probably don't want them unless we *really* need them. 18:30:08 <elmiko> right 18:30:30 <elmiko> probably storing to db is the easiest solution 18:30:39 <alazarev> elmiko, api doesn't know how many engines we have, it is not possible to create trusts for all of them 18:30:51 <egafford> elmiko: So it really comes down to "is it safe to store trust_ids in the database?" If the answer is broadly yes, then all is pretty well. 18:31:05 <elmiko> alazarev: ah, interesting. i did not know that 18:31:32 <elmiko> egafford: i think it is, but i can research a little more. 18:31:43 <egafford> alazarev: There are AMQP tricks that could conceivably deal with that (fanout messages) but they're usually troublesome. 18:31:45 <elmiko> iirc we store the proxy domain trusts in the db temporarily 18:32:04 <egafford> alazarev: Haven't looked into oslo_messaging support of that featureset. 18:32:15 <egafford> alazarev: (And I hope not to. :) ) 18:32:36 <elmiko> another option would be to use barbican for external secret storage 18:32:52 <elmiko> (if needed) 18:32:57 <egafford> elmiko: That is not a bad idea at all if we need it, yeah. 18:33:21 <alazarev> barbican could be optional only, I think 18:33:25 <elmiko> but i think its generally safe to store a trust id 18:33:35 <alazarev> we don't want to depend on barbican for now 18:33:45 <elmiko> alazarev: yea, we'd have to use the castellan approach 18:33:46 <egafford> alazarev: +1 optional Barbican. 18:34:42 <elmiko> it would be similar to what we're proposing for secret storage now. use the db as a default with the option to improve to barbican 18:34:59 <egafford> elmiko: Right; we could piggyback on the same interface. 18:35:09 <elmiko> yea 18:35:41 <elmiko> speaking of which, #link https://review.openstack.org/#/c/179393/ :cough: 18:35:55 <elmiko> =) 18:36:22 <alazarev> elmiko, will review 18:36:26 <elmiko> thanks 18:36:41 <tmckay> elmiko, you should take something for that cough ;-) 18:36:57 <elmiko> the only cure now is more +1/+2 ;) 18:37:06 <egafford> Okay, so it sounds like the sensible plan is: 1) I propose the spec change to allow trust storage for long-running clusters, 2) elmiko researches whether trust_ids can be stored in the DB, 3) someone writes up a spec to alter our schema to store all trust ids (transient and long-running) through the improved secret storage module, 4) profit. 18:37:21 <egafford> Usually profit wants to be 3, but sometimes it takes a while. 18:37:28 <elmiko> egafford: sounds good to me 18:37:43 <egafford> elmiko: Cool. Thanks again. 18:40:01 <SergeyLukjanov> anything else to chat today? 18:40:07 <alazarev> egafford, sounds good 18:40:18 <egafford> alazarev: Excellent. 18:40:57 <pino|work> SergeyLukjanov: a look at pending reviews? ;) 18:41:16 <huichun> egafford: hi Ethan, recurrence edp spec has been updated according to your last comment,waiting your review^_^ 18:41:47 <SergeyLukjanov> pino|work, in my backlog, should actively review tomorrow morning (canceled all mostly all meetings :) ) 18:41:53 <egafford> huichun: I saw; I thought about reviewing, but then thought "you know, other people really need to review these too; they're incredibly important." 18:42:38 <egafford> huichun: It sounds like some other folks (tmckay, alazarev) have signed up to review as well; I'll come back to it once it gets some additional eyes (which I hope it does soon; it's incredibly important. :) ) 18:43:04 <pino|work> SergeyLukjanov: thanks! 18:43:06 <tmckay> yes, very interested, but I was away! 18:43:08 <huichun> nice^_^ 18:43:28 <egafford> tmckay: No judgment at all. 18:43:29 <elmiko> keystone session spec could use more eyes as well ;) 18:43:36 * tmckay makes todo list in email, marks urgent 18:48:40 <huichun> tmckay: oh,one more thing, do we need update current Ozzie client call from v1 to V2? 18:48:59 <tmckay> huichun, hmmm, I am unaware of the implications 18:49:50 <alazarev> huichun, why not? I think it should be pretty easy, right? 18:50:01 <huichun> yes,easy 18:50:57 <huichun> and we have new feature can be added into Sahara oozie engine by v2 18:51:12 <tmckay> huichun, okay, sounds good to me 18:51:28 <huichun> alazarev: i will write a spec to do this 18:51:38 <tmckay> huichun, maybe we don't need a blueprint, but a wishlist bug outlining what the new benefits/features are 18:51:45 <tmckay> or a spec :) 18:52:04 <alazarev> +1 on spec, this is not a bug 18:52:15 <tmckay> that's what wishlist is for :) 18:52:24 <huichun> tmckay: agree 18:52:32 * SergeyLukjanov dissapearing for the ptl overview recording 18:52:42 <SergeyLukjanov> alazarev, don't forget to end the meeting please 18:52:53 <alazarev> SergeyLukjanov, ok 18:53:07 <alazarev> do we have anything more to discuss? 18:53:27 <huichun> alazarev: waiting for your review comments on recurrence edp and suspend resume edp ^_^ 18:54:03 <huichun> so that I can start up with coding work ^_^ 18:54:34 <tmckay> huichun, heh 18:54:59 * tmckay sometimes starts coding before spec is done, ssshhh 18:55:34 * egafford is both shocked and appalled at tmckay, and almost always starts coding before spec is done. 18:55:44 <huichun> ^_^ 18:56:02 <tmckay> iterative refinement, it's good 18:56:15 <huichun> nice 18:56:45 <egafford> In huichun's case, admittedly, there are more ways he can go than on some specs. Having community signoff first is nice. 18:57:07 <elmiko> yea, this recurrence spec could be implemented a couple different ways 18:57:50 <tmckay> like, set versus list? 18:58:04 <tmckay> or you mean at a higher level? ;-) 18:58:16 <elmiko> yea, higher level is definitely a possibility 18:58:17 <huichun> yes, so any way may rewrote code, not just refinement ^_^ 18:58:31 <elmiko> i guess we'll discuss it on the review though 18:58:37 <alazarev> ok, it looks we are done for today 18:58:43 <alazarev> #endmeeting