Wednesday, 2013-09-04

*** IlyaE has joined #savanna00:03
*** IlyaE has quit IRC00:06
*** NikitaKonovalov has joined #savanna00:22
*** NikitaKonovalov has quit IRC00:26
*** IlyaE has joined #savanna00:30
*** sacharya has quit IRC00:44
*** NikitaKonovalov has joined #savanna00:53
*** nos_ has joined #savanna00:54
*** nosnos has joined #savanna00:55
*** NikitaKonovalov has quit IRC00:57
*** sacharya has joined #savanna01:10
*** NikitaKonovalov has joined #savanna01:23
*** NikitaKonovalov has quit IRC01:28
*** NikitaKonovalov has joined #savanna01:54
*** NikitaKonovalov has quit IRC01:59
*** NikitaKonovalov has joined #savanna02:25
*** NikitaKonovalov has quit IRC02:30
*** nosnos_ has joined #savanna02:47
*** nosnos has quit IRC02:49
*** nosnos_ has quit IRC02:55
*** nosnos has joined #savanna02:55
*** NikitaKonovalov has joined #savanna02:56
*** NikitaKonovalov has quit IRC03:01
*** NikitaKonovalov has joined #savanna03:27
*** NikitaKonovalov has quit IRC03:31
*** NikitaKonovalov has joined #savanna03:58
*** NikitaKonovalov has quit IRC04:03
*** IlyaE has quit IRC04:07
*** tmckay has quit IRC04:21
*** IlyaE has joined #savanna04:21
*** SergeyLukjanov has joined #savanna04:25
*** NikitaKonovalov has joined #savanna04:29
*** tmckay has joined #savanna04:31
*** NikitaKonovalov has quit IRC04:33
*** sacharya has quit IRC04:38
*** nadya has joined #savanna04:43
*** akuznetsov has joined #savanna04:57
*** nadya has quit IRC04:58
*** NikitaKonovalov has joined #savanna05:00
*** NikitaKonovalov has quit IRC05:04
*** IlyaE has quit IRC05:05
*** nadya has joined #savanna05:25
*** akuznetsov has quit IRC05:26
*** akuznetsov has joined #savanna05:29
*** NikitaKonovalov has joined #savanna05:30
*** NikitaKonovalov has quit IRC05:36
*** openstack has joined #savanna14:54
*** openstackgerrit has joined #savanna14:55
nprivalovaI think it's ok to have JobOrigin14:56
nprivalovawhat do you think about storing libs separately? not depending on jobs?14:58
*** IlyaE has joined #savanna15:00
nprivalovaakuznetsov, crobertsrh, tmckay, guys, we need to determine further steps15:01
crobertsrhsorry....been in another call...reading back15:02
nprivalovabecause UI depends on thos15:02
crobertsrhYes, it certainly does :)15:02
tmckaynprivalova, for the sake of discussion, we need a name for the new "DataSource"-like object you propose in your email.  Maybe  it could even be a JobBinary -- what if a JobBinary can have a data column or a url column with credentials?15:02
tmckayeither could be empty15:03
tmckayor we could have 2 very similar records.   JobBinary and JobBinaryInternal, maybe.  And JobOrigins store lists consisting of those two.15:04
tmckayI think treating libraries as independent makes sense.  If they are really reusable, then they will be referenced by multiple jobs15:05
nprivalovahm…looks like JobBinary is enough15:05
nprivalovaThey are reusable, e.g. UDF in Pig15:06
tmckaySo, just to be clear -- some JobBinary records in sqlalchemy would have an empty data column and a url with credentials, and some would have a populated data column with empty url and no credentials15:06
akuznetsovAgree with tmckay in this case the logic about how retrieves the jar and resource and JobOrigin will be contain a description of needed resources for Job15:07
nprivalovaSo do we need mockups for JobBinary page?15:08
nprivalovaon UI15:09
nprivalovatmckay, let's discuss your
akuznetsovI think it should be a additional fields in JobSource page for uploading or specifying a list job binaries15:10
tmckaynprivalova, yes, it relates directly to our discussion above15:11
tmckayI think what we are proposing is that a JobOrigin contains a list of ids for JobBinary objects.  And JobBinary is extended to allow url/credentials.  Is this your understanding?15:12
nprivalovawe may abandon it and begin to work together on this. or I may begin to do smth else15:12
tmckayEither way.  I think a CR that shows changes to the object definitions and REST api in one set would be good, so that it can be reviewed.15:14
nprivalovatmckay, it is not actually a list. it is many-to-many. We may get an example in node-groups in cluster I think15:15
tmckaynprivalova, okay.  if we can agree on the structures of the objects, then maybe I can change the code for storage/retrieval and you can change the job_manager to expect those objects in a separate CR.15:15
nprivalovatmckay, smth like     node_groups = relationship('NodeGroup', cascade="all,delete", backref='cluster', lazy='joined')15:16
crobertsrhI can add a mockup for a JobBinary page.  I should probably update the other mockups too.  I think they might be slightly out of touch with reality right now.15:16
tmckaynprivalova, yes, you're right.  I'm still thinking in terms of URLs but these are all actually sqlalchemy objects so we can have the relationships.15:17
nprivalovacrobertsrh, great! I think this part should be rather stable so it would be really good to make a final decision on this, not a draft15:18
tmckayagreed.  This is fundamental, we should solve it for good :)15:18
tmckaynprivalova, how about doing this through a draft CR, with real code?15:21
*** sacharya has joined #savanna15:23
nprivalovatmckay, doing what? didn't get you. Do you mean creating draft CR at first?15:23
tmckaynprivalova, sorry, I mean for discussion :)  If one of us makes a draft CR then we can comment back and forth on it til we're happy.15:24
tmckayEven if there are a hundred patch sets.15:24
crobertsrhHmm, maybe I'm not 100% clear on where we're going here.  Here is what I think we agreed to....15:25
crobertsrh1)  We still have data sources....just like before15:25
nprivalovaYes, I agree. We have time shifts so you may begin today and I will proceed tomorrow. Because I'm blocked by this15:25
crobertsrh2)  we will now have Job Binaries (including a separate UI page)15:25
tmckaynprivalova, okay.  Can more than one person upload to a CR?  You just need to have the write ID in the commit message, yes?15:26
crobertsrh3)  A Job Origin will now include 1 or more Job Binaries (binaries/libraries)?15:26
tmckay"write" == "right", sorry15:26
nprivalovatmckay, yes15:26
crobertsrh4)  A job will still contain 2 data sources (input and output) and a Job Origin?15:27
tmckaycrobertsrh, yes.  JobOrigins will refer to JobBinaries via database id.  JobBinaries themselves will either refer to internal storage, or an external url.  There will be database constraints on the relationship between JobOrigins and JobBinaries so deletion, etc work correctly.15:28
crobertsrhWill all job binaries have names (something nice that can be displayed in the UI)?  Displaying a list of IDs to choose from is probably not that helpful.15:29
nprivalovacrobertsrh, regarding 3) yes. JobOrigin may have several JobBinaries as libs. And one or none "main" binary15:29
tmckaycroberstrh, if a JobBinary refers to an external url, it also will have a credentials field (which may or may not be populated)15:29
nprivalovacrobertsrh, regarding 4) yes15:29
crobertsrhNprivalova:  Is there a distinction in the api between "main" and "library"?15:30
tmckaynprivalova, so a JobOrigin has two fields.  Currently we are proposing "url" and "libs" -- maybe we need better field names.15:30
tmckaywe can be literal and just use "main" and "libs"15:30
nprivalovaby the way, it would be great to create JobBinary "on the fly",is it possible?15:30
tmckayYou mean stream application data to savanna?15:31
*** ruhe has joined #savanna15:31
tmckaynprivalova, or submit binary application data with the creation of a job origin?15:32
nprivalovano. I just don't want a separate page for library creations. yes, the last is what I meant15:32
nprivalovaI think crobertsrh will manage with this problem. It is not very important and depends on horizon framework15:33
crobertsrhRight.  It should be possible to do "on the fly" creation via file upload if that's what you mean.15:34
tmckayso, the UI would have an "upload binary here" widget on the form, and when the user presses "submit" the UI would create a JobBinary as an additional step?15:34
crobertsrhtmckay:  Yeah, that's what I'm thinking.  First it would create a job binary...then take the ID that it gets from doing that (assuming something is returned...I'll need to check on that) and store it with the Job Origin.15:35
nprivalovaAs I understand user may have "collection" of libs and create libs "on the fly"15:35
tmckayyes, it's returned.  All creates via REST return an object with all/most fields populated.15:36
nprivalovacrobertsrh, is it clear about this part?15:37
crobertsrhnprivalova:  I think I get it.  If my 1,2,3,4 above are correct, I will go in that direction for now.15:38
* tmckay creates new branch off master and gets to work....15:38
nprivalova2 more questions, sorry :)15:38
tmckayfire away15:38
nprivalova1) will we have "collections" of libs? (just to clarify)15:39
tmckaywould that be an extra level of indirection?  JobOrigin -> LibCollection -> JobBinarys?15:40
tmckayinstead of just JobOrigin -> JobBinarys15:40
nprivalovaJobBinary may has no JobOrigin, correct?15:41
tmckayright, JobBinary is an id for a BLOB or a url to a file15:42
tmckay"is" --> "has"15:42
nprivalovaduring job creation we may create a new one jobBinary or choose from existing15:42
nprivalovaso we need to have a page with a list of existing binaries15:43
nprivalova* as I see it15:44
tmckayah, something that calls "get_all" and lists them for selection15:44
tmckaycrobertsrh, ^^15:45
crobertsrhYep, that's what I was thinking15:45
crobertsrh1 required entry with + signs to add more dropdown fields15:45
nprivalovaok. I think we are done about 1)15:46
crobertsrhEach field can be a choice from the dropdown or a new upload15:46
nprivalova2) I wanted to say about "main" and "libs" because it is a little problem...15:46
* tmckay has been wondering how the job_manager will find an entry point15:47
nprivalovafor pig and hive we need to store scripts in /user/hadoop/job-980472-3824723-8734/ and all libs we should store to /../../.../libs . So we need to know a "status" for a binary15:48
nprivalovafor mapreduce all the files should be in libs. So there is no 'main'  resource15:49
tmckayyou mean /user/hadoop/job-980472-3824723-8734/libs in this example?  What would the value of "status" be?15:50
nprivalova"status" = should or not we store a binary in the libs subdir15:50
nprivalovaI meant /user/hadoop/job-980472-3824723-8734/libs, yes15:52
tmckayso if the above job were a mapreduce job, everything would be in /user/hadoop/job-980472-3824723-8734/libs?15:52
nprivalovauser should determine mapper and reducer class in configuration15:53
nprivalovaLooks like user should tell us is it a lib-file or not15:54
tmckayok.  So, can we still handle that in a JobOrigin with "main" and "libs", and maybe main is null for mapreduce?15:54
nprivalovamain will be null for mapreduce15:55
tmckayyes :)  Unclear English.  I meant "maybe we can use this as a solution"15:56
nprivalovalooks like we may have only 'resources' field in JobOrigin instead of url and libs15:56
nprivalovaneed to think about it15:57
tmckayand no storage type, and no credentials.  If that is the case, does the "status" value go in the JobBinary object to tell us where to store it15:58
nprivalovaby the way, credentials are not needed in jobOrigin anymore, are they?16:00
*** dina_belova has quit IRC16:00
tmckaycorrect, they move to JobBinary (and may be blank)16:00
nprivalovaI will update ether pad tomorrow.16:00
crobertsrhOk.  Can you ping me when the pad is updated?16:01
tmckayokay.  I'll start a CR.16:01
nprivalovaI think we should ask user to tell us whether a chosen binary is a lib or not16:03
nprivalovaand have a list of 'mains' and a list of 'libs'16:04
tmckayYou're talking about a UI display?  So we potentially need a "main/lib" flag on "get_all" in the api.16:05
nprivalovalet's not to store 'status' in JobBinary. I guess it may changes. in this case we don't need such a flag in api16:07
nprivalovasorry for confusion16:08
*** SergeyLukjanov has quit IRC16:08
nprivalovaduring job creation user chooses lib-files and main-files16:09
openstackgerritNikita Konovalov proposed a change to stackforge/savanna: Floating ip assignement support
*** tmckay has quit IRC16:12
nprivalovawill we have pig-page, hive-page and mr page?16:12
*** NikitaKonovalov has quit IRC16:13
*** dmitryme has quit IRC16:14
*** tmckay has joined #savanna16:14
*** IlyaE has quit IRC16:15
tmckaynprivalova, my computer froze :)16:15
nprivalovaI feel alone :(16:16
tmckayso, if we do not have status in JobBinary, then we still have the problem of where to indicate file placement.  Where do we specify that?  (back to multiple fields in JobOrigin)16:16
*** openstackgerrit has quit IRC16:16
*** openstackgerrit has joined #savanna16:17
nprivalovain jobOrigin: "main" will have a multiple JobBinaries and "libs" too16:17
tmckayokay, great.16:17
nprivalovaI hope it's ok for sqlAlchemy16:17
nprivalovaHow will it looks like on UI we may discuss tomorrow16:18
tmckayhmm, I'm sure there is a way to store a set of ids.  Maybe a special column type, maybe another table.16:19
nprivalovaif we have separate page for each job type it will be sillier to determine 'libs' and 'mains'. But rest should be updated anyway16:19
nprivalova*easier, sorry16:20
*** ruhe has quit IRC16:21
nprivalovaI need to go now. I hope it became clearer what should be done16:21
tmckayI think so, clear enough to put up a draft.  See you tomorrow.16:22
tmckayI'll start with the database type, figure out how to store multiple ids16:23
nprivalovaok, bye!16:24
*** IlyaE has joined #savanna16:34
*** nprivalova has quit IRC16:38
*** IlyaE has quit IRC16:41
*** ruhe has joined #savanna16:54
*** ruhe has quit IRC17:00
*** dmitryme has joined #savanna17:05
*** ruhe has joined #savanna17:08
*** asavu has joined #savanna17:12
*** IlyaE has joined #savanna17:17
*** IlyaE has quit IRC17:18
*** ruhe has quit IRC17:53
*** dmitryme has quit IRC17:58
*** akuznetsov has quit IRC18:04
*** dina_belova has joined #savanna18:11
*** SergeyLukjanov has joined #savanna18:11
*** dina_belova has quit IRC18:16
openstackgerritYaroslav Lobankov proposed a change to stackforge/savanna: Integration test refactoring
*** asavu has quit IRC18:26
*** nadya has joined #savanna18:35
*** tstclair has quit IRC18:52
*** nadya_ has joined #savanna18:54
*** nadya has quit IRC18:57
*** mattf is now known as _mattf18:59
*** _mattf is now known as mattf19:00
*** nadya_ has quit IRC19:07
*** tstclair has joined #savanna19:07
*** SergeyLukjanov has quit IRC19:07
*** NikitaKonovalov has joined #savanna19:10
*** dina_belova has joined #savanna19:11
*** NikitaKonovalov has quit IRC19:14
openstackgerritErik Bergenholtz proposed a change to stackforge/savanna: Documentation about HDP plugin
*** dina_belova has quit IRC19:16
openstackgerritErik Bergenholtz proposed a change to stackforge/savanna: Documentation about HDP plugin
*** IlyaE has joined #savanna19:40
*** asavu has joined #savanna19:41
*** dmitryme has joined #savanna19:50
*** dmitryme has joined #savanna19:51
*** NikitaKonovalov has joined #savanna20:11
*** dina_belova has joined #savanna20:12
*** NikitaKonovalov has quit IRC20:15
*** dina_belova has quit IRC20:17
*** Guest85374 has quit IRC20:31
*** crobertsrh is now known as _crobertsrh20:48
*** NikitaKonovalov has joined #savanna21:01
*** asavu has quit IRC21:01
*** IlyaE has quit IRC21:05
*** NikitaKonovalov has quit IRC21:09
*** dina_belova has joined #savanna21:12
*** dina_belova has quit IRC21:17
*** tstclair is now known as _tstclair21:35
*** NikitaKonovalov has joined #savanna21:35
*** NikitaKonovalov has quit IRC21:40
*** mattf is now known as _mattf21:53
*** sacharya has quit IRC21:54
*** NikitaKonovalov has joined #savanna22:06
*** dmitryme has quit IRC22:07
*** tmckay has quit IRC22:08
*** asavu has joined #savanna22:09
*** NikitaKonovalov has quit IRC22:11
*** dina_belova has joined #savanna22:13
*** dina_belova has quit IRC22:18
*** NikitaKonovalov has joined #savanna22:37
*** NikitaKonovalov has quit IRC22:42
*** asavu has quit IRC22:54
*** NikitaKonovalov has joined #savanna23:08
*** NikitaKonovalov has quit IRC23:13
*** dina_belova has joined #savanna23:13
*** dina_belova has quit IRC23:18
*** IlyaE has joined #savanna23:26
*** NikitaKonovalov has joined #savanna23:39
*** NikitaKonovalov has quit IRC23:43
*** sacharya has joined #savanna23:48

Generated by 2.14.0 by Marius Gedminas - find it at!