*** IlyaE has joined #savanna | 00:03 | |
*** IlyaE has quit IRC | 00:06 | |
*** NikitaKonovalov has joined #savanna | 00:22 | |
*** NikitaKonovalov has quit IRC | 00:26 | |
*** IlyaE has joined #savanna | 00:30 | |
*** sacharya has quit IRC | 00:44 | |
*** NikitaKonovalov has joined #savanna | 00:53 | |
*** nos_ has joined #savanna | 00:54 | |
*** nosnos has joined #savanna | 00:55 | |
*** NikitaKonovalov has quit IRC | 00:57 | |
*** sacharya has joined #savanna | 01:10 | |
*** NikitaKonovalov has joined #savanna | 01:23 | |
*** NikitaKonovalov has quit IRC | 01:28 | |
*** NikitaKonovalov has joined #savanna | 01:54 | |
*** NikitaKonovalov has quit IRC | 01:59 | |
*** NikitaKonovalov has joined #savanna | 02:25 | |
*** NikitaKonovalov has quit IRC | 02:30 | |
*** nosnos_ has joined #savanna | 02:47 | |
*** nosnos has quit IRC | 02:49 | |
*** nosnos_ has quit IRC | 02:55 | |
*** nosnos has joined #savanna | 02:55 | |
*** NikitaKonovalov has joined #savanna | 02:56 | |
*** NikitaKonovalov has quit IRC | 03:01 | |
*** NikitaKonovalov has joined #savanna | 03:27 | |
*** NikitaKonovalov has quit IRC | 03:31 | |
*** NikitaKonovalov has joined #savanna | 03:58 | |
*** NikitaKonovalov has quit IRC | 04:03 | |
*** IlyaE has quit IRC | 04:07 | |
*** tmckay has quit IRC | 04:21 | |
*** IlyaE has joined #savanna | 04:21 | |
*** SergeyLukjanov has joined #savanna | 04:25 | |
*** NikitaKonovalov has joined #savanna | 04:29 | |
*** tmckay has joined #savanna | 04:31 | |
*** NikitaKonovalov has quit IRC | 04:33 | |
*** sacharya has quit IRC | 04:38 | |
*** nadya has joined #savanna | 04:43 | |
*** akuznetsov has joined #savanna | 04:57 | |
*** nadya has quit IRC | 04:58 | |
*** NikitaKonovalov has joined #savanna | 05:00 | |
*** NikitaKonovalov has quit IRC | 05:04 | |
*** IlyaE has quit IRC | 05:05 | |
*** nadya has joined #savanna | 05:25 | |
*** akuznetsov has quit IRC | 05:26 | |
*** akuznetsov has joined #savanna | 05:29 | |
*** NikitaKonovalov has joined #savanna | 05:30 | |
*** NikitaKonovalov has quit IRC | 05:36 | |
*** openstack has joined #savanna | 14:54 | |
*** openstackgerrit has joined #savanna | 14:55 | |
nprivalova | I think it's ok to have JobOrigin | 14:56 |
---|---|---|
nprivalova | what do you think about storing libs separately? not depending on jobs? | 14:58 |
*** IlyaE has joined #savanna | 15:00 | |
nprivalova | akuznetsov, crobertsrh, tmckay, guys, we need to determine further steps | 15:01 |
crobertsrh | sorry....been in another call...reading back | 15:02 |
nprivalova | because UI depends on thos | 15:02 |
nprivalova | *this | 15:02 |
crobertsrh | Yes, it certainly does :) | 15:02 |
tmckay | nprivalova, for the sake of discussion, we need a name for the new "DataSource"-like object you propose in your email. Maybe it could even be a JobBinary -- what if a JobBinary can have a data column or a url column with credentials? | 15:02 |
tmckay | either could be empty | 15:03 |
tmckay | or we could have 2 very similar records. JobBinary and JobBinaryInternal, maybe. And JobOrigins store lists consisting of those two. | 15:04 |
tmckay | I think treating libraries as independent makes sense. If they are really reusable, then they will be referenced by multiple jobs | 15:05 |
nprivalova | hm…looks like JobBinary is enough | 15:05 |
nprivalova | They are reusable, e.g. UDF in Pig | 15:06 |
tmckay | So, just to be clear -- some JobBinary records in sqlalchemy would have an empty data column and a url with credentials, and some would have a populated data column with empty url and no credentials | 15:06 |
akuznetsov | Agree with tmckay in this case the logic about how retrieves the jar and resource and JobOrigin will be contain a description of needed resources for Job | 15:07 |
nprivalova | ok | 15:07 |
nprivalova | So do we need mockups for JobBinary page? | 15:08 |
nprivalova | on UI | 15:09 |
nprivalova | tmckay, let's discuss your https://review.openstack.org/#/c/44526/ | 15:10 |
akuznetsov | I think it should be a additional fields in JobSource page for uploading or specifying a list job binaries | 15:10 |
tmckay | nprivalova, yes, it relates directly to our discussion above | 15:11 |
tmckay | I think what we are proposing is that a JobOrigin contains a list of ids for JobBinary objects. And JobBinary is extended to allow url/credentials. Is this your understanding? | 15:12 |
nprivalova | we may abandon it and begin to work together on this. or I may begin to do smth else | 15:12 |
tmckay | Either way. I think a CR that shows changes to the object definitions and REST api in one set would be good, so that it can be reviewed. | 15:14 |
nprivalova | tmckay, it is not actually a list. it is many-to-many. We may get an example in node-groups in cluster I think | 15:15 |
tmckay | nprivalova, okay. if we can agree on the structures of the objects, then maybe I can change the code for storage/retrieval and you can change the job_manager to expect those objects in a separate CR. | 15:15 |
nprivalova | tmckay, smth like node_groups = relationship('NodeGroup', cascade="all,delete", backref='cluster', lazy='joined') | 15:16 |
crobertsrh | I can add a mockup for a JobBinary page. I should probably update the other mockups too. I think they might be slightly out of touch with reality right now. | 15:16 |
tmckay | nprivalova, yes, you're right. I'm still thinking in terms of URLs but these are all actually sqlalchemy objects so we can have the relationships. | 15:17 |
nprivalova | crobertsrh, great! I think this part should be rather stable so it would be really good to make a final decision on this, not a draft | 15:18 |
tmckay | agreed. This is fundamental, we should solve it for good :) | 15:18 |
tmckay | nprivalova, how about doing this through a draft CR, with real code? | 15:21 |
*** sacharya has joined #savanna | 15:23 | |
nprivalova | tmckay, doing what? didn't get you. Do you mean creating draft CR at first? | 15:23 |
tmckay | nprivalova, sorry, I mean for discussion :) If one of us makes a draft CR then we can comment back and forth on it til we're happy. | 15:24 |
tmckay | Even if there are a hundred patch sets. | 15:24 |
crobertsrh | Hmm, maybe I'm not 100% clear on where we're going here. Here is what I think we agreed to.... | 15:25 |
crobertsrh | 1) We still have data sources....just like before | 15:25 |
nprivalova | Yes, I agree. We have time shifts so you may begin today and I will proceed tomorrow. Because I'm blocked by this | 15:25 |
crobertsrh | 2) we will now have Job Binaries (including a separate UI page) | 15:25 |
tmckay | nprivalova, okay. Can more than one person upload to a CR? You just need to have the write ID in the commit message, yes? | 15:26 |
crobertsrh | 3) A Job Origin will now include 1 or more Job Binaries (binaries/libraries)? | 15:26 |
tmckay | "write" == "right", sorry | 15:26 |
nprivalova | tmckay, yes | 15:26 |
crobertsrh | 4) A job will still contain 2 data sources (input and output) and a Job Origin? | 15:27 |
tmckay | crobertsrh, yes. JobOrigins will refer to JobBinaries via database id. JobBinaries themselves will either refer to internal storage, or an external url. There will be database constraints on the relationship between JobOrigins and JobBinaries so deletion, etc work correctly. | 15:28 |
crobertsrh | Will all job binaries have names (something nice that can be displayed in the UI)? Displaying a list of IDs to choose from is probably not that helpful. | 15:29 |
nprivalova | crobertsrh, regarding 3) yes. JobOrigin may have several JobBinaries as libs. And one or none "main" binary | 15:29 |
tmckay | croberstrh, if a JobBinary refers to an external url, it also will have a credentials field (which may or may not be populated) | 15:29 |
nprivalova | crobertsrh, regarding 4) yes | 15:29 |
crobertsrh | Nprivalova: Is there a distinction in the api between "main" and "library"? | 15:30 |
tmckay | nprivalova, so a JobOrigin has two fields. Currently we are proposing "url" and "libs" -- maybe we need better field names. | 15:30 |
tmckay | we can be literal and just use "main" and "libs" | 15:30 |
nprivalova | by the way, it would be great to create JobBinary "on the fly",is it possible? | 15:30 |
tmckay | You mean stream application data to savanna? | 15:31 |
*** ruhe has joined #savanna | 15:31 | |
tmckay | nprivalova, or submit binary application data with the creation of a job origin? | 15:32 |
nprivalova | no. I just don't want a separate page for library creations. yes, the last is what I meant | 15:32 |
nprivalova | I think crobertsrh will manage with this problem. It is not very important and depends on horizon framework | 15:33 |
crobertsrh | Right. It should be possible to do "on the fly" creation via file upload if that's what you mean. | 15:34 |
tmckay | so, the UI would have an "upload binary here" widget on the form, and when the user presses "submit" the UI would create a JobBinary as an additional step? | 15:34 |
crobertsrh | tmckay: Yeah, that's what I'm thinking. First it would create a job binary...then take the ID that it gets from doing that (assuming something is returned...I'll need to check on that) and store it with the Job Origin. | 15:35 |
nprivalova | As I understand user may have "collection" of libs and create libs "on the fly" | 15:35 |
tmckay | yes, it's returned. All creates via REST return an object with all/most fields populated. | 15:36 |
nprivalova | crobertsrh, is it clear about this part? | 15:37 |
crobertsrh | nprivalova: I think I get it. If my 1,2,3,4 above are correct, I will go in that direction for now. | 15:38 |
* tmckay creates new branch off master and gets to work.... | 15:38 | |
nprivalova | 2 more questions, sorry :) | 15:38 |
tmckay | fire away | 15:38 |
nprivalova | 1) will we have "collections" of libs? (just to clarify) | 15:39 |
tmckay | would that be an extra level of indirection? JobOrigin -> LibCollection -> JobBinarys? | 15:40 |
tmckay | instead of just JobOrigin -> JobBinarys | 15:40 |
nprivalova | no | 15:41 |
nprivalova | JobBinary may has no JobOrigin, correct? | 15:41 |
tmckay | right, JobBinary is an id for a BLOB or a url to a file | 15:42 |
tmckay | "is" --> "has" | 15:42 |
nprivalova | yes | 15:42 |
nprivalova | during job creation we may create a new one jobBinary or choose from existing | 15:42 |
tmckay | agreed | 15:43 |
nprivalova | so we need to have a page with a list of existing binaries | 15:43 |
nprivalova | * as I see it | 15:44 |
tmckay | ah, something that calls "get_all" and lists them for selection | 15:44 |
tmckay | crobertsrh, ^^ | 15:45 |
crobertsrh | Yep, that's what I was thinking | 15:45 |
nprivalova | great! | 15:45 |
crobertsrh | 1 required entry with + signs to add more dropdown fields | 15:45 |
nprivalova | ok. I think we are done about 1) | 15:46 |
crobertsrh | Each field can be a choice from the dropdown or a new upload | 15:46 |
nprivalova | 2) I wanted to say about "main" and "libs" because it is a little problem... | 15:46 |
* tmckay has been wondering how the job_manager will find an entry point | 15:47 | |
nprivalova | for pig and hive we need to store scripts in /user/hadoop/job-980472-3824723-8734/ and all libs we should store to /../../.../libs . So we need to know a "status" for a binary | 15:48 |
SergeyLukjanov | fyi http://lists.openstack.org/pipermail/openstack-dev/2013-September/014623.html | 15:48 |
nprivalova | for mapreduce all the files should be in libs. So there is no 'main' resource | 15:49 |
tmckay | you mean /user/hadoop/job-980472-3824723-8734/libs in this example? What would the value of "status" be? | 15:50 |
nprivalova | "status" = should or not we store a binary in the libs subdir | 15:50 |
nprivalova | ping | 15:51 |
nprivalova | I meant /user/hadoop/job-980472-3824723-8734/libs, yes | 15:52 |
tmckay | so if the above job were a mapreduce job, everything would be in /user/hadoop/job-980472-3824723-8734/libs? | 15:52 |
nprivalova | yes | 15:52 |
nprivalova | user should determine mapper and reducer class in configuration | 15:53 |
nprivalova | Looks like user should tell us is it a lib-file or not | 15:54 |
tmckay | ok. So, can we still handle that in a JobOrigin with "main" and "libs", and maybe main is null for mapreduce? | 15:54 |
nprivalova | main will be null for mapreduce | 15:55 |
tmckay | yes :) Unclear English. I meant "maybe we can use this as a solution" | 15:56 |
nprivalova | looks like we may have only 'resources' field in JobOrigin instead of url and libs | 15:56 |
nprivalova | need to think about it | 15:57 |
tmckay | and no storage type, and no credentials. If that is the case, does the "status" value go in the JobBinary object to tell us where to store it | 15:58 |
nprivalova | by the way, credentials are not needed in jobOrigin anymore, are they? | 16:00 |
*** dina_belova has quit IRC | 16:00 | |
tmckay | correct, they move to JobBinary (and may be blank) | 16:00 |
nprivalova | I will update ether pad tomorrow. | 16:00 |
crobertsrh | Ok. Can you ping me when the pad is updated? | 16:01 |
tmckay | okay. I'll start a CR. | 16:01 |
nprivalova | ok | 16:01 |
nprivalova | I think we should ask user to tell us whether a chosen binary is a lib or not | 16:03 |
nprivalova | and have a list of 'mains' and a list of 'libs' | 16:04 |
tmckay | You're talking about a UI display? So we potentially need a "main/lib" flag on "get_all" in the api. | 16:05 |
nprivalova | let's not to store 'status' in JobBinary. I guess it may changes. in this case we don't need such a flag in api | 16:07 |
nprivalova | sorry for confusion | 16:08 |
*** SergeyLukjanov has quit IRC | 16:08 | |
nprivalova | during job creation user chooses lib-files and main-files | 16:09 |
openstackgerrit | Nikita Konovalov proposed a change to stackforge/savanna: Floating ip assignement support https://review.openstack.org/44822 | 16:09 |
*** tmckay has quit IRC | 16:12 | |
nprivalova | will we have pig-page, hive-page and mr page? | 16:12 |
*** NikitaKonovalov has quit IRC | 16:13 | |
*** dmitryme has quit IRC | 16:14 | |
*** tmckay has joined #savanna | 16:14 | |
*** IlyaE has quit IRC | 16:15 | |
tmckay | nprivalova, my computer froze :) | 16:15 |
nprivalova | I feel alone :( | 16:16 |
tmckay | so, if we do not have status in JobBinary, then we still have the problem of where to indicate file placement. Where do we specify that? (back to multiple fields in JobOrigin) | 16:16 |
*** openstackgerrit has quit IRC | 16:16 | |
*** openstackgerrit has joined #savanna | 16:17 | |
nprivalova | in jobOrigin: "main" will have a multiple JobBinaries and "libs" too | 16:17 |
tmckay | okay, great. | 16:17 |
nprivalova | I hope it's ok for sqlAlchemy | 16:17 |
nprivalova | How will it looks like on UI we may discuss tomorrow | 16:18 |
tmckay | hmm, I'm sure there is a way to store a set of ids. Maybe a special column type, maybe another table. | 16:19 |
nprivalova | if we have separate page for each job type it will be sillier to determine 'libs' and 'mains'. But rest should be updated anyway | 16:19 |
nprivalova | *easier, sorry | 16:20 |
nprivalova | :) | 16:20 |
nprivalova | *REST | 16:20 |
*** ruhe has quit IRC | 16:21 | |
nprivalova | I need to go now. I hope it became clearer what should be done | 16:21 |
tmckay | I think so, clear enough to put up a draft. See you tomorrow. | 16:22 |
tmckay | I'll start with the database type, figure out how to store multiple ids | 16:23 |
nprivalova | ok, bye! | 16:24 |
tmckay | bye | 16:24 |
*** IlyaE has joined #savanna | 16:34 | |
*** nprivalova has quit IRC | 16:38 | |
*** IlyaE has quit IRC | 16:41 | |
*** ruhe has joined #savanna | 16:54 | |
*** ruhe has quit IRC | 17:00 | |
*** dmitryme has joined #savanna | 17:05 | |
*** ruhe has joined #savanna | 17:08 | |
*** asavu has joined #savanna | 17:12 | |
*** IlyaE has joined #savanna | 17:17 | |
*** IlyaE has quit IRC | 17:18 | |
*** ruhe has quit IRC | 17:53 | |
*** dmitryme has quit IRC | 17:58 | |
*** akuznetsov has quit IRC | 18:04 | |
*** dina_belova has joined #savanna | 18:11 | |
*** SergeyLukjanov has joined #savanna | 18:11 | |
*** dina_belova has quit IRC | 18:16 | |
openstackgerrit | Yaroslav Lobankov proposed a change to stackforge/savanna: Integration test refactoring https://review.openstack.org/43925 | 18:21 |
*** asavu has quit IRC | 18:26 | |
*** nadya has joined #savanna | 18:35 | |
*** tstclair has quit IRC | 18:52 | |
*** nadya_ has joined #savanna | 18:54 | |
*** nadya has quit IRC | 18:57 | |
*** mattf is now known as _mattf | 18:59 | |
*** _mattf is now known as mattf | 19:00 | |
*** nadya_ has quit IRC | 19:07 | |
*** tstclair has joined #savanna | 19:07 | |
*** SergeyLukjanov has quit IRC | 19:07 | |
*** NikitaKonovalov has joined #savanna | 19:10 | |
*** dina_belova has joined #savanna | 19:11 | |
*** NikitaKonovalov has quit IRC | 19:14 | |
openstackgerrit | Erik Bergenholtz proposed a change to stackforge/savanna: Documentation about HDP plugin https://review.openstack.org/44928 | 19:16 |
*** dina_belova has quit IRC | 19:16 | |
openstackgerrit | Erik Bergenholtz proposed a change to stackforge/savanna: Documentation about HDP plugin https://review.openstack.org/44928 | 19:34 |
*** IlyaE has joined #savanna | 19:40 | |
*** asavu has joined #savanna | 19:41 | |
*** dmitryme has joined #savanna | 19:50 | |
*** dmitryme has joined #savanna | 19:51 | |
*** NikitaKonovalov has joined #savanna | 20:11 | |
*** dina_belova has joined #savanna | 20:12 | |
*** NikitaKonovalov has quit IRC | 20:15 | |
*** dina_belova has quit IRC | 20:17 | |
*** Guest85374 has quit IRC | 20:31 | |
*** crobertsrh is now known as _crobertsrh | 20:48 | |
*** NikitaKonovalov has joined #savanna | 21:01 | |
*** asavu has quit IRC | 21:01 | |
*** IlyaE has quit IRC | 21:05 | |
*** NikitaKonovalov has quit IRC | 21:09 | |
*** dina_belova has joined #savanna | 21:12 | |
*** dina_belova has quit IRC | 21:17 | |
*** tstclair is now known as _tstclair | 21:35 | |
*** NikitaKonovalov has joined #savanna | 21:35 | |
*** NikitaKonovalov has quit IRC | 21:40 | |
*** mattf is now known as _mattf | 21:53 | |
*** sacharya has quit IRC | 21:54 | |
*** NikitaKonovalov has joined #savanna | 22:06 | |
*** dmitryme has quit IRC | 22:07 | |
*** tmckay has quit IRC | 22:08 | |
*** asavu has joined #savanna | 22:09 | |
*** NikitaKonovalov has quit IRC | 22:11 | |
*** dina_belova has joined #savanna | 22:13 | |
*** dina_belova has quit IRC | 22:18 | |
*** NikitaKonovalov has joined #savanna | 22:37 | |
*** NikitaKonovalov has quit IRC | 22:42 | |
*** asavu has quit IRC | 22:54 | |
*** NikitaKonovalov has joined #savanna | 23:08 | |
*** NikitaKonovalov has quit IRC | 23:13 | |
*** dina_belova has joined #savanna | 23:13 | |
*** dina_belova has quit IRC | 23:18 | |
*** IlyaE has joined #savanna | 23:26 | |
*** NikitaKonovalov has joined #savanna | 23:39 | |
*** NikitaKonovalov has quit IRC | 23:43 | |
*** sacharya has joined #savanna | 23:48 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!