*** openstackgerrit has quit IRC | 02:47 | |
*** openstackgerrit has joined #openstack-storlets | 02:48 | |
kota_ | hello | 03:02 |
---|---|---|
kota_ | sorry, probably I won't be at today's IRC meeting. I'll read the log tommorow-ish. | 03:03 |
openstackgerrit | Takashi Kajinami proposed openstack/storlets: DO NOT MERGE: PoC code for zero copy https://review.openstack.org/325949 | 03:13 |
*** eranrom has joined #openstack-storlets | 06:02 | |
*** eranrom has quit IRC | 06:12 | |
*** eranrom has joined #openstack-storlets | 06:30 | |
*** openstackgerrit has quit IRC | 06:48 | |
*** openstackgerrit has joined #openstack-storlets | 06:48 | |
*** eranrom_ has joined #openstack-storlets | 07:04 | |
*** eranrom has quit IRC | 07:04 | |
*** eranrom has joined #openstack-storlets | 07:06 | |
*** eranrom_ has quit IRC | 07:06 | |
eranrom | kota_: no problem | 07:20 |
*** openstackgerrit has quit IRC | 07:48 | |
*** openstackgerrit has joined #openstack-storlets | 07:49 | |
*** takashi has joined #openstack-storlets | 08:14 | |
takashi | eranrom: sorry for my long absence from irc. I replied to your comment in patch 314140, so please check it when you have time.. | 08:15 |
patchbot | takashi: https://review.openstack.org/#/c/314140/ - storlets - Refactor Storlet protocol classes | 08:15 |
*** openstackgerrit has quit IRC | 08:48 | |
*** openstackgerrit has joined #openstack-storlets | 08:48 | |
*** takashi has quit IRC | 09:40 | |
*** takashi has joined #openstack-storlets | 11:08 | |
*** eranrom has quit IRC | 11:13 | |
*** takashi has quit IRC | 11:23 | |
*** eranrom has joined #openstack-storlets | 11:42 | |
*** eranrom has quit IRC | 12:18 | |
*** takashi has joined #openstack-storlets | 12:51 | |
*** eranrom has joined #openstack-storlets | 12:57 | |
eranrom | Hi | 12:57 |
takashi | eranrom: Hi | 12:57 |
eranrom | I believe its only the two of us today | 12:57 |
eranrom | so lets get started | 12:57 |
takashi | eranrom: +1 | 12:58 |
eranrom | I did not update the agenda but we can discuss the protocol merge patch and the spark "echosystem" | 12:58 |
takashi | eranrom: ok | 13:00 |
eranrom | Did you see my response? | 13:00 |
takashi | eranrom: Yes, and had a little thought after that | 13:00 |
takashi | eranrom: So can we start with the discussion about protocol merge? | 13:01 |
eranrom | takashi: sure, lets start with that. | 13:01 |
takashi | eranrom: I checked you reply on patch 314140, and had some thoughts about that | 13:02 |
patchbot | takashi: https://review.openstack.org/#/c/314140/ - storlets - Refactor Storlet protocol classes | 13:02 |
takashi | I still have some concerns about using fd in diskfile | 13:03 |
takashi | My biggest concern is that we are now using internal thing in swift to pass fd directly to storlets | 13:04 |
takashi | self.stream = orig_resp.app_iter._fp.fileno() | 13:04 |
takashi | As you can see, _fp should not be accessed outside diskfile, based on the design of diskfile | 13:05 |
eranrom | takashi: I agree this is a concern. I still think that we can do here a best effort thing. That is if the fd is there lets use it, if not then fall back to a copy | 13:06 |
eranrom | takashi: AFAIK, currently hdd is the main media in Swift deployments. | 13:07 |
eranrom | and I would like to make use of it as much as possible. | 13:07 |
takashi | eranrom: ok | 13:08 |
eranrom | You may argue that it might be dangerous to commit on a random access functionality that might disappear one day | 13:08 |
eranrom | but we can (1) document this (2) when this is no longer possible we deprecate | 13:09 |
eranrom | Other then the commitment for a random access this is all internal. | 13:09 |
takashi | eranrom: I think I need some more thought about interface, when we accept both of iter and fd | 13:11 |
eranrom | takashi: sure. | 13:11 |
takashi | and some logic to automatically use available fd instead of iter | 13:11 |
takashi | I tried to make all protocols to use fd first, but there are some problems because in proxy-server we may get raw data, which should be parsed in http manner | 13:12 |
eranrom | absolutely. My first step there is to have a way to pass metadata on the fd/stream so that the Docker side would know how to act. | 13:12 |
eranrom | In fact this is part of 319640 | 13:13 |
eranrom | and I plan to break it into a different patch | 13:14 |
eranrom | This is the patch | 13:14 |
eranrom | https://review.openstack.org/#/c/319640/ | 13:14 |
patchbot | eranrom: patch 319640 - storlets - WIP: Allow to run a single range request on the ob... | 13:14 |
takashi | eranrom: I had a quick locck and added some comments, but still need deep look about it | 13:14 |
eranrom | and I still need to look at the comments :-) I am currently focused on the documentation improvemnts... | 13:15 |
eranrom | BTW - did you see my comment on https://review.openstack.org/#/c/325949/ | 13:15 |
patchbot | eranrom: patch 325949 - storlets - DO NOT MERGE: PoC code for zero copy | 13:15 |
takashi | eranrom: yes | 13:15 |
eranrom | metadata over the stream could be beneficial there as well | 13:16 |
takashi | eranrom: eranrom: and I noticed that we should deal with parsing of raw http message, as you pointed out. I think I need some more thinking about that patch | 13:17 |
takashi | eranrom: Can I ask you one thing about random read? | 13:18 |
eranrom | takashi: go ahead | 13:18 |
takashi | eranrom: My question, which is my second concern about passing fd for random read, is we only can make it on object-server, right? | 13:18 |
takashi | In object server, we can get the FileObject of object file as app_iter._fp to realize random access using it | 13:20 |
eranrom | takashi: right. But note that in many other cases as well we cannot run in the object server anyway. encryption/EC | 13:20 |
takashi | eranrom: yes, We only can't get more that fd for socket, which supports sequential access | 13:21 |
takashi | s/that/than | 13:21 |
eranrom | takashi: In what sense the app_iter._fp gives random access? | 13:21 |
eranrom | I mean doesn't using app_iter._fp mean pass the asspciated fd to the storlet side? | 13:23 |
eranrom | s/asspciated/associated | 13:23 |
takashi | In object-server we get the instance of swift.obj.diskfile.BaseDiskFileReader as app_iter | 13:24 |
takashi | it has FileObject for the object file, and we can get the fd for file, right? | 13:25 |
takashi | So that fd is connected to file, and we can realize random access over object using it | 13:25 |
eranrom | right, but I thought that this is exactly what you were concerned about :-) Note that my zero copy remark was only for the object-server case. I am sorry if this was not clear. | 13:26 |
eranrom | I still think it would be cool to have zero copy on the proxy, bur this is more difficult... | 13:26 |
takashi | eranrom: I think having zero copy is good, but we have limitations for that, right? We should run storlets on object server | 13:27 |
takashi | eranrom: and now I'm thinking the way to prohibit executing storlets, which requires random access, on proxy-server, where we can not reazlie random accesss | 13:28 |
eranrom | right. In fact we cannot ever run random access storlets in the cases of enc./EC and as you mentioned in the future also not over other types of media | 13:29 |
eranrom | might be that this calls for a storlet metadata that testifies the storlet assumes random access | 13:29 |
eranrom | and so we can error out if we are forced to run on the proxy or if the fd is not there on the object... | 13:30 |
takashi | eranrom: yes | 13:30 |
eranrom | takashi: So here is my thinking: | 13:32 |
eranrom | 1. add stream/fd metadata field to the protocol between the middleware and SCommon | 13:32 |
eranrom | 2. add the range reads for object server execution | 13:33 |
eranrom | 3. can be done earlier: merge the existing protocol classes into two (1) those that require copy of the input (2) those that are zero copy | 13:33 |
eranrom | what do you think? | 13:34 |
takashi | eranrom: make sense | 13:34 |
eranrom | Once I am done with the documentation I will go for #1 | 13:34 |
eranrom | It will take me few more days I believe | 13:35 |
takashi | eranrom: I'll think about 3 first. I'll find the way to realize general interface to accept both of iter and fd | 13:35 |
takashi | if I can :-) | 13:35 |
eranrom | takashi: That would be great. | 13:35 |
eranrom | takashi: If you do that then I could do the range stuff on top of that. so we do 3 and then 2 | 13:35 |
eranrom | takashi: If you do not find a way to have a general interface for both, we can always have two classes. It would still be better then the 4 we have now. | 13:37 |
takashi | eranrom: +1 | 13:37 |
takashi | eranrom: I think I should be consistent with your range change, | 13:38 |
takashi | so I think you don't have to wait me finishing. I can rebase my idea on your change. | 13:39 |
eranrom | takashi: no worries. Let me do 1 first and see where we stand | 13:39 |
eranrom | we can decide later | 13:39 |
takashi | eranrom: ok | 13:39 |
eranrom | takashi: do you want to talk about Spark or anything else? | 13:40 |
eranrom | or leave it for next time? | 13:40 |
takashi | Can you let me know a quick update? I think we had better discuss about detail when kota_ is also available | 13:41 |
eranrom | ok. no problem. | 13:41 |
eranrom | a quick update | 13:42 |
eranrom | Currently there is a PoC in IBM which is pretty far from something that can be upstreamed | 13:42 |
eranrom | The PoC makes some changes to the following components: | 13:42 |
eranrom | (1) Stocator - The Spark Swift connector written by Gil Vernik. I think that the changes made there should be somewhere else in the stack | 13:43 |
eranrom | (2) spark csv - This is the Spark library that helps dealing with csv format. There is an inherent problem in doing the change there, which I can explain next time we talk | 13:44 |
eranrom | I think that we may need to supply an alternative library altogether. But I still need to have a deeper look. | 13:45 |
takashi | this one? https://github.com/SparkTC/stocator | 13:45 |
takashi | about Stocator | 13:45 |
eranrom | I think this is a clone. Let me look sec. | 13:45 |
takashi | np. I'll search it later | 13:46 |
eranrom | I guess this is it yes | 13:46 |
takashi | eranrom: thx! | 13:46 |
eranrom | (3) spark rdd - This is a very generic layer in Spark, and I think that the push down could be done without touchinng it | 13:47 |
eranrom | (4) hadoop rdd, this is a pluggable rdd library which we may need to supply our own implementation and not changing this one | 13:48 |
eranrom | At a high level this where we touched the stack to make it work, but I plan to redo the whole thing | 13:49 |
eranrom | I have some spark learning to do before that | 13:49 |
eranrom | If NTT is interested we can do this together... | 13:49 |
takashi | eranrom: I'll discuss with kota_ about that later | 13:51 |
eranrom | takashi: sure | 13:52 |
eranrom | anything else for today? | 13:52 |
takashi | but honestly speaking we are now putting our focus on existing applications, which doesn't use spark now. | 13:52 |
takashi | so I'm not sure we can work about spark just now. | 13:52 |
takashi | From me, one small update | 13:53 |
eranrom | takashi: don't worry. I am planning to do this anyway :-) | 13:53 |
eranrom | takashi: go ahead | 13:53 |
takashi | eranrom: I'm looking foward that work! | 13:54 |
takashi | I started litter about directory refactoring, like patch 320945 | 13:54 |
patchbot | takashi: https://review.openstack.org/#/c/320945/ - storlets - Create common under storlet_gateway | 13:54 |
takashi | s/litter/little/ | 13:54 |
takashi | I started with the common functions, which doesn't have so big impact about on-going changes | 13:55 |
eranrom | takashi: sorry must have missed it wil revuew | 13:55 |
eranrom | review | 13:55 |
takashi | eranrom: np | 13:55 |
eranrom | sure looking good. I will review now | 13:55 |
eranrom | thans | 13:55 |
eranrom | thanks | 13:55 |
takashi | I'm going to work about handler in middleware and gateway modules, if I can find any good timings, not to prevent on-going works | 13:56 |
takashi | that's all from my side | 13:56 |
eranrom | takashi: alright. | 13:56 |
eranrom | Thanks for joining | 13:56 |
eranrom | talk to you later | 13:56 |
takashi | eranrom: Thank you | 13:56 |
openstackgerrit | Eran Rom proposed openstack/storlets: Create common under storlet_gateway https://review.openstack.org/320945 | 13:59 |
*** eranrom has quit IRC | 14:09 | |
*** eranrom has joined #openstack-storlets | 14:34 | |
*** eranrom has quit IRC | 14:49 | |
*** eranrom has joined #openstack-storlets | 15:01 | |
*** takashi has quit IRC | 16:26 | |
openstackgerrit | Takashi Kajinami proposed openstack/storlets: Create common under storlet_gateway https://review.openstack.org/320945 | 16:33 |
openstackgerrit | Takashi Kajinami proposed openstack/storlets: WIP: Revert zero-copy about GET execution on object-server https://review.openstack.org/327244 | 16:54 |
openstackgerrit | Takashi Kajinami proposed openstack/storlets: WIP: Revive zero-copy about GET execution on object-server https://review.openstack.org/327244 | 16:56 |
*** eranrom has quit IRC | 17:29 | |
*** eranrom has joined #openstack-storlets | 17:38 | |
*** eranrom has quit IRC | 18:07 | |
*** eranrom has joined #openstack-storlets | 18:10 | |
*** eranrom has quit IRC | 20:29 | |
*** openstackgerrit has quit IRC | 20:48 | |
*** openstackgerrit has joined #openstack-storlets | 20:49 | |
openstackgerrit | Merged openstack/storlets: Create common under storlet_gateway https://review.openstack.org/320945 | 20:49 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!