21:01:01 <oneswig> #startmeeting scientific_wg 21:01:02 <openstack> Meeting started Tue Jun 13 21:01:01 2017 UTC and is due to finish in 60 minutes. The chair is oneswig. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:01:03 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:01:05 <openstack> The meeting name has been set to 'scientific_wg' 21:01:10 <oneswig> Greetings all 21:01:12 <rbudden> hello everyone! 21:01:21 <jmlowe> Hey, Bob made it! 21:01:21 <martial> Hi Stig, everyone 21:01:29 <Xiaoyi-Lu-OSU_> Hi, Stig, everyone 21:01:31 <oneswig> #chair martial 21:01:31 <openstack> Current chairs: martial oneswig 21:01:33 <jmlowe> Martial, Stig 21:01:33 <priteau> Hello 21:01:38 <rbudden> yes, might have to bow out early for daycare pickup though ;) 21:01:39 <oneswig> Hi Xiaoyi-Lu-OSU_, thanks for coming 21:01:42 <trandles> Hi hi everyone 21:01:47 <Xiaoyi-Lu-OSU_> sure you are welcome 21:01:52 <DK_> Hi, Stig 21:01:56 <oneswig> #link Today's agenda https://wiki.openstack.org/wiki/Scientific_working_group#IRC_Meeting_June_13th_2017 21:02:13 <oneswig> OK lets get the show on the road 21:02:17 <oneswig> Blair here yet? 21:02:36 <oneswig> OK, lets get started 21:02:47 <oneswig> #topic RDMA-enabled Big Data 21:03:01 <oneswig> Hello DK_ Xiaoyi-Lu-OSU_ thanks both for coming 21:03:13 <Xiaoyi-Lu-OSU_> Hello everybody 21:03:23 <oneswig> #link presentation for today is http://www.hpcadvisorycouncil.com/events/2017/swiss-workshop/pdf/Tuesday11April/DKPanda_BigDataMeetsHPC_Tue04112017.pdf 21:03:38 <oneswig> We can talk today about OSU's work 21:03:52 <oneswig> on optimising HPDA platforms with RDMA 21:03:55 <verdurin> Evening. 21:04:03 <oneswig> Hi verdurin, welcome 21:04:46 <oneswig> Xiaoyi-Lu-OSU_: DK_: can you talk about how long you've been working on this project and an overview of what you've done? 21:05:08 <DK_> We have been working on this project for the last four years. 21:05:43 <DK_> The broad idea is to exploit HPC technologies (including RDMA) to accelerate Hadoop, Spark and Memcached. 21:06:16 <DK_> Recently, we have also been exploring virtualization support for these stacks with SR-IOV and OpenStack 21:06:16 <oneswig> Hadoop and Spark - presumably big lumps of java - how do you do that? 21:06:42 <Xiaoyi-Lu-OSU_> For Hadoop, we have designs for difffernent components 21:06:52 <Xiaoyi-Lu-OSU_> say, RPC, MapReduce, and HDFS 21:06:55 <oneswig> I'm looking at the box called "OSU design" in the new network stack (slide 11) 21:07:10 <Xiaoyi-Lu-OSU_> They are designed with Java + JNI + native C libraires 21:07:52 <oneswig> Are there well established precedents for using JNI to do RDMA into a JVM? 21:07:57 <Xiaoyi-Lu-OSU_> For Spark, we currently also bring our RDMA design into the shuffle manager 21:08:49 <b1airo> hi all (bit late sorry, early morning dns issues) 21:08:57 <oneswig> Hi b1airo, good morning 21:09:01 <oneswig> #chair b1airo 21:09:02 <Xiaoyi-Lu-OSU_> Different groups are exploring different solutions. We choose JNI to have better control for the low-level verbs-based designs. 21:09:03 <openstack> Current chairs: b1airo martial oneswig 21:09:05 <martial> #chaor b1airo 21:09:58 <oneswig> Xiaoyi-Lu-OSU_: How big were the changes for Hadoop and Spark components - is this a major change or is it well layered/ 21:10:25 <Xiaoyi-Lu-OSU_> First I think it is well layered. 21:10:53 <Xiaoyi-Lu-OSU_> For example, we implement our RDMA designs as plugins for these components. 21:11:11 <Xiaoyi-Lu-OSU_> We don't want to change too many lines of code inside the original codebase. 21:11:30 <Xiaoyi-Lu-OSU_> That's why we are able to support Apache Distribution of Hadoop, as well as CDH and HDP. 21:11:36 <oneswig> So it's maintainable? Sounds promising. 21:12:11 <Xiaoyi-Lu-OSU_> Yes, it is maintainable. And we don't change any existing APIs for these components 21:12:39 <oneswig> Something I missed was the acronym HHH for Hadoop - what is that? (slide 14) 21:12:40 <Xiaoyi-Lu-OSU_> We also keep the existing architecture intact. 21:13:16 <Xiaoyi-Lu-OSU_> HHH means A Hybrid Approach to Accelerate HDFS on HPC Clusters with Heterogeneous Storage Architecture 21:13:17 <b1airo> oneswig: are there slides somewhere i missed? 21:13:31 <Xiaoyi-Lu-OSU_> Hybrid, HDFS, and Heterogeneous are three key wrods 21:13:32 <oneswig> these ones b1airo: http://www.hpcadvisorycouncil.com/events/2017/swiss-workshop/pdf/Tuesday11April/DKPanda_BigDataMeetsHPC_Tue04112017.pdf 21:13:33 <Xiaoyi-Lu-OSU_> words 21:13:52 <oneswig> Hybrid between what and what? 21:14:58 <Xiaoyi-Lu-OSU_> Hybrid means different I/O paths among hard disks, SSD, RAM Disk, and parallel filesystems. 21:15:12 <Xiaoyi-Lu-OSU_> More details can be found at this paper: Triple-H: A Hybrid Approach to Accelerate HDFS on HPC Clusters with Heterogeneous Storage Architecture 21:15:20 <Xiaoyi-Lu-OSU_> which was presented at CCGrid 2015 21:15:26 <oneswig> Is there a URL? 21:15:50 <Xiaoyi-Lu-OSU_> Here it is: http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=7152476 21:16:05 <Xiaoyi-Lu-OSU_> from IEEE Digital Library 21:16:48 <oneswig> Thanks Xiaoyi-Lu-OSU_ 21:16:56 <Xiaoyi-Lu-OSU_> No problem. 21:16:57 <martial> #link http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=7152476 21:18:20 <oneswig> the benchmark results (slide 26 onwards) look compelling. How does Ethernet (10G or 50G) fit on these graphs - have you tested? 21:18:56 <Xiaoyi-Lu-OSU_> we have some initial evaluations with 10GigE, but we have not done anything for 50G 21:19:14 <oneswig> RoCE 10GE would be interesting if you've done that 21:19:52 <Xiaoyi-Lu-OSU_> We don't have large testbed with RoCE 10GE 21:20:05 <Xiaoyi-Lu-OSU_> but anybody can do this study with our libraries since we support RoCE also 21:20:21 <Xiaoyi-Lu-OSU_> users can just download them from this link: http://hibd.cse.ohio-state.edu/ 21:20:25 <Xiaoyi-Lu-OSU_> as indicated in slide 12 21:21:32 <martial> #link http://hibd.cse.ohio-state.edu/ 21:21:36 <oneswig> Is the project open source? 21:21:52 <Xiaoyi-Lu-OSU_> not yet. 21:22:10 <Xiaoyi-Lu-OSU_> The OHB benchmarks are opensourced already 21:22:26 <oneswig> Do you have plans for the rest? 21:22:36 <Xiaoyi-Lu-OSU_> yes, in future 21:22:47 <oneswig> great. 21:22:52 <b1airo> large scale RoCE workloads may not work well anyway, unless you were using MOFED4.0, depending on application RDMA QP requirements 21:23:10 <oneswig> b1airo: did you get that bond issue fixed that was blocking you? 21:24:01 <oneswig> Xiaoyi-Lu-OSU_: These results all look great. What are the cases where it doesn't perform well? :-) 21:24:36 <b1airo> oneswig: it was supposed to be fixed in 4.1 but we haven't checked it again yet 21:24:39 <Xiaoyi-Lu-OSU_> for some workloads, if they are not communication intensive, then you may not be able to see obvious benefits 21:26:03 <oneswig> Why did you base your benchmarks on IPoIB? Was that because of what you had available? 21:26:41 <Xiaoyi-Lu-OSU_> No, for IPoIB, we can evaluate our enhanced design with default design on the same InfiniBand hardware 21:27:01 <Xiaoyi-Lu-OSU_> we believe this is a fair way to compare them. 21:27:11 <b1airo> makes sense 21:27:40 <b1airo> it would be very interesting, and probably give the work a greater audience and applicability, to see more results over an Ethernet fabric 21:27:56 <oneswig> Do you know these people: https://gaia.ac.uk/gaia-uk/ioa-cambridge/dpci - website says 108 nodes but that's a previous generation. IIRC they have 200+ nodes running IPoIB Hadoop 21:28:03 <Xiaoyi-Lu-OSU_> Do you mean RoCE? 21:28:29 <b1airo> Yes, RoCE versus regular TCP over the same high-speed Ethernet 21:28:31 <oneswig> Xiaoyi-Lu-OSU_: I'd say so 21:29:20 <Xiaoyi-Lu-OSU_> Yes, we actually can support such type of comparison as well with our packages 21:29:35 <Xiaoyi-Lu-OSU_> like I said earlier, we support native RoCE or IB 21:29:43 <Xiaoyi-Lu-OSU_> it can be configured with our libraries 21:30:03 <oneswig> So - OpenStack - how do they integrate so far? You build on an OFED image, so that's easy enough. 21:30:06 <b1airo> Xiaoyi-Lu-OSU_: do you have any special relationship with Mellanox, e.g., Centre of Excellence ? 21:30:09 <Xiaoyi-Lu-OSU_> we are able to run three modes: default TCP/IP, native IB, native RoCE 21:30:50 <Xiaoyi-Lu-OSU_> you can go to this page: http://hibd.cse.ohio-state.edu/userguide/ 21:30:59 <oneswig> Are the last two via MVAPICH2? 21:30:59 <b1airo> they might be able to get you access to a reasonable Ethernet test-bed 21:31:02 <Xiaoyi-Lu-OSU_> to get all the configuration information from our userguides for various components 21:31:21 <Xiaoyi-Lu-OSU_> yes, we do have closely worked with Mellanox folks 21:31:43 <Xiaoyi-Lu-OSU_> MVAPICH2 is a separate project 21:31:52 <Xiaoyi-Lu-OSU_> that's great! 21:32:01 <Xiaoyi-Lu-OSU_> we would love to get the access 21:32:51 <oneswig> Xiaoyi-Lu-OSU_: OK thanks re: MVAPICH2. How have you integrated with OpenStack to date? 21:33:39 <Xiaoyi-Lu-OSU_> So, for Openstack integration, we are using Heat to develop a standard deploy template 21:33:54 <oneswig> Is this the appliance for Chameleon 21:33:55 <oneswig> ? 21:34:03 <Xiaoyi-Lu-OSU_> this template will help users to set up all required dependencies as well as install and configure our libraries automatically 21:34:08 <Xiaoyi-Lu-OSU_> yes 21:34:15 <Xiaoyi-Lu-OSU_> it is available on Chameleon already 21:34:48 <Xiaoyi-Lu-OSU_> here is the information about our appliance 21:34:49 <Xiaoyi-Lu-OSU_> https://www.chameleoncloud.org/appliances/17/docs/ 21:35:23 <oneswig> The appliance looks like it deploys a hypervisor first - is that correct? 21:35:39 <Xiaoyi-Lu-OSU_> yes 21:35:55 <Xiaoyi-Lu-OSU_> we will first allocate some bare-metal nodes and then deploy the KVM instances on top of it 21:36:10 <Xiaoyi-Lu-OSU_> then, a layer of RDMA-Hadoop will be set up on these vms 21:36:28 <oneswig> Are the benchmarks you publish from SR-IOV VMs? Or bare metal? 21:36:59 <Xiaoyi-Lu-OSU_> except slide 48 21:37:07 <Xiaoyi-Lu-OSU_> others are taken from bare-metal nodes 21:37:38 <Xiaoyi-Lu-OSU_> the numbers on slide 48 are taken from SR-IOV vms 21:37:57 <oneswig> We are looking at integrating HiBD into Sahara - on bare metal 21:38:04 <oneswig> Any advice? 21:38:45 <Xiaoyi-Lu-OSU_> We are still exploring Sahara. At this point, we don't know what kind of issues will be there 21:38:49 <oneswig> What Mark my colleague has found so far is that HiBD uses pretty new versions of everything, and Sahara's pinned on some pretty old versions. 21:38:58 <Xiaoyi-Lu-OSU_> if you find any problems, please feel free to contact us 21:39:04 <Xiaoyi-Lu-OSU_> and we will be happy to help 21:39:13 <oneswig> Thanks Xiaoyi-Lu-OSU_, will do! 21:39:21 <b1airo> Xiaoyi-Lu-OSU_: would the Heat template be easily editable to remove the bare-metal component assuming we already had RDMA-capable KVM guests via Nova? 21:39:47 <Xiaoyi-Lu-OSU_> We keep upgrading our designs to the newer version of the codebase 21:39:51 <oneswig> I think generally it would be a great thing to have this project easily integrated into HPDA-on-demand 21:40:10 <b1airo> oneswig: we have Sahara in Nectar testcloud at the moment, it's behind Trove for prod deployment currently 21:40:11 <Xiaoyi-Lu-OSU_> yes, that will be doable with Heat 21:40:11 <martial> oneswig: great idea 21:40:58 <oneswig> Xiaoyi-Lu-OSU_: Interesting to see how you've worked a software deployment inside a vm that is itself a software deployment 21:41:27 <Xiaoyi-Lu-OSU_> OK 21:41:29 <Xiaoyi-Lu-OSU_> :-) 21:42:10 <oneswig> I'm sure there's wide interest in this, for anyone with RDMA-enabled kit and an interest in data-intensive analytics 21:42:23 <Xiaoyi-Lu-OSU_> I agree 21:42:43 <oneswig> Xiaoyi-Lu-OSU_: do you already have contact with people from the Sahara project? 21:43:29 <Xiaoyi-Lu-OSU_> One person from Sahara project talked to me earlier when I presented this work in OpenStack Summit @ Boston 21:43:49 <Xiaoyi-Lu-OSU_> they also feel very interested with our designs. 21:44:02 <oneswig> If we can, let's find a way to get an OSU recipe into their work. 21:44:07 <b1airo> that's my main concern with Sahara... as I understand it Mirantis built it but then cut back the dev resources, and I'm not sure how much other community there is around it yet 21:44:35 <oneswig> b1airo: don't know either. It's very useful for us. 21:45:39 <oneswig> Xiaoyi-Lu-OSU_: memcached - presumably much simpler. I wonder if this might be usable on the OpenStack control plane - have you ever tested it in a Keystone use case? 21:46:00 <b1airo> i agree it is useful, but i'm still on the fence about whether we need a specific service, i mean you could do something pretty similar with Murano packages 21:46:03 <Xiaoyi-Lu-OSU_> No, we didn't do that 21:46:16 <b1airo> or even just an external orchestration tool - Juju, Ansible, etc 21:46:46 <oneswig> b1airo: We do a fair bit with heat wrapped up in Ansible. What these services get you is a dashboard panel. 21:46:57 <oneswig> Which helps for user friendliness 21:47:10 <Xiaoyi-Lu-OSU_> we have not explored these 21:48:31 <oneswig> I suppose it doesn't do much for you if Hadoop/Spark is part of a wider application platform - like http://opencb.org 21:48:47 <b1airo> oneswig: yes UI is important, but Murano could give you that too 21:49:10 <oneswig> b1airo: Isn't Murano in a similar boat to Sahara? 21:49:32 <b1airo> have you looked at the Sahara interface? there are a LOT of widgets, confusing even for someone with a vague idea of what they are doing 21:49:52 <b1airo> oneswig: dev/community wise? yes i suppose so 21:50:24 <b1airo> i suppose i'm thinking that not all of these projects will last and wondering what things are best to invest in 21:50:25 <Xiaoyi-Lu-OSU_> If the OpenCB can run with default Hadoop/Spark, they should be able to run on our packages directly 21:50:44 <oneswig> b1airo: No I haven't - slightly worried that the interface might not be the panacea we hope for... 21:51:10 <oneswig> We are overrunning on time... any final questions WG? 21:51:13 <b1airo> other option for UI is something like Ambari, but then that's a service atop your cloud, not integrated into it 21:52:05 <oneswig> b1airo: perhaps that's OK, in that the application platform is not locked in to OpenStack... 21:52:22 <oneswig> heresy I know :-) 21:52:42 <oneswig> OK, we should cover other items. 21:52:51 <Xiaoyi-Lu-OSU_> OK 21:52:53 <Xiaoyi-Lu-OSU_> :-) 21:52:54 <oneswig> Thank you Xiaoyi-Lu-OSU_ and DK_ - very helpful 21:53:02 <DK_> Thank you!! 21:53:04 <Xiaoyi-Lu-OSU_> Thanks everyone! 21:53:06 <oneswig> really good to have you come by! 21:53:12 <b1airo> yes thanks a lot! 21:53:19 <martial> really good coverage, truly appreciate 21:53:29 <oneswig> Martial do you have a roundup on ORC? 21:53:36 <martial> yes I do 21:53:39 <martial> #topic ORC roundup 21:53:46 <martial> ORC continue its discussion on the different topics discovered during the original effort 21:53:52 <martial> #link https://drive.google.com/drive/folders/0B4Y7flFgUgf9dElkaFkwbUhKblU 21:53:58 <martial> Conversation is ongoing toward a federation effort for more than just OpenStack 21:54:04 <martial> There is a request to see what and who from the SWG can bring any research user stories / information / effort to the initiative 21:54:12 <martial> The next meeting is going to be at the end of August in Amsterdam (dates to be finalized) 21:54:19 <martial> Stig, are you able/willing to go? 21:54:39 <martial> (physical meeting) 21:54:47 <oneswig> martial: sounds possible. 21:54:55 <martial> there is a weekly telecon on Mondays at 11am EST 21:54:59 <oneswig> I love a good kletskoppen 21:55:24 <martial> oneswig: will mention that to Kazil then, he will reach out to you once the official meeting is set 21:55:36 <martial> and that is it for ORC roundup :) 21:55:45 <oneswig> OK, thanks martial 21:55:54 <martial> (yes I prepared my text :) ) 21:56:10 <oneswig> Anyone going to ISC next week? I believe Xiaoyi-Lu-OSU_ and DK will be there... 21:56:42 <oneswig> Unfortunately I will not but one of my colleagues is heading over 21:56:50 <Xiaoyi-Lu-OSU_> we are there 21:56:56 <oneswig> Xiaoyi-Lu-OSU_: great 21:56:56 <powerd> hey oneswig, ill be over at ISC! 21:57:09 <b1airo> i was going to but then realised how close it was to all the travel i've just had - need to stick around home for a while! 21:57:19 <oneswig> Hello powerd! You guys should meet up with Xiaoyi-Lu-OSU_? 21:57:58 <Xiaoyi-Lu-OSU_> we are happy to meet with you guys there 21:57:59 <oneswig> powerd: John's going to be there, perhaps he already mentioned that. 21:58:06 <powerd> yea that would be great - i sat in a workshop at SC about HiBD and have had it on my 'to test' list for for too long 21:58:15 <b1airo> quick question - anyone got experience with transparent hugepages integration with HPC workloads ? 21:58:21 <powerd> yup will be syncing with john too 21:58:41 <oneswig> b1airo: not that I'm aware of, we've been busy on baremetal these last few months 21:59:06 <oneswig> I had one final question 21:59:11 <b1airo> e.g., how to transparently make allocations use madvise etc, preferably with scheduler switches 21:59:13 <oneswig> #link CharlieCloud article https://insidehpc.com/2017/06/charliecloud-simplifies-big-data-supercomputing-lanl/ 21:59:26 <oneswig> trandles: Are you holding punched cards in the photo?? 21:59:36 <trandles> haha 21:59:38 <rbudden> lol 21:59:40 <oneswig> You guys are old school :-) 21:59:41 <trandles> yeah the caption wasn't correct 22:00:05 <trandles> I'm holding equivalent pages representing code base for docker, shifter, singularity 22:00:09 <b1airo> really floppy disks? 22:00:18 <trandles> Reid is holding all of Charliecloud's source code 22:00:35 <b1airo> should the caption have read: "here's some guys we ran into outside the printer room" 22:00:42 <trandles> yes basically 22:00:45 <oneswig> OK - on that happy note, time to take it away for another week... 22:00:56 <oneswig> Thanks all 22:00:59 <oneswig> #endmeeting