15:00:29 #startmeeting openstack-cyborg 15:00:29 Meeting started Wed Nov 15 15:00:29 2017 UTC and is due to finish in 60 minutes. The chair is zhipeng. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:30 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:33 The meeting name has been set to 'openstack_cyborg' 15:01:25 #link https://zoom.us/j/973513277 15:06:03 just did a little bit chatting with crushil, that we will keep irc meeting today, and move the zoom meeting to a later time slot for dev specific issue 15:06:14 will keep everybody posted 15:07:29 so no zoom, just irc for now? 15:08:29 masuberu yep :) 15:08:36 crushil you there ? 15:09:03 ok 15:09:05 zhipeng: what's the zoom conference ID? 15:09:26 shaohe_feng_ will do zoom tomorrow morning 15:09:54 zhipeng: got it. 15:10:01 could everyone log their names ? 15:10:04 #info Howard 15:10:23 Can’t get my pc connected, have to use cellphone 15:10:32 #info Michele 15:10:41 #info Li 15:10:41 zhuli we will just irc tonight 15:10:42 #info zhuli 15:10:51 Got it 15:11:01 Do we have another meeting tomorrow? 15:11:20 I'm here 15:11:35 okey seems like full house almost :) 15:11:36 #info Rushil 15:11:58 #topic GATK4 FPGA requirement discussion 15:12:23 the first order of business today is that we have masuberu 15:12:49 o/ 15:12:54 system engineer from one of the largest genome testing reserach testing centers in the world 15:13:05 Welcome masuberu 15:13:19 thank you all for having me here 15:13:33 to talk with us about their upcoming trail with Intel FPGA card to run GTAK4 application for genome analysis 15:13:42 the floor is yours masuberu :) 15:13:50 than you 15:14:05 I work for a medical research intitute 15:14:16 and as sequence and analyse genomes 15:14:22 #info shaohe 15:14:29 right now we are in a project with intel 15:15:04 they are helping us to build an environment to run GATK4 which a set of tools of best practice genome analysis 15:15:25 tomorrow a guy from intel is coming to my office to give me FPGA cards 15:15:50 because GATK4 can either work on HPC cluster, Spark cluster and GDC 15:16:16 so my idea is to test GATK4 on FPGAs through openstack and compare performance with apache spark 15:16:36 obviously FPGA should be much faster but we want to see how much faster 15:17:25 and I was looking for documentation about how to integrate FPGAs on openstack and I found your project 15:17:52 shaohe_feng_ is from Intel :) 15:17:53 so wanted to say hi and see if this would be a possibility of collaboration 15:18:02 ok 15:18:10 well intel is quite big 15:18:12 I guess 15:18:16 masuberu: hi 15:18:22 haha yes 15:18:33 Hi Masuberu, This is Li from Huawei, our team has a lot of experience using FPGAs in Openstack 15:18:36 the person coming tomorrow is from either China or Singapore I think 15:18:58 masuberu could you describe a possible setup ? 15:19:20 For example if you are gonna have a VM connected to the FPGA 15:19:38 or just feeding data directly to GTAK4 on that FPGA ? 15:20:04 what ever is easier, I was thinking of giving a go and see if could be possible to integrate FPGAs with nova and see if the vms could see it 15:20:13 masuberu: I'm also working on FPGA. And some of us has started to support FPGA on k8s 15:20:14 otherwise I am happy of using different methods 15:20:27 ok 15:20:36 I am totally open to suggestions 15:20:45 the setup is 15:21:13 4 compute nodes with AMD cpus (56 cores) 512 RAM and ~4.5TB spinning disks 10k 15:21:17 Do you have the developed IPs to use for the FPGA? 15:21:26 #info masuberu from Garvan Institute of Medical Research 15:21:35 and 3 nodes with intel cpus (numa 2 sockets 14 cpus each) 512 GB RAM and 20 TB nvmes 15:21:56 #link https://www.youtube.com/watch?v=HSAJpAxORAw 15:21:57 all connected through 2x25GB bond network 15:22:04 network is mellanox ethernet 15:22:11 mellanos switch 100GB 15:22:44 it is not big setup but I think it is a good starting point 15:23:39 #info Li 15:23:41 I don't know how many FPGAs I will get, I will know tomorrow 15:23:53 shaohe_feng_ any tips on nova integrates with FPGA ? 15:24:11 so I am not sure if that explains most of the questions? please feel free to ask 15:24:32 masuberu will GTAK4 run as an IP for the FPGA ? 15:25:06 tomorrow I have a meeting with Albert from intel and he will brief me about how it works 15:25:16 Question: is there multiple IPs to be used in the GATK4 15:25:42 LiLiu_, what do you mean by multiple IPs? 15:26:06 I mean different bitstreams 15:26:50 so GATK4 is a bunch of tools bioinformatitians use to create pipelines 15:27:39 you can run a tool at a time or multiple ones at the same time, depends on what you are doing 15:27:54 #link https://github.com/broadinstitute/gatk 15:28:33 OK, I assume you will have multiple types of IPs in your system. Do you chain the FPGAs together or each FPGA only talk to the tied CPU ? 15:29:00 ok, I will ask that question to the intel guy tomorrow 15:29:09 I honestly never used a FPGA before 15:29:45 shaohe_feng__ any tips on nova integrates with FPGA ? 15:31:07 I see, my team works a lot on FPGA heterogeneous system in Openstack, we may talk more offline if you are interested, masuberu. 15:31:28 sure 15:31:38 liliu1@huawei.com 15:33:04 did some googling 15:33:05 I wanna know more about the scope/size of the system you guys are targeting 15:33:28 it will depends on the performance of this tests 15:33:29 seems like the pairHMM will be the part that need acceleration from FPGA or GPU ? 15:34:09 for instance we have capacity to sequence 1200 genomes per month so it all depends how much hardware we need for that 15:34:31 #link https://www.altera.com/content/dam/altera-www/global/en_US/pdfs/literature/wp/wp-accelerating-genomics-opencl-fpgas.pdf 15:34:47 zhipeng: actually, every accelerator in Intel's FPGA is a sriov device. 15:35:02 zhipeng, yep I got that document yesterday from Albert 15:35:22 shaohe_feng__ how about the altera ones ? 15:35:26 the same ? 15:35:37 zhipeng: altera means intel :) 15:35:54 haha 15:36:16 nallatech 385A is what I am getting 15:36:16 zhipeng: there maybe 3 method to support FPGA. 15:36:30 getting for testing 15:36:54 zhipeng: 1. pre-programming FPGA on the host. 15:38:02 zhipeng: 2. the orchestration program FPGA 15:38:15 zhipeng: 3. VM program FPGA. 15:38:36 zhipeng: I have got a FPGA server, and I'm trying to do some test on it 15:38:54 shaohe_feng__, the person coming tomorrow is called AlbertZQ Wang, not sure if you know him... 15:39:25 zhipeng: AlbertZQ Wang from intel? 15:39:36 ^ masuberu: 15:39:52 yes 15:40:02 which site? China or US 15:40:17 I guess China but not 100% sure 15:40:22 maybe either of them 15:41:48 any questions? 15:41:53 masuberu: haha, we can connect each other convenience in intel's internal connection tool. 15:42:23 yeah they showed me that tool last time they came to my office 15:42:39 any plan? 15:42:59 i think this is definitely a very interesting use case for us 15:43:22 it would be great if you could come back with more details from the Intel guys tomorrow 15:43:34 and we could hammer out details for future work 15:43:49 i think getting Nova connect with FPGA is easy 15:44:15 but how to sched and manage your FPGAs is what we will be working on 15:44:21 zhipeng, you mean because of SRIOV? 15:44:49 it can be nova or k8s 15:44:55 or another method 15:45:05 yes because the current support for Nova is good enough 15:45:08 to set it up 15:45:11 ok 15:45:18 what about overhead? 15:45:36 what type of overhead you are thinking about ? 15:45:38 is it going to have much overhead due to virtualization? 15:45:50 i think that depends 15:46:05 how you setup the workload between VM and FPGA 15:46:37 ok, is there anything I can read about that? because I don't know much to be honest 15:46:46 for example if you gonna have FPGA do all the GATK heavy lifting and VM just for running a data pipe application 15:46:53 and I want to get the most of this project 15:47:04 i don't think there will be much overhead 15:47:05 I don't know for how long I can have the FPGAs 15:47:54 lets say we decide to run on the numa nodes, then I can even enable cpu afinity 15:48:10 to make sure process use local memory 15:48:16 but that is a different story 15:48:31 yes 15:48:48 I am more worry about the lack of documentation about openstack+FPGAs 15:49:16 you can just go through the nova pci support docs 15:49:16 also we have another project coming to run GPUs for machine learning stuff 15:49:39 but that will come in 3-4 months time 15:49:46 ok 15:49:50 cool :) 15:50:00 another thing I normally deploy openstack using kolla... 15:50:10 kolla-ansible, not sure if that will be an issue 15:50:14 crushil 15:50:29 any thoughts on making cyborg kolla-able ? :P 15:51:01 zhipeng, My coworker who presented for us is kolla-kubernetes core 15:51:12 nice 15:51:13 I'm working with him to get Cyborg kollafied 15:51:24 masuberu there you go :) 15:51:24 that's cool 15:51:32 we have everyone you need in cyborg :P 15:51:36 It's already on my plate for a future sprint 15:51:52 I am quite flexible we can do whatever we want with this nodes as they are running nothing 15:52:19 masuberu just give a howler any time you have an issue on this channel 15:52:45 core members are always logged in , so as long as any of us are awake, we will respond :) 15:52:47 Btw you should play with this tool https://review.openstack.org/#/c/487972/. I will be integrating Cyborg into this tool 15:53:13 you could also just send email to the openstack-dev mailinglist, or openstack-ops 15:55:07 I think I know rwellum 15:55:33 ok 15:56:02 by the way GATK4 is just a jar file 15:56:10 very simple to install and run 15:57:04 is it possible to share an FPGU with multiple instances? 15:57:25 or do I need to assign 1 FPGU card per vm? 15:57:31 if the FPGA has multiple PFs/VFs, then yes 15:58:08 It really depends on how the IP is developed 15:58:45 ok, and the FPGUs has cores? 15:59:29 nope 15:59:44 I mean can I break the resources of a FPGU card and assign it to multiple vms? 15:59:50 FPGA can be virtualized and shared by multiple VMs 16:00:01 you are right 16:00:40 for example with memory and cpus I can "define" flavors, can I do same with FPGUs? 16:01:08 yup 16:01:31 That's one of the most common ways of managing FPGA in Openstack 16:01:46 like for instance 0.3Teraflops for one instance and 0.7Teraflops to another 16:01:49 ok 16:02:08 That's totally doable 16:02:21 do you have experience with openhpc? 16:02:40 I don 16:02:47 we run rocks cluster but the development is quite slow, I am wondering whether openhpc would be better option 16:02:50 ok 16:02:55 but I know someone in my team might do 16:03:32 ok, thats ok, let try and focus on these tests comparing GATK4 on FPGUs vs Apache Spark 16:03:49 maybe this type of tests doesn't even makes sense, we will see 16:04:22 so I will get the cards tomorrow and connected to the servers 16:04:56 sure 16:05:04 and from there if anyone know any document about how to integrate nova with fpgus or if there is any document/tool within cyborg project just let me know 16:06:44 before cyborg is ready to use, I think the most naive way is just use PCI passthrough 16:07:17 sure 16:13:31 zhipeng, Don't forget to close the meeting 16:13:50 of course :) 16:14:01 if nothing else, we had a great discussion today 16:14:07 the meeting is adjourned 16:14:22 and I will let everyone know the new ZOOM meeting time 16:14:28 #endmeeting