15:00:29 <zhipeng> #startmeeting openstack-cyborg 15:00:29 <openstack> Meeting started Wed Nov 15 15:00:29 2017 UTC and is due to finish in 60 minutes. The chair is zhipeng. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:30 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:33 <openstack> The meeting name has been set to 'openstack_cyborg' 15:01:25 <zhipeng> #link https://zoom.us/j/973513277 15:06:03 <zhipeng> just did a little bit chatting with crushil, that we will keep irc meeting today, and move the zoom meeting to a later time slot for dev specific issue 15:06:14 <zhipeng> will keep everybody posted 15:07:29 <masuberu> so no zoom, just irc for now? 15:08:29 <zhipeng> masuberu yep :) 15:08:36 <zhipeng> crushil you there ? 15:09:03 <masuberu> ok 15:09:05 <shaohe_feng_> zhipeng: what's the zoom conference ID? 15:09:26 <zhipeng> shaohe_feng_ will do zoom tomorrow morning 15:09:54 <shaohe_feng_> zhipeng: got it. 15:10:01 <zhipeng> could everyone log their names ? 15:10:04 <zhipeng> #info Howard 15:10:23 <zhuli> Can’t get my pc connected, have to use cellphone 15:10:32 <mpaolino> #info Michele 15:10:41 <Guest69871> #info Li 15:10:41 <zhipeng> zhuli we will just irc tonight 15:10:42 <zhuli> #info zhuli 15:10:51 <zhuli> Got it 15:11:01 <Guest69871> Do we have another meeting tomorrow? 15:11:20 <crushil> I'm here 15:11:35 <zhipeng> okey seems like full house almost :) 15:11:36 <crushil> #info Rushil 15:11:58 <zhipeng> #topic GATK4 FPGA requirement discussion 15:12:23 <zhipeng> the first order of business today is that we have masuberu 15:12:49 <masuberu> o/ 15:12:54 <zhipeng> system engineer from one of the largest genome testing reserach testing centers in the world 15:13:05 <zhuli> Welcome masuberu 15:13:19 <masuberu> thank you all for having me here 15:13:33 <zhipeng> to talk with us about their upcoming trail with Intel FPGA card to run GTAK4 application for genome analysis 15:13:42 <zhipeng> the floor is yours masuberu :) 15:13:50 <masuberu> than you 15:14:05 <masuberu> I work for a medical research intitute 15:14:16 <masuberu> and as sequence and analyse genomes 15:14:22 <shaohe_feng__> #info shaohe 15:14:29 <masuberu> right now we are in a project with intel 15:15:04 <masuberu> they are helping us to build an environment to run GATK4 which a set of tools of best practice genome analysis 15:15:25 <masuberu> tomorrow a guy from intel is coming to my office to give me FPGA cards 15:15:50 <masuberu> because GATK4 can either work on HPC cluster, Spark cluster and GDC 15:16:16 <masuberu> so my idea is to test GATK4 on FPGAs through openstack and compare performance with apache spark 15:16:36 <masuberu> obviously FPGA should be much faster but we want to see how much faster 15:17:25 <masuberu> and I was looking for documentation about how to integrate FPGAs on openstack and I found your project 15:17:52 <zhipeng> shaohe_feng_ is from Intel :) 15:17:53 <masuberu> so wanted to say hi and see if this would be a possibility of collaboration 15:18:02 <masuberu> ok 15:18:10 <masuberu> well intel is quite big 15:18:12 <masuberu> I guess 15:18:16 <shaohe_feng__> masuberu: hi 15:18:22 <zhipeng> haha yes 15:18:33 <Guest69871> Hi Masuberu, This is Li from Huawei, our team has a lot of experience using FPGAs in Openstack 15:18:36 <masuberu> the person coming tomorrow is from either China or Singapore I think 15:18:58 <zhipeng> masuberu could you describe a possible setup ? 15:19:20 <zhipeng> For example if you are gonna have a VM connected to the FPGA 15:19:38 <zhipeng> or just feeding data directly to GTAK4 on that FPGA ? 15:20:04 <masuberu> what ever is easier, I was thinking of giving a go and see if could be possible to integrate FPGAs with nova and see if the vms could see it 15:20:13 <shaohe_feng__> masuberu: I'm also working on FPGA. And some of us has started to support FPGA on k8s 15:20:14 <masuberu> otherwise I am happy of using different methods 15:20:27 <masuberu> ok 15:20:36 <masuberu> I am totally open to suggestions 15:20:45 <masuberu> the setup is 15:21:13 <masuberu> 4 compute nodes with AMD cpus (56 cores) 512 RAM and ~4.5TB spinning disks 10k 15:21:17 <Guest69871> Do you have the developed IPs to use for the FPGA? 15:21:26 <zhipeng> #info masuberu from Garvan Institute of Medical Research 15:21:35 <masuberu> and 3 nodes with intel cpus (numa 2 sockets 14 cpus each) 512 GB RAM and 20 TB nvmes 15:21:56 <zhipeng> #link https://www.youtube.com/watch?v=HSAJpAxORAw 15:21:57 <masuberu> all connected through 2x25GB bond network 15:22:04 <masuberu> network is mellanox ethernet 15:22:11 <masuberu> mellanos switch 100GB 15:22:44 <masuberu> it is not big setup but I think it is a good starting point 15:23:39 <LiLiu_> #info Li 15:23:41 <masuberu> I don't know how many FPGAs I will get, I will know tomorrow 15:23:53 <zhipeng> shaohe_feng_ any tips on nova integrates with FPGA ? 15:24:11 <masuberu> so I am not sure if that explains most of the questions? please feel free to ask 15:24:32 <zhipeng> masuberu will GTAK4 run as an IP for the FPGA ? 15:25:06 <masuberu> tomorrow I have a meeting with Albert from intel and he will brief me about how it works 15:25:16 <LiLiu_> Question: is there multiple IPs to be used in the GATK4 15:25:42 <masuberu> LiLiu_, what do you mean by multiple IPs? 15:26:06 <LiLiu_> I mean different bitstreams 15:26:50 <masuberu> so GATK4 is a bunch of tools bioinformatitians use to create pipelines 15:27:39 <masuberu> you can run a tool at a time or multiple ones at the same time, depends on what you are doing 15:27:54 <zhipeng> #link https://github.com/broadinstitute/gatk 15:28:33 <LiLiu_> OK, I assume you will have multiple types of IPs in your system. Do you chain the FPGAs together or each FPGA only talk to the tied CPU ? 15:29:00 <masuberu> ok, I will ask that question to the intel guy tomorrow 15:29:09 <masuberu> I honestly never used a FPGA before 15:29:45 <zhipeng> shaohe_feng__ any tips on nova integrates with FPGA ? 15:31:07 <LiLiu_> I see, my team works a lot on FPGA heterogeneous system in Openstack, we may talk more offline if you are interested, masuberu. 15:31:28 <masuberu> sure 15:31:38 <LiLiu_> liliu1@huawei.com 15:33:04 <zhipeng> did some googling 15:33:05 <LiLiu_> I wanna know more about the scope/size of the system you guys are targeting 15:33:28 <masuberu> it will depends on the performance of this tests 15:33:29 <zhipeng> seems like the pairHMM will be the part that need acceleration from FPGA or GPU ? 15:34:09 <masuberu> for instance we have capacity to sequence 1200 genomes per month so it all depends how much hardware we need for that 15:34:31 <zhipeng> #link https://www.altera.com/content/dam/altera-www/global/en_US/pdfs/literature/wp/wp-accelerating-genomics-opencl-fpgas.pdf 15:34:47 <shaohe_feng__> zhipeng: actually, every accelerator in Intel's FPGA is a sriov device. 15:35:02 <masuberu> zhipeng, yep I got that document yesterday from Albert 15:35:22 <zhipeng> shaohe_feng__ how about the altera ones ? 15:35:26 <zhipeng> the same ? 15:35:37 <shaohe_feng__> zhipeng: altera means intel :) 15:35:54 <zhipeng> haha 15:36:16 <masuberu> nallatech 385A is what I am getting 15:36:16 <shaohe_feng__> zhipeng: there maybe 3 method to support FPGA. 15:36:30 <masuberu> getting for testing 15:36:54 <shaohe_feng__> zhipeng: 1. pre-programming FPGA on the host. 15:38:02 <shaohe_feng__> zhipeng: 2. the orchestration program FPGA 15:38:15 <shaohe_feng__> zhipeng: 3. VM program FPGA. 15:38:36 <shaohe_feng__> zhipeng: I have got a FPGA server, and I'm trying to do some test on it 15:38:54 <masuberu> shaohe_feng__, the person coming tomorrow is called AlbertZQ Wang, not sure if you know him... 15:39:25 <shaohe_feng__> zhipeng: AlbertZQ Wang from intel? 15:39:36 <shaohe_feng__> ^ masuberu: 15:39:52 <masuberu> yes 15:40:02 <shaohe_feng__> which site? China or US 15:40:17 <masuberu> I guess China but not 100% sure 15:40:22 <masuberu> maybe either of them 15:41:48 <masuberu> any questions? 15:41:53 <shaohe_feng__> masuberu: haha, we can connect each other convenience in intel's internal connection tool. 15:42:23 <masuberu> yeah they showed me that tool last time they came to my office 15:42:39 <masuberu> any plan? 15:42:59 <zhipeng> i think this is definitely a very interesting use case for us 15:43:22 <zhipeng> it would be great if you could come back with more details from the Intel guys tomorrow 15:43:34 <zhipeng> and we could hammer out details for future work 15:43:49 <zhipeng> i think getting Nova connect with FPGA is easy 15:44:15 <zhipeng> but how to sched and manage your FPGAs is what we will be working on 15:44:21 <masuberu> zhipeng, you mean because of SRIOV? 15:44:49 <masuberu> it can be nova or k8s 15:44:55 <masuberu> or another method 15:45:05 <zhipeng> yes because the current support for Nova is good enough 15:45:08 <zhipeng> to set it up 15:45:11 <masuberu> ok 15:45:18 <masuberu> what about overhead? 15:45:36 <zhipeng> what type of overhead you are thinking about ? 15:45:38 <masuberu> is it going to have much overhead due to virtualization? 15:45:50 <zhipeng> i think that depends 15:46:05 <zhipeng> how you setup the workload between VM and FPGA 15:46:37 <masuberu> ok, is there anything I can read about that? because I don't know much to be honest 15:46:46 <zhipeng> for example if you gonna have FPGA do all the GATK heavy lifting and VM just for running a data pipe application 15:46:53 <masuberu> and I want to get the most of this project 15:47:04 <zhipeng> i don't think there will be much overhead 15:47:05 <masuberu> I don't know for how long I can have the FPGAs 15:47:54 <masuberu> lets say we decide to run on the numa nodes, then I can even enable cpu afinity 15:48:10 <masuberu> to make sure process use local memory 15:48:16 <masuberu> but that is a different story 15:48:31 <zhipeng> yes 15:48:48 <masuberu> I am more worry about the lack of documentation about openstack+FPGAs 15:49:16 <zhipeng> you can just go through the nova pci support docs 15:49:16 <masuberu> also we have another project coming to run GPUs for machine learning stuff 15:49:39 <masuberu> but that will come in 3-4 months time 15:49:46 <masuberu> ok 15:49:50 <zhipeng> cool :) 15:50:00 <masuberu> another thing I normally deploy openstack using kolla... 15:50:10 <masuberu> kolla-ansible, not sure if that will be an issue 15:50:14 <zhipeng> crushil 15:50:29 <zhipeng> any thoughts on making cyborg kolla-able ? :P 15:51:01 <crushil> zhipeng, My coworker who presented for us is kolla-kubernetes core 15:51:12 <masuberu> nice 15:51:13 <crushil> I'm working with him to get Cyborg kollafied 15:51:24 <zhipeng> masuberu there you go :) 15:51:24 <masuberu> that's cool 15:51:32 <zhipeng> we have everyone you need in cyborg :P 15:51:36 <crushil> It's already on my plate for a future sprint 15:51:52 <masuberu> I am quite flexible we can do whatever we want with this nodes as they are running nothing 15:52:19 <zhipeng> masuberu just give a howler any time you have an issue on this channel 15:52:45 <zhipeng> core members are always logged in , so as long as any of us are awake, we will respond :) 15:52:47 <crushil> Btw you should play with this tool https://review.openstack.org/#/c/487972/. I will be integrating Cyborg into this tool 15:53:13 <zhipeng> you could also just send email to the openstack-dev mailinglist, or openstack-ops 15:55:07 <masuberu> I think I know rwellum 15:55:33 <masuberu> ok 15:56:02 <masuberu> by the way GATK4 is just a jar file 15:56:10 <masuberu> very simple to install and run 15:57:04 <masuberu> is it possible to share an FPGU with multiple instances? 15:57:25 <masuberu> or do I need to assign 1 FPGU card per vm? 15:57:31 <LiLiu_> if the FPGA has multiple PFs/VFs, then yes 15:58:08 <LiLiu_> It really depends on how the IP is developed 15:58:45 <masuberu> ok, and the FPGUs has cores? 15:59:29 <LiLiu_> nope 15:59:44 <masuberu> I mean can I break the resources of a FPGU card and assign it to multiple vms? 15:59:50 <LiLiu_> FPGA can be virtualized and shared by multiple VMs 16:00:01 <LiLiu_> you are right 16:00:40 <masuberu> for example with memory and cpus I can "define" flavors, can I do same with FPGUs? 16:01:08 <LiLiu_> yup 16:01:31 <LiLiu_> That's one of the most common ways of managing FPGA in Openstack 16:01:46 <masuberu> like for instance 0.3Teraflops for one instance and 0.7Teraflops to another 16:01:49 <masuberu> ok 16:02:08 <LiLiu_> That's totally doable 16:02:21 <masuberu> do you have experience with openhpc? 16:02:40 <LiLiu_> I don 16:02:47 <masuberu> we run rocks cluster but the development is quite slow, I am wondering whether openhpc would be better option 16:02:50 <masuberu> ok 16:02:55 <LiLiu_> but I know someone in my team might do 16:03:32 <masuberu> ok, thats ok, let try and focus on these tests comparing GATK4 on FPGUs vs Apache Spark 16:03:49 <masuberu> maybe this type of tests doesn't even makes sense, we will see 16:04:22 <masuberu> so I will get the cards tomorrow and connected to the servers 16:04:56 <LiLiu_> sure 16:05:04 <masuberu> and from there if anyone know any document about how to integrate nova with fpgus or if there is any document/tool within cyborg project just let me know 16:06:44 <LiLiu_> before cyborg is ready to use, I think the most naive way is just use PCI passthrough 16:07:17 <masuberu> sure 16:13:31 <crushil> zhipeng, Don't forget to close the meeting 16:13:50 <zhipeng> of course :) 16:14:01 <zhipeng> if nothing else, we had a great discussion today 16:14:07 <zhipeng> the meeting is adjourned 16:14:22 <zhipeng> and I will let everyone know the new ZOOM meeting time 16:14:28 <zhipeng> #endmeeting