14:00:17 #startmeeting openstack-cyborg 14:00:21 Meeting started Wed Mar 21 14:00:17 2018 UTC and is due to finish in 60 minutes. The chair is zhipeng. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:22 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:25 The meeting name has been set to 'openstack_cyborg' 14:00:41 #topic Roll Call 14:00:49 #info Howard 14:01:11 hi zhipeng 14:01:18 hi! 14:01:23 hi guys 14:01:33 \o 14:01:35 please use #info to record you names 14:01:48 #info Konstantinos 14:01:50 #info Sundar 14:02:15 #info Rushil 14:02:57 #info shaohe 14:05:02 Do we have a Zoom link? 14:05:06 let's wait for a few more minutes in case more people will join 14:05:14 Sundar no we only have irc today 14:05:23 Sure, Zhipeng. Thanks 14:06:18 #info Yumeng__ 14:09:45 okey let's start 14:09:57 #topic CERN GPU use case introduction 14:10:08 kosamara plz takes us away 14:10:41 I joined you recently, so let me remind that I'm a technical student at CERN, integrating GPUs into our openstack. 14:10:56 Our use case is computation only at this point. 14:11:31 We have implemented and are currently testing a nova-only pci-passthrough. 14:12:18 We intend to also explore vGPUs, but they don't seem to fit our use case very much: licensing costs, limited CUDA support (nvidia case both). 14:12:47 At the moment the big issues are enforcing quotas on GPUs and security concerns. 14:13:07 1. Subsequent users can potentially access data on the GPU memory 14:13:39 2. Low-level access means they could change the firmware or even cause the host to restart 14:14:23 We are looking at a way to mitigate at least the first issue, by performing some kind of cleanup on the GPU after use 14:14:55 But in our current workflow this would require a change in nova. 14:15:13 It looks like something that could be done in cyborg? 14:15:42 i think so, i remember we had similar convo regarding clean up on FPGA in the PTG 14:15:43 are you try to force a "reset" after usage of the device? 14:16:02 Reset of the device? 14:16:32 According to a research article, in nvidia's case performing a device reset through nvidia-smi is not enough for the data leaks. 14:16:42 something like that. like Zhipeng said, "clean up" 14:16:49 - article: https://www.semanticscholar.org/paper/Confidentiality-Issues-on-a-GPU-in-a-Virtualized-E-Maurice-Neumann/693a8b56a9e961052702ff088131eb553e88d9ae 14:17:19 The additional complexity in pci passthrough is that the host can't access the GPUs (no drivers) 14:17:39 so once the GPU devices are detached, cyborg should do clean up at once. 14:17:40 What kind of clean up do you have in mind? Zero out the RAM? 14:18:15 My thinking is to put the relevant resources in a special state after deallocation from their previous VM and use a service VM to perform the cleanup ops themselves. 14:18:21 memset() to zero for all ? 14:18:27 Yes, zero the ram 14:19:00 But ideally we would like to ensure the firmware's state is valid too. 14:19:13 only this one method to clear up? 14:19:48 Sorry, I'm out of context. This method is from where? 14:19:58 zero the ram 14:20:15 Yes for the ram. 14:20:30 OK, got it. 14:20:35 But we would also like to check the firmware's state. 14:20:54 And at least ensure that it hasn't been tampered with. 14:21:22 so no ways to power off the GPU device, and re-power it? 14:21:23 i think this is doable with cyborg 14:21:43 Not that I know of, apart from rebooting the host. 14:21:44 shaohe_feng_ that will mess up the state ? 14:21:59 The service VM needs to run in every compute node, and it needs to have all the right drivers for various GPU devices. We need to see how practical that is. 14:22:14 we might need to get some help from Nvidia 14:22:48 let's say we have the NVIDIA driver support 14:22:50 In a vGPU scenario, how do you power off the device without affecting other users? 14:22:52 for cyborg 14:22:59 Sundar: unless vfio is capable of doing these basic operations? 14:23:39 do we still need the service VM ? 14:24:11 I haven't explored practically nvidia's vGPU scenario. The host is supposed to be able to operate on the GPU in that case. 14:24:32 kosamara: Not sure I understand. vfio is a generic module for allowing device access. How would it know about specific nvidia devices? 14:24:44 kosamara: do you use vfio pci passthrough or pci-stub passthrough? 14:25:08 zhipeng: vfio-pci is the host stub driver when the gpu is passed through. If we could do something through that, then we wouldn't need the service VM. 14:25:32 shaohe_feng_: vfio-pci 14:25:37 kosamara good to know 14:26:16 Sundar: yes, I'm just making a hypothesis. I expect it can't, but perhaps zeroing-out the ram is a general enough operation and it can do it...? 14:27:40 hi 14:27:49 kosamara: is zeroing-out the ram time consuming？ 14:28:20 kosamara: I am not a Nvidia expert. But presumably, to access the RAM, one would need to look at the PCI BAR space and find the base address or something? If so, that would be device-specific 14:29:49 Sundar: thanks for the input. I can research the feasibility of this option this week. 14:30:15 kosamara I will also talk to the NVIDIA OpenStack team about it 14:30:26 see if we could come out with something for Rocky :) 14:30:45 shaohe_feng_: I don't have a number right now. It should depend on the manner: if it happens on the device without pci transports it should be quite fast. 14:31:08 zhipeng thanks :) 14:32:01 kosamara is there anything like an architecture diagram for the use case ? 14:32:20 Not really. 14:32:32 okey :) 14:33:00 kosamara: you also mentioned quotas on GPUs? 14:33:03 What exactly do you mean by architecture diagram? 14:33:09 this rings a bell to me about the driver apis 14:33:20 kosarama like the overall setup 14:33:33 currently we have report() and program() for the vendor driver api 14:33:50 Yes. We currently implement quotas indirectly. We only allow GPU flavors on specific projects and quota them by cpu. 14:33:58 we might want to consider a reset() api for the drivers 14:34:09 Li_Liu makes sense 14:34:21 kosamara we will also have quota support for Rocky 14:34:22 +1 14:35:13 Good to know. 14:35:28 okey folks let's move on to the next topic 14:35:36 thx again kosamara 14:35:40 thanks! 14:35:47 kosamara: so do you not support GPU quota for specific projects？ 14:35:56 #topic subteam lead report 14:36:04 No, we do it indirectly, in the way I mentioned. 14:36:24 shaohe_feng_ scrow up :) 14:36:39 scroll 14:37:02 okey yumeng could you introduce the progress on your side ? 14:37:43 Yumeng__ 14:39:24 Last week I set up a repository for cyborg-specs 14:40:24 some patches are still waiting for review from the infra team 14:40:49 Hope it could be merged ASAP 14:40:55 Are we not using the git repo's doc/specs/ area anymore? 14:41:14 Sundar we are migrating it out :) 14:41:22 Maybe I am missing the context. What is this repository for specs? 14:41:27 but it is fine now to keep submit to that folder 14:41:46 we will migrate all the approved specs after MS1 to cyborg-specs 14:42:11 Howard, could you explain why we are doing that? 14:42:36 It would be better for the documentation when we do release 14:42:47 all the core projects are doing it 14:43:33 i think yumeng also add the gate check on docs for cyborg-specs 14:44:13 Sundar All the core projects split out the specs and the main project. So, it makes sense to follow suite 14:44:31 yes exactly 14:44:35 IIUC, in the git repo, approved specs will be doc/specs//approved, but in the new repo, all release specs will be in one place. Is that right? 14:45:03 it will still follows the similar directory structure 14:45:36 OK, so we are just separating code from docs 14:45:41 yes 14:45:48 from specs to be preceise 14:45:50 precise 14:45:57 Got it, thanks :) 14:46:08 since general documentation is still in cyborg repo, if I understand correctly 14:46:09 for now we still check in the docs to the code repo right? 14:46:19 Li_Liu yes, nothing changes 14:46:24 ok 14:46:40 thx to Yumeng__ for the quick progress 14:46:56 In future releases, would we check specs into code repo, and have it be migrated after approval? 14:47:04 zhipeng: :) 14:47:21 in the future we will just submit the spec patch to cyborg-specs 14:47:37 ok 14:47:56 shaohe_feng_ any update on the python-cyborgclient ? 14:48:03 the spec and code will separated 14:48:41 zhipeng: jinghan has some personal staff these days. So no more update. 14:48:58 zhipeng: I will help him on it. 14:49:18 thx :) was just gonna mention this 14:49:24 plz work with him 14:49:29 hopeful we will make progress next week. 14:49:30 OK 14:49:36 ok thx 14:49:49 I will work with zhuli on the os-acc 14:50:10 that one will most likely involve nova team discussion 14:50:54 Yes. I will be happy to work with zhuli if he needs any help 14:51:14 Sundar gr8t :) 14:51:26 #topic rocky spec/patch discussion 14:51:52 #link https://review.openstack.org/#/q/status:open+project:openstack/cyborg 14:52:37 first up, Sundar's spec patch 14:53:13 Sundar: good work 14:53:28 Shaohe, thanks :) 14:53:28 but I have a question. why nova developer think the accelerator weigher call cause performance loss 14:53:39 #link https://review.openstack.org/#/c/554717/ 14:53:59 I think the assumption was that the weigher will call into Cyborg REST API for each host 14:54:09 If the weigher is in Nova tree, that is true 14:54:25 But, if Cyborg keeps it, we have other options 14:54:27 Sundar: why for each host? 14:54:51 The typical filter today operates per host 14:54:52 Sundar: I have discussed it before. 14:55:18 Sundar: the cyborg API will run on controller node. 14:55:32 Sundar: we only call the api in controller node. 14:55:39 just on api is OK. 14:56:13 for example, the scheduler filter choose the suitable hosts 14:56:56 and the scheduler weigher just call a API to query the accelerator infos of these hosts 14:57:05 zhipeng: Li_Liu: right? 14:57:36 that is still , per host 14:57:36 you mean the weigher is on the Cyborg controller side? 14:57:56 zhipeng: no, we get list for filter API 14:58:32 Shaohe, yes, we could override the BaseWeigher and handle multiple hosts in one call. That call could invoke Cyborg REST API. 14:58:44 for example: GET /cyborg/v1/accelerators?hosts=cyborg-1,cyborg-2&type=fpga 14:58:45 To me, it is not clear what the performance hit would be. 14:59:26 I suspect any performance hit would not be noticeable until we get to some scale 14:59:28 Does this involve the 2-stages scheduling problem we were trying to avoid? 15:00:10 There is no 2-stage scheduling here: the proposed filter/weigher is a typical one, which just filters hosts based on function calls. 15:00:26 Sundar: yes, the scheduler has call placement several times, is there performance issue? 15:00:29 Sundar I think Li_Liu meant for weigher in Nova 15:00:52 shaohe_feng_ it is not the same thing 15:01:13 they are both http request. 15:01:18 anyways this has been discussed in extent with Nova team and let's stay with the conclusion 15:01:24 ok 15:01:30 shaohe_feng_ we could discuss offline more with Alex 15:01:31 OK 15:01:38 but let's not dwell on it 15:01:51 May be I misunderstood :) We are proposing a weigher maintained in Cyborg tree, which the operator will configure in nova.conf. Is that a concern? 15:02:34 weigher maintained in Cyborg tree, still need one cyborg api request, so also performance issue? 15:03:02 Sundar I don't think that would be a concern 15:03:06 shaohe: I personally don't think so, but we'll check the data to assure everybody 15:03:24 zhipeng: Li_Liu: do you think a weigher maintained in Cyborg tree is a good idea? 15:03:47 yes, at the moment 15:04:09 Sundar: you still need to tell cyborg which hosts need to weight 15:04:22 on cyborg api call. 15:04:22 shaohe: This weigher is querying Cyborg DB. It is better to keep it in Cyborg 15:04:39 I agree with zhipeng 15:04:57 the weigher will just talk to the conductor 15:05:02 it is not blocking nova operations 15:05:06 that is the point 15:05:59 any the way, the api call will talk to conductor to query Cyborg DB 15:06:06 Shaohe, yes. The weigher gets a list of hosts. We could either introduce a new Cyborg API for that, or just have the weigher query the db directly 15:06:09 no difference. 15:06:23 shaohe_feng_ let's leave it offline 15:06:29 zhipeng: OK. 15:06:46 Sundar thx for the spec :) 15:06:57 We will definitely review it more 15:07:51 next up, Li Liu's patch 15:08:05 #info Implemented the Objects and APIs for vf/pf 15:08:19 #link https://review.openstack.org/552734 15:08:59 Sorry, I need to leave for my next call. :( Will catch up from minutes 15:09:12 later 15:09:23 Sundar no problem 15:09:40 Sundar we need another discussion on k8s actions :) 15:10:11 Li_Liu any update or comment on your patch ? 15:10:22 things we need to be aware of ? 15:10:59 not really. I think shaohe needs to change some code in his resource tracker to adopt the change 15:11:35 other than that. one thing left over is to utilize the attribute table in deployables. This is still a missing piece 15:11:41 Li_Liu: thanks for reminder. 15:12:12 I will keep working on that as well as 2 other specs 15:12:49 shaohe_feng_ np. let me know if you need any help on using my pf/vfs 15:12:49 Li_Liu: 2 other specs include image management? 15:13:00 Li_Liu: ok, thanks. 15:13:23 programability and image metadata standardization 15:13:44 yes , big task on your shoulder :) 15:14:05 :) 15:15:45 okey we've gone through our agenda list today 15:15:54 I think we can end the meeting now :) 15:16:04 and talk to you guys next week 15:18:11 bye 15:19:05 bye 15:19:38 #endmeeting