#openstack-cyborg log

14:00:17 <zhipeng> #startmeeting openstack-cyborg
14:00:21 <openstack> Meeting started Wed Mar 21 14:00:17 2018 UTC and is due to finish in 60 minutes.  The chair is zhipeng. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:22 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:25 <openstack> The meeting name has been set to 'openstack_cyborg'
14:00:41 <zhipeng> #topic Roll Call
14:00:49 <zhipeng> #info Howard
14:01:11 <shaohe_feng_> hi zhipeng
14:01:18 <kosamara> hi!
14:01:23 <zhipeng> hi guys
14:01:33 <crushil> \o
14:01:35 <zhipeng> please use #info to record you names
14:01:48 <kosamara> #info Konstantinos
14:01:50 <Sundar> #info Sundar
14:02:15 <crushil> #info Rushil
14:02:57 <shaohe_feng_> #info shaohe
14:05:02 <Sundar> Do we have a Zoom link?
14:05:06 <zhipeng> let's wait for a few more minutes in case more people will join
14:05:14 <zhipeng> Sundar no we only have irc today
14:05:23 <Sundar> Sure, Zhipeng. Thanks
14:06:18 <Yumeng__> #info Yumeng__
14:09:45 <zhipeng> okey let's start
14:09:57 <zhipeng> #topic CERN GPU use case introduction
14:10:08 <zhipeng> kosamara plz takes us away
14:10:41 <kosamara> I joined you recently, so let me remind that I'm a technical student at CERN, integrating GPUs into our openstack.
14:10:56 <kosamara> Our use case is computation only at this point.
14:11:31 <kosamara> We have implemented and are currently testing a nova-only pci-passthrough.
14:12:18 <kosamara> We intend to also explore vGPUs, but they don't seem to fit our use case very much: licensing costs, limited CUDA support (nvidia case both).
14:12:47 <kosamara> At the moment the big issues are enforcing quotas on GPUs and security concerns.
14:13:07 <kosamara> 1. Subsequent users can potentially access data on the GPU memory
14:13:39 <kosamara> 2. Low-level access means they could change the firmware or even cause the host to restart
14:14:23 <kosamara> We are looking at a way to mitigate at least the first issue, by performing some kind of cleanup on the GPU after use
14:14:55 <kosamara> But in our current workflow this would require a change in nova.
14:15:13 <kosamara> It looks like something that could be done in cyborg?
14:15:42 <zhipeng> i think so, i remember we had similar convo regarding clean up on FPGA in the PTG
14:15:43 <Li_Liu> are you try to force a "reset" after usage of the device?
14:16:02 <kosamara> Reset of the device?
14:16:32 <kosamara> According to a research article, in nvidia's case performing a device reset through nvidia-smi is not enough for the data leaks.
14:16:42 <Li_Liu> something like that. like Zhipeng said, "clean up"
14:16:49 <kosamara> - article: https://www.semanticscholar.org/paper/Confidentiality-Issues-on-a-GPU-in-a-Virtualized-E-Maurice-Neumann/693a8b56a9e961052702ff088131eb553e88d9ae
14:17:19 <kosamara> The additional complexity in pci passthrough is that the host can't access the GPUs (no drivers)
14:17:39 <shaohe_feng_> so once the GPU devices are detached, cyborg should do clean up at once.
14:17:40 <Sundar> What kind of clean up do you have in mind? Zero out the RAM?
14:18:15 <kosamara> My thinking is to put the relevant resources in a special state after deallocation from their previous VM and use a service VM to perform the cleanup ops themselves.
14:18:21 <Li_Liu> memset() to zero for all ?
14:18:27 <kosamara> Yes, zero the ram
14:19:00 <kosamara> But ideally we would like to ensure the firmware's state is valid too.
14:19:13 <shaohe_feng_> only this one method to clear up?
14:19:48 <kosamara> Sorry, I'm out of context. This method is from where?
14:19:58 <shaohe_feng_> zero the ram
14:20:15 <kosamara> Yes for the ram.
14:20:30 <shaohe_feng_> OK, got it.
14:20:35 <kosamara> But we would also like to check the firmware's state.
14:20:54 <kosamara> And at least ensure that it hasn't been tampered with.
14:21:22 <shaohe_feng_> so no ways to power off the GPU device, and re-power it?
14:21:23 <zhipeng> i think this is doable with cyborg
14:21:43 <kosamara> Not that I know of, apart from rebooting the host.
14:21:44 <zhipeng> shaohe_feng_ that will mess up the state ?
14:21:59 <Sundar> The service VM needs to run in every compute node, and it needs to have all the right drivers for various GPU devices. We need to see how practical that is.
14:22:14 <Li_Liu> we might need to get some help from Nvidia
14:22:48 <zhipeng> let's say we have the NVIDIA driver support
14:22:50 <Sundar> In a vGPU scenario, how do you power off the device without affecting other users?
14:22:52 <zhipeng> for cyborg
14:22:59 <kosamara> Sundar: unless vfio is capable of doing these basic operations?
14:23:39 <zhipeng> do we still need the service VM ?
14:24:11 <kosamara> I haven't explored practically nvidia's vGPU scenario. The host is supposed to be able to operate on the GPU in that case.
14:24:32 <Sundar> kosamara: Not sure I understand. vfio is a generic module for allowing device access. How would it know about specific nvidia devices?
14:24:44 <shaohe_feng_> kosamara: do you use vfio pci passthrough or pci-stub passthrough?
14:25:08 <kosamara> zhipeng: vfio-pci is the host stub driver when the gpu is passed through. If we could do something through that, then we wouldn't need the service VM.
14:25:32 <kosamara> shaohe_feng_: vfio-pci
14:25:37 <zhipeng> kosamara good to know
14:26:16 <kosamara> Sundar: yes, I'm just making a hypothesis. I expect it can't, but perhaps zeroing-out the ram is a general enough operation and it can do it...?
14:27:40 <Vipparthy> hi
14:27:49 <shaohe_feng_> kosamara:  is zeroing-out the ram time consuming？
14:28:20 <Sundar> kosamara: I am not a Nvidia expert. But presumably, to access the RAM, one would need to look at the PCI BAR space and find the base address or something? If so, that would be device-specific
14:29:49 <kosamara> Sundar: thanks for the input. I can research the feasibility of this option this week.
14:30:15 <zhipeng> kosamara I will also talk to the NVIDIA OpenStack team about it
14:30:26 <zhipeng> see if we could come out with something for Rocky :)
14:30:45 <kosamara> shaohe_feng_: I don't have a number right now. It should depend on the manner: if it happens on the device without pci transports it should be quite fast.
14:31:08 <kosamara> zhipeng thanks :)
14:32:01 <zhipeng> kosamara is there anything like an architecture diagram for the use case ?
14:32:20 <kosamara> Not really.
14:32:32 <zhipeng> okey :)
14:33:00 <Sundar> kosamara: you also mentioned quotas on GPUs?
14:33:03 <kosamara> What exactly do you mean by architecture diagram?
14:33:09 <Li_Liu> this rings a bell to me about the driver apis
14:33:20 <zhipeng> kosarama like the overall setup
14:33:33 <Li_Liu> currently we have report() and program() for the vendor driver api
14:33:50 <kosamara> Yes. We currently implement quotas indirectly. We only allow GPU flavors on specific projects and quota them by cpu.
14:33:58 <Li_Liu> we might want to consider a reset() api for the drivers
14:34:09 <zhipeng> Li_Liu makes sense
14:34:21 <zhipeng> kosamara we will also have quota support for Rocky
14:34:22 <shaohe_feng_> +1
14:35:13 <kosamara> Good to know.
14:35:28 <zhipeng> okey folks let's move on to the next topic
14:35:36 <zhipeng> thx again kosamara
14:35:40 <kosamara> thanks!
14:35:47 <shaohe_feng_> kosamara: so do you not support GPU quota  for specific projects？
14:35:56 <zhipeng> #topic subteam lead report
14:36:04 <kosamara> No, we do it indirectly, in the way I mentioned.
14:36:24 <zhipeng> shaohe_feng_ scrow up :)
14:36:39 <zhipeng> scroll
14:37:02 <zhipeng> okey yumeng could you introduce the progress on your side ?
14:37:43 <zhipeng> Yumeng__
14:39:24 <Yumeng__> Last week I set up a repository for cyborg-specs
14:40:24 <Yumeng__> some patches are still waiting for review from the infra team
14:40:49 <Yumeng__> Hope it could be merged ASAP
14:40:55 <Sundar> Are we not using the git repo's doc/specs/ area anymore?
14:41:14 <zhipeng> Sundar we are migrating it out :)
14:41:22 <Sundar> Maybe I am missing the context. What is this repository for specs?
14:41:27 <zhipeng> but it is fine now to keep submit to that folder
14:41:46 <zhipeng> we will migrate all the approved specs after MS1 to cyborg-specs
14:42:11 <Sundar> Howard, could you explain why we are doing that?
14:42:36 <zhipeng> It would be better for the documentation when we do release
14:42:47 <zhipeng> all the core projects are doing it
14:43:33 <zhipeng> i think yumeng also add the gate check on docs for cyborg-specs
14:44:13 <crushil> Sundar All the core projects split out the specs and the main project. So, it makes sense to follow suite
14:44:31 <zhipeng> yes exactly
14:44:35 <Sundar> IIUC, in the git repo, approved specs will be doc/specs/<release>/approved, but in the new repo, all release specs will be in one place. Is that right?
14:45:03 <zhipeng> it will still follows the similar directory structure
14:45:36 <Sundar> OK, so we are just separating code from docs
14:45:41 <zhipeng> yes
14:45:48 <zhipeng> from specs to be preceise
14:45:50 <zhipeng> precise
14:45:57 <Sundar> Got it, thanks :)
14:46:08 <zhipeng> since general documentation is still in cyborg repo, if I understand correctly
14:46:09 <Li_Liu> for now we still check in the docs to the code repo right?
14:46:19 <zhipeng> Li_Liu yes, nothing changes
14:46:24 <Li_Liu> ok
14:46:40 <zhipeng> thx to Yumeng__ for the quick progress
14:46:56 <Sundar> In future releases, would we check specs into code repo, and have it be migrated after approval?
14:47:04 <Yumeng__> zhipeng: :)
14:47:21 <zhipeng> in the future we will just submit the spec patch to cyborg-specs
14:47:37 <Sundar> ok
14:47:56 <zhipeng> shaohe_feng_ any update on the python-cyborgclient ?
14:48:03 <shaohe_feng_> the spec and code will separated
14:48:41 <shaohe_feng_> zhipeng:  jinghan has some personal staff these days. So no more update.
14:48:58 <shaohe_feng_> zhipeng: I will help him on it.
14:49:18 <zhipeng> thx :) was just gonna mention this
14:49:24 <zhipeng> plz work with him
14:49:29 <shaohe_feng_> hopeful we will make progress next week.
14:49:30 <shaohe_feng_> OK
14:49:36 <zhipeng> ok thx
14:49:49 <zhipeng> I will work with zhuli on the os-acc
14:50:10 <zhipeng> that one will most likely involve nova team discussion
14:50:54 <Sundar> Yes. I will be happy to work with zhuli if he needs any help
14:51:14 <zhipeng> Sundar gr8t :)
14:51:26 <zhipeng> #topic rocky spec/patch discussion
14:51:52 <zhipeng> #link https://review.openstack.org/#/q/status:open+project:openstack/cyborg
14:52:37 <zhipeng> first up, Sundar's spec patch
14:53:13 <shaohe_feng_> Sundar: good work
14:53:28 <Sundar> Shaohe, thanks :)
14:53:28 <shaohe_feng_> but I have a question. why nova developer think the accelerator weigher call cause performance loss
14:53:39 <zhipeng> #link https://review.openstack.org/#/c/554717/
14:53:59 <Sundar> I think the assumption was that the weigher will call into Cyborg REST API for each host
14:54:09 <Sundar> If the weigher is in Nova tree, that is true
14:54:25 <Sundar> But, if Cyborg keeps it, we have other options
14:54:27 <shaohe_feng_> Sundar: why for each host?
14:54:51 <Sundar> The typical filter today operates per host
14:54:52 <shaohe_feng_> Sundar: I have discussed it before.
14:55:18 <shaohe_feng_> Sundar: the cyborg API will run on controller node.
14:55:32 <shaohe_feng_> Sundar: we only call the api in controller node.
14:55:39 <shaohe_feng_> just on api is OK.
14:56:13 <shaohe_feng_> for example, the scheduler filter choose the suitable hosts
14:56:56 <shaohe_feng_> and the scheduler weigher just call a API to query the accelerator infos of these hosts
14:57:05 <shaohe_feng_> zhipeng: Li_Liu: right?
14:57:36 <zhipeng> that is still , per host
14:57:36 <Li_Liu> you mean the weigher is on the Cyborg controller side?
14:57:56 <shaohe_feng_> zhipeng: no, we get list for filter API
14:58:32 <Sundar> Shaohe, yes, we could override the BaseWeigher and handle multiple hosts in one call. That call could invoke Cyborg REST API.
14:58:44 <shaohe_feng_> for example: GET /cyborg/v1/accelerators?hosts=cyborg-1,cyborg-2&type=fpga
14:58:45 <Sundar> To me, it is not clear what the performance hit would be.
14:59:26 <Sundar> I suspect any performance hit would not be noticeable until we get to some scale
14:59:28 <Li_Liu> Does this involve the 2-stages scheduling problem we were trying to avoid?
15:00:10 <Sundar> There is no 2-stage scheduling here: the proposed filter/weigher is a typical one, which just filters hosts based on function calls.
15:00:26 <shaohe_feng_> Sundar: yes, the scheduler has call placement several times, is there performance issue?
15:00:29 <zhipeng> Sundar I think Li_Liu meant for weigher in Nova
15:00:52 <zhipeng> shaohe_feng_ it is not the same thing
15:01:13 <shaohe_feng_> they are both http request.
15:01:18 <zhipeng> anyways this has been discussed in extent with Nova team and let's stay with the conclusion
15:01:24 <Li_Liu> ok
15:01:30 <zhipeng> shaohe_feng_ we could discuss offline more with Alex
15:01:31 <shaohe_feng_> OK
15:01:38 <zhipeng> but let's not dwell on it
15:01:51 <Sundar> May be I misunderstood :) We are proposing a weigher maintained in Cyborg tree, which the operator will configure in nova.conf. Is that a concern?
15:02:34 <shaohe_feng_> weigher maintained in Cyborg tree, still need one cyborg api request,  so also performance issue?
15:03:02 <zhipeng> Sundar I don't think that would be a concern
15:03:06 <Sundar> shaohe: I personally don't think so, but we'll check the data to assure everybody
15:03:24 <shaohe_feng_> zhipeng: Li_Liu:  do you think  a weigher maintained in Cyborg tree is a good idea?
15:03:47 <zhipeng> yes, at the moment
15:04:09 <shaohe_feng_> Sundar: you still need to tell cyborg which hosts need to weight
15:04:22 <shaohe_feng_> on cyborg api call.
15:04:22 <Sundar> shaohe: This weigher is querying Cyborg DB. It is better to keep it in Cyborg
15:04:39 <Li_Liu> I agree with zhipeng
15:04:57 <zhipeng> the weigher will just talk to the conductor
15:05:02 <zhipeng> it is not blocking nova operations
15:05:06 <zhipeng> that is the point
15:05:59 <shaohe_feng_> any the way, the api call will talk to conductor to query Cyborg DB
15:06:06 <Sundar> Shaohe, yes. The weigher gets a list of hosts. We could either introduce a new Cyborg API for that, or just have the weigher query the db directly
15:06:09 <shaohe_feng_> no difference.
15:06:23 <zhipeng> shaohe_feng_ let's leave it offline
15:06:29 <shaohe_feng_> zhipeng: OK.
15:06:46 <zhipeng> Sundar thx for the spec :)
15:06:57 <zhipeng> We will definitely review it more
15:07:51 <zhipeng> next up, Li Liu's patch
15:08:05 <zhipeng> #info Implemented the Objects and APIs for vf/pf
15:08:19 <zhipeng> #link https://review.openstack.org/552734
15:08:59 <Sundar> Sorry, I need to leave for my next call. :( Will catch up from minutes
15:09:12 <Li_Liu> later
15:09:23 <zhipeng> Sundar no problem
15:09:40 <zhipeng> Sundar we need another discussion on k8s actions :)
15:10:11 <zhipeng> Li_Liu any update or comment on your patch ?
15:10:22 <zhipeng> things we need to be aware of ?
15:10:59 <Li_Liu> not really. I think shaohe needs to change some code in his resource tracker to adopt the change
15:11:35 <Li_Liu> other than that. one thing left over is to utilize the attribute table in deployables. This is still a missing piece
15:11:41 <shaohe_feng_> Li_Liu: thanks for reminder.
15:12:12 <Li_Liu> I will keep working on that as well as 2 other specs
15:12:49 <Li_Liu> shaohe_feng_ np. let me know if you need any help on using my pf/vfs
15:12:49 <shaohe_feng_> Li_Liu: 2 other specs include image management?
15:13:00 <shaohe_feng_> Li_Liu: ok, thanks.
15:13:23 <Li_Liu> programability and image metadata standardization
15:13:44 <zhipeng> yes , big task on your shoulder :)
15:14:05 <Li_Liu> :)
15:15:45 <zhipeng> okey we've gone through our agenda list today
15:15:54 <zhipeng> I think we can end the meeting now :)
15:16:04 <zhipeng> and talk to you guys next week
15:18:11 <kosamara> bye
15:19:05 <Yumeng__> bye
15:19:38 <zhipeng> #endmeeting