#openstack-cyborg log

13:59:21 <zhipeng> #startmeeting openstack-cyborg
13:59:22 <openstack> Meeting started Wed Apr 25 13:59:21 2018 UTC and is due to finish in 60 minutes.  The chair is zhipeng. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:59:23 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
13:59:25 <openstack> The meeting name has been set to 'openstack_cyborg'
13:59:34 <zhipeng> #topic Roll Call
13:59:41 <zhipeng> #info Howard
13:59:45 <NokMikeR> #info Mike
13:59:48 <Sundar> #info Sundar
14:00:16 <Sundar> Hi Howard and Mike
14:00:27 <NokMikeR> Hello Sundar, Howard et all.
14:00:37 <circ-user-c3hdH> #info Helloway
14:00:40 <zhipeng> Hello everyone :)
14:00:49 <Li_Liu> #info Li Liu
14:01:10 <Li_Liu> Hi guys
14:03:15 <shaohe_feng_> #info shaohe
14:03:27 <shaohe_feng_> hi all
14:03:48 <zhipeng> hi
14:04:45 <zhipeng> let's start then
14:05:01 <zhipeng> #topic KubeCon EU ResMgmt WG preparation
14:05:09 <zhipeng> #link https://docs.google.com/document/d/1j3vrG6BgE0hUDs2e-1ZUegKN4W4Adb1B6oJ6j-4kyPU/edit?usp=sharing
14:05:29 <xinran_> Hi all
14:05:36 <zhipeng> so i think I mentioned that for this year's planning I want to be able to align what we have done here
14:05:41 <zhipeng> with the k8s community
14:06:09 <zhipeng> KubeCon EU is around the corner next week, and it would be a great place to start participating
14:06:27 <edleafe> #info edleafe
14:06:28 <zhipeng> Sundar could you help share some status about the resource mgmt wg in k8s ?
14:06:36 <Sundar> Sure
14:06:49 <Li_Liu> that's great
14:07:05 <Sundar> We started participating last year, with a document describing the FPGA structure and use cases
14:07:42 <Sundar> The main thing to note is that the FPGA structural model -- with regions, accelerators, local memory etc. -- is the same independent of orch framework -- openStack, K8s etc
14:08:21 <Sundar> Also, the use cases defined in the Cyborg/Nova spec stay the same -- FPGA as a Service, Accelerated Function as a Service, etc.
14:08:25 <Sundar> The main difference is in the set of mechanisms available
14:09:05 <Sundar> In OpenSTack, we have the notion of nested Resource Providers (nRPs), which provides a natural tree structure that matches many device topologies
14:09:44 <Sundar> The data models and resource handling in K8s is still evolving
14:10:46 <Sundar> What we have now is the device plugin mechanism: there is standard API by which kubelet can invoke a plugin for  a category of devices
14:11:41 <Sundar> The plugin advertises a resource name, e.g. intel.com/fpga-a10, and lists the devices corresponding to that. There is also a provision to update that list over time, and to report the health of each device
14:12:09 <Sundar> Based on this information, when a pod spec asks fora resource, the standard K8s scheduler picks a node and informs the kubelet on that node
14:12:35 <Sundar> The kubelet then invokes another API on the device plugin to allocate a device of the requested type and prepare it
14:13:22 <Sundar> After that, the kubelet invokes a container runtime (e.g. Docker) through the CRI with an OCI runtime spec
14:14:05 <Sundar> This basic mechanism does not include the nested structure of FPGAs, and we have been discussing how to fit that in
14:14:35 <Sundar> However, there are many options: we can use Custom Resource Definitions (CRDs) https://kubernetes.io/docs/concepts/api-extension/custom-resources/#customresourcedefinitions
14:14:36 <zhipeng> and vGPU as well I suppose ?
14:15:11 <Sundar> Howard, yes, I think vGPUs, esp.of different types, will also require further consideration
14:15:52 <Sundar> CRDs are eseentially custom resource classes: we can instantiate resources for a CRD.
14:16:33 <Sundar> There are also two ongoing proposals for including resource classes:
14:16:51 <Sundar> The first one is: https://docs.google.com/document/d/1qKiIVs9AMh2Ua5thhtvWqOqW0MSle_RV3lfriO1Aj6U/edit#
14:17:28 <Sundar> An alternative proposal for resource classes is at: https://docs.google.com/document/d/1666PPUs4Lz56TqKygcy6mXkNazde-vwA7q4e5H92sUc/edit#
14:18:04 <Sundar> Thei stated goals and non-goals are not exactly the same.
14:18:05 <Li_Liu> We don't have access two those 2 google docs..
14:18:38 <Sundar> Li_Liu, these are supposed to be public dos -- please ask for access
14:18:41 <Sundar> *docs
14:18:50 <Li_Liu> ok, just did
14:19:26 <zhipeng> Sundar there is a PR on Resource API
14:19:45 <zhipeng> this is correlates to jiaying's or vish's ?
14:19:45 <Sundar> IMHO, lot of the discussion comes from GPU background. For FPGAs, we are trying to get alignment within Intel first
14:20:37 <Sundar> Zhipeng, Jiaying's proposal came first -- but I need to look at the PR before confirming
14:21:34 <zhipeng> okey :)
14:21:54 <Sundar> So, in short, our task is to find a way of handling FPGAs, without having the benefit of a tree-structured data model, but handling the same FPGA structures and usage models
14:21:57 <NokMikeR> is the vGPU is vendor specific? ie tied to a particular driver implementation?
14:23:03 <Li_Liu> Sundar, does the k8s community have any plans to add the tree-structure data model in the near future?
14:23:05 <Sundar> NokMikeR, the discussions I have seen have been centered on Nvidia's vGPU types, though not necessarily phrased in vendor-specific terms
14:23:15 <NokMikeR> in other words how do you differentie the features on one vGPU vs another if the underlying features in the real gpu are different - or are they abstracted somehow?
14:23:20 <Sundar> Li_Liu, None that I am aware of.
14:24:14 <NokMikeR> Sundar: ok thought so re: nvid gpus.
14:24:35 <Sundar> NokMikeR, in OpenStack, the answer is clearer: the device itself exposes different vGPU types as traits, and their capacities as units of a generic accelerator RC
14:25:44 <Sundar> Cyborg needs to handle GPUs and FPGAs of course. But, IMHO, there is enough attention on GPUs :) It is FPGAs that need further thought :)
14:25:57 <zhipeng> Sundar Jiaying's proposal, as far as I understand, still tries to modify the k8s core functionality ?
14:26:37 <shaohe_feng_> Sundar: do we decide to only support one vendor GPU or FPGA in this release without nest Provider?
14:26:42 <Sundar> Zhipeng, yes, it requires changes on controller side and kubelet changes
14:28:15 <Sundar> Shaohe: for Rocky, I was proposing to include only device of a particualr type: one GPU or one FPGA. But, based on feedback, w ehave to relax it to multiple devices of the same type, i.e.,
14:28:29 <Sundar> you could have 2 GPUs of the same type, 2 FPGAs of the same type etc.
14:28:51 <zhipeng> Sundar if we say, propose a CRD type of kube-cyborg thing, will it make sense to the res mgmt wg people ?
14:29:00 <zhipeng> meaning that similar to OpenStack
14:29:10 <zhipeng> we view accelerator not part of the general compute infra
14:29:46 <zhipeng> and have its own model and scheduling process if needed
14:30:25 <Sundar> ZHipeng, we can propose CRDs, but the exact workflows will matter.
14:30:49 <Li_Liu> zhipeng, you are saying, similar to what we did to Cyborg, we cut a piece out from K8S?
14:30:51 <zhipeng> I think the main pain point is still at scheduler extention
14:31:07 <zhipeng> which Derek also mentioned KubeCon last Dec
14:31:31 <zhipeng> Li_Liu essentially a out-of-band controller for accelerators
14:31:43 <shaohe_feng_> Sundar: will cyborg support nest provider in Rocky release?
14:32:10 <zhipeng> shaohe_feng_ I think Placement won't support it
14:32:29 <shaohe_feng_> zhipeng: Got it.
14:32:40 <zhipeng> but the way we are modeling it is very close to nrp, correct me if i'm wrong Li_Liu
14:32:43 <Sundar> Zhipeng, I was also advocating a scheduler extension. But apparently it is not popular within the community. There is a proposal to revamp the scheduler itself: https://docs.google.com/document/d/1NskpTHpOBWtIa5XsgB4bwPHz4IdRxL1RNvdjI7RVGio/edit#
14:32:58 <Sundar> So, the scheduler, as well as its extension APIs, may change
14:34:22 <zhipeng> well CRDs are generally great for API aggregation, but complex for resource related functionalities
14:34:32 <Sundar> Here is a possible way to get to a few basic cases without anything fancy (this is not fully agreed upon, please take this as an option, not a plan):
14:34:37 <zhipeng> like binding the resource to the pod
14:34:46 <zhipeng> since the process is external via CRD
14:34:46 <Sundar> •	Publish each region type as a resource. E.g. intel.com/fpga-dcp, intel.com/fpga-vg.  •	The pod spec asks for a region type as a resource, and also specifies a bitstream ID. That could be a label.  •	An admission controller inserts an init container on seeing a FPGA resource. •	The scheduler picks a node based on the requested region type (and ignores the bitstream ID). •	The init container pulls the bitstream wi
14:35:07 <Sundar> Ah, that didn't come out well in IRC -- let me re-type
14:35:24 <Sundar> •	Publish each region type as a resource. E.g. intel.com/fpga-dcp, intel.com/fpga-vg
14:35:34 <Sundar> •	The pod spec asks for a region type as a resource, and also specifies a bitstream ID. That could be a label
14:35:45 <Sundar> •	An admission controller inserts an init container on seeing a FPGA resource.
14:35:59 <Sundar> •	The scheduler picks a node based on the requested region type (and ignores the bitstream ID).
14:36:10 <Sundar> •	The init container pulls the bitstream with that ID from a bitstream repository (mechanism TBD) and programs the selected device.
14:37:20 <Sundar> I have heard that, if we give a higher security context to the init container for programming, it may affect other containers in the same pod. I am still trying to find evidence for that
14:37:46 <zhipeng> lol this is just too complicated
14:38:07 <zhipeng> thx Sundar I think we have a good understanding of the status qup
14:38:10 <zhipeng> quo
14:38:18 <Sundar> Zhipeng, more complicated than other proposals out there? ;)
14:38:25 <zhipeng> anyone else got questions regarding k8s ?
14:38:54 <zhipeng> if you are attending KubeCon we could meet f2f, and give them hell XD
14:39:09 <Sundar> lol
14:39:28 * NokMikeR braces for impact
14:41:04 <Li_Liu> if you guys have any dial-in-able meeting during kubecon, please loop us in
14:41:36 <shaohe_feng_> yes, loop us in
14:42:02 <zhipengh[m]> Okey I will give a howler if a bridge is available
14:42:33 <zhipengh[m]> Seems like my PC irc client just died
14:43:05 <zhipengh[m]> #topic Sub team arrangements
14:46:08 <zhipeng> phew
14:46:13 <Sundar> :)
14:46:42 <zhipeng> cell phone irc bouncer crashed just now
14:46:57 <zhipeng> moving on
14:47:05 <zhipeng> #topic subteam arrangments
14:47:35 <zhipeng> okey so given recent events, I think it is necessary to reorg the subteams
14:48:23 <zhipeng> and also encourage subteam to organize their specific meetings
14:48:30 <zhipeng> for specific topics
14:48:45 <zhipeng> so I would suggest shaohe to help lead the driver subteam
14:49:10 <zhipeng> work with our Xilinx and lenovo colleagues on FPGA and GPU driver in Rocky
14:49:29 <shaohe_feng_> Ok.
14:49:40 <zhipeng> Li Liu help lead the doc team, to work with our CMCC member and others to make documentation as good as your spec :)
14:49:49 <Li_Liu> sure
14:49:59 <zhipeng> I will keep on the release mgmt side
14:50:35 <zhipeng> shaohe_feng_ you can sync up with Chuck_ on a meeting time more suited for US west coast
14:50:55 <zhipeng> mainly China morning times I guess
14:51:13 <shaohe_feng_> zhipeng: what is Chuck_?
14:51:28 <Chuck_> I am Chuck_ :-)
14:51:37 <Chuck_> Hi Shaohe, this is Chuck from Xilinx
14:51:44 <Sundar> lol
14:51:46 <shaohe_feng_> Chuck_:  hello
14:51:46 <Chuck_> I work from US west time
14:51:54 <zhipeng> I will add Chuck_ into our wechat group as well
14:52:13 <zhipeng> talk in Chinese :)
14:52:47 <Li_Liu> count me in for those driver meeting shaohe_feng_
14:52:52 <Li_Liu> :)
14:52:53 <Chuck_> yes, look forward to working with you.
14:53:20 <shaohe_feng_> Li_Liu: OK.
14:53:21 <zhipeng> and subteam plz send report to the mailing list, you can decide whether it is bi-weekly or weekly
14:53:26 <zhipeng> or monthly even
14:53:28 <zhipeng> up to you
14:53:56 <Sundar> Is it all WeChat in Chinse then? ;) I can join if that helps
14:54:10 <zhipeng> it is for the China region devs :P
14:54:20 <zhipeng> all in Chinese and crazy emoticons :P
14:54:24 <Li_Liu> WeChat has a translation feature tho :)
14:54:39 <shaohe_feng_> Sundar: we speak  Chinese there. :)
14:54:53 <shaohe_feng_> Sundar: you can learn Chinese.
14:55:24 <Sundar> OK :) My daughters learnt Mandarin. I should have joined them
14:55:55 <zhipeng> haha will learn a lot
14:56:34 <Sundar> :) Do we have alignment on what use cases we will deliver in Rocky?
14:57:11 <zhipeng> CERN HPC PoC could be the one for GPU
14:57:12 <Sundar> Can we say we will deliver AFaaS pre-programmed, and FPGA aaS with request time programming? These are the simplest ones, and many customers want that
14:57:36 <Sundar> Zhipeng, yes, GPU POC too
14:57:41 <zhipeng> Sundar that is something we should be able to deliver
14:57:47 <zhipeng> for FPGA
14:57:47 <shaohe_feng_> Sundar: oh, do we will have same RC name for VGPU and FPGA, and other accelerators in Rocky release?
14:58:49 <Sundar> Shaohe, yes our agreemnt with Nova is to use a generic RC for all accelerators
14:59:04 <Sundar> as in the spec
14:59:21 <Sundar> DO we have kosamara here?
14:59:34 <Sundar> Any input on the spec from GPU perspective?
14:59:58 <shaohe_feng_> Sundar: But I'm worry about without nest provider.
15:00:28 <Sundar> Shaohe, without nRP, we will apply the traits etc. to the compute node RP
15:00:46 <Sundar> Do you see problems in doing that?
15:01:07 <shaohe_feng_> how do we distinguish vGPU and FPGA in one host？
15:02:02 <Li_Liu> hello?
15:02:08 <NokMikeR> welcome back
15:02:12 <Li_Liu> everyone quit?
15:02:21 <NokMikeR> net split a glitch in the matrix.
15:02:44 <Sundar> All traits will get mixed in the compute node. So, a flavor asking for resource:CUSTOM_ACCELERATOR=1 and trait:CUSTOM_GPU_<foo> will come to cyborg which needs to choose a GPU based on its Deployables
15:02:54 <Sundar> Li_Liu, I am still here :)
15:03:24 <shaohe_feng_> Sundar: for example, we make a FPGA traits and GPU traits on host provider.
15:03:33 <zhipeng> okey this is a crazy night
15:03:42 <shaohe_feng_> Sundar: and there is a one GPU and one FPGA
15:03:48 <Sundar> Sorry, i have another call at this time
15:04:04 <shaohe_feng_> firstly we consume one FPGA
15:04:08 <Sundar> Shaohe, can we continue another time?
15:04:14 <shaohe_feng_> Sundar: OK.
15:04:34 <zhipeng> okey moving on
15:04:45 <zhipeng> #topic critical rocky spec update
15:05:09 <zhipeng> xinran_ are you still around ?
15:05:48 <xinran_> Yes I’m here
15:06:06 <zhipeng> could you provide a brief update on the quota spec ?
15:09:31 <xinran_> Ok.  Like we discussed with xiyuan, I think we should do the usage part first, so I want to separate the spec into two part.
15:09:45 <xinran_> What do you think
15:11:13 <zhipeng> makes sense
15:11:24 <zhipeng> have you already updated it as two parts ?
15:12:20 <xinran_> Not yet will update the usage part in this week
15:12:40 <zhipeng> sounds great :)
15:13:10 <zhipeng> #action xinran to update the quota spec into two parts and complete the usage one first
15:13:15 <xinran_> For the limit part, i think it still need more discussion with xiyuan
15:13:29 <zhipeng> no problem
15:13:35 <Li_Liu> So that we will have 2 spec on this in stead of 1 right?
15:13:42 <zhipeng> yep
15:13:53 <zhipeng> but the limit one should be rather simple
15:14:08 <zhipeng> since we will utilize a lot of things Keystone has already designed
15:14:46 <Li_Liu> I see
15:15:09 <xinran_> Yes I think so
15:15:53 <zhipeng> okey another thing is that the os-acc spec will need more time
15:16:08 <zhipeng> so I suggest we relax the deadline for proposal on that one to the June MS2
15:16:38 <zhipeng> if by then we still could not get it landed, then I will block it for Rocky but could be land first thing in Stein :)
15:16:47 <Li_Liu> zhipeng, let me know if you need some help on that one
15:16:57 <zhipeng> sounds reasonable
15:16:58 <zhipeng> ?
15:17:06 <zhipeng> Li_Liu sure :)
15:17:29 <Li_Liu> since I think I have some thoughts on it
15:17:34 <zhipeng> #agreed os-acc spec extended to MS2 for approval
15:17:53 <zhipeng> Li_Liu no problem, feel free to share it
15:18:04 <zhipeng> okey moving on
15:18:14 <zhipeng> #topic open patches/bugs
15:18:31 <zhipeng> I will push up the fix for mutable-config this week
15:20:16 <zhipeng> maybe combined with the fix lenovo folk has provided but blocked by me due to trivial fix
15:21:07 <zhipeng> any other issues on this topic ?
15:22:04 <zhipeng> okey then
15:22:09 <zhipeng> #topic AoB
15:22:32 <zhipeng> any other business
15:23:31 <xinran_> what is AoB......
15:23:45 <xinran_> Ah i know!
15:24:11 <zhipeng> xinran_ bien :)
15:25:07 <xinran_> ;)
15:26:10 <zhipeng> okey if there are no other topics
15:26:17 <zhipeng> let's conclude the meeting today
15:26:28 <zhipeng> thx for the great conversation :)
15:26:31 <zhipeng> #endmeeting