*** username_ has joined #openstack-cyborg | 01:21 | |
*** username_ is now known as username__ | 01:23 | |
*** dolpher has joined #openstack-cyborg | 02:29 | |
*** username__ has quit IRC | 02:35 | |
openstackgerrit | Li Liu proposed openstack/cyborg master: Added attribute object and its unit tests https://review.openstack.org/565382 | 04:30 |
---|---|---|
openstackgerrit | Li Liu proposed openstack/cyborg master: Added attribute object and its unit tests https://review.openstack.org/565382 | 04:40 |
*** evin has quit IRC | 05:47 | |
*** evin has joined #openstack-cyborg | 06:15 | |
*** xinran__ has joined #openstack-cyborg | 06:25 | |
xinran__ | hi, all. does cyborg have api doc? | 06:26 |
*** masber has quit IRC | 07:36 | |
*** mszwed has quit IRC | 10:05 | |
*** mszwed has joined #openstack-cyborg | 10:06 | |
*** xinran__ has quit IRC | 12:05 | |
*** circ-user-pF3Sp has joined #openstack-cyborg | 13:02 | |
*** Li_Liu has joined #openstack-cyborg | 13:51 | |
*** zhipeng has joined #openstack-cyborg | 13:51 | |
*** circ-user-pF3Sp has quit IRC | 13:54 | |
*** circ-user-IlMPE has joined #openstack-cyborg | 13:54 | |
*** evin has quit IRC | 13:55 | |
*** Helloway has joined #openstack-cyborg | 13:56 | |
*** NokMikeR has joined #openstack-cyborg | 13:56 | |
*** Helloway has quit IRC | 13:56 | |
*** Helloway has joined #openstack-cyborg | 13:57 | |
*** xinran__ has joined #openstack-cyborg | 13:59 | |
xinran__ | meeting? | 13:59 |
zhipeng | yes | 14:00 |
xinran__ | :) | 14:00 |
*** shaohe has joined #openstack-cyborg | 14:01 | |
zhipeng | #startmeeting openstack-cyborg | 14:01 |
openstack | Meeting started Wed May 9 14:01:06 2018 UTC and is due to finish in 60 minutes. The chair is zhipeng. Information about MeetBot at http://wiki.debian.org/MeetBot. | 14:01 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 14:01 |
*** openstack changes topic to " (Meeting topic: openstack-cyborg)" | 14:01 | |
openstack | The meeting name has been set to 'openstack_cyborg' | 14:01 |
zhipeng | #topic Roll Call | 14:01 |
*** openstack changes topic to "Roll Call (Meeting topic: openstack-cyborg)" | 14:01 | |
zhipeng | #info Howard | 14:01 |
sum12 | #info sum12 | 14:01 |
NokMikeR | #info Mike | 14:02 |
Helloway | #info Helloway | 14:02 |
edleafe | #info Ed | 14:02 |
Li_Liu | #info Li_Liu | 14:02 |
shaohe | #info shaohe | 14:02 |
zhipeng | Hi sum12, could you introduce a bit about yourself ? | 14:03 |
xinran__ | #info xinran__ | 14:03 |
*** sundar has joined #openstack-cyborg | 14:03 | |
sum12 | Hey, I am Sumit. I work for SUSE and have been part of some other projects already. | 14:03 |
sundar | Hi Sumit | 14:03 |
zhipeng | welcome :) | 14:03 |
Li_Liu | Hi Sumit | 14:03 |
sum12 | Thanks everyone :) | 14:04 |
NokMikeR | Welcome | 14:04 |
sundar | #info Sundar | 14:04 |
shaohe | welcome sum12 | 14:04 |
sum12 | glad to be here, thanks sundar zhipeng Li_Liu NokMikeR sundar | 14:05 |
zhipeng | okey let's get into business | 14:06 |
zhipeng | #topic driver subteam meeting time | 14:06 |
*** openstack changes topic to "driver subteam meeting time (Meeting topic: openstack-cyborg)" | 14:06 | |
zhipeng | so shaohe has helped started a poll | 14:07 |
zhipeng | and it seems three options are at the top | 14:07 |
sundar | Could somebody post the poll link here? | 14:08 |
zhipeng | UTC 1:00am Mon, UTC 2:00pm Mon, UTC 8:30am Wed | 14:08 |
zhipengh[m] | #link https://doodle.com/poll/pa3gi78yncsr7qee | 14:08 |
shaohe | sundar: any prefer on the time? | 14:10 |
zhipeng | let's pick one :) | 14:10 |
Li_Liu | I am good with anytime at night(which is morning back in China) | 14:11 |
sundar | Yes, Shaohe. I clicked on the doodle now. Sunday eve or Mon 7 am PST are both good | 14:12 |
zhipeng | ok let's do the Mon UTC 2:00pm | 14:13 |
zhipeng | which will be 10:00pm at China, 10:00 am EST, 7:00am PST | 14:13 |
zhipeng | it's the most popular time | 14:15 |
zhipeng | okey ? | 14:15 |
NokMikeR | ok | 14:15 |
sundar | Sounds good to me! | 14:15 |
zhipeng | #agreed driver subteam meeting time every Mon UTC1400 | 14:16 |
Li_Liu | Monday? | 14:16 |
zhipeng | Monday | 14:16 |
Li_Liu | ok | 14:16 |
Li_Liu | sounds good | 14:16 |
zhipeng | #topic new core reviewer promotion | 14:16 |
*** openstack changes topic to "new core reviewer promotion (Meeting topic: openstack-cyborg)" | 14:16 | |
zhipeng | in order to increase our review bandwidth, i hereby promote Sundar to be the new core reviewer | 14:17 |
zhipeng | Sundar has been very active and taking charge of several important specs | 14:17 |
zhipeng | so as usual, we will have one week time for any feedback, and acknowledgment of the promotion next Wed :) | 14:17 |
kosamara | #info kosamara | 14:18 |
sundar | Thanks, zhipeng | 14:18 |
Li_Liu | gratz :) | 14:18 |
shaohe | gratz :) | 14:19 |
zhipeng | well not yet :) | 14:19 |
zhipeng | let's wait for a week for feedback | 14:19 |
sundar | Thanks, Li_Liu and shaohe :). As zhipeng says, one week from now, we will know. | 14:20 |
zhipeng | but would like to thanks for Sundar's great effort so far | 14:20 |
zhipeng | :) | 14:20 |
zhipeng | okey moving on | 14:20 |
zhipeng | #topic KubeCon feedback | 14:20 |
*** openstack changes topic to "KubeCon feedback (Meeting topic: openstack-cyborg)" | 14:20 | |
zhipeng | okey last week i attend KubeCon and the resource mgmt wg deep dive session | 14:20 |
zhipeng | k8s res mgmt wg is the center in k8s which deals with general and acceleration resources | 14:21 |
zhipeng | My takeaway is that the support for a general accelerator mgmt, is still not in any shape in k8s | 14:22 |
zhipeng | Google is interested in GPU passthrough support for ML, mainly | 14:22 |
zhipeng | so if anyone wants to introduce any feature in-tree | 14:23 |
zhipeng | that would require a PoC up-front | 14:24 |
zhipeng | many things we discussed here, like vGPU types, general accelerator support including FPGA and others | 14:24 |
zhipeng | are viewed non-priority | 14:25 |
zhipeng | the resource class/resource api PRs are also long shot | 14:25 |
zhipeng | according to vishnu | 14:25 |
sundar | zhipeng: Agreed with your assessments. If we can get just one feature in -- passing annotations to the device plugin API -- that will help us meet most basic FPGA use cases, IMHO | 14:27 |
zhipeng | Sasha mentioned Intel team has finished a FPGA DPI PoC | 14:27 |
zhipeng | but also just pre-programmed FPGAs | 14:28 |
sundar | We can couple that with a scheduler extension. However, scheduler extensions are not viewed favorably because the scheduler fra,ework itself may change and the APIs may change along with it. | 14:28 |
zhipeng | yes, and also DPI is designed at the node level | 14:29 |
sundar | However, we can do a POC base don it, including programming support, and revamp it when the APIs change. Just my thought. :) | 14:29 |
zhipeng | and a mostly "reschedule" focused mechanism | 14:29 |
sundar | Yes, we do have a POC that does only pre-programmed use case. That does not show the strength of FPGAs, which is reprogramming | 14:29 |
zhipeng | so DPI is designed to mostly work for hot-plug use case, not scheduling upfront | 14:30 |
zhipeng | the scheduling will be retriggered once the node discover the DPI Plugin | 14:30 |
zhipeng | anyways it is the current lay of land | 14:30 |
zhipeng | so my thinking is, maybe it is reasonable to introduce a CRD framework for cyborg into k8s community | 14:31 |
zhipeng | so that we could have all of our data model preserved, has leeway on the api and scheduling design | 14:31 |
zhipeng | maintain a k8s-ish API interface | 14:31 |
zhipeng | and a out-of-band general accelerator mgmt functionality, not bound to DPI development | 14:32 |
zhipeng | i don't know what other team member's thought on this matter ? | 14:32 |
sundar | The CRD framework does not allow AFAIK for a nested topology, which OpenStack supports. | 14:32 |
zhipeng | CRD is just an API mechinism right ? | 14:33 |
NokMikeR | needs more blinking lights | 14:33 |
zhipeng | not implementation specific | 14:33 |
sundar | Yes. How do we model regions inside FPGAs, accelerators inside regions, local memory inside either ... | 14:33 |
zhipeng | that could all be done in Cyborg | 14:34 |
zhipeng | for example if you look at kubernetes service catalog | 14:35 |
Li_Liu | k8s will do the scheduling for Cyborg then? | 14:35 |
zhipeng | we could use scheduling extention maybe for that | 14:35 |
zhipeng | but I doubt Google wants to have the k8s core doing scheduling that taking accelerators into consideration | 14:36 |
sundar | The Cyborg implementation could relate different resources. Agreed. The CRD discussions also seem to get into resource classes etc., which seem to be a long shot, as you said. Yes, agreed that scheduling core cannot be changed | 14:37 |
Li_Liu | but the scheduling extension still require some change in k8s main tree right? | 14:37 |
*** evin has joined #openstack-cyborg | 14:37 | |
sundar | Li_Liu: scheduler extension is a standard mechanism in K8s today | 14:38 |
zhipeng | yep what sundar said | 14:38 |
sundar | However, the scheduler framework itself may evolve, and the extension APIs may change along with it | 14:38 |
sundar | Link to proposed K8s scheduler framework: https://docs.google.com/document/d/1NskpTHpOBWtIa5XsgB4bwPHz4IdRxL1RNvdjI7RVGio/edit# | 14:39 |
zhipeng | #link https://medium.com/@trstringer/create-kubernetes-controllers-for-core-and-custom-resources-62fc35ad64a3 | 14:40 |
zhipeng | some crd fundamentals | 14:40 |
shaohe | zhipeng: can not open it. | 14:41 |
zhipeng | so as I understand, crd is basically a way that we write a non-core k8s-ish controller | 14:41 |
zhipeng | shaohe you need vpn | 14:41 |
zhipeng | it listens upon the api-server | 14:42 |
zhipeng | and the keyword will trigger the request going to the crd controller, instead of the core k8s controller | 14:42 |
zhipeng | basically a hat on cyborg, if you will | 14:42 |
Li_Liu | so it's a subscribe/notify model right? | 14:44 |
zhipeng | in essence, as I understand yes | 14:44 |
NokMikeR | in go land | 14:46 |
*** shaohe has quit IRC | 14:46 | |
zhipeng | yep :) | 14:47 |
NokMikeR | is it trival to use python from go or how does the cyborg api interaction work to k8 and go? | 14:48 |
zhipeng | we could have gRPC clients that abstract away the lang difference | 14:48 |
NokMikeR | ok | 14:48 |
NokMikeR | I need that for English to Finnish to English also :) | 14:49 |
zhipeng | haha | 14:50 |
zhipeng | you need google duplex for that :P | 14:50 |
NokMikeR | :) | 14:51 |
sundar | Exactly, gRPC -- as Zhipeng said. The controller is a separate daemon. Also, kubelet and Cyborg DP will also be separate processes. | 14:51 |
zhipeng | let's keep discussion alive offline :) | 14:52 |
zhipeng | #topic bugs and issues | 14:52 |
*** openstack changes topic to "bugs and issues (Meeting topic: openstack-cyborg)" | 14:52 | |
zhipeng | shaohe a colleague of mine report that when devstack starts, he could not find cyborg services | 14:53 |
zhipeng | have you encountered similar problem ? | 14:53 |
*** shashaguo has joined #openstack-cyborg | 14:54 | |
zhipeng | shaohe dropped i think | 14:55 |
zhipeng | okey let's move on to the next topic then | 14:55 |
NokMikeR | I reported the same problem many moons ago, but have not tried lately to install it | 14:55 |
zhipeng | NokMikeR i think during some of the past fixes it turns out ok | 14:56 |
zhipeng | i'm not sure if some of the recent patches breaks it | 14:56 |
NokMikeR | the mutable config page shows a failure? https://review.openstack.org/#/c/559303/ | 14:57 |
*** shaohe_ has joined #openstack-cyborg | 14:57 | |
shaohe_ | ping | 14:58 |
zhipeng | hey welcome back | 14:58 |
NokMikeR | pong | 14:58 |
zhipeng | NokMikeR that should be already fixed | 14:58 |
NokMikeR | ok | 14:58 |
zhipeng | so the specific problem is c-cond and c-agent are not running | 15:00 |
zhipeng | shaohe_ that is not normal right / | 15:00 |
shaohe_ | yes. | 15:00 |
zhipeng | if devstack succeed | 15:01 |
zhipeng | c-api, c-cond, c-agent should all be running right ? | 15:01 |
shaohe_ | devstack should report error, if c-cond and c-agent are not running | 15:01 |
zhipeng | ok i will contact the author | 15:02 |
zhipeng | okey moving on | 15:02 |
zhipeng | #topic spec review day | 15:03 |
*** openstack changes topic to "spec review day (Meeting topic: openstack-cyborg)" | 15:03 | |
shaohe_ | it should be cyborg-agent, cyborg-api, cyborg-cond | 15:03 |
zhipeng | nova and other team all have this custom of spec sprint, or runways | 15:03 |
zhipeng | let's have one as well | 15:03 |
zhipeng | shaohe_ ok I will let him know | 15:03 |
zhipeng | #link https://etherpad.openstack.org/p/cyborg-rocky-spec-day | 15:04 |
shaohe_ | is c-xxx cinder process? | 15:04 |
zhipeng | maybe he just referred wrong | 15:04 |
zhipeng | I should double check with him | 15:04 |
shaohe_ | yes | 15:04 |
zhipeng | okey back to topic | 15:05 |
zhipeng | let's start with the "old ones" :) | 15:06 |
zhipeng | we should be ready to land those | 15:06 |
* NokMikeR braces for impact | 15:06 | |
zhipeng | first up | 15:07 |
zhipeng | #info python-cyborgclient framework | 15:07 |
zhipeng | #link https://review.openstack.org/565023 | 15:07 |
zhipeng | shaohe_ mentioned the client code is actually ready, so let's land this one | 15:07 |
zhipeng | any objections ? | 15:07 |
Li_Liu | agree, I think we can merge that one | 15:07 |
sundar | I think the syntax is not quite in line? | 15:07 |
Li_Liu | just finished reviewing | 15:08 |
sundar | Other commands are like 'openstack server ...' where the 2nd argument is the object on which an action is applied | 15:08 |
sundar | Whereas we are proposing 'openstack acceleration ...' I saw Shaohe respond to my comment. | 15:08 |
sundar | It is not clear to me why we cannot have 'openstack acelerator <list/show/...>' | 15:09 |
NokMikeR | wondering why its not openstack cyborg list/show ? | 15:09 |
zhipeng | because accelerator more like an object that we will choose to act upon | 15:09 |
zhipeng | NokMikeR legal issue | 15:10 |
NokMikeR | ok | 15:10 |
sundar | NokMikeR: Just as we have 'openstack server ..' instead of 'openstack nova ' | 15:10 |
shaohe_ | cyborg is project name, acceleration is service type. | 15:10 |
edleafe | yeah - project names should be avoided | 15:10 |
NokMikeR | clarified thanks. | 15:11 |
Li_Liu | that's good to know :) | 15:11 |
sundar | shaohe: It is not clear to me why we cannot have 'openstack accelerator <list/show/...>' | 15:11 |
zhipeng | sundar see my above comment | 15:12 |
zhipeng | because accelerator more like an object that we will choose to act upon | 15:12 |
zhipeng | acceleration is a type of service, like server service, volume service | 15:12 |
sundar | zhipeng, yes. Like server/image etc. shouldn't it be the 2nd arg? | 15:12 |
zhipeng | we might have something like openstack acceleration fpga create | 15:12 |
zhipeng | shaohe_ correct me if I'm wrong | 15:13 |
shaohe_ | yes. I have look other project. | 15:13 |
shaohe_ | they do use this syntax | 15:13 |
shaohe_ | some use this syntax | 15:14 |
NokMikeR | is it command(create) then object(fpga) ? | 15:14 |
NokMikeR | openstack [<global-options>] <object-1> <action> [<object-2>] [<command-arguments>] | 15:15 |
shaohe_ | yes as NokMikeR suggestion | 15:15 |
zhipeng | NokMikeR yes the example i mentioned might not be strictly correct :P | 15:15 |
shaohe_ | global-options can be service type | 15:15 |
shaohe_ | for cyborg the service type is acceleration. | 15:16 |
shaohe_ | for cinder the service type is volume | 15:16 |
edleafe | #link openstackclient guidelines: https://docs.openstack.org/python-openstackclient/3.4.1/humaninterfaceguide.html#command-structure | 15:16 |
shaohe_ | for glance the service type is image | 15:16 |
sundar | shaohe: May be I still have a disconnect. :) "openstack [<global-options>] <object-1> <action> [<object-2>] [<command-arguments>]" Where is the service here? | 15:16 |
edleafe | sundar: object-1 | 15:17 |
sundar | So, we will have syntax like 'openstack acceleration create <object-2> ...' | 15:18 |
zhipeng | something like that | 15:18 |
shaohe_ | openstack network --help |less | 15:18 |
sundar | If everybody else is ok with it, I am ok too :) | 15:18 |
edleafe | I would suggest s/acceleration/accelerator | 15:19 |
shaohe_ | openstack network flavor create | 15:19 |
zhipeng | edleafe any reason for that ? | 15:19 |
sundar | IMHO, the term 'accelerator' is more in line with other usages -- but go ahead :) | 15:20 |
edleafe | ^^ what sundar said | 15:20 |
zhipeng | ya I'm just thinking when we actually using accelerator it will need to be more specific (FPGA, GPU, ...) | 15:21 |
zhipeng | acceleration could just represent a service type offered by cyborg in general | 15:21 |
shaohe_ | sundar: edleafe: $ openstack --help |grep volume | 15:21 |
xinran__ | Yes I think accelerator is more reasonable | 15:21 |
sundar | zhipeng: yes, openstack accelerator create fpga <args> -- here 'fpga' is 'object-2' | 15:21 |
NokMikeR | acceleration implies more than one device? accelerator is singular or one device. If we debate this to the end we end up somewhere in particle physics... :) | 15:21 |
zhipeng | haha NokMikeR | 15:22 |
zhipeng | but I think accelerator has more votes here | 15:22 |
sum12 | i feel accelerator/acceleration are both too complicated as compared to service/image/network/... | 15:22 |
zhipeng | sum12 lol what is your suggestion | 15:22 |
sum12 | I am suggesting to ask we we anything easier in out arsenal ? | 15:23 |
zhipeng | maybe just acc ? | 15:23 |
shaohe_ | sum12: acc for abbreviation? | 15:23 |
zhipeng | shaohe_ man crush | 15:24 |
sundar | sum12: we use the term 'accelerator' or 'device'. But the term device is too general? | 15:24 |
zhipeng | sundar ya that might give people confusion | 15:24 |
sum12 | accel ? | 15:24 |
zhipeng | accel a little bit too Xilinx-ish ? | 15:24 |
sundar | sum12: accel is ok by me. I use that abbrev too | 15:24 |
zhipeng | okey anyone else ? | 15:24 |
zhipeng | if accel could work then accel it is :) | 15:25 |
sundar | If we give bash completions | 15:25 |
sundar | If we provide bash completions, it may not matter :) | 15:25 |
edleafe | 'accelerator' is a known term for computers. | 15:25 |
edleafe | 'accel' not so much | 15:25 |
shaohe_ | openstack command support bash completion | 15:25 |
sundar | sum12: with bash completion, do you still see a problem with 'accelerator'? | 15:26 |
shaohe_ | let see cinder's command name | 15:26 |
shaohe_ | $ openstack --help |grep volume | 15:26 |
shaohe_ | volume type create Create new volume type volume type delete Delete volume type(s) volume type list List volume types volume type set Set volume type properties volume type show Display volume type details volume type unset Unset volume type properties volume unset Unset volume properties | 15:27 |
shaohe_ | oh, sorry | 15:27 |
shaohe_ | volume type create Create new volume type | 15:27 |
shaohe_ | volume snapshot create Create new volume snapshot | 15:27 |
sum12 | bash completion is not a problem, but if I was devops guy I like the small and easy to remember (scripting) and not too charachter-y | 15:27 |
NokMikeR | https://docs.openstack.org/python-openstackclient/latest/ they are listed here | 15:27 |
NokMikeR | https://docs.openstack.org/python-openstackclient/latest/cli/commands.html | 15:28 |
sundar | From the link by NokMikeR: $ volume type list # 'volume type' is a two-word single object | 15:28 |
shaohe_ | $ openstack volume type create --help | 15:28 |
shaohe_ | usage: openstack volume type create [-h] [-f {json,shell,table,value,yaml}] | 15:28 |
edleafe | sum12: at least it isn't as long as 'application credential' :) | 15:29 |
sum12 | edleafe: :) | 15:29 |
zhipeng | do we have a consensus now ? :) | 15:29 |
shaohe_ | sundar: yes, it is two-word single object | 15:29 |
shaohe_ | and volume is the service type. | 15:30 |
sundar | We have a single-word single object :) which is better | 15:30 |
sundar | Anyways, I vote for 'openstack accelerator <command> <object-2> ...' That's my 2 cents | 15:30 |
zhipeng | i vote for that as well | 15:31 |
NokMikeR | same here if Im allowed. | 15:32 |
shaohe_ | OK, remove the service type. | 15:32 |
shaohe_ | but keep in mind | 15:32 |
*** Yumeng__ has joined #openstack-cyborg | 15:33 | |
kosamara | I would prefer accelerator too | 15:33 |
shaohe_ | if cyborg support a flavor api | 15:33 |
shaohe_ | flavor create/list | 15:33 |
shaohe_ | what should it be? | 15:33 |
zhipeng | sum12 that word looks a bit shorter now ? lol | 15:33 |
shaohe_ | ^ sundar: | 15:33 |
sum12 | :) | 15:33 |
shaohe_ | let's show the command: | 15:34 |
shaohe_ | $ openstack --help |grep flavor | 15:34 |
shaohe_ | flavor create Create new flavor | 15:34 |
shaohe_ | network flavor create Create new network flavor | 15:34 |
shaohe_ | there are two flavor, | 15:34 |
sundar | Shaohe: why should Cyborg commands create flavors? Flavors with accelerators should still be under usual command, right? In any case, we can do 'openstack accelerator create flavor ...' | 15:35 |
shaohe_ | first one is nova flavor | 15:35 |
shaohe_ | second one is network flavor. | 15:35 |
sundar | shaohe: ok. we can do 'openstack accelerator create flavor ...' | 15:36 |
shaohe_ | sundar: but openstack accelerator create flavor is not formal | 15:36 |
*** Yumeng__ has left #openstack-cyborg | 15:36 | |
shaohe_ | for accelerator is just an collection of our restful url. | 15:37 |
shaohe_ | and accelerations is the service type. we register in keystone | 15:38 |
sundar | shaohe: Not sure what you mean by 'formal'. If you prefer 'openstack accelerator flavor create', like nova/network, I am fine. | 15:38 |
zhipeng | shaohe_ i think let's just change it to accelerator | 15:38 |
zhipeng | seems like a team consensus at the moment | 15:38 |
zhipeng | after that update the patch should good to go :) | 15:39 |
shaohe_ | zhipeng: OK, so we register in keystone use accelerator instead of acceleration? | 15:39 |
sundar | There were some deprecated code in that patch | 15:40 |
shaohe_ | sundar: I will remove them | 15:40 |
sundar | Thanks, shaohe | 15:40 |
zhipeng | shaohe- yes I guess so | 15:40 |
shaohe_ | OK, let me file a patch to correct the service type firstly. | 15:41 |
zhipeng | many thx :) | 15:41 |
zhipeng | ok given the time in China | 15:41 |
zhipeng | xiran_ | 15:42 |
*** evin has quit IRC | 15:42 | |
zhipeng | could you provide a update on the quota spec ? | 15:42 |
zhipeng | xinran_ | 15:42 |
zhipeng | #info Quota spec | 15:42 |
zhipeng | #link https://review.openstack.org/560285 | 15:42 |
zhipeng | #link https://review.openstack.org/564968 | 15:42 |
sundar | Is this the Keystone-based quota that we were recommended to use? | 15:44 |
xinran__ | Yes i have update the spec firstly we should support quota usage in cyborg and implement limit part by invoke Oslo.limit once keystone guys finish that | 15:45 |
xinran__ | I have a doubt about resource type. Should we just count total number of accelerator or should we count like fpga gpu etc | 15:46 |
Li_Liu | We count the number of deployables I think | 15:47 |
zhipeng | Li_Liu shall we count them per type ? | 15:47 |
sundar | Li_Liu: A FPGA, as well as regions with it, will all be Deployables, right? | 15:47 |
shaohe_ | count granularity | 15:47 |
Li_Liu | yes, they should be grouped in types for sure | 15:48 |
xinran__ | zhipeng: seems there is only one resource class(accelerator) for now | 15:48 |
sundar | S, the quota be based on Deployable type? i.e. you can get X regions | 15:48 |
Li_Liu | the deployable patch has already been merged | 15:49 |
Li_Liu | regions are just a type of deployable | 15:49 |
sundar | I think xinran is right -- quotas are based on resource classes, right? | 15:49 |
xinran__ | sundar: yes I think so :) | 15:50 |
zhipeng | okey then we could settle upon that :) | 15:51 |
shaohe_ | sundar: only one resource type for quota? | 15:51 |
sundar | OK, then there is only one resource class in Cyborg -- CUSTOM_ACCELERATOR -- as we agreed with Nova | 15:51 |
Li_Liu | i see, that's what is exposed to nova. | 15:52 |
shaohe_ | for example, there maybe spdk software accelerators and vgpu accelerators. they share the same quotas? | 15:52 |
sundar | shaohe: I think so, but maybe I need to read more on quotas | 15:52 |
Li_Liu | xinran__ sundar their db existence are deployable just to be clear | 15:53 |
sundar | Li_Liu: when we get to oslo.limit based on Keystone, as Xinran said, that will be based on resource classes, right? | 15:53 |
shaohe_ | for nova, at present, the granularity is cpu, mem... | 15:53 |
edleafe | sundar: why just one CUSTOM_ class? The whole idea behind CUSTOM_ resource classes is that the service can create what it needs. | 15:54 |
shaohe_ | maybe gpu is also one quota | 15:54 |
Li_Liu | sundar right, deployables are just db existences. | 15:54 |
sum12 | need to drop | 15:54 |
sundar | edleafe: this is what Nova folks proposed to us, right? :) Are we ok with CUSTOM_ACCELERATOR_FPGA, CUSTOM_ACCELERATOR_GPU, ...? | 15:55 |
xinran__ | I think quota should depends on resource class | 15:55 |
shaohe_ | we should keep in mind, if we use one CUSTOM_ class, we must use nest resource provider. | 15:55 |
edleafe | sundar: I guess I missed that proposal. Was it to keep the Nova flavors simple? | 15:55 |
shaohe_ | or we can not distinguish the different resource | 15:56 |
sundar | edleafe: Multiple resource classes would actually be better. I didn't see a specific reason for single RC. Maybe the discussion was centered around vGPU types, and one was enough | 15:56 |
shaohe_ | that's the difference on flavors if we use one CUSTOM_ class or multi CUSTOM_ class? | 15:57 |
edleafe | sundar: that sounds more correct. As shaohe_ notes, if I ask for CUSTOM_ACCELERATOR, I might get back an FPGA or a GPU. :) | 15:57 |
sundar | edleafe: we have traits that distinguish FPGAs of different types, GPUs of different types, ... | 15:58 |
sundar | So, the flavor would ask for the traits too, as noted in the spec | 15:58 |
sundar | But, for quotas, it would be better to have distinct RCs based on device type (FPGA, GPU, HPTS, ....) | 15:59 |
xinran__ | I also feel a little bit confused why there is only one resource class but anyway quota should accord to resource class right? | 15:59 |
zhipeng | Li_Liu any input ? | 16:00 |
shaohe_ | so for one CUSTOM_ class, it should be nest PR first, and the flavor should be: resources:CUSTOM_ACCELERATOR:1, traits: FPGA | 16:00 |
shaohe_ | for multi CUSTOM_ class, the flavor should be: resources:CUSTOM_FPGA:1 | 16:00 |
shaohe_ | for multi CUSTOM_ class, the flavor should be: resources:CUSTOM_GPU:1 | 16:00 |
shaohe_ | ^ edleafe: right? | 16:01 |
sundar | shaohe: yes, you are right. We don't need nested RPs for this flavor definition. | 16:01 |
edleafe | It would be better, IMO, to have separate resource classes to distinguish the different devices (GPU vs. FPGA), and use traits to further refine the capabilities of a particular device | 16:01 |
sundar | But that will introduce limitations on combining different device tyoes | 16:02 |
sundar | edleafe: Agreed. Let me amend the spec. Please review it! | 16:02 |
shaohe_ | multi CUSTOM_ class can work without NPR. | 16:02 |
edleafe | sundar: ack | 16:02 |
shaohe_ | one CUSTOM_ class must work with NPR | 16:02 |
sundar | shaohe: multi CUSTOM_ class without nRPs will also have issues if you combine 2 different FPGAs on same host | 16:03 |
Li_Liu | zhipeng as far as I concern, using CUSTOM_ACCELERATOR might be a bit too general, if we decide to use it, additional information will be need to schedule/allocate the resources | 16:03 |
xinran__ | Li_Liu: like traits? | 16:03 |
zhipeng | will the way shaohe_ just mentioned work ? | 16:04 |
Li_Liu | right. coz essentially, we want to guide nova during their scheduling. | 16:04 |
sundar | We will not create RCs like CUSTOM_ACCELERATOR_FPGA_INTEL_ARRIA10, but only CUSTOM_ACCELERATOR_FPGA, CUSTOM_ACCELERATOR_GPU, etc. | 16:04 |
shaohe_ | xinran__: yes, it need traits on sub resource provider. | 16:04 |
Li_Liu | shaohes's way should work | 16:04 |
sundar | shaohe and all: I have been tracking nRP support in Nova, and apparently we are a week or two away from getting it. edleafe can confirm :) So, maybe we don't have to split hairs over what to do without nRP ;) | 16:05 |
sundar | Even with CUSTOM_ACCELERATOR_FPGA etc., we still need traits | 16:06 |
shaohe_ | but we can not need nPR. | 16:07 |
sundar | Sorry, late for my next meeting. I'll do my reviews offline and catch up here. :) | 16:07 |
edleafe | it does seem that nested RPs may be merged soon | 16:08 |
shaohe_ | edleafe: good. Then one CUSTOM_ class can works. | 16:09 |
zhipeng | sounds very promising :) | 16:09 |
edleafe | shaohe_: sure, it *can* work, but it isn't really a good design | 16:09 |
sundar | shaohe: multiple classes are good for quotas | 16:10 |
NokMikeR | Sorry Howard I have to leave physically but will leave this on monitoring. Best of luck sleeping :) | 16:10 |
zhipeng | NokMikeR :) no problem | 16:11 |
shaohe_ | edleafe: I have no objection on one CUSTOM_ class after we support nPR | 16:11 |
zhipeng | okey so since this is a spec review day, Li_Liu sundar plz continue discuss your specs here and I will leave the meeting recorded throughout the day | 16:11 |
xinran__ | Anyway I think quota should be accord with RCs... if one RC, quota count one... | 16:12 |
Li_Liu | I am gonna grab some lunch, we can discuss this later today | 16:14 |
Li_Liu | I will keep my session open | 16:14 |
Li_Liu | @sundar and all | 16:15 |
sundar | Sorry, guys. I am jumping between my meeting and here. I will keep it open too. | 16:20 |
*** zhipeng has quit IRC | 16:23 | |
edleafe | zhipengh[m]: you forgot to #endmeeting | 16:28 |
zhipengh[m] | That is on purpose :) | 16:29 |
sundar | Li_Liu: I am getting customer feedback that some of them want to use function names in Rocky. The problem is that FPGA hardware and bitstreams may expose function IDs, not names. | 16:34 |
sundar | So, one possibility is to let the operator apply a Glance property for function name when he onboards a bitstream, and reference that in the flavor. For Cyborg, it is just another Glance property to query -- wheher it is function ID or name. | 16:35 |
sundar | What do you think? | 16:35 |
*** masber has joined #openstack-cyborg | 16:40 | |
*** evin has joined #openstack-cyborg | 16:58 | |
*** shashaguo has quit IRC | 17:34 | |
*** NokMikeR has quit IRC | 17:35 | |
Li_Liu | sundar, I added function_uuid based on your comments for the metadata spec. That uuid should be mapped to a specific function name | 18:15 |
*** circ-user-IlMPE has quit IRC | 18:22 | |
*** Helloway has quit IRC | 18:23 | |
*** xinran__ has quit IRC | 18:39 | |
sundar | Li_Liu: yes. At least for Rocky, we can leave that mapping to the operator. | 18:56 |
sundar | IOW, the traits we apply are still based only on UUIDs. No further complexity in Rocky | 18:59 |
sundar | Can you add a function name as an optional property in your spec? | 19:00 |
*** sundar has quit IRC | 19:04 | |
Li_Liu | sundar sure will do | 19:11 |
*** circ-user-IlMPE has joined #openstack-cyborg | 19:28 | |
*** circ-user-IlMPE has quit IRC | 19:32 | |
*** sundar has joined #openstack-cyborg | 19:40 | |
sundar | Li_Liu: Thanks! | 19:40 |
*** sundar has quit IRC | 19:56 | |
*** circ-user-IlMPE has joined #openstack-cyborg | 20:57 | |
*** circ-user-IlMPE has quit IRC | 21:01 | |
*** Li_Liu has quit IRC | 21:12 | |
*** circ-user-IlMPE has joined #openstack-cyborg | 22:17 | |
*** circ-user-IlMPE has quit IRC | 22:21 | |
*** circ-user-IlMPE has joined #openstack-cyborg | 23:26 | |
*** circ-user-IlMPE has quit IRC | 23:30 | |
*** shaohe_ has quit IRC | 23:54 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!