#openstack-cyborg log

14:00:05 <zhipeng> #startmeeting openstack-cyborg
14:00:06 <openstack> Meeting started Wed Jul 25 14:00:05 2018 UTC and is due to finish in 60 minutes.  The chair is zhipeng. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:08 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:10 <openstack> The meeting name has been set to 'openstack_cyborg'
14:00:31 <zhipeng> #topic Roll Call
14:00:37 <zhipeng> #info Howard
14:00:53 <wangzhh> #info wangzhh
14:01:42 <seungwook> Hi there~^^
14:01:47 <zhipeng> hi rianx_ and seungwook :)
14:01:54 <zhipeng> brianx_
14:02:07 <shaohe_feng> #info shaohe_feng
14:02:09 <Yumeng__> #info Yumeng
14:02:18 <Helloway> #info Helloway
14:02:41 <seungwook> Hi zhipeng, shaohe and yumeng
14:02:54 <seungwook> Hi brianx
14:02:57 <shaohe_feng> hi seungwook, welcome
14:03:27 <Yumeng__> welcome seungwook
14:03:34 <Yumeng__> :)
14:03:47 <seungwook> Thanks all~ : )
14:04:04 <zhipeng> Hi brianx__
14:04:15 <Li_Liu> #info Li_Liu
14:04:34 <seungwook> Hi Li Liu
14:04:56 <Li_Liu> Hey Seungwook! welcome to the party
14:05:29 <seungwook> Hi xinran
14:06:06 <wangzhh> Hey Seungwook!  Welcome!
14:06:23 <zhipeng> #topic Rocky Development Update
14:06:26 <seungwook> Thank wangzhh
14:06:36 <zhipeng> so the deadline we are facing this week
14:06:43 <zhipeng> is the client lib deadline
14:06:54 <seungwook> Ho vrv
14:07:05 <zhipeng> shaohe_feng how's the client side prepared ? Could we make it on time ?
14:08:33 <shaohe_feng> zhipeng, which more features should we add for the client?
14:08:38 <shaohe_feng> zhipeng, allocation?
14:08:59 <zhipeng> could we do basic CRUD ?
14:09:00 <shaohe_feng> zhipeng, unallocation?
14:09:04 <Li_Liu> Does fpga program count?
14:09:26 <Li_Liu> the program api should be ready this week
14:09:34 <shaohe_feng> zhipeng, we can only list accelerators
14:10:00 <Li_Liu> but calling the client part I am not sure
14:10:18 <zhipeng> i don't think we could have programming in client on time
14:10:29 <zhipeng> shaohe_feng, once coco and zhenghao's patch in
14:10:34 <Li_Liu> ok, as least the api part
14:10:42 <zhipeng> could we quickly add allocation and unallocation ?
14:10:45 <xinran__> hi seungwook  welcome
14:10:53 <zhipeng> in the most simple form, add to the db, remove from db
14:11:09 <shaohe_feng> zhipeng,  if we want to let nova call cyborg. we do not need CRUD for accelerators. we need CRUD for deployables.
14:11:52 <Li_Liu> according to the discussion in here: https://etherpad.openstack.org/p/cyborg-rocky-development
14:11:52 <seungwook> Thank xinran
14:12:43 <shaohe_feng> we can use the current PATCH deployable for allocation and unallocation
14:12:56 <shaohe_feng> but I do not think this is a good ideas.
14:13:19 <shaohe_feng> also do we need batch  allocation?
14:13:59 <shaohe_feng> for example, a user want 3 deployables, can he call once allocation?
14:13:59 <zhipeng> which patch are you talking about ?
14:14:15 <wangzhh> https://review.openstack.org/#/c/584641/   Maybe this one.
14:14:20 <shaohe_feng> API part
14:14:39 <zhipeng> why is it not a good idea ?
14:14:58 <shaohe_feng> yes.
14:16:09 <shaohe_feng> IMHO, There should better be a explicit API for these two actions
14:16:33 <shaohe_feng> I also use PATCH for allocation and unallocation for my POC.
14:16:52 <shaohe_feng> just  for I do not touch more cyborg code
14:17:25 <xinran__> I think there should be an unified api which does allocation and unallocation
14:17:41 <zhipeng> xinran__ what do you mean by unified ?
14:17:50 <xinran__> Instead of implementing in deployable api
14:17:52 <shaohe_feng> the PATCH method, it is for resource, not for collections. So it can not support batch allocation
14:17:53 <Coco> #info Coco
14:18:03 <Li_Liu> 1 api for both alloc and unalloc?
14:18:16 <xinran__> I mean another independent api :)
14:18:27 <Coco> 2 independent api.
14:18:28 <xinran__> #info xinran__
14:18:57 <Li_Liu> shaohe_feng, i think as least there is no harm to have the PATCH method, even for long term
14:19:37 <shaohe_feng> #link https://github.com/shaohef/cyborg/commit/13cb904809376584f3cf78b7d7f120817d83c2ad
14:19:52 <shaohe_feng> this is a poc about half years ago
14:20:07 <Coco> patch method can keep, but patch can do many things, not only allocate or unallocate.
14:20:15 <shaohe_feng> I use it PATCH for unallocation
14:20:15 <zhipeng> I agree with Li_Liu's assessment, should be good at the moment
14:20:35 <shaohe_feng> curl -g -i -X PATCH http://localhost:6666/deployables/$UUID -H "Content-Type: application/json" \
14:20:35 <shaohe_feng> -H "Accept: application/json" -H "X-Auth-Token: $(openstack token issue -f value -c id)" -d '
14:20:35 <shaohe_feng> [
14:20:35 <shaohe_feng> { "op": "replace", "path": "/instance_uuid", "value": null }
14:20:36 <shaohe_feng> ]'
14:21:06 <shaohe_feng> I agree we do not touch two many codes for cyborg at present.
14:22:31 <shaohe_feng> tiny change is OK for me as https://review.openstack.org/#/c/584641/
14:22:51 <zhipeng> okey then let's wrap up zhenghao and coco's patch, and shaohe_feng could update the cyborgclient
14:23:08 <zhipeng> I will need to cut the final release for client on 27th
14:23:09 <wangzhh> So, everyone agree that we should have two another apis for allocate or unallocate?
14:23:20 <shaohe_feng> but we need dolpher/sunder or Li_Liu comments for it.
14:23:24 <wangzhh> and...
14:23:55 <shaohe_feng> we can not get the expect deployables.
14:24:13 <shaohe_feng> for we need more attributes.
14:24:28 <wangzhh> shaohe_feng, we can extend parameters if dolpher/sunder or Li_Liu  have other comments.
14:24:49 <shaohe_feng> such as function ID/name and more
14:25:23 <Li_Liu> shaohe_feng so you want to add more fields in the table?
14:25:36 <wangzhh> API is stable. Just extend the filter for querying.
14:26:17 <wangzhh> Li_Liu, not db. It's for API.
14:26:45 <shaohe_feng> For example, I want a crypto FPGA(pre-pre-programmed)
14:26:45 <Li_Liu> adding fields should not affect api tho
14:27:17 <shaohe_feng> Yes. so you give the comments ASAP
14:27:37 <shaohe_feng> need just need experts comments
14:27:51 <zhipeng> Li_liu plz take a look at https://review.openstack.org/#/c/584641/
14:28:20 <zhipeng> and also coco's patch
14:28:22 <zhipeng> https://review.openstack.org/584296
14:28:37 <zhipeng> Li_Liu if you are ok with it plz go head w+1
14:28:44 <zhipeng> to land it
14:28:50 <shaohe_feng> As we have talked, it is easy to improve these patches, as your comments arrives. :)
14:29:46 <Li_Liu> I was looking as it
14:30:01 <Li_Liu> will provide comments in a bit
14:30:13 <shaohe_feng> Thanks.
14:30:27 <shaohe_feng> For sample scenarios
14:30:50 <shaohe_feng> There is a QAT and FPGA.
14:31:26 <shaohe_feng> the user want to cypto function.
14:32:11 <shaohe_feng> pre-programmed cypto FPGA
14:32:19 <shaohe_feng> how does he do?
14:33:03 <shaohe_feng> needs host, devices_type, function_type.
14:33:15 <shaohe_feng> also the numbers?
14:33:29 <Li_Liu> one way is to use add_attribute() api in Deployable
14:33:39 <shaohe_feng> maybe he want to 2 pre-programmed cypto FPGAs
14:33:57 <shaohe_feng> Li_Liu, I did it as you said.
14:34:02 <Coco> yes, I think number is necessary.
14:34:17 <Li_Liu> what kind of numbers?
14:34:23 <Li_Liu> performance kpi?
14:34:32 <Li_Liu> like 30 gbps?
14:34:51 <shaohe_feng> Coco, agree. You can also give the comments on that API patches.
14:34:54 <Coco> the number of deployables allocated at once.
14:35:13 <wangzhh> GET will return a list of acc. So we need number in allocate?
14:35:58 <Coco> at least, there are some place in the code where we deal with the number requirement of deployables.
14:36:07 <wangzhh> Batch allocate?
14:36:08 <Li_Liu> if we have a dedicated allocation api, then the number should be included
14:36:50 <Li_Liu> I though for now we are just using the deployable PATCH, which is single allocation
14:37:18 <Coco> OK, than if user ask for 2, we should use the api twice?
14:37:28 <Coco> then
14:37:42 <shaohe_feng> For example, you and I Get 2 accs.
14:37:48 <Li_Liu> that could a bit tricky
14:38:07 <shaohe_feng> Then we need allocated them, then risk
14:39:12 <zhipeng> will user actually needs to simultaneously deploy two kinds of accs ?
14:39:23 <zhipeng> or more, all at once
14:39:39 <shaohe_feng> So For my POC. dolpher proposal one API to get and allocated.
14:40:14 <shaohe_feng> zhipeng, dolper prefer at once.
14:40:23 <zhipeng> i was thinking about batch actions
14:40:43 <zhipeng> it makes sense for VMs and containers, since they are mostly homogeneous
14:40:56 <zhipeng> if fails, rollback is pretty easy
14:41:07 <Li_Liu> I agree with the multi-alloc use case
14:41:09 <shaohe_feng> sure. thanks, you got my points.
14:41:27 <zhipeng> but if we are gonna have batch operation on difference accelerators
14:41:54 <zhipeng> if they are the same type it will be easy, but for different functionalities, it will be tricky
14:42:07 <shaohe_feng> It is hard for difference accelerators.
14:42:17 <shaohe_feng> I have not think a good solutions for it.
14:42:22 <xinran__> Cyborg return a list of accelerators and nova will do a for loop to allocate one by one, this is the current solution, but I think we should pass num to cyborg allocate api in the future
14:42:34 <shaohe_feng> Li_Liu, what's your suggestion on this one?
14:42:49 <Li_Liu> when calling the dedicated alloc api, just provide the list of acc type you need
14:42:52 <zhipeng> yes what xinran__ said is okey
14:43:22 <Li_Liu> for instance, alloc: [aes x 1 + rsa x 2]
14:43:37 <Li_Liu> that gives you 3 different accs
14:47:57 <zhipeng> i think we should follow xinran__'s suggestion
14:48:05 <zhipeng> let's stuck to the current solution for Rocky
14:48:15 <zhipeng> and discuss about batch in Denver ptg
14:48:38 <Coco> what's the deadline?
14:50:16 <zhipeng> client is 27th ...
14:51:42 <Li_Liu> agree, xinran_'s suggestion should solve the problem for us for now
14:51:56 <Coco> 2 days left?
14:52:09 <wangzhh> LIke os-acc before.  It is urgent.：）
14:52:15 <zhipeng> yes lol
14:52:20 <Li_Liu> zhipeng, what else are we trying to squeeze in by 27th?
14:52:20 <zhipeng> every release is like this
14:52:33 <zhipeng> just the client lib final cut
14:52:51 <zhipeng> meaning the basic api will not be changed as well, after that, to make sure the client works
14:53:08 <Li_Liu> i see
14:53:29 <zhipeng> shaohe_feng your new patch
14:53:30 <zhipeng> https://review.openstack.org/#/c/585146/
14:53:37 <zhipeng> is about the placement support right ?
14:53:49 <zhipeng> do we need to remove some old implementation ?
14:54:03 <zhipeng> I remember Li Liu coded the report functionality before
14:54:39 <Li_Liu> https://github.com/openstack/cyborg/blob/master/cyborg/services/report.py
14:54:43 <Li_Liu> I did this part
14:55:15 <Li_Liu> shaohe_feng, is it sufficient for you?
14:55:17 <zhipeng> are there any conflict ?
14:55:39 <xinran__> client lib depends on cyborg api we don’t have allocation api right now. How we handle that ?
14:55:43 <zhipeng> I think shaohe added the provider_tree and also make sure the interaction happens on the agent level
14:56:03 <zhipeng> xinran__ will zhenghao's patch handle that ?
14:58:15 <shaohe_feng> Li_Liu, zhipeng I'm back.
14:58:16 <shaohe_feng> sorry
14:58:24 <Li_Liu> I guess shaohe_feng re-implemented my part in his new patch..
14:58:43 <shaohe_feng> It support the sub-provider.
14:58:57 <shaohe_feng> actually, I did not re-implemented it.
14:59:05 <shaohe_feng> I leverage it from nova
14:59:07 <Li_Liu> shaohe_feng, I have implemented the SchedulerReportClient in https://github.com/openstack/cyborg/blob/master/cyborg/services/report.py
14:59:34 <shaohe_feng> Li_Liu, yes, I know your implementations.
14:59:48 <Li_Liu> so what the difference between your SchedulerReportClient and my SchedulerReportClient? just curious
14:59:58 <shaohe_feng> no sup-provider.
15:00:09 <shaohe_feng> it support nest provider.
15:00:24 <shaohe_feng> I think these code should be in a lib
15:00:31 <Li_Liu> i see, but should be put them together?
15:00:32 <shaohe_feng> so nova and cyborg can share.
15:00:50 <Li_Liu> just use your code
15:01:02 <shaohe_feng> yes. Maybe it need to remove the old one.
15:01:03 <Li_Liu> ignore my implementation for now then
15:01:11 <Li_Liu> ok
15:01:19 <zhipeng> okey
15:01:30 <shaohe_feng> And the best solution, move them to a common lib  shared by nova and cyborg or other project
15:01:42 <xinran__> zhipeng:  yeah but we need modify that in the future
15:01:55 <zhipeng> xinran__ yes for sure
15:02:03 <shaohe_feng> zhipeng, you can talk about with nova's guy, do they have plan to do it?
15:02:03 <wangzhh> xinran__， IMHO, we can use PATCH for allocation as my patch in rocky. Or we can design new API interface and  parallel development.
15:02:19 <zhipeng> shaohe_feng I think it is a reasonable request
15:02:33 <zhipeng> since provider tree has also been implemented for nova-compute
15:02:39 <openstackgerrit> Merged openstack/cyborg master: Add "interface_type" field in deployable DB  https://review.openstack.org/584296
15:02:57 <xinran__> wangzhh:  and deallocate also by patch method ?
15:02:59 <zhipeng> hey look who's here
15:03:08 <Li_Liu> a lot of reimplementations cross different project just for this part...
15:03:13 <shaohe_feng> zhipeng, yes. Need need for us to maintain same code with other project. :)
15:03:31 <wangzhh> Yes.
15:03:39 <wangzhh> xinran__
15:03:44 <shaohe_feng> Li_Liu, yes. too many. but we should avoid.
15:03:55 <Li_Liu> agree
15:04:00 <zhipeng> okey next on drivers
15:04:05 <shaohe_feng> And that is cyborg should focus on
15:04:18 <shaohe_feng> that is not
15:04:20 <shaohe_feng> sorry.
15:04:35 <zhipeng> we now have gpu and opae based fpga drivers on the fly
15:05:02 <zhipeng> need brianx__ to start the Xilinx driver ASAP lol
15:05:10 <xinran__> Coco:  did you modify the driver discover() method to get interface type?
15:05:11 <zhipeng> let's start with a simple version
15:05:47 <Coco> I think the yes
15:06:14 <Coco> I will check again.
15:07:29 <Coco> I use the "pci" as the "interface_type" value at this moment, since it's fpga driver.
15:08:12 <wangzhh> I think discover should have common data structure.
15:08:22 <shaohe_feng> I have list cyborg tasks in the etherpad. include this one
15:08:25 <Coco> I agree with wangzhh
15:08:26 <shaohe_feng> No one take it at present.
15:09:02 <Coco> 27th also the deadline?
15:09:02 <shaohe_feng> Yes. maybe an object
15:09:17 <shaohe_feng> then it can not need schema verify
15:09:53 <shaohe_feng> anyway, the goal is unify the discover data
15:10:22 <Coco> I can take it,  but can't make sure it can be done by 27th.
15:10:41 <zhipeng> does that affect the API behaviour ?
15:10:41 <Li_Liu> when discover() reports data, it should be in Cyborg understandable format
15:11:01 <shaohe_feng> Coco, great, thanks.
15:11:09 <wangzhh> zhipeng, no. Just affect agent.
15:11:28 <wangzhh> To report data.
15:11:29 <Li_Liu> no affect on api I don't think
15:11:32 <zhipeng> okey then there is no hurry
15:11:34 <shaohe_feng> Coco, have you looked at sundar's VAN?
15:11:37 <Li_Liu> yup
15:11:59 <shaohe_feng> Coco, can we leverage it?
15:12:07 <shaohe_feng> yes, not hurry.
15:12:13 <Coco> what's the link?
15:12:28 <shaohe_feng> the spec, let me show you.
15:13:15 <Coco> ok, thks.
15:13:35 <shaohe_feng> #link https://review.openstack.org/#/c/577438/
15:13:46 <shaohe_feng> it is used for os-acc
15:14:21 <shaohe_feng> you can have a check,  does the drivers can use it?
15:16:02 <Coco> ok, i will check tomorrow.
15:16:08 <shaohe_feng> Thanks.
15:16:42 <zhipeng> Yumeng__ are you still around ?
15:16:54 <zhipeng> the doc work should start now
15:17:52 <Li_Liu> hmm
15:18:16 <Li_Liu> I will focus on that after warpping up the program api
15:18:33 <Li_Liu> will as Yumeng and others for help
15:18:59 <zhipeng> yes will need to work together, to comb through the implementations :)
15:19:23 <Li_Liu> yup
15:20:29 <shaohe_feng> guess she is in sleep. :)
15:20:47 <Coco> OK, we already had doc wechat group.
15:21:57 <shaohe_feng> more sleep can keep a girl beauty. :)
15:22:10 <Coco> ....
15:22:25 <Coco> I need to go sleep too.
15:22:37 <zhipeng> okey
15:22:38 <Li_Liu> have a good sleep guys :)
15:22:40 <seungwook> I think it takes some time for me to catch up the dialog.
15:22:52 <zhipeng> let's try to nail client lib down this week lol
15:23:17 <zhipeng> #endmeeting