14:00:05 <zhipeng> #startmeeting openstack-cyborg 14:00:06 <openstack> Meeting started Wed Jul 25 14:00:05 2018 UTC and is due to finish in 60 minutes. The chair is zhipeng. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:08 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:10 <openstack> The meeting name has been set to 'openstack_cyborg' 14:00:31 <zhipeng> #topic Roll Call 14:00:37 <zhipeng> #info Howard 14:00:53 <wangzhh> #info wangzhh 14:01:42 <seungwook> Hi there~^^ 14:01:47 <zhipeng> hi rianx_ and seungwook :) 14:01:54 <zhipeng> brianx_ 14:02:07 <shaohe_feng> #info shaohe_feng 14:02:09 <Yumeng__> #info Yumeng 14:02:18 <Helloway> #info Helloway 14:02:41 <seungwook> Hi zhipeng, shaohe and yumeng 14:02:54 <seungwook> Hi brianx 14:02:57 <shaohe_feng> hi seungwook, welcome 14:03:27 <Yumeng__> welcome seungwook 14:03:34 <Yumeng__> :) 14:03:47 <seungwook> Thanks all~ : ) 14:04:04 <zhipeng> Hi brianx__ 14:04:15 <Li_Liu> #info Li_Liu 14:04:34 <seungwook> Hi Li Liu 14:04:56 <Li_Liu> Hey Seungwook! welcome to the party 14:05:29 <seungwook> Hi xinran 14:06:06 <wangzhh> Hey Seungwook! Welcome! 14:06:23 <zhipeng> #topic Rocky Development Update 14:06:26 <seungwook> Thank wangzhh 14:06:36 <zhipeng> so the deadline we are facing this week 14:06:43 <zhipeng> is the client lib deadline 14:06:54 <seungwook> Ho vrv 14:07:05 <zhipeng> shaohe_feng how's the client side prepared ? Could we make it on time ? 14:08:33 <shaohe_feng> zhipeng, which more features should we add for the client? 14:08:38 <shaohe_feng> zhipeng, allocation? 14:08:59 <zhipeng> could we do basic CRUD ? 14:09:00 <shaohe_feng> zhipeng, unallocation? 14:09:04 <Li_Liu> Does fpga program count? 14:09:26 <Li_Liu> the program api should be ready this week 14:09:34 <shaohe_feng> zhipeng, we can only list accelerators 14:10:00 <Li_Liu> but calling the client part I am not sure 14:10:18 <zhipeng> i don't think we could have programming in client on time 14:10:29 <zhipeng> shaohe_feng, once coco and zhenghao's patch in 14:10:34 <Li_Liu> ok, as least the api part 14:10:42 <zhipeng> could we quickly add allocation and unallocation ? 14:10:45 <xinran__> hi seungwook welcome 14:10:53 <zhipeng> in the most simple form, add to the db, remove from db 14:11:09 <shaohe_feng> zhipeng, if we want to let nova call cyborg. we do not need CRUD for accelerators. we need CRUD for deployables. 14:11:52 <Li_Liu> according to the discussion in here: https://etherpad.openstack.org/p/cyborg-rocky-development 14:11:52 <seungwook> Thank xinran 14:12:43 <shaohe_feng> we can use the current PATCH deployable for allocation and unallocation 14:12:56 <shaohe_feng> but I do not think this is a good ideas. 14:13:19 <shaohe_feng> also do we need batch allocation? 14:13:59 <shaohe_feng> for example, a user want 3 deployables, can he call once allocation? 14:13:59 <zhipeng> which patch are you talking about ? 14:14:15 <wangzhh> https://review.openstack.org/#/c/584641/ Maybe this one. 14:14:20 <shaohe_feng> API part 14:14:39 <zhipeng> why is it not a good idea ? 14:14:58 <shaohe_feng> yes. 14:16:09 <shaohe_feng> IMHO, There should better be a explicit API for these two actions 14:16:33 <shaohe_feng> I also use PATCH for allocation and unallocation for my POC. 14:16:52 <shaohe_feng> just for I do not touch more cyborg code 14:17:25 <xinran__> I think there should be an unified api which does allocation and unallocation 14:17:41 <zhipeng> xinran__ what do you mean by unified ? 14:17:50 <xinran__> Instead of implementing in deployable api 14:17:52 <shaohe_feng> the PATCH method, it is for resource, not for collections. So it can not support batch allocation 14:17:53 <Coco> #info Coco 14:18:03 <Li_Liu> 1 api for both alloc and unalloc? 14:18:16 <xinran__> I mean another independent api :) 14:18:27 <Coco> 2 independent api. 14:18:28 <xinran__> #info xinran__ 14:18:57 <Li_Liu> shaohe_feng, i think as least there is no harm to have the PATCH method, even for long term 14:19:37 <shaohe_feng> #link https://github.com/shaohef/cyborg/commit/13cb904809376584f3cf78b7d7f120817d83c2ad 14:19:52 <shaohe_feng> this is a poc about half years ago 14:20:07 <Coco> patch method can keep, but patch can do many things, not only allocate or unallocate. 14:20:15 <shaohe_feng> I use it PATCH for unallocation 14:20:15 <zhipeng> I agree with Li_Liu's assessment, should be good at the moment 14:20:35 <shaohe_feng> curl -g -i -X PATCH http://localhost:6666/deployables/$UUID -H "Content-Type: application/json" \ 14:20:35 <shaohe_feng> -H "Accept: application/json" -H "X-Auth-Token: $(openstack token issue -f value -c id)" -d ' 14:20:35 <shaohe_feng> [ 14:20:35 <shaohe_feng> { "op": "replace", "path": "/instance_uuid", "value": null } 14:20:36 <shaohe_feng> ]' 14:21:06 <shaohe_feng> I agree we do not touch two many codes for cyborg at present. 14:22:31 <shaohe_feng> tiny change is OK for me as https://review.openstack.org/#/c/584641/ 14:22:51 <zhipeng> okey then let's wrap up zhenghao and coco's patch, and shaohe_feng could update the cyborgclient 14:23:08 <zhipeng> I will need to cut the final release for client on 27th 14:23:09 <wangzhh> So, everyone agree that we should have two another apis for allocate or unallocate? 14:23:20 <shaohe_feng> but we need dolpher/sunder or Li_Liu comments for it. 14:23:24 <wangzhh> and... 14:23:55 <shaohe_feng> we can not get the expect deployables. 14:24:13 <shaohe_feng> for we need more attributes. 14:24:28 <wangzhh> shaohe_feng, we can extend parameters if dolpher/sunder or Li_Liu have other comments. 14:24:49 <shaohe_feng> such as function ID/name and more 14:25:23 <Li_Liu> shaohe_feng so you want to add more fields in the table? 14:25:36 <wangzhh> API is stable. Just extend the filter for querying. 14:26:17 <wangzhh> Li_Liu, not db. It's for API. 14:26:45 <shaohe_feng> For example, I want a crypto FPGA(pre-pre-programmed) 14:26:45 <Li_Liu> adding fields should not affect api tho 14:27:17 <shaohe_feng> Yes. so you give the comments ASAP 14:27:37 <shaohe_feng> need just need experts comments 14:27:51 <zhipeng> Li_liu plz take a look at https://review.openstack.org/#/c/584641/ 14:28:20 <zhipeng> and also coco's patch 14:28:22 <zhipeng> https://review.openstack.org/584296 14:28:37 <zhipeng> Li_Liu if you are ok with it plz go head w+1 14:28:44 <zhipeng> to land it 14:28:50 <shaohe_feng> As we have talked, it is easy to improve these patches, as your comments arrives. :) 14:29:46 <Li_Liu> I was looking as it 14:30:01 <Li_Liu> will provide comments in a bit 14:30:13 <shaohe_feng> Thanks. 14:30:27 <shaohe_feng> For sample scenarios 14:30:50 <shaohe_feng> There is a QAT and FPGA. 14:31:26 <shaohe_feng> the user want to cypto function. 14:32:11 <shaohe_feng> pre-programmed cypto FPGA 14:32:19 <shaohe_feng> how does he do? 14:33:03 <shaohe_feng> needs host, devices_type, function_type. 14:33:15 <shaohe_feng> also the numbers? 14:33:29 <Li_Liu> one way is to use add_attribute() api in Deployable 14:33:39 <shaohe_feng> maybe he want to 2 pre-programmed cypto FPGAs 14:33:57 <shaohe_feng> Li_Liu, I did it as you said. 14:34:02 <Coco> yes, I think number is necessary. 14:34:17 <Li_Liu> what kind of numbers? 14:34:23 <Li_Liu> performance kpi? 14:34:32 <Li_Liu> like 30 gbps? 14:34:51 <shaohe_feng> Coco, agree. You can also give the comments on that API patches. 14:34:54 <Coco> the number of deployables allocated at once. 14:35:13 <wangzhh> GET will return a list of acc. So we need number in allocate? 14:35:58 <Coco> at least, there are some place in the code where we deal with the number requirement of deployables. 14:36:07 <wangzhh> Batch allocate? 14:36:08 <Li_Liu> if we have a dedicated allocation api, then the number should be included 14:36:50 <Li_Liu> I though for now we are just using the deployable PATCH, which is single allocation 14:37:18 <Coco> OK, than if user ask for 2, we should use the api twice? 14:37:28 <Coco> then 14:37:42 <shaohe_feng> For example, you and I Get 2 accs. 14:37:48 <Li_Liu> that could a bit tricky 14:38:07 <shaohe_feng> Then we need allocated them, then risk 14:39:12 <zhipeng> will user actually needs to simultaneously deploy two kinds of accs ? 14:39:23 <zhipeng> or more, all at once 14:39:39 <shaohe_feng> So For my POC. dolpher proposal one API to get and allocated. 14:40:14 <shaohe_feng> zhipeng, dolper prefer at once. 14:40:23 <zhipeng> i was thinking about batch actions 14:40:43 <zhipeng> it makes sense for VMs and containers, since they are mostly homogeneous 14:40:56 <zhipeng> if fails, rollback is pretty easy 14:41:07 <Li_Liu> I agree with the multi-alloc use case 14:41:09 <shaohe_feng> sure. thanks, you got my points. 14:41:27 <zhipeng> but if we are gonna have batch operation on difference accelerators 14:41:54 <zhipeng> if they are the same type it will be easy, but for different functionalities, it will be tricky 14:42:07 <shaohe_feng> It is hard for difference accelerators. 14:42:17 <shaohe_feng> I have not think a good solutions for it. 14:42:22 <xinran__> Cyborg return a list of accelerators and nova will do a for loop to allocate one by one, this is the current solution, but I think we should pass num to cyborg allocate api in the future 14:42:34 <shaohe_feng> Li_Liu, what's your suggestion on this one? 14:42:49 <Li_Liu> when calling the dedicated alloc api, just provide the list of acc type you need 14:42:52 <zhipeng> yes what xinran__ said is okey 14:43:22 <Li_Liu> for instance, alloc: [aes x 1 + rsa x 2] 14:43:37 <Li_Liu> that gives you 3 different accs 14:47:57 <zhipeng> i think we should follow xinran__'s suggestion 14:48:05 <zhipeng> let's stuck to the current solution for Rocky 14:48:15 <zhipeng> and discuss about batch in Denver ptg 14:48:38 <Coco> what's the deadline? 14:50:16 <zhipeng> client is 27th ... 14:51:42 <Li_Liu> agree, xinran_'s suggestion should solve the problem for us for now 14:51:56 <Coco> 2 days left? 14:52:09 <wangzhh> LIke os-acc before. It is urgent.:) 14:52:15 <zhipeng> yes lol 14:52:20 <Li_Liu> zhipeng, what else are we trying to squeeze in by 27th? 14:52:20 <zhipeng> every release is like this 14:52:33 <zhipeng> just the client lib final cut 14:52:51 <zhipeng> meaning the basic api will not be changed as well, after that, to make sure the client works 14:53:08 <Li_Liu> i see 14:53:29 <zhipeng> shaohe_feng your new patch 14:53:30 <zhipeng> https://review.openstack.org/#/c/585146/ 14:53:37 <zhipeng> is about the placement support right ? 14:53:49 <zhipeng> do we need to remove some old implementation ? 14:54:03 <zhipeng> I remember Li Liu coded the report functionality before 14:54:39 <Li_Liu> https://github.com/openstack/cyborg/blob/master/cyborg/services/report.py 14:54:43 <Li_Liu> I did this part 14:55:15 <Li_Liu> shaohe_feng, is it sufficient for you? 14:55:17 <zhipeng> are there any conflict ? 14:55:39 <xinran__> client lib depends on cyborg api we don’t have allocation api right now. How we handle that ? 14:55:43 <zhipeng> I think shaohe added the provider_tree and also make sure the interaction happens on the agent level 14:56:03 <zhipeng> xinran__ will zhenghao's patch handle that ? 14:58:15 <shaohe_feng> Li_Liu, zhipeng I'm back. 14:58:16 <shaohe_feng> sorry 14:58:24 <Li_Liu> I guess shaohe_feng re-implemented my part in his new patch.. 14:58:43 <shaohe_feng> It support the sub-provider. 14:58:57 <shaohe_feng> actually, I did not re-implemented it. 14:59:05 <shaohe_feng> I leverage it from nova 14:59:07 <Li_Liu> shaohe_feng, I have implemented the SchedulerReportClient in https://github.com/openstack/cyborg/blob/master/cyborg/services/report.py 14:59:34 <shaohe_feng> Li_Liu, yes, I know your implementations. 14:59:48 <Li_Liu> so what the difference between your SchedulerReportClient and my SchedulerReportClient? just curious 14:59:58 <shaohe_feng> no sup-provider. 15:00:09 <shaohe_feng> it support nest provider. 15:00:24 <shaohe_feng> I think these code should be in a lib 15:00:31 <Li_Liu> i see, but should be put them together? 15:00:32 <shaohe_feng> so nova and cyborg can share. 15:00:50 <Li_Liu> just use your code 15:01:02 <shaohe_feng> yes. Maybe it need to remove the old one. 15:01:03 <Li_Liu> ignore my implementation for now then 15:01:11 <Li_Liu> ok 15:01:19 <zhipeng> okey 15:01:30 <shaohe_feng> And the best solution, move them to a common lib shared by nova and cyborg or other project 15:01:42 <xinran__> zhipeng: yeah but we need modify that in the future 15:01:55 <zhipeng> xinran__ yes for sure 15:02:03 <shaohe_feng> zhipeng, you can talk about with nova's guy, do they have plan to do it? 15:02:03 <wangzhh> xinran__, IMHO, we can use PATCH for allocation as my patch in rocky. Or we can design new API interface and parallel development. 15:02:19 <zhipeng> shaohe_feng I think it is a reasonable request 15:02:33 <zhipeng> since provider tree has also been implemented for nova-compute 15:02:39 <openstackgerrit> Merged openstack/cyborg master: Add "interface_type" field in deployable DB https://review.openstack.org/584296 15:02:57 <xinran__> wangzhh: and deallocate also by patch method ? 15:02:59 <zhipeng> hey look who's here 15:03:08 <Li_Liu> a lot of reimplementations cross different project just for this part... 15:03:13 <shaohe_feng> zhipeng, yes. Need need for us to maintain same code with other project. :) 15:03:31 <wangzhh> Yes. 15:03:39 <wangzhh> xinran__ 15:03:44 <shaohe_feng> Li_Liu, yes. too many. but we should avoid. 15:03:55 <Li_Liu> agree 15:04:00 <zhipeng> okey next on drivers 15:04:05 <shaohe_feng> And that is cyborg should focus on 15:04:18 <shaohe_feng> that is not 15:04:20 <shaohe_feng> sorry. 15:04:35 <zhipeng> we now have gpu and opae based fpga drivers on the fly 15:05:02 <zhipeng> need brianx__ to start the Xilinx driver ASAP lol 15:05:10 <xinran__> Coco: did you modify the driver discover() method to get interface type? 15:05:11 <zhipeng> let's start with a simple version 15:05:47 <Coco> I think the yes 15:06:14 <Coco> I will check again. 15:07:29 <Coco> I use the "pci" as the "interface_type" value at this moment, since it's fpga driver. 15:08:12 <wangzhh> I think discover should have common data structure. 15:08:22 <shaohe_feng> I have list cyborg tasks in the etherpad. include this one 15:08:25 <Coco> I agree with wangzhh 15:08:26 <shaohe_feng> No one take it at present. 15:09:02 <Coco> 27th also the deadline? 15:09:02 <shaohe_feng> Yes. maybe an object 15:09:17 <shaohe_feng> then it can not need schema verify 15:09:53 <shaohe_feng> anyway, the goal is unify the discover data 15:10:22 <Coco> I can take it, but can't make sure it can be done by 27th. 15:10:41 <zhipeng> does that affect the API behaviour ? 15:10:41 <Li_Liu> when discover() reports data, it should be in Cyborg understandable format 15:11:01 <shaohe_feng> Coco, great, thanks. 15:11:09 <wangzhh> zhipeng, no. Just affect agent. 15:11:28 <wangzhh> To report data. 15:11:29 <Li_Liu> no affect on api I don't think 15:11:32 <zhipeng> okey then there is no hurry 15:11:34 <shaohe_feng> Coco, have you looked at sundar's VAN? 15:11:37 <Li_Liu> yup 15:11:59 <shaohe_feng> Coco, can we leverage it? 15:12:07 <shaohe_feng> yes, not hurry. 15:12:13 <Coco> what's the link? 15:12:28 <shaohe_feng> the spec, let me show you. 15:13:15 <Coco> ok, thks. 15:13:35 <shaohe_feng> #link https://review.openstack.org/#/c/577438/ 15:13:46 <shaohe_feng> it is used for os-acc 15:14:21 <shaohe_feng> you can have a check, does the drivers can use it? 15:16:02 <Coco> ok, i will check tomorrow. 15:16:08 <shaohe_feng> Thanks. 15:16:42 <zhipeng> Yumeng__ are you still around ? 15:16:54 <zhipeng> the doc work should start now 15:17:52 <Li_Liu> hmm 15:18:16 <Li_Liu> I will focus on that after warpping up the program api 15:18:33 <Li_Liu> will as Yumeng and others for help 15:18:59 <zhipeng> yes will need to work together, to comb through the implementations :) 15:19:23 <Li_Liu> yup 15:20:29 <shaohe_feng> guess she is in sleep. :) 15:20:47 <Coco> OK, we already had doc wechat group. 15:21:57 <shaohe_feng> more sleep can keep a girl beauty. :) 15:22:10 <Coco> .... 15:22:25 <Coco> I need to go sleep too. 15:22:37 <zhipeng> okey 15:22:38 <Li_Liu> have a good sleep guys :) 15:22:40 <seungwook> I think it takes some time for me to catch up the dialog. 15:22:52 <zhipeng> let's try to nail client lib down this week lol 15:23:17 <zhipeng> #endmeeting