03:09:37 #startmeeting openstack-cyborg 03:09:38 Meeting started Thu Nov 26 03:09:37 2020 UTC and is due to finish in 60 minutes. The chair is Yumeng. Information about MeetBot at http://wiki.debian.org/MeetBot. 03:09:39 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 03:09:41 The meeting name has been set to 'openstack_cyborg' 03:09:46 #topic Roll call 03:09:57 #info Yumeng 03:10:09 #info xinranwang 03:10:10 #info swp20 03:10:30 #topic Agenda 03:11:18 # topic vgpu 03:11:41 #topic vgpu 03:13:33 swp20: pls continue 03:14:00 are you saying the detach failure in hotplug is because of cirros mssing? 03:14:36 yeah, the cirros vm process crash when detach the gpu device 03:15:06 the un-hotplug is not real success in fact. 03:15:58 cirros image is not support and centos is well. 03:18:13 so vgpu hotplug is not supported in cirros but supported in centos, right? 03:18:51 i am not sure. 03:18:58 i means re-hotplug 03:19:27 ok 03:19:43 did you find out why un-hotplug is not successful? 03:19:59 attach, detach and reattach 03:20:19 i search the vm log 03:21:12 there is process crash problem. 03:23:01 is it a occasional case or it crash every time? 03:23:41 it's high probability 03:23:51 ok 03:24:32 Has this crash ever happend in Centos? 03:25:10 hasn't met yet. 03:25:25 ok. got taht. 03:25:46 cool 03:26:08 looks like hotplug is image sensitive. 03:26:32 Thanks wenping for the sharing 03:26:33 maybe the driver is important. 03:27:09 do you mean nvidia virtualization driver? 03:27:37 no, i means the driver in image 03:28:03 gpu is not support well for cirros 03:28:14 include vgpu 03:28:39 you can test for vgpu about detach by 'virsh detach-device' 03:29:30 yes, the VFIO mdev driver is very important. nvidia virtualization driver version must be well match the image version 03:29:48 ok. will try when I got time 03:30:01 cool 03:30:13 I also have a vGPU issue to discuss with you 03:30:20 about the vGPU support 03:30:26 yep 03:30:58 i think in the time of bind arq is better 03:31:03 to create mdev 03:31:42 attach_handle is too early 03:32:20 and maintain task is heavy 03:32:20 yes, I also think so. 03:33:58 so let's confirm this. 03:34:07 xinranwang what do you think? 03:35:29 Sylvain prefer create mdev in generate attach_handle. sean and gibi is fine with either 03:36:14 from my perspective, I also prefer creating medv when arq bind 03:37:24 if gpu's type is determined, the max number of vfs is also determined, right? 03:37:51 yes 03:38:15 but if it is changed, we need to delete all the created ones and create new ones 03:38:19 if we do not create mdev at attach_handle generation step, how many vfs should we report? 03:38:26 even if they were never used. 03:40:11 xinranwang: the maximum number 03:40:36 in the inventory, we always report the maximum number 03:40:43 ok, got it. 03:41:39 it seems create mdev during binding is more efficient. we just create it when we use it. 03:42:41 yes, that's also how I mentioned in nova spec. 03:42:42 does mdev creation spend much time? 03:42:48 not much. 03:43:12 will it fail in some cases? 03:44:28 I tested in my env, but it was not a big number of VMs. create mdev is very fast 03:44:50 but not sure what's the results when VM is a large number 03:45:16 mdev creation is a serial task, i think. 03:46:04 anyway, i think at binding step is more efficient, if there is no obvious gap. 03:46:06 creation failure is at very Low frequency. 03:46:30 hasn't met yet 03:47:14 xinranwang: ok. cool 03:47:24 So we agreed on at binding step. 03:47:33 I will go back to sync with nova guys 03:47:58 ok. nothing else from side. 03:48:18 Is ther anything else you guys what to mention? 03:49:41 nothing from my side 03:51:23 ok~ 03:51:32 lunch time~~ 03:51:41 lol 03:51:46 bon appetit 03:52:14 haha 03:52:27 so let's wrap up today's meeting 03:52:41 #endmeeting