13:00:51 <baoli> #startmeeting PCI Passthrough 13:00:52 <openstack> Meeting started Tue Apr 15 13:00:51 2014 UTC and is due to finish in 60 minutes. The chair is baoli. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:00:53 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:00:56 <openstack> The meeting name has been set to 'pci_passthrough' 13:01:00 <baoli> Hi there 13:01:04 <heyongli> hi 13:01:12 <beagles> hi 13:01:51 <baoli> Irenab is not going to be here today 13:02:20 <baoli> Let's wait to see if John and Rkukura are going to join 13:02:35 <Guest84743> (nick has gone crazy, Guest84743 is beagles) 13:02:53 <rkukura> I’m lurking, but am not caught up on current sr-iov goings on 13:04:48 <baoli> I think that we should get started. 13:05:08 * russellb lurking 13:05:28 <baoli> https://review.openstack.org/#/c/86606/ 13:07:16 <baoli> For networking, my simplistic view of how tenant is going to use it: 1) want a sr-iov port (with macvtap or not), 2) the port is connected to a particular network. 13:08:46 <heyongli> baoli, agree, also picture this way 13:09:20 <baoli> So basically a compute host needs to provide the number of sr-iov ports per network it's connected with 13:10:51 <heyongli> don't get you point, but this is sound good, 13:11:58 <baoli> heyongli, I'm talking about the information required from a compute host for sr-iov networking 13:12:44 <heyongli> got, does this bring something new? 13:13:01 <Guest84743> baoli, to clarify, when you write "(macvtap or not)" above, do you mean the tenant specifies the type of connection (direct/macvtap) or is this implicit in the sr-iov/network spec? 13:13:58 <baoli> guest84743, two variants with sr-iov ports 13:14:03 <Guest84743> right okay 13:14:18 <Guest84743> so the have a degree of control over how it is connected? 13:14:26 <baoli> heyongli, not really. We have gone over it several times. 13:14:37 <heyongli> i suppose the port type could be specified by user or not, both should be a valid use case 13:15:10 <baoli> Guest84743, yea, with macvtap, less performance, but with live migration support 13:15:16 * Guest84743 nods 13:16:18 <baoli> heyongli, on top of the basic use cases, you brought it up that a tenant may want to use a sr-iov port that resides on a particular vendor's card. 13:17:12 <baoli> Would everyone agree that the cloud should provide this support to the tenant? 13:17:18 <Guest84743> has there ever been a discussion about using classifiers instead of direct "specification" that would mean something like 'passthru-to-net-foo-direct' instead of lower level details, or is this type of generalization/indirection seen as being part of a flavor mechanism 13:17:44 <russellb> this feels like flavor driven stuff to me 13:17:46 <heyongli> guest84743, i agree 13:17:52 <russellb> Guest84743: /nick beagles :-) 13:18:05 <baoli> beagles, yes, we talked about that. 13:18:10 <sadasu> Guest84743: good point 13:18:14 <Guest84743> russellb, wouldn't let me change.. or kept changing it back 13:18:34 <russellb> Guest84743: may need to identify with nickserv 13:18:37 <rkukura> I’m no expert on this area, but do we need to think about how the tenant ensures the VM has the right driver for whatever device they are asking for? 13:18:37 <Guest84743> yeah, I've read the backlog of emails but in the end I couldn't get a strong handle on the general consensus 13:18:51 <heyongli> i also think the vendor, vnic type should be hide in flaovr, but user also should can choose it if user does want it 13:19:35 <sadasu> rkukura: yes...that is very critical for the feature to work 13:19:38 <b3nt_pin> heh, b3nt_pin will have to do for now.. won't let me (b3nt_pin is beagles) 13:20:04 <baoli> rkukura, do you mean the kernel device driver? 13:20:10 <rkukura> baoli: yes 13:20:16 <rkukura> in the VM 13:20:23 <b3nt_pin> rkukura, that wouldn't suck :) 13:20:30 <sadasu> when the tenant picks the vendor and product ids, this indirectly takes care of the tenant driver criteria 13:20:33 <russellb> you can so something for that between flavor metadata and image metadata 13:20:38 <russellb> capabilities / requirements matching 13:20:55 <russellb> if you want 13:21:42 <b3nt_pin> russellb, is there a good example of this in use that you would recommend? 13:21:43 <rkukura> If the tenant needs to ensure the VM has the correct driver, I don’t think we should be obfuscating how the tenant specifies/sees the vNIC type too much 13:22:08 <russellb> rkukura: that's a fair point 13:22:56 <sadasu> rkukura: the same driver usually supports both vnic_types generally 13:24:05 <heyongli> the drive should directly depend on the device it use, not the nic type it use, i think 13:24:28 <sadasu> driver decides which host the VM can run on and the vnic_type decides the mode in which the sr-iov port can be used 13:25:03 <rkukura> Again, I’m far from an expert on this area, but does the VM need a driver specific to the SR-IOV card’s vendor/type? 13:25:20 <sadasu> vnic_type is not useful in the placement of a VM on a host but the driver info is 13:25:26 <sadasu> rkukura: correct 13:26:37 <baoli> rkukura, would an image be built with all the device drivers the cloud would support? 13:26:39 <rkukura> So vnic_type is more of a mode of operation than a device type? 13:27:03 <sadasu> rkukura: correct ...again :-) 13:27:20 <rkukura> baoli: I guess if the image is supplied by the cloud provider, that would be reasonable. 13:27:23 <sadasu> and only neutron cares about the vnic_type...nova can be blind to it 13:27:59 <sadasu> we just want to include it in a single API, so the user can launch a VM with all these specification via one command 13:28:46 <heyongli> to get a device driver properly setup, we may setup the meta data after get the actually device, but i'm not sure at that point, is there a way to do so, for a tenant image. 13:29:10 <sadasu> rkukura: by image do you mean kernel driver for the sr-iov ports? 13:29:11 <b3nt_pin> sadasu, rkukura: to clarify, are we saying that a VM needs a device specific driver? If yes, does it still need that if the vnic type is macvtap (doesn't sound right)? 13:29:32 <heyongli> and this is common issue for all pci passthrough not only for sriov 13:29:38 <rkukura> b3nt_pin: I don’t know the answers 13:30:09 <b3nt_pin> rkukura, :) okay... I'll poke around and see if I can find out what the deal with libvirt is at least. 13:30:12 <sadasu> b3nt_pin: host needs a device specific driver 13:30:31 <b3nt_pin> sadasu, host, but not VM, right? 13:30:45 <rkukura> sadasu: Yes, I meant the image that is booted, with its kernel, drivers, … 13:30:52 <sadasu> b3nt_pin: correct 13:31:03 <b3nt_pin> sadasu, aahhh okay... that makes sense :) 13:32:13 <b3nt_pin> sadasu, I'm curious how VFs are presented to a VM but I suppose that is a virtualization detail. 13:32:18 <sadasu> rkukura: atleast with the pci passthru devices that I am exposed to so far...the driver resides in the kernel of the host OS and not the guest OS 13:32:34 <sadasu> not sure if any devices out there break that model 13:33:17 <b3nt_pin> sadasu, I'm happy to "hear" that as it is consistent with my admittedly "academic" view (awaiting hardware to get hands-on) 13:33:29 <sadasu> but I am guessing there will dfntlt be a host OS level driver...there might be an additional guest os lever driver 13:33:32 <sadasu> level* 13:33:33 <heyongli> sadasu, any device? don't thinks so, some driver must in the guest os also,for some device 13:33:50 <b3nt_pin> heyongli, yeah.. that would make sense especially for not networking SR-IOV 13:34:02 <sadasu> heyongli: yes...mentioned that.. 13:34:11 * b3nt_pin wonders if anybody has a crypto board 13:34:31 <sadasu> the point I am trying to make is that ..there is a host kernel dependency also... 13:34:34 <heyongli> b3nt_pin, i had one 13:34:59 <heyongli> sadasu, seems a deploy problem 13:35:08 <russellb> i think even if we only pulled of sr-iov NICs for Juno, that'd be a huge step forward :) 13:35:09 <b3nt_pin> sadasu, yeah.. so this has to be presented as part of the info, basically. Whether configured or discovered. 13:35:10 <sadasu> if the dependency is only on the VM image, then I think it is pretty simple because the cloud provider can provide the correct image ...end of story 13:35:18 <b3nt_pin> russellb, +100 13:35:26 <baoli> I think that we are talking about a different issue: how to select a host that meets the requirement that comes from a tenant built image. 13:35:27 <russellb> "only" 13:35:57 * b3nt_pin didn't meant to "dilute" 13:36:10 <russellb> nah i said that 13:36:40 <rkukura> My concern was purely with the tenant VM having the correct driver for the virtual device that shows up in the VM, not with the hosts’ drivers. 13:37:45 <heyongli> rkukura, the metadata service might be a answer for this, fix me 13:37:51 <sadasu> rkukura: in that case, the tenant has to supply nova with the correct image id, correct? 13:38:08 <baoli> Say the tenant built image supports mlnx driver only. 13:38:34 <heyongli> baoli, then we need vendor information here, in the flavor 13:39:52 <baoli> How to extract the requirement from the image? 13:39:58 <heyongli> sadasu, for any image, provide some information to VM is a possible solution 13:40:11 <sadasu> baoli: thought we covered that... 13:40:38 <heyongli> baoli, this is another story, i try to make image can append pci device information specs or flavor something 13:40:51 <sadasu> heyongli: yes, that where the vendor_id, product_id come into play 13:41:16 <baoli> sadasu, can you enlighten me again? 13:41:27 <sadasu> baoli: :-) 13:41:54 <sadasu> u are the expert.. 13:42:11 <sadasu> ok...so I think this is where we are: 13:42:14 <b3nt_pin> can we not simply assign that info to the image via metadata? 13:42:39 <sadasu> PCI device needs 1. driver in host OS 2. driver in guest OS 3. Both 13:43:00 <heyongli> b3nt_pin, we must had facility to extrat that from image. 13:43:04 <sadasu> veryone on same page so far? 13:44:14 <heyongli> sure, host driver is deploy scope i think,, guest image is a problem, we might should split that to another topic , it's common for all pci device. fix me 13:44:23 <sadasu> specifying vendor_id, product_id will help nova place VM of type 1 on correct host 13:45:16 <baoli> b3nt_pin: do we have an API to associate the meta-data with the image, which then can be used for host scheduling? 13:45:26 <sadasu> based on driver inside the VM image, once again giving vendor_id, product_id to nova should let us place it on the Host having the correct HW 13:45:37 <heyongli> sadasu, i don't think vm need this in a type one request, where am i wrong? 13:45:38 <sadasu> case 3: is same as above 13:46:06 <b3nt_pin> baoli, http://docs.openstack.org/grizzly/openstack-compute/admin/content/image-metadata.html 13:46:30 <baoli> b3nt_pin, thanks 13:46:48 <sadasu> for case 1: we will use vendor_id product_id just to determine if pci passthrro devices exist 13:47:25 <sadasu> b3nt_pin: thanks will take a look 13:47:31 <heyongli> sadasu, no , whitelist should does this, in a deploy stage. 13:47:57 <sadasu> agreed...whitelist contains the same info.. 13:48:29 <sadasu> it would deftly help to associate meta data with the VM instead of expecting the tenant to assign the correct image to the VM 13:48:29 <heyongli> any thing related to host, should be a deploy problem, at least deploy involved in 13:49:03 <sadasu> yes...even at deploy you are providing vendo_id, product_id 13:51:21 <heyongli> sadasu, sure, it help, current image meta data could not address vm's driver problem if we don't check that in PCI pass-through scope 13:52:09 <b3nt_pin> I'm kind of struggling with the "extract from vm" idea... what do we mean by that? 13:53:07 <heyongli> b3nt_pin,where is the extract form vm idea? i'm lost here 13:53:30 * b3nt_pin goes back ... 13:53:44 <b3nt_pin> sorry.. heyongli I misinterpreted your previous statement 13:53:54 <b3nt_pin> you said, "extract from image" not from VM 13:53:58 <b3nt_pin> my bad 13:54:14 <sadasu> I am a little worried if we are making the configuration very difficult 13:54:52 <b3nt_pin> can you elaborate at which points you feel might be increasing the difficulty? 13:55:00 <heyongli> sadasu, not really difficult, which point difficult do you mean? 13:55:17 * b3nt_pin is telepathically linked to heyongli apparently 13:55:19 <sadasu> whitelist, image meta-data, sr-iov port modes, picking the correct ml2 driver on the neutron side :-) 13:55:46 <sadasu> I get what is going on..I am looking at it from a tenant perspective 13:56:00 * b3nt_pin nods... 13:56:20 <b3nt_pin> are all of these essential for basic SR-IOV usage, or only required for finer and finer control? 13:56:20 <heyongli> all of this could be done automatically, except the image meta-data, that's might be confuse now, but not very bad 13:56:34 <heyongli> i thinks it's finer control 13:57:00 <b3nt_pin> yeah 13:57:26 <sadasu> b3nt_pin: hard to say what is basic SR-IOV usage anymore :-)..every device/vendor is doing things slightly differently 13:57:31 <b3nt_pin> lol 13:57:35 <heyongli> cause OS admin could say, we had all these device, please pre install driver for your image... this is a simple solution here 13:59:04 <baoli> heyongli, thinking about the same. 14:00:16 <baoli> ok, we have to wrap it up today 14:00:18 <heyongli> if we provide image constrain, it's also a good feature, 14:00:23 <sadasu> ok...so my action item is to understand the image metadata feature better to see if that better suits the VM image driver dependency issue 14:00:30 <baoli> thanks everyone 14:00:34 <b3nt_pin> cheers all! 14:00:37 <baoli> #endmeeting