13:00:04 <baoli> #startmeeting PCI passthrough 13:00:05 <openstack> Meeting started Thu Jan 9 13:00:04 2014 UTC and is due to finish in 60 minutes. The chair is baoli. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:00:06 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:00:08 <openstack> The meeting name has been set to 'pci_passthrough' 13:00:23 <baoli> Hi everyone 13:00:27 <irenab> hi 13:00:33 <johnthetubaguy1> hi 13:00:41 <heyongli> hello 13:00:55 <baoli> John, can you lead the discussion today? 13:01:06 <johnthetubaguy> if you want 13:01:17 <johnthetubaguy> I would love to talk about this: 13:01:27 <johnthetubaguy> https://wiki.openstack.org/wiki/Meetings/Passthrough#The_user_view_of_requesting_things 13:01:40 <johnthetubaguy> what would the nova CLI calls look like 13:01:56 <johnthetubaguy> when requesting gpu passthrough first 13:03:00 <johnthetubaguy> anyone want to suggest a proposal? 13:03:08 <irenab> john, I guess it requires flavor creation with extra_spec for GPU device and then regular 'nova boot' 13:03:26 <johnthetubaguy> +1 on that I think 13:03:43 <heyongli> current is: nova flavor-key m1.large set "pci_passthrough:alias"="a1:2" 13:03:57 <heyongli> nova boot --image new1 --key_name test --flavor m1.large 123 13:03:58 <johnthetubaguy> right 13:04:20 <johnthetubaguy> I will add that in the wiki 13:04:21 <irenab> heyongli: and its already supported, right? 13:04:31 <heyongli> yes 13:05:18 <johnthetubaguy> so there is a limitation there 13:05:25 <johnthetubaguy> you only get one PCI passthrough device 13:05:39 <johnthetubaguy> do we care about that for GPU etc, I think the answer is yet 13:05:42 <johnthetubaguy> I mean yes 13:06:12 <irenab> john: you can request number of devices 13:06:33 <johnthetubaguy> irenab: how? 13:06:41 <johnthetubaguy> oh, I see 13:06:45 <irenab> a1:2 => 2 devices 13:06:53 <heyongli> a1:2, a2:3 13:06:54 <johnthetubaguy> but there are all from the same alias 13:06:56 <heyongli> is also ok 13:07:04 <johnthetubaguy> ah, so thats a better example 13:07:20 <johnthetubaguy> we support this today then: a1:2, a2:3 13:07:22 <heyongli> and alias support a mixer spec: two 2 type of device 13:07:25 <irenab> I think you can add another alias too 13:08:07 <heyongli> you can defien alias: 13:08:17 <irenab> my feeling is GPU case is quite sovled and we just need to keep it working when adding netowrking case, agree? 13:08:22 <heyongli> a1={type1} 13:08:30 <heyongli> then write same: a1={type2} 13:08:54 <heyongli> then you requied a1:2 ,means both type1,and typ2 is ok 13:09:00 <johnthetubaguy> heyongli: what are the CLI commands for that, I am a bit confused 13:09:20 <heyongli> in nova configuration now. 13:09:25 <johnthetubaguy> I added the GPU case here: 13:09:25 <johnthetubaguy> https://wiki.openstack.org/wiki/Meetings/Passthrough#The_user_view_of_requesting_things 13:09:29 <heyongli> no API yet 13:09:32 <johnthetubaguy> I agree we just need to keep it working 13:09:52 <johnthetubaguy> heyongli: I am talking about the flavor, and what we already have today 13:10:15 <sadasu> heyongli: cli is confusing ... "x:y"="a:b" would be interpreted as x = a and y = b which is not the case in your CLI 13:11:02 <heyongli> sadasu: where you see x:y = a:b? 13:11:10 <johnthetubaguy> hang on, hang on 13:11:13 <johnthetubaguy> is this valid today 13:11:14 <johnthetubaguy> nova flavor-key m1.large set "pci_passthrough:alias"=" large_GPU:1,small_GPU:1" 13:11:14 <sadasu> "pci_passthrough:alias"="a1:2" 13:11:37 <baoli> pci_alias='{"name":"Cisco.VIC","vendor_id":"1137","product_id":"0071"}' 13:11:44 <baoli> this is how it's defined today 13:11:49 <heyongli> sadasu: this is another problem, john: right ,it works today 13:11:56 <johnthetubaguy> I don't mind about the alias 13:12:08 <johnthetubaguy> I am trying to ask how the flavor extra specs work today 13:12:11 <johnthetubaguy> is this valid? 13:12:12 <johnthetubaguy> nova flavor-key m1.large set "pci_passthrough:alias"=" large_GPU:1,small_GPU:1" 13:12:28 <heyongli> johnthetubaguy, it works 13:12:32 <johnthetubaguy> cool 13:12:47 <johnthetubaguy> so, we have the user request for vGPU 13:12:48 <sadasu> I am sure it works...you can make it work...I am just suggesting that it is not very self explanatory 13:12:48 <johnthetubaguy> now... 13:12:56 <johnthetubaguy> SRIOV 13:13:30 <johnthetubaguy> sadasu: it could be better, it could be worse, but I vote we try not to worry about that right now 13:13:49 <sadasu> ok...get it...lets move to SRIOV 13:13:58 <heyongli> sadasu: the pci_passthrough:alias, should be this sytle because the scheduler history reason. 13:14:36 <johnthetubaguy> so first, nova boot direct 13:14:38 <irenab> SRIOV + neutron, ok? 13:14:46 <johnthetubaguy> yep 13:15:31 <irenab> john: suggestion is to add attributes to --nic 13:15:45 <johnthetubaguy> yep, can we give an example 13:15:54 <johnthetubaguy> I am trying to type one up and not likeing any of them 13:15:57 <baoli> Can we go over the jan 8th agenda I posted yesterday? 13:16:14 <baoli> It contains all the details we have been workign so far 13:16:39 <johnthetubaguy> OK, we can reference it for sure, I just would love to aggree this user end bit first 13:16:46 <irenab> baoli: lest go to the last use case 13:17:11 <irenab> john: do you agree with nova boot format? 13:17:29 <johnthetubaguy> I can't easily see an example in that text 13:17:36 <johnthetubaguy> oh wait 13:17:37 <johnthetubaguy> sorry 13:17:39 <baoli> nova boot --flavor m1.large --image <image_id> --nic net-id=<net-id>,vnic-type=macvtap,pci-group=<group-name> <vm-name> 13:17:39 <johnthetubaguy> I am blind 13:17:46 <johnthetubaguy> macvtap? 13:17:57 <johnthetubaguy> vs direct vs vnic 13:17:59 <irenab> or direct or virtio 13:18:02 <johnthetubaguy> why do we need that? 13:18:14 <johnthetubaguy> I mean, why do we have three here? 13:18:32 <baoli> For SRIOV, there is both macvtap and direct 13:18:54 <johnthetubaguy> OK, is that not implied by the device type and vif driver config? 13:19:02 <baoli> With macvtap, it's still pci passthrough but a host macvtap device is involved 13:19:41 <baoli> Well, the device type and vif driver can support both at the same time on the same device 13:20:06 <johnthetubaguy> hmm, OK 13:20:14 <johnthetubaguy> macvtap doesn't look like passthrough 13:20:23 <johnthetubaguy> it looks like an alternative type of vnic 13:20:31 <baoli> John, it's one form of PCI passthrough 13:20:36 <irenab> john, the idea is to work with neutron ML2 plugin that will enable different typpes of vnics 13:20:37 <baoli> I mean one type 13:21:16 <johnthetubaguy> OK... 13:21:26 <johnthetubaguy> does the PCI device get attached to the VM? 13:21:39 <baoli> John, yes. 13:21:46 <johnthetubaguy> hmm, OK 13:21:47 <irenab> john: both macvtap and direct are network interfaces on PCI device 13:21:58 <johnthetubaguy> OK 13:22:11 <johnthetubaguy> seems like we need that then 13:22:15 <irenab> direct required vendor driver in the VM and macvtap doesn't 13:22:36 <baoli> irenab, I think it's the opposite 13:22:50 <baoli> Irenab, sorry,. you are right 13:22:52 <johnthetubaguy> so, as a user, I don't want to type all this stuff in, let me suggest something... 13:23:10 <johnthetubaguy> the user wants a nic-flavor right? 13:23:27 <johnthetubaguy> defaults to whatever makes sense in your cloud setup 13:23:33 <baoli> John, we have a special case in which th euser doesn't need to type it 13:23:43 <irenab> john: exactly 13:23:44 <johnthetubaguy> but if there are options, the user picks "slow" or "fast" or something like that 13:24:04 <johnthetubaguy> so I would expect to see... 13:24:49 <johnthetubaguy> nova boot --flavor m1.large --image <image_id> 13:24:49 <johnthetubaguy> --nic net-id=<net-id>,vnic-flavor=<slow | fast | foobar> <vm-name> 13:25:16 <johnthetubaguy> vnic-type is probably better than flavor I guess 13:25:33 <baoli> John, we don't want to add QoS to this yet, which is a separate effort 13:25:47 <johnthetubaguy> nova boot --flavor m1.large --image <image_id> 13:25:47 <johnthetubaguy> --nic net-id=<net-id>,vnic-type=<slow | fast | foobar> <vm-name> 13:26:00 <baoli> But I guess that you can do that 13:26:02 <johnthetubaguy> this isn't QoS... 13:26:09 <johnthetubaguy> slow = virtual 13:26:14 <johnthetubaguy> fast = PCI passthrough 13:26:25 <heyongli> this mean vnic-type contain vnic-type=macvtap in it? 13:26:29 <irenab> john: agree on this 13:27:04 <johnthetubaguy> heyongli, the concept represented by vnic-type would include such settings, yes 13:27:16 <johnthetubaguy> so do we all agree on this: 13:27:17 <johnthetubaguy> nova boot --flavor m1.large --image <image_id> 13:27:17 <johnthetubaguy> --nic net-id=<net-id>,vnic-type=<slow | fast | foobar> <vm-name> 13:27:34 <heyongli> i'm ok with it. 13:27:44 <irenab> john: missing here the 'pointer' to the pool of PCI devices 13:28:03 <baoli> John, how do you define vnic-type? 13:28:16 <johnthetubaguy> well, thats the question 13:28:24 <johnthetubaguy> vnic-type is the user concept 13:28:32 <johnthetubaguy> we need to map that to concrete settings 13:28:49 <johnthetubaguy> but before we get there, are we OK with the theory of that user facing command? 13:29:00 <baoli> our original idea was to define a type of vnic that a user would attach its VM to 13:29:29 <johnthetubaguy> right, thats what I am suggesting here I think... 13:30:09 <baoli> Can we classify the VNICs to have types of virtio, pci-passthorugh without macvtap, pci-passthourgh with macvta[ 13:30:15 <baoli> sorry, macvtap 13:30:33 <johnthetubaguy> the user doesn't care about all that, thats an admin thing, I think 13:30:50 <johnthetubaguy> the user cares about the offerings, not the implementation 13:31:03 <johnthetubaguy> at least, thats our general assumption in the current APIs 13:31:20 <irenab> john: I guess user will be charged differently depending what vnic he has, so probably he should be aware 13:31:45 <irenab> but logically it should have names meaningful to the user and not technical 13:31:55 <johnthetubaguy> exactly 13:32:05 <baoli> #agreed 13:32:09 <johnthetubaguy> logical names, the users case which one, but they care about the logical name 13:32:34 <johnthetubaguy> cool… I don't really care what that is, but this works for now I think... 13:32:44 <johnthetubaguy> boot --flavor m1.large --image <image_id> --nic net-id=<net-id>,nic-type=<slow | fast | foobar> <vm-name> 13:33:04 <johnthetubaguy> I removed the "v" bit, it seems out of place, but we can have that argument laters 13:33:22 <irenab> still missing here binding to the PCI devices that allowed for this nic 13:33:27 <baoli> #agreed 13:33:55 <johnthetubaguy> irenab: yes, lets do that in a second 13:33:59 <johnthetubaguy> now one more question... 13:34:08 <johnthetubaguy> if we have the above, I think we also need this... 13:34:26 <johnthetubaguy> nova boot --flavor m1.large --image <image_id> --nic port-id=<port-id> 13:34:36 <johnthetubaguy> i.e. all the port settings come from neutron 13:34:42 <johnthetubaguy> which means... 13:34:45 <baoli> John, yes. 13:34:48 <johnthetubaguy> we probably need this 13:34:55 <johnthetubaguy> quantum port-create --fixed-ip subnet_id=<subnet-id>,ip_address=192.168.57.101 <net-id> --nic-type=<slow | fast | foobar> 13:34:57 <baoli> we had it described in our doc 13:35:07 <irenab> john: yes, it will be added with the same nic-type attribute 13:35:14 <johnthetubaguy> cool, appolgies for repeating the obvious 13:35:20 <johnthetubaguy> just want to get agreement 13:35:33 <irenab> agree 13:35:40 <heyongli> agree 13:35:41 <johnthetubaguy> cool, so does this look correct: 13:35:41 <johnthetubaguy> https://wiki.openstack.org/wiki/Meetings/Passthrough#The_user_view_of_requesting_things 13:36:00 <irenab> john: its neutron :) 13:36:22 <johnthetubaguy> agree, just want to make sure 13:36:40 <johnthetubaguy> I have little knowlege of neutron these days, but that seems to make sense 13:36:45 <johnthetubaguy> cool 13:37:01 <baoli> overall, it looks good 13:37:07 <johnthetubaguy> so how do we get a mapping from nic-type to macvtap and pci devices 13:37:17 <johnthetubaguy> I vote macvtap goes into the alias 13:37:21 <johnthetubaguy> is that crazy? 13:37:25 <heyongli> +1 13:37:49 <irenab> john: do you suggest it to work with flavor? 13:38:23 <johnthetubaguy> irenab: not really, at least I don't think so 13:38:34 <johnthetubaguy> irenab: sounds like info that the VIF driver needs 13:38:39 <irenab> so what do you mean by goes into alias? 13:38:52 <johnthetubaguy> a good question... 13:39:22 <heyongli> alias map the nic-type to contain information needed. 13:39:41 <heyongli> this means the vnic type is one kind of alias 13:39:57 <johnthetubaguy> pci_alias='{"name":"Cisco.VIC","vendor_id":"1137","product_id":"0071", "nic-type":"fast", "attach-type":"macvtap"}' 13:40:08 <heyongli> +1 13:40:22 <johnthetubaguy> with nic-type and attach-type as optional, its not that sexy, but could work I think... 13:40:23 <irenab> not good... 13:40:31 <baoli> no good 13:40:40 <irenab> this definition makes it static 13:40:52 <johnthetubaguy> pci_alias_2='{"name":"Cisco.VIC.Fast","vendor_id":"1137","product_id":"0071", "nic-type":"faster", "attach-type":"direct"}' 13:40:57 <heyongli> what do you mean static? 13:41:06 <johnthetubaguy> user chooses "fast" or "faster" 13:41:36 <irenab> it comes to my previous question regarding pool of available PCI devies for the vnic 13:41:37 <johnthetubaguy> does that work? 13:41:48 <irenab> seems you define all of this as part of the alias 13:41:57 <johnthetubaguy> at the moment, yes 13:42:08 <baoli> John, if we have done the user's point of view, can we go over the the original post? 13:42:41 <johnthetubaguy> baoli: sure, that probably makes sense 13:43:00 <baoli> Thanks, john 13:43:32 <irenab> If we want a VM be connected to 3 networks: one via SRIOV direct, one with SRIOV macvtap and one with virtio, how it will be done? 13:44:20 <johnthetubaguy> this is now 13:44:27 <johnthetubaguy> one sec... 13:44:28 <baoli> Thanks Irenab to bring that up 13:45:09 <johnthetubaguy> nova boot --flavor m1.large --image <image_id> --nic net-id=<net-id>,nic-type=fast --nic net-id=<net-id>,nic-type=faster <vm-name> 13:45:15 <johnthetubaguy> • pci_alias='{"name":"Cisco.VIC", devices:[{"vendor_id":"1137","product_id":"0071", address:"*"}],"nic-type":"fast", "attach-type":"macvtap"}' 13:45:16 <johnthetubaguy> • pci_alias_2='{"name":"Cisco.VIC.Fast",devices:[{"vendor_id":"1137","product_id":"0071", address:"*"}],"nic-type":"faster", "attach-type":"direct"}' 13:45:37 <heyongli> john, agree 13:45:55 <johnthetubaguy> hang on, we missed regular... 13:46:28 <johnthetubaguy> nova boot --flavor m1.large --image <image_id> —nic net-id=<net-id-1> —nic net-id=<net-id-2>,nic-type=fast --nic net-id=<net-id-3>,nic-type=faster <vm-name> 13:47:11 <irenab> john: we need the devices that provide network connectivity, how it is going to happen? 13:47:33 <johnthetubaguy> right, we haven't covered how we implement it 13:47:38 <johnthetubaguy> just how the user requests it 13:48:14 <johnthetubaguy> irenab: is that OK for the user request? 13:48:17 <heyongli> this will works smooth for pci, per my opinion, and the connectivity is also can be a spec of alias 13:48:32 <irenab> john: if youer has both cisco and mellanox nics, it will have to define cisco_fast and mellanox_fast ... 13:48:42 <johnthetubaguy> irenab: correct 13:48:46 <johnthetubaguy> unless 13:48:54 <johnthetubaguy> you want them to share... 13:48:55 <baoli> John, when you say user, you mean the final user or someone providing the service 13:49:23 <johnthetubaguy> well we are only really doing the end user at the moment 13:49:37 <johnthetubaguy> be we should do the PCI alias stuff for the deployer 13:49:49 <johnthetubaguy> hmm... 13:50:53 <johnthetubaguy> pci_alias_2='{"name":"Fast",devices:[{"vendor_id":"1137","product_id":"0071", address:"*","attach-type":"direct"}, {"vendor_id":"123","product_id":"0081", address:"*","attach-type":"macvtap"}],"nic-type":"faster", }' 13:50:58 <johnthetubaguy> does that work better? 13:51:09 <irenab> I am not the fan of this alias defintions, but waiting to see how we resolve the network connectivity to agree 13:51:18 <heyongli> john: just define the pci_alias_2 2 time is works now 13:51:27 <johnthetubaguy> irenab: what do you mean: "resolve network connectivity"? 13:51:40 <baoli> John, we are trying to avoid alias on the controller node 13:51:49 <baoli> first of all 13:51:55 <sadasu> john: the cisco vic and the mellanox vic could be connected to diff networks 13:52:10 <sadasu> so they cannot be part of the same alias 13:52:21 <johnthetubaguy> right, thats OK still though, I think 13:52:28 <baoli> I don't think anyone using pci cares about vendor id, whatsoever 13:52:45 <sadasu> then both of them would seem equivalent at the time of nova boot 13:52:46 <johnthetubaguy> baoli: that is a deployer option 13:52:47 <irenab> how do we make VM to land on the node with PCI devices connecting to the correct provider-network and correct PCI device be allocated 13:53:12 <johnthetubaguy> irenab: OK, thats the scheduling issue then? 13:53:51 <irenab> john: agree, but I think that input comes from the nova boot command 13:54:18 <irenab> either in flavor or --nic 13:54:25 <johnthetubaguy> right, so let me ramble on about how I see this working… I know its not ideal 13:54:39 <johnthetubaguy> so, user makes request 13:54:54 <johnthetubaguy> nova looks for required pci devices 13:55:10 <johnthetubaguy> flavor extra specs or network_info might have them 13:55:15 <irenab> john: missing scheduler 13:55:18 <johnthetubaguy> we get a list of required alias 13:55:55 <johnthetubaguy> the scheduler can look at the required alias 13:56:09 <johnthetubaguy> and filters out hosts that can't meet the requests for all the requested devices 13:56:25 <johnthetubaguy> i.e. thats some scheduler filter kicking in 13:56:33 <johnthetubaguy> when the requests gets to the compute node 13:56:41 <johnthetubaguy> talks to resource manager to claim the resource as normal 13:57:19 <johnthetubaguy> compute node sends updates the the scheudler on what devices are available (we can sort out format later) 13:57:29 <johnthetubaguy> when VM is setup... 13:57:32 <johnthetubaguy> create domain 13:57:41 <johnthetubaguy> add any requested devices in the flavor extra specs 13:57:44 <johnthetubaguy> when plugging vifs 13:57:54 <johnthetubaguy> vif driver gets extra PCI device info 13:58:27 <johnthetubaguy> it also gets some lib that points back to nova driver specific ways of plugging PCI devices, and does what it wants to do 13:58:29 <johnthetubaguy> maybe 13:58:40 <johnthetubaguy> anyways, that was my general thinking 13:59:13 <heyongli> john: currently pci works in this way, almost 13:59:25 <irenab> john: I have to admit I do not see how networking part is resolved ... 13:59:37 <baoli> Hi John, I think that's how it works today. But we need to resolve network connectivity issue as Irenab has pointed out. We need a PCI device that connects to a physical net 13:59:45 <johnthetubaguy> well VIF driver gets its config from neutron in regular way 14:00:08 <johnthetubaguy> combined with PCI alias info from nova 14:00:13 <johnthetubaguy> it should be able to do what it needs to do 14:00:19 <johnthetubaguy> at least thats my suggestion 14:00:39 <heyongli> john: yep, this is also can reslove connectivity problem 14:01:00 <johnthetubaguy> anyways 14:01:05 <johnthetubaguy> its probably the nova meeting on here now 14:01:16 <baoli> Time is up. Do you guys want to end the meeting soon? 14:01:20 <irenab> john: PCI alias does not put any info regarding connectivity according to waht you defined previously 14:01:36 <heyongli> irenab: we can extend it 14:01:41 <johnthetubaguy> irenab: agreed, that has to come from neutron, in my model 14:01:42 <shanewang> yes, nova meeting 14:01:46 <johnthetubaguy> hyongli: its the nova meeting 14:01:48 <irenab> it can be several mellanox NICs but only one conecting to the physicla network 14:01:54 <baoli> #endmeeting