13:30:02 <sgordon> #startmeeting pci-passthrough-network 13:30:03 <openstack> Meeting started Fri Nov 1 13:30:02 2013 UTC and is due to finish in 60 minutes. The chair is sgordon. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:30:04 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:30:06 <openstack> The meeting name has been set to 'pci_passthrough_network' 13:30:10 <sgordon> might be useful to log this :) 13:30:19 <irenab> alias defined by means of product and vendor id and device type for now 13:30:52 <sgordon> #chair irenab 13:30:53 <openstack> Current chairs: irenab sgordon 13:30:57 <sgordon> #chair baoli_ 13:30:58 <openstack> Current chairs: baoli_ irenab sgordon 13:31:20 <irenab> if auto idscovery for Host pci devices is used, it will catch all VFs that maybe do not have the same physical network access 13:31:46 <baoli_> Yes, the current definition of pci alias is not sufficient 13:32:35 <irenab> so one think that should be added is physical network "hint" to PCI alias 13:32:54 <HenryG> I am not sure if the physical network should go in the alias 13:32:55 <irenab> thing 13:33:18 <HenryG> This is a problem specific to network PCI devices 13:33:27 <irenab> yes 13:33:37 <HenryG> Other PCI devices are not interested in it 13:33:52 <irenab> so you suggest that neutron should be responsible to resolve it? 13:34:01 <sadasu> this info should be hidden inside port profiles 13:34:20 <HenryG> It might have to be a configuration 13:34:39 <irenab> I think this should be known in order to schedule the VM on correct Host 13:35:09 <baoli_> Let me reiterate this: a SRIOV VF joins a network when it's assigned a port profile. 13:35:12 <itzikb> The thing is that right now nova handles the PCI devices,right? 13:35:41 <baoli_> the reason for nova to do this is that it has to schedule an instance based on resource requirement 13:35:46 <HenryG> itzikb: yes, for scheduling 13:35:48 <mwagner> are we using sr-iov and pcipassthrough interchangably here ? 13:36:10 <baoli_> @mwagner: for network, we use SRIOV 13:36:14 <HenryG> mwagner: depends :) 13:36:21 <irenab> we too 13:37:24 <mwagner> ok, in my experience there has been a clear difference between the two 13:37:25 <irenab> @baoli_: not sure I follow how scheduler can use port porfile 13:38:14 <mwagner> while this discussion is clearly focused on the network side we should also keep in mind that you can passthru other devices (ex storatge controller) 13:38:27 <baoli_> @irenab, for scheduling purpose, port profile is not involved. The scheduler only needs to know that a node has the PCI device that can satisfy the requirement 13:38:40 <mwagner> maybe its possible to pick a solution that will support the global case 13:39:14 <rook> @mwagner agreed, Nvidia has came out with technology like SRIOV, but for vGPUs 13:39:37 <irenab> agree 13:40:06 <baoli_> @mwagner: agreed. we are trying to focus on networking. But some of the nova side work should be generic to support various features 13:40:50 <HenryG> Let me try to summarize a little: 13:40:55 <irenab> @baoli_: how it takes into account the availability of the physical connectivity of the Host to the required physical network? Do you assume full connectivity? 13:40:56 <mwagner> I am just throwing out there as we discuss where to do things tomake sure we keep it in mind 13:41:26 <irenab> @mwagner: agree 13:42:12 <sadasu> @baoli_ & @irenab - "the requirement" in your discussion can be very specific to each pci passthrough device 13:42:38 <HenryG> We have a bunch of network PCI devices that all look the same to nova (same alias). But we want nova to schdule a device that is connected to a particlar network, which is only on a subset the devices. 13:43:07 <irenab> exactly 13:43:41 <baoli_> @irenab: I think that I kind of understand your concern now. Basically, that's another requirement from nova's scheduling point of view 13:43:44 <HenryG> And the question is how to we give nova this hint. 13:44:08 <itzikb> If not alias then how? 13:44:32 <baoli_> Now you are saying physical network connectivity should be part of Nova's scheduling decision 13:44:34 <yamahata> So neutron should know the network topology and the difference of PCI deivces, tell it to nova? 13:45:07 <sadasu> I have the same opinion as yamahata. 13:45:07 <HenryG> yes, I think there will need to be some nova <-> neutron interaction 13:45:12 <irenab> another option may be to define dedicatad aliases for each group 13:45:23 <sadasu> And can be scaled to other types of pci passthrough devices 13:46:41 <sadasu> @irenab can you give some specifics on this? 13:46:57 <baoli_> PCI alias can be enhanced to implicitly indicate a network segment 13:47:49 <HenryG> adding the info in the alias would make it easy, but it feels a little hacky 13:47:50 <irenab> thare can be alias per group of devices that should be used to connect to each physical network 13:48:05 <baoli_> @irenab: I think so 13:48:25 <HenryG> it also brings network topology info into nova, which I am uneasy about 13:48:39 <yamahata> HenryG, I agree. There are another uses case of nova <-> neutron interaction. 13:48:43 <irenab> but it probably will be a lot of pain to manage the lists of devices per Host 13:48:51 <baoli_> @HenryG: not really 13:48:51 <sadasu> and who has the alias -> devices -> physical network mapping? 13:49:11 <sadasu> if nova...then I think we are breaking nova/neutron design 13:49:38 <baoli_> PCI alias is a way to represent PCI devices to Nova 13:49:57 <baoli_> Nova doesn't have any idea about the network topology 13:49:58 <sgordon> #info We have a bunch of network PCI devices that all look the same to nova (same alias). But we want nova to schdule a device that is connected to a particlar network, which is only on a subset the devices. 13:50:20 <itzikb> @sadasu: This is because it's both a pci device and a NIC. I don't see a way around it 13:51:10 <baoli_> @sgordon: one way to do it:nova boot —flavor m1.large —image <image-id> --nic net-id=<net>,pci-alias=<alias>,sriov=<direct|macvtap>,port-profile=<profile> 13:51:20 <sadasu> nova should know only alias....neutron should know alias->devices->physical network mapping 13:52:11 <irenab> so seems there should be sone neutron agent per Host to manage SRIOV NICs, am I right? 13:52:13 <baoli_> the PCI alias will be used to schedule the instance 13:52:14 <sgordon> the nova api extensions for passthrough dont expose the alias atm do they? 13:52:28 <baoli_> @irenab: why? 13:52:36 <HenryG> How about something like this: Nova builds its list of PCI devices. When it discovers a device that is a NIC, it asks neutron which netowrk the NIC is connected to, and adds this info to the alias. ? 13:52:40 <sgordon> (talking about the proposed one here, they didnt make havana) 13:53:37 <baoli_> @HenryG: it only needs to know the alias and the nubmer of resouces in the alias for scheduling purpose. 13:53:40 <irenab> @baoli_: Neutron should know each Host NICs to be able to answer nova request following proposed above 13:54:09 <sadasu> @baoli_ how does Nova make sure pci-alias, port-profile are correct for net-id? 13:55:15 <baoli_> @sadasu: a net-id is associated with a VLAN, which is also defined in the port-profile. Neutron will make sure that are the same. 13:55:19 <sadasu> @irenab agreed. 13:57:15 <yamahata> baoli_, if the port is VF, Neutron (agent) will change the config of VF to be bound to the VLAN? 13:57:55 <irenab> @yanahata: in case of the HW VEB yes 13:58:38 <irenab> I think that we figured out that neutron agent is required also for NIC PCI device discover 13:58:50 <sadasu> @baoli_ agreed...just wanted to point out that nova *checking* with neutron is unavoidable 13:58:53 <baoli_> @yamahata: in the case of libvirt, binding of a VF to a VLAn is to assgine a port profile to the VF 13:59:47 <yamahata> baoli_, got it. thanks 14:00:06 <irenab> in our case we configure VF locally, so agent will be a must 14:01:03 <HenryG> irenab: do you plan to use a driver in ML2 for managing your agent? 14:01:19 <baoli_> @irenab: whether or not an agent is needed is specific to the node 14:01:20 <irenab> @HenryG: yes 14:01:20 <sadasu> @irenab agreed! 14:01:29 <HenryG> irenab: excellent! 14:03:16 <irenab> what I am still not getting is how to impact scheduler to consider network connectivity 14:03:30 <baoli_> @irenab: PCI alias 14:03:35 <irenab> currently it uses the pci alias 14:04:44 <HenryG> irenab: it is not obvious yet what is the best or most desirable way 14:04:53 <baoli_> You define a PCI alias so that you can create nova instance that makes use of the resources define in that alias 14:05:15 <irenab> @baoli_: pci alias is a pool of devices with same physicla network connectivity 14:05:20 <irenab> ? 14:05:40 <HenryG> yes, that is the "easy" way that I mentioned 14:06:10 <HenryG> but I will say again, do we want nova to have this network knowledge? 14:06:11 <irenab> the IT guy won't like it 14:06:48 <irenab> @HenryG: I think we don't 14:07:04 <baoli_> @HenryG: nova doesn't have this network knowledge, it only has the PCI alias 14:08:05 <HenryG> but to create the aliases correctly, you need network knowledge 14:08:32 <sadasu> I think we need to involve nova scheduler folks to get a better solution 14:08:37 <baoli_> When you design a cloud, you need to have that in mind 14:09:02 <HenryG> If someone moves cables around, they have to remember to update both neutron config and nova alias config 14:10:05 <irenab> @HenryG: I think networking info should be kept at neutron config 14:10:29 <HenryG> irenab: yes, that is my preference too 14:10:30 <irenab> @sadasu: agree 14:11:02 <sadasu> with the number of pci-passthrough ports per physical port, putting the overloaded pci alias burden on the IT is a bit much 14:11:40 <HenryG> we also need to make sure we do not derail the effort to remove nova networking 14:12:15 <sadasu> and how neutron provides this info to the scheduler is what in up for discussion IMHO 14:12:22 <baoli_> @HenryG: what we have discussed so far doesn't involve nova networking 14:12:23 <sadasu> is* 14:12:27 <sgordon> #agree Networking info should be kept in Neutron 14:12:52 <sgordon> #info Need to determine how to impact scheduling 14:13:16 <irenab> I feel we need scheduler devs to progress here 14:13:19 <itzikb> I'm missing something - How will neutron keep the devices-> network mapping 14:13:33 <sgordon> is there a design session scheduled for passthrough? 14:13:49 <sgordon> yes, as it turns out 14:13:52 <irenab> on Thursday morning 14:13:56 <sgordon> yah 14:14:01 <sgordon> #link http://icehousedesignsummit.sched.org/event/5d5b92daa0fff4e1d0a070a5c7397650#.UnO27PEYrTc 14:14:17 <sadasu> @sgordon thanks for the link 14:14:27 <irenab> but its general passthrough not specific networking 14:14:34 <sgordon> yeah 14:14:43 <HenryG> itzikb: such mappings are usually done by config (.ini files) 14:14:48 <baoli_> I ran out of time in puting one session in 14:14:50 <sgordon> networking is heavily involved in the next step here though 14:15:04 <sgordon> and the nova api for pci isnt in havana so needs to be finalized for icehouse 14:15:23 <sgordon> i think this use case will need to be considered in that session 14:15:39 <itzikb> @HenryG: which one? 14:16:00 <sgordon> #chair HenryG 14:16:01 <openstack> Current chairs: HenryG baoli_ irenab sgordon 14:16:10 <irenab> Maybe worth to have unconference discussion during the summit? 14:16:32 <sgordon> irenab, +1 - perhaps after the more general pci discussion though? 14:16:47 <irenab> @sgordon: agree 14:17:26 <irenab> sorry, I have to leave the session in 3 minutes. 14:17:38 <sgordon> yeah i cant stay much longer either 14:18:06 <HenryG> itzikb: in neutron.conf for bridge-physical mappings, and in vendor .ini files for physical switch port mappings 14:18:32 <baoli_> @sgordon:https://blueprints.launchpad.net/nova/+spec/pci-passthrough-base 14:19:27 <baoli_> Ok, let's continue the discussion during the summit 14:19:37 <sgordon> baoli_, yeah this is what i have been working with atm + the api extension that is on github atm 14:19:37 <itzikb> Thanks 14:19:51 <irenab> thank you for the great discussion, see you at HK 14:20:09 <sgordon> #endmeeting pci_passthrough_network