13:30:02 <sgordon> #startmeeting pci-passthrough-network
13:30:03 <openstack> Meeting started Fri Nov  1 13:30:02 2013 UTC and is due to finish in 60 minutes.  The chair is sgordon. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:30:04 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
13:30:06 <openstack> The meeting name has been set to 'pci_passthrough_network'
13:30:10 <sgordon> might be useful to log this :)
13:30:19 <irenab> alias defined by means of product and vendor id and device type for now
13:30:52 <sgordon> #chair irenab
13:30:53 <openstack> Current chairs: irenab sgordon
13:30:57 <sgordon> #chair baoli_
13:30:58 <openstack> Current chairs: baoli_ irenab sgordon
13:31:20 <irenab> if auto idscovery for Host pci devices is used, it will catch all VFs that maybe do not have the same physical network access
13:31:46 <baoli_> Yes, the current definition of pci alias is not sufficient
13:32:35 <irenab> so one think that should be added is physical network "hint" to PCI alias
13:32:54 <HenryG> I am not sure if the physical network should go in the alias
13:32:55 <irenab> thing
13:33:18 <HenryG> This is a problem specific to network PCI devices
13:33:27 <irenab> yes
13:33:37 <HenryG> Other PCI devices are not interested in it
13:33:52 <irenab> so you suggest that neutron should be responsible to resolve it?
13:34:01 <sadasu> this info should be hidden inside port profiles
13:34:20 <HenryG> It might have to be a configuration
13:34:39 <irenab> I think this should be known in order to schedule the VM on correct Host
13:35:09 <baoli_> Let me reiterate this: a SRIOV VF joins a network when it's assigned a port profile.
13:35:12 <itzikb> The thing is that right now nova handles the PCI devices,right?
13:35:41 <baoli_> the reason for nova to do this is that it has to schedule an instance based on resource requirement
13:35:46 <HenryG> itzikb: yes, for scheduling
13:35:48 <mwagner> are we using sr-iov and pcipassthrough interchangably here ?
13:36:10 <baoli_> @mwagner: for network, we use SRIOV
13:36:14 <HenryG> mwagner: depends :)
13:36:21 <irenab> we too
13:37:24 <mwagner> ok, in my experience there has been a clear difference between the two
13:37:25 <irenab> @baoli_: not sure I follow how scheduler can use port porfile
13:38:14 <mwagner> while this discussion is clearly focused on the network side we should also keep in mind that you can passthru other devices (ex storatge controller)
13:38:27 <baoli_> @irenab, for scheduling purpose, port profile is not involved. The scheduler only needs to know that a node has the PCI device that can satisfy the requirement
13:38:40 <mwagner> maybe its possible to pick a solution that will support the global case
13:39:14 <rook> @mwagner agreed, Nvidia has came out with technology like SRIOV, but for vGPUs
13:39:37 <irenab> agree
13:40:06 <baoli_> @mwagner: agreed. we are trying to focus on networking. But some of the nova side work should be generic to support various features
13:40:50 <HenryG> Let me try to summarize a little:
13:40:55 <irenab> @baoli_: how it takes into account the availability of the physical connectivity of the Host to the required physical network? Do you assume full connectivity?
13:40:56 <mwagner> I am just throwing out there as we discuss where to do things tomake sure we keep it in mind
13:41:26 <irenab> @mwagner: agree
13:42:12 <sadasu> @baoli_ & @irenab - "the requirement" in your discussion can be very specific to each pci passthrough device
13:42:38 <HenryG> We have a bunch of network PCI devices that all look the same to nova (same alias). But we want nova to schdule a device that is connected to a particlar network, which is only on a subset the devices.
13:43:07 <irenab> exactly
13:43:41 <baoli_> @irenab: I think that I kind of understand your concern now. Basically, that's another requirement from nova's scheduling point of view
13:43:44 <HenryG> And the question is how to we give nova this hint.
13:44:08 <itzikb> If not alias then how?
13:44:32 <baoli_> Now you are saying physical network connectivity should be part of Nova's scheduling decision
13:44:34 <yamahata> So neutron should know the network topology and the difference of PCI deivces, tell it to nova?
13:45:07 <sadasu> I have the same opinion as yamahata.
13:45:07 <HenryG> yes, I think there will need to be some nova <-> neutron interaction
13:45:12 <irenab> another option may be to define dedicatad aliases for each group
13:45:23 <sadasu> And can be scaled to other types of pci passthrough devices
13:46:41 <sadasu> @irenab can you give some specifics on this?
13:46:57 <baoli_> PCI alias can be enhanced to implicitly indicate a network segment
13:47:49 <HenryG> adding the info in the alias would make it easy, but it feels a little hacky
13:47:50 <irenab> thare can be alias per group of devices that should be used to connect to each physical network
13:48:05 <baoli_> @irenab: I think so
13:48:25 <HenryG> it also brings network topology info into nova, which I am uneasy about
13:48:39 <yamahata> HenryG, I agree. There are another uses case of nova <-> neutron interaction.
13:48:43 <irenab> but it probably will be a lot of pain to manage the lists of devices per Host
13:48:51 <baoli_> @HenryG: not really
13:48:51 <sadasu> and who has the alias -> devices -> physical network mapping?
13:49:11 <sadasu> if nova...then I think we are breaking nova/neutron design
13:49:38 <baoli_> PCI alias is a way to represent PCI devices to Nova
13:49:57 <baoli_> Nova doesn't have any idea about the network topology
13:49:58 <sgordon> #info We have a bunch of network PCI devices that all look the same to nova (same alias). But we want nova to schdule a device that is connected to a particlar network, which is only on a subset the devices.
13:50:20 <itzikb> @sadasu: This is because it's both a pci device and a NIC. I don't see a way around it
13:51:10 <baoli_> @sgordon: one way to do it:nova boot —flavor m1.large —image <image-id> --nic net-id=<net>,pci-alias=<alias>,sriov=<direct|macvtap>,port-profile=<profile>
13:51:20 <sadasu> nova should know only alias....neutron should know alias->devices->physical network mapping
13:52:11 <irenab> so seems there should be sone neutron agent per Host to manage SRIOV NICs, am I right?
13:52:13 <baoli_> the PCI alias will be used to schedule the instance
13:52:14 <sgordon> the nova api extensions for passthrough dont expose the alias atm do they?
13:52:28 <baoli_> @irenab: why?
13:52:36 <HenryG> How about something like this: Nova builds its list of PCI devices. When it discovers a device that is a NIC, it asks neutron which netowrk the NIC is connected to, and adds this info to the alias. ?
13:52:40 <sgordon> (talking about the proposed one here, they didnt make havana)
13:53:37 <baoli_> @HenryG: it only needs to know the alias and the nubmer of resouces in the alias for scheduling purpose.
13:53:40 <irenab> @baoli_: Neutron should know each Host NICs to be able to answer nova request following proposed above
13:54:09 <sadasu> @baoli_ how does Nova make sure pci-alias, port-profile are correct for net-id?
13:55:15 <baoli_> @sadasu: a net-id is associated with a VLAN, which is also defined in the port-profile. Neutron will make sure that are the same.
13:55:19 <sadasu> @irenab agreed.
13:57:15 <yamahata> baoli_, if the port is VF, Neutron (agent) will change the config of VF to be bound to the VLAN?
13:57:55 <irenab> @yanahata: in case of the HW VEB yes
13:58:38 <irenab> I think that we figured out that neutron agent is required also for NIC PCI device discover
13:58:50 <sadasu> @baoli_ agreed...just wanted to point out that nova *checking* with neutron is unavoidable
13:58:53 <baoli_> @yamahata: in the case of libvirt, binding of a VF to a VLAn is to assgine a port profile to the VF
13:59:47 <yamahata> baoli_, got it. thanks
14:00:06 <irenab> in our case we configure VF locally, so agent will be a must
14:01:03 <HenryG> irenab: do you plan to use a driver in ML2 for managing your agent?
14:01:19 <baoli_> @irenab: whether or not an agent is needed is specific to the node
14:01:20 <irenab> @HenryG: yes
14:01:20 <sadasu> @irenab agreed!
14:01:29 <HenryG> irenab: excellent!
14:03:16 <irenab> what I am still not getting is how to impact scheduler to consider network connectivity
14:03:30 <baoli_> @irenab: PCI alias
14:03:35 <irenab> currently it uses the pci alias
14:04:44 <HenryG> irenab: it is not obvious yet what is the best or most desirable way
14:04:53 <baoli_> You define a PCI alias so that you can create nova instance that makes use of the resources define in that alias
14:05:15 <irenab> @baoli_: pci alias is a pool of devices with same physicla network connectivity
14:05:20 <irenab> ?
14:05:40 <HenryG> yes, that is the "easy" way that I mentioned
14:06:10 <HenryG> but I will say again, do we want nova to have this network knowledge?
14:06:11 <irenab> the IT guy won't like it
14:06:48 <irenab> @HenryG: I think we don't
14:07:04 <baoli_> @HenryG: nova doesn't have this network knowledge, it only has the PCI alias
14:08:05 <HenryG> but to create the aliases correctly, you need network knowledge
14:08:32 <sadasu> I think we need to involve nova scheduler folks to get a better solution
14:08:37 <baoli_> When you design a cloud, you need to have that in mind
14:09:02 <HenryG> If someone moves cables around, they have to remember to update both neutron config and nova alias config
14:10:05 <irenab> @HenryG: I think networking info should be kept at neutron config
14:10:29 <HenryG> irenab: yes, that is my preference too
14:10:30 <irenab> @sadasu: agree
14:11:02 <sadasu> with the number of pci-passthrough ports per physical port, putting the overloaded pci alias burden on the IT is a bit much
14:11:40 <HenryG> we also need to make sure we do not derail the effort to remove nova networking
14:12:15 <sadasu> and how neutron provides this info to the scheduler is what in up for discussion IMHO
14:12:22 <baoli_> @HenryG: what we have discussed so far doesn't involve nova networking
14:12:23 <sadasu> is*
14:12:27 <sgordon> #agree Networking info should be kept in Neutron
14:12:52 <sgordon> #info Need to determine how to impact scheduling
14:13:16 <irenab> I feel we need scheduler devs to progress here
14:13:19 <itzikb> I'm missing something - How will neutron keep the devices-> network mapping
14:13:33 <sgordon> is there a design session scheduled for passthrough?
14:13:49 <sgordon> yes, as it turns out
14:13:52 <irenab> on Thursday morning
14:13:56 <sgordon> yah
14:14:01 <sgordon> #link http://icehousedesignsummit.sched.org/event/5d5b92daa0fff4e1d0a070a5c7397650#.UnO27PEYrTc
14:14:17 <sadasu> @sgordon thanks for the link
14:14:27 <irenab> but its general passthrough not specific networking
14:14:34 <sgordon> yeah
14:14:43 <HenryG> itzikb: such mappings are usually done by config (.ini files)
14:14:48 <baoli_> I ran out of time in puting one session in
14:14:50 <sgordon> networking is heavily involved in the next step here though
14:15:04 <sgordon> and the nova api for pci isnt in havana so needs to be finalized for icehouse
14:15:23 <sgordon> i think this use case will need to be considered in that session
14:15:39 <itzikb> @HenryG: which one?
14:16:00 <sgordon> #chair HenryG
14:16:01 <openstack> Current chairs: HenryG baoli_ irenab sgordon
14:16:10 <irenab> Maybe worth to have unconference discussion during the summit?
14:16:32 <sgordon> irenab, +1 - perhaps after the more general pci discussion though?
14:16:47 <irenab> @sgordon: agree
14:17:26 <irenab> sorry, I have to leave the session in 3 minutes.
14:17:38 <sgordon> yeah i cant stay much longer either
14:18:06 <HenryG> itzikb: in neutron.conf for bridge-physical mappings, and in vendor .ini files for physical switch port mappings
14:18:32 <baoli_> @sgordon:https://blueprints.launchpad.net/nova/+spec/pci-passthrough-base
14:19:27 <baoli_> Ok, let's continue the discussion during the summit
14:19:37 <sgordon> baoli_, yeah this is what i have been working with atm + the api extension that is on github atm
14:19:37 <itzikb> Thanks
14:19:51 <irenab> thank you for the great discussion, see you at HK
14:20:09 <sgordon> #endmeeting pci_passthrough_network