14:00:49 <baoli> #startmeeting PCI Passthrough
14:00:50 <openstack> Meeting started Tue Jan  7 14:00:49 2014 UTC and is due to finish in 60 minutes.  The chair is baoli. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:51 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:53 <openstack> The meeting name has been set to 'pci_passthrough'
14:00:57 <heyongli> hello
14:00:58 <ijw> o/
14:01:08 <baoli> hello, everyone
14:01:18 <irenab> hi
14:02:00 <baoli> Let's wait for a couple of more minutes
14:02:04 <heyongli> seems John not here,
14:02:25 <baoli> He asked about this week's meeting time
14:02:35 <baoli> I was hoping that he would join
14:03:13 <ijw> Mailed him, anyway
14:03:24 <baoli> thanks, ijw
14:04:02 <baoli> About today's agenda, I sent an email to the mailing list.
14:04:58 <ijw> I wondered if we could do it a slightly different way, work out what we agree on and get those bits moving
14:05:09 <heyongli> i missed that, could you please paste here , sorry
14:05:19 <heyongli> ijw, +1
14:05:29 <ijw> baoli's mail basically said we should try and clear up what we're not settled on
14:05:42 <irenab> baoli: regarding daily meeting, do you want to suggest time?
14:05:49 <baoli> Same time?
14:06:01 <ijw> We probably need our own channel for it
14:06:03 <irenab> baoli: one hour earlier better
14:06:27 <baoli> Irenab, that sounds good to me.
14:06:38 <heyongli> what's your guys time? i also want earlier one hour,but yunhong will miss this
14:07:03 <baoli> Yes, it's not a good time for Yunhong
14:07:32 <ijw> I'm CET so I'm easy
14:07:38 <ijw> This is midafternoon
14:07:47 <baoli> Regarding the agenda, we should agree on the key design points, which we havn't been able to achieve yet.
14:08:11 <baoli> Let's get started
14:08:17 <irenab> I am at GMT+2
14:08:53 <ijw> I guess we'll want to skip Friday too, irenab?
14:09:07 <irenab> and Saturday :)
14:09:32 <baoli> So agreed, let's do one hour earlier. It will be great if yunhong can join. Otherwise, he can comment on the logs.
14:09:49 <irenab> baoli: agree
14:09:52 <ijw> cool
14:09:54 <heyongli> sure
14:10:04 <baoli> #topic auto discovery
14:10:53 <ijw> Right, what we agree on is that we need to find all the PCI devices, match them by a match expression, and then assign a group/flavor/whatever we like to call it to them
14:11:20 <baoli> two key points: first, discover class of VFs; second, define default pci groups based on their class
14:12:16 <ijw> How would you mark a PCI device as being up for discovery and passthrough without also labelling it a group?  (I know you covered this but remind me)
14:14:06 <irenab> I also want to verify if we consider auto-discovery as a show stopper, since if we can provide a way to use deployment tools to define these groups it can be resolved later.
14:14:32 <baoli> ijw, one simple way is to set a nova config variable, something like sriov_auto = true. Once that's said, all the SRIOV VFs will be used for PCI passthrough.  by doing so, configuration of whitelist is not required
14:14:58 <ijw> OK - but that's a refinement too, really (nice but not necessary)
14:15:32 <ijw> irenab: I don't think it's the end of the world if we don't have it, but heyongli, yunhong and co have already done most of the discovery work in the sense of finding devices, I thought, it's just a matter of how we use the information that's up for debate
14:15:57 <irenab> ijw: OK.
14:16:23 <baoli> ijw, configuration of whitelist with some matching criteria as descibed in the google doc will be a separate task.
14:16:37 <heyongli> ijw: yeah, auto discory the class can group by the class's name
14:16:55 <ijw> OK.
14:17:11 <baoli> guys, let's focus on the two points I mentioned earlier.
14:17:46 <heyongli> discover class of VF: sure
14:17:47 <irenab> so for networking , how the auto-discovered VFs will be used with neutron?
14:17:57 <ijw> baoli: your points are great but they wouldn't solve my use cases - we either need the static list that irenab is talking about or the dynamic list based on autodiscovery and both of those fairly promptly, I'm afraid
14:18:04 <irenab> PCI-group = net?
14:18:21 <heyongli> yeah
14:18:34 <ijw> irenab: I think baoli's case is the simplest one for now, where all network devices are equivalent, then we refine the code from there
14:19:03 <baoli> irenab, if a VF has a 'net' class, then it's used for networking. We define a default group called "network"
14:19:38 <baoli> So the second point is that a SRIOV VF belongs to a default (or pre-defined) group
14:19:40 <irenab> baoli: so actually need to make sure that there is only one SRIOV NIC on the Host?
14:19:46 <ijw> And you have to use VFs, and you have to be careful that your Openstack control interfaces are *not* VFs - which is why I think that's quite a limited use case
14:20:04 <baoli> Irenab, it doesn't have to be
14:20:15 <heyongli> irenab: you can group by 'net' and the 'PF'
14:20:17 <baoli> ijw, the rest can be built on top of that
14:20:43 <ijw> baoli: I couldn't test what you're proposing
14:20:49 <irenab> heyongli: understand
14:21:05 <baoli> ijw, can you elaborate?
14:21:29 <ijw> You'd need VF configured NICs which I generally don't have on the machines I have handy
14:22:08 <baoli> ijw, can be give more details?
14:22:12 <ijw> Which is why I'd like to get to the case where we can pick and choose our groups quite quickly.
14:22:41 <baoli> About your machine, what do you mean about VF configured NICs?
14:22:53 <ijw> I'd like to help with the testing and dev, but if we can't pass through an entire PCI device that's not operating in SRIOV then that's not going to be possible
14:23:29 <irenab> ijw: for our case, only VF is relevant as vNICs
14:23:33 <heyongli> ijw: pass through the PF is also be possible, why we can not ?
14:23:54 <heyongli> jiw, even not VF enabled
14:24:16 <ijw> We can, absolutely, but I think baoli's proposing that we automatically detect and group NICs that are VFs and call that the starting point.  A better way of phrasing what I'm saying is - that's totally fine, but I think we have to move on from there quickly
14:24:21 <irenab> ijw: are talking about PCI for networking or general one?
14:25:43 <ijw> Right, I'm derailing this.  Ignore what I said.  I would sooner see us take the step forward.  baoli, let's do what you suggest.
14:26:27 <heyongli> with autodiscovry the class utilize the whitelist group, we can flexable define the group in any way you want
14:26:47 <baoli> ijw, thanks. in a matter of fact, if we have enough helping hand, whitelist configuration with match cretaria can be worked on in the same time
14:27:19 <baoli> Yongli, let's discuss that shortly
14:27:23 <heyongli> baoli: +1
14:28:20 <baoli> Sounds like that we'll be doing auto discovery with predefined pci groups?
14:28:51 <ijw> I would see baoli's case as ultimately being a default rule in the grouping that we override, by the time we've finished - sound about right?
14:29:25 <heyongli> baoli: i don't like predefined group,just let whitelist do that
14:29:47 <irenab> heyongli: can you elaborate?
14:29:49 <baoli> ijw, whitelist configuration can use any pci group that is created
14:30:27 <baoli> yongli, default or predefined pci group is the basis of the discussion
14:30:40 <heyongli> why predefine?  in whitelist you say "class=net group to a" is better
14:30:45 <ijw> So to clarify terms, we're talking about a whitelist being the matching expression and the group as being the tag the matching expression assigns, right?
14:30:59 <baoli> ijw, correct
14:31:41 <heyongli> so, it's sound group defined by a pre defined group, this group just a 'class' 's name, this not neccessary
14:32:02 <heyongli> perfer "class=net group to a" style
14:32:31 <ijw> heyongli: how close is that to working right now?
14:32:31 <baoli> yongli, doing that will allow a user to use SRIOV without configuring any thing
14:32:42 <heyongli> ijw: ready
14:33:18 <ijw> The baoli can propose a set of default rules and we can debate the wisdom in a separate patch, can't we?
14:33:38 <heyongli> ijw: +1
14:33:49 <ijw> Saves us taking the meeting up
14:34:31 <irenab> heyongli: ca yu please share how net class is managed now?
14:34:46 <heyongli> net class is not ready
14:34:53 <baoli> Cool. One key point that I want to make is that the entire implementation should be based on PCI group
14:35:19 <heyongli> net class belong to auto discovery class we discuss here .
14:35:28 <ijw> baoli: +1
14:35:33 <heyongli> +1
14:35:52 <baoli> Great!
14:36:16 <ijw> One big point of debate was whether the group/whitelist should be in the database or on the compute node.  You know my opinion but where do we stand on that?
14:36:28 <baoli> We'll have predefined pci groups, and user can explictely define groups
14:36:47 <baoli> ijw, let's come to that
14:36:50 <heyongli> any way they should saved in db
14:37:14 <heyongli> only disagree is if we allow API to modify that
14:37:16 <baoli> #topic pci group
14:37:35 <baoli> Agreed on predefined pci groups?
14:37:47 <heyongli> -1
14:37:53 <baoli> we'll have "net" for SRIOV
14:37:56 <irenab> baoli: based on class, yes
14:38:17 <irenab> baoli: any other criteria?
14:38:45 <heyongli> with the autodiscovery, we will had the new pci device k,v: class='net'
14:38:45 <ijw> I don't like them, but as I said, I think they can be implemented after the general case
14:39:02 <baoli> irenab, not really.
14:40:31 <heyongli> with this property, i suggest define group in the whitelist
14:41:01 <baoli> ijw, like I said before, it's the basis of the design. Please allow me to finish my thoughts on this
14:41:20 <heyongli> if use some pre define group, this may need configuration this:  is it be enable
14:41:44 <heyongli> baoli, go on
14:42:32 <baoli> ok. So you have predefine pci groups, and you may or may not have user defined pci groups. A SRIOV device will anyway belong to a pci group.
14:43:09 <baoli> Therefore, you have everything in place to support PCI groups
14:43:13 <ijw> An SRIOV device may not want to be in a PCI group.  I could be using it for things other than passthrough.
14:43:23 <ijw> Which is fine if the default can be disabled.
14:43:25 <baoli> ijw, correct.
14:43:28 <heyongli> ijw: +2
14:43:43 <baoli> that's when the whitelist kicks in
14:43:55 <irenab> so maybe we need an exclude list?
14:44:07 <baoli> irenab, that' s good idea too
14:44:08 <ijw> But that's not the question.  The question is, clearly this default is one line in heyongli's scheme.  And I presumably need one line of config to enable it in the first place.  Why is it worth adding?
14:44:35 <heyongli> exclude: no , we had a simple regular expresion support all value,
14:45:09 <heyongli> ijw: good point
14:46:10 <irenab> coming back to the basic requirement, we need to find and bind VF to specific physical network (can be defined by PF on each compute node). What should be defined with regards to VFs suitable for allocation?
14:46:55 <irenab> with as much as less configuration definitions
14:47:28 <ijw> irenab: I see three possibilities for network binding.  One, all net devices in one group are bound to the same physical network.  Two, Neutron does the binding and programs the PF to do it.  Three, Neutron does the binding and needs some arbitrary information to do it (like switch port).
14:47:37 <heyongli> irenab: if we requement more and more pre defined thing, ... this is not worthy
14:48:05 <irenab> ijw: I am not talking about configuration, but allocation
14:48:16 <ijw> How do you mean, allocation?
14:48:24 <irenab> nova should pick the VF, neutron should be responsible to configure it
14:48:47 <baoli> ijw, yongli, the point is that with predefined pci groups, you don't need to configure whitelist
14:49:09 <heyongli> nova pick the vf, just by the requiments, not care what it is
14:49:13 <heyongli> all define in whitelist
14:49:41 <heyongli> baoli: i know, that's not worthy and redundant
14:50:38 <baoli> yongli, if it's a pre existing knowledge, why do you need to add the line you proposed into the whitelist? is it a special syntax?
14:50:42 <ijw> baoli: your proposal is a refinement to save the Openstack configurer from adding a specific whitelist to match all network VFs because it makes the config a bit simpler for a commonly encountered case, right?
14:50:58 <baoli> ijw, not really
14:51:21 <baoli> ijw, it's good for both the coding and the admin
14:51:26 <ijw> irenab: on your allocation point, what precisely were you getting at?
14:51:26 <heyongli> baoli, i think ijw 's description is your idea exactly , to me
14:51:46 <heyongli> not good for coding, definitely
14:52:22 <irenab> ijw: just seems that I am  ot following what need to be managed for the simp,e networking case that I need a VM with VF NIC
14:52:26 <heyongli> you will need more configration to define your pregroup
14:53:24 <irenab> ijw: scheduler should be able to choose the suitable Host to bring up the VM, meaning that it has access to this network and has available VF
14:53:43 <baoli> yongli, do you need a special syntax for it?
14:53:48 <irenab> ijw: this part is before neutron invocation
14:54:18 <ijw> OK, so this is just how we specify we need a NIC and then how we find somewhere that's got one forus
14:54:50 <heyongli> baoli: this pre define group, if you coding, that is not worthy just to avoid a simple configration in whitelist
14:55:28 <baoli> yongli, can you explain why?
14:55:55 <irenab> ijw: for this case, we do not want the admin to go to each Host and configure white lists, right?
14:57:28 <irenab> baoli: we are running out of time. Please, send an update on time and place (IRC chanel) for next meeting.
14:57:29 <heyongli> baoli: i just think that is not necessary, that would help admin a lot, even a little bit more constrain to be added to the pre defined group, and in this way, whitelist is more mass
14:57:51 <baoli> Well, time is almost up. Let's wrap up for today, and we'll resume UTC 1300 tomorrow.
14:58:03 <baoli> Agreed?
14:58:04 <ijw> irenab: actually, yes, I think we do - that's what an admin would do with provider networks, for instance, and hosts are not all identical
14:58:29 <irenab> ijw: any tool cna be used here?
14:58:32 <ijw> It's better that, when you add a new compute noe to the cluster, it brings its configuration with it, rather than that configuration being stored in the DB
14:58:50 <irenab> ijw: agree
14:58:55 <baoli> I think that we are confusing ourself with different use cases
14:59:13 <ijw> irenab: when I do installs I use puppet, which makes it easy to set multiple machines up with the same, correct config, so storing it centrally in Openstack gets you nothing
14:59:38 <irenab> baoli: I guess we better put the list of use cases and go one by one
14:59:41 <ijw> If you're adventurous you can use an autodiscovery tool that audits the machine before it installs things and chooses config accordingly
15:00:00 <baoli> If you think about it, we can cover all the use cases discussed today with: predefined PCI groups, user-defined PCI groups, whitelist configuration.
15:00:03 <baoli> irenab: sure
15:00:05 <sc68cal> might have to take this into neutron - got a meeting at 10AM for the ipv6 team
15:00:06 <ijw> I think that's the time you'd want to get that sort of configuring done, not with an openstack API (and heyongli, this was my problem with a lot of the API you had)
15:00:08 <heyongli> ijw: good to know that
15:00:12 <irenab> ijw: need more clarification :)
15:00:13 <baoli> #endmeeting