13:04:01 <esberglu> #startmeeting powervm_driver_meeting 13:04:02 <openstack> Meeting started Tue Aug 29 13:04:01 2017 UTC and is due to finish in 60 minutes. The chair is esberglu. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:04:03 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:04:06 <openstack> The meeting name has been set to 'powervm_driver_meeting' 13:04:18 <efried> \o 13:04:31 <efried> edmondsw is at VMWorld. 13:04:41 <efried> thorst_afk - you going to be here? 13:04:48 <mdrabe> o/ 13:04:52 <thorst_afk> not really 13:05:02 <thorst_afk> will be catching up on the feed periodically 13:05:16 <esberglu> #link https://etherpad.openstack.org/p/powervm_driver_meeting_agenda 13:05:24 <esberglu> #topic In Tree Driver 13:05:37 <esberglu> #link https://etherpad.openstack.org/p/powervm-in-tree-todos 13:05:40 <efried> Okay. I guess it'll be mostly a status update. esberglu when done, send link to minutes out so others can catch up. 13:06:45 <esberglu> I tested the config drive patch with FORCE_CONFIG_DRIVE=True 13:06:56 <esberglu> Everything seemed to be working as expected 13:07:06 <efried> Sweet. Have you seen my email? 13:07:20 <esberglu> efried: Yep. Ran it last night, 0 failures 13:07:29 <efried> Well, so here's the funky thing... 13:07:29 <esberglu> Need to port that to the IT patch 13:07:46 <efried> https://review.openstack.org/#/c/498614/ <== this passed our CI. 13:07:59 <efried> It oughtn't to have. 13:08:42 <esberglu> efried: Weird... 13:08:56 <efried> We would have been sending down AssociatedLogicalPartition links like http://localhost:12080/rest/api/uom/ManagedSystem/None/LogicalPartition/<real_uuid> 13:09:16 <efried> But... perhaps the REST side is just parsing the end off without looking too closely at the rest of it. 13:09:39 <efried> The only side effect I can think of would have been that we would be ignoring fuse logic in mappings 13:09:48 <efried> which just means every mapping ends up on its own bus. 13:09:57 <efried> which wouldn't manifest any problems in the CI, really. 13:10:20 <esberglu> efried: Want me to post the CI logs from the manual CI run to see if there's anything different going on? 13:10:32 <efried> No, if it passes, we won't be able to see anything useful in there. 13:11:00 <efried> Anyway, yeah, esberglu you want to pick up those two changes and run with 'em? 13:11:06 <efried> finish up UT and whatnot? 13:11:16 <esberglu> efried: Sure 13:11:33 <esberglu> #action esberglu: Port host_uuid change to IT config drive patch 13:11:52 <esberglu> #action esberglu: Finish UT for OOT host_uuid patch 13:12:03 <esberglu> That 13:12:08 <esberglu> that's it for IT? 13:12:39 <efried> For completeness, the pypowervm side is 5818; the nova-powervm change to be finished up and ported to the in-tree cfg drive change is https://review.openstack.org/#/c/498614/ 13:13:09 <efried> #action esberglu UT for 5818 too 13:13:24 <efried> It's passing right now, but needs some extra testing for the stuff I changed. 13:13:35 <esberglu> efried: ack 13:13:55 <efried> Oh, and following up on vfc mappings. I don't have anything fibre-channely to test with. Do you? 13:14:38 <esberglu> Don't think so 13:14:57 <efried> Need to make sure the same logic (posting ROOT URIs to create mappings) also works for vfc, then make the same change in the vfc mapping bld methods. 13:15:06 <efried> Can be a followon change, I suppose 13:15:26 <efried> For the moment, we're just using it to trim down the cfg drive stuff, which is always vscsi. 13:16:00 <esberglu> efried: Ok. We can loop back to that after this first wave is done and either add it in or start a new change 13:16:26 <efried> yuh. Perhaps someone from pvc can lend us a fc-havin system for a day or two. 13:16:36 <efried> mdrabe you got anything like that? 13:16:52 <mdrabe> Yes 13:17:02 <efried> nice 13:17:07 <mdrabe> But as far as lending I'm not sure :/ 13:17:28 <efried> I actually would literally need it for half an hour 13:17:44 <mdrabe> All be consumed for pvc stories atm, in a few days they should be free I think 13:17:45 <efried> and would need a free fc disk I could assign to an LPAR (even the nvl) 13:18:11 <efried> Assuming that disk is free, the testing would be nondestructive. 13:18:24 <efried> I just need to create a mapping in a certain way and make sure it works. 13:18:40 <mdrabe> Testing with devstack though right? 13:18:43 <efried> no 13:18:45 <efried> just pypowervm 13:18:49 <efried> don't even care what level 13:18:55 <mdrabe> oh mkay 13:19:09 <mdrabe> That's good then, dm me 13:19:14 <efried> rgr 13:19:46 <efried> #action efried to validate vfc mappings work the same way (with ROOT URIs for AssociatedLogicalPartition) using mdrabe's setup. 13:19:49 <efried> aaaand... 13:20:03 <efried> #action efried to continue thorough review of cfg drive change 13:20:22 <efried> At some point I reckon we're gonna need thorst_afk to review it too. 13:20:54 <efried> All three of us have had hands on it, so approval is going to be like supermajority consensus. 13:21:07 <efried> Oh, wait, this is in tree. 13:21:17 <efried> So we all just get to +1 it anyway. 13:21:25 <efried> And community approval is going to entail... 13:21:48 <efried> #action esberglu to drive pvm in-tree integration bp for q. 13:22:07 <esberglu> efried: Yep 13:22:14 <efried> Not sure if you caught my parting shot yesterday on that, but: may want to ask mriedem in -nova whether he wants a fresh bp or just re-approve existing. 13:22:31 <esberglu> efried: Yep saw that was planning on putting that in motion today 13:22:37 <efried> coo 13:23:11 <esberglu> #topic Out Of Tree Driver 13:23:28 <mdrabe> https://review.openstack.org/#/c/471926 passed functional testing 13:23:49 <mdrabe> There's one issue uncovered from the testing left to be ironed out, but it's unrelated to the change 13:24:34 <efried> What was the issue? 13:24:38 <efried> Love it when test finds bugs. 13:24:49 <efried> It kinda justifies the whole existence of testing as a thing. 13:25:21 <mdrabe> One evacuation failed with the dreaded vscsi rebuild exception... 13:25:45 <efried> got bug? 13:26:09 <mdrabe> Can't link here, dming, sec 13:26:17 <efried> If it's RTC, I don't care. 13:26:28 <mdrabe> Heh k 13:26:28 <efried> Is it not a bug in the community code? 13:26:49 <efried> If so, we ought to have a lp bug for it. 13:26:54 <mdrabe> It's been some time since I've looked at it 13:27:05 <efried> oh, is it an old bug? 13:27:09 * efried confused 13:27:23 <mdrabe> No, I just mean within a weeks timeframe 13:27:28 <mdrabe> I forget things quickly, sorry 13:28:18 <efried> mdrabe Okay, well, I'm not in a huge hurry to get a lp bug opened, but if the changes are going into nova-powervm, that should happen eventually (before we merge it). 13:28:47 <mdrabe> efried: The exception that was raised was this one: https://github.com/powervm/pypowervm/blob/develop/pypowervm/tasks/slot_map.py#L665 13:29:00 <mdrabe> For 1 out of 5 evacuations 13:29:32 <efried> As in, we couldn't find one of the devices on the target system? 13:29:37 <mdrabe> Right 13:29:42 <efried> Uhm. 13:29:47 <efried> So first of all, 1/5 ain't good. 13:29:58 <mdrabe> And I _think_ I recall seeing LUA recovery failures in the logs 13:30:02 <efried> Second, upon what are you basing your assertion that this is unrelated to your change? 13:30:16 <mdrabe> Because it's not related to the slot map 13:31:21 <efried> even though ten out of the 13 or so LOC leading up to that exception have 'slot' in 'em? 13:31:49 <mdrabe> but 13:31:52 <mdrabe> ok 13:32:04 <mdrabe> I'll -1 WF until we resolve it 13:32:35 <mdrabe> efried: fair? 13:32:46 <efried> I had put a +2 on it, but yeah, I think we should follow up first. 13:34:21 <esberglu> Reminder that pike official release is tomorrow 13:34:30 <esberglu> That it for OOT? 13:34:48 <efried> other than pci stuff, I think so. 13:35:10 <esberglu> #topic PCI Passthrough 13:35:18 <efried> okay 13:35:28 <efried> Lots to catch up on here since last week. 13:36:11 <efried> First of all, last week I got a prototype successfully claiming *and* assigning PCI passthrough devices during spawn. 13:36:38 <efried> Were any of y'all in the demo on Friday? 13:36:58 <esberglu> Yeah 13:36:58 <mdrabe> nope 13:37:32 <efried> The nova-powervm code is here: https://review.openstack.org/#/c/496434/ 13:38:09 <efried> And I'm actually not sure ^ relies on any pypowervm or REST changes, as currently written. 13:38:19 <efried> despite what the commit message says. 13:38:29 <efried> now 13:39:12 <efried> REST has merged the change that lets us assign slots on LPAR PUT. Which means I can remove the hack here: https://review.openstack.org/#/c/496434/3/nova_powervm/virt/powervm/vm.py@573 13:40:37 <efried> Also the much-debated PCI address spoofing I think I'm gonna keep in nova-powervm (abandoned 5755 accordingly) because... 13:40:43 <efried> All of this is going to be temporary 13:40:50 <efried> It may not even survive queens, gods willing. 13:41:02 <mdrabe> efried: I forget, through what API do we assign PCI devices after spawn? 13:41:46 <efried> mdrabe Before that REST fix? IOSlot.bld and append that guy to the LPAR's io_config.io_slots. Then POST the LPAR. 13:42:11 <openstackgerrit> Eric Berglund proposed openstack/nova-powervm master: DNM: ci check https://review.openstack.org/328315 13:42:43 <mdrabe> efried: And that's triggered by an interface attach from an openstack perspective? 13:43:16 <efried> mdrabe No, actually, I'm not sure what happens during interface attach - should probably look into that. 13:43:40 <efried> No, in openstack the instance object we get passed during spawn contains a list of pci_devices that have been claimed for us. 13:43:49 <openstackgerrit> Eric Berglund proposed openstack/nova-powervm master: DNM: CI Check2 https://review.openstack.org/328317 13:44:24 <efried> Via the above change sets, we're culling that info and sending it into LPARBuilder (curse him). 13:44:51 <efried> mdrabe Is that what you were looking for? 13:45:12 <mdrabe> I'm just trying to understand the flows affected 13:45:38 <efried> Sure, definitely worth going over in more detail, let's do that. 13:46:38 <mdrabe> Yea, I've been meaning to take some time to stare at this stuff, I'll probably ask better questions after I do that 13:46:45 <efried> Nova gets PCI dev info from three places: 13:46:52 <efried> => get_available_resource (in the compute driver - code we control) produces a list of pci_passthrough_devices as part of the json object it dumps. 13:47:35 <efried> => The compute process looks in its conf for [pci]passthrough_whitelist, which it intersects with the above to filter down to only devices you're allowed to assign to VMs. 13:48:16 <efried> => The nova API process looks in its conf (which may not be the same .conf as the compute process - took me a while to figure THAT one out) for [pci]alias entries, which it *also* uses to filter the above. 13:48:55 <efried> The operator sets up a flavor. In the flavor extra_specs he sets a field called pci_passthrough:alias whose value is a comma-separated list of <alias>:<count> 13:49:48 <efried> The <alias> names come from the [pci]alias config, and are how the op identifies what kinds of devices he wants on his VM. Those [pci]alias entries just map the alias name to a vendor/product ID pair. 13:49:57 <efried> And the <count> is how many of that kind of dev you want. 13:50:01 <efried> So 13:50:56 <efried> When you do a spawn with that flavor, nova looks at the pci_passthrough:alias in the flavor, maps it to the vendor/product ID, and then goes and looks in the filtered-down pci_passthrough_devices list for devices that match. 13:51:12 <efried> Meanwhile it's keeping track of how many of those kinds of devices it has claimed and whatnot. 13:51:36 <mdrabe> Ok so adding/removing PCI devices is triggered through resize 13:51:38 <efried> So assuming it finds suitable devices, it decrements their available count and assigns 'em to your instance. 13:51:50 <efried> Yes, I believe that's the case, though I haven't explicitly tried it yet. 13:52:17 <mdrabe> That makes me wonder how this works with SR-IOV 13:52:32 <efried> To come full circle: nova puts the specific devices it claimed into your instance object that it passes to spawn, which is where our code again gets control. 13:52:50 <efried> Yeah, SR-IOV is going to be a different story 13:52:59 <efried> Especially since we're not doing the same thing nova does with SR-IOV. 13:53:16 <efried> But much of the flow is the same. 13:53:54 <efried> pci_passthrough_devices is *supposed* to register each VF as a child of its respective PF. 13:54:05 <efried> So you could claim a VF and the matching is done based on the parent. 13:54:32 <efried> But when you're doing that as part of network interface setup, things go off the rails a bit. 13:55:02 <efried> Now it starts looking for a physical_network tag on your device and trying to bind a neutron port with that network and all that jazz. 13:55:40 <efried> In the rest of the world, you have to pre-create VFs, and they're passed through explicitly one by one and assigned directly to the VM. 13:56:01 <efried> In our world... we don't have the VFs until we need 'em, and even then, they're not assigned directly to the VM. 13:56:49 <efried> So we have to fool the pci manager by spoofing "fake" VFs in our pci_passthrough_devices list. We just create however many entries according to the MaxLPs on the PF. 13:57:23 <mdrabe> Right okay, I'm stuck in the PowerVM perspective 13:58:12 <efried> Yeah, so when we do a claim with SR-IOV, nova actually hands us one of those fake VFs, but we ignore it and just create our VNIC on the appropriate PF. 13:58:55 <efried> This is probably enough historical treatise. The aforementioned PoC code gives me confidence that we can make this work in q without community involvement. Which is not bad. 13:59:01 <efried> But it also ain't pretty. 13:59:16 <efried> The main ugliness is that we have to spoof our PCI addresses. 13:59:41 <efried> Because nova refuses to operate without a Linuxy PCI address in <domain>:<bus>:<slot>.<func> format. 13:59:54 <efried> Our devices don't have those. We have DRC index and location code. 14:00:09 <efried> Linuxy PCI addresses are 32-bit. DRC index is 64-bit. 14:00:42 <mdrabe> What determines the DRC index for us? 14:00:46 <efried> PHYP 14:00:46 <mdrabe> phyp? 14:01:25 <efried> So I started down a path of suggesting some changes to nova's pci manager that would allow us to use our DRC index (or location code, or whatever we wanted) to address and identify devices. 14:01:42 <efried> https://review.openstack.org/497965 14:02:38 <efried> It was basically shot down as being an interim hackup that would be superseded by the move to placement and resource providers. 14:03:04 <efried> Which is really what I was going for in the first place. I wanted to garner some attention and discussion that would get us moving in that direction. 14:04:07 <efried> The upshot is that we (I believe Jay is the nova core most invested in this) want to make devices (not just PCI - any devices) managed through the placement and resource provider framework. 14:04:36 <efried> In that nirvana, our compute driver provides a get_inventory method, which replaces get_available_resource. 14:05:24 <efried> The information contained therein is able to represent any resource generically, and the nova code doesn't try to introspect values and do stuff with 'em like it is doing today for PCI addresses and whatnot. 14:05:54 <mdrabe> That sounds like the way to go 14:06:10 <efried> That work is off the ground at this point in nova, for resources like vcpu, mem, and disk. 14:06:18 <efried> There's also some support for custom resource classes. 14:06:23 <efried> So 14:06:57 <efried> Jay and I are working up content for discussion at the PTG toward making devices managed by the same setup. 14:07:18 <mdrabe> Cool 14:08:22 <esberglu> Good discussion. We ready to move on? 14:08:33 <efried> A resource provider would describe the devices it has available; those devices would have qualitative and quantitative properties. Nova would get a spawn request asking for a device with certain qualitative and quantitative properties. Placement and scheduler and claims and family would just match those values (again, blindly, not introspecting the values) and give us the resources. 14:08:51 <efried> And we get the helm back in our driver and do whatever we want with those claimed resources. 14:09:29 <mdrabe> I feel much more informed than I did an hour ago 14:09:53 <esberglu> Same 14:10:02 <efried> So my action this week is going to be collating some of these notes and stuff, creating an etherpad for the PTG, and perhaps putting some of it down in a blueprint https://blueprints.launchpad.net/nova/+spec/devices-as-resources whose spec is here: https://review.openstack.org/#/c/497978/ 14:10:51 <mdrabe> efried: Is the resource provider change targetted for q? 14:11:02 <efried> Well, that's what I don't know. 14:11:09 <efried> I'm sure it will be targetted for q. 14:11:15 <efried> Whether it will get done in q is another question. 14:11:19 <efried> So 14:11:31 <efried> We need to be prepared to move forward with our hacked version 14:11:44 <efried> And we can transition over as able. 14:11:48 <efried> It's a big piece of work. 14:12:09 <efried> So I suspect that even if it gets done in q, it'll get done late in the cycle, possibly too late for us to exploit it fully ourselves. 14:12:37 <efried> The really good news here is that Jay is very invested in this, and it fits with the overall direction nova is moving wrt placement and resource providers, so I don't doubt it's going to get done... eventually. 14:12:57 <efried> It's not just us whining "we need this for PowerVM". 14:13:40 <esberglu> Cool 14:13:50 <efried> Okay, I think that's probably enough of that for now. Any further questions, or ready to move on? 14:14:09 <efried> #action efried to write etherpad and/or spec content for nova device management as generic resources. 14:14:09 <esberglu> I might have questions later, I need to look through the code still 14:14:32 <esberglu> #topic PowerVM CI 14:14:37 <esberglu> Not much to report here 14:15:10 <esberglu> Still waiting for the REST change for the serialization issue 14:15:25 <efried> esberglu It's been prototyped, though? 14:15:33 <efried> And run through CI? 14:15:58 <esberglu> efried: Prototyped and run through CI, but not with the latest version of the code 14:16:10 <efried> 5775? 14:17:38 <esberglu> efried: I think it requires the related changes as well. Not 100% sure though, hsien deployed it 14:18:59 <esberglu> Other than that the compute driver was occaisionally failing to come up on CI runs. The stacks on the undercloud for a few systems were messed up 14:19:09 <esberglu> I redeployed, haven't seen it since, gonna keep an eye out 14:20:00 <esberglu> Those were the only failures hitting CI consistently, so failure rates should be pretty low now 14:20:16 <esberglu> Well not now, once that rest fix is in 14:20:30 <esberglu> That's all I had CI 14:20:52 <esberglu> #topic Driver Testing 14:21:05 <esberglu> Jay isn't on. But he was having problems stacking last week 14:21:24 <esberglu> I got his system stacked, not sure if any further testing has been done on it yet 14:22:14 <esberglu> Nothing else to report there 14:22:34 <esberglu> #topic Open Discussion 14:22:45 <esberglu> That's it for me 14:22:52 <efried> nothing else here 14:23:40 <esberglu> Alright. See you here next week 14:23:50 <esberglu> #endmeeting