13:00:09 #startmeeting powervm_driver_meeting 13:00:10 Meeting started Tue Sep 5 13:00:09 2017 UTC and is due to finish in 60 minutes. The chair is esberglu. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:00:11 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:00:13 The meeting name has been set to 'powervm_driver_meeting' 13:00:26 o/ 13:00:30 \o 13:00:34 \o 13:00:35 o/ 13:01:28 #link https://etherpad.openstack.org/p/powervm_driver_meeting_agenda 13:01:45 #topic In-Tree Driver 13:02:19 Planning on starting the pike spec today in the background 13:02:27 Might have some questions 13:02:33 efried did you mark the pike one implemented? 13:02:38 queens spec 13:02:47 I think mriedem did 13:03:11 cool... I thought he had, but then you were talking about it on Friday and I assumed I'd missed something 13:03:26 oh, I didn't wind up moving it from approved/ to implemented/ because it appears there's a script to do that for all of them, and I don't think that's my responsibility. 13:03:49 well, to rephrase, I'm not sure it would be appreciated if I proposed that. 13:03:50 oh, interesting 13:04:10 as long as we're in sync with everything else 13:04:19 Yeah, nothing has been moved yet. 13:04:28 esberglu how is config drive going? 13:05:11 edmondsw: Good I think I'm ready for the first wave of reviews on the IT patch 13:05:27 what have you done to test it? 13:06:18 I assume there are functional tests that we can start running in the CI related to this... have you tried those with it? 13:06:19 Still need to finish up the UT pypowervm side for removal of the host_uuid which is needed for that 13:06:30 Yeah there is a tempest conf option FORCE_CONFIG_DRIVE 13:06:38 I have done a couple manual runs with that set to true 13:06:58 And have looked through the logs to make sure that it is hitting the right code paths 13:07:07 Everything looked fine for those 13:07:08 cool 13:07:14 Manually you can also do a spawn and then look to make sure the device and its scsi mapping exist. 13:07:25 I have not done any testing of spawns from the cli 13:07:53 Only through tempest so far 13:08:21 k, let's try the CLI as well just to be safe 13:08:48 #action: esberglu: Test config drive patch from CLI 13:08:50 and make sure that what you tried to get config drive to do was actually done on the guest OS 13:09:12 e.g. setting the hostname 13:09:31 Yep will do 13:09:38 efried: you started posting in nova 13:09:40 lol 13:09:50 thorst_afk Eh? 13:09:53 Posting what? 13:09:59 Does IT support LPM yet? 13:10:11 nm...I misread :-) 13:10:13 mdrabe not yet... that's one of our TODOs for queens 13:10:31 K, was just thinking about the problems that vopt causes 13:10:39 LPM might be ambitious for Q 13:11:06 efried oh, you're right... that was NOT a TODO for queens... 13:11:09 mdrabe We have that stuff solved for OOT; do you see any reason the solution would be different IT? 13:11:56 I don't think so 13:13:17 anybody else have something to discuss IT? 13:14:10 #topic Out-of-Tree Driver 13:14:28 Anything to discuss here? 13:14:35 Pursuant to above, the rename from approved/ to implemented/ is already proposed here: https://review.openstack.org/#/c/500369/2 13:14:47 (for the pike spec) 13:15:09 efried cool 13:15:18 mdrabe any updates on what you were working on? 13:16:27 The PPT ratio stuff has been delivered in pypowervm 13:16:59 I've tested the teeny nova-powervm bits, but I wanted to get with Satish on merging that 13:17:02 mdrabe and nova-powervm? 13:17:28 (Also the nova-powervm side has been merged internally) 13:17:30 k 13:17:45 anything else OOT? 13:18:08 I think I went through on Friday and cleaned up some oldies. 13:18:18 there is an iSCSI-related pypowervm change from tjakobs that I need to review 13:18:49 thorst_afk what are we doing with 5531? 13:19:08 edmondsw: looking 13:19:35 been sitting a while 13:19:42 I don't think we need that 13:20:01 and if we want it, we can punt to a later pypowervm 13:20:12 thorst_afk abandon? 13:20:14 but the OVS update proposed doesn't require it 13:20:20 yeah 13:20:22 can do 13:20:24 tx 13:20:29 anything else? 13:21:50 esberglu next... 13:21:53 #topic PCI Passthrough 13:22:04 Which should be renamed "device passthrough" 13:22:21 because? 13:22:30 Because it's not going to be limited to PCI devices. 13:22:38 what else? 13:22:53 No significant update from last week; been doing a brain dump in prep for the PTG here https://etherpad.openstack.org/p/nova-ptg-queens-generic-device-management but it's not really ready for anyone else to read yet. 13:23:27 At the end of last week I think I started to get the idea of how it's really going to end up working. 13:24:11 And I think there's only going to be a couple of things that will be device-specific about it (as distinguishable from any other type of resource) 13:24:47 One will be how to transition away from the existing PCI device management setup (see L62 of that etherpad) 13:25:03 The other will be how network attachments will be associated with devices when they're generic resources. 13:25:37 I'm going to spend the rest of this week thinking through various scenarios and populating the section at L47... 13:25:57 ...and possibly putting those into a nice readable RST that we can put up on the screen in Denver. 13:26:01 that sounds great 13:26:30 even just throwing up the etherpad would be great 13:26:38 The premise is that I believe we can handle devices just like any other resource, with some careful (and occasionally creative) modeling of traits etc. 13:26:51 So the goal is to enumerate the scenarios and try to describe how each one would fit into that picture. 13:27:11 Now, getting this right relies *completely* on nested resource providers. 13:27:21 Which aren't done yet, but which I think will be a focus for Q. 13:27:42 If they aren't already, the need for device passthrough will be a push in that direction. 13:28:35 so are we being pushed away from doing things first with the current state of things and then again later moving to resource providers? 13:28:37 Once that framework is all in place, the onus will be on individual virt drivers to do most of the work as far as inventory reporting, creation of resource classes and traits, etc. 13:28:55 What do you mean pushed? 13:29:17 is this our choice, or are comments from jaypipes others making us have to go that way? 13:30:08 So as far as the powervm driver is concerned (both in and out of tree), I believe the appropriate plan is for us to implement our hacked up PCI passthrough using the existing PCI device management subsystem. Basically cleat up the PoCs I've got already proposed. 13:30:14 and have that be our baseline for Q. 13:30:37 Then whenever the generic resource provider and placement stuff is ready, we transition. Whether that be Queens or Rocky or whatever. 13:30:50 ok, I misunderstood your intentions then 13:31:01 sounds good 13:31:23 I like doing what we can under the current system in case the resource providers is delayed 13:31:28 but moving to that as soon as we can 13:31:39 Right; and to answer the other part of your question: yes, it's Jay et al (Dan, Ed, Chris, etc.) pushing for the way things are going to be. 13:32:11 I'm tracking it very closely, and imposing myself in the process, to make sure our particular axes are appropriately ground. 13:32:29 great, as long as they're not resisting patches that will get things working under the current system 13:32:30 But so far the direction seems sane and correct and generic enough to accomodate everyone. 13:32:40 Oh, I have no idea about that. 13:33:04 We'll have to throw those at the wall and see if they stick. 13:33:08 But we can at least get it done for OOT. 13:33:08 yeah 13:33:13 yep 13:33:28 Which is the important thing for us. 13:33:50 mdrabe need you to look at this resource providers future and assess PowerVC impacts 13:34:01 we can talk more about that offline 13:34:09 aye 13:34:19 Ready to move on? 13:34:27 I think so 13:34:38 yup 13:34:40 #topic PowerVM CI 13:34:59 Noticed things are somewhat unhealthy at the moment. 13:35:02 At least OOT. 13:35:25 https://review.openstack.org/#/c/500099/ https://review.openstack.org/#/c/466425/ 13:35:50 neo19 is failing to start the compute service with this error 13:36:19 http://paste.openstack.org/show/620409/ 13:36:37 http://ci-watch.tintri.com/project?project=nova says the last 7 OOT have passed 13:36:44 This persisted through an unstack and a stack. Anyone know what that's about? I asked in novalink with no response 13:36:50 As far as actual tempest runs 13:37:13 The fix for the REST serialization stuff doesn't appear to have solved the issue 13:37:22 esberglu anytime we see HTTP 500 we will have to look at pvm-rest logs 13:37:27 Still seeing "The physical location code "U8247.22L.2125D5A-V2-C4" either does not exist on the system or the device associated with it is not in AVAILABLE state." 13:37:37 Need to link up with hsien again for that 13:38:20 There is this other issue that has been popping up lately 13:38:25 http://paste.openstack.org/show/620411/ 13:39:08 Other than that I spent some time looking at the networking related tempest failures. It seems to be an issue with multiple tests trying to interact with the same network 13:39:25 interesting... 13:39:27 I haven't wrapped my head around exactly what's going on there 13:39:38 esberglu That rootwrap one - is it occasional or consistent? 13:39:44 efried: Occaisonal 13:39:59 that's really weird. If there's no filter for `tee`, there's no filter for `tee`. 13:40:14 could be using wildcards that sometimes match and sometimes don't 13:40:23 but that is really weird 13:41:00 maybe part of the rootwrap setup is sometimes failing? 13:42:33 I haven't been keeping this page as up to date as I should, but I started transitioning some local notes to it 13:42:33 esberglu I'll try to hep you with that offline 13:42:35 https://etherpad.openstack.org/p/powervm_tempest_failures 13:42:54 ++ 13:43:11 esberglu could you stop taking local notes and just work out of that etherpad? 13:44:11 edmondsw: Yeah that's my plan. The formatting options aren't as robust as I would like but I can deal 13:44:40 just as much as possible 13:45:01 Anyone know what that neo19 issue is about? Might just reinstall unless anyone has an idea 13:45:31 (IIRC this error message has been seen before and fixed via reinstall) 13:45:52 Seems like we should open a defect and have the VIOS team look at that. 13:46:09 efried: k 13:46:23 That's all for CI 13:46:24 Can we bump neo19 out of the pool in the meantime? 13:46:31 efried: Yeah it already is 13:46:34 coo 13:46:46 #topic Driver Testing 13:47:00 Haven't heard from jay in a while, anyone know where that testing is at? 13:47:09 yeah, I've got updates here 13:47:14 we have lost Jay 13:47:31 he's been pulled off to other things 13:47:56 We may have someone else that can help here, or we may not... I will be figuring that out this week 13:48:42 longterm we can probably assume that testing will be whatever we can do as a dev team via tempest, with no specific tester assigned 13:49:17 any questions? 13:49:48 Not atm 13:50:50 #topic Open Discussion 13:51:00 Anything else this week? 13:51:01 we should all start thinking about tempest coverage and improving it where we can / should / have time 13:51:34 I think that's it for me 13:51:58 we have the PTG next week, so probably no meeting 13:52:17 yuh 13:52:30 Yep I'll cancel 13:52:51 Have a good week all 13:52:55 #endmeeting