13:01:02 #startmeeting powervm_driver_meeting 13:01:03 Meeting started Tue Aug 8 13:01:02 2017 UTC and is due to finish in 60 minutes. The chair is esberglu. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:01:04 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:01:06 The meeting name has been set to 'powervm_driver_meeting' 13:01:17 o/ 13:01:46 o/ 13:02:54 #topic In Tree Driver 13:02:57 #link link https://etherpad.openstack.org/p/powervm-in-tree-todos 13:03:19 I don't think there is anything new IT 13:03:25 right 13:03:49 #topic Out Of Tree Driver 13:04:58 thorst please check 5645 13:05:29 edmondsw: yes sir. 13:05:40 I think that's all we've got going OOT at the moment 13:06:18 #topic PCI Passthrough 13:06:53 Anything new here? 13:07:06 I don't think we've made any progress here yet. efried is finishing up some auth work and then we can start to make progress 13:07:39 o/ 13:07:50 Yeah, what edmondsw said. 13:08:49 #topic PowerVM CI 13:09:22 Tested the devstack gen. tempest.conf one last time for all runs last night, all looked good 13:09:31 great 13:09:36 Got the +2 from edmondsw, anyone else want to look before I merge? 13:10:19 Tempest bugs are getting worked through 13:10:22 do we need to be opening a LP bug about those 2 tests having the same id? 13:10:30 esberglu I don't need to look again. 13:10:38 edmondsw: I think that it is intentional for those 2 13:10:39 If it's tested and edmondsw is happy, I'm happy. 13:10:54 They are the same test, just different microversions 13:11:10 I'd rather we weren't having to skip a couple new tests, but that seems a small price to pay to get this in 13:11:22 I hope there's a todo to figure that out and get those unskipped? 13:11:49 edmondsw: Yeah I was going to add it to the list once I merged 13:11:50 yeah, I know it's kinda the same test... still thought they should probably have different ids but maybe not 13:12:06 esberglu I'd go ahead and add it just to make sure we don't forget :) 13:12:22 I can disable the 2.48 version of the tests by setting the max_microversion 13:12:37 I'd rather not 13:12:38 But I'm not familiar enough with compute microversions to know if that's really what we want 13:12:47 I didn't think so either 13:12:55 Can I get some background here? 13:13:21 Two different tests testing the same thing over different microversions of the API ought to have different UUIDs. I very much doubt that was intentional. 13:13:46 And we should be able to handle both microversions in our env. If we can't, and that's passing in the world at large, it's our bug. 13:14:05 efried check 5598 13:14:15 https://github.com/openstack/tempest/blob/master/tempest/api/compute/admin/test_server_diagnostics.py 13:14:52 I'm guessing whoever made the V248 test there just copied the original test case and didn't change the ID 13:14:54 efried I expect efried is right, but I didn't look at how the test is actually written... is it one method, so one id, but run twice somehow? 13:15:12 esberglu ah in that case it does sound like a bug 13:15:14 esberglu I suspect that's what happened. 13:15:29 Anyways I can look into it 13:15:29 esberglu open the LP bug... worst case they reject it 13:15:38 tx 13:15:38 Yep 13:15:49 Other bugs... 13:16:03 There was a bug in tempest where the REST requests would timeout 13:16:24 efried made a loop to see if it was permanent or temporary 13:16:25 https://review.openstack.org/#/c/491003/ 13:16:36 With that getting patched in we no longer are seeing that timeout 13:17:00 But we still need to find out what's causing the timeout and make a long term solution 13:17:12 ++ 13:17:29 hsien got to the bottom of the internal server error 500's 13:17:37 oh, do tell 13:18:22 sweet 13:18:35 5657 13:18:49 There was an issue with the vios busy rc not being honored and retrying 13:19:09 btw, that loop fixup should have logged a message when we hit it. We should look for that log message and see how many times it hits per test. I suspect the very next try went through. Which probably means it's a threading problem at the server side of that call. 13:19:49 efried: Will do 13:20:45 esberglu Another experiment that might be worthwhile is knocking our threading level down. It's possible we're just timing out due to load. 13:21:08 Though... it seems like it would always hit on one or more of the same three or four tests, nah? 13:21:30 efried: Yeah same handful of tests 13:22:19 esberglu you also had something about discover_hosts on the agenda? 13:22:29 did we get that all straight? 13:22:37 looks like the CI has been better 13:22:41 edmondsw: Was just going to say that our fix is working there 13:22:52 awesome 13:22:53 Yep with that and efried's retry loop success rates are up 13:23:14 hsien's fix is +2 so should be in soon, then I will update the systems 13:23:34 edmondsw It needs to be noted that the retry loop is in tempest code, not our code. 13:23:58 So it's not a long-term fix (unless we can make the case that it should be submitted to tempest itself). 13:24:22 efried right, we need to figure out what's going on there and how to fix it permanently 13:24:32 Yeah, cause I don't think it's a good idea for us to be running long-term with a tempest patch. 13:24:38 ++ 13:24:42 ++ 13:24:56 that on the todo list, esberglu? 13:25:08 at the top? :) 13:25:44 edmondsw: I need to do an update of the list after the meeting but yeah it will be 13:25:50 cool 13:25:57 I was going to ask about http://184.172.12.213/92/474892/6/check/nova-in-tree-pvm/2922a78/ 13:26:19 I'm pretty sure I've seen that kind of failure before... but can't remember where it ended up 13:26:57 edmondsw: Yeah I saw that. I think when I removed a bunch of tests from the skip list with the networking api extension change some may have introduced new issues 13:27:19 I know we have had those before, can't remember what our solution was 13:27:20 ok, that makes sense. cuz I thought we'd fixed that, but it was probably with a skip 13:28:04 edmondsw: IIRC its an issue with tests interfering with each other 13:29:02 That's all for CI 13:29:07 #topic Driver Testing 13:29:33 Any progress? 13:30:03 I opened RTC stories for testing 13:30:40 I ordered them such that we'd validate vSCSI, FC, and LPM with the OOT driver before coming back to iSCSI 13:30:47 give us some time to do the dev work on iSCSI 13:31:23 don't see jay1_ on to discuss further 13:31:30 chhavi fyi ^ 13:33:02 #topic Open Discussion 13:33:11 Any last words? 13:33:30 I finally got devstack working! ;) 13:33:43 so there are a bunch of additions to https://etherpad.openstack.org/p/powervm_stacking_issues 13:33:47 Woohoo! 13:34:26 that last one was really weird... hope that's really the fix, and it wasn't just coincidence that it worked after that 13:34:53 I'm pretty sure it's legit 13:35:09 that's it from me 13:35:29 Thanks for joining 13:35:32 #endmeeting