13:01:02 <esberglu> #startmeeting powervm_driver_meeting 13:01:03 <openstack> Meeting started Tue Aug 8 13:01:02 2017 UTC and is due to finish in 60 minutes. The chair is esberglu. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:01:04 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:01:06 <openstack> The meeting name has been set to 'powervm_driver_meeting' 13:01:17 <mdrabe> o/ 13:01:46 <edmondsw> o/ 13:02:54 <esberglu> #topic In Tree Driver 13:02:57 <esberglu> #link link https://etherpad.openstack.org/p/powervm-in-tree-todos 13:03:19 <esberglu> I don't think there is anything new IT 13:03:25 <edmondsw> right 13:03:49 <esberglu> #topic Out Of Tree Driver 13:04:58 <edmondsw> thorst please check 5645 13:05:29 <thorst> edmondsw: yes sir. 13:05:40 <edmondsw> I think that's all we've got going OOT at the moment 13:06:18 <esberglu> #topic PCI Passthrough 13:06:53 <esberglu> Anything new here? 13:07:06 <edmondsw> I don't think we've made any progress here yet. efried is finishing up some auth work and then we can start to make progress 13:07:39 <efried> o/ 13:07:50 <efried> Yeah, what edmondsw said. 13:08:49 <esberglu> #topic PowerVM CI 13:09:22 <esberglu> Tested the devstack gen. tempest.conf one last time for all runs last night, all looked good 13:09:31 <edmondsw> great 13:09:36 <esberglu> Got the +2 from edmondsw, anyone else want to look before I merge? 13:10:19 <esberglu> Tempest bugs are getting worked through 13:10:22 <edmondsw> do we need to be opening a LP bug about those 2 tests having the same id? 13:10:30 <efried> esberglu I don't need to look again. 13:10:38 <esberglu> edmondsw: I think that it is intentional for those 2 13:10:39 <efried> If it's tested and edmondsw is happy, I'm happy. 13:10:54 <esberglu> They are the same test, just different microversions 13:11:10 <edmondsw> I'd rather we weren't having to skip a couple new tests, but that seems a small price to pay to get this in 13:11:22 <edmondsw> I hope there's a todo to figure that out and get those unskipped? 13:11:49 <esberglu> edmondsw: Yeah I was going to add it to the list once I merged 13:11:50 <edmondsw> yeah, I know it's kinda the same test... still thought they should probably have different ids but maybe not 13:12:06 <edmondsw> esberglu I'd go ahead and add it just to make sure we don't forget :) 13:12:22 <esberglu> I can disable the 2.48 version of the tests by setting the max_microversion 13:12:37 <edmondsw> I'd rather not 13:12:38 <esberglu> But I'm not familiar enough with compute microversions to know if that's really what we want 13:12:47 <esberglu> I didn't think so either 13:12:55 <efried> Can I get some background here? 13:13:21 <efried> Two different tests testing the same thing over different microversions of the API ought to have different UUIDs. I very much doubt that was intentional. 13:13:46 <efried> And we should be able to handle both microversions in our env. If we can't, and that's passing in the world at large, it's our bug. 13:14:05 <edmondsw> efried check 5598 13:14:15 <esberglu> https://github.com/openstack/tempest/blob/master/tempest/api/compute/admin/test_server_diagnostics.py 13:14:52 <esberglu> I'm guessing whoever made the V248 test there just copied the original test case and didn't change the ID 13:14:54 <edmondsw> efried I expect efried is right, but I didn't look at how the test is actually written... is it one method, so one id, but run twice somehow? 13:15:12 <edmondsw> esberglu ah in that case it does sound like a bug 13:15:14 <efried> esberglu I suspect that's what happened. 13:15:29 <esberglu> Anyways I can look into it 13:15:29 <edmondsw> esberglu open the LP bug... worst case they reject it 13:15:38 <edmondsw> tx 13:15:38 <esberglu> Yep 13:15:49 <esberglu> Other bugs... 13:16:03 <esberglu> There was a bug in tempest where the REST requests would timeout 13:16:24 <esberglu> efried made a loop to see if it was permanent or temporary 13:16:25 <esberglu> https://review.openstack.org/#/c/491003/ 13:16:36 <esberglu> With that getting patched in we no longer are seeing that timeout 13:17:00 <esberglu> But we still need to find out what's causing the timeout and make a long term solution 13:17:12 <edmondsw> ++ 13:17:29 <esberglu> hsien got to the bottom of the internal server error 500's 13:17:37 <efried> oh, do tell 13:18:22 <edmondsw> sweet 13:18:35 <edmondsw> 5657 13:18:49 <esberglu> There was an issue with the vios busy rc not being honored and retrying 13:19:09 <efried> btw, that loop fixup should have logged a message when we hit it. We should look for that log message and see how many times it hits per test. I suspect the very next try went through. Which probably means it's a threading problem at the server side of that call. 13:19:49 <esberglu> efried: Will do 13:20:45 <efried> esberglu Another experiment that might be worthwhile is knocking our threading level down. It's possible we're just timing out due to load. 13:21:08 <efried> Though... it seems like it would always hit on one or more of the same three or four tests, nah? 13:21:30 <esberglu> efried: Yeah same handful of tests 13:22:19 <edmondsw> esberglu you also had something about discover_hosts on the agenda? 13:22:29 <edmondsw> did we get that all straight? 13:22:37 <edmondsw> looks like the CI has been better 13:22:41 <esberglu> edmondsw: Was just going to say that our fix is working there 13:22:52 <edmondsw> awesome 13:22:53 <esberglu> Yep with that and efried's retry loop success rates are up 13:23:14 <esberglu> hsien's fix is +2 so should be in soon, then I will update the systems 13:23:34 <efried> edmondsw It needs to be noted that the retry loop is in tempest code, not our code. 13:23:58 <efried> So it's not a long-term fix (unless we can make the case that it should be submitted to tempest itself). 13:24:22 <edmondsw> efried right, we need to figure out what's going on there and how to fix it permanently 13:24:32 <efried> Yeah, cause I don't think it's a good idea for us to be running long-term with a tempest patch. 13:24:38 <edmondsw> ++ 13:24:42 <esberglu> ++ 13:24:56 <edmondsw> that on the todo list, esberglu? 13:25:08 <edmondsw> at the top? :) 13:25:44 <esberglu> edmondsw: I need to do an update of the list after the meeting but yeah it will be 13:25:50 <edmondsw> cool 13:25:57 <edmondsw> I was going to ask about http://184.172.12.213/92/474892/6/check/nova-in-tree-pvm/2922a78/ 13:26:19 <edmondsw> I'm pretty sure I've seen that kind of failure before... but can't remember where it ended up 13:26:57 <esberglu> edmondsw: Yeah I saw that. I think when I removed a bunch of tests from the skip list with the networking api extension change some may have introduced new issues 13:27:19 <esberglu> I know we have had those before, can't remember what our solution was 13:27:20 <edmondsw> ok, that makes sense. cuz I thought we'd fixed that, but it was probably with a skip 13:28:04 <esberglu> edmondsw: IIRC its an issue with tests interfering with each other 13:29:02 <esberglu> That's all for CI 13:29:07 <esberglu> #topic Driver Testing 13:29:33 <esberglu> Any progress? 13:30:03 <edmondsw> I opened RTC stories for testing 13:30:40 <edmondsw> I ordered them such that we'd validate vSCSI, FC, and LPM with the OOT driver before coming back to iSCSI 13:30:47 <edmondsw> give us some time to do the dev work on iSCSI 13:31:23 <edmondsw> don't see jay1_ on to discuss further 13:31:30 <edmondsw> chhavi fyi ^ 13:33:02 <esberglu> #topic Open Discussion 13:33:11 <esberglu> Any last words? 13:33:30 <edmondsw> I finally got devstack working! ;) 13:33:43 <edmondsw> so there are a bunch of additions to https://etherpad.openstack.org/p/powervm_stacking_issues 13:33:47 <esberglu> Woohoo! 13:34:26 <edmondsw> that last one was really weird... hope that's really the fix, and it wasn't just coincidence that it worked after that 13:34:53 <edmondsw> I'm pretty sure it's legit 13:35:09 <edmondsw> that's it from me 13:35:29 <esberglu> Thanks for joining 13:35:32 <esberglu> #endmeeting