14:15:12 #startmeeting powervm_drver_meeting 14:15:13 Meeting started Tue Jan 24 14:15:12 2017 UTC and is due to finish in 60 minutes. The chair is adreznec. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:15:14 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:15:17 The meeting name has been set to 'powervm_drver_meeting' 14:15:27 #topic In-tree driver status 14:15:43 Lets start here. I'll turn things over to you efried 14:17:12 Hey guys sorry I'm late, forgot my badge 14:17:39 I'm slowly working my way through the early in-tree change sets. 14:17:41 np esberglu, just fired up the meeting. talking in-tree driver status first 14:18:03 First one is for sure ready for wider review; we're just waiting for in-tree CI before we can put pressure on the cores to review. 14:18:24 If we can get the CI up today or tomorrow, there's a chance we can get mriedem to do a review before the nova meeting on Thursday. 14:18:33 What's the o-3 date? Is it Thursday or Friday? 14:19:02 Thursday 14:19:20 efried: It's up to projects a bit. Most will be Thursday, but Jan 23-27 is the official range 14:19:40 Well, okay, this is nova we're talking about. 14:20:00 Confirmed Thursday. 14:20:26 whoops 14:20:28 Sooo... we pretty much miss ocata if we don't have the CI up today. Cause I don't think we get the change merged on the first pass. 14:20:30 sorry... 14:20:50 efried: I think we've missed Ocata. :-) 14:20:59 for the intree. 14:21:30 Let's work as if there's still a chance. 14:21:36 agree. 14:21:49 Remainder of driver status is piddly details. I can go more in depth if you want, but we should probably spend the time on more important stuff. 14:22:04 *in-tree driver status 14:22:18 I have oot driver talk, but I suspect that's a different part of meeting 14:22:35 yuh, suggest waiting til after we talk in-tree CI. 14:22:42 Anything else before we move on to that? 14:22:58 Nothing here 14:23:10 #topic In-tree driver CI status 14:23:15 esberglu, the floor is yours 14:23:35 The in-tree driver is failing to stack. "n-cpu service is not running" for some reason 14:23:44 Problem is that the logs are failing to copy 14:23:48 So i'm running one manually 14:23:52 To see what the deal is 14:24:09 esberglu: That's weird. Is it just not getting far enough to transfer somehow? 14:24:15 Or is there actually an scp failure? 14:25:14 Nah it's actually a SCP failure. Which is weird because I didn't change anything for SCP 14:25:21 Trying to create /srv/static/logs/59 14:25:26 It fails trying to create thta 14:25:36 the log server isn't full or anything though 14:26:31 I will let you guys know the results of the manual run when it finishes 14:26:38 The other thing I had 14:26:47 Hmm odd. Nothing in that flow should have changed unless something is broken in the build variables somehow 14:27:19 There's a "test connection" thing in the configure system, and it is connecting to the log server fine 14:27:48 3 scripts run as part of the CI setup 14:27:48 1) prepare_node_powervm 14:27:48 2) ready_node_powervm 14:27:50 3) prep_devstack 14:27:58 Previously we would install the patched (develop) pypowervm in prepare_node_powervm 14:27:59 Since we are now using 2 different versions of pypowervm (1.0.0.4 for in tree, develop for oot) I moved the installation to prep_devstack 14:27:59 The problem is that we need the patched pypowervm to be installed for the ready_node script to work 14:28:20 So I was thinking just install the patched develop in prepare_node_powervm 14:28:36 Then if it is in tree, just overwrite with the patched 1.0.0.4 14:28:44 Thoughts? 14:29:06 As long as 1.0.0.4 is in place before the nova compute process starts, it shouldn't matter when we do it. 14:29:22 Heck, I would even be okay skipping that wrinkle for now and just continuing to use develop 14:29:26 efried: I think it does 14:29:29 They're not different enough that it's going to cause failures. 14:29:30 Because I think we use pvmctl before that 14:29:32 To do node setup 14:29:58 and we can focus on getting things working first, then worry about that version switch later. 14:30:07 Yeah that was the problem. We don't know if it's in or out of tree until (3) but we need pvmctl by (2) 14:30:30 esberglu: we could just always install develop for step 2 14:30:38 then switch it out for the "right" version in 3 if needed? 14:30:56 Fragments it a bit, but... meh 14:30:58 Yeah. That's exactly how I have it set up right now. Seems to be working, just wanted to make sure I wasn't missing something 14:31:06 oh, okay. 14:31:42 thorst_ adreznec - can you think of an easy way to have e.g. Adapter() init log the pypowervm version? 14:31:58 There are other options like bundling pypowervm/pvmctl into a venv and shipping that whole thing so pvmctl has its own pypowervm to use always or something 14:32:01 But they're more work 14:32:17 I don't know where the version numbers are stored. I'm sure it involves pbr or something. 14:32:32 efried: no idea... 14:34:01 We'd have to make a call off to pbr's version_string method to get that 14:34:25 In [12]: pbr.version.VersionInfo('pypowervm').release_string() 14:34:25 Out[12]: '1.0.0.dev4' 14:34:28 :) 14:34:34 could we just log the version at the end of the CI job? 14:34:37 Yeah 14:34:38 and call it a day there? 14:34:56 I wonder if we should stop using pbr for pypowervm though 14:35:03 pbr only really works well with semver 14:35:04 Well, I wanted to have a way to be sure the compute process was started with the correct version. 14:35:05 As you can see there 14:35:17 Since the version probably isn't really 1.0.0.dev4 14:35:31 but 1.0.0.4 or 1.0.0.5 14:35:48 We would have to log it before it gets patched 14:36:01 wouldn't think so 14:36:04 Once it's patched the version becomes 1.0.1devxxx 14:36:11 I'm pretty sure 14:38:02 If this is going to be more than a five-minute thing, then never mind; but it would be useful in the long run. 14:38:31 For right now, like I say, I would be okay moving forward even if the compute process is still using develop. 14:39:37 I got lost. What's the next step here? Seeing how the local run goes, and then nailing down the scp thing? 14:40:00 Yep 14:40:18 #action esberglu: Finish manual in tree run and update with results 14:40:30 #action esberglu: Figure out why logs aren't being copied 14:41:01 esberglu: would having another body help move this along any faster, or are we bottlenecked? 14:41:26 I would be volunteering someone like adreznec who knows this stuff ;-) 14:41:27 If someone wants to help with the SCP thing. Nothing to do for the manual run but wait 14:41:44 k. thorst_ is that in your wheelhouse? 14:42:48 huh? 14:43:10 Do you have the expertise and bandwidth to help esberglu figure out this SCP boggle? 14:43:17 ooo, I do not. 14:43:24 adreznec? 14:43:25 survival mode atm 14:43:27 Cause I know I don't. 14:43:58 efried: Not sure yet, still bogged down right now 14:44:14 Okay, if there's nobody with the technical chops, I'd be happy to be a sounding board and additional googler. 14:44:14 Will depend on how things shake out with meetings really 14:44:29 #action efried to help esberglu with SCP boggle, for whatever that's worth. 14:45:21 esberglu: is there anything else you can see on the horizon that will need to be addressed? Something we might be able to get a head start on if we're stuck waiting for whatever? 14:45:57 The big obvious thing is paring down the test list - but we don't really know where to start with that. However, setting up the infrastructure to use a whitelist? 14:46:39 I already know how we should do that 14:46:46 This is the conf we use for out of tree 14:46:51 https://github.com/powervm/powervm-ci/blob/master/tempest/os_ci_tempest.conf#L26 14:46:58 We need to make a second conf for in tree 14:47:10 well, its going to be a whitelist 14:47:13 And then we set the BASE_TEST_REGEX to include all the tests we want 14:47:16 so its only supposed to be the tests we want 14:47:31 ahh, nm...I see 14:47:32 Yep. The BASE_TEST_REGEX for out of tree includes all the tests 14:47:32 oh, so it's already whitelisting. It's just really inclusive. 14:47:35 consensus! 14:47:47 that's rough. 14:48:01 Yeah the "whitelist" for out of tree is all tests, then it gets reduced by the skip_list 14:48:08 I'm going to nope myself out of anything with regex 14:48:11 So the BASE_TEST_REGEX is going to be a regex with (id|id|id|id.....) 14:48:14 I find regex to be an awful creation 14:48:35 ahh, thorst_, you don't understand the beauty of regex. 14:48:40 14:48:51 Sokay, I'm your regex guy. 14:49:04 efried: you are correct, I find it flawed and awful 14:49:11 but that's my definition of 'beauty' 14:49:15 can't be awful 14:49:24 efried: It we be easier to use test names, then we could include groups of tests with one regex. But it was recommended to use to use ID's before 14:49:53 Yeah, however, I don't know that we really want to handle the whitelist with a regex like that. There's probably another (better) way to do it. 14:50:22 So - let me take another look at the os_ci_tempest.sh and see what I can figure out. Unless esberglu has already done that? 14:50:50 I forget, which project holds the real one of those? neo-os-ci or powervm-ci? 14:50:56 powervm-ci 14:52:16 #action efried to investigate whitelisting 14:53:21 All right 14:53:29 Anything else on in-tree CI? 14:53:56 Ok, I know thorst_ had discussion on out-of-tree 14:54:06 #topic Out-of-tree driver discussion 14:54:19 yeah, so our oot CI is kinda flaking out again 14:54:27 I'm seeing several patches failing... 14:54:33 I've at least root caused one of em. 14:54:34 http://pastebin.com/uN5aB1Kk 14:54:44 that's the error. Basically it is a non-force immediate power off. 14:54:48 and its just hanging 14:55:10 I think we've hit this a few times now...so we should get it fixed. 14:55:19 I opened a bug a long time ago around this 14:55:22 https://bugs.launchpad.net/nova-powervm/+bug/1562117 14:55:22 Launchpad bug 1562117 in nova-powervm "power-off times are not adhered to" [Low,New] - Assigned to Lauren Taylor (lmtaylor) 14:55:35 I think we need some attention put on it now. Anyone have cycles to explore that fix? 14:56:02 I can also check with lmtaylor on it, but she's had it for a while and hasn't updated it recently. 14:56:20 squeaky wheel 14:56:47 alright. 14:56:53 well, I'll work on it as I free up 14:56:58 but it's impacting CI. Sigh 14:57:06 that was about it 14:57:18 I suspect we'll argue in the review 14:57:31 so awareness now, this one will be a weird review...so pay attention to the review. 14:58:06 thorst_: is the problem that you think we should be timing out faster? 15:00:18 OpenStack gives us values for timeout and retry 15:00:24 what those values mean...is open to interpretation 15:00:32 does a 0 mean immediately or wait forevs 15:00:42 I interpret it as 'immediate' 15:00:51 :-) 15:02:16 I see. 15:02:31 Should prolly go look at how libvirt et al interpret those values. 15:03:24 libvirt agrees with thorst_ 15:03:37 well then, its easy 15:03:38 timeout != 0 => "gracefully" 15:03:42 anyway, I'll get on it 15:03:53 soon enough, cause its blocking my other changes (kinda) 15:04:17 that's all I had for OOT. Big thanks for reviews on the fileio thing 15:04:25 not sure if we can get that into ocata...would've been nice 15:04:26 k, if you find you don't have time, I can take over. 15:04:45 #action thorst_ https://bugs.launchpad.net/nova-powervm/+bug/1562117 - efried to help if needed. 15:04:45 Launchpad bug 1562117 in nova-powervm "power-off times are not adhered to" [Low,New] - Assigned to Lauren Taylor (lmtaylor) 15:04:55 Ok 15:04:59 So I know we're over 15:05:08 #topic Open floor 15:05:10 Anything else? 15:05:22 * thorst_ dances on open floor 15:05:27 not from me, but esberglu stick around so we can talk about the whitelist. 15:05:34 Okay 15:05:40 #endmeeting