14:15:12 <adreznec> #startmeeting powervm_drver_meeting 14:15:13 <openstack> Meeting started Tue Jan 24 14:15:12 2017 UTC and is due to finish in 60 minutes. The chair is adreznec. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:15:14 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:15:17 <openstack> The meeting name has been set to 'powervm_drver_meeting' 14:15:27 <adreznec> #topic In-tree driver status 14:15:43 <adreznec> Lets start here. I'll turn things over to you efried 14:17:12 <esberglu> Hey guys sorry I'm late, forgot my badge 14:17:39 <efried> I'm slowly working my way through the early in-tree change sets. 14:17:41 <adreznec> np esberglu, just fired up the meeting. talking in-tree driver status first 14:18:03 <efried> First one is for sure ready for wider review; we're just waiting for in-tree CI before we can put pressure on the cores to review. 14:18:24 <efried> If we can get the CI up today or tomorrow, there's a chance we can get mriedem to do a review before the nova meeting on Thursday. 14:18:33 <efried> What's the o-3 date? Is it Thursday or Friday? 14:19:02 <esberglu> Thursday 14:19:20 <adreznec> efried: It's up to projects a bit. Most will be Thursday, but Jan 23-27 is the official range 14:19:40 <efried> Well, okay, this is nova we're talking about. 14:20:00 <efried> Confirmed Thursday. 14:20:26 <thorst_> whoops 14:20:28 <efried> Sooo... we pretty much miss ocata if we don't have the CI up today. Cause I don't think we get the change merged on the first pass. 14:20:30 <thorst_> sorry... 14:20:50 <thorst_> efried: I think we've missed Ocata. :-) 14:20:59 <thorst_> for the intree. 14:21:30 <efried> Let's work as if there's still a chance. 14:21:36 <thorst_> agree. 14:21:49 <efried> Remainder of driver status is piddly details. I can go more in depth if you want, but we should probably spend the time on more important stuff. 14:22:04 <efried> *in-tree driver status 14:22:18 <thorst_> I have oot driver talk, but I suspect that's a different part of meeting 14:22:35 <efried> yuh, suggest waiting til after we talk in-tree CI. 14:22:42 <efried> Anything else before we move on to that? 14:22:58 <adreznec> Nothing here 14:23:10 <adreznec> #topic In-tree driver CI status 14:23:15 <adreznec> esberglu, the floor is yours 14:23:35 <esberglu> The in-tree driver is failing to stack. "n-cpu service is not running" for some reason 14:23:44 <esberglu> Problem is that the logs are failing to copy 14:23:48 <esberglu> So i'm running one manually 14:23:52 <esberglu> To see what the deal is 14:24:09 <adreznec> esberglu: That's weird. Is it just not getting far enough to transfer somehow? 14:24:15 <adreznec> Or is there actually an scp failure? 14:25:14 <esberglu> Nah it's actually a SCP failure. Which is weird because I didn't change anything for SCP 14:25:21 <esberglu> Trying to create /srv/static/logs/59 14:25:26 <esberglu> It fails trying to create thta 14:25:36 <esberglu> the log server isn't full or anything though 14:26:31 <esberglu> I will let you guys know the results of the manual run when it finishes 14:26:38 <esberglu> The other thing I had 14:26:47 <adreznec> Hmm odd. Nothing in that flow should have changed unless something is broken in the build variables somehow 14:27:19 <esberglu> There's a "test connection" thing in the configure system, and it is connecting to the log server fine 14:27:48 <esberglu> 3 scripts run as part of the CI setup 14:27:48 <esberglu> 1) prepare_node_powervm 14:27:48 <esberglu> 2) ready_node_powervm 14:27:50 <esberglu> 3) prep_devstack 14:27:58 <esberglu> Previously we would install the patched (develop) pypowervm in prepare_node_powervm 14:27:59 <esberglu> Since we are now using 2 different versions of pypowervm (1.0.0.4 for in tree, develop for oot) I moved the installation to prep_devstack 14:27:59 <esberglu> The problem is that we need the patched pypowervm to be installed for the ready_node script to work 14:28:20 <esberglu> So I was thinking just install the patched develop in prepare_node_powervm 14:28:36 <esberglu> Then if it is in tree, just overwrite with the patched 1.0.0.4 14:28:44 <esberglu> Thoughts? 14:29:06 <efried> As long as 1.0.0.4 is in place before the nova compute process starts, it shouldn't matter when we do it. 14:29:22 <efried> Heck, I would even be okay skipping that wrinkle for now and just continuing to use develop 14:29:26 <adreznec> efried: I think it does 14:29:29 <efried> They're not different enough that it's going to cause failures. 14:29:30 <adreznec> Because I think we use pvmctl before that 14:29:32 <adreznec> To do node setup 14:29:58 <efried> and we can focus on getting things working first, then worry about that version switch later. 14:30:07 <esberglu> Yeah that was the problem. We don't know if it's in or out of tree until (3) but we need pvmctl by (2) 14:30:30 <adreznec> esberglu: we could just always install develop for step 2 14:30:38 <adreznec> then switch it out for the "right" version in 3 if needed? 14:30:56 <adreznec> Fragments it a bit, but... meh 14:30:58 <esberglu> Yeah. That's exactly how I have it set up right now. Seems to be working, just wanted to make sure I wasn't missing something 14:31:06 <efried> oh, okay. 14:31:42 <efried> thorst_ adreznec - can you think of an easy way to have e.g. Adapter() init log the pypowervm version? 14:31:58 <adreznec> There are other options like bundling pypowervm/pvmctl into a venv and shipping that whole thing so pvmctl has its own pypowervm to use always or something 14:32:01 <adreznec> But they're more work 14:32:17 <efried> I don't know where the version numbers are stored. I'm sure it involves pbr or something. 14:32:32 <thorst_> efried: no idea... 14:34:01 <adreznec> We'd have to make a call off to pbr's version_string method to get that 14:34:25 <efried> In [12]: pbr.version.VersionInfo('pypowervm').release_string() 14:34:25 <efried> Out[12]: '1.0.0.dev4' 14:34:28 <efried> :) 14:34:34 <thorst_> could we just log the version at the end of the CI job? 14:34:37 <adreznec> Yeah 14:34:38 <thorst_> and call it a day there? 14:34:56 <adreznec> I wonder if we should stop using pbr for pypowervm though 14:35:03 <adreznec> pbr only really works well with semver 14:35:04 <efried> Well, I wanted to have a way to be sure the compute process was started with the correct version. 14:35:05 <adreznec> As you can see there 14:35:17 <adreznec> Since the version probably isn't really 1.0.0.dev4 14:35:31 <adreznec> but 1.0.0.4 or 1.0.0.5 14:35:48 <esberglu> We would have to log it before it gets patched 14:36:01 <efried> wouldn't think so 14:36:04 <esberglu> Once it's patched the version becomes 1.0.1devxxx 14:36:11 <esberglu> I'm pretty sure 14:38:02 <efried> If this is going to be more than a five-minute thing, then never mind; but it would be useful in the long run. 14:38:31 <efried> For right now, like I say, I would be okay moving forward even if the compute process is still using develop. 14:39:37 <efried> I got lost. What's the next step here? Seeing how the local run goes, and then nailing down the scp thing? 14:40:00 <esberglu> Yep 14:40:18 <esberglu> #action esberglu: Finish manual in tree run and update with results 14:40:30 <esberglu> #action esberglu: Figure out why logs aren't being copied 14:41:01 <efried> esberglu: would having another body help move this along any faster, or are we bottlenecked? 14:41:26 <efried> I would be volunteering someone like adreznec who knows this stuff ;-) 14:41:27 <esberglu> If someone wants to help with the SCP thing. Nothing to do for the manual run but wait 14:41:44 <efried> k. thorst_ is that in your wheelhouse? 14:42:48 <thorst_> huh? 14:43:10 <efried> Do you have the expertise and bandwidth to help esberglu figure out this SCP boggle? 14:43:17 <thorst_> ooo, I do not. 14:43:24 <efried> adreznec? 14:43:25 <thorst_> survival mode atm 14:43:27 <efried> Cause I know I don't. 14:43:58 <adreznec> efried: Not sure yet, still bogged down right now 14:44:14 <efried> Okay, if there's nobody with the technical chops, I'd be happy to be a sounding board and additional googler. 14:44:14 <adreznec> Will depend on how things shake out with meetings really 14:44:29 <efried> #action efried to help esberglu with SCP boggle, for whatever that's worth. 14:45:21 <efried> esberglu: is there anything else you can see on the horizon that will need to be addressed? Something we might be able to get a head start on if we're stuck waiting for whatever? 14:45:57 <efried> The big obvious thing is paring down the test list - but we don't really know where to start with that. However, setting up the infrastructure to use a whitelist? 14:46:39 <esberglu> I already know how we should do that 14:46:46 <esberglu> This is the conf we use for out of tree 14:46:51 <esberglu> https://github.com/powervm/powervm-ci/blob/master/tempest/os_ci_tempest.conf#L26 14:46:58 <esberglu> We need to make a second conf for in tree 14:47:10 <thorst_> well, its going to be a whitelist 14:47:13 <esberglu> And then we set the BASE_TEST_REGEX to include all the tests we want 14:47:16 <thorst_> so its only supposed to be the tests we want 14:47:31 <thorst_> ahh, nm...I see 14:47:32 <esberglu> Yep. The BASE_TEST_REGEX for out of tree includes all the tests 14:47:32 <efried> oh, so it's already whitelisting. It's just really inclusive. 14:47:35 <adreznec> consensus! 14:47:47 <thorst_> that's rough. 14:48:01 <esberglu> Yeah the "whitelist" for out of tree is all tests, then it gets reduced by the skip_list 14:48:08 <thorst_> I'm going to nope myself out of anything with regex 14:48:11 <efried> So the BASE_TEST_REGEX is going to be a regex with (id|id|id|id.....) 14:48:14 <thorst_> I find regex to be an awful creation 14:48:35 <efried> ahh, thorst_, you don't understand the beauty of regex. 14:48:40 <efried> <patpat> 14:48:51 <efried> Sokay, I'm your regex guy. 14:49:04 <thorst_> efried: you are correct, I find it flawed and awful 14:49:11 <thorst_> but that's my definition of 'beauty' 14:49:15 <thorst_> can't be awful 14:49:24 <esberglu> efried: It we be easier to use test names, then we could include groups of tests with one regex. But it was recommended to use to use ID's before 14:49:53 <efried> Yeah, however, I don't know that we really want to handle the whitelist with a regex like that. There's probably another (better) way to do it. 14:50:22 <efried> So - let me take another look at the os_ci_tempest.sh and see what I can figure out. Unless esberglu has already done that? 14:50:50 <efried> I forget, which project holds the real one of those? neo-os-ci or powervm-ci? 14:50:56 <esberglu> powervm-ci 14:52:16 <efried> #action efried to investigate whitelisting 14:53:21 <adreznec> All right 14:53:29 <adreznec> Anything else on in-tree CI? 14:53:56 <adreznec> Ok, I know thorst_ had discussion on out-of-tree 14:54:06 <adreznec> #topic Out-of-tree driver discussion 14:54:19 <thorst_> yeah, so our oot CI is kinda flaking out again 14:54:27 <thorst_> I'm seeing several patches failing... 14:54:33 <thorst_> I've at least root caused one of em. 14:54:34 <thorst_> http://pastebin.com/uN5aB1Kk 14:54:44 <thorst_> that's the error. Basically it is a non-force immediate power off. 14:54:48 <thorst_> and its just hanging 14:55:10 <thorst_> I think we've hit this a few times now...so we should get it fixed. 14:55:19 <thorst_> I opened a bug a long time ago around this 14:55:22 <thorst_> https://bugs.launchpad.net/nova-powervm/+bug/1562117 14:55:22 <openstack> Launchpad bug 1562117 in nova-powervm "power-off times are not adhered to" [Low,New] - Assigned to Lauren Taylor (lmtaylor) 14:55:35 <thorst_> I think we need some attention put on it now. Anyone have cycles to explore that fix? 14:56:02 <thorst_> I can also check with lmtaylor on it, but she's had it for a while and hasn't updated it recently. 14:56:20 <efried> squeaky wheel 14:56:47 <thorst_> alright. 14:56:53 <thorst_> well, I'll work on it as I free up 14:56:58 <thorst_> but it's impacting CI. Sigh 14:57:06 <thorst_> that was about it 14:57:18 <thorst_> I suspect we'll argue in the review 14:57:31 <thorst_> so awareness now, this one will be a weird review...so pay attention to the review. 14:58:06 <efried> thorst_: is the problem that you think we should be timing out faster? 15:00:18 <thorst_> OpenStack gives us values for timeout and retry 15:00:24 <thorst_> what those values mean...is open to interpretation 15:00:32 <thorst_> does a 0 mean immediately or wait forevs 15:00:42 <thorst_> I interpret it as 'immediate' 15:00:51 <thorst_> :-) 15:02:16 <efried> I see. 15:02:31 <efried> Should prolly go look at how libvirt et al interpret those values. 15:03:24 <efried> libvirt agrees with thorst_ 15:03:37 <thorst_> well then, its easy 15:03:38 <efried> timeout != 0 => "gracefully" 15:03:42 <thorst_> anyway, I'll get on it 15:03:53 <thorst_> soon enough, cause its blocking my other changes (kinda) 15:04:17 <thorst_> that's all I had for OOT. Big thanks for reviews on the fileio thing 15:04:25 <thorst_> not sure if we can get that into ocata...would've been nice 15:04:26 <efried> k, if you find you don't have time, I can take over. 15:04:45 <efried> #action thorst_ https://bugs.launchpad.net/nova-powervm/+bug/1562117 - efried to help if needed. 15:04:45 <openstack> Launchpad bug 1562117 in nova-powervm "power-off times are not adhered to" [Low,New] - Assigned to Lauren Taylor (lmtaylor) 15:04:55 <adreznec> Ok 15:04:59 <adreznec> So I know we're over 15:05:08 <adreznec> #topic Open floor 15:05:10 <adreznec> Anything else? 15:05:22 * thorst_ dances on open floor 15:05:27 <efried> not from me, but esberglu stick around so we can talk about the whitelist. 15:05:34 <adreznec> Okay 15:05:40 <adreznec> #endmeeting