#openstack-powervm log

13:01:45 <esberglu> #startmeeting powervm_driver_meeting
13:01:46 <openstack> Meeting started Tue Mar 21 13:01:45 2017 UTC and is due to finish in 60 minutes.  The chair is esberglu. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:01:47 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
13:01:49 <openstack> The meeting name has been set to 'powervm_driver_meeting'
13:01:51 <efried> o/
13:02:23 <esberglu> #topic In Tree Driver
13:02:35 <efried> Four change sets are rebased and updated.
13:02:57 <esberglu> Yep looking at those is first on my list this morning
13:03:12 <esberglu> Mostly just rebasing or functional changes?
13:03:15 <efried> Bottom two passed CI; top two failed.  I just submitted rechecks for 'em.  #4 failed one test, rebuild in error state or something.  #3 failed that same one and two more.
13:03:43 <esberglu> Yeah will talk about CI failures later in the meeting
13:03:56 <efried> Actually almost no functional changes - a couple of method signatures and the way flavors are passed around.  The vast majority was UT changes.
13:04:26 <efried> So I don't expect CI failures due to my changes.
13:04:59 <efried> Fact that #2 succeeded supports that.
13:05:05 <esberglu> Yeah those failures are unrelated
13:05:05 <efried> Cause the method signature changes started there.
13:05:35 <efried> So today I plan to rebase the SSP change set.  Then jayasankar_ can retest that guy and we can see if there's still that weird glance bug to nail down.
13:05:53 <efried> Say, is jayasankar_ on this meeting notice?
13:06:04 <esberglu> No I will add him
13:06:08 <efried> Thanks.
13:06:19 <efried> And Nilesh too
13:06:23 <esberglu> Yep
13:06:51 <efried> I don't have anything else.
13:07:16 <esberglu> I think jayasankar_ hit some issues testing in-tree, I haven't looked into yet but we may have an issue there
13:07:29 <esberglu> Gonna help him with that today if hes online
13:07:38 <esberglu> That's all I have for in-tree
13:07:47 <esberglu> #topic Out of Tree Driver
13:07:58 <esberglu> Guessing there isn't gonna be much to talk about here
13:08:34 <adreznec> We need to get that new pypowervm release out
13:08:35 <efried> I've still been backporting some of the changes from in-tree.
13:08:47 <esberglu> Rechecked all changes to the stable/* branches, so the backlog is cleared there
13:08:57 <adreznec> Currently our reqs issue is blocking an OSA SHA release
13:08:58 <efried> I'll put up a new OOT change set once I'm done with the SSP stuff.
13:09:09 <adreznec> The PTL come to talk to me about it yesterday
13:09:33 <efried> Guess you'll be on jfoliva's ass today.  Sh*t rolls downhill.
13:09:45 <adreznec> I pinged Julio and I believe he and Dom merged the change, but getting that out today is my priority
13:09:55 <adreznec> So we can merge https://review.openstack.org/#/c/440811/
13:10:24 <efried> We'll have to put up a new patch set on that.
13:10:31 <efried> but yeah
13:10:33 <adreznec> Yep
13:10:40 <adreznec> I'll drive that
13:11:04 <efried> #action adreznec to hold the whip on jfoliva to get the new pypowervm release out.
13:11:18 <efried> #action adreznec to close https://review.openstack.org/#/c/440811/
13:13:01 <esberglu> #topic CI
13:14:17 <esberglu> We are getting Http 500 Errors, it is hitting multiple tests
13:14:28 <esberglu> I'm guessing that's what hung up your patches efried
13:14:29 <esberglu> http://184.172.12.213/52/445652/3/check/nova-out-of-tree-pvm/ab6525c/powervm_os_ci.html
13:14:32 <esberglu> ^ example
13:14:48 <esberglu> I haven't had a chance to debug yet, been putting out other CI fires for the last week or so
13:15:01 <esberglu> But it's on my radar for today
13:15:17 <esberglu> I know that it was hitting some of the "rebuild" tests as well
13:17:08 <esberglu> The other thing I wanted to talk about is how to determine which tests to run for in-tree and how/when we are going to put those out
13:17:15 <efried> unknown internal error with half a dozen blank lines in the <Message>
13:17:18 <efried> That's gonna be fun to debug.
13:17:37 <efried> We don't save off the REST logs, do we.
13:17:42 <esberglu> No
13:17:47 <efried> whee.
13:17:52 <esberglu> We could
13:17:54 <efried> But I don't think my failures were the same, fwiw.
13:18:11 <efried> The 500 showed up in the html report here, but the same wasn't happening in my failures.
13:18:21 <efried> Anyway, leave it to you to debug.  Let me know if you need help.
13:18:25 <esberglu> Sounds good
13:18:32 <esberglu> Anyway back to the in-tree whitelist
13:18:32 <efried> Check with apearson whether it's okay to publish REST logs publicly.
13:19:09 <esberglu> K. The problem with the whitelist is that there isn't a good way to know which tests will be supported when we add certain functionality
13:19:27 <esberglu> What I have been doing
13:19:28 <efried> esberglu One strategy is: with each new change set, make sure OOT is stable, then run the in-tree change set with the OOT config.
13:19:35 <efried> Whatever passes, makes the whitelist.
13:19:38 <esberglu> Yep that's what I have been doing
13:19:51 <efried> It's simple and easy.  But it's not particularly scientific.
13:19:56 <esberglu> Exactly
13:20:16 <efried> What we really should be doing (ugh) is inspecting each test and seeing if it *should* pass or fail.
13:20:31 <efried> If it should fail and it passes, we should open a bug for that, and potentially fix the test.
13:20:31 <esberglu> Yeah, but that's a HUGE time investment
13:20:40 <efried> I suspect there are a number of tests that fit that category.
13:20:55 <efried> If it should pass and it fails, likewise nail that down.
13:21:23 <efried> Yes.  Possibly something jayasankar_ and nilesh could help out with.
13:22:08 <esberglu> The other thing related to this is when we are going to put changes out
13:22:20 <efried> to the in-tree whitelist?
13:22:32 <esberglu> Right now we are pulling in the 1st patch for all in-tree runs
13:22:53 <esberglu> Once that 1st one merges, are we going to start pulling in the 2nd patch and test that for all in-tree?
13:23:35 <efried> Well, I don't think the methodology has to do with change-by-change so much as by when (at which change set boundaries) the whitelist changes.
13:23:52 <efried> The whitelist may actually be the same for the first four or five change sets.
13:24:14 <efried> Is it?
13:24:19 <esberglu> No
13:24:38 <esberglu> power on/off will have an associated whitelist change
13:25:12 <efried> Okay, so #1 and #2 have the same whitelist, then it changes for power on/off, then it stays the same again until SSP, I'm guessing.
13:25:18 <efried> or maybe console.
13:26:03 <esberglu> Yep I don't think any of the 4 spawn/destroy will change anything
13:26:15 <esberglu> But my issue is still present when we get to SSP
13:26:23 <esberglu> We can't change the whitelist until that change is in
13:26:33 <esberglu> Meaning we won't get much CI volume on that change
13:26:49 <esberglu> Unless we do what we are doing with PS1 right now and pull it in for every in-tree run
13:26:50 <efried> Unless we change our setup to pull in that guy
13:26:53 <efried> right.
13:26:59 <efried> But
13:27:18 <efried> Then with the lower change sets we would have to figure out how to allow them to revert down the tree.
13:27:24 <efried> Until they merge.
13:27:30 <efried> Which is gonna be a while, way things are going.
13:27:54 <efried> So maybe we should look into enabling that.
13:28:25 <efried> Then we could essentially move up the whitelist and the baseline change set any time we have a change set we consider stable.
13:29:43 <efried> Right now we're doing something like, "If the path from the current change set back to tip of master contains our baseline change set, don't apply our baseline change set."
13:30:23 <esberglu> Okay I like that going forward. For now I think we are stable through the 4th spawn delete (full flavor). So are we ready to move our baseline to that?
13:30:30 <esberglu> And yeah that's what we are currently doing
13:30:40 <efried> We would have to do that, but also, "If the path from our baseline back to the tip of master contains the current change set, don't apply our baseline change set."
13:31:06 <efried> Yes, I'm happy to move our baseline to that once we have the above logic working.
13:31:13 <efried> But not until then.
13:31:22 <efried> Cause otherwise change sets 1-3 will fail.
13:31:37 <efried> They won't fail the CI run - they'll fail on the git shuffle.
13:32:03 <esberglu> Okay. I can look into that on staging, might bug you later if I have questions
13:32:07 <efried> Cause you'll be trying to cherry-pick, say, #2 onto the tip of a chain that already contains #2.
13:32:30 <esberglu> I think that --allow-empty gets past that no?
13:32:44 <esberglu> or --keep-redundant
13:32:48 <esberglu> I think
13:32:50 <efried> It may.  But I'm not sure.  There's a more explicit way to do it.
13:34:06 <esberglu> #action esberglu: Add CI logic for applying proper changesets
13:34:15 <efried> if git log --pretty=format:%H origin/${ZUUL_BRANCH}..HEAD | grep -q $commit; then
13:34:21 <efried> That's what we're doing today.
13:34:39 <efried> The new thing is going to be a little more complicated because we don't yet have $commit downloaded.
13:35:27 <efried> oo, it's even more complicated because we're theoretically looping through and cherry-picking multiple commits.
13:35:35 <efried> This is gonna be fun.
13:36:48 <efried> Anyway,
13:36:48 <efried> we don't need to solve it here.
13:36:48 <efried> We can actually do the git fetch and compare origin/${ZUUL_BRANCH} to FETCH_HEAD
13:37:02 <efried> And just skip the cherry-pick.
13:37:07 <efried> So yeah, we don't need to solve it here.
13:37:10 <efried> But that solves it.
13:37:12 <esberglu> lol
13:38:15 <esberglu> Okay. The only other thing I have for CI is OSA CI
13:38:24 * efried backs away slowly
13:38:31 <esberglu> Same...
13:39:29 <esberglu> Currently just trying to get run_playbooks script to work
13:40:06 <esberglu> Which is what runs basically the entire set of ansible playbooks
13:40:41 <esberglu> It took forever to get that env. stable, but I think it finally is and I can start grinding through the failures there
13:41:10 <esberglu> That's all for me today. Any other topics/thoughts?
13:41:27 <adreznec> Any specific issues on the OSA CI at this point?
13:41:53 <esberglu> It's failing to install pip
13:42:00 <adreznec> That... seems odd
13:42:31 <esberglu> Yeah. I've only tried the script once and didn't have time to debug yet
13:42:38 <esberglu> So it might be something trivial
13:42:47 <adreznec> Ok
13:43:10 <esberglu> I'll keep you posted
13:44:01 <efried> esberglu #3 and #4 failed in-tree CI again.
13:44:09 <esberglu> Ugh. Same thing?
13:44:32 <efried> no
13:44:40 <efried> test_multiple_create_with_reservation_return
13:45:03 <efried> and test_multiple_create
13:45:07 <efried> respectively.
13:45:20 <esberglu> Http 500 or not?
13:45:59 <efried> not from us.
13:46:09 <efried> Looks like might be neutron glitches.
13:46:25 <efried> Want me to recheck again, or leave 'em?
13:48:50 <esberglu> Looks like the test_multiple_create is hitting all of the OOT runs since last night
13:49:01 <efried> is it new?
13:49:04 <efried> the test, that is?
13:49:17 <efried> sorry, we shouldn't derail the meeting with this.  Is the meeting over?
13:49:24 <esberglu> Yeah I think so
13:49:30 <esberglu> #endmeeting