#openstack-powervm log

13:01:56 <esberglu> #startmeeting powervm_driver_meeting
13:01:57 <openstack> Meeting started Tue Aug  1 13:01:56 2017 UTC and is due to finish in 60 minutes.  The chair is esberglu. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:01:58 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
13:02:00 <openstack> The meeting name has been set to 'powervm_driver_meeting'
13:02:10 <mdrabe> o/
13:02:40 <esberglu> #link https://etherpad.openstack.org/p/powervm_driver_meeting_agenda
13:03:17 <efried> \o
13:03:36 <thorst_afk> o/
13:03:46 <esberglu> #topic In Tree Driver
13:03:56 <esberglu> #link https://etherpad.openstack.org/p/powervm-in-tree-todos
13:04:21 <esberglu> efried: Any updates here?
13:04:27 <edmondsw> o/
13:04:59 <edmondsw> I think we're pretty much in a holding pattern here until pike goes out and we can start working on queens
13:05:17 <esberglu> Yeah that's what I thought as well
13:05:53 <esberglu> #topic Out of Tree Driver
13:05:59 <efried> sorry, yes, that's the case.
13:07:19 <edmondsw> what did we decide to do with mdrabe's UUID-instead-of-instance-for-better-performance change? any news there?
13:07:50 <mdrabe> We're gonna cherry-pick that internally for test
13:08:12 <thorst_afk> get burn in, then once we're clear that its solid, push through
13:08:24 <mdrabe> So I'm gonna finish out UT, do the merge, and we're currently getting the test cases reviewed
13:08:39 <edmondsw> by cherry-pick, you mean into pvcos so everyone has it, or just for a select tester to apply to their system?
13:08:51 <mdrabe> The former
13:08:55 <edmondsw> good
13:09:49 <edmondsw> we had a conversation this week about adding support for mover service partitions to NovaLink
13:10:25 <mdrabe> Yea that'd be good for queens
13:10:26 <edmondsw> PowerVC already has this for HMC, and we're going to start exposing it to customers via a new CLI command in 1.4.0, but we don't have this for NovaLink
13:11:18 <edmondsw> so we're investigating what it would take to support for NovaLink as well... yeah, queens
13:11:51 <edmondsw> anything else?
13:12:02 <mdrabe> On that...
13:12:06 <mdrabe> Could we still work it in regardless of platform support?
13:12:39 <edmondsw> not sure I follow...
13:12:50 <efried> "we" who, and what do you mean by "platform"?
13:13:43 <mdrabe> Well if NL doesn't have the support for specifying MSPs, can we still have all the plumbing in nova-powervm?
13:14:14 <thorst_afk> we need the plumbing in place before we do anything in nova-powervm.  We could start the patch, but we would never push it through until the pypowervm/novalink changes are through
13:14:44 <mdrabe> K that's what I was wondering, thanks
13:15:16 <esberglu> Anything else?
13:17:54 <edmondsw> I may have found someone to help with the iSCSI dev, but not sure there
13:17:59 <esberglu> #topic     PCI Passthru
13:18:06 <edmondsw> that's it
13:18:27 <edmondsw> I don't have any news on PCI passthru... efried?
13:18:34 <efried> no
13:18:40 <edmondsw> next topic
13:19:54 <edmondsw> esberglu?
13:20:07 <esberglu> #topic PowerVM CI
13:20:23 <esberglu> Just got some comments back on the devstack patches I submitted, need to address them
13:20:31 <edmondsw> I saw those
13:20:44 <edmondsw> do you know what he's talking about with meta?
13:21:34 <esberglu> Yeah I think there may be a way you can set tempest.conf options in the local.conf without using devstack options
13:21:53 <esberglu> Like put the actual tempest.conf lines in there instead of using devstack options mapped to tempest options
13:22:52 <esberglu> Other than that I'm testing REST log copying on staging right now, should be able to have that on prod by the end of the day I think
13:23:04 <efried> Can you add me to those reviews?  I may not have any useful feedback, but want to at least glance at 'em.
13:23:12 <esberglu> efried: Yep
13:23:55 <edmondsw> efried they're all linked in 5598's commit message
13:23:57 <esberglu> The relevant rest logs are just the FFDC logs? Or are there other rest logs that we want
13:24:26 <efried> esberglu Certainly FFDC and Audit.
13:24:36 <efried> Not sure any of the others are relevant, lemme look real quick.
13:25:12 <efried> Yeah, that should be fine, assuming we're not turning on developer debug.
13:25:18 <mdrabe> Aren't there JNI logs? Would we want those?
13:26:23 <efried> Mm, don't know where those are offhand.  We seldom need them.  But probably not a bad idea.
13:26:39 <efried> Have to ask seroyer or nvcastet where they live.
13:26:56 <esberglu> There somewhere in /var/log/pvm/wlp I can find them
13:27:03 <esberglu> They're
13:27:23 <mdrabe> Actually one dir up
13:27:30 <esberglu> Yep
13:28:04 <efried> So esberglu This could wind up being a nontrivial amount of data.  Do we have the space?
13:29:26 <esberglu> efried: Let me look at the size of those files when zipped quick
13:29:31 <efried> talking maybe tens of MB per run.
13:29:56 <esberglu> I'll take a look and do some math after the meeting
13:30:31 <esberglu> If not we can add space or potentially change how long they stick around
13:30:48 <efried> Oh, hold on
13:31:24 <efried> We're talking about scping the REST (and JNI) logs from a neo that's serving several CI nodes across multiple runs?
13:31:45 <esberglu> efried: Yeah
13:31:54 <efried> Yeeeaaahhh, so that's not gonna work.
13:31:59 <efried> That's gonna be more than tens of megs.
13:32:05 <efried> And we'll be copying the same data over and over again.
13:32:26 <efried> I think we need to be a bit more clever.
13:32:39 <edmondsw> yeah...
13:32:52 <efried> We should make a dir per neo on the log server.
13:32:55 <edmondsw> what were we planning to use as the trigger for this?
13:33:07 <efried> And copy each neo's logs into it.
13:33:21 <edmondsw> and only if we see that the current logs there are not recent
13:33:21 <efried> And refresh (total replace) those periodically (period tbd)
13:33:41 <efried> And then link to the right neo's dir from the CI results of a given run.
13:33:50 <esberglu> efried: Should be able to just add a cron to each neo to scrub and copy
13:33:54 <efried> edmondsw Well, they'll always be out of date.
13:33:56 <esberglu> periodically
13:34:19 <edmondsw> efried what do you mean, always out of date?
13:34:22 <efried> Unless we have a period of time where zero runs are happening against that neo.
13:34:33 <efried> esberglu That's pretty rare, nah?
13:35:01 <esberglu> Eh it happens decently often
13:35:14 <efried> In any case, perhaps we look into rsync.
13:35:19 <esberglu> 14 neos, we are often running fewer runs than that
13:36:19 <esberglu> K. I will work on that today
13:36:39 <efried> Honestly don't know how it works trying to copy out a file while it's being written to.
13:36:57 <efried> but I'm sure people smarter than us figured that out decades ago.
13:37:19 <efried> ...which is why we should try to use something like rsync rather than writing the logic ourselves.
13:38:07 <efried> And a trigger to make sure we're synced should be a failing run.
13:38:28 <efried> With appropriate queueing in case a second run fails while we're still copying the logs from the first failing run.
13:38:30 <efried> And all that.
13:39:17 <esberglu> Just trying to figure out how we will handle the scrubbing
13:39:41 <efried> I think aging, not scrubbing.
13:40:15 <efried> The FFDC logs take care of their own rotation
13:40:43 <efried> How old do we let our openstack logs get before we scrub 'em?
13:41:15 <esberglu> Not sure off the top of my head
13:41:17 <esberglu> Looking
13:42:49 <esberglu> Anyway we can sort out the details post meeting
13:43:34 <edmondsw> anything else going on with the CI?
13:43:54 <esberglu> Haven't looked at failures today, but just the timeout thing
13:44:13 <esberglu> Need to touch base to get someone looking at the rest logs
13:44:26 <edmondsw> we still seeing a lot of timeouts?
13:44:49 <esberglu> Excuse me I was talking about the Internal Server Error 500 for rest logs
13:44:54 <edmondsw> I thought with the marker LUs and all fixed that would go back to an occasional thing
13:44:55 <esberglu> Yeah still seeing timeouts as well
13:45:16 <esberglu> edmondsw: The marker LU thing was causing the 3-4+ hour runs
13:45:31 <esberglu> These are timeouts on a specific subset of tests that hit intermittently
13:46:12 <edmondsw> k
13:47:15 <esberglu> #topic Driver Testing
13:47:37 <edmondsw> jay1_ anything here?
13:48:04 <jay1_> I haven't got any update from Ravi yet, seems like he still needs some more time to come back
13:49:49 <jay1_> The present issue is with the Iscsi volume attach.
13:53:18 <edmondsw> jay1_ is the issues etherpad up to date?
13:53:37 <edmondsw> https://etherpad.openstack.org/p/powervm-driver-test-status
13:53:56 <edmondsw> not a lot of information there
13:55:01 <jay1_> Yeah.. same issue with the volume attach, will try to add the log error message as well
13:55:38 <edmondsw> tx
13:55:48 <edmondsw> esberglu that's probably all there for today
13:55:51 <edmondsw> next topic
13:55:58 <esberglu> #topic Open Discussion
13:56:25 <esberglu> Any last words?
13:57:10 <esberglu> #endmeeting