*** smatzek has quit IRC | 00:04 | |
*** edmondsw has joined #openstack-powervm | 00:21 | |
*** edmondsw has quit IRC | 00:25 | |
*** thorst_afk has joined #openstack-powervm | 00:46 | |
*** thorst_afk has quit IRC | 00:54 | |
*** thorst_afk has joined #openstack-powervm | 01:29 | |
*** thorst_afk has quit IRC | 01:38 | |
*** thorst_afk has joined #openstack-powervm | 01:38 | |
*** thorst_afk has quit IRC | 01:43 | |
*** thorst_afk has joined #openstack-powervm | 01:55 | |
*** thorst_afk has quit IRC | 01:55 | |
*** thorst_afk has joined #openstack-powervm | 02:11 | |
*** thorst_afk has quit IRC | 02:25 | |
*** esberglu has joined #openstack-powervm | 02:39 | |
*** esberglu has quit IRC | 02:44 | |
*** esberglu has joined #openstack-powervm | 02:49 | |
*** apearson has joined #openstack-powervm | 03:10 | |
*** apearson has quit IRC | 03:19 | |
*** apearson has joined #openstack-powervm | 03:19 | |
*** https_GK1wmSU has joined #openstack-powervm | 03:22 | |
*** https_GK1wmSU has left #openstack-powervm | 03:23 | |
*** apearson has quit IRC | 04:13 | |
*** apearson has joined #openstack-powervm | 04:25 | |
*** thorst_afk has joined #openstack-powervm | 04:26 | |
*** thorst_afk has quit IRC | 04:31 | |
*** esberglu has quit IRC | 06:06 | |
*** esberglu has joined #openstack-powervm | 06:06 | |
*** esberglu has quit IRC | 06:15 | |
*** thorst_afk has joined #openstack-powervm | 06:28 | |
*** thorst_afk has quit IRC | 06:32 | |
*** thorst_afk has joined #openstack-powervm | 08:27 | |
*** thorst_afk has quit IRC | 08:32 | |
*** esberglu has joined #openstack-powervm | 09:48 | |
*** esberglu has quit IRC | 09:53 | |
*** thorst_afk has joined #openstack-powervm | 10:28 | |
*** thorst_afk has quit IRC | 10:32 | |
*** smatzek has joined #openstack-powervm | 11:04 | |
*** edmondsw has joined #openstack-powervm | 11:09 | |
*** edmondsw has quit IRC | 11:14 | |
*** esberglu has joined #openstack-powervm | 11:37 | |
*** thorst_afk has joined #openstack-powervm | 11:40 | |
*** efried has quit IRC | 11:40 | |
*** esberglu has quit IRC | 11:42 | |
*** svenkat has joined #openstack-powervm | 11:49 | |
*** efried has joined #openstack-powervm | 11:52 | |
*** apearson has quit IRC | 11:53 | |
*** svenkat_ has joined #openstack-powervm | 11:55 | |
*** svenkat has quit IRC | 11:57 | |
*** svenkat_ is now known as svenkat | 11:57 | |
*** thorst_afk has quit IRC | 12:22 | |
*** esberglu has joined #openstack-powervm | 12:24 | |
*** openstackgerrit has joined #openstack-powervm | 12:35 | |
openstackgerrit | Eric Berglund proposed openstack/nova-powervm master: DNM: ci check https://review.openstack.org/328315 | 12:35 |
---|---|---|
openstackgerrit | Eric Berglund proposed openstack/nova-powervm master: DNM: CI Check2 https://review.openstack.org/328317 | 12:35 |
*** edmondsw has joined #openstack-powervm | 12:38 | |
*** esberglu has quit IRC | 12:39 | |
*** esberglu has joined #openstack-powervm | 12:55 | |
*** apearson has joined #openstack-powervm | 12:57 | |
*** thorst_afk has joined #openstack-powervm | 12:57 | |
*** kylek3h has joined #openstack-powervm | 13:01 | |
*** jay1_ has joined #openstack-powervm | 13:01 | |
esberglu | #startmeeting powervm_driver_meeting | 13:01 |
openstack | Meeting started Tue Aug 1 13:01:56 2017 UTC and is due to finish in 60 minutes. The chair is esberglu. Information about MeetBot at http://wiki.debian.org/MeetBot. | 13:01 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 13:01 |
openstack | The meeting name has been set to 'powervm_driver_meeting' | 13:02 |
mdrabe | o/ | 13:02 |
esberglu | #link https://etherpad.openstack.org/p/powervm_driver_meeting_agenda | 13:02 |
efried | \o | 13:03 |
thorst_afk | o/ | 13:03 |
esberglu | #topic In Tree Driver | 13:03 |
esberglu | #link https://etherpad.openstack.org/p/powervm-in-tree-todos | 13:03 |
esberglu | efried: Any updates here? | 13:04 |
edmondsw | o/ | 13:04 |
edmondsw | I think we're pretty much in a holding pattern here until pike goes out and we can start working on queens | 13:04 |
esberglu | Yeah that's what I thought as well | 13:05 |
esberglu | #topic Out of Tree Driver | 13:05 |
efried | sorry, yes, that's the case. | 13:05 |
*** dwayne has quit IRC | 13:06 | |
edmondsw | what did we decide to do with mdrabe's UUID-instead-of-instance-for-better-performance change? any news there? | 13:07 |
mdrabe | We're gonna cherry-pick that internally for test | 13:07 |
thorst_afk | get burn in, then once we're clear that its solid, push through | 13:08 |
mdrabe | So I'm gonna finish out UT, do the merge, and we're currently getting the test cases reviewed | 13:08 |
edmondsw | by cherry-pick, you mean into pvcos so everyone has it, or just for a select tester to apply to their system? | 13:08 |
mdrabe | The former | 13:08 |
edmondsw | good | 13:08 |
edmondsw | we had a conversation this week about adding support for mover service partitions to NovaLink | 13:09 |
mdrabe | Yea that'd be good for queens | 13:10 |
edmondsw | PowerVC already has this for HMC, and we're going to start exposing it to customers via a new CLI command in 1.4.0, but we don't have this for NovaLink | 13:10 |
edmondsw | so we're investigating what it would take to support for NovaLink as well... yeah, queens | 13:11 |
edmondsw | anything else? | 13:11 |
mdrabe | On that... | 13:12 |
mdrabe | Could we still work it in regardless of platform support? | 13:12 |
edmondsw | not sure I follow... | 13:12 |
efried | "we" who, and what do you mean by "platform"? | 13:12 |
mdrabe | Well if NL doesn't have the support for specifying MSPs, can we still have all the plumbing in nova-powervm? | 13:13 |
thorst_afk | we need the plumbing in place before we do anything in nova-powervm. We could start the patch, but we would never push it through until the pypowervm/novalink changes are through | 13:14 |
mdrabe | K that's what I was wondering, thanks | 13:14 |
esberglu | Anything else? | 13:15 |
edmondsw | I may have found someone to help with the iSCSI dev, but not sure there | 13:17 |
esberglu | #topic PCI Passthru | 13:17 |
edmondsw | that's it | 13:18 |
edmondsw | I don't have any news on PCI passthru... efried? | 13:18 |
efried | no | 13:18 |
edmondsw | next topic | 13:18 |
*** cjvolzka has joined #openstack-powervm | 13:19 | |
edmondsw | esberglu? | 13:19 |
esberglu | #topic PowerVM CI | 13:20 |
esberglu | Just got some comments back on the devstack patches I submitted, need to address them | 13:20 |
edmondsw | I saw those | 13:20 |
edmondsw | do you know what he's talking about with meta? | 13:20 |
esberglu | Yeah I think there may be a way you can set tempest.conf options in the local.conf without using devstack options | 13:21 |
esberglu | Like put the actual tempest.conf lines in there instead of using devstack options mapped to tempest options | 13:21 |
esberglu | Other than that I'm testing REST log copying on staging right now, should be able to have that on prod by the end of the day I think | 13:22 |
efried | Can you add me to those reviews? I may not have any useful feedback, but want to at least glance at 'em. | 13:23 |
esberglu | efried: Yep | 13:23 |
edmondsw | efried they're all linked in 5598's commit message | 13:23 |
esberglu | The relevant rest logs are just the FFDC logs? Or are there other rest logs that we want | 13:23 |
efried | esberglu Certainly FFDC and Audit. | 13:24 |
efried | Not sure any of the others are relevant, lemme look real quick. | 13:24 |
efried | Yeah, that should be fine, assuming we're not turning on developer debug. | 13:25 |
mdrabe | Aren't there JNI logs? Would we want those? | 13:25 |
efried | Mm, don't know where those are offhand. We seldom need them. But probably not a bad idea. | 13:26 |
efried | Have to ask seroyer or nvcastet where they live. | 13:26 |
esberglu | There somewhere in /var/log/pvm/wlp I can find them | 13:26 |
esberglu | They're | 13:27 |
mdrabe | Actually one dir up | 13:27 |
esberglu | Yep | 13:27 |
efried | So esberglu This could wind up being a nontrivial amount of data. Do we have the space? | 13:28 |
esberglu | efried: Let me look at the size of those files when zipped quick | 13:29 |
efried | talking maybe tens of MB per run. | 13:29 |
esberglu | I'll take a look and do some math after the meeting | 13:29 |
esberglu | If not we can add space or potentially change how long they stick around | 13:30 |
efried | Oh, hold on | 13:30 |
efried | We're talking about scping the REST (and JNI) logs from a neo that's serving several CI nodes across multiple runs? | 13:31 |
esberglu | efried: Yeah | 13:31 |
efried | Yeeeaaahhh, so that's not gonna work. | 13:31 |
efried | That's gonna be more than tens of megs. | 13:31 |
efried | And we'll be copying the same data over and over again. | 13:32 |
efried | I think we need to be a bit more clever. | 13:32 |
edmondsw | yeah... | 13:32 |
efried | We should make a dir per neo on the log server. | 13:32 |
edmondsw | what were we planning to use as the trigger for this? | 13:32 |
efried | And copy each neo's logs into it. | 13:33 |
edmondsw | and only if we see that the current logs there are not recent | 13:33 |
efried | And refresh (total replace) those periodically (period tbd) | 13:33 |
efried | And then link to the right neo's dir from the CI results of a given run. | 13:33 |
esberglu | efried: Should be able to just add a cron to each neo to scrub and copy | 13:33 |
efried | edmondsw Well, they'll always be out of date. | 13:33 |
esberglu | periodically | 13:33 |
edmondsw | efried what do you mean, always out of date? | 13:34 |
efried | Unless we have a period of time where zero runs are happening against that neo. | 13:34 |
efried | esberglu That's pretty rare, nah? | 13:34 |
esberglu | Eh it happens decently often | 13:35 |
efried | In any case, perhaps we look into rsync. | 13:35 |
esberglu | 14 neos, we are often running fewer runs than that | 13:35 |
esberglu | K. I will work on that today | 13:36 |
efried | Honestly don't know how it works trying to copy out a file while it's being written to. | 13:36 |
efried | but I'm sure people smarter than us figured that out decades ago. | 13:36 |
efried | ...which is why we should try to use something like rsync rather than writing the logic ourselves. | 13:37 |
efried | And a trigger to make sure we're synced should be a failing run. | 13:38 |
efried | With appropriate queueing in case a second run fails while we're still copying the logs from the first failing run. | 13:38 |
efried | And all that. | 13:38 |
esberglu | Just trying to figure out how we will handle the scrubbing | 13:39 |
efried | I think aging, not scrubbing. | 13:39 |
efried | The FFDC logs take care of their own rotation | 13:40 |
efried | How old do we let our openstack logs get before we scrub 'em? | 13:40 |
esberglu | Not sure off the top of my head | 13:41 |
esberglu | Looking | 13:41 |
esberglu | Anyway we can sort out the details post meeting | 13:42 |
edmondsw | anything else going on with the CI? | 13:43 |
esberglu | Haven't looked at failures today, but just the timeout thing | 13:43 |
esberglu | Need to touch base to get someone looking at the rest logs | 13:44 |
edmondsw | we still seeing a lot of timeouts? | 13:44 |
esberglu | Excuse me I was talking about the Internal Server Error 500 for rest logs | 13:44 |
edmondsw | I thought with the marker LUs and all fixed that would go back to an occasional thing | 13:44 |
esberglu | Yeah still seeing timeouts as well | 13:44 |
esberglu | edmondsw: The marker LU thing was causing the 3-4+ hour runs | 13:45 |
esberglu | These are timeouts on a specific subset of tests that hit intermittently | 13:45 |
edmondsw | k | 13:46 |
esberglu | #topic Driver Testing | 13:47 |
edmondsw | jay1_ anything here? | 13:47 |
jay1_ | I haven't got any update from Ravi yet, seems like he still needs some more time to come back | 13:48 |
jay1_ | The present issue is with the Iscsi volume attach. | 13:49 |
edmondsw | jay1_ is the issues etherpad up to date? | 13:53 |
edmondsw | https://etherpad.openstack.org/p/powervm-driver-test-status | 13:53 |
edmondsw | not a lot of information there | 13:53 |
jay1_ | Yeah.. same issue with the volume attach, will try to add the log error message as well | 13:55 |
edmondsw | tx | 13:55 |
edmondsw | esberglu that's probably all there for today | 13:55 |
edmondsw | next topic | 13:55 |
esberglu | #topic Open Discussion | 13:55 |
esberglu | Any last words? | 13:56 |
esberglu | #endmeeting | 13:57 |
openstack | Meeting ended Tue Aug 1 13:57:10 2017 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 13:57 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/powervm_driver_meeting/2017/powervm_driver_meeting.2017-08-01-13.01.html | 13:57 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/powervm_driver_meeting/2017/powervm_driver_meeting.2017-08-01-13.01.txt | 13:57 |
openstack | Log: http://eavesdrop.openstack.org/meetings/powervm_driver_meeting/2017/powervm_driver_meeting.2017-08-01-13.01.log.html | 13:57 |
edmondsw | I did have one thing I should have brought up in the OOT driver time... | 13:58 |
edmondsw | this nova-powervm tox issue that pushkaraj found | 13:58 |
edmondsw | I was able to reproduce in a fresh environment | 13:58 |
edmondsw | Maybe there was a recent nova change that started this... I need to look | 13:59 |
efried | edmondsw pvc or community-only? | 13:59 |
edmondsw | I reproduced with community-only | 14:00 |
edmondsw | pushkaraj found with pvcos... so it's both | 14:00 |
efried | And is the req in nova's [test-]requirements.txt? | 14:00 |
edmondsw | yes | 14:00 |
efried | that's weird. That should not happen. | 14:00 |
efried | should be chaining the reqs properly. | 14:00 |
edmondsw | I suspect the way we list nova as a dependency doesn't also pull in nova's dependencies | 14:00 |
edmondsw | at least test dependencies | 14:01 |
efried | ah, yup. | 14:02 |
efried | we do it in tox.ini, not in requirements. | 14:02 |
edmondsw | right | 14:02 |
efried | edmondsw Does nova list it in requirements.txt or test-requirements.txt? | 14:02 |
efried | This is wsgi-intercept? | 14:03 |
efried | That's in test-requirements. | 14:03 |
efried | So yeah, we may need to do an extra -r in there. | 14:04 |
edmondsw | efried I really don't know where to begin here... if you do, can you take this? | 14:04 |
edmondsw | I did confirm that wsgi-intercept is not new for nova | 14:05 |
edmondsw | so I'm not sure why we haven't hit this before | 14:05 |
edmondsw | maybe they started using it in a new place | 14:05 |
efried | edmondsw Is there an email somewhere in my inbox that describes how to reproduce this locally? | 14:07 |
edmondsw | ah, yep, that's exactly it... https://github.com/openstack/nova/commit/fdf27abf7db233ca51f12e2926d78c272b54935b | 14:07 |
edmondsw | efried yes | 14:07 |
edmondsw | it's pretty simple... tox --recreate -epy27,pep8 | 14:07 |
edmondsw | :) | 14:07 |
efried | Okay, I'll see if I can figure it out. I'm not a tox expert. | 14:08 |
edmondsw | yeah, me either... not by a long shot | 14:09 |
edmondsw | tx | 14:09 |
edmondsw | efried just resent the email with the latest info I have | 14:09 |
efried | ack | 14:11 |
*** jay1_ has quit IRC | 14:25 | |
*** dwayne has joined #openstack-powervm | 14:54 | |
esberglu | efried: Sounds like the file may show up as corrupt if you run rsync while it's being written | 15:39 |
efried | boo | 15:42 |
efried | edmondsw esberglu How has this wsgi-intercept thing not shown up in nova-powervm jenkins results? | 15:42 |
efried | edmondsw And do we have a LP bug for it? | 15:43 |
esberglu | efried: It is hitting the jenkins results now | 15:43 |
edmondsw | yeah, I was wondering that too... maybe we have nova installed in the same env, complete with test-requirements? | 15:43 |
esberglu | https://review.openstack.org/#/c/328315/ | 15:43 |
esberglu | http://logs.openstack.org/15/328315/56/check/gate-nova-powervm-python27-ubuntu-xenial/5c0b870/console.html | 15:43 |
edmondsw | efried I don't think pushkaraj opened a LP bug | 15:43 |
edmondsw | efried yeah, I assume we have nova and nova-powervm installed together, including nova's test-requirements, hence no issue with jenkins | 15:44 |
efried | edmondsw In the CI that'd be the case, but the regular jenkins part of a nova-powervm change set should be the same as a local tox -r run, more or less. | 15:45 |
efried | And as esberglu notes, that appears to be the case. | 15:45 |
edmondsw | oh, I see what you mean | 15:45 |
efried | Can someone open a LP bug please? | 15:45 |
efried | edmondsw ? | 15:45 |
edmondsw | sure, I'll open | 15:45 |
efried | I have the fix. | 15:45 |
esberglu | efried: Want to brainstorm other options for REST log copying since it doesn't sound like rsync is going to work? | 15:48 |
efried | esberglu Create a temp dir, do a local copy, scp/rsync the copy, blow away the temp dir? | 15:49 |
edmondsw | efried https://bugs.launchpad.net/nova-powervm/+bug/1707951 | 15:49 |
openstack | Launchpad bug 1707951 in nova-powervm "nova-powervm tox failing with ImportError for wsgi_intercept" [Undecided,New] | 15:49 |
efried | edmondsw Thanks. | 15:50 |
edmondsw | np | 15:50 |
esberglu | efried: local cp wouldn't have any issues with the file still being written right? | 15:50 |
efried | esberglu I believe that to be true. | 15:50 |
efried | We *might* get into trouble if logrotate hits while we're doing the local copy. | 15:51 |
efried | But probably not. | 15:51 |
esberglu | Need to wipe ips as part of it as well | 15:52 |
*** dwayne has quit IRC | 15:53 | |
efried | esberglu Oh, that's gross. Easy enough to do on the .logs, but the .log.gzs will have to be unzipped and rezipped. | 15:53 |
esberglu | Yeah. Annoying but easy | 15:53 |
openstackgerrit | Eric Fried proposed openstack/nova-powervm master: Install nova test requirements for tox https://review.openstack.org/489645 | 15:55 |
efried | edmondsw esberglu ^^ | 15:55 |
mdrabe | efried: Should the topic point to the bug for that review? | 15:56 |
efried | edmondsw Is Pushkaraj on openstack gerrit? | 15:57 |
efried | mdrabe Can do, sec. | 15:57 |
edmondsw | efried not sure I understand the question... he saw with pvcos, which is our own internal gerrit | 15:57 |
efried | edmondsw I mean to add him to the review. | 15:58 |
edmondsw | oh, I don't know... good question | 15:58 |
edmondsw | efried githubusercontent.com? ? | 15:58 |
edmondsw | oh, nm... that's what I get when I go to the raw file as well | 16:00 |
efried | edmondsw Swhat I got when I punched the 'raw' button from https://github.com/openstack/nova/blob/master/test-requirements.txt -- yeah. | 16:00 |
edmondsw | +2 | 16:00 |
efried | thx | 16:00 |
edmondsw | ty | 16:04 |
*** dwayne has joined #openstack-powervm | 16:14 | |
mdrabe | edmondsw: efried: stable nova-powervm pulls nova master? | 16:17 |
efried | mdrabe Shouldn't. Find a bug? | 16:17 |
mdrabe | I'm just wondering about that hard master URL | 16:18 |
efried | Look at tox.ini for e.g. stable/ocata. | 16:18 |
efried | We've been manually editing those links when we cut a stable branch. | 16:18 |
efried | Which is a pain | 16:18 |
efried | But is one of the reasons to get integrated with the official releases process - they do that stuff for ya. | 16:19 |
efried | Though I'm not sure if they would fix the nova dep link too. | 16:19 |
efried | Anyway, yes, we frequently forget to update those things. | 16:19 |
mdrabe | Ok that's my concern | 16:19 |
edmondsw | https://github.com/openstack/nova-powervm/blob/stable/ocata/tox.ini#L14 | 16:20 |
edmondsw | not much we can do about it... if we forget when pike moves to stable, tox will break, and we'll notice and fix it... but we should see it when we're cutting over | 16:20 |
efried | edmondsw tox won't break, though. | 16:20 |
efried | until it does. | 16:20 |
mdrabe | Can tox.ini run a script? | 16:20 |
efried | mdrabe Dunno. That's beyond my ken. | 16:21 |
edmondsw | efried i meant if we started pulling nova from stable/ocata but forgot to make the same change for test-req | 16:21 |
edmondsw | I assumed mdrabe was asking in referene to this fix | 16:22 |
efried | Oh - we don't need to backport this fix, cause the thing that surfaced it was only in pike. | 16:22 |
efried | The test req is in N+, though. | 16:22 |
efried | I mean, technically we could backport the fix, but I say if it ain't broke... | 16:22 |
mdrabe | I'm thinking in the future, if we forget to update for queens | 16:22 |
efried | Well, this one won't break us. But something else might. | 16:23 |
mdrabe | Right some version incompatibility | 16:23 |
efried | Anyway, yeah, it's a hole. We know about it, but we haven't been motivated to fix it f'real yet. We just patch it up once per release whenever we think about it, or if it actually breaks something. | 16:23 |
efried | If you have the time and inclination to figure out the mysterious swirling vortext of tox et al, feel free to make it right. | 16:24 |
efried | But if you have that kind of time, I've got better things for you to do. | 16:24 |
mdrabe | Yea google isn't revealing much on getting a bash context in tox.ini | 16:25 |
efried | mdrabe There's 'commands', but that runs every time, whereas deps only get installed first time or with -r. | 16:26 |
openstackgerrit | Eric Fried proposed openstack/nova-powervm master: Adopt new pypowervm power_off APIs https://review.openstack.org/476274 | 17:09 |
edmondsw | thorst_afk if you'll +2 https://review.openstack.org/#/c/489645/ we can get nova-powervm passing jenkins again | 17:47 |
openstackgerrit | Eric Fried proposed openstack/nova-powervm master: Adopt new pypowervm power_off APIs https://review.openstack.org/476274 | 17:52 |
thorst_afk | edmondsw: looking now | 18:01 |
thorst_afk | looks like a pretty complex change | 18:01 |
edmondsw | oh? | 18:01 |
edmondsw | one line, right? | 18:02 |
thorst_afk | :-) | 18:02 |
thorst_afk | sarcasm is lost in IRC | 18:02 |
edmondsw | :) | 18:02 |
edmondsw | thought maybe you were looking at the wrong thing ;) | 18:02 |
efried | edmondsw I rebased https://review.openstack.org/476274 on it to (further) prove that it works. | 18:04 |
edmondsw | +2 | 18:04 |
thorst_afk | edmondsw: I +2'd the earlier one | 18:20 |
edmondsw | thorst_afk tx | 18:20 |
openstackgerrit | Merged openstack/nova-powervm master: Install nova test requirements for tox https://review.openstack.org/489645 | 18:28 |
*** apearson has quit IRC | 19:00 | |
*** apearson has joined #openstack-powervm | 19:03 | |
*** jay1_ has joined #openstack-powervm | 19:20 | |
edmondsw | esberglu can you check why the CI failed for https://review.openstack.org/#/c/476274/ ? | 19:45 |
edmondsw | timeouts... just recheck? | 19:46 |
esberglu | edmondsw: It also failed a few with this | 19:47 |
esberglu | Failed to power off instance: 'module' object has no attribute 'power_off_progressive' | 19:47 |
edmondsw | esberglu ok yeah, that'd be a problem | 19:48 |
edmondsw | efried ^ | 19:48 |
*** apearson has quit IRC | 20:12 | |
efried | jeez | 20:27 |
efried | looking. | 20:27 |
efried | esberglu Can you give me a pointer to that? | 20:27 |
esberglu | http://184.172.12.213/74/476274/7/check/nova-powervm-out-of-tree-pvm/e829e6b/powervm_os_ci.html | 20:28 |
efried | It worries me that we're getting all these timeouts, need to dig into it a bit more, makes me think we may not be doing forced power-off when we oughtta. | 20:28 |
esberglu | efried: See the ServerDiskConfigTestJSON tests there | 20:28 |
esberglu | I didn't dig into any actual logs | 20:28 |
esberglu | efried: edmondsw: thorst_afk: See 5632 for the first wave of rest log copying | 20:29 |
esberglu | It will keep roughly 24 hours of rest logs and rsync hourly | 20:29 |
*** apearson has joined #openstack-powervm | 20:30 | |
efried | edmondsw Do we log a pip freeze anywhere in our CI runs? | 20:30 |
edmondsw | esberglu ^ | 20:30 |
efried | sorry, yeah | 20:31 |
efried | I've already complained about you two having nicks of the same length starting with e, right? | 20:31 |
edmondsw | be careful, yours starts with e as well ;) | 20:31 |
efried | I mean, at least I have the decency to have two fewer chars in mine. | 20:31 |
*** svenkat has quit IRC | 20:31 | |
edmondsw | so nice of you | 20:31 |
esberglu | efried: Not seeing it anywhere | 20:32 |
edmondsw | esberglu I guess you're wiping ips and domain names because this is stored in public? Why not store it somewhere private and not do that? | 20:33 |
edmondsw | nobody outside IBM is going to be interested in that data, are they? | 20:33 |
esberglu | I thought it would be nice to have it all in one place | 20:33 |
edmondsw | efried, what do you think? Is wiping ips and such going to make it harder to debug things? | 20:34 |
esberglu | But yeah no one outside will likely be interested | 20:34 |
efried | nono, we need to have the stuff public | 20:34 |
efried | why, that's how mriedem found the wait-for-compute bug last week. | 20:35 |
efried | Also for accountability. | 20:35 |
efried | Now, do we really need to bother wiping IPs? | 20:35 |
efried | We should ask someone who knows things about security. | 20:35 |
edmondsw | I'm not saying our logs shouldn't be public... I'm asking why the novalink REST logs need to be public | 20:35 |
edmondsw | specifically | 20:35 |
efried | oh | 20:35 |
edmondsw | the new stuff we're going to start grabbing that will be greek to anyone outside IBM | 20:35 |
efried | well, yeah, they probably don't. | 20:35 |
edmondsw | as for security... yeah, if you're going to make it public you need to wipe IPs | 20:36 |
efried | We're going to need *some* way to walk from a CI result to the appropriate REST logs. | 20:36 |
edmondsw | efried timestamp, no? | 20:36 |
edmondsw | and short hostname | 20:36 |
edmondsw | I think we leave that in the logs, right? | 20:36 |
efried | edmondsw Yeah, it's the hostname. | 20:36 |
efried | So far we've had trouble determining getting neo's hostname into the CI logs. esberglu Did we solve that yet? | 20:37 |
esberglu | Yeah a while ago | 20:37 |
esberglu | Its at the top of the console log | 20:37 |
esberglu | efried: Do you think it's worth rsyncing on failures? Or just keeping the cron hourly? | 20:38 |
esberglu | Not very often would we be ready to look at the rest logs within an hour of a specific failure | 20:38 |
efried | Be pretty frustrating not to find it. But we can wait an hour. | 20:39 |
efried | So hang back a sec. | 20:39 |
efried | Why are we doing this? | 20:39 |
efried | If we don't need to make the data public, and we know which neo the logs are on, and we know the time stamp of the failure (from the openstack logs), then we can get to the logs we need regardless. | 20:40 |
edmondsw | I thought we were grabbing things before they got overwritten / wrapped | 20:40 |
edmondsw | but that was just my assumption, could be wrong | 20:41 |
efried | esberglu How long before that happens? | 20:41 |
efried | Gimme an example of a neo that's been running a while. | 20:41 |
esberglu | So I was copying the last 20 FFDC logs which is about 24 hours | 20:41 |
esberglu | I thought we were trying to get the rest logs set up somewhere so it was easy to get from a failed run to the rest logs | 20:42 |
esberglu | So we have the last 24 hours or so of rest logs syncing to the logserver | 20:42 |
efried | esberglu If we're not making the logs public, then we can't link 'em from CI results, so we're not getting any ease-of-get-to benefit. | 20:42 |
efried | Is neo7 a CI host? | 20:42 |
esberglu | efried: Yeah that's why I put it on the logserver | 20:42 |
esberglu | Yes it is | 20:43 |
efried | That guy has FFDC logs back to 7/27 a.m. | 20:43 |
efried | How long do we keep CI results? | 20:43 |
esberglu | efried: Quite a while | 20:44 |
esberglu | sec | 20:44 |
esberglu | I thought the point of all of this was to make it easier to go from a failed run to the rest logs | 20:44 |
efried | esberglu If we're not making the logs public, then we can't link 'em from CI results, so we're not getting any ease-of-get-to benefit. | 20:44 |
esberglu | Rather than having to actually go to the neo, you can just click a link in the run logs that will take you to the rest logs | 20:44 |
esberglu | Yeah I'm saying why not make them public | 20:44 |
esberglu | It doesn't hurt anythgin | 20:44 |
efried | Okay, right. Cause then you gotta scrub IPs, and that's a pain? | 20:45 |
esberglu | efried: Not really, I just copied the log scrubbing that we use for everything else | 20:45 |
efried | And that works for .gzs as well? | 20:45 |
esberglu | Yep. In unzip those, scrub, then rezip | 20:45 |
efried | ight. | 20:45 |
efried | How big is 24h worth of logs? | 20:46 |
efried | Like 220MB? | 20:47 |
esberglu | Nah the neo I was using was 124 MB with everything zipped | 20:47 |
edmondsw | do we need an ease-of-get-to benefit here? Is it hard to just go to the neo yourself? | 20:47 |
efried | So then esberglu what's your strategy for aging these things? Cause any time you recopy, you're going to be carving the window down to 24h. Which is worse than what we've got on the neo. | 20:48 |
efried | I'm leaning towards edmondsw's view here. We're doing a lot of stuff (writing code, consuming disk, chewing up bandwidth) to implement something that's severely limited and of negligible benefit. | 20:50 |
esberglu | efried: Yeah. I thought that you guys just wanted them copied to the logserver for easy access | 20:51 |
efried | esberglu Plonk a paragraph into the README (do we have a README?) that describes how to find the REST logs (i.e. which log to find the neo name in, and what to search for in there). | 20:51 |
efried | And I think that solves it. | 20:52 |
edmondsw | sounds good to me | 20:52 |
efried | Looking back at what we were thinking when we put this on the to-do list, having not dug into it, we were thinking we could somehow have per-run REST logs. | 20:53 |
efried | But having grabbed that tiger by the tail and ridden it, knowing what we know now, it doesn't seem like there's a benefit. | 20:53 |
efried | Now | 20:53 |
efried | If we could filter the logs to get per-run entries only | 20:53 |
efried | Then I would be totally on board. | 20:53 |
efried | That might be *theoretically* doable. | 20:54 |
efried | But probably very tricky. | 20:54 |
*** apearson has quit IRC | 20:55 | |
efried | Cause we know the IP/hostname of the CI node, and that info should be in *some* of the REST requests. Probably every Audit.log entry, in fact. From there to transaction IDs; and from there to FFDC log entries. | 20:55 |
efried | Tricky to pull out multiline entries, though. | 20:55 |
efried | But we could also just base it on the timestamps of the lifespan of the CI run. We'd get entries for all CI nodes running during that time, but that's okay. | 20:56 |
esberglu | efried: Yeah but then we get into the problem of copying a ton of duplicate data | 20:57 |
efried | esberglu But a) we only do it on failures, and b) we're not duplicating *everything* every time - just things in that time window when runs are happening in parallel. | 20:57 |
efried | esberglu Anyway, I think this is a low-priority wishlist item. | 20:58 |
efried | Right now we need to figure out why tf http://184.172.12.213/74/476274/7/check/nova-powervm-out-of-tree-pvm/e829e6b/logs/ isn't picking up pypowervm 1.1.6 | 20:58 |
efried | esberglu This is a little weird: http://184.172.12.213/74/476274/7/check/nova-powervm-out-of-tree-pvm/e829e6b/console.html#_2017-08-01_18_03_47_656 | 20:59 |
efried | This happens several times during stacking. | 21:00 |
efried | Which makes a guy think we've got a git repo sitting around somewhere that might be (sometimes??) hijacking the 1.1.6. | 21:00 |
efried | esberglu Hm, so the CI node base image probably has a pypowervm sitting on it. | 21:01 |
esberglu | efried: Yep thats where the 1.1.4 is from | 21:01 |
efried | esberglu But it's refusing to uninstall it. | 21:02 |
esberglu | I thought stack would install over it though | 21:02 |
efried | That would also answer why we're getting lots of timeouts - because we're running into the power_off bug from 1.1.4 | 21:02 |
efried | Yeah, so how do you explain "found existing installation"? | 21:02 |
efried | esberglu Rebuilding the base image is a thing that takes a long time and is a pain in the ass, right? | 21:04 |
efried | esberglu Can you spin me up a pseudo-CI node, pre-stack? | 21:04 |
efried | I wanna see where this 1.1.4 is coming from. | 21:05 |
esberglu | efried: Yeah. Would be easier to modify prep_devstack script to install properly | 21:05 |
esberglu | We install it so that the prepare_node_powervm.sh script works | 21:05 |
efried | esberglu Well, remember, we don't do that for a reason. | 21:05 |
efried | esberglu Remind me, is there a reason we need pypowervm before stack? | 21:05 |
esberglu | Yeah we need it for the image template and ready node scripts to work | 21:05 |
efried | oh, certainly now, since we're doing the remote hack before stack. | 21:05 |
efried | So how are we getting it? Based on the requirements.txt in the nova clone we're testing? | 21:06 |
esberglu | efried: It's a variable in neo-os-ci | 21:06 |
esberglu | For the undercloud | 21:06 |
efried | uh | 21:06 |
efried | boo | 21:06 |
esberglu | For the ready nodes scripts I mean | 21:06 |
esberglu | We can just explicitly install the pypowervm version found in the u-c in the prep_devstack script | 21:08 |
esberglu | efried: Just add an else block there that installs pypowervm_version | 21:11 |
esberglu | https://github.com/powervm/powervm-ci/blob/master/devstack/prep_devstack.sh#L157-L182 | 21:12 |
efried | esberglu Dig. Though I would rather get it from the requirements.txt of the repo of the change we're testing. | 21:13 |
efried | We should be able to figure that out, nah? | 21:13 |
esberglu | https://github.com/powervm/powervm-ci/blob/master/devstack/prep_devstack.sh#L147 | 21:14 |
esberglu | Yeah just need to change that | 21:14 |
efried | Reason I say that is cause that allows us to "preview" a g-r bump by proposing a patch that bumps the requirements.txt version. | 21:14 |
esberglu | We can already preview a patch with the patching logic | 21:15 |
esberglu | For any openstack project | 21:15 |
esberglu | efried: Oh I see what you mean | 21:21 |
esberglu | We use the prep_devstack script for projects that don't have pypowervm as a req | 21:22 |
esberglu | neutron, ceilometer silent runs | 21:22 |
esberglu | But I could add logic to check the project | 21:22 |
*** jay1_ has quit IRC | 21:29 | |
*** cjvolzka has quit IRC | 21:29 | |
efried | esberglu Yeah, if the project in question doesn't have a pypowervm req, then get it from global. | 21:30 |
*** cjvolzka has joined #openstack-powervm | 21:30 | |
*** cjvolzka has quit IRC | 21:31 | |
esberglu | efried: Hmm same thing happened when I tried installing 1.1.6 pre-stack | 21:32 |
esberglu | And trying to uninstall pypowervm prior to that gives the same message stack is seeing | 21:32 |
esberglu | Can't uninstall 'pypowervm'. No files were found to uninstall. | 21:33 |
efried | esberglu Yeah, so where is that 1.1.4 coming from? | 21:33 |
efried | it's like the pip db is corrupted or something. | 21:33 |
esberglu | /opt/stack/pypowervm | 21:33 |
efried | oh, so it's not deleting files cause that guy was installed with -e. | 21:33 |
efried | Which ought to be just fine. | 21:34 |
efried | I mean, I don't know that this is really the problem. | 21:34 |
efried | But we're certainly getting a pypowervm other than 1.1.6. | 21:34 |
*** smatzek has quit IRC | 21:42 | |
*** apearson has joined #openstack-powervm | 21:47 | |
*** esberglu has quit IRC | 21:58 | |
*** smatzek has joined #openstack-powervm | 22:01 | |
*** smatzek_ has joined #openstack-powervm | 22:02 | |
*** apearson has quit IRC | 22:05 | |
*** thorst_afk has quit IRC | 22:05 | |
*** apearson has joined #openstack-powervm | 22:06 | |
*** smatzek has quit IRC | 22:06 | |
*** apearson has quit IRC | 22:06 | |
*** apearson has joined #openstack-powervm | 22:07 | |
*** apearson has quit IRC | 22:08 | |
*** kylek3h has quit IRC | 22:12 | |
*** esberglu has joined #openstack-powervm | 22:12 | |
*** kylek3h has joined #openstack-powervm | 22:13 | |
*** kylek3h has quit IRC | 22:13 | |
*** esberglu has quit IRC | 22:17 | |
*** thorst_afk has joined #openstack-powervm | 22:19 | |
*** smatzek_ has quit IRC | 22:20 | |
*** esberglu has joined #openstack-powervm | 22:23 | |
*** thorst_afk has quit IRC | 22:24 | |
*** edmondsw has quit IRC | 22:36 | |
*** smatzek_ has joined #openstack-powervm | 22:36 | |
*** svenkat has joined #openstack-powervm | 22:40 | |
*** smatzek_ has quit IRC | 22:49 | |
*** svenkat has quit IRC | 22:53 | |
*** esberglu has quit IRC | 23:06 | |
*** esberglu has joined #openstack-powervm | 23:12 | |
*** cjvolzka has joined #openstack-powervm | 23:38 | |
-openstackstatus- NOTICE: osic nodes have been removed from nodepool due to a problem with the mirror host beginning around 22:20 UTC. please recheck any jobs with failures installing packages. | 23:47 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!