Monday, 2017-04-03

*** thorst has joined #openstack-powervm01:37
*** thorst has quit IRC01:42
*** thorst has joined #openstack-powervm02:38
*** thorst has quit IRC02:57
*** apearson has joined #openstack-powervm03:00
*** dwayne__ has quit IRC03:19
*** dwayne__ has joined #openstack-powervm03:22
*** jwcroppe has quit IRC03:31
*** jwcroppe has joined #openstack-powervm03:32
*** apearson has quit IRC03:40
*** apearson has joined #openstack-powervm03:40
*** apearson has quit IRC03:47
*** thorst has joined #openstack-powervm03:54
*** thorst has quit IRC03:58
*** thorst has joined #openstack-powervm04:54
*** thorst has quit IRC04:59
*** shyama has joined #openstack-powervm05:29
shyamathorst: efried please review https://review.openstack.org/#/c/432322/05:37
*** thorst has joined #openstack-powervm05:55
*** thorst has quit IRC05:59
*** jwcroppe has quit IRC06:32
*** jwcroppe has joined #openstack-powervm06:32
*** thorst has joined #openstack-powervm06:56
*** thorst has quit IRC07:00
*** thorst has joined #openstack-powervm07:56
*** openstackgerrit has quit IRC08:03
*** thorst has quit IRC08:16
*** thorst has joined #openstack-powervm09:13
*** thorst has quit IRC09:17
*** k0da has joined #openstack-powervm09:56
*** chas has joined #openstack-powervm10:35
*** shyama has quit IRC11:02
*** shyama has joined #openstack-powervm11:20
*** shyama has quit IRC11:25
*** thorst has joined #openstack-powervm11:34
*** thorst has quit IRC11:36
*** thorst has joined #openstack-powervm12:00
*** jpasqualetto has joined #openstack-powervm12:14
*** jpasqualetto has quit IRC12:25
*** edmondsw has joined #openstack-powervm12:32
*** efried has quit IRC12:38
*** mdrabe has joined #openstack-powervm12:39
*** efried has joined #openstack-powervm12:48
*** shyama has joined #openstack-powervm12:51
*** jwcroppe has quit IRC12:53
*** jwcroppe has joined #openstack-powervm12:54
*** jwcroppe has quit IRC12:58
*** jwcroppe has joined #openstack-powervm13:07
*** esberglu has joined #openstack-powervm13:15
*** jpasqualetto has joined #openstack-powervm13:20
*** dwayne__ has quit IRC13:48
*** mdrabe has quit IRC14:11
*** mdrabe has joined #openstack-powervm14:17
*** tjakobs has joined #openstack-powervm14:23
*** smatzek has joined #openstack-powervm14:25
*** jpasqualetto has quit IRC14:27
*** jpasqualetto has joined #openstack-powervm14:28
*** nbante has joined #openstack-powervm14:39
*** dwayne__ has joined #openstack-powervm14:48
*** shyama has quit IRC14:49
esbergluefried: I'm seeing some issues with power off in the in-tree CI runs14:53
efriedTell me14:53
esbergluIt goes through and tries normal power off, that times out and it tries the VSP hard shutdown. Then that fails as well14:56
esberglu"Partition must be running to shut down"14:56
esbergluA couple things in the logs that concern me14:56
esbergluWhen the first shutdown comes through the instance is in state open firmware, not active14:57
esbergluThen also seeing a message "can't perform OS shutdown because RMC connection is not active"14:58
esbergluI'm wondering if we are trying to shutdown an instance that is not ready yet?14:58
esbergluhttp://184.172.12.213/74/385074/10/check/nova-in-tree-pvm/1427341/14:59
esbergluAnd not handling that scenario14:59
*** jwcroppe has quit IRC15:02
esbergluefried: I can point out timestamps in the logs if you would like15:03
efriedesberglu I got 'em.  Looking now.15:03
esbergluefried: This issue happens 2x in that run15:04
efriedYuh15:04
*** jwcroppe has joined #openstack-powervm15:09
*** jwcroppe has quit IRC15:09
*** jwcroppe has joined #openstack-powervm15:10
efriedesberglu Erm, what code is this using?15:10
esberglupypowervm?15:11
efriedSorry, yeah.  I'm seeing a log message that isn't in the source.15:11
esberglu1.1.015:13
esbergluIt pulls it from upper-constraints15:13
efriedesberglu Well, something ain't right.15:14
efriedCause I can't find the source code that emits this message:15:14
efried2017-04-03 07:11:56.221 WARNING pypowervm.tasks.power [req-23017211-7e7f-4b89-889b-b8e6ed7ddbe5 tempest-ServerDiskConfigTestJSON-561105912 tempest-ServerDiskConfigTestJSON-561105912] Can not perform OS shutdown on Virtual Machine pvm3-tempest-Ser-49a39f18 because its RMC connection is not active.15:14
*** jwcroppe has quit IRC15:15
esbergluefried: exceptions.py15:17
esbergluhttps://github.com/powervm/pypowervm/blob/1.1.0/pypowervm/exceptions.py#L13915:17
efriedGot it, thanks.  Not sure how I missed that on my first pass.15:19
esbergluefried: Had me worried for a second there15:21
efriedMe too.  Brain not yet in gear after weekend.15:21
efriedMore honey-do projects in last two days than, like, last six months combined.15:21
efriedAfter a weekend of installing ceiling fixtures and ripping up carpet, code takes a bit of a gear shift to get back into.15:22
efriedesberglu thorst Okay, I think it's this 60s timeout that's screwing us again.15:25
efriedSo we tried VSP normal, and it timed out - but probably actually succeeded.15:25
esbergluAnd then the VSP hard fails because it's already shut down15:26
efriedThen we moved on to VSP hard, but by the time we issued that guy, the partition was actually already down, so it "failed"15:26
efriedyup.15:26
efriedSo I'm thinking perhaps we want a tweak to the logic here.15:26
thorstefried: so just add a check?15:26
thorstto see if already dead.15:26
thorstwhich I know is silly but...15:26
efriedthorst Well, not to check the state of the partition.15:26
efriedBut a check for this error code.15:26
thorstahh15:27
thorstfair15:27
efried"[PVME01050901-0581] Partition must be running to shut down."15:27
efriedCheck for that PVME code.15:27
efriedWeird thing is, it took three minutes for that hard shutdown to fail.15:27
thorstheh, something to ask hsien15:28
thorstthat is weird....15:28
efriedCould just be bogged system.15:28
thorstshouldn't be that bogged.15:28
efriedAnyway, I think it's reasonable to check for that PVME code and "succeed" at that point.15:29
thorstRaghu's scale tests have been hitting 1000 VMs15:29
efriedesberglu We don't have pvm-rest logs for these, do we?15:29
esbergluNope15:29
efriedI think we should add those.  thorst adreznec Unless there's space considerations?15:29
efriedCause without that, we don't have squat for @changh to look at.15:30
thorstefried: they're just going to be flooded with other requests15:30
thorstbut yeah, I like that idea15:30
adreznecefried: to clarify, you're thinking vsp soft, if fail, vsp hard and if we hit that PVME then at that point assume it succeeded in the interim15:30
efriedadreznec I'm saying if power-off hits that PVME at any point, it just returns success.15:30
efriedNot talking about changing the flow.15:30
adreznecefried: thorst I think the only issue there was log scrubbing at the time15:30
adreznecbut we've kind of given up on there...15:31
adreznec*that15:31
efriedadreznec You mean internal IPs and whatnot?15:31
adreznecSpace shouldn't really be an issue with the log retain period15:31
adreznecI don't think... esberglu have you looked at how much space we're using lately?15:31
adreznecefried: yes15:31
*** apearson has joined #openstack-powervm15:31
esbergluLogserver is currently 88% full. 82 of 100G used15:33
adreznechmm15:33
adreznecSo maybe we would need more storage for that15:33
*** thorst is now known as thorst_afk15:33
esbergluEither more storage or decrease the time until we delete15:33
thorst_afkMOAR STORAGE15:34
thorst_afkit's just SL  :-)15:34
efriedOkay, so we already have logic to succeed the power-off if certain error codes are received; but this one seems to be new.15:34
*** jwcroppe has joined #openstack-powervm15:46
*** jwcroppe has quit IRC15:47
efriedthorst_afk esberglu adreznec: 507915:53
efried(UT already covered by existing tests)15:53
*** jwcroppe has joined #openstack-powervm15:54
*** efried has quit IRC15:59
*** k0da has quit IRC16:02
*** apearson has quit IRC16:36
*** jwcroppe_ has joined #openstack-powervm16:37
*** jwcroppe has quit IRC16:40
*** thorst_afk is now known as thorst16:42
*** shyama has joined #openstack-powervm16:57
*** esberglu_ has joined #openstack-powervm17:23
*** tjakobs_ has joined #openstack-powervm17:23
*** tjakobs has quit IRC17:24
*** esberglu has quit IRC17:24
*** shyama has quit IRC17:34
esberglu_FYI: PowerVM CI will be down starting at 6 PM central time today. I am going to be upgrading the CI undercloud from newton to ocata17:37
esberglu_ Timeframe until back up is ~4 hours provided that there are no issues with the upgrade17:37
thorstesberglu_: ack17:37
*** jwcroppe_ has quit IRC17:41
*** apearson has joined #openstack-powervm17:52
*** chas has quit IRC18:04
*** nbante has quit IRC18:08
*** jwcroppe has joined #openstack-powervm18:11
*** jpasqualetto has quit IRC18:14
*** jpasqualetto has joined #openstack-powervm18:26
*** shyama has joined #openstack-powervm18:37
*** shyama has quit IRC18:49
*** efried has joined #openstack-powervm18:52
*** jpasqualetto has quit IRC19:01
*** k0da has joined #openstack-powervm19:09
*** jpasqualetto has joined #openstack-powervm19:16
*** jpasqualetto has quit IRC19:24
*** smatzek has quit IRC19:54
*** apearson has quit IRC19:55
thorstefried: mind taking a look at https://review.openstack.org/#/c/432322/ when you get a chance?20:00
thorstmy +2 is stuck in submit20:01
*** apearson has joined #openstack-powervm20:04
*** k0da has quit IRC20:11
*** k0da has joined #openstack-powervm20:24
efriedthorst Sorry, yeah.20:32
efriedmdrabe and I have been banging our heads against the download hang.20:32
efriedPretty sure I'm going to have to retool pypowervm to use eventlet instead of concurrent.futures20:33
thorstwtf...20:33
thorstthat sux20:33
efriedWell, it would suck less if it didn't mean we were totally broken without a new pypowervm.20:34
thorstwhere are the threads in pypowervm now that we've gotten rid of that coordinated upload crap?20:34
thorstI guess i can do a find quick20:34
thorstefried: so just transaction.py?20:35
efriedthorst thread_utils.py is what's killing us right now.20:35
thorstefried: this is because of the rest_api_pipe function?20:36
efriedWhich is only used from tasks/storage.py for the upload business.20:37
thorstefried: and this is because nova-powervm uses 'func' uploads20:38
efriedWell, no, it's worse than that.20:38
efriedIt's also because pypowervm uses coordinated.20:38
efriedWe have to change _both_ to get rid of threads.20:39
thorstright right.  I'm assuming you're using the develop branch that 'got rid of' coordinated20:39
efriedBecause the API/FUNC upload in pypowervm uses that _rest_api_pipe, which _also_ uses futures.20:39
efriedthorst Yeah, the problem is that API upload _also_ uses futures for API upload when the UploadType is FUNC.20:40
thorstefried: look at how we used to do it...sans 'FUNC'20:41
thorsthttps://github.com/openstack/nova-powervm/blob/stable/mitaka/nova_powervm/virt/powervm/disk/localdisk.py20:41
efriedYeah, I know, without FUNC we can get rid of threads.20:41
thorstbut I also know that we need to support FUNC.20:41
efriedthorst Only for backward compatibility.20:42
efriedI'm not convinced we need FUNC at all.20:42
thorstwe could make FUNC you know...write to a file and then upload/delete.  Backwards compat wise.20:42
thorstthe only time I think we need FUNC is if we're going to transform the image ahead of time20:42
efriedthorst Sure, guaranteeing we have a temp file system with enough space.20:42
thorstlike what we get from glance may be a tar.gz....FUNC gives us an opportunity to do something to it beforehand.20:42
efriedHow so?20:43
thorstget the stream, do something to said stream, upload20:43
efriedThat's the non-FUNC path.20:43
thorstwell, FUNC defers it until you need to upload20:43
thorstssp case may not actually need to pull from glance at all.20:43
efriedIf we could work with a stream, we just get the chunks from the download function by not specifying a target20:43
thorstbut I suppose you could know that ahead of time anywho20:44
thorstI'm good with deprecating func if we want TBH20:44
thorstchange nova-powervm to be the way it was...now that the speed is good...and then make FUNC write to a temp file (or to a pipe like it is now...openstack won't use it that way)20:44
efriedthorst We need to do some investigation to see how far back we're broken, though.20:44
thorsttrue...we can also push this change back to stable/ocata...20:45
thorstif needed of course20:45
efriedI am almost certain master nova-powervm is dead.20:45
efriedOcata might be busted too.20:45
thorstthough esbeglu is redeploying the undercloud tonight20:45
thorstso I'm not convinced ocata is dead20:45
efriedI'm not convinced either.  We need to check.20:45
thorstshould find out tonight20:46
thorsteither the undercloud busts or it doesn't20:46
efriedWhatever changed that caused this freeze changed like a week, week and a half ago.20:46
efriedNo20:46
efriedThe CI will not hit this.20:46
thorstthe undercloud CI won't hit this?20:46
thorstremember, he's redeploying the whole thing...not just the workloads themselves.20:46
efriedUhm, maybe actually.20:46
efriedBecause API/FUNC is still busted.20:46
efriedWe verified that (accidentally)20:47
efriedWith master + in-tree SSP change set + changes to make it go API/FUNC.20:47
thorstright...20:47
efriedSoooo... it's possible we'd be okay if we got rid of FUNC in pike nova-powervm and in-tree AND made pike requirements for both of those guys require pypowervm 1.1.1 that gets rid of coordinated.20:48
efriedOlder openstack with newer pypowervm would be aaight - assuming ocata is still okay.20:48
thorstright.20:48
efriedAnd vice versa wouldn't be possible.20:48
thorstnice and clean20:48
efriedYeah, right.  Nice and clean.20:49
thorst:-p20:49
thorstsoftware dependencies are ... fun20:49
efriedOkay, gonna get a water refill and start working on all of that.20:49
thorstgood luck20:49
thorstI'll be heading out to diapers in ten...20:49
efriedCourse, we may still be broken in that transaction.py business you pointed out.20:49
thorstyep...20:49
thorstbut that's not doing i/o waits.20:50
efriedMebbe so.20:50
efriedWouldn't count on it.20:50
efriedEspecially as we move to LIO...20:50
thorstefried: LIO shouldn't matter there...just the stream to the glance is all that I/O waits20:51
thorstthe rest of it is rest calls (which is an I/O call, sure...but different)20:51
*** esberglu_ is now known as esberglu20:51
*** esberglu has quit IRC20:56
*** esberglu has joined #openstack-powervm20:56
*** thorst has quit IRC20:59
*** thorst has joined #openstack-powervm21:00
*** esberglu has quit IRC21:01
*** thorst has quit IRC21:04
*** edmondsw has quit IRC21:12
*** edmondsw has joined #openstack-powervm21:14
*** edmondsw has quit IRC21:19
*** thorst has joined #openstack-powervm21:26
*** thorst has quit IRC21:30
*** jwcroppe has quit IRC21:34
*** jwcroppe has joined #openstack-powervm21:34
*** esberglu has joined #openstack-powervm21:35
*** jwcroppe has quit IRC21:38
*** thorst has joined #openstack-powervm21:47
efriedthorst So we can't actually get rid of all the FUNC code :(21:53
*** mdrabe has quit IRC22:13
*** tjakobs_ has quit IRC22:19
*** apearson has quit IRC22:23
*** efried has quit IRC22:32
*** edmondsw has joined #openstack-powervm22:45
*** edmondsw has quit IRC22:49
*** jwcroppe has joined #openstack-powervm23:01
*** k0da has quit IRC23:18

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!