Monday, 2017-07-31

*** apearson has joined #openstack-powervm00:38
*** apearson has quit IRC00:49
*** svenkat has joined #openstack-powervm01:21
*** edmondsw has joined #openstack-powervm01:36
*** thorst has joined #openstack-powervm01:37
*** edmondsw has quit IRC01:40
*** thorst has quit IRC01:41
*** svenkat has quit IRC02:07
*** svenkat has joined #openstack-powervm02:08
*** svenkat has quit IRC02:13
*** deep-book-gk has joined #openstack-powervm02:34
*** deep-book-gk has left #openstack-powervm02:36
*** thorst has joined #openstack-powervm02:42
*** thorst has quit IRC02:47
*** svenkat has joined #openstack-powervm02:49
*** svenkat has quit IRC02:53
*** edmondsw has joined #openstack-powervm03:24
*** edmondsw has quit IRC03:28
*** thorst has joined #openstack-powervm03:43
*** thorst has quit IRC03:48
*** svenkat has joined #openstack-powervm04:21
*** svenkat has quit IRC04:25
*** svenkat has joined #openstack-powervm05:01
*** svenkat has quit IRC05:05
*** thorst has joined #openstack-powervm05:44
*** thorst has quit IRC05:49
*** svenkat has joined #openstack-powervm06:23
*** svenkat has quit IRC06:27
*** edmondsw has joined #openstack-powervm06:59
*** edmondsw has quit IRC07:04
*** thorst has joined #openstack-powervm07:45
*** thorst has quit IRC07:50
*** edmondsw has joined #openstack-powervm08:48
*** edmondsw has quit IRC08:53
*** thorst has joined #openstack-powervm09:46
*** thorst has quit IRC09:58
*** edmondsw has joined #openstack-powervm10:36
*** edmondsw has quit IRC10:41
*** smatzek has joined #openstack-powervm10:49
*** smatzek has quit IRC10:55
*** smatzek has joined #openstack-powervm10:55
*** thorst has joined #openstack-powervm11:13
*** miltonm has joined #openstack-powervm11:39
*** svenkat has joined #openstack-powervm11:40
*** thorst has quit IRC11:43
*** edmondsw has joined #openstack-powervm12:06
*** edmondsw has quit IRC12:13
*** miltonm has quit IRC12:26
*** kylek3h has joined #openstack-powervm12:57
*** thorst has joined #openstack-powervm13:00
*** dwayne_ has quit IRC13:02
*** apearson has joined #openstack-powervm13:04
*** esberglu has joined #openstack-powervm13:13
*** edmondsw has joined #openstack-powervm13:13
*** edmondsw has quit IRC13:22
*** edmondsw has joined #openstack-powervm13:22
*** apearson has quit IRC13:59
*** apearson has joined #openstack-powervm14:00
*** apearson has quit IRC14:01
*** apearson has joined #openstack-powervm14:01
*** apearson has quit IRC14:02
*** dwayne has joined #openstack-powervm14:10
esbergluefried_zzz: thorst: edmondsw: Any idea how to resolve this? The CI cloud thinks all 135 ips are used even though there are only 9 instances14:46
esbergluIs there some way to remove these "used" ips that aren't actually being used?14:46
thorstesberglu: yeah, though concerning that happened14:47
thorstrun a 'neutron port-delete' from the CLI on the openstack server14:47
esbergluthorst: What I think happened is that the controller was full on disk and so a TON of spawns were failing over the weekend. Some of them must have not been cleaning up14:48
esbergluMight have to up the controller disk size14:48
esbergluThe switch from 15G to 30G images really takes up some space since there are multiple images sitting around14:49
thorst:-(14:53
thorstwe could increase it, but you'd need to rebuild the whole system14:53
thorstthat's kinda a PITA14:53
esbergluthorst: Yeah. Maybe next time I do a full redeploy I will bite the bullet. I doubled it previously thinking that would be enough, wish I would have done more14:59
thorstnever enough storage15:12
*** edmondsw has quit IRC16:37
*** apearson has joined #openstack-powervm16:40
*** efried_zzz is now known as efried16:44
*** dwayne has quit IRC17:18
esbergluefried: Those 500 internal server error runs also have this trace in n-cond-cell1.log17:54
esbergluhttp://paste.openstack.org/show/617053/17:54
esbergluYou know anything about nova conductor?17:54
efriedno17:54
efriedWhat makes you think that's related?17:55
efriedUnless it has something to do with data that's normally populated being None, and that funneling down to the REST request, and that hitting some unhandled error path in REST/VIOS.17:55
esbergluThis is now showing up where we would see the internal error 500 message in powervm_os_ci.html17:56
esbergluLooking in n-cpu the 500 error is still found on the failing test17:56
esbergluAnd that 500 error is also printed in n-cond-cell1.log17:57
esbergluhttp://184.172.12.213/55/408955/71/check/nova-in-tree-pvm/e41038c/17:58
esberglu^ logs I have been looking at17:58
*** apearson has quit IRC17:59
efriedMaybe we're talking about two different things.18:02
efriedThe 500 I care about is the one coming from *our* REST API, on a VIOS POST.18:03
efriedThe odds of that 500 being related in any way to a 500 coming from/through the conductor are virtually nil.18:04
efriedThe problem I have with the 500 from VIOS POST isn't necessarily that it's happening - it's that we have no idea *why* it's happening, because the message is generic.  Normally when we get a 500 from our REST API, it has some hint as to what went wrong.18:04
efriedSo that's the first thing we need to nail down and fix.  That's the thrust of the defect we opened.18:05
efriedIn order to figure that out, they're gonna need the REST logs.18:05
efriedThen once we know what the real reason is, we a) get that message fixed, and b) see if it's something we can do anything about, or if it's just a random failure we have to tolerate (like "VIOS busy").18:05
esbergluLook at the first line of the paste above, it is the 500 from the vios no?18:06
esbergluThe sql_connection could just be a result of the 500 though, not sure18:06
esbergluThe CantStartEngineError that is18:07
efriedah, okay, I see the confusion.  So yeah, if there's any correlation at all, the conductor error is the result of the REST 500.  Doesn't affect how we follow up.18:08
efriedesberglu Scanning the code, I *think* the exception you pasted above is actually ignored.  Our 500 happens in the try: block at https://github.com/openstack/nova/blob/master/nova/conductor/manager.py#L536-L551 -- then the thing that's generating the exception is where it's trying to clean up from that (L561) which generates an exception that gets logged, but ignored (L563)18:20
efriedIn any case, our job is to figure out why we're getting that 500.18:21
esbergluefried: I should just be able to scp from the neo onto the AIO vm I think18:32
*** dwayne has joined #openstack-powervm18:37
*** dwayne has quit IRC18:45
efriedesberglu You mean scp the REST logs?  Yes, if you can figure out which neo it is.  Which should be possible in the same way we're doing the local2remote setup.18:56
efriedesberglu But I thought we had already done that (at least once, manually) and supplied that info in the defect18:56
efriedPerhaps we just need to put pressure on the defect owner(s) to take a look.18:57
*** apearson has joined #openstack-powervm19:07
*** dwayne has joined #openstack-powervm19:24
*** edmondsw has joined #openstack-powervm19:30
*** edmondsw has quit IRC19:32
*** edmondsw has joined #openstack-powervm19:32
efriedesberglu Ah, that defect was owned by me.  That's why it wasn't going anywhere.  Needs to be owned by REST.  Moved it to changh.19:33
esbergluefried: Oh I opened it in pypowervm that's why. Thought it would automatically reroute when I changed that19:43
esbergluMakes sense that it doesn't I guess19:43
*** thorst is now known as thorst_afk20:25
efriedDoesn't make sense to me.20:28
*** apearson has quit IRC20:45
*** apearson has joined #openstack-powervm20:47
*** apearson has quit IRC20:58
*** svenkat has quit IRC20:59
*** apearson has joined #openstack-powervm21:00
edmondswesberglu check the note I just sent21:09
edmondswwe must be running tox all the time, and we haven't been seeing this, right?21:09
esbergluI haven't seen it anywhere, tox gets run every patch21:15
*** smatzek has quit IRC21:16
edmondswesberglu do we maybe have wsgi_intercept installed?21:16
esbergluedmondsw: Not sure. That's part of the jenkins gating jobs not the PowerVM CI21:18
esbergluedmondsw: Not seeing anything about wsgi_intercept in the logs from those runs21:24
*** apearson has quit IRC21:33
*** apearson has joined #openstack-powervm21:40
*** esberglu has quit IRC21:51
*** esberglu has joined #openstack-powervm21:57
*** kylek3h has quit IRC21:58
*** apearson has quit IRC22:01
*** esberglu has quit IRC22:02
*** esberglu has joined #openstack-powervm22:09
*** esberglu has quit IRC22:14
*** dwayne has quit IRC22:18
edmondswesberglu fyi, I was able to reproduce that wsgi_intercept problem on a fresh system22:27
edmondswso it's not looking like an env issue22:27
*** thorst_afk has quit IRC22:38
*** edmondsw has quit IRC22:52
*** dwayne has joined #openstack-powervm23:20
*** smatzek has joined #openstack-powervm23:51

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!