*** k0da has quit IRC | 00:04 | |
*** dwayne has joined #openstack-powervm | 00:21 | |
*** tblakes has joined #openstack-powervm | 00:22 | |
*** thorst_ has joined #openstack-powervm | 00:45 | |
*** thorst_ has quit IRC | 00:55 | |
*** tblakes has quit IRC | 01:04 | |
*** tblakes has joined #openstack-powervm | 01:12 | |
*** tjakobs has quit IRC | 01:13 | |
*** thorst_ has joined #openstack-powervm | 01:27 | |
*** edmondsw has joined #openstack-powervm | 01:31 | |
*** edmondsw has quit IRC | 01:43 | |
*** edmondsw has joined #openstack-powervm | 01:54 | |
*** thorst_ has quit IRC | 02:18 | |
*** thorst_ has joined #openstack-powervm | 02:20 | |
*** tblakes has quit IRC | 02:29 | |
*** edmondsw has quit IRC | 02:42 | |
*** edmondsw has joined #openstack-powervm | 02:43 | |
*** edmondsw has quit IRC | 02:43 | |
*** edmondsw has joined #openstack-powervm | 02:43 | |
*** edmondsw has quit IRC | 02:52 | |
*** Jay1 has joined #openstack-powervm | 03:00 | |
*** edmondsw has joined #openstack-powervm | 03:01 | |
*** Jay1 has quit IRC | 03:05 | |
*** edmondsw has quit IRC | 03:09 | |
*** Jay1 has joined #openstack-powervm | 03:10 | |
*** edmondsw has joined #openstack-powervm | 03:11 | |
*** edmondsw has quit IRC | 03:14 | |
*** edmondsw has joined #openstack-powervm | 03:14 | |
*** edmondsw has quit IRC | 03:14 | |
*** edmondsw has joined #openstack-powervm | 03:15 | |
*** Jay1 has quit IRC | 03:15 | |
*** thorst_ has quit IRC | 03:40 | |
*** edmondsw has quit IRC | 04:17 | |
*** edmondsw has joined #openstack-powervm | 04:17 | |
*** edmondsw has quit IRC | 04:21 | |
*** Jay1 has joined #openstack-powervm | 05:35 | |
*** thorst_ has joined #openstack-powervm | 05:56 | |
*** thorst_ has quit IRC | 06:01 | |
*** tjakobs has joined #openstack-powervm | 06:18 | |
*** tjakobs has quit IRC | 06:37 | |
*** thorst_ has joined #openstack-powervm | 07:57 | |
*** thorst_ has quit IRC | 08:01 | |
*** k0da has joined #openstack-powervm | 09:18 | |
*** thorst_ has joined #openstack-powervm | 09:58 | |
*** thorst_ has quit IRC | 10:02 | |
*** Jay1 has quit IRC | 11:25 | |
*** Jay1 has joined #openstack-powervm | 11:29 | |
*** Jay1 has quit IRC | 11:33 | |
*** smatzek has joined #openstack-powervm | 11:52 | |
*** thorst_ has joined #openstack-powervm | 11:59 | |
*** thorst_ has quit IRC | 12:03 | |
*** thorst__ has joined #openstack-powervm | 12:43 | |
*** dwayne has quit IRC | 12:46 | |
*** edmondsw has joined #openstack-powervm | 13:21 | |
*** edmondsw_ has joined #openstack-powervm | 13:23 | |
*** edmondsw has quit IRC | 13:25 | |
*** Jay1 has joined #openstack-powervm | 13:35 | |
thorst__ | efried: there? | 13:40 |
---|---|---|
*** thorst__ is now known as thorst_ | 13:42 | |
efried | thorst_ sup? | 13:50 |
efried | I've been looking at the code a bit. | 13:50 |
efried | What kind of partition was this? | 13:50 |
thorst_ | type linux | 13:51 |
thorst_ | but it's the blank image | 13:51 |
thorst_ | so no OS | 13:51 |
thorst_ | likely stuck in bootp | 13:51 |
thorst_ | I think the issue is here... | 13:51 |
thorst_ | https://github.com/powervm/pypowervm/blob/master/pypowervm/tasks/power.py#L228 | 13:51 |
thorst_ | I think that we need to increase the timeout | 13:51 |
thorst_ | a force immediate should actually wait for it to finish. | 13:51 |
thorst_ | in all scenarios. | 13:52 |
thorst_ | and it looks like we do that in other places we escalate force_immediate, but not there. | 13:52 |
efried | So a couple of things. | 13:53 |
efried | Line 248 and 252 is where we set the immediate flag | 13:53 |
efried | and 288 | 13:54 |
efried | which is the one we hit if the request fails (rather than timing out0. | 13:54 |
efried | ) | 13:54 |
thorst_ | so we've got to be looping through twice | 13:55 |
efried | We're supposed to, yes. | 13:55 |
thorst_ | and I believe that it is line 248 where we set it | 13:55 |
thorst_ | start the second loop | 13:55 |
thorst_ | Force.TRUE sets the operation to shutdown/immediate | 13:56 |
thorst_ | but I think we're just timing out in 60 seconds (which is kinda absurd...) | 13:56 |
thorst_ | but I'd like to try to patch it with a higher value. Like line 251 | 13:56 |
efried | ohhh, you're saying the second loop, with shutdown/immediate, is actually timing out? | 13:56 |
thorst_ | I think so, yes | 13:57 |
thorst_ | well, let me rephrase | 13:57 |
thorst_ | after code inspection of this awful logic, that's the one thing I can think of. | 13:57 |
efried | And of course this is happening intermittently. | 13:57 |
thorst_ | right. | 13:58 |
efried | Well. | 13:58 |
efried | If we are indeed supplying shutdown/immediate, there should be no excuse for that to take 60s. That sounds like a platform bug. | 13:58 |
thorst_ | yeah, I've been talking to seroyer...its not the hypervisor | 13:59 |
thorst_ | once it gets to the hypervisor its sub second | 13:59 |
thorst_ | the question is what's inbetween | 13:59 |
thorst_ | :-) | 13:59 |
thorst_ | yeah, I am becoming convinced this is a solid thing to try out. | 13:59 |
thorst_ | async eventing and what not | 14:00 |
thorst_ | mind if I just make it consistent between those two paths and we try again? | 14:00 |
efried | Sorry, you lost me. | 14:00 |
efried | above you said, "it looks like we do that in other places we escalate force_immediate, but not there" | 14:01 |
efried | do what? | 14:01 |
thorst_ | let me show with a code example | 14:01 |
thorst_ | its easier that way | 14:01 |
efried | ight | 14:02 |
efried | Though I'm not convinced anything is easy when it comes to this beast. | 14:02 |
thorst_ | if it were easy, it wouldn't be worth doing ;-) | 14:02 |
*** dwayne has joined #openstack-powervm | 14:04 | |
thorst_ | efried: 4761 | 14:04 |
efried | ack | 14:04 |
efried | thorst_ Yes, I see - we're doing same in the other paths where we set it to True. This makes sense... sort of. | 14:06 |
thorst_ | sort of. | 14:06 |
efried | It still doesn't make sense that we should need to do it. If we say shutdown/immediate, it should be.... immediate. | 14:06 |
thorst_ | mind pushing through and we'll see if sort of fixes things? | 14:06 |
thorst_ | yeah, I can talk to Hsien about that in the scrum today | 14:07 |
thorst_ | it seems like we may have more problems with the async eventing. | 14:07 |
thorst_ | but I also agree that in the case of force immediate, we should wait until it says its done | 14:07 |
thorst_ | (it just should be done really damn fast) | 14:07 |
efried | That said, if the original timeout was 1s, and we do the force in 1s, we can't reasonably expect even a force shutdown to get all the way through the REST server's Jobs module, to PHYP, and back in 1s, especially on a loaded system. | 14:08 |
thorst_ | right. | 14:08 |
efried | What are you talking about wrt "async eventing"? | 14:08 |
thorst_ | when we shut down a VM, there is an 'async event' sent from the hypervisor to the REST server | 14:08 |
thorst_ | and the rest server will finish the job when that comes in | 14:08 |
efried | btw, I'm not sure how this change is going to pass sonar. Did we disable cyclomatic complexity for this module? | 14:08 |
thorst_ | so if there are a lot of events... | 14:08 |
thorst_ | efried: no idea...I hope we did | 14:09 |
efried | checking... | 14:09 |
efried | nooooo... | 14:09 |
thorst_ | well, we can have a Slack I hate Jenkins fest again | 14:10 |
efried | But I remember we changed this thing recently, to add the Force enum - how tf did we pass sonar then? | 14:11 |
efried | Oh well, guess we'll see. | 14:11 |
efried | We should add UT for this. | 14:11 |
thorst_ | I think we swore at it enough to make it feel bad and let it through | 14:12 |
thorst_ | even the UT for power off is awful | 14:12 |
efried | I can see ways we could refactor this code, but it would be risky because of all the things consuming it. | 14:14 |
*** mdrabe has joined #openstack-powervm | 14:14 | |
thorst_ | yep | 14:14 |
thorst_ | nothing in UTs checks the timeout now | 14:14 |
thorst_ | great. | 14:14 |
*** kriskend has joined #openstack-powervm | 14:15 | |
*** kriskend has quit IRC | 14:16 | |
*** kriskend has joined #openstack-powervm | 14:16 | |
*** tjakobs has joined #openstack-powervm | 14:23 | |
*** tblakes has joined #openstack-powervm | 14:27 | |
*** kriskend has quit IRC | 14:27 | |
*** esberglu has joined #openstack-powervm | 14:27 | |
*** kriskend has joined #openstack-powervm | 14:27 | |
*** kriskend has quit IRC | 14:29 | |
*** kriskend has joined #openstack-powervm | 14:29 | |
*** apearson has joined #openstack-powervm | 14:30 | |
*** esberglu_ has joined #openstack-powervm | 14:30 | |
*** esberglu has quit IRC | 14:33 | |
efried | thorst_ One thing: should a 2G image be taking 470s to "upload" on e.g. neo40 with nothing else going on? | 14:33 |
*** Jay1 has quit IRC | 14:33 | |
thorst_ | efried: depends...comes from NovaLink to local VIOS? | 14:34 |
thorst_ | probably not | 14:34 |
efried | Also regularly seeing lots and lots of the forced pipe close messages. | 14:34 |
efried | This is for the in-tree code. | 14:34 |
thorst_ | uhhh | 14:34 |
thorst_ | that seems wrong | 14:34 |
thorst_ | updated pypowervm? | 14:35 |
efried | 1.0.0.4 | 14:35 |
*** tlian has joined #openstack-powervm | 14:39 | |
efried | thorst_ Jenkins passed | 15:18 |
thorst_ | well, rip it? | 15:18 |
efried | Not sure how, but gift horse, mouth, etc. | 15:18 |
thorst_ | I'm going to rip it | 15:18 |
efried | ight | 15:18 |
thorst_ | esberglu_: that should automatically get picked up being in pypowervm now right? | 15:19 |
thorst_ | or do we need a new base image rebuild? | 15:19 |
openstackgerrit | Merged openstack/networking-powervm: Use neutron-lib portbindings api-def https://review.openstack.org/422759 | 15:24 |
esberglu_ | New base image rebuild | 15:24 |
esberglu_ | You're talking the power off change I'm assuming? | 15:25 |
efried | esberglu_ yes, 4761 | 15:28 |
efried | now merged. | 15:28 |
thorst_ | esberglu_: can we rebuild it now? | 15:29 |
thorst_ | wipe the existing ready nodes and rebuild? | 15:29 |
thorst_ | https://github.com/powervm/pypowervm/tree/develop | 15:30 |
thorst_ | it's the latest commit in there | 15:30 |
esberglu_ | Sure, I can just rebuild the mgmt node. | 15:30 |
esberglu_ | ls | 15:32 |
thorst_ | -la | 15:33 |
*** kriskend has quit IRC | 16:12 | |
esberglu_ | thorst_: efried: adreznec: Yesterday I got a bunch of spawn tests pass on an in tree CI run. | 16:20 |
esberglu_ | Instead of defining the networks in prep_devstack, allow the networks to be defined by os_ci_tempest.sh | 16:20 |
esberglu_ | I only have tested this on one manual run, about to do a second and confirm | 16:20 |
esberglu_ | I can pm you guys the test results if you want | 16:20 |
adreznec | esberglu_: How do the generated networks differ? | 16:21 |
*** burgerk has joined #openstack-powervm | 16:21 | |
esberglu_ | os_ci_tempest sets some tempest conf. stuff when creating the networks that wasn't getting set when doing it in prep_devstack | 16:21 |
*** k0da has quit IRC | 16:21 | |
*** kriskend has joined #openstack-powervm | 16:22 | |
efried | Yeah, I remember writing os_ci_tempest.sh to assume certain specific network names etc. | 16:23 |
esberglu_ | I think the initial change was because devstack was creating networks, but not the way we wanted them. So we turned devstack network creation off | 16:24 |
esberglu_ | And then defined them ourselves | 16:24 |
esberglu_ | But I'm not 100% clear on why we did that instead of letting tempest create them | 16:24 |
esberglu_ | os_ci_tempest that is | 16:25 |
esberglu_ | I also let a run through with that change for OOT | 16:25 |
esberglu_ | It didn't seem to cause any issues | 16:26 |
esberglu_ | But I would like to test it with more runs before putting it in | 16:26 |
esberglu_ | The only difference in the net creates | 16:30 |
*** apearson has quit IRC | 16:30 | |
esberglu_ | There are some differences in the net-creates though | 16:31 |
esberglu_ | The networks in prep_devstack are --shared, but not is os_ci_tempest | 16:33 |
esberglu_ | And the public net in os_ci_tempest is defined with an external router | 16:33 |
*** apearson has joined #openstack-powervm | 16:35 | |
efried | Well I, for one, don't understand how any of that stuff works or how it affects anything or why it should matter. Networking is thorst_'s bailiwick. | 16:40 |
*** mdrabe has quit IRC | 16:41 | |
esberglu_ | For now I say leave OOT as is until we can test it further. And just make the change for IT if it is confirmed to work | 16:44 |
*** apearson has quit IRC | 16:49 | |
*** mdrabe has joined #openstack-powervm | 16:49 | |
*** nbante has joined #openstack-powervm | 16:53 | |
*** apearson has joined #openstack-powervm | 17:00 | |
*** apearson has quit IRC | 17:21 | |
*** nbante has quit IRC | 17:22 | |
*** apearson has joined #openstack-powervm | 17:47 | |
*** k0da has joined #openstack-powervm | 18:55 | |
*** apearson has quit IRC | 19:00 | |
*** apearson has joined #openstack-powervm | 19:05 | |
*** apearson has quit IRC | 19:37 | |
*** apearson has joined #openstack-powervm | 19:47 | |
esberglu_ | thorst_: efried: adreznec: Confirmed that using the networks from os_ci_tempest.sh works for in tree | 20:10 |
thorst_ | for now | 20:10 |
esberglu_ | I created a whitelist based on the results of that run. Gonna redeploy the staging CI and test the whitelist and get a few more runs with this change through | 20:13 |
adreznec | Cool | 20:32 |
*** smatzek has quit IRC | 20:58 | |
*** smatzek has joined #openstack-powervm | 20:59 | |
*** tblakes has quit IRC | 21:12 | |
*** smatzek has quit IRC | 21:22 | |
*** tblakes has joined #openstack-powervm | 21:25 | |
thorst_ | I revote that this security group test should just be skipped. | 22:02 |
esberglu_ | I have a patch up for it, just responded to a question adreznec had on it | 22:09 |
thorst_ | can I just rogue +2 that beast? | 22:10 |
adreznec | lol | 22:10 |
esberglu_ | He was just asking if we want to merge the change to the skip list, or just pick up the change as a patch in production. I think it is better to just merge it | 22:13 |
thorst_ | yeah, as long as we root cause it | 22:13 |
thorst_ | and not just forget it later | 22:13 |
adreznec | Sure | 22:15 |
adreznec | thorst_: feel free to rogue +2 with that caveat | 22:15 |
thorst_ | I did that like 5 minutes ago | 22:15 |
adreznec | lol | 22:15 |
thorst_ | #rogue | 22:15 |
thorst_ | I'm out for the weekend. See ya! | 22:15 |
*** thorst_ has quit IRC | 22:16 | |
*** edmondsw_ has quit IRC | 22:23 | |
*** edmondsw has joined #openstack-powervm | 22:23 | |
*** edmondsw has quit IRC | 22:28 | |
*** kriskend has quit IRC | 22:28 | |
*** esberglu_ has quit IRC | 22:36 | |
*** esberglu has joined #openstack-powervm | 22:36 | |
*** apearson has quit IRC | 22:39 | |
*** esberglu has quit IRC | 22:41 | |
*** burgerk has quit IRC | 22:50 | |
*** esberglu has joined #openstack-powervm | 22:50 | |
*** esberglu has quit IRC | 22:54 | |
*** dwayne has quit IRC | 22:58 | |
*** tblakes has quit IRC | 23:10 | |
*** mdrabe has quit IRC | 23:14 | |
*** tjakobs has quit IRC | 23:32 | |
*** edmondsw has joined #openstack-powervm | 23:43 | |
*** dwayne has joined #openstack-powervm | 23:48 | |
*** smatzek has joined #openstack-powervm | 23:52 | |
*** smatzek_ has joined #openstack-powervm | 23:53 | |
*** smatzek_ has quit IRC | 23:54 | |
*** smatzek_ has joined #openstack-powervm | 23:55 | |
*** smatzek has quit IRC | 23:57 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!