13:00:41 <esberglu> #startmeeting powervm_driver_meeting 13:00:42 <openstack> Meeting started Tue Apr 4 13:00:41 2017 UTC and is due to finish in 60 minutes. The chair is esberglu. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:00:44 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:00:46 <openstack> The meeting name has been set to 'powervm_driver_meeting' 13:00:48 <thorst> o/ 13:03:11 <esberglu> #topic Out Of Tree Driver 13:03:25 <esberglu> ocata is broken 13:03:34 <thorst> esberglu: the upload thing? 13:03:43 <esberglu> Yep 13:03:54 <thorst> crap...how did that get back ported... 13:04:06 <thorst> I thought we tested it earlier :-/ 13:04:10 <thorst> in the staging env... 13:05:15 <esberglu> So CI is still down, I can redeploy again with newton or we can get this figured out and be without CI until then 13:05:44 <thorst> esberglu: well it seems like we're going to need a new pypowervm for it 13:05:44 <esberglu> I think I may have missed some of the convo yesterday 13:05:52 <thorst> and we're going to have to bump that back in ocata...which is awful 13:05:54 <esberglu> But it looked like efried was expecting this 13:06:06 <thorst> well, efried thought this could occur. I didn't think it could 13:06:26 <thorst> it was based off of whether or not whatever broke us was back ported 13:06:36 <thorst> efried will be on in 15 min and we can discuss more then 13:07:09 <esberglu> Do you know what broke us? 13:07:19 <thorst> nope 13:07:25 <thorst> that's the irony...and what I'm frustrated about 13:07:33 <thorst> we're trying to fix for something that we don't know why we broke 13:07:56 <esberglu> Well lets look at what got into ocata in the last week 13:08:10 <thorst> +2 13:08:13 <esberglu> Because I deployed ocata on staging at the end of last week no problem 13:10:04 <esberglu> There's nothing that has gone into ocata since I deployed on staging successfully 13:10:13 <esberglu> in the 3 powervm projects 13:10:18 <thorst> right. 13:10:21 <thorst> it'd be in nova itself 13:10:26 <esberglu> Yep I'm looking there now 13:10:43 <adreznec> It would have to be a bugfix at that point that broke us, right? 13:10:52 <adreznec> I mean Ocata's been cut for some time now 13:10:57 <thorst> I'd assume 13:10:58 <thorst> https://github.com/openstack/nova/commits/stable/ocata 13:11:02 <thorst> nothing much in the past week there tho 13:11:09 <adreznec> Right... 13:11:24 <thorst> thought maybe a global req change 13:11:26 <esberglu> None of that looks suspect 13:11:28 <thorst> but nothing much there 13:11:31 <thorst> pbr updated... 13:11:47 <adreznec> thorst: yeah, once ocata gets cut reqs are pretty much frozen 13:11:52 <adreznec> Unless there's a major breaking issue 13:12:32 <thorst> esberglu: I'm now curious if newton is hosed. 13:13:03 <esberglu> I hope not. That means CI is down for the count until this is resolved 13:13:13 <thorst> right. 13:13:56 <thorst> adreznec: does concurrent.futures use greenlet or eventlet? 13:14:09 <thorst> do you know? 13:15:26 <adreznec> thorst: not sure offhand 13:16:22 <adreznec> wait 13:16:28 <adreznec> isn't concurrent.futures a stdlib 13:17:31 <thorst> adreznec: yeah, but that's where efried sees us hanging 13:17:32 <adreznec> neither eventlet or greenlet are builtins, so it can't use either by default 13:17:48 <thorst> so efried wants to switch to all eventlet I think... 13:18:16 <efried> Howdy. 13:18:29 <thorst> so the net is...I think that is now our highest priority 13:18:31 <efried> Not sure if I missed anything, but I have a plan for the broken upload in ocata. 13:18:33 <thorst> and we should work it first oot 13:18:52 <thorst> rather than it (so we don't create a bunch of misc reviews for the core side) 13:18:57 <thorst> efried: net is, CI is down 13:19:17 <thorst> what I think we're curious about is did something change in OpenStack...or was it perhaps even lower than that 13:19:23 <thorst> like eventlet or somewhere else 13:19:29 <adreznec> thorst: is this futures in py2.7 or py3? 13:19:53 <esberglu> Staging CI is still up. So if we need to run something through we can still do it there 13:19:57 <efried> For OOT ocata, we need a bug opened; then we a) port to ocata OOT the change that moves from FUNC back to IO_STREAM; b) update the pypowervm requirement to 1.1.1. And, of course, we'll need to release 1.1.1. 13:19:59 <adreznec> I'll admit I'm not totally in the loop on the upload issue 13:20:30 <thorst> efried: what is the 'fix' 13:20:34 <thorst> down in pypowervm 13:20:37 <efried> Yeah, so something changed in eventlet recently - I still haven't nailed down exactly what, but sdague gave me some vague pointers last week. 13:20:51 <thorst> well shit 13:20:56 <adreznec> efried: something that would have changed since ocata was released? 13:20:57 <thorst> that'll affect things way back 13:21:10 <efried> The fix is two-sided, unfortunately. In pypowervm, we have to kill coordinated upload. In community, we have to kill FUNC. 13:21:12 <adreznec> otherwise shouldn't it be pinned by version in reqs? 13:21:27 <efried> Because there's no way to do FUNC without threads, and there's no way to do coordinated without threads. 13:21:44 <efried> The alternative is to retool pypowervm to use eventlet instead of futures. 13:21:50 <thorst> efried: so is concurrent.futures just dead? 13:22:00 <thorst> because of a change in eventlet? 13:22:01 <efried> No, it's just incompatible with greenlet. 13:22:18 <efried> Although that might not be entirely true. 13:22:20 <thorst> so func is still viable, just not in an OpenStack env. 13:22:40 <thorst> and it also calls into question if we need a change for your VIOS Task thingy 13:22:46 <thorst> which also uses concurrent.futures 13:22:55 <efried> I don't disagree with that. 13:23:12 <thorst> non-disagreement is as close as we can hope for a resounding agreement 13:23:21 <thorst> efried: so are you focused on that change today? 13:23:27 <thorst> and we just keep CI down while we fix that? 13:23:47 <efried> Which change? Get rid of FUNC and coordinated in ocata OOT? 13:23:58 <efried> Or convert to greenlet? 13:24:23 <efried> Perhaps we should take a couple of minutes and go over what I (sort of, maybe) know so far about the underlying cause. 13:24:35 <thorst> efried: yes, lets do that 13:24:41 <efried> So from my research with mdrabe yesterday, I *think* it goes like this: 13:24:53 <efried> There's two kinds of multiprocessing models available: threads and greenlets. 13:24:55 <adreznec> So just to clarify 13:25:04 <adreznec> What version of pypowervm is this using 13:25:07 <adreznec> With Ocata 13:25:15 <adreznec> 1.1.0? 13:25:17 <efried> I don't fully understand the difference between them, but they're totally different animals, not just different implementations on top of the same underlying threading model. 13:25:49 <efried> Openstack uses greenlets throughout. They even have a hacking check in place to make sure you're using eventlet through their nova.utils wrapper of it. 13:26:20 <efried> (adreznec, not sure, but to fix this we'll need to release 1.1.1 and bump the ocata req to that. Is that even legal?) 13:26:43 <adreznec> Uh 13:26:44 <esberglu> ocata is using 1.0.0.4 I believe 13:26:51 <adreznec> We can technically do that for Ocata 13:26:53 <adreznec> I guess... 13:27:07 <adreznec> in the future... please, let's never have to deal with that 13:27:08 <efried> Yeah, the req bump will need to happen regardless of which way we fix this. Unless we can figure out some as-yet-unknown way to fix it purely in the community code. 13:27:36 <adreznec> I'm just wondering if this is only broken with some combination of versions 13:27:53 <adreznec> e.g. only with pypowervm 1.1.0 because that's where we require futures>=3.0 13:28:08 <adreznec> vs just "futures" 13:28:12 <adreznec> with no version req 13:28:32 <efried> Mm. And presumably openstack doesn't require futures? 13:28:39 <adreznec> nope 13:28:44 <adreznec> sorry, yes, they do 13:28:54 <adreznec> that's where the >3.0 req came from 13:29:06 <adreznec> (walking to a meeting) 13:29:55 <efried> Anyway, threading in python apparently has this GIL (global interpreter lock) which actually makes it so that literally only one thread runs at a time - the others are stopped. 13:30:05 <thorst> right. 13:30:28 <efried> Normally this is okay because threads can yield and allow other threads to run, so as long as your actual programming doesn't have deadlocks in it, you're aaight. 13:30:29 <efried> But 13:30:41 <efried> This sucker is blocking on a syscall. 13:30:47 <efried> Which doesn't yield. 13:31:36 <efried> So all the other threads - including the greenlets, including the one that would kick the REST server to do its open, which would unblock the write side - are frozen. 13:31:58 <efried> Now, there is apparently a way to explicitly release the GIL. 13:32:30 <thorst> ? 13:33:03 <efried> That might be the least disruptive path, if we can figure out how to do it. But a) it's going to be a hack (more on that in a bit), and b) it might not work in the context we would need to do it in - that is, it might only work if we can do it right at that open() call, which is in code we don't own. 13:33:36 <thorst> efried: to me...lets just fix it proper... 13:33:43 <thorst> greenlet (not eventlet)? 13:35:33 <efried> Yeah, so I don't know what the difference is there - those are different libs - but they use greenlets / green threads under the covers. 13:35:57 <efried> Whereas anything that says "thread" - like the native thread library, or concurrent.futures - uses the other kind of threads. 13:36:37 <thorst> I'd assume we can use greenlets for most things...but the pipe may not be able to use greenlet... 13:37:06 <efried> That's an unknown at this point. But I imagine it's gotta be possible. 13:37:17 <thorst> ok... 13:37:42 <thorst> so the net is, due to this, we need to bump pypowervm...get a bug for nova-powervm...and possibly back port this way (enough) back 13:38:00 <efried> However, given that we're going to need to release a new pypowervm and bump the OOT req to it anyway, I would just as soon do the fix that avoids threading altogether. 13:38:09 <thorst> waler is testing on Mitaka, so he could actually probably tell us if Mitaka is impacted :-) 13:38:45 <thorst> efried: for the upload? Sure. VIOS Feed Task stuff...not so sure 13:39:36 <efried> thorst Agree, but I think what saves us there is that what's running in that thread is non-blocking. 13:39:36 <thorst> alright...so I guess that's priority 1... 13:39:53 <thorst> yep...I agree that is probably not highly impacted. 13:39:55 <efried> So maybe it hitches the process while it's doing that POST, but as soon as the POST comes back, we keep truckin. 13:40:09 <thorst> yeah, kinda ick, not uber ick 13:40:12 <efried> Not ideal, and perhaps something we should look into for the future, but not first priority, right. 13:40:37 <thorst> alright...so gameplan build out here... 13:40:50 <thorst> 1) esberglu - would you be willing to make the bug and tag at least back to ocata 13:41:05 <esberglu> Sure 13:41:21 <thorst> I honestly suspect that newton / mitaka may still be impacted...would love to know that if you have time to redeploy with newton 13:41:47 <thorst> 2) efried you're updating the pypowervm bits? 13:42:03 <thorst> 3) should I do the nova-powervm bits to swap off func? 13:42:04 <esberglu> thorst: I should be able to do that in the background today 13:43:11 <efried> thorst Actually, let me do it. 13:43:19 <efried> Take a look at this delta: https://review.openstack.org/#/c/443189/15..16/nova/virt/powervm/disk/ssp.py 13:43:32 <thorst> yeah 13:43:38 <efried> It'll be like that, except we won't actually need the IterableToFileAdapter. 13:43:41 <thorst> we would need to basically revert into that change for both localdisk 13:43:44 <thorst> and ssp 13:43:50 <thorst> why not? 13:43:53 <efried> Because I'm gonna make a change to pypowervm ;-) 13:43:59 <efried> Since we're going to need a new version of that anyway. 13:44:02 <efried> Backward compatible. 13:44:06 <thorst> don't make it even more complicated :-) 13:44:07 <efried> But eliminating the need for IterableToFileAdapter. 13:44:17 <efried> It makes it less complicated, really. 13:45:00 <efried> The HTTP request expects an iterable. Glance gives us an iterable. For some reason we had pypowervm expecting a file and converting it to an iterable, so the community had to convert the iterable to a file just so pypowervm could convert it back. 13:45:11 <efried> which is stupid. 13:45:26 <thorst> hmm...ok 13:45:45 <thorst> well, I guess I'll let you do magic and be on point to be a reviewer? 13:46:24 <thorst> how do we make actions in the meeting? 13:46:41 <thorst> (for the meeting minutes) 13:46:53 <esberglu> #action esberglu: Open bug for upload issue 13:48:12 <efried> I'm going to use 5083, which is already most of the way there. Just need to add the iterable killer. 13:48:41 <thorst> #action efried drive pypowervm and nova-powervm fixes for upload issue 13:48:54 <thorst> #action esberglu determine if newton is impacted 13:50:01 <thorst> efried: 5083 - we should tag that with the bug that esberglu is making 13:50:08 <thorst> so we have one bug capturing this whole nightmare... 13:50:27 <thorst> #action adreznec ship out a new pypowervm once this whole fiasco is solved :-) 13:50:38 <thorst> (heh) 13:50:44 <adreznec> lol 13:50:59 <adreznec> might need to sync that with julio 13:53:01 <thorst> what else do we have for the meeting? 13:53:10 <esberglu> Cool sounds like we have a plan. Meeting is almost up, anyone have anything else? I don't have anything for CI 13:53:17 <esberglu> efried: anything in-tree? 13:53:32 <esberglu> Just waiting for reviewers at this point correct? 13:53:54 <thorst> nbante and jay are still doing testing... I know jay is hitting issues, I'm trying to help out once I get in the env. nbante I thik is stuck on something with tempest in OSA. 13:53:56 <efried> I think by the time we hit the SSP change set, we'll need to bump the in-tree reqs to the new pypowervm. 13:55:09 <nbante> correct..adreznec: check once if we have to uncomment anything in user_config.yml to get that work 13:56:20 <adreznec> nbante: you'll likely have to experiment on your own there today. I'm pretty much swamped in meetings until later this afternoon 13:57:21 <nbante> sure..I already tried most of parts. But will give shot in 2-3 hours. If not work, will send you note 13:58:54 <esberglu> nbante: I may have time to take a look today. I will let you know 13:59:16 <thorst> nbante: one thought I had... 13:59:20 <nbante> sure..thanks 13:59:25 <thorst> I don't think we care if tempest is deployed via OSA 13:59:30 <thorst> or run from a separate server... 13:59:39 <thorst> you have a cloud, we just need to run tempest against it 13:59:41 <thorst> :-) 13:59:46 <thorst> so that gives you options to try 14:00:10 <thorst> but esberglu is more familiar than I am...so maybe he'll figure it out in 2 mins 14:00:25 <nbante> I never tried tempest so not sure how it worked. In SVT, we have own framework 14:00:44 <esberglu> thorst: I haven't got tempest working yet either.... so I'm pretty much in the same boat as nbante right now for OSA CI 14:00:58 <thorst> esberglu: right...but my thought is 14:01:03 <thorst> we have tempest working for IT CI 14:01:06 <thorst> or OOT CI 14:01:13 <thorst> so...uh...how'd we set it up there? 14:01:21 <thorst> and can we do the same here? 14:02:56 <esberglu> At least some tweaks will be needed. We can discuss more when I really dive into it 14:03:01 <thorst> awesome 14:03:10 <thorst> nbante: are you deployed with Cinder? 14:03:16 <nbante> no 14:03:24 <nbante> using local disk only 14:03:38 <thorst> so we're just getting back to where we were? 14:04:55 <nbante> after local disk, I worked on tempest where I stuck 14:05:16 <nbante> do you want me to parallely work on configuring cinder? 14:05:16 <thorst> ok 14:05:25 <thorst> ahh, right...so we were getting tempest and then moving to iSCSI cinder 14:05:33 <nbante> correct 14:05:33 <thorst> sorry, I'm getting my wires wrong :-) 14:06:19 <nbante> :) 14:06:40 <thorst> OK - I'll also catch up with Jay... 14:06:51 <thorst> can you check with him to get his IRC working? 14:07:21 <nbante> sure..will check 14:07:50 <thorst> ok - I didn't have anything else. 14:11:13 <esberglu> #endmeeting