#openstack-powervm log

13:02:57 <esberglu> #startmeeting powervm_driver_meeting
13:02:58 <openstack> Meeting started Tue Jun  6 13:02:57 2017 UTC and is due to finish in 60 minutes.  The chair is esberglu. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:03:00 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
13:03:03 <openstack> The meeting name has been set to 'powervm_driver_meeting'
13:03:06 <thorst> o/
13:03:38 <esberglu> #topic In Tree Driver
13:03:42 <esberglu> #link https://etherpad.openstack.org/p/powervm-in-tree-todos
13:04:03 <efried> o/
13:04:20 <edmondsw> o/
13:04:33 <efried> "Fixing" the get_info business led alll the way down the rabbit hole.
13:04:47 <efried> https://review.openstack.org/471146
13:05:12 <efried> In the end, mriedem said we should just remove all the unused fields from InstanceInfo, everywhere.
13:06:09 <efried> This will impact the OOT driver if/when it merges.
13:06:48 <edmondsw> k
13:07:35 <edmondsw> that's for instances... were we also talking about host stats the other day? Are there unused fields to remove there as well?
13:07:50 <thorst> efried: you should probably run that by mdrabe.
13:07:53 <efried> We weren't talking about host stats.
13:07:59 <edmondsw> k
13:08:07 <thorst> I suspect pvc would be impacted (and I bet other OS products)
13:08:31 <efried> thorst Yeah, I was thinking it will probably be a good idea to blast the dev ML on this one.
13:08:53 <thorst> yep.  But just ping mdrabe on the side too.  I'm not sure how much they view the ML
13:08:58 <thorst> I know I can't (don't) keep up
13:09:05 <edmondsw> +1
13:09:19 <efried> esberglu Depending how mriedem's comment plays out, this might impact your support matrix change.
13:09:30 <efried> https://review.openstack.org/#/c/471146/2/doc/source/support-matrix.ini@249
13:09:38 <efried> Not sure if he's gonna ask to remove that whole section.
13:09:43 <esberglu> efried: ack
13:10:15 <efried> I think that's it for me in tree.
13:10:26 <edmondsw> I wanted to ask an IT question
13:10:56 <efried> floor is yours
13:10:58 <edmondsw> so when we were looking at the support matrix it clicked for me that our SSP support that merged IT is only ephemeral
13:11:30 <edmondsw> when we've talked about 2H17 priorities we've talked about network, config_drive, and vSCSI
13:11:51 <edmondsw> is vSCSI there ephemeral or data or both?
13:12:02 <edmondsw> and is vSCSI the top priority for data disk attach/detach, not SSP?
13:12:05 <efried> I don't remember a vSCSI discussion.  iSCSI maybe?
13:12:31 <thorst> cinder
13:12:31 <edmondsw> thorst had said vSCSI
13:12:37 <mdrabe> Do we want vSCSI IT?
13:12:39 <thorst> thorst said cinder (via vSCSI)
13:12:45 <mdrabe> (o/ btw)
13:12:56 <thorst> vSCSI is simply a way to connect storage to a VM
13:12:59 <edmondsw> mdrabe read up, there was something for you above
13:13:00 <efried> Gotcha.  So the VSCSIVolumeAdapter.
13:13:07 <thorst> when we talk about it in terms of Cinder, we typically mean FC volumes to a VM
13:13:10 <mdrabe> ack
13:13:15 <thorst> in fact in PVC, we simplified vSCSI to just mean that
13:13:24 <thorst> but vSCSI is used for SSP, iSCSI, FC PV, etc...
13:13:45 <thorst> so I probably used the wrong language there
13:13:53 <thorst> I meant cinder support via vSCSI
13:14:05 <efried> Really, for FC?  I thought we had a fibre channel mapping that was different from a VSCSI mapping.
13:14:16 <thorst> FC also has this fancy NPIV support
13:14:16 <efried> Anyway, separate discussion.
13:14:38 <thorst> which is like a SR-IOV like thing for FC...though, yeah, separate discussion
13:14:47 <efried> Point is, we're looking to support the VSCSIVolumeAdapter in tree.
13:14:53 <thorst> +1
13:15:07 <edmondsw> thorst, in terms of the support matrix, what should we be trying to flip to partial/complete among the storage.block items?
13:15:08 <edmondsw> https://github.com/openstack/nova/blob/master/doc/source/support-matrix.ini#L945
13:15:15 <edmondsw> https://github.com/openstack/nova/blob/master/doc/source/support-matrix.ini#L972
13:15:23 <edmondsw> https://github.com/openstack/nova/blob/master/doc/source/support-matrix.ini#L993
13:15:25 <edmondsw> etc.
13:16:18 <thorst> 945 - partial, 972 - complete (though we can add NPIV later), 993 - missing (for now?  If we can tuck in awesome)
13:16:21 <edmondsw> I think you're saying L972 via cinder vSCSI
13:16:30 <edmondsw> k
13:16:38 <thorst> reality is that today, everyone is FC.  So that's the hole we should fill first for IT.
13:16:55 <edmondsw> what about cinder via SSP?
13:17:04 <thorst> no cinder driver for SSP
13:17:11 <edmondsw> oh, really
13:17:13 <mdrabe> everyone is FC? you mean Power folks?
13:17:14 <thorst> we talked about making one...but it never came to fruition
13:17:20 <efried> https://review.openstack.org/#/c/372254/
13:17:21 <thorst> PowerVM - everyone is FC
13:17:23 <efried> Still open ;-)
13:17:25 <thorst> rest of world...not so much
13:17:35 <efried> Last action in January
13:17:40 <thorst> efried: yeah...
13:17:52 <thorst> we were hoping that would then allow us to make a cinder driver
13:17:57 <thorst> I think people got pulled in other directions
13:18:08 <thorst> like iSCSI...and my other crazy volume connectors
13:18:21 <edmondsw> thorst so when PowerVC uses SSP for data volumes... how is it doing that without a cinder driver?
13:18:40 <thorst> edmondsw: they have a cinder driver, but it isn't upstreamed yet
13:18:46 <edmondsw> ah
13:19:34 <esberglu> Anything else IT?
13:19:37 <mdrabe> I've a question
13:19:42 <efried> Real quick, back to the get_info discussion, this also came out of it: https://review.openstack.org/#/c/471106/
13:19:46 <efried> trivial
13:19:49 <mdrabe> Where does os-brick come in to play with volume connectors?
13:20:23 <thorst> mdrabe: good q...shyama was looking into that.  Its a way to replace (I think?) the connection_info object (bdm)
13:20:28 <thorst> not super sure
13:20:32 <edmondsw> I've been working on deactivating the compute service when we can't get a pvm session or there are no VIOS ready, but not ready to put up for review quite yet
13:21:11 <mdrabe> Ok yea can discuss later
13:22:21 <esberglu> Alright moving on
13:22:29 <esberglu> #topic Out Of Tree Driver
13:22:59 <efried> Perf improvement change (https://review.openstack.org/469982) - I owe another patch set.
13:23:17 <efried> But the testing came back good on that, so once those fixups are in, I think we're good to go there.
13:23:35 <efried> Then I plan to look into the "don't need a whole instance for the NVRAM manager" thing.
13:23:48 <efried> which could also yield perf improvements... maybe.
13:24:08 <efried> Gotta do that quick before arnoldja moves on to bluer pastures.
13:24:15 <thorst> efried: I don't disagree with what you did
13:24:20 <thorst> I just feel dirty about it
13:24:24 <efried> heh
13:24:28 <thorst> 'lets just wait 15 seconds for everything'
13:24:34 <thorst> 'because this API vomits up events'
13:24:36 <efried> Well, anything PartitionState
13:24:58 <thorst> so I don't disagree...I just think its bleh
13:25:04 <efried> It's always that way with perf improvements.
13:25:11 <thorst> yep yep yep
13:25:13 <efried> Most of the time they make the code uglier.
13:25:20 <thorst> just letting my voice be heard.  :-p
13:25:36 <esberglu> This week is the pike 2 milestone (thursday), so I will be tagging the repos accordingly.
13:26:09 <mdrabe> efried is there a LP bug for that?
13:26:23 <efried> for the perf thing?
13:26:33 <mdrabe> yea... I guess it's not technically a bug
13:26:36 <efried> https://launchpad.net/bugs/1694784
13:26:37 <openstack> Launchpad bug 1694784 in nova-powervm "Reduce overhead for redundant PartitionState events" [Undecided,New]
13:27:04 <mdrabe> Ah neat thx
13:29:02 <efried> This might have been said last week, but the get_inventory thing is on hold pending further baking of the infrastructure.
13:29:12 <thorst> yep
13:29:12 <efried> https://review.openstack.org/468560
13:29:28 <efried> They've hit a snag with the design of shared resource providers.
13:29:53 <efried> It's going around the ML at the moment.  Not sure how that's gonna shake out.  An elegant solution is not yet forthcoming.
13:30:17 <efried> Subject line, in case you want to follow along at home: [openstack-dev] [nova][scheduler][placement] Allocating Complex Resources
13:30:25 <efried> I guess that's an IT/OOT thing.
13:31:08 <efried> Oh, I wanted to bring up t9n
13:31:50 <efried> I saw an email a couple days ago that may affect our stance on how aggressive we become about removing translations from various places.
13:32:26 <thorst> t9n?
13:32:30 <thorst> what does that stand for?
13:32:39 <efried> translation
13:32:51 <edmondsw> what was the email?
13:32:54 <efried> Right now the policy we're following in nova-powervm is just not translating any new log messages, and removing from anything we happen to touch while doing mods.
13:33:05 <thorst> so the 9 stands for len('ranslatio')
13:33:11 <efried> networking-powervm is subject to a hacking rule that disallows *any* translation.
13:33:21 <efried> (thorst, yeah, like i18n, etc.)
13:33:26 <efried> and k8s ;-)
13:33:37 <thorst> (got it - finally understand i18n too)
13:33:55 <efried> ...and I'm not sure whether we've even talked about a policy for pypowervm.
13:33:58 * thorst feels dumb
13:34:31 <thorst> efried: well, pypowervm is consumed by more ways than OpenStack...we've got two or three other direct users.
13:34:38 <thorst> I think any change there would need to be run by them
13:35:04 <thorst> we'd probably want to ask clbush from a CLI perspective too.
13:35:20 <edmondsw> I assumed efried was going to talk about nova-powervm, not pypowervm
13:35:39 <edmondsw> oh, missed a linee
13:35:56 * edmondsw feels dumb, joining thorst
13:36:15 <efried> So okay, agree that discussion outside this group is needed for pypowervm.
13:36:20 <efried> What about nova-powervm?
13:36:36 <thorst> I dunno, I'm dragging my feet on that
13:36:48 <thorst> and I'll admit, its really because I know pvc likes those messages translated.
13:36:48 <efried> It's probably not worth going all out and removing everything.
13:37:04 <edmondsw> thorst that's not true... PowerVC doesn't want log message translated
13:37:13 <efried> thorst That's the email I was referring to, yeah.
13:37:18 <thorst> o, huh
13:37:31 <thorst> well, then yeah.  I'm fine with either being proactive or lazy about it then
13:37:33 <edmondsw> thorst PowerVC wants consistency, it just doesn't want to spend the resources to scrub the translations it already has in place
13:37:43 <thorst> got it.
13:37:47 <edmondsw> but a note was sent just a couple days ago abount starting to scrub things if/when you can
13:37:53 <thorst> neat
13:38:07 <thorst> well, then ... same goes for ceiometer-powervm too
13:38:24 <thorst> that one is probably easier to do (and probably could benefit from a patch set done against it)
13:38:42 <edmondsw> I'd probably prioritize ceilometer-powervm above nova-powervm
13:38:43 <efried> Okay, upshot for nova-powervm and ceilometer-powervm is: no need to hold back if you feel like scrubbing out all the log t9n from those guys.
13:38:51 <efried> But it's not a high priority.
13:39:00 <thorst> yep
13:39:10 <edmondsw> +1
13:39:38 <efried> I added it to the etherpad https://etherpad.openstack.org/p/powervm-in-tree-todos line 69
13:40:01 <edmondsw> that it for OOT?
13:40:46 <efried> nothing else from me.
13:41:11 <esberglu> #topic PowerVM CI
13:41:21 <esberglu> https://etherpad.openstack.org/p/powervm_ci_todos
13:41:42 <efried> esberglu Okay, so you moved the CI to-dos out to another etherpad.
13:42:01 <esberglu> efried: Yeah I linked it in the other one
13:42:15 <esberglu> I can move it back if that's what people prefer
13:42:33 <esberglu> But I wanted to track tempest failures there and it was becoming a lot of info
13:42:58 <efried> esberglu I'm fine with it as long as everything's cross-linked.  I added a backpointer from the CI one to the original.
13:43:10 <esberglu> efried: Good call.
13:43:20 <efried> What's the difference between WORKING and CURRENT?
13:43:38 <esberglu> Stuff that I'm actually doing (in staging) vs stuff that's just on the list
13:44:12 <esberglu> We still need to figure out a way to get the VNC tests working
13:44:28 <esberglu> And check what tests (if any) can be enabled with SSP merged
13:44:38 <edmondsw> esberglu change "CURRENT" to "NEXT"?
13:44:44 <edmondsw> efried that clearer?
13:45:35 <efried> Yeah, that would be fine.  Not a big thang.
13:46:11 <esberglu> CI has been looking really good since the last couple fixes last week
13:46:29 <esberglu> Which should open up some time to start knocking this list out
13:46:50 <esberglu> I'm gonna go through and prioritize the list today
13:47:28 <esberglu> Been seeing way less of the timeout errors since I upped the time limit. Which to me points to slow over hanging.
13:47:43 <edmondsw> esberglu can you make looking at the tempest failures part of that prioritized list?
13:47:50 <esberglu> edmondsw: Yeah
13:48:11 <edmondsw> esberglu so you did merge that timeout bump?
13:48:53 <esberglu> edmondsw: I thought we were just putting it in temporarily for investigation purposes. But I can
13:49:00 <efried> Hope not.  We need to have a lively discussion first.
13:49:15 <edmondsw> no, I just asked because you're seeing "way less"
13:49:27 <esberglu> edmondsw: I just live patched the jenkins jobs
13:49:34 <edmondsw> if it didn't merge, wouldn't it only be in effect on a one-by-one basis?
13:49:35 <edmondsw> oh
13:50:55 <efried> Basically, my stance on this is that our CI isn't just testing "does it work.... eventually?"  It's also there to alert us to what I'll call "performance problems" for lack of a better term.
13:51:23 <efried> So if stuff is taking a long time, we need to figure out why it's taking a long time, not just increase the timeout.
13:51:56 <edmondsw> efried yep, I think we all agree there
13:51:57 <efried> I would even go so far as to say, if we had the space for it, we should be *decreasing* timeouts to highlight things that are taking longer than they ought.
13:52:22 <edmondsw> I'll even agree with that... once we get these current timeouts figured out / addressed
13:52:28 <efried> yup.
13:53:29 <esberglu> Should I remove the timeout increase now? It will be easier to find failing runs to investigate that way
13:53:48 <efried> esberglu When you've got the space to really start digging into them, yes.
13:54:04 <efried> Not necessary if it's just going to result in more failures but no action.
13:54:21 <edmondsw> +1 or when one of us pings you that we have that time
13:54:35 <esberglu> efried: Ok. I want to do a couple other things first (like get the neo logged) which should help for debugging
13:54:44 <efried> fo sho.
13:55:15 <edmondsw> esberglu I'm not seeing getting the neo logged on your list
13:55:18 <efried> let me know if you need help figuring out how to do that; I have a couple of ideas.
13:55:24 <esberglu> edmondsw: Yeah that list is a WIP
13:55:41 <esberglu> That's all I had for CI
13:55:56 <esberglu> #topic Driver Testing
13:56:02 <esberglu> Any progress here?
13:56:30 <efried> We don't have testers on.  But thorst_afk added https://etherpad.openstack.org/p/powervm-in-tree-todos starting line 92
13:56:46 <efried> ...pursuant to our call the other day.
13:56:49 <thorst_afk> efried: we're lining up the test resources still.  I don't think any tangible change, just formulating plan
13:57:50 <esberglu> Any discussion needed here? Otherwise I'll move on, running close to time
13:58:10 <thorst_afk> don't think so
13:58:14 <esberglu> #topic Open Discussion
13:58:21 <esberglu> Any final thoughts before I call it?
13:58:43 <efried> It's really confusing in my HexChat interface that esberglu and edmondsw both start with 'e' and have the same number of letters.
13:58:56 <efried> My old IRC client had different colors for each user.  Haven't figured out how to do that in HexChat.
13:59:03 <thorst_afk> efried: bringing the real problems to light
13:59:08 <edmondsw> :)
13:59:09 <efried> You can count on me.
13:59:26 <thorst_afk> I'd make a quip...but yeah, we do count on you
13:59:59 <mdrabe> I got a q actually
14:00:09 <thorst_afk> alright, I need to bail.  Need to go spread the gospel of open vswitch
14:00:20 <mdrabe> For test, what's the desired deployment route, devstack or OSA?
14:00:34 <thorst_afk> mdrabe: for now, devstack due to simplicity of setup
14:00:45 <thorst_afk> which is not all that simple, until you compare to OSA.
14:01:02 <efried> Hah, ironic considering OSA is supposed to be the thing that makes it simple.
14:01:17 <thorst_afk> efried: OSA is the thing to make OpenStack production grade
14:01:17 <mdrabe> Would this be an opportunity to iron out the OSA path then?
14:01:22 <efried> Yeah
14:01:34 <efried> Sorry, thorst_afk Yeah.  mdrabe No.
14:01:38 <thorst_afk> mdrabe: kinda.  Lets chat more when I'm off the phone
14:02:10 <mdrabe> I'd say wainot, but sounds like it's complicated
14:05:10 <efried> esberglu Think we're done here.
14:05:31 <esberglu> #endmeeting