#openstack-powervm log

13:00:48 <esberglu> #startmeeting powervm_driver_meeting
13:00:49 <openstack> Meeting started Tue Sep 19 13:00:48 2017 UTC and is due to finish in 60 minutes.  The chair is esberglu. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:00:50 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
13:00:52 <openstack> The meeting name has been set to 'powervm_driver_meeting'
13:01:00 <edmondsw> o/
13:01:04 <efried> o/
13:01:10 <mdrabe> o/
13:01:32 <esberglu> #topic In-Tree Driver
13:02:17 <edmondsw> tx to esberglu for getting the queens spec up
13:02:29 <efried> ++
13:02:34 <esberglu> I was planning on starting either the OVS or SEA IT patch this week
13:02:35 <esberglu> np
13:03:03 <edmondsw> we talked over the basic gist of the plan the spec lays out at the PTG, and the only response (from mriedem) was "that seems fine"
13:03:12 <edmondsw> esberglu sounds good
13:04:32 <esberglu> Not sure which yet, I think OVS has a patch started but CI isn't set up for it. SEA is set up for CI but no work has gone into so far afaik
13:04:41 <esberglu> edmondsw: Cool
13:05:17 <edmondsw> mriedem did ask about whether the CI would use OVS or SEA, and efried said SEA
13:05:29 <edmondsw> makes sense since that's what we do OOT
13:05:50 <efried> It would be neat to figure out OVS for the CI, but they're not going to require it.
13:06:07 <edmondsw> yeah, if we could somehow test both that would be awesome
13:06:11 <efried> Because they recognize it would mean we would need essentially two separate CIs
13:06:16 <efried> and they don't force others to do that.
13:06:20 <esberglu> Okay so SEA first then
13:06:39 <esberglu> And then OVS IT with whatever CI changes are needed if practical
13:06:40 <efried> But yeah, it would be pretty cool if, like, we could have the CI alternate or something.
13:08:11 <edmondsw> config drive patch still needs reviews from our team
13:08:35 <esberglu> efried: Eh I think that would be weird. Like if you do a recheck and suddenly its using different networking
13:09:04 <edmondsw> esberglu yeah, it's definitely not optimal... but neither is never testing one of them
13:09:08 <efried> esberglu I don't disagree it would be weird.
13:09:14 <efried> Yeah, what edmondsw said.
13:09:23 <edmondsw> at least we would see intermittent errors if we were alternating
13:09:42 <efried> Like, maybe we test even-numbered change sets with one, odd-numbered with the other.
13:09:45 <edmondsw> not saying we have to do that, just something to think about
13:09:55 <efried> Then at least rechecks would be consistent.
13:10:23 <efried> But need a way to force one or the other, so we can specifically target changes that affect the networking code, or whatever...
13:10:27 <esberglu> edmondsw: efried: We can have this discussion further once we get to OVS. But yeah I agree we should try to get both of them being tested somehow
13:10:33 <esberglu> I'll keep it in the back of my mind
13:10:56 <efried> I think the major blocker there was us running local commands, right?  Which we're no longer doing.
13:12:13 <edmondsw> another thing we need to think about IT is the docs.. we never finished our TODOs there
13:12:19 <esberglu> I don't think that was the only blocker. I have some notes somewhere that I need to dig up
13:12:40 <edmondsw> we added ourselves to the support matrix, but didn't add a hypervisors page for powervm to the config ref
13:13:45 <edmondsw> probably a bigger deal for queens than for pike, since the pike IT driver isn't all that usable without networking, but still...
13:13:57 <esberglu> Yeah I think they had a freeze on adding docs there when we tried to do that during pike
13:14:38 <edmondsw> if we could get the framework up for those docs, then we could be adding to it in the patches we are submitting for queens instead of after the fact
13:14:52 <esberglu> They were moving around the location of certain docs and didn't want new ones at that point
13:14:58 <esberglu> I can look into it
13:15:00 <edmondsw> yep, understood
13:15:06 <edmondsw> I think the location is locked down now
13:15:23 <edmondsw> I think that's all for IT?
13:15:29 <esberglu> #action esberglu: Finish docs updates
13:15:32 <esberglu> Yep
13:15:47 <esberglu> #topic Out-of-Tree Driver
13:16:43 <esberglu> Anyone have discussion items here?
13:17:31 <edmondsw> we had to spin pypowervm 1.1.8 to fix a max slots bug. The u-c update for that has merged now after some initial gate issues
13:18:00 <efried> Curse LPAR builder.
13:18:18 <edmondsw> tjakobs has a patch up for iSCSI that I need to look at
13:18:29 <edmondsw> if any others have time to check that out...
13:19:32 <edmondsw> we're pulling the never-really-supported Linux Bridge code
13:20:06 <efried> Did we decide to pull the trigger on that?
13:20:09 <edmondsw> tjakobs also has a patch up for ceph volumes to review
13:20:20 <edmondsw> efried yeah, I think so... I checked with thorst and he said pull it
13:20:37 <efried> k, well, that change is ready for review then.
13:21:31 <efried> privsep stuff <waves hands>
13:21:45 <edmondsw> mdrabe there's a merge conflict on https://review.openstack.org/#/c/497977/
13:21:59 <mdrabe> yep
13:22:58 <esberglu> efried: Yeah they renamed dac_admin so I just need to keep an eye for that merge
13:23:34 <efried> Perhaps we should consider implementing support for Depends-On in our OOT CI.
13:23:44 <esberglu> efried: Yep it's on the TODO list
13:23:52 <efried> okay.
13:24:17 <edmondsw> ++
13:24:47 <esberglu> Ready to move on?
13:24:52 <edmondsw> yep, on to PCI
13:25:18 <esberglu> #topic Device Passthrough
13:26:45 <efried> We got buy-in at the PTG that the way Nova will handle device passthrough should be done in ways that won't paint non-libvirt into a corner.
13:27:13 <edmondsw> ++++++++++
13:27:23 <efried> I feel like we need three major pieces of framework before we can work a full solution:
13:27:46 <efried> 1) Nested resource providers (in placement).  This got prioritized for early Queens.
13:29:20 <efried> 2) Ability to request allocations with a richer syntax, to support a) multiple allocations of the *same* resource class from *different* resource providers (e.g. VFs on different physnets); and b) ensuring multiple resources can be grabbed from the *same* resource provider (e.g. claim the VF inventory and the bandwidth inventory from the same PF).  This is committed for Queens (jaypipes to write spec; I started some cont
13:29:20 <efried> ent)
13:29:39 <efried> 3) Affinity/anti-affinity via placement aggregates.  This is *not* committed for Queens.
13:30:19 <efried> Once all of those things are done, Nova will consider starting on whatever framework is necessary for generic device management, with an eye towards getting rid of the existing PCI management code entirely.
13:31:00 <efried> But once all of those things are done, we may be able to get a jump start on converting over from...
13:31:02 <efried> The Hack
13:31:38 <efried> So for the Q cycle, we're going to move forward with https://review.openstack.org/494733 and https://review.openstack.org/496434
13:32:03 <edmondsw> did they talk about how to get rid of hte existing PCI management code without breaking operators who are still using it? backward compat type stuff
13:32:14 <efried> defer, defer, defer
13:32:27 <efried> fingers in ears
13:32:34 <efried> lalalala, not going to think about it yet.
13:32:38 <edmondsw> lol
13:32:47 <efried> But it'll be a deprecation cycle.
13:33:24 <efried> I anticipate individual hypervisors being able to take the lead for themselves.
13:34:00 <efried> to some extent
13:34:03 <edmondsw> FYI all, mdrabe is going to be helping efried on this as able, and driving the corresponding work in PowerVC
13:34:41 <efried> because the ways of specifying resources (in general, not just devices) is going to have to cut over at some point.
13:36:28 <edmondsw> efried do we need a recheck on https://review.openstack.org/#/c/496434/ or are you going to upload a new ps?
13:37:11 <efried> Oh, yeah, it needs UT and stuff.  It's not close to done.  That and the spec are going to be my main gig for the next couple of weeks.
13:37:47 <efried> Also need to talk to consumer architects (Joe?) to be sure we understand the requirements.
13:38:09 <edmondsw> yeah, I need to set that up
13:38:22 <efried> e.g. do we need to do more hackage to overcome the limitation on grouping by vendor/product ID?
13:38:29 <edmondsw> #action edmondsw: schedule PCI attach mtg with stakeholders
13:39:46 <edmondsw> efried anything else here?
13:39:58 <efried> I don't think so.
13:40:07 <efried> actually
13:40:15 <efried> lemme just drop a couple more links here for reference...
13:40:26 <efried> https://etherpad.openstack.org/p/nova-ptg-queens-generic-device-management
13:40:42 <efried> https://etherpad.openstack.org/p/nova-multi-alloc-request-syntax-brainstorm  <== this is for #2 mentioned above
13:41:36 <efried> that is all
13:41:56 <edmondsw> #topic PowerVM CI
13:42:23 <esberglu> Slowly knocking out tempest failures and CI todo backlog items
13:43:00 <esberglu> We still need to get console tests working for CI which was having problems due to remote
13:43:21 <esberglu> Haven't looked into that for a while because other things have been breaking
13:43:26 <edmondsw> did you and thorst ever come up with a plan for that?
13:43:45 <esberglu> We talked about it a little bit a couple times
13:43:49 <esberglu> But not a solid plan
13:44:16 <edmondsw> it's not my top priority
13:44:41 <edmondsw> I would work on OVS/SEA IT first, adding Depends-On support, obviously any CI breakages...
13:45:01 <esberglu> edmondsw: Yeah it keeps getting bumped by higher priority stuff. IIRC there are only 2 tempest tests for it
13:45:14 <edmondsw> but good to not forget it
13:45:54 <esberglu> I also am still working on that openstack dashboard. I got the web UI up on a test system but it wasn't populating properly
13:46:43 <edmondsw> what dashboard?
13:46:49 <efried> Yeah, what dashboard?
13:47:17 <esberglu> Openstack health dashboard
13:47:29 <esberglu> Its the tempest failure stats UI that the community uses
13:47:38 <esberglu> http://status.openstack.org/openstack-health/#/
13:48:15 <edmondsw> ok, I think we did talk about that
13:49:17 <esberglu> That's all I had for CI
13:49:22 <efried> *-powervm are on there
13:49:34 <efried> what do we need beyond that?
13:50:26 <esberglu> That's not results for PowerVM CI
13:50:36 <esberglu> That's results on *-powervm for the community CI
13:50:48 <efried> okay.
13:51:08 <edmondsw> oh, interesting...
13:51:29 <edmondsw> so if we get our CI added, how would that look? Munged together with what is already there for *-powervm?
13:51:49 <efried> or a separate dashboard that just shows our runs?
13:52:14 <esberglu> It would be a separate dashboard that just shows our runs
13:53:02 <esberglu> That dashboard only show the tests that you see under the "Jenkins check" header on reviews. It doesn't have any 3rd party
13:53:14 <edmondsw> looks like you can click one of these, e.g. nova-powervm, and get a new page with more details about what jobs ran there
13:53:55 <edmondsw> esberglu oh, so you'd be setting up something totally separate?
13:54:02 <edmondsw> does any other 3rd party CI do that?
13:54:17 <esberglu> edmondsw: Yeah but that still is only showing the jenkins jobs (py27, py35)
13:54:26 <esberglu> It would be something totally separate
13:54:38 <esberglu> PowerKVM has one, I talked to them a little bit about what they had to do
13:55:00 <edmondsw> k
13:55:08 <edmondsw> sounds cool, but not critical
13:55:14 <efried> agree
13:55:27 <esberglu> edmondsw: efried: The idea of it is that we can get rid of failure emails
13:55:35 <esberglu> And just monitor the dashboard
13:55:55 <edmondsw> sure
13:56:09 <edmondsw> esberglu want to talk about zuulv3
13:56:13 <efried> We would still get emails for changes we're subscribed to, right?
13:56:28 <esberglu> efried: Yeah that's all external to us
13:56:30 <efried> Cause the likelihood of me actively and continuously "monitoring" the dashboard is nil.
13:56:48 <efried> But I still need to know if a change I care about fails.
13:57:03 <esberglu> efried: Yeah you would still get that I believe
13:57:18 <esberglu> But I would make sure before getting rid of the failures
13:57:32 <esberglu> edmondsw: Did you hear much about it at PTG?
13:57:41 <esberglu> I didn't see a whole lot on the mailing list
13:57:57 <edmondsw> esberglu not much, just that they were trying to switch over, hit an issue early in the week, and were going to come back to is
13:58:02 <edmondsw> it*
13:58:08 <edmondsw> later in the week
13:58:12 <esberglu> edmondsw: Yeah that was my understanding too
13:58:21 <edmondsw> I should probably say "keep working on it" rather than "come back to it"
13:58:34 <esberglu> We aren't necessarily in a hurry to do that
13:58:39 <esberglu> It would just be nice
13:58:53 <edmondsw> and my understanding is that their switch doesn't really effect us, just would be nice if/when we can follow suit
13:58:58 <edmondsw> let them work out the kinks first, though
13:59:19 <esberglu> edmondsw: Yeah I'll just try to stay in the loop on it
13:59:23 <edmondsw> ++
13:59:32 <edmondsw> the other thing I was going to ask about is neutron/ceilometer ci
13:59:44 <esberglu> edmondsw: You saw my latest email on that?
14:00:17 <edmondsw> yeah, your email said that we caught that ceilometer issue in tox rather than in CI... which surprised me
14:00:29 <esberglu> We would need to talk to both neutron and ceilometer separately if we want to publish
14:00:54 <edmondsw> esberglu yeah, and I think that would probably be a good discussion to have... let's add that to the TODO list
14:01:03 <edmondsw> non-voting, of course
14:01:11 <esberglu> Yep
14:01:27 <esberglu> edmondsw: I also will add a todo to look into what we are testing for ceilometer
14:01:34 <edmondsw> tx
14:01:47 <edmondsw> done with CI?
14:01:51 <esberglu> Yep
14:02:05 <edmondsw> #topic Driver Testing
14:02:09 <edmondsw> I just have one quick thing here
14:02:31 <edmondsw> basically, we're just going to have UT and CI at this point
14:03:01 <edmondsw> I thought we might be able to get Ravi to do some tempest work, but we lost him as well as Jay
14:03:23 <edmondsw> so I will remove this from the agenda going forward
14:03:24 <edmondsw> :(
14:03:39 <edmondsw> of course OOT driver is still used/tested by PowerVC
14:04:09 <edmondsw> and I'll be asking for more resources to help here in the next cycle
14:04:40 <esberglu> okay
14:04:50 <edmondsw> but that will probably be dev resources, not test... so get ready to be asked to do more in tempest dev and usage
14:04:59 <edmondsw> that is all
14:05:07 <edmondsw> #topic Open Discussion
14:05:23 <efried> edmondsw Following up from PTG, can you please review this series: https://review.openstack.org/#/q/topic:bp/nova-validate-certificates+status:open
14:05:34 <esberglu> Any last thoughts? Nothing else from me
14:05:36 <edmondsw> efried yep, on my list
14:05:39 <efried> Thanks.
14:06:44 <esberglu> #endmeeting