13:00:48 #startmeeting powervm_driver_meeting 13:00:49 Meeting started Tue Sep 19 13:00:48 2017 UTC and is due to finish in 60 minutes. The chair is esberglu. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:00:50 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:00:52 The meeting name has been set to 'powervm_driver_meeting' 13:01:00 o/ 13:01:04 o/ 13:01:10 o/ 13:01:32 #topic In-Tree Driver 13:02:17 tx to esberglu for getting the queens spec up 13:02:29 ++ 13:02:34 I was planning on starting either the OVS or SEA IT patch this week 13:02:35 np 13:03:03 we talked over the basic gist of the plan the spec lays out at the PTG, and the only response (from mriedem) was "that seems fine" 13:03:12 esberglu sounds good 13:04:32 Not sure which yet, I think OVS has a patch started but CI isn't set up for it. SEA is set up for CI but no work has gone into so far afaik 13:04:41 edmondsw: Cool 13:05:17 mriedem did ask about whether the CI would use OVS or SEA, and efried said SEA 13:05:29 makes sense since that's what we do OOT 13:05:50 It would be neat to figure out OVS for the CI, but they're not going to require it. 13:06:07 yeah, if we could somehow test both that would be awesome 13:06:11 Because they recognize it would mean we would need essentially two separate CIs 13:06:16 and they don't force others to do that. 13:06:20 Okay so SEA first then 13:06:39 And then OVS IT with whatever CI changes are needed if practical 13:06:40 But yeah, it would be pretty cool if, like, we could have the CI alternate or something. 13:08:11 config drive patch still needs reviews from our team 13:08:35 efried: Eh I think that would be weird. Like if you do a recheck and suddenly its using different networking 13:09:04 esberglu yeah, it's definitely not optimal... but neither is never testing one of them 13:09:08 esberglu I don't disagree it would be weird. 13:09:14 Yeah, what edmondsw said. 13:09:23 at least we would see intermittent errors if we were alternating 13:09:42 Like, maybe we test even-numbered change sets with one, odd-numbered with the other. 13:09:45 not saying we have to do that, just something to think about 13:09:55 Then at least rechecks would be consistent. 13:10:23 But need a way to force one or the other, so we can specifically target changes that affect the networking code, or whatever... 13:10:27 edmondsw: efried: We can have this discussion further once we get to OVS. But yeah I agree we should try to get both of them being tested somehow 13:10:33 I'll keep it in the back of my mind 13:10:56 I think the major blocker there was us running local commands, right? Which we're no longer doing. 13:12:13 another thing we need to think about IT is the docs.. we never finished our TODOs there 13:12:19 I don't think that was the only blocker. I have some notes somewhere that I need to dig up 13:12:40 we added ourselves to the support matrix, but didn't add a hypervisors page for powervm to the config ref 13:13:45 probably a bigger deal for queens than for pike, since the pike IT driver isn't all that usable without networking, but still... 13:13:57 Yeah I think they had a freeze on adding docs there when we tried to do that during pike 13:14:38 if we could get the framework up for those docs, then we could be adding to it in the patches we are submitting for queens instead of after the fact 13:14:52 They were moving around the location of certain docs and didn't want new ones at that point 13:14:58 I can look into it 13:15:00 yep, understood 13:15:06 I think the location is locked down now 13:15:23 I think that's all for IT? 13:15:29 #action esberglu: Finish docs updates 13:15:32 Yep 13:15:47 #topic Out-of-Tree Driver 13:16:43 Anyone have discussion items here? 13:17:31 we had to spin pypowervm 1.1.8 to fix a max slots bug. The u-c update for that has merged now after some initial gate issues 13:18:00 Curse LPAR builder. 13:18:18 tjakobs has a patch up for iSCSI that I need to look at 13:18:29 if any others have time to check that out... 13:19:32 we're pulling the never-really-supported Linux Bridge code 13:20:06 Did we decide to pull the trigger on that? 13:20:09 tjakobs also has a patch up for ceph volumes to review 13:20:20 efried yeah, I think so... I checked with thorst and he said pull it 13:20:37 k, well, that change is ready for review then. 13:21:31 privsep stuff 13:21:45 mdrabe there's a merge conflict on https://review.openstack.org/#/c/497977/ 13:21:59 yep 13:22:58 efried: Yeah they renamed dac_admin so I just need to keep an eye for that merge 13:23:34 Perhaps we should consider implementing support for Depends-On in our OOT CI. 13:23:44 efried: Yep it's on the TODO list 13:23:52 okay. 13:24:17 ++ 13:24:47 Ready to move on? 13:24:52 yep, on to PCI 13:25:18 #topic Device Passthrough 13:26:45 We got buy-in at the PTG that the way Nova will handle device passthrough should be done in ways that won't paint non-libvirt into a corner. 13:27:13 ++++++++++ 13:27:23 I feel like we need three major pieces of framework before we can work a full solution: 13:27:46 1) Nested resource providers (in placement). This got prioritized for early Queens. 13:29:20 2) Ability to request allocations with a richer syntax, to support a) multiple allocations of the *same* resource class from *different* resource providers (e.g. VFs on different physnets); and b) ensuring multiple resources can be grabbed from the *same* resource provider (e.g. claim the VF inventory and the bandwidth inventory from the same PF). This is committed for Queens (jaypipes to write spec; I started some cont 13:29:20 ent) 13:29:39 3) Affinity/anti-affinity via placement aggregates. This is *not* committed for Queens. 13:30:19 Once all of those things are done, Nova will consider starting on whatever framework is necessary for generic device management, with an eye towards getting rid of the existing PCI management code entirely. 13:31:00 But once all of those things are done, we may be able to get a jump start on converting over from... 13:31:02 The Hack 13:31:38 So for the Q cycle, we're going to move forward with https://review.openstack.org/494733 and https://review.openstack.org/496434 13:32:03 did they talk about how to get rid of hte existing PCI management code without breaking operators who are still using it? backward compat type stuff 13:32:14 defer, defer, defer 13:32:27 fingers in ears 13:32:34 lalalala, not going to think about it yet. 13:32:38 lol 13:32:47 But it'll be a deprecation cycle. 13:33:24 I anticipate individual hypervisors being able to take the lead for themselves. 13:34:00 to some extent 13:34:03 FYI all, mdrabe is going to be helping efried on this as able, and driving the corresponding work in PowerVC 13:34:41 because the ways of specifying resources (in general, not just devices) is going to have to cut over at some point. 13:36:28 efried do we need a recheck on https://review.openstack.org/#/c/496434/ or are you going to upload a new ps? 13:37:11 Oh, yeah, it needs UT and stuff. It's not close to done. That and the spec are going to be my main gig for the next couple of weeks. 13:37:47 Also need to talk to consumer architects (Joe?) to be sure we understand the requirements. 13:38:09 yeah, I need to set that up 13:38:22 e.g. do we need to do more hackage to overcome the limitation on grouping by vendor/product ID? 13:38:29 #action edmondsw: schedule PCI attach mtg with stakeholders 13:39:46 efried anything else here? 13:39:58 I don't think so. 13:40:07 actually 13:40:15 lemme just drop a couple more links here for reference... 13:40:26 https://etherpad.openstack.org/p/nova-ptg-queens-generic-device-management 13:40:42 https://etherpad.openstack.org/p/nova-multi-alloc-request-syntax-brainstorm <== this is for #2 mentioned above 13:41:36 that is all 13:41:56 #topic PowerVM CI 13:42:23 Slowly knocking out tempest failures and CI todo backlog items 13:43:00 We still need to get console tests working for CI which was having problems due to remote 13:43:21 Haven't looked into that for a while because other things have been breaking 13:43:26 did you and thorst ever come up with a plan for that? 13:43:45 We talked about it a little bit a couple times 13:43:49 But not a solid plan 13:44:16 it's not my top priority 13:44:41 I would work on OVS/SEA IT first, adding Depends-On support, obviously any CI breakages... 13:45:01 edmondsw: Yeah it keeps getting bumped by higher priority stuff. IIRC there are only 2 tempest tests for it 13:45:14 but good to not forget it 13:45:54 I also am still working on that openstack dashboard. I got the web UI up on a test system but it wasn't populating properly 13:46:43 what dashboard? 13:46:49 Yeah, what dashboard? 13:47:17 Openstack health dashboard 13:47:29 Its the tempest failure stats UI that the community uses 13:47:38 http://status.openstack.org/openstack-health/#/ 13:48:15 ok, I think we did talk about that 13:49:17 That's all I had for CI 13:49:22 *-powervm are on there 13:49:34 what do we need beyond that? 13:50:26 That's not results for PowerVM CI 13:50:36 That's results on *-powervm for the community CI 13:50:48 okay. 13:51:08 oh, interesting... 13:51:29 so if we get our CI added, how would that look? Munged together with what is already there for *-powervm? 13:51:49 or a separate dashboard that just shows our runs? 13:52:14 It would be a separate dashboard that just shows our runs 13:53:02 That dashboard only show the tests that you see under the "Jenkins check" header on reviews. It doesn't have any 3rd party 13:53:14 looks like you can click one of these, e.g. nova-powervm, and get a new page with more details about what jobs ran there 13:53:55 esberglu oh, so you'd be setting up something totally separate? 13:54:02 does any other 3rd party CI do that? 13:54:17 edmondsw: Yeah but that still is only showing the jenkins jobs (py27, py35) 13:54:26 It would be something totally separate 13:54:38 PowerKVM has one, I talked to them a little bit about what they had to do 13:55:00 k 13:55:08 sounds cool, but not critical 13:55:14 agree 13:55:27 edmondsw: efried: The idea of it is that we can get rid of failure emails 13:55:35 And just monitor the dashboard 13:55:55 sure 13:56:09 esberglu want to talk about zuulv3 13:56:13 We would still get emails for changes we're subscribed to, right? 13:56:28 efried: Yeah that's all external to us 13:56:30 Cause the likelihood of me actively and continuously "monitoring" the dashboard is nil. 13:56:48 But I still need to know if a change I care about fails. 13:57:03 efried: Yeah you would still get that I believe 13:57:18 But I would make sure before getting rid of the failures 13:57:32 edmondsw: Did you hear much about it at PTG? 13:57:41 I didn't see a whole lot on the mailing list 13:57:57 esberglu not much, just that they were trying to switch over, hit an issue early in the week, and were going to come back to is 13:58:02 it* 13:58:08 later in the week 13:58:12 edmondsw: Yeah that was my understanding too 13:58:21 I should probably say "keep working on it" rather than "come back to it" 13:58:34 We aren't necessarily in a hurry to do that 13:58:39 It would just be nice 13:58:53 and my understanding is that their switch doesn't really effect us, just would be nice if/when we can follow suit 13:58:58 let them work out the kinks first, though 13:59:19 edmondsw: Yeah I'll just try to stay in the loop on it 13:59:23 ++ 13:59:32 the other thing I was going to ask about is neutron/ceilometer ci 13:59:44 edmondsw: You saw my latest email on that? 14:00:17 yeah, your email said that we caught that ceilometer issue in tox rather than in CI... which surprised me 14:00:29 We would need to talk to both neutron and ceilometer separately if we want to publish 14:00:54 esberglu yeah, and I think that would probably be a good discussion to have... let's add that to the TODO list 14:01:03 non-voting, of course 14:01:11 Yep 14:01:27 edmondsw: I also will add a todo to look into what we are testing for ceilometer 14:01:34 tx 14:01:47 done with CI? 14:01:51 Yep 14:02:05 #topic Driver Testing 14:02:09 I just have one quick thing here 14:02:31 basically, we're just going to have UT and CI at this point 14:03:01 I thought we might be able to get Ravi to do some tempest work, but we lost him as well as Jay 14:03:23 so I will remove this from the agenda going forward 14:03:24 :( 14:03:39 of course OOT driver is still used/tested by PowerVC 14:04:09 and I'll be asking for more resources to help here in the next cycle 14:04:40 okay 14:04:50 but that will probably be dev resources, not test... so get ready to be asked to do more in tempest dev and usage 14:04:59 that is all 14:05:07 #topic Open Discussion 14:05:23 edmondsw Following up from PTG, can you please review this series: https://review.openstack.org/#/q/topic:bp/nova-validate-certificates+status:open 14:05:34 Any last thoughts? Nothing else from me 14:05:36 efried yep, on my list 14:05:39 Thanks. 14:06:44 #endmeeting