16:00:01 <smcginnis> #startmeeting Cinder 16:00:02 <openstack> Meeting started Wed Sep 28 16:00:01 2016 UTC and is due to finish in 60 minutes. The chair is smcginnis. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:03 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:05 <openstack> The meeting name has been set to 'cinder' 16:00:06 <smcginnis> dulek duncant eharney geguileo winston-d e0ne jungleboyj jgriffith thingee smcginnis hemna xyang1 tbarron scottda erlon rhedlind jbernard _alastor_ bluex patrickeast dongwenjuan JaniceLee cFouts Thelo vivekd adrianofr mtanino yuriy_n17 karlamrhein diablo_rojo jay.xu jgregor baumann rajinir wilson-l reduxio wanghao thrawn01 chris_morrell stevemar watanabe.isao,tommylike.hu mdovgal 16:00:06 <flip214> hi 16:00:09 <Swanson> hello 16:00:09 <_alastor_> o/ 16:00:12 <e0ne> hi 16:00:13 <rajinir> o/ 16:00:14 <hemna> yough 16:00:16 <DuncanT> hi 16:00:16 <erlon> hey! 16:00:16 <dulek> o/ 16:00:17 <eharney> hi 16:00:18 <smcginnis> Hey everyone 16:00:18 <xyang1> hi 16:00:20 <jgregor> Hello! 16:00:20 <geguileo> Hi! 16:00:21 <scottda> hola 16:00:25 <smcginnis> Agenda: https://wiki.openstack.org/wiki/CinderMeetings#Next_Cinder_Team_meeting 16:00:25 <jgriffith> goobily hoobily 16:00:30 <baumann> Hi there 16:00:34 <chris_morrell> \o/ 16:00:42 <jseiler> hi 16:00:49 <rhe00> hi 16:00:53 <bswartz> .o/ 16:01:19 <smcginnis> #topic Announcements 16:01:47 <adrianofr_> hi 16:01:47 <smcginnis> There was a question earlier in channel, so I'll state it here to make it official - I'd like Ocata to be mostly bugfix and stabilization. 16:02:00 <hemna> smcginnis, +1 16:02:05 <smcginnis> There are some features in flight that I consider part of that "stabilization". 16:02:06 <erlon> smcginnis: +1 16:02:14 <_alastor_> smcginnis: +! 16:02:22 <_alastor_> smcginnis: +1 16:02:23 <erlon> smcginnis: WILL IT BE OPEN TO NEW DRIVERS? 16:02:27 <erlon> opss 16:02:28 <smcginnis> I'd love to see some of that get wrapped up and not be "in-progress" for another release. 16:02:30 <dulek> :D 16:02:32 <smcginnis> erlon: YES 16:02:33 <smcginnis> :) 16:02:39 <e0ne> smcginnis: +1 16:02:40 <Swanson> erlon, inside voice. 16:02:42 <DuncanT> can we explicitly add "testing" to that list, please? 16:02:43 * erlon screaams! 16:02:44 <smcginnis> hah! 16:03:12 <smcginnis> DuncanT: Yeah, I think testing will be a very important part of both stabilization and bugfixing. 16:03:19 <DuncanT> (I'm very slow typing at the moment due to injury, sorry) 16:03:31 <smcginnis> But I'll try to remember to call that out explicitly from now on. 16:03:35 <smcginnis> DuncanT: How's the hand? 16:03:57 <DuncanT> Not fallen off yet. Not actually working very well though 16:04:02 <smcginnis> :/ 16:04:10 <smcginnis> Hopefully it improves. 16:04:13 <jungleboyj> o/ 16:04:18 <smcginnis> #link https://etherpad.openstack.org/p/ocata-cinder-designsummit-planning Summit planning 16:04:28 <diablo_rojo_phon> Hello 16:04:36 <smcginnis> With those goals in mind - we need to plan our summit sessions. 16:04:54 <smcginnis> Testing will be part of it for sure. 16:05:00 <smcginnis> Please add ideas to the etherpad. 16:05:17 <smcginnis> Once we get a little closer we can prioritize and get sessions assigned. 16:05:28 <dulek> smcginnis: I guess we need a session on all the things we're considering "in-progress"? 16:05:48 <smcginnis> dulek: Sure, we could. 16:06:07 <smcginnis> dulek: It needs to be discussed somewhere at least. :) 16:06:39 <smcginnis> One more announcement before moving on with the agenda - I need to cut RC2. 16:06:40 <dulek> And what are we actually considering in-progress besides A/A? 16:07:03 <smcginnis> dulek: Maybe I'll start an etherpad. 16:07:06 <bswartz> dulek: rolling upgrades 16:07:07 <xyang2> smcginnis: you are talking about core features, right? new drivers and driver features are still ok? 16:07:46 <smcginnis> xyang2: Yes, contained to drivers it should be OK. Just no changes to Cinder core that could destabilize things or slow down completion of things like HA A/A. 16:07:51 <scottda> Cinder-Nova API changes and multi-attach are in-progress 16:07:54 <dulek> bswartz: I would love to continue the work on stabilizing that in Ocata, but I don't think it should make a priority per-se. 16:07:56 <DuncanT> Nova-cinder api 16:08:04 <dulek> DuncanT: +1000 :) 16:08:21 <hemna> scottda, I think multi-attach is probably out for O 16:08:24 <smcginnis> DuncanT: That will be a big one. I think we're close there now. 16:08:27 <hemna> scottda, we need to stabilize the new api first 16:08:37 <hemna> my $0.02 16:08:38 <dulek> Okay, these 3 things probably will drain enough time to fill-up Ocata. 16:08:48 <smcginnis> hemna: Yeah, if we can at least get the new APIs in palce, then hopefully we're in a good spot for Pike. 16:08:48 <DuncanT> dulek: getting better validation and testing of rolling upgrade would be great to see 16:08:59 <dulek> DuncanT: Working on it! :) 16:09:16 <smcginnis> So on the subject of me needing to cut the RC2... 16:09:17 <jungleboyj> hemna: Nooooo!!!! 16:09:18 <dulek> hemna, smcginnis: There will be riots, you remember? ;) 16:09:18 <jgriffith> smcginnis: hemna I do plan to have multi-attach shortly after the new api calls 16:09:21 <smcginnis> #topic FFE RBD Replication 16:09:33 <smcginnis> geguileo: You have the floor. 16:09:38 <smcginnis> jgriffith: +1 16:09:44 <geguileo> smcginnis: thanks 16:10:01 <geguileo> Ok, so RBD replication was accepted as a FFE 16:10:20 <geguileo> And all dependent patches have merged in master 16:10:32 <geguileo> Just the RBD driver patch remains 16:10:39 <smcginnis> geguileo: So you're probably not going to like this, but... 16:10:40 <geguileo> And it needs some love 16:10:55 <smcginnis> geguileo: This has gotten late and that's a lot of changes to push in right before cutting a release candidate. 16:11:08 <smcginnis> geguileo: I'd actually feel better leaving that for O at this point. 16:11:11 <geguileo> :''-( 16:11:13 <e0ne> geguileo: I did some testing today, it works for me 16:11:22 <e0ne> I'll review the code tonight 16:11:33 <geguileo> smcginnis: In my defense all those other patches are bugs we had in Cinder 16:11:49 <geguileo> smcginnis: So a backport and merge at this point is no longer an option? 16:11:56 <smcginnis> geguileo: If folks are comfortable with it and feel the risk is low, I can probably be convinced otherwise. 16:12:10 <geguileo> smcginnis: Should we vote or something? 16:12:12 <smcginnis> geguileo: But I'd want to see those go through today if we're going to do it. 16:12:17 <bswartz> I can't believe it's not merged already! I thought that FFE was granted more than a week ago 16:12:26 <smcginnis> Who has tested the patches in a real deployment? 16:12:38 <geguileo> smcginnis: I have ;-) 16:12:50 <smcginnis> geguileo: Well that's good! :) 16:12:51 <geguileo> smcginnis: And apparently e0ne has as wel 16:12:59 <geguileo> s/wel/well 16:13:25 <geguileo> Anybody can easily test it, I provided a script to deploy everything 16:13:32 <geguileo> 2 Ceph clusters properly configured 16:13:36 <e0ne> geguileo, smcginnis: I tested it with multinode devstack with 2 separate ceph clusters 16:13:39 <geguileo> Devstack with the patches, etc 16:13:49 <smcginnis> e0ne: OK, great. That does help. 16:14:02 <bswartz> IMO it's way too late to be merging features into Newton -- I would feel nervous about even merging a bugfix this late 16:14:21 <hemna> bswartz, +1 16:14:30 <smcginnis> bswartz: That's my dilemma. 16:14:35 <smcginnis> It's hopefully isolated to the driver. 16:14:48 <smcginnis> But changes should be very minimal by this point. 16:15:04 <dulek> But it's contained to a single driver and enabled by config only, right? 16:15:14 <bswartz> when the FFE request game through, I thought it was debatable, and I assumed it would merge immedately after it was granted 16:15:26 <geguileo> dulek: The remaining patch to master is 16:15:27 <jgriffith> bswartz: +1 16:15:39 <hemna> I think it's too late now. 16:15:42 <geguileo> dulek: But for the backport it requires the other patches as well 16:16:03 <eharney> how likely are we to end up backporting the bugfix patches regardless of the rbd replication code? 16:16:06 <dulek> geguileo: Oh, sure, I don't find them itrusive. 16:16:36 <e0ne> geguileo: so, we should allow FFE not only for RBD replication patch:( 16:16:37 <geguileo> dulek: If those are not intrusive the RBD part is disabled by default 16:16:56 <smcginnis> String changes as well... 16:17:19 <geguileo> e0ne: Yep 16:17:41 <geguileo> e0ne: At least 1 of the patches already has 2 +2 and +A 16:17:42 <dulek> e0ne: Wait, bugfixes aren't features. 16:17:45 <geguileo> e0ne: And another one 1 +2 16:18:16 <patrickeast> So, my 2c, I think for pretty much every driver that implemented replication there were additional fixes needed after the fact as people started using and testing it in different scenarios 16:18:17 <e0ne> dulek: not critical bugfixes are not allowed for Newton at the moment 16:18:29 <smcginnis> geguileo: Sorry, I think we're going to have to push to O at this point. 16:18:39 <dulek> e0ne: That's a fair point. 16:18:40 <geguileo> smcginnis: OK 16:18:41 <patrickeast> So imo the odds of these not breaking *something* is pretty low 16:19:13 <DuncanT> smcginnis: I'd like to withdraw my agenda item, on consideration. It's way too late. 16:19:14 <geguileo> patrickeast: By that you mean breaking something out of the replication stuff or the replication having bugs? 16:19:23 <jungleboyj> patrickeast: ++ 16:19:23 <smcginnis> DuncanT: Was wondering about that. ;) 16:19:37 <patrickeast> geguileo: both I suppose 16:19:41 <geguileo> XD 16:19:45 <DuncanT> smcginnis: I've not been paying enough attention to where we were in the cycle 16:20:30 <smcginnis> geguileo, DuncanT: Both of these I think we can get going in master. But just too late with too much risk with an RC2 imminent. 16:21:04 <geguileo> noted 16:21:30 <smcginnis> geguileo: I'll buy you a (free) beer in Barcelona. :) 16:21:39 <geguileo> XD XD XD 16:21:42 <geguileo> Thanks! 16:21:54 <smcginnis> Moving on then... 16:21:58 <smcginnis> #topic Getting ActiveActive/HA in the O release 16:22:04 <scottda> So, here's a chance to help geguileo mop up some of the tears shed for not getting RBD replication into Newton... 16:22:06 <smcginnis> scottda, geguileo: 16:22:08 <dulek> Buying a "free" beer is a paradox. :P 16:22:12 <smcginnis> :) 16:22:16 <smcginnis> dulek: ;) 16:22:29 <smcginnis> dulek: I'll even buy two. 16:22:31 <scottda> We're into the 3rd? release for AA/Ha 16:22:44 <geguileo> scottda: Yup 16:22:45 <smcginnis> scottda: Yep 16:22:48 <scottda> I think we've all seen the architecture, and merged a bunch of patches in N. 16:23:03 <scottda> IF we're committed to getting this in O, let's get it in soon. 16:23:09 <smcginnis> scottda: +1 16:23:14 <e0ne> +1 16:23:16 <scottda> This will allow focus on testing and finding bugs. 16:23:20 <erlon> scottda: +1 16:23:31 <geguileo> scottda: +1 16:23:43 <e0ne> do we want to move feature freeze to O-1 milestone? 16:23:44 <scottda> I semi-jokingly proposed a review day and Merge Fest. But maybe that's a good idea? 16:24:15 <bswartz> When I look at the Ocata schedule I see 8 weeks that are not holidays between Design summit and Feature Freeze 16:24:17 <Swanson> So with no features going in can we expect every company to pull resources? 16:24:18 <dulek> e0ne: 0-1 is just 2 weeks after the summit. This is a bad idea IMO. 16:24:25 <scottda> WE merged about 12 patches on the last day of the mid-cycle, and really moved this along. Should we try something like that soon? Before the summit? 16:24:39 <smcginnis> e0ne: We'll just be restrictive about what we do allow in. 16:24:46 <e0ne> dulek: I had to agree with you 16:24:55 <smcginnis> bswartz: Yeah, really short span on this one. 16:24:58 <geguileo> scottda: +1 16:25:08 * scottda waits patiently for the conversation to settle down 16:25:30 <smcginnis> scottda: Sounds like a plan. 16:25:50 <jungleboyj> scottda: Good idea. 16:26:40 <scottda> geguileo: Can you go over the patches in the BP and make sure they are ready to go, and status updated as to which are ready for review? 16:26:40 <smcginnis> Schedule reminder for folks: https://releases.openstack.org/ocata/schedule.html 16:26:52 <scottda> #link https://blueprints.launchpad.net/cinder/+spec/cinder-volume-active-active-support 16:26:53 <geguileo> scottda: OK 16:26:54 <eharney> is there a clear mark at this point of when we will consider HA "complete"? (i.e. after patchsets X, Y, and Z land?) 16:27:29 <smcginnis> eharney: Good question 16:27:37 <geguileo> scottda: I'll work on the replication to A/A patch and create unit tests for all WIP patches by next week 16:27:42 <geguileo> scottda: And then update the BP 16:27:59 <geguileo> scottda: So on next meeting we should be able to set a day for the merging 16:28:06 <scottda> Maybe geguileo can also indicate the "complete" point, as opposed to any "optional" stuff? 16:28:10 <bswartz> eharney: how about when it's deployed in production? 16:28:10 <smcginnis> geguileo: +1 16:28:41 <eharney> bswartz: well i assume we will be shaking out bugs for a bit, just wondering as far as what it means to "get it in O" 16:28:46 <smcginnis> geguileo: We have other patches we can merge in the meantime, right? 16:28:59 <smcginnis> geguileo: I mean while you're working on the replication piece. 16:29:03 <scottda> #action geguileo Will update AA/HA BP and prepare for code reviews and merging 16:29:20 <smcginnis> The idea at the midcycle was to get things merged so they could get more runtime and shake out bugs. 16:29:20 <geguileo> smcginnis: Yeah, I'll update the BP and remove the -2 I have on the first patch 16:29:31 <smcginnis> I think now's a good time for doing that again for O. 16:29:36 <smcginnis> geguileo: Cool, thanks! 16:29:46 <geguileo> smcginnis: In the BP I'll reflect which patches are ready for merging 16:29:50 <scottda> smcginnis: geguileo I updated that BP a couple weeks ago, just to keep Merged vs. WIp stuff up to date 16:29:54 <xyang2> geguileo: what your blog link on how to test AA again? Is it in one of the etherpad? 16:30:03 <bswartz> eharney: that should be determined by the stakeholders who want this feature so badly 16:30:17 <scottda> #link http://gorka.eguileor.com/manual-validation-of-cinder-aa-patches/ 16:30:20 <scottda> xyang2: ^^ 16:30:29 <xyang2> scottda: thanks 16:30:37 <geguileo> xyang2: I have to update the post, but it's this one: http://gorka.eguileor.com/manual-validation-of-cinder-aa-patches/ 16:30:38 <smcginnis> And automated tests are also being worked on? 16:30:48 <geguileo> scottda: You were faster than me :-) 16:30:51 <xyang2> geguileo: thanks 16:31:35 <hemna> I don't think there is enough detail on that blog page..... 16:31:48 <scottda> smcginnis: We were just discussing last hour in cinder_testing how to do automated tests... 16:31:56 <geguileo> hemna: I'll update it and add a little bit more including the API cleanup stuff ;-) 16:32:00 <hemna> :P 16:32:00 <scottda> smcginnis: Some of it will prove tricky. 16:32:29 <geguileo> Yeah, it's tricky because you can't tell if they are getting evenly distributed and if the DB is being properly cleaned up 16:32:36 <scottda> But once code is in, we get existing Tempest testing for free. So there'll be some indication that things still work and don't get broken. 16:32:44 <geguileo> By evenly I mean round robin 16:33:51 <winston-d_> Is there a way to trigger some issues without having A/A deployment? 16:34:33 <scottda> winston-d_: We had discussed how some type of error-injection would be good. But there's no infra for that ATM 16:34:34 <geguileo> winston-d_: What do you mean? r:-?? 16:35:38 <smcginnis> Killing nodes? 16:35:53 <winston-d_> I mean to prove Cinder A/A works, job distribution is one thing, the other is things like DLM is working properly. 16:36:14 <geguileo> winston-d_: Oh, yeah, that's the next step of what I want to manually test 16:36:21 <hemna> probably need some rally runs for that 16:36:49 <geguileo> winston-d_: But I want to get all the cinder stuff in before focusing on the DLM part 16:36:57 <scottda> Yeah, and some manual testing with sleeps would be good for checking the races. But hard to do in an automated infra-approved way. 16:37:31 <geguileo> scottda: I have some ideas on how to do that, I just have to find the time to do a PoC 16:38:50 <scottda> OK, well it looks like some of the Cleanup patches could be reviewed and merged anytime. 16:39:27 <winston-d_> I really want to see how Cinder with A/A is running differently from what it is now, e.g. using the hostname hack and rabbit to do round-robin 'A/A'. 16:39:27 <scottda> and geguileo is going to look at some of the stuff marked WIP at the moment, and clean up the BP before next week. 16:39:45 <scottda> winston-d_: Look at geguileo 's blog for manual testing. 16:40:03 <scottda> winston-d_: It show really good details. 16:40:11 <winston-d_> reading 16:40:42 <smcginnis> Anything else on this we should cover in the meeting? Or just follow up in channel as we go? 16:40:54 <scottda> Nothing more from me. 16:41:03 <geguileo> I'm good 16:41:07 <smcginnis> Thanks guys. 16:41:23 <smcginnis> DuncanT's topic is deferred. Any other topics? 16:41:56 <smcginnis> Going once... 16:42:06 <smcginnis> Going twice... 16:42:14 <smcginnis> OK, thanks everyone! 16:42:18 <geguileo> Thanks! 16:42:21 <winston-d_> thx 16:42:23 <e0ne> see you next week 16:42:28 <smcginnis> #endmeeting