16:00:40 <jgriffith> #startmeeting cinder 16:00:40 <openstack> Meeting started Wed May 27 16:00:40 2015 UTC and is due to finish in 60 minutes. The chair is jgriffith. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:41 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:43 <DuncanT> hi 16:00:44 <openstack> The meeting name has been set to 'cinder' 16:00:46 <opencompute> hi 16:00:49 <tbarron> hi 16:00:50 <kmartin> o/ 16:00:54 <Yogi2> hi 16:00:56 <jgriffith> We've got a pretty full agenda so let's get on with it 16:01:00 <dannywilson> hi 16:01:05 <e0ne> #link https://wiki.openstack.org/wiki/CinderMeetings#Next_meeting 16:01:06 <jgriffith> #link https://goo.gl/XG062E 16:01:19 <jgriffith> e0ne: :) thanks 16:01:36 <jgriffith> reminder: please put your name next to your proposed topic 16:01:56 <jgriffith> #topic Live-Migration changes 16:02:04 <jgriffith> hemna: I'm assumign this is you? 16:02:16 <jgriffith> assuming 16:02:24 <jgriffith> hemna: ? 16:02:37 <jgriffith> giving hemna 30 seconds, then moving on and we'll come back 16:03:00 <jgriffith> 15 more seconds... 16:03:10 <geguileor> Hi 16:03:12 <dulek_home> Can we get back to it at the end? Guys from my team are in a traffic jam and will be able to discuss helping with Nova stuff in 30 minutes. 16:03:18 <jgriffith> Ok, we'll come back *if* we have time 16:03:24 <dulek_home> :) 16:03:25 <kmartin> hemna, is presenting to SNIA right now 16:03:26 <jgriffith> #topic Cinder internal tenant 16:03:31 <jgriffith> patrickeast: you're up 16:03:31 <patrickeast> hi 16:03:40 <patrickeast> so this got brought up a bit at the summit 16:03:49 <patrickeast> for fixing the hidden volumes problem 16:04:01 <patrickeast> i have some mention of it in this spec https://review.openstack.org/#/c/182520/ 16:04:03 <mtanino> hi 16:04:03 <patrickeast> for the image cache 16:04:29 <patrickeast> after talking with some folks over the last couple days i wanted to bring it up at a meeting and make sure there wasn’t any strong resistance 16:04:43 <DuncanT> Seems like a good idea to me 16:04:56 <patrickeast> and it looks like there is a review up proposing a hidden flag to volumes, so i think we need to maybe make a decision on which direction we go 16:05:05 <patrickeast> and unify all the various efforts on that approach 16:05:15 <patrickeast> questions? comments? concerns? 16:05:42 * DuncanT prefers a special tenant to the hidden flag 16:05:51 <jgriffith> patrickeast: so I need to read your spec more carefully, but i mentioned the other day another spin 16:05:51 <jungleboyj> o/ Here now. Sorry. 16:06:01 <tbarron> internal tenant seems potentially useful for lots of stuff 16:06:06 <jgriffith> patrickeast: "special tenant" and public snapshots 16:06:18 <jgriffith> patrickeast: but I know the public snapshots is contentious 16:06:34 <e0ne> DuncanT: +1 for special tenant 16:06:36 <jgriffith> partially because of my own statements :) 16:06:46 <patrickeast> jgriffith: hehe yea id rather not tie special tenants to doing public snapshots, more like if we go down that road we could use the special tenant 16:06:50 <jgriffith> Anybody object to internal tenant? 16:06:58 <tbarron> patrickeast: +1 16:07:22 <jgriffith> patrickeast: well... the problem IMO is your spec is actually to "solve" the image-caching issue 16:07:34 <jgriffith> patrickeast: not "sholud we do special tenant" 16:07:48 <geguileor> +1 to special tenant 16:07:51 <patrickeast> jgriffith: yea i was wondering about that… maybe we should split it out? 16:08:05 <jgriffith> I think we're all agreed on the tenant idea so that's great 16:08:06 <xyang2> special tenant can be used for temporary volume, temp snapshot, and image cache. fine with me 16:08:15 <jgriffith> patrickeast: might be good to split it out 16:08:27 <jgriffith> patrickeast: for the sake of clarity and "argument" :) 16:08:35 <patrickeast> sounds good 16:08:38 <jgriffith> xyang2: +1 16:08:43 <jungleboyj> Sounds like something that can be used by multiple people. 16:08:48 <jungleboyj> xyang2: ++ 16:08:49 <e0ne> xyang2: +1 16:08:54 <patrickeast> i’ll make a new bp and spec that we can use as a dependency for the other ones that would need it 16:08:56 <jgriffith> We need to be carefule though 16:09:01 <rajinir> Like the idea of special internal tenant. Seems like it can be of multi use 16:09:02 <jgriffith> and VERY specific on it's usage 16:09:14 <jgriffith> It's easy for something like this to become HORRIBLY abused!!!! 16:09:17 <patrickeast> jgriffith: +1 16:09:25 <cebruns> jgriffith: +1 16:09:27 <jgriffith> including circumventing things like quotas in public clouds etc 16:09:29 <kmartin> agree, the special tenant would be useful 16:09:30 <patrickeast> we are shooting for just the right amount of abuse 16:09:34 <jgriffith> or private for that matter 16:09:42 <jgriffith> so let's clarify... 16:09:58 <jgriffith> I propse we're very specific and it's NOT just a "special" tenant 16:10:02 <ameade> o/ 16:10:07 <jgriffith> which can be anything anybody wants it to be 16:10:08 <scottda> swapsudoexit 16:10:23 <jgriffith> in this case I suggest it's something like "cinder-image" tenant 16:10:43 <DuncanT> +1 16:10:45 <jgriffith> and it's specifically for image caching and management, nothing more 16:11:02 <patrickeast> oh that brings up another thing i wanted some feedback on, should we have multiple of these? one for image caching, or for migration helping, etc? 16:11:06 <e0ne> agree about caching 16:11:06 <jgriffith> if other valid use cases come up we can adjust and deal with them 16:11:17 <e0ne> jgriffith: what type of management do you mean? 16:11:18 <xyang2> jgriffith: I also need it for non-disruptive backup 16:11:23 <jgriffith> patrickeast: so that's the rat-hole I'm hoping to avoid here 16:11:28 <xyang2> jgriffith: vincent needs it for migration 16:11:34 <jgriffith> e0ne: so my use-case is something like this; 16:11:42 <patrickeast> jgriffith: yea, but we already have 3(?) use cases wanting it 16:11:45 <geguileor> If we create specific users we'll end up with a bunch of them 16:11:58 <jgriffith> image-tenant creates bootable volumes from glance images on a periodic of some sort 16:12:03 <e0ne> we need to specify all use-cases in spec 16:12:13 <jgriffith> provides public snapshot or volume to "other" tenants 16:12:16 <DuncanT> Specific users makes figuring out WTF is going on easier 16:12:26 <jgriffith> DuncanT: +1 16:12:44 <jgriffith> DuncanT: so this could turn into a SERIOUS mess if we're not very careful and explicit 16:12:59 <jgriffith> start throwing around migration blah blah blah and we're pretty well sunk IMO 16:13:12 <DuncanT> Indeed, it is already starting to feel like a new nail... 16:13:28 <jgriffith> IMO the image-tenant is just a sort of "glance proxy" to start with 16:13:30 <jgriffith> that's all 16:13:42 <cFouts> o/ 16:13:54 <patrickeast> i agree 100% that we don’t want to mis-use this, but if we don’t then we end up with a hidden flag on the volume table *and* special tenants 16:14:06 <jgriffith> patrickeast: ? 16:14:17 <patrickeast> we can’t just exclude migrations or whatever else 16:14:26 <jgriffith> patrickeast: sure we can 16:14:39 <jgriffith> patrickeast: I don't see what migrations has to do with the topic? 16:14:39 <jungleboyj> jgriffith: To be clear, you don't want to have a general user, you want a specific user for a specific purpose. 16:14:41 <patrickeast> right, but then we get https://review.openstack.org/#/c/185857/ 16:14:52 <jungleboyj> We can start with image-tenant and then expand. 16:14:58 <patrickeast> the whole point of the special tenant is to avoid a hidden flag on the volume table 16:15:02 * flip214 was reminded to act as timekeeper again.... so, ¼ of the time is gone. 16:15:04 <patrickeast> thats why it came up in the first place 16:15:15 <patrickeast> maybe we can discuss after the meeting more 16:15:22 <patrickeast> i don’t want to hog all of the time 16:15:22 <jgriffith> patrickeast: there... that's fixed 16:15:36 <e0ne> jungleboyj: looks like we neet to start use trusts from the keystone api v3 16:15:53 <xyang2> jgriffith, DuncanT: I do need to create temp volume and snapshot for the non disruptive backup case, so either I need a hidden flag or a cinder tenant for it. I thought at the summit, cinder tenant was the preferred approach 16:15:55 <e0ne> to make more thin user management 16:16:32 <jungleboyj> xyang2: ++ 16:16:44 <geguileor> Yes cinder tenant was preferred because it didn't meant changes to Quota 16:16:48 <geguileor> among other things 16:16:50 <jgriffith> xyang2: So that may in fact be something that this idea expands to, although I don't understand why it has to be "hidden" 16:17:13 <xyang2> jgriffith: it is a temp snapshot, we don't want user to do operation on it 16:17:16 <jgriffith> xyang2: so online backup requires creation of a snapshot... no problem IMO 16:17:31 <jgriffith> xyang2: well, keep in mind users can't really *do* anything with snapshots anyway :) 16:17:38 <DuncanT> jgriffith: It's the vol from snap for lagacy drivers that is the issue 16:17:42 <jgriffith> xyang2: and frankly what would they *do* 16:17:45 <DuncanT> jgriffith: The snap is fine 16:17:58 <DuncanT> jgriffith: But we need to create a hidden volume 16:18:04 <jgriffith> DuncanT: so mark it as R/O 16:18:04 <xyang2> jgriffith, DuncanT: they can still list it. is it okay 16:18:18 <jgriffith> frankly I don't care if they can list it 16:18:20 <xyang2> DuncanT: ya, hidden volume is another issue 16:18:36 <DuncanT> jgriffith: quota? Them deleting it in the middle of the operation? 16:18:45 <jgriffith> DuncanT: ummm 16:18:54 <jgriffith> DuncanT: quota = deal with it 16:19:10 <jgriffith> DuncanT: deleting in the middle of an operation we check that sort of thing in the API all the time 16:19:15 <jungleboyj> jgriffith: I think you do care. People get confused if they see volumes show up that they didn't create. 16:19:18 <jgriffith> DuncanT: if !available raise 16:19:20 <DuncanT> jgriffith: Volumes coming out of nowhere was shown to be a very confusing UI for migrations, I don't think it is going to get any less confusing 16:19:26 <jgriffith> Ok 16:19:32 <xyang2> DuncanT, jgriffith: right, quota is the other issue that why we thought cinder tenant is preferred 16:19:36 <jungleboyj> Maybe it was DuncanT that really cared, but I think we should care. 16:19:46 <tbarron> jungleboyj: +1 16:19:49 <jgriffith> I'll leave it to you all to sort out then, but I suspect you're not going to like the end result :( 16:20:02 <e0ne> jungleboyj: good point 16:20:06 <jgriffith> jungleboyj: and no, I don't care on that particular item 16:20:34 <jungleboyj> jgriffith: Ok, must have been DuncanT 16:20:40 <jgriffith> jungleboyj: people get "more" confused when there's invisible things happening and stuff fails and they have zero idea why 16:20:59 <jgriffith> Ok... 16:21:04 <DuncanT> So we see what it looks like in code having a backup tenant too... it won't be used for any driver that is updated to be able to backup snaps directly, so it is hopefully temporaty 16:21:08 <DuncanT> temporary 16:21:16 <jgriffith> so it sounds like everybody is on board with "special" tenants 16:21:17 <xyang2> jgriffith: what if it is visible to admin only, just not to regular tenant? 16:21:29 <jgriffith> I'll let everybody else argue about where/how they can be used 16:21:35 <patrickeast> ok so… heres my proposal, i’ll write up a spec for the special tenant and put as many of these use cases on there as i can 16:21:43 <jungleboyj> xyang2: ++ We need the tenants for that though, right? 16:21:46 <patrickeast> we can hash out which of them are ‘valid’ or not in the spec review 16:21:54 <xyang2> jungleboyj: yes 16:21:54 <jungleboyj> patrickeast: __ 16:21:55 <geguileor> patrickeast: +1 16:21:58 <jungleboyj> patrickeast: ++ 16:22:07 <jgriffith> xyang2: sure, maybe that works 16:22:10 <jgriffith> ok 16:22:23 <jgriffith> so patrickeast did we at least cover the main points for you to move forward? 16:22:36 <patrickeast> jgriffith: haha yea, i think everyone seems to be on board 16:22:43 <jgriffith> cool 16:22:45 <patrickeast> just a matter of figuring out exactly how we use them 16:22:51 <jgriffith> :) 16:22:55 <jgriffith> #topic hin provisioning volume capacity consideration in filter scheduler 16:23:03 <jgriffith> that's "thin" by the way :) 16:23:05 <jgriffith> not hin 16:23:11 <jgriffith> xyang2: you're up 16:23:14 <xyang2> is winston here? 16:23:15 <xyang2> ok 16:23:27 <xyang2> this was brought up by patrickeast 16:23:45 <xyang2> so currently we deduct the size of the new volume for thin provisioning from free capacity 16:23:50 <tbarron> xyang2: winston said he had a conflict today 16:24:02 <xyang2> this is the conservative approach we started with in the design 16:24:23 <xyang2> the concern is for thin volume, it is not consumed yet when it is first provisioned 16:24:47 <xyang2> the proposed patch allows two ways of handling this 16:25:09 <xyang2> a flag was added, if it is true, we deduct the size of the new volume 16:25:19 <xyang2> if it is false, we don't deduct it 16:25:28 <xyang2> anyone has an opinion on this 16:25:37 <patrickeast> xyang2: so i think there are actually two things here, my bug originally wasn’t even for that issue (and now our conversation yesterday makes more sense) 16:25:37 <xyang2> ? 16:25:39 <jgriffith> xyang2: well I do of cource :) 16:25:49 <jgriffith> course 16:25:54 <xyang2> jgriffith: go ahead 16:26:02 <patrickeast> the bug https://bugs.launchpad.net/cinder/+bug/1458976 is that you can create 100 2TB devices on a 100TB backend but not 1 200TB volume 16:26:02 <openstack> Launchpad bug 1458976 in Cinder "cannot create thin provisioned volume larger than free space" [Undecided,In progress] - Assigned to Xing Yang (xing-yang) 16:26:14 <patrickeast> then there is the issue of as you create thin volumes 16:26:18 <patrickeast> it eats up ‘free’ space 16:26:22 <patrickeast> until the next stats update 16:26:40 <jgriffith> xyang2: well, I've always been of the opinion that we need to quit pontificating and screwing around report capacities 16:27:15 <jgriffith> xyang2: that means "available" which use the over-prov ratio you implemented 16:27:21 <jgriffith> in the case of thin 16:27:34 <jgriffith> and distinguish between allocated and provisioned 16:27:48 <jgriffith> I've propsed this no less than 1/2 a dozen times in the last 18 months 16:27:54 <guitarzan> that's a crazy bug report :) 16:28:34 <guitarzan> thin provisioning is a crazy pit I'm glad I'm not jumping into :) 16:28:37 <xyang2> jgriffith: sorry, I don't think I completely follow you:(. what is your suggestion? by the way, our definition may be a little different 16:29:01 <jgriffith> xyang2: I'm sure our definitions are different which has always been the problem 16:29:14 <jgriffith> xyang2: everybody wants "their" definition and won't compromise on anything 16:29:33 <jgriffith> xyang2: so my proposal is and has been just report the same way the reference LVM thin driver does 16:29:42 <jgriffith> xyang2: allocated vs actual 16:29:58 <jgriffith> xyang2: and calculate deltas to report and schedulel placement 16:30:15 <jgriffith> xyang2: so if you have thin provisioning and a backend with 100G of free space 16:30:37 <jgriffith> it reports free-space + (free-space * over-prov-ratio) 16:31:04 <jgriffith> and free-space = physical - allocated 16:31:14 <jgriffith> allocated is "actual" blocks used 16:31:30 <jgriffith> xyang2: make sense? 16:32:19 <bswartz> jgriffith: the term "allocated" is problematic because it doesn't match the definition of allocated_space in cinder -- I understand what you mean though 16:32:21 <xyang2> jgriffith: so "allocated" means actually used capacity. I think that is how free is calculated currently, just the term is not the same 16:32:53 <patrickeast> so to make sure i understand if you had 100G of free space, and a 2.0 ratio, you could place a 200G thin volume, right? 16:32:59 <patrickeast> with what you described 16:33:10 <jgriffith> patrickeast: correct 16:33:14 <patrickeast> perfect 16:33:24 <patrickeast> thats what my bug report is for… we can’t do that today 16:33:34 <jgriffith> bswartz: yeah, so the other suggestion for the "name" conflict was apparant or effective 16:33:45 <jgriffith> bswartz: specifically for the scheduler 16:34:20 * flip214 mentions that half of the time is gone. 16:34:28 <bswartz> I think using different terms in different places is what of what leads to the madness and misunderstandings 16:34:53 <jgriffith> flip214: thanks for the reminder sir 16:35:09 <flip214> jgriffith: no problem. glad to be of service! 16:35:11 <xyang2> bswartz, jgriffith: do we really want to start discuss the terms now? I added a whole section on terms in the cinder spec 16:35:16 <jgriffith> bswartz: well, for an end user allocated should not take into account anything to do with thin 16:35:24 <jgriffith> :) 16:35:32 <jgriffith> xyang2: thanks! You saved me 16:35:34 <tbarron> xyang2: +1 16:35:37 <jgriffith> and my blood pressure 16:35:49 <bswartz> I just wanted to point out that we don't want to argue about what the terms should mean, we want whatever the terms are to be crisply defined so there is no confusion 16:36:05 <xyang2> jgriffith: ok, I don't think I explained the problem clearly. 16:36:07 <patrickeast> so… terms and formulas aside, i think the original topic isn’t for *how* we calculate the virtual space 16:36:21 <xyang2> jgriffith: there's definitely a bug that reported by patrick 16:36:27 <jgriffith> xyang2: agreed 16:36:29 <xyang2> jgriffith: and I want to fix it. 16:36:40 <jgriffith> xyang2: yes, and that's FANTASTIC 16:36:51 <xyang2> jgriffith: the question is whether we want to also preserve the existing behavior 16:36:57 <jgriffith> xyang2: I'm proposing that we fix it by having the drivers report capacities in a way that isn't stupid 16:37:11 <jgriffith> xyang2: which frankly right now they kinda are 16:37:20 <patrickeast> the bug is a flaw in how we do the filter logic, not the virtual space 16:37:28 <patrickeast> or at least i see it as a flaw 16:37:30 <xyang2> jgriffith: it is not the driver here actually. the filter scheduler deduct the volume size 16:37:34 <winston-d> patrickeast: +1 16:37:41 <jgriffith> patrickeast: it can be addressed on either side 16:37:50 <jgriffith> patrickeast: and the scheduler may be the right place 16:38:02 <jgriffith> patrickeast: xyang2 my thing was I didn't like the fix and adding a flag etc 16:38:04 <patrickeast> jgriffith: nono, the problem is that the filter doesn’t ever get to the virtual capacity stuff as-is 16:38:11 <tbarron> the issue is about filter scheduler behavior , not driver behavior - given the spec that was approved and implemented in kilo 16:38:12 <winston-d> i don't think we abused the term 'free space' so far 16:38:18 <jgriffith> patrickeast: xyang2 IMHO the drivers and scheduler should work together to just "do the right thing" 16:38:19 <patrickeast> jgriffith: it fails before then on a if free < volume_size 16:38:27 <xyang2> jgriffith: ok, no one likes the flag so far:) 16:38:29 <jungleboyj> tbarron: ++ 16:38:34 <patrickeast> this line needs to be changed 16:38:36 <patrickeast> https://github.com/openstack/cinder/blob/master/cinder/scheduler/filters/capacity_filter.py#L83 16:38:43 <jgriffith> patrickeast: yeah, what I'm trying to say is that our reporting of free is wrong 16:38:45 <patrickeast> or moved *after* we check thin provisioning stuff 16:38:54 <patrickeast> jgriffith: oooo 16:38:57 <patrickeast> ok i see 16:39:04 <winston-d> free space means it's physically available space, without overprovision. 16:39:08 <jgriffith> patrickeast: free in the case of thin support should be "free * over-prov-ratio" 16:39:09 <xyang2> any one want to keep the ability to preserve the volume size, please speak up 16:39:31 <patrickeast> jgriffith: gotcha, later on we call that virtual_free or something 16:39:32 <xyang2> otherwise, since no one likes to flag, we'll just not to preserve it 16:39:33 <jgriffith> winston-d: so that's the quirk 16:39:53 <tbarron> xyang2: I'm ok with that 16:40:02 <jgriffith> winston-d: which is why I then said "ok... add an apparant/virtual/effective" or whatever "-free" 16:40:07 <jgriffith> and use that instead 16:40:22 <jgriffith> honestly if it's thin I don't necessarily know why the scheduler should care 16:40:44 <xyang2> I agree the flag looks ugly. I just want to see if anyone wants to preserve the existing behavior 16:40:48 <hemna> ok I'm back. sorry guys, I had a preso dry run to do at the same time as our meeting. 16:40:52 <winston-d> jgriffith: you can create a type to explicitly ask for thin or thick 16:41:15 <DuncanT> jgriffith: thick and thin on the same backend 16:41:24 <xyang2> jgriffith: so I was given a comment 6 months back when I first started working on this, that we should be conservative and assume the new volume will be consumed 16:41:26 <jgriffith> winston-d: Oh, the crazy pool nonsense let's you do both 16:41:34 <jungleboyj> hemna: is back. Back to discussion. ;-) 16:41:53 <xyang2> jgriffith: that was why it was deducted in the filter scheduler. 16:42:00 <jgriffith> winston-d: so seems like both numbers is good to have, and you use the one that's applicable based on the volume-type being created/requested? 16:42:03 <jgriffith> xyang2: thoughts? 16:42:04 <DuncanT> jgriffith: You can do both without pools... 16:42:13 <jgriffith> DuncanT: ok 16:42:15 <winston-d> jgriffith: right 16:42:28 <jungleboyj> jgriffith: ++ 16:42:31 <jgriffith> xyang2: winston-d so can we just fix it that way rather than flags etc? 16:42:43 <xyang2> yes, some backend can support both thin and thick. 16:42:47 * DuncanT suggests that it looks like we can dump the existing behaviour for thin types... it is broken 16:43:03 <xyang2> I have added a extraspec for thin, thick in the cinder spec 16:43:04 <jgriffith> DuncanT: let's discuss that offline 16:43:12 <jgriffith> DuncanT: oh! 16:43:13 <jgriffith> LOL 16:43:14 <xyang2> so driver can do it if it wants to 16:43:14 <jgriffith> yes 16:43:19 <jgriffith> DuncanT: I agree with you 16:43:37 <winston-d> i'm all for fix this bug without flag 16:43:37 <xyang2> for the particular implementation, this is for thin actually 16:43:46 <jgriffith> xyang2: so driver reports back to scheduler free and apparant-free 16:43:47 <tbarron> winston-d: +1 16:43:51 <patrickeast> winston-d: +1 16:43:59 <jungleboyj> winston-d: +1 16:43:59 <jgriffith> scheudler uses apparant-free for thin type scheduling and free for thick 16:44:09 <jgriffith> ok 16:44:18 <patrickeast> sounds reasonable to me 16:44:21 <jgriffith> so we're all on the same page I think 16:44:29 <xyang2> jgriffith: that is almost there, I don't think we need more reporting 16:44:30 <hemna> so every driver that supports both have to now report apparant-free and free ? 16:44:31 <jgriffith> xyang2: we can chat more between you and I or you and winston-d 16:44:32 <DuncanT> Looks like we're enthusiastically agreeing here, shall we stamp it and move on? 16:44:47 <winston-d> DuncanT: +1 16:44:51 <jgriffith> DuncanT: yeah, but I don't think xyang2 agrees 16:44:57 <patrickeast> although i’m wondering what the odds are of backporting that type of change vs a bug fix for my original problem 16:44:57 <xyang2> jgriffith:sure, but those are already reported 16:45:00 <jgriffith> we can discuss in channel after meeting 16:45:10 <jgriffith> think it's just 'details' at this point 16:45:18 <xyang2> jgriffith: that is what the provisioned_capacity is about 16:45:29 <xyang2> jgriffith: we already have that in driver stats 16:45:38 <tbarron> xyang2: ++ 16:45:38 <jgriffith> xyang2: ok, let's talk after... but I'd say then "just use that" 16:45:41 <jgriffith> there's your fix 16:45:56 <xyang2> jgriffith: I'm fine with that 16:45:59 <jgriffith> #topic https://bugs.launchpad.net/cinder/+bug/1458976 16:46:00 <openstack> Launchpad bug 1458976 in Cinder "cannot create thin provisioned volume larger than free space" [Undecided,In progress] - Assigned to Xing Yang (xing-yang) 16:46:01 <jgriffith> GAAAA 16:46:12 <jgriffith> #topic Downgrades in new db migrations 16:46:22 <patrickeast> lol 16:46:22 <jgriffith> DuncanT: you'r one 16:46:24 <jgriffith> on 16:46:27 <jgriffith> patrickeast: :) 16:46:31 <e0ne> i asked this question several weeks ago. DuncanT was against this solution because it makes versioning objects debugging and testing harder 16:47:02 <e0ne> imo, downgrades makes our migrations more complex. 16:47:02 <DuncanT> So hopefully a quick question: Have we come to a conclusion about removing these? None of the new reviews have them 16:47:14 <e0ne> but they are useful for new migrations development 16:47:32 <jgriffith> DuncanT: I am not aware that we had 16:48:05 <DuncanT> Ok, so nobody is going to scream at me for -1ing changes without them. Excellent 16:48:06 <patrickeast> have new changes merged with out them, or is this just reviews? 16:48:09 <e0ne> jgriffith, DuncanT: cross-project spec about removing downgrages migration merged 16:48:24 <hemna> e0ne, url ? 16:48:24 <jgriffith> e0ne: so that's the answer then I guess 16:48:28 <DuncanT> e0ne: cross project specs are advisory 16:49:06 <e0ne> #link https://github.com/openstack/openstack-specs/blob/master/specs/no-downward-sql-migration.rst 16:49:17 <winston-d> in other words, we don't have to follow? 16:49:29 <e0ne> DuncanT: agree. fair enough 16:49:52 <hemna> e0ne, thanks, have to read through that to figure out their justification. 16:50:06 <vilobhmm> will go through the spec eone 16:50:09 <bswartz> why don't people want to support downgrades? 16:50:16 <hemna> downgrades are a bit of a PITA to do in some cases, regarding foreign key constraints, etc 16:50:26 <ameade> esp hard with data migrations 16:50:27 <DuncanT> bswartz: They're hard to write and down get tested 16:50:30 * bswartz mutters under his breath 16:50:30 <e0ne> bswartz: becouse operators don't use it in prod 16:50:31 <ameade> if not impossible 16:50:34 <hemna> bswartz, the url above seems to explain their justifications. 16:50:41 <e0ne> hemna: +1 16:50:45 * bswartz readin... 16:50:52 * bswartz reading... 16:50:54 <hemna> the problem is the downgrades aren't for operators 16:51:02 <hemna> they are for us to ensure our upgrades work 16:51:07 <e0ne> hemna: than's true 16:51:15 <hemna> fwiw 16:52:13 <hemna> "Update oslo.db to return appropriate errors when trying to perform a schema downgrade" 16:52:25 <hemna> so if that gets implemented, our downgrades might start to puke 16:52:30 <bswartz> this sounds like laziness to me 16:52:37 <hemna> and we won't be able to not get rid of them 16:52:38 <DuncanT> The idea of returning to a db dump is just broken in the case of live upgrade though.... 16:52:43 * jungleboyj is surprised by it. 16:52:53 <bswartz> downgrades are hard to do right -- so we propose to not do downgrades 16:53:07 <hemna> heh 16:53:08 <hemna> yah 16:53:20 <hemna> I had issues w/ downgrades for the multi-attach patch 16:53:24 <hemna> but worked through it. 16:53:42 <hemna> it seemed like a good exercise to me. 16:53:47 <DuncanT> You didn't want to actually keep the volumes you created recently, right? 16:53:47 <bswartz> nobody cares if downgrades are wildly inefficient, but having them is better than not having them 16:53:57 <jgriffith> bswartz: I think the real point was "they don't really work when there's data" so we shouldn't pretend they do 16:53:58 <jungleboyj> bswartz: ++ 16:53:59 <DuncanT> Oh, and the ones you deleted are back 16:54:20 <jgriffith> bswartz: there's also a good question on whether they're actually useful for anything other than debug 16:54:38 <bswartz> useful for debugging seems like a good enough reason to keep them 16:54:51 <bswartz> the real problem is if they're buggy -- the solution is to test them and fix the bugs 16:55:11 <jgriffith> Ok, so I think you all want to do the opposite of the all the folks on the ML 16:55:14 <jgriffith> that's ok by me 16:55:27 <DuncanT> If oslo.db is blocking them we're screwed 16:55:29 <e0ne> imo, re-create db procedure is faster than create downgrade+tests for debugging 16:55:37 <jgriffith> e0ne: +1 16:55:38 <hemna> according to that spec though, oslo.db is going to be updated to not even allow them 16:55:42 <hemna> so I think the point might be moot. 16:55:55 <jgriffith> e0ne: I'm not sure I see why anybody is upset by this but ok 16:56:01 <DuncanT> e0ne: Have you ever tried to recreate a prod db? It's a nightmare 16:56:08 <hemna> DuncanT, +1 16:56:18 <e0ne> DuncanT: i'm talking bout debugging 16:56:18 <jgriffith> DuncanT: have you ever tried to run our downgrades on a production DB? 16:56:24 <jgriffith> I can't imagine that it would work 16:56:28 <patrickeast> but i thought the point was that these were not used in production? 16:56:30 <e0ne> jgriffith: :) 16:56:33 <jgriffith> patrickeast: :) 16:56:44 <DuncanT> jgriffith: We used them on a dev cluster, in that case they actually worked 16:56:49 <jgriffith> and thus the circular-reference argument ensues 16:56:54 <hemna> DuncanT, I think there are 2 purposes for db migrations though. My primary is during development to ensure the changes I've made actually make it in place. In this use case, doing a complete db recreate is just as effective as a downgrade. 16:57:09 <jgriffith> hemna: that's what grenade does fyi 16:57:14 <hemna> the other purpose is for live production data, and I'm not sure I ever see a use case where a customer wants to downgrade ? 16:57:20 <jgriffith> hemna: your changes are upgrades, not downgrades 16:57:25 <hemna> yup 16:57:29 <hemna> jgriffith, agreed. 16:57:35 <jgriffith> ok, I think this horse it dead... not sure why we're beating it 16:57:36 <DuncanT> What do you do if your system is busted after an upgrade? 16:57:40 <hemna> so I think I'd be ok with nuking downgrades. 16:57:53 <e0ne> DuncanT: revert from backup? 16:58:00 <DuncanT> e0ne: Live upgrade 16:58:01 <jgriffith> DuncanT: I'm not sure how/why anybody thinks the downgrade scripts are going to help int hat situation anyway? 16:58:23 <hemna> jgriffith, +1 16:58:28 <DuncanT> jgriffith: If they work, they do... the one time I've tried them on they worked fine 16:58:36 <hemna> if your upgrade is roasted, the downgrade most likely won't even work. 16:58:43 <e0ne> DuncanT: i'm not sure that if live upgrade fail, downgrade will work 16:58:44 <jgriffith> DuncanT: so you've used them once in 4 years? 16:58:51 <hemna> jgriffith, :) 16:58:53 <DuncanT> jgriffith: Yes 16:59:06 <jgriffith> DuncanT: don't every become a sales person :) 16:59:13 <jgriffith> s/every/ever/ 16:59:19 <hemna> lol 16:59:24 <bswartz> hemna: the DB upgrade may be fine but the new code could have critical bugs making you want to go back to an older version 16:59:28 * jungleboyj is enjoying that mental image 16:59:37 <DuncanT> bswartz: ++ 16:59:38 <jgriffith> DuncanT: so honestly I don't care either way, but it sounds as if you need to take it up on the dev ML 16:59:44 <jgriffith> DuncanT: with bswartz 17:00:10 <jgriffith> because it would seem the rest of the OpenStack community has moved on and may be removing the capability form oslo.db anyhow 17:00:19 <jungleboyj> Really seems like the DB should be snapshotted before an upgrade so that it can be rolled back if a disaster occurs. 17:00:21 <jgriffith> and on that note... we're out of time 17:00:22 <hemna> bswartz, we have bugs ? 17:00:22 <DuncanT> jgriffith: Yup, didn't realise that there was an official policy. That answers my question for now 17:00:23 <bswartz> it feels to me like the devs are screwing the users with this change 17:00:26 <hemna> :P 17:00:47 <jgriffith> bswartz: there was never a user that came back and said they have ever used it though 17:00:49 <winston-d> i've never used my 17:00:50 <jgriffith> ok 17:00:55 <DuncanT> jungleboyj: You can't snapshot a live db and expect it to work later... new volumes are lost, deleted volumes are back, it is totally broken 17:01:02 <jgriffith> thanks every one 17:01:04 <winston-d> fire extinguisher on car, but i want to make sure it works when i need it 17:01:04 <bswartz> but it's less work for us, so it's all good </sarcasm> 17:01:05 <jgriffith> #endmeeting