16:00:02 #startmeeting cinder 16:00:03 Meeting started Wed Aug 12 16:00:02 2015 UTC and is due to finish in 60 minutes. The chair is thingee. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:04 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:06 The meeting name has been set to 'cinder' 16:00:08 o/ 16:00:10 fireball: has multiple personalities. 16:00:11 o/ 16:00:13 o/ 16:00:13 o/ 16:00:14 Hello! 16:00:14 hi 16:00:17 Hi! 16:00:20 hi 16:00:20 o/ 16:00:21 hi 16:00:21 hello :) 16:00:21 Hi, 16:00:24 hello 16:00:26 hi 16:00:31 Hi 16:00:42 hi 16:00:45 Hello 16:00:53 #topic how translate specific exceptions 16:01:22 hello 16:01:35 uhhh mriedem? 16:02:22 ok move on 16:02:24 mriedem: not here? 16:02:30 https://review.openstack.org/#/c/209150/ 16:02:34 thingee: I told mriedem the time. I will ping him. 16:02:42 thingee: that is the patch he proposed 16:02:45 #topic modify volume image metadata 16:02:48 morning 16:02:54 #link https://review.openstack.org/#/c/211201/3 16:02:59 Hi, 16:03:16 error handling in the client 16:03:20 So I started look at some of the work davechen is doing here and it's quite a bit of complexity for us to have role protected metadata 16:03:32 mriedem1: we'll circle back 16:03:40 ok 16:03:49 thingee: as far as I know, only glance has this feature. 16:04:05 thingee: and we need this based on two reason. 16:04:18 I'm on a call for work, so I'll lurk here as best I can. 16:04:25 hi 16:04:33 o/ 16:04:40 the first reason is, Glance Image Properties that are copied are a combination of mutable and immutable values, But that distinction is lost when they are copied into Cinder. 16:05:13 the second reason is, the Glance has RBAC to properties, tt could be the case that we want to update a property in Cinder that is protected in Glance. 16:05:33 have a glimpse of this webpage, http://docs.openstack.org/developer/glance/property-protections.html 16:05:57 there are some desc about the props pretection in glance. 16:06:27 so to put in small amount of words, this is too much complexity for more than I care to add into cinder 16:07:05 I don't think cinder should even care how this works 16:07:09 thingee: so, this is not needed per your perspective? 16:07:40 based on the reasons you gave in the review, I'm trying to think of a time someone was trying to solve this problem 16:07:52 license key 16:07:59 kernel id 16:08:03 ramdisk id 16:08:03 I think there have been some discussion about this. 16:08:07 where? 16:08:08 yeah... 16:08:21 https://openstack.nimeyo.com/16488/openstack-cinder-glance-update-volume-image-metadata-proposal 16:08:35 mailing list, but i think it's a long discussion. 16:08:42 I open this to meeting because I want to know from folks what they think too 16:08:56 And really I think this is overthought for just modifying metadata. 16:09:30 and I think Duncan is in the glance's meeting to talk with glance team about this. 16:10:03 http://eavesdrop.openstack.org/meetings/glance/2014/glance.2014-06-26-20.03.log.html 16:10:28 so, anyone has a interesting could read the that. 16:11:00 o/ 16:11:07 thingee: I am okay with it, if this is too heavy, we can just drop it. 16:11:42 but not sure what other's thoughts? 16:11:59 winston-d^^ 16:12:07 DuncanT ^^ 16:12:08 davechen: sorry, i was joining late, dropping what? 16:12:23 winston-d: props pretection. 16:12:26 role/policy base metadata update? 16:12:54 yup, this maybe too complexity per Mike's perspetive. 16:13:29 davechen: this is as opposed to always requiring admin to update image meta? 16:13:30 there once a disccusion around this, and thought this was needed to pretect some props as what glance did. 16:13:52 Personally I think we might need to do some metadata cleanup/fixing before we tackle this.. but perhaps in M 16:13:54 guitarzan: no, it's based on user's role. 16:14:09 davechen: so a normal user can update some image meta in glance? 16:14:11 guitarzan: not always just admin can do this stuff. 16:14:14 DuncanT wanted this feature 16:14:38 sure, if the props is not the confidential props, why we cannot read/update it. 16:14:40 davechen: are kernel and ramdisk the only editable ones? 16:14:41 guitarzan: i guess so, if it's a instance snapshot or private image 16:14:54 FWIW if the numbers in this analysis are accurate: https://bugs.launchpad.net/cinder/+bug/1483165 I'm a bit concerned 16:14:54 Launchpad bug 1483165 in Cinder "The performance of volume index is low" [Undecided,In progress] - Assigned to wangxiyuan (wangxiyuan) 16:15:02 Launchpad bug 1483165 in cinder "The performance of volume index is low" [Undecided,In progress] https://launchpad.net/bugs/1483165 16:15:03 Launchpad bug 1483165 in cinder "The performance of volume index is low" [Undecided,In progress] 16:15:19 New bot? 16:15:34 winston-d: not just DuncanT but glance's team as well. 16:15:41 smcginnis: it's like a begin/end code statement :) 16:16:15 ;) 16:16:56 thingee: what do you think? Mike 16:17:23 thingee: drop that patch? 16:17:35 I'm not seeing a huge reason to bring it in honestly. I think it's crazy all these different things in openstack doing role based policies though. 16:17:42 I can't even keep track 16:17:50 I'd hate to be a deployer. 16:17:58 thingee: :) 16:18:35 thingee: okay, if we need this, we can do it. 16:19:21 thingee: If these stuff is not such important, we can just let it be. 16:19:25 should 'admin_or_owner" coever 99% use cases? 16:19:26 thingee: +1 16:19:43 winston-d: that's where I'm getting it at. 16:19:46 what's wrong with what we have now 16:19:50 winston-d: I doubt it, because most of them the owner can't update 16:20:09 winston-d: this policy is too simply I think. 16:20:25 guitarzan: then change policy i guess? 16:20:42 winston-d: If admin_or_owner cover 99%, there is no need to be oslo policy to be existing. 16:21:09 winston-d: I'm not sure what to do about the 4 keys that they can change 16:21:13 or whatever that number is 16:21:14 davechen: we are talking about cinder context 16:21:25 davechen: not overall openstack use cases 16:21:39 winston-d: okay. 16:21:56 * thingee is confused 16:22:17 is this a blocking issue for certain use cases? 16:22:43 no 16:23:14 alright lets move on 16:23:24 okay, thanks. 16:23:28 #topic how to translate specific exceptions 16:23:30 mriedem1: hi 16:24:12 https://review.openstack.org/#/c/209150/ 16:24:21 ust wondering what to do there 16:26:57 mriedem1: this might be a silly question, but why do we fail a detach when unattached anyway? 16:27:42 guitarzan: because the attach failed on boot 16:27:49 * jungleboyj can hear everyone's brains processing. 16:28:11 mriedem1: I'm not sure that answers my question 16:28:16 but I might just not understand 16:28:34 detaching after an attach failed is a cleanup operation, no? 16:28:36 mriedem1: and nova doesn't know about the attaching failure so it thought novulme is attached? 16:28:53 how does nova not know about an attach failure? 16:30:40 I don't want to get off track, but I don't understand why we couldn't just say "ok, you detached" 16:30:54 anyway, just my 2 cents if we want to continue on the conversation 16:31:03 guitarzan: That makes sense to me. 16:31:05 sorry, multitasking, 16:31:08 so there is the bug report 16:31:17 i'd suggest looking at that for the root issue 16:31:36 my question is how we should handle translating errors from cinder-api to specific exceptions i nthe client so nova can key off those 16:31:47 rather than nova having to parse the error message from cinder, which could change 16:32:00 sure, I'm just saying that cinder could just not fail the detach... 16:32:00 we have a similar model with neutronclient 16:32:09 and nova would continue happily on with its cleanup 16:32:13 guitarzan: cinder fails the detach b/c the attach didn't happen 16:32:19 https://bugs.launchpad.net/nova/+bug/1476806 16:32:19 Launchpad bug 1476806 in OpenStack Compute (nova) "Unable to delete instance with attached volumes which failed to boot" [High,In progress] - Assigned to Maxim Nestratov (mnestratov) 16:32:21 Launchpad bug 1476806 in nova "Unable to delete instance with attached volumes which failed to boot" [High,In progress] 16:32:22 Launchpad bug 1476806 in nova "Unable to delete instance with attached volumes which failed to boot" [High,In progress] https://launchpad.net/bugs/1476806 16:32:25 there are no attachments so the detach returns a 400 16:32:32 nova wants to handle that and ignore 16:32:32 mriedem1: that's what I'm saying we could fix 16:32:38 *we* could ignore it 16:32:45 Sub-codes again? 16:32:59 there is an API working group patch up about specific codes for error responses 16:33:04 We talked about this at the kilo midcycle. 16:33:06 i assume that's a ways out from landing 16:33:37 mriedem1: if it gets a detach and the volume is not attached it should just say "ok" 16:33:47 mriedem1: it shouldn't be a failure to begin with in that case 16:34:09 yeah, idk, i heard things about races yesterday where the db and actual block storage backend aren't in synhc 16:34:10 *sync 16:34:14 mriedem1: but FWIW I do like the idea of some standardized responses to nova so it's not mess parsing 16:34:28 jgriffith: ++ 16:34:34 mriedem1: meh... I don't know how real that is, but regardless 16:34:39 sure 16:34:47 mriedem1: So, this is moving the message parsing into cinderclient instead. 16:34:51 mriedem1: well, bdms being out of sync does happen, but this just doesn't seem like a failure case 16:34:52 yes 16:34:54 jungleboyj: yes 16:34:58 mriedem1: there's lots of fud around here about races and consistency that isn't as prevalent as one might think 16:35:00 mriedem1: What improvement are we going to get from that? 16:35:08 jungleboyj: that it lives closer to cinder 16:35:08 https://github.com/openstack/cinder/blob/master/cinder/volume/manager.py#L865 16:35:18 we threw exception here in the manager 16:35:33 xyang0: that should just return instead of raise :) 16:35:35 jungleboyj: and it's the same model we (nova) has with neutronclient errors that are translated from neutron api in the client 16:35:36 mriedem1: ++ I thought that was the answer but wanted to make sure. 16:35:37 that nova handles 16:35:46 xyang0: and it should just be a warn 16:35:47 jungleboyj: it's way better than trying to guess/parse strings in exception messages 16:36:09 granted, this is a thing nova will want to backport to kilo, 16:36:18 because with this bug you can't delete an instance w/o removing the bdms in the db 16:36:20 which blows 16:36:31 jgriffith: ++ Agreed. 16:36:38 i don't know how keen cinder people would be on backporting that change from 400 to 200 16:36:39 mriedem1: yeah, hated it 16:37:04 mriedem1: i mean have to manually clean up bdm 16:37:13 ok back 16:37:26 winston-d: yeah, any surgery directly in the nova db isn't fun 16:37:27 so I agree with jgriffith if a detach call comes in and the volume is already detached, just say ok. 16:38:10 can anyone look at this: https://github.com/openstack/cinder/blob/master/cinder/volume/manager.py#L865 16:38:17 okay if we just return success? 16:38:35 xyang0: this is actually what nova is hitting in that bug https://github.com/openstack/cinder/blob/master/cinder/volume/manager.py#L885 16:38:54 mriedem1: right. so we can fix it in cinder 16:39:01 xyang0, yah 16:39:09 mriedem1: return success instead of raise 16:39:10 xyang0, want me to put up a patch to change that? 16:39:13 this is an API change, but from error to success which might be ok 16:39:13 +1 to fix in the manager 16:39:28 hemna_: sure 16:39:33 xyang0, ok I'll do it. 16:39:37 +1 16:39:54 I don't like changing it to success 16:40:01 Yeah, we shouldn't raise. There was nothing to detach. 16:40:15 There was nothing and you asked me to detach, so error 16:40:19 geguileo: 400 to 200, badrequest to accepted 16:40:31 geguileo: we also do that in drivers. if you try to detach a volume that is already detached, log a msg and move on 16:40:31 It's like trying to delete a file in your filesystem that doesn't exist 16:40:35 Don't you get an error? 16:40:37 not necessarily sucess 16:40:43 it's not much different from what we do with delete volume that doesn't exist. 16:40:47 drivers just say...ok 16:40:51 hemna_: +1 16:40:54 geguileo: this is no different from other cinder things "I was not able to delete this volume because it didn't exist, so success" 16:41:01 I'm fine with this approach 16:41:07 Truth is, this is an API change and so it might break people's scripts 16:41:07 thingee: I understand it 16:41:15 hemna_: That makes sense. 16:41:17 thingee: But I still believe we should report an error 16:41:22 you can log a warning 16:41:28 thingee: And Nova should look at the type of error 16:41:30 #topic update on volume migration 16:41:34 geguileo: ++ But not raise the exception. 16:41:45 jungleboyj: ^ 16:41:50 vincent isn't around 16:41:53 Argh! 16:42:14 do you have an update? 16:42:21 for vincent or should I move on? 16:42:36 jungleboyj: I have one comment about https://review.openstack.org/#/c/186312/ 16:42:50 thingee: I don't have anything specific. Sorry, missed that he had put that out there. 16:43:04 If we can get eyes on the patches and work on getting things merged so we get more run time that would be great. 16:43:12 * jungleboyj needs to go review the patches as well. 16:43:14 geguileo: ++ 16:43:15 geguileo: why should we put the volume in error state in this case? 16:43:34 geguileo: it doesn't work that way... "nova look at the type of error" 16:43:36 winston-d: Did you put it in the review? 16:43:43 which is within this change, attach/detach or whatever actions that are not allowed when volume is marked as 'maintaince', 16:43:44 geguileo: the volume gets put in error-detaching and your'e "stuck" 16:43:54 #topic call for reviews on object patches 16:44:02 hi! 16:44:04 this change should stop these operations. 16:44:10 jgriffith: But we can return an error and not set it to error detaching 16:44:25 dulek: thangp hi 16:44:25 jungleboyj: I mentioned this a few times in previous reviews. 16:44:28 hi! 16:44:34 We just want to report that work on RPC compatibility is blocked by object patches. 16:44:35 but it seems vincent hasn't addressed it. 16:44:44 winston-d: Ok, I will look through that and highlight it to Vincent. 16:44:49 winston-d: Thanks. 16:44:49 We recently merged ConsistencyGroup object. 16:44:51 jungleboyj: thx 16:44:57 geguileo: I suppose I don't care... just don't see any real value in it, because what's Nova going to do? What are your expectations here? 16:45:02 #link https://review.openstack.org/#/c/177054/ 16:45:05 It's a warning IMHO 16:45:07 There are only 3 to go: Volume, Service and CGSnapshot 16:45:11 #link https://review.openstack.org/#/c/201404/ 16:45:12 jgriffith: ++ 16:45:18 #link https://review.openstack.org/#/c/160417/ 16:45:18 geguileo: it would just add more code to nova to do the same thing it's already going to do 16:45:23 jgriffith: +1 16:45:30 jgriffith: My expectations are giving the right response 16:45:33 So we want to call for reviews of them. 16:45:37 #link https://review.openstack.org/#/c/195027/ 16:45:46 Without them thangp's work on making rolling upgrades *actually* work is blocked. 16:45:47 geguileo: I'm with jgriffith here, it's a warning. 16:45:48 guitarzan: Sure, but it would be the right response 16:45:59 #link https://review.openstack.org/#/c/205622/ 16:46:00 geguileo: LOL.. and I'm saying if you asked to detach a volume and it's detached that is the correct resposne :) 16:46:02 geguileo: nah, the right thing to do is just smile and nod when nova tries to detach something already detached 16:46:03 but anyway 16:46:04 You guys agree, so I'm cool with it :-) 16:46:04 moving on 16:46:09 #link https://review.openstack.org/#/c/209701/ 16:46:19 jgriffith: we already moved on :) 16:46:20 (thingee's posting links to reviews) 16:46:24 (and my messages are sinking in the background noise I think) 16:46:32 thingee, jgriffith : https://review.openstack.org/#/c/205369/ - nested quota driver needs review 16:46:52 dulek: Yep, sorry for making that much noise 16:47:06 dulek: we need updated versionedobjects for cgsnapshot object to work. that is not release yet, right? 16:47:08 its all +1 since a week or so …if someone from core team can have a look it would be helpful 16:47:15 thingee , jgrifith : ^^ 16:47:15 #info people should review finishing the object patches and rpc versioning patches 16:47:27 xyang0: Correct, but we can still work on other patches. 16:47:31 dulek: no, they are still on the screen 16:47:35 xyang0: yes, but we could put a workaround in place until then? 16:47:53 So again - work on upgrades is going forward, but patches are very big and hard to review. 16:48:10 thangp: that will break all the drivers if that is not merged. when do you expect versionedobjects will be updated? 16:48:20 We need them objects merged to progress. 16:48:28 dulek: can't be broken down to smaller ones? 16:48:34 xyang0: not sure 16:48:44 xyang0: i can ask doug 16:48:44 thangp: what is the workaround? install versionedobjects manually from git? 16:48:50 thangp: ok 16:49:00 winston-d: It's quite hard to use objects in one place and not in the others. 16:49:01 thanks dulek and thangp 16:49:04 xyang0: no, just unset the consistencygroup 16:49:22 so cgsnapshot.consistencygroup is not populated 16:49:29 so it could be sent over rpc 16:49:41 thangp: can you add the workaround in the review? I couldn't run that patch with this problem 16:49:48 xyang0: sure 16:49:58 xyang0: I'll make sure this is done in CGSnapshot patch. 16:50:08 dulek: ok, thanks 16:50:23 dulek: that i agree 16:50:37 That's all from our side I think 16:50:49 #topic open discussion 16:51:23 it would be nice if we have reviews in nested quota patches as well 16:51:36 thingee : looks like the ping got lost in the objects discussion before but https://review.openstack.org/#/c/205369/ - nested quota driver needs review 16:51:36 https://review.openstack.org/#/c/206171 16:51:39 and https://review.openstack.org/#/c/205369/ 16:51:51 thingee, jgriffith : ^^ 16:52:30 #endmeeting