15:59:51 <jgriffith> #startmeeting cinder 15:59:52 <openstack> Meeting started Wed Apr 16 15:59:51 2014 UTC and is due to finish in 60 minutes. The chair is jgriffith. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:59:54 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:59:57 <openstack> The meeting name has been set to 'cinder' 16:00:15 <jgriffith> Hey everyone 16:00:29 <kmartin> hello 16:00:33 <thingee> o/ 16:00:36 <jgriffith> Just wanted to do a quick synch up with folks today 16:00:39 <asselin> hi 16:00:43 <jgriffith> https://wiki.openstack.org/wiki/CinderMeetings 16:00:56 <jgriffith> #topic Release Status 16:01:15 <jgriffith> We cut another RC yesterday morning 16:01:30 <jgriffith> At this point we should be done unless something REALLY critical pops up 16:01:46 <jgriffith> akerr: did note a problem with Glance API V2 16:01:53 <thingee> jgriffith: there was an issue that came up with create from image 16:02:01 <jgriffith> thingee: :) 16:02:03 <thingee> with regards to checksum missing 16:02:05 <jgriffith> yes 16:02:11 <jgriffith> same issue I just mentioned 16:02:16 <jgriffith> It's Glance API V2 16:02:25 <jgriffith> There are two bugs associated with that.... 16:02:36 <jgriffith> https://bugs.launchpad.net/cinder/+bug/1308594 16:02:37 <kmartin> should that be fixed, if it's easy? 16:02:37 <uvirtbot> Launchpad bug 1308594 in cinder "upload-to-image fails with size error on glance v2 api" [High,Confirmed] 16:02:46 <jgriffith> https://bugs.launchpad.net/cinder/+bug/1308058 16:02:47 <uvirtbot> Launchpad bug 1308058 in cinder "Cannot create volume from glance image without checksum" [Undecided,New] 16:02:55 <jgriffith> kmartin: the problem is timing 16:03:13 <kmartin> yeah, down to the wire 16:03:28 <jgriffith> So spinning another RC and reseting the package maintainers again this late is not good 16:03:31 <jgriffith> Also.... 16:03:36 <thingee> and since we can't guarantee the timing with the state of things, I say we revert and default to None. 16:03:39 <jgriffith> My view is that this is a V2 Glance API thing 16:03:50 <jgriffith> and we default to V1 16:04:08 <jgriffith> My vote/view is document in the Release notes as Known issue (which is done) and roll 16:04:10 <kmartin> agree...should note it 16:04:15 <glenng> But it does have an impact to a new NetApp feature requiring v2. 16:04:22 <jgriffith> glenng: yes, correct 16:04:24 <jgriffith> which sucks 16:04:29 <glenng> agreed 16:05:14 <thingee> it's too bad that this was reported pretty early yesterday too 16:05:18 <jgriffith> I talked to akerr and he stated that a backport could be relatively easy for Netapp customers 16:05:26 <thingee> and not noticed 16:05:37 <glenng> Not the end of the world; documenting would be okay. 16:05:46 <jgriffith> thingee: well since we default to V1 only Netapp uses V2 right now 16:05:59 <thingee> just saying, I' 16:06:05 <jgriffith> thingee: FYI even yesterday morning was a bit late 16:06:07 <thingee> think the cut off wasn't done right 16:06:13 <akerr> jgriffith: our feature is optional as well. As glenng says, not the end of the world 16:06:14 <thingee> if there are unverified issues like this 16:06:18 <jgriffith> thingee: we cut/shipped yesterday AM at about 8:00 16:06:26 <jgriffith> I think we're ok 16:06:28 <thingee> It was earlier than that 16:06:37 <jgriffith> thingee: well ok... 16:06:41 <jgriffith> thingee: and the point is? 16:06:58 <thingee> it was unverified. there was a cut off 16:07:05 <thingee> what if it was critical? 16:07:33 <jgriffith> thingee: I'm not sure what you're looking for here? Is this criticism, or something else? 16:07:50 <thingee> I'm just unhappy with the cut off decision 16:07:58 <thingee> on new unverified issues 16:08:04 <jgriffith> thingee: what specifically do you mean? 16:08:05 <thingee> be present 16:08:11 <jgriffith> thingee: the "cut off" decision? 16:08:12 <thingee> can't make it more clear than that 16:08:27 <jgriffith> thingee: You're unhappy with me cutting the RC yesterday? 16:09:03 <jgriffith> thingee: maybe we should talk after the meeting 16:09:17 <jgriffith> thingee: You seem to be very unhappy with how things have been going lately 16:09:21 <jgriffith> maybe we can fix it 16:09:55 <jgriffith> Ok... so back to our regular scheduled program 16:10:25 <jgriffith> #topic summit sessions update 16:10:35 <glenng> Yahoo! 16:10:39 <jgriffith> I'll make another pass on those shortly 16:10:49 <jgriffith> Have some good proposals 16:11:04 <jgriffith> It's not too late if you have items you want to propose, but need to do it today 16:11:12 <thingee> jgriffith: is the ISCSI and FC clean up work in brick needed for a session? 16:11:19 <thingee> could probably just unconf for interested parties 16:11:32 <thingee> will not just brick 16:11:43 <bswartz> thingee: unconf sessions are no more 16:11:47 <bswartz> let me find the blog 16:11:59 <thingee> bswartz: I can invite anyone to a bar with me to discuss it 16:12:01 <jgriffith> thingee: We have slots 16:12:04 <thingee> and then bring it up in the ML 16:12:09 <thingee> ;) 16:12:17 <jgriffith> thingee: we should propose it 16:12:21 <jgriffith> IMO 16:12:24 <glenng> *is interested* 16:12:30 <bswartz> oh a REAL unconf session! are you buying? 16:12:32 <thingee> ok, I had second thoughts how productive it would be 16:12:46 <thingee> bswartz: HA, I'm on a budget nowadays 16:13:00 <thingee> jgriffith: ok I'll propose it 16:13:05 <jgriffith> thingee: thanks 16:13:24 <jgriffith> any questions/suggestions WRT summit sessions? 16:13:53 <thingee> have we verified the people that have proposed these are going to be present or have someone familiar with the subject to be present? 16:13:59 <thingee> I don't want more david wang situations 16:14:01 <jgriffith> haha 16:14:09 <DuncanT-1> Who/is/ David Wang? 16:14:15 <jgriffith> I've checked with each of them and they've "said" they'll be there 16:14:15 <thingee> DuncanT-1: that's the new shirt 16:14:36 <hemna> mornin 16:15:28 <winston-d> hemna: morning~ 16:15:29 <jgriffith> anything else from anyone? 16:15:44 <DuncanT-1> I saw a couple of sessions marked incomplete 16:15:52 <DuncanT-1> Does that mean they're out? 16:16:01 <hemna> thingee, iSCSI/FC cleanup work ? 16:16:01 <jgriffith> DuncanT-1: nahh 16:16:10 <thingee> hemna: yes, it's a mess 16:16:12 <jgriffith> DuncanT-1: means they have a chance to come back with a more detailed focus 16:16:18 <thingee> complicated code 16:16:32 <xyang1> jgriffith: how many slots do we have 16:16:38 <jgriffith> DuncanT-1: but as the proposal was there were concerns or it wasn't clear what the objective was 16:16:47 <hemna> thingee, ok fill me in offline 16:16:47 <jgriffith> xyang1: 12 16:16:56 <jgriffith> errr... 11 16:17:05 <DuncanT-1> jgriffith: Ok, cool 16:17:24 <xyang1> jgriffith: do you know which days yet? Wed, Thu, Fri? 16:17:29 <jgriffith> hemna: if you want to update multi-attach perhaps? 16:17:31 <hemna> thingee, almost all of that is directly from nova. I have plans to refactor some of the initiator side, wrt multi-attach and rediscovery at detach time for iSCSI 16:17:39 <jgriffith> xyang1: Friday last check 16:17:45 <hemna> jgriffith, sure 16:17:46 <jgriffith> xyang1: same as usual 16:18:12 <hemna> multi-attach is coming along. I have the first set of patches as WIP in gerrit for nova, cinder, cinderclient 16:18:21 <jgriffith> hemna: awesome 16:18:23 <hemna> but I'm working on changes to it as well as getting unit tests working 16:18:29 <jgriffith> #topic open-discussion 16:18:42 * thingee added a topic last minute 16:18:58 <hemna> I had to make a change to the current patches at detach time to pass in the attachment uuid instead of instance uuid, because Cinder can attach to a host (no instance uuid in that case) 16:19:12 <winston-d> jgriffith: was asking about if people are interested in immutable volumt type or in place update for volumes when admin making changes to type definition or type-QoS associations 16:19:16 <jgriffith> thingee: I don't see it? 16:20:02 <hemna> I came across an issue yesterday that I might need some help with 16:20:02 <thingee> weird, I do 16:20:10 <thingee> well it's 'cinder resource status' 16:20:15 * jgriffith refreshes 16:20:18 <thingee> specifically the way we handle a status for an object 16:20:21 <jungleboyj_> I am here now. 16:20:22 <thingee> https://bugs.launchpad.net/cinder/+bug/1305550 16:20:23 <uvirtbot> Launchpad bug 1305550 in cinder "Failed retype with driver raised exception should set volume status to "error"" [Undecided,In progress] 16:20:42 <thingee> this bug raised a thought that we make the status field too complicated 16:20:45 <jgriffith> thingee: oh, topic to the agenda 16:20:49 <jgriffith> not the summit 16:21:13 <winston-d> thingee: or not complicated enough? 16:21:19 <jgriffith> #topic what to do on retype failure 16:21:37 <thingee> I would like folks to think of cinder of trying removing a lot of the intervention by ops and user. 16:21:38 <jgriffith> winston-d: not complicated enough IMO 16:21:53 <hemna> volume/manager.py _migrate_volume_generic has a call into nova to update_server_volume for an instance. since cinder can be attached to multiple instances, which one do I use in the nova_api.update_server_volume() call? 16:21:58 <thingee> to be clear I don't think it should be up to people recover volumes if cinder can do it 16:22:03 <hemna> or do I call it for every instance. 16:22:12 <hemna> thats the only oustanding issue I have wrt multi-attach 16:22:13 <jungleboyj_> I think the problem is bigger than just for three type. It seems like we should be able to provide the user with more information about a failure. 16:22:40 <jgriffith> hemna: can we come back around to that 16:22:45 <thingee> if somethin is in an error, just stop. it's done. and keep the status as *error*. 16:22:45 <jungleboyj_> *retype 16:22:49 <jgriffith> hemna: finish up the talk about retype first 16:22:52 <hemna> so I'm going to try and get a second set of patches up in gerrit this week 16:22:55 <hemna> jgriffith, ok 16:23:01 <thingee> don't try to convey it with 'it-failed-because-of-this-thing' status 16:23:17 <thingee> have a separate field that explains why the status is 'error' 16:23:24 <jgriffith> thingee: so the question however is in that particular case is setting it to error-status appropriate I vote no 16:23:28 <DuncanT-1> Fine with the separate field 16:23:39 <thingee> jgriffith: ok, what do you gain from other statuses? 16:23:39 <akerr> thingee: doesn't nova do something similar to that when an instance goes into error? 16:23:52 <jungleboyj_> Can we do a separate fields and still be able to have the user take actions or will it only be the administrator? 16:23:54 <ameade> akerr: +1 instance faults 16:24:01 <jgriffith> DuncanT-1: I think thingee is saying the opposite 16:24:03 <DuncanT-1> jgriffith: what should it do? Leave the volume at the old type? 16:24:15 <thingee> my point is we should reserve error for, there is nothing cinder can do about it. nothing the user can do about it 16:24:17 <jgriffith> DuncanT-1: well, I think so yes 16:24:18 <thingee> it's up to ops 16:24:27 <jgriffith> DuncanT-1: and the reason is becuase there's nothing "wrong" with the volume 16:24:35 <DuncanT-1> jgriffith: That seems quite reasonable to me 16:24:41 <jgriffith> and worse the user doesn't have a mechanism to know what retype's are valid 16:24:45 <jgriffith> so it's trial and error 16:24:53 <thingee> jgriffith: so i agree with retype 16:24:57 <jgriffith> and I think it's bad user experience to put it in error 16:25:07 <jgriffith> and say "haha now you can't use your volume" 16:25:08 <thingee> I'm saying in general, better conveying to the user what happened is what I'm adovcating here 16:25:32 <jgriffith> thingee: so that's another topic IMO 16:25:39 <winston-d> jgriffith: well, i think https://bugs.launchpad.net/cinder/+bug/1305550 here is more about something wrong happened when retyping a volume 16:25:39 <jgriffith> thingee: and we should propose sub-states 16:25:41 <uvirtbot> Launchpad bug 1305550 in cinder "Failed retype with driver raised exception should set volume status to "error"" [Undecided,In progress] 16:25:53 <thingee> jgriffith: agreed. and if you look back to my original sentence, this topic brought on a thought for me 16:25:56 <thingee> of this 16:26:10 <jgriffith> winston-d: yeah, that one is different and that's no good 16:26:28 <jungleboyj_> Thingee so you were saying we wouldn't put the volume in error? 16:26:31 <akerr> jgriffith: we already have error_extending, but I'm not sure thats the best way to go 16:26:31 <thingee> I think sub-states is also complicated. Again, error is just nothing can be done about it. Not cinder, not the user. Just ops 16:26:40 <thingee> put that in a status description field 16:26:45 <jungleboyj_> Just leave it is available with more information in the field? 16:26:53 <thingee> make it so it's safe for users' eyes 16:26:58 <jgriffith> akerr: yeah, it still blocks some things that look for "error_" 16:27:02 <winston-d> Nova has both 'task state' and 'instance falut' 16:27:06 <thingee> ops can see a general idea from what the user sees and look at the logs for more information 16:27:11 <jgriffith> winston-d: +1 16:27:13 <winston-d> we can at least have one 16:27:20 <ameade> you need to be able to handle the case of multiple errors on one resource 16:27:21 <winston-d> or both 16:27:29 <ameade> i dont want to only see info about the latest 16:27:54 <thingee> ameade: so I talked about that in #openstack-cinder too 16:27:54 <jgriffith> ameade: I don't agree with that but I think we're rat holing a bit 16:28:07 <winston-d> ameade: use cases like multi-attaching? 16:28:20 <jgriffith> the bottom line is right now we have ONE and only ONE method of conveying status 16:28:26 <jgriffith> it seems that's not enough 16:28:36 <winston-d> jgriffith: +1 16:28:38 <jgriffith> so we should at least start by implementing a task-state 16:28:41 <jungleboyj_> ameade I think that is going further than we need right now. 16:28:44 <jgriffith> and go from there 16:28:52 * thingee is talking to himself when he just brought up a second way of giving status 16:29:22 <jungleboyj_> jgriffith sounds reasonable. 16:29:34 <jgriffith> thingee: what did you want to say 16:29:39 <jgriffith> thingee: floor is all yours 16:29:45 <jgriffith> everybody listen to thingee 16:29:51 <thingee> let me scrollback up and paste what I said earlier 16:30:04 * jungleboyj_ listens 16:30:24 <thingee> This is also explained in the bug https://bugs.launchpad.net/cinder/+bug/1305550 16:30:25 <uvirtbot> Launchpad bug 1305550 in cinder "Failed retype with driver raised exception should set volume status to "error"" [Undecided,In progress] 16:30:46 <jgriffith> thingee: ummm... sorry that doesn't help me 16:30:50 <jgriffith> thingee: what did YOU say 16:30:51 <hemna> (we should get that bot in openstack-cinder) 16:31:04 <thingee> Reserve 'error' for the resource is not recoverable by user or cinder. it requires manual intervention by ops 16:31:07 <thingee> jgriffith: sorry still typing 16:31:24 <thingee> use a *second* field to give a description of the status 16:31:41 <thingee> instead of 'it-failed-because-of-this-status' like we've been doing 16:32:22 <jgriffith> thingee: sure 16:32:23 <jungleboyj_> jgriffith I think he is pointing out that he wants to keep her for just the worst of situations. 16:32:31 <jgriffith> thingee: as in my proposal in comment #2 of the bug 16:32:38 <jungleboyj_> *error 16:32:52 <winston-d> thingee: so no 'error-extending' but just 'error' with an description field? 16:33:27 <akerr> winston-d: i think not even "error" there because the volume is still usable, just not the new size 16:33:32 <thingee> in order to promote better state setting, I would say instead of using the db api directly, we need some helper for setting state that would require things like the new state e.g. available, error, in-use, and a status description is required if it's something like error state. 16:33:32 <jungleboyj_> winston-d or available with a description field. 16:33:51 <winston-d> akerr: well, that depends. 16:33:58 <winston-d> jungleboyj_: ^^ 16:34:01 <jgriffith> thingee: yeah, we've been saying for a year defined and real states 16:34:50 <winston-d> i wish we can have backend driver report some type of failure that actually doesn't hurt/touch the volume. 16:34:59 <jungleboyj_> It seems like we would still have to add a state for the case for your command failed and you need to see the additional information. 16:35:03 <thingee> jgriffith: I guess when I read #2 comment in that bug, I took it as another key being used for the sub-status, not a full description of text. 16:35:06 <jgriffith> winston-d: I agree 16:35:38 <akerr> you could define something like a 'nonFatalError' exception that drivers could throw 16:35:41 <jungleboyj_> winston-d +1 16:35:49 <hemna> doesn't this fall under general state management of volume transactions. Wasn't taskflow supposed to help with this some? 16:36:07 <winston-d> akerr: yeah, but until we have that, an error could be a unrecoverable error 16:36:42 <jgriffith> Ok... can I say something without hurting any feelings or pissing anybody off? 16:36:48 <jgriffith> let's back up and focus a little 16:37:12 <jgriffith> first; don't bring taskflow into the discussion, it doesn't do what we're talking about regardless of if that was a goal or not 16:37:24 <jgriffith> Let's propose a summit session 16:37:41 <jgriffith> First... let's agree on: Adding a task-status entry 16:37:58 <jgriffith> We can argue about verbosity, what it means etc later 16:38:03 <jungleboyj_> jgr iffith +1 16:38:31 <jgriffith> At the same time, that means we have the opportunity to limit the status field we have today as thingee pointed out 16:38:35 <jungleboyj_> I think getting in a room together and talking about this is a good idea. 16:38:36 <jgriffith> which I think is needed/good 16:38:48 <jgriffith> There are a lot of opportunities here 16:38:55 <jungleboyj_> Ageed. 16:39:15 <jgriffith> but you can't throw in EVERYTHING all at once 16:39:35 <jgriffith> does this sound reasonable to everyone? 16:39:44 <jgriffith> are there any disagreements? 16:39:50 <thingee> question 16:40:01 <hemna> well I just think taskflow is relevant to the discussion of volume state. that's all. 16:40:14 <thingee> what is the task-state accomplishing? what exactly does 'creating' currently mean for example? 16:40:48 <jgriffith> creating today is a status 16:40:54 <thingee> correct 16:40:56 <jungleboyj_> thingee it is telling the user what is happening that they can't see. 16:41:12 <jgriffith> when you say task-state are you referring to the hypothetical yet to exist thing? 16:41:13 <hemna> and it means you can't take other actions on the volume while it's in that status. 16:41:16 <jgriffith> or something else? 16:41:17 <thingee> ok, so again, it's explaining what 'creating' status currently means. 16:41:20 <thingee> more detailed 16:41:26 <thingee> as an example 16:41:49 <thingee> jgriffith: I'm referring to 'task-state' that you just said a few lines up 16:41:57 <jgriffith> thanks 16:41:59 <jgriffith> and NO 16:42:07 <jgriffith> it's not to describe the status 16:42:15 <thingee> I don't know what task-state means and I was just giving an example to understand. 16:42:17 <jgriffith> it's not to describe what "attaching" means 16:42:24 <jungleboyj_> thingee that is what I am thinking. That is what we need to talk about at the summit. 16:42:37 <jgriffith> I'll try my proposal again.... 16:42:42 <jgriffith> For example: 16:42:49 <jgriffith> You try to extend a volume 16:42:50 <akerr> do you mean state=creating, task-state=in progress? 16:42:56 <jgriffith> The volume/backend doesn't support extend 16:43:12 <jgriffith> The volume is "fine", just not extended 16:43:23 <jgriffith> DONT put the volume in error status 16:43:39 <jgriffith> Set a taks-status of "extend-failed" or whatever 16:43:52 <jgriffith> leave the volume as 'available' and the original size 16:44:01 <jgriffith> Example 2: 16:44:09 <jgriffith> retype from foo to baz 16:44:18 <jgriffith> backend doens't support baz, and migration is not enabled 16:44:28 <jgriffith> DONT set volume to error status and make it unusable 16:44:41 <jgriffith> Set the task-status to "error-retyp" or whatever 16:44:49 <jgriffith> Leave the status as "avaialble" 16:45:01 <jgriffith> thingee: is that clear? 16:45:05 <thingee> yup 16:45:07 <jgriffith> thingee: do I need another example? 16:45:10 <jungleboyj_> jgriffith +2 16:45:10 <hemna> shouldn't we include a tnx history, instead of just the last failure ? 16:45:16 <thingee> hemna: +1 16:45:17 <hemna> txn 16:45:17 <jgriffith> tnx? 16:45:25 <hemna> sorry, I'm lazy....transaction 16:45:29 <glenng> jgriffith + 1 16:45:30 <jgriffith> gotcha 16:45:36 <jgriffith> hemna: maybe... 16:45:40 <jungleboyj_> hemna one thing at a time. 16:45:42 <jgriffith> hemna: 1. What would that be 16:45:48 <jgriffith> hemna: 2. Do you need that first pass 16:45:55 <jgriffith> hemna: 3. How do you manage it 16:45:56 <winston-d> hemna: like instance faluts of Nova? 16:46:04 <jgriffith> winston-d: yes 16:46:08 <thingee> jgriffith: I think it's really important to consider this in the design now. If we change our mind later, it's going to be a pain to change on deployed 16:46:17 <hemna> do we need it right now? I'd argue that yes, we could use it now :) 16:46:18 <jgriffith> thingee: I'm not saying that it isn't 16:46:19 <thingee> once deployed* 16:46:26 <hemna> does it have to be done first pass, probably not. 16:46:29 <jgriffith> hemna: Please answer the first question 16:46:33 <akerr> maybe get getting ahead here again, but would want a 3rd field with a more descriptive explanation of why the task failed? 16:46:34 <jgriffith> hemna: 'what is it' 16:46:52 <jungleboyj_> The last few states of that volume? 16:47:00 <hemna> another table in the db that tracks transactions and their states/steps/failures 16:47:07 <jgriffith> jungleboyj_: that's your interpretation... I want hemna 's 16:47:25 <thingee> hemna: pretty much what I thought too 16:47:35 <jgriffith> hemna: for who's consumption? 16:47:40 <thingee> the user 16:47:44 <ameade> how does the user know there was an error at all (if the status isn't error)? 16:48:00 <hemna> soo......that leads me to bring up taskflow again. Isn't there a built in mechanism to taskflow that tracks the transaction state? 16:48:04 * hemna ducks 16:48:13 <hemna> jgriffith, for admins 16:48:21 <winston-d> ameade: task status 16:48:30 <DuncanT-1> hemna: Not really, no. There ought to be, but isn't 16:48:35 <jgriffith> hemna: so this is why I'm asking "you" 16:48:42 <ameade> winston-d: sure that could make sense maybe 16:48:46 <jgriffith> hemna: you say admins, thingee says users 16:48:49 <thingee> winston-d: I think the problem though is how do you know. say the task status already has a value 16:48:50 <akerr> winston-d: that assumes the task-status would clear up after some time? 16:48:50 <hemna> heh 16:48:52 <thingee> how do you know it's new? 16:48:55 <jgriffith> others may say "ops" etc 16:49:15 <hemna> I dunno, I don't think users should need to see why retype failed, but admins do. 16:49:19 <thingee> what if you get the same status? do you have to keep track of the old status to know a change has happened? 16:49:47 <jgriffith> I really think that this is being made much more complex than it should be 16:49:55 <winston-d> akerr: well, task state/status clear doesn't help if you want to find out why the 'retype' request was failed that you invoke 3 days ago. 16:50:00 <hemna> DuncanT-1, ok sounds like we should ping harlowja about adding it then. 16:50:07 <jgriffith> which is part of the problem I have with existing things (like taskflow) 16:50:10 <jungleboyj_> jgriffith +2 16:50:23 <DuncanT-1> hemna: Not simple, since taskflow currently isn't built in a way it can usefully track it 16:50:25 <winston-d> akerr: and after that you also did a bunch of new operations to the voluem 16:50:34 <thingee> jgriffith: I think the current thought is more simplified than it should be. I'm trying to figure out how people would use it. 16:50:41 <thingee> how it would look in clients like horizon 16:50:54 <jgriffith> thingee: the same as it looks in Nova for example 16:51:10 <jgriffith> |Status|Task| 16:51:27 <jgriffith> avaialble|unable-to-retype| 16:51:39 <ameade> fwiw, i think typically in a RESTful api what is usually done is the user would create a new 'retype' resource and they can poll that to see the status of the task 16:51:55 <ameade> but that of course makes no sense in our current design 16:52:00 <jungleboyj_> add a timestamp perhaps? 16:52:21 <thingee> jgriffith: so I'm totally in agreement with going back to available status. +1000. But if extend fails..the user tries twice...they get the same task state back. I guess that's fine and maybe a timestamp of when that task state was updated? 16:52:25 <thingee> just so you know something finished? 16:52:34 <winston-d> So Nova has |Status|Task|InstanceFaults| 16:52:53 <jungleboyj_> thingee +2 16:53:13 <winston-d> |available|unable-to-retype|backend_not_supported| 16:53:48 <jgriffith> winston-d: sure 16:54:09 <DuncanT-1> backend_not_supported doesn't mean anything to an end users tennant though 16:54:22 <jgriffith> DuncanT-1: yeah, I'd suggest that field be admin 16:54:30 <akerr> thingee: so I suppose a task history would come in handy there — cinder task-history <uuid> -> | Task | Outcome | Timestamp | 16:54:31 <jgriffith> but again I think we're getting ahead of ourselves a bit 16:54:34 <winston-d> DuncanT-1: instance falut is for admins 16:54:34 <DuncanT-1> Ok, that makes sense 16:54:43 <hemna> DuncanT-1, unless you want to portray it as that action is not available 16:54:51 <hemna> since it will always fail 16:55:30 <jgriffith> 5 minute warning 16:55:43 <winston-d> akerr: try logstash with request ID 16:56:04 <DuncanT-1> Or stacky with the same 16:56:13 <winston-d> yeah 16:56:17 <jgriffith> yeah, please don't suggest duplicating the log files in some API call 16:56:30 <thingee> winston-d: I still don't think it helps in knowing if a task finished when you retry a failed task. 16:56:35 <thingee> from the user's standpoint 16:56:37 <thingee> or client 16:56:59 <jgriffith> My suggestion was that running a new task 'always' clears the previous task-state 16:57:05 <jgriffith> set's it to None at the onset 16:57:10 <jungleboyj_> give the user as much info as possible. Eventually it helps the admin is well. 16:57:39 <hemna> jungleboyj_, hey, user here is a nice fat stacktrace for you. good luck. :P 16:57:48 <DuncanT-1> jungleboyj_: Disagree. Far too easy for the user to start guessing what the problem is and get completely the wrong end of the stick 16:57:54 <jgriffith> hemna: and so much for the abstraction 16:57:57 <glenng> Or confuses them. Seeing old error info may hinder when current operation worked. 16:57:59 <thingee> jgriffith: would that be obvious to someone new? I'm trying to remember if on certain operations we list the volume/snapshot or whatever details before doing certain actions 16:58:12 <jungleboyj_> hemna ... Not that much. 16:58:21 <hemna> :P good. 16:58:42 <thingee> jgriffith: It's not an obvious thing to me that a field would be cleared on a new action. 16:58:45 <jgriffith> thingee: it's a hell of a lot more obvious that silently not extending or setting the volume to error because something isn't supported 16:58:49 <hemna> other than the current volume state, all the admin has now are log stacktraces...if that. 16:59:04 <jgriffith> thingee: when you run an API cmd and see the field change it seems obvious to me 16:59:26 <thingee> jgriffith: I agree it's better. I'm just saying if we're going to revamp this, lets be careful and consider these things so we're not repeating ourselves. 16:59:28 <jungleboyj_> hemna backend_not_supported doesn't seem dangerous though. 16:59:30 <DuncanT-1> hemna: Good drivers log lots of useful info of their own too... if yours doesn't, talk to your vendor 16:59:37 <jgriffith> thingee: fair enough 16:59:45 <hemna> jungleboyj_, +1 16:59:46 <jgriffith> DuncanT-1: +1 16:59:59 <jgriffith> Ok 16:59:59 <thingee> DuncanT-1: +1 17:00:04 <bswartz> +1 for log spam 17:00:15 <jgriffith> We've succesfully burned our hour 17:00:19 <hemna> DuncanT-1, ours does a good job of logging failures/reasons. I'm just saying in general though that's not overly useful to an admin 17:00:24 <jgriffith> I'll get a session for this proposed 17:00:25 <thingee> also with that, reviewers should be encouraging driver changes to give great logs to cinder users! 17:00:30 <jgriffith> and have some code for ATL 17:00:34 <hemna> because it takes for fricking ever to find the error in the log on a busy system. 17:00:41 <DuncanT-1> ATL? 17:00:43 <jgriffith> thanks everyone 17:00:45 <thingee> atlanta 17:00:49 <hemna> forcing admins to have to look in the log, is the wrong approach IMO 17:00:50 <DuncanT-1> Ah 17:00:52 <jungleboyj_> thingee my favorite thing to do. 17:00:53 <jgriffith> #endmeeting