16:03:05 <jgriffith> #startmeeting cinder 16:03:07 <openstack> Meeting started Wed Nov 14 16:03:05 2012 UTC. The chair is jgriffith. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:03:08 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:03:09 <openstack> The meeting name has been set to 'cinder' 16:03:20 <thingee> oh I didn't comprehend your message correctly. It appears durgin just signed on at 7:56 16:03:23 <jdurgin1> hello 16:03:26 <thingee> there! 16:03:28 <jgriffith> thingee: Morning! :) 16:03:52 <jgriffith> thingee: after your all nighter you don't stand a chance at comprehending me :) 16:04:00 <jgriffith> thingee: I'm confusing enough on a good day 16:04:43 <jgriffith> alright, I want to start with G1 status updates 16:04:47 <jgriffith> #topic G1 status 16:04:56 <jgriffith> #link https://launchpad.net/cinder/+milestone/grizzly-1 16:05:41 <jgriffith> This is what we have slated and remember G1 is next week 16:06:19 <winston-d> :) 16:06:45 <jgriffith> Anybyody have any concerns for what they're signed up for thati should know about? 16:06:50 <bswartz> I see one of the blueprints mentions splitting the drivers into 1 per file 16:06:55 <jgriffith> Need help, blockers, etc? 16:07:01 <bswartz> I would prefer keeping the netapp drivers together 16:07:44 <winston-d> jgriffith, mines are on track. 16:07:57 <jgriffith> winston-d: excellent 16:08:20 <jgriffith> bswartz: the first phases of that change have already landed 16:08:27 <jgriffith> bswartz: over a week ago 16:08:57 <thingee> jgriffith: looking pretty good on apiv2. clearer-error-messages is going to be pretty easy too 16:09:16 <jgriffith> thingee: awesome, so the power of positive thinking worked :) 16:09:27 <jgriffith> thingee: That and no sleep for a night! 16:09:38 <bswartz> jgriffith: I think the change in general is a good idea, but I'm just suggesting that the NetApp drivers are exempted 16:09:55 <jgriffith> bswartz: I'm just pointing out that the change already merged 16:10:09 <bswartz> oh wait, I'm looking at the wrong tree 16:10:10 <jgriffith> bswartz: https://review.openstack.org/#/c/15000/ 16:10:53 <bswartz> okay it looks like netapp wasn't affected by that change 16:11:02 <thingee> jgriffith: I thought apiv2 was targetted for g1? 16:11:09 <bswartz> I will keep an eye out for the next phase of the change 16:11:38 <jgriffith> bswartz: ok, we still need to figure out if there needs to be a next phase I suppose but anyway 16:12:11 <jgriffith> thingee: hmmm.... it was but it was moved last week when we got Chucks version 16:12:25 <jgriffith> thingee: My plan/idea was to get the structure for G1 16:12:25 <thingee> ah right 16:12:42 <jgriffith> thingee: All the other additions/enhancements will come in G2 16:12:54 <jgriffith> thingee: So it's critical that we have everything in place to do that at G1 16:12:56 <thingee> yeah. I'll let you know once I move the other stuff into separate bps and then retarget? 16:13:09 <DuncanT> Sorry, didn't see the time 16:13:09 <jgriffith> perfect 16:13:16 <jgriffith> DuncanT: NP 16:13:38 <jgriffith> So if there are no big concerns about getting these in.... 16:13:51 <jgriffith> Are there any big concerns about something that should be there that's not? 16:14:44 <jgriffith> Yay! I'll take silence as a good thing ;) 16:15:09 <jgriffith> #topic service and host info 16:15:30 <jgriffith> So something else that came up this week was cinder-manage service list :) 16:15:48 <jgriffith> Turns out I was in the process of implementing that functionality in the cinderclient 16:16:11 <jgriffith> There may be some concern around doing this and I wanted to get feed-back from all of you 16:16:28 <winston-d> i see you are working on a host extension? 16:16:44 <jgriffith> winston-d: Yes, but I am planning to change that around a bit 16:16:56 <jgriffith> It should fit more with what nova does I think 16:17:12 <jgriffith> and then add a seperate service extensionn for the "other stuff" 16:17:24 <winston-d> jgriffith, ok. i'm fine with that. 16:17:33 <jgriffith> TBH I'm not sure of the value in the host extension any longer 16:17:49 <winston-d> hmm 16:17:52 <jgriffith> 90% of it just raises notimplemented 16:18:19 <jgriffith> winston-d: I'd like to implement the extension and then we can fill in the underlying stuff later 16:18:33 <jgriffith> but I want to make sure it's not something that nobody will ever use :) 16:18:59 <creiht> what was the host extension? 16:19:02 <jgriffith> ie host power-actions (reboot a cinder noode) set maintenance-window etc 16:19:08 <creiht> oh 16:19:12 <creiht> interesting 16:19:39 <jgriffith> creiht: yeah, I started with just a place to put things like "show me all the cinder services, their status and where they're running" 16:19:41 <DuncanT> Is there a detailed blueprint for it? I'm only familiar with a very small subset of it 16:19:52 <jgriffith> Then I noticed nova had this hosts extension but it's a bit different 16:20:09 <DuncanT> The services/nodes/statuses stuff is definitely useful 16:20:14 <jgriffith> DuncanT: Nope, I didn't go into detail with the BP because it existed in Nova 16:20:32 <jgriffith> DuncanT: Yeah, but I'm thinking that should be seperate form the hosts extension 16:20:58 <winston-d> DuncanT, agree. service/node/status are useful. 16:20:59 <jgriffith> Leave the hosts extension in line with what it does in Nova, and add services specifically for checking status, enable/disable etc 16:21:18 <winston-d> i usually treat that as how scheduler sees the cluster. 16:21:29 <DuncanT> I need to go look at the nova version before I can comment, I guess... 16:21:32 <jgriffith> winston-d: explain? 16:21:57 <jgriffith> DuncanT: It's sorta funky IMO, mostly because a good portion of it is not implemented 16:22:14 <jgriffith> DuncanT: But the idea is that you perform actions on your compute nodes 16:22:41 <winston-d> jgriffith, well, what services are up/down, etc. basically i'd check nova-manage service info when i see something wrong with instances scheduling. 16:22:55 <winston-d> this was missing part in cinder. 16:22:55 <jgriffith> winston-d: ahh... yes, ok 16:23:11 <jgriffith> winston-d: So that is what I set out to address with this 16:23:29 <jgriffith> winston-d: But I'm thinking now that it might be usefull if this had it's own extensions (service) 16:23:39 <DuncanT> I don't think the cinder API is the right place to be rebooting nodes... (nore the nova API for that matter) 16:23:43 <jgriffith> winston-d: There are a number of things that I think could be added into that in the future 16:24:05 <jgriffith> DuncanT: yeah, that's something I thought would come up :) 16:24:05 <DuncanT> Service info as its own extension sounds like the way to go then... 16:24:08 <winston-d> i agree with DuncanT. 16:24:28 <jgriffith> #action jgriffith dump hosts extension for now and implement services ext 16:24:45 <jgriffith> Everybody ok with that? 16:24:53 <DuncanT> Yup 16:24:58 <winston-d> i'm good. 16:25:04 <jdurgin1> sounds good 16:25:29 <jgriffith> Just for background... there's a push to get things out of the *-manage utilities and into the clients 16:25:43 <jgriffith> that's why I didn't just pick up that patch and be done last week :) 16:26:24 <winston-d> jgriffith, that's related to admin API stuff? 16:26:26 <jgriffith> ok... any questions/suggestions on this topic? 16:26:37 <DuncanT> I'd ideally like to keep "cinder-manage service list" working direct from the database too, but I won't cry if it disappears (I'll just carry my own version... I only want it for dev systems) 16:26:41 <jgriffith> winston-d: yes, they would be admin only extensions 16:26:57 <jgriffith> DuncanT: Well, we can put it in as well.... but it DNE today :) 16:27:38 <DuncanT> I'll send a patch to put it in :-) 16:27:54 <jgriffith> DuncanT: We can just reactivate the one I rejected this week :) 16:28:05 <jgriffith> I think it was from Avishay, but can't remember 16:28:14 <kmartin> I'll let hemna know 16:28:24 <jgriffith> kmartin: Ahhh... thanks!!!! 16:28:31 <jgriffith> kmartin: Yes, it was hemna! 16:28:38 <DuncanT> :-) 16:28:43 <jgriffith> I'm just not sure about the value in having both but whatevs 16:29:07 <jgriffith> any other thoughts? 16:29:23 <kmartin> jgriffith: he should be in the office shortly, he'll lget it done today 16:29:27 <winston-d> i think nova is trying to avoid direct db access. 16:29:44 <jgriffith> winston-d: yes, that was my point of rejecting it the first time around 16:29:52 <winston-d> have that in cinder-manage, means we are adding direct db access? 16:30:00 <jgriffith> winston-d: yup 16:30:14 <DuncanT> I think it is a lordable aim but not needing the endpoint address is handy on dev systems 16:30:29 * jgriffith is hopeful somebody might agree with himm on this 16:30:52 <winston-d> last time, when i proposed to add some feature to nova-manage, it was rejected and suggest to do that in novaclient. 16:31:18 <jgriffith> Ok, I'm reverting back to my original stance on this 16:31:33 <jgriffith> kmartin: dont' tell hemna to resbumit please :) 16:31:55 <kmartin> no problem 16:32:00 <DuncanT> Fair enough, I'll keep an out-of-tree version for now 16:32:00 <creiht> lol 16:32:01 <jgriffith> DuncanT: we can revisit later, but i hate to put something in there just for dev 16:32:18 <jgriffith> DuncanT: Maybe I can just give you a developers tool that you can use :) 16:32:24 <creiht> I'm not sure if I am fond of having the manage tools also in the client 16:32:32 <winston-d> i think rackspace private cloud team has a lot of db access scripts to do management jobs. 16:32:35 <creiht> but I can understand how it makes certain things easier 16:32:44 <winston-d> they even have some project around that. 16:32:59 <jgriffith> creiht: Well the idea is they would be ONLY in the client if that helps :) 16:33:06 <DuncanT> DB access can be really handy when something in the db is stopping your API server from working :-) 16:33:24 <jgriffith> DuncanT: yeah, in some cases it's the only option :) 16:33:25 <DuncanT> But those type of tools tend to be very site specific I think 16:33:47 <jgriffith> DuncanT: I think it's fair that those are handled by the provider IMO 16:34:15 <DuncanT> As I said, it is a trivial enough thing to maintain out-of-tree for now... might bring it up again in six months 16:34:16 <winston-d> DuncanT, yeah, i know. the question is if we want more of that goes into cinder. 16:34:58 <jgriffith> Ok, I'm going to proceed forward.... people can scream and punch me in the face later if they want :) 16:34:59 <winston-d> ops people may already have much more powerful db scripts to do auditing/monitoring/reaping jobs. i guess. 16:35:07 <jgriffith> winston-d: +1 16:35:25 <winston-d> :) 16:35:31 <jgriffith> #topic gate test failures 16:35:41 <DuncanT> I'd like to bring some of that power into the upstream tree to save reinventing the wheel, but I'm happy to accept that cleaning thing up needs to happen first 16:35:57 <jgriffith> It just occured to me this AM that I haven't been updating people on this whole mess 16:36:13 <jgriffith> The Ubuntu dd issue... 16:36:30 <jgriffith> We continue to see intermittent failures in the gate tests due to this 16:36:58 <jgriffith> the kernel folks working it are making some progress but it's really becoming a thorn in my side 16:37:01 <jgriffith> Soo..... 16:37:02 <DuncanT> I tried and failed to reproduce 16:37:15 <jgriffith> DuncanT: Yeah, that's what sucks about it 16:37:29 <jgriffith> DuncanT: But if you tear down, rebuild enough times you'll hit it 16:37:42 <jgriffith> physical or virtual 16:37:59 <jgriffith> Anyway, I put in a temporary patch: 16:38:07 <DuncanT> Hmmm, have you got a set of backtraces from when it is happening? 16:38:13 <winston-d> DuncanT, checkout https://github.com/JCallicoat/pulsar that's *nova swiss army knife*. 16:38:20 <jgriffith> I added a "secure_delete" flag that is checked on the dd command 16:39:10 <DuncanT> winston-d: Cheers 16:39:15 <jgriffith> The default is set to True, but in the gate/tempest jobs we set it to False in the localrc file 16:39:25 <winston-d> that works? 16:39:49 <jgriffith> This will hopefully keep everybody from saying "Cinder failed jenkins again" 16:39:56 <jgriffith> :) 16:40:09 <winston-d> great 16:40:31 <jgriffith> I've been trying some other methods of the secure delete but they either have the same problem or other severe perf problems 16:41:08 <jgriffith> Anyway, I thought I should start keeping everybody up to speed on what's going on witht hat 16:41:11 <jgriffith> that 16:41:34 <jgriffith> I'm still hopeful that one morning I'll find the kernel fairy came by while I slept and have this fixed 16:42:11 <jgriffith> otherwise during G2 we'll need to focus on a viable alternative 16:42:29 <jgriffith> Any questions/thoughts on this? 16:42:35 <creiht> jgriffith: would out of band zeroing make it better? 16:42:47 <jgriffith> creiht: how do you mean? 16:42:50 <winston-d> jgriffith, do you have ubuntu bug # on this? 16:43:21 <jgriffith> winston-d: https://bugs.launchpad.net/cinder/+bug/1023755 16:43:23 <DuncanT> There's always the option of breaking the LVM and hand building a couple of linear mappings and writing zeros to them 16:43:23 <uvirtbot> Launchpad bug 1023755 in linux "Precise kernel locks up while dd to /dev/mapper files > 1Gb (was: Unable to delete volume)" [Undecided,Confirmed] 16:43:45 <winston-d> jgriffith, thx 16:43:57 <DuncanT> Should have way better performance that way too 16:44:30 <creiht> jgriffith: we zero in an outside job that runs periodically 16:44:42 <jgriffith> creiht: Ohhh..... got ya 16:44:49 <jgriffith> creiht: I don't think it would sadly 16:45:10 <jgriffith> creiht: The dd to the dev/mapper system itself seems to be the issue 16:45:34 <jgriffith> I don't thinkk it would matter when that's done, the failures in the tests are MOSTLY because the kernell locks up 16:46:05 <jgriffith> This would still happen but "other" tests/operations on the system would fail 16:46:22 <jgriffith> and it would be harder to figure out why.... unless I'm missing something 16:46:44 <jgriffith> creiht: although... if you guys aren't seeing this issue maybe there's something to that idea 16:48:19 <creiht> what version of ubuntu are they seeing it on? 16:48:20 <jgriffith> 12.04 16:48:57 <creiht> yeah it is weird that we haven't seen anything like that 16:49:16 <jgriffith> creiht: That is odd.... 16:49:22 <creiht> is it specific to how dd writes, or is it just the squential writes of data? 16:49:28 <creiht> because I don't think we use dd 16:49:33 <jgriffith> OH!! 16:49:45 <jgriffith> Yeah, it definitely seems dd related 16:49:47 <creiht> we have a python script that zeros 16:49:56 <creiht> I think 16:49:58 <creiht> :) 16:50:08 <jgriffith> But I tried changing it to like "cp /dev/zero /dev/mapper/xxxx" 16:50:26 <jgriffith> This eventually failed as well 16:50:47 <creiht> I'm not somwehere that I can look at the code right now, but I will see if I can dig a little deeper and report back to you 16:50:55 <jgriffith> creiht: cool 16:51:30 <eharney> jgriffith: have you tried different dd block sizes? maybe the python script just writes with a different pattern 16:51:31 <jgriffith> creiht: It may be worth testing as you pointed out just doing direct access writes from python in the delete function as well 16:52:00 <jgriffith> eharney: Haven't messed with block-sizes too much 16:52:29 <jgriffith> DuncanT: I'd also like to hear more about your proposal as well 16:52:38 <bswartz> jgriffith: don't do "cp /dev/zero /dev/mapper/xxxx", do "cat < /dev/zero > /dev/mapper/xxxx" 16:53:12 <jgriffith> bswartz: ok, I can try it... 16:53:15 <jgriffith> bswartz: thanks 16:53:21 <DuncanT> jgriffith: I'm trying to code in now :-) 16:53:30 <jgriffith> DuncanT: awesome 16:53:56 <jgriffith> So BTW... anybody interested in this feel free :) I''m open to ideas 16:54:09 <jgriffith> It's just really tough to repro 16:54:28 <jgriffith> You almost have to tear down and rebuild each time 16:54:37 <DuncanT> I can't reproduce, but I don't need to to send you a patch 16:55:09 <jgriffith> cool 16:55:20 <jgriffith> alright... we've beat that dead horse enough for today 16:55:30 <jgriffith> #topic open discussion 16:55:43 <jgriffith> Anybody have anythign they want/need to talk about? 16:56:01 <winston-d> nope 16:56:10 <DuncanT> https://blueprints.launchpad.net/cinder/+spec/add-expect-deleted-flag-in-volume-db 16:57:04 <DuncanT> I've a slightly alternative proposal: Set the state to 'deleting' in API 16:57:15 <DuncanT> Match 'attaching' that we already have 16:57:32 <zykes-> Oh, cinder meeting ? 16:57:38 <zykes-> How goes the FC / SAN stuff ? 16:57:39 <bswartz> I would like to introduce rishuagr 16:57:40 <winston-d> DuncanT, don't we have that? 16:57:47 <bswartz> rushiagr* 16:58:20 <DuncanT> winston-d: I'm not sure if cinder have it, I couldn't see it in the code, but I only spent a few seconds looking. If we have, then I can't see what the blueprint is about? 16:58:31 <bswartz> Rushi is a member of the NetApp team who is working on cinder full time 16:58:44 <rushiagr> hi all ! 16:59:11 <kmartin> zykes-:The FC blueprint is moving through the HP legal system....slowly 16:59:41 <bswartz> I would like Rushi to be added to the cinder core team soon 16:59:52 <winston-d> DuncanT, i think that bp is mainly for billing. they don't want slow zeroing to mess-up with billing. 17:00:13 <DuncanT> winston-d: Surely you stop billing once it is in the 'deleting' state? 17:00:18 <DuncanT> (we do) 17:00:31 <jgriffith> DuncanT: I would hope so :) 17:00:32 <winston-d> DuncanT, oh, i see your point. 17:00:43 <winston-d> rongze_, ping 17:00:44 <jgriffith> DuncanT: Otherwise you better give an option to NOT secure delete :) 17:00:49 <zykes-> kmartin: :/ 17:00:55 <creiht> that's the other nice thing about out of band delete 17:01:01 <winston-d> creiht, :) 17:01:02 <zykes-> billing you mean metering ? 17:01:04 <creiht> erm out of band zeroing 17:01:33 <rongze_> hi 17:01:46 <eharney> i haven't talked to some of you guys much yet, but Cinder is also becoming my primary focus... so.. hi :) 17:02:00 <thingee> DuncanT: the api appears to have a delete method for setting the deleting state. 17:02:00 <winston-d> rongze_, DuncanT was talking about your expected-deleted-flag bp. 17:02:07 <jgriffith> eharney: welcome... 17:02:12 <DuncanT> OoB zeroing is a win we've found too, but that is a differnt question to this blueprint I think? 17:02:18 <thingee> before it calls volume_delete 17:02:19 <jgriffith> Let's get through DuncanT's topic here... 17:02:44 <winston-d> thingee, yes, there is. 17:03:02 <DuncanT> So what is this blueprint proposing? I can't make sense of it 17:03:09 <winston-d> rongze_, and DuncanT suggest you stop billing when there's 'deleting' state, what do you think? 17:03:17 <rongze_> yes 17:03:29 <rongze_> I agree DuncanT 17:03:46 <DuncanT> So what is the blueprint suggesting? 17:03:49 <jgriffith> rongze_: I thought when we talked about this though the idea was..... 17:04:23 <jgriffith> rongze_: We have the ability to know when it's safe to remove a volume even if the delete operation never quite finished or errored out 17:05:32 <DuncanT> Isn't that 'still in deleting state'? 17:05:58 <jgriffith> DuncanT: yes, but it's the "hung in deleting state" thing that could be solved 17:06:22 <jgriffith> DuncanT: at least that's what I thought we were aiming for 17:06:53 <jgriffith> DuncanT: as it stands right now you can get that state and you're there forever unless you go in and manipulate the DB by hand 17:07:11 <DuncanT> I think this is a special case of the problem of needing (none-customer-facing) substates for all sorts of hang/lost message cases... Would it be worth trying to come up with a proposal that covers all of them? 17:07:25 <jgriffith> DuncanT: Perhaps.... yes 17:07:48 <jgriffith> rongze_: Is my interpretation accurate, or did I misunderstand you on this? 17:08:16 <DuncanT> i.e. have a sub-state field that could go 'delete api request dispatched' -> 'delete started on node XXX' -> 'scrubbing finished' -> 'gone' 17:08:41 <DuncanT> Same field can be used for create subtasks, snapshot subtasks, backup subtasks etc 17:08:46 <jgriffith> DuncanT: Yeah, which brings up the new state implementation stuff clayg teased us with :) 17:08:48 <rongze_> What is the instance is deleted? 17:09:00 <jgriffith> rongze_: on delete we don't care... 17:09:10 <jgriffith> rongze_: we're already detached right? 17:09:53 <jgriffith> DuncanT: I think what you're proposing is the way we want to go, and I believe it's the sort of thing clayg had in mind 17:10:03 <rongze_> I think we can reference instance delete operation 17:10:22 <jgriffith> rongze_: Oh, I see what you mean.. sorry 17:10:31 <jgriffith> Ok... 17:11:02 <jgriffith> #action discuss/clarify blueprint add-expect-deleted-flag-in-volume-db 17:11:12 <jgriffith> We'll pick this up at the top of G2 17:11:20 <jgriffith> Meanwhile... 17:11:35 <jgriffith> rushiagr: welcome 17:11:55 <jgriffith> eharney: welcome to you as well 17:12:09 <jgriffith> rushiagr: eharney Hang out on IRC in #openstack-cinder 17:12:30 <eharney> will do 17:12:30 <jgriffith> Or PM me and we can sync up later 17:12:50 <jgriffith> I'm headed to the airport here and will be travelling today but otherwise.... 17:12:55 <jgriffith> kmartin: any FC updates? 17:13:03 <eharney> ok 17:13:38 <kmartin> jgriffith: legal stuff...but that has not stopped us from starting to code 17:14:31 <jgriffith> kmartin: Ok.... please try and get some details added to the BP next week if you can 17:15:53 <jgriffith> Ok... we're over time 17:15:57 <jgriffith> Thanks everyone 17:16:01 <jgriffith> #endmeeting