16:03:05 #startmeeting cinder 16:03:07 Meeting started Wed Nov 14 16:03:05 2012 UTC. The chair is jgriffith. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:03:08 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:03:09 The meeting name has been set to 'cinder' 16:03:20 oh I didn't comprehend your message correctly. It appears durgin just signed on at 7:56 16:03:23 hello 16:03:26 there! 16:03:28 thingee: Morning! :) 16:03:52 thingee: after your all nighter you don't stand a chance at comprehending me :) 16:04:00 thingee: I'm confusing enough on a good day 16:04:43 alright, I want to start with G1 status updates 16:04:47 #topic G1 status 16:04:56 #link https://launchpad.net/cinder/+milestone/grizzly-1 16:05:41 This is what we have slated and remember G1 is next week 16:06:19 :) 16:06:45 Anybyody have any concerns for what they're signed up for thati should know about? 16:06:50 I see one of the blueprints mentions splitting the drivers into 1 per file 16:06:55 Need help, blockers, etc? 16:07:01 I would prefer keeping the netapp drivers together 16:07:44 jgriffith, mines are on track. 16:07:57 winston-d: excellent 16:08:20 bswartz: the first phases of that change have already landed 16:08:27 bswartz: over a week ago 16:08:57 jgriffith: looking pretty good on apiv2. clearer-error-messages is going to be pretty easy too 16:09:16 thingee: awesome, so the power of positive thinking worked :) 16:09:27 thingee: That and no sleep for a night! 16:09:38 jgriffith: I think the change in general is a good idea, but I'm just suggesting that the NetApp drivers are exempted 16:09:55 bswartz: I'm just pointing out that the change already merged 16:10:09 oh wait, I'm looking at the wrong tree 16:10:10 bswartz: https://review.openstack.org/#/c/15000/ 16:10:53 okay it looks like netapp wasn't affected by that change 16:11:02 jgriffith: I thought apiv2 was targetted for g1? 16:11:09 I will keep an eye out for the next phase of the change 16:11:38 bswartz: ok, we still need to figure out if there needs to be a next phase I suppose but anyway 16:12:11 thingee: hmmm.... it was but it was moved last week when we got Chucks version 16:12:25 thingee: My plan/idea was to get the structure for G1 16:12:25 ah right 16:12:42 thingee: All the other additions/enhancements will come in G2 16:12:54 thingee: So it's critical that we have everything in place to do that at G1 16:12:56 yeah. I'll let you know once I move the other stuff into separate bps and then retarget? 16:13:09 Sorry, didn't see the time 16:13:09 perfect 16:13:16 DuncanT: NP 16:13:38 So if there are no big concerns about getting these in.... 16:13:51 Are there any big concerns about something that should be there that's not? 16:14:44 Yay! I'll take silence as a good thing ;) 16:15:09 #topic service and host info 16:15:30 So something else that came up this week was cinder-manage service list :) 16:15:48 Turns out I was in the process of implementing that functionality in the cinderclient 16:16:11 There may be some concern around doing this and I wanted to get feed-back from all of you 16:16:28 i see you are working on a host extension? 16:16:44 winston-d: Yes, but I am planning to change that around a bit 16:16:56 It should fit more with what nova does I think 16:17:12 and then add a seperate service extensionn for the "other stuff" 16:17:24 jgriffith, ok. i'm fine with that. 16:17:33 TBH I'm not sure of the value in the host extension any longer 16:17:49 hmm 16:17:52 90% of it just raises notimplemented 16:18:19 winston-d: I'd like to implement the extension and then we can fill in the underlying stuff later 16:18:33 but I want to make sure it's not something that nobody will ever use :) 16:18:59 what was the host extension? 16:19:02 ie host power-actions (reboot a cinder noode) set maintenance-window etc 16:19:08 oh 16:19:12 interesting 16:19:39 creiht: yeah, I started with just a place to put things like "show me all the cinder services, their status and where they're running" 16:19:41 Is there a detailed blueprint for it? I'm only familiar with a very small subset of it 16:19:52 Then I noticed nova had this hosts extension but it's a bit different 16:20:09 The services/nodes/statuses stuff is definitely useful 16:20:14 DuncanT: Nope, I didn't go into detail with the BP because it existed in Nova 16:20:32 DuncanT: Yeah, but I'm thinking that should be seperate form the hosts extension 16:20:58 DuncanT, agree. service/node/status are useful. 16:20:59 Leave the hosts extension in line with what it does in Nova, and add services specifically for checking status, enable/disable etc 16:21:18 i usually treat that as how scheduler sees the cluster. 16:21:29 I need to go look at the nova version before I can comment, I guess... 16:21:32 winston-d: explain? 16:21:57 DuncanT: It's sorta funky IMO, mostly because a good portion of it is not implemented 16:22:14 DuncanT: But the idea is that you perform actions on your compute nodes 16:22:41 jgriffith, well, what services are up/down, etc. basically i'd check nova-manage service info when i see something wrong with instances scheduling. 16:22:55 this was missing part in cinder. 16:22:55 winston-d: ahh... yes, ok 16:23:11 winston-d: So that is what I set out to address with this 16:23:29 winston-d: But I'm thinking now that it might be usefull if this had it's own extensions (service) 16:23:39 I don't think the cinder API is the right place to be rebooting nodes... (nore the nova API for that matter) 16:23:43 winston-d: There are a number of things that I think could be added into that in the future 16:24:05 DuncanT: yeah, that's something I thought would come up :) 16:24:05 Service info as its own extension sounds like the way to go then... 16:24:08 i agree with DuncanT. 16:24:28 #action jgriffith dump hosts extension for now and implement services ext 16:24:45 Everybody ok with that? 16:24:53 Yup 16:24:58 i'm good. 16:25:04 sounds good 16:25:29 Just for background... there's a push to get things out of the *-manage utilities and into the clients 16:25:43 that's why I didn't just pick up that patch and be done last week :) 16:26:24 jgriffith, that's related to admin API stuff? 16:26:26 ok... any questions/suggestions on this topic? 16:26:37 I'd ideally like to keep "cinder-manage service list" working direct from the database too, but I won't cry if it disappears (I'll just carry my own version... I only want it for dev systems) 16:26:41 winston-d: yes, they would be admin only extensions 16:26:57 DuncanT: Well, we can put it in as well.... but it DNE today :) 16:27:38 I'll send a patch to put it in :-) 16:27:54 DuncanT: We can just reactivate the one I rejected this week :) 16:28:05 I think it was from Avishay, but can't remember 16:28:14 I'll let hemna know 16:28:24 kmartin: Ahhh... thanks!!!! 16:28:31 kmartin: Yes, it was hemna! 16:28:38 :-) 16:28:43 I'm just not sure about the value in having both but whatevs 16:29:07 any other thoughts? 16:29:23 jgriffith: he should be in the office shortly, he'll lget it done today 16:29:27 i think nova is trying to avoid direct db access. 16:29:44 winston-d: yes, that was my point of rejecting it the first time around 16:29:52 have that in cinder-manage, means we are adding direct db access? 16:30:00 winston-d: yup 16:30:14 I think it is a lordable aim but not needing the endpoint address is handy on dev systems 16:30:29 * jgriffith is hopeful somebody might agree with himm on this 16:30:52 last time, when i proposed to add some feature to nova-manage, it was rejected and suggest to do that in novaclient. 16:31:18 Ok, I'm reverting back to my original stance on this 16:31:33 kmartin: dont' tell hemna to resbumit please :) 16:31:55 no problem 16:32:00 Fair enough, I'll keep an out-of-tree version for now 16:32:00 lol 16:32:01 DuncanT: we can revisit later, but i hate to put something in there just for dev 16:32:18 DuncanT: Maybe I can just give you a developers tool that you can use :) 16:32:24 I'm not sure if I am fond of having the manage tools also in the client 16:32:32 i think rackspace private cloud team has a lot of db access scripts to do management jobs. 16:32:35 but I can understand how it makes certain things easier 16:32:44 they even have some project around that. 16:32:59 creiht: Well the idea is they would be ONLY in the client if that helps :) 16:33:06 DB access can be really handy when something in the db is stopping your API server from working :-) 16:33:24 DuncanT: yeah, in some cases it's the only option :) 16:33:25 But those type of tools tend to be very site specific I think 16:33:47 DuncanT: I think it's fair that those are handled by the provider IMO 16:34:15 As I said, it is a trivial enough thing to maintain out-of-tree for now... might bring it up again in six months 16:34:16 DuncanT, yeah, i know. the question is if we want more of that goes into cinder. 16:34:58 Ok, I'm going to proceed forward.... people can scream and punch me in the face later if they want :) 16:34:59 ops people may already have much more powerful db scripts to do auditing/monitoring/reaping jobs. i guess. 16:35:07 winston-d: +1 16:35:25 :) 16:35:31 #topic gate test failures 16:35:41 I'd like to bring some of that power into the upstream tree to save reinventing the wheel, but I'm happy to accept that cleaning thing up needs to happen first 16:35:57 It just occured to me this AM that I haven't been updating people on this whole mess 16:36:13 The Ubuntu dd issue... 16:36:30 We continue to see intermittent failures in the gate tests due to this 16:36:58 the kernel folks working it are making some progress but it's really becoming a thorn in my side 16:37:01 Soo..... 16:37:02 I tried and failed to reproduce 16:37:15 DuncanT: Yeah, that's what sucks about it 16:37:29 DuncanT: But if you tear down, rebuild enough times you'll hit it 16:37:42 physical or virtual 16:37:59 Anyway, I put in a temporary patch: 16:38:07 Hmmm, have you got a set of backtraces from when it is happening? 16:38:13 DuncanT, checkout https://github.com/JCallicoat/pulsar that's *nova swiss army knife*. 16:38:20 I added a "secure_delete" flag that is checked on the dd command 16:39:10 winston-d: Cheers 16:39:15 The default is set to True, but in the gate/tempest jobs we set it to False in the localrc file 16:39:25 that works? 16:39:49 This will hopefully keep everybody from saying "Cinder failed jenkins again" 16:39:56 :) 16:40:09 great 16:40:31 I've been trying some other methods of the secure delete but they either have the same problem or other severe perf problems 16:41:08 Anyway, I thought I should start keeping everybody up to speed on what's going on witht hat 16:41:11 that 16:41:34 I'm still hopeful that one morning I'll find the kernel fairy came by while I slept and have this fixed 16:42:11 otherwise during G2 we'll need to focus on a viable alternative 16:42:29 Any questions/thoughts on this? 16:42:35 jgriffith: would out of band zeroing make it better? 16:42:47 creiht: how do you mean? 16:42:50 jgriffith, do you have ubuntu bug # on this? 16:43:21 winston-d: https://bugs.launchpad.net/cinder/+bug/1023755 16:43:23 There's always the option of breaking the LVM and hand building a couple of linear mappings and writing zeros to them 16:43:23 Launchpad bug 1023755 in linux "Precise kernel locks up while dd to /dev/mapper files > 1Gb (was: Unable to delete volume)" [Undecided,Confirmed] 16:43:45 jgriffith, thx 16:43:57 Should have way better performance that way too 16:44:30 jgriffith: we zero in an outside job that runs periodically 16:44:42 creiht: Ohhh..... got ya 16:44:49 creiht: I don't think it would sadly 16:45:10 creiht: The dd to the dev/mapper system itself seems to be the issue 16:45:34 I don't thinkk it would matter when that's done, the failures in the tests are MOSTLY because the kernell locks up 16:46:05 This would still happen but "other" tests/operations on the system would fail 16:46:22 and it would be harder to figure out why.... unless I'm missing something 16:46:44 creiht: although... if you guys aren't seeing this issue maybe there's something to that idea 16:48:19 what version of ubuntu are they seeing it on? 16:48:20 12.04 16:48:57 yeah it is weird that we haven't seen anything like that 16:49:16 creiht: That is odd.... 16:49:22 is it specific to how dd writes, or is it just the squential writes of data? 16:49:28 because I don't think we use dd 16:49:33 OH!! 16:49:45 Yeah, it definitely seems dd related 16:49:47 we have a python script that zeros 16:49:56 I think 16:49:58 :) 16:50:08 But I tried changing it to like "cp /dev/zero /dev/mapper/xxxx" 16:50:26 This eventually failed as well 16:50:47 I'm not somwehere that I can look at the code right now, but I will see if I can dig a little deeper and report back to you 16:50:55 creiht: cool 16:51:30 jgriffith: have you tried different dd block sizes? maybe the python script just writes with a different pattern 16:51:31 creiht: It may be worth testing as you pointed out just doing direct access writes from python in the delete function as well 16:52:00 eharney: Haven't messed with block-sizes too much 16:52:29 DuncanT: I'd also like to hear more about your proposal as well 16:52:38 jgriffith: don't do "cp /dev/zero /dev/mapper/xxxx", do "cat < /dev/zero > /dev/mapper/xxxx" 16:53:12 bswartz: ok, I can try it... 16:53:15 bswartz: thanks 16:53:21 jgriffith: I'm trying to code in now :-) 16:53:30 DuncanT: awesome 16:53:56 So BTW... anybody interested in this feel free :) I''m open to ideas 16:54:09 It's just really tough to repro 16:54:28 You almost have to tear down and rebuild each time 16:54:37 I can't reproduce, but I don't need to to send you a patch 16:55:09 cool 16:55:20 alright... we've beat that dead horse enough for today 16:55:30 #topic open discussion 16:55:43 Anybody have anythign they want/need to talk about? 16:56:01 nope 16:56:10 https://blueprints.launchpad.net/cinder/+spec/add-expect-deleted-flag-in-volume-db 16:57:04 I've a slightly alternative proposal: Set the state to 'deleting' in API 16:57:15 Match 'attaching' that we already have 16:57:32 Oh, cinder meeting ? 16:57:38 How goes the FC / SAN stuff ? 16:57:39 I would like to introduce rishuagr 16:57:40 DuncanT, don't we have that? 16:57:47 rushiagr* 16:58:20 winston-d: I'm not sure if cinder have it, I couldn't see it in the code, but I only spent a few seconds looking. If we have, then I can't see what the blueprint is about? 16:58:31 Rushi is a member of the NetApp team who is working on cinder full time 16:58:44 hi all ! 16:59:11 zykes-:The FC blueprint is moving through the HP legal system....slowly 16:59:41 I would like Rushi to be added to the cinder core team soon 16:59:52 DuncanT, i think that bp is mainly for billing. they don't want slow zeroing to mess-up with billing. 17:00:13 winston-d: Surely you stop billing once it is in the 'deleting' state? 17:00:18 (we do) 17:00:31 DuncanT: I would hope so :) 17:00:32 DuncanT, oh, i see your point. 17:00:43 rongze_, ping 17:00:44 DuncanT: Otherwise you better give an option to NOT secure delete :) 17:00:49 kmartin: :/ 17:00:55 that's the other nice thing about out of band delete 17:01:01 creiht, :) 17:01:02 billing you mean metering ? 17:01:04 erm out of band zeroing 17:01:33 hi 17:01:46 i haven't talked to some of you guys much yet, but Cinder is also becoming my primary focus... so.. hi :) 17:02:00 DuncanT: the api appears to have a delete method for setting the deleting state. 17:02:00 rongze_, DuncanT was talking about your expected-deleted-flag bp. 17:02:07 eharney: welcome... 17:02:12 OoB zeroing is a win we've found too, but that is a differnt question to this blueprint I think? 17:02:18 before it calls volume_delete 17:02:19 Let's get through DuncanT's topic here... 17:02:44 thingee, yes, there is. 17:03:02 So what is this blueprint proposing? I can't make sense of it 17:03:09 rongze_, and DuncanT suggest you stop billing when there's 'deleting' state, what do you think? 17:03:17 yes 17:03:29 I agree DuncanT 17:03:46 So what is the blueprint suggesting? 17:03:49 rongze_: I thought when we talked about this though the idea was..... 17:04:23 rongze_: We have the ability to know when it's safe to remove a volume even if the delete operation never quite finished or errored out 17:05:32 Isn't that 'still in deleting state'? 17:05:58 DuncanT: yes, but it's the "hung in deleting state" thing that could be solved 17:06:22 DuncanT: at least that's what I thought we were aiming for 17:06:53 DuncanT: as it stands right now you can get that state and you're there forever unless you go in and manipulate the DB by hand 17:07:11 I think this is a special case of the problem of needing (none-customer-facing) substates for all sorts of hang/lost message cases... Would it be worth trying to come up with a proposal that covers all of them? 17:07:25 DuncanT: Perhaps.... yes 17:07:48 rongze_: Is my interpretation accurate, or did I misunderstand you on this? 17:08:16 i.e. have a sub-state field that could go 'delete api request dispatched' -> 'delete started on node XXX' -> 'scrubbing finished' -> 'gone' 17:08:41 Same field can be used for create subtasks, snapshot subtasks, backup subtasks etc 17:08:46 DuncanT: Yeah, which brings up the new state implementation stuff clayg teased us with :) 17:08:48 What is the instance is deleted? 17:09:00 rongze_: on delete we don't care... 17:09:10 rongze_: we're already detached right? 17:09:53 DuncanT: I think what you're proposing is the way we want to go, and I believe it's the sort of thing clayg had in mind 17:10:03 I think we can reference instance delete operation 17:10:22 rongze_: Oh, I see what you mean.. sorry 17:10:31 Ok... 17:11:02 #action discuss/clarify blueprint add-expect-deleted-flag-in-volume-db 17:11:12 We'll pick this up at the top of G2 17:11:20 Meanwhile... 17:11:35 rushiagr: welcome 17:11:55 eharney: welcome to you as well 17:12:09 rushiagr: eharney Hang out on IRC in #openstack-cinder 17:12:30 will do 17:12:30 Or PM me and we can sync up later 17:12:50 I'm headed to the airport here and will be travelling today but otherwise.... 17:12:55 kmartin: any FC updates? 17:13:03 ok 17:13:38 jgriffith: legal stuff...but that has not stopped us from starting to code 17:14:31 kmartin: Ok.... please try and get some details added to the BP next week if you can 17:15:53 Ok... we're over time 17:15:57 Thanks everyone 17:16:01 #endmeeting