16:00:42 #startmeeting cinder 16:00:43 Meeting started Wed Oct 30 16:00:42 2013 UTC and is due to finish in 60 minutes. The chair is jgriffith. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:44 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:46 yeah!! 16:00:46 The meeting name has been set to 'cinder' 16:00:49 o/ 16:00:52 dosaboy: 16:00:53 Hey 16:01:07 jusst on time :) 16:01:08 dosaboy: with us? 16:01:11 rushiagr: :) 16:01:14 oh hey yes 16:01:16 Glad you shouted, totally forgotten the clocks change altered the meeting time 16:01:16 hi 16:01:20 goddam DST ;) 16:01:28 #topic backup support for metadata 16:01:30 dosaboy: LOL 16:01:34 Howdy all! 16:01:38 jungleboyj: yo 16:01:40 ok so 16:01:55 https://wiki.openstack.org/wiki/CinderMeetings 16:02:09 duncanT: it's ok, the rest of us will show up at the wrong time next week 16:02:25 hmm, or the week after I suppose 16:02:31 guitarzan: +2 16:02:32 so i had a few discussions now about backlup metadata 16:02:42 few opionions flying around 16:02:49 but 16:02:59 i think the best way forward is as follows 16:03:16 each backup driver will backup a set of volume metadata 16:03:27 that 'set' of metadata will come from a common api 16:03:34 presented to all drivers 16:03:38 which will be versioned 16:03:56 this will allow for the volume to recreated from scratch 16:04:02 should the db/cinder cluster get lost 16:04:04 dosaboy: why back up the metadata at all? 16:04:11 (some caveats tbd) 16:04:20 dosaboy: ahh... db recovery 16:04:21 dosaboy: i noted on the BP that you need an import_backup too 16:04:23 well (queue DuncanT) 16:04:33 avishay 16:04:35 yes 16:04:41 I have created a sperate BP for that 16:04:50 basiucally all this back stuff is mushromming a bit ;) 16:04:52 Oh OK, cool - please link the BPs 16:04:55 back/backup 16:05:09 yeah sorry i am all over the shop this week 16:05:11 The vision I've always tryed to keep for backup is that it *is* for disaster recovery 16:05:15 trying to keep up 16:05:19 duncanT: DR of what though? 16:05:28 dosaboy: how would this work if we allowed the Volume Drivers to do backups? 16:05:29 duncanT: it's not an HA implementation of Cinder 16:05:35 duncanT: at least I don't think it should be 16:05:48 jgriffith: Cidner volumes. Even if your cinder cluster caught frie orr got stolen, you can still get you volume(s) back 16:06:03 Gah, typing fail 16:06:06 dosaboy: having the Volume Driver do the backup would be more efficient when the drives are external, but we need a definitive format. 16:06:07 duncanT: if it's cinder volumes I ask again, why even back up metadata 16:06:08 jgriffith: if you backup to a remote site and you lose your entire cinder, your backups should remain usable 16:06:19 caitlin56: not sure what you mean 16:06:20 caitlin56: besides the point right now 16:06:27 duncanT: So, the data about what volumes there were? 16:06:33 avishay: that's Cinder DR not volume backup 16:06:44 jgriffith: because certain volumes are useless without at least some metadata (e.g. the bootable flags and glace metadata for licensing) 16:06:46 what I'm saying is that they're two very different things 16:06:53 jgriffith: everything's connected :) 16:06:57 duncanT: ok.. I'm going ot try this one more time 16:07:06 avishay: duncanT 16:07:07 first: 16:07:19 avishay: Oh, ok, backup of the volumes is separate and then this backs up the data for accessing them. Right? 16:07:22 My thought regarding the purpose of volume backup service is to backup volumes 16:07:41 what you're proposing now is bleeding over into the db contents 16:07:45 however... 16:07:57 if you're going to do that, then I would argue that you have to go all the way 16:08:02 jgriffith: since we need to backup the metadata, we could just shove it in the backend and effectively get DR for free 16:08:11 in other words just backing up the meta is only part of the story 16:08:21 it is not much effort to get that done 16:08:26 dosaboy: ummm... I don't think it's that simple honestly 16:08:29 o/ 16:08:34 dosaboy: quotas, limits etc etc 16:08:41 well, 16:08:41 all of those things exist in the DB 16:08:48 dosaboy: it's DR with high RPO and RTO 16:08:48 snapshots 16:08:50 quotas, limits etc I don't think are part of the volume 16:08:54 that is where my versioned api comes in 16:08:57 so 16:09:01 duncanT: neither is metadata 16:09:10 the idea is that we define a sufficient set of metadata 16:09:13 if you want to recover from your whole cinder going up in smoke you need to mirror the whole cinder DB 16:09:15 But the stuff needed to use the volume *is* part of the volume 16:09:18 Backing up the metadata with the data is relatively easy, it's standardizing it and being compatible with existing backups that takes work. 16:09:22 I honestly think this makes things WAY more complicated than we should 16:09:37 caitlin56: hence the versioning 16:09:38 caitlin56: yes 16:09:40 if you want cinder DR then implement an HA cinder setup 16:09:44 bswartz: The backup API allows you to choose (and pay for in certain cases) a safe, cold storage copy of you volume 16:09:54 if you want to back up databases, back them up using a backup service 16:10:09 I *don't* want to back up the database 16:10:10 caitlin56: but can we 'backup' the metadata in db? 16:10:10 Backing up the CinderDB means that you would be restoring *all* volumes. 16:10:19 duncanT: but that's your argument here 16:10:28 the only reason to backup metadata is if the db is lost 16:10:30 jgriffith: No it isn't 16:10:37 (my arguement) 16:10:44 ok... then why backup the metadata at all? 16:10:48 If you want to restore selective volumes then you need selective metadata (or no metadata, which is what jgriffith is arguing). 16:11:07 Right, I want a backup to be a disaster resistant copy of a volume. 16:11:13 duncanT: what's the logic to backing up the metadata? 16:11:13 if you're not worried about the database going away, then there's no point to making more copies of the metadata 16:11:20 Including everything you need to use *that volume* 16:11:29 Not *all volumes* 16:11:34 Not *cinder config* 16:11:43 Just the volume I've said is important 16:11:53 Otehrwise use a snapshot 16:12:00 You're still not realy answering my question 16:12:05 bswartz: yes, there is. we 'snapshot' the metadata, in DB 16:12:18 duncanT: can you site some specific metadata fields that you would not know how to set when restoring just the volume payload? 16:12:19 bswartz: +1 16:12:30 bswartz: which is my whole point 16:12:42 bswartz: and what I'm trying to get duncanT to explain 16:12:43 caitlin56: The bootable flag. The licensing info held in the glance metadata 16:13:00 introducing things like "disaster resistant" isn't very helpful to me :) 16:13:02 winston-d: needs to be consistent 16:13:04 winston-d: you're imagining that the metadata might change and you want to restore it from an old copy? 16:13:07 that's a bit subjective 16:13:16 I'd like to be able to import a backup into a clean cinder install 16:13:18 bswartz: correct 16:13:26 duncanT: ahh... that's VERY different! 16:13:34 duncanT: that's volume import or migration 16:13:42 that's NOT volume backup 16:13:43 winston-d: seems reasonble 16:13:47 jgriffith: No, it's backup and restore 16:13:52 no, it's not migration 16:13:58 jgriffith: Even if cinder dies, catchs fire etc 16:14:01 * dosaboy is sitting on the fence whistling 16:14:05 Does anyone have examples of metadagta that SHOULD NOT be restored when you migrate a volume from one site to another? 16:14:09 jgriffith: The backup should be enough to get my working volume back 16:14:12 migration is within one openstack install 16:14:41 I put that on my first ever backup slide, and I propose to keep it there 16:14:43 avishay: yeah... 16:14:59 avishay: duncanT alright we're obviously not going to agree here 16:15:03 * jungleboyj is enjoying the show. 16:15:05 avishay: well maybe you and I will 16:15:12 haha 16:15:25 avishay: backup is within one cinder install as well, no? 16:15:25 duncanT: fine, so you want a "cinder-volume service" backup 16:15:30 i'm with duncanT on this one 16:15:46 winston-d: i can backup to wherever i want - think geo-distributed swift even 16:15:46 haha 16:15:53 avishay: +1 16:15:56 WTF? 16:16:01 jgriffith: I just want the volume I backup to come back, even if cinder caught fire in the mean time 16:16:09 right 16:16:10 * bswartz is nervou about conflating backup/restore use cases with DR use cases 16:16:21 nervous* 16:16:22 the key is "cinder caugh fire" 16:16:26 if i backup to geo-distributed swift, and a meteor hits my datacenter, i can rebuild and point my new metadata to existing swift objects 16:16:27 What is a backup? If it is not enough to restore a volume to a new context then why not just replicate snapshots? 16:16:34 sorry.. buy you're saying "backup as a service" in openstack as a whole IMO 16:16:34 backup has been DR since day one on the design spec 16:16:50 You're not talking about restoring a volume anymore 16:17:09 avishay, duncanT don't forget to backup volume types, and qos alone with volume and metadata 16:17:09 where is that leap coming from? 16:17:12 you're talking about all the little nooks and crannies that your specific install/implementation may require 16:17:14 duncanT: I don't agree that taking backups is the best way to implement DR -- it's *a* way, but a relatively poor one 16:17:21 backup and replication are on the same scale, with different RPO/RTO and fail-over methods 16:17:25 and what's worse is you're saying "I only care about metadata" 16:17:26 avishay: duncanT 'cos those two are considered to be 'metadata' to the voluem as well 16:17:32 The problem is, when I first wrote backup, we didn't have bootable flags, or volume encryption... you got the bits back onto an iscsi target and you were back in business 16:17:33 but somebody else says "but I care about quotas" 16:17:38 winston-d has an interesting point 16:17:40 and somebody else cares about something else 16:17:44 Glance metadata was the first bug 16:17:47 the real value of backups is when you don't have a disaster, but you've corrupted your data somehow 16:17:57 It doesn't end until you backup all of cinder and the db 16:17:59 winston-d: but i can put the data into a volume with different qos and it still works 16:18:01 Types or rate limits don't stop me using a volume 16:18:12 bswartz: isn't that a disaster? :) 16:18:15 duncanT: they on't stop YOU 16:18:17 that's the key 16:18:26 they may stop others though depending on IMPL 16:18:40 guitarzan: no -- that's a snafu 16:18:42 jgriffith: They don't stop a customer... 16:18:49 duncanT: they don't stop your customers 16:18:52 duncanT: they stop mine 16:19:09 duncanT: I have specific heat jobs that require volumes of certain types 16:19:10 there's a difference between users screwing up their own data, and a service provider having an outage 16:19:14 jgriffith: The point is, right now, even if you've got your backup of a bootable volume, it is useless if cinder looses stuff out the DB 16:19:14 avishay but that's not the original volume (not data) any more. 16:19:28 duncanT: perhaps 16:19:32 avishay: i mean data is, volume is not 16:19:39 Would it always make sense when restoring a single volume to a new datacenter to preserve the prior QoS/quotas/etc.? 16:19:39 winston-d: that's philosophical :P 16:19:48 jgriffith: You can't restore it in such a way that you can boot from it. At all. 16:19:53 all this could be accounted for by simply defining a required matadata set 16:19:58 duncanT: but the purpose of the backup IMO is if your backend takes a dump a user can get his data back, or as bswartz pointed out a user does "rm -rf /*" 16:20:06 i don't see why that would be so complex 16:20:07 dosaboy: +1 16:20:16 it's all of these use cases 16:20:19 dosaboy: I don't disagree with that 16:20:36 deliberating whther this or that metadata is required is ot reall for this conversation 16:20:39 jgriffith: Right now, I CAN'T GET MY BOOTABLE VOLUME TO BOOT 16:20:43 it's rm -rf, it's a fire, it's a meteor 16:20:43 It jsut can't be done 16:20:45 dosaboy: agreed, if we are going to backup metadata, we need to define filters on the metadata so only things that should be kept are. 16:20:55 duncanT: keep yelling 16:21:04 duncanT: I'll keep ignoring :) 16:21:12 that's the bug, you can't boot a restored backup 16:21:14 jgriffith: accidental caps lock ;) 16:21:26 jgriffith: Sorry, I was out of line a touch there 16:21:28 avishay: haha... I don't think that's the case 16:21:36 jgriffith: be optimistic :) 16:21:45 dosaboy: so like I said in IRC the other day.... 16:22:01 yarp 16:22:03 dosaboy: I'm fine with it being implemented, I could care less 16:22:03 i'd like to consider the volume as a virtual hard drive. 16:22:03 jgriffith: But a snapshot covers the rm -rf case 16:22:18 dosaboy: I have the info in my DB so I really don't care 16:22:37 dosaboy: If you're an idiot and you don't back up you db then hey.. this at least will help you 16:22:49 dosaboy: but something else is going to bight you in the ass later 16:22:54 whatever it takes to backup a virtual hard drive, that's what we should do in cinder backup. 16:22:56 Allowing snapshot replication would deal with disaster recovery issues, but not with porting a volume to a new vendor. 16:23:03 damn it, i'm back on the fence again 16:23:28 i kinda think the only way to resolve this is to have a vote 16:23:33 dosaboy: I also think that things like metadata would be good in an "export/import" api 16:23:48 dosaboy: democracy doesn't work in these situations i think :) 16:23:50 I missed the beginning of this convo. Why are people opposed to it restoring metadata? 16:23:52 dosaboy: duncanT but like I said, it doesn't *hurt* me if you put metadata there 16:23:54 lol 16:23:59 I totally agree that theya re part of export too 16:24:08 And transfer for that matter 16:24:23 ok so, i have implement a chunk of this, 16:24:23 what is "volume export"? 16:24:30 Certain volumes are literally useless if you loose certain bits of their metadata 16:24:33 why don't i see if i can knock uo the rest 16:24:38 and then if you like... 16:24:42 dosaboy: I do think someone should better define the purpose of cinder-backup though 16:24:50 dosaboy: that's fine by me 16:24:52 thingee: i think we are discussing about where to save the copy of metadata for a volume backup, in DB or in Swift/Ceph/Sth else 16:24:53 jgriffith: totally agree 16:24:57 winston-d: if we thinking volume as a virtual hard driver, so can we export it as a package, like ovf? it contains metadata 16:25:00 dosaboy: like I said, I won't object to backing it up at all 16:25:00 jgriffith: +1 16:25:09 dosaboy: but I don't want to have misleading expectations 16:25:10 i would have asked for a session if i had not confrmed HK so late 16:25:21 dosaboy: this is nowhere near a Cinder DR recovery backup 16:25:25 and I don't want to make it one 16:25:35 errr... 16:25:38 s/recover// 16:25:43 there are many a stong opinion on this one ;) 16:25:43 winston-d: I think we've discussed before to leave it to the object store. 16:25:46 jrgriffith: we're debating what a backup is good for. 16:25:51 as object store metadata 16:26:07 jgriffith: what is "volume export"? 16:26:27 avishay: non-existent :) 16:26:44 jgriffith: it looks like i'm leading the "volume import" session, so thought I should know :) 16:26:53 avishay: the idea/proposal was to be able to kick out volumes from Cinder without deleting them off the backend 16:27:00 put it this way, as long as the necessary metadat is backed up (either way) 16:27:03 jgriffith: ah OK 16:27:03 and then obviously an import to pull in existing volumes 16:27:03 noone gets hurt 16:27:08 thingee: yeah, but as i said to dosaboy the other day, i missed that discussion 16:27:11 dosaboy: agreed 16:27:32 soi'll keep going in the track i'm on 16:27:47 and we can take a look at what i get done 16:27:51 jgriffith: quotas are tricky on export 16:27:53 if we don't like it then fine 16:28:02 dosaboy: are you still storing it in object store metadata? 16:28:09 thingee: yes 16:28:12 avishay: indeed 16:28:18 but each driver will have it's own way 16:28:20 avishay: I'd suggest that once something is kicked out of cinder, then it can't take up a cinder quota? 16:28:22 avishay: and types, extra specs, qos? 16:28:23 dosaboy: be sure to escape the metadata then. 16:28:32 duncanT: +1 16:28:35 duncanT: but it still takes up space on disk 16:28:54 avishay: yeah so that's the counter, however you kicked it out 16:28:58 avishay: Not cinder's problem once you've explicitly decided it isn't cinder's problem 16:29:06 avishay: that tenant can't 'access' via cinder anymore 16:29:17 avishay: it's troubling.... 16:29:27 jgriffith: yes. is export needed? 16:29:29 avishay: I'm having the same dilema with adding a purge call to the API 16:29:39 Quotas are tricky, because they are implicitly part of the cinder context where the backup was made. 16:29:47 avishay: well that's another good question :) 16:29:48 ooh i had not thought of that 16:30:03 so how do backups count towards quota atm? 16:30:06 wait...i'm sorry i forked the conversation 16:30:14 people are getting confused 16:30:14 dosaboy: Currently they don't 16:30:24 * rushiagr definitely is 16:30:27 we moved to talk about "volume export" without finishing backup metadata 16:30:29 and i presume that is bad? 16:30:34 * jgriffith went down the fork nobody else is on 16:30:44 haha 16:30:50 dosaboy: backups should count in your *object* storage quoata, not cinder. 16:31:12 i think we can hash out export at the session. i think we should also find time to talk about backup metadata. 16:31:17 caitlin56: but they don't necessarily go to an object store 16:31:33 like if you offload to TSM 16:31:51 dosaboy: when the backup is to an object store, we should let that object store track/report the data consumption. 16:32:07 caitlin56: +1 16:32:09 caitlin56: That makes sense. 16:32:22 Cinder can't run the whole world. 16:32:29 I don't want this problem where multiple projects are tracking the same resource quota again. 16:32:30 ...yet. 16:32:42 anyway i don't wanna hijack this meeting anymore 16:33:29 i think quotas in backup are discussion to be had though 16:33:30 dosaboy: too bad :) 16:33:37 dosaboy: duncanT: can you come up with a set of metadata to back up and we'll discuss in person next week? we can do an ad-hoc session? jgriffith sound good? 16:33:46 sure 16:33:50 Sure 16:34:04 * jungleboyj is sorry he is going to miss that discussion. The IRC version was fun! 16:34:08 avishay: sure... but like I said, if dosaboy just wants to implement backup of metadata with a volume I have no objection 16:34:19 we can all sit around a nice campfire 16:34:19 so it would be a short conversation on my part :) 16:34:25 ok :) 16:34:25 DuncanT can make the cocoa 16:34:27 no way!!!! 16:34:32 dosaboy: +1 16:34:33 I know duncanT would push me into the fire 16:34:39 haha 16:34:39 that way we know if duncanT is yelling or just hit the caps lock ;) 16:34:42 hahah 16:34:49 :-) 16:35:07 don't worry, i'll bring the whiskey 16:35:16 dosaboy: +2 16:35:19 alright... 16:35:21 well that was fun 16:35:33 Is Ehud around? 16:35:37 yes 16:35:44 EhudTrainin: welcome 16:35:49 hi 16:35:51 #topic fencing 16:36:21 https://blueprints.launchpad.net/cinder/+spec/fencing-and-unfencing 16:36:27 for those that haven't seen it ^^ 16:36:33 EhudTrainin: I'll let you kick it off 16:37:07 It is about adding fencing functionality to Cinder 16:37:24 in order to support HA for instances 16:38:08 blueprint explains it pretty well 16:38:15 by ensuring the instances on a failed host would not try to access the storage after they rebuilt on another host 16:38:35 avishay: +1 16:38:46 My concerns here are the failure cases... partitioned storage etc... 16:38:47 avishay: +1, nice write up in the bp. haven't seen any one like that for quite a while. 16:38:49 +1 for beutifully explaining in BP 16:38:50 yeah, the bp is well written (nice job on that by the way) 16:39:21 duncanT: please elaborate 16:39:41 It's easy to say it's hard for nova so cinder should do it, but exactly the same problems exist in cinder failure cases... like cinder looses communitication with a storage backend 16:40:13 EhudTrainin: i'm thinking how to cinder identify those attachment session for a volume. seems need prevent such race condition issue.. IMO 16:40:20 it's not hard for nova - it's impossible for nova. the server is in a bad state and we can't trust it to do the right thing. 16:40:21 I guess my only real question was: 16:40:22 fencing ? 16:40:37 1. why would the compute host try and access the storage? 16:40:44 2. why do we necessarily care? 16:41:03 I should clarify before somebody flips out... 16:41:18 I think that if the storage does not response then fence will fail, but it is lower probability to both host and storage fail at the same time 16:41:25 If a compute host *fails* and instances are migrated 16:41:43 it should IMO be up to nova to disable the atachments on the *failed* compute host 16:41:50 jgriffith: it might not fail completely - it might just lose connectivity or go into some other bad state 16:41:58 avishay: sure 16:42:12 avishay: but it migrated instances right? 16:42:13 shouldn't nova deal with detaching and reattaching elsewhere ? 16:42:22 hemna: that's kinda what I was saying 16:42:30 I agree with jgriffith. it should be up to whatever is doing the migration. 16:42:42 wait 16:42:45 since Nova should know that it had to migrate the instance to another host, it has the knowledge of the state 16:42:46 we could really screw some things up if we make incorrect decisions 16:42:53 But currently there is no way of telling cinder 'make sure this is teally detached', I don't think 16:42:58 EhudTrainin: if nova brings up the instance on another VM, does it have the same instance ID? 16:43:02 *totally detached 16:43:18 This is not exactly migation, but a rebuild, since the host from Nova point of view has failed 16:43:25 jgriffith: I agree, nova knows the client state accurately. It should deal with the results of that changing. 16:43:25 jgriffith: migrate the instance may or maynot stop the old one from connecting cinder 16:43:34 avishay, I'd asume since it's a rebuild,it would be a new instance id 16:43:48 winston-d: yeah... but I'm saying it *should* 16:43:50 sounds like we expect the same piece of code that may fail to migrat and instance to be sane enough to ensure a fence 16:43:51 could be a poor assumption though 16:44:10 dosaboy: that could be double un-good 16:44:31 yeah, i may be missing how this would be done though 16:44:33 the use case is that the nova server is not responsive but the VM continues to run and access the storage 16:44:43 It may rebuild with the same IP and attach it to the same volume 16:44:46 avishay: ahhh 16:44:46 dosaboy: could you propose something where we are protecting the volume rather than doing nova's work for it? 16:45:02 So if a compute node goes wonkey, when is it safe to reattach a volume that was attached to that host to another instance? 16:45:03 avishay: so rogue vm's that nova can't get to anymore 16:45:12 jgriffith: yes 16:45:14 duncanT: never probably 16:45:26 avishay: so who does the fencing? 16:45:30 but now i'm think that maybe this is only a problem if we have multi-attach? 16:45:30 jgriffith: Indeed. I think the idea of fence is 'make it safe to do that' 16:45:34 jgriffith: i assume the admin 16:45:39 avishay:ie who makes the call 16:45:48 avishay: and why not just send a detach/disconnect 16:45:52 rogueVMs aren't something we shoudl solve - at most we should protect the volume from rogue VMs. 16:46:00 caitlin56: not sure what you mean there 16:46:03 jgriffith, +1 16:46:22 caitlin56: The idea here *is* to protect the volume for a rogue VM 16:46:26 caitlin56: how does cinder know what a rogue vm is though?> 16:46:27 jgriffith: detach failed 16:46:34 I guess it is a disconnect on steroids 16:46:48 jgriffith: 'cos nova compute is not reachable 16:46:58 dosaboy: exactly, we don't want cinder falsely deciding a VM is rogue. 16:47:00 dosaboy, cinder doesn't know. only nova does 16:47:06 duncanT: yeah, I'm assumign that's what the implementation would basicly be here 16:47:09 catlin56: ah gotcha 16:47:34 I think nova should drive this and I'm not sure what cinder needs to do during the fencing process. nova should detach from the rogue vm 16:47:36 Ok... so interesting scenarios 16:47:39 EhudTrainin: want to comment? 16:47:40 here's my take... 16:47:57 hemna: +1 nova should be driving this 16:48:01 cinder does not have enough information 16:48:05 hemna: If the compute node stops talking, nova can't to the detach from the VM 16:48:11 if you want to implement a service/admin API call to force disconnect from a node and ban a node I *think* that's ok 16:48:17 hemna: what if nova failed to detach volume, the only hope is to beg cinder to help 16:48:21 An instance may be rougue when there is no connection to nova-compute of its host, but in future further indication may used to decide a host is failed 16:48:29 quite honestly I'm worried about the bugs that will be logged due to operator/admin error though 16:48:30 hemna: cinder is on the end of the connection 16:48:38 winston-d, cinder failed in that case as well no ? 16:48:53 winston-d: EhudTrainin hemna avishay duncanT caitlin56 thoughts on my comment ^^ 16:49:00 jgriffith: +1 16:49:07 jgriffith, +1 16:49:18 jgriffith: I totally agree this is basically a force call 16:49:21 I think we all understand the use case now. nova node is not responsive, rouge vms. Still we keep coming back to cinder not having enough information. I think really though nova should be driving this still in handling this situation happening. 16:49:31 jgriffith: +1 16:49:38 thingee: what's missing? 16:49:39 +1 16:49:40 if nova can't talk to the n-cpu process on the host, nova can't really detach the volume. 16:49:43 thingee: yeah, but if we step back.... 16:49:43 jgriffith: My single concern is how to signal to the caller that the call failed 16:50:02 thingee: allowing an admin to force disconnect and ban a node from connecting I'm ok 16:50:03 The problem Nova can't take care of this, since a failure indication does not ensure the the instance is not talking to the storage or won't do it after some time 16:50:23 duncanT: can you explain? 16:50:25 sorry 16:50:33 duncanT: signal for what? force detach failed? 16:50:52 jgriffith: If cinder can't talk to the storage backend, it can't force the detach... 16:50:52 EhudTrainin: I apologize, but this is the first time I'm hearing about the bp. I'll check it out to understand more, but as jgriffith mentioned I fear the bugs in automating this. 16:51:00 I like the idea of admin call to force detach though 16:51:02 winston-d: Yes, forced detach failed 16:51:26 winston-d: It is far from a show-stopper, just want to ensure it is thought of 16:51:31 thingee, but if n-cpu isn't reachable on the host....nova can't detach the volume. 16:51:36 duncanT: if nobody can talk to the VM and nobody can talk to the storage, i guess you need to pull the plug :) 16:51:40 duncanT: a-synch call, no call back. please come back query the result. 16:51:52 ok 16:52:07 so EhudTrainin I think the take-away (and I can add this to bp if you like) 16:52:10 is: 16:52:13 winston-d: That's fine, yup, just need to remember to add the queryable status :-) 16:52:31 1. Add an admin API call to attempt force disconnect of an attachment/s to a specified node/IP 16:52:36 the best you can do in that case is ask cinder to disconnect from the backend, that'll eventually leave a broken LUN on the host, which will give i/o errors for the host and the vm. 16:52:38 hemna: cinder still doesn't have the information needed though to act. Maybe this bp explains that..I haven't read it yet. 16:52:53 Ummm... hmm 16:53:05 avishay: if this is done by cinder, what happened to the entries in nova db 16:53:10 so then what :) 16:53:24 thingee, cinder has the volume and attachment info in it's db. it can call the right backend to disconnect 16:53:26 xyang__: good question - EhudTrainin ? 16:53:39 this is a nova node HA problem. I'm not sure why we're trying to solve it with cinder. 16:53:40 xyang__: nova is the caller, it should know what to do about the block-device-mapping 16:53:43 this is just icky 16:54:02 hemna: that's not the problem 16:54:05 winston-d: nova is not working here, right 16:54:09 hemna: the problem is cinder doesn't know to act 16:54:22 winston: this will be a cinder operation completely 16:54:24 xyang__: a particular compute node is trashed 16:54:28 so this should start at n-api and then to cinder? 16:54:31 thingee, well not yet. :) we were talking about forcing a disconnect from cinder. 16:54:32 thingee: Nova knows it has lost track of a vm and can make the call, yes? 16:54:50 duncanT, yah 16:54:54 when the instance is rebuilt, the volume is detached and then attached. the rebuild would be done only after fencing to avoid possible conflict 16:55:05 duncanT, hemna: as jgriffith mentioned, I think making a call to force deatach is good. But nova should make that call 16:55:13 thingee, +1 16:55:14 or an admin 16:55:15 xyang__: nova compute is not working, not the entire nova, e.g. nova-api is still working 16:55:18 EhudTrainin: ok, now you kinda lost me 16:55:20 thingee +1 16:55:36 winston-d: ok. 16:55:45 winston-d, but the host needs n-cpu to be working in order to detach the volume from the VM and the host 16:55:59 hemna: no 16:56:05 hemna: it just needs n-api 16:56:10 hemna: n-api can call cinder 16:56:11 EhudTrainin: i think the question is - why can't this be implemented in nova, where nova-api calls detach/terminate_connection for all volumes attached to the host? 16:56:24 jgriffith: but what tells n-api? 16:56:26 n-cpu does the work to detach the volume from the hypervisor and the host kernel 16:56:30 thingee: LOL 16:56:36 thingee: excellent question :) 16:56:55 thingee: and now we're back to admin, in which case who cares if it's direct to cinder api from admin etc 16:57:02 hemna: in the case when n-cpu is on fire, n-api has to call for help from cinder 16:57:04 again, I really think this is a nova node HA case. I don't see anything right now that cinder can know to act on. 16:57:15 winston-d, yah I think that's the only option. 16:57:18 thingee: I agree 16:57:29 The detach command does not ensure that the instace on the mal host would not try to access the storage 16:57:42 so you all keep saying things like "nova node on fire" "nova node is unreachable" etc etc 16:58:01 if the nova node is so hosed it's probably not going to be making iscsi connections anyway 16:58:09 EhudTrainin, correct, but if the cinder backend driver disconnects from the storage, the host will get i/o errors when the vm/host tries to access the volume. 16:58:14 what about file system mounts, where terminate_connection doesn't do anything? 16:58:17 I have an easy solution... 16:58:25 so we all agree...force deatach exposed. Leave it to the people handling the instances. If a nova node catches fire, there better be another nova node available to catch rouge vms and communicate with cinder 16:58:26 ssh root@nova-node; shutdown -h now 16:58:26 EhudTrainin, effectively detaching the volume....but with a dangling LUN 16:58:27 thingee: cinder doesn't and doesn't have to know. cinder just provides help, in the case when n-cpu is broke and no one can reach n-cpu. 16:58:39 if that doesn't work login to pdu and shut off power 16:58:49 winston-d: so when does cinder to the deatch to help? 16:58:58 thingee, +1 16:59:00 jgriffith: what if the server's management network is down, but it's still accessing a storage network? 16:59:04 jgriffith: It says in the blueprint you don't always have a PDU 16:59:14 avishay, re: Fibre Channel ? :P 16:59:17 avishay: that's where the unplug came from LOL 16:59:25 hemna: or a separate ethernet network 16:59:25 duncanT: sighh 16:59:29 henma: if nova is hosed then it shouldn't be surprising that gettig evertything working again will not be trivial. 16:59:32 call the DC monkey 16:59:43 hah 16:59:56 alright, we're spiraling 16:59:57 times up. 17:00:06 thingee: n-api finds out n-cpu is on fire, it'd like to re-create another vm on another n-cpu. but n-api failed to disconnect vol, it has to call for cinder's help 17:00:07 throw a grenade and run. next! 17:00:14 EhudTrainin: it's an interesting idea but there are some very valid concerns here IMO 17:00:15 how about this case: we have an NFS mount on the host. disconnect today does nothing. how do we stop the VM from accessing it? 17:00:18 The way I'm reading the blueprint here, all it is asking for is a force_disconnect_and_rogue_reconnections 17:00:26 avishay: Kill the export? 17:00:42 avishay, heh, that's why jgriffith and I complained about the NFS unmount code :) 17:00:46 I think we're all fine with an admin extension to force disconnect 17:00:52 let's start with that and go from there 17:00:57 everybody ok with that? 17:01:01 +1 17:01:05 of course with NFS you're just screwed 17:01:19 +1 17:01:30 Ok... we can theorize more in #openstack-cinder if you like 17:01:30 +1 17:01:32 * hartsocks waes 17:01:32 I think beyond force disconnect we would also want to prevent the nova-compute on that host from creating new connections 17:01:35 thanks everybody 17:01:37 yah good luck deploying a cloud w/ NFS :P 17:01:41 #endmeeting