18:01:21 #startmeeting 18:01:22 Meeting started Thu Mar 1 18:01:21 2012 UTC. The chair is jdg. Information about MeetBot at http://wiki.debian.org/MeetBot. 18:01:23 Useful Commands: #action #agreed #help #info #idea #link #topic. 18:01:40 #link http://wiki.openstack.org/NovaVolumeMeetings 18:02:41 Anybody else? 18:02:45 Not much was added to the agenda other than DuncanT's request to talk about boot from volume 18:02:52 o/ 18:03:04 This might be a short meeting... :) Maybe we should give folks another minute? 18:03:19 I'm here too 18:03:41 ogelbukh wanted to join, but he might be unavailable right now 18:03:48 alright, if there's no objections let's get started 18:03:54 #topic boot from volume 18:04:09 DuncanT... you had some things you wanted to talk about here? 18:04:31 Yes please 18:04:35 Go for it 18:05:11 Basically I want to get some sort of conscientious as to where people think boot-from-volume is aiming for 18:05:52 Anything specific? 18:05:56 I'm not 100% sure what works at the moment, but I'd like some idea of what people think should work... 18:06:00 Boot from iso 18:06:22 Boot from a volume that I've arranged to have a boot loader on it already 18:06:41 Rather than using the intermediate instance etc? 18:06:49 Yes 18:07:24 Personally I agree that this is soemthing that we "need", how to go about it is another story 18:08:02 DuncanT: Do you have any thoughts on how to implement? 18:08:28 I don't /think/ it is possible to run an instance at the moment that doesn't have a glace reference? 18:08:59 I'm only just getting familiar with how nova starts instances at the moment 18:10:07 Sorry, I'm not as well prepared here as I'd hoped to be 18:10:37 No worries... Does anybody have any thoughts on this? Or do we not have the right people today? 18:10:43 last I checked that was the case, and I agree the imageref shouldn't be necessary 18:10:51 did you guys see https://blueprints.launchpad.net/nova/+spec/auto-create-boot-volumes? 18:11:37 jdurgin: Somebody here pointed me at that a few minutes ago 18:12:05 We'd like to be able to create the volumes from (volume) snapshots too 18:12:10 Yes, it seemed to be a good idea even back in early Diablo days 18:12:30 it's started to be implemented now though: https://review.openstack.org/#change,4576 18:14:06 So it looks like this makes it a system-wide change to always use persistent volumes? 18:15:07 I ran through this change, looks pretty good 18:15:29 But shouldn't there be some cleanup after instance shutdown? 18:16:10 I don't see how you start an instance using the same volumes again 18:16:32 i.e. terminate the instance, keep the volumes then boot a new instance using those volumes 18:16:53 The same as you might shutdown a server and have it come back exactly as it was 18:17:48 I'm not entirely sure the usecase of the AutoCreateVolumes feature without this facility 18:17:54 Maybe I'm missing something? 18:18:48 DuncanT: maybe the way to accomplish that is to make creating the image from a volume a parameter of the api request, instead of a global flag 18:19:19 We can minimize usage of local disks on compute node, it can be necessary sometimes 18:19:49 jdurgin: I think so, yes. I think you can get the behaviour that this new feature gives you using block_device_mapper flags on every instance creation 18:19:54 For example, to minimize VM downtime on compute host failure 18:20:04 YorikSar: Ok, I can see that 18:20:42 But if we are going to use/reuse such volumes, it looks like we should not put this logic into compute 18:20:57 YorikSar: I agree 18:21:26 Maybe, we should let nova-volume summon new volume from image and then start an instance on it? 18:22:29 Can we find a way to reuse the code in nova-compute that currently creates the ephemeral (local) volumes here, since we know it is good? 18:22:54 YorikSar: I don't think nova-volume should start the instance itself, but adding a VolumeDriver method to create a volume from an image sounds good to me 18:23:45 jdurgin: Of course, instance creation should be a separate API call handled by compute 18:25:04 So the create API call could take an argument to specify the instance should reside on a volume... create the volume, and launch the instance 18:25:05 DuncanT: I don't see how can it help here 18:25:13 I've no strong feelings on where it should live, but using different code to populate local .v. persistent volumes from glace seems odd 18:25:36 The task is essentially the safe, isn't it? 18:26:13 not quite the same - local disks are just files downloaded to the host from glance 18:26:21 afaik, local volumes are not actually volumes 18:26:34 jakedahn: +1 18:26:44 jdurgin: +1 18:26:48 currently nova-volume has no way to actually write to the volumes 18:27:07 DuncanT: quick update...I am sorry I joined in late, so I may not have all the context. The way I have created test bfv volumes so far is by attaching a new volume to an existing instance and dd-ing over the contents of /boot 18:27:57 renuka: Exactly this logic should be separated into "create_volume_from_image" API call 18:28:10 DuncanT: by new volume, I mean one that nova-volume knows about 18:28:21 Ok, I thought it could inject files to them and stuff, but I haven't looked at the code in detail. Doesn't it do some magic to expand the filesystem to fill whatever size volume your flavour provides? 18:29:08 Ok, if we need an API call to do it, I'm fine with that 18:29:27 I think, we can delegate this call to some place where both Glance and nova-volume are accessible, along with this resizefs funcitonality 18:29:37 DuncanT: not sure about the details of that.. but do we care about filesystem size when we are explicitly saying to boot from *this* volume 18:30:02 sry, late - what'd I miss :) 18:30:04 renuka: Only if/when initially populating *this* volume with an image from glace 18:30:18 s/glace/glance/ 18:30:38 renuka: If the volume gets populated any other way, I agree we don't care 18:30:53 Maybe we should move it to nova.virt.disk and run on nova-api node? 18:31:19 DuncanT: my impression was, when people use boot from volume, they will have the exact volume they want to boot from. So you are talking about when we create *this* volume, correct? 18:31:39 YorikSar: Would that mean nova-api nodes then need to be able to connect to / mount volumes? 18:31:47 nm, found the log http://eavesdrop.openstack.org/meetings/openstack-meeting/2012/openstack-meeting.2012-03-01-18.01.log.txt 18:32:58 why should this be in the nova-volume api? versus like a utility command 18:32:58 DuncanT: yeah, this is odd. It definitely should be done on nova-volume node. 18:32:59 renuka: I think there are two stages. You're quite right, the second stage is to say 'boot from this already created volume'. There's also the case in https://blueprints.launchpad.net/nova/+spec/auto-create-boot-volumes of initially creating that volume from a glance image 18:33:36 DuncanT: why should this be in the nova-volume api? versus like a utility command 18:34:19 renuka: How would it be driven by a user, if it isn't in an api somewhere? 18:35:17 renuka: In case of iSCSI driver, we can cache frequently used image on nova-volume node and propagate it locally, with a performance gain 18:36:07 DuncanT: I guess what I am more uncomfortable about is having nova-volume be aware of glance all of a sudden 18:36:08 renuka: It can be used too frequent to be an utility. 18:36:22 renuka: An example use-case might be: Create me a server using new persistent volumes for all storage, using the ubuntu glace image.... later, terminate that instance... later still boot a new instance using the volumes I created earlier, exactly as if I have powered off a physical server then powered it back on again 18:37:05 renuka: If we can get nova-compute to use nova-volume volumes in place of local disk images then nova-compute existing code can do the rest 18:37:06 DuncanT: We need to be careful that all this while, compute has been the glance-aware component 18:37:07 renuka: It will connect to Glance anyway to backup volumes 18:37:18 DuncanT: I guess I don't see why you couldn't use the existing glance and compute relations to do that? 18:37:38 YorikSar: We do backups (or what the euca commands call snapshots) without glance usign copy-on-write 18:37:43 eg: keep volume unaware, and just "use" it 18:38:05 YorikSar: at this point, nova-volume does not connect to glance AFAIK... back ups and snapshots are taken on the existing backend 18:38:10 DuncanT: I'm talking about backup to cold storage, e.g. Glance. 18:38:11 there are also possible optimizations if glance and nova-volume are using the same backend storage - new instances could be created that are copy-on-write 18:38:17 jgd: If the API is gotten correct, I think you can keep volume unaware, yes 18:39:08 jdurgin: that cannot be a requirement 18:39:09 jdurgin: COW instances for fast instance creation is definitely on our road-map 18:39:32 renuka: not a requirement, certainly, but an optimization 18:39:36 It seems like adding the functionality to compute api when creating an instance to "use" a volume gets what everybody wants without causing a bunch of tangles in volume code 18:40:10 jdg: Agreed 18:40:15 +1 18:40:23 +1 18:40:26 +1 18:40:31 jdg: I think, this method (create volume from image) should be useful for nova-volume as separate service. 18:41:04 YorikSar: Maybe, but I like the idea of keeping volumes limited to just being "volumes" 18:41:14 They should not know or care how they are being used should they? 18:41:43 Having said that I am not entirely thrilled with the idea of compute suddenly having control of a command that does "feel" like a compute command 18:41:53 Of course, no. But what can stop them from using Glance to store and ressurect long-term backups? 18:42:42 backups should be a function of the backend storage system 18:42:58 weather that is to glance, directly to swift, or local snaps it shouldn't matter 18:42:59 YorikSar: the individual volume drivers should not have to be modified for this functionality.. we need volume only to *create* the new volume... 18:43:14 as long as we make sure we have a consistent interface for the users to interact with that 18:43:23 creiht: we need an API that can support many semantics though 18:43:41 Mmm.. I think, I should formulate this as some bluebrint. 18:43:41 There should be a base amount of functionality for backups 18:43:51 create backup, create volume based on backup, etc. 18:43:58 any extra can be added with extensions 18:45:08 creiht: The problem there is that we consider 'snapshots' and 'backups' to be two separate things, both of which users might want to do 18:45:16 because that extra functionality is going to be different for every implementation 18:45:31 creiht: Which do you map to the standard 'backup'? 18:46:04 renuka: Still volume node can be the closest node to the new volume, so we lose performance on network IO 18:46:20 renuka: how would you know how to write to a volume without an additional volume driver method? 18:46:43 jdurgin: We can mount volume to nova-volume node and write to it 18:46:52 So there's a lot of code referencing block_device_mapping, which as I understand the EC2 feature allows you to accomplish boot-from-volume (i.e. the root fs of this instance is an ebs volume) - has any used the block_device_mapping feature currently impliemented? 18:46:54 well but that would mean the volume needs to be mounted somewhere right 18:47:00 YorikSar: not all volumes are mountable on the host 18:47:09 DuncanT: that's part of the problem is the terminology makes it difficult to define all of this 18:47:32 I tried at one point to clean it up with using the term backups, but I may just further confused the situation 18:48:09 All I'm saying is that there should be a simple base functionality (what is currently implemented in the api as snapshots) 18:48:11 creiht: I wrote a blueprint that attempted to define some terminology. One problem is that ec2 API already owns some of the terms 18:48:17 jdurgin: can you clarify volumes that aren't mountable? 18:48:26 weather we call it backups or snapshots at this point, I've come to not care 18:48:43 but anything on top of that should be an extension 18:48:59 jdg: sheepdog and rbd are written to directly by qemu 18:49:01 clayg: Regarding block_device_mapping, I can't see how to create a new instance that doesn't reference a glace image using it 18:49:03 creiht: DuncanT: the terms can be overloaded, but it seems ok if they means different things to diffent volume types/storage backends as long as the user can keep it stright. 18:49:26 There are volumes, we can create fast snapshots that use less space and can easily be a source for a new volume. And there are backups that take a lot of time to create, are stored in a very reliable place (like Swift) and take a lot of resources to be restored. 18:49:36 jdurgin: So what about saying sheepdog and rdb don't suppor this bfv method? 18:50:04 Or "this method of bfv is not supported" 18:50:07 I though, this terminology is common 18:50:14 clayg: block_device_mapping otherwise gives a lot of the needed functionality, I think, though it is ugly 18:50:17 YorikSar: you can support that, but not every storage system is going to support that 18:50:28 that's why I am arguing for a simple base concept 18:50:29 jdg: they can support it, just not with dding to a block device on the host. they can both be written to with qemu-img 18:50:49 it is reasonable to expect every storage system to implement some backup/snapshot system 18:51:06 jdurgin: ahh, ok 18:51:06 creiht: why does this API have to be backend dependent 18:51:35 creiht: the worst way is to attach volume to nova-volume host (just as it can be attached to nova-compute) and dd an image to/from it 18:51:57 creiht: we should not have to rely on additional backend functionality, when all we need is the ability to create/attach volume 18:52:08 renuka: all I'm arguing for is a simple base functionality that all systems can implement 18:52:22 then where systems want to vary/ add their own value they can in extensions 18:52:26 just like the rest of nova 18:52:45 renuka: and I agree totally with that 18:53:01 YorikSar: that sounds so wrong. the volume host is a control plane, we should not have random volumes whose contents we have no idea about being attached to a privileged host/vm 18:53:36 renuka: volume host function separation ++ 18:53:53 ok here's a suggestion.. off the top of my head... can we expect to boot the image we want, attach a new volume to it, and dd (like how i said i was creating volumes)... 18:54:20 boot the image if required of course, not if it is running already 18:54:25 renuka: ok, then we should have a separate utility host that should do it. 18:54:39 YorkSar: why? 18:55:28 YorikSar: i know this is quite hacky... but it shouldn't matter where we booted this image 18:55:45 renuka: to keep Compute unaware of all backuping. And to not lose performance on virtualization and networking (if volumes are local) 18:56:45 renuka: that's essentially what libguestfs does (and there's a plugin for it in nova.virt.disks for file injection) 18:57:15 YorikSar: so are you saying that the backup functionality should be common accross all storage systems? 18:57:29 jdurgin: file injection can be done by compute host later 18:58:23 creiht: Yes. And if backend can not do something faster (like stream image directly to storage), we should do in for it, e.g. on utility host 18:59:20 YorikSar: I would argue that is not possible (at least in the near term) 18:59:22 YorikSar: yeah, file injection is a separate issue 18:59:36 every storage system is going to do backups differently 19:00:00 for example, I imagine netapp will store backups internally as snaps 19:00:35 creiht: But it should be kept in nova-volume, not spread around both volume and compute 19:00:36 lunr is going to backup directly to swift 19:01:23 snapshots should not be considered as long-term storage 19:01:38 So what's so wrong with a default backup in the Volume driver that does something along those lines, and then folks override it in their drivers where possible. In both cases it's the same volume-api call? 19:01:47 YorikSar: again that depends on the backend storage system 19:01:57 for your storage system that may be the case 19:02:01 for another it may not 19:02:34 creiht: Well, it should be up to driver 19:02:46 and that is exactly what I am arguing for 19:02:48 creiht: It can alias backup to snapshot 19:02:55 we shouldn't make those decisions for them :) 19:02:59 YorikSar: +1 19:03:12 and I have to run to another meeting 19:03:32 creiht: But we should pass create_backup request to driver anyway, so that there should be API call that triggers it 19:03:57 Ok, we're out of time. Sounds like we can pick up on this again next week for sure. 19:04:08 It sounds like we have some conscientious on how boot-from-volume could work, even if backup/snapshot is a bit up-it-the air? 19:04:46 DuncanT: Yes, I think folks agree on the top level. Backup/Snapshot details still needs some discussion. 19:04:48 Maybe I'll try to summerise my understanding of the boot from volume, with notes on the code that seems to be missing, for next week? 19:04:50 DuncanT: can you summarize? 19:05:02 DuncanT: If we support future with backups in nova-volume, this logic should be moved to nova-volume. 19:05:06 Another question I have is how this impacts the existing blueprint and work that's been done by samsung 19:05:20 jdg: any progress on uuids? 19:05:28 clayg: :) 19:05:40 jdg: Is http://wiki.openstack.org/AutoCreateBootVolumes the samsung one? 19:05:55 Working on it. My first approach trashed ec2 calls. 19:05:57 Sorry, I don't know who's who 19:05:58 DuncanT: could you put down what we have agreed on...i am still confused about this 19:05:59 DuncanT: Yes 19:06:02 And I've been beaten for unifying volume API and extension. 19:06:37 jdg: Ok, I'll make sure any interaction with that is documented 19:06:56 clayg: I'm going to take another look today and mabye send out an email to volume list about what I'm trying to do. 19:07:26 My thought now is to just modify existing DB/API methods to check if they're recieving a UUID versus int-id and behave accordingly. 19:07:28 vishy mentioned some plan for separation nova-volume into project of it's own 19:07:34 But this creates some confusion higher up 19:07:49 ogelbukh: Yes 19:07:52 did anyone see updates on that? 19:07:58 renuka: I'll email our the summary ASAP then you can comment on that... to be honest it feels like what you were doing is basically what I'm thinking of, other than using something slightly smarter than DD 19:08:09 or he's going to do it at summit 19:08:13 ogelbukh: I think that's going to be relegated more towards the summit 19:08:21 We need to use hash-action to fixate what should be donne by next week 19:08:31 jdg: oh, fine 19:08:53 DuncanT: i think this could be a different service altogether, or the closest would be to have it become part of nova-compute 19:08:56 #action DuncanT to send out summary of where we're at on BFV 19:09:21 renuka: It is going to involve nova-compute work, definitely 19:09:30 Anything else real quick? 19:10:02 Ok, thanks everyone. 19:10:06 #endmeeting