18:01:21 <jdg> #startmeeting
18:01:22 <openstack> Meeting started Thu Mar  1 18:01:21 2012 UTC.  The chair is jdg. Information about MeetBot at http://wiki.debian.org/MeetBot.
18:01:23 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic.
18:01:40 <jdg> #link http://wiki.openstack.org/NovaVolumeMeetings
18:02:41 <DuncanT> Anybody else?
18:02:45 <jdg> Not much was added to the agenda other than DuncanT's request to talk about boot from volume
18:02:52 <YorikSar> o/
18:03:04 <jdg> This might be a short meeting... :)  Maybe we should give folks another minute?
18:03:19 <jdurgin> I'm here too
18:03:41 <kishanov> ogelbukh wanted to join, but he might be unavailable right now
18:03:48 <jdg> alright, if there's no objections let's get started
18:03:54 <jdg> #topic boot from volume
18:04:09 <jdg> DuncanT... you had some things you wanted to talk about here?
18:04:31 <DuncanT> Yes please
18:04:35 <jdg> Go for it
18:05:11 <DuncanT> Basically I want to get some sort of conscientious as to where people think boot-from-volume is aiming for
18:05:52 <jdg> Anything specific?
18:05:56 <DuncanT> I'm not 100% sure what works at the moment, but I'd like some idea of what people think should work...
18:06:00 <DuncanT> Boot from iso
18:06:22 <DuncanT> Boot from a volume that I've arranged to have a boot loader on it already
18:06:41 <jdg> Rather than using the intermediate instance etc?
18:06:49 <DuncanT> Yes
18:07:24 <jdg> Personally I agree that this is soemthing that we "need", how to go about it is another story
18:08:02 <jdg> DuncanT: Do you have any thoughts on how to implement?
18:08:28 <DuncanT> I don't /think/ it is possible to run an instance at the moment that doesn't have a glace reference?
18:08:59 <DuncanT> I'm only just getting familiar with how nova starts instances at the moment
18:10:07 <DuncanT> Sorry, I'm not as well prepared here as I'd hoped to be
18:10:37 <jdg> No worries...  Does anybody have any thoughts on this?  Or do we not have the right people today?
18:10:43 <jdurgin> last I checked that was the case, and I agree the imageref shouldn't be necessary
18:10:51 <jdurgin> did you guys see https://blueprints.launchpad.net/nova/+spec/auto-create-boot-volumes?
18:11:37 <DuncanT> jdurgin: Somebody here pointed me at that a few minutes ago
18:12:05 <DuncanT> We'd like to be able to create the volumes from (volume) snapshots too
18:12:10 <YorikSar> Yes, it seemed to be a good idea even back in early Diablo days
18:12:30 <jdurgin> it's started to be implemented now though: https://review.openstack.org/#change,4576
18:14:06 <DuncanT> So it looks like this makes it a system-wide change to always use persistent volumes?
18:15:07 <YorikSar> I ran through this change, looks pretty good
18:15:29 <YorikSar> But shouldn't there be some cleanup after instance shutdown?
18:16:10 <DuncanT> I don't see how you start an instance using the same volumes again
18:16:32 <DuncanT> i.e. terminate the instance, keep the volumes then boot a new instance using those volumes
18:16:53 <DuncanT> The same as you might shutdown a server and have it come back exactly as it was
18:17:48 <DuncanT> I'm not entirely sure the usecase of the AutoCreateVolumes feature without this facility
18:17:54 <DuncanT> Maybe I'm missing something?
18:18:48 <jdurgin> DuncanT: maybe the way to accomplish that is to make creating the image from a volume a parameter of the api request, instead of a global flag
18:19:19 <YorikSar> We can minimize usage of local disks on compute node, it can be necessary sometimes
18:19:49 <DuncanT> jdurgin: I think so, yes. I think you can get the behaviour that this new feature gives you using block_device_mapper flags on every instance creation
18:19:54 <YorikSar> For example, to minimize VM downtime on compute host failure
18:20:04 <DuncanT> YorikSar: Ok, I can see that
18:20:42 <YorikSar> But if we are going to use/reuse such volumes, it looks like we should not put this logic into compute
18:20:57 <DuncanT> YorikSar: I agree
18:21:26 <YorikSar> Maybe, we should let nova-volume summon new volume from image and then start an instance on it?
18:22:29 <DuncanT> Can we find a way to reuse the code in nova-compute that currently creates the ephemeral (local) volumes here, since we know it is good?
18:22:54 <jdurgin> YorikSar: I don't think nova-volume should start the instance itself, but adding a VolumeDriver method to create a volume from an image sounds good to me
18:23:45 <YorikSar> jdurgin: Of course, instance creation should be a separate API call handled by compute
18:25:04 <jdg> So the create API call could take an argument to specify the instance should reside on a volume... create the volume, and launch the instance
18:25:05 <YorikSar> DuncanT: I don't see how can it help here
18:25:13 <DuncanT> I've no strong feelings on where it should live, but using different code to populate local .v. persistent volumes from glace seems odd
18:25:36 <DuncanT> The task is essentially the safe, isn't it?
18:26:13 <jdurgin> not quite the same - local disks are just files downloaded to the host from glance
18:26:21 <ogelbukh> afaik, local volumes are not actually volumes
18:26:34 <ogelbukh> jakedahn: +1
18:26:44 <ogelbukh> jdurgin: +1
18:26:48 <jdurgin> currently nova-volume has no way to actually write to the volumes
18:27:07 <renuka> DuncanT: quick update...I am sorry I joined in late, so I may not have all the context. The way I have created test bfv volumes so far is by attaching a new volume to an existing instance and dd-ing over the contents of /boot
18:27:57 <YorikSar> renuka: Exactly this logic should be separated into "create_volume_from_image" API call
18:28:10 <renuka> DuncanT: by new volume, I mean one that nova-volume knows about
18:28:21 <DuncanT> Ok, I thought it could inject files to them and stuff, but I haven't looked at the code in detail. Doesn't it do some magic to expand the filesystem to fill whatever size volume your flavour provides?
18:29:08 <DuncanT> Ok, if we need an API call to do it, I'm fine with that
18:29:27 <YorikSar> I think, we can delegate this call to some place where both Glance and nova-volume are accessible, along with this resizefs funcitonality
18:29:37 <renuka> DuncanT: not sure about the details of that.. but do we care about filesystem size when we are explicitly saying to boot from *this* volume
18:30:02 <clayg> sry, late - what'd I miss :)
18:30:04 <DuncanT> renuka: Only if/when initially populating *this* volume with an image from glace
18:30:18 <DuncanT> s/glace/glance/
18:30:38 <DuncanT> renuka: If the volume gets populated any other way, I agree we don't care
18:30:53 <YorikSar> Maybe we should move it to nova.virt.disk and run on nova-api node?
18:31:19 <renuka> DuncanT: my impression was, when people use boot from volume, they will have the exact volume they want to boot from. So you are talking about when we create *this* volume, correct?
18:31:39 <DuncanT> YorikSar: Would that mean nova-api nodes then need to be able to connect to / mount volumes?
18:31:47 <clayg> nm, found the log http://eavesdrop.openstack.org/meetings/openstack-meeting/2012/openstack-meeting.2012-03-01-18.01.log.txt
18:32:58 <renuka> why should this be in the nova-volume api? versus like a utility command
18:32:58 <YorikSar> DuncanT: yeah, this is odd. It definitely should be done on nova-volume node.
18:32:59 <DuncanT> renuka: I think there are two stages. You're quite right, the second stage is to say 'boot from this already created volume'. There's also the case in https://blueprints.launchpad.net/nova/+spec/auto-create-boot-volumes of initially creating that volume from a glance image
18:33:36 <renuka> DuncanT: why should this be in the nova-volume api? versus like a utility command
18:34:19 <DuncanT> renuka: How would it be driven by a user, if it isn't in an api somewhere?
18:35:17 <YorikSar> renuka: In case of iSCSI driver, we can cache frequently used image on nova-volume node and propagate it locally, with a performance gain
18:36:07 <renuka> DuncanT: I guess what I am more uncomfortable about is having nova-volume be aware of glance all of a sudden
18:36:08 <YorikSar> renuka: It can be used too frequent to be an utility.
18:36:22 <DuncanT> renuka: An example use-case might be: Create me a server using new persistent volumes for all storage, using the ubuntu glace image.... later, terminate that instance... later still boot a new instance using the volumes I created earlier, exactly as if I have powered off a physical server then powered it back on again
18:37:05 <DuncanT> renuka: If we can get nova-compute to use nova-volume volumes in place of local disk images then nova-compute existing code can do the rest
18:37:06 <renuka> DuncanT: We need to be careful that all this while, compute has been the glance-aware component
18:37:07 <YorikSar> renuka: It will connect to Glance anyway to backup volumes
18:37:18 <jdg> DuncanT: I guess I don't see why you couldn't use the existing glance and compute relations to do that?
18:37:38 <DuncanT> YorikSar: We do backups (or what the euca commands call snapshots) without glance usign copy-on-write
18:37:43 <jdg> eg: keep volume unaware, and just "use" it
18:38:05 <renuka> YorikSar: at this point, nova-volume does not connect to glance AFAIK... back ups and snapshots are taken on the existing backend
18:38:10 <YorikSar> DuncanT: I'm talking about backup to cold storage, e.g. Glance.
18:38:11 <jdurgin> there are also possible optimizations if glance and nova-volume are using the same backend storage - new instances could be created that are copy-on-write
18:38:17 <DuncanT> jgd: If the API is gotten correct, I think you can keep volume unaware, yes
18:39:08 <renuka> jdurgin: that cannot be a requirement
18:39:09 <DuncanT> jdurgin: COW instances for fast instance creation is definitely on our road-map
18:39:32 <jdurgin> renuka: not a requirement, certainly, but an optimization
18:39:36 <jdg> It seems like adding the functionality to compute api when creating an instance to "use" a volume gets what everybody wants without causing a bunch of tangles in volume code
18:40:10 <DuncanT> jdg: Agreed
18:40:15 <renuka> +1
18:40:23 <dricco> +1
18:40:26 <jdurgin> +1
18:40:31 <YorikSar> jdg: I think, this method (create volume from image) should be useful for nova-volume as separate service.
18:41:04 <jdg> YorikSar: Maybe, but I like the idea of keeping volumes limited to just being "volumes"
18:41:14 <jdg> They should not know or care how they are being used should they?
18:41:43 <renuka> Having said that I am not entirely thrilled with the idea of compute suddenly having control of a command that does "feel" like a compute command
18:41:53 <YorikSar> Of course, no. But what can stop them from using Glance to store and ressurect long-term backups?
18:42:42 <creiht> backups should be a function of the backend storage system
18:42:58 <creiht> weather that is to glance, directly to swift, or local snaps it shouldn't matter
18:42:59 <renuka> YorikSar: the individual volume drivers should not have to be modified for this functionality.. we need volume only to *create* the new volume...
18:43:14 <creiht> as long as we make sure we have a consistent interface for the users to interact with that
18:43:23 <DuncanT> creiht: we need an API that can support many semantics though
18:43:41 <YorikSar> Mmm.. I think, I should formulate this as some bluebrint.
18:43:41 <creiht> There should be a base amount of functionality for backups
18:43:51 <creiht> create backup, create volume based on backup, etc.
18:43:58 <creiht> any extra can be added with extensions
18:45:08 <DuncanT> creiht: The problem there is that we consider 'snapshots' and 'backups' to be two separate things, both of which users might want to do
18:45:16 <creiht> because that extra functionality is going to be different for every implementation
18:45:31 <DuncanT> creiht: Which do you map to the standard 'backup'?
18:46:04 <YorikSar> renuka: Still volume node can be the closest node to the new volume, so we lose performance on network IO
18:46:20 <jdurgin> renuka: how would you know how to write to a volume without an additional volume driver method?
18:46:43 <YorikSar> jdurgin: We can mount volume to nova-volume node and write to it
18:46:52 <clayg> So there's a lot of code referencing block_device_mapping, which as I understand the EC2 feature allows you to accomplish boot-from-volume (i.e. the root fs of this instance is an ebs volume) - has any used the block_device_mapping feature currently impliemented?
18:46:54 <renuka> well but that would mean the volume needs to be mounted somewhere right
18:47:00 <jdurgin> YorikSar: not all volumes are mountable on the host
18:47:09 <creiht> DuncanT: that's part of the problem is the terminology makes it difficult to define all of this
18:47:32 <creiht> I tried at one point to clean it up with using the term backups, but I may just further confused the situation
18:48:09 <creiht> All I'm saying is that there should be a simple base functionality (what is currently implemented in the api as snapshots)
18:48:11 <DuncanT> creiht: I wrote a blueprint that attempted to define some terminology. One problem is that ec2 API already owns some of the terms
18:48:17 <jdg> jdurgin: can you clarify volumes that aren't mountable?
18:48:26 <creiht> weather we call it backups or snapshots at this point, I've come to not care
18:48:43 <creiht> but anything on top of that should be an extension
18:48:59 <jdurgin> jdg: sheepdog and rbd are written to directly by qemu
18:49:01 <DuncanT> clayg: Regarding block_device_mapping, I can't see how to create a new instance that doesn't reference a glace image using it
18:49:03 <clayg> creiht: DuncanT: the terms can be overloaded, but it seems ok if they means different things to diffent volume types/storage backends as long as the user can keep it stright.
18:49:26 <YorikSar> There are volumes, we can create fast snapshots that use less space and can easily be a source for a new volume. And there are backups that take a lot of time to create, are stored in a very reliable place (like Swift) and take a lot of resources to be restored.
18:49:36 <jdg> jdurgin: So what about saying sheepdog and rdb don't suppor this bfv method?
18:50:04 <jdg> Or "this method of bfv is not supported"
18:50:07 <YorikSar> I though, this terminology is common
18:50:14 <DuncanT> clayg: block_device_mapping otherwise gives a lot of the needed functionality, I think, though it is ugly
18:50:17 <creiht> YorikSar: you can support that, but not every storage system is going to support that
18:50:28 <creiht> that's why I am arguing for a simple base concept
18:50:29 <jdurgin> jdg: they can support it, just not with dding to a block device on the host. they can both be written to with qemu-img
18:50:49 <creiht> it is reasonable to expect every storage system to implement some backup/snapshot system
18:51:06 <jdg> jdurgin: ahh, ok
18:51:06 <renuka> creiht: why does this API have to be backend dependent
18:51:35 <YorikSar> creiht: the worst way is to attach volume to nova-volume host (just as it can be attached to nova-compute) and dd an image to/from it
18:51:57 <renuka> creiht: we should not have to rely on additional backend functionality, when all we need is the ability to create/attach volume
18:52:08 <creiht> renuka: all I'm arguing for is a simple base functionality that all systems can implement
18:52:22 <creiht> then where systems want to vary/ add their own value they can in extensions
18:52:26 <creiht> just like the rest of nova
18:52:45 <creiht> renuka: and I agree totally with that
18:53:01 <renuka> YorikSar: that sounds so wrong. the volume host is a control plane, we should not have random volumes whose contents we have no idea about being attached to a privileged host/vm
18:53:36 <DuncanT> renuka: volume host function separation ++
18:53:53 <renuka> ok here's a suggestion.. off the top of my head... can we expect to boot the image we want, attach a new volume to it, and dd (like how i said i was creating volumes)...
18:54:20 <renuka> boot the image if required of course, not if it is running already
18:54:25 <YorikSar> renuka: ok, then we should have a separate utility host that should do it.
18:54:39 <renuka> YorkSar: why?
18:55:28 <renuka> YorikSar: i know this is quite hacky... but it shouldn't matter where we booted this image
18:55:45 <YorikSar> renuka: to keep Compute unaware of all backuping. And to not lose performance on virtualization and networking (if volumes are local)
18:56:45 <jdurgin> renuka: that's essentially what libguestfs does (and there's a plugin for it in nova.virt.disks for file injection)
18:57:15 <creiht> YorikSar: so are you saying that the backup functionality should be common accross all storage systems?
18:57:29 <YorikSar> jdurgin: file injection can be done by compute host later
18:58:23 <YorikSar> creiht: Yes. And if backend can not do something faster (like stream image directly to storage), we should do in for it, e.g. on utility host
18:59:20 <creiht> YorikSar: I would argue that is not possible (at least in the near term)
18:59:22 <jdurgin> YorikSar: yeah, file injection is a separate issue
18:59:36 <creiht> every storage system is going to do backups differently
19:00:00 <creiht> for example, I imagine netapp will store backups internally as snaps
19:00:35 <YorikSar> creiht: But it should be kept in nova-volume, not spread around both volume and compute
19:00:36 <creiht> lunr is going to backup directly to swift
19:01:23 <YorikSar> snapshots should not be considered as long-term storage
19:01:38 <jdg> So what's so wrong with a default backup in the Volume driver that does something along those lines, and then folks override it in their drivers where possible.  In both cases it's the same volume-api call?
19:01:47 <creiht> YorikSar: again that depends on the backend storage system
19:01:57 <creiht> for your storage system that may be the case
19:02:01 <creiht> for another it may not
19:02:34 <YorikSar> creiht: Well, it should be up to driver
19:02:46 <creiht> and that is exactly what I am arguing for
19:02:48 <YorikSar> creiht: It can alias backup to snapshot
19:02:55 <creiht> we shouldn't make those decisions for them :)
19:02:59 <jdg> YorikSar: +1
19:03:12 <creiht> and I have to run to another meeting
19:03:32 <YorikSar> creiht: But we should pass create_backup request to driver anyway, so that there should be API call that triggers it
19:03:57 <jdg> Ok, we're out of time.  Sounds like we can pick up on this again next week for sure.
19:04:08 <DuncanT> It sounds like we have some conscientious on how boot-from-volume could work, even if backup/snapshot is a bit up-it-the air?
19:04:46 <jdg> DuncanT: Yes, I think folks agree on the top level.  Backup/Snapshot details still needs some discussion.
19:04:48 <DuncanT> Maybe I'll try to summerise my understanding of the boot from volume, with notes on the code that seems to be missing, for next week?
19:04:50 <renuka> DuncanT: can you summarize?
19:05:02 <YorikSar> DuncanT: If we support future with backups in nova-volume, this logic should be moved to nova-volume.
19:05:06 <jdg> Another question I have is how this impacts the existing blueprint and work that's been done by samsung
19:05:20 <clayg> jdg: any progress on uuids?
19:05:28 <jdg> clayg:  :)
19:05:40 <DuncanT> jdg: Is http://wiki.openstack.org/AutoCreateBootVolumes the samsung one?
19:05:55 <jdg> Working on it.  My first approach trashed ec2 calls.
19:05:57 <DuncanT> Sorry, I don't know who's who
19:05:58 <renuka> DuncanT: could you put down what we have agreed on...i am still confused about this
19:05:59 <jdg> DuncanT: Yes
19:06:02 <YorikSar> And I've been beaten for unifying volume API and extension.
19:06:37 <DuncanT> jdg: Ok, I'll make sure any interaction with that is documented
19:06:56 <jdg> clayg: I'm going to take another look today and mabye send out an email to volume list about what I'm trying to do.
19:07:26 <jdg> My thought now is to just modify existing DB/API methods to check if they're recieving a UUID versus int-id and behave accordingly.
19:07:28 <ogelbukh> vishy mentioned some plan for separation nova-volume into project of it's own
19:07:34 <jdg> But this creates some confusion higher up
19:07:49 <jdg> ogelbukh:  Yes
19:07:52 <ogelbukh> did anyone see updates on that?
19:07:58 <DuncanT> renuka: I'll email our the summary ASAP then you can comment on that... to be honest it feels like what you were doing is basically what I'm thinking of, other than using something slightly smarter than DD
19:08:09 <ogelbukh> or he's going to do it at summit
19:08:13 <jdg> ogelbukh: I think that's going to be relegated more towards the summit
19:08:21 <YorikSar> We need to use hash-action to fixate what should be donne by next week
19:08:31 <ogelbukh> jdg: oh, fine
19:08:53 <renuka> DuncanT: i think this could be a different service altogether, or the closest would be to have it become part of nova-compute
19:08:56 <jdg> #action DuncanT to send out summary of where we're at on BFV
19:09:21 <DuncanT> renuka: It is going to involve nova-compute work, definitely
19:09:30 <jdg> Anything else real quick?
19:10:02 <jdg> Ok, thanks everyone.
19:10:06 <jdg> #endmeeting