14:00:40 <domhnallw> #startmeeting freezer 14:00:40 <openstack> Meeting started Thu Jun 30 14:00:40 2016 UTC and is due to finish in 60 minutes. The chair is domhnallw. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:40 <ddieterly> groovy 14:00:41 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:44 <openstack> The meeting name has been set to 'freezer' 14:00:50 <ddieterly> o/ 14:01:01 <domhnallw> Okay, I don't see any topics in the etherpad, has anyone any suggestions? 14:01:53 <m3m0> o/ 14:01:55 <slashme> o/ 14:01:55 <daemontool> o/ 14:02:03 <domhnallw> #topic Using bup for backup engine - Tim Buckley to lead discussion 14:02:45 <domhnallw> timothyb89? 14:02:48 <timothyb89> so, I'm interested in adding a new engine that would use https://github.com/bup/bup 14:03:31 <timothyb89> it would give us solid deduplication for free, in addition to incremental backups and compression 14:03:56 <domhnallw> I do note this in the readme: Reasons you might want to avoid bup 14:03:56 <domhnallw> This is a very early version. Therefore it will most probably not work for you, but we don't know why. It is also missing some probably-critical features. 14:04:04 <ddieterly> it would fix the limitations tar has with incremental and name changes/moves 14:04:38 <timothyb89> while the github repo does say that (and it is still true, in some areas) it is still a fairly mature project 14:04:41 <timothyb89> its been around ~6 years now 14:05:07 <timothyb89> and, it's underlying format (git packfile) is pretty battle-tested by now 14:05:12 <m3m0> how is it doing in the performance area? 14:05:47 <timothyb89> I haven't measured that myself, but I'd imagine pretty decently.. it seems that they have implemented all the performance-critical parts in C 14:05:58 <ddieterly> current release .028.1 is a red flag 14:06:21 <m3m0> and I wonder how it will behave when working with streams 14:06:22 <domhnallw> ddieterly, red flag? 14:06:35 <domhnallw> The 0.x version number you mean? 14:06:38 <ddieterly> it is not at a 1.0 release yet 14:06:43 <daemontool> timothyb89, can it be used to store data in the current supported media storage? 14:06:44 <ddieterly> domhnallw yes 14:07:32 <daemontool> how would it work, would we wrap the executable? 14:07:36 <timothyb89> daemontool: to my knowledge it's only designed for filesystem-to-filesystem backups, so there would need to be a small layer on top for, e.g. swift 14:08:15 <domhnallw> There's a related project called 'bupper' that facilitates config file-based profiles: https://github.com/tobru/bupper 14:08:16 <timothyb89> wrapping needs further investigation, but it might not need to be wrapped at all, given that it's mainly written in python 14:08:46 <timothyb89> we may be able to use it directly, or worst case use it like we currently use tar 14:09:20 <daemontool> timothyb89, if we can use the code, without wrapping binaries +1 14:11:18 <jonaspf1> there are other backup tools similar to bup. e.g. borg, attic, ... 14:11:38 <domhnallw> Might also be worth looking at this near the bottom of the readme: https://github.com/bup/bup/tree/0.28.1#things-that-are-stupid-for-now-but-which-well-fix-later 14:11:39 <jonaspf1> has anybody tried to analyse the different options? 14:12:08 <m3m0> i did with borg 14:12:17 <m3m0> python 3 only 14:12:19 <ddieterly> it would be great if someone could make a list of possible options in a bp and pros/cons of each option 14:12:25 <m3m0> works with local storages 14:12:32 <m3m0> but not so sure about swift 14:12:33 <daemontool> and also define some requirements from our side 14:12:38 <daemontool> for the tool 14:12:52 <daemontool> what needs to support/provide in order to be included 14:13:08 <domhnallw> Absolutely, a concrete set of requirements must come first. 14:13:24 <domhnallw> (a *lot* easier said than done though) 14:13:43 <ddieterly> if everybody contributes what they know, then it would be easy 14:13:55 <ddieterly> step 1, create a bp 14:14:35 <ddieterly> step 3, profit 14:14:44 <jonaspf1> I looked at different backup systems for my private home backup a while ago. I found this overview very useful: https://wiki.archlinux.org/index.php/Synchronization_and_backup_programs 14:14:55 <daemontool> I can write the requirements 14:15:02 <daemontool> in the bp 14:15:57 <ddieterly> #action daemontool to create bp for backup engine options/requirements 14:16:38 <daemontool> we need to be consistent with previous engines conversations 14:16:53 <daemontool> I-ll check that with slashme 14:17:02 <ddieterly> are those recorded somewhere? 14:17:07 <domhnallw> I was only aware of tar? 14:17:23 <ddieterly> domhnallw theres dar as well 14:17:24 <m3m0> ddieterly: I see what you did there :P 14:17:54 <ddieterly> m3m0 i didn't just fall off a turnip truck 14:18:28 <daemontool> ok next? 14:18:50 <domhnallw> We're done with this topic then? 14:19:10 <ddieterly> sure 14:19:18 <domhnallw> Okay. 14:19:19 <domhnallw> #topic Policy on breaking backward compatibility 14:19:32 <domhnallw> This should be fun :) 14:19:40 <ddieterly> i know we discussed this before, but i'm not sure if there is a stated policy on this 14:19:50 <daemontool> domhnallw, nice to meet you btw 14:19:54 <domhnallw> Likewise :) 14:20:00 <ddieterly> when daemontool is around, it is always fun 14:20:11 <domhnallw> If anyone has any additional topics, now would be an excellent time to add them to the etherpad :) 14:21:10 <ddieterly> do we have a policy on breaking backward compatibility? 14:21:24 <domhnallw> So, backward compatability. Are we talking about changing freezer's behaviour, or changing its requirement versions? 14:21:36 <timothyb89> behavior I believe? 14:22:03 <ddieterly> as i understand it, version n does not work with version n+1 14:23:09 <domhnallw> So I guess we need to also think about whether we're talking about the API, the scheduler, or the agent, and if we're talking command-line arguments, configuration settings, etc. etc. 14:23:30 <ddieterly> if you are adding things to the etherpad, could you please pick a color so that we know who is adding what? 14:24:01 <domhnallw> ddieterly, it looks so far that it's you and yangyapeng, is that wrong? 14:24:36 <ddieterly> it looks like someone is adding with purple 14:24:52 <yangyapeng> not me :) 14:24:53 <ddieterly> maybe that's lavendar, i'm not a color expert 14:25:05 <ddieterly> 'How can we scale'... 14:25:11 <ddieterly> who dat? 14:25:30 <ddieterly> got to be daemontool 14:25:49 <domhnallw> Whoever it is, please state your name and, erm, colour? 14:25:49 <yangyapeng> i guess daemontool ? 14:25:50 <domhnallw> :p 14:25:57 <daemontool> it's me yes 14:26:04 <slashme> domhnallw: usualy, for cli arguments 14:26:06 <daemontool> first of all, you are welcome 14:26:09 <daemontool> :) 14:26:12 <daemontool> ok 14:26:14 <yangyapeng> haha :) I see the patch in gerrit 14:26:16 <slashme> Adding is no poblem 14:26:30 <slashme> removing needs to be deprecated for a release cycle 14:26:34 <ddieterly> daemontool just put your name on a color so that we know who it is 14:27:21 <ddieterly> daemontool thank you 14:27:36 <domhnallw> Okay, they're both you :) 14:27:43 <domhnallw> Thanks folks. 14:27:50 <domhnallw> Now, back on topic. 14:27:52 <daemontool> so, getting back to serious things 14:27:59 <domhnallw> CLI arguments for the various elements of freezer I believe? 14:28:03 <timothyb89> one example I was wondering about was fixing the '--no-incremental' argument type (string -> boolean) 14:28:03 <daemontool> how do we backup large data set size, 14:28:13 <daemontool> without re reading all the data every time_ 14:28:23 <daemontool> ? 14:29:48 <ddieterly> daemontool what is the current issue with the current implementation now? 14:29:59 <daemontool> that you need to re read all the data every time 14:30:14 <daemontool> I mean not now 14:30:20 <daemontool> because we do not support block based incremental 14:30:32 <daemontool> so we check only file inode changes 14:30:59 <ddieterly> would a different engine besides tar solve some of that? 14:31:09 <daemontool> it depends on the engine 14:31:12 <daemontool> dar nope 14:31:27 <ddieterly> ok, so a different engine may help with that 14:31:28 <daemontool> so the problem is that if you have 1TB 14:31:31 <timothyb89> I think it would have to be a different engine, right? 14:31:35 <daemontool> yes 14:31:43 <daemontool> you read it today 14:31:48 <daemontool> compute the block hashes 14:32:03 <daemontool> or anything that keep track of blocks state 14:32:04 <ddieterly> ok, dumb idea, why don't users just back up entire volumes of the cloud hosts? 14:32:27 <daemontool> ddieterly, ok, no incremental? 14:32:36 <daemontool> if we have a 1TB 14:32:38 <ddieterly> use a huge net instead of trying to pick up each little fishy 14:32:43 <domhnallw> Incremental backups are designed to save space, right? Just save the changes rather than having multiple copies of near-identical data? 14:33:04 <domhnallw> I can't see when that wouldn't be at least desirable. 14:33:04 <daemontool> so let's say we have a volume of 1 TB today 14:33:26 <daemontool> we execute a volume backup today 14:33:29 <daemontool> what do we do tomorrow? 14:33:37 <daemontool> backup it the whole volume again? 14:33:51 <daemontool> if we do incremental 14:33:58 <ddieterly> daemontool incremental? 14:34:03 <daemontool> then we need to check the block differences with yesterday execution 14:34:09 <daemontool> but every time we do a backup 14:34:14 <daemontool> we need to re read the 1TB 14:34:28 <daemontool> now thing if we have 500 volumes, 1TB each 14:34:33 <daemontool> every time we need to re read 500TB 14:34:38 <daemontool> that does not scale.... 14:34:46 <daemontool> so, how do we scale? :( 14:34:53 <ddieterly> so, you propose to check inodes instead of the data itself? 14:35:02 <daemontool> ddieterly, that's how we do things now with tar 14:35:11 <daemontool> and with that approach everyday we backup 1TB 14:35:12 <daemontool> each time 14:35:17 <daemontool> fast but not efficient 14:35:24 <daemontool> this is important 14:35:29 <daemontool> because in enterprise environment 14:35:33 <ddieterly> so, is there an efficient way to do this? 14:35:35 <daemontool> 1PB of storage start to be common 14:35:49 <domhnallw> Throwing hardware at the problem is only ever a stop-gap solution. 14:35:49 <ddieterly> what do other backup solutions do to handle this? 14:35:56 <daemontool> we need to adopt something, that track the blocks changed in the fs 14:36:01 <daemontool> then the data is written 14:36:03 <ddieterly> daemontool are you proposing a solution? 14:36:03 <daemontool> by the application 14:36:13 <ddieterly> omg 14:36:19 <daemontool> so for instance 14:36:38 <daemontool> well this is a discussion had with a customer 14:36:49 <daemontool> not sure about the solution 14:36:58 <ddieterly> does anybody do this? is there a precedent for this? 14:37:03 <daemontool> so zfs keep the hashtable of changed blocks in the fs 14:37:09 <domhnallw> It sounds a lot like Dropbox if I'm honest :) 14:37:11 <ddieterly> sounds like you want to write a logging fs 14:37:34 <ddieterly> domhnallw don't be honest, lie to us, we like that better ;-) 14:37:34 <daemontool> ddieterly, nope, I think we need to support that with 1 FS that provides that at least 14:37:55 <daemontool> but I don't know about the solution 14:38:01 <domhnallw> It would seem to me that doing filesystem-specific stuff might not be the cleverest solution in an open environment? 14:38:09 <daemontool> if we solve this, we provide enterprise grade backups 14:38:11 <slashme> That's something we should investigate 14:38:14 <domhnallw> Yep. 14:38:25 <daemontool> ok 14:38:28 <daemontool> :( 14:38:29 <daemontool> :) 14:38:36 <slashme> And I think this is a topic for the midcycle 14:38:36 <daemontool> next 14:38:41 <domhnallw> Okay. 14:38:42 <daemontool> yes definetely 14:38:58 <domhnallw> #topic Tenant resources backup (relates to backup as a service) 14:39:04 <domhnallw> https://blueprints.launchpad.net/freezer/+spec/tenant-backup/ 14:39:07 <slashme> We said no to this 14:39:10 <daemontool> ah ok 14:39:14 <daemontool> sorry 14:39:17 <slashme> For two reasons 14:39:52 <slashme> 1 Way to much code to implement and maintain to support this (we need to map every api calls for every OpenStack services) 14:40:08 <daemontool> this was about also what the people asked for in the summit, during the design session 14:40:15 <daemontool> but if we do not want to provide that 14:40:18 <daemontool> fine for me 14:40:21 <daemontool> :) 14:40:33 <slashme> 2 This is basicaly what Smaug does and they are a step ahead of us on that topic 14:40:56 <domhnallw> Okay, so moving on? 14:41:03 <domhnallw> We've already covered this topic I think: ' How do we scale? (i.e. use case backup a data set of 500TB) ' 14:41:19 <domhnallw> #topic ' How do we scale? (i.e. use case backup a data set of 500TB) ' 14:41:36 <domhnallw> I'm putting it in for completeness but I think we've just had this discussion. 14:41:54 <domhnallw> Anyone? 14:42:39 <domhnallw> Right. Next topic? 14:42:57 <domhnallw> #topic Add more back end storages (AWS S3?) 14:43:08 <domhnallw> This is from daemontool again. 14:43:26 <domhnallw> Again, we've been here a bit already. 14:43:32 <domhnallw> Think we can skip, that okay? 14:43:43 <daemontool> what was the outcome in brief, sorry? 14:44:00 <domhnallw> We have an action to generate a requirements document to look at what we'd need here. 14:44:15 <slashme> That will be easy once the agent refactoring is completed 14:44:15 <domhnallw> (for any engines we'd consider) 14:44:34 <daemontool> ok 14:44:46 <domhnallw> Good to move on? Or am I rushing things? 14:45:09 <daemontool> domhnallw, good good 14:45:15 <domhnallw> Okay. 14:45:16 <domhnallw> #topic Add multiple snapshotting technologies, not only LVM or cinder volumes only. We need to integrate with 3PP Storage snapshotting APIs. 14:46:46 <daemontool> that'd be about interacting directly with the back end storage to generate snapshot 14:46:59 <daemontool> like VNX, 3PAR, etc 14:47:30 <daemontool> zfs, btrfs, etc 14:47:30 <domhnallw> So I guess we'd need to see if anyone has a wishlist of vendor and/or technologies it'd be preferred to support? 14:47:49 <domhnallw> vendors* 14:48:15 <daemontool> let's start with the technologies directly related with the companies we work for, that'd easier for all of us to justify time 14:48:17 <daemontool> to invest on it 14:48:26 <daemontool> ? 14:49:45 <domhnallw> Seems sensible. Would we need to devise a minimum set of supported actions each technology would need to be able to implement (or work around), or is this already in place? 14:49:54 <slashme> daemontool: Same answer 14:50:06 <slashme> Easy once the agent refactoring is completed 14:50:18 * domhnallw thought it was "no for two reasons" :) 14:50:19 <slashme> It will just be add a new snapshot plugin 14:50:45 <daemontool> ok 14:51:08 <domhnallw> Next? 14:51:10 <slashme> Moreover snapshot plugins will be the simplest 14:51:51 <slashme> domhnallw: yes 14:51:54 <domhnallw> #topic Fix freezer-scheduler use trigger cron 14:51:59 <domhnallw> From yangyapeng 14:52:08 <yangyapeng> hello 14:52:28 <domhnallw> hi :) 14:52:29 <yangyapeng> if someone has tested schedule_date and start_date job_schedule 14:53:10 <yangyapeng> I have some test about cron trigger in freezer-schedule, cron is is unvaivable 14:54:02 <domhnallw> So are we saying cron is unavailable/unmocked in the testing environment, or...? 14:54:45 <slashme> I think we are saying it is just not working 14:54:58 <yangyapeng> yeah 14:54:58 <domhnallw> Oh :) 14:55:19 <yangyapeng> it is not working , i will fix it 14:55:33 <domhnallw> As an aside, I was a bit confused looking at the API where some dates/times were expressed as Unix timestamps and others as ISO8601 for no immediately obvious reason... 14:55:54 <domhnallw> That might be for another time though. 14:56:32 <domhnallw> So yangyapeng is that an action for you? 14:56:52 <yangyapeng> yeah, 14:57:28 <domhnallw> #action yangyapeng Fix cron/scheduling issues in freezer-scheduler 14:57:37 <domhnallw> :) 14:57:43 <yangyapeng> :) 14:57:52 <domhnallw> Next? 14:57:56 <domhnallw> #topic Improve the volumes backup through the Cinder APIs (i.e retention, delete, list backups, metadata) 14:58:24 <domhnallw> First of all, what's bad/wrong with the current state of this? 14:58:38 <domhnallw> daemontool ? 14:59:03 <domhnallw> We're running low on time here, I've noticed. 14:59:08 <yangyapeng> ping iceyao 14:59:21 <iceyao> here 14:59:21 <daemontool> yes 14:59:24 <daemontool> sorry 14:59:32 <daemontool> so we do not remove old backups currently 14:59:36 <domhnallw> Right folks, we may need to just jump to the reviews that need looking at soon... 14:59:38 <yangyapeng> the topic is your? 14:59:43 <daemontool> when executing cinder backups using cinder api 15:00:04 <daemontool> let's go to the chan 15:00:10 <domhnallw> Okay. 15:00:12 <domhnallw> Pending reviews? 15:00:19 <domhnallw> And then we're wrap up. 15:00:29 <domhnallw> First, a "hot pickle": https://review.openstack.org/#/c/331880 15:00:31 <EinstCrazy> Currently, volumes backup cannot be delete or list by freezer 15:00:31 <EinstCrazy> Native cinder backup 15:00:31 <EinstCrazy> So, we need to work for it 15:00:37 <bswartz> ¬_¬ 15:00:43 <daemontool> I think we need to move guys 15:00:50 <daemontool> now there's another meeting here 15:00:53 <domhnallw> Yep, will we close this out now? 15:00:57 <daemontool> yes 15:01:00 <domhnallw> #endmeeting freezer