#openstack-meeting-alt log

14:00:40 <domhnallw> #startmeeting freezer
14:00:40 <openstack> Meeting started Thu Jun 30 14:00:40 2016 UTC and is due to finish in 60 minutes.  The chair is domhnallw. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:40 <ddieterly> groovy
14:00:41 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:44 <openstack> The meeting name has been set to 'freezer'
14:00:50 <ddieterly> o/
14:01:01 <domhnallw> Okay, I don't see any topics in the etherpad, has anyone any suggestions?
14:01:53 <m3m0> o/
14:01:55 <slashme> o/
14:01:55 <daemontool> o/
14:02:03 <domhnallw> #topic Using bup for backup engine - Tim Buckley to lead discussion
14:02:45 <domhnallw> timothyb89?
14:02:48 <timothyb89> so, I'm interested in adding a new engine that would use https://github.com/bup/bup
14:03:31 <timothyb89> it would give us solid deduplication for free, in addition to incremental backups and compression
14:03:56 <domhnallw> I do note this in the readme: Reasons you might want to avoid bup
14:03:56 <domhnallw> This is a very early version. Therefore it will most probably not work for you, but we don't know why. It is also missing some probably-critical features.
14:04:04 <ddieterly> it would fix the limitations tar has with incremental and name changes/moves
14:04:38 <timothyb89> while the github repo does say that (and it is still true, in some areas) it is still a fairly mature project
14:04:41 <timothyb89> its been around ~6 years now
14:05:07 <timothyb89> and, it's underlying format (git packfile) is pretty battle-tested by now
14:05:12 <m3m0> how is it doing in the performance area?
14:05:47 <timothyb89> I haven't measured that myself, but I'd imagine pretty decently.. it seems that they have implemented all the performance-critical parts in C
14:05:58 <ddieterly> current release .028.1 is a red flag
14:06:21 <m3m0> and I wonder how it will behave when working with streams
14:06:22 <domhnallw> ddieterly, red flag?
14:06:35 <domhnallw> The 0.x version number you mean?
14:06:38 <ddieterly> it is not at a 1.0 release yet
14:06:43 <daemontool> timothyb89, can it be used to store data in the current supported media storage?
14:06:44 <ddieterly> domhnallw yes
14:07:32 <daemontool> how would it work, would we  wrap the executable?
14:07:36 <timothyb89> daemontool: to my knowledge it's only designed for filesystem-to-filesystem backups, so there would need to be a small layer on top for, e.g. swift
14:08:15 <domhnallw> There's a related project called 'bupper' that facilitates config file-based profiles: https://github.com/tobru/bupper
14:08:16 <timothyb89> wrapping needs further investigation, but it might not need to be wrapped at all, given that it's mainly written in python
14:08:46 <timothyb89> we may be able to use it directly, or worst case use it like we currently use tar
14:09:20 <daemontool> timothyb89, if we can use the code, without wrapping binaries +1
14:11:18 <jonaspf1> there are other backup tools similar to bup. e.g. borg, attic, ...
14:11:38 <domhnallw> Might also be worth looking at this near the bottom of the readme: https://github.com/bup/bup/tree/0.28.1#things-that-are-stupid-for-now-but-which-well-fix-later
14:11:39 <jonaspf1> has anybody tried to analyse the different options?
14:12:08 <m3m0> i did with borg
14:12:17 <m3m0> python 3 only
14:12:19 <ddieterly> it would be great if someone could make a list of possible options in a bp and pros/cons of each option
14:12:25 <m3m0> works with local storages
14:12:32 <m3m0> but not so sure about swift
14:12:33 <daemontool> and also define some requirements from our side
14:12:38 <daemontool> for the tool
14:12:52 <daemontool> what needs to support/provide in order to be included
14:13:08 <domhnallw> Absolutely, a concrete set of requirements must come first.
14:13:24 <domhnallw> (a *lot* easier said than done though)
14:13:43 <ddieterly> if everybody contributes what they know, then it would be easy
14:13:55 <ddieterly> step 1, create a bp
14:14:35 <ddieterly> step 3, profit
14:14:44 <jonaspf1> I looked at different backup systems for my private home backup a while ago. I found this overview very useful: https://wiki.archlinux.org/index.php/Synchronization_and_backup_programs
14:14:55 <daemontool> I can write the requirements
14:15:02 <daemontool> in the bp
14:15:57 <ddieterly> #action daemontool to create bp for backup engine options/requirements
14:16:38 <daemontool> we need to be consistent with previous engines conversations
14:16:53 <daemontool> I-ll check that with slashme
14:17:02 <ddieterly> are those recorded somewhere?
14:17:07 <domhnallw> I was only aware of tar?
14:17:23 <ddieterly> domhnallw theres dar as well
14:17:24 <m3m0> ddieterly: I see what you did there :P
14:17:54 <ddieterly> m3m0 i didn't just fall off a turnip truck
14:18:28 <daemontool> ok next?
14:18:50 <domhnallw> We're done with this topic then?
14:19:10 <ddieterly> sure
14:19:18 <domhnallw> Okay.
14:19:19 <domhnallw> #topic Policy on breaking backward compatibility
14:19:32 <domhnallw> This should be fun :)
14:19:40 <ddieterly> i know we discussed this before, but i'm not sure if there is a stated policy on this
14:19:50 <daemontool> domhnallw, nice to meet you btw
14:19:54 <domhnallw> Likewise :)
14:20:00 <ddieterly> when daemontool is around, it is always fun
14:20:11 <domhnallw> If anyone has any additional topics, now would be an excellent time to add them to the etherpad :)
14:21:10 <ddieterly> do we have a policy on breaking backward compatibility?
14:21:24 <domhnallw> So, backward compatability. Are we talking about changing freezer's behaviour, or changing its requirement versions?
14:21:36 <timothyb89> behavior I believe?
14:22:03 <ddieterly> as i understand it, version n does not work with version n+1
14:23:09 <domhnallw> So I guess we need to also think about whether we're talking about the API, the scheduler, or the agent, and if we're talking command-line arguments, configuration settings, etc. etc.
14:23:30 <ddieterly> if you are adding things to the etherpad, could you please pick a color so that we know who is adding what?
14:24:01 <domhnallw> ddieterly, it looks so far that it's you and yangyapeng, is that wrong?
14:24:36 <ddieterly> it looks like someone is adding with purple
14:24:52 <yangyapeng> not me  :)
14:24:53 <ddieterly> maybe that's lavendar, i'm not a color expert
14:25:05 <ddieterly> 'How can we scale'...
14:25:11 <ddieterly> who dat?
14:25:30 <ddieterly> got to be daemontool
14:25:49 <domhnallw> Whoever it is, please state your name and, erm, colour?
14:25:49 <yangyapeng> i guess daemontool ？
14:25:50 <domhnallw> :p
14:25:57 <daemontool> it's me yes
14:26:04 <slashme> domhnallw: usualy, for cli arguments
14:26:06 <daemontool> first of all, you are welcome
14:26:09 <daemontool> :)
14:26:12 <daemontool> ok
14:26:14 <yangyapeng> haha :) I see the patch in gerrit
14:26:16 <slashme> Adding is no poblem
14:26:30 <slashme> removing needs to be deprecated for a release cycle
14:26:34 <ddieterly> daemontool just put your name on a color so that we know who it is
14:27:21 <ddieterly> daemontool thank you
14:27:36 <domhnallw> Okay, they're both you :)
14:27:43 <domhnallw> Thanks folks.
14:27:50 <domhnallw> Now, back on topic.
14:27:52 <daemontool> so, getting back to serious things
14:27:59 <domhnallw> CLI arguments for the various elements of freezer I believe?
14:28:03 <timothyb89> one example I was wondering about was fixing the '--no-incremental' argument type (string -> boolean)
14:28:03 <daemontool> how do we backup large data set size,
14:28:13 <daemontool> without re reading all the data every time_
14:28:23 <daemontool> ?
14:29:48 <ddieterly> daemontool what is the current issue with the current implementation now?
14:29:59 <daemontool> that you need to re read all the data every time
14:30:14 <daemontool> I mean not now
14:30:20 <daemontool> because we do not support block based incremental
14:30:32 <daemontool> so we  check only file inode changes
14:30:59 <ddieterly> would a different engine besides tar solve some of that?
14:31:09 <daemontool> it depends on the engine
14:31:12 <daemontool> dar nope
14:31:27 <ddieterly> ok, so a different engine may help with that
14:31:28 <daemontool> so the problem is that if you have 1TB
14:31:31 <timothyb89> I think it would have to be a different engine, right?
14:31:35 <daemontool> yes
14:31:43 <daemontool> you read it today
14:31:48 <daemontool> compute the block hashes
14:32:03 <daemontool> or anything that keep track of blocks state
14:32:04 <ddieterly> ok, dumb idea, why don't users just back up entire volumes of the cloud hosts?
14:32:27 <daemontool> ddieterly, ok, no incremental?
14:32:36 <daemontool> if we have a 1TB
14:32:38 <ddieterly> use a huge net instead of trying to pick up each little fishy
14:32:43 <domhnallw> Incremental backups are designed to save space, right? Just save the changes rather than having multiple copies of near-identical data?
14:33:04 <domhnallw> I can't see when that wouldn't be at least desirable.
14:33:04 <daemontool> so let's say we have a volume of 1 TB today
14:33:26 <daemontool> we execute a volume backup today
14:33:29 <daemontool> what do we do tomorrow?
14:33:37 <daemontool> backup it the whole volume again?
14:33:51 <daemontool> if we do incremental
14:33:58 <ddieterly> daemontool incremental?
14:34:03 <daemontool> then we need to check the block differences with yesterday execution
14:34:09 <daemontool> but every time we do a backup
14:34:14 <daemontool> we need to re read the 1TB
14:34:28 <daemontool> now thing if we have 500 volumes, 1TB each
14:34:33 <daemontool> every time we need to re read 500TB
14:34:38 <daemontool> that does not scale....
14:34:46 <daemontool> so, how do we scale? :(
14:34:53 <ddieterly> so, you propose to check inodes instead of the data itself?
14:35:02 <daemontool> ddieterly,  that's how we do things now with tar
14:35:11 <daemontool> and with that approach everyday we backup 1TB
14:35:12 <daemontool> each time
14:35:17 <daemontool> fast but not efficient
14:35:24 <daemontool> this is important
14:35:29 <daemontool> because in enterprise environment
14:35:33 <ddieterly> so, is there an efficient way to do this?
14:35:35 <daemontool> 1PB of storage start to be common
14:35:49 <domhnallw> Throwing hardware at the problem is only ever a stop-gap solution.
14:35:49 <ddieterly> what do other backup solutions do to handle this?
14:35:56 <daemontool> we need to adopt something, that track the blocks changed in the fs
14:36:01 <daemontool> then the data is written
14:36:03 <ddieterly> daemontool are you proposing a solution?
14:36:03 <daemontool> by the application
14:36:13 <ddieterly> omg
14:36:19 <daemontool> so for instance
14:36:38 <daemontool> well this is a discussion had with a customer
14:36:49 <daemontool> not sure about the solution
14:36:58 <ddieterly> does anybody do this? is there a precedent for this?
14:37:03 <daemontool> so zfs keep the hashtable of changed blocks in the fs
14:37:09 <domhnallw> It sounds a lot like Dropbox if I'm honest :)
14:37:11 <ddieterly> sounds like you want to write a logging fs
14:37:34 <ddieterly> domhnallw don't be honest, lie to us, we like that better ;-)
14:37:34 <daemontool> ddieterly,  nope, I think we need to support that with 1 FS that provides that at least
14:37:55 <daemontool> but I don't know about the solution
14:38:01 <domhnallw> It would seem to me that doing filesystem-specific stuff might not be the cleverest solution in an open environment?
14:38:09 <daemontool> if we solve this, we provide  enterprise grade  backups
14:38:11 <slashme> That's something we should investigate
14:38:14 <domhnallw> Yep.
14:38:25 <daemontool> ok
14:38:28 <daemontool> :(
14:38:29 <daemontool> :)
14:38:36 <slashme> And I think this is a topic for the midcycle
14:38:36 <daemontool> next
14:38:41 <domhnallw> Okay.
14:38:42 <daemontool> yes definetely
14:38:58 <domhnallw> #topic Tenant resources backup (relates to backup as a service)
14:39:04 <domhnallw> https://blueprints.launchpad.net/freezer/+spec/tenant-backup/
14:39:07 <slashme> We said no to this
14:39:10 <daemontool> ah ok
14:39:14 <daemontool> sorry
14:39:17 <slashme> For two reasons
14:39:52 <slashme> 1 Way to much code to implement and maintain to support this (we need to map every api calls for every OpenStack services)
14:40:08 <daemontool> this was about also what the people asked for in the summit, during the design session
14:40:15 <daemontool> but if we do not want to provide that
14:40:18 <daemontool> fine for me
14:40:21 <daemontool> :)
14:40:33 <slashme> 2 This is basicaly what Smaug does and they are a step ahead of us on that topic
14:40:56 <domhnallw> Okay, so moving on?
14:41:03 <domhnallw> We've already covered this topic I think: ' How do we scale? (i.e. use case backup a data set of 500TB) '
14:41:19 <domhnallw> #topic ' How do we scale? (i.e. use case backup a data set of 500TB) '
14:41:36 <domhnallw> I'm putting it in for completeness but I think we've just had this discussion.
14:41:54 <domhnallw> Anyone?
14:42:39 <domhnallw> Right. Next topic?
14:42:57 <domhnallw> #topic Add more back end storages (AWS S3?)
14:43:08 <domhnallw> This is from daemontool again.
14:43:26 <domhnallw> Again, we've been here a bit already.
14:43:32 <domhnallw> Think we can skip, that okay?
14:43:43 <daemontool> what was the outcome in brief, sorry?
14:44:00 <domhnallw> We have an action to generate a requirements document to look at what we'd need here.
14:44:15 <slashme> That will be easy once the agent refactoring is completed
14:44:15 <domhnallw> (for any engines we'd consider)
14:44:34 <daemontool> ok
14:44:46 <domhnallw> Good to move on? Or am I rushing things?
14:45:09 <daemontool> domhnallw, good good
14:45:15 <domhnallw> Okay.
14:45:16 <domhnallw> #topic Add multiple snapshotting technologies, not only LVM or cinder volumes only. We need to integrate with 3PP Storage snapshotting APIs.
14:46:46 <daemontool> that'd be about  interacting directly with the back end storage to generate snapshot
14:46:59 <daemontool> like VNX, 3PAR, etc
14:47:30 <daemontool> zfs, btrfs, etc
14:47:30 <domhnallw> So I guess we'd need to see if anyone has a wishlist of vendor and/or technologies it'd be preferred to support?
14:47:49 <domhnallw> vendors*
14:48:15 <daemontool> let's start with the technologies directly related with the companies we work for, that'd easier for all of us to justify time
14:48:17 <daemontool> to invest on it
14:48:26 <daemontool> ?
14:49:45 <domhnallw> Seems sensible. Would we need to devise a minimum set of supported actions each technology would need to be able to implement (or work around), or is this already in place?
14:49:54 <slashme> daemontool: Same answer
14:50:06 <slashme> Easy once the agent refactoring is completed
14:50:18 * domhnallw thought it was "no for two reasons" :)
14:50:19 <slashme> It will just be add a new snapshot plugin
14:50:45 <daemontool> ok
14:51:08 <domhnallw> Next?
14:51:10 <slashme> Moreover snapshot plugins will be the simplest
14:51:51 <slashme> domhnallw: yes
14:51:54 <domhnallw> #topic Fix freezer-scheduler use trigger cron
14:51:59 <domhnallw> From yangyapeng
14:52:08 <yangyapeng> hello
14:52:28 <domhnallw> hi :)
14:52:29 <yangyapeng> if  someone has tested schedule_date and start_date  job_schedule
14:53:10 <yangyapeng> I have some test  about cron trigger in freezer-schedule,  cron is  is unvaivable
14:54:02 <domhnallw> So are we saying cron is unavailable/unmocked in the testing environment, or...?
14:54:45 <slashme> I think we are saying it is just not working
14:54:58 <yangyapeng> yeah
14:54:58 <domhnallw> Oh :)
14:55:19 <yangyapeng> it is not working ,  i will fix it
14:55:33 <domhnallw> As an aside, I was a bit confused looking at the API where some dates/times were expressed as Unix timestamps and others as ISO8601 for no immediately obvious reason...
14:55:54 <domhnallw> That might be for another time though.
14:56:32 <domhnallw> So yangyapeng is that an action for you?
14:56:52 <yangyapeng> yeah,
14:57:28 <domhnallw> #action yangyapeng Fix cron/scheduling issues in freezer-scheduler
14:57:37 <domhnallw> :)
14:57:43 <yangyapeng> :)
14:57:52 <domhnallw> Next?
14:57:56 <domhnallw> #topic Improve the volumes backup through the Cinder APIs (i.e retention, delete, list backups, metadata)
14:58:24 <domhnallw> First of all, what's bad/wrong with the current state of this?
14:58:38 <domhnallw> daemontool ?
14:59:03 <domhnallw> We're running low on time here, I've noticed.
14:59:08 <yangyapeng> ping iceyao
14:59:21 <iceyao> here
14:59:21 <daemontool> yes
14:59:24 <daemontool> sorry
14:59:32 <daemontool> so we do not remove old backups currently
14:59:36 <domhnallw> Right folks, we may need to just jump to the reviews that need looking at soon...
14:59:38 <yangyapeng> the topic is your?
14:59:43 <daemontool> when executing cinder backups using cinder api
15:00:04 <daemontool> let's go to the chan
15:00:10 <domhnallw> Okay.
15:00:12 <domhnallw> Pending reviews?
15:00:19 <domhnallw> And then we're wrap up.
15:00:29 <domhnallw> First, a "hot pickle": https://review.openstack.org/#/c/331880
15:00:31 <EinstCrazy> Currently, volumes backup cannot be delete or list by freezer
15:00:31 <EinstCrazy> Native cinder backup
15:00:31 <EinstCrazy> So, we need to work for it
15:00:37 <bswartz> ¬_¬
15:00:43 <daemontool> I think we need to move guys
15:00:50 <daemontool> now there's another meeting here
15:00:53 <domhnallw> Yep, will we close this out now?
15:00:57 <daemontool> yes
15:01:00 <domhnallw> #endmeeting freezer