13:59:24 #startmeeting freezer 13:59:26 Meeting started Thu Mar 24 13:59:24 2016 UTC and is due to finish in 60 minutes. The chair is m3m0. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:59:27 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:59:30 The meeting name has been set to 'freezer' 13:59:49 hey guys who's here for the freezer meeting 13:59:54 o/ 14:00:01 o/ 14:01:00 0/ 14:01:03 o/ 14:01:17 0/ 14:01:21 0/ 14:01:24 :) 14:01:33 there is a situation 14:01:40 we don't have topics for today :P https://etherpad.openstack.org/p/freezer_meetings 14:01:54 but I will like to start with one 14:02:04 #topic pending reviews 14:02:27 does anyone has important commits need it to be reviewd? 14:02:44 I think this is important https://review.openstack.org/#/c/297119/ 14:02:45 from infra 14:03:03 From szaher https://review.openstack.org/#/c/295220/ 14:03:23 it's not highly urgent and still working in progress 14:03:30 https://review.openstack.org/#/c/290461/ but if anyone wants to review the code... 14:03:38 thanks ddieterly btw ^^ 14:03:54 from my side 14:03:54 https://review.openstack.org/278407 14:04:00 https://review.openstack.org/280811 14:04:07 https://review.openstack.org/291757 14:04:08 daemontool np, just gave it a cursory reading, nothing too deep 14:04:13 https://review.openstack.org/296436 14:04:45 we need to solve an important challenge 14:05:02 which is, how to not re read all the data every time to compute incrementals 14:05:02 which is? 14:05:25 so we have a quick exchange of ideas with frescof 14:05:30 can we use an external tool like diff? 14:05:43 the problem is 14:05:46 daemontool: Like git did - checks modification date 14:06:01 ok 14:06:07 so let's say there's a use case 14:06:12 where the current data 14:06:29 is 300TB (sounds big but is starting to be common) 14:06:54 backup 300TB data? 14:06:57 if you have volumes of 100GB 14:07:11 you will realise 300TB is not that much volumes 14:07:20 zhangjn, yes 14:07:23 even 100TB 14:07:37 Some devstack patch is need to be accept. 14:07:44 now devstack is not work. 14:07:47 zhangjn, yes ++ 14:07:58 so 14:08:07 I hear a lots of uses 14:08:10 use cases 14:08:37 where is needed a solution 14:08:38 https://review.openstack.org/#/c/265111/ 14:08:44 this one 14:08:45 that can do backups of all volumes 14:08:48 like all_tenants 14:08:50 incremental 14:08:55 and that is the challenge we have 14:08:58 like reldan suggested, do like git or tar, first check the modification time and if is different from the previous one, read only that file 14:09:09 we cannot reread the data every time 14:09:18 m3m0, we might have something like 14:09:22 we are only reading the inodes 14:09:23 /dev/sdb2 14:09:38 but let's say 14:09:40 ooooo not from file system perspective but from the image? 14:09:42 we have 10k volumes 14:09:45 sorry the volume? 14:09:48 so we have 10k files 14:09:53 and every file is modified 14:10:01 and every file is 100GB 14:10:24 so we need to reread the data to generate the incrementals each time... 14:10:38 daemontool: If we don’t know about the structure of these files - yes 14:10:44 there are no better solution 14:10:57 we could thing of something like 14:11:01 storage drivers 14:11:11 if we now for example, that these files only support appends to the end - it easier 14:11:11 there are fs that can do that for us 14:11:17 like zfs 14:11:19 or btrfs 14:11:26 or other proprietory solution 14:11:44 they can provide to us 14:11:46 before hand 14:12:12 the blocks that changed 14:12:16 and for example 14:12:18 a differencial 14:12:25 of 2 snapshots 14:12:45 you mean use the feature the fs provide ? 14:12:52 yes 14:12:59 we need a way to know what data modified 14:13:11 without re read all the data each time 14:13:24 how do we do that? this is the challenge 14:13:33 we can do it with tar and rsync 14:13:44 for significant use cases 14:13:49 I actually don’t understand the idea to have 10K of 100 GB files per each and modify them 14:14:03 without support of the file system will be a problem 14:14:11 Do we really have such customers? 14:14:16 reldan, yes 14:14:23 I have now at least 3 14:14:31 Probably you can share what the files? 14:14:35 But I've study btrfs' cow 2 years ago, it did not support file cow,but only directory 14:14:44 What are they storing in these files? 14:14:55 EinstCrazy, btrfs would be for example the file system in the storage 14:15:00 reldan, I don't know 14:15:11 so in the compute nodes 14:15:17 you have /var/lib/nova/instances 14:15:28 and that /var/lib/nova/instances is mounted on a remote storage 14:15:39 but btrfs is one of the use cases 14:15:40 cause 14:15:48 most of the users for this case 14:15:50 would have 14:15:52 emc 14:15:54 or 3par 14:15:57 storeonce 14:15:58 and so on 14:16:39 Data Deduplication? 14:16:45 data deduplication 14:16:50 we can do it 14:17:01 with rsync as the hashes of blocks 14:17:12 are computed there 14:17:15 but 14:17:21 for instance data deduplication 14:17:30 is provided by zfs 14:17:31 natively 14:17:33 I think this is most import for instance backup. 14:17:34 and also compression 14:17:43 Yes, but in common case I suppose we should store more hashes than we have data ) 14:18:03 zfs is good at this scenario. 14:18:04 reldan, rephrase that please? 14:18:08 I didn't got it 14:18:19 Let’s say we have file A and file B 14:18:28 file A and file B share 50% of content 14:18:32 ok 14:18:43 so we have rolling hash in first file 14:18:49 and rolling hash in second 14:19:02 but I think we'd better do in freezer. 14:19:03 to be able to find that file A and file B sharing some chunk 14:19:10 reldan, yes 14:19:21 we should keep our hashes (in this case rolling hashes) 14:19:53 and for file with size m and chunk for hash n 14:20:02 we should have m - n + 1 hash 14:20:19 O(n) hashes 14:20:37 reldan, exactly 14:20:42 that's how zfs does 14:20:51 hash table 14:21:03 some more info here v 14:21:05 https://pthree.org/2012/12/18/zfs-administration-part-xi-compression-and-deduplication/ 14:21:13 what I'd like to understand is 14:21:23 do you see this is something we need to solve? 14:21:27 all? 14:21:49 seems like s scale issue that has to be solved 14:21:57 yes 14:22:12 we cannot use rsync for large scale cases 14:22:15 or for cases 14:22:26 where customers wants to have the backups executed on all_tenants 14:22:33 all volumes 14:22:37 or all instances 14:22:53 if the customer can use rsync on each vm/volume 14:22:55 than better 14:24:05 so 14:24:14 we can have something like storage drivers 14:24:38 like zfs driver? 14:24:38 and we manage the storage drivers as btrfs, zfs, storeonce, 3par 14:24:40 yez 14:25:07 so if we have the backend that does dedup and can provide hashes of modified blocks 14:25:11 we use that and we scale 14:25:19 otherwise we provide tar and rsync 14:25:21 enable driver to do this is easy. 14:25:22 depending on the cases 14:25:49 if we provide this... we'll provide the best open source backup/restore/disaster recovery tool in the world 14:26:07 that's it :) 14:26:39 in the virtualization scenario :) 14:26:39 now, who wants to write a bp for this? 14:26:49 in the cloud business yes 14:27:10 daemontool: You mean we would have a zfs/btrfs/3par/... engine ? 14:27:20 slashme yes 14:27:30 Seems very good to me :) 14:27:34 better definition in freezer glossary 14:27:36 ty :) 14:27:46 it's is higly critical 14:27:55 cause it's the competitive advantage 14:28:16 that commercial solution have 14:28:21 vs us 14:28:25 first support 3par 14:28:29 And it allows us not to introduce too much complexity in freezer if we manage dedupliction and incremental using the storage 14:28:50 slashme, yes. I think we can provide both 14:29:09 but the cases of where the features will be uses 14:29:12 are very different 14:29:15 for different customers 14:29:15 Yes\ 14:29:17 etc 14:29:34 if people use ext4 and use freezer to backup from withicn the vms 14:29:40 with dedup 14:29:43 then we offer that 14:29:52 last week I meet with hpe sale in my office. 14:30:03 but if we have all the giants 14:30:06 looking at us 14:30:15 cause they want to use freezer 14:30:24 then we have to provide that 14:30:42 and I think, it's a reasonable approach to get more people onboard to the project 14:30:57 good idea 14:31:13 we can talk about this 14:31:16 in the Summit 14:31:30 who will be going to the summit 14:31:38 i will be there 14:31:41 most of us 14:31:54 me too 14:32:01 I will 14:32:06 m3m0 as well 14:32:09 frescof too 14:32:13 I hope szaher too 14:32:14 more resources is good for us. 14:32:19 frescof also 14:32:25 daemontool: Sorry I won't be able to do it :( 14:32:33 reldan you going? 14:32:35 ok I'm sorry Saad 14:32:59 ddieterly: Nope, my boss said that I’m not going 14:32:59 I'm going to return the free ticket than, is that ok? 14:33:14 Ok 14:33:29 ddieterly, if you could say half word internally that'd be good, if you can/want 14:33:38 ok, sounds like a good number of people will be there 14:33:39 I can share the room 14:33:46 so no room costs 14:33:56 and provide free access to the summit 14:34:11 daemontool i did not understand; could you please rephrase? 14:34:29 ddieterly, can you have a word with Omead, and see if there's anything he can do to send reldan ? 14:34:33 something like that 14:34:49 daemontool i'm sorry, but i just barely made the cut to go 14:34:51 I can share the room 14:34:56 ok 14:34:57 np 14:34:58 hpe is really cutting back on attendees 14:35:01 ok 14:35:03 share man :( 14:35:17 ok np 14:35:43 daemontool omead an only send 2 people 14:35:44 I am so shy 14:35:59 roland and i got to go 14:36:04 ok np 14:36:07 ty :) 14:36:11 anyway 14:36:18 let's get back on track 14:36:20 with the meeting here 14:36:23 so 14:36:28 anyone wants to write the bp 14:36:43 for engines that leverage storage specific 14:36:46 features? 14:36:52 like dedup and incrementals? 14:37:24 zhangjn, EinstCrazy are you itnerested guys? 14:38:03 i'd be happy to, but i don't know enough about the project to do it 14:38:09 :-( 14:38:30 ddieterly, do you mean you don't know about freezer? 14:38:38 I think this can be a good opportunity 14:38:44 and if you see the rsync code 14:38:49 yea, i don't know enough about backups and backup technology 14:38:49 it will be somewhat similar 14:38:54 ah ok 14:39:09 well, we can start with 3par, storeonce, emc 14:39:12 I'm interested. But I dont know the scope in detail 14:39:15 or at least the technology 14:39:21 we have access more 14:39:25 good opportunity to know freezer and backup. 14:39:47 ok 14:39:49 so let me know 14:39:54 even we can start with 14:39:56 zfs 14:40:07 I don't know 14:40:27 EinstCrazy, go ahead, daemontool will help you. 14:40:38 if you are interested I'm here 14:40:40 frescof too 14:40:48 and reldan too :) 14:40:57 m3m0, we can move forward 14:41:02 to the next topic 14:41:09 en 14:41:15 ok, but still we don't have a list of topics 14:41:28 does anyone have something to share? 14:41:30 m3m0, improvise 14:41:33 lol 14:41:37 so 14:41:44 reldan, metadata 14:41:46 documentation or testing then :) 14:41:54 any new on that? 14:42:22 ok, hpe is talkign about multi-region agaion 14:42:34 Yes, metadata. I am still writing blueprint. But acutally have one question 14:42:37 we still aren't sure exactly what it is, but it keeps coming up 14:43:17 I am here to observ 14:43:22 one instance of freezer-api/elastic search serving multiple regions of nova/neutron/etc 14:44:00 We have discussed that we would like to change the way we keep our backups. Like for swift engine/id/bla-bla. So my question - should we support our previous backups and storages in this case 14:44:49 reldan you mean remain backward compatible? 14:45:37 yes, you are right. Because 1) remain backward compatability is really bad solution 2) don’t remain previous backup - also not very nice 14:46:11 it is much easier for all if backward compatibility is maintained 14:46:21 otherwise, you must deal with upgrading 14:46:25 How many people are using freezer project to backup in product? 14:46:50 freezer is shipped with hpe helion 14:47:11 also in Ericsson is being used in few projects 14:47:17 so, we need backward compatibility or else an upgrade path 14:47:34 ddieterly, yes, but it's just about doing a new level 0 backups after all 14:47:48 that's the difference I think 14:48:04 ok, as long as existing installations do not break... 14:48:11 In case of backward compatibility, there are no much reason to support new container path and metadata. Because I should be able to restore my data without metadata anyway :) 14:48:54 who can write a user case for me, I can push the freezer project in china. 14:50:09 zhangjn, let's have an offline conversation about that 14:50:10 I can help 14:50:29 reldan, I agree 14:51:07 So I need some decision about backward compatibility 14:51:13 I can even write “migration” 14:51:26 any thought? 14:51:27 anyone 14:51:38 but don’t think that for big amount of data it is a good idea 14:51:45 I agree 14:51:49 migration sucks 14:52:11 :) 14:52:16 ;-) 14:52:29 another feature we need to provide 14:52:33 are rolling upagrades 14:52:40 all other services are providing that 14:52:46 we need to do that too 14:52:53 how to upgrade the freezer will be considered later? 14:53:08 Second idea (also very bad) rename swift storage to old-swift and write new-swift with metadata and different pathes 14:53:19 local -> old-loca 14:53:29 ssh -> old-ssh 14:53:40 reldan, I think, if needed we can have backward incompatibility 14:53:41 and support both versions :) 14:53:50 in Newton 14:53:55 and if you need to restrore data 14:54:00 exactly on that moment 14:54:15 because there's no level 0 yet 14:54:25 then a previou freezer-agent version needs to be used 14:54:30 we can write that in the documentation 14:54:44 if we need to move forward and provide better feature and simplify our life 14:55:04 Sounds good for me. But then everybody should agree that at some point we will be unable to restore old backups by new code 14:56:07 I don't see the problem 14:56:12 guys we have 4 min left 14:56:19 if this happens rarely 14:56:24 I don't see the problem either 14:56:30 You just have to use an older version of freezer to restore an older backup 14:56:49 slashme will give you the customer ticket when the problem comes up ;-) 14:57:13 I disagree 14:57:15 ddieterly, it slashme that get the ticket first :) 14:57:26 yangyapeng, ok, please extend 14:57:28 :) 14:57:49 yea, i meant that slashme gets the ticket 14:57:49 it is a trouble to restore use old backup 14:58:22 2 minutes left 14:58:24 ddieterly, ok :) 14:58:38 sorry I have to run now, I have a meeting 14:58:44 ciao! 14:58:52 remember that we can take this discussion to #openstack-freezer channel 14:59:07 thanks all for your time :) 14:59:09 #endmeeting