14:00:21 #startmeeting freezer 14:00:22 Meeting started Thu Apr 7 14:00:21 2016 UTC and is due to finish in 60 minutes. The chair is m3m0. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:23 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:25 The meeting name has been set to 'freezer' 14:00:36 :) 14:01:12 Hi m3m0 our beloved chairman 14:02:08 and even more handsome :) 14:02:25 as always freezer notes https://etherpad.openstack.org/p/freezer_meetings 14:02:48 who's here for the freezer meeting? 14:02:58 :) 14:04:39 :) 14:04:44 ok let's wait for more people to join 14:05:12 EinstCrazy yangyapeng, any updates from your side? 14:05:43 We are test cindernative backup nowadays 14:06:06 and how is it going? 14:07:37 and we have some bugs to fix, and have a poc 14:08:17 nice, I'm looking forward to see the new patchs :) 14:08:27 ok let's start with the first topic 14:08:37 #topic summit preparation 14:08:46 slashme any update on this? 14:09:08 hi 14:09:22 sorry I-m late 14:09:33 Yes m3m0 14:09:41 0/ 14:09:50 So we have 4 sessions 14:10:05 for the developper summit 14:10:19 1 in a big room: Wed: 9:50 - 10:30 14:10:37 2 in smaller rooms: Wed: 11:00 - 11:40 and 11:50 - 12:30 14:10:54 And one longer on the friday: Fri: 9:00 - 12:30 14:11:00 :) 14:11:13 Thanks to daemontool for that 14:11:38 let's make together the most out of them :) 14:11:38 Now we need to decide what is going to be discussed during the three first sessions 14:12:09 for one, should be backup of your infra 14:12:11 The last big one on friday is supposed to be theme-less in order to prepare everything needed for the cycle 14:12:45 backup as a service 14:12:47 https://etherpad.openstack.org/p/austin_summit_preparation 14:13:22 and we have one left 14:13:38 disaster recovery? 14:14:12 I think so 14:14:25 Let's try to develop a bit 14:15:14 What do we want to fit in : "Backup your infrastructure" ? 14:15:42 mysql backup and restore 14:16:02 yes 14:16:09 maybe cinder and nova 14:16:17 So I guess backup you OpenStack Infrastructure would be better 14:16:50 we can probably resume the topic on 3 points? 14:16:56 1) infrastructure backup 14:16:59 2) baas 14:17:01 3) DR 14:18:10 daemontool: I think your three points are three different sessions 14:20:42 What do we want to fit in : "Backup as a service" ? 14:20:53 slagle, yes 14:21:00 slashme, yes 14:21:26 backup as a service: tenant resources backups such as: Volumes and VMs 14:21:57 infrastructure: mysql + job sessions 14:22:22 DR: whatever it is 14:22:38 also we need to talk about the scalability 14:22:55 like how do we thing to achieve a backup of 10 or 50TB of data 14:22:57 or more 14:23:24 in my opinion, these are the challenging topics we have 14:24:25 I think we need to think about backup of big size of data 14:24:48 EinstCrazy, ++ 14:24:55 scalability 14:24:59 yes 14:25:16 for infrastructure backup I think we did good so far 14:25:27 what are the current issues, or the current concerns 14:25:30 Okay. So do you think scalability needs its own session ? In that case, we need to remove something else. 14:25:30 on this? 14:25:49 Or should we fit it in the friday session ? 14:25:54 scalability in place of infra? 14:26:00 I don't know 14:26:13 Friday is good I think 14:26:56 let's add infra to Fri and scalability on its own? 14:29:01 I would keep infra in the main sessions 14:30:40 ok 14:30:48 all good for me 14:30:55 as long we have this 4 14:31:29 the reason why I'm pushign for baas 14:31:33 is that all the commercial backup solution 14:31:39 provides infra backup 14:31:43 what is really lacking 14:31:45 is baas 14:32:02 I'm asked for that at least 3 times every week 14:32:13 and we provide infra backup 14:32:28 just explaning my motivation 14:32:33 I'm totally OK with that plan 14:32:37 I agree 14:33:56 hi frescof 14:34:28 I added scalability to backup as a service, because they are related 14:34:51 hi all. I waish I was going to summit 14:35:01 here to observ 14:35:08 ok 14:35:16 also related to infrastructure 14:35:17 and dr 14:35:24 well scalability is related with everything probably 14:35:39 are this topic agreed? 14:35:42 are we good? 14:35:51 Okay for me. 14:36:19 can we move forward? 14:36:29 yes please 14:36:48 #topic How should we deal with authentication when using other clients 14:37:23 This was raised by erno 14:37:35 I think we need an example for that 14:37:49 we are calling private methods when authenticating with cinder and glance clients 14:37:49 this is related to reldan conversation perhaps? 14:37:55 ah ok 14:37:56 daemontool: yes 14:38:16 and is not good 14:38:19 to void that 14:38:22 Idealy, we should be able to use keystone sessions in order to authenticate once 14:38:24 we have to rewrite code 14:38:35 and then pass the session to other clients 14:38:37 and that's why reldan did that 14:38:38 ok 14:39:04 But we are not completely sure of how this is supposed to work 14:39:06 we should probably open a bug for that 14:39:06 Yes, we can rewrite that. But actually from my point of view os clients is really mess 14:39:10 if it is not already opened 14:39:25 I guess we should ask for the keystone team opinion 14:40:23 I don't know, someone should take ownership of that 14:40:27 think about that 14:40:39 szaher reldan ? 14:40:39 have the related conversation with the other services teams 14:40:41 if needed 14:40:49 and do changes if would be the case 14:41:18 I did that in diff project and I used sessions to authenticate with diff projects at the same time check this http://paste.openstack.org/show/493346/ 14:41:45 ok 14:42:09 reldan, what's your thought? 14:42:24 hi ddieterly 14:42:39 hello 14:42:55 daemontool: szaher have already such part of code for another project 14:43:04 So we can copy-past it to ours 14:43:15 ok 14:43:25 szaher, are you comfortable doing that? 14:43:34 daemontool: Yes, that is fine 14:43:41 can we move forward? 14:44:00 slashme, all good? 14:44:11 yup 14:44:14 Next topic 14:44:16 #topic freezer-agent --exclude 14:44:28 clsacramento ^^ 14:44:37 daemontool: ^^ 14:44:55 yes 14:44:59 I think we should have a common interface in python that get passed to the engines 14:45:20 m3m0, for what? 14:45:24 we have that already 14:45:29 Should we separate the exclude mechanism from tar. 14:45:29 only for ta 14:45:31 r 14:45:33 We need this for backup consistency and ner engines (rsync) 14:45:34 Idea is: 14:45:35 or is something different? 14:45:42 When I was implementing the checksum for backup consistency I realized that the --exclude uses the tar exclude 14:45:51 yes 14:45:52 If --exclude is passed as a parameter 14:45:53 We walk path and generate a list of files to exclude 14:45:53 That list can be passed to tar with --exclude-from 14:45:54 Question : 14:45:55 Should we do this ? 14:45:56 Acceptable solution ? 14:45:57 format of : regex / shell globbing (tar-like) / ... 14:46:09 I was looking for a way to implement this exclusion that is standard for all engines 14:46:20 clsacramento, ok, good 14:46:57 in rsync 14:47:01 I'm doing something like 14:47:04 Because the solution we thought of was: develop an exclusion on the checksum that behaves the same as the tar 14:47:14 but then for others engine it could not be good 14:47:22 filename = path.split('/')[-1] 14:47:27 if exclude in filename: 14:47:29 next 14:47:31 something like that 14:47:39 then we thought of separating the exclusion from the engine 14:47:44 ok 14:47:52 I'm not sure 14:47:54 you can separate 14:48:09 because if check the exclusion while you walk the file system 14:48:21 and while the agent run the file system 14:48:27 for example, what u are doing for rsync is not really equivalent to the tar pattern exclusion 14:48:33 generate the backup data block that is 14:48:42 uploaded in chunks 14:48:48 clsacramento, ok I agree 14:49:07 I think it is okay to have it separated. 14:49:16 and for the backup consistency we need all exclusion to have the same results 14:49:33 clsacramento, yes indeed 14:49:37 now my question is 14:49:43 how do you move that away from the engine? 14:49:48 The drawback is : if you pass a --exclude it means that freezer will have to walt the backup-path one time more 14:50:30 if you want to move it away 14:50:30 why? 14:50:34 we thought of walking the backup path to determine the list of excluded files and pass this list to the engine 14:50:37 slashme ^^ 14:50:39 you need to scan the filesystem before 14:50:42 generate the tree 14:50:50 for tar we have already figured out how to pass the list 14:50:59 and remove the pattern that match with exclude 14:51:22 guys we have 10 min left 14:51:23 the list of excludes? 14:51:40 daemontool: yes, the list of excluded files 14:51:40 daemontool: yes, like convert the pattern to a list of excludes 14:51:52 and pass this list instead of the pattern to the engine 14:52:24 In that way, we are assured to be consistent regardless of the engine 14:52:34 ok 14:52:39 let's do that :) 14:52:45 If we do that, we need to agree in the pattern machanism 14:52:56 like if it is regex, shell or anything else 14:53:02 if exclude is contained in filename 14:53:04 mmhhh 14:53:11 nope 14:53:14 more efficient 14:53:25 if filename in exclude: next 14:53:32 exclude is always a list 14:53:52 regex against each file 14:53:57 it's a bit heavy 14:54:06 it is not very usable, imagine u are user who needs to exclude thousands of file 14:54:26 ok 14:54:31 What the format of the exclude is going to be ? 14:54:35 it's O(n) 14:54:36 also the command line will be unreadable 14:54:43 ddieterly, yes :( 14:54:47 yes :) 14:54:49 clsacramento yes 14:55:00 that is very fast 14:55:05 ddieterly, exactly 14:55:30 so would be like 14:55:40 Shell globbing (like tar) (ie: */plop/*) ? or regex (ie: ^.*/plop/.*) ? 14:55:56 --exclude brown,white,yellow,green 14:56:02 ok, the tar is not exactly like globbing 14:56:04 a list is generated from that 14:56:25 if filename in that list 14:56:26 skip 14:56:31 so a filename like 14:56:32 daemontool: what about files that end with an extension, like *.log is accpeted ? 14:56:36 brownsugar 14:56:38 yes 14:56:56 it will be skipped 14:57:01 ah no sorry 14:57:10 clsacramento: should be possible to pass that as argument --exclude *.pyc 14:57:10 that my bad 14:57:24 but under the hood we convert that to a list 14:57:45 regex is more appropriate imho 14:57:51 it would be an exact match 14:57:54 for that case 14:58:00 if you want that flexibility 14:58:01 I prefer file globing 14:58:01 converting to a list and then scanning the list for each file will not scale 14:58:15 we need to have on of globing of regex 14:58:16 if that is what was meant 14:58:16 ddieterly: 14:58:17 but 14:58:17 regex is better 14:58:18 No 14:58:24 ddieterly, agree! 14:58:26 take in consideration 14:58:40 if we have few milions of files 14:58:43 You walk path and only add to the list if it match regex 14:58:50 we need to match against that 14:59:04 slashme, so with that approach 14:59:05 guys we have 2 minutes left 14:59:10 you need to scan the fs twice 14:59:25 if you look at each file name and match with a regex just once, then that would be ok 14:59:27 daemontool: Yes, only if you provide a --exclude 14:59:56 yes ok, but it is still inefficient even only on taht case 15:00:02 can we please move this discussion to #openstack-freezer channel? 15:00:06 ok 15:00:12 #endmeeting