14:00:21 <m3m0> #startmeeting freezer 14:00:22 <openstack> Meeting started Thu Apr 7 14:00:21 2016 UTC and is due to finish in 60 minutes. The chair is m3m0. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:23 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:25 <openstack> The meeting name has been set to 'freezer' 14:00:36 <yangyapeng> :) 14:01:12 <slashme> Hi m3m0 our beloved chairman 14:02:08 <m3m0> and even more handsome :) 14:02:25 <m3m0> as always freezer notes https://etherpad.openstack.org/p/freezer_meetings 14:02:48 <m3m0> who's here for the freezer meeting? 14:02:58 <yangyapeng> :) 14:04:39 <EinstCrazy> :) 14:04:44 <m3m0> ok let's wait for more people to join 14:05:12 <m3m0> EinstCrazy yangyapeng, any updates from your side? 14:05:43 <EinstCrazy> We are test cindernative backup nowadays 14:06:06 <m3m0> and how is it going? 14:07:37 <EinstCrazy> and we have some bugs to fix, and have a poc 14:08:17 <m3m0> nice, I'm looking forward to see the new patchs :) 14:08:27 <m3m0> ok let's start with the first topic 14:08:37 <m3m0> #topic summit preparation 14:08:46 <m3m0> slashme any update on this? 14:09:08 <daemontool> hi 14:09:22 <daemontool> sorry I-m late 14:09:33 <slashme> Yes m3m0 14:09:41 <reldan> 0/ 14:09:50 <slashme> So we have 4 sessions 14:10:05 <slashme> for the developper summit 14:10:19 <slashme> 1 in a big room: Wed: 9:50 - 10:30 14:10:37 <slashme> 2 in smaller rooms: Wed: 11:00 - 11:40 and 11:50 - 12:30 14:10:54 <slashme> And one longer on the friday: Fri: 9:00 - 12:30 14:11:00 <daemontool> :) 14:11:13 <slashme> Thanks to daemontool for that 14:11:38 <daemontool> let's make together the most out of them :) 14:11:38 <slashme> Now we need to decide what is going to be discussed during the three first sessions 14:12:09 <m3m0> for one, should be backup of your infra 14:12:11 <slashme> The last big one on friday is supposed to be theme-less in order to prepare everything needed for the cycle 14:12:45 <daemontool> backup as a service 14:12:47 <slashme> https://etherpad.openstack.org/p/austin_summit_preparation 14:13:22 <m3m0> and we have one left 14:13:38 <m3m0> disaster recovery? 14:14:12 <slashme> I think so 14:14:25 <slashme> Let's try to develop a bit 14:15:14 <slashme> What do we want to fit in : "Backup your infrastructure" ? 14:15:42 <m3m0> mysql backup and restore 14:16:02 <daemontool> yes 14:16:09 <m3m0> maybe cinder and nova 14:16:17 <slashme> So I guess backup you OpenStack Infrastructure would be better 14:16:50 <daemontool> we can probably resume the topic on 3 points? 14:16:56 <daemontool> 1) infrastructure backup 14:16:59 <daemontool> 2) baas 14:17:01 <daemontool> 3) DR 14:18:10 <slashme> daemontool: I think your three points are three different sessions 14:20:42 <slashme> What do we want to fit in : "Backup as a service" ? 14:20:53 <daemontool> slagle, yes 14:21:00 <daemontool> slashme, yes 14:21:26 <daemontool> backup as a service: tenant resources backups such as: Volumes and VMs 14:21:57 <daemontool> infrastructure: mysql + job sessions 14:22:22 <daemontool> DR: whatever it is 14:22:38 <daemontool> also we need to talk about the scalability 14:22:55 <daemontool> like how do we thing to achieve a backup of 10 or 50TB of data 14:22:57 <daemontool> or more 14:23:24 <daemontool> in my opinion, these are the challenging topics we have 14:24:25 <EinstCrazy> I think we need to think about backup of big size of data 14:24:48 <daemontool> EinstCrazy, ++ 14:24:55 <daemontool> scalability 14:24:59 <EinstCrazy> yes 14:25:16 <daemontool> for infrastructure backup I think we did good so far 14:25:27 <daemontool> what are the current issues, or the current concerns 14:25:30 <slashme> Okay. So do you think scalability needs its own session ? In that case, we need to remove something else. 14:25:30 <daemontool> on this? 14:25:49 <slashme> Or should we fit it in the friday session ? 14:25:54 <daemontool> scalability in place of infra? 14:26:00 <daemontool> I don't know 14:26:13 <daemontool> Friday is good I think 14:26:56 <daemontool> let's add infra to Fri and scalability on its own? 14:29:01 <slashme> I would keep infra in the main sessions 14:30:40 <daemontool> ok 14:30:48 <daemontool> all good for me 14:30:55 <daemontool> as long we have this 4 14:31:29 <daemontool> the reason why I'm pushign for baas 14:31:33 <daemontool> is that all the commercial backup solution 14:31:39 <daemontool> provides infra backup 14:31:43 <daemontool> what is really lacking 14:31:45 <daemontool> is baas 14:32:02 <daemontool> I'm asked for that at least 3 times every week 14:32:13 <daemontool> and we provide infra backup 14:32:28 <daemontool> just explaning my motivation 14:32:33 <daemontool> I'm totally OK with that plan 14:32:37 <slashme> I agree 14:33:56 <daemontool> hi frescof 14:34:28 <slashme> I added scalability to backup as a service, because they are related 14:34:51 <kelepirci> hi all. I waish I was going to summit 14:35:01 <kelepirci> here to observ 14:35:08 <daemontool> ok 14:35:16 <daemontool> also related to infrastructure 14:35:17 <daemontool> and dr 14:35:24 <daemontool> well scalability is related with everything probably 14:35:39 <daemontool> are this topic agreed? 14:35:42 <daemontool> are we good? 14:35:51 <slashme> Okay for me. 14:36:19 <m3m0> can we move forward? 14:36:29 <daemontool> yes please 14:36:48 <m3m0> #topic How should we deal with authentication when using other clients 14:37:23 <slashme> This was raised by erno 14:37:35 <daemontool> I think we need an example for that 14:37:49 <slashme> we are calling private methods when authenticating with cinder and glance clients 14:37:49 <daemontool> this is related to reldan conversation perhaps? 14:37:55 <daemontool> ah ok 14:37:56 <slashme> daemontool: yes 14:38:16 <daemontool> and is not good 14:38:19 <daemontool> to void that 14:38:22 <slashme> Idealy, we should be able to use keystone sessions in order to authenticate once 14:38:24 <daemontool> we have to rewrite code 14:38:35 <slashme> and then pass the session to other clients 14:38:37 <daemontool> and that's why reldan did that 14:38:38 <daemontool> ok 14:39:04 <slashme> But we are not completely sure of how this is supposed to work 14:39:06 <daemontool> we should probably open a bug for that 14:39:06 <reldan> Yes, we can rewrite that. But actually from my point of view os clients is really mess 14:39:10 <daemontool> if it is not already opened 14:39:25 <slashme> I guess we should ask for the keystone team opinion 14:40:23 <daemontool> I don't know, someone should take ownership of that 14:40:27 <daemontool> think about that 14:40:39 <slashme> szaher reldan ? 14:40:39 <daemontool> have the related conversation with the other services teams 14:40:41 <daemontool> if needed 14:40:49 <daemontool> and do changes if would be the case 14:41:18 <szaher> I did that in diff project and I used sessions to authenticate with diff projects at the same time check this http://paste.openstack.org/show/493346/ 14:41:45 <daemontool> ok 14:42:09 <daemontool> reldan, what's your thought? 14:42:24 <daemontool> hi ddieterly 14:42:39 <ddieterly> hello 14:42:55 <reldan> daemontool: szaher have already such part of code for another project 14:43:04 <reldan> So we can copy-past it to ours 14:43:15 <daemontool> ok 14:43:25 <daemontool> szaher, are you comfortable doing that? 14:43:34 <szaher> daemontool: Yes, that is fine 14:43:41 <m3m0> can we move forward? 14:44:00 <daemontool> slashme, all good? 14:44:11 <slashme> yup 14:44:14 <slashme> Next topic 14:44:16 <m3m0> #topic freezer-agent --exclude 14:44:28 <slashme> clsacramento ^^ 14:44:37 <slashme> daemontool: ^^ 14:44:55 <daemontool> yes 14:44:59 <m3m0> I think we should have a common interface in python that get passed to the engines 14:45:20 <daemontool> m3m0, for what? 14:45:24 <daemontool> we have that already 14:45:29 <slashme> Should we separate the exclude mechanism from tar. 14:45:29 <m3m0> only for ta 14:45:31 <m3m0> r 14:45:33 <slashme> We need this for backup consistency and ner engines (rsync) 14:45:34 <slashme> Idea is: 14:45:35 <daemontool> or is something different? 14:45:42 <clsacramento> When I was implementing the checksum for backup consistency I realized that the --exclude uses the tar exclude 14:45:51 <daemontool> yes 14:45:52 <slashme> If --exclude <something> is passed as a parameter 14:45:53 <slashme> We walk path and generate a list of files to exclude 14:45:53 <slashme> That list can be passed to tar with --exclude-from 14:45:54 <slashme> Question : 14:45:55 <slashme> Should we do this ? 14:45:56 <slashme> Acceptable solution ? 14:45:57 <slashme> format of <something> : regex / shell globbing (tar-like) / ... 14:46:09 <clsacramento> I was looking for a way to implement this exclusion that is standard for all engines 14:46:20 <daemontool> clsacramento, ok, good 14:46:57 <daemontool> in rsync 14:47:01 <daemontool> I'm doing something like 14:47:04 <clsacramento> Because the solution we thought of was: develop an exclusion on the checksum that behaves the same as the tar 14:47:14 <clsacramento> but then for others engine it could not be good 14:47:22 <daemontool> filename = path.split('/')[-1] 14:47:27 <daemontool> if exclude in filename: 14:47:29 <daemontool> next 14:47:31 <daemontool> something like that 14:47:39 <clsacramento> then we thought of separating the exclusion from the engine 14:47:44 <daemontool> ok 14:47:52 <daemontool> I'm not sure 14:47:54 <daemontool> you can separate 14:48:09 <daemontool> because if check the exclusion while you walk the file system 14:48:21 <daemontool> and while the agent run the file system 14:48:27 <clsacramento> for example, what u are doing for rsync is not really equivalent to the tar pattern exclusion 14:48:33 <daemontool> generate the backup data block that is 14:48:42 <daemontool> uploaded in chunks 14:48:48 <daemontool> clsacramento, ok I agree 14:49:07 <slashme> I think it is okay to have it separated. 14:49:16 <clsacramento> and for the backup consistency we need all exclusion to have the same results 14:49:33 <daemontool> clsacramento, yes indeed 14:49:37 <daemontool> now my question is 14:49:43 <daemontool> how do you move that away from the engine? 14:49:48 <slashme> The drawback is : if you pass a --exclude it means that freezer will have to walt the backup-path one time more 14:50:30 <daemontool> if you want to move it away 14:50:30 <m3m0> why? 14:50:34 <clsacramento> we thought of walking the backup path to determine the list of excluded files and pass this list to the engine 14:50:37 <m3m0> slashme ^^ 14:50:39 <daemontool> you need to scan the filesystem before 14:50:42 <daemontool> generate the tree 14:50:50 <clsacramento> for tar we have already figured out how to pass the list 14:50:59 <daemontool> and remove the pattern that match with exclude 14:51:22 <m3m0> guys we have 10 min left 14:51:23 <daemontool> the list of excludes? 14:51:40 <slashme> daemontool: yes, the list of excluded files 14:51:40 <clsacramento> daemontool: yes, like convert the pattern to a list of excludes 14:51:52 <clsacramento> and pass this list instead of the pattern to the engine 14:52:24 <slashme> In that way, we are assured to be consistent regardless of the engine 14:52:34 <daemontool> ok 14:52:39 <daemontool> let's do that :) 14:52:45 <clsacramento> If we do that, we need to agree in the pattern machanism 14:52:56 <clsacramento> like if it is regex, shell or anything else 14:53:02 <daemontool> if exclude is contained in filename 14:53:04 <daemontool> mmhhh 14:53:11 <daemontool> nope 14:53:14 <daemontool> more efficient 14:53:25 <daemontool> if filename in exclude: next 14:53:32 <daemontool> exclude is always a list 14:53:52 <daemontool> regex against each file 14:53:57 <daemontool> it's a bit heavy 14:54:06 <clsacramento> it is not very usable, imagine u are user who needs to exclude thousands of file 14:54:26 <daemontool> ok 14:54:31 <slashme> What the format of the exclude is going to be ? 14:54:35 <ddieterly> it's O(n) 14:54:36 <clsacramento> also the command line will be unreadable 14:54:43 <daemontool> ddieterly, yes :( 14:54:47 <daemontool> yes :) 14:54:49 <yangyapeng> clsacramento yes 14:55:00 <ddieterly> that is very fast 14:55:05 <daemontool> ddieterly, exactly 14:55:30 <daemontool> so would be like 14:55:40 <slashme> Shell globbing (like tar) (ie: */plop/*) ? or regex (ie: ^.*/plop/.*) ? 14:55:56 <daemontool> --exclude brown,white,yellow,green 14:56:02 <clsacramento> ok, the tar is not exactly like globbing 14:56:04 <daemontool> a list is generated from that 14:56:25 <daemontool> if filename in that list 14:56:26 <daemontool> skip 14:56:31 <daemontool> so a filename like 14:56:32 <clsacramento> daemontool: what about files that end with an extension, like *.log is accpeted ? 14:56:36 <daemontool> brownsugar 14:56:38 <daemontool> yes 14:56:56 <daemontool> it will be skipped 14:57:01 <daemontool> ah no sorry 14:57:10 <m3m0> clsacramento: should be possible to pass that as argument --exclude *.pyc 14:57:10 <daemontool> that my bad 14:57:24 <m3m0> but under the hood we convert that to a list 14:57:45 <frescof> regex is more appropriate imho 14:57:51 <daemontool> it would be an exact match 14:57:54 <daemontool> for that case 14:58:00 <daemontool> if you want that flexibility 14:58:01 <slashme> I prefer file globing 14:58:01 <ddieterly> converting to a list and then scanning the list for each file will not scale 14:58:15 <daemontool> we need to have on of globing of regex 14:58:16 <ddieterly> if that is what was meant 14:58:16 <slashme> ddieterly: 14:58:17 <daemontool> but 14:58:17 <yangyapeng> regex is better 14:58:18 <slashme> No 14:58:24 <frescof> ddieterly, agree! 14:58:26 <daemontool> take in consideration 14:58:40 <daemontool> if we have few milions of files 14:58:43 <slashme> You walk path and only add to the list if it match regex 14:58:50 <daemontool> we need to match against that 14:59:04 <daemontool> slashme, so with that approach 14:59:05 <m3m0> guys we have 2 minutes left 14:59:10 <daemontool> you need to scan the fs twice 14:59:25 <ddieterly> if you look at each file name and match with a regex just once, then that would be ok 14:59:27 <slashme> daemontool: Yes, only if you provide a --exclude 14:59:56 <daemontool> yes ok, but it is still inefficient even only on taht case 15:00:02 <m3m0> can we please move this discussion to #openstack-freezer channel? 15:00:06 <daemontool> ok 15:00:12 <m3m0> #endmeeting