14:00:21 <m3m0> #startmeeting freezer
14:00:22 <openstack> Meeting started Thu Apr  7 14:00:21 2016 UTC and is due to finish in 60 minutes.  The chair is m3m0. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:23 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:25 <openstack> The meeting name has been set to 'freezer'
14:00:36 <yangyapeng> :)
14:01:12 <slashme> Hi m3m0 our beloved chairman
14:02:08 <m3m0> and even more handsome :)
14:02:25 <m3m0> as always freezer notes https://etherpad.openstack.org/p/freezer_meetings
14:02:48 <m3m0> who's here for the freezer meeting?
14:02:58 <yangyapeng> :)
14:04:39 <EinstCrazy> :)
14:04:44 <m3m0> ok let's wait for more people to join
14:05:12 <m3m0> EinstCrazy yangyapeng, any updates from your side?
14:05:43 <EinstCrazy> We are test cindernative backup  nowadays
14:06:06 <m3m0> and how is it going?
14:07:37 <EinstCrazy> and we have some bugs to fix, and have a poc
14:08:17 <m3m0> nice, I'm looking forward to see the new patchs :)
14:08:27 <m3m0> ok let's start with the first topic
14:08:37 <m3m0> #topic summit preparation
14:08:46 <m3m0> slashme any update on this?
14:09:08 <daemontool> hi
14:09:22 <daemontool> sorry I-m late
14:09:33 <slashme> Yes m3m0
14:09:41 <reldan> 0/
14:09:50 <slashme> So we have 4 sessions
14:10:05 <slashme> for the developper summit
14:10:19 <slashme> 1 in a big room: Wed: 9:50 - 10:30
14:10:37 <slashme> 2 in smaller rooms:  Wed: 11:00 - 11:40 and 11:50 - 12:30
14:10:54 <slashme> And one longer on the friday: Fri: 9:00 - 12:30
14:11:00 <daemontool> :)
14:11:13 <slashme> Thanks to daemontool for that
14:11:38 <daemontool> let's make together the most out of them :)
14:11:38 <slashme> Now we need to decide what is going to be discussed during the three first sessions
14:12:09 <m3m0> for one, should be backup of your infra
14:12:11 <slashme> The last big one on friday is supposed to be theme-less in order to prepare everything needed for the cycle
14:12:45 <daemontool> backup as a service
14:12:47 <slashme> https://etherpad.openstack.org/p/austin_summit_preparation
14:13:22 <m3m0> and we have one left
14:13:38 <m3m0> disaster recovery?
14:14:12 <slashme> I think so
14:14:25 <slashme> Let's try to develop a bit
14:15:14 <slashme> What do we want to fit in : "Backup your infrastructure" ?
14:15:42 <m3m0> mysql backup and restore
14:16:02 <daemontool> yes
14:16:09 <m3m0> maybe cinder and nova
14:16:17 <slashme> So I guess backup you OpenStack Infrastructure would be better
14:16:50 <daemontool> we can probably resume the topic on 3 points?
14:16:56 <daemontool> 1) infrastructure backup
14:16:59 <daemontool> 2) baas
14:17:01 <daemontool> 3) DR
14:18:10 <slashme> daemontool: I think your three points are three different sessions
14:20:42 <slashme> What do we want to fit in : "Backup as a service" ?
14:20:53 <daemontool> slagle,  yes
14:21:00 <daemontool> slashme, yes
14:21:26 <daemontool> backup as a service: tenant resources backups such as: Volumes and VMs
14:21:57 <daemontool> infrastructure: mysql + job sessions
14:22:22 <daemontool> DR: whatever it is
14:22:38 <daemontool> also we need to talk about the scalability
14:22:55 <daemontool> like how do we thing to achieve a backup of 10 or 50TB of data
14:22:57 <daemontool> or more
14:23:24 <daemontool> in my opinion, these are the challenging topics we have
14:24:25 <EinstCrazy> I think we need to think about backup of big size of data
14:24:48 <daemontool> EinstCrazy,  ++
14:24:55 <daemontool> scalability
14:24:59 <EinstCrazy> yes
14:25:16 <daemontool> for infrastructure backup I think we did good so far
14:25:27 <daemontool> what are the current issues, or the current concerns
14:25:30 <slashme> Okay. So do you think scalability needs its own session ? In that case, we need to remove something else.
14:25:30 <daemontool> on this?
14:25:49 <slashme> Or should we fit it in the friday session ?
14:25:54 <daemontool> scalability in place of infra?
14:26:00 <daemontool> I don't know
14:26:13 <daemontool> Friday is good I think
14:26:56 <daemontool> let's add infra to Fri and scalability on its own?
14:29:01 <slashme> I would keep infra in the main sessions
14:30:40 <daemontool> ok
14:30:48 <daemontool> all good for me
14:30:55 <daemontool> as long we have this 4
14:31:29 <daemontool> the reason why I'm pushign for baas
14:31:33 <daemontool> is that all the commercial backup solution
14:31:39 <daemontool> provides infra backup
14:31:43 <daemontool> what is really lacking
14:31:45 <daemontool> is baas
14:32:02 <daemontool> I'm asked for that at least 3 times every week
14:32:13 <daemontool> and we provide infra backup
14:32:28 <daemontool> just explaning my motivation
14:32:33 <daemontool> I'm totally OK with that plan
14:32:37 <slashme> I agree
14:33:56 <daemontool> hi frescof
14:34:28 <slashme> I added scalability to backup as a service, because they are related
14:34:51 <kelepirci> hi all. I waish I was going to summit
14:35:01 <kelepirci> here to observ
14:35:08 <daemontool> ok
14:35:16 <daemontool> also related to infrastructure
14:35:17 <daemontool> and dr
14:35:24 <daemontool> well scalability is related with everything probably
14:35:39 <daemontool> are this topic agreed?
14:35:42 <daemontool> are we good?
14:35:51 <slashme> Okay for me.
14:36:19 <m3m0> can we move forward?
14:36:29 <daemontool> yes please
14:36:48 <m3m0> #topic How should we deal with authentication when using other clients
14:37:23 <slashme> This was raised by erno
14:37:35 <daemontool> I think we need an example for that
14:37:49 <slashme> we are calling private methods when authenticating with cinder and glance clients
14:37:49 <daemontool> this is related to reldan conversation perhaps?
14:37:55 <daemontool> ah ok
14:37:56 <slashme> daemontool: yes
14:38:16 <daemontool> and is not good
14:38:19 <daemontool> to void that
14:38:22 <slashme> Idealy, we should be able to use keystone sessions in order to authenticate once
14:38:24 <daemontool> we have to rewrite code
14:38:35 <slashme> and then pass the session to other clients
14:38:37 <daemontool> and that's why reldan did that
14:38:38 <daemontool> ok
14:39:04 <slashme> But we are not completely sure of how this is supposed to work
14:39:06 <daemontool> we should probably open a bug for that
14:39:06 <reldan> Yes, we can rewrite that. But actually from my point of view os clients is really mess
14:39:10 <daemontool> if it is not already opened
14:39:25 <slashme> I guess we should ask for the keystone team opinion
14:40:23 <daemontool> I don't know, someone should take ownership of that
14:40:27 <daemontool> think about that
14:40:39 <slashme> szaher reldan ?
14:40:39 <daemontool> have the related conversation with the other services teams
14:40:41 <daemontool> if needed
14:40:49 <daemontool> and do changes if would be the case
14:41:18 <szaher> I did that in diff project and I used sessions to authenticate with diff projects at the same time check this http://paste.openstack.org/show/493346/
14:41:45 <daemontool> ok
14:42:09 <daemontool> reldan,  what's your thought?
14:42:24 <daemontool> hi ddieterly
14:42:39 <ddieterly> hello
14:42:55 <reldan> daemontool: szaher have already such part of code for another project
14:43:04 <reldan> So we can copy-past it to ours
14:43:15 <daemontool> ok
14:43:25 <daemontool> szaher, are you comfortable doing that?
14:43:34 <szaher> daemontool: Yes, that is fine
14:43:41 <m3m0> can we move forward?
14:44:00 <daemontool> slashme, all good?
14:44:11 <slashme> yup
14:44:14 <slashme> Next topic
14:44:16 <m3m0> #topic freezer-agent --exclude
14:44:28 <slashme> clsacramento ^^
14:44:37 <slashme> daemontool: ^^
14:44:55 <daemontool> yes
14:44:59 <m3m0> I think we should have a common interface in python that get passed to the engines
14:45:20 <daemontool> m3m0,  for what?
14:45:24 <daemontool> we have that already
14:45:29 <slashme> Should we separate the exclude mechanism from tar.
14:45:29 <m3m0> only for ta
14:45:31 <m3m0> r
14:45:33 <slashme> We need this for backup consistency and ner engines (rsync)
14:45:34 <slashme> Idea is:
14:45:35 <daemontool> or is something different?
14:45:42 <clsacramento> When I was implementing the checksum for backup consistency I realized that the --exclude uses the tar exclude
14:45:51 <daemontool> yes
14:45:52 <slashme> If --exclude <something> is passed as a parameter
14:45:53 <slashme> We walk path and generate a list of files to exclude
14:45:53 <slashme> That list can be passed to tar with --exclude-from
14:45:54 <slashme> Question :
14:45:55 <slashme> Should we do this ?
14:45:56 <slashme> Acceptable solution ?
14:45:57 <slashme> format of <something> : regex / shell globbing (tar-like) / ...
14:46:09 <clsacramento> I was looking for a way to implement this exclusion that is standard for all engines
14:46:20 <daemontool> clsacramento,  ok, good
14:46:57 <daemontool> in rsync
14:47:01 <daemontool> I'm doing something like
14:47:04 <clsacramento> Because the solution we thought of was: develop an exclusion on the checksum that behaves the same as the tar
14:47:14 <clsacramento> but then for others engine it could not be good
14:47:22 <daemontool> filename = path.split('/')[-1]
14:47:27 <daemontool> if exclude in filename:
14:47:29 <daemontool> next
14:47:31 <daemontool> something like that
14:47:39 <clsacramento> then we thought of separating the exclusion from the engine
14:47:44 <daemontool> ok
14:47:52 <daemontool> I'm not sure
14:47:54 <daemontool> you can separate
14:48:09 <daemontool> because if check the exclusion while you walk the file system
14:48:21 <daemontool> and while the agent run the file system
14:48:27 <clsacramento> for example, what u are doing for rsync is not really equivalent to the tar pattern exclusion
14:48:33 <daemontool> generate the backup data block that is
14:48:42 <daemontool> uploaded in chunks
14:48:48 <daemontool> clsacramento, ok I agree
14:49:07 <slashme> I think it is okay to have it separated.
14:49:16 <clsacramento> and for the backup consistency we need all exclusion to have the same results
14:49:33 <daemontool> clsacramento, yes indeed
14:49:37 <daemontool> now my question is
14:49:43 <daemontool> how do you move that away from the engine?
14:49:48 <slashme> The drawback is : if you pass a --exclude it means that freezer will have to walt the backup-path one time more
14:50:30 <daemontool> if you want to move it away
14:50:30 <m3m0> why?
14:50:34 <clsacramento> we thought of walking the backup path to determine the list of excluded files and pass this list to the engine
14:50:37 <m3m0> slashme ^^
14:50:39 <daemontool> you need to scan the filesystem before
14:50:42 <daemontool> generate the tree
14:50:50 <clsacramento> for tar we have already figured out how to pass the list
14:50:59 <daemontool> and remove the pattern that match with exclude
14:51:22 <m3m0> guys we have 10 min left
14:51:23 <daemontool> the list of excludes?
14:51:40 <slashme> daemontool: yes, the list of excluded files
14:51:40 <clsacramento> daemontool: yes, like convert the pattern to a list of excludes
14:51:52 <clsacramento> and pass this list instead of the pattern to the engine
14:52:24 <slashme> In that way, we are assured to be consistent regardless of the engine
14:52:34 <daemontool> ok
14:52:39 <daemontool> let's do that :)
14:52:45 <clsacramento> If we do that, we need to agree in the pattern machanism
14:52:56 <clsacramento> like if it is regex, shell or anything else
14:53:02 <daemontool> if exclude is contained in filename
14:53:04 <daemontool> mmhhh
14:53:11 <daemontool> nope
14:53:14 <daemontool> more efficient
14:53:25 <daemontool> if filename in exclude: next
14:53:32 <daemontool> exclude is always a list
14:53:52 <daemontool> regex against each file
14:53:57 <daemontool> it's a bit heavy
14:54:06 <clsacramento> it is not very usable, imagine u are user who needs to exclude thousands of file
14:54:26 <daemontool> ok
14:54:31 <slashme> What the format of the exclude is going to be ?
14:54:35 <ddieterly> it's O(n)
14:54:36 <clsacramento> also the command line will be unreadable
14:54:43 <daemontool> ddieterly,  yes :(
14:54:47 <daemontool> yes :)
14:54:49 <yangyapeng> clsacramento yes
14:55:00 <ddieterly> that is very fast
14:55:05 <daemontool> ddieterly,  exactly
14:55:30 <daemontool> so would be like
14:55:40 <slashme> Shell globbing (like tar) (ie: */plop/*) ? or regex (ie: ^.*/plop/.*) ?
14:55:56 <daemontool> --exclude brown,white,yellow,green
14:56:02 <clsacramento> ok, the tar is not exactly like globbing
14:56:04 <daemontool> a list is generated from that
14:56:25 <daemontool> if filename in that list
14:56:26 <daemontool> skip
14:56:31 <daemontool> so a filename like
14:56:32 <clsacramento> daemontool: what about files that end with an extension, like *.log is accpeted ?
14:56:36 <daemontool> brownsugar
14:56:38 <daemontool> yes
14:56:56 <daemontool> it will be skipped
14:57:01 <daemontool> ah no sorry
14:57:10 <m3m0> clsacramento: should be possible to pass that as argument --exclude *.pyc
14:57:10 <daemontool> that my bad
14:57:24 <m3m0> but under the hood we convert that to a list
14:57:45 <frescof> regex is more appropriate imho
14:57:51 <daemontool> it would be an exact match
14:57:54 <daemontool> for that case
14:58:00 <daemontool> if you want that flexibility
14:58:01 <slashme> I prefer file globing
14:58:01 <ddieterly> converting to a list and then scanning the list for each file will not scale
14:58:15 <daemontool> we need to have on of globing of regex
14:58:16 <ddieterly> if that is what was meant
14:58:16 <slashme> ddieterly:
14:58:17 <daemontool> but
14:58:17 <yangyapeng> regex is better
14:58:18 <slashme> No
14:58:24 <frescof> ddieterly, agree!
14:58:26 <daemontool> take in consideration
14:58:40 <daemontool> if we have few milions of files
14:58:43 <slashme> You walk path and only add to the list if it match regex
14:58:50 <daemontool> we need to match against that
14:59:04 <daemontool> slashme, so with that approach
14:59:05 <m3m0> guys we have 2 minutes left
14:59:10 <daemontool> you need to scan the fs twice
14:59:25 <ddieterly> if you look at each file name and match with a regex just once, then that would be ok
14:59:27 <slashme> daemontool: Yes, only if you provide a --exclude
14:59:56 <daemontool> yes ok, but it is still inefficient even only on taht case
15:00:02 <m3m0> can we please move this discussion to #openstack-freezer channel?
15:00:06 <daemontool> ok
15:00:12 <m3m0> #endmeeting