Saturday, 2010-11-06

*** ddumitriu has joined #openstack		00:00
*** ddumitriu has quit IRC		00:02
*** khussein has left #openstack		00:12
*** kami__ has joined #openstack		00:25
*** dendrobates is now known as dendro-afk		00:26
*** dendro-afk is now known as dendrobates		00:30
*** metcalfc has quit IRC		00:36
*** dendrobates is now known as dendro-afk		00:52
*** arthurc has quit IRC		01:10
*** rlucio has quit IRC		01:20
*** DubLo7 has quit IRC		01:32
*** dendro-afk is now known as dendrobates		01:47
zaitcev	https://blueprints.launchpad.net/swift/+spec/s3-hail -- ok, coding is basically done. Needs debugging and then deciding how and if we integrate things. Configuring through CLD is too arcane.	01:54
* arcane is too		01:55
*** mischer has quit IRC		01:55
*** Cybodog has joined #openstack		02:11
uvirtbot	New bug: #671704 in swift "Stats collector failing to parse log lines, server_name match fails" [Undecided,New] https://launchpad.net/bugs/671704	02:21
*** Ziad has joined #openstack		02:24
*** Ziad has left #openstack		02:28
*** zaitcev has quit IRC		02:28
*** dysinger has left #openstack		02:28
*** dendrobates is now known as dendro-afk		02:29
*** jakedahn has quit IRC		02:30
*** miclorb has joined #openstack		02:30
*** smithpg1002 has joined #openstack		02:33
*** Cybodog has quit IRC		02:44
*** msinhore has joined #openstack		02:48
*** miclorb has quit IRC		02:49
*** pvo has joined #openstack		02:59
*** ChanServ sets mode: +v pvo		02:59
*** DubLo7 has joined #openstack		02:59
*** miclorb has joined #openstack		03:03
*** smithpg1002 has quit IRC		03:33
*** pvo has quit IRC		03:39
*** schisamo has quit IRC		03:43
*** v has left #openstack		03:54
*** miclorb has quit IRC		03:56
*** msinhore has quit IRC		03:59
*** khussein has joined #openstack		04:20
*** khussein is now known as knighthacker		04:26
*** knighthacker is now known as KnightHacker		04:27
*** KnightHacker has quit IRC		04:30
*** omidhdl has joined #openstack		04:44
*** khussein has joined #openstack		04:45
*** omidhdl1 has joined #openstack		04:46
*** omidhdl has quit IRC		04:46
*** khussein has left #openstack		04:49
*** KnightHacker_ has joined #openstack		04:49
*** KnightHacker_ has quit IRC		04:51
*** KnightHacker has joined #openstack		05:20
*** KnightHacker has left #openstack		05:34
*** kainam has joined #openstack		05:37
*** kainam has quit IRC		05:38
*** kainam has joined #openstack		05:38
*** arcane has quit IRC		05:39
*** kainam has quit IRC		05:39
*** kainam has joined #openstack		05:40
*** kainam has quit IRC		05:40
*** arcane has joined #openstack		05:41
*** KnightHacker has joined #openstack		05:48
*** rsampaio has joined #openstack		05:53
*** ChrisAM has quit IRC		05:59
*** ChrisAM1 has joined #openstack		06:03
*** scaraffe has joined #openstack		06:15
*** KnightHacker has quit IRC		06:28
*** befreax has joined #openstack		07:59
*** miclorb_ has joined #openstack		08:44
*** miclorb_ has quit IRC		08:49
*** alekibango has quit IRC		08:53
*** aimon_ has joined #openstack		09:06
*** vishy has left #openstack		09:07
*** aimon has quit IRC		09:08
*** aimon_ is now known as aimon		09:08
*** miclorb_ has joined #openstack		09:14
*** arthurc has joined #openstack		09:19
*** rsampaio has quit IRC		09:20
*** miclorb_ has quit IRC		09:58
*** chmouel has quit IRC		10:27
*** arthurc has quit IRC		10:38
*** chmouel has joined #openstack		10:45
*** DubLo7 has quit IRC		11:29
*** allsystemsarego has joined #openstack		11:36
*** allsystemsarego has joined #openstack		11:36
*** ucu7 has joined #openstack		11:41
*** ucu7 is now known as ucu7_		11:47
*** dendro-afk is now known as dendrobates		12:01
*** schisamo has joined #openstack		12:01
*** ucu7_ has quit IRC		12:01
*** ucu7_ has joined #openstack		12:01
*** ucu7_ is now known as ucu7		12:02
*** krish has joined #openstack		12:02
*** dendrobates is now known as dendro-afk		12:22
*** wiseneuron has joined #openstack		12:31
*** wiseneuron has left #openstack		12:33
*** schisamo has quit IRC		12:38
*** krish has quit IRC		13:05
*** ArdRigh has quit IRC		13:11
*** krish has joined #openstack		13:18
*** krish has joined #openstack		13:19
*** mischer has joined #openstack		13:19
*** pvo has joined #openstack		13:50
*** ChanServ sets mode: +v pvo		13:50
*** omidhdl1 has quit IRC		14:13
*** zykes- has quit IRC		14:26
*** zykes- has joined #openstack		14:27
*** xfrogman5 has joined #openstack		14:28
*** hazmat has quit IRC		14:29
*** xfrogman5 has quit IRC		14:34
*** arthurc has joined #openstack		14:35
*** xfrogman5 has joined #openstack		14:48
*** jtimberman has quit IRC		14:49
*** dendro-afk is now known as dendrobates		14:52
*** befreax has quit IRC		15:01
*** johnpur has joined #openstack		15:03
*** ChanServ sets mode: +v johnpur		15:03
*** johnpur has quit IRC		15:03
*** localhost has quit IRC		15:33
*** hggdh has quit IRC		15:34
*** localhost has joined #openstack		15:35
*** KnightHacker has joined #openstack		15:57
*** arcane has quit IRC		16:07
*** arcane has joined #openstack		16:07
*** wiseneuron has joined #openstack		16:10
*** wiseneuron has left #openstack		16:10
*** hggdh has joined #openstack		16:13
*** hggdh has quit IRC		16:15
*** cliguori has joined #openstack		16:16
*** scaraffe has left #openstack		16:18
*** hggdh has joined #openstack		16:25
*** dendrobates is now known as dendro-afk		16:28
*** pvo has quit IRC		16:35
*** blakeyeager has joined #openstack		16:35
*** pvo has joined #openstack		16:35
*** ChanServ sets mode: +v pvo		16:35
*** pvo_ has joined #openstack		16:37
*** ChanServ sets mode: +v pvo_		16:37
*** pvo has quit IRC		16:40
*** pvo_ is now known as pvo		16:40
* jeremyb is looking at the Updaters and Auditors sections of the arch overview		16:40
jeremyb	... only as large as the frequency at which the updater runs and may not even be noticed ...	16:41
jeremyb	but it doesn't seem to say what an "updater" is	16:42
jeremyb	(is this channel dead on weekends? i'm new :) )	16:42
jeremyb	also, for auditors and initial writes how is integrity checked? are checksums generated, stored and verified? where are they stored and what algorithm? do clients do any checking or just the server?	16:44
*** blakeyeager has quit IRC		16:47
* jeremyb has 2 different use cases in mind: storing a repo of images (over a million? trying to get a count) and/or thumbs derived from those images (~5 std sizes, plus misc arbitrary sizes) and an unrelated project with ~25 million emails and various other file types related to each of the emails (lists of links extracted from emails, screenshots of emails)		16:49
jeremyb	(this is swift of course)	16:49
notmyname	jeremyb: let me see if I can help	17:06
notmyname	(and, yes, the channel is much less active than weekdays)	17:07
jeremyb	notmyname: hi!	17:07
jeremyb	notmyname: no rush, this is just when i got around to asking	17:07
* jeremyb is screen'd :)		17:07
notmyname	jeremyb: when an object is PUT, swift attempts to write that in to the container listing and update the container metadata (object count, container size, etc)	17:08
jeremyb	but first writes the object itself?	17:08
notmyname	but it will fail quickly to ensure good performance and queue it for an async update later	17:08
notmyname	ya. object is written first	17:08
notmyname	so the updater handles the async requests	17:09
notmyname	this is how you can be guaranteed to read your writes, but container listings may be eventually consistent	17:09
notmyname	integrity is checked on writes with the etag header (md5 sum of the object)	17:09
notmyname	the auditors scan the drives and verify that the checksums still match	17:10
notmyname	for objects	17:10
notmyname	and verify that the db isn't corrupted for container and accounts	17:10
* jeremyb grumbles @ md5		17:11
*** rsampaio has joined #openstack		17:11
notmyname	it's fast and standard. and we aren't at risk for any preimage attacks	17:11
jeremyb	right :)	17:11
notmyname	your million+ image use case is a great fit for swift	17:11
jeremyb	unless the node is compromised	17:11
jeremyb	does swift do any compression on disk or wire?	17:12
notmyname	no	17:12
notmyname	I would recommend a few things to help your use case get better performance	17:12
jeremyb	does each object require at least 1 file on disk? what if objects are smaller than block size?	17:12
jeremyb	note: not familiar with XFS	17:13
notmyname	I'm by no means the expert on the swift team for XFS (looks at redbo), but it hasn't been an issue that is problematic as far as I know	17:13
* jeremyb listens for notmyname's reccomendations		17:13
notmyname	but, yes, one file on disk per object	17:14
jeremyb	i meant if there were 1k files. that's already double file size because of inode size	17:14
notmyname	recommendation 1: use multiple containers. you will be able to use higher concurrency and get better throughput	17:14
notmyname	if there is a logical sharding method for your data, use that (grouped by month or date or resolution)	17:15
notmyname	also, i'd recommend that you not use container listings to determine if your object exists. container listings are going to be eventually consistent and relatively slow (especially for million+ item containers)	17:15
notmyname	so I'd recommend that you keep a local index of your data. this also will let you sort and group better than swift allows	17:16
notmyname	I think using sqlite3 is nice because you can back up the db file itself to swift too	17:16
*** patrick has joined #openstack		17:16
notmyname	simply sync your local index with swift occasionally	17:17
*** patrick is now known as Guest77282		17:17
notmyname	(that was recommendation 2)	17:17
*** Guest77282 is now known as kashif1		17:17
notmyname	beyond that, not much. use concurrency.	17:17
* jeremyb reads		17:18
notmyname	some sort of dynamic compression would be a very interesting feature to add to swift. I'll have to add it to my list of "things I'd like to see in swift"	17:19
jeremyb	also, was thinking about encryption	17:20
jeremyb	one use case was:	17:20
*** kashif1 has joined #openstack		17:20
notmyname	encryption is a whole different ballgame. I'd be really reluctant to add it because it makes swift have to handle encryption keys. I think it's a much better feature for a client	17:20
* notmyname doesn't want to get in the business of key management		17:21
kashif1	Hello there	17:21
notmyname	holoway: kashif1	17:21
notmyname	errr	17:21
notmyname	howdy kashif1	17:21
*** rsampaio has quit IRC		17:21
notmyname	jeremyb: make sense? answer your questions?	17:21
kashif1	could someone please help me, i am setting up openstack and when i issue the command euca-upload-bundle -m /tmp/kernel.manifest.xml -b mybucket	17:22
jeremyb	notmyname: yeah, was writing my fantasy use case	17:22
kashif1	it throws an error saying i dont have permission to mybucket	17:22
jeremyb	notmyname: and i was already planning to do the natural partitiioning you mentioned	17:22
notmyname	great	17:22
notmyname	jeremyb: I'd still like to hear your use case	17:24
jeremyb	2-3 nodes per object in semitrusted DC + 2-3 untrusted+distributed nodes (e.g. stick a node in someone's house) (this is currently <10TB total so you could get a big bang for your buck with residential nodes)	17:24
jeremyb	but with data that you don't want leaking if something is stolen from the house	17:25
jeremyb	i was thinking 4-5 sata drives with esata enclosure and a guruplug	17:26
kashif1	anybody can help me on the bucket permissions problem?	17:26
notmyname	jeremyb: interesting. one thing that has been mentioned (and will be talked about at the design summit next week) is having one logical cluster spread over a wide geographic area.	17:27
jeremyb	doesn't require much performance because there's no churn only additions so it's less than 1mbit/s to keep up once up to date	17:27
notmyname	technically, it's possible now, but there are some things that would need to change to keep performance up	17:27
notmyname	the general answer is that if you want your data to be encrypted, write encrypted data	17:28
*** rsampaio has joined #openstack		17:28
jeremyb	well i'd at least want to ensure that the untrusted node can't change the checksum for something on another node	17:28
notmyname	jeremyb: so swift is divided into availability zones (see the Ring docs). these zones could be widely dispersed	17:29
jeremyb	or delete or overwrite	17:29
jeremyb	right, i've read some about those	17:29
notmyname	that pretty much what the auditors do. the object auditor will scan the drive and compare the checksum to the stored checksum	17:30
notmyname	it will quarantine bad objects and replication (from other zones) will replace the data	17:30
jeremyb	right, but where does the stored checksum come from?	17:31
notmyname	the initial write	17:31
jeremyb	i mean to feed the auditor	17:31
notmyname	ah	17:31
notmyname	the local store (the object metadata, stored in the fs xattrs)	17:32
notmyname	so, yes, it's all local	17:32
jeremyb	right	17:32
notmyname	we didn't design swift to provide perfect security in an untrusted environment. most of your needs could be solved by a client, but there are some things that swift is just not designed to do	17:34
*** Rend has joined #openstack		17:34
*** Rend has left #openstack		17:34
notmyname	in your example, I'd be more concerned about the network security than the disk security, but most things go out the window when you give the attacker physical access to hardware	17:35
notmyname	kashif1: sorry I can't help with your issue. perhaps some nova experts will be on later	17:36
kashif1	thanks man	17:37
notmyname	but I'm happy to help if you have questions about swift :-)	17:37
*** blpiatt has joined #openstack		17:37
*** JordanRinke has joined #openstack		17:39
*** dubsquared has joined #openstack		17:39
notmyname	jeremyb: thoughts?	17:40
jeremyb	notmyname: in a few, getting pinged in 3+ windows	17:40
notmyname	heh	17:41
*** kim0 has joined #openstack		17:55
kim0	Hi folks, I'm installing nova on ubuntu 10.10 based on the wiki guide. All steps are ok, except the final euca-run-instances is hanging for more than 5 minutes	17:57
kim0	any pointers as to what could be wrong	17:58
*** clayg has quit IRC		18:01
*** clayg_ has joined #openstack		18:02
jeremyb	euca-run-instances? based on eucalyptus?	18:05
* jeremyb reads back		18:06
kim0	ah yeah .. nvm .. some services were failing to start	18:06
kim0	namely nova-network .. because "dnsmasq" was already bound to the port	18:06
kim0	problem fixed for me	18:06
kim0	not sure though if it's a problem others would face	18:07
*** clayg_ is now known as clayg		18:07
kim0	sweet instance launched	18:08
jeremyb	notmyname: so basically i wanted to allow for 2 "residential" nodes to fail while still having a trusted copy of everything. so at least metadata (including checksums) would be signed by a private key in a vault that the residential nodes don't have access to	18:09
kim0	euca-describe-instances => Yields 3 results .. the one VM that's up, plus two old VMs that failed to start when nova-network was down	18:09
*** JordanRinke has quit IRC		18:09
*** metcalfc has joined #openstack		18:11
notmyname	jeremyb: that wouldn't be compatible with swift at all :-)	18:14
jeremyb	notmyname: anyway, like i said that was more fantasy	18:14
jeremyb	real life: what about snapshots? i know objects are immutable which makes it easier	18:15
notmyname	storing snapshots of something in swift or taking snapshots of swift?	18:15
jeremyb	can you make a fast and cheap copy of a container?	18:15
jeremyb	snaps of swift	18:15
jeremyb	but could be just a container not the whole cluster	18:15
notmyname	a container is an sqlite3 db file	18:15
jeremyb	i meant including all objects ref'd	18:16
notmyname	it only has an object listing and some metadata	18:16
notmyname	the only thing I know of that could store a backup of swift is swift ;-)	18:16
notmyname	I mean, where do you back up a 10PB cluster to?	18:16
notmyname	S3?	18:16
notmyname	Azure?	18:16
jeremyb	are you familiar with zfs?	18:16
notmyname	quite	18:17
jeremyb	one sec	18:17
notmyname	are you asking about a copy-on-write snapshot type feature?	18:17
jeremyb	yes	18:19
*** Kami_ has left #openstack		18:19
jeremyb	so, both my use cases seem to grow around 20GB / day	18:19
jeremyb	and it's all additions no deletes	18:19
jeremyb	1 is now ~9TB, 1 is ~12TB. so much smaller than 10PB	18:20
notmyname	i suppose object versioning would allow for something similar (versioning is something else on my "cool features to add to swift" list). the question is making it work at scale	18:22
jeremyb	i don't think that's necessary even	18:22
jeremyb	what you'd need is a way to prevent objects from being deleted if they're ref'd by a snap but not a "live" container	18:23
jeremyb	so if i'm partiitioning on date then i want to iterate over containers periodically and decide that a given container will never get any more changes (or maybe just do it each time when switching to a new container) and then do a final backup of that container and lock it down by ACL	18:24
jeremyb	would be nice to be able to get atomic periodic backups of entire containers (to swift and then from there to anywhere) while they're still open for writes	18:26
notmyname	for your current use case, or are we still talking "what-ifs"?	18:26
jeremyb	current use cases, both	18:26
jeremyb	i guess another solution would be rotating containers, writing to 1 for a day then another, then switch back. back up each while idle	18:27
jeremyb	instead of just writing to one until full	18:27
notmyname	your containers are locked down pretty tight by default, so there isn't a need to further lock them down after writing to them (IMO)	18:27
*** befreax has joined #openstack		18:27
notmyname	but atomic backups is not something that will ever happen	18:28
jeremyb	even per container?	18:28
notmyname	doubtful	18:28
jeremyb	the "guess another solution" would do it but they wouldn't be entirely up to date	18:28
notmyname	how do you perform an atomic operation over all of the objects in a container when they are dispersed throughout hundreds of servers?	18:29
notmyname	eventually consistent backups, then	18:29
notmyname	and that, IMO, gets back in to the realm of the client rather than swift-proper	18:30
jeremyb	so, at least in my cases, we can assume no deletes. so if it's in the container then it's readable. and objects are immutable so we don't have to worry about it changing during the backup	18:30
jeremyb	but the "no deletes" thing doesn't generalize	18:31
* jeremyb goes to read on ACLs		18:31
notmyname	I've got to go do some stuff around the house	18:32
jeremyb	k, thanks	18:32
notmyname	feel free to ask any questions	18:32
*** metcalfc has quit IRC		18:39
kim0	killing a VM thru virsh, nova still thinks it's running	18:41
jeremyb	can containers be moved between accounts?	18:44
jeremyb	also, is there any log of actions so you could replay them if you had a point in time snapshot?	18:44
jeremyb	(would need to have actual data in them)	18:44
*** kashif1 has quit IRC		18:50
*** mischer has quit IRC		18:54
*** krish has quit IRC		19:15
notmyname	jeremyb: containers cannot be moved between accounts. objects can be copied (swift-side) within an account.	19:34
notmyname	jeremyb: everything is logged, but the actual data isn't (or the logs would be a copy of the cluster!)	19:34
jeremyb	hrmm, k	19:34
jeremyb	right	19:34
jeremyb	that's what i wanted :)	19:34
notmyname	jeremyb: well, i suppose that a token with read access could be copied to a different account	19:35
notmyname	essentially, the server-side copy feature does a GET + PUT	19:35
notmyname	so if the GET works, the PUT will work too	19:35
jeremyb	k	19:35
*** blpiatt has quit IRC		19:36
jeremyb	i was just wondering about that. much more interested in logs with data :)	19:36
notmyname	honestly, why? then your log files are as big as the cluster.	19:38
*** dubsquared has quit IRC		19:38
*** zaitcev has joined #openstack		19:53
*** anticw has quit IRC		19:59
*** anticw has joined #openstack		19:59
jeremyb	notmyname: same as mysql. then you can back up the whole cluster and in between full backups back up the logs. then use the latest full + logs since then to recover	20:02
*** rsampaio has quit IRC		20:12
*** xfrogman5 has quit IRC		20:20
*** alekibango has joined #openstack		20:24
*** rsampaio has joined #openstack		20:32
*** pothos_ has joined #openstack		20:54
*** pothos has quit IRC		20:56
*** metcalfc has joined #openstack		20:56
*** pothos_ has quit IRC		20:57
*** pothos has joined #openstack		20:57
*** rlucio has joined #openstack		20:58
*** dendro-afk is now known as dendrobates		21:01
*** joearnol_ has joined #openstack		21:11
*** rsampaio_ has joined #openstack		21:28
*** rsampaio has quit IRC		21:29
*** joearnol_ has quit IRC		21:51
*** joearnold has joined #openstack		21:52
*** joearnold has quit IRC		22:05
*** allsystemsarego has quit IRC		22:08
*** dubsquared has joined #openstack		22:16
*** rsampaio_ has quit IRC		22:16
*** metcalfc has quit IRC		22:29
*** dubsquared1 has joined #openstack		22:45
*** dubsquared1 has quit IRC		22:47
*** dubsquared has quit IRC		22:47
*** befreax has quit IRC		23:03
*** DubLo7 has joined #openstack		23:24
*** DubLo7 has left #openstack		23:26
*** pharkmillups has joined #openstack		23:42

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!