19:01:35 <notmyname> #startmeeting swift
19:01:36 <openstack> Meeting started Wed Nov 13 19:01:35 2013 UTC and is due to finish in 60 minutes.  The chair is notmyname. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:01:37 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:01:40 <openstack> The meeting name has been set to 'swift'
19:01:49 <notmyname> thanks for joining. today's agenda is at https://wiki.openstack.org/wiki/Meetings/Swift
19:01:52 <notmyname> who's here
19:01:54 <notmyname> ?
19:01:56 <peluse> here
19:02:00 <zaitcev> o/
19:02:05 <portante> o/
19:02:05 <torgomatic> o/
19:02:05 <briancline> +1
19:02:06 <koolhead17> o/
19:02:07 * clayg lurks
19:02:18 <notmyname> great
19:02:24 <lincolnt> o/
19:02:47 <notmyname> so first item of business it to figure out the proper way to wrap uuid4(). because reasons
19:02:48 <notmyname> ;-)
19:02:58 <notmyname> (not really)
19:03:11 <zaitcev> I am confused. How hard can it be?
19:03:19 <notmyname> #topic HK summit recap
19:03:22 <briancline> we'll need a plugin manager
19:03:31 <notmyname> the hong kong summit was great
19:03:50 <peluse> agreed!
19:03:58 <notmyname> swift keynote on the first day. kinetic drives. just about every conversation involved storage policies. tons of interest from IBM, HP, and others
19:04:09 <notmyname> thanks peluse for doing the multi-ring demo
19:04:32 <peluse> my pleasure!  I'm just glad it worked :)
19:04:38 <notmyname> heh
19:04:56 <notmyname> overall, I was very happy with the summit. part of that, I think, was because we had the hackathon 2 weeks prior
19:05:12 <notmyname> any other questions or comments about the summit?
19:05:34 <peluse> I miseed the mercado libre talk but saw it was posted on IRC - good to watch if anyone missed it
19:05:42 <notmyname> I think koolhead17 has a blog post about it (from a swiftstack perspective) that will go up soon. If needed, I can fill in any other gaps from a general swift perspective
19:05:51 <zaitcev> creiht's RAX preso was V.interesting, although he didn't divulge the number of objects
19:05:52 <notmyname> oh yeah. mercado libre loves swift
19:06:09 <notmyname> ya, RAX talked about 85 PB across their swift clusters (raw)
19:06:22 <notmyname> 20 PB deployed in france by enovance for cloudwatt
19:06:33 <koolhead17> Swift ^^
19:06:36 <notmyname> global clusters from concur
19:06:42 <creiht> zaitcev: :)
19:06:54 <notmyname> zaitcev: billions and billions?
19:07:12 <clayg> I think the most interesting part was creiht's sports coat - very snazzy
19:07:23 <creiht> haha
19:07:29 <notmyname> probably part of his new secret job as a spy
19:07:47 <peluse> don't sell yourself short clayg - the daemon factory stuff eas very cool
19:08:03 <clayg> yeah i suppose I need to finish that up and get up a patch
19:08:09 <zaitcev> I understand the danger of racing Azure and S3 but I was thinking about how much our containers can take... S3 has some nifty auto-sharding thing, and I am curious if we're at the point where we need it or not.
19:08:25 <notmyname> the profiling middleware stuff was very cool. along with clayg's "pluggable" things
19:08:32 <clayg> whoa!
19:08:42 <portante> were the unconference sessions recorded as well?
19:08:48 <notmyname> zaitcev: we've "always" needed it. and we've always said "just use ssds"
19:08:54 <notmyname> portante: not AFAIK
19:09:02 <notmyname> portante: not any tech sessions, actually
19:09:09 <portante> oh
19:09:20 <portante> would be interested in blog posts on those sessions
19:09:35 <notmyname> there was also a really interesting tech session where a potential user went down a list of questions, essentially evaluating swift as the storage engine for a massive global messaging app
19:09:36 <portante> clayg: was your "pluggable" things recorded?
19:09:53 <briancline> me as well, I was pulled between the general and design stuff far too much
19:09:55 <notmyname> I want to try to have that in atlanta again in may
19:10:04 <torgomatic> atlanta?
19:10:14 <notmyname> also, next summit is in atlanta in may
19:10:16 <portante> or in the atlantic?
19:10:19 <notmyname> then paris in november
19:10:22 <peluse> have what in atlanta?
19:10:47 <glange> the paris summit is going to be so romantic
19:10:52 <notmyname> peluse: a tech session similar to the one LINE did this time where they are asking real-world use case questions about swift
19:11:02 <peluse> ahh, gotcha
19:11:09 <notmyname> glange: nothing like eating snails under the eiffel tower
19:11:53 <notmyname> anything else on HK or the summit?
19:12:06 <peluse> any next steps on the profiling middleware?
19:12:06 <notmyname> thanks to everyone who participated there
19:12:18 <notmyname> peluse: I think it's up for review
19:12:19 <portante> any design decisions made?
19:12:27 <peluse> I think KT was asking for core comm to pick it up right?
19:12:28 <notmyname> portante: like normal, not really
19:12:29 <portante> was it really a design summit for tings?
19:12:34 <portante> k
19:12:41 <notmyname> portante: never really has been for swift
19:12:59 <notmyname> portante: that stuff normally happens in irc and in the day-to-day (and now maybe at hackathons)
19:13:06 <portante> but nobody else in openstack made sweeping design decisions that affect swift either?
19:13:32 <notmyname> portante: not that I know of. I'm sure they'll let us know ;-)
19:13:37 <torgomatic> well, we are going to be switching to Trove-provisioning for our Mongo databases, which are replacing sqlite
19:13:38 <portante> :)
19:13:51 <torgomatic> also, rabbitmq EVERYWHERE
19:13:52 <portante> torgomatic: I hope you are joiking
19:13:57 <notmyname> torgomatic: before or after using a pluggable library to replace uuid4()?
19:13:58 <portante> joking
19:14:26 <notmyname> ok, let's move on :-)
19:14:34 <notmyname> #topic swift-bench separation
19:14:51 <notmyname> this needs to happen. it's half-done now, which is really the worse place to be
19:15:10 <notmyname> and volunteers to lead the -ectomy in swift and cleaning up the swift-bench repo?
19:15:51 <notmyname> *crickets*
19:16:09 * portante whistles quietly to himself
19:16:11 * torgomatic takes one giant step backward
19:16:13 <zaitcev> I thought Sam knew what he was doing there
19:16:33 <torgomatic> I know what needs to happen, but I'm doing other stuff at the moment
19:16:49 <notmyname> as is everyone else it seems ;-)
19:16:58 <clayg> it's just so easy to not care about patches that want to add dependencies to swiftclient for "reasons" and admit that having bench in repo is sorta handy in a super lazy sort of way.
19:17:11 <notmyname> clayg: heh, ya
19:17:27 <notmyname> ok, I want to keep it in everyone's mind. still an outstanting TODO
19:17:35 <clayg> BEFORE ICEHOUSE!
19:17:41 * clayg feels a mantra coming on
19:17:51 <notmyname> I can keep bugging people in day-to-day IRC
19:18:03 <notmyname> #topic test coverage
19:18:05 <clayg> you should tag it to a milestone
19:18:20 <notmyname> clayg: ooohh. that's a good way to get it done ;-)
19:18:34 <zaitcev> I am having a vague feeling that Fedora packaging may win if we separate... I did stuff like that before but I need to see the benefit.
19:18:41 <portante> how 'bout create a test that fails until it is done?
19:18:51 <notmyname> in my catching up after the summit, there were some concerns raised about patches landing that lowered test coverage
19:19:07 <clayg> out coverage went down?
19:19:32 <notmyname> mostly this is a reminder to make sure that test coverage is checked when doing reviews
19:19:49 <notmyname> new patches should have tests for the lines of code they change
19:20:06 <notmyname> but we don't have a formal gate on that, so it's easy to miss it from time to time
19:20:28 <clayg> moar gates!  moar gates!  moar gates!
19:20:34 <notmyname> anyone want to add anything specifically here, or can we leave it at that?
19:20:35 <zaitcev> I don't see a lot of low-hanging fruit...
19:20:43 <zaitcev> Container updater pehrpas
19:20:51 <portante> notmyname: I believe there is a POST commit job that runs the coverage, but is not used
19:21:05 <clayg> I'm getting like total 91% on my machine, i thought stuff had been coming up cause people have been adding tests for all that crap we never tested
19:21:27 <zaitcev> I once was dinged for lowering coverage... That was awful.
19:21:30 * clayg glares at db_replicator
19:21:42 <notmyname> ya, and I'm looking forward to the discoverable constraints thingy landing so that we can add similar coverage to functional tests
19:21:57 <notmyname> just saw this in -infra: "turns out swift wasn't designed to run in a single VM "  yay!
19:22:12 <clayg> REWRITE!
19:22:29 <zaitcev> What do you mean wasn't designed? What about SAIO?
19:22:54 <notmyname> zaitcev: as in devstack. one replica on one loopback device with one process in one vm
19:22:59 <notmyname> #topic metadata search API
19:23:19 <notmyname> HP presented on adding a metadata search API into swift
19:23:27 <notmyname> and softlayer has had something like that for a while
19:23:35 <lincolnt> Hi there, I presented.
19:23:39 <notmyname> lincolnt: you led the talk in HK. take it away
19:24:00 <lincolnt> See https://wiki.openstack.org/wiki/MetadataSearch
19:24:17 <lincolnt> The design session went well, thanks everyone for the great feedback!
19:24:38 <lincolnt> Brian Cline (SoftLayer) and I have updated the blueprint and started that Wiki page
19:24:52 <lincolnt> I posted our strawman API spec we wrote at HP
19:24:58 <lincolnt> And the design session slides
19:25:15 <portante> can you provide a sample use case?
19:25:16 <lincolnt> We're developing against that spec for proof-of-concept
19:25:36 <clayg> portante: you have metadata - you want to search it
19:25:38 <lincolnt> treating Swift as a black box, but we (all) want it to be a new standard API and ref impl for Swift.
19:25:59 <portante> just searching x-*-meta-* keys?
19:26:15 <notmyname> portante: my understanding is that softlayer (IBM) has it, HP wants it, and they don't want there to be 50 different APIs to do it
19:26:16 <lincolnt> portante: "Show me all objects in all containers where the object count > 10 and the last modified time is in the last 30 days and > 1 GB"
19:26:19 <clayg> portante: probably more like content-type and size and last modified and stuff
19:26:33 <clayg> oh heh
19:26:36 * clayg backs away
19:26:42 <creiht> searching user meta data would be nice
19:26:52 <lincolnt> ...and BTW where the container-meta-location = "New Zealand" and object-meta-physician = "Smith"
19:27:08 <tomerm> the api support searching both system AND user metadata
19:27:13 <briancline> precisely, we'd like to adopt whatever becomes the standard
19:27:18 <lincolnt> Yes, searching system and custom metadata are both spec'd in the API
19:27:54 <portante> so we'll need to formally define the system metadata, unless we already do that today?
19:28:02 <lincolnt> The proposed API is broader and (we hope) more flexible/featureful/extensible than SoftLayer's but they have theirs established so they will be critical to the new API's success
19:28:03 <notmyname> lincolnt: so what's the next step?
19:28:08 <clayg> i'm keen on doing interesting indexing tricks with container db's, i don't really want "cluster search" and I'm less concerened about the api as long as it's usable.
19:28:09 <zaitcev> I am wondering if it's a great way to bring the system to its knees by launching queries that chew up CPU
19:28:27 <lincolnt> Asking everyone to review the Wiki, API, slides, and the blueprint pages there...
19:28:42 <lincolnt> Give us your experiences / needs / wishlists for what you'd like to search in Swift metadata...
19:28:48 <lincolnt> Poke holes in the API...
19:28:52 <notmyname> lincolnt: how do you want feedback?
19:29:05 <lincolnt> Suggest implementations e.g. metadata index stores to use.
19:29:13 <notmyname> does this need to be on the openstack-dev mailing list? do we do it on the wiki? in IRC? what's best for you?
19:29:29 <lincolnt> How would be best for the community? Add to the Wiki (I'd like, collects in one place)?
19:29:40 <briancline> preferably mailing list for discussion, with the wiki for the evolving idea on what it should look like
19:29:50 <notmyname> the mailing list is best for async discussion
19:29:54 <notmyname> ya, what briancline said
19:29:59 <lincolnt> We (briancline and I) will be sending a openstack-dev email after this, asking for the same inputs
19:30:07 <notmyname> the risk is too much "wrap uuid4()" style comments
19:30:13 <notmyname> lincolnt: great
19:30:34 <lincolnt> Cool, we can own the Wiki and update it based on email list feedback
19:30:51 <notmyname> sounds good. thanks
19:31:00 <notmyname> anything else on that topic for this meeting?
19:31:01 <briancline> zaitcev: that'll depend largely on what sort of indexing backend one uses, but for as long as we've had it out there I don't think we've had any issues like that
19:31:17 <lincolnt> zaitcev: Yes, good point, query optimizaton will be crucial
19:31:30 <lincolnt> And swift (ha) responses and ingests into the tables
19:31:35 <gholt> notmyname: We're starting to test ssync in staging now. Hopefully in prod in 2-3 weeks.
19:31:45 <lincolnt> Also let me introduce tomerm, our primary developer of HP's PoC
19:31:48 <notmyname> gholt: cool. I saw that you had packaged it
19:31:51 <tomerm> hi
19:31:57 <notmyname> tomerm: hi. welcome
19:32:01 <lincolnt> You can ping either of us, or briancline, on IRC
19:32:06 <gholt> notmyname: Oh sorry, I read your question as "anything else" without the "on that topic" ah well ;)
19:32:13 <dfg> can I suggest that we fix the whole POST as copy thing if we're going to add metadata searching?
19:32:14 <notmyname> gholt: :-)
19:32:32 <gholt> dfg: How's it broke again?
19:32:34 <notmyname> dfg: ya, that's a good idea
19:32:49 <notmyname> gholt: make POSTs update containers so you get fast posts
19:33:08 <dfg> its not broken- but make it faster for large objects.
19:33:36 <dfg> if we're going to be searching on metadata being able to do POSTs without COPYing the whole object seems like it'll come in handy
19:33:52 <dfg> but maybe not
19:34:02 <gholt> Gotcha, tbh I can't remember why we have the two POST processes anymore. Lol
19:34:14 <torgomatic> something something container sync something something?
19:34:16 <lincolnt> dfg: Please reply to the upcoming openstack-dev email we send on this topic, I think I know what youre talking about, good idea.
19:34:17 <clayg> i mostly remember blaming container sync
19:34:20 <dfg> i don't think there is a good reason- just time to fix
19:34:36 <notmyname> it's an optimization, but I can see how POST would be more popular if metadata searches are available
19:34:36 <clayg> but I agree the real issue was content-type and container updates
19:34:39 <briancline> quick side note, on the SoftLayer side, sudorandom and CrackerJackMack are also extremely familiar with our implementation if questions come up, however their availability may vary for this topic
19:34:50 <notmyname> kk
19:34:59 <clayg> notmyname: with the x-delete-at stuff it's already sorta getting in style to run fast-post
19:35:14 <dfg> i might have gotten unsubscribed from openstack-dev...
19:35:38 <notmyname> dfg: filtering to /dev/null doesn't count as unsubscribe
19:35:40 <CrackerJackMack> I'm quasi available via IRC
19:35:49 <peluse> is there already a bp to address POST w/o copy?
19:35:53 <gholt> Yeah, every once in a while I think Rackspace marks it as spam and bounces everything and you have to resub
19:36:14 <clayg> gholt: sounds like a feature
19:36:19 <notmyname> gholt: and I'm not sure that's always a wrong choice ;-)
19:37:16 <notmyname> ok, moving on from metadata search
19:37:18 <notmyname> thanks lincolnt
19:37:31 <notmyname> #topic timestamps + modified-since headers
19:37:36 <notmyname> portante: you're up
19:37:46 <portante> so are folks familiar with the problem?
19:37:55 <notmyname> rfc calls for int timestamps?
19:37:59 <portante> we store timestamps using microsecond resolution
19:38:29 <portante> the http protocol only allows for seconds resolution in last-modified, if-[un]modified-since heasders
19:38:41 <portante> so if you HEAD an object after PUTing it
19:39:00 <portante> you get a truncated timestamp: 13.9 ends up being 13
19:39:17 <portante> if you then use that last-modified value with if-unmodified-since, you get a 412
19:39:33 <portante> if you use that with if-modified-since you'll fetch the object, even though it has not changed
19:39:55 <portante> I am not sure what to do about this personally
19:40:09 <notmyname> last-modified with "13" gives an error?
19:40:26 <notmyname> err...if-unmodified-since
19:40:31 <torgomatic> well, if you only get second-level resolution with HTTP, we have to treat all same-second updates as identical
19:40:38 <torgomatic> so, maybe round up to <second>.99999 ?
19:40:49 <notmyname> torgomatic: or floor it
19:41:22 <notmyname> portante: does the normalize timestamp method not account for this?
19:41:34 <portante> it does not have anything to do with this
19:41:45 <portante> the x-timestamp value comes from the object
19:42:17 <portante> the format of the last-modified and if-[un]modified-since headers is http-data format (verbose human readable time string)
19:42:50 <notmyname> normalize what comes in on the request. is the question to count it as .99999 or .0?
19:42:55 <briancline> flooring seems best
19:43:19 <portante> I think we have to ceiling x-timestmap value when returning last-modified, and then do the same when generating datetime objects when comparing the if-* headers
19:43:20 <lincolnt> Could we define a meta HTTP header to hold the microseconds so clients can provide it?
19:43:47 <portante> lincolnt: that is an option, but clients would have to change
19:43:50 <gholt> lincolnt: You could, but that doesn't help with all the existing http-based stuff out there.
19:44:36 <portante> and we tried changing the http-data format to add the trailing microseconds and some proxies just strip it and reformat the date without it
19:44:40 <lincolnt> right, what youre saying about flooring etc would still have to happen, but allows clients to add it in the future
19:45:03 <briancline> if you're performing that in both places, why would it matter whether to floor or ceil it?
19:45:15 <zaitcev> If I understood right, the problem is 412. Just make sure it doesn't happen and return the object.
19:45:23 <portante> yes, if a request had an x-swift-if-[un]modified-since header, we could accept that full format
19:45:29 <notmyname> we return x-timestamp, so if you floor the last-modified value, you satisfy the rfc and give the user the chance to get more accurate
19:45:44 <portante> we don't return the x-timestamp value as is
19:46:01 <portante> that header I don't think is allowed in responses by default, though I could be wrong
19:46:30 <notmyname> hmm
19:46:36 <portante> the last-modified header is represented by the last_modified property on a Response object, which is a _datetime_property() object
19:46:59 <portante> that object converts to the string our x-timestamp float value, dropping the microseconds on the flow
19:47:02 <portante> floor
19:47:30 <notmyname> ok
19:47:44 <zaitcev> Do "swift stat" and get X-Timestamp: 1368649471.47379
19:48:08 <zaitcev> It started happening a while ago. Before 1.4
19:48:18 <portante> so we could change last-modified to ceiling and then change the if-[un]modified-since code to ceiling
19:48:25 <zaitcev> At the time I found it annoying that it leaked.
19:48:31 <torgomatic> I mean, fundamentally the problem is: if an object has Last-Modified: Blah, and I make a GET with If-Unmodified-Since: Blah, I want a 304 Not Modified response. Right?
19:48:40 <torgomatic> or am I misunderstanding things?
19:48:41 <portante> but then we have to document that there is a possibility of missing sub-second updates
19:48:50 <portante> yes
19:48:52 <portante> torgomatic
19:49:11 <portante> zaitcev: not sure we want to rely on that "leak"
19:49:49 <peluse> not I sure I follow the "leak" comment
19:49:51 <notmyname> is -since inclusive or exclusive?
19:50:20 <portante> peluse: I believe zaitcev is saying that we did not return an x-timestamp header in responses before 1.4
19:50:28 <peluse> ahh
19:50:33 <zaitcev> yes
19:50:53 <gholt> Quick test that shows the python libs floor: https://gist.github.com/gholt/6963bffe5a20cbded451
19:52:19 <notmyname> portante: does that ^ give you anything?
19:52:42 <portante> sec
19:53:11 <notmyname> should we punt this to #openstack-swift in the interest of time?
19:53:18 <portante> sure
19:53:26 <notmyname> #topic open discussion
19:53:38 <notmyname> anything else? if not, let's figure out the timestamp stuff
19:53:57 <portante> acoles 's work
19:53:58 <zaitcev> I meant to ask gholt to look out for "DB locked" failrues _in servers servince requests_.
19:54:00 <notmyname> gholt: RAX is now testing basically at master, including ssync and early quorum
19:54:12 <zaitcev> like this https://bugs.launchpad.net/swift/+bug/1224253
19:54:36 <zaitcev> Apparently it started happening in Havana. No clue why or if it really is the case
19:54:44 <portante> notmyname: there has been some discussion about pipelines and adding a manditory header stripper at the left of the pipeline
19:55:12 <notmyname> portante: ya. alpha_ori has a patch to do nifty things to the pipeline
19:55:28 <portante> yes, he does it only for config dir environments
19:55:29 <notmyname> swifterdarrell skewered it, and alpha_ori will be back in the office late this week or next week
19:55:37 <zaitcev> I see "DB locked" in auditors often but whatever, that may be ok unless it prevents repair.
19:55:39 <portante> so I am looking at how to do that in general
19:55:48 <notmyname> portante: use config dirs! ;-)
19:56:51 <notmyname> acoles: did you have something?
19:56:52 <portante> notmyname: sure!
19:58:30 <notmyname> ok, I don't think acoles is here
19:58:36 <notmyname> thanks everyone for attending
19:58:41 <notmyname> see you in two weeks
19:58:41 <portante> thanks
19:58:43 <notmyname> #endmeeting