19:01:35 <notmyname> #startmeeting swift 19:01:36 <openstack> Meeting started Wed Nov 13 19:01:35 2013 UTC and is due to finish in 60 minutes. The chair is notmyname. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:37 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:40 <openstack> The meeting name has been set to 'swift' 19:01:49 <notmyname> thanks for joining. today's agenda is at https://wiki.openstack.org/wiki/Meetings/Swift 19:01:52 <notmyname> who's here 19:01:54 <notmyname> ? 19:01:56 <peluse> here 19:02:00 <zaitcev> o/ 19:02:05 <portante> o/ 19:02:05 <torgomatic> o/ 19:02:05 <briancline> +1 19:02:06 <koolhead17> o/ 19:02:07 * clayg lurks 19:02:18 <notmyname> great 19:02:24 <lincolnt> o/ 19:02:47 <notmyname> so first item of business it to figure out the proper way to wrap uuid4(). because reasons 19:02:48 <notmyname> ;-) 19:02:58 <notmyname> (not really) 19:03:11 <zaitcev> I am confused. How hard can it be? 19:03:19 <notmyname> #topic HK summit recap 19:03:22 <briancline> we'll need a plugin manager 19:03:31 <notmyname> the hong kong summit was great 19:03:50 <peluse> agreed! 19:03:58 <notmyname> swift keynote on the first day. kinetic drives. just about every conversation involved storage policies. tons of interest from IBM, HP, and others 19:04:09 <notmyname> thanks peluse for doing the multi-ring demo 19:04:32 <peluse> my pleasure! I'm just glad it worked :) 19:04:38 <notmyname> heh 19:04:56 <notmyname> overall, I was very happy with the summit. part of that, I think, was because we had the hackathon 2 weeks prior 19:05:12 <notmyname> any other questions or comments about the summit? 19:05:34 <peluse> I miseed the mercado libre talk but saw it was posted on IRC - good to watch if anyone missed it 19:05:42 <notmyname> I think koolhead17 has a blog post about it (from a swiftstack perspective) that will go up soon. If needed, I can fill in any other gaps from a general swift perspective 19:05:51 <zaitcev> creiht's RAX preso was V.interesting, although he didn't divulge the number of objects 19:05:52 <notmyname> oh yeah. mercado libre loves swift 19:06:09 <notmyname> ya, RAX talked about 85 PB across their swift clusters (raw) 19:06:22 <notmyname> 20 PB deployed in france by enovance for cloudwatt 19:06:33 <koolhead17> Swift ^^ 19:06:36 <notmyname> global clusters from concur 19:06:42 <creiht> zaitcev: :) 19:06:54 <notmyname> zaitcev: billions and billions? 19:07:12 <clayg> I think the most interesting part was creiht's sports coat - very snazzy 19:07:23 <creiht> haha 19:07:29 <notmyname> probably part of his new secret job as a spy 19:07:47 <peluse> don't sell yourself short clayg - the daemon factory stuff eas very cool 19:08:03 <clayg> yeah i suppose I need to finish that up and get up a patch 19:08:09 <zaitcev> I understand the danger of racing Azure and S3 but I was thinking about how much our containers can take... S3 has some nifty auto-sharding thing, and I am curious if we're at the point where we need it or not. 19:08:25 <notmyname> the profiling middleware stuff was very cool. along with clayg's "pluggable" things 19:08:32 <clayg> whoa! 19:08:42 <portante> were the unconference sessions recorded as well? 19:08:48 <notmyname> zaitcev: we've "always" needed it. and we've always said "just use ssds" 19:08:54 <notmyname> portante: not AFAIK 19:09:02 <notmyname> portante: not any tech sessions, actually 19:09:09 <portante> oh 19:09:20 <portante> would be interested in blog posts on those sessions 19:09:35 <notmyname> there was also a really interesting tech session where a potential user went down a list of questions, essentially evaluating swift as the storage engine for a massive global messaging app 19:09:36 <portante> clayg: was your "pluggable" things recorded? 19:09:53 <briancline> me as well, I was pulled between the general and design stuff far too much 19:09:55 <notmyname> I want to try to have that in atlanta again in may 19:10:04 <torgomatic> atlanta? 19:10:14 <notmyname> also, next summit is in atlanta in may 19:10:16 <portante> or in the atlantic? 19:10:19 <notmyname> then paris in november 19:10:22 <peluse> have what in atlanta? 19:10:47 <glange> the paris summit is going to be so romantic 19:10:52 <notmyname> peluse: a tech session similar to the one LINE did this time where they are asking real-world use case questions about swift 19:11:02 <peluse> ahh, gotcha 19:11:09 <notmyname> glange: nothing like eating snails under the eiffel tower 19:11:53 <notmyname> anything else on HK or the summit? 19:12:06 <peluse> any next steps on the profiling middleware? 19:12:06 <notmyname> thanks to everyone who participated there 19:12:18 <notmyname> peluse: I think it's up for review 19:12:19 <portante> any design decisions made? 19:12:27 <peluse> I think KT was asking for core comm to pick it up right? 19:12:28 <notmyname> portante: like normal, not really 19:12:29 <portante> was it really a design summit for tings? 19:12:34 <portante> k 19:12:41 <notmyname> portante: never really has been for swift 19:12:59 <notmyname> portante: that stuff normally happens in irc and in the day-to-day (and now maybe at hackathons) 19:13:06 <portante> but nobody else in openstack made sweeping design decisions that affect swift either? 19:13:32 <notmyname> portante: not that I know of. I'm sure they'll let us know ;-) 19:13:37 <torgomatic> well, we are going to be switching to Trove-provisioning for our Mongo databases, which are replacing sqlite 19:13:38 <portante> :) 19:13:51 <torgomatic> also, rabbitmq EVERYWHERE 19:13:52 <portante> torgomatic: I hope you are joiking 19:13:57 <notmyname> torgomatic: before or after using a pluggable library to replace uuid4()? 19:13:58 <portante> joking 19:14:26 <notmyname> ok, let's move on :-) 19:14:34 <notmyname> #topic swift-bench separation 19:14:51 <notmyname> this needs to happen. it's half-done now, which is really the worse place to be 19:15:10 <notmyname> and volunteers to lead the -ectomy in swift and cleaning up the swift-bench repo? 19:15:51 <notmyname> *crickets* 19:16:09 * portante whistles quietly to himself 19:16:11 * torgomatic takes one giant step backward 19:16:13 <zaitcev> I thought Sam knew what he was doing there 19:16:33 <torgomatic> I know what needs to happen, but I'm doing other stuff at the moment 19:16:49 <notmyname> as is everyone else it seems ;-) 19:16:58 <clayg> it's just so easy to not care about patches that want to add dependencies to swiftclient for "reasons" and admit that having bench in repo is sorta handy in a super lazy sort of way. 19:17:11 <notmyname> clayg: heh, ya 19:17:27 <notmyname> ok, I want to keep it in everyone's mind. still an outstanting TODO 19:17:35 <clayg> BEFORE ICEHOUSE! 19:17:41 * clayg feels a mantra coming on 19:17:51 <notmyname> I can keep bugging people in day-to-day IRC 19:18:03 <notmyname> #topic test coverage 19:18:05 <clayg> you should tag it to a milestone 19:18:20 <notmyname> clayg: ooohh. that's a good way to get it done ;-) 19:18:34 <zaitcev> I am having a vague feeling that Fedora packaging may win if we separate... I did stuff like that before but I need to see the benefit. 19:18:41 <portante> how 'bout create a test that fails until it is done? 19:18:51 <notmyname> in my catching up after the summit, there were some concerns raised about patches landing that lowered test coverage 19:19:07 <clayg> out coverage went down? 19:19:32 <notmyname> mostly this is a reminder to make sure that test coverage is checked when doing reviews 19:19:49 <notmyname> new patches should have tests for the lines of code they change 19:20:06 <notmyname> but we don't have a formal gate on that, so it's easy to miss it from time to time 19:20:28 <clayg> moar gates! moar gates! moar gates! 19:20:34 <notmyname> anyone want to add anything specifically here, or can we leave it at that? 19:20:35 <zaitcev> I don't see a lot of low-hanging fruit... 19:20:43 <zaitcev> Container updater pehrpas 19:20:51 <portante> notmyname: I believe there is a POST commit job that runs the coverage, but is not used 19:21:05 <clayg> I'm getting like total 91% on my machine, i thought stuff had been coming up cause people have been adding tests for all that crap we never tested 19:21:27 <zaitcev> I once was dinged for lowering coverage... That was awful. 19:21:30 * clayg glares at db_replicator 19:21:42 <notmyname> ya, and I'm looking forward to the discoverable constraints thingy landing so that we can add similar coverage to functional tests 19:21:57 <notmyname> just saw this in -infra: "turns out swift wasn't designed to run in a single VM " yay! 19:22:12 <clayg> REWRITE! 19:22:29 <zaitcev> What do you mean wasn't designed? What about SAIO? 19:22:54 <notmyname> zaitcev: as in devstack. one replica on one loopback device with one process in one vm 19:22:59 <notmyname> #topic metadata search API 19:23:19 <notmyname> HP presented on adding a metadata search API into swift 19:23:27 <notmyname> and softlayer has had something like that for a while 19:23:35 <lincolnt> Hi there, I presented. 19:23:39 <notmyname> lincolnt: you led the talk in HK. take it away 19:24:00 <lincolnt> See https://wiki.openstack.org/wiki/MetadataSearch 19:24:17 <lincolnt> The design session went well, thanks everyone for the great feedback! 19:24:38 <lincolnt> Brian Cline (SoftLayer) and I have updated the blueprint and started that Wiki page 19:24:52 <lincolnt> I posted our strawman API spec we wrote at HP 19:24:58 <lincolnt> And the design session slides 19:25:15 <portante> can you provide a sample use case? 19:25:16 <lincolnt> We're developing against that spec for proof-of-concept 19:25:36 <clayg> portante: you have metadata - you want to search it 19:25:38 <lincolnt> treating Swift as a black box, but we (all) want it to be a new standard API and ref impl for Swift. 19:25:59 <portante> just searching x-*-meta-* keys? 19:26:15 <notmyname> portante: my understanding is that softlayer (IBM) has it, HP wants it, and they don't want there to be 50 different APIs to do it 19:26:16 <lincolnt> portante: "Show me all objects in all containers where the object count > 10 and the last modified time is in the last 30 days and > 1 GB" 19:26:19 <clayg> portante: probably more like content-type and size and last modified and stuff 19:26:33 <clayg> oh heh 19:26:36 * clayg backs away 19:26:42 <creiht> searching user meta data would be nice 19:26:52 <lincolnt> ...and BTW where the container-meta-location = "New Zealand" and object-meta-physician = "Smith" 19:27:08 <tomerm> the api support searching both system AND user metadata 19:27:13 <briancline> precisely, we'd like to adopt whatever becomes the standard 19:27:18 <lincolnt> Yes, searching system and custom metadata are both spec'd in the API 19:27:54 <portante> so we'll need to formally define the system metadata, unless we already do that today? 19:28:02 <lincolnt> The proposed API is broader and (we hope) more flexible/featureful/extensible than SoftLayer's but they have theirs established so they will be critical to the new API's success 19:28:03 <notmyname> lincolnt: so what's the next step? 19:28:08 <clayg> i'm keen on doing interesting indexing tricks with container db's, i don't really want "cluster search" and I'm less concerened about the api as long as it's usable. 19:28:09 <zaitcev> I am wondering if it's a great way to bring the system to its knees by launching queries that chew up CPU 19:28:27 <lincolnt> Asking everyone to review the Wiki, API, slides, and the blueprint pages there... 19:28:42 <lincolnt> Give us your experiences / needs / wishlists for what you'd like to search in Swift metadata... 19:28:48 <lincolnt> Poke holes in the API... 19:28:52 <notmyname> lincolnt: how do you want feedback? 19:29:05 <lincolnt> Suggest implementations e.g. metadata index stores to use. 19:29:13 <notmyname> does this need to be on the openstack-dev mailing list? do we do it on the wiki? in IRC? what's best for you? 19:29:29 <lincolnt> How would be best for the community? Add to the Wiki (I'd like, collects in one place)? 19:29:40 <briancline> preferably mailing list for discussion, with the wiki for the evolving idea on what it should look like 19:29:50 <notmyname> the mailing list is best for async discussion 19:29:54 <notmyname> ya, what briancline said 19:29:59 <lincolnt> We (briancline and I) will be sending a openstack-dev email after this, asking for the same inputs 19:30:07 <notmyname> the risk is too much "wrap uuid4()" style comments 19:30:13 <notmyname> lincolnt: great 19:30:34 <lincolnt> Cool, we can own the Wiki and update it based on email list feedback 19:30:51 <notmyname> sounds good. thanks 19:31:00 <notmyname> anything else on that topic for this meeting? 19:31:01 <briancline> zaitcev: that'll depend largely on what sort of indexing backend one uses, but for as long as we've had it out there I don't think we've had any issues like that 19:31:17 <lincolnt> zaitcev: Yes, good point, query optimizaton will be crucial 19:31:30 <lincolnt> And swift (ha) responses and ingests into the tables 19:31:35 <gholt> notmyname: We're starting to test ssync in staging now. Hopefully in prod in 2-3 weeks. 19:31:45 <lincolnt> Also let me introduce tomerm, our primary developer of HP's PoC 19:31:48 <notmyname> gholt: cool. I saw that you had packaged it 19:31:51 <tomerm> hi 19:31:57 <notmyname> tomerm: hi. welcome 19:32:01 <lincolnt> You can ping either of us, or briancline, on IRC 19:32:06 <gholt> notmyname: Oh sorry, I read your question as "anything else" without the "on that topic" ah well ;) 19:32:13 <dfg> can I suggest that we fix the whole POST as copy thing if we're going to add metadata searching? 19:32:14 <notmyname> gholt: :-) 19:32:32 <gholt> dfg: How's it broke again? 19:32:34 <notmyname> dfg: ya, that's a good idea 19:32:49 <notmyname> gholt: make POSTs update containers so you get fast posts 19:33:08 <dfg> its not broken- but make it faster for large objects. 19:33:36 <dfg> if we're going to be searching on metadata being able to do POSTs without COPYing the whole object seems like it'll come in handy 19:33:52 <dfg> but maybe not 19:34:02 <gholt> Gotcha, tbh I can't remember why we have the two POST processes anymore. Lol 19:34:14 <torgomatic> something something container sync something something? 19:34:16 <lincolnt> dfg: Please reply to the upcoming openstack-dev email we send on this topic, I think I know what youre talking about, good idea. 19:34:17 <clayg> i mostly remember blaming container sync 19:34:20 <dfg> i don't think there is a good reason- just time to fix 19:34:36 <notmyname> it's an optimization, but I can see how POST would be more popular if metadata searches are available 19:34:36 <clayg> but I agree the real issue was content-type and container updates 19:34:39 <briancline> quick side note, on the SoftLayer side, sudorandom and CrackerJackMack are also extremely familiar with our implementation if questions come up, however their availability may vary for this topic 19:34:50 <notmyname> kk 19:34:59 <clayg> notmyname: with the x-delete-at stuff it's already sorta getting in style to run fast-post 19:35:14 <dfg> i might have gotten unsubscribed from openstack-dev... 19:35:38 <notmyname> dfg: filtering to /dev/null doesn't count as unsubscribe 19:35:40 <CrackerJackMack> I'm quasi available via IRC 19:35:49 <peluse> is there already a bp to address POST w/o copy? 19:35:53 <gholt> Yeah, every once in a while I think Rackspace marks it as spam and bounces everything and you have to resub 19:36:14 <clayg> gholt: sounds like a feature 19:36:19 <notmyname> gholt: and I'm not sure that's always a wrong choice ;-) 19:37:16 <notmyname> ok, moving on from metadata search 19:37:18 <notmyname> thanks lincolnt 19:37:31 <notmyname> #topic timestamps + modified-since headers 19:37:36 <notmyname> portante: you're up 19:37:46 <portante> so are folks familiar with the problem? 19:37:55 <notmyname> rfc calls for int timestamps? 19:37:59 <portante> we store timestamps using microsecond resolution 19:38:29 <portante> the http protocol only allows for seconds resolution in last-modified, if-[un]modified-since heasders 19:38:41 <portante> so if you HEAD an object after PUTing it 19:39:00 <portante> you get a truncated timestamp: 13.9 ends up being 13 19:39:17 <portante> if you then use that last-modified value with if-unmodified-since, you get a 412 19:39:33 <portante> if you use that with if-modified-since you'll fetch the object, even though it has not changed 19:39:55 <portante> I am not sure what to do about this personally 19:40:09 <notmyname> last-modified with "13" gives an error? 19:40:26 <notmyname> err...if-unmodified-since 19:40:31 <torgomatic> well, if you only get second-level resolution with HTTP, we have to treat all same-second updates as identical 19:40:38 <torgomatic> so, maybe round up to <second>.99999 ? 19:40:49 <notmyname> torgomatic: or floor it 19:41:22 <notmyname> portante: does the normalize timestamp method not account for this? 19:41:34 <portante> it does not have anything to do with this 19:41:45 <portante> the x-timestamp value comes from the object 19:42:17 <portante> the format of the last-modified and if-[un]modified-since headers is http-data format (verbose human readable time string) 19:42:50 <notmyname> normalize what comes in on the request. is the question to count it as .99999 or .0? 19:42:55 <briancline> flooring seems best 19:43:19 <portante> I think we have to ceiling x-timestmap value when returning last-modified, and then do the same when generating datetime objects when comparing the if-* headers 19:43:20 <lincolnt> Could we define a meta HTTP header to hold the microseconds so clients can provide it? 19:43:47 <portante> lincolnt: that is an option, but clients would have to change 19:43:50 <gholt> lincolnt: You could, but that doesn't help with all the existing http-based stuff out there. 19:44:36 <portante> and we tried changing the http-data format to add the trailing microseconds and some proxies just strip it and reformat the date without it 19:44:40 <lincolnt> right, what youre saying about flooring etc would still have to happen, but allows clients to add it in the future 19:45:03 <briancline> if you're performing that in both places, why would it matter whether to floor or ceil it? 19:45:15 <zaitcev> If I understood right, the problem is 412. Just make sure it doesn't happen and return the object. 19:45:23 <portante> yes, if a request had an x-swift-if-[un]modified-since header, we could accept that full format 19:45:29 <notmyname> we return x-timestamp, so if you floor the last-modified value, you satisfy the rfc and give the user the chance to get more accurate 19:45:44 <portante> we don't return the x-timestamp value as is 19:46:01 <portante> that header I don't think is allowed in responses by default, though I could be wrong 19:46:30 <notmyname> hmm 19:46:36 <portante> the last-modified header is represented by the last_modified property on a Response object, which is a _datetime_property() object 19:46:59 <portante> that object converts to the string our x-timestamp float value, dropping the microseconds on the flow 19:47:02 <portante> floor 19:47:30 <notmyname> ok 19:47:44 <zaitcev> Do "swift stat" and get X-Timestamp: 1368649471.47379 19:48:08 <zaitcev> It started happening a while ago. Before 1.4 19:48:18 <portante> so we could change last-modified to ceiling and then change the if-[un]modified-since code to ceiling 19:48:25 <zaitcev> At the time I found it annoying that it leaked. 19:48:31 <torgomatic> I mean, fundamentally the problem is: if an object has Last-Modified: Blah, and I make a GET with If-Unmodified-Since: Blah, I want a 304 Not Modified response. Right? 19:48:40 <torgomatic> or am I misunderstanding things? 19:48:41 <portante> but then we have to document that there is a possibility of missing sub-second updates 19:48:50 <portante> yes 19:48:52 <portante> torgomatic 19:49:11 <portante> zaitcev: not sure we want to rely on that "leak" 19:49:49 <peluse> not I sure I follow the "leak" comment 19:49:51 <notmyname> is -since inclusive or exclusive? 19:50:20 <portante> peluse: I believe zaitcev is saying that we did not return an x-timestamp header in responses before 1.4 19:50:28 <peluse> ahh 19:50:33 <zaitcev> yes 19:50:53 <gholt> Quick test that shows the python libs floor: https://gist.github.com/gholt/6963bffe5a20cbded451 19:52:19 <notmyname> portante: does that ^ give you anything? 19:52:42 <portante> sec 19:53:11 <notmyname> should we punt this to #openstack-swift in the interest of time? 19:53:18 <portante> sure 19:53:26 <notmyname> #topic open discussion 19:53:38 <notmyname> anything else? if not, let's figure out the timestamp stuff 19:53:57 <portante> acoles 's work 19:53:58 <zaitcev> I meant to ask gholt to look out for "DB locked" failrues _in servers servince requests_. 19:54:00 <notmyname> gholt: RAX is now testing basically at master, including ssync and early quorum 19:54:12 <zaitcev> like this https://bugs.launchpad.net/swift/+bug/1224253 19:54:36 <zaitcev> Apparently it started happening in Havana. No clue why or if it really is the case 19:54:44 <portante> notmyname: there has been some discussion about pipelines and adding a manditory header stripper at the left of the pipeline 19:55:12 <notmyname> portante: ya. alpha_ori has a patch to do nifty things to the pipeline 19:55:28 <portante> yes, he does it only for config dir environments 19:55:29 <notmyname> swifterdarrell skewered it, and alpha_ori will be back in the office late this week or next week 19:55:37 <zaitcev> I see "DB locked" in auditors often but whatever, that may be ok unless it prevents repair. 19:55:39 <portante> so I am looking at how to do that in general 19:55:48 <notmyname> portante: use config dirs! ;-) 19:56:51 <notmyname> acoles: did you have something? 19:56:52 <portante> notmyname: sure! 19:58:30 <notmyname> ok, I don't think acoles is here 19:58:36 <notmyname> thanks everyone for attending 19:58:41 <notmyname> see you in two weeks 19:58:41 <portante> thanks 19:58:43 <notmyname> #endmeeting