21:00:01 <notmyname> #startmeeting swift
21:00:02 <openstack> Meeting started Wed Jul 11 21:00:01 2018 UTC and is due to finish in 60 minutes.  The chair is notmyname. Information about MeetBot at http://wiki.debian.org/MeetBot.
21:00:03 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
21:00:05 <openstack> The meeting name has been set to 'swift'
21:00:08 <notmyname> who's here for the swift team meeting?
21:00:09 <timburke> o/
21:00:12 <mattoliverau> o/
21:00:17 <rledisez> hi o/
21:00:22 <kota_> o/
21:01:07 <notmyname> #link https://wiki.openstack.org/wiki/Meetings/Swift
21:01:14 <notmyname> there's the agenda. a few things to cover
21:01:21 <notmyname> where's clayg? is he here?
21:01:29 <clayg> o/
21:01:50 <notmyname> welcome, everyone
21:02:08 <notmyname> #topic contianer sharding
21:02:26 <notmyname> I wanted to share with [!swiftstack] some of the info on sharding. more just "here's how it's going in prod"
21:02:57 <notmyname> put simply, so far it's going pretty well. it works, it can be scaled to do more at once, and people are pretty happy
21:03:10 <notmyname> that being said, we just recently came across a couple of issues we're looking in to
21:03:38 <notmyname> first, object PUTs can get slow because there's a GET to the root container. timburke and I have talked a little about what may be done about that
21:04:06 <notmyname> second, when concurrency was ramped up some object GETs started returning 404s. not sure what's up with that yet
21:04:16 <notmyname> clayg: timburke: is all that correct-ish enough?
21:04:33 <timburke> no... *container* GETs in the object *PUT* path
21:04:39 <clayg> Talked more than just “cache shard ranges somehow”?
21:04:40 <timburke> (was my understanding)
21:04:49 <notmyname> timburke:  thanks :-)
21:05:00 <notmyname> clayg: very slightly more, yes ;-)
21:05:12 <clayg> I know what’s happening with the 404 on PUT I saw that I’m the lab. It’s kind of the same problem. Ish.
21:05:19 <timburke> we knew that container request would slow things down. we didn't appreciate that the container would get error limited, leading to the 404s
21:05:21 <notmyname> either we cache it in the proxy (somehow) or we potentially defer it to async later
21:05:37 <notmyname> timburke: ah, that makes sense
21:05:40 <clayg> Ah. Async.
21:06:06 <clayg> Yeah we know why there is 404. We could/should probably try to do better there.
21:06:21 <notmyname> may be an actual async pending. or could be that the object updates the root and then gets redirected to a child. or that the root queues it for async update to the child. so many different ideas
21:06:39 <clayg> The async idea seems like it GETs is into trouble later. Dunno.
21:06:56 <notmyname> yeah, it's not great. but neither is an unbounded cache in the proxy
21:07:13 <mattoliverau> Interesting, do we have bugs or an etherpad to track these
21:07:40 <notmyname> I don't think so
21:07:42 <clayg> I’m still a little mad the container dbs can’t do a few 100 connections per second. 🤷‍♂️
21:08:18 <clayg> Yeah so action item is bugs.
21:08:29 <notmyname> yeah. I'll take that
21:08:45 <clayg> I’ll work on that while @timburke is out.
21:08:49 <timburke> one thought i'd had was to see that the root is sharding, direct all updates to a random selection of the root's first 10*replica_count *handoffs* (skipping the primaries entirely), and let them cleave to the shards
21:08:58 <timburke> but i have *no idea* how well that'll work
21:09:03 <mattoliverau> I'd like to recreate them so I can see where the bottleneck is :)
21:09:50 <mattoliverau> Oh interesting, giving the work to handoffs
21:09:57 <clayg> Just get sharded AF them PUT with a lot of concurrency.
21:10:06 <notmyname> timburke: I liked that one for the "it's just crazy enough it may actually work" factor ;-)
21:10:43 <timburke> mattoliverau: importantly, some >1 multiple of the replica count of the handoffs
21:11:11 <notmyname> mattoliverau: if I haven't made upstream bugs by your lunchtime today, then yell at me
21:11:13 <clayg> I think we might could find something clever in the container server code that would get us enough shard range lookups per second.
21:11:21 <timburke> but i'd worry about what will happen with db size as we're streaming records through the misplaced object table...
21:12:05 <clayg> Deferring the work isn’t great. That was the problem we were solving!
21:12:20 <clayg> I’d much rather back pressure the client.
21:13:04 <notmyname> anyway, I wanted to share with mattoliverau and rledisez and kota_ what was going on there
21:13:17 <notmyname> mostly good. a few issues to work through. nothing that's a showstopper yet
21:13:49 <notmyname> I'll make some upstream bugs so others can try to see it too
21:13:50 <kota_> notmyname: thanks for sharing, I need to catch up more what's going on the situation with the container sharding...
21:13:53 <clayg> We should be able to make it fast enough. With some sort of caching (maybe closer to the db) - more like the pending files. Or... something. There’s not that much shard ranges to cache. We have an SSD right there.
21:13:54 <mattoliverau> Thanks I appreciate it, let's out our heads together and figure something out. I'll have a go recreating it
21:14:06 <notmyname> clayg: +1
21:14:09 <notmyname> #topic patches
21:14:18 <kota_> repro according to the bug reports will help me to understand.
21:14:36 <notmyname> https://review.openstack.org/#/c/447129 is open, and acoles had asked for an additional +2. his is on it already
21:14:37 <patchbot> patch 447129 - swift - Configure diskfile per storage policy
21:15:14 <clayg> Oh, that’s on my list to review!
21:15:21 <clayg> Hi @rledisez
21:15:22 <notmyname> oh good! :-)
21:15:46 <notmyname> I'd looked at it a high level, but clayg would do a lot better at a real review than me :-)
21:15:56 <clayg> I mean, it’s been there since last week?  🤷‍♂️
21:16:14 <notmyname> a whole week?! surely we don't let patches get that old, do we? ;-)
21:16:26 <clayg> 😂
21:16:29 <notmyname> speaking of... https://review.openstack.org/#/c/427911/
21:16:30 <patchbot> patch 427911 - swift - Replace MIME with PUT+POST for EC and Encryption
21:16:52 <notmyname> timburke: do you remember where acoles left this with zaitcev?
21:17:13 <timburke> not off hand...
21:17:16 <notmyname> ok
21:17:31 <notmyname> the other patch I wanted to mention in swift is https://review.openstack.org/#/c/337960/
21:17:31 <patchbot> patch 337960 - swift - Include SLO ETag in container updates
21:17:45 <notmyname> timburke: you updated this at clayg's suggestion
21:17:45 <clayg> Zaitcev was gunna flatten some stuff. Then we needed to merge it. Al said “needs tests”
21:17:53 <notmyname> clayg: got it. thanks
21:18:12 <clayg> @notmyname: we talk about that. It’s all brilliant. Tim is a saint.
21:18:44 <timburke> more keys in JSON dicts! what could go wrong, right?
21:18:57 <notmyname> nothing, as long as you don't end with a trailing ,
21:18:58 <clayg> kota_: you might wanna sign off?  The follow on is s3api. Also awesome afaik?
21:19:20 <notmyname> yeah, I'd love to have kota_'s opinion on it, based on his expertise with s3
21:19:45 <kota_> it sounds cool, sorry i didn't see the new version (more keys) yet.
21:20:03 <notmyname> no worries
21:20:24 <notmyname> clayg: weren't we just saying how great timburke is? I've got another reason to pile on
21:21:00 <notmyname> so this week timburke updated https://wiki.openstack.org/wiki/Swift/PriorityReviews with a bunch of stuff that would be good to see land in swiftclient. (thanks timburke!)
21:21:09 <kota_> I'll try to look at it as possible today but not sure to complete, then i'll be being in summer vacation for a week.
21:21:18 <notmyname> it's a long list, but the top ones are all that really matter right now (because they're the only ones ready to land)
21:21:26 <clayg> I say that all the time. It’s part of my morning routine. “Try to be more like Tim, but don’t cry to much when you inevitably fail”
21:21:41 <mattoliverau> +1 :)
21:21:43 <notmyname> and https://review.openstack.org/#/c/498069/ and https://review.openstack.org/#/c/538349/ both already have one +2. so should we go ahead and click +A on them
21:21:44 <patchbot> patch 498069 - python-swiftclient - Stop mutating header dicts
21:21:45 <patchbot> patch 538349 - python-swiftclient - Treat 404 as success when deleting segments
21:21:55 <timburke> clayg: i think you've got my routine, but backwards :-/
21:23:00 <notmyname> so unless someone objects by the end of my workday (ie in about 3-4 hours), then i'll merge these two patches based on existing reviews
21:23:34 <notmyname> https://review.openstack.org/#/c/581374/ and https://review.openstack.org/#/c/581374/ are the other two that could land soon. they add good new functionality
21:23:35 <patchbot> patch 581374 - python-swiftclient - Add ability to generate a temporary URL with an IP...
21:23:36 <patchbot> patch 581374 - python-swiftclient - Add ability to generate a temporary URL with an IP...
21:23:49 <clayg> Those patches already have my opinion on them 😃
21:23:59 <notmyname> importantly, we need swiftclient master to be ready for a release tag by the end of next week (july 20)
21:24:24 <notmyname> towards the end of next week i'll do the authors/changelog patch and well tag a release
21:24:41 <notmyname> #topic open discussion
21:24:45 <notmyname> anything else to bring up this week?
21:24:53 <timburke> i'm looking at that tempurl one now -- looks sane, i might have a follow-up to catch some errors earlier
21:24:57 <timburke> another patch worth pointing out: https://review.openstack.org/#/c/578075/ -- i know we said one +2 is enough, but since i had a decent hand in writing the patch, i figure someone else ought to point out where i was stupid
21:24:58 <patchbot> patch 578075 - swift - Add keymaster to fetch root secret from KMIP service
21:26:05 <clayg> mattoliverau: bash completion thing looks cool!?
21:26:13 <notmyname> clayg: i know, right
21:26:26 <kota_> clayg: +1
21:26:45 <mattoliverau> Thanks :)
21:27:04 <mattoliverau> I'm lazy so want things to auto complete
21:27:11 <clayg> I don’t want to sign up for kmips thing.
21:27:20 <notmyname> timburke: is your multi-key thing a dependency of p 578075?
21:27:20 <patchbot> https://review.openstack.org/#/c/578075/ - swift - Add keymaster to fetch root secret from KMIP service
21:27:25 <mattoliverau> If we're happy with how that works, might slowly add it to other tools
21:27:29 <timburke> https://review.openstack.org/#/c/577874/ is cool, too -- will probably want multiple opinions eventually
21:27:29 <patchbot> patch 577874 - swift - Add support for multiple root encryption secrets
21:27:35 <notmyname> right
21:27:50 <timburke> notmyname: the two patches are independent, but once one lands, the other's gonna grow a chain
21:27:56 <notmyname> got it
21:28:02 <clayg> rledisez: weren’t you saying you wanted to do more reviews?
21:28:04 <mattoliverau> Is there a simple way to have a kmip service, if so I'll take a look
21:28:22 <notmyname> mattoliverau: yeah, the pykmip project has a test endpoint afaik
21:28:36 <timburke> acoles is the best. https://gist.github.com/alistairncoles/88044939197875d227556a5c676ad6ac
21:28:48 <rledisez> clayg: yeah, but I say so much things… :/
21:28:50 <timburke> linked in the comments on gerrit, too
21:29:23 <timburke> clayg: i'm pretty sure we *all* say we want to do more reviews :P
21:29:24 <notmyname> the story with these two kmip ones is that (1) we've got people who what to do key rotation (ie start using a new key later for new data, and all the keys still work) and (2) we've got people who want to do that while talking directly to a kmip server instead of going through barbican
21:29:54 <mattoliverau> Oh sweet, go acoles
21:30:16 <clayg> Lol @ timburke
21:30:21 <notmyname> what else should we talk about this week?
21:30:25 <notmyname> rledisez: anything to report on LOSF?
21:30:29 <notmyname> (or anything else?)
21:30:52 <kota_> timburke: something we should do for proxyfs?
21:31:12 <kota_> i saw an issue you reported around delimiter...
21:31:35 <timburke> kota_: add support or delimiter queries? yeah, that'd probably be nice... it's been a known issue for a while... i just gotta find cycles to fix it
21:31:41 <rledisez> notmyname: not really on LOSF. alecuyer was digging a bug in SSYNC all day. might be related to https://bugs.launchpad.net/swift/+bug/1652323. I expect a patch soon for that
21:31:41 <openstack> Launchpad bug 1652323 in OpenStack Object Storage (swift) "ssync syncs an expired object as a tombstone" [High,Fix released]
21:32:32 <kota_> timburke: ok, thx.
21:32:58 <timburke> kota_: as much as anything, i wrote that up to get our known issues more visible -- "Issues (2)" doesn't *begin* to describe the limitations we know about :P
21:33:11 <timburke> i'll try to write up more of them in the near-ish future
21:33:21 <kota_> great
21:34:15 <clayg> Grand.
21:34:42 <notmyname> kota_: rledisez: mattoliverau: I want to talk to you each during the next week about the denver ptg. I'll try to find you each on irc (at your respective reasonable times)
21:34:54 <notmyname> anything else to bring up from anyone?
21:35:48 <mattoliverau> Nope, breakfast time?
21:35:48 <notmyname> let's put a fork in it and call it done
21:36:04 <notmyname> (yes, I'm using american idioms intentionally to confuse people) ;-)
21:36:16 <notmyname> thanks for coming today. and thank you for your work on swift
21:36:20 <notmyname> #endmeeting