21:00:01 #startmeeting swift 21:00:02 Meeting started Wed Jul 11 21:00:01 2018 UTC and is due to finish in 60 minutes. The chair is notmyname. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:03 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:05 The meeting name has been set to 'swift' 21:00:08 who's here for the swift team meeting? 21:00:09 o/ 21:00:12 o/ 21:00:17 hi o/ 21:00:22 o/ 21:01:07 #link https://wiki.openstack.org/wiki/Meetings/Swift 21:01:14 there's the agenda. a few things to cover 21:01:21 where's clayg? is he here? 21:01:29 o/ 21:01:50 welcome, everyone 21:02:08 #topic contianer sharding 21:02:26 I wanted to share with [!swiftstack] some of the info on sharding. more just "here's how it's going in prod" 21:02:57 put simply, so far it's going pretty well. it works, it can be scaled to do more at once, and people are pretty happy 21:03:10 that being said, we just recently came across a couple of issues we're looking in to 21:03:38 first, object PUTs can get slow because there's a GET to the root container. timburke and I have talked a little about what may be done about that 21:04:06 second, when concurrency was ramped up some object GETs started returning 404s. not sure what's up with that yet 21:04:16 clayg: timburke: is all that correct-ish enough? 21:04:33 no... *container* GETs in the object *PUT* path 21:04:39 Talked more than just “cache shard ranges somehow”? 21:04:40 (was my understanding) 21:04:49 timburke: thanks :-) 21:05:00 clayg: very slightly more, yes ;-) 21:05:12 I know what’s happening with the 404 on PUT I saw that I’m the lab. It’s kind of the same problem. Ish. 21:05:19 we knew that container request would slow things down. we didn't appreciate that the container would get error limited, leading to the 404s 21:05:21 either we cache it in the proxy (somehow) or we potentially defer it to async later 21:05:37 timburke: ah, that makes sense 21:05:40 Ah. Async. 21:06:06 Yeah we know why there is 404. We could/should probably try to do better there. 21:06:21 may be an actual async pending. or could be that the object updates the root and then gets redirected to a child. or that the root queues it for async update to the child. so many different ideas 21:06:39 The async idea seems like it GETs is into trouble later. Dunno. 21:06:56 yeah, it's not great. but neither is an unbounded cache in the proxy 21:07:13 Interesting, do we have bugs or an etherpad to track these 21:07:40 I don't think so 21:07:42 I’m still a little mad the container dbs can’t do a few 100 connections per second. 🤷‍♂️ 21:08:18 Yeah so action item is bugs. 21:08:29 yeah. I'll take that 21:08:45 I’ll work on that while @timburke is out. 21:08:49 one thought i'd had was to see that the root is sharding, direct all updates to a random selection of the root's first 10*replica_count *handoffs* (skipping the primaries entirely), and let them cleave to the shards 21:08:58 but i have *no idea* how well that'll work 21:09:03 I'd like to recreate them so I can see where the bottleneck is :) 21:09:50 Oh interesting, giving the work to handoffs 21:09:57 Just get sharded AF them PUT with a lot of concurrency. 21:10:06 timburke: I liked that one for the "it's just crazy enough it may actually work" factor ;-) 21:10:43 mattoliverau: importantly, some >1 multiple of the replica count of the handoffs 21:11:11 mattoliverau: if I haven't made upstream bugs by your lunchtime today, then yell at me 21:11:13 I think we might could find something clever in the container server code that would get us enough shard range lookups per second. 21:11:21 but i'd worry about what will happen with db size as we're streaming records through the misplaced object table... 21:12:05 Deferring the work isn’t great. That was the problem we were solving! 21:12:20 I’d much rather back pressure the client. 21:13:04 anyway, I wanted to share with mattoliverau and rledisez and kota_ what was going on there 21:13:17 mostly good. a few issues to work through. nothing that's a showstopper yet 21:13:49 I'll make some upstream bugs so others can try to see it too 21:13:50 notmyname: thanks for sharing, I need to catch up more what's going on the situation with the container sharding... 21:13:53 We should be able to make it fast enough. With some sort of caching (maybe closer to the db) - more like the pending files. Or... something. There’s not that much shard ranges to cache. We have an SSD right there. 21:13:54 Thanks I appreciate it, let's out our heads together and figure something out. I'll have a go recreating it 21:14:06 clayg: +1 21:14:09 #topic patches 21:14:18 repro according to the bug reports will help me to understand. 21:14:36 https://review.openstack.org/#/c/447129 is open, and acoles had asked for an additional +2. his is on it already 21:14:37 patch 447129 - swift - Configure diskfile per storage policy 21:15:14 Oh, that’s on my list to review! 21:15:21 Hi @rledisez 21:15:22 oh good! :-) 21:15:46 I'd looked at it a high level, but clayg would do a lot better at a real review than me :-) 21:15:56 I mean, it’s been there since last week? 🤷‍♂️ 21:16:14 a whole week?! surely we don't let patches get that old, do we? ;-) 21:16:26 😂 21:16:29 speaking of... https://review.openstack.org/#/c/427911/ 21:16:30 patch 427911 - swift - Replace MIME with PUT+POST for EC and Encryption 21:16:52 timburke: do you remember where acoles left this with zaitcev? 21:17:13 not off hand... 21:17:16 ok 21:17:31 the other patch I wanted to mention in swift is https://review.openstack.org/#/c/337960/ 21:17:31 patch 337960 - swift - Include SLO ETag in container updates 21:17:45 timburke: you updated this at clayg's suggestion 21:17:45 Zaitcev was gunna flatten some stuff. Then we needed to merge it. Al said “needs tests” 21:17:53 clayg: got it. thanks 21:18:12 @notmyname: we talk about that. It’s all brilliant. Tim is a saint. 21:18:44 more keys in JSON dicts! what could go wrong, right? 21:18:57 nothing, as long as you don't end with a trailing , 21:18:58 kota_: you might wanna sign off? The follow on is s3api. Also awesome afaik? 21:19:20 yeah, I'd love to have kota_'s opinion on it, based on his expertise with s3 21:19:45 it sounds cool, sorry i didn't see the new version (more keys) yet. 21:20:03 no worries 21:20:24 clayg: weren't we just saying how great timburke is? I've got another reason to pile on 21:21:00 so this week timburke updated https://wiki.openstack.org/wiki/Swift/PriorityReviews with a bunch of stuff that would be good to see land in swiftclient. (thanks timburke!) 21:21:09 I'll try to look at it as possible today but not sure to complete, then i'll be being in summer vacation for a week. 21:21:18 it's a long list, but the top ones are all that really matter right now (because they're the only ones ready to land) 21:21:26 I say that all the time. It’s part of my morning routine. “Try to be more like Tim, but don’t cry to much when you inevitably fail” 21:21:41 +1 :) 21:21:43 and https://review.openstack.org/#/c/498069/ and https://review.openstack.org/#/c/538349/ both already have one +2. so should we go ahead and click +A on them 21:21:44 patch 498069 - python-swiftclient - Stop mutating header dicts 21:21:45 patch 538349 - python-swiftclient - Treat 404 as success when deleting segments 21:21:55 clayg: i think you've got my routine, but backwards :-/ 21:23:00 so unless someone objects by the end of my workday (ie in about 3-4 hours), then i'll merge these two patches based on existing reviews 21:23:34 https://review.openstack.org/#/c/581374/ and https://review.openstack.org/#/c/581374/ are the other two that could land soon. they add good new functionality 21:23:35 patch 581374 - python-swiftclient - Add ability to generate a temporary URL with an IP... 21:23:36 patch 581374 - python-swiftclient - Add ability to generate a temporary URL with an IP... 21:23:49 Those patches already have my opinion on them 😃 21:23:59 importantly, we need swiftclient master to be ready for a release tag by the end of next week (july 20) 21:24:24 towards the end of next week i'll do the authors/changelog patch and well tag a release 21:24:41 #topic open discussion 21:24:45 anything else to bring up this week? 21:24:53 i'm looking at that tempurl one now -- looks sane, i might have a follow-up to catch some errors earlier 21:24:57 another patch worth pointing out: https://review.openstack.org/#/c/578075/ -- i know we said one +2 is enough, but since i had a decent hand in writing the patch, i figure someone else ought to point out where i was stupid 21:24:58 patch 578075 - swift - Add keymaster to fetch root secret from KMIP service 21:26:05 mattoliverau: bash completion thing looks cool!? 21:26:13 clayg: i know, right 21:26:26 clayg: +1 21:26:45 Thanks :) 21:27:04 I'm lazy so want things to auto complete 21:27:11 I don’t want to sign up for kmips thing. 21:27:20 timburke: is your multi-key thing a dependency of p 578075? 21:27:20 https://review.openstack.org/#/c/578075/ - swift - Add keymaster to fetch root secret from KMIP service 21:27:25 If we're happy with how that works, might slowly add it to other tools 21:27:29 https://review.openstack.org/#/c/577874/ is cool, too -- will probably want multiple opinions eventually 21:27:29 patch 577874 - swift - Add support for multiple root encryption secrets 21:27:35 right 21:27:50 notmyname: the two patches are independent, but once one lands, the other's gonna grow a chain 21:27:56 got it 21:28:02 rledisez: weren’t you saying you wanted to do more reviews? 21:28:04 Is there a simple way to have a kmip service, if so I'll take a look 21:28:22 mattoliverau: yeah, the pykmip project has a test endpoint afaik 21:28:36 acoles is the best. https://gist.github.com/alistairncoles/88044939197875d227556a5c676ad6ac 21:28:48 clayg: yeah, but I say so much things… :/ 21:28:50 linked in the comments on gerrit, too 21:29:23 clayg: i'm pretty sure we *all* say we want to do more reviews :P 21:29:24 the story with these two kmip ones is that (1) we've got people who what to do key rotation (ie start using a new key later for new data, and all the keys still work) and (2) we've got people who want to do that while talking directly to a kmip server instead of going through barbican 21:29:54 Oh sweet, go acoles 21:30:16 Lol @ timburke 21:30:21 what else should we talk about this week? 21:30:25 rledisez: anything to report on LOSF? 21:30:29 (or anything else?) 21:30:52 timburke: something we should do for proxyfs? 21:31:12 i saw an issue you reported around delimiter... 21:31:35 kota_: add support or delimiter queries? yeah, that'd probably be nice... it's been a known issue for a while... i just gotta find cycles to fix it 21:31:41 notmyname: not really on LOSF. alecuyer was digging a bug in SSYNC all day. might be related to https://bugs.launchpad.net/swift/+bug/1652323. I expect a patch soon for that 21:31:41 Launchpad bug 1652323 in OpenStack Object Storage (swift) "ssync syncs an expired object as a tombstone" [High,Fix released] 21:32:32 timburke: ok, thx. 21:32:58 kota_: as much as anything, i wrote that up to get our known issues more visible -- "Issues (2)" doesn't *begin* to describe the limitations we know about :P 21:33:11 i'll try to write up more of them in the near-ish future 21:33:21 great 21:34:15 Grand. 21:34:42 kota_: rledisez: mattoliverau: I want to talk to you each during the next week about the denver ptg. I'll try to find you each on irc (at your respective reasonable times) 21:34:54 anything else to bring up from anyone? 21:35:48 Nope, breakfast time? 21:35:48 let's put a fork in it and call it done 21:36:04 (yes, I'm using american idioms intentionally to confuse people) ;-) 21:36:16 thanks for coming today. and thank you for your work on swift 21:36:20 #endmeeting