21:00:15 <timburke> #startmeeting swift 21:00:15 <openstack> Meeting started Wed Nov 18 21:00:15 2020 UTC and is due to finish in 60 minutes. The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:16 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:18 <openstack> The meeting name has been set to 'swift' 21:00:26 <timburke> who's here for the swift meeting? 21:00:28 <zaitcev> o/ 21:00:31 <alecuyer> o/ 21:00:41 <rledisez> hi o/ 21:00:54 <acoles> o/ 21:01:26 <mattoliverau> o/ 21:02:09 <timburke> agenda's at https://wiki.openstack.org/wiki/Meetings/Swift 21:02:24 <timburke> #topic no meeting next week 21:03:11 <timburke> thanksgiving is next week in the US and i know i'm going to be out all week. i expect things will probably be pretty quiet anyway 21:03:21 <mattoliverau> Kk 21:03:28 <clayg> this is where the party is?! 21:03:37 <timburke> so unless someone else would like to chair it, i propose we skip next week's meeting 21:04:05 <clayg> yeah, party somewhere else next week 21:04:13 <mattoliverau> Skipping is fine for me 21:05:45 <timburke> clayg, iirc you'll be around much of the start of the week; would you mind dropping a reminder in -swift on the end of your Tuesday (and maybe again early Wednesday)? 21:06:07 <mattoliverau> I'll do it if you like 21:06:20 <timburke> that works too! thanks mattoliverau 21:06:48 <timburke> (mainly i proposed clayg do it because i know his timezone offset better ;-) 21:07:00 <timburke> #topic RBAC community effort 21:07:20 <timburke> zaitcev, you added this; what should we know/discuss? 21:07:24 <clayg> i set a reminder! (but I mean.. I had a reminder go off fo this meeting this afternoon too) 21:07:37 <mattoliverau> Lol 21:08:20 <zaitcev> timburke: mainly ask if you know about it along the TC/PTL communications and what your thoughts are, and let you know that I am going to do something about it, most likely add a reader role. 21:09:09 <zaitcev> timburke: there's a group called "Pop-Up Something Or Other", officially formed at OpenStack (or Open Infra now?). 21:10:44 <zaitcev> I started looking at keystoneauth.py and found that our unit tests are not water-tight. I was able to add obvious bugs and tests still passed. 21:10:48 <timburke> i've not had much communication about it, but if i do i'll be sure to direct them your way. i love the idea of a dedicated reader role 21:11:17 <zaitcev> What, you do? I thought it was some meaningless make-do invented by Keystone people. 21:11:55 <zaitcev> Okay, tell me one thing. If a token with a reader role tries to GET an object with _no_ ACLs, is it permitted or not? 21:13:59 <timburke> now that i've got an ops hat i occasionally put on, i'd love to have something like a read-only reseller admin role. i don't think *this* is *that*, but it seems like a start 21:14:21 <zaitcev> ok. I thought so. 21:14:23 <zaitcev> Thanks. 21:14:41 <zaitcev> We can move on to the next item, as far as I'm concerned. 21:16:09 <timburke> as to the specific question, i'm inclined to say yeah -- it seems similar to the account acls tempauth allows (if less fine-grained) 21:16:31 <zaitcev> got it 21:16:34 <timburke> #topic async deletes of SLO segments 21:17:15 <timburke> so we merged this recently; i just wanted to give an update now that i've actually seen it in use 21:17:32 <timburke> tl;dr: it looks to be working great! 21:18:10 <zaitcev> Good 21:18:24 <timburke> max delete request time went down from something like 25mins to generally around 10s 21:18:34 <mattoliverau> nice 21:19:00 <mattoliverau> are things getting deleted in a timely manner async wise? hows the general task queue size? 21:19:05 <timburke> the expirer queue would fill and drain in pretty nice waves 21:19:11 <zaitcev> I'm amazed. Often bumping things to be async does nothing because you still need to use the same amount of resources over long term. 21:19:54 <timburke> heh. *client* delete request timing ;-) 21:20:48 <zaitcev> Also, I thought you were looking for a better compatibility with S3. Although maybe I confuse it with some other segment-deletion patch. 21:21:08 <clayg> and it seems like the expirer's ratelimiting tasks_per_second give us all the knob we need - we can stuff tombstones fast enough they go to async_pending, or slow enough to barely keep up with the next wave of DELETEs 21:21:18 <timburke> so the queue depth definitely got larger than i was expecting, but the expirers could generally keep up. there was certainly some fairly linear growth during the 48hrs or so worth of deletes, but once the client requests stopped, it cleared pretty quickly 21:21:48 <zaitcev> Aww 21:22:12 <zaitcev> To me it does not sound good. There's still a chance for clients to kill clusters. Well, it can be disable on public clouds. 21:22:38 <zaitcev> Anyway, seems like a net positive. 21:22:47 <timburke> zaitcev, it's tied to s3api, but somewhat loosely. certainly, i don't expect most s3 clients to have a 25min timeout for deletes, where they likely would at least have a 30s timeout 21:23:03 <clayg> zaitcev: I dont' think that generally s3 clients expect MPU deletes to be pretty fast even with a bunch a bunch of segments in the original MPU - pretty sure AWS is totally async in this regard - and we're close with the +segments container, but not exactly the same in that we still count your segment bytes until we're all cleared out 21:23:04 <zaitcev> I see. 21:23:41 <zaitcev> I'm a little curious what we do if clusters can't keep up. But let's kick this can down the road. 21:24:16 <clayg> zaitcev: you can disable the option and it shows up in /info - that would put backpressure to clients again same as before 21:24:17 <timburke> turn on ratelimit and only let clients send so many deletes at a time ;-) 21:25:40 <timburke> that's all i really had there, just wanted to share ('cause it was pretty cool to see it work so well!) 21:26:07 <timburke> #topic audit watchers 21:26:10 <mattoliverau> thanks for sharing! 21:26:10 <clayg> timburke's next crazy idea is to queue s3api async mpu deletes on overwrite! 21:26:13 <timburke> #link https://review.opendev.org/#/c/706653/ 21:26:14 <patchbot> patch 706653 - swift - Let developers/operators add watchers to object au... - 40 patch sets 21:26:44 <zaitcev> There was no change. I think watchers are ready to go, in the sense that I see no show-stopers. We may want to fine-tune it. 21:26:45 <mattoliverau> I'll be reviewing this patch today. I've been a little distracted as of late 21:27:02 <zaitcev> I've not yet shipped them, but I ran some tests. 21:27:13 <timburke> i know i meant to loop around to doing another review, too. hopefully we'll have it merged by the next meeting :-) 21:27:43 <zaitcev> mattoliverau: many thanks and sharding is of course more important for Swift overall, but this is just so close. I want this monkey off my back. 21:27:58 <timburke> zaitcev, how'd the tests go? anything interesting to fall out of them? 21:28:42 <zaitcev> timburke: I found that my cluster would be absolutely, 100% clean of dark data, if not for garbage that I created when testing broken PUT+POST. 21:28:43 <mattoliverau> totally understand, this is higher priority in my opinion, still a bunch of sharding to do, this is so close, we should land it :) And you've done awesome work here 21:29:12 <timburke> 🎉 21:29:37 <timburke> all right, mattoliverau and i will plan to review it over the next couple weeks 21:29:54 <zaitcev> But I think maybe we can find good uses for the watchers that I am not foreseeing, like checking if SLOs are missing segments or whatnot. 21:30:37 <timburke> that is *definitely* one of the use-cases i want to try out. and even going the other way, at least for s3api segments 21:30:59 <timburke> #topic ssync tracebacks 21:30:59 <zaitcev> oh, right. segments what miss manifests 21:32:25 <timburke> so i noticed today that something like 75% of the tracebacks i'm seeing in prod are ssync receiver bombing out trying to read a chunked request body 21:32:54 <timburke> acoles pushed up https://review.opendev.org/763205 and it's working its way through the gate now 21:32:54 <patchbot> patch 763205 - swift - ssync: don't log tracebacks for client disconnects - 2 patch sets 21:33:21 <zaitcev> Interesting reviews for ssync, I think I can take a look. It's a targeted SsyncClientDisconnected, not "except Exception" :-) 21:33:42 <timburke> i've also (more rarely) seen issues where ssync gets and unexpected blank, and proposed https://review.opendev.org/744270 21:33:42 <patchbot> patch 744270 - swift - ssync: Tolerate more hang-ups - 4 patch sets 21:34:47 <timburke> as much as anything, i just wanted to point the frequent ssync errors and mention that we're working to get the noise down 21:35:08 <timburke> rledisez, alecuyer i'm sure you guys have noticed this problem, too :-) 21:35:39 <rledisez> timburke: right, ssync is by far the biggest logger when it comes to error 21:36:00 <rledisez> that's good you're working on that, thx :) 21:36:04 <alecuyer> yep we see these often 21:36:18 <acoles> reading the comments in ssync code, I inferred that the traceback logging was originally intended to illuminate code/protocol errors, but the disconnects were getting caught and logged in the same way 21:36:43 <clayg> ☝️ 21:37:39 <timburke> #topic s3api, +segments container, and ACLs 21:39:46 <timburke> so i've had s3api users that set up acls on a container, verify that other users can upload ok, but then get very confused when the acl-allowed user can't do multipart uploads 21:40:18 <timburke> i proposed https://review.opendev.org/763106 as a start, but i've still got a couple concerns 21:40:18 <patchbot> patch 763106 - swift - s3api: Clone ACLs when creating +segments container - 1 patch set 21:41:43 <timburke> first, i think the acl-allowed user will still run into trouble if the +segments container doesn't already exist (since they won't have permission to create new containers) 21:41:49 <clayg> oh yeah, i had that checked out - it looked solid 21:42:29 <mattoliverau> would you also need to propergate ACL changes when ACLs are changed on the parent container/bucket? 21:43:03 <timburke> i might be able to work around that by doing a auth-check first then creating the container as a pre-auth'ed request 21:43:41 <timburke> and then, yeah -- keeping the ACLs in sync is a definite problem. i'm still not sure how to approach that one 21:44:45 <timburke> (fwiw, the idea for ALOs would be to have all access to the segment container pre-authed and based on the acls set for the main container) 21:44:55 <clayg> rather that worry about keeping ACLs in sync - can we just work on ... see timburke gets it 21:45:56 <mattoliverau> yeah, another win for ALO 21:46:32 <clayg> timburke: whatever the workflow that triggered this - for writing they must have had perms to create +segments - because the problem was "my ACLs were working then some s3client transparently MPU and on some downloads no worky" 21:47:10 <clayg> i think they have like admin workers doing ingest and then they just want them published for the ML jobs 21:47:11 <timburke> so i guess my questions are, should a user who's allowed to write objects into the container be able to create the segments container? and if i do the auth-check/pre-auth'ed request to make that possible, would that be "good enough" to merge as a band-aid until i can write ALOs? 21:48:05 <clayg> I think the band-aid you have is good enough until we have ALO - I think i'd actually be *more* worried about hacking in a pre-authed request - piggybacking on "don't set the wrong spid" with "don't set the wrong acls" seems totally reasonable 21:48:19 <timburke> clayg, maybe that was it -- for some reason, i could've sworn that i'd seen someone talking about how they were allowed to upload normal objects but not MPUs 21:48:28 <clayg> if there was any authz concerns with the existing code you're not making it worse 21:48:38 <timburke> true 21:49:22 <clayg> ah, you might be right - we've probably had "the wrong ACLs" break both ways - but John said he's happy with "well set it correctly initially" 21:49:24 <mattoliverau> my initial feeling is, well that's what pre-auth would be for, so long as we're sure the user was authed (via ACLs) first. And they can't say delete the container or any objects there. 21:50:17 <mattoliverau> but this is morning Matt who is definitely under caffinated :P 21:50:20 <clayg> honestly I'm amazed the +segments auto-vivify works as well as it does w/o hand-holding; and I guess maybe there's more hand holding going on than I realize (SRE is always dealing with tickets and slack messages to fix ACLs) 21:51:04 <timburke> all right, maybe i'll just leave the patch as-is then, and write a script to go compare acls between containers across the whole cluster (since we don't have *that* many containers) 21:51:04 <clayg> mattoliverau: you might be right - if we say "you can upload objects" you can probably create the +segments container to create the objects 🤔 21:51:54 <clayg> I just don't like to think to hard when it comes to s3api and swift ACLs - I feel like users are mixing metaphors because they can make it work, and not really thinking about security 21:52:19 <clayg> we should work on less swift ACLs and more s3api compatible security policies! 21:52:28 <clayg> zaitcev: amirite!? 21:53:29 <timburke> my biggest worry is how it smells like a privilege escalation 21:54:46 <timburke> all right, that's all i've got 21:54:53 <timburke> #topic open discussion 21:55:04 <timburke> anything else we ought to bring up this week? 21:57:08 <timburke> if anybody feels like reviewing some client code, https://review.opendev.org/#/c/758500/ seems like a nice usability improvement 21:57:09 <patchbot> patch 758500 - python-swiftclient - Allow tempurl times to have units - 1 patch set 21:57:11 <timburke> i don't really feel like memorizing how may seconds are i a day ;-) 21:57:42 <acoles> never enough I feel ;) 21:59:37 <timburke> all right 21:59:54 <timburke> thank you all for coming, and thank you for working on swift! 22:00:03 <timburke> #endmeeting