21:00:13 <timburke> #startmeeting swift 21:00:14 <openstack> Meeting started Wed Apr 29 21:00:13 2020 UTC and is due to finish in 60 minutes. The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:15 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:17 <openstack> The meeting name has been set to 'swift' 21:00:21 <timburke> who's here for the swift meeting? 21:00:24 <seongsoocho> o/ 21:00:28 <kota_> hi 21:00:32 <rledisez> hi o/ 21:00:49 <tdasilva> o/ 21:00:57 <clayg> o/ 21:01:24 <timburke> agenda's at https://wiki.openstack.org/wiki/Meetings/Swift 21:01:36 <timburke> first up 21:01:42 <timburke> #topic swift-get-nodes 21:02:39 <clayg> so i haven't really looked at this - i don't see how it ever worked with paths with spaces? Do you just like wrap it all in "quotes or something" 21:02:39 <timburke> so i realized earlier this week -- we never made it so you could find %00versions containers with swift-get-nodes! 21:02:54 <alecuyer> o/ sorry im late 21:03:18 <mattoliverau> o/ 21:03:24 <timburke> clayg, yeah, with spaces you'd do something like `swift-get-nodes ring.gz 'acct/cont/obj with spaces'` 21:03:53 <clayg> oic, but that strategy won't work with the null character 21:04:02 <timburke> but you can't really do that with null bytes -- pretty sure things break down because of how args get sent to C programs 21:04:15 <timburke> which is a *bit much* to try to hack around ;-) 21:05:01 <timburke> #link https://launchpad.net/bugs/1875734 21:05:01 <openstack> Launchpad bug 1875734 in OpenStack Object Storage (swift) "swift-get-nodes cannot be used with %00versions containers" [High,In progress] 21:05:04 <clayg> yeah, ok so un-url-encoded sort of kind of worked fro some names on accident 21:05:24 <timburke> yup 21:06:05 <timburke> so i wrote a patch to have it accept percent-encoded paths, like acct/cont/obj%20with%20spaces 21:06:08 <timburke> #link https://review.opendev.org/#/c/724141/ 21:06:08 <patchbot> patch 724141 - swift - swift-get-nodes: Allow users to specify either quo... - 1 patch set 21:06:15 <clayg> I think it should try and take quoted names by default - with a `--for-some-reason-i-did-not-quote-this` flag for when you want an object named `%beef` and can't be bothered to type `%25beef` 21:07:00 <clayg> I feel like for most of the cases where it was working it will still work - if you "unquote" "object with spaces" it turns into "object with spaces" doesn't it? what's the big deal? 21:07:08 <timburke> out of an abundance of caution, i kept the old behavior and have you opt-in to quoted paths with --quoted (or just -Q) 21:07:36 <clayg> then we don't need to "deprecate" anything - we just get better (and leave an escape hatch if somehow someone was scripting this and figured out a way to make it work) 21:08:00 <clayg> timburke: I'd advocate for the opposite instead, that's why I wanted to get other opinions 21:08:02 <timburke> clayg, yeah, i was mainly worried about the %beef sort of case 21:08:33 <timburke> i'd be perfectly happy to just work the one way :D 21:08:48 <timburke> what's everybody else think? 21:09:18 <clayg> perhaps I'm being overly optimistic about most of the time unquote of unquoted names is the identity function - obviously in SOME cases that's not true 21:09:53 <rledisez> It seems reasonable to move fast as it's not an API and I doubt everybody did some scripting around that tool. Just displaying a proper warning after the default changed for sometime should be enough 21:10:02 <clayg> 100%beef is the canonical example of the case where the output would unexpectedly change 21:10:11 <timburke> fwiw, i only emit the warning when there is a difference 21:10:26 <rledisez> but the warning shoud be obvious. I can already tell I wouldn't see it because I'm never reading the first lines of the output... 21:11:09 <clayg> hahah 21:11:21 <timburke> heh, fair enough -- i probably ought to move it to the bottom we continue down this route 21:11:44 <rledisez> yeah, I think it gets more chance to be see by operators 21:12:57 <mattoliverau> we need like a cli equiv of <blink> tag :P 21:13:06 <clayg> @mattoliverau hahaha 21:13:43 <timburke> k -- i'll respin with no --quoted option (because it'll be the default behavior) and the warning moved to the bottom 21:14:15 <timburke> and just for mattoliverau, i'll probably spend more time than i should digging through ANSI escape sequence docs ;-) 21:14:22 <clayg> @timburke but what will the warning say exactly? If they're doing the right thing and quoting paths that need it .... won't they get a WARNING: I did exactly what you expected 21:14:43 <mattoliverau> lol 21:14:43 <clayg> oh no mattoliverau 21:14:45 <timburke> hrm. good point :-/ 21:15:03 <clayg> i think just put it in the release notes 21:15:59 <timburke> oh, it's for-sure going in the release notes. and i know rledisez would see it that way; do we think that's true of most operators? 21:16:03 <mattoliverau> snowman says "warning .." :P 21:16:38 <rledisez> I hope so… :) 21:17:15 <timburke> also, i can tell that i'm going to hate poking at old swift clusters after this ;-) 21:17:28 <rledisez> after that, it's just about reminding the change. so many years of passing crazy args to the tool, can't be erased in a blink ;) 21:18:13 <clayg> i feel like I mostly get these paths from loglines - there they're probably "double quoted" 🙄 21:19:12 <timburke> the only consistency is inconsistency! 21:19:23 <clayg> eventaully 21:19:26 <kota_> lol 21:20:18 <mattoliverau> lol 21:20:32 <timburke> well... i'll try something. at least the change is publicized! i *do* want something like this available in the near term, though, 'cause it's not great that there are containers and objects getting created that we can't easily find 21:20:48 <timburke> #topic S3 MPU deletes 21:21:57 <timburke> earlier this week, we had an availability issue because of request amplifications coming from deleting S3 MPUs 21:22:29 <mattoliverau> oh, opps 21:22:49 <timburke> there were a bunch of confounding factors (old-style versioning being enabled on the segments container, and everything being in an EC policy) 21:23:39 <timburke> but even setting those aside, you can have a single client DELETE kick off like 1000 subrequests to clean up the segments :-( 21:23:54 <kota_> :( 21:24:33 <timburke> if the client has a short-ish timeout (after all, AWS responds to deletes real quick), they'll likely retry and make the problem worse 21:25:25 <timburke> i don't really have any action items for this -- mostly just highlighting the problem so people know about it 21:25:33 <rledisez> i guess it can happen too with bulk delete? 21:26:53 <timburke> yeah, though bulk delete has the advantage of being able to dribble out bytes to keep the connection alive 21:28:02 <rledisez> right, so it cannot happen by accident (but still it can be intentional). ratelimit middleware could be a protection for that maybe. I don't remember exactly how it ratelimit (sleeping or returning a 4xx response?) 21:28:10 <tdasilva> async_delete ? 21:28:32 <timburke> rledisez, yeah, i was just thinking about ratelimiting... 21:28:34 <clayg> yeah, it's gunna have to be something like that 21:28:56 <timburke> tdasilva, the async delete stuff is definitely on my mind :-) i think long-term, we want a 21:29:13 <timburke> new large object type that behaves much more like MPUs 21:29:37 <timburke> and async-deletes will likely be part of that solution 21:30:15 <timburke> #topic PTG 21:30:27 <timburke> one more reminder about the call for topics 21:30:29 <timburke> #link https://etherpad.openstack.org/p/swift-ptg-victoria 21:31:33 <timburke> i think i'm also supposed to sign up for some timeslots or something -- i really need to make sure i figure out the plan for this "virtual PTG" thing this coming week 21:31:38 <timburke> we're only like a month away! 21:31:47 <clayg> wow that's nuts 21:32:09 <timburke> ok, on to ongoing-work! 21:32:18 <timburke> #topic lots of small files 21:32:46 <timburke> alecuyer, i saw a patch for an updated key format 21:33:02 <alecuyer> Yes, I pushed a WIP patch for that here : https://review.opendev.org/#/c/723609/ 21:33:03 <patchbot> patch 723609 - swift (feature/losf) - New key format for objects in the index-server - 1 patch set 21:33:30 <alecuyer> So, still need to work on it, but I'm happy that I'm removing more code than adding new code. It makes listdir() functions simpler 21:33:37 <timburke> \o/ 21:33:43 <alecuyer> and I guess that's about it unless you have questions :) 21:33:59 <timburke> i'll try to take a look at it soon :) 21:34:35 <timburke> i should also revisit my attempt to fix the losf gate job... never enough time lately 21:35:09 <timburke> #topic swiftclient socket leak 21:35:54 <timburke> so a week or two ago i saw a message on the ML 21:35:56 <timburke> #link http://lists.openstack.org/pipermail/openstack-discuss/2020-April/014221.html 21:36:13 <timburke> about swiftclient 3.9.0 on py2 leaking sockets 21:36:30 <timburke> #link https://bugs.launchpad.net/python-swiftclient/+bug/1873435 21:36:30 <openstack> Launchpad bug 1873435 in python-swiftclient "Established connection is never closed on Python 2.7" [High,In progress] - Assigned to Tim Burke (1-tim-z) 21:36:36 <clayg> i was looking at that bug today - the patch was pretty invasive 21:37:59 <timburke> well, part of it was that much of p 674320 probably should have been done like that from the get-go 21:38:00 <patchbot> https://review.opendev.org/#/c/674320/ - python-swiftclient - Cleanup session on delete (MERGED) - 2 patch sets 21:38:44 <timburke> clayg, would you prefer that p 721051 was more limited, just adding the `if not six.PY2:`? 21:38:44 <patchbot> https://review.opendev.org/#/c/721051/ - python-swiftclient - Only add __del__ to HTTPConnection shim on py3 - 2 patch sets 21:39:59 <clayg> no, i mean not if that won't fix the bug 21:40:44 <timburke> i think that'd probably be enough? not entirely sure, honestly. but it'd get us back to the prior behavior on py2, at least 21:41:17 <timburke> i could have the explcit closing as a follow-up that we separately debate -- that'd be fine 21:41:53 <zaitcev> Did anyone manage to reproduce the leak outside of that weird test environment? 21:42:14 <zaitcev> I mean, the original leak that Schulz patched with __del__. 21:43:07 <timburke> zaitcev, i remember being able to get that ResourceWarning just fine when i was reviewing it 21:43:12 <zaitcev> I see. 21:44:21 <timburke> might only get emitted on some versions of py3... i forget when they added that 21:44:48 <zaitcev> You know 21:45:17 <zaitcev> I was reading keystoneclient a bit in the past week. And it has client.sess=sess and sess.client=client 21:45:33 <zaitcev> Not always, but sometimes, depending what options are supplied. 21:46:24 <zaitcev> Using __del__ in that situation is pure madness. I only reviewed our own stuff, so I saw no loops and no excuse to -1 on that patch. 21:47:52 <timburke> hmm... i thought the gc was pretty good about cycle detection these days, though? i'd just assumed there was some sort of situation that caused it to not get run *shrug* 21:49:22 <timburke> anyway, something to be aware of -- i think we ought to decide on a fix and backport it to ussuri 21:49:57 <timburke> #topic s3api + versioning 21:50:44 <timburke> i've seen client trying to request specific versions of object on buckets that have never had versioning enabled 21:51:10 <kota_> interecting 21:51:16 <kota_> interesting 21:51:17 <timburke> i *think* it came down to an AWS version id tagging along during a migration? not entirely sure 21:51:48 <timburke> but s3api didn't handle it well -- object_versioning responds with a 400, then s3api responds 500 21:51:57 <timburke> but https://review.opendev.org/#/c/722552/ should fix it! 21:51:58 <patchbot> patch 722552 - swift - s3api: Check whether versioning is enabled more - 3 patch sets 21:52:13 <timburke> #topic versioning + container acls 21:52:55 <timburke> i've also got a customer that turned on versioning for a container, then was sad that container acls basically stopped working 21:53:40 <timburke> existing data was fine, but once they wrote new data (new name or an overwrite), reads that previously would have been successful all 403 21:54:15 <timburke> clayg mentioned it a bit in -swift before the meeting, but i think the approach i settled on in https://review.opendev.org/#/c/724393/ is reasonable? 21:54:15 <patchbot> patch 724393 - swift - versioning: Have versioning symlinks make pre-auth... - 1 patch set 21:55:09 <timburke> basically, since the user needed to be authorized to read the symlink object, we should grant the same authorization to following the link to the versions container 21:55:15 <clayg> timburke: is there a significant difference between the wsgi utils preauth request helper and installing a noop lambda auth callback? 21:55:15 <zaitcev> I can't find it now, but I seem to recall that there's some semantic thing that prevents interpreter to invoke __del__ on looped object. Or, rather, it will even try to free anything if one of the objects in the loop has __del__. But I don't remember and I can't find it. 21:55:56 <timburke> zaitcev, oh, interesting. so the __del__ may not be a good idea on py3, either... 21:55:57 <clayg> zaitcev: that sounds familiar to me! I always treated __del__ as dark magic! 21:56:31 <zaitcev> I think I read about it in Bezeley's book. 21:56:39 <timburke> clayg, i forget -- i'll look into it. i think there might be some other header-scrubbing that goes on, something like that 21:57:01 <zaitcev> Anyway, sorry. We're on ACLs now. 21:57:35 <timburke> sorry, running low on time :-) 21:59:15 <timburke> does that seem reasonable? it's only in the get path, and this still lets us define another way of saying "hey, allow these users to list and read old versions" 21:59:55 <clayg> @timburke what were you saying about "popping" off that stack? 22:00:18 <clayg> @timburke i'm sure it's reasonable but I need to look at it more - it's on my list for tomorrow 22:01:26 <timburke> well, i kinda push some state (the allow-reserve-names header and the always-authed callback), then immediately after the app call i restore the previous state (so if the version was a symlink to some *other* container, the new request still needs to be authorized) 22:02:07 <timburke> i started out just inserting the no-op callback where we used to add the header, but realized that'd be open to abuse 22:02:23 <timburke> anyway, thanks for thinking about it! 22:02:41 <timburke> thank you all for coming, and thank you for working on swift! 22:02:54 <timburke> #endmeeting