#openstack-swift log

21:00:03 <timburke> #startmeeting swift
21:00:04 <opendevmeet> Meeting started Wed Nov 10 21:00:03 2021 UTC and is due to finish in 60 minutes.  The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot.
21:00:04 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
21:00:04 <opendevmeet> The meeting name has been set to 'swift'
21:00:10 <timburke> who's here for the swift meeting?
21:00:31 <kota> hi
21:02:14 <mattoliver> o/
21:02:29 <timburke> i know acoles won't make it; we'll see if we pick up clayg or zaitcev as we go ;-)
21:02:37 <timburke> as usual, the agenda's at https://wiki.openstack.org/wiki/Meetings/Swift
21:02:45 <timburke> first up
21:02:52 <timburke> #topic next meeting
21:03:27 <timburke> i had a death in the family, and i'm unlikely to be able to be able to chair next week
21:03:29 <zaitcev> Oh, right. The UTC.
21:03:52 <zaitcev> Sorry for your loss.
21:04:00 <kota> sorry to hear that
21:04:09 <timburke> thanks all
21:04:12 <timburke> given the US holiday the week after, i propose we set the next meeting for the week after
21:04:16 <mattoliver> timburke: sorry to hear that.
21:05:15 <mattoliver> KK sounds good, if something comes up in the meantime, I'm happy to chair if need be.
21:05:33 <mattoliver> but a few weeks off is also ok :)
21:05:35 <timburke> 👍
21:05:42 <timburke> there's no reason i *have* to be there ;-)
21:05:58 <timburke> next up
21:06:08 <timburke> #topic https://bugs.launchpad.net/swift/+bug/1685798
21:06:20 <timburke> i think clayg wanted to bring this up
21:06:47 <timburke> it's an old but to do with logging tempurl signatures
21:06:53 <timburke> even has a CVE attached
21:07:03 <clayg> so it's legit?
21:07:19 <clayg> I was thinking we could just change the status to resolved or whatever and no one would know
21:07:23 <timburke> i think cschwede even has a patch attached?
21:07:37 <clayg> well now that it's public we can just put that in gerrit then?
21:07:53 <clayg> git it rebased and merged and THEN mark it as resolved or whatever?
21:08:08 <timburke> clayg, idk -- i'd view it as more of a security hardening opportunity *shrug*
21:08:08 <clayg> that seems totally legit 😎
21:08:14 <mattoliver> sounds like a plan (not having looked at the patch)
21:08:42 <timburke> iirc the one complication was that there's presumably a similar issue with s3api presigned requests
21:09:04 <timburke> who wants to take ownership on getting the patch rebased and merged?
21:09:23 <mattoliver> I'll do it.
21:09:36 <mattoliver> unless clayg has a burning desire
21:10:16 <timburke> thanks mattoliver
21:10:17 <opendevreview> Clay Gerrard proposed openstack/swift master: Fix CVE-2017-8761  https://review.opendev.org/c/openstack/swift/+/817476
21:10:32 <timburke> or mattoliver will review it ;-)
21:10:33 <mattoliver> oh even easier :P
21:10:47 <mattoliver> lol yup
21:11:08 <clayg> this was great - can we leave it on the agenda and follow up in two weeks as to why the bug is not closed or close it
21:11:31 <timburke> 👍
21:11:53 <timburke> #topic storage policies and sharding
21:12:39 <timburke> i just wanted to check in on the current thinking for this issue -- i feel like things have gone in a few different directions
21:13:22 <mattoliver> Good question, and I haven't really been following along, but also wanted to get up to speed.
21:13:38 <timburke> i guess it's on clayg then, with acoles not here ;-)
21:13:40 <zaitcev> What's the link between the policies and sharding?
21:14:07 <timburke> so *that much* i did get loaded into my head
21:14:48 <timburke> the issue is that a sharded container could get deleted then recreated, and the shards won't pick up the right storage policy
21:15:03 <mattoliver> acoles: and clayg have been playing with reworking the reconciler cmp_policy_info
21:15:13 <mattoliver> based around the discussions we had at PTG
21:15:31 <timburke> it can lead to some dark data iirc, where pre-sharding the reconciler would eventually move the data to the right policy and it'd show up in listings again
21:15:32 <mattoliver> But the getting policy index from root -> shards is still a thing that needs to happen
21:16:02 <mattoliver> So we can either fix up reconciler cmp function and then fix shards
21:17:05 <mattoliver> or fix shards and work on the cmp function as seperate things.
21:18:44 <mattoliver> I guess if we make decisions on shards as we do a root using cmp_policy_info then when it itself is fixed we'd still get getting root -> shards fixed/improved.
21:19:45 <mattoliver> I'll take a look at the current state of the root -> shard policy_index patches today to see what state their in, report back here in channel and maybe we can make a decision.
21:20:26 <timburke> sounds good, thanks mattoliver. worst case, we can check in again next meeting
21:20:33 <mattoliver> yup
21:20:36 <timburke> #topic memcache issues
21:21:09 <timburke> i hit an issue in my home cluster that was a little funny to run down, but i kicked up a patch out of it
21:21:17 <timburke> #link https://review.opendev.org/c/openstack/swift/+/817307
21:22:16 <timburke> basically, the memcache pool could be depleted (and never replenished) if a MemcachePoolTimeout gets raised while we're trying to create a new connection
21:22:50 <zaitcev> Oh I see
21:22:54 <mattoliver> oh interesting
21:23:07 <mattoliver> your home cluster is a good source of bugs, nice :)
21:23:47 <timburke> while i was trying to functionally test the fix, i noticed that i wouldn't actually drop to zero open connections to memcache
21:24:07 <timburke> which i suspect was due to *another* issue that kota already had a patch for :-)
21:24:15 <timburke> #link https://review.opendev.org/c/openstack/swift/+/338819
21:24:52 <kota> Oh, i think it's much old one
21:25:08 <mattoliver> nice one kota
21:25:09 <timburke> only 5 years or so ;-)
21:25:12 <mattoliver> 5 years ago
21:25:35 <timburke> sorry i hadn't reviewed it :-)
21:26:10 <timburke> i changed a few things, so it might be good to have someone else take a look too, but LGTM
21:27:39 <timburke> i mostly just wanted to raise awareness of the issue -- fwiw, it'd manifest (for me) as the proxy returning 401 for all requests, since it couldn't contact memcache at all
21:28:36 <timburke> if you've got keystone or some other system where your tokens are stored in more than just memcache, it would likely manifest as dramatically increased load on your auth system
21:28:53 <timburke> #topic request tracing
21:29:09 <timburke> mattoliver, i think i saw some more movement on these patches, is that right?
21:29:39 <mattoliver> Yup, I've cleaned it up a bunch. So it's much smaller then it was.
21:30:47 <mattoliver> And I've now implemented a new span per proxy request to storage nodes (I think I got them all), and if there is a timeout we log the exception to that span.
21:31:00 <mattoliver> in essence, we can see when it fails to communitcate to a node, and which node that was
21:31:29 <mattoliver> Previously it'll only track through responses, so failing to talk to nodes wouldn't show up.
21:31:41 <timburke> very cool! is there anything we need to discuss for it, or does it mainly just need review at this point?
21:32:07 <mattoliver> yeah, if you want to review it, great, but it's still in a bit of flux.
21:32:34 <mattoliver> trying to get it working best it can because I need to write a talk on it for Linux.conf.au
21:33:54 <timburke> sounds good
21:34:01 <timburke> all right, that's all i've got
21:34:05 <timburke> #topic open discussion
21:34:13 <timburke> anything else we ought to bring up today?
21:34:15 <mattoliver> Here's a link to my random presentation of quickly wrote up because my tweet I sent about the PTG was asked by an Linux Australia concil member to be submitted as a talk:  https://lca2022.linux.org.au/schedule/presentation/32/
21:34:48 <mattoliver> and it was approved, so it's motivating me to get tracing working well :P
21:35:11 <timburke> \o/
21:38:16 <timburke> all right, let's call it
21:38:28 <timburke> thank you all for coming, and thank you for working on swift!
21:38:32 <timburke> #endmeeting