21:00:03 <timburke> #startmeeting swift 21:00:04 <opendevmeet> Meeting started Wed Nov 10 21:00:03 2021 UTC and is due to finish in 60 minutes. The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:04 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:04 <opendevmeet> The meeting name has been set to 'swift' 21:00:10 <timburke> who's here for the swift meeting? 21:00:31 <kota> hi 21:02:14 <mattoliver> o/ 21:02:29 <timburke> i know acoles won't make it; we'll see if we pick up clayg or zaitcev as we go ;-) 21:02:37 <timburke> as usual, the agenda's at https://wiki.openstack.org/wiki/Meetings/Swift 21:02:45 <timburke> first up 21:02:52 <timburke> #topic next meeting 21:03:27 <timburke> i had a death in the family, and i'm unlikely to be able to be able to chair next week 21:03:29 <zaitcev> Oh, right. The UTC. 21:03:52 <zaitcev> Sorry for your loss. 21:04:00 <kota> sorry to hear that 21:04:09 <timburke> thanks all 21:04:12 <timburke> given the US holiday the week after, i propose we set the next meeting for the week after 21:04:16 <mattoliver> timburke: sorry to hear that. 21:05:15 <mattoliver> KK sounds good, if something comes up in the meantime, I'm happy to chair if need be. 21:05:33 <mattoliver> but a few weeks off is also ok :) 21:05:35 <timburke> 👍 21:05:42 <timburke> there's no reason i *have* to be there ;-) 21:05:58 <timburke> next up 21:06:08 <timburke> #topic https://bugs.launchpad.net/swift/+bug/1685798 21:06:20 <timburke> i think clayg wanted to bring this up 21:06:47 <timburke> it's an old but to do with logging tempurl signatures 21:06:53 <timburke> even has a CVE attached 21:07:03 <clayg> so it's legit? 21:07:19 <clayg> I was thinking we could just change the status to resolved or whatever and no one would know 21:07:23 <timburke> i think cschwede even has a patch attached? 21:07:37 <clayg> well now that it's public we can just put that in gerrit then? 21:07:53 <clayg> git it rebased and merged and THEN mark it as resolved or whatever? 21:08:08 <timburke> clayg, idk -- i'd view it as more of a security hardening opportunity *shrug* 21:08:08 <clayg> that seems totally legit 😎 21:08:14 <mattoliver> sounds like a plan (not having looked at the patch) 21:08:42 <timburke> iirc the one complication was that there's presumably a similar issue with s3api presigned requests 21:09:04 <timburke> who wants to take ownership on getting the patch rebased and merged? 21:09:23 <mattoliver> I'll do it. 21:09:36 <mattoliver> unless clayg has a burning desire 21:10:16 <timburke> thanks mattoliver 21:10:17 <opendevreview> Clay Gerrard proposed openstack/swift master: Fix CVE-2017-8761 https://review.opendev.org/c/openstack/swift/+/817476 21:10:32 <timburke> or mattoliver will review it ;-) 21:10:33 <mattoliver> oh even easier :P 21:10:47 <mattoliver> lol yup 21:11:08 <clayg> this was great - can we leave it on the agenda and follow up in two weeks as to why the bug is not closed or close it 21:11:31 <timburke> 👍 21:11:53 <timburke> #topic storage policies and sharding 21:12:39 <timburke> i just wanted to check in on the current thinking for this issue -- i feel like things have gone in a few different directions 21:13:22 <mattoliver> Good question, and I haven't really been following along, but also wanted to get up to speed. 21:13:38 <timburke> i guess it's on clayg then, with acoles not here ;-) 21:13:40 <zaitcev> What's the link between the policies and sharding? 21:14:07 <timburke> so *that much* i did get loaded into my head 21:14:48 <timburke> the issue is that a sharded container could get deleted then recreated, and the shards won't pick up the right storage policy 21:15:03 <mattoliver> acoles: and clayg have been playing with reworking the reconciler cmp_policy_info 21:15:13 <mattoliver> based around the discussions we had at PTG 21:15:31 <timburke> it can lead to some dark data iirc, where pre-sharding the reconciler would eventually move the data to the right policy and it'd show up in listings again 21:15:32 <mattoliver> But the getting policy index from root -> shards is still a thing that needs to happen 21:16:02 <mattoliver> So we can either fix up reconciler cmp function and then fix shards 21:17:05 <mattoliver> or fix shards and work on the cmp function as seperate things. 21:18:44 <mattoliver> I guess if we make decisions on shards as we do a root using cmp_policy_info then when it itself is fixed we'd still get getting root -> shards fixed/improved. 21:19:45 <mattoliver> I'll take a look at the current state of the root -> shard policy_index patches today to see what state their in, report back here in channel and maybe we can make a decision. 21:20:26 <timburke> sounds good, thanks mattoliver. worst case, we can check in again next meeting 21:20:33 <mattoliver> yup 21:20:36 <timburke> #topic memcache issues 21:21:09 <timburke> i hit an issue in my home cluster that was a little funny to run down, but i kicked up a patch out of it 21:21:17 <timburke> #link https://review.opendev.org/c/openstack/swift/+/817307 21:22:16 <timburke> basically, the memcache pool could be depleted (and never replenished) if a MemcachePoolTimeout gets raised while we're trying to create a new connection 21:22:50 <zaitcev> Oh I see 21:22:54 <mattoliver> oh interesting 21:23:07 <mattoliver> your home cluster is a good source of bugs, nice :) 21:23:47 <timburke> while i was trying to functionally test the fix, i noticed that i wouldn't actually drop to zero open connections to memcache 21:24:07 <timburke> which i suspect was due to *another* issue that kota already had a patch for :-) 21:24:15 <timburke> #link https://review.opendev.org/c/openstack/swift/+/338819 21:24:52 <kota> Oh, i think it's much old one 21:25:08 <mattoliver> nice one kota 21:25:09 <timburke> only 5 years or so ;-) 21:25:12 <mattoliver> 5 years ago 21:25:35 <timburke> sorry i hadn't reviewed it :-) 21:26:10 <timburke> i changed a few things, so it might be good to have someone else take a look too, but LGTM 21:27:39 <timburke> i mostly just wanted to raise awareness of the issue -- fwiw, it'd manifest (for me) as the proxy returning 401 for all requests, since it couldn't contact memcache at all 21:28:36 <timburke> if you've got keystone or some other system where your tokens are stored in more than just memcache, it would likely manifest as dramatically increased load on your auth system 21:28:53 <timburke> #topic request tracing 21:29:09 <timburke> mattoliver, i think i saw some more movement on these patches, is that right? 21:29:39 <mattoliver> Yup, I've cleaned it up a bunch. So it's much smaller then it was. 21:30:47 <mattoliver> And I've now implemented a new span per proxy request to storage nodes (I think I got them all), and if there is a timeout we log the exception to that span. 21:31:00 <mattoliver> in essence, we can see when it fails to communitcate to a node, and which node that was 21:31:29 <mattoliver> Previously it'll only track through responses, so failing to talk to nodes wouldn't show up. 21:31:41 <timburke> very cool! is there anything we need to discuss for it, or does it mainly just need review at this point? 21:32:07 <mattoliver> yeah, if you want to review it, great, but it's still in a bit of flux. 21:32:34 <mattoliver> trying to get it working best it can because I need to write a talk on it for Linux.conf.au 21:33:54 <timburke> sounds good 21:34:01 <timburke> all right, that's all i've got 21:34:05 <timburke> #topic open discussion 21:34:13 <timburke> anything else we ought to bring up today? 21:34:15 <mattoliver> Here's a link to my random presentation of quickly wrote up because my tweet I sent about the PTG was asked by an Linux Australia concil member to be submitted as a talk: https://lca2022.linux.org.au/schedule/presentation/32/ 21:34:48 <mattoliver> and it was approved, so it's motivating me to get tracing working well :P 21:35:11 <timburke> \o/ 21:38:16 <timburke> all right, let's call it 21:38:28 <timburke> thank you all for coming, and thank you for working on swift! 21:38:32 <timburke> #endmeeting