21:00:24 <timburke> #startmeeting swift
21:00:25 <openstack> Meeting started Wed Oct 14 21:00:24 2020 UTC and is due to finish in 60 minutes.  The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot.
21:00:26 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
21:00:28 <openstack> The meeting name has been set to 'swift'
21:00:31 <timburke> who's here for the swift meeting?
21:00:51 <kota_> hello
21:00:56 <seongsoocho> o/
21:01:07 <rledisez> hi o/
21:01:50 <zaitcev> 07
21:02:27 <timburke> as usual, the agenda's at https://wiki.openstack.org/wiki/Meetings/Swift
21:02:43 <timburke> first up, though, something i forgot to put on there ;-)
21:02:49 <timburke> #topic releases
21:02:57 <timburke> victoria was released today!
21:03:09 <timburke> thanks everyone for making this another great cycle
21:03:34 <kota_> congrats!
21:03:37 <seongsoocho> yay ~~~ congrats!!! 🎉
21:03:39 <timburke> we also have new stable releases for ussuri (2.25.1) and train (2.23.2)
21:05:35 <mattoliverau> o/ (sorry I'm late)
21:05:36 <timburke> they include a lot of py3 bug fixes, including the recent crypto patch where you'll need to upgrade while still on py2 before transitioning to py3
21:05:57 <timburke> (as much as anything, just making sure people are aware of them)
21:06:56 <timburke> and i'll probably try to get a swiftclient release out in the near future, as i think it's blocking my current attempt at getting py3 probe tests in the gate
21:07:11 <timburke> #topic PTG
21:07:28 <timburke> just a week and a half away!
21:08:00 <mattoliverau> \o/
21:08:01 <timburke> i added a rough design for ALOs to the etherpad
21:08:03 <timburke> #link https://etherpad.opendev.org/p/swift-ptg-wallaby
21:08:13 <mattoliverau> Oh nice
21:08:15 <kota_> actually, Summit will start on the next Monday.
21:08:48 <timburke> also true!
21:09:00 <timburke> i should see what sessions look interesting...
21:09:16 <timburke> if anyone has recommendations, i'd love to hear them (now or in -swift)
21:10:15 <timburke> the ptg schedule is up at https://object-storage-ca-ymq-1.vexxhost.net/swift/v1/6e4619c416ff4bd19e1c087f27a43eea/www-assets-prod/Uploads/PTG2-Oct26-30-2020-Schedule-1.pdf and all the meeting times on the etherpad are accurate
21:10:17 * kota_ did NOT catch up the session schedule yet :P
21:11:54 <timburke> summit schedule's also available, at https://www.openstack.org/summit/2020/summit-schedule/
21:13:17 <timburke> i can't wait to see you all again and do some hacking! :-)
21:13:33 <timburke> speaking of... let's talk patches!
21:13:46 <timburke> #topic audit watchers
21:13:51 <timburke> #link https://review.opendev.org/#/c/706653/
21:13:52 <patchbot> patch 706653 - swift - Let developers/operators add watchers to object au... - 37 patch sets
21:14:30 <timburke> zaitcev, i saw you added a +2 -- does that mean you and david are ready for the rest of us to do some reviews?
21:14:37 <zaitcev> Yes
21:14:58 <timburke> cool
21:15:00 <timburke> !
21:15:22 <zaitcev> dsariel (not in the meeting) was already looking for another project, this may bet float up.
21:15:54 <timburke> i'll be sure to make some time for a look, probably rebase https://review.opendev.org/#/c/744078/1
21:15:55 <patchbot> patch 744078 - swift - watchers: Add EC stat gatherer - 1 patch set
21:16:08 <zaitcev> Oh, nice. I forgot you had that.
21:16:39 <timburke> and see what other fun ideas for watchers i can come up with :-)
21:17:35 <timburke> anything else you need there besides reviews? anything we should know digging into it?
21:17:47 <zaitcev> But I just want a claim post in the ground -- even if the EC plugin ends requiring changes to the API. I resigned myself to APIs that need to be changed. I think this is what Sam meant when he wrote that all arguments are passed by name. Note that it requires implementations to always add **kwargs, but beyond that a flexibility is built in.
21:18:46 <timburke> yeah, and we could surely have a few releases where we label the feature as being experimental and subject to change if we really need
21:18:53 <zaitcev> I'm glad that I talked you all into the reduced isolation, because separate processes produced nasty problems with not keeping up and logging.
21:19:23 <zaitcev> If you can accept in-process model, I'm okay with anything. Even "except Exception" which I always -1 elsewhere.
21:20:19 <zaitcev> That is all.
21:20:43 <timburke> #topic replication locking
21:20:46 <zaitcev> If you look at it and like it, we don't really need to waste PTG time on it. Feel free to overstrike after you review.
21:20:58 <timburke> #link https://review.opendev.org/#/c/754242/
21:20:58 <patchbot> patch 754242 - swift - Fix a race condition in case of cross-replication - 5 patch sets
21:21:05 <zaitcev> Tsk
21:21:13 <zaitcev> I promised to review it but didn't.
21:21:27 <timburke> rledisez, sorry, i still haven't gotten to rebuilding an env that would be amenable to trying to repro :-(
21:22:06 <timburke> i don't want to drop it from the agenda, though, because i hope to shame myself into actually doing what i said i'd do ;-)
21:22:12 <rledisez> So patchset 5 passed the production test, I was unable to lose datafiles using this patch. I added a comment on timeout at the end. I'm in favor of hardcoding a small value, but you may have different opinions
21:22:52 <timburke> that seems perfectly reasonable to me. if we can't get the lock quickly, skip it and try again later; makes sense
21:23:44 <rledisez> ok. so I'll update the review with a hardcoded timeout of 0.2; It's working fine in our deployments
21:24:48 <timburke> anything else you need there? would it be useful for us to think some more about how to handle locking for rsync?
21:25:01 <rledisez> I still have to work on the rsync fix, I have to take some time to think about it. I'm affraid it would incur a major change in SSYNC (splitting "negociation" and "data transfert). But it may be for the best actually
21:25:52 <rledisez> I still think the best option is to wrap rsync with a ssync call
21:27:35 <rledisez> that's all for me on that topic
21:27:38 <zaitcev> Interesting.
21:28:40 <timburke> #topic async slo cleanup
21:28:44 <timburke> #link https://review.opendev.org/#/c/733026/
21:28:44 <patchbot> patch 733026 - swift - Add a new URL parameter to allow for async cleanup... - 12 patch sets
21:29:07 <timburke> so i think clayg's getting skittish about turning this on by default
21:30:38 <timburke> i might end up having it be opt-in, at least for an initial release. i know the main cluster i care about doesn't make heavy use of the expirer yet, and this would presumably change that
21:31:53 <timburke> meanwhile, i don't have a great handle on how best to monitor my expirers -- i know clayg has a little script that can check how much is in the queue and how much of it is ready for reaping, though
21:32:14 <mattoliverau> I guess opt in initially is probably a safe way to go forward.
21:32:40 <timburke> i'd also had an idea a while back (based on rledisez's https://review.opendev.org/#/c/715580/) to add a "lag" metric for the expirer
21:32:41 <patchbot> patch 715580 - swift - obj-updater: add metric on lag of containers listing - 1 patch set
21:32:52 <clayg> yeah SRE just deployed the expirer monitor on like one random node - so we have stats now (as long as that node is up 🙄 )
21:33:10 <mattoliverau> Should we get expires to emit something, somehow? Have think about it.
21:34:09 <timburke> we've got some stats already, at least; iirc, mostly just success/failure counts
21:34:35 <zaitcev> I looked at that patch and it seemed okay, but then Clay came in and he had some fundamental comments
21:35:30 <timburke> (fwiw, the expirer patch is https://review.opendev.org/#/c/735271/)
21:35:30 <patchbot> patch 735271 - swift - metrics: Add lag metric to expirer - 1 patch set
21:35:53 <mattoliverau> Can you pull general task queue account stats to get a rough idea on size? With and account head. I should play around with general task queue some more.
21:36:33 <mattoliverau> Or did we shard the accounts too.
21:36:51 * mattoliverau is just thinking out loud, and isn't at his computer
21:37:47 <timburke> yeah, pretty sure that's the idea with https://gist.github.com/clayg/7f66eab2a61c77869e1e84ac4ed6f1df
21:38:55 <timburke> oh, but with https://review.opendev.org/#/c/517389/ that might get more complicated
21:38:56 <patchbot> patch 517389 - swift - Add object-expirer new mode to execute tasks from ... - 46 patch sets
21:41:17 <timburke> anyway, mainly just wanted to call attention to that default change -- i should have a fresh patchset up soon
21:41:24 <timburke> #topic open discussion
21:41:32 <timburke> what else should we bring up today?
21:42:41 <zaitcev> I recommended dsariel to look into sharding, help Matt along.
21:42:51 <zaitcev> Not sure if it's going to work.
21:43:11 <mattoliverau> Thanks!
21:45:46 <timburke> cool! speaking of, i should get my probe test at https://review.opendev.org/#/c/744256/ to a point that it passes consistently :-/
21:45:46 <patchbot> patch 744256 - swift - sharding: probe test to exercise manual shrinking - 3 patch sets
21:46:53 <mattoliverau> I've been writing some tests for my poc rangescanner that will hopefully fsch root shard ranges
21:47:25 <mattoliverau> *fsck/scan and attempt to fix
21:47:54 <timburke> i'm really coming around on the idea that the way to "solve" autosharding is to get an automated recovery from overlapping shard ranges
21:49:00 <mattoliverau> Yeah, I want to do leader election properly, but in any case being able to fix is more important initially
21:49:37 <mattoliverau> The scanner currently deals with overlaps and can rebuild fragmented paths.
21:50:24 <mattoliverau> It's still a WIP but feel free to have a look. I also have a new doc/braindump
21:50:40 <mattoliverau> Which I hope to go over at the ptg.
21:51:54 <timburke> oh! there's a patch i keep meaning to bring up during meetings but never quite get around to: https://review.opendev.org/#/c/751966/
21:51:55 <patchbot> patch 751966 - swift - replace md5 with swift utils version - 11 patch sets
21:52:14 <zaitcev> wait, what
21:52:28 <timburke> someone's interested in running swift with FIPS mode enabled, which means we'd need to annotate all uses of md5
21:53:14 <timburke> ...which understandably means that there'd be a decent number of conflicts if/when we merge it
21:53:34 <zaitcev> Russian anecdote: One guys said "What do you know about security through obscurity? On my last job, I replaced MD5 with SHA256 only trimmed to fit."
21:54:08 <timburke> heh
21:54:12 <mattoliverau> Lol
21:55:30 <timburke> (honestly, it makes me think a bit too much of https://tools.ietf.org/html/rfc3514, but w/e...)
21:56:11 <zaitcev> 1 April 2003 - nice tray
21:57:37 <timburke> does anyone have an interest in trying to review this? i'll probably get to it eventually, but it's very much a "when i get around to it" sort of endeavor)
21:58:07 <zaitcev> I could, I suppose. Just after Romain's race condition thing.
21:58:20 <timburke> i like that prioritization :-)
22:00:07 <timburke> all right, we're about at time. thank you all for coming, and thank you for working on swift!
22:00:13 <timburke> #endmeeting