21:00:41 <timburke> #startmeeting swift
21:00:41 <opendevmeet> Meeting started Wed Sep  3 21:00:41 2025 UTC and is due to finish in 60 minutes.  The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot.
21:00:41 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
21:00:41 <opendevmeet> The meeting name has been set to 'swift'
21:00:51 <timburke> who's here for the swift team meeting?
21:01:07 <mattoliver> o/
21:02:08 <timburke> as usual the agenda's at
21:02:11 <timburke> #link https://wiki.openstack.org/wiki/Meetings/Swift
21:02:15 <timburke> first up
21:02:20 <timburke> #topic vPTG
21:02:46 <timburke> as a reminder, it's just under two month out now
21:02:53 <timburke> oct 27-31
21:03:07 <timburke> but there's more than just a reminder this time! there's a call to action!
21:03:24 <timburke> for i've got the skeleton of an etherpad up for topics
21:03:33 <timburke> #link https://etherpad.opendev.org/p/swift-ptg-gazpacho
21:03:41 <mattoliver> Oh nice
21:04:00 <timburke> so please, add topics you'd like to discuss!
21:04:18 <timburke> next up
21:04:23 <timburke> #topic releases
21:04:50 <timburke> it's getting to that point in the cycle where we really need to get code shipped :-)
21:05:17 <timburke> liberasurecode 1.7.1 is now released
21:05:33 <timburke> and the stable branch for swiftclient has been cut
21:05:42 <mattoliver> Time to fix up and promote the priority reviews page?
21:05:45 <timburke> but i still need to do a swift release
21:06:12 <timburke> always a good time for that :-) i really should do it more regularly
21:07:25 <timburke> but i think i like where things are at the moment. we could always do more, and if anyone wants to slip something in this week, i can update notes for it, but i think i'd be content once we get the notes i've written so far merged
21:07:32 <timburke> #link https://review.opendev.org/c/openstack/swift/+/956333
21:08:22 <mattoliver> Kk
21:08:30 <timburke> if you've got a sec to spot-check those (make sure i didn't misspell something, or that i explained things well enough), i'd appreciate it
21:08:47 <mattoliver> Will do!
21:08:58 <timburke> thanks jianjian for taking a look last week!
21:09:13 <timburke> now it's got the reno-ified notes, too, though
21:09:28 <jianjian> no problem, I also have some nits, good time to add since Matt is going to take a look as well :-)
21:09:46 <timburke> #link https://1fa741547dce2186b901-c02cba5f61e61aa034234ed930eebdcc.ssl.cf1.rackcdn.com/openstack/d2a082b2322e47a791f7d98062e50b34/docs/current.html
21:10:57 <timburke> but there's even more! i'd also like to get a pyeclib release out, so it can offer the new backend in liberasurecode 1.7.x
21:11:00 <timburke> #link https://review.opendev.org/c/openstack/pyeclib/+/958706
21:11:42 <mattoliver> Oh yeah, getting the new backend out would be awesome too
21:11:45 <jianjian> nice!
21:12:24 <timburke> all right, next up
21:12:31 <timburke> #topic eventlet removal
21:13:50 <timburke> a bunch of us (clayg, jianjian, and i) met with cschwede last week to ask about where things stand and how we could help out
21:15:02 <timburke> sounds like things are going fairly well, as POCs go! account server is looking pretty good
21:15:33 <timburke> one of the things to come out of that was to spin up a feature branch where we can all hack on it
21:15:50 <timburke> so we now have a feature/threaded branch!
21:16:15 <timburke> and cschwede has pushed up a chain to get us started
21:16:24 <timburke> #link https://review.opendev.org/q/project:openstack/swift+branch:feature/threaded+is:open
21:17:20 <mattoliver> you know we're getting serious when we now have 2 feature branches running!
21:17:33 <timburke> we're going to have a feature flag for it (currently implemented as an env var), that way we don't have to dive into PUT+POST+POST right away
21:18:08 <mattoliver> oh nice idea
21:18:36 <jianjian> seems like a clean addition, great start from cschwede
21:18:37 <mattoliver> is there like a one pager , overview or doc on the main idea?
21:18:44 <timburke> and i think i'll try to get some of the gate jobs running with gunicorn for the account server this week
21:18:59 <timburke> mattoliver, i don't think we have that at the moment -- good idea
21:19:27 <jianjian> for me, I have been thinking to re-run my previous benchmark tests with newer python versions to check for improvements. I started with Uvicorn (even though we probably won’t adopt it for various reasons) and saw a significant throughput gain when using py3.13 with the uvloop scheduler. I’ll be testing Gunicorn as next.
21:19:36 <mattoliver> just help people collab if there is at least a known rough idea.
21:20:08 <mattoliver> although git diff between branches and gunicorn is a good start
21:21:40 <timburke> all right, one last thing that's been keeping me preoccupied lately
21:21:50 <timburke> #topic potential EC data loss
21:22:04 <timburke> we've actually had this come up in two ways
21:22:38 <timburke> in the first, we've got two clients trying to write to the same object with the same timestamp
21:24:19 <timburke> the bad news is, the writes can interleave in such a way that both writes return 201 to the client yet we don't have enough fragments to reconstruct either of them -- in an 8+4 policy, we might have 7 of one and 5 of the other
21:25:07 <timburke> i went searching for bugs about it and apparently we hadn't written anything up about it -- but it sounds a lot like
21:25:10 <timburke> #link https://bugs.launchpad.net/swift/+bug/1971686
21:26:01 <mattoliver> I know other distributed eventual systems use a thing call a lamport clock. That basically boils down to a node having a counter or unique number they also sent to break collision deadlocks. And swift already allows this with the timestamp offsets.. so I've been having a play: https://review.opendev.org/c/openstack/swift/+/959009
21:26:25 <timburke> the good news is, they were trying to write the same data. but with encryption enabled, they each used different keys/ivs, so the proxy bombs out on read when it noticed the mismatched etags
21:26:46 <mattoliver> That just using randint because I haven't got something more generic like machine ids working properly because they're too big an int but unique.
21:27:19 <timburke> yep, thanks mattoliver! and clayg's been playing with a probe test to prompt the error: https://review.opendev.org/c/openstack/swift/+/927327
21:27:27 <mattoliver> you guys have been amazing and digging into the EC details and getting them getting things fixed!
21:27:38 <mattoliver> at that end
21:28:12 <timburke> one trouble i see, though, is that there's nothing to stop both clients going through the same proxy (as with the probe test)
21:28:41 <mattoliver> yeah, I got it different per worker
21:28:48 <jianjian> the new improvement from timburke on the repair tool is great 👍
21:28:52 <mattoliver> but I guess need to add some more randomness via eventlet
21:29:19 <timburke> yeah, so you're beating me to the punch a little :-) despite not having any set of frags large enough to rebuild using the normal machinery, we did find a way to fix them
21:30:35 <jianjian> that's very smart
21:30:58 <mattoliver> you guys are bloody geniuses, especially you timburke  to figure it all out!
21:31:55 <timburke> basically, since they've always been collisions with the same plaintext user data and we've been using the systematic isa-l codes, we could grab just the data fragments, do some math to figure out the right offsets, and decrypt the frags directly without involving pyeclib at all
21:33:08 <mattoliver> :mind-blown:
21:34:04 <timburke> that worked for most cases, but every now and then we had an object that lost a data frag. *then* we can figure out which key/iv has the most parity, re-encrypt any other data frags to use that key/iv, and then send it all through pyeclib and a decrypter like a proxy would
21:35:02 <timburke> so far i don't think we've had anything that we couldn't fix that way
21:35:34 <jianjian> \o/
21:36:03 <timburke> but it's still treating the symptoms, not the root cause. so expect more work on that front (thanks for thinking about lamport clocks, mattoliver!)
21:36:05 <mattoliver> incredible!
21:37:28 <jianjian> +1, will look into matt's new patches
21:38:08 <timburke> the other case is a little more scary. we've *also* seen the occasional fragment that's just bad. like, the frag-level etag matches, but trying to use it when decoding gives us data that doesn't match the object-level etag
21:39:27 <mattoliver> well the fact that you got as far as you did it way way better then I'd ever expect you to get.
21:39:35 <timburke> we're still searching for explanations. the good news is that there are usually enough other frags such that we can find *some* set that still works
21:40:35 <jianjian> probably that was some kind of  bit rot or disk corruption
21:40:45 <mattoliver> yeah, I guess if you gather all frag (even on handoffs) then brute force the combinations until it works
21:40:54 <mattoliver> yeah
21:41:26 <timburke> maybe? but then i would've expected our normal machinery to quarantine it, and the frag-level etag to not match
21:41:45 <timburke> (we've even had a case where an object hit *both* of these issues, so clayg enhanced our repair tool to be able to exclude specific frags)
21:43:20 <mattoliver> Re uniqueness in green threads, I wonder if I could just do something simple like crc the txid and add that to the offset so unique-ify between greenthreads :hmm: Need to add something to make it more unique. Does each thread have an id in eventlet land or something.
21:43:41 <mattoliver> Anyway my patch is what I have is just a start at playing with the idea. don't think it's unique enough yet, well it can be.. but not happy with it yet.
21:44:39 <mattoliver> but we have Timestamp objects with offsets plumbed all the way to deiskfile, so it kinda just works. Just weird seeing data files on disk with an offset included.
21:45:17 <timburke> i think our leading suspects are an issue with reconstruction, where one of the frags we reach for *does* get quarantined but we still send data reconstructed from it to the destination with a fresh, good etag; or, some kind of weird bit-flip somewhere in the proxy, where the frag was always corrupted as soon as we got it back from pyeclib (but due to memory problems, not pyeclib)
21:46:11 <timburke> mattoliver, yeah, iirc greenthreads have some kind of id -- if nothing else, you could surely use python's id() on it
21:47:19 <mattoliver> oh yeah!
21:47:34 <timburke> another thought i had was to make diskfile more brittle: prevent it from linking over an existing file
21:48:06 <mattoliver> thats not a bad idea either. So basically in this case, both would fail
21:48:21 <mattoliver> unless enough frags for each were written to handoffs
21:48:33 <mattoliver> *both requests
21:48:42 <timburke> yeah -- getting 503s back out to the client and prompting a retry
21:48:46 <timburke> hmm... true...
21:49:01 <timburke> so, might be good to do both :D
21:49:21 <timburke> all right, that's all i've got, and we've only got about 10mins left
21:49:26 <timburke> #topic open discussion
21:49:34 <timburke> anything else we ought to bring up this week?
21:51:08 <mattoliver> I've respun the ring/ringdata patch and playing with using a reentrant lock so the ring can't update mid get_nodes/get_more_nodes
21:51:09 <mattoliver> #link https://review.opendev.org/c/openstack/swift/+/955263
21:51:23 <mattoliver> opps wrong one
21:51:36 <mattoliver> #link https://review.opendev.org/c/openstack/swift/+/957291
21:52:34 <timburke> nice -- i'll try to have a look soon
21:53:34 <timburke> the stable gates that were broken should be fixed now. i saw elodilles also backported the fix to the most-recent unmaintained branch
21:54:18 <mattoliver> oh cool
21:55:07 <timburke> oh yeah, and the guy that added the new libec backend it working on another one
21:55:09 <timburke> #link https://review.opendev.org/c/openstack/liberasurecode/+/959280
21:56:27 <timburke> if you're interested, maybe start with https://en.wikipedia.org/wiki/Locally_recoverable_code (at least for an overview of the goals)
21:56:54 <mattoliver> thanks, yeah, I'll need an overview
21:57:02 <timburke> it kinda sounds like a better take on our ec duplication factor -- which did always feel like overkill
21:57:08 <mattoliver> But nice to see so much liberasure love
21:57:28 <timburke> agree
21:57:36 <timburke> all right, i think i oughta call it
21:57:47 <timburke> thank you all for coming, and thank you for working on swift!
21:57:51 <timburke> #endmeeting