21:00:41 <timburke> #startmeeting swift 21:00:41 <opendevmeet> Meeting started Wed Sep 3 21:00:41 2025 UTC and is due to finish in 60 minutes. The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:41 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:41 <opendevmeet> The meeting name has been set to 'swift' 21:00:51 <timburke> who's here for the swift team meeting? 21:01:07 <mattoliver> o/ 21:02:08 <timburke> as usual the agenda's at 21:02:11 <timburke> #link https://wiki.openstack.org/wiki/Meetings/Swift 21:02:15 <timburke> first up 21:02:20 <timburke> #topic vPTG 21:02:46 <timburke> as a reminder, it's just under two month out now 21:02:53 <timburke> oct 27-31 21:03:07 <timburke> but there's more than just a reminder this time! there's a call to action! 21:03:24 <timburke> for i've got the skeleton of an etherpad up for topics 21:03:33 <timburke> #link https://etherpad.opendev.org/p/swift-ptg-gazpacho 21:03:41 <mattoliver> Oh nice 21:04:00 <timburke> so please, add topics you'd like to discuss! 21:04:18 <timburke> next up 21:04:23 <timburke> #topic releases 21:04:50 <timburke> it's getting to that point in the cycle where we really need to get code shipped :-) 21:05:17 <timburke> liberasurecode 1.7.1 is now released 21:05:33 <timburke> and the stable branch for swiftclient has been cut 21:05:42 <mattoliver> Time to fix up and promote the priority reviews page? 21:05:45 <timburke> but i still need to do a swift release 21:06:12 <timburke> always a good time for that :-) i really should do it more regularly 21:07:25 <timburke> but i think i like where things are at the moment. we could always do more, and if anyone wants to slip something in this week, i can update notes for it, but i think i'd be content once we get the notes i've written so far merged 21:07:32 <timburke> #link https://review.opendev.org/c/openstack/swift/+/956333 21:08:22 <mattoliver> Kk 21:08:30 <timburke> if you've got a sec to spot-check those (make sure i didn't misspell something, or that i explained things well enough), i'd appreciate it 21:08:47 <mattoliver> Will do! 21:08:58 <timburke> thanks jianjian for taking a look last week! 21:09:13 <timburke> now it's got the reno-ified notes, too, though 21:09:28 <jianjian> no problem, I also have some nits, good time to add since Matt is going to take a look as well :-) 21:09:46 <timburke> #link https://1fa741547dce2186b901-c02cba5f61e61aa034234ed930eebdcc.ssl.cf1.rackcdn.com/openstack/d2a082b2322e47a791f7d98062e50b34/docs/current.html 21:10:57 <timburke> but there's even more! i'd also like to get a pyeclib release out, so it can offer the new backend in liberasurecode 1.7.x 21:11:00 <timburke> #link https://review.opendev.org/c/openstack/pyeclib/+/958706 21:11:42 <mattoliver> Oh yeah, getting the new backend out would be awesome too 21:11:45 <jianjian> nice! 21:12:24 <timburke> all right, next up 21:12:31 <timburke> #topic eventlet removal 21:13:50 <timburke> a bunch of us (clayg, jianjian, and i) met with cschwede last week to ask about where things stand and how we could help out 21:15:02 <timburke> sounds like things are going fairly well, as POCs go! account server is looking pretty good 21:15:33 <timburke> one of the things to come out of that was to spin up a feature branch where we can all hack on it 21:15:50 <timburke> so we now have a feature/threaded branch! 21:16:15 <timburke> and cschwede has pushed up a chain to get us started 21:16:24 <timburke> #link https://review.opendev.org/q/project:openstack/swift+branch:feature/threaded+is:open 21:17:20 <mattoliver> you know we're getting serious when we now have 2 feature branches running! 21:17:33 <timburke> we're going to have a feature flag for it (currently implemented as an env var), that way we don't have to dive into PUT+POST+POST right away 21:18:08 <mattoliver> oh nice idea 21:18:36 <jianjian> seems like a clean addition, great start from cschwede 21:18:37 <mattoliver> is there like a one pager , overview or doc on the main idea? 21:18:44 <timburke> and i think i'll try to get some of the gate jobs running with gunicorn for the account server this week 21:18:59 <timburke> mattoliver, i don't think we have that at the moment -- good idea 21:19:27 <jianjian> for me, I have been thinking to re-run my previous benchmark tests with newer python versions to check for improvements. I started with Uvicorn (even though we probably won’t adopt it for various reasons) and saw a significant throughput gain when using py3.13 with the uvloop scheduler. I’ll be testing Gunicorn as next. 21:19:36 <mattoliver> just help people collab if there is at least a known rough idea. 21:20:08 <mattoliver> although git diff between branches and gunicorn is a good start 21:21:40 <timburke> all right, one last thing that's been keeping me preoccupied lately 21:21:50 <timburke> #topic potential EC data loss 21:22:04 <timburke> we've actually had this come up in two ways 21:22:38 <timburke> in the first, we've got two clients trying to write to the same object with the same timestamp 21:24:19 <timburke> the bad news is, the writes can interleave in such a way that both writes return 201 to the client yet we don't have enough fragments to reconstruct either of them -- in an 8+4 policy, we might have 7 of one and 5 of the other 21:25:07 <timburke> i went searching for bugs about it and apparently we hadn't written anything up about it -- but it sounds a lot like 21:25:10 <timburke> #link https://bugs.launchpad.net/swift/+bug/1971686 21:26:01 <mattoliver> I know other distributed eventual systems use a thing call a lamport clock. That basically boils down to a node having a counter or unique number they also sent to break collision deadlocks. And swift already allows this with the timestamp offsets.. so I've been having a play: https://review.opendev.org/c/openstack/swift/+/959009 21:26:25 <timburke> the good news is, they were trying to write the same data. but with encryption enabled, they each used different keys/ivs, so the proxy bombs out on read when it noticed the mismatched etags 21:26:46 <mattoliver> That just using randint because I haven't got something more generic like machine ids working properly because they're too big an int but unique. 21:27:19 <timburke> yep, thanks mattoliver! and clayg's been playing with a probe test to prompt the error: https://review.opendev.org/c/openstack/swift/+/927327 21:27:27 <mattoliver> you guys have been amazing and digging into the EC details and getting them getting things fixed! 21:27:38 <mattoliver> at that end 21:28:12 <timburke> one trouble i see, though, is that there's nothing to stop both clients going through the same proxy (as with the probe test) 21:28:41 <mattoliver> yeah, I got it different per worker 21:28:48 <jianjian> the new improvement from timburke on the repair tool is great 👍 21:28:52 <mattoliver> but I guess need to add some more randomness via eventlet 21:29:19 <timburke> yeah, so you're beating me to the punch a little :-) despite not having any set of frags large enough to rebuild using the normal machinery, we did find a way to fix them 21:30:35 <jianjian> that's very smart 21:30:58 <mattoliver> you guys are bloody geniuses, especially you timburke to figure it all out! 21:31:55 <timburke> basically, since they've always been collisions with the same plaintext user data and we've been using the systematic isa-l codes, we could grab just the data fragments, do some math to figure out the right offsets, and decrypt the frags directly without involving pyeclib at all 21:33:08 <mattoliver> :mind-blown: 21:34:04 <timburke> that worked for most cases, but every now and then we had an object that lost a data frag. *then* we can figure out which key/iv has the most parity, re-encrypt any other data frags to use that key/iv, and then send it all through pyeclib and a decrypter like a proxy would 21:35:02 <timburke> so far i don't think we've had anything that we couldn't fix that way 21:35:34 <jianjian> \o/ 21:36:03 <timburke> but it's still treating the symptoms, not the root cause. so expect more work on that front (thanks for thinking about lamport clocks, mattoliver!) 21:36:05 <mattoliver> incredible! 21:37:28 <jianjian> +1, will look into matt's new patches 21:38:08 <timburke> the other case is a little more scary. we've *also* seen the occasional fragment that's just bad. like, the frag-level etag matches, but trying to use it when decoding gives us data that doesn't match the object-level etag 21:39:27 <mattoliver> well the fact that you got as far as you did it way way better then I'd ever expect you to get. 21:39:35 <timburke> we're still searching for explanations. the good news is that there are usually enough other frags such that we can find *some* set that still works 21:40:35 <jianjian> probably that was some kind of bit rot or disk corruption 21:40:45 <mattoliver> yeah, I guess if you gather all frag (even on handoffs) then brute force the combinations until it works 21:40:54 <mattoliver> yeah 21:41:26 <timburke> maybe? but then i would've expected our normal machinery to quarantine it, and the frag-level etag to not match 21:41:45 <timburke> (we've even had a case where an object hit *both* of these issues, so clayg enhanced our repair tool to be able to exclude specific frags) 21:43:20 <mattoliver> Re uniqueness in green threads, I wonder if I could just do something simple like crc the txid and add that to the offset so unique-ify between greenthreads :hmm: Need to add something to make it more unique. Does each thread have an id in eventlet land or something. 21:43:41 <mattoliver> Anyway my patch is what I have is just a start at playing with the idea. don't think it's unique enough yet, well it can be.. but not happy with it yet. 21:44:39 <mattoliver> but we have Timestamp objects with offsets plumbed all the way to deiskfile, so it kinda just works. Just weird seeing data files on disk with an offset included. 21:45:17 <timburke> i think our leading suspects are an issue with reconstruction, where one of the frags we reach for *does* get quarantined but we still send data reconstructed from it to the destination with a fresh, good etag; or, some kind of weird bit-flip somewhere in the proxy, where the frag was always corrupted as soon as we got it back from pyeclib (but due to memory problems, not pyeclib) 21:46:11 <timburke> mattoliver, yeah, iirc greenthreads have some kind of id -- if nothing else, you could surely use python's id() on it 21:47:19 <mattoliver> oh yeah! 21:47:34 <timburke> another thought i had was to make diskfile more brittle: prevent it from linking over an existing file 21:48:06 <mattoliver> thats not a bad idea either. So basically in this case, both would fail 21:48:21 <mattoliver> unless enough frags for each were written to handoffs 21:48:33 <mattoliver> *both requests 21:48:42 <timburke> yeah -- getting 503s back out to the client and prompting a retry 21:48:46 <timburke> hmm... true... 21:49:01 <timburke> so, might be good to do both :D 21:49:21 <timburke> all right, that's all i've got, and we've only got about 10mins left 21:49:26 <timburke> #topic open discussion 21:49:34 <timburke> anything else we ought to bring up this week? 21:51:08 <mattoliver> I've respun the ring/ringdata patch and playing with using a reentrant lock so the ring can't update mid get_nodes/get_more_nodes 21:51:09 <mattoliver> #link https://review.opendev.org/c/openstack/swift/+/955263 21:51:23 <mattoliver> opps wrong one 21:51:36 <mattoliver> #link https://review.opendev.org/c/openstack/swift/+/957291 21:52:34 <timburke> nice -- i'll try to have a look soon 21:53:34 <timburke> the stable gates that were broken should be fixed now. i saw elodilles also backported the fix to the most-recent unmaintained branch 21:54:18 <mattoliver> oh cool 21:55:07 <timburke> oh yeah, and the guy that added the new libec backend it working on another one 21:55:09 <timburke> #link https://review.opendev.org/c/openstack/liberasurecode/+/959280 21:56:27 <timburke> if you're interested, maybe start with https://en.wikipedia.org/wiki/Locally_recoverable_code (at least for an overview of the goals) 21:56:54 <mattoliver> thanks, yeah, I'll need an overview 21:57:02 <timburke> it kinda sounds like a better take on our ec duplication factor -- which did always feel like overkill 21:57:08 <mattoliver> But nice to see so much liberasure love 21:57:28 <timburke> agree 21:57:36 <timburke> all right, i think i oughta call it 21:57:47 <timburke> thank you all for coming, and thank you for working on swift! 21:57:51 <timburke> #endmeeting