Wednesday, 2025-03-12

opendevreviewMerged openstack/swift-bench master: Remove nose detritus  https://review.opendev.org/c/openstack/swift-bench/+/94302904:59
opendevreviewMatthew Oliver proposed openstack/swift master: Use SO_TIMESTAMP to track request times before they hit accept queue  https://review.opendev.org/c/openstack/swift/+/94410311:14
claygtimburke_: DHE: I assume we'd still only talk to handoffs when primaries miss.  A POST would still generally (202, 202, 404) => but for failure-during-rebalance I could see a (ConnectionTimeout, 202, 404, 202) being better than (ConnectionTimeout, 202, 404, 404) - For read-stale-after-delete I think would hope the x-backend-timestamp from the tombstone on the primary could somehow surpress the 2XX from the stale primary 14:08
clayginserted into the handoffs list; but maybe we don't do that.  Again, I think there's code we start adding tests to somewhere...14:08
claygmaybe start with a rebase 834621: ring: Add a rebalance history in the ring | https://review.opendev.org/c/openstack/swift/+/83462114:08
DHEmost likely, sure, you assume that your 3 good copies are probably fine. but immediately after a rebalance that is false. having the 3rd copy at least in the handoffs list should improve reliability at least a little bit... maybe making it the first handoff isn't the right answer...14:48
timburke_idk, sticking an old primary in as first handoff seems like the right sort of trade-off to make, especially for a AP system. oh! what about this: stick the old primary in as first handoff on reads, and as an extra primary on (replicated) writes!15:25
timburke_just need to make sure it doesn't impact the quorum requirement15:26
DHEI like that.. on reads you're likely to find an expected copy quickly. on writes, it's best to keep handoffs still as spread out as possible across regions/zones which is what get_more_nodes() does..15:31
timburke_continuing to write to old primaries should also reduce unnecessary rsync traffic (where today we rsync the old data with everything else and let cleanup_ondisk_files unlink it)15:47
timburke_would probably want a config option for that over-replicate-on-write behavior, though -- capacity crunches are no fun17:29
opendevreviewAlistair Coles proposed openstack/swift master: sq: move CRCHasher to checksum.py  https://review.opendev.org/c/openstack/swift/+/94414618:43
opendevreviewAlistair Coles proposed openstack/swift master: sq? S3Request checksumming suggestions  https://review.opendev.org/c/openstack/swift/+/94414718:43
-opendevstatus- NOTICE: One of our Zuul job log storage providers is experiencing errors. We have removed that storage target from base jobs. You should be able to safely recheck changes now.20:24
kotagood morning20:57
timburke_o/21:00
timburke_#startmeeting swift21:00
opendevmeetMeeting started Wed Mar 12 21:00:33 2025 UTC and is due to finish in 60 minutes.  The chair is timburke_. Information about MeetBot at http://wiki.debian.org/MeetBot.21:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.21:00
opendevmeetThe meeting name has been set to 'swift'21:00
timburke_who's here for the swift meeting?21:00
kotao/21:00
mattolivero/21:01
timburke_sorry about missing last week; time just got away from me21:01
timburke_on that note, i also haven't updated the agenda 😬21:02
timburke_but we can still have some updates!21:02
timburke_#topic PTG21:02
acoleso/21:02
timburke_reminder that the ptg is less than a month away21:02
mattoliverwoo! need to get topics filled out then!21:03
timburke_#link https://etherpad.opendev.org/p/swift-ptg-flamingo21:03
timburke_we also need to choose meeting times. this time i figured, rather than list each individual hour, let's just list the blocks and we'll pick one per day that seems to work fairly well. if you can't make the whole block, that's fine; but those tables got too wide before21:05
timburke_https://framadate.org/LBfzMX2I4C1JL9cZ21:05
timburke_#link https://framadate.org/LBfzMX2I4C1JL9cZ21:05
mattoliverkk21:05
timburke_and if you haven't registered already, please do21:06
jianjiangot it21:06
timburke_#link https://ptg.openinfra.dev/21:06
timburke_i look forward to seeing you there!21:06
timburke_#topic swift release21:06
timburke_i merged the authors/changelog patch for 2.35.021:07
timburke_#link https://review.opendev.org/c/openstack/swift/+/94307321:07
timburke_y'all did a lot of great work this past cycle!21:07
timburke_i should really try to get more than one release out per cycle :-/21:08
timburke_and there's a releases patch up to create the tag, though they're pushing for a major version bump21:09
timburke_#link https://review.opendev.org/c/openstack/releases/+/94388021:09
timburke_we'll see if i can convince them that actually very little has changed, despite our removing py2 gating21:09
jianjianlet us know if you need support by posting comments from us 😉21:10
timburke_otherwise there'll be another quick patch to update the version in the changelog. i'd be inclined to go to 4.0.0 (to avoid confusion with the swift3 project)21:10
kota:D21:11
timburke_next up21:11
timburke_#topic aws-chunked support21:11
acolestimburke_: thanks as always for all your work on the release etc21:11
mattoliver+121:12
timburke_our (nvidia's) users have started to get used to the fact that boto3 doesn't work with default configs -- which isn't really where we want to be21:12
timburke_makes me think of something like "there's nothing more permanent than a temporary solution"21:13
acoleslol21:13
timburke_so we've been working on getting the right balance between adding support for aws-chunked, performing all the validation that clients are requesting, but not necessarily taking on all the work required to add full additional-checksums support21:15
timburke_latest attempt looks like21:15
timburke_#link https://review.opendev.org/c/openstack/swift/+/94407321:15
timburke_where the idea is just to validate that the bytes received match the checksum provided during PutObject and UploadPart -- but we won't write any of it down, and we won't validate them during CompleteMultipartUpload21:17
timburke_acoles has already started to look further down the chain (thanks!) and i'll try to get some of his follow-ups squashed in soon21:18
mattoliveroh thats a good idea, so at least we're doing something with the given crc rather then just ignoring it. 21:19
acolesI was thinking a bit more about that "not writing anything down part...that will mean that if/when we do support including crcs in the object attributes API thing, then there'd forever be some objects that we can't report the values. I do NOT want to scope creep what we're doing, but we may want to consider how hard it would be to stuff the value into sysmeta so it could be retrieved by a HEAD in future.21:21
timburke_acoles, but that's no different than if they'd uploaded the object without any checksums, right? i'd be hesitant to pretend to know what we want to write down until we're actually willing to return it21:22
acolesI'm not so worried about ListParts because that is transient, once an MPU completes we don't need the crc value21:22
acolessure its no different than an upload with no crc. But if client gave us a crc and we checked it then we might one day want to return it to the client via some API? or am I misunderstanding?21:23
acolesI should perhaps go and study the object attributes s3 api docs some more !21:24
acolesor even play with it21:25
timburke_sure -- but as a client i wouldn't expect any of my uploads to get that retroactively -- once i see the feature's in, then i'd want it to work, but not for anything that predates it21:25
acolesI guess that's a reasonable position. I just thought I'd flag up ...like if it proves to be low hanging fruit (except my alter-ego will scream "even low hanging fruit requires more tests etc" )  ;-)21:26
timburke_yeah -- and the full-object vs composite checksums thing makes me worry about what all we need to write down21:27
acolesoh yeah, _that_21:28
acolesok, let's stay focussed!21:29
timburke_somewhat related, i know clayg has noted some of the strange read-a-byte-then-return-an-error in this stuff -- we might want to revive https://review.opendev.org/c/openstack/swift/+/799866 sometime soon, make it so we can tell eventlet to close a connection without bothering to discard21:29
timburke_#topic ring v221:30
timburke_there's been a bit of back and forth in IRC lately about what we can do with an old-primaries table21:31
timburke_DHE heard about our plan to insert old primaries at the start of the handoff list for reads21:32
timburke_and i'd previously been thinking about how we might want it for at least *some* writes as well (POSTs in particular)21:33
timburke_DHE had a concern about how we might return out-dated data -- which is always a bit of a risk with Swift, but it does seem like the current plan would increase that risk21:35
mattoliverBut also the ability to just grab an old hand off if it exists. To break deadlocks.21:36
timburke_so now i had a thought about adding the old primary to the start of handoffs for reads (like we've been planning) but *also* add it to the set of *primaries* for writes, causing some over-replication21:36
mattoliveroh interesting, so we over provision new writes. If there is an old primary. And if there's not, same behaviour as today?21:38
timburke_yup21:39
timburke_it should avoid DHE's worry and reduce some unnecessary rsync traffic (where the old primary would sync data just so the remote could delete it), but at a risk of not digging out from full-disks fast enough21:39
timburke_so we'd probably want to make it configurable21:39
timburke_all right, that's all i've got on my mind21:40
timburke_#topic open discussion21:40
timburke_what else should we talk about this week?21:40
kotaI'll be absent the next week due to attendance of GTC.21:41
timburke_we could try to meet up -- sorry, i think i dropped the ball on that last time you were in town21:44
mattoliveroh cool, GTC! 21:44
kotanp, thx.21:45
mattoliverI'm playing with adding the SO_TIMESTAMP socket option to swift so we can see how long a request was sitting on the accept queue. https://review.opendev.org/c/openstack/swift/+/944103 21:45
mattoliverIn my basic socket script it works all the time. I've put it into Swift and it works sometimes :( 21:46
timburke_curious that it should only be sometimes :-/ how does it fail?21:47
mattoliverOther times I get resource not avaialbe. Using eventlet / green sockets in my script all work. But still might be something eventlet. Still playing with it to figure out what I'm missing. There are some flags to force ti to wait and to PEEK at the data. So trying some of those. 21:47
mattoliverthe socket.recvmsg() to get the ancdata just throws a [Errno 11] Resource temporarily unavailable21:48
mattoliverbut other times it works and returns the ancdata. 21:48
mattoliverAnd I'm just running the same request (swift stat) over and over. 21:49
mattoliverMaybe something something eventlet pools and socket not available.21:49
mattoliverI thought maybe the socket didn't have data yet and is non-blocking.. but there is a MSG_WAITALL flag I can use but that doesn't seem to fix it. 21:50
timburke_curious. when you're trying it in your script, is it with greened sockets, or stdlib?21:50
mattoliverI started with stdlib, but moved to using eventlet.listen and greened sockets like we do in Swift. All good21:51
timburke_huh. strange21:51
mattoliverBut that's still a single "thread" so maybe I need to try firing off a bunch of threads, ie a pool21:52
mattoliverI did try a sleep(0) just in case I'd loose context (just trying anything atm).21:52
mattoliverThe code I pushed last night just pass the exception, but just put in a self.log_message(f"=== Eception {str(ex)} ===") and you'll see the error21:53
timburke_i wonder if with the non-blocking sockets we could get to a point where accept could return, but not all the ancdata is available yet21:53
mattoliveryeah, true. 21:54
mattolivereitherway mostly good news, just need to figure out this. 21:54
mattoliverAlso I seemed to have issues using recvmsg after we socket.makefile... but might recheck that because I did try a billion things last night, so could've been a red herring21:55
mattoliverAnyway, that's all I got for now.21:55
timburke_hmm... i wonder if you could use that to back-date some spans for your otel work... would be interesting if we could see how long the request was sitting in the accept queue with that21:56
timburke_all right, i think we can wrap up then21:58
mattoliveroh yeah! 21:58
mattoliverwell once I get this working, I'll add that to otel tracing ;) 21:58
timburke_👍21:58
timburke_thank you all for coming, and thank you for working on swift!21:58
timburke_#endmeeting21:58
opendevmeetMeeting ended Wed Mar 12 21:58:51 2025 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)21:58
opendevmeetMinutes:        https://meetings.opendev.org/meetings/swift/2025/swift.2025-03-12-21.00.html21:58
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/swift/2025/swift.2025-03-12-21.00.txt21:58
opendevmeetLog:            https://meetings.opendev.org/meetings/swift/2025/swift.2025-03-12-21.00.log.html21:58

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!