21:00:33 #startmeeting swift 21:00:34 Meeting started Wed Mar 12 21:00:33 2025 UTC and is due to finish in 60 minutes. The chair is timburke_. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:34 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:34 The meeting name has been set to 'swift' 21:00:44 who's here for the swift meeting? 21:00:51 o/ 21:01:15 o/ 21:01:46 sorry about missing last week; time just got away from me 21:02:06 on that note, i also haven't updated the agenda 😬 21:02:33 but we can still have some updates! 21:02:38 #topic PTG 21:02:45 o/ 21:02:58 reminder that the ptg is less than a month away 21:03:21 woo! need to get topics filled out then! 21:03:29 #link https://etherpad.opendev.org/p/swift-ptg-flamingo 21:05:11 we also need to choose meeting times. this time i figured, rather than list each individual hour, let's just list the blocks and we'll pick one per day that seems to work fairly well. if you can't make the whole block, that's fine; but those tables got too wide before 21:05:16 https://framadate.org/LBfzMX2I4C1JL9cZ 21:05:20 #link https://framadate.org/LBfzMX2I4C1JL9cZ 21:05:46 kk 21:06:33 and if you haven't registered already, please do 21:06:35 got it 21:06:35 #link https://ptg.openinfra.dev/ 21:06:50 i look forward to seeing you there! 21:06:59 #topic swift release 21:07:21 i merged the authors/changelog patch for 2.35.0 21:07:39 #link https://review.opendev.org/c/openstack/swift/+/943073 21:07:52 y'all did a lot of great work this past cycle! 21:08:11 i should really try to get more than one release out per cycle :-/ 21:09:02 and there's a releases patch up to create the tag, though they're pushing for a major version bump 21:09:05 #link https://review.opendev.org/c/openstack/releases/+/943880 21:09:43 we'll see if i can convince them that actually very little has changed, despite our removing py2 gating 21:10:31 let us know if you need support by posting comments from us 😉 21:10:37 otherwise there'll be another quick patch to update the version in the changelog. i'd be inclined to go to 4.0.0 (to avoid confusion with the swift3 project) 21:11:07 :D 21:11:15 next up 21:11:23 #topic aws-chunked support 21:11:33 timburke_: thanks as always for all your work on the release etc 21:12:16 +1 21:12:38 our (nvidia's) users have started to get used to the fact that boto3 doesn't work with default configs -- which isn't really where we want to be 21:13:16 makes me think of something like "there's nothing more permanent than a temporary solution" 21:13:27 lol 21:15:03 so we've been working on getting the right balance between adding support for aws-chunked, performing all the validation that clients are requesting, but not necessarily taking on all the work required to add full additional-checksums support 21:15:34 latest attempt looks like 21:15:41 #link https://review.opendev.org/c/openstack/swift/+/944073 21:17:23 where the idea is just to validate that the bytes received match the checksum provided during PutObject and UploadPart -- but we won't write any of it down, and we won't validate them during CompleteMultipartUpload 21:18:24 acoles has already started to look further down the chain (thanks!) and i'll try to get some of his follow-ups squashed in soon 21:19:01 oh thats a good idea, so at least we're doing something with the given crc rather then just ignoring it. 21:21:11 I was thinking a bit more about that "not writing anything down part...that will mean that if/when we do support including crcs in the object attributes API thing, then there'd forever be some objects that we can't report the values. I do NOT want to scope creep what we're doing, but we may want to consider how hard it would be to stuff the value into sysmeta so it could be retrieved by a HEAD in future. 21:22:18 acoles, but that's no different than if they'd uploaded the object without any checksums, right? i'd be hesitant to pretend to know what we want to write down until we're actually willing to return it 21:22:36 I'm not so worried about ListParts because that is transient, once an MPU completes we don't need the crc value 21:23:48 sure its no different than an upload with no crc. But if client gave us a crc and we checked it then we might one day want to return it to the client via some API? or am I misunderstanding? 21:24:45 I should perhaps go and study the object attributes s3 api docs some more ! 21:25:13 or even play with it 21:25:14 sure -- but as a client i wouldn't expect any of my uploads to get that retroactively -- once i see the feature's in, then i'd want it to work, but not for anything that predates it 21:26:34 I guess that's a reasonable position. I just thought I'd flag up ...like if it proves to be low hanging fruit (except my alter-ego will scream "even low hanging fruit requires more tests etc" ) ;-) 21:27:36 yeah -- and the full-object vs composite checksums thing makes me worry about what all we need to write down 21:28:51 oh yeah, _that_ 21:29:12 ok, let's stay focussed! 21:29:18 somewhat related, i know clayg has noted some of the strange read-a-byte-then-return-an-error in this stuff -- we might want to revive https://review.opendev.org/c/openstack/swift/+/799866 sometime soon, make it so we can tell eventlet to close a connection without bothering to discard 21:30:53 #topic ring v2 21:31:41 there's been a bit of back and forth in IRC lately about what we can do with an old-primaries table 21:32:33 DHE heard about our plan to insert old primaries at the start of the handoff list for reads 21:33:15 and i'd previously been thinking about how we might want it for at least *some* writes as well (POSTs in particular) 21:35:41 DHE had a concern about how we might return out-dated data -- which is always a bit of a risk with Swift, but it does seem like the current plan would increase that risk 21:36:00 But also the ability to just grab an old hand off if it exists. To break deadlocks. 21:36:59 so now i had a thought about adding the old primary to the start of handoffs for reads (like we've been planning) but *also* add it to the set of *primaries* for writes, causing some over-replication 21:38:20 oh interesting, so we over provision new writes. If there is an old primary. And if there's not, same behaviour as today? 21:39:11 yup 21:39:23 it should avoid DHE's worry and reduce some unnecessary rsync traffic (where the old primary would sync data just so the remote could delete it), but at a risk of not digging out from full-disks fast enough 21:39:33 so we'd probably want to make it configurable 21:40:43 all right, that's all i've got on my mind 21:40:47 #topic open discussion 21:40:57 what else should we talk about this week? 21:41:27 I'll be absent the next week due to attendance of GTC. 21:44:16 we could try to meet up -- sorry, i think i dropped the ball on that last time you were in town 21:44:17 oh cool, GTC! 21:45:15 np, thx. 21:45:44 I'm playing with adding the SO_TIMESTAMP socket option to swift so we can see how long a request was sitting on the accept queue. https://review.opendev.org/c/openstack/swift/+/944103 21:46:14 In my basic socket script it works all the time. I've put it into Swift and it works sometimes :( 21:47:45 curious that it should only be sometimes :-/ how does it fail? 21:47:55 Other times I get resource not avaialbe. Using eventlet / green sockets in my script all work. But still might be something eventlet. Still playing with it to figure out what I'm missing. There are some flags to force ti to wait and to PEEK at the data. So trying some of those. 21:48:39 the socket.recvmsg() to get the ancdata just throws a [Errno 11] Resource temporarily unavailable 21:48:59 but other times it works and returns the ancdata. 21:49:21 And I'm just running the same request (swift stat) over and over. 21:49:41 Maybe something something eventlet pools and socket not available. 21:50:32 I thought maybe the socket didn't have data yet and is non-blocking.. but there is a MSG_WAITALL flag I can use but that doesn't seem to fix it. 21:50:39 curious. when you're trying it in your script, is it with greened sockets, or stdlib? 21:51:27 I started with stdlib, but moved to using eventlet.listen and greened sockets like we do in Swift. All good 21:51:42 huh. strange 21:52:03 But that's still a single "thread" so maybe I need to try firing off a bunch of threads, ie a pool 21:52:43 I did try a sleep(0) just in case I'd loose context (just trying anything atm). 21:53:29 The code I pushed last night just pass the exception, but just put in a self.log_message(f"=== Eception {str(ex)} ===") and you'll see the error 21:53:46 i wonder if with the non-blocking sockets we could get to a point where accept could return, but not all the ancdata is available yet 21:54:06 yeah, true. 21:54:24 eitherway mostly good news, just need to figure out this. 21:55:10 Also I seemed to have issues using recvmsg after we socket.makefile... but might recheck that because I did try a billion things last night, so could've been a red herring 21:55:49 Anyway, that's all I got for now. 21:56:11 hmm... i wonder if you could use that to back-date some spans for your otel work... would be interesting if we could see how long the request was sitting in the accept queue with that 21:58:11 all right, i think we can wrap up then 21:58:14 oh yeah! 21:58:28 well once I get this working, I'll add that to otel tracing ;) 21:58:43 👍 21:58:45 thank you all for coming, and thank you for working on swift! 21:58:51 #endmeeting