21:00:08 #startmeeting swift 21:00:08 Meeting started Wed Jul 3 21:00:08 2024 UTC and is due to finish in 60 minutes. The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:08 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:08 The meeting name has been set to 'swift' 21:00:15 who's here for the swift meeting? 21:00:33 i know acoles and mattoliver are out 21:01:01 Im here 21:01:10 o/ 21:01:14 o/ 21:03:38 well, it'll do, even if it's just the two of us :-) 21:03:47 as usual, the agenda's at 21:03:49 #link https://wiki.openstack.org/wiki/Meetings/Swift 21:03:54 first up 21:04:08 #topic published docker images 21:04:21 they were busted -- now they're fixed! 21:04:33 That's great news! 21:04:34 they're also now py3-only! 21:04:45 #link https://review.opendev.org/c/openstack/swift/+/896450 21:04:45 patch 896450 - swift - Get rid of py2 docker image builds; switch "latest... (MERGED) - 3 patch sets 21:05:40 we probably should have changed the "latest" tag to be py3 a while ago, so i think it's all good 21:07:06 i still need to look into getting a gate job to run functional tests using the built container, though -- but at least the immediate problem of publishing known-broken images is resolved 21:07:41 unfortunately, i realized we'd been doing that for months and months :-( 21:08:02 We've being taking a look at automating some test, but they're are focused on s3api 21:08:24 going back to september at least 21:08:26 Maybe we could start messing around with getting the tests to run in the container as well 21:08:26 #link https://bugs.launchpad.net/swift/+bug/2037268 21:08:27 Bug #2037268 - Docker's SAIO doesn't work (Fix Released) 21:09:35 fulecorafa, that'd be great, thanks for thinking of it! iirc, the docker image includes s3api support, so that could work well -- i'll double check 21:09:56 It does include, I'm sure of it 21:10:50 But the testing we've been doing with the image has been more of a manual work. i.e. a cli client which makes a bunch of calls to a running instance 21:11:17 the main i need to figure out is how to get the zuul plumbing sorted; i think clarkb has previously given me some pointers to help get me started, just need to go back through some notes 21:11:21 ah, makes sense 21:11:49 I imagine the ideal scenario would be to automatically run a docker instance and testing together 21:12:02 timburke: ya zuul has a bunch of existing base job stuff you can inherit from and supply secrest to and it does a lot of the container build and publishing work for you 21:12:14 timburke: opendev/system-config has a lot of examples as does zuul/zuul or zuul/nodepool 21:12:15 my thought is basically to have a job that starts up the just-built container, then runs `pytest test/functional` with an appropriate config file to point it at the container 21:12:55 probably the zuul stuff is closer to what you're doing since its also python and a smaller number of images. Look at the zuul image builds and the zuul quickstart job and how they interact 21:13:36 yeah -- probably what i really ought to do is extend the existing build job to include validation testing 21:13:52 Should be great as well. I did try to run testing on the image before, but I just couldn't figure out the python dependencies. I'll later try again with this new fix 21:14:22 next up 21:14:28 #topic account-reaper and sharded containers 21:14:50 i still haven't gotten to writing a probe test. but it's still on my list! 21:15:07 hopefully i can get something together for next week 21:15:12 thanks for remembering 21:15:54 next up... 21:16:00 #topic cooperative tokens 21:16:31 ah, I joined on the right moment 21:16:51 you did! 21:17:29 i was hoping jianjian might offer an overview of some of the work here, since i've seen a decent bit of activity/interest lately 21:17:37 #link https://review.opendev.org/c/openstack/swift/+/890174 21:17:38 patch 890174 - swift - common: add memcached based cooperative token mech... - 32 patch sets 21:17:45 I have a fundamental queestion: do we actually need these 21:17:53 in light of the Fernet token review 21:18:10 and 21:18:14 #link https://review.opendev.org/c/openstack/swift/+/908969 21:18:14 Soon there will be no memcached, no McRouter, no trouble ... right? 21:18:14 patch 908969 - swift - proxy: use cooperative tokens to coalesce updating... - 23 patch sets 21:18:51 zaitcev, other than Fernet token, swift has other use cases which replies on memcached 21:19:05 zaitcev, so memcache will still be a thing -- these are mostly around account/container info and (especially) shard range caching 21:19:17 jianjian: yeah. But those aren't cooperative token use cases. 21:19:23 (thanks for reviewing the fernet token patch, though!) 21:19:43 Did I review it? I remember looking it, but... 21:19:54 (having a biden moment) 21:20:11 p 861271 has your +2 and everything :-) 21:20:12 https://review.opendev.org/c/openstack/swift/+/861271 - swift - tempauth: Support fernet tokens - 6 patch sets 21:20:22 So anyway, I'm not aganist the cooperative tokens idea, I think they are pretty clever. 21:20:38 from our production experience, shard range caching is our mostly targeted use case. we see a lot of shard range cache misses and the associated thundering herd problems. 21:20:52 I see. 21:22:07 and when shard range cache misses, thousands of requests would go the same containers (3 replica) and overload those container, because shard range GET is a very expensive and slow operation. 21:23:47 and also some of those requests would be able to get shard ranges and start to write into memcache at the same time, cause memcache to fail as well 21:24:29 but think about it, all of those tens of thousands requests are asking for the same thing! we only need one of them 21:25:40 that's the basic idea of cooperative token. on top of that, we allow a few of requests to get the token and go to the backend, in case any single one would fail to do so. 21:26:37 I see what I misunderstood. You aren't talking about authentication tokens at all, but tokens that you circulate in memcached like in a TokenRing. 21:27:37 yeah, something like a semaphore i think 21:27:38 Way to go off half-cocked. But thanks for the additional explanation. 21:27:47 no, it's not for authentication. 21:28:03 timburke, that's right 21:28:18 testing on staging cluster works well, we are going to enabling it on production, let's see. 21:29:05 we've got a few processes that might be interested in shard ranges -- is this work only improving proxy-server handling? would mattoliver's p 874721 be able to benefit, too? 21:29:05 https://review.opendev.org/c/openstack/swift/+/874721 - swift - updater: add memcache shard update lookup support - 5 patch sets 21:29:57 how are you looking at measuring the improvement? 21:30:16 just noticed this patch, will take a look at it 21:31:29 my goal is to not see the container server 503 storms any more, not sure if it will improve anything like front user will see 21:32:01 it's mroe than a year out of date, so i wouldn't worry *too much* about it -- i was just curious if you could see other processes which need shard ranges (such as the object-updater) using the same API 21:32:27 if those container server won't be overloaded, user would see less 503 errors 21:32:49 or rather, that *could benefit from* shard ranges -- updater doesn't currently fetch shard ranges, just accepts redirects 21:32:49 good point, I will take a look. thanks! timburke 21:32:51 cccccbkvbdvfndecuccvrufvdvlccbretkdtnufnnrvr 21:33:13 oh, no, sorry my usb device 21:34:20 My keyboard just repeats a key. 21:34:29 but either way, reducing the load from proxies will be great, and getting some measurable improvements from prod is always a vote of confidence :-) i'll try to review the chain, though it looks like acoles has been helping too 21:35:14 next up 21:35:29 #topic py312 and slow process start-up 21:36:20 this was something i noticed when trying to run tests locally ahead of declaring py312 support in p 917878 21:36:20 https://review.opendev.org/c/openstack/swift/+/917878 - swift - Test under py312 (MERGED) - 5 patch sets 21:36:53 (that only added automated unit testing under py312; func and probe tests i performed locally) 21:37:12 func tests were fine, but probe tests took *forever* 21:38:13 i eventually traced it down warnings about pkg_resources being deprecated 21:39:03 I think I recall fixing one... Only because watcher used it and I was obligated. 21:39:37 the kind of funny thing is that setuptools (which was what was issuing the warning about pkg_resources being deprecated) was *also* the one writing code that would use pkg_resources! 21:40:02 https://github.com/pypa/setuptools/blob/main/setuptools/script.tmpl 21:40:02 haha 21:40:20 groan 21:41:05 I noticed our pipeline has "swift-tox-py312" now, so that's added by the 917878 patch? 21:41:08 that only really came up with py312 because prior to that, python -m venv would include an older version of setuptools that didn't issue the warning 21:41:16 jianjian, yup! 21:42:04 👍 21:42:24 after a bit more digging, i figured out that we could change how we indicate there's this bin script to get around it 21:43:59 basically, instead of listing it in the "scripts" of the "files" section in setup.cfg, list it in "console_scripts" in "entry_points" 21:44:44 Oh, so that was the reason 21:45:08 Why didn't you write it in the changelog? It's not self-evident. I'd add a +2 right away. 21:45:12 so i started doing the conversion. it's mostly fairly mechanical, but there are a few bin scripts that still have a lot of code in them that needs to get relocated 21:45:44 i thought i did ok on that in p 918365 ! i guess not clear enough :-) 21:45:44 https://review.opendev.org/c/openstack/swift/+/918365 - swift - Use entry_points for server executables - 4 patch sets 21:45:46 nice! this brings us closer to run py312 on prod 21:45:53 Oh, right. That'd add a lot of boilerplate. 21:47:26 anyway, mattoliver at least took a look a couple months ago, recommending that we do it for everybody, so now i've added more patches stacked on top 21:47:46 if anyone has time to review, i'd appreciate it 21:47:54 next up 21:48:06 #topic multi-policy containers 21:48:25 i promised fulecorafa i'd get this on the agenda :-) 21:48:36 Thanks a lot :) 21:49:23 i'm not sure what would be most useful for you, though. more discussion/brainstorming, maybe? 21:49:57 So, we've been working on this and we have a working prototype. Based on that idea from last meeting of making a +cold bucket automatically 21:50:23 cool! 21:51:01 Interesting! 21:51:07 We still have to make some adaptations to start thinking about a patch though. We've been working with some old version, which is what we use in production 21:51:54 Some other good news: we got it working with MPUs and versioning as well 21:51:54 always best to start from what you're running in prod :-) 21:51:59 nice! 21:52:56 One thing we are still pondering though 21:53:54 For multipart uploads, it kind of came for free with our modifications in the protocol if we have 4 buckets in the end: bkt; bkt+segments; bkt+cold and bkt+segments+cold 21:54:35 that'd seem to make sense 21:55:33 We came to consider possibly removing this need by adapting the manifest file to skip the linking bkt -> bkt+cold (manifest) -> bkt+cold+segments (parts). So it would be just bkt (manifest) -> bkt+segments+cold (parts) 21:55:48 Is it worth the trouble? 21:57:20 maybe? i think it'd probably come down to what kind of performance tradeoffs you can make -- the extra hop will necessarily increase time-to-first-byte 21:58:26 My personal opinion is that it is not worth it. It would largly complicate the code and this increase in time would be little compared to the whole process 21:58:48 In our tests, we didn't notice much of a difference 22:01:01 having the four buckets seems unavoidable -- bkt / bkt+cold for normal uploads, bkt+segments / bkt+segments+cold for part data 22:01:04 i'm inclined to agree that having the extra hop is worth it if it means the code is easier to grok and maintain 22:02:06 Then we agree. Maybe this changes when we come around to patch upstream, but we can talk about it then 22:02:17 cool, sounds great! 22:02:23 all right, we're about at time 22:02:38 thank you all for coming, and thank you for working on swift! 22:02:44 #endmeeting