21:00:08 <timburke> #startmeeting swift 21:00:08 <opendevmeet> Meeting started Wed Jul 3 21:00:08 2024 UTC and is due to finish in 60 minutes. The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:08 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:08 <opendevmeet> The meeting name has been set to 'swift' 21:00:15 <timburke> who's here for the swift meeting? 21:00:33 <timburke> i know acoles and mattoliver are out 21:01:01 <fulecorafa> Im here 21:01:10 <timburke> o/ 21:01:14 <fulecorafa> o/ 21:03:38 <timburke> well, it'll do, even if it's just the two of us :-) 21:03:47 <timburke> as usual, the agenda's at 21:03:49 <timburke> #link https://wiki.openstack.org/wiki/Meetings/Swift 21:03:54 <timburke> first up 21:04:08 <timburke> #topic published docker images 21:04:21 <timburke> they were busted -- now they're fixed! 21:04:33 <fulecorafa> That's great news! 21:04:34 <timburke> they're also now py3-only! 21:04:45 <timburke> #link https://review.opendev.org/c/openstack/swift/+/896450 21:04:45 <patch-bot> patch 896450 - swift - Get rid of py2 docker image builds; switch "latest... (MERGED) - 3 patch sets 21:05:40 <timburke> we probably should have changed the "latest" tag to be py3 a while ago, so i think it's all good 21:07:06 <timburke> i still need to look into getting a gate job to run functional tests using the built container, though -- but at least the immediate problem of publishing known-broken images is resolved 21:07:41 <timburke> unfortunately, i realized we'd been doing that for months and months :-( 21:08:02 <fulecorafa> We've being taking a look at automating some test, but they're are focused on s3api 21:08:24 <timburke> going back to september at least 21:08:26 <fulecorafa> Maybe we could start messing around with getting the tests to run in the container as well 21:08:26 <timburke> #link https://bugs.launchpad.net/swift/+bug/2037268 21:08:27 <patch-bot> Bug #2037268 - Docker's SAIO doesn't work (Fix Released) 21:09:35 <timburke> fulecorafa, that'd be great, thanks for thinking of it! iirc, the docker image includes s3api support, so that could work well -- i'll double check 21:09:56 <fulecorafa> It does include, I'm sure of it 21:10:50 <fulecorafa> But the testing we've been doing with the image has been more of a manual work. i.e. a cli client which makes a bunch of calls to a running instance 21:11:17 <timburke> the main i need to figure out is how to get the zuul plumbing sorted; i think clarkb has previously given me some pointers to help get me started, just need to go back through some notes 21:11:21 <timburke> ah, makes sense 21:11:49 <fulecorafa> I imagine the ideal scenario would be to automatically run a docker instance and testing together 21:12:02 <clarkb> timburke: ya zuul has a bunch of existing base job stuff you can inherit from and supply secrest to and it does a lot of the container build and publishing work for you 21:12:14 <clarkb> timburke: opendev/system-config has a lot of examples as does zuul/zuul or zuul/nodepool 21:12:15 <timburke> my thought is basically to have a job that starts up the just-built container, then runs `pytest test/functional` with an appropriate config file to point it at the container 21:12:55 <clarkb> probably the zuul stuff is closer to what you're doing since its also python and a smaller number of images. Look at the zuul image builds and the zuul quickstart job and how they interact 21:13:36 <timburke> yeah -- probably what i really ought to do is extend the existing build job to include validation testing 21:13:52 <fulecorafa> Should be great as well. I did try to run testing on the image before, but I just couldn't figure out the python dependencies. I'll later try again with this new fix 21:14:22 <timburke> next up 21:14:28 <timburke> #topic account-reaper and sharded containers 21:14:50 <timburke> i still haven't gotten to writing a probe test. but it's still on my list! 21:15:07 <timburke> hopefully i can get something together for next week 21:15:12 <zaitcev> thanks for remembering 21:15:54 <timburke> next up... 21:16:00 <timburke> #topic cooperative tokens 21:16:31 <jianjian> ah, I joined on the right moment 21:16:51 <timburke> you did! 21:17:29 <timburke> i was hoping jianjian might offer an overview of some of the work here, since i've seen a decent bit of activity/interest lately 21:17:37 <timburke> #link https://review.opendev.org/c/openstack/swift/+/890174 21:17:38 <patch-bot> patch 890174 - swift - common: add memcached based cooperative token mech... - 32 patch sets 21:17:45 <zaitcev> I have a fundamental queestion: do we actually need these 21:17:53 <zaitcev> in light of the Fernet token review 21:18:10 <timburke> and 21:18:14 <timburke> #link https://review.opendev.org/c/openstack/swift/+/908969 21:18:14 <zaitcev> Soon there will be no memcached, no McRouter, no trouble ... right? 21:18:14 <patch-bot> patch 908969 - swift - proxy: use cooperative tokens to coalesce updating... - 23 patch sets 21:18:51 <jianjian> zaitcev, other than Fernet token, swift has other use cases which replies on memcached 21:19:05 <timburke> zaitcev, so memcache will still be a thing -- these are mostly around account/container info and (especially) shard range caching 21:19:17 <zaitcev> jianjian: yeah. But those aren't cooperative token use cases. 21:19:23 <timburke> (thanks for reviewing the fernet token patch, though!) 21:19:43 <zaitcev> Did I review it? I remember looking it, but... 21:19:54 <zaitcev> (having a biden moment) 21:20:11 <timburke> p 861271 has your +2 and everything :-) 21:20:12 <patch-bot> https://review.opendev.org/c/openstack/swift/+/861271 - swift - tempauth: Support fernet tokens - 6 patch sets 21:20:22 <zaitcev> So anyway, I'm not aganist the cooperative tokens idea, I think they are pretty clever. 21:20:38 <jianjian> from our production experience, shard range caching is our mostly targeted use case. we see a lot of shard range cache misses and the associated thundering herd problems. 21:20:52 <zaitcev> I see. 21:22:07 <jianjian> and when shard range cache misses, thousands of requests would go the same containers (3 replica) and overload those container, because shard range GET is a very expensive and slow operation. 21:23:47 <jianjian> and also some of those requests would be able to get shard ranges and start to write into memcache at the same time, cause memcache to fail as well 21:24:29 <jianjian> but think about it, all of those tens of thousands requests are asking for the same thing! we only need one of them 21:25:40 <jianjian> that's the basic idea of cooperative token. on top of that, we allow a few of requests to get the token and go to the backend, in case any single one would fail to do so. 21:26:37 <zaitcev> I see what I misunderstood. You aren't talking about authentication tokens at all, but tokens that you circulate in memcached like in a TokenRing. 21:27:37 <timburke> yeah, something like a semaphore i think 21:27:38 <zaitcev> Way to go off half-cocked. But thanks for the additional explanation. 21:27:47 <jianjian> no, it's not for authentication. 21:28:03 <jianjian> timburke, that's right 21:28:18 <jianjian> testing on staging cluster works well, we are going to enabling it on production, let's see. 21:29:05 <timburke> we've got a few processes that might be interested in shard ranges -- is this work only improving proxy-server handling? would mattoliver's p 874721 be able to benefit, too? 21:29:05 <patch-bot> https://review.opendev.org/c/openstack/swift/+/874721 - swift - updater: add memcache shard update lookup support - 5 patch sets 21:29:57 <timburke> how are you looking at measuring the improvement? 21:30:16 <jianjian> just noticed this patch, will take a look at it 21:31:29 <jianjian> my goal is to not see the container server 503 storms any more, not sure if it will improve anything like front user will see 21:32:01 <timburke> it's mroe than a year out of date, so i wouldn't worry *too much* about it -- i was just curious if you could see other processes which need shard ranges (such as the object-updater) using the same API 21:32:27 <jianjian> if those container server won't be overloaded, user would see less 503 errors 21:32:49 <timburke> or rather, that *could benefit from* shard ranges -- updater doesn't currently fetch shard ranges, just accepts redirects 21:32:49 <jianjian> good point, I will take a look. thanks! timburke 21:32:51 <jianjian> cccccbkvbdvfndecuccvrufvdvlccbretkdtnufnnrvr 21:33:13 <jianjian> oh, no, sorry my usb device 21:34:20 <zaitcev> My keyboard just repeats a key. 21:34:29 <timburke> but either way, reducing the load from proxies will be great, and getting some measurable improvements from prod is always a vote of confidence :-) i'll try to review the chain, though it looks like acoles has been helping too 21:35:14 <timburke> next up 21:35:29 <timburke> #topic py312 and slow process start-up 21:36:20 <timburke> this was something i noticed when trying to run tests locally ahead of declaring py312 support in p 917878 21:36:20 <patch-bot> https://review.opendev.org/c/openstack/swift/+/917878 - swift - Test under py312 (MERGED) - 5 patch sets 21:36:53 <timburke> (that only added automated unit testing under py312; func and probe tests i performed locally) 21:37:12 <timburke> func tests were fine, but probe tests took *forever* 21:38:13 <timburke> i eventually traced it down warnings about pkg_resources being deprecated 21:39:03 <zaitcev> I think I recall fixing one... Only because watcher used it and I was obligated. 21:39:37 <timburke> the kind of funny thing is that setuptools (which was what was issuing the warning about pkg_resources being deprecated) was *also* the one writing code that would use pkg_resources! 21:40:02 <timburke> https://github.com/pypa/setuptools/blob/main/setuptools/script.tmpl 21:40:02 <jianjian> haha 21:40:20 <zaitcev> groan 21:41:05 <jianjian> I noticed our pipeline has "swift-tox-py312" now, so that's added by the 917878 patch? 21:41:08 <timburke> that only really came up with py312 because prior to that, python -m venv would include an older version of setuptools that didn't issue the warning 21:41:16 <timburke> jianjian, yup! 21:42:04 <jianjian> 👍 21:42:24 <timburke> after a bit more digging, i figured out that we could change how we indicate there's this bin script to get around it 21:43:59 <timburke> basically, instead of listing it in the "scripts" of the "files" section in setup.cfg, list it in "console_scripts" in "entry_points" 21:44:44 <zaitcev> Oh, so that was the reason 21:45:08 <zaitcev> Why didn't you write it in the changelog? It's not self-evident. I'd add a +2 right away. 21:45:12 <timburke> so i started doing the conversion. it's mostly fairly mechanical, but there are a few bin scripts that still have a lot of code in them that needs to get relocated 21:45:44 <timburke> i thought i did ok on that in p 918365 ! i guess not clear enough :-) 21:45:44 <patch-bot> https://review.opendev.org/c/openstack/swift/+/918365 - swift - Use entry_points for server executables - 4 patch sets 21:45:46 <jianjian> nice! this brings us closer to run py312 on prod 21:45:53 <zaitcev> Oh, right. That'd add a lot of boilerplate. 21:47:26 <timburke> anyway, mattoliver at least took a look a couple months ago, recommending that we do it for everybody, so now i've added more patches stacked on top 21:47:46 <timburke> if anyone has time to review, i'd appreciate it 21:47:54 <timburke> next up 21:48:06 <timburke> #topic multi-policy containers 21:48:25 <timburke> i promised fulecorafa i'd get this on the agenda :-) 21:48:36 <fulecorafa> Thanks a lot :) 21:49:23 <timburke> i'm not sure what would be most useful for you, though. more discussion/brainstorming, maybe? 21:49:57 <fulecorafa> So, we've been working on this and we have a working prototype. Based on that idea from last meeting of making a +cold bucket automatically 21:50:23 <timburke> cool! 21:51:01 <zaitcev> Interesting! 21:51:07 <fulecorafa> We still have to make some adaptations to start thinking about a patch though. We've been working with some old version, which is what we use in production 21:51:54 <fulecorafa> Some other good news: we got it working with MPUs and versioning as well 21:51:54 <timburke> always best to start from what you're running in prod :-) 21:51:59 <timburke> nice! 21:52:56 <fulecorafa> One thing we are still pondering though 21:53:54 <fulecorafa> For multipart uploads, it kind of came for free with our modifications in the protocol if we have 4 buckets in the end: bkt; bkt+segments; bkt+cold and bkt+segments+cold 21:54:35 <timburke> that'd seem to make sense 21:55:33 <fulecorafa> We came to consider possibly removing this need by adapting the manifest file to skip the linking bkt -> bkt+cold (manifest) -> bkt+cold+segments (parts). So it would be just bkt (manifest) -> bkt+segments+cold (parts) 21:55:48 <fulecorafa> Is it worth the trouble? 21:57:20 <timburke> maybe? i think it'd probably come down to what kind of performance tradeoffs you can make -- the extra hop will necessarily increase time-to-first-byte 21:58:26 <fulecorafa> My personal opinion is that it is not worth it. It would largly complicate the code and this increase in time would be little compared to the whole process 21:58:48 <fulecorafa> In our tests, we didn't notice much of a difference 22:01:01 <timburke> having the four buckets seems unavoidable -- bkt / bkt+cold for normal uploads, bkt+segments / bkt+segments+cold for part data 22:01:04 <timburke> i'm inclined to agree that having the extra hop is worth it if it means the code is easier to grok and maintain 22:02:06 <fulecorafa> Then we agree. Maybe this changes when we come around to patch upstream, but we can talk about it then 22:02:17 <timburke> cool, sounds great! 22:02:23 <timburke> all right, we're about at time 22:02:38 <timburke> thank you all for coming, and thank you for working on swift! 22:02:44 <timburke> #endmeeting