opendevreview | Merged openstack/swift master: common: move memcached exceptions to the base file. https://review.opendev.org/c/openstack/swift/+/923315 | 05:51 |
---|---|---|
acoles | timburke: apologies, I can't make today's meeting | 16:13 |
opendevreview | Jianjian Huo proposed openstack/swift master: common: add memcached based cooperative token mechanism. https://review.opendev.org/c/openstack/swift/+/890174 | 17:53 |
opendevreview | Jianjian Huo proposed openstack/swift master: proxy: use cooperative tokens to coalesce updating shard range requests into backend https://review.opendev.org/c/openstack/swift/+/908969 | 20:39 |
timburke | almost meeting time, but i'm not sure there's going to be many people around for it | 20:55 |
timburke | #startmeeting swift | 21:00 |
opendevmeet | Meeting started Wed Jul 3 21:00:08 2024 UTC and is due to finish in 60 minutes. The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot. | 21:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 21:00 |
opendevmeet | The meeting name has been set to 'swift' | 21:00 |
timburke | who's here for the swift meeting? | 21:00 |
timburke | i know acoles and mattoliver are out | 21:00 |
fulecorafa | Im here | 21:01 |
timburke | o/ | 21:01 |
fulecorafa | o/ | 21:01 |
timburke | well, it'll do, even if it's just the two of us :-) | 21:03 |
timburke | as usual, the agenda's at | 21:03 |
timburke | #link https://wiki.openstack.org/wiki/Meetings/Swift | 21:03 |
timburke | first up | 21:03 |
timburke | #topic published docker images | 21:04 |
timburke | they were busted -- now they're fixed! | 21:04 |
fulecorafa | That's great news! | 21:04 |
timburke | they're also now py3-only! | 21:04 |
timburke | #link https://review.opendev.org/c/openstack/swift/+/896450 | 21:04 |
patch-bot | patch 896450 - swift - Get rid of py2 docker image builds; switch "latest... (MERGED) - 3 patch sets | 21:04 |
timburke | we probably should have changed the "latest" tag to be py3 a while ago, so i think it's all good | 21:05 |
timburke | i still need to look into getting a gate job to run functional tests using the built container, though -- but at least the immediate problem of publishing known-broken images is resolved | 21:07 |
timburke | unfortunately, i realized we'd been doing that for months and months :-( | 21:07 |
fulecorafa | We've being taking a look at automating some test, but they're are focused on s3api | 21:08 |
timburke | going back to september at least | 21:08 |
fulecorafa | Maybe we could start messing around with getting the tests to run in the container as well | 21:08 |
timburke | #link https://bugs.launchpad.net/swift/+bug/2037268 | 21:08 |
patch-bot | Bug #2037268 - Docker's SAIO doesn't work (Fix Released) | 21:08 |
timburke | fulecorafa, that'd be great, thanks for thinking of it! iirc, the docker image includes s3api support, so that could work well -- i'll double check | 21:09 |
fulecorafa | It does include, I'm sure of it | 21:09 |
fulecorafa | But the testing we've been doing with the image has been more of a manual work. i.e. a cli client which makes a bunch of calls to a running instance | 21:10 |
timburke | the main i need to figure out is how to get the zuul plumbing sorted; i think clarkb has previously given me some pointers to help get me started, just need to go back through some notes | 21:11 |
timburke | ah, makes sense | 21:11 |
fulecorafa | I imagine the ideal scenario would be to automatically run a docker instance and testing together | 21:11 |
clarkb | timburke: ya zuul has a bunch of existing base job stuff you can inherit from and supply secrest to and it does a lot of the container build and publishing work for you | 21:12 |
clarkb | timburke: opendev/system-config has a lot of examples as does zuul/zuul or zuul/nodepool | 21:12 |
timburke | my thought is basically to have a job that starts up the just-built container, then runs `pytest test/functional` with an appropriate config file to point it at the container | 21:12 |
clarkb | probably the zuul stuff is closer to what you're doing since its also python and a smaller number of images. Look at the zuul image builds and the zuul quickstart job and how they interact | 21:12 |
timburke | yeah -- probably what i really ought to do is extend the existing build job to include validation testing | 21:13 |
fulecorafa | Should be great as well. I did try to run testing on the image before, but I just couldn't figure out the python dependencies. I'll later try again with this new fix | 21:13 |
timburke | next up | 21:14 |
timburke | #topic account-reaper and sharded containers | 21:14 |
timburke | i still haven't gotten to writing a probe test. but it's still on my list! | 21:14 |
timburke | hopefully i can get something together for next week | 21:15 |
zaitcev | thanks for remembering | 21:15 |
timburke | next up... | 21:15 |
timburke | #topic cooperative tokens | 21:16 |
jianjian | ah, I joined on the right moment | 21:16 |
timburke | you did! | 21:16 |
timburke | i was hoping jianjian might offer an overview of some of the work here, since i've seen a decent bit of activity/interest lately | 21:17 |
timburke | #link https://review.opendev.org/c/openstack/swift/+/890174 | 21:17 |
patch-bot | patch 890174 - swift - common: add memcached based cooperative token mech... - 32 patch sets | 21:17 |
zaitcev | I have a fundamental queestion: do we actually need these | 21:17 |
zaitcev | in light of the Fernet token review | 21:17 |
timburke | and | 21:18 |
timburke | #link https://review.opendev.org/c/openstack/swift/+/908969 | 21:18 |
zaitcev | Soon there will be no memcached, no McRouter, no trouble ... right? | 21:18 |
patch-bot | patch 908969 - swift - proxy: use cooperative tokens to coalesce updating... - 23 patch sets | 21:18 |
jianjian | zaitcev, other than Fernet token, swift has other use cases which replies on memcached | 21:18 |
timburke | zaitcev, so memcache will still be a thing -- these are mostly around account/container info and (especially) shard range caching | 21:19 |
zaitcev | jianjian: yeah. But those aren't cooperative token use cases. | 21:19 |
timburke | (thanks for reviewing the fernet token patch, though!) | 21:19 |
zaitcev | Did I review it? I remember looking it, but... | 21:19 |
zaitcev | (having a biden moment) | 21:19 |
timburke | p 861271 has your +2 and everything :-) | 21:20 |
patch-bot | https://review.opendev.org/c/openstack/swift/+/861271 - swift - tempauth: Support fernet tokens - 6 patch sets | 21:20 |
zaitcev | So anyway, I'm not aganist the cooperative tokens idea, I think they are pretty clever. | 21:20 |
jianjian | from our production experience, shard range caching is our mostly targeted use case. we see a lot of shard range cache misses and the associated thundering herd problems. | 21:20 |
zaitcev | I see. | 21:20 |
jianjian | and when shard range cache misses, thousands of requests would go the same containers (3 replica) and overload those container, because shard range GET is a very expensive and slow operation. | 21:22 |
jianjian | and also some of those requests would be able to get shard ranges and start to write into memcache at the same time, cause memcache to fail as well | 21:23 |
jianjian | but think about it, all of those tens of thousands requests are asking for the same thing! we only need one of them | 21:24 |
jianjian | that's the basic idea of cooperative token. on top of that, we allow a few of requests to get the token and go to the backend, in case any single one would fail to do so. | 21:25 |
zaitcev | I see what I misunderstood. You aren't talking about authentication tokens at all, but tokens that you circulate in memcached like in a TokenRing. | 21:26 |
timburke | yeah, something like a semaphore i think | 21:27 |
zaitcev | Way to go off half-cocked. But thanks for the additional explanation. | 21:27 |
jianjian | no, it's not for authentication. | 21:27 |
jianjian | timburke, that's right | 21:28 |
jianjian | testing on staging cluster works well, we are going to enabling it on production, let's see. | 21:28 |
timburke | we've got a few processes that might be interested in shard ranges -- is this work only improving proxy-server handling? would mattoliver's p 874721 be able to benefit, too? | 21:29 |
patch-bot | https://review.opendev.org/c/openstack/swift/+/874721 - swift - updater: add memcache shard update lookup support - 5 patch sets | 21:29 |
timburke | how are you looking at measuring the improvement? | 21:29 |
jianjian | just noticed this patch, will take a look at it | 21:30 |
jianjian | my goal is to not see the container server 503 storms any more, not sure if it will improve anything like front user will see | 21:31 |
timburke | it's mroe than a year out of date, so i wouldn't worry *too much* about it -- i was just curious if you could see other processes which need shard ranges (such as the object-updater) using the same API | 21:32 |
jianjian | if those container server won't be overloaded, user would see less 503 errors | 21:32 |
timburke | or rather, that *could benefit from* shard ranges -- updater doesn't currently fetch shard ranges, just accepts redirects | 21:32 |
jianjian | good point, I will take a look. thanks! timburke | 21:32 |
jianjian | cccccbkvbdvfndecuccvrufvdvlccbretkdtnufnnrvr | 21:32 |
jianjian | oh, no, sorry my usb device | 21:33 |
zaitcev | My keyboard just repeats a key. | 21:34 |
timburke | but either way, reducing the load from proxies will be great, and getting some measurable improvements from prod is always a vote of confidence :-) i'll try to review the chain, though it looks like acoles has been helping too | 21:34 |
timburke | next up | 21:35 |
timburke | #topic py312 and slow process start-up | 21:35 |
timburke | this was something i noticed when trying to run tests locally ahead of declaring py312 support in p 917878 | 21:36 |
patch-bot | https://review.opendev.org/c/openstack/swift/+/917878 - swift - Test under py312 (MERGED) - 5 patch sets | 21:36 |
timburke | (that only added automated unit testing under py312; func and probe tests i performed locally) | 21:36 |
timburke | func tests were fine, but probe tests took *forever* | 21:37 |
timburke | i eventually traced it down warnings about pkg_resources being deprecated | 21:38 |
zaitcev | I think I recall fixing one... Only because watcher used it and I was obligated. | 21:39 |
timburke | the kind of funny thing is that setuptools (which was what was issuing the warning about pkg_resources being deprecated) was *also* the one writing code that would use pkg_resources! | 21:39 |
timburke | https://github.com/pypa/setuptools/blob/main/setuptools/script.tmpl | 21:40 |
jianjian | haha | 21:40 |
zaitcev | groan | 21:40 |
jianjian | I noticed our pipeline has "swift-tox-py312" now, so that's added by the 917878 patch? | 21:41 |
timburke | that only really came up with py312 because prior to that, python -m venv would include an older version of setuptools that didn't issue the warning | 21:41 |
timburke | jianjian, yup! | 21:41 |
jianjian | 👍 | 21:42 |
timburke | after a bit more digging, i figured out that we could change how we indicate there's this bin script to get around it | 21:42 |
timburke | basically, instead of listing it in the "scripts" of the "files" section in setup.cfg, list it in "console_scripts" in "entry_points" | 21:43 |
zaitcev | Oh, so that was the reason | 21:44 |
zaitcev | Why didn't you write it in the changelog? It's not self-evident. I'd add a +2 right away. | 21:45 |
timburke | so i started doing the conversion. it's mostly fairly mechanical, but there are a few bin scripts that still have a lot of code in them that needs to get relocated | 21:45 |
timburke | i thought i did ok on that in p 918365 ! i guess not clear enough :-) | 21:45 |
patch-bot | https://review.opendev.org/c/openstack/swift/+/918365 - swift - Use entry_points for server executables - 4 patch sets | 21:45 |
jianjian | nice! this brings us closer to run py312 on prod | 21:45 |
zaitcev | Oh, right. That'd add a lot of boilerplate. | 21:45 |
timburke | anyway, mattoliver at least took a look a couple months ago, recommending that we do it for everybody, so now i've added more patches stacked on top | 21:47 |
timburke | if anyone has time to review, i'd appreciate it | 21:47 |
timburke | next up | 21:47 |
timburke | #topic multi-policy containers | 21:48 |
timburke | i promised fulecorafa i'd get this on the agenda :-) | 21:48 |
fulecorafa | Thanks a lot :) | 21:48 |
timburke | i'm not sure what would be most useful for you, though. more discussion/brainstorming, maybe? | 21:49 |
fulecorafa | So, we've been working on this and we have a working prototype. Based on that idea from last meeting of making a +cold bucket automatically | 21:49 |
timburke | cool! | 21:50 |
zaitcev | Interesting! | 21:51 |
fulecorafa | We still have to make some adaptations to start thinking about a patch though. We've been working with some old version, which is what we use in production | 21:51 |
fulecorafa | Some other good news: we got it working with MPUs and versioning as well | 21:51 |
timburke | always best to start from what you're running in prod :-) | 21:51 |
timburke | nice! | 21:51 |
fulecorafa | One thing we are still pondering though | 21:52 |
fulecorafa | For multipart uploads, it kind of came for free with our modifications in the protocol if we have 4 buckets in the end: bkt; bkt+segments; bkt+cold and bkt+segments+cold | 21:53 |
timburke | that'd seem to make sense | 21:54 |
fulecorafa | We came to consider possibly removing this need by adapting the manifest file to skip the linking bkt -> bkt+cold (manifest) -> bkt+cold+segments (parts). So it would be just bkt (manifest) -> bkt+segments+cold (parts) | 21:55 |
fulecorafa | Is it worth the trouble? | 21:55 |
timburke | maybe? i think it'd probably come down to what kind of performance tradeoffs you can make -- the extra hop will necessarily increase time-to-first-byte | 21:57 |
fulecorafa | My personal opinion is that it is not worth it. It would largly complicate the code and this increase in time would be little compared to the whole process | 21:58 |
fulecorafa | In our tests, we didn't notice much of a difference | 21:58 |
timburke | having the four buckets seems unavoidable -- bkt / bkt+cold for normal uploads, bkt+segments / bkt+segments+cold for part data | 22:01 |
timburke | i'm inclined to agree that having the extra hop is worth it if it means the code is easier to grok and maintain | 22:01 |
fulecorafa | Then we agree. Maybe this changes when we come around to patch upstream, but we can talk about it then | 22:02 |
timburke | cool, sounds great! | 22:02 |
timburke | all right, we're about at time | 22:02 |
timburke | thank you all for coming, and thank you for working on swift! | 22:02 |
timburke | #endmeeting | 22:02 |
opendevmeet | Meeting ended Wed Jul 3 22:02:44 2024 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 22:02 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/swift/2024/swift.2024-07-03-21.00.html | 22:02 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/swift/2024/swift.2024-07-03-21.00.txt | 22:02 |
opendevmeet | Log: https://meetings.opendev.org/meetings/swift/2024/swift.2024-07-03-21.00.log.html | 22:02 |
fulecorafa | Thanks people, have a nice one! | 22:02 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!