Wednesday, 2024-07-03

opendevreview	Merged openstack/swift master: common: move memcached exceptions to the base file. https://review.opendev.org/c/openstack/swift/+/923315	05:51
acoles	timburke: apologies, I can't make today's meeting	16:13
opendevreview	Jianjian Huo proposed openstack/swift master: common: add memcached based cooperative token mechanism. https://review.opendev.org/c/openstack/swift/+/890174	17:53
opendevreview	Jianjian Huo proposed openstack/swift master: proxy: use cooperative tokens to coalesce updating shard range requests into backend https://review.opendev.org/c/openstack/swift/+/908969	20:39
timburke	almost meeting time, but i'm not sure there's going to be many people around for it	20:55
timburke	#startmeeting swift	21:00
opendevmeet	Meeting started Wed Jul 3 21:00:08 2024 UTC and is due to finish in 60 minutes. The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot.	21:00
opendevmeet	Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.	21:00
opendevmeet	The meeting name has been set to 'swift'	21:00
timburke	who's here for the swift meeting?	21:00
timburke	i know acoles and mattoliver are out	21:00
fulecorafa	Im here	21:01
timburke	o/	21:01
fulecorafa	o/	21:01
timburke	well, it'll do, even if it's just the two of us :-)	21:03
timburke	as usual, the agenda's at	21:03
timburke	#link https://wiki.openstack.org/wiki/Meetings/Swift	21:03
timburke	first up	21:03
timburke	#topic published docker images	21:04
timburke	they were busted -- now they're fixed!	21:04
fulecorafa	That's great news!	21:04
timburke	they're also now py3-only!	21:04
timburke	#link https://review.opendev.org/c/openstack/swift/+/896450	21:04
patch-bot	patch 896450 - swift - Get rid of py2 docker image builds; switch "latest... (MERGED) - 3 patch sets	21:04
timburke	we probably should have changed the "latest" tag to be py3 a while ago, so i think it's all good	21:05
timburke	i still need to look into getting a gate job to run functional tests using the built container, though -- but at least the immediate problem of publishing known-broken images is resolved	21:07
timburke	unfortunately, i realized we'd been doing that for months and months :-(	21:07
fulecorafa	We've being taking a look at automating some test, but they're are focused on s3api	21:08
timburke	going back to september at least	21:08
fulecorafa	Maybe we could start messing around with getting the tests to run in the container as well	21:08
timburke	#link https://bugs.launchpad.net/swift/+bug/2037268	21:08
patch-bot	Bug #2037268 - Docker's SAIO doesn't work (Fix Released)	21:08
timburke	fulecorafa, that'd be great, thanks for thinking of it! iirc, the docker image includes s3api support, so that could work well -- i'll double check	21:09
fulecorafa	It does include, I'm sure of it	21:09
fulecorafa	But the testing we've been doing with the image has been more of a manual work. i.e. a cli client which makes a bunch of calls to a running instance	21:10
timburke	the main i need to figure out is how to get the zuul plumbing sorted; i think clarkb has previously given me some pointers to help get me started, just need to go back through some notes	21:11
timburke	ah, makes sense	21:11
fulecorafa	I imagine the ideal scenario would be to automatically run a docker instance and testing together	21:11
clarkb	timburke: ya zuul has a bunch of existing base job stuff you can inherit from and supply secrest to and it does a lot of the container build and publishing work for you	21:12
clarkb	timburke: opendev/system-config has a lot of examples as does zuul/zuul or zuul/nodepool	21:12
timburke	my thought is basically to have a job that starts up the just-built container, then runs `pytest test/functional` with an appropriate config file to point it at the container	21:12
clarkb	probably the zuul stuff is closer to what you're doing since its also python and a smaller number of images. Look at the zuul image builds and the zuul quickstart job and how they interact	21:12
timburke	yeah -- probably what i really ought to do is extend the existing build job to include validation testing	21:13
fulecorafa	Should be great as well. I did try to run testing on the image before, but I just couldn't figure out the python dependencies. I'll later try again with this new fix	21:13
timburke	next up	21:14
timburke	#topic account-reaper and sharded containers	21:14
timburke	i still haven't gotten to writing a probe test. but it's still on my list!	21:14
timburke	hopefully i can get something together for next week	21:15
zaitcev	thanks for remembering	21:15
timburke	next up...	21:15
timburke	#topic cooperative tokens	21:16
jianjian	ah, I joined on the right moment	21:16
timburke	you did!	21:16
timburke	i was hoping jianjian might offer an overview of some of the work here, since i've seen a decent bit of activity/interest lately	21:17
timburke	#link https://review.opendev.org/c/openstack/swift/+/890174	21:17
patch-bot	patch 890174 - swift - common: add memcached based cooperative token mech... - 32 patch sets	21:17
zaitcev	I have a fundamental queestion: do we actually need these	21:17
zaitcev	in light of the Fernet token review	21:17
timburke	and	21:18
timburke	#link https://review.opendev.org/c/openstack/swift/+/908969	21:18
zaitcev	Soon there will be no memcached, no McRouter, no trouble ... right?	21:18
patch-bot	patch 908969 - swift - proxy: use cooperative tokens to coalesce updating... - 23 patch sets	21:18
jianjian	zaitcev, other than Fernet token, swift has other use cases which replies on memcached	21:18
timburke	zaitcev, so memcache will still be a thing -- these are mostly around account/container info and (especially) shard range caching	21:19
zaitcev	jianjian: yeah. But those aren't cooperative token use cases.	21:19
timburke	(thanks for reviewing the fernet token patch, though!)	21:19
zaitcev	Did I review it? I remember looking it, but...	21:19
zaitcev	(having a biden moment)	21:19
timburke	p 861271 has your +2 and everything :-)	21:20
patch-bot	https://review.opendev.org/c/openstack/swift/+/861271 - swift - tempauth: Support fernet tokens - 6 patch sets	21:20
zaitcev	So anyway, I'm not aganist the cooperative tokens idea, I think they are pretty clever.	21:20
jianjian	from our production experience, shard range caching is our mostly targeted use case. we see a lot of shard range cache misses and the associated thundering herd problems.	21:20
zaitcev	I see.	21:20
jianjian	and when shard range cache misses, thousands of requests would go the same containers (3 replica) and overload those container, because shard range GET is a very expensive and slow operation.	21:22
jianjian	and also some of those requests would be able to get shard ranges and start to write into memcache at the same time, cause memcache to fail as well	21:23
jianjian	but think about it, all of those tens of thousands requests are asking for the same thing! we only need one of them	21:24
jianjian	that's the basic idea of cooperative token. on top of that, we allow a few of requests to get the token and go to the backend, in case any single one would fail to do so.	21:25
zaitcev	I see what I misunderstood. You aren't talking about authentication tokens at all, but tokens that you circulate in memcached like in a TokenRing.	21:26
timburke	yeah, something like a semaphore i think	21:27
zaitcev	Way to go off half-cocked. But thanks for the additional explanation.	21:27
jianjian	no, it's not for authentication.	21:27
jianjian	timburke, that's right	21:28
jianjian	testing on staging cluster works well, we are going to enabling it on production, let's see.	21:28
timburke	we've got a few processes that might be interested in shard ranges -- is this work only improving proxy-server handling? would mattoliver's p 874721 be able to benefit, too?	21:29
patch-bot	https://review.opendev.org/c/openstack/swift/+/874721 - swift - updater: add memcache shard update lookup support - 5 patch sets	21:29
timburke	how are you looking at measuring the improvement?	21:29
jianjian	just noticed this patch, will take a look at it	21:30
jianjian	my goal is to not see the container server 503 storms any more, not sure if it will improve anything like front user will see	21:31
timburke	it's mroe than a year out of date, so i wouldn't worry too much about it -- i was just curious if you could see other processes which need shard ranges (such as the object-updater) using the same API	21:32
jianjian	if those container server won't be overloaded, user would see less 503 errors	21:32
timburke	or rather, that could benefit from shard ranges -- updater doesn't currently fetch shard ranges, just accepts redirects	21:32
jianjian	good point, I will take a look. thanks! timburke	21:32
jianjian	cccccbkvbdvfndecuccvrufvdvlccbretkdtnufnnrvr	21:32
jianjian	oh, no, sorry my usb device	21:33
zaitcev	My keyboard just repeats a key.	21:34
timburke	but either way, reducing the load from proxies will be great, and getting some measurable improvements from prod is always a vote of confidence :-) i'll try to review the chain, though it looks like acoles has been helping too	21:34
timburke	next up	21:35
timburke	#topic py312 and slow process start-up	21:35
timburke	this was something i noticed when trying to run tests locally ahead of declaring py312 support in p 917878	21:36
patch-bot	https://review.opendev.org/c/openstack/swift/+/917878 - swift - Test under py312 (MERGED) - 5 patch sets	21:36
timburke	(that only added automated unit testing under py312; func and probe tests i performed locally)	21:36
timburke	func tests were fine, but probe tests took forever	21:37
timburke	i eventually traced it down warnings about pkg_resources being deprecated	21:38
zaitcev	I think I recall fixing one... Only because watcher used it and I was obligated.	21:39
timburke	the kind of funny thing is that setuptools (which was what was issuing the warning about pkg_resources being deprecated) was also the one writing code that would use pkg_resources!	21:39
timburke	https://github.com/pypa/setuptools/blob/main/setuptools/script.tmpl	21:40
jianjian	haha	21:40
zaitcev	groan	21:40
jianjian	I noticed our pipeline has "swift-tox-py312" now, so that's added by the 917878 patch?	21:41
timburke	that only really came up with py312 because prior to that, python -m venv would include an older version of setuptools that didn't issue the warning	21:41
timburke	jianjian, yup!	21:41
jianjian	👍	21:42
timburke	after a bit more digging, i figured out that we could change how we indicate there's this bin script to get around it	21:42
timburke	basically, instead of listing it in the "scripts" of the "files" section in setup.cfg, list it in "console_scripts" in "entry_points"	21:43
zaitcev	Oh, so that was the reason	21:44
zaitcev	Why didn't you write it in the changelog? It's not self-evident. I'd add a +2 right away.	21:45
timburke	so i started doing the conversion. it's mostly fairly mechanical, but there are a few bin scripts that still have a lot of code in them that needs to get relocated	21:45
timburke	i thought i did ok on that in p 918365 ! i guess not clear enough :-)	21:45
patch-bot	https://review.opendev.org/c/openstack/swift/+/918365 - swift - Use entry_points for server executables - 4 patch sets	21:45
jianjian	nice! this brings us closer to run py312 on prod	21:45
zaitcev	Oh, right. That'd add a lot of boilerplate.	21:45
timburke	anyway, mattoliver at least took a look a couple months ago, recommending that we do it for everybody, so now i've added more patches stacked on top	21:47
timburke	if anyone has time to review, i'd appreciate it	21:47
timburke	next up	21:47
timburke	#topic multi-policy containers	21:48
timburke	i promised fulecorafa i'd get this on the agenda :-)	21:48
fulecorafa	Thanks a lot :)	21:48
timburke	i'm not sure what would be most useful for you, though. more discussion/brainstorming, maybe?	21:49
fulecorafa	So, we've been working on this and we have a working prototype. Based on that idea from last meeting of making a +cold bucket automatically	21:49
timburke	cool!	21:50
zaitcev	Interesting!	21:51
fulecorafa	We still have to make some adaptations to start thinking about a patch though. We've been working with some old version, which is what we use in production	21:51
fulecorafa	Some other good news: we got it working with MPUs and versioning as well	21:51
timburke	always best to start from what you're running in prod :-)	21:51
timburke	nice!	21:51
fulecorafa	One thing we are still pondering though	21:52
fulecorafa	For multipart uploads, it kind of came for free with our modifications in the protocol if we have 4 buckets in the end: bkt; bkt+segments; bkt+cold and bkt+segments+cold	21:53
timburke	that'd seem to make sense	21:54
fulecorafa	We came to consider possibly removing this need by adapting the manifest file to skip the linking bkt -> bkt+cold (manifest) -> bkt+cold+segments (parts). So it would be just bkt (manifest) -> bkt+segments+cold (parts)	21:55
fulecorafa	Is it worth the trouble?	21:55
timburke	maybe? i think it'd probably come down to what kind of performance tradeoffs you can make -- the extra hop will necessarily increase time-to-first-byte	21:57
fulecorafa	My personal opinion is that it is not worth it. It would largly complicate the code and this increase in time would be little compared to the whole process	21:58
fulecorafa	In our tests, we didn't notice much of a difference	21:58
timburke	having the four buckets seems unavoidable -- bkt / bkt+cold for normal uploads, bkt+segments / bkt+segments+cold for part data	22:01
timburke	i'm inclined to agree that having the extra hop is worth it if it means the code is easier to grok and maintain	22:01
fulecorafa	Then we agree. Maybe this changes when we come around to patch upstream, but we can talk about it then	22:02
timburke	cool, sounds great!	22:02
timburke	all right, we're about at time	22:02
timburke	thank you all for coming, and thank you for working on swift!	22:02
timburke	#endmeeting	22:02
opendevmeet	Meeting ended Wed Jul 3 22:02:44 2024 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)	22:02
opendevmeet	Minutes: https://meetings.opendev.org/meetings/swift/2024/swift.2024-07-03-21.00.html	22:02
opendevmeet	Minutes (text): https://meetings.opendev.org/meetings/swift/2024/swift.2024-07-03-21.00.txt	22:02
opendevmeet	Log: https://meetings.opendev.org/meetings/swift/2024/swift.2024-07-03-21.00.log.html	22:02
fulecorafa	Thanks people, have a nice one!	22:02

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!