21:00:02 <timburke> #startmeeting swift 21:00:02 <opendevmeet> Meeting started Wed May 1 21:00:02 2024 UTC and is due to finish in 60 minutes. The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:02 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:02 <opendevmeet> The meeting name has been set to 'swift' 21:00:11 <timburke> who's here for the swift team meeting? 21:00:50 <mattoliver> o/ 21:01:13 <timburke> huzzah! i was worried i'd be left talking to myself ;-) 21:01:25 <mattoliver> not this time :) 21:01:41 <timburke> i tried to do a better job of prepping this week 21:01:57 <timburke> so the agenda's pretty full at 21:02:00 <timburke> #link https://wiki.openstack.org/wiki/Meetings/Swift 21:02:06 <timburke> first up 21:02:14 <timburke> #topic utils refactor 21:02:21 <timburke> #link https://review.opendev.org/c/openstack/swift/+/914029 21:02:22 <patch-bot> patch 914029 - swift - Refactor utils - 20 patch sets 21:02:52 <timburke> clayg, acoles, and i all like where this has landed 21:03:36 <timburke> unfortunately it looks like there was a probe test failure in the gate (test_reconciler_move_object_twice), so it'll need a recheck 21:03:38 <mattoliver> yeah, I love the idea of further refactor, utils is getting big.. but not looking forward to the rebase fallout, esp in tracing :P 21:03:52 <timburke> but it'll be coming in the next day or so 21:04:18 <timburke> and yeah, expect a decent number of merge conflicts to fall out of it (sorry in advance) 21:04:47 <mattoliver> kk 21:05:13 <timburke> i'll try to get a merge down to feature/mpu up asap once its landed so acoles can have a ready-to-go-branch in his morning 21:05:33 <timburke> #topic probe test timeouts 21:05:35 <mattoliver> oh yeah great idea 21:06:02 <timburke> while i was reviewing that patch, i noticed that we get a fair bit of probe test timeouts 21:06:11 <timburke> not a *ton*, but more than i'd like 21:07:10 <timburke> some of them more or less make sense -- a patchset breaks every probe test, then the retry-failed-tests logic kicks in and retries them *all*... 21:07:19 <timburke> yeah, that's reasonably likely to cause a timeout 21:07:45 <timburke> others seem to just hang, though, and that's more worrying 21:08:15 <timburke> #link https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_aeb/913949/3/check/swift-probetests-centos-9-stream/aebbd31/job-output.txt 21:08:55 <timburke> for example, gets 8% of the way through the tests, then hangs until the timeout pops 1h51m later 21:09:21 <mattoliver> wow 21:09:23 <timburke> the test that hangs isn't consistent, fwiw 21:09:26 <timburke> #link https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_67c/909800/7/check/swift-probetests-centos-9-stream/67cfe7b/job-output.txt 21:09:39 <timburke> #link https://9b6014e80e764b848f3d-c29773bdeee4530a738751d9e026e2a7.ssl.cf1.rackcdn.com/874806/23/check/swift-probetests-centos-9-stream/ddc315e/job-output.txt 21:09:53 <mattoliver> been able to reproduce when running probe tests locally? 21:10:08 <timburke> nope -- so honestly i'm not quite sure how to debug it 21:10:32 <timburke> but i figured i'd bring it up in case anyone else had ideas 21:11:01 <timburke> i should probably write up a bug about it, and try to track job failures more closely 21:11:42 <timburke> if anyone else wants to take a look, i found this helpful 21:11:45 <timburke> #link https://zuul.opendev.org/t/openstack/builds?job_name=swift-probetests-centos-9-stream&job_name=swift-probetests-centos-8-stream&project=openstack%2Fswift&result=TIMED_OUT&skip=0&limit=100 21:11:45 <mattoliver> yeah bug might be a good start. I'll run some probe tests locally in the meantime and see what happens 21:12:04 <mattoliver> on nice 21:12:55 <timburke> it does seem like things go worse around March -- prior to that, it was mostly ~1/month 21:13:28 <timburke> but of course, the older runs don't still have logs attached to verify the hang 21:13:54 <timburke> next up 21:14:03 <timburke> #topic liberasurecode release 21:14:12 <timburke> it's been like a couple years! 21:14:24 <timburke> so i put together authors/changelog 21:14:31 <timburke> #link https://review.opendev.org/c/openstack/liberasurecode/+/917784 21:14:32 <patch-bot> patch 917784 - liberasurecode - Release 1.6.4 - 1 patch set 21:15:06 <mattoliver> yeah probably due for a release :P 21:15:30 <timburke> there's nothing too major -- there's a bounds-check that callers might appreciate, but otherwise it's mostly code cleanup and build fixes 21:15:47 <mattoliver> kk, will review it today 21:15:49 <timburke> probably half the reason is just to make sure i remember how to do one of these ;-) 21:15:51 <timburke> thanks 21:16:16 <timburke> speaking of ec... 21:16:30 <timburke> #topic manylinux wheels for pyeclib 21:17:03 <timburke> so i've been playing with this for a bit, and created a Dockerfile to help build these a while back 21:17:09 <timburke> #link https://review.opendev.org/c/openstack/pyeclib/+/817498 21:17:09 <patch-bot> patch 817498 - pyeclib - Add Dockerfile to build manylinux wheels - 11 patch sets 21:17:31 <timburke> but i finally got around to trying to get them building in CI! 21:17:37 <timburke> #link https://review.opendev.org/c/openstack/pyeclib/+/917857 21:17:37 <patch-bot> patch 917857 - pyeclib - Add job to build wheels - 5 patch sets 21:17:45 <mattoliver> oh yeah, I remember you playing with this 21:17:57 <mattoliver> nice 21:18:24 <timburke> it even has them showing up as artifacts on the zuul build page: https://zuul.opendev.org/t/openstack/build/a8e195bfe57b4d2c928d1a52a0523e4e/artifacts 21:19:36 <timburke> next up i want to beg some help from someone who knows zuul and the release process better than me to figure out how to actually build & upload that when we tag a release 21:20:14 <mattoliver> you might have to visit infra for that 21:20:35 <timburke> i also realize it might be nice to provide a little more context on manylinux wheels and why i want this 21:20:54 <mattoliver> true 21:22:28 <timburke> so any of us can build a binary wheel already -- setup.py bdist_wheel and away you go 21:23:27 <timburke> but that would create a wheel tied to your specific version of system libraries (including not just glibc but also liberasurecode) 21:24:22 <timburke> meaning that you couldn't just publish it and expect other people to be able to use it. pypi will actually reject such a wheel if you even try 21:26:36 <timburke> manylinux wheels are designed so you *can* distribute them, because they target a really old version of glibc and glibc won't break backwards compat 21:27:15 <mattoliver> oh ok, making alot more sense now 21:28:31 <timburke> that actually only solves half the problem, though -- great, glibc's OK, and we can probably expect other people to have *some* version of that installed 21:28:45 <timburke> but what about liberasurecode? or isa-l? 21:30:01 <timburke> there's a way to have those baked into the wheel, too! and since *those* will only depend on some widely-installed libraries, now you've got a wheel that can actually be used in a lot of places 21:31:12 <timburke> *and* you don't need a C build chain to install pyeclib 21:31:37 <mattoliver> oh wow, ok. I never considered putting more into a wheel. I guess why not. The point is to save compiling etc. 21:31:52 <timburke> my end goal is to be able to run `pip install swift` on a pretty bare-bones system and have it Just Work 21:32:50 <timburke> at least now you can say `pip isntall https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_a8e/917857/5/check/pyeclib-build-wheels/a8e195b/artifacts/pyeclib-1.6.1-cp35-abi3-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl swift` and i *think* that'll work ;-) 21:33:02 <timburke> (until the build results expire) 21:33:31 <timburke> (and assuming you fix my isntall typo :P) 21:33:40 <mattoliver> that would be cool. I actaully did just that yesterday (pip install swift) and then needed to get python.py and a compiler installed. So maybe good timing for this discussion :) 21:33:52 <mattoliver> *python.h 21:35:03 <timburke> there's more stuff that could be done (aarch64 wheels, musl wheels) but this seemed like a pretty good starting point 21:35:16 <timburke> next up 21:35:24 <timburke> #topic expirer work 21:35:28 <mattoliver> +1 21:35:42 <timburke> there are a few patches we've been looking at lately 21:36:24 <timburke> one adds some more info to the expirer queue entries -- specifically, the content-length of items that are marked to expire 21:36:28 <timburke> #link https://review.opendev.org/c/openstack/swift/+/912496 21:36:28 <patch-bot> patch 912496 - swift - add bytes of expiring objects to queue entry - 13 patch sets 21:38:12 <timburke> the other body of work is trying to deal with the large number of expirers and large number of queue entries we've got in prod -- every object node is participating, and that can result in a lot of account/container db load when they all restart 21:39:10 <timburke> the fact that we've got a bunch of deferred work in the queue that should be skipped for now just adds to the frustration 21:39:27 <timburke> so clayg has a couple patches 21:39:29 <timburke> #link https://review.opendev.org/c/openstack/swift/+/914713 21:39:30 <patch-bot> patch 914713 - swift - expirer: new options to control task iteration - 14 patch sets 21:39:35 <timburke> #link https://review.opendev.org/c/openstack/swift/+/916026 21:39:35 <patch-bot> patch 916026 - swift - distributed parallel task container iteration - 6 patch sets 21:40:14 <timburke> they were stacked previously, but that second one hasn't been updated in a little bit 21:40:40 <timburke> fwiw, though, i wonder how much we'd need the first one if we had the second one already 21:41:35 <mattoliver> would finally moving to the new task queue (that divides up the queues amongst all the partitions (or whatever)) making it more distributed, be an option? 21:42:21 <mattoliver> I haven't really looked into these patches yet. I'll try and get too that to get a better understanding 21:43:36 <timburke> potentially? p 517389 hasn't seen real activity since 2019, though, and we'll still need to deal with the 1B+ queue entries in the old layout 21:43:36 <patch-bot> https://review.opendev.org/c/openstack/swift/+/517389 - swift - Add object-expirer new mode to execute tasks from ... - 47 patch sets 21:44:50 <timburke> next up... 21:45:02 <mattoliver> oh yeah, just thinking out loud 21:45:07 <timburke> #topic py2/py3 behavior difference in brokers 21:45:30 <timburke> acoles and i noticed a funny thing while reviewing a patch on feature/mpu 21:46:13 <timburke> when we bulk-load all the rows from the pending file into a db, py2 shuffles the rows! 21:46:14 <mattoliver> yeah, I've noticed this. And skipped on py2 tests because the row insert order isn't known bewteen the 2 21:46:22 <timburke> this was a bit of a surprise to both of us 21:47:01 <timburke> oh! which test, do you remember? i want to fix it so py2 behaves like py3 21:47:35 <mattoliver> didn't py2's dict not strickly ordered. maybe it's used as a datatype down in the sqlite module or something 21:48:00 <timburke> it comes down to dict iteration order -- i think we just need to use an OrderedDict around https://github.com/openstack/swift/blob/2.33.0/swift/container/backend.py#L1365 21:48:30 <mattoliver> I'll have to find it.. it was a while ago 21:48:42 <timburke> and maybe https://github.com/openstack/swift/blob/2.33.0/swift/container/backend.py#L341 21:49:21 <mattoliver> where was working on brokers. maybe in the shard-ragne sync point patch, or maybe somethnig that's landed. I'll have to go digging. I'll ping you when I find it. 21:49:27 <timburke> that'd be great if you can. i might be able to find it on my own, too, now that i know it's somewhere out there 21:49:38 <timburke> last up 21:49:48 <timburke> #topic unreleased swiftclient bug 21:50:29 <timburke> there are a couple bugs caused by a recent-ish swiftclient patch, but Yan's got a fix up for them! 21:50:32 <timburke> #link https://review.opendev.org/c/openstack/python-swiftclient/+/916135 21:50:32 <patch-bot> patch 916135 - python-swiftclient - Fix swiftclient output regression - 5 patch sets 21:50:41 <mattoliver> oh nice 21:51:02 <timburke> we probably want to get that reviewed & merged fairly soon 21:51:19 <mattoliver> kk, I'll put it on my list 21:51:31 <timburke> all right, that's all i've got 21:51:35 <timburke> #topic open discussion 21:51:42 <timburke> anything else we want to bring up? 21:52:22 <mattoliver> We do have some students from a university in Qatar who want to work on swift as a project at Uni, their teacher/lecturer as reached out. 21:52:46 <mattoliver> I was trying to think of some swift related project for them to work on. 21:53:18 <timburke> oh yeah, i think i saw you forwarded something to me... sorry, i'm bad at keeping up with outreach 21:53:19 <mattoliver> So any thoughts would be greatly appreciated. Not sure on the size or complexity though. 21:53:47 <mattoliver> Looking at our old ideas page maybe one of these? 21:54:00 <mattoliver> account quotas for number of files 21:54:12 <mattoliver> #link https://wiki.openstack.org/wiki/Swift/ideas/account-quota-files 21:54:49 <mattoliver> task queue (though maybe to complex) 21:55:04 <mattoliver> probably same with teiring. 21:55:20 <mattoliver> we could try and give them pipeline automation 21:56:09 <mattoliver> I think the reconciler and sharder daemons need better scaling (ie added concurrency with workers etc). 21:56:23 <timburke> oh yeah, i should revisit p 635040 ... 21:56:23 <patch-bot> https://review.opendev.org/c/openstack/swift/+/635040 - swift - Include some pipeline validation during proxy-serv... - 5 patch sets 21:56:40 <mattoliver> Or maybe just something something intersting audit-watcher or custom middleware. 21:57:40 <timburke> i'll have a think on it 21:58:08 <mattoliver> Thanks, me too. And jianjian too now that he's joined the room :P 21:58:37 <mattoliver> I think we're basically out of time.. so that'll do from me :) 21:59:53 <timburke> all right. thank you for coming, and thank you for working on swift! 21:59:57 <timburke> #endmeeting