Wednesday, 2024-10-30

mcapeHi All!03:18
mcapeI’m investigating connection timeouts to the container-server in my OpenStack Swift cluster. I’m seeing about 2-3k of these daily. The software version is Yoga.03:18
mcapeInterestingly, the main load to the container-servers comes from the user agent "Swift Container Sharder" which is doing about 70 RPS to the container-servers, 03:19
mcapewhile proxy-servers/object-servers are only doing about ~10RPS and "container-sharder <PID>" does about 10RPS. 03:19
mcapeThe requests from "Swift Container Sharder" are mostly GET to the root container.03:19
mcapesome of them are 200, some are 404, and they all look like this:03:19
mcaperoot-container-name?marker=706e5c6b%2F634e%2F4bb6%2Fb6%2Fd3%2F8fdc7e3c6b18&states=auditing&end_marker=72c5cf72%2F63ac%2F4562%2F96%2Fbe%2F6883600d54d&format=json" "tx7da317711c964ccea93b6-0067219ee4" "Swift Container Sharder" 0.0452 "-" 633 003:19
mcapeI have a three sharded containers, one large with ~300m objects, other smaller, overall i have 586 containers, with majority of them being shards.03:19
mcapeAbout ~1m objects are added every week03:19
mcapeIs that kind of sharding activity normal?  Maybe it could be tuned down somehow? 03:20
mcapeThanks for any input!03:20
timburkemcape, each time the sharder walks a disk and finds a shard db, it'll check in with the root to make sure that it should still be in charge of its portion of the namespace (ie, it hasn't been absorbed by another shard). that's what those &states=auditing queries are about03:55
timburkeyou might try lowering your databases_per_second or increasing your interval. i'd lean more toward the former; maybe start by dropping it down to 10 from the default of 50?03:55
opendevreviewTim Burke proposed openstack/swift master: Pull out a bunch of liberasurecode requirements  https://review.opendev.org/c/openstack/swift/+/93369704:11
opendevreviewTim Burke proposed openstack/swift master: CI: Drag forward more constraints  https://review.opendev.org/c/openstack/swift/+/93368004:39
opendevreviewTim Burke proposed openstack/swift master: Pull out a bunch of liberasurecode requirements  https://review.opendev.org/c/openstack/swift/+/93369704:47
opendevreviewTim Burke proposed openstack/swift master: CI: Drag forward more constraints  https://review.opendev.org/c/openstack/swift/+/93368004:50
opendevreviewMatthew Oliver proposed openstack/swift master: db_auditor: add vacuum support  https://review.opendev.org/c/openstack/swift/+/91686108:44
mcapeTimburke, thanks a lot for your suggestions!09:46
mcapeI tried decreasing the databases_per_second parameter in the [container-sharder] sections from 50 to 10 and restarted all container-* services on all nodes, and waited a couple of hours. 09:47
mcapeI haven’t seen a substantial decrease in the states=auditing queries. Should I expect these queries to decrease by 5x, or is there another factor at play? 09:47
mcapeI also tried increasing the interval from 30 to 60 seconds, but saw the same results.09:47
opendevreviewAlistair Coles proposed openstack/swift feature/mpu: mpu: modify copy middleware to support uploadPartCopy  https://review.opendev.org/c/openstack/swift/+/93360511:46
opendevreviewAlistair Coles proposed openstack/swift feature/mpu: slo: add more functional tests for copying  https://review.opendev.org/c/openstack/swift/+/93372811:46
opendevreviewAlistair Coles proposed openstack/swift feature/mpu: mpu: func test for copying from an mpu (fails/skipped)  https://review.opendev.org/c/openstack/swift/+/93372911:46
opendevreviewAlistair Coles proposed openstack/swift master: Use a patcher to make SwiftLogAdapter looks like a StatsdClient  https://review.opendev.org/c/openstack/swift/+/93147312:38
opendevreviewAlistair Coles proposed openstack/swift feature/mpu: slo: add more functional tests for copying  https://review.opendev.org/c/openstack/swift/+/93372812:45
opendevreviewAlistair Coles proposed openstack/swift feature/mpu: mpu: modify copy middleware to support uploadPartCopy  https://review.opendev.org/c/openstack/swift/+/93360512:45
opendevreviewAlistair Coles proposed openstack/swift feature/mpu: mpu: func test for copying from an mpu (fails/skipped)  https://review.opendev.org/c/openstack/swift/+/93372912:45
*** avanzaghi16 is now known as avanzaghi113:32
*** edausqu is now known as edausq13:32
timburkemcape, i would've expected a ~5x decrease in queries, like you said. odd.16:00
timburkeinterval can almost certainly go much higher -- it's the desired minimum cycle time. if you see that your current cycle times are around an hour, you might try raising it to 2 or 3 hours16:00
timburkethe downside is that the work will tend to be front-loaded for any particular worker, rather than spread out across the whole cycle. part of why i'd default to adjusting databases_per_second first16:01
opendevreviewTim Burke proposed openstack/swift master: CI: Drag forward more constraints  https://review.opendev.org/c/openstack/swift/+/93368016:13
opendevreviewTim Burke proposed openstack/swift master: CI: Move a bunch of func test jobs from py38 to py312  https://review.opendev.org/c/openstack/swift/+/93336916:39
opendevreviewAlistair Coles proposed openstack/swift feature/mpu: mpu: func test for copying from an mpu (fails/skipped)  https://review.opendev.org/c/openstack/swift/+/93372917:30
opendevreviewAlistair Coles proposed openstack/swift feature/mpu: mpu: modify copy middleware to support uploadPartCopy  https://review.opendev.org/c/openstack/swift/+/93360518:16
opendevreviewAlistair Coles proposed openstack/swift feature/mpu: mpu: fix copying from an mpu  https://review.opendev.org/c/openstack/swift/+/93372918:30
opendevreviewASHWIN A NAIR proposed openstack/swift master: refactor test for x-delete-at w/t part_num and x-open-expired  https://review.opendev.org/c/openstack/swift/+/93306119:09
-opendevstatus- NOTICE: The Gerrit service on review.opendev.org will be offline momentarily at 20:30 utc (half an hour from now) to apply a configuration change20:03
-opendevstatus- NOTICE: The Gerrit service on review.opendev.org will be offline momentarily to apply a configuration change20:32
opendevreviewShreeya Deshpande proposed openstack/swift master: Use a patcher to make SwiftLogAdapter looks like a StatsdClient  https://review.opendev.org/c/openstack/swift/+/93147320:52
timburke#startmeeting swift21:00
opendevmeetMeeting started Wed Oct 30 21:00:35 2024 UTC and is due to finish in 60 minutes.  The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot.21:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.21:00
opendevmeetThe meeting name has been set to 'swift'21:00
timburkewho's here for the swift meeting?21:00
indianwhocodeso/21:02
timburkefirst up21:04
timburke#topic ptg21:04
timburkeit was last week!21:04
mattolivero/21:04
mattoliver(sorry internet issues)21:04
timburkethanks to everybody for coming -- i know it was early/late a lot of days for a lot of people21:04
timburkebut i think we had a lot of good discussions!21:04
timburkeif there are any other notes we should write up about them, please add them to the etherpad while they're fresh in your heads21:05
fulecorafao/21:05
mattoliverit was great, just need to get stuff done while the fire's still lit on some of those discussions :) 21:05
timburke#link https://etherpad.opendev.org/p/swift-ptg-epoxy21:05
timburkespeaking of fires lit by those discussions...21:06
timburke#topic pyeclib release21:06
timburkethere's been one! in fact, there've been 4 (if you count the pre-release)21:06
timburkebig main thing i wanted out of it was to start publishing binary wheels so `pip install pyeclib` won't require a build environment21:07
timburkeand we did it! eventually ;-)21:07
timburkethe first three wound up including some AVX2 instructions that weren't available on all of the various CI nodes21:08
timburkeso sorry for that gate breakage21:08
mattoliverahh, so that's what was happening!21:09
timburkebut it got fixed up by p 93333821:09
patch-bothttps://review.opendev.org/c/openstack/pyeclib/+/933338 - pyeclib - wheels: Disable optimizations for liberasurecode (MERGED) - 1 patch set21:09
timburkei realized that this could also be a good opportunity to slim down some of our requirements in swift, if we could rely on pyeclib providing liberasurecode21:10
timburkeso i proposed p 93369721:10
patch-bothttps://review.opendev.org/c/openstack/swift/+/933697 - swift - Pull out a bunch of liberasurecode requirements - 2 patch sets21:10
timburke...but then realized that i need to wait on that until the new pyeclib release gets into the global upper-contraints file21:11
timburke(i see a couple general "Updated from generate-constraints" patches that include it, but i may write a dedicated patch to just bump pyeclib)21:12
timburkespeaking of gate failures...21:12
timburke#topic py38 gate failures21:13
timburkethis was another thing that came up toward the end of last week21:13
timburkewith global requirements dropping py38 support in p 925201 our py38 jobs stopped being able to find a valid version of some transitive dep to install21:16
patch-bothttps://review.opendev.org/c/openstack/requirements/+/925201 - requirements - Remove Python 3.8 tests and constraints (MERGED) - 5 patch sets21:16
mattoliveroh yay21:16
indianwhocodesdamn21:16
indianwhocodesfrom 2 to 3.921:17
mattoliverso since PTG you've bacially been fighting the gate, thanks for all that Tim21:17
timburkei cleared it up for now in p 933363 by doing the same basic thing that we did when py37 got dropped21:17
patch-bothttps://review.opendev.org/c/openstack/swift/+/933363 - swift - CI: use py36 constraints for py38 (MERGED) - 1 patch set21:17
indianwhocodesnicee21:17
timburkebut i'm realizing that the existing py36-constraints.txt file isn't really a good solution21:17
indianwhocodesit isn't21:17
indianwhocodesit was never a good soln lol21:18
clarkbone option that might be worth trying is running without constraints since many things drop support for older python anyway you may get implicit constrints due to having an upper bound on the library side21:18
timburkei'm trying to get it to a *slightly* better place with p 933680, but even that's probably still not ideal21:18
patch-bothttps://review.opendev.org/c/openstack/swift/+/933680 - swift - CI: Drag forward more constraints - 4 patch sets21:18
clarkbof course you always run the risk of having things update and breaking you, but maybe that will be infrequent enough with older python versiosn due to the python requires usage in pypi packages21:19
indianwhocodesp 92788321:19
patch-bothttps://review.opendev.org/c/openstack/swift/+/927883 - swift - manage py36 constraints for swift (MERGED) - 8 patch sets21:19
timburkethis manually-managed constraint file is causing grief in other ways, too -- mainly because we've been using it for our func-py3 tox environment21:20
indianwhocodestrue ^^^^21:20
timburkethis is causing trouble for devstack trying to move to ubuntu noble/py312, because our constraint is pinning greenlet to a pretty ancient version which can't compile for py31221:21
timburkesee p 93169721:21
patch-bothttps://review.opendev.org/c/openstack/devstack/+/931697 - devstack - Switch devstack nodeset to Ubuntu 24.04 (Noble) - 1 patch set21:21
timburkei think the more-general py3-constraints.txt that i've proposed will get things squared -- or there's also the initial patchset of https://review.opendev.org/c/openstack/swift/+/933369/1 that tries to move the func-py3 jobs back to using global constraints21:23
patch-botpatch 933369 - swift - CI: Move a bunch of func test jobs from py38 to py312 - 2 patch sets21:23
timburke...but both of those trip func test failures in https://github.com/openstack/swift/blob/master/test/functional/s3api/test_xxe_injection.py21:24
timburkeso i think i'm going to need to spend some time hunting down what changed there :-/21:25
timburkeit seems likely to be related to our using boto (not boto3) for the test, which in turn expects a *really old* version of requests21:26
timburkesee also https://bugs.launchpad.net/swift/+bug/155726021:26
patch-botBug #1557260 - Swift3 functests should transition to using boto3 (New)21:26
indianwhocodesnoooo21:27
indianwhocodeslol21:28
timburkeanyway, i think that's most of what i've got to say about the gate right now -- there's surely more work that can be done, and there's been a bunch of deferred maintenance that seems to be coming due21:28
timburke#topic https://bugs.launchpad.net/swift/+bug/208110321:29
patch-botBug #2081103 - s3api: Deleting the current version of an object can (sometimes?) 500 (In Progress)21:29
timburkewe still haven't fixed fulecorafa's bug :-(21:30
timburkeif anybody has time to review p 931325 i'd appreciate it21:30
patch-bothttps://review.opendev.org/c/openstack/swift/+/931325 - swift - versioning: 411 PUTs with neither content-length n... - 2 patch sets21:30
fulecorafatimburke that's ok. I saw ou opened a PR in eventlet github right?21:30
timburkeyes! https://github.com/eventlet/eventlet/pull/98521:31
timburkei should follow up on that; they requested some clarification in the commit message21:31
fulecorafaHow's it going there? I took a look at it, LGTM but I saw no one else reviewed it right?21:32
timburkebut the swift patch should be sufficient21:32
fulecorafaOh ok. Then I will look at it ASAP21:32
timburkeyeah, no one else has looked yet -- if you could try it out in a pre-prod environment, say, and leave a review for how it went, that'd be awesome21:34
mattoliverme too. I'll put it on my list. Should get some swift work done (sick of banging my head against the downstream work wall) :) 21:34
mattoliver+121:34
fulecorafaI will try it, yes21:34
timburkethanks21:34
fulecorafaAbout testing, I will bring a topic at open discussion that may be interesting21:35
timburkenext up21:35
timburke#topic multi-policy containers21:35
timburkefulecorafa, how's it going? just wanted to check in since it's been a bit21:36
fulecorafaOh, didn't I metion it? We got it working in prod right now21:36
timburkeyes! i remember that part :-) wanted to see if we should be trying to block out some review time in the near future21:37
fulecorafaWe're activelly working on bringing it up to date now so we can patch it in swift, but there is quite a lot to do in that regard21:37
timburkefair enough -- it's tricky rebasing a large body of work21:38
fulecorafaSorry if we're slow in bringing things here, it's been a pain point to have an outdated version of swift running21:38
timburkeespecially in s3api, which sees no small amount of churn21:38
fulecorafaIn regards to implementation, we did it through a middleware, with little changes to s3api actuall21:39
timburkeyeah, totally makes sense to prioritize being able to run newer swift over getting the new feature merged -- in fact, it's probably a necessary pre-req21:40
fulecorafaI think somethings wwe have done are not even patch-able, which is the most hold up21:41
fulecorafaAs in, we need to rewrite it so it is even possible to make it work with newer code21:41
timburkeah, yeah. sorry :-/21:41
fulecorafaBut we're trying, it will just take some time to get it ok21:42
timburkeout of curiosity, where's the friction turning up? we usually try to not be terribly disruptive to third-party middlewares21:42
fulecorafaI would say it comes down to patches in s3api, pro controller and related. This patches are not easily migratable due to changes in function name/code placement/functionality changes21:44
timburkeah21:44
timburkeall right, then i think that's all i've got21:44
timburke#topic open discussion21:44
timburkewhat else should we discuss?21:45
fulecorafaWhile the midddleware could be done through a simple `git apply`, these patches would be necessary to rewrite and reconsider21:45
fulecorafaI've got a topic regarding testing environment21:45
timburkego ahead!21:46
fulecorafaWe've been trying to use both the oficial SAIO docker image to test and the NVidia one, but is very problematic, mostly to the wa we do things, but also with some breaks with errors not related to our active development21:46
fulecorafaSo we've been working with a lxc environment, which has come to be pretty cool. Now we're automating everthing, trying to take off developper hands apply and patch and start servers within lxd21:47
fulecorafaWould this be an addition to the project? Would you like to have a look at it?21:48
timburkenice! i remember notmyname tinkering with lxc/lxd ages ago, but can't find it now...21:49
mattoliverI do like a good lxd container, havn't used one lately. I'd love to take a look21:49
mattoliverI tinkered years ago, with notmyname too. So might be cool21:50
fulecorafaNice21:50
timburkemaybe? i'd be happy to look, anyway. fwiw, most of us have tended to settle on https://github.com/NVIDIA/vagrant-swift-all-in-one/ for a dev env21:50
fulecorafaI'll talk with my side to make this code available. Any tips on sharing with ou guys?21:50
timburkemattoliver, ah, good! you have a copy! https://github.com/matthewoliver/runway21:51
timburkethough it looks like https://github.com/kevin-wyx/runway might have a slightly more recent snapshot21:52
mattoliverthere you go :P never remove a cloned repo :P 21:52
mattoliveryeah probably21:52
timburkefulecorafa, it's up to you -- if you'd can and would like to make it public, that's probably easiest, but you could also add us separately to some private repo if you'd prefer21:53
fulecorafaNice, I'll discuss making the repo public then with the other authors. Thanks a lot21:54
timburkesure thing! thanks for wanting to make the swift developer experience better!21:55
mattoliver+10021:55
mattoliverI've finally pushed up a new version of https://review.opendev.org/c/openstack/swift/+/916861 21:55
patch-botpatch 916861 - swift - db_auditor: add vacuum support - 11 patch sets21:55
mattoliverThat one dumps the bucket bloatiness stats to recon and sends them as gauges in statsd at the end of an audit cycle. 21:56
mattoliverso it's the gauge for the last run. which seems to make more sense to me. 21:57
mattoliverIt also adds statsdclient.gauge so we can use gauges now. 21:57
timburkei think that makes sense enough -- it's tricky striking the right balance between emitting stats frequently enough that your graph/timeseries db doesn't forget about the value and slowly enough that you can roll up some useful info21:59
mattoliverthats all I really have to report on my end21:59
timburkei see that it tripped https://bugs.launchpad.net/openstacksdk/+bug/2085654 :-/21:59
patch-botBug #2085654 - Two intermittent functional test failures (New)21:59
timburkeat least i finally wrote up a bug about it, though!21:59
mattoliveryay22:00
timburkeall right, we're at time -- i oughta let mattoliver get on with his morning ;-)22:00
timburkethank you all for coming, and thank you for working on swift!22:00
timburke#endmeeting22:00
opendevmeetMeeting ended Wed Oct 30 22:00:41 2024 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)22:00
opendevmeetMinutes:        https://meetings.opendev.org/meetings/swift/2024/swift.2024-10-30-21.00.html22:00
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/swift/2024/swift.2024-10-30-21.00.txt22:00
opendevmeetLog:            https://meetings.opendev.org/meetings/swift/2024/swift.2024-10-30-21.00.log.html22:00
opendevreviewShreeya Deshpande proposed openstack/swift master: Use a patcher to make SwiftLogAdapter looks like a StatsdClient  https://review.opendev.org/c/openstack/swift/+/93147323:12

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!