mcape | Hi All! | 03:18 |
---|---|---|
mcape | I’m investigating connection timeouts to the container-server in my OpenStack Swift cluster. I’m seeing about 2-3k of these daily. The software version is Yoga. | 03:18 |
mcape | Interestingly, the main load to the container-servers comes from the user agent "Swift Container Sharder" which is doing about 70 RPS to the container-servers, | 03:19 |
mcape | while proxy-servers/object-servers are only doing about ~10RPS and "container-sharder <PID>" does about 10RPS. | 03:19 |
mcape | The requests from "Swift Container Sharder" are mostly GET to the root container. | 03:19 |
mcape | some of them are 200, some are 404, and they all look like this: | 03:19 |
mcape | root-container-name?marker=706e5c6b%2F634e%2F4bb6%2Fb6%2Fd3%2F8fdc7e3c6b18&states=auditing&end_marker=72c5cf72%2F63ac%2F4562%2F96%2Fbe%2F6883600d54d&format=json" "tx7da317711c964ccea93b6-0067219ee4" "Swift Container Sharder" 0.0452 "-" 633 0 | 03:19 |
mcape | I have a three sharded containers, one large with ~300m objects, other smaller, overall i have 586 containers, with majority of them being shards. | 03:19 |
mcape | About ~1m objects are added every week | 03:19 |
mcape | Is that kind of sharding activity normal? Maybe it could be tuned down somehow? | 03:20 |
mcape | Thanks for any input! | 03:20 |
timburke | mcape, each time the sharder walks a disk and finds a shard db, it'll check in with the root to make sure that it should still be in charge of its portion of the namespace (ie, it hasn't been absorbed by another shard). that's what those &states=auditing queries are about | 03:55 |
timburke | you might try lowering your databases_per_second or increasing your interval. i'd lean more toward the former; maybe start by dropping it down to 10 from the default of 50? | 03:55 |
opendevreview | Tim Burke proposed openstack/swift master: Pull out a bunch of liberasurecode requirements https://review.opendev.org/c/openstack/swift/+/933697 | 04:11 |
opendevreview | Tim Burke proposed openstack/swift master: CI: Drag forward more constraints https://review.opendev.org/c/openstack/swift/+/933680 | 04:39 |
opendevreview | Tim Burke proposed openstack/swift master: Pull out a bunch of liberasurecode requirements https://review.opendev.org/c/openstack/swift/+/933697 | 04:47 |
opendevreview | Tim Burke proposed openstack/swift master: CI: Drag forward more constraints https://review.opendev.org/c/openstack/swift/+/933680 | 04:50 |
opendevreview | Matthew Oliver proposed openstack/swift master: db_auditor: add vacuum support https://review.opendev.org/c/openstack/swift/+/916861 | 08:44 |
mcape | Timburke, thanks a lot for your suggestions! | 09:46 |
mcape | I tried decreasing the databases_per_second parameter in the [container-sharder] sections from 50 to 10 and restarted all container-* services on all nodes, and waited a couple of hours. | 09:47 |
mcape | I haven’t seen a substantial decrease in the states=auditing queries. Should I expect these queries to decrease by 5x, or is there another factor at play? | 09:47 |
mcape | I also tried increasing the interval from 30 to 60 seconds, but saw the same results. | 09:47 |
opendevreview | Alistair Coles proposed openstack/swift feature/mpu: mpu: modify copy middleware to support uploadPartCopy https://review.opendev.org/c/openstack/swift/+/933605 | 11:46 |
opendevreview | Alistair Coles proposed openstack/swift feature/mpu: slo: add more functional tests for copying https://review.opendev.org/c/openstack/swift/+/933728 | 11:46 |
opendevreview | Alistair Coles proposed openstack/swift feature/mpu: mpu: func test for copying from an mpu (fails/skipped) https://review.opendev.org/c/openstack/swift/+/933729 | 11:46 |
opendevreview | Alistair Coles proposed openstack/swift master: Use a patcher to make SwiftLogAdapter looks like a StatsdClient https://review.opendev.org/c/openstack/swift/+/931473 | 12:38 |
opendevreview | Alistair Coles proposed openstack/swift feature/mpu: slo: add more functional tests for copying https://review.opendev.org/c/openstack/swift/+/933728 | 12:45 |
opendevreview | Alistair Coles proposed openstack/swift feature/mpu: mpu: modify copy middleware to support uploadPartCopy https://review.opendev.org/c/openstack/swift/+/933605 | 12:45 |
opendevreview | Alistair Coles proposed openstack/swift feature/mpu: mpu: func test for copying from an mpu (fails/skipped) https://review.opendev.org/c/openstack/swift/+/933729 | 12:45 |
*** avanzaghi16 is now known as avanzaghi1 | 13:32 | |
*** edausqu is now known as edausq | 13:32 | |
timburke | mcape, i would've expected a ~5x decrease in queries, like you said. odd. | 16:00 |
timburke | interval can almost certainly go much higher -- it's the desired minimum cycle time. if you see that your current cycle times are around an hour, you might try raising it to 2 or 3 hours | 16:00 |
timburke | the downside is that the work will tend to be front-loaded for any particular worker, rather than spread out across the whole cycle. part of why i'd default to adjusting databases_per_second first | 16:01 |
opendevreview | Tim Burke proposed openstack/swift master: CI: Drag forward more constraints https://review.opendev.org/c/openstack/swift/+/933680 | 16:13 |
opendevreview | Tim Burke proposed openstack/swift master: CI: Move a bunch of func test jobs from py38 to py312 https://review.opendev.org/c/openstack/swift/+/933369 | 16:39 |
opendevreview | Alistair Coles proposed openstack/swift feature/mpu: mpu: func test for copying from an mpu (fails/skipped) https://review.opendev.org/c/openstack/swift/+/933729 | 17:30 |
opendevreview | Alistair Coles proposed openstack/swift feature/mpu: mpu: modify copy middleware to support uploadPartCopy https://review.opendev.org/c/openstack/swift/+/933605 | 18:16 |
opendevreview | Alistair Coles proposed openstack/swift feature/mpu: mpu: fix copying from an mpu https://review.opendev.org/c/openstack/swift/+/933729 | 18:30 |
opendevreview | ASHWIN A NAIR proposed openstack/swift master: refactor test for x-delete-at w/t part_num and x-open-expired https://review.opendev.org/c/openstack/swift/+/933061 | 19:09 |
-opendevstatus- NOTICE: The Gerrit service on review.opendev.org will be offline momentarily at 20:30 utc (half an hour from now) to apply a configuration change | 20:03 | |
-opendevstatus- NOTICE: The Gerrit service on review.opendev.org will be offline momentarily to apply a configuration change | 20:32 | |
opendevreview | Shreeya Deshpande proposed openstack/swift master: Use a patcher to make SwiftLogAdapter looks like a StatsdClient https://review.opendev.org/c/openstack/swift/+/931473 | 20:52 |
timburke | #startmeeting swift | 21:00 |
opendevmeet | Meeting started Wed Oct 30 21:00:35 2024 UTC and is due to finish in 60 minutes. The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot. | 21:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 21:00 |
opendevmeet | The meeting name has been set to 'swift' | 21:00 |
timburke | who's here for the swift meeting? | 21:00 |
indianwhocodes | o/ | 21:02 |
timburke | first up | 21:04 |
timburke | #topic ptg | 21:04 |
timburke | it was last week! | 21:04 |
mattoliver | o/ | 21:04 |
mattoliver | (sorry internet issues) | 21:04 |
timburke | thanks to everybody for coming -- i know it was early/late a lot of days for a lot of people | 21:04 |
timburke | but i think we had a lot of good discussions! | 21:04 |
timburke | if there are any other notes we should write up about them, please add them to the etherpad while they're fresh in your heads | 21:05 |
fulecorafa | o/ | 21:05 |
mattoliver | it was great, just need to get stuff done while the fire's still lit on some of those discussions :) | 21:05 |
timburke | #link https://etherpad.opendev.org/p/swift-ptg-epoxy | 21:05 |
timburke | speaking of fires lit by those discussions... | 21:06 |
timburke | #topic pyeclib release | 21:06 |
timburke | there's been one! in fact, there've been 4 (if you count the pre-release) | 21:06 |
timburke | big main thing i wanted out of it was to start publishing binary wheels so `pip install pyeclib` won't require a build environment | 21:07 |
timburke | and we did it! eventually ;-) | 21:07 |
timburke | the first three wound up including some AVX2 instructions that weren't available on all of the various CI nodes | 21:08 |
timburke | so sorry for that gate breakage | 21:08 |
mattoliver | ahh, so that's what was happening! | 21:09 |
timburke | but it got fixed up by p 933338 | 21:09 |
patch-bot | https://review.opendev.org/c/openstack/pyeclib/+/933338 - pyeclib - wheels: Disable optimizations for liberasurecode (MERGED) - 1 patch set | 21:09 |
timburke | i realized that this could also be a good opportunity to slim down some of our requirements in swift, if we could rely on pyeclib providing liberasurecode | 21:10 |
timburke | so i proposed p 933697 | 21:10 |
patch-bot | https://review.opendev.org/c/openstack/swift/+/933697 - swift - Pull out a bunch of liberasurecode requirements - 2 patch sets | 21:10 |
timburke | ...but then realized that i need to wait on that until the new pyeclib release gets into the global upper-contraints file | 21:11 |
timburke | (i see a couple general "Updated from generate-constraints" patches that include it, but i may write a dedicated patch to just bump pyeclib) | 21:12 |
timburke | speaking of gate failures... | 21:12 |
timburke | #topic py38 gate failures | 21:13 |
timburke | this was another thing that came up toward the end of last week | 21:13 |
timburke | with global requirements dropping py38 support in p 925201 our py38 jobs stopped being able to find a valid version of some transitive dep to install | 21:16 |
patch-bot | https://review.opendev.org/c/openstack/requirements/+/925201 - requirements - Remove Python 3.8 tests and constraints (MERGED) - 5 patch sets | 21:16 |
mattoliver | oh yay | 21:16 |
indianwhocodes | damn | 21:16 |
indianwhocodes | from 2 to 3.9 | 21:17 |
mattoliver | so since PTG you've bacially been fighting the gate, thanks for all that Tim | 21:17 |
timburke | i cleared it up for now in p 933363 by doing the same basic thing that we did when py37 got dropped | 21:17 |
patch-bot | https://review.opendev.org/c/openstack/swift/+/933363 - swift - CI: use py36 constraints for py38 (MERGED) - 1 patch set | 21:17 |
indianwhocodes | nicee | 21:17 |
timburke | but i'm realizing that the existing py36-constraints.txt file isn't really a good solution | 21:17 |
indianwhocodes | it isn't | 21:17 |
indianwhocodes | it was never a good soln lol | 21:18 |
clarkb | one option that might be worth trying is running without constraints since many things drop support for older python anyway you may get implicit constrints due to having an upper bound on the library side | 21:18 |
timburke | i'm trying to get it to a *slightly* better place with p 933680, but even that's probably still not ideal | 21:18 |
patch-bot | https://review.opendev.org/c/openstack/swift/+/933680 - swift - CI: Drag forward more constraints - 4 patch sets | 21:18 |
clarkb | of course you always run the risk of having things update and breaking you, but maybe that will be infrequent enough with older python versiosn due to the python requires usage in pypi packages | 21:19 |
indianwhocodes | p 927883 | 21:19 |
patch-bot | https://review.opendev.org/c/openstack/swift/+/927883 - swift - manage py36 constraints for swift (MERGED) - 8 patch sets | 21:19 |
timburke | this manually-managed constraint file is causing grief in other ways, too -- mainly because we've been using it for our func-py3 tox environment | 21:20 |
indianwhocodes | true ^^^^ | 21:20 |
timburke | this is causing trouble for devstack trying to move to ubuntu noble/py312, because our constraint is pinning greenlet to a pretty ancient version which can't compile for py312 | 21:21 |
timburke | see p 931697 | 21:21 |
patch-bot | https://review.opendev.org/c/openstack/devstack/+/931697 - devstack - Switch devstack nodeset to Ubuntu 24.04 (Noble) - 1 patch set | 21:21 |
timburke | i think the more-general py3-constraints.txt that i've proposed will get things squared -- or there's also the initial patchset of https://review.opendev.org/c/openstack/swift/+/933369/1 that tries to move the func-py3 jobs back to using global constraints | 21:23 |
patch-bot | patch 933369 - swift - CI: Move a bunch of func test jobs from py38 to py312 - 2 patch sets | 21:23 |
timburke | ...but both of those trip func test failures in https://github.com/openstack/swift/blob/master/test/functional/s3api/test_xxe_injection.py | 21:24 |
timburke | so i think i'm going to need to spend some time hunting down what changed there :-/ | 21:25 |
timburke | it seems likely to be related to our using boto (not boto3) for the test, which in turn expects a *really old* version of requests | 21:26 |
timburke | see also https://bugs.launchpad.net/swift/+bug/1557260 | 21:26 |
patch-bot | Bug #1557260 - Swift3 functests should transition to using boto3 (New) | 21:26 |
indianwhocodes | noooo | 21:27 |
indianwhocodes | lol | 21:28 |
timburke | anyway, i think that's most of what i've got to say about the gate right now -- there's surely more work that can be done, and there's been a bunch of deferred maintenance that seems to be coming due | 21:28 |
timburke | #topic https://bugs.launchpad.net/swift/+bug/2081103 | 21:29 |
patch-bot | Bug #2081103 - s3api: Deleting the current version of an object can (sometimes?) 500 (In Progress) | 21:29 |
timburke | we still haven't fixed fulecorafa's bug :-( | 21:30 |
timburke | if anybody has time to review p 931325 i'd appreciate it | 21:30 |
patch-bot | https://review.opendev.org/c/openstack/swift/+/931325 - swift - versioning: 411 PUTs with neither content-length n... - 2 patch sets | 21:30 |
fulecorafa | timburke that's ok. I saw ou opened a PR in eventlet github right? | 21:30 |
timburke | yes! https://github.com/eventlet/eventlet/pull/985 | 21:31 |
timburke | i should follow up on that; they requested some clarification in the commit message | 21:31 |
fulecorafa | How's it going there? I took a look at it, LGTM but I saw no one else reviewed it right? | 21:32 |
timburke | but the swift patch should be sufficient | 21:32 |
fulecorafa | Oh ok. Then I will look at it ASAP | 21:32 |
timburke | yeah, no one else has looked yet -- if you could try it out in a pre-prod environment, say, and leave a review for how it went, that'd be awesome | 21:34 |
mattoliver | me too. I'll put it on my list. Should get some swift work done (sick of banging my head against the downstream work wall) :) | 21:34 |
mattoliver | +1 | 21:34 |
fulecorafa | I will try it, yes | 21:34 |
timburke | thanks | 21:34 |
fulecorafa | About testing, I will bring a topic at open discussion that may be interesting | 21:35 |
timburke | next up | 21:35 |
timburke | #topic multi-policy containers | 21:35 |
timburke | fulecorafa, how's it going? just wanted to check in since it's been a bit | 21:36 |
fulecorafa | Oh, didn't I metion it? We got it working in prod right now | 21:36 |
timburke | yes! i remember that part :-) wanted to see if we should be trying to block out some review time in the near future | 21:37 |
fulecorafa | We're activelly working on bringing it up to date now so we can patch it in swift, but there is quite a lot to do in that regard | 21:37 |
timburke | fair enough -- it's tricky rebasing a large body of work | 21:38 |
fulecorafa | Sorry if we're slow in bringing things here, it's been a pain point to have an outdated version of swift running | 21:38 |
timburke | especially in s3api, which sees no small amount of churn | 21:38 |
fulecorafa | In regards to implementation, we did it through a middleware, with little changes to s3api actuall | 21:39 |
timburke | yeah, totally makes sense to prioritize being able to run newer swift over getting the new feature merged -- in fact, it's probably a necessary pre-req | 21:40 |
fulecorafa | I think somethings wwe have done are not even patch-able, which is the most hold up | 21:41 |
fulecorafa | As in, we need to rewrite it so it is even possible to make it work with newer code | 21:41 |
timburke | ah, yeah. sorry :-/ | 21:41 |
fulecorafa | But we're trying, it will just take some time to get it ok | 21:42 |
timburke | out of curiosity, where's the friction turning up? we usually try to not be terribly disruptive to third-party middlewares | 21:42 |
fulecorafa | I would say it comes down to patches in s3api, pro controller and related. This patches are not easily migratable due to changes in function name/code placement/functionality changes | 21:44 |
timburke | ah | 21:44 |
timburke | all right, then i think that's all i've got | 21:44 |
timburke | #topic open discussion | 21:44 |
timburke | what else should we discuss? | 21:45 |
fulecorafa | While the midddleware could be done through a simple `git apply`, these patches would be necessary to rewrite and reconsider | 21:45 |
fulecorafa | I've got a topic regarding testing environment | 21:45 |
timburke | go ahead! | 21:46 |
fulecorafa | We've been trying to use both the oficial SAIO docker image to test and the NVidia one, but is very problematic, mostly to the wa we do things, but also with some breaks with errors not related to our active development | 21:46 |
fulecorafa | So we've been working with a lxc environment, which has come to be pretty cool. Now we're automating everthing, trying to take off developper hands apply and patch and start servers within lxd | 21:47 |
fulecorafa | Would this be an addition to the project? Would you like to have a look at it? | 21:48 |
timburke | nice! i remember notmyname tinkering with lxc/lxd ages ago, but can't find it now... | 21:49 |
mattoliver | I do like a good lxd container, havn't used one lately. I'd love to take a look | 21:49 |
mattoliver | I tinkered years ago, with notmyname too. So might be cool | 21:50 |
fulecorafa | Nice | 21:50 |
timburke | maybe? i'd be happy to look, anyway. fwiw, most of us have tended to settle on https://github.com/NVIDIA/vagrant-swift-all-in-one/ for a dev env | 21:50 |
fulecorafa | I'll talk with my side to make this code available. Any tips on sharing with ou guys? | 21:50 |
timburke | mattoliver, ah, good! you have a copy! https://github.com/matthewoliver/runway | 21:51 |
timburke | though it looks like https://github.com/kevin-wyx/runway might have a slightly more recent snapshot | 21:52 |
mattoliver | there you go :P never remove a cloned repo :P | 21:52 |
mattoliver | yeah probably | 21:52 |
timburke | fulecorafa, it's up to you -- if you'd can and would like to make it public, that's probably easiest, but you could also add us separately to some private repo if you'd prefer | 21:53 |
fulecorafa | Nice, I'll discuss making the repo public then with the other authors. Thanks a lot | 21:54 |
timburke | sure thing! thanks for wanting to make the swift developer experience better! | 21:55 |
mattoliver | +100 | 21:55 |
mattoliver | I've finally pushed up a new version of https://review.opendev.org/c/openstack/swift/+/916861 | 21:55 |
patch-bot | patch 916861 - swift - db_auditor: add vacuum support - 11 patch sets | 21:55 |
mattoliver | That one dumps the bucket bloatiness stats to recon and sends them as gauges in statsd at the end of an audit cycle. | 21:56 |
mattoliver | so it's the gauge for the last run. which seems to make more sense to me. | 21:57 |
mattoliver | It also adds statsdclient.gauge so we can use gauges now. | 21:57 |
timburke | i think that makes sense enough -- it's tricky striking the right balance between emitting stats frequently enough that your graph/timeseries db doesn't forget about the value and slowly enough that you can roll up some useful info | 21:59 |
mattoliver | thats all I really have to report on my end | 21:59 |
timburke | i see that it tripped https://bugs.launchpad.net/openstacksdk/+bug/2085654 :-/ | 21:59 |
patch-bot | Bug #2085654 - Two intermittent functional test failures (New) | 21:59 |
timburke | at least i finally wrote up a bug about it, though! | 21:59 |
mattoliver | yay | 22:00 |
timburke | all right, we're at time -- i oughta let mattoliver get on with his morning ;-) | 22:00 |
timburke | thank you all for coming, and thank you for working on swift! | 22:00 |
timburke | #endmeeting | 22:00 |
opendevmeet | Meeting ended Wed Oct 30 22:00:41 2024 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 22:00 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/swift/2024/swift.2024-10-30-21.00.html | 22:00 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/swift/2024/swift.2024-10-30-21.00.txt | 22:00 |
opendevmeet | Log: https://meetings.opendev.org/meetings/swift/2024/swift.2024-10-30-21.00.log.html | 22:00 |
opendevreview | Shreeya Deshpande proposed openstack/swift master: Use a patcher to make SwiftLogAdapter looks like a StatsdClient https://review.opendev.org/c/openstack/swift/+/931473 | 23:12 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!