*** renich has quit IRC | 00:24 | |
donnyd | what would be the best way to disable public reading of large array of objects | 00:32 |
---|---|---|
donnyd | nevermind | 00:39 |
donnyd | that was easier than I though | 00:39 |
seongsoocho | hi, I'm in emergency ..... about a few hours ago, All container server's disk are suddenly full. I add more node but the disk usage does not getting down.... Is there any solution? | 01:44 |
timburke | *all* of them? what's the ratio like between container disk space vs object disk space? the usual sort of failure we see with containers and full drives is that a handful get *very* full, but most of them are still fairly empty... | 02:12 |
timburke | separately (but maybe kinda related?) i wonder if there should be a skip_commits=True around https://github.com/openstack/swift/blob/2.25.0/swift/container/sharder.py#L1628-L1629 ... | 02:13 |
seongsoocho | timburke: yes all of them. container server running alone. | 02:16 |
seongsoocho | the total container disk's size is 800GB and object disk are 1PB | 02:18 |
timburke | so replication pretty much *always* leads to some increased disk usage during an expansion -- gotta make sure the data's been replicated everywhere before we unlink. there are some emergency replication options like handoffs_only that were born out of some cluster-full incidents and needing a way to quickly process handoffs (and *only* handoffs) so we can unlink faster | 02:21 |
timburke | how full are the new disks? | 02:21 |
seongsoocho | timburke: ok then.. Set handoffs_only = true in all container node and restart replicator? | 02:22 |
seongsoocho | handoffs_only = yes | 02:23 |
timburke | yeah -- you'll want to watch your logs for when the replicators start complaining that they didn't find any work to do and reassess the situation | 02:24 |
seongsoocho | ok thanks I will try | 02:24 |
timburke | if disks are still full and even the new guys are looking bad, we've gotta worry. if things are coming down, flip it back off and life goes on :-) | 02:25 |
seongsoocho | timburke: the new guys disk are increasing now.. but the original one still full. I wil wait | 02:26 |
timburke | fwiw, i've got ~470T in SSDs for ~25P in spinners (both raw, not usable) -- but those SSDs are pretty empty. lemme get an estimate on what the actual used numbers are... | 02:28 |
*** manuvakery has joined #openstack-swift | 02:35 | |
timburke | it's kinda fuzzy -- there's an object policy on most of them that brings the usage up a good bit, like 18T of the 470T is filled probably? 3-4% or so. there are a couple disks *not* in that object policy; they're sitting at like 0.6%, but i'm not sure that's actually representative; feels like it might be low | 02:36 |
timburke | fwiw, i'd treat container disks kinda like object in that you don't really want them to get more than 75-80% full if you can avoid it | 02:54 |
*** psachin has joined #openstack-swift | 03:00 | |
seongsoocho | timburke: the situation are getting better, the disk usage is decreasing.... | 03:07 |
seongsoocho | ;;;;;; what a terrible friday morning.. | 03:07 |
seongsoocho | timburke: thanks for your help. I really appreciate you | 03:08 |
timburke | \o/ glad it's working | 03:55 |
timburke | and hey, could be worse -- could've been a late friday night ;-) | 03:55 |
seongsoocho | :-) hope to happy friday nigh. | 04:17 |
*** evrardjp has quit IRC | 04:36 | |
*** evrardjp has joined #openstack-swift | 04:36 | |
*** psachin has quit IRC | 04:51 | |
timburke | alecuyer, rledisez: i'm about to go to bed, but something to think about with the updaters -- what's your container-replicator cycle time like? we just intentionally *lengthened* ours, which seems to be having good effect on our updaters' success/failure ratio | 05:40 |
timburke | theory is that we used to have it tuned too fast, causing the replicator and/or sharder to flush the db.pending file more often than was helpful | 05:40 |
timburke | (side note: we should probably make https://github.com/openstack/swift/blob/2.25.0/swift/common/db.py#L52-L54 configurable. especially since that value predates encryption and updates have gotten bigger) | 05:42 |
seongsoocho | After recovery all container-server, the container replicate log some db fail to replicate.... Should I delete that db file and wait for re-replicate? | 06:08 |
seongsoocho | May 8 15:06:43 cnfswf-sto003.svr.toastmaker.net container-replicator: Synchronization for /srv/node/sdb/containers/2124/776/84cb2e263a504ce3f48a81135e53d776/84cb2e263a504ce3f48a81135e53d776.db has fallen more than 100000 rows behind; moving on and will try again next pass | 06:08 |
timburke | no, that's fine -- it should catch up eventually, it just may take some time. probably worth checking to see how many rows are in the DB, and what sync points are recorded | 06:35 |
seongsoocho | timburke: thanks . and good night :-) | 06:36 |
timburke | it made progress on those 100k; it just doesn't want to be stuck on this one db for too long. if it seems like it's really holding you up, you could consider upping max_diffs: https://github.com/openstack/swift/blob/master/etc/container-server.conf-sample#L154-L158 | 06:37 |
timburke | idk that i've ever bothered to do that, though | 06:38 |
timburke | and you're right, i should get to bed ;-) | 06:38 |
timburke | good night! enjoy the rest of your day | 06:38 |
*** dtantsur|afk is now known as dtantsur | 07:39 | |
*** manuvakery has quit IRC | 08:05 | |
*** tkajinam has quit IRC | 08:13 | |
*** takamatsu has joined #openstack-swift | 08:29 | |
*** ccamacho has joined #openstack-swift | 09:31 | |
*** rcernin has quit IRC | 09:39 | |
*** threestrands has quit IRC | 09:41 | |
*** rcernin has joined #openstack-swift | 10:22 | |
viks____ | I see lot of errors like : | 10:22 |
viks____ | ``` | 10:22 |
viks____ | container-server: ERROR __call__ error with DELETE /disk2/1552/5cc5358c50e34a3196eca65ea9039633/backup/Test/Archive/test/9bf82551-aa58-4efd-b045-263c5bd20f50/01f410c3-25fe-ec74-2491-88db97686035/blocks/42a1b2043d580213243c531152cfe1e0/34457.8b7f06a499d03d21dfe659ccc6d9e75d.028a8da294d30174d43667c98d9586c8.blk : LockTimeout (10s) /srv/node/disk2/containers/1552/e51/610717f4ac1d0c3ca1212deb49299e51/.lock (txn: | 10:22 |
viks____ | tx3c75433f0a284425bbb69-005eb52e70) | 10:22 |
viks____ | ``` | 10:22 |
viks____ | when s3api bulk delete is in action with a container with 2 million objects and the container db file size is also around 3GB | 10:22 |
viks____ | I also see the below in container server configuration under `[container-sharder]`: | 10:22 |
viks____ | ``` | 10:22 |
viks____ | # Large databases tend to take a while to work with, but we want to make sure | 10:22 |
viks____ | # we write down our progress. Use a larger-than-normal broker timeout to make | 10:22 |
viks____ | # us less likely to bomb out on a LockTimeout. | 10:22 |
viks____ | # broker_timeout = 60 | 10:22 |
viks____ | ``` | 10:22 |
viks____ | So do can i set similar timeout for container-server as well? or how do i overcome this error? | 10:22 |
*** tkajinam has joined #openstack-swift | 12:27 | |
*** mpasserini has joined #openstack-swift | 12:43 | |
mpasserini | Hi there. Is there a way to trace in the log who changed a Swift ACL on a given container, and how he changed it? I only see logs about "POST" but no details of contents. | 12:43 |
*** jv_ has quit IRC | 13:02 | |
*** jv_ has joined #openstack-swift | 13:05 | |
DHE | A POST or a PUT are really about the ways to do it. | 13:30 |
clayg | mpasserini: there's no audit trail of the contents of the requests, there are some options in some of the logging middlewares to capture headers out to logs you could investigate | 15:11 |
clayg | viks____: Anyone that's ever run into contention on container databases has seen the LockTimeout - a longer timeout isn't really a good solution for the container server (10s is already a long time) | 15:12 |
clayg | viks____: we gave the sharder a pass for the longer timeout mainly for legacy support of containers that were 100's of gigabytes before sharding existed and the sharder has some unique long running operations | 15:13 |
clayg | viks____: 3GB for a container database is not that "big" IME - I think it would be difficult to hit the timeout unless... are the containers maybe on spinning disks? | 15:15 |
clayg | I got a DM that looked kinda interesting. Is anyone else working on s3 acl's right now? https://github.com/openstack/swift/compare/master...idrive-online-backup:feature/bucket-policy | 15:18 |
*** tkajinam has quit IRC | 15:22 | |
viks____ | clayg: Thanks for the reply... So what action should i take for such error? my drives are SAS drives, but i used 1 nvme ssd drive to check if this goes away... in that also i see this error.. | 15:25 |
clayg | you get a TEN SECOND lock timeout with a 3 GB database on an nvme drive? In 10s an nvme could read the entire DB into memory and write it back out ... 100 times | 15:30 |
viks____ | clayg: Hmmm... i'm not sure what else could be causing this.. 🤔 | 15:32 |
clayg | so it's hard to run dbs on HDDs - keeping them small helps keep them fast | 15:33 |
clayg | what's the io utilization of the disk under the database? | 15:34 |
viks____ | container db disk utilization is not going beyond 5%... also if you check the error above it's basically timeout happening on object path... | 15:39 |
*** renich has joined #openstack-swift | 15:40 | |
*** mpasserini has quit IRC | 15:40 | |
*** gyee has joined #openstack-swift | 15:41 | |
viks____ | sorry... its's some path in container disk itself... | 15:42 |
viks____ | what is the file with `.blk` extension it is trying to lock? | 15:47 |
timburke | viks____, so the container DB is trying to process an object DELETE -- that is, an object-server or object-updater is trying to tell the DB that a delete occurred so the object stops showing up in listings | 16:10 |
*** jv_ has quit IRC | 16:15 | |
timburke | how big does /srv/node/disk2/containers/1552/e51/610717f4ac1d0c3ca1212deb49299e51/610717f4ac1d0c3ca1212deb49299e51.db.pending get? it should keep growing and shrinking, growing and shrinking as the container server scribbles down updates then batch-loads them | 16:15 |
timburke | there's a cap around 128k, but some operations will cause it to flush early | 16:16 |
clayg | viks____: `.blk` is part of the object name , not a fs path | 16:26 |
viks____ | timburke: Thanks... i see it growing and shrinking not necessarily reaching 12k limit... | 16:28 |
viks____ | clayg: ok | 16:28 |
viks____ | sorry ... 128K limit.. | 16:31 |
viks____ | is that error messages can be safely ignored? | 16:34 |
*** evrardjp has quit IRC | 16:36 | |
*** evrardjp has joined #openstack-swift | 16:36 | |
*** dtantsur is now known as dtantsur|afk | 16:42 | |
timburke | viks____, so if it's never even getting close to the limit, your container replicator/sharder/auditor might be too aggressive. just last night i dropped the dbs_per_second for my replicators and sharders so i could make some headway on async pendings; seems to have greatly reduced the number of updater failures | 17:29 |
zaitcev | BTW, I gave up on https://github.com/eventlet/eventlet/issues/526 | 17:42 |
zaitcev | I configured an additional region where identity has no SSL, and told Swift to use that region. That avoids the crash. | 17:43 |
zaitcev | Fortunately, regions mean nothing for tokens. So, the client obtains token using RegionOne, passes it to Swift, keystonemiddleware talks to Keystone using RegionX and passes the token for verification, and it works. No region scoping. | 17:44 |
*** manuvakery has joined #openstack-swift | 17:56 | |
*** renich has quit IRC | 18:05 | |
timburke | i really ought to learn how to setup/use wireguard -- i feel like that'll be a great idea for all intra-cluster traffic, including swift <-> keystone | 18:57 |
clayg | viks____: I don't think you should ignore it - it's telling you something important: "a DELETE request to this database took >10s so I gave up and returned 500" - realistically there's not a lot of good reasons for anything to take that long | 19:11 |
clayg | viks____: you say the database isn't big, you say the disk isn't busy - those are the reasons WE'VE ALL SEEN for things to slow down (and sharding or more iops solves them) - but if NOT those it's something more sinister which is even MORE interesting | 19:12 |
clayg | viks____: but you know... it could just be this -> https://bugs.launchpad.net/swift/+bug/1877651 | 19:28 |
openstack | Launchpad bug 1877651 in OpenStack Object Storage (swift) "Reclaim of tombstone rows is unbounded and causes LockTimeout (10s)" [Medium,New] - Assigned to clayg (clay-gerrard) | 19:28 |
*** benj_ has quit IRC | 19:55 | |
*** benj_ has joined #openstack-swift | 19:56 | |
*** manuvakery has quit IRC | 20:06 | |
*** renich has joined #openstack-swift | 20:55 | |
*** ccamacho has quit IRC | 21:02 | |
*** mikecmpbll has joined #openstack-swift | 22:46 | |
*** jv_ has joined #openstack-swift | 23:19 | |
*** jv_ has quit IRC | 23:25 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!