timburke | stay safe mattoliverau! GL! | 00:18 |
---|---|---|
*** gyee has quit IRC | 01:08 | |
openstackgerrit | Xuan Yandong proposed openstack/swift-bench master: Remove six Replace the following items with Python 3 style code. https://review.opendev.org/755151 | 01:21 |
*** clayg has quit IRC | 02:14 | |
*** tkajinam has quit IRC | 02:14 | |
*** StevenK has quit IRC | 02:14 | |
*** mattoliverau has quit IRC | 02:14 | |
*** tkajinam_ has joined #openstack-swift | 02:15 | |
*** StevenK has joined #openstack-swift | 02:15 | |
*** clayg has joined #openstack-swift | 02:15 | |
*** ChanServ sets mode: +v clayg | 02:15 | |
*** mattoliverau has joined #openstack-swift | 02:20 | |
*** tepper.freenode.net sets mode: +v mattoliverau | 02:20 | |
*** viks____ has joined #openstack-swift | 02:28 | |
*** rcernin has quit IRC | 02:55 | |
*** rcernin_ has joined #openstack-swift | 02:56 | |
*** psachin has joined #openstack-swift | 03:31 | |
*** psachin has quit IRC | 03:32 | |
*** psachin has joined #openstack-swift | 03:33 | |
*** m75abrams has joined #openstack-swift | 04:22 | |
*** evrardjp has quit IRC | 04:33 | |
*** evrardjp has joined #openstack-swift | 04:33 | |
*** mikecmpbll has joined #openstack-swift | 04:37 | |
openstackgerrit | Xuan Yandong proposed openstack/swift-bench master: Remove six and py27 tox Replace the following items with Python 3 style code. https://review.opendev.org/755151 | 06:06 |
openstackgerrit | Xuan Yandong proposed openstack/swift-bench master: Remove six and py27 tox https://review.opendev.org/755151 | 06:43 |
*** rcernin_ has quit IRC | 07:06 | |
*** rcernin_ has joined #openstack-swift | 07:17 | |
*** rcernin_ has quit IRC | 07:20 | |
*** rcernin has joined #openstack-swift | 07:20 | |
*** mikecmpbll has joined #openstack-swift | 08:06 | |
*** rcernin has quit IRC | 08:48 | |
openstackgerrit | wu.shiming proposed openstack/swift master: requirements: Drop os-testr https://review.opendev.org/755232 | 08:53 |
*** ab-a has quit IRC | 09:35 | |
*** ab-a has joined #openstack-swift | 09:36 | |
openstackgerrit | Merged openstack/swift stable/train: py3: Fix swift-dispersion-populate https://review.opendev.org/754853 | 10:02 |
*** StevenK has quit IRC | 10:58 | |
*** StevenK has joined #openstack-swift | 10:58 | |
*** rcernin has joined #openstack-swift | 11:50 | |
*** rcernin has quit IRC | 12:16 | |
*** tkajinam_ has quit IRC | 13:12 | |
*** m75abrams has quit IRC | 13:57 | |
*** gyee has joined #openstack-swift | 14:59 | |
*** ozzzo has joined #openstack-swift | 15:19 | |
*** Hamidreza has joined #openstack-swift | 15:24 | |
Hamidreza | Hi | 15:25 |
Hamidreza | I've a question about openstack swift storage | 15:25 |
Hamidreza | I add 20 disks to my cluster and nodes then update the ring, Now it should balance the data but it didn't do that!!! | 15:25 |
Hamidreza | what should i do? | 15:25 |
timburke | Hamidreza, have you checked that the object-replicator is running on all nodes? | 15:27 |
Hamidreza | I checked object-replicator and even rsync proccess | 15:28 |
Hamidreza | and they were working | 15:28 |
ormandj | we see a lot of intermittent ConnectionTimeouts to backend servers (using servers per port of 2) - i would have expected slowness, but not connection timeouts if disks are saturated. is this expected with ussuri? | 15:28 |
Hamidreza | and even i increased the number of proccess | 15:28 |
timburke | i just saw http://lists.openstack.org/pipermail/openstack-discuss/2020-September/017675.html -- note that you *won't* want to keep the fs read-only; the replicator needs to be able to delete data that no longer belongs on that disk | 15:30 |
Hamidreza | ok, what can I do ? | 15:32 |
timburke | Hamidreza, you may want to look at the handoffs_first and handoff_delete options: https://github.com/openstack/swift/blob/2.26.0/etc/object-server.conf-sample#L287-L304 | 15:32 |
timburke | if things have been fairly healthy, you should be fine to set handoffs_first=True and handoff_delete=1, restart the replicators, and wait for a replication cycle -- the full drives should start draining fairly quickly at that point | 15:34 |
timburke | odds are, you'll be limited by the iops of the new drives | 15:34 |
Hamidreza | I don't want to start object replicator | 15:36 |
Hamidreza | I've stoped it before | 15:36 |
Hamidreza | because, oneday suddenly I saw that all of my disks get broken one by one | 15:37 |
Hamidreza | I think they were under high pressure | 15:37 |
Hamidreza | so I disable the object replicator | 15:38 |
Hamidreza | after that day none of my disks get broken!!! | 15:38 |
timburke | "broken" how? the replicators (and, if using erasure coding, reconstructors) are how swift (1) ensures that data remains durable even in the face of failing drives and (2) moves data as part of expansions so that drives don't fill up. it's a vital part of your swift deployment | 15:42 |
*** openstackgerrit has quit IRC | 15:46 | |
Hamidreza | (2) yeah, this is vital part of swift deployment but it didn't work for me. it must rebalance the disks and move from full disks to empty | 15:47 |
*** mikecmpbll has quit IRC | 15:53 | |
*** mikecmpbll has joined #openstack-swift | 15:54 | |
*** Hamidreza has quit IRC | 16:02 | |
*** psachin has quit IRC | 16:34 | |
clayg | Hamidreza: maybe try changing the ionice_priority setting for the object-replicator with rather low concurrency and handoffs_first=True while monitoring your devices with iostat | 16:50 |
clayg | hopefully you can find a balance that allows your disks to service the io needs of your client facing traffic as well as the consistency engines io needs for background work | 16:50 |
clayg | in an emergency you can also turn off other processes like the object-auditor | 16:50 |
clayg | if you run container resources on the same disks as your object devices that can put a lot of pressure on those disks as well | 16:51 |
clayg | dedicated ssd's are best | 16:51 |
clayg | timburke: so my config tests are having problems with the dlo middleware trying to reparse the config for legacy options... | 16:52 |
clayg | I'm sure I can get the tests passing - but i'm not looking forward to an overhaul of staticweb | 16:52 |
clayg | is there maybe a better idea than message passing via the request environ for how SLO can signal to proxy-logging that an error occurred during the iterator? | 16:54 |
clayg | i feel like catch errors and proxy logging are starting to kinda team up or converge when it comes to watch dogging the iterators on content length 🤔 | 16:55 |
timburke | ormandj, connection timeouts aren't so surprising, especially if it's a busy cluster. one of the things that can happen is the object-server gets stuck waiting on disk IO, so incoming connections can't be accepted. kernel will queue some of them, but eventually, the connect will block. you can use something like `lsof -a -u swift -i -s TCP:LISTEN -T q` to check in on your listen queue depths | 16:55 |
ormandj | timburke: yeah, is there any way around it? tl;dr, we've had to lower replication workers to almost nothing and are only doing about 40MB/s to the 'new' server, and it's still causing every client massive problems with ~8 out of 56 drives per server relatively iops saturated | 16:56 |
ormandj | customers would be fine with slow, but broken is not so much. we could increase conn_timeout, but historically, that really hasn't helped | 16:57 |
ormandj | we're about to try servers_per_port =4 instead of 2 in hopes it will help | 16:57 |
ormandj | lesson learned, don't deploy with less than 20+ nodes in the future, but trying to figure out a way out of this pickle in the meantime hah | 16:58 |
ormandj | i think we're down at 12 replication workers atm | 16:59 |
ormandj | see some read queues of 20-30ish | 17:07 |
ormandj | using ss, read/send queues at 127/128 | 17:08 |
*** openstackgerrit has joined #openstack-swift | 17:41 | |
openstackgerrit | Clay Gerrard proposed openstack/swift master: Test proxy-server.conf-sample https://review.opendev.org/755087 | 17:41 |
openstackgerrit | Clay Gerrard proposed openstack/swift master: Add staticweb to default pipeline https://review.opendev.org/755132 | 17:47 |
openstackgerrit | Clay Gerrard proposed openstack/swift master: Log error processing manifest as ServerError https://review.opendev.org/752770 | 17:47 |
*** recyclehero has joined #openstack-swift | 17:59 | |
recyclehero | hi | 17:59 |
recyclehero | consider horozin, keystone and databases gone. swift-proxy and container-account-object are avaialbe. | 18:01 |
recyclehero | wondering is there any hope for recovery of data or just burn the thing | 18:01 |
DHE | recovery how? | 18:03 |
recyclehero | Bmy queens infra was very unstable mostly hardware problems. going to deploy new usurri with kolla | 18:03 |
recyclehero | what should I do with my swift? its something good to know for planning DR later | 18:04 |
recyclehero | DHE: getting files | 18:04 |
recyclehero | objects sorry | 18:04 |
DHE | the official thing I could suggest is making a project with the same project ID as the old one. however looking at my (admittedly old) version of the openstack cli tool there isn't a means to select your own uuid so you might have to go into the keystone DB post creation and change it | 18:05 |
DHE | authentication is really by your project membership. other than using read/write ACLs swift doesn't care much about individual users | 18:05 |
DHE | I built my swift cluster to have minimal keystone dependencies, so there are tempurl secrets everywhere and most programs that would authenticate use that instead. really keystone is only needed for deletion (that might be beatable, but I'm using bulk delete so not bothering) and container/object listings | 18:06 |
recyclehero | DHE: aha, but I have to give it to use swift to get them back. is there a way I can assemble files from swift dbs? | 18:07 |
DHE | well all the swift dbs under the account and container directories on your disks are sqlite so you can absolutely stick your nose in there with the sqlite tool | 18:07 |
recyclehero | underlying fs + swift dbs | 18:08 |
recyclehero | ? | 18:08 |
DHE | if you know the object URLs you want there's swift-get-nodes which will take the ring file and path name and provide both full server+paths, and CURL commands to get some data | 18:09 |
recyclehero | files are made into chunks if I am correct. somehow one should assemble them back right? | 18:09 |
timburke | your swift data is still safe and sound; the difficult part will be finding it. as DHE suggests, if you can get the project IDs to match between old and new, it'll all be there and available. if matching project IDs isn't feasible, you should still be able to create a "reseller admin" user that will have full access to any account; that user could then do server side copies of data to a new location, then clean up the old data | 18:09 |
DHE | if you're using EC, yes they are. if you're using multi-way replication each file is fully intact | 18:09 |
recyclehero | timburke: nice. I read the OS was made from nova and swift respectivly by nasa and rackspace. I like swift | 18:11 |
timburke | us too :-) | 18:12 |
recyclehero | and all I should do witn cmd now? | 18:12 |
recyclehero | sure. thanks :-) | 18:13 |
timburke | the cardinality of accounts is usually fairly small -- i'd probably write a little script to just walk the account disks on each node and build a list of all accounts in the cluster. then you'd need figure out which account was who's and make a mapping from old account to new account, which may be non-trivial. then script a data-mover and wait a while | 18:16 |
timburke | might be worth doing periodic (encrypted) backups of the keystone db into swift ;-) then i think you could just restore from the backup and be up and running again | 18:17 |
recyclehero | thanks again I will start and came back when I have some progress | 18:23 |
openstackgerrit | Romain LE DISEZ proposed openstack/swift master: Fix a race condition in case of cross-replication https://review.opendev.org/754242 | 18:32 |
ormandj | if you have logs, you can probably clean a lot of account info out of them | 18:32 |
ormandj | and just mangle the db | 18:32 |
openstackgerrit | Romain LE DISEZ proposed openstack/swift master: Fix a race condition in case of cross-replication https://review.opendev.org/754242 | 18:38 |
timburke | almost meeting time! | 20:51 |
seongsoocho | \o/ | 20:54 |
kota_ | morning | 20:57 |
mattoliverau | Morning | 20:59 |
*** nicolasbock has joined #openstack-swift | 21:11 | |
nicolasbock | Hi! Does swift support rewriting requests to public (staticweb, tempurl) containers? I am looking to be able to point my browser to `www.example.com` and be redirected to `https://swift-cluster.example.com/v1/AUTH_account/container/object?....` | 21:37 |
timburke | nicolasbock, you'll want to look at the cname_lookup and domain_remap middlewares, but the long and short of it is yeah, that's doable | 21:39 |
nicolasbock | Oh cool | 21:39 |
nicolasbock | Thanks for the pointer timburke ! | 21:39 |
timburke | np! idea is to have www.example.com have a cname record pointing to something like container.auth-account.swift-cluster.example.com, then cname_lookup does the translation in the received host header and domain_remap unpacks the account/container pieces | 21:44 |
nicolasbock | Nice, that doesn't sound too bad | 21:46 |
DHE | I'm just using nginx as a proxy to the proxy (HA!) no tempurl though (but it can be done if need be) | 21:48 |
clayg | wsgi is the worst abstraction for web request processing; except for all the others | 22:04 |
clayg | My 12 year old has his very first football scrimmage tonight! Go Rice Ravens! | 22:05 |
timburke | haha nice! have fun! | 22:05 |
timburke | oh! i also kinda wanted to point out https://review.opendev.org/#/c/751966/ to people -- i still haven't gotten a fips-enabled vm to come back up in a usable state, but it looks like the patch might be about ready | 22:06 |
patchbot | patch 751966 - swift - replace md5 with swift utils version - 11 patch sets | 22:06 |
timburke | merging it will result in a bunch of merge conflicts, so it seems worth having on people's radars | 22:06 |
timburke | and if it means i'll be able to get even just one review from cschwede i'm calling it worth it ;-) | 22:07 |
*** rcernin has joined #openstack-swift | 22:12 | |
*** mikecmpbll has quit IRC | 22:23 | |
*** mikecmpbll has joined #openstack-swift | 22:29 | |
*** tkajinam has joined #openstack-swift | 23:00 | |
openstackgerrit | Tim Burke proposed openstack/liberasurecode master: Be willing to write fragments with legacy crc https://review.opendev.org/738959 | 23:45 |
openstackgerrit | Tim Burke proposed openstack/swift master: ec: Add an option to write fragments with legacy crc https://review.opendev.org/739164 | 23:51 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!