openstackgerrit | Tim Burke proposed openstack/swift master: Have slo tell the object-server that it wants whole manifests https://review.opendev.org/697739 | 00:09 |
---|---|---|
openstackgerrit | Tim Burke proposed openstack/swift master: symlink: Clean up app iters better https://review.opendev.org/700959 | 00:10 |
*** d34dh0r53 has quit IRC | 00:14 | |
clayg | that sucks, I thought we already did something to assert no unread requests in teardown | 00:19 |
viks___ | hi, i'm testing erasure code policy on my setup.. I notice that it is slower compared to default policy? i.e. for a 500mb file upload i see around 6 sec and with EC i see around 9sec ... is this expected? anyone has any idea?? | 00:23 |
openstackgerrit | Tim Burke proposed openstack/swift master: account-server: Add test for leading delimiter https://review.opendev.org/700964 | 00:27 |
openstackgerrit | Tim Burke proposed openstack/swift stable/train: account-server: Correctly handle containers starting with delimiter https://review.opendev.org/700965 | 00:27 |
timburke | clayg, no *unclosed* requests, which is also good. but i wanted to minimize 499s | 00:28 |
clayg | ohhhhhh | 00:28 |
clayg | brilliant | 00:28 |
clayg | viks___: what's your test setup? | 00:28 |
timburke | viks___, what's your EC algorithm? the rs_vand that ships with liberasure code is mainly meant for demonstration/testing purposes; if it were me, i'd want to be ssure to use isa-l | 00:29 |
clayg | viks___: if you're cpu limited (i.e. all nodes/services on the same SAIO) that would be expected. | 00:29 |
clayg | viks___: normally when people compare single stream upload throughput on testing clusters their *disk* limited - and when going to EC there's more spindles and less bytes so it goes a good bit *faster* than replicated | 00:30 |
clayg | but that assumes pleanty of cpu headroom | 00:30 |
timburke | or bandwidth-limited, but there's a similar trick wherein you're only sending 1.5x (say) out your cluster-facing NIC instead of 3x, so you actually see a speedup for large enough objects (and 500MB should definitely be large enough) | 00:32 |
viks___ | No.. all nodes are separate...And my cluster is not loaded at all... when i check the same in my vagrant based setup, i see EC is faster.. so i'm confused why they are behaving differently? My erasure code has 3 fragments and 2parity set | 00:34 |
viks___ | in vagrant setup is see around 20-25% better speed with EC.. | 00:37 |
timburke | viks___, what's the ec_type? is it the same on both the test cluster and the vagrant setup? | 00:38 |
viks___ | liberasurecode_rs_vand | 00:39 |
viks___ | yes | 00:39 |
viks___ | both are same.. both have 3 storage nodes.. | 00:40 |
viks___ | ok.. i;ll check once with isa-l and revert back | 00:42 |
openstackgerrit | Tim Burke proposed openstack/swift master: Use less responses from handoffs https://review.opendev.org/700239 | 00:45 |
viks___ | is number of fragments should be set based on no. of storage nodes? | 00:45 |
viks___ | because when i tried with 7 fragments and 3 parity, i was getting the below swift proxy error: | 00:47 |
viks___ | ``` | 00:47 |
viks___ | Object PUT returning 503, 6/8 required connections | 00:47 |
viks___ | ``` | 00:47 |
clayg | so writing down "quoted-paths: true" in sysmeta is pretty awful - for one an un-upgraded node can write down an *unquoted* root *after* an upgraded node wrote down "quoted_paths: true" | 00:47 |
clayg | after replication takes the latest value for each key I've got an unquoted path that says "quoted_paths: true" 🙄 | 00:48 |
timburke | eh... i'd say it's more about what kind of durability/storage overhead trade-off you want to make. keeping the numbers small simplifies the math a bit and should reduce CPU overhead... but makes it so you can't reliably withstand as many simultaneous drive failures | 00:48 |
timburke | viks___, how many disks do you have per node? | 00:49 |
timburke | clayg, yeha, that sounds pretty terrible | 00:49 |
viks___ | 9 disks | 00:50 |
timburke | :-/ did the logs tell you much else? it should've tried to connect to every disk... | 00:51 |
viks___ | no other info... i'm yet to try with debug enabled in logs... Also when i tried with 6 fragments and 3 parity, i'm getting the below swift proxy error: | 00:54 |
viks___ | ``` | 00:54 |
viks___ | Object PUT returning 503, 6/7 required connections | 00:54 |
viks___ | ``` | 00:54 |
viks___ | So i could not make out how this required connections getting calculated... | 00:54 |
viks___ | timburke: Do i need to modify some worker/concurrency tuning for this ? orany other parameter? | 00:55 |
clayg | timburke: yeah I also managed to get some lost objects into a shard named with the quoted value when the proxy sent down the target-shard quoted and the object forwarded it | 00:57 |
clayg | i guess since the container was in an autocreate account it made a new db - which strangely itself doesn't think it's a shard | 00:57 |
clayg | i wonder if there's a clue in recon or logs about a db in .shards_X account that's not a shard? | 00:58 |
*** gyee has quit IRC | 00:58 | |
clayg | there's lot of this: | 01:01 |
clayg | Jan 3 01:00:17 saio container-sharder-6021: Failed to put shard ranges to 127.0.0.1:6041/sdb4: 'utf8' codec can't decode byte 0xbe in position 3: invalid start byte: #012Traceback (most recent call last):#012 File "/vagrant/swift/swift/container/sharder.py", line 596, in _put_container#012 headers=headers, contents=body)#012 File "/vagrant/swift/swift/common/direct_client.py", line 348, in | 01:01 |
clayg | direct_put_container#012 path = _make_path(account, container)#012 File "/vagrant/swift/swift/common/direct_client.py", line 60, in _make_path#012 for x in components)#012 File "/vagrant/swift/swift/common/direct_client.py", line 60, in <genexpr>#012 for x in components)#012 File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode#012 return codecs.utf_8_decode(input, errors, | 01:01 |
clayg | True)#012UnicodeDecodeError: 'utf8' codec can't decode byte 0xbe in position 3: invalid start byte | 01:01 |
clayg | i bet that's because of the quoted-paths + (not-quoted)-root | 01:01 |
clayg | ok, so that was a disaster! 😁 | 01:02 |
timburke | clayg, that seems like a great thing to add around https://github.com/openstack/swift/blob/2.23.0/swift/container/sharder.py#L776-L777 -- if broker.is_root_container() and broker.account.startswith(self.shards_account_prefix): ... | 01:02 |
clayg | i'll make a sticky to write a bug report tomorrow | 01:02 |
clayg | are you going to re-rev the sharder-quoting patch again? | 01:03 |
timburke | ha! you went with my all-beef container name, didn't you 😁 | 01:03 |
clayg | i'm going to abandon my "alternative approach" | 01:03 |
clayg | well, now it's like 25% beef 😬 | 01:03 |
timburke | nah, i don't have anything to push up on that guy at the moment | 01:04 |
timburke | i suppose i could try to get the client-says-it-wants-quoted change in... but i haven't started on it yet | 01:06 |
timburke | and i kinda want to actually try a non-default prefix with https://review.opendev.org/#/c/700818/ | 01:09 |
patchbot | patch 700818 - swift - Deprecate per-service auto_create_account_prefix - 2 patch sets | 01:09 |
timburke | since that seems pretty handy for https://review.opendev.org/#/c/700449/ | 01:09 |
patchbot | patch 700449 - swift - Allow reconciler to handle reserved names - 3 patch sets | 01:09 |
*** f0o has quit IRC | 01:37 | |
*** f0o has joined #openstack-swift | 01:38 | |
*** spsurya has quit IRC | 02:03 | |
*** d34dh0r53 has joined #openstack-swift | 02:56 | |
openstackgerrit | Merged openstack/swift master: Use less responses from handoffs https://review.opendev.org/700239 | 03:13 |
openstackgerrit | Tim Burke proposed openstack/swift master: Deprecate per-service auto_create_account_prefix https://review.opendev.org/700818 | 05:28 |
*** evrardjp has quit IRC | 05:33 | |
*** evrardjp has joined #openstack-swift | 05:33 | |
*** Fengli1 has joined #openstack-swift | 05:57 | |
*** Fengli has quit IRC | 05:59 | |
*** Fengli1 is now known as Fengli | 05:59 | |
openstackgerrit | Tim Burke proposed openstack/swift master: symlink: Clean up app iters better https://review.opendev.org/700959 | 06:06 |
openstackgerrit | Tim Burke proposed openstack/swift master: Middleware that allows a user to have quoted Etags https://review.opendev.org/700056 | 06:16 |
zaitcev | Jan 3 01:54:56 rhev-a24c-01 swift[9927]: - - 03/Jan/2020/06/54/56 GET /v1/.misplaced_objects%3Fformat%3Djson%26marker%3D%26end_marker%3D%26prefix%3D HTTP/1.0 404 - Swift%20Container%20Reconciler - - - - tx7e12fe8b104845d19ce94-005e0ee540 - 0.0358 - - 1578034496.477324247 1578034496.513160467 - | 06:55 |
zaitcev | Jan 3 01:54:56 rhev-a24c-01 container-reconciler[9927]: Reconciler Stats: {} (txn: tx7e12fe8b104845d19ce94-005e0ee540) | 06:55 |
zaitcev | Same PID, 9927. I'm wondering why the reconciler reports itself as "container-reconciler" at some times and as just "swift" at other times. | 06:56 |
*** psachin has joined #openstack-swift | 08:00 | |
*** tesseract has joined #openstack-swift | 08:14 | |
*** Fengli1 has joined #openstack-swift | 08:14 | |
*** Fengli has quit IRC | 08:16 | |
*** Fengli1 is now known as Fengli | 08:16 | |
*** pcaruana has joined #openstack-swift | 08:35 | |
*** rpittau|afk is now known as rpittau | 08:41 | |
*** psachin has quit IRC | 09:18 | |
*** Fengli has quit IRC | 09:53 | |
*** Fengli has joined #openstack-swift | 10:05 | |
*** Fengli has quit IRC | 11:26 | |
*** henriqueof has joined #openstack-swift | 14:57 | |
*** henriqueof has quit IRC | 15:03 | |
*** henriqueof has joined #openstack-swift | 15:03 | |
*** renich has joined #openstack-swift | 15:24 | |
*** renich has quit IRC | 15:33 | |
*** takamatsu has joined #openstack-swift | 16:37 | |
*** henriqueof has quit IRC | 16:44 | |
*** gyee has joined #openstack-swift | 17:03 | |
*** rpittau is now known as rpittau|afk | 17:18 | |
*** evrardjp has quit IRC | 17:33 | |
*** evrardjp has joined #openstack-swift | 17:33 | |
clayg | @zaitcev I bet it's something to do with the internal client config | 17:44 |
clayg | get's it's own logger instead of passing it through? I think all the apps in the pipeline call get_logger 🤔 | 17:45 |
zaitcev | right... I'm sure it comes from internal_client | 18:00 |
timburke | lol https://github.com/openstack/swift/blob/2.23.0/doc/manpages/object-server.conf.5#L371-L375 | 18:39 |
timburke | we were so optimistic :P | 18:40 |
zaitcev | more like le sigh | 18:40 |
openstackgerrit | Tim Burke proposed openstack/swift master: Deprecate per-service auto_create_account_prefix https://review.opendev.org/700818 | 18:45 |
*** paladox has quit IRC | 19:10 | |
*** henriqueof has joined #openstack-swift | 19:21 | |
*** paladox has joined #openstack-swift | 19:34 | |
*** paladox has quit IRC | 19:35 | |
*** paladox has joined #openstack-swift | 19:35 | |
openstackgerrit | Clay Gerrard proposed openstack/swift master: wip: move to new quoted-path https://review.opendev.org/701059 | 19:50 |
clayg | timburke: ^ so just having everyone start using quoted path works great! a couple of shards might think they're roots temporarily but it all works out | 19:51 |
clayg | i'm not really sure why just starting to quote location seemed to work - I had some async pending pile up with container servers rejecting 301 (obviously) then 412 (!?) | 19:52 |
clayg | eventually everyone updated and everything worked with no lost or misplaced updates 🥳 | 19:52 |
clayg | I might be happy with something a little more progressive or careful than p 701059 but not significantly so w/o some proven failure mode I can duplicate | 19:53 |
patchbot | https://review.opendev.org/#/c/701059/ - swift - wip: move to new quoted-path - 1 patch set | 19:53 |
clayg | the main advantage of letting the natural failure modes recover on their own when they case is that it proves we're trusting the case when the old path isn't sent | 19:54 |
clayg | i.e. I don't have to think "ok, what happens in rare case when the quoted & unquoted paths are NOT the same; what does that failure look like!?" ... | 19:55 |
clayg | ... because THAT failure mode is the happy path, and if it *works* then we don't need to do anything special for when the unquoted path isn't equivilent - which I think was really the heart of what was bothering me | 19:56 |
timburke | clayg, this is what i've got so far, but i need to fix up some tests http://paste.openstack.org/show/788045/ | 20:42 |
*** takamatsu has quit IRC | 20:51 | |
zaitcev | I added some innocuous imports into my auditor plugin, and it makes the auditor loop hard on start. Must be eventlet, I just know it. | 22:10 |
*** tesseract has quit IRC | 22:54 | |
*** pcaruana has quit IRC | 22:59 | |
openstackgerrit | Clay Gerrard proposed openstack/swift master: Deprecate per-service auto_create_account_prefix https://review.opendev.org/700818 | 23:16 |
openstackgerrit | Clay Gerrard proposed openstack/swift master: Allow reconciler to handle reserved names https://review.opendev.org/700449 | 23:56 |
*** gyee has quit IRC | 23:58 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!