*** camelCaser has quit IRC | 00:11 | |
*** camelCas- has joined #openstack-swift | 00:11 | |
openstackgerrit | Merged openstack/swift master: Use `is` to compare against sentinel object https://review.opendev.org/674883 | 00:32 |
---|---|---|
timburke | how crazy would it be to track last object write for containers? we've already got the trigger to update object_count/bytes_used ... seems not *so* terrible... | 01:05 |
timburke | i started poking at https://bugs.launchpad.net/swift/+bug/1834097 | 01:05 |
openstack | Launchpad bug 1834097 in OpenStack Object Storage (swift) "container-sharder has no latch when reporting stats" [Undecided,New] | 01:05 |
timburke | and got pretty far -- added a migration to add a (boolean) `reported` column to the shard table, mark it as such when we send an update to the root, skip doing the update if we're already marked reported... | 01:07 |
timburke | but then i ran into issues where one replica of a shard reports stats and latches, then another one does the same... | 01:08 |
timburke | but because we set the meta_timestamp to the time at which we're doing the reporting (https://github.com/openstack/swift/blob/2.22.0/swift/common/utils.py#L5018-L5021) the second replica updates the timestamp in the root | 01:09 |
timburke | so when the first replica comes 'round again, it checks in with the root, sees there's an update, so merges it and clears its reported flag | 01:11 |
timburke | maybe it's that last part that's really causing my problem? it feels like if i'm doing a merge i oughta clear it... but maybe if stats match i can leave it? | 01:12 |
openstackgerrit | Tim Burke proposed openstack/swift master: WIP: latch shard-stat reporting https://review.opendev.org/675014 | 01:17 |
timburke | well, we'll see what robe tests look like on ^^^ | 01:17 |
timburke | *probe* tests, even | 01:17 |
timburke | can't wait for us to have https://review.opendev.org/#/c/671333/ ;-) it'll be so exciting! | 01:19 |
patchbot | patch 671333 - swift - py3: (mostly) port probe tests - 2 patch sets | 01:19 |
*** pcaruana has quit IRC | 01:26 | |
*** gyee has quit IRC | 01:33 | |
*** zaitcev_ has quit IRC | 01:40 | |
*** zaitcev_ has joined #openstack-swift | 01:54 | |
*** ChanServ sets mode: +v zaitcev_ | 01:54 | |
*** BjoernT has quit IRC | 02:43 | |
*** BjoernT_ has joined #openstack-swift | 02:43 | |
*** BjoernT_ has quit IRC | 02:47 | |
*** BjoernT has joined #openstack-swift | 02:52 | |
*** psachin has joined #openstack-swift | 03:36 | |
*** ianychoi_ has joined #openstack-swift | 03:38 | |
*** ianychoi has quit IRC | 03:42 | |
*** pcaruana has joined #openstack-swift | 03:44 | |
openstackgerrit | Tim Burke proposed openstack/swift master: WIP: latch shard-stat reporting https://review.opendev.org/675014 | 03:45 |
*** diablo_rojo has joined #openstack-swift | 04:04 | |
*** BjoernT has quit IRC | 04:12 | |
openstackgerrit | Tim Burke proposed openstack/swift master: Latch shard-stat reporting https://review.opendev.org/675014 | 05:07 |
*** diablo_rojo has quit IRC | 05:11 | |
*** zaitcev_ has quit IRC | 05:21 | |
*** zaitcev_ has joined #openstack-swift | 05:34 | |
*** ChanServ sets mode: +v zaitcev_ | 05:34 | |
*** tdasilva_ has quit IRC | 05:36 | |
*** patchbot has quit IRC | 05:46 | |
*** patchbot has joined #openstack-swift | 05:48 | |
*** zaitcev_ has quit IRC | 06:27 | |
*** zaitcev_ has joined #openstack-swift | 06:41 | |
*** ChanServ sets mode: +v zaitcev_ | 06:41 | |
*** ccamacho has joined #openstack-swift | 06:45 | |
*** rcernin has quit IRC | 07:03 | |
*** tesseract has joined #openstack-swift | 07:26 | |
*** jistr is now known as jistr|afk | 07:42 | |
*** spsurya has joined #openstack-swift | 07:50 | |
*** mikecmpbll has joined #openstack-swift | 08:05 | |
*** threestrands has quit IRC | 08:06 | |
*** mauro|call has quit IRC | 08:13 | |
*** mauro|call has joined #openstack-swift | 08:14 | |
*** e0ne has joined #openstack-swift | 08:18 | |
*** mikecmpbll has quit IRC | 08:35 | |
*** mikecmpbll has joined #openstack-swift | 08:36 | |
*** zaitcev_ has quit IRC | 08:51 | |
*** mauro|call has quit IRC | 08:58 | |
*** mauro|call has joined #openstack-swift | 09:00 | |
*** tkajinam has quit IRC | 09:02 | |
*** [diablo] has quit IRC | 09:02 | |
*** [diablo]9 has joined #openstack-swift | 09:03 | |
*** zaitcev_ has joined #openstack-swift | 09:04 | |
*** ChanServ sets mode: +v zaitcev_ | 09:04 | |
*** mauro|call is now known as takamatsu | 09:14 | |
*** jistr|afk is now known as jistr | 09:54 | |
*** mvkr has joined #openstack-swift | 10:36 | |
*** jistr is now known as jistr|call | 12:37 | |
*** BjoernT_ has joined #openstack-swift | 13:36 | |
*** tesseract has quit IRC | 13:56 | |
*** tesseract has joined #openstack-swift | 13:57 | |
*** tesseract has quit IRC | 14:00 | |
*** tesseract has joined #openstack-swift | 14:01 | |
*** mvkr has quit IRC | 14:12 | |
*** zaitcev_ has quit IRC | 14:26 | |
*** jistr|call is now known as jistr | 14:36 | |
*** zaitcev_ has joined #openstack-swift | 14:40 | |
*** ChanServ sets mode: +v zaitcev_ | 14:40 | |
*** altlogbot_2 has quit IRC | 14:46 | |
*** altlogbot_2 has joined #openstack-swift | 14:47 | |
*** ianychoi_ is now known as ianychoi | 14:54 | |
openstackgerrit | Tim Burke proposed openstack/swift master: Latch shard-stat reporting https://review.opendev.org/675014 | 15:00 |
*** zaitcev_ has quit IRC | 15:06 | |
*** zaitcev_ has joined #openstack-swift | 15:20 | |
*** ChanServ sets mode: +v zaitcev_ | 15:20 | |
*** diablo_rojo has joined #openstack-swift | 15:48 | |
*** tesseract has quit IRC | 15:58 | |
*** tesseract has joined #openstack-swift | 16:02 | |
*** henriqueof has quit IRC | 16:02 | |
*** gyee has joined #openstack-swift | 16:05 | |
clayg | If a symlink with a ``X-Symlink-Target-Etag`` targets a static large object manifest it will carry forward the SLO size and etag in the container listing the ``symlink_bytes`` and ``slo_etag`` keys. However, manifests created before swift v2.12.0 (released Dec 2016) do not contain enough metadata to propogate the extra SLO information to the listing. | 16:27 |
clayg | 🤮 | 16:27 |
clayg | timburke: Is there anyway I can make this sound like less of total tire fire? | 16:28 |
*** psachin has quit IRC | 16:30 | |
*** baojg has quit IRC | 16:38 | |
*** tesseract has quit IRC | 16:39 | |
timburke | clayg, all software is terrible -- despite that, some software's useful | 16:40 |
*** mikecmpbll has quit IRC | 16:40 | |
timburke | i guess the good news is that the slo_etag came after the size sysmeta? so at least clients have a way to identify data that should be copied over itself if they really want to fix it? | 16:42 |
timburke | it just involves doing a full listing and then heading every object that *doesn't* have slo_etag to see if it's in fact an slo 🤮 | 16:44 |
*** baojg has joined #openstack-swift | 16:44 | |
*** baojg has quit IRC | 16:44 | |
*** baojg has joined #openstack-swift | 16:44 | |
*** baojg has quit IRC | 16:45 | |
*** baojg has joined #openstack-swift | 16:45 | |
timburke | i'm not actually sure i've got the right recommendation on https://bugs.launchpad.net/swift/+bug/1839355 ... | 16:45 |
openstack | Launchpad bug 1839355 in OpenStack Object Storage (swift) "container-sharder should keep cleaving when there are no rows" [Undecided,New] | 16:45 |
*** baojg has quit IRC | 16:45 | |
timburke | maybe we should stop aborting replication on "small" DBs? let any rows land as misplaced objects in the root, get dealt with there... | 16:46 |
*** baojg has joined #openstack-swift | 16:46 | |
*** baojg has quit IRC | 16:46 | |
*** baojg has joined #openstack-swift | 16:47 | |
*** baojg has quit IRC | 16:47 | |
*** baojg has joined #openstack-swift | 16:48 | |
*** baojg has quit IRC | 16:48 | |
*** baojg has joined #openstack-swift | 16:48 | |
*** baojg has quit IRC | 16:48 | |
*** baojg has joined #openstack-swift | 16:50 | |
*** baojg has quit IRC | 16:50 | |
*** baojg has joined #openstack-swift | 16:51 | |
*** baojg has quit IRC | 16:51 | |
*** baojg has joined #openstack-swift | 16:51 | |
*** baojg has quit IRC | 16:51 | |
*** baojg has joined #openstack-swift | 16:52 | |
*** baojg has quit IRC | 16:52 | |
*** baojg has joined #openstack-swift | 16:53 | |
*** baojg has quit IRC | 16:53 | |
*** baojg has joined #openstack-swift | 16:54 | |
*** baojg has quit IRC | 16:54 | |
*** baojg has joined #openstack-swift | 16:54 | |
*** baojg has quit IRC | 16:55 | |
*** baojg has joined #openstack-swift | 16:55 | |
*** baojg has quit IRC | 16:55 | |
*** baojg has joined #openstack-swift | 16:56 | |
*** diablo_rojo__ has joined #openstack-swift | 16:56 | |
*** baojg has quit IRC | 16:56 | |
*** diablo_rojo has quit IRC | 16:56 | |
*** baojg has joined #openstack-swift | 16:57 | |
*** baojg has quit IRC | 16:57 | |
clayg | maybe instead of limiting on cleave_batch_size we could do timeboxed? | 16:57 |
*** baojg has joined #openstack-swift | 16:57 | |
*** baojg has quit IRC | 16:58 | |
clayg | do you estimate a fix for lp bug #1834097 significantly reduce the 11 day window? | 16:58 |
openstack | Launchpad bug 1834097 in OpenStack Object Storage (swift) "container-sharder has no latch when reporting stats" [Undecided,In progress] https://launchpad.net/bugs/1834097 | 16:58 |
*** baojg has joined #openstack-swift | 16:58 | |
*** baojg has quit IRC | 16:58 | |
timburke | i'm seeing log lines like "Cleaved ... for shard range ... in 0.063s." so i'd guess we could do the whole think in like 2mins | 17:05 |
timburke | i forget how long it usually takes to cleave a *full* range... | 17:06 |
*** e0ne has quit IRC | 17:06 | |
*** altlogbot_2 has quit IRC | 17:17 | |
*** altlogbot_2 has joined #openstack-swift | 17:24 | |
*** altlogbot_2 has quit IRC | 17:31 | |
*** henriqueof has joined #openstack-swift | 17:33 | |
*** altlogbot_0 has joined #openstack-swift | 17:35 | |
clayg | 10m? 30m? 2hr? it depends on how busy the disks are | 17:58 |
ormandj | heya folks - how are people dealing with large file uploads with segments, where failures happen on upload so there's segments that exist but aren't actually referenced in a manifest | 18:00 |
clayg | if num_batches > cleave_batch_size and time.time() - start > cleave_batch_min_thrash: | 18:01 |
ormandj | this happens more than we'd like and it consumes space that end users don't know/understand :) | 18:01 |
clayg | ormandj: what's your operating use-case? I think public operators probably "solve" the issue by charging customers, and private clusters do audits? | 18:03 |
clayg | I'm not aware of any solutions that for example add an x-delete-after 24 hrs when uploading segments and then do a POST to each segment to clear it after writing the manifest... but clients could probably get the cluster to help if they were real explicit about what they want... | 18:04 |
ormandj | clayg: use case is internal or external. let's say, hypothetically, you have a website that people can 'click to upload' files, including large files. user gets tired about 24 hours into an upload on their 2400 baud acoustic coupler, and shuts down their pc. | 18:06 |
ormandj | they fire it up again the next day and see this empty bucket, yet data consumed | 18:06 |
ormandj | they see it's in a segments bucket and go wtf | 18:06 |
ormandj | what's a segment? | 18:06 |
ormandj | obviously we don't want to expose any user to this kind of thing, since they shouldn't have to care about any of it | 18:07 |
ormandj | we were considering writing something that pulls all manifests and does an audit with the segments and if the segments exist but don't have a manifest entry, they are orphaned | 18:07 |
ormandj | otherwise we have a ton of orphaned segments | 18:07 |
ormandj | and especially if users have tons of uploads into the bucket w/ other large objects | 18:08 |
ormandj | it's a nightmare just telling users to 'figure it out' | 18:08 |
*** diablo_rojo__ is now known as diablo_rojo | 18:13 | |
ormandj | clayg: i just didn't know if there's a built in way to handle orphaned segments like this, or if it was something anyone had thought about - i would imagine it's a pretty large problem for anyone regardless of usage, if they have large objects that get segmented | 18:16 |
*** spsurya has quit IRC | 18:22 | |
clayg | You're right that it's something people have thought about, and in your situation the conclusion would be like: "the website/application would handle the client error/disconnect by cleaning up segments it uploaded for which it didn't upload a manifest" | 18:25 |
clayg | That fails when the appliation fails - and it would be an improvement if there was a "built in" way to expire unreferenced segments | 18:26 |
clayg | However currently the API doesn't decorate segment uploads in such a way as to identify them as segments - they're just objects which can be combined later when a client uploads a manifest. | 18:26 |
clayg | I think the object-expiration feature could be useful for application authors that need their segments to expire automatically if the uploading application dies w/o writing a manifest.. but that'd be up to the client since they'd also need to *clear* the expiration/timeout after they upload the manifest | 18:29 |
ormandj | clayg: yeah, unfortunately, that example is just the website, reality is, we have a ton of rando-clients doing things :) | 18:30 |
ormandj | and as you know, success in getting clients to do anything correctly is generally somewhere around 0% | 18:30 |
ormandj | so we have to figure out a server-side way of handling this. would have been nice if we required manifest first, then segments for large object uploads | 18:31 |
ormandj | jimmies-first-javaproject-0.0.1.jar is one of the clients we have to account for, so we're trying to figure out a good way to handle this transparently for end users, since we can't expect them to clean up segments related to bad client behavior | 18:32 |
clayg | ormandj: yeah, i think that would be nice too - I've not heard anyone ever offer to work on that at PTG or anything... you could probably cook up a custom middleware to make a SLO++ | 18:33 |
ormandj | not much value in it now since it's already implemented, nobody is gonna go update clients (same problem) :p | 18:34 |
ormandj | guess we'll just have to figure out a way to handle it on the backend with manifest/segment audits | 18:34 |
clayg | ormandj: maybe... if you have no control of clients... how can you tell if any given object is a SLO segment or just... a regular object? huristic on the naming conventions? | 18:37 |
ormandj | the clients all seem to drop stuff in a segments bucket | 18:40 |
ormandj | (we are s3 compat facing outwards) | 18:40 |
clayg | timburke: what was the name you came up with for the new kind of *LOs that would have a multi-part upload id with expiration kind of thing? | 18:40 |
clayg | ormandj: oh, s3api is different - are you running the latest version? | 18:41 |
timburke | ALO, i think? like, atomic large object, or something | 18:41 |
ormandj | clayg: stein's built-in, yes | 18:42 |
timburke | then expose something like S3's "clean up segments for incomplete uploads more than X days old" | 18:42 |
ormandj | clayg: you'll probably see a bug report from me regarding stein and keystone w/ py3, i ended up having to patch it internally to even make it work hah :) but yes, we're all stein now | 18:43 |
timburke | ormandj, what version of swift are you running? 2.21.0? | 18:45 |
timburke | there's been a bunch of py3-related fixes since then :-) | 18:45 |
timburke | or was the issue some interaction between swift-on-py2 and keystone-on-py3? | 18:45 |
ormandj | https://bugs.launchpad.net/keystone/+bug/1833739 | 18:47 |
openstack | Launchpad bug 1833739 in OpenStack Identity (keystone) "keystone (stein), python3, and postgresql: hex in database" [High,Triaged] | 18:47 |
*** mvkr has joined #openstack-swift | 18:47 | |
ormandj | i fixed it internally with a patch, but it's not the 'right' solution so i didn't submit it upstream, as you should be handling this for every case, not just this specific one. i just didn't have time to implement the proper fix, but i did suggest it in this channel back when i found the bug | 18:47 |
*** BjoernT has joined #openstack-swift | 18:48 | |
timburke | ah, cool. i think i was looking at that recently, though i don't remember why -- sounds kinda familiar tho :-) | 18:48 |
timburke | "but i did suggest it in this channel back when i found the bug" heh, maybe that's why it sounded familiar | 18:48 |
ormandj | probably :) | 18:49 |
clayg | ormandj: so with s3api you have at least ListMultipartUploads, but currently we don't have anything like their LifecycleManagement that automatically scans and reaps Incomplete Multipart Uploads. | 18:49 |
*** e0ne has joined #openstack-swift | 18:49 | |
ormandj | clayg: yeah, that's what i expected, i was just hopeful i missed something | 18:50 |
ormandj | crushing my dreams again clay, crushing my dreams... | 18:50 |
*** BjoernT_ has quit IRC | 18:50 | |
clayg | ormandj: ask timburke to tell you more about how great ALO's will be someday | 18:50 |
clayg | he's a dreamer | 18:50 |
clayg | we'll probably have s3 compatible lifecycle management via 1space before we have ALOs 🤔 | 18:51 |
*** BjoernT_ has joined #openstack-swift | 19:15 | |
*** BjoernT has quit IRC | 19:17 | |
openstackgerrit | Tim Burke proposed openstack/python-swiftclient stable/rocky: Fix up stable gate https://review.opendev.org/675184 | 19:18 |
*** camelCas- has quit IRC | 19:38 | |
*** e0ne has quit IRC | 19:46 | |
*** ndk_ has joined #openstack-swift | 20:07 | |
openstackgerrit | Merged openstack/python-swiftclient stable/stein: Fix up stable gate https://review.opendev.org/674180 | 20:39 |
*** tdasilva has joined #openstack-swift | 20:51 | |
*** ChanServ sets mode: +v tdasilva | 20:51 | |
kota_ | morning | 20:57 |
timburke | o/ | 20:59 |
kota_ | timburke: o/ | 20:59 |
timburke | tdasilva, mattoliverau meeting time | 21:01 |
*** diablo_rojo has quit IRC | 21:07 | |
*** diablo_rojo has joined #openstack-swift | 21:08 | |
*** BjoernT_ has quit IRC | 21:08 | |
*** camelCaser has joined #openstack-swift | 21:09 | |
openstackgerrit | Clay Gerrard proposed openstack/swift master: Allow "harder" symlinks https://review.opendev.org/633094 | 21:20 |
ormandj | clayg: haha, fair enough. | 21:22 |
*** henriqueof has quit IRC | 21:33 | |
clayg | 👍 | 21:41 |
*** zaitcev_ is now known as zaitcev | 21:44 | |
*** diablo_rojo has quit IRC | 21:46 | |
openstackgerrit | Tim Burke proposed openstack/python-swiftclient stable/stein: Fix SLO re-upload https://review.opendev.org/673321 | 21:51 |
*** diablo_rojo has joined #openstack-swift | 22:05 | |
openstackgerrit | Merged openstack/swift master: py3: port RBAC func tests https://review.opendev.org/674703 | 22:10 |
*** diablo_rojo has quit IRC | 22:10 | |
*** rcernin has joined #openstack-swift | 22:15 | |
*** notmyname has quit IRC | 22:45 | |
*** patchbot has quit IRC | 22:59 | |
*** tdasilva has quit IRC | 23:02 | |
*** tdasilva has joined #openstack-swift | 23:02 | |
*** ChanServ sets mode: +v tdasilva | 23:02 | |
*** notmyname has joined #openstack-swift | 23:15 | |
*** ChanServ sets mode: +v notmyname | 23:15 | |
*** patchbot has joined #openstack-swift | 23:15 | |
*** zaitcev_ has joined #openstack-swift | 23:16 | |
*** ChanServ sets mode: +v zaitcev_ | 23:16 | |
openstackgerrit | Tim Burke proposed openstack/swift master: py3: mostly port s3 func tests https://review.opendev.org/674716 | 23:19 |
openstackgerrit | Tim Burke proposed openstack/swift master: py3: Finish porting s3 func tests https://review.opendev.org/675227 | 23:19 |
*** zaitcev has quit IRC | 23:20 | |
openstackgerrit | Tim Burke proposed openstack/swift master: py3: Cover account/container func tests https://review.opendev.org/645388 | 23:23 |
openstackgerrit | Tim Burke proposed openstack/swift master: py3: port dlo func tests https://review.opendev.org/642920 | 23:24 |
*** hoonetorg has quit IRC | 23:26 | |
*** hoonetorg has joined #openstack-swift | 23:39 | |
openstackgerrit | Merged openstack/python-swiftclient stable/rocky: Fix up stable gate https://review.opendev.org/675184 | 23:45 |
openstackgerrit | Merged openstack/swift stable/stein: Imported Translations from Zanata https://review.opendev.org/674477 | 23:51 |
*** tdasilva has quit IRC | 23:51 | |
*** tdasilva has joined #openstack-swift | 23:52 | |
*** ChanServ sets mode: +v tdasilva | 23:52 | |
*** tdasilva has quit IRC | 23:55 | |
*** tdasilva has joined #openstack-swift | 23:56 | |
*** ChanServ sets mode: +v tdasilva | 23:56 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!