*** seongsoocho has joined #openstack-swift | 00:29 | |
seongsoocho | Hi ! back from long vacation and just join new irc ! | 00:29 |
---|---|---|
mattoliverau | seongsoocho: hey o/ welcome back! | 00:50 |
seongsoocho | yay ! | 00:50 |
*** aolivo1 has quit IRC | 01:23 | |
*** timburke has quit IRC | 01:38 | |
*** timburke has joined #openstack-swift | 01:54 | |
zaitcev | Aren't we having some odd failures with py39 | 02:40 |
zaitcev | https://zuul.opendev.org/t/openstack/build/109346d22ef04da99e0d22c4e10f1595/log/job-output.txt | 02:40 |
zaitcev | - Swift realm="account-name%0A%0A%3Cb%3Efoo%3Cbr%3E" | 02:41 |
zaitcev | + Swift realm="account-name%3Cb%3Efoo%3Cbr%3E" | 02:41 |
zaitcev | The \n\n is added | 02:41 |
zaitcev | And another test: | 02:41 |
zaitcev | - Invalid path: on%20e | 02:41 |
zaitcev | + Invalid path: o%0An%20e | 02:41 |
timburke | zaitcev, should be fixed now that https://review.opendev.org/c/openstack/swift/+/793495 merged | 02:42 |
zaitcev | One \n is added | 02:42 |
timburke | i wouldn't mind having another set of eyes look at it before i backport it to wallaby as https://review.opendev.org/c/openstack/swift/+/793439, though | 02:42 |
zaitcev | timburke: I do not understand why you dropped the test. Sure, the Request.blank now always unquotes. But that is an a-priory knowledge, isn't it? And you want to make sure that result is right... I don't see that test as value-less. | 02:49 |
zaitcev | But maybe I'm missing something here... There's no point in having tests that are always guaranteed to pass, we're just wasting CPU time. | 02:50 |
timburke | zaitcev, part of why i made sure to add the assertion that the unquoted string wound up in PATH_INFO at https://review.opendev.org/c/openstack/swift/+/793495/1/test/unit/common/test_swob.py@793 | 02:52 |
zaitcev | I'd use something like self.assertTrue('Swift realm="%s"' % quoted_hacker == resp.headers['Www-Authenticate'] or ('Swift realm="%s"' % quoted_hacker).strip('\n') == resp.headers['Www-Authenticate']) ... maybe. | 02:52 |
zaitcev | Oh, okay. | 02:53 |
*** mattoliver has joined #openstack-swift | 03:17 | |
opendevreview | Pete Zaitcev proposed openstack/swift master: Band-aid and test the crash of the account server https://review.opendev.org/c/openstack/swift/+/743797 | 03:22 |
zaitcev | So I don't need to do anything, you're doing all my work for me. | 03:22 |
mattoliver | FYI am testing out the matrix irc bridge, thus mattoliver and mattoliverau at the same time. I figure a move to a new IRC server means a good time to see how matrix is going.. seeing as it means a free bouncer, and if it goes well I can decommission my quassel core + gcloud host :) | 04:07 |
timburke | huh. https://github.com/openstack/liberasurecode/blob/master/include/xor_codes/xor_hd_code_defs.h#L63 seems like something must be wrong. shouldn't there being two entries for 56 mean that two parities are identical? | 04:45 |
timburke | with similar issues for 7+6 (hd 4) and 8+6 (hd 4) :-( | 04:51 |
timburke | maybe that's OK for the xor codes? the assumption for the 6+6 (hd 4) is that you'll have *9* frags in hand when you're trying to rebuild, right? hmm... | 04:53 |
opendevreview | Matthew Oliver proposed openstack/swift master: sharder: Track scan progress to fix small tails https://review.opendev.org/c/openstack/swift/+/793543 | 05:25 |
opendevreview | Tim Burke proposed openstack/liberasurecode master: Fix underflow in flat_xor_hd code https://review.opendev.org/c/openstack/liberasurecode/+/794137 | 06:09 |
*** timburke has quit IRC | 06:20 | |
acoles | seongsoocho: welcome back and welcome to oftc :) | 08:10 |
opendevreview | Lin PeiWen proposed openstack/swift master: Delete unavailable py2 package https://review.opendev.org/c/openstack/swift/+/794167 | 08:56 |
opendevreview | Merged openstack/swift master: Switch IRC references from freenode to OFTC https://review.opendev.org/c/openstack/swift/+/793983 | 10:52 |
*** aolivo1 has joined #openstack-swift | 13:51 | |
*** tdasilva_ has joined #openstack-swift | 14:09 | |
*** tdasilva has quit IRC | 14:09 | |
*** opendevreview has quit IRC | 14:38 | |
*** timburke has joined #openstack-swift | 15:32 | |
*** timburke has quit IRC | 16:32 | |
*** thiago__ has joined #openstack-swift | 16:51 | |
*** tdasilva_ has quit IRC | 16:57 | |
*** erbarr has joined #openstack-swift | 17:11 | |
*** edausq has joined #openstack-swift | 17:21 | |
*** timburke has joined #openstack-swift | 17:24 | |
*** erlon has joined #openstack-swift | 17:44 | |
*** opendevreview has joined #openstack-swift | 20:09 | |
opendevreview | Merged openstack/swift master: relinker: Remove replication locks for empty parts https://review.opendev.org/c/openstack/swift/+/790305 | 20:09 |
timburke | almost meeting time! reminder that we're going to try doing it *here* 🤞 | 20:54 |
*** kota_ has joined #openstack-swift | 20:58 | |
kota_ | morning | 20:58 |
timburke | #startmeeting swift | 21:00 |
opendevmeet | Meeting started Wed Jun 2 21:00:08 2021 UTC and is due to finish in 60 minutes. The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot. | 21:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 21:00 |
opendevmeet | The meeting name has been set to 'swift' | 21:00 |
timburke | who's here for the swift meeting? | 21:00 |
*** thiago__ is now known as tdasilva | 21:00 | |
kota_ | hi | 21:00 |
acoles | o/ | 21:01 |
mattoliver | o/ | 21:02 |
timburke | i'm glad to see most everybody's migrated over to OFTC, and that the meeting bot's working well for us here :-) | 21:03 |
timburke | as usual, the agenda's at https://wiki.openstack.org/wiki/Swift | 21:03 |
timburke | er, not that. https://wiki.openstack.org/wiki/Meetings/Swift | 21:03 |
timburke | that's the one | 21:03 |
timburke | first up | 21:03 |
timburke | #topic testing on ARM | 21:04 |
timburke | i wanted to see what opinions we might have about ARM jobs now that we've (1) got more jobs proposed (thanks mattoliver!) and (2) we've had a bit more time to think about it | 21:04 |
timburke | the good news, by the way, is that everything seems to Just Work -- libec, pyeclib, swift all have passing ARM jobs proposed | 21:05 |
timburke | they're taking a bit longer than the other jobs (~2x or so?) but at least for swift, they aren't the limiting factor | 21:06 |
mattoliver | yeah, and I added func, func encrytion and a probe. So pretty good coverage I think | 21:06 |
timburke | i've got two main questions, and i'm not sure whether they're connected or not | 21:07 |
mattoliver | #link https://review.opendev.org/c/openstack/swift/+/793280 | 21:07 |
mattoliver | #link https://review.opendev.org/c/openstack/pyeclib/+/793281 | 21:07 |
timburke | #link https://review.opendev.org/c/openstack/swift/+/792867 | 21:08 |
timburke | #link https://review.opendev.org/c/openstack/liberasurecode/+/793511 | 21:08 |
timburke | first, should we have them in the main check queue or a separate check-arm64 queue? ricolin proposed it as a separate queue, but trying it out on the libec patch, a single queue seems to work fine | 21:09 |
timburke | second, should they be voting or not? they all seem to pass and if i saw one fail (*especially* if it was on a patch touching ctypes or something) i'd be inclined to figure out the failure before approving, personally | 21:10 |
mattoliver | well now that we know they seem to pass, I'm happy to have them voting, we can always turn them off again. | 21:11 |
acoles | +1 | 21:11 |
mattoliver | the extra check pipeline, I'm not sure.. I thought I'd read somewhere it might have something to do with managing the arm64 resources.. but cant seem to figure out where I read it.. so might have been dreaming :P | 21:12 |
timburke | i seem to remember seeing something about that, too -- an ML thread, maybe? | 21:14 |
timburke | that also brings me to why i'm not sure whether the questions are connected or not: with two queues, we get two zuul responses -- if the arm jobs are voting, can the second response change the vote from the first? i can ask in -infra, i suppose... | 21:16 |
zaitcev | If the CI machine set for ARM is reliable enough, then I think we want them voting. We don't want to get stuck just because something keeps crashing. That balances against the upside of guarding against a breakage that is specific to ARM. | 21:17 |
timburke | i'm inclined to merge them non-voting to start, then revisit later (maybe a topic for the next PTG?) | 21:18 |
mattoliver | sure, sounds reasonable. point is we get to test on arm, which is pretty cool. | 21:19 |
timburke | for sure! | 21:20 |
timburke | #topic train-em | 21:20 |
timburke | so at the end of this week, openstack as a whole is moving train to extended maintenance. i'm going to work on getting a release tagged before then. just a heads-up | 21:21 |
zaitcev | So... What is here to discuss? | 21:21 |
timburke | that was all :-) | 21:21 |
timburke | on to updates! | 21:21 |
timburke | #topic sharding and shrinking | 21:22 |
timburke | how's it going? | 21:22 |
timburke | we merged https://review.opendev.org/c/openstack/swift/+/792182 - Add absolute values for shard shrinking config options | 21:23 |
acoles | we noticed some intermittent gappy listings from sharded containers last week, turned out we had some shard range data stuck in memcache | 21:23 |
acoles | the root problem is memcache related, but it caused us to realise that perhaps we should not be so tolerant of bad listing responses from shard containers | 21:24 |
timburke | leading to https://review.opendev.org/c/openstack/swift/+/793492 - Return 503 for container listings when shards are deleted | 21:25 |
acoles | so https://review.opendev.org/c/openstack/swift/+/793492 proposed to 503 is a shard listing does not succeed | 21:25 |
acoles | IIRC we originally thought a gappy listing was equivalent to eventual consistency, but with hindsight they are more like 'something isn't working' | 21:26 |
mattoliver | And acoles has a patch for invalidating the shard listing cache which will hopefully make things much better | 21:27 |
acoles | mattoliver: actually I abandoned that :) | 21:27 |
mattoliver | oh, then I take that back.. he hasn't got one :P | 21:27 |
acoles | I decided that if the cause of the bad response was backend server workload then flushing the cache could just escalate the problem | 21:28 |
acoles | so not worth the risk | 21:28 |
mattoliver | oh fair enough, it was hard enough to find as it was | 21:28 |
acoles | given that memcache should expire entries, we just had an anomaly | 21:28 |
timburke | i think we need some more investigation into why the entry didn't expire properly, anyway | 21:29 |
acoles | I prefer the idea of including expiry time with the cached data, but I expect that's a bigger piece of work | 21:29 |
acoles | anyway, that was the background to https://review.opendev.org/c/openstack/swift/+/793492 | 21:30 |
timburke | once landed, do we think it's the sort of thing we ought to backport? | 21:30 |
mattoliver | I've learnt alot about memcache (and mcrounter what we use at NVIDIA). memcache should be able to supply the TTL with a 'me <key>' or something like that. I'll investigate that | 21:31 |
timburke | (in light of all the sharding backports zaitcev has already done) | 21:31 |
acoles | timburke: maybe. shall we see how it goes in production first (just in case we uncover a can of worms) | 21:32 |
acoles | although, we've no reason to expect a an of worms :) | 21:32 |
mattoliver | We're slowly making progress on small shard tails. Have a new simpler approach where we actually track the scan progress of the scanner to make it more reliable, and from that can make smarter decisions. And not delve into effiecent db queries or adding rows_per_shard = auto | 21:32 |
acoles | s/an/can/ | 21:33 |
mattoliver | #link https://review.opendev.org/c/openstack/swift/+/793543 | 21:33 |
timburke | nice | 21:33 |
acoles | mattoliver: I like the context idea in https://review.opendev.org/c/openstack/swift/+/793543 | 21:34 |
mattoliver | I see acoles reviewed it, thanks! Will look at that again today.. yeah storing the upper might actually simplify the method.. that and/or the index. | 21:34 |
acoles | I was just a bit unsure about where we do the 'tiny-shard-squashing' | 21:34 |
timburke | anything else we ought to bring up for sharding? i'll be sure to add those two patches to the priority reviews page | 21:36 |
acoles | I also wondered if having per-db-replica context for scanning might help avoid split brain scanning??? but that's *another topic* | 21:36 |
mattoliver | it's the progress + shard_size + minimum > object_count line. Because that returns the end upper. but maybe I miss understand. | 21:36 |
mattoliver | yeah! interesting, maybe it could.. but yeah, need to think about it more before we discuss that :P | 21:36 |
timburke | all right, i'll assume those are the two main track right now :-) | 21:38 |
timburke | #topic dark data watcher | 21:38 |
timburke | zaitcev, i saw some more updates on https://review.opendev.org/c/openstack/swift/+/788398 -- how's it going? | 21:38 |
zaitcev | timburke: I'm addressing comments by acoles | 21:39 |
zaitcev | Give me a day or two | 21:39 |
timburke | 👍 | 21:40 |
zaitcev | Could we get this landed instead? https://review.opendev.org/c/openstack/swift/+/792713 | 21:40 |
zaitcev | I mean in the meanwhile | 21:40 |
zaitcev | Not instead. | 21:40 |
timburke | i'll take a look, see about writing a test for it to demonstrate the difference | 21:41 |
zaitcev | Although ironically enough I was going to slip it through with no change in testing coverage. | 21:41 |
timburke | :P | 21:41 |
timburke | #topic open discussion | 21:41 |
timburke | anything else we ought to bring up this week? | 21:41 |
zaitcev | I was just about to type that the other change has better tests. However, it only emulates listings that miss the objects, but errors. | 21:42 |
acoles | we successfully quarantined a large number of isolated durable EC fragments in the last week using https://review.opendev.org/c/openstack/swift/+/788833 | 21:44 |
timburke | our log-ingest pipeline seems much happier for it :-) | 21:44 |
acoles | and as a consequence eliminated a large number of error log messages :) | 21:44 |
zaitcev | Note that it's not the dark data plugin but the built-in replicator code that does that. | 21:44 |
timburke | oh -- i noticed that unlike with the object-updater and container-updater (which can use request path), the container-sharder doesn't give any indication what shard an update came from in container server logs -- so i proposed https://review.opendev.org/c/openstack/swift/+/793485 to stick the chard account/container in Referer | 21:45 |
zaitcev | Why are you guys quarantine them instead of deleting? | 21:45 |
zaitcev | s/ are / do / | 21:45 |
mattoliver | timburke: nice | 21:46 |
zaitcev | Is there any doubt about the decision-making in that code? Looked pretty watertight to me. Just a general caution? | 21:46 |
acoles | timburke: I'll review that again | 21:46 |
timburke | thanks | 21:47 |
acoles | zaitcev: yes, caution | 21:47 |
mattoliver | just seemed better to quarantine then to just delete | 21:47 |
acoles | I'm averse to deleting things | 21:47 |
zaitcev | This contrasts with Alistair wanting to run object watcher with action=delete, which clearly has more avenues to fail and start deleting everything. | 21:47 |
timburke | zaitcev, yeah, general caution. our ops team will still need some tooling to wade through quarantines, though :-( | 21:47 |
acoles | zaitcev: I don't want to run dark data watcher ! I'm worried for anyone that does (before these fixes get merged) | 21:48 |
mattoliver | I have been playing with some potential reconstructor improvements, more interesting chain ends: https://review.opendev.org/c/openstack/swift/+/793888 which if it finds it on the last known primary will leave it for the handoff to push.. kinda a built in handoffs_first if we're talking post rebalance. | 21:49 |
mattoliver | the last patch (that;s linked) in the chain is skipping a partition if there have been a bunch already found on said partition. In some basic testing in my SAIO it sped post rebalance reconstructor cycle quite a bit. | 21:51 |
mattoliver | but just playing around, scratching an itch. | 21:51 |
timburke | very cool -- it'd be interesting to play with that in a lab environment (and for that matter, to have some notion of "rebalance scenarios" for labs...) | 21:52 |
timburke | all right, i think we're about done then | 21:53 |
timburke | thank you all for coming, and thank you for working on swift! | 21:54 |
timburke | #endmeeting | 21:54 |
opendevmeet | Meeting ended Wed Jun 2 21:54:07 2021 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 21:54 |
opendevmeet | Minutes: http://eavesdrop.openstack.org/meetings/swift/2021/swift.2021-06-02-21.00.html | 21:54 |
opendevmeet | Minutes (text): http://eavesdrop.openstack.org/meetings/swift/2021/swift.2021-06-02-21.00.txt | 21:54 |
opendevmeet | Log: http://eavesdrop.openstack.org/meetings/swift/2021/swift.2021-06-02-21.00.log.html | 21:54 |
*** kota_ has quit IRC | 21:55 | |
*** kota_ has joined #openstack-swift | 21:56 | |
*** kota_ has quit IRC | 22:04 | |
*** kota_ has joined #openstack-swift | 22:43 | |
*** kota_ has quit IRC | 22:51 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!