21:00:01 <timburke> #startmeeting swift 21:00:02 <openstack> Meeting started Wed Aug 19 21:00:01 2020 UTC and is due to finish in 60 minutes. The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:03 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:05 <openstack> The meeting name has been set to 'swift' 21:00:08 <timburke> who's here for the swift meeting? 21:00:18 <seongsoocho> o/ 21:00:51 <rledisez> o/ 21:01:06 <clayg> o/ 21:01:43 <timburke> as always, agenda's at https://wiki.openstack.org/wiki/Meetings/Swift 21:01:52 <timburke> #topic PTG 21:02:17 <timburke> just wanted a quick reminder to register 21:02:27 <timburke> #link https://www.eventbrite.com/e/project-teams-gathering-october-2020-tickets-116136313841 21:03:12 <timburke> i'll make an etherpad to start gathering topics, and send out a doodle to help us schedule meeting times 21:04:31 <timburke> more details about the scheduling and reservation process at http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016497.html if anyone's curious 21:04:49 <timburke> #topic s3api logging 21:05:28 <timburke> clayg, i carried over the one still-open patch from last week's agenda -- is there much we need to discuss about it? 21:05:42 <clayg> nope, all done - I approved https://review.opendev.org/#/c/735220/ 21:05:42 <patchbot> patch 735220 - swift - proxy-logging: Be able to configure log_route - 3 patch sets 21:06:01 <clayg> I think I asked a couple weeks ago if anyone else had noticed the duplication in things like transfer stats 21:06:11 <timburke> excellent! and i see i accidentally took the wrong one off the agenda :-/ oops 21:06:34 <clayg> but it's possible most clusters aren't dealing with such a large percentage of s3api requests 🤷♂️ 21:07:05 <clayg> anyway, i put "update example configs" on my list - since it's a best practices issue we might discuss that as part of the review 21:07:24 <timburke> given trends, i expect most clusters to see that percentage increasing 21:07:56 <timburke> thanks for, er, following up with the follow-up work! 21:07:58 <clayg> well then folks might want to be thinking about setting up a different route/prefix for their subrequets logging middleware! 21:08:10 <clayg> kill the topic for now; i'll re-raise later 21:08:17 <timburke> #topic py3 crypto bug 21:08:29 <timburke> #link https://review.opendev.org/#/c/742033/ 21:08:30 <patchbot> patch 742033 - swift - py3: Work with proper native string paths in crypt... - 4 patch sets 21:08:41 <clayg> this one is still very much on my radar after a long delay - did anyone else look at it? 21:09:31 <timburke> that was basically exactly my question :-) that, and "if so, do you have any questions about the approach or code?" 21:11:13 <clayg> I think we discussed it last week and everyone was onboard with the weird glitch of trying to upgrade from py2-old-swift to py3-new-swift in one go where old py2-old-swift couldn't read the py3v2 format 21:11:44 <timburke> i guess that's a no on looking at it. does anyone think they'll have time to look at it soon-ish? 21:12:44 <timburke> clayg, that was my impression as well -- people were generally on board with requiring a py2-new-swift step in between, if only because there isn't really a way to do it safely otherwise 21:12:49 <clayg> timburke: are you asking "besides clayg"? Sorry I haven't gotten back to it yet, hoping to do it this week! 21:13:10 <clayg> or... at least "before next wednesday" (so you can REALLY fuss at me again next week) 21:13:11 <timburke> no worries! we'll dig out from our review backlog some day! 21:13:28 <timburke> all right, moving along then 21:13:37 <timburke> #topic listen socket per worker 21:13:44 <timburke> #link https://review.opendev.org/#/c/745603/ 21:13:44 <patchbot> patch 745603 - swift - Bind a new socket per-worker - 3 patch sets 21:13:59 <timburke> so i got around to testing this out in prod today 21:14:08 <timburke> results look promising! 21:15:13 <rledisez> timburke: did you observed improvement on bandwidth or ttfb for example? I could expect that the 99percentile ttfb decrease 21:15:24 <timburke> went from generally 10-30 req/worker and one poor guy with 170 reqs to 15-35 req/worker 21:15:36 <clayg> oh wow! 21:15:46 <timburke> rledisez, not sure yet -- i'll definitely look for it now that you ask 21:17:06 <rledisez> at some point I was playing with uwsgi to work around that behavior and I was able to increase the bandwidth used by the proxy server (all the core was involved instead of just a few of them) 21:17:13 <timburke> this one proxy has been running like that for two or three hours now, and it certainly doesn't seem to have made anything *worse* 21:17:51 <rledisez> timburke: that's all I need to read :D 21:19:14 <timburke> as a heads-up, there's now an eventlet issue open about it, too 21:19:17 <timburke> #link https://github.com/eventlet/eventlet/issues/635 21:20:34 <timburke> oh, looks like it's closed now. i'm glad i replied :-) 21:20:56 <timburke> #topic 404s during rebalance 21:21:02 <timburke> #link https://review.opendev.org/#/c/744942/ 21:21:02 <patchbot> patch 744942 - swift - Client should retry when there's just one 404 and ... - 5 patch sets 21:21:39 <clayg> wow that's interesting! 21:22:43 <timburke> so we've observed troubles where we'd return a 404 even though data definitely *did* exist in the cluster -- and we're pretty sure we traced it back to primaries generally getting overloaded and timing out coupled with a rebalance that meant *one* primary could respond quickly with a 404 21:23:33 <clayg> those new primaries just LOVE to hand out 404s - you want data? nope. you want data? nope, none for you either. This is easy 😎 21:24:25 <seongsoocho> that means, the first response from object node is 404 and the other is timeout ? 21:25:07 <clayg> seongsoocho: yes [Timeout, Timeout, 404] should be 503 21:25:22 <seongsoocho> oh i see 21:25:45 <clayg> or at least, in our experience the most useful thing a client can do with a response to that set of backend primary responses is exponential backoff and then retry 21:26:11 <clayg> it's possible they'll get a 404 on [Timeout, 404, 404] - but that didn't seem to fit the pattern we observed most often 21:27:04 <clayg> I think tim made it configurable - but the default is 1 - which is what we want 21:27:27 <timburke> yeah, i added a new config option, ignore_rebalance_404s, that basically says, "if you get this many or fewer 404s from primaries, assume they're due to a rebalance and ignore them" 21:27:34 <clayg> I want to try to see if ignore_rebalance_404s = 0 mostly still passes existing assertions?! 21:28:17 <clayg> this patch and the worker socket patch are right there on the same list py3 crypto 😞 21:28:20 <timburke> which isn't great in terms of the logging you get out of best_response, but i'm realizing that logging isn't great regardless (*especially* for ec data...) 21:28:33 <seongsoocho> then, operator should setting that value before the rebalance, and remove it after the rebalance? 21:30:01 <timburke> you can -- but i think the default of 1 should be reasonable basically all the time (unless your rebalances are moving faster than replication can keep up) 21:32:06 <clayg> seongsoocho: another neat thing is that [Timeout(), Timeout(), 404-with-tombstone-timestamp] will still return 404 😎 21:32:11 <kota_> sorry i'm late 21:32:22 <seongsoocho> aha. ok. it looks very good. 🙂 I also have experience there are some 503 response during rebalance. 21:33:01 <timburke> unless you've got min_part_hours=0 (iirc), the rebalance should only move one replica's worth of data at a time. i suppose you could explicitly set ignore_rebalance_404s to 0 when you know your rings ought to be stable, but even then you may have an empty primary because the PUT went to handoffs or something 21:34:01 <clayg> timburke: yeah I feel like this strategy of "keep responses out of best_response" isn't really extending very well 21:34:51 <timburke> anyway, i mainly just wanted to raise awareness of this issue we've seen, check if anyone else has seen erroneous 404s during rebalances, and feel out whether the config tunable feels like the right sort of a knob 21:36:18 <clayg> it feels kinda similar in spirit to something like write_affinity_handoff_delete_count 21:37:00 <timburke> fwiw, there are limits to how far you can push it: negative values get raised to 0, and values at replica_count or higher get reduced to replica_count - 1 21:37:29 <timburke> (so the default is actually "0 if replica_count == 1 else 1") 21:37:40 <clayg> 💡 21:38:00 <clayg> i'll probably review this one first - what a great patch Tim! ❤️ 21:38:28 <timburke> that was a bit of an issue for me with tests in the gate, since tempest tests (and i think dsvm?) run with replica_count == 1 21:38:43 <timburke> all right, that's all i've got 21:38:48 <timburke> #topic open discussion 21:38:58 <timburke> what else should we talk about this week? 21:39:14 <clayg> another thing to note is that patch is built on waterfall-ec 21:39:54 <kota_> I added some comments to liberasurecode patch and waiting response from timburke 21:40:06 <kota_> please revisit that. 21:40:13 <timburke> yes! thank you, kota_! sorry, i've been meaning to reply 21:40:18 <kota_> just notification ;) 21:42:08 <clayg> Does anyone have any reservations they've been holding back on waterfall-ec; matt left a few comments I need to clean up and tim found a bug... but you know... otherwise it seems like it's almost ready! we're shipping it anyway 😁 21:42:28 <timburke> the long and short of it was that (1) i was lazy and didn't want to have to do string comparisons in C against "yes", "YES", "1", "true", "t", "y", etc. and (2) i wasn't sure how best to handle something like LIBERASURECODE_WRITE_LEGACY_CRC=asdf 21:43:32 <clayg> LIBERASURECODE_WRITE_LEGACY_CRC=1 seems quite reasonable to me for C code, what does zaitcev think? 21:43:41 <clayg> anyone have a link? 21:43:52 <timburke> present/not present seemed to leave very little room for ambiguity :-) though i should probably at least make sure a blank value still counts as false 21:43:57 <timburke> #link https://review.opendev.org/#/c/738959/ 21:43:58 <patchbot> patch 738959 - liberasurecode - Be willing to write fragments with legacy crc - 2 patch sets 21:44:02 * zaitcev thinks just the existence of the variable was a flag to us 21:44:09 <timburke> and 21:44:11 <timburke> #link 21:44:14 <timburke> #link https://review.opendev.org/#/c/739164/ 21:44:15 <patchbot> patch 739164 - swift - ec: Add an option to write fragments with legacy crc - 1 patch set 21:45:04 <zaitcev> I'll check what Kota found over my +2, likely I missed something. 21:45:28 <kota_> thx zaitcev 21:45:58 <kota_> i might be worried too much. 21:47:56 <timburke> all right, seems like we're winding down 21:48:08 <timburke> thank you all for coming, and thank you for working on swift! 21:48:13 <timburke> #endmeeting