21:00:01 #startmeeting swift 21:00:02 Meeting started Wed Jun 13 21:00:01 2018 UTC and is due to finish in 60 minutes. The chair is notmyname. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:04 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:06 The meeting name has been set to 'swift' 21:00:08 who's here for the swift team meeting? 21:00:40 hi o/ 21:01:58 hi 21:02:00 rledisez: hmm... think anyone else is joining us today? 21:02:12 o/ 21:02:27 acoles: torgomatic_: hello 21:04:25 clayg is out this week. kota_ is out too 21:05:03 not sure where everyone else is 21:05:34 torgomatic_: do you want to go over your patch question, or should we wait until more people are online? 21:05:43 eh, I can go now if you like 21:05:48 I don't see much else on the agenda this meeting 21:06:09 I've written a change that makes the object updater have both multiple processes and per-process concurrency. The goal is to be faster at handling large numbers of pending updates. 21:06:15 I tested it on one (real-hardware) node with a few million async_pendings; the graph is here: https://storage101.dfw1.clouddrive.com/v1/MossoCloudFS_3b02a08f-d96b-49fe-b8a5-b88551fad855/graphs/parallel-expirer-graph.png?temp_url_sig=4127cb9b472e3222bc865e936c9cb023d12a64a0&temp_url_expires=1615318360&inline 21:06:23 The only weird thing here is the config parameters "concurrency" and "updater_workers". On master, "concurrency" means "number of processes", and each process handles async_pendings serially. With the patch, "updater_workers" means "number of processes", and "concurrency" means "number of concurrent updates in each process". 21:06:28 Most other Swift daemons have similar configs where "(type_)?workers" controls process count and "concurrency" controls greenthread count, like the object reconstructor, object replicator, object server, object expirer, container replicator, container server, account replicator, account server, and account reaper. The only other daemons using "concurrency" to control process count are the object auditor and container 21:06:28 updater. 21:06:35 On upgrade, this patch would result in a small change to existing clusters who have set concurrency=N. Before, they'd have N processes spread across (up to) N disks, each processing one async_pending at a time. After, they'd have 1 process operating on each disk in turn, processing N async_pendings at a time. 21:06:41 I think this is fine, but others may disagree, and that's why it's a meeting topic. 21:07:12 * notmyname reading 21:07:56 I am very glad rledisez is here :-) 21:08:07 :) 21:08:16 let me read it all once again ;) 21:09:21 Sorry I'm late o/ 21:09:22 I raised this when reviewing the patch. The patch is great, the move towards consistent option naming is laudable, but I felt the implications of the option name changes warranted wider discussion (in our brave new world of single +2) 21:10:04 it's a good question to raise. my gut reaction is the same as acoles's. 21:10:06 torgomatic_: it's not really the scope of the patch but i'm curious. how did you ended up with millions of async pendings? 21:10:12 torgomatic_: what's the upgrade path? 21:10:51 rledisez: lots and lots of puts of small objects to a single container 21:11:26 we need a better solution for those small objects ! ;) 21:11:27 notmyname: the upgrade path is pretty much just upgrade Swift as normal 21:11:40 torgomatic_: that's what i thought ;) 21:11:40 i've pushed this last week exactly for that situation: https://review.openstack.org/#/c/571917/ 21:11:40 patch 571917 - swift - WIP: Manage async_pendings priority per containers 21:11:50 current master only has the "concurrency" setting? 21:11:55 notmyname: correct 21:12:11 related to that patch, i'm a bit afraid of the I/O pressure it would put to disks if you have, let's say, concurrency=60 21:12:25 operators should be warned to take care of that value before upgrading 21:12:30 torgomatic_: and the default value of the new "update_workers" setting? 21:12:31 so you go from N-at-a-time before to N-at-a-time after, but the distribution is worse 21:12:41 rledisez: agreed; it's definitely worth a mention in the release notes 21:12:52 or a mailing list message to openstack-operators, or both 21:13:12 notmyname: default is 1, I think... let me look 21:13:20 1 or 1 per drive? 21:13:32 so what torgomatic_ says ^^ was my concern, that if all you do is upgrade swift, you might have end up with updater performance 21:13:53 defaults are concurrency=128 (which may be a bit high; I just made it up) and updater_workers=1 21:13:54 degradation 21:14:05 acoles: true; if you've got only a few async_pendings, then you're okay with the upgrade 21:14:22 torgomatic_: so that means on upgrade, you would *not* have one per drive. you'd have only one 21:14:25 if you have a lot, then this upgrade could slow you down until you notice the new settings and then you get to process them faster than before 21:14:26 and if you've got plenty of them, you're already in trouble :D 21:14:41 notmyname: one what, exactly? process? concurrent update? 21:14:45 ah! "one...operating on each disk in turn". got it. /me learns to read 21:15:01 I misread it as "after upgrade...you'd have one per disk" 21:17:05 so what's the alternative? invent new config variables and keep inconsistency across all settings? 21:17:11 rledisez: can I check, when you say concurrency=60 might be a bad idea, do you mean the new concurrency i.e. 60 threads, or do you mean 60 processes? 21:18:18 notmyname: yeah, pretty much. We could have "concurrency" for processes and "concurrency_per_process" for greenthreads or something, and then it's a special case forever but there's no upgrade issues 21:18:37 I prefer moving towards consistency; there are too many daemons not to IMO 21:19:08 so the question is "do we choose new names with inconsistency" or "go for config consistency at the risk of an upgrade slowing down existing clusters until settings are changed" 21:19:53 right, and it would only really harm existing clusters with lots of async_pendings 21:20:05 acoles: if i got it well: now concurrency=60 means 60 processes (one per disk) 21:20:05 after: concurrency=60 means one process per disks, 60 threads each. i think 60 threads per disk is a bad idea (if the value is just 2, it should be fine for everybody) 21:20:12 I'm going to assume all clusters have times with lots of async pendings 21:20:48 rledisez: fair enough; so one process per disk and 2 greenthreads per process is something you think is reasonable? 21:20:54 rledisez: ok, that makes sense. 21:21:05 for reference, the graph above was done with 1 process per disk and 32 greenthreads per process 21:21:22 IMO, seeing as 100% of clusters have to update configs in order to take advantage of the new feature, going for consistency is worth it 21:21:50 yes. i can't tell the right number per disk without making some impact test. but until 5-10 it should be fine. over 10 is probably too much (think unlinking, inode update, journal update, too much I/O) 21:21:51 hmm... and changing the default as rledisez suggests would further minimize impact 21:22:25 rledisez: 8 it is... that's a power of 2, so people are less likely to wonder about it ;) 21:22:30 :D 21:22:36 * torgomatic_ will change the default to 8 after the meeting 21:22:55 Lol 21:23:03 everyone happy with this conclusion? 21:23:16 * rledisez is wondering how far we to dynamically adapt all of this and not worry anymore about default values 21:23:17 notmyname: if you have concurrency >1 now then you do not need to change config to take advantage of the new feature, but you do need to change config to retain the old feature 21:23:59 +1 for consistency, +1 for 8, +1 for upgrade docs. +100 for some good work 21:24:17 rledisez: do you think "auto" as a value for updater_workers would work, where "auto" means "1 process per disk, however many that is?" 21:24:23 (future patch, not this one) 21:24:46 but that'd let you set greenthreads-per-disk and then forget about the reset 21:24:48 *rest 21:25:13 torgomatic_: no, i'm just dreaming that one day, processes will know the numbers of available I/O they can use without impacting the TTFB 21:25:23 rledisez: +1000!!! 21:25:51 yeah, if we have a single process responsible for each disk's IO, that gets easier 21:26:27 kind of tough to do that with all these uncoordinated WSGI servers, replicators, auditors, and other things all running around at the same time 21:26:33 that's definitely a goal that I want us all to work torwards over the next year or so 21:27:52 ok, anything else major to discuss this week? 21:27:57 * notmyname has some time pressure today 21:28:09 nothing else here 21:28:14 torgomatic_: thanks for bringing that up 21:28:17 just one (quick) question about "tsync" work 21:28:20 ok 21:28:28 do you think it would be in python or golang? 21:28:32 yes 21:28:33 mattoliverau: has been asking about the ring composer CLI, whether we should merge it? 21:28:44 also yes 21:29:11 #link https://review.openstack.org/451507 21:29:12 patch 451507 - swift - Experimental swift-ring-composer CLI to build comp... 21:29:43 rledisez: in reality, I think we will (and should) do tsync before any object server rewrite. which means we should implement tsync such that there aren't any language-specific serialization issues. yes, that means we'll write it twice 21:30:18 ok. because my fear is that a tsync in golang means to re-do all the work on LOSF to write the diskfile in golang 21:30:30 but I guess in any way, at some point everything will be rewritten 21:31:13 acoles: mattoliverau: having something is better than having nothing. maybe it should have a stderr line on invocation that says "this is experimental" or something like that 21:31:33 notmyname: IIRC it has that, maybe on stdout 21:31:39 oh ok 21:32:31 notmyname: no, stderr https://review.openstack.org/#/c/451507/18/swift/cli/ringcomposer.py@166 21:32:32 patch 451507 - swift - Experimental swift-ring-composer CLI to build comp... 21:32:33 rledisez: I'm not sure why golang tsync means rewriting golang diskfile 21:32:55 I agree that something is better than nothing re: ring-composer. Get *something* in now, then we can iterate later. 21:33:09 torgomatic_: acoles: also, it's got a warning! :-) 21:33:19 I always find it easier to fix something than to start with an empty .py file and come up with something good. 21:33:42 torgomatic_: alternatively, I've never had issues starting with an empty .py file and coming up with something bad 21:33:49 :) 21:34:17 I can start and end with an empty .py file :P 21:34:18 Great my thoughts exactly. 21:34:33 leave *nowhere* for the bugs to hide! 21:34:46 notmyname: well, I guess it depends a lot of the design. If the server is the normal object-server, no issues. if the server is a tsync server, it needs to use the diskfile. so i'm especially thinking of the LOSF diskfile 21:35:49 rledisez: I guess the good news is that nobody has started writing antyhing "tsync" yet, so it can be whatever we want it to be 21:35:55 Re: composter. I've been talking to people about a cool feature called composition rings so it would be nice to have something people could at least play with. I know the classes are there, but some clients is needed. 21:36:05 *composite 21:36:12 Don autocomplete 21:36:19 but IMO tsync should definitely account for the lsof work from the beginning 21:36:46 DHE also bought up this week, which reminded me that we never landed it 21:37:06 I've got two final thoughts (then I need to close the meeting and go offline for a bit) 21:37:16 *brought. (not to be confused with bribes to get what I want :) 21:37:37 * mattoliverau gives up typing on his phone, autocomplete sucks and y'all just have to learn to guess what I'm saying :p 21:37:43 oh. I'm sorry. 21:37:47 (1) it's really great to see all the review activity and code landing. my subjective feel is that the single +2 olicy is so-far successful (time will tell if it's sustainable) 21:38:02 Lol 21:38:36 mattoliverau: I'd love it to be renamed as the 'composter' :) 21:38:49 DHE: that wasn't directed at you, sometimes my spelling is bad.. I am Australian 21:38:56 acoles: exactly! 21:39:09 (2) please remember to be kind on reviews. there was one review comment I saw recently that was perhaps a little more accusatory than it needed to be (not from anyone in here). seemed like a good time to thank everyone for their normally very kind reviews 21:39:22 mattoliverau: I know, I'm just trying to be funny... 21:40:13 DHE: lol, I appreciate that ;) keep it up 21:40:41 thank you, everyone, for your contributions to swift. I'm glad you're part of the community 21:40:44 #endmeeting