21:00:03 <timburke> #startmeeting swift 21:00:04 <openstack> Meeting started Wed Oct 30 21:00:03 2019 UTC and is due to finish in 60 minutes. The chair is timburke. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:00:05 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:00:07 <openstack> The meeting name has been set to 'swift' 21:00:13 <timburke> who's here for the swift meeting? 21:00:19 <kota_> o/ 21:00:37 <mattoliverau> o/ 21:00:59 <rledisez> o/ 21:01:36 <timburke> agenda's at https://wiki.openstack.org/wiki/Meetings/Swift 21:01:47 <timburke> #topic Shanghai 21:01:59 <timburke> it's almost here! 21:02:14 <kota_> yey 21:02:22 <timburke> in two days, i'll be on the plane! i'm excited (and a little freaking out) 21:02:35 <rledisez> H-36 before my flight :) 21:02:55 <timburke> i've been adding what events i know about to the etherpad 21:02:58 <timburke> #link https://etherpad.openstack.org/p/swift-ptg-shanghai 21:03:02 <kota_> it'll logn flight, please safe. 21:03:23 <kota_> be long 21:03:25 <timburke> in particular, i saw there was a game night, like there have been the last few PTGs 21:03:48 <timburke> they tend to be pretty fun, and a good opportunity to get to know some other the other openstackers better 21:04:08 <timburke> i also saw that cschwede is going to be there! 21:04:08 <kota_> good to know 21:04:16 <kota_> when? 21:04:22 <timburke> (at the ptg that is; i don't know about game night ;-) 21:04:57 <kota_> ah, you told about the past ones. 21:05:03 <kota_> got it. 21:05:18 <timburke> game night's thursday, 8:00 PM, City Center Mariott Lobby 21:05:30 <kota_> ah ok. thx. 21:06:00 * kota_ should go to the etherpad link 21:07:05 <timburke> oh, there was also a flyer the foundation put together, where'd i put that... 21:07:18 <timburke> #link https://object-storage-ca-ymq-1.vexxhost.net/swift/v1/6e4619c416ff4bd19e1c087f27a43eea/www-assets-prod/summits/shanghai/Shanghai-Travel-Tips.pdf 21:07:41 <timburke> with some travel tips 21:07:50 <kota_> travel tips! 21:08:37 <timburke> "Some restrooms do not supply toilet paper. Suggested to carry some with you." is a little disconcerting... 21:08:46 <timburke> but good to know! 21:08:51 <kota_> oic. "Download ALL apps needed on your phone / desktop (including the Summit Mobile App!)" 21:09:04 <kota_> `including the Summit App!` 21:09:11 <mattoliverau> oh wow so soon 21:09:40 <timburke> yeah, i still need to get my phone situated... laptop's prepped; phone, not yet 21:09:56 <mattoliverau> at least y'all will be close to my timezone soon (assuming you'll have irc access) 21:10:38 <kota_> strictly speaking, my timezone is closer ;-) 21:10:55 <timburke> mattoliverau, i *think* irc will be ok? ptgbot isn't going to be so useful otherwise, anyway ;-) 21:11:04 <mattoliverau> ahh good point :) 21:11:18 <mattoliverau> kota_: lol 21:11:22 <mattoliverau> that is true 21:12:29 <timburke> all right, that's all i've got for summit/ptg -- i can't wait to see kota_ clayg rledisez alecuyer and cschwede there! 21:12:37 <timburke> on to updates! 21:12:46 <timburke> #topic versioning 21:13:46 <timburke> clayg and tdasilva have got the three patches stacked up now, and they've been iterating on it 21:13:50 <mattoliverau> cschwede there too, that's awesome! 21:14:44 <timburke> i haven't been able to follow along quite as closely as before, as i've gotten a bit distracted with summit/ptg/general-travel prep 21:15:58 <mattoliverau> timburke: you do a great job of following as much as you do. 21:16:25 <mattoliverau> I guess they're not here to do an update. maybe we just link the patches and move on then? 21:16:25 <timburke> but i got the impression they've been adding tests and fixing up rough edges, with the idea that clayg will have a pretty solid picture of what's involved so we can talk about it at the ptg 21:16:35 <mattoliverau> cool 21:16:58 <mattoliverau> so what comes first, the null containers? 21:17:09 <mattoliverau> or namespace 21:17:20 <mattoliverau> what ever terminology I'm suppose to use :P 21:17:29 <timburke> yep, null namespace first -- https://review.opendev.org/#/c/682138/ 21:17:48 <timburke> then a new versioning api for swift -- https://review.opendev.org/#/c/682382/ 21:18:26 <timburke> and finally hooking up s3api to use the new api -- https://review.opendev.org/#/c/682382/ 21:18:49 <mattoliverau> sweet 21:18:54 <timburke> i think that ought to cover it 21:19:00 <timburke> #topic lots of small files 21:19:03 <kota_> those look big changes 21:20:28 <timburke> kota_, fortunately, almost 4k of the 5k lines in that middle patch are just new test files :-) 21:20:39 <kota_> :-) 21:20:54 <rledisez> alecuyer is not here, but I think he told me there is nothing new on losf this week 21:21:29 <timburke> rledisez, is there anything we should be trying to do or look at to be more prepared at the ptg? 21:22:18 <rledisez> I think what we have in mind right now is to stop evolving it for w while until it can be merged (fix bugs, tests, nothing new before merge) 21:22:54 <rledisez> I think alecuyer should be answering your question, i can't answer honestly 21:24:10 <timburke> that's ok, no worries. something of a freeze sounds reasonable; and i guess the rest of us ought to be thinking about what needs to happen next for us to feel comfortable merging it to master 21:25:04 <timburke> #topic profiling 21:25:13 <timburke> #link https://etherpad.openstack.org/p/swift-profiling 21:25:33 <timburke> rledisez, take it away :-) 21:25:38 <rledisez> thx :) 21:26:13 <rledisez> so, the full story is in the etherpad, but i short, we are CPU bound on proxy-servers, and it does not seem right that a decent proxy-server cannot handle more than 3 or 4 Gbps of trafic 21:26:42 <rledisez> so I did some profiling, I played with conf option timburke suggested last week and I put result there 21:27:11 <rledisez> first of all, I'm interested if you see any issue in the bench I did (wrong methodology etc…) 21:27:33 <rledisez> after that, I propose some ideas at the bottom to improve the situation i'd like to discuss 21:27:52 <rledisez> basically, object-server is fine. proxy/GET is fine, proxy/PUT is damn slow 21:28:22 <timburke> you said it's got 10Gb NICs -- are there two of them? one client-facing, one cluster-facing? 21:29:05 <rledisez> timburke: on our production yes, but for the benchmark, all was local? on production we are far from 10Gbps on either interface 21:29:18 <rledisez> i mean, it was local, for sure :) 21:30:08 <rledisez> note: I still need to bench with EC policy 21:30:37 <timburke> and are we measuring bandwidth on the client-facing traffic, or cluster-facing? 21:30:47 <timburke> (just to sanity check ;-) 21:30:48 <zaitcev> Is it possible refuse in-kernel MD5 and try some local libraries? 21:30:58 <zaitcev> Maybe the kernel overhead is too great or something. 21:31:18 <rledisez> timburke: client facing (and I understand that cluster facing we are expecting N*bandwidth for a PUT) 21:32:05 <kota_> rledisez: the benchmark ran under py3 or still py2? 21:32:16 <rledisez> zaitcev: are you talking about the splice option? 21:32:41 <rledisez> kota_: I did both for some measure, I didn't see any major difference, but mostly py2 21:32:54 <kota_> ok 21:34:08 <timburke> so with the 1MB chunk size... the client's seeing 5Gbps, so we must be generating 15Gbps on the cluster interface -- which seems about in line with the upper-bounds you were seeing in the object-server... 21:34:09 <zaitcev> No, I am saying that all of our MD5 are calculated by kernel nowadays, right? 21:34:22 <zaitcev> Every time you invoke md5 it's a syscall 21:34:32 <zaitcev> Using the AF_LINK or what's its name 21:34:33 <timburke> zaitcev, nope -- rledisez already pointed out that i had the wrong idea about that ;-) 21:34:59 <zaitcev> ok 21:35:57 <rledisez> timburke: In my bench I had 3 object-server that could reach about 14Gbps, so in the best of best world the proxy should handle 15Gbps (because it's all localhost trafic, writing on /dev/shm) 21:36:19 <timburke> when we're writing we just use the normal python hashlib: https://github.com/openstack/swift/blob/2.23.0/swift/obj/diskfile.py#L1669 21:37:03 <rledisez> and it's quite good given than object-server is "only" 20% slower than simple md5sum 21:37:25 <rledisez> so I'm not expecting major improvement on object-server 21:37:52 <rledisez> (well +17% bw / -17% cpu is still something :)) 21:38:43 <rledisez> just to be clear, i'm not suggesting at all to remove md5 calculation :) I just did it to get the best of proxy-server 21:40:11 <timburke> eh, i know notmyname's talked about the idea of using soemthing other than md5 before... that's not *such* a crazy idea... 21:40:46 <timburke> but yeah, i'm not sure how best to further investigate ATM 21:41:09 <mattoliverau> Makes me wonder if some simple perf daily or weekly check in zuul might be useful. Obviously with a grain of salt because of shared tenants. But might catch major degredations. 21:41:42 <rledisez> well, with just the no-timeout/no-queue on proxy we already could get a signifant perf improvement 21:42:46 <mattoliverau> I wonder why timeouts cause such a slow down, is it there implementation, because I wonder why a watchdog thread works so well 21:43:08 <mattoliverau> I would've thought a timeout would be a timed thread or something somewhat similar 21:43:22 * mattoliverau has never looked under the hood though 21:43:53 <mattoliverau> is it something evenlet monkey patches (me is just thinking out loud). 21:43:55 <rledisez> mattoliverau: it is quite good, except we call it for each chink (so each piece of 64KB), so it is called thousands of time for an upload 21:44:05 <mattoliverau> ahh 21:44:08 <mattoliverau> ok 21:44:18 <rledisez> while a watchdog will be initalized once and jsut a variable is updated then 21:44:25 <timburke> yeah, eventlet basically schedules an event for later to raise the Timeout in the appropriate thread 21:45:19 <timburke> and i think it's also part of why the increased chunk size captures a lot of the no-timeout gain 21:45:48 <rledisez> and the queue, well, the same * N, and it needs a synchronisation each time (so lock etc…) 21:46:06 <rledisez> timburke: right, bigger chunk == less call to Timeout/queue 21:46:23 <timburke> rledisez, did you happen to measure RAM consumption differences between the different chunk sizes? 21:47:26 <timburke> i wonder if we should just up the default chunk size... 21:47:29 <rledisez> nope, I didn't. but it's quite easy to calculate the woerst case scenario I think. the max queue size is 10 IIRW, there is N replica, so chink_size * N * 10 ? 21:47:45 <rledisez> *chunk 21:47:58 <rledisez> per PUT 21:48:33 <timburke> and it was always a single PUT at a time, right? 21:49:39 <rledisez> yeah, Im planning to do more on concurrency later to see if there is something to optimize there 21:49:47 <rledisez> right now it's focus on one connection performance 21:51:51 <mattoliverau> rledisez: great job 21:51:52 <timburke> out of curiosity, what kinds of speeds can you get with netcat? testing locally just now, i can get ~24GB/s with dd piping straight to /dev/null, but only 1.3-1.4GB/s (so ~11Gbps) if i send it through a socket that's piping to /dev/null... 21:52:25 <rledisez> timburke: do you have the exact command so I can copy/paste? 21:53:10 <timburke> in one terminal, `nc -l 8081 > /dev/null` -- in another, `dd if=/dev/zero bs=1M count=10000 > /dev/null` 21:53:43 <rledisez> 15.5 GB/s 21:53:52 <timburke> i tried twiddling bs/count to do even larger chunk sizes, but it didn't have much difference 21:54:08 <timburke> why's my laptop so slow!? boo! 21:54:28 <timburke> good to know though, to keep in mind as an upper bound :-) 21:54:56 <rledisez> I can provide you a server to work, but you're going to have trouble at china customs ;) 21:55:16 <timburke> all right, well... i guess we'll keep thinking about it. willing to bet we'll talk about this more next week 21:55:24 <kota_> lol 21:55:33 <timburke> got just a few more minutes 21:55:38 <timburke> #topic open discussion 21:55:47 <timburke> anything else anyone would like to bring up? 21:56:29 <mattoliverau> I have a mate who might be convincing his work to do some upstream time. They're interesting in tiering. So if it goes ahead I might point him at those stalled patches. 21:56:59 <timburke> \o/ i love new contributors! 21:57:05 <mattoliverau> if so, a discussion that should be had (maybe at ptg) is is it still the right design? 21:57:27 <kota_> good 21:57:31 <mattoliverau> or maybe should it use the new null namespace and hide tiering containers? 21:58:11 <timburke> excellent question 21:58:15 <mattoliverau> I had a chat wtih him about some of it already while giving him a Swift intro the otherday online. 21:59:01 <timburke> out of curiosity, who's his employer, if you can say? 21:59:13 <mattoliverau> if you can add that to the list of discussions it would be good. It'll depend on if he can swing it. But maybe as a friday thing after other null namespace dicsussions 21:59:25 <mattoliverau> Can't say yet 21:59:37 <timburke> 👍 21:59:54 <mattoliverau> timburke: but you might now them because they may or may not use swiftstack ;) 22:00:05 <timburke> all right, we're about at time 22:00:18 <timburke> thank you all for coming, and thank you for working on swift! 22:00:22 <timburke> #endmeeting