Wednesday, 2019-04-24

*** auk has joined #kata-dev00:21
*** tmhoang has quit IRC01:55
*** igordc has joined #kata-dev02:43
*** igordc has quit IRC05:49
*** pcaruana has joined #kata-dev06:24
*** auk has quit IRC06:25
*** tmhoang has joined #kata-dev07:47
*** jodh has joined #kata-dev08:06
*** gwhaley has joined #kata-dev08:09
kata-irc-bot2<james.o.hunt> @claire - I think syndicating the weekly emails to the Kata blog is a great idea! :) /cc @eric.ernst.08:23
*** stackedsax has quit IRC09:53
*** stackedsax has joined #kata-dev09:53
brtknrHey all, just reporting on some preliminary fio test results.... why does kata perform so well in sequential write case compared to sequential read: https://raw.githubusercontent.com/brtknr/kata-vs-crio/master/aggregate-bw-kata.png11:38
brtknrFor comparison, this is the raw baremetal performance: https://raw.githubusercontent.com/brtknr/kata-vs-crio/master/aggregate-bw-bare.png11:39
*** devimc has joined #kata-dev12:42
kata-irc-bot2<graham.whaley> Hi @brtknr - wrt why write might be looking much better than read - my best guess is it might be due to what is being cached and where... you might have to show the exact fio settings you used (and there are sooooo many ;) ).12:52
kata-irc-bot2<graham.whaley> writes could be getting cached in the VM (and not flushed out or sync'd to the host), whereas reads have to come from the host (unless the same item is being read again, and happens to have been cached in the guest)12:52
kata-irc-bot2<graham.whaley> For instance, in our fio metrics tests, we default to using fio direct=true mode to try and avoid such effects12:53
kata-irc-bot2<graham.whaley> https://github.com/kata-containers/tests/blob/master/metrics/storage/fio.sh#L5412:53
kata-irc-bot2<graham.whaley> as we really are interested in measuring the transport mechanism, and not any guest cache effects. Really depends on exactly what you want to measure ;)12:53
*** irclogbot_3 has quit IRC12:55
*** irclogbot_2 has joined #kata-dev12:55
*** altlogbot_1 has quit IRC12:57
*** altlogbot_0 has joined #kata-dev12:57
*** fuentess has joined #kata-dev12:58
brtknrgwhaley: Thanks for the heads up, I will try again with direct=true :)13:05
brtknrgwhaley: Do you know why sequential read is not getting cached in the same way?13:06
kata-irc-bot2<graham.whaley> brtknr - reads will only be cached once they are read - so, if you read the same thing twice, maybe the 2nd read will hit the cache. sequential will probably never hit the cache, as you only read each item once. random you have some chance (depending on the random algorithm) of reading some items more than once. but13:07
kata-irc-bot2<graham.whaley> this also all depends on how big your test file is, and how big your cache (ram) is... :slightly_smiling_face:13:08
kata-irc-bot2<graham.whaley> it is probably worth you reading the top part (the config settings) of that fio test we have already. fio is very 'flexible' though13:08
brtknrso write caching acts more like a buffer whereas read caching is more like a short term memory?13:13
kata-irc-bot2<graham.whaley> yes, I think you could say that. Of course, reality is a bit more complex...13:15
kata-irc-bot2<graham.whaley> in the case of Kata for instance, as it is a VM, you have caches potentially (but not always) both in the VM(guest) and on the host. There are pros and cons to having either both enabled, or only one enbabled. And, I think the setups change depending on what sort of backing store/graph driver is being used (like 9p or devicemapper or virtio-fs etc.). It's moderately complex :slightly_smiling_face:13:17
kata-irc-bot2<graham.whaley> I remember now @brtknr - somebody asked about some of this before on an Issue, and I wrote some stuff down - have a look at https://github.com/kata-containers/tests/issues/560 maybe13:19
brtknrgwhaley: thats great, some nice explanations there, thank you :)13:31
brtknrgwhaley: Also looking at the graph again, why does a single kata VM have a significantly better randwrite IO compared to 64 running in parallel?14:23
brtknrDo they have a global lock?14:24
gwhaleybrtknr: are you running 64 kata's, or 64 fio tests inside a single kata?14:24
gwhaleythat is - maybe a 64 threaded parallel fio test inside a single kata (to be clear)14:24
brtknrgwhaley: I'm running 64 kata's across 2 instances, 32 instances per node14:24
brtknrSingle fio job per kata14:25
brtknr4 threads per job14:25
brtknrbut I think fio aggregates this result14:25
gwhaleyright. I've not tried that ;-) So, a single instance is out-performing the total of the 64 instances?14:25
brtknrgwhaley: yep, thats the claim14:25
gwhaleyit could be a latency thing, or a bottleneck somewhere. It's an interesting case/finding14:26
brtknrhttps://raw.githubusercontent.com/brtknr/kata-vs-crio/master/aggregate-bw-kata.png14:26
gwhaleydoes the non-kata case scale more linearly?14:26
brtknrgwhaley: it does: https://raw.githubusercontent.com/brtknr/kata-vs-crio/master/aggregate-bw-runc.png14:26
brtknrhttps://raw.githubusercontent.com/brtknr/kata-vs-crio/master/aggregate-bw-bare.png14:26
brtknrbare=bare metal, runc=well, runc14:27
gwhaleyjust to throw one thought out then - in the kata case, each instance will have its own cache inside the VM, and then share the cache on the host as well. in the bare metal case there is just the one cache on the host. It might be14:28
gwhaleythat kata is therefore consuming more RAM (as it has more caches), so the caches are not being as effective individually.14:28
brtknrgwhaley: thats an interesting thought14:28
gwhaleyyou may be able to get clues if that is the case by changing the size of either/both the size of the test file, and the amount of RAM each kata instance has14:29
brtknrhow would you explain runc?14:30
brtknris that also being cached on the host since they share the same kernel?14:30
gwhaleybrtknr: usign the same kernel, but runc will be using the host cache - so only one cache for all instances. So, one big cache. For kata, as it can have caches inside the VMs (guests), they may be duplicating data that is also held in the host cache, and thus effectively reducing the host cache size (consuming memory that may have been used by the host for cacheing). It is just a thought. It might not actually be the bottleneck or reaso14:32
gwhaleyn :_)14:32
brtknrgwhaley: interesting thought!14:35
*** igordc has joined #kata-dev14:42
*** pcaruana has quit IRC15:06
*** tmhoang has quit IRC15:13
*** pcaruana has joined #kata-dev15:46
*** altlogbot_0 has quit IRC16:09
*** altlogbot_1 has joined #kata-dev16:11
*** altlogbot_1 has quit IRC16:43
*** altlogbot_3 has joined #kata-dev16:43
*** igordc has quit IRC16:45
*** altlogbot_3 has quit IRC16:53
*** altlogbot_2 has joined #kata-dev16:53
*** devimc has quit IRC16:57
kata-irc-bot2<eric.ernst> Heads up - sweet post from @salvador.fuentes on our CI/CD, and how we leverage service offerings donates from various cloud providers: Great post from @salvador.fuentes live on the blog now! https://medium.com/kata-containers/kata-containers-testing-and-packaging-powered-by-the-cloud-b752de2ee47116:57
*** devimc has joined #kata-dev16:57
brtknrhttps://raw.githubusercontent.com/brtknr/kata-vs-bare-vs-runc/master/scenario-cumulative-aggregate-cf-1-clients.png17:06
brtknrhttps://raw.githubusercontent.com/brtknr/kata-vs-bare-vs-runc/master/scenario-cumulative-aggregate-cf-64-clients.png17:06
brtknrHere's some funky graphs which compare kata-vs-bare-vs-runc all side by side17:07
*** igordc has joined #kata-dev17:16
*** jodh has quit IRC17:23
*** gwhaley has quit IRC17:23
*** igordc has quit IRC20:22
*** igordc has joined #kata-dev20:23
*** igordc has quit IRC20:23
*** pcaruana has quit IRC20:39
*** devimc has quit IRC21:09
*** igordc has joined #kata-dev21:53
*** fuentess has quit IRC21:59
*** davidgiluk has joined #kata-dev22:36
*** davidgiluk has quit IRC22:50

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!