Friday, 2019-04-12

kata-irc-bot<archana.m.shinde> @sebastien.boeuf I am debugging the netmon failures for tc now, that I talked about yesterday00:06
kata-irc-bot<archana.m.shinde> question about netmon, before I go dig deeper00:06
*** igordc has quit IRC00:07
kata-irc-bot<archana.m.shinde> it subscribes to all netlink events?00:07
kata-irc-bot<archana.m.shinde> I just started the debug logs for netmon, and for some reason I dont see the debug netlink events in place when a network is disconnected with docker disconnect00:08
*** auk has joined #kata-dev00:20
*** auk has quit IRC00:31
*** auk has joined #kata-dev01:05
*** EricRen has joined #kata-dev01:35
*** irclogbot_0 has quit IRC03:01
*** irclogbot_1 has joined #kata-dev03:01
*** changcheng has quit IRC03:36
*** changcheng has joined #kata-dev03:38
*** sameo has joined #kata-dev05:08
*** lpetrut has joined #kata-dev06:02
*** sgarzare has joined #kata-dev06:26
*** jodh has joined #kata-dev07:10
*** davidgiluk has joined #kata-dev07:59
*** gwhaley has joined #kata-dev08:06
gwhaleyhi brtknr: yeah, we know that 'grpc server' error quite well, but sadly, it is a bit of a generic error indicating there was trouble connecting to the agent inside the container. So, it doesn't give us the immediate clue to the solution :-(  I don't suppose you got further overnight did you? :-)08:18
*** sameo_ has joined #kata-dev08:43
*** sameo has quit IRC08:44
*** tmhoang has joined #kata-dev09:28
*** auk has quit IRC09:45
brtknrgwhaley: so the minikube environment on packet seems to work without any issues... I am seeing this problem on another cloud where I was trying to replicate the workshop...10:15
brtknrgwhaley: Perhaps I did something wrong in my configuration...10:15
gwhaleyhi brtknr: as long as you installed the minikube using the command line as detailed in the gist and on the packet motd file, and you checked that you have kvm enabled and nested vm enabled, then I would think it would work.... so...10:18
brtknr:q10:18
gwhaleyif you have done all those (I suspect you have), and you are willing, if you could let us know what sort of setup (or cloud) it is, we could maybe investigate10:18
brtknrDo you still need to enable RuntimeClass feature gate?10:20
brtknrI didnt think I did in 1.1410:20
brtknrk8s that is10:20
*** tmhoang has quit IRC10:25
gwhaleybrtknr: in 1.14, I don't think you need to enable. but, I've not tried myself10:26
gwhaleybe interested in what the other cloud is or the physical node - most other clouds are not bare metal, so are already a nested VM - not all support further nesting - so, we may need to check that for sure10:27
brtknrThe other cloud is Sausage cloud :) it has nested virt enabled10:30
gwhaleymust be Friday! - heh, never heard of that - off I go to look!10:30
gwhaleybrtknr: do you have a link?10:31
* davidgiluk notes the google for that doesn't look promising10:33
brtknrcompute.sausage.cloud... you need to ask Nick Jones (yankcrime) to create an account for you10:34
brtknrYou're presenting at Manchester cloud native inagural meetup that he organises arent you?10:35
gwhaleybrtknr: indeed I am, and we chatted some at the OpenInfraDays... I'll go peek, and then maybe ask him :-)10:39
brtknrI can add your ssh key to the machine where I am messing around if you wanna have a go?10:39
gwhaleyheh, that homepage for sausage is nicely anonymous...10:39
gwhaleylet me have a chat with yankcrime and see what I can find out - thx!10:40
davidgilukbrtknr: Take care though, things like migration of L1 doesn't work if nesting is enabled (well in some cases in other cases if it's being used)10:41
brtknrkgz: hello there10:41
kgzsup10:42
*** yankcrime has joined #kata-dev10:42
* brtknr waves to yankcrime10:42
yankcrimeyo brtknr10:42
brtknr>  things like migration of L1 doesn't work if nesting is enabled10:44
gwhaleyI'm thinking maybe the hardware is too old for the instruction set used to build the rootfs maybe - the clearlinux rootfs has some not-too-old hardware requirements iirc...10:52
*** EricRen has quit IRC10:52
gwhaleysooo, brtknr yankcrime - there is a ref in the code that we should have 'at least a Westmere', which is a couple of years more recent than the hw I think you have: https://github.com/kata-containers/runtime/blob/master/cli/kata-check_amd64.go#L6410:55
yankcrimeyeah looks like that feature was introduced in the westmere architecture10:57
brtknrgwhaley: That seems like a reasonable guess! I'll try to find peace in this newly acquired knowledge then :)10:58
* yankcrime wonders what would happen if you disabled that check11:00
* brtknr wonders if /usr/bin/kata-qemu kata-check should pick up on this and report it as a blocker11:04
davidgilukare you really running on something older than a westmere? That's pretty old11:05
gwhaleybrtknr: yankcrime - yes, I think if that is the issue then a nicer error message would be great ;-) jodh, do you remember the (long and tortured) history here?11:05
gwhaleyI don't remember if it won't work, or there are features that won't work, or we just never ever test it, so give you no guarantees ;-)11:06
gwhaleyI still have a feeling it might come down to some instruction set minimum we set for the clear linux rootfs build as well11:06
yankcrimedavidgiluk: it's running on old datacentred kit - hardware that was given to us by another isp that was throwing it away, and this was 5 years ago!11:08
yankcrimeso yeah, it's relatively ancient11:08
yankcrimebut i hate to see working hardware go to waste, so....11:08
davidgiluknod11:08
*** tmhoang has joined #kata-dev11:40
*** devimc has joined #kata-dev12:00
*** dhellmann has left #kata-dev12:10
*** altlogbot_1 has joined #kata-dev13:03
*** sgarzare has quit IRC13:09
*** altlogbot_1 has quit IRC13:32
*** altlogbot_3 has joined #kata-dev13:32
*** altlogbot_3 has quit IRC13:38
*** altlogbot_3 has joined #kata-dev13:38
*** altlogbot_3 has quit IRC14:00
*** lpetrut has quit IRC14:11
*** altlogbot_2 has joined #kata-dev14:25
*** altlogbot_2 has quit IRC14:29
*** altlogbot_3 has joined #kata-dev14:29
*** altlogbot_3 has quit IRC14:33
*** altlogbot_2 has joined #kata-dev14:33
*** altlogbot_2 has quit IRC14:33
brtknrgwhaley: 117Mb/s inside kata container, 1.5Gb/s inside a runc container is the preliminary result... I'm going to try doing the volumeBlock approach that someone recommended earlier and see what happens14:44
*** sgarzare has joined #kata-dev14:44
brtknrthis is using the volumeMount approach14:44
brtknrwhich you already warned me about :)14:45
brtknrI'm running this command: dd if=/dev/zero of=block bs=1G count=114:47
*** altlogbot_2 has joined #kata-dev14:48
davidgilukbrtknr: That doesn't do a sync does it? So it's still writing once the dd completes?14:48
gwhaleywriting ... somewhere... oh, sync and VMs - what an interesting little area that is eh davidgiluk :-)14:50
davidgilukbrtknr: I think you typically add a oflag=dsync or something?14:51
*** altlogbot_2 has quit IRC14:51
*** altlogbot_0 has joined #kata-dev14:52
brtknrdavidgiluk: its about the same: 1073741824 bytes (1.1 GB) copied, 8.91691 s, 120 MB/s14:55
davidgilukok14:55
brtknrwith oflag=dsync14:55
*** altlogbot_0 has quit IRC14:55
*** altlogbot_1 has joined #kata-dev14:56
*** altlogbot_1 has quit IRC14:56
*** altlogbot_2 has joined #kata-dev14:58
*** igordc has joined #kata-dev15:04
brtknrdavidgiluk: gwhaley: using the volumeBlock approach: 492765184 bytes (493 MB, 470 MiB) copied, 1.32476 s, 372 MB/s15:24
brtknrBut still nowhere near the raw performance15:24
brtknrAnything else I should try?15:25
brtknr@archana.m.shinde^^15:25
kata-irc-bot<gmmaharaj> brtknr: is there a place where do have documented your setup? it would be good to mimic it locally to see it.15:26
*** devimc has quit IRC15:26
kata-irc-bot<gmmaharaj> if we are using a block based volume, the performance should be good. WOn't be close to what you see in raw by pretty close.15:26
brtknrI'm running the setup documented here: https://gist.github.com/brtknr/06521748bca81b399152a42bf7cb653815:28
brtknrInitially, I just did a regular hostPath mount15:28
kata-irc-botAction: gmmaharaj goes to see15:28
davidgilukbrtknr: What does the qemu command line part of that look like?15:29
gwhaleybrtknr: you are also using dd with bs=1G still, yes? we should consider if that is anything like a real world case. maybe look at that fio test code I pointed at, that does multiple different block size transfers iirc :-)15:30
* gwhaley suspects davidgiluk has his own preference for what io tests to run, probably also fio based maybe?15:30
davidgilukgwhaley: I'm not really a block wrangler, more of stefanha's department15:30
brtknrtbh, i get the same performance outside of kata container... so this seems like a limitation of using loop block device in general15:37
brtknrgwhaley: I will eventually run fio tests... I am just trying to get a feel for what is achievable.. dd is a pretty good proxy imho15:37
gwhaleybrtknr: ah, yes, loopback will hurt perf - but, it is the easiest way to get a block device iirc. I had to repart/install a machine to get a real devicemapper block device at one point :-(15:37
gwhaleysure, dd will get you a feel, sure15:38
gwhaleygmmaha ^^ fyi, loopback.15:38
* brtknr wonders how to turn a network mounted disk into a block device15:39
gwhaleyheh, when it is not gluster or ceph? :-)15:39
brtknrgwhaley: we're using beegfs15:39
gwhaleyin theory you can mount those net block storage devices inside the container itself.... but, I've not tried that for some time. it might need some kernel module enabling in the vm kernel depending on the fs in use15:40
kata-irc-bot<gmmaharaj> brtknr: i think the block size matters when it comes to io with dd too.15:40
kata-irc-bot<gmmaharaj> ``` ganeshma@ganeshma-lab1:~$ dd if=/dev/zero of=block bs=512M count=1 1+0 records in 1+0 records out 536870912 bytes (537 MB, 512 MiB) copied, 0.753192 s, 713 MB/s ganeshma@ganeshma-lab1:~$ dd if=/dev/zero of=block_small bs=1M count=512 512+0 records in 512+0 records out 536870912 bytes (537 MB, 512 MiB) copied, 0.491864 s, 1.1 GB/s ganeshma@ganeshma-lab1:~$ dd if=/dev/zero of=block_small bs=4K count=131072 131072+0 records in15:40
kata-irc-bot131072+0 records out 536870912 bytes (537 MB, 512 MiB) copied, 1.65169 s, 325 MB/s ```15:40
kata-irc-bot<gmmaharaj> i wonder what your throughput would be if you use small block size within kata? any chance you can run that test one more time but drop the block size?15:40
brtknrgmmaharaj: using volumeBlock or volumeMount?15:41
kata-irc-bot<gmmaharaj> brtknr: volumeBlock please if you could.15:41
kata-irc-bot<gmmaharaj> what is your current setup? volumeMount? if yes, then why not both?15:42
gwhaley<cough> https://github.com/kata-containers/tests/blob/master/metrics/storage/fio.sh#L57-L60 ;-)15:42
gmmaha:D15:42
brtknrroot@my-pod:/mnt# dd if=/dev/zero of=block count=1000 oflag=dsync15:42
brtknr1000+0 records in15:42
brtknr1000+0 records out15:42
brtknr512000 bytes (512 kB, 500 KiB) copied, 1.45657 s, 352 kB/s15:43
kata-irc-bot<gmmaharaj> i wonder what dd's default size is. 4K?15:43
kata-irc-bot<gmmaharaj> brtknr: can you specify some sie for the block. bs=1M maybe?15:43
brtknrwith 1M, its 198M/s15:44
brtknrwith 1M, its 198MB/s15:44
kata-irc-bot<gmmaharaj> and this is VolumeMount?15:45
brtknrgmmaha: like I said, I get the same perf outside of kata too... if I mount the block device directly on the host15:45
brtknrgmmaha: no, volumeBlock15:45
kata-irc-bot<gmmaharaj> aaah ok15:45
kata-irc-bot<gmmaharaj> so it15:46
kata-irc-bot<gmmaharaj> let me see if i can reproduce this locally with a block on a true volume device instead of loopback15:47
brtknrgmmaha: I get the same perf with volumeMount with 1M blocksize15:47
kata-irc-bot<gmmaharaj> i have tested this using https://github.com/ganeshmaharaj/lvm-snapshotter on a physical device and the performance was comparable to runc.15:48
brtknrbut inside a runc container, it goes up to 1.5GB/s15:48
kata-irc-bot<gmmaharaj> i can collect the numbers again with the new version to see what i get15:48
brtknrgwhaley: I remember you mentioning that things should get better with vsock?15:49
gwhaleybrtknr: vsock in two ways - only one I think storage related....15:50
gwhaleyif vsock is availabe in the kernel, then we can use it to talk to the container, and drop the proxy process - so a small win on size and complexity for kata....15:50
brtknrgwhaley: is there already a way to enable it?15:51
gwhaleyand then there is the new in-test virtio-fs - which I'm not sure is actually bound to vsock. virtio-fs gives better than 9p storage performance. maybe not as good as block though, but you cannot use block in all situations I think15:51
gwhaleythere is a guide to enabling virtio-fs - I think gmmaha has done it recently. not quite trivial right now though I think - we are working on getting all the bits into kata so it is easier/available before all the bits land in the upstream (kernel, qemu etc.)15:51
gmmahaquite a few moving parts that davidgiluk stefanha have been working on and i have been tracking to make sure we get all the bits landed in kata for it to work out of the box. as gwhaley mentioned, not trivial right now.15:53
brtknrgwhaley: should i go down that path if its already known that its no better than using block?15:53
*** devimc has joined #kata-dev15:53
gwhaleybrtknr - I guess that depends on what your goal is? Right now, it seems you have enough to go at with loop block, so you could run some initial stuff and get a feel for it.15:54
brtknrgwhaley: it seems like I will be testing the limits of using loop block device rather than kata15:55
brtknrI need to see if there is another way to do this15:56
gwhaleyI'll leave gmmaha to discuss that - he knows a lot more about storage stuff than I do :-)15:56
gmmahabrtknr: if you have a spare drive attached to your machine, i can provide you with steps on where you can setup a devicemapper device and use that as a backend snapshot holder using containerd15:56
brtknrgmmaha: that would be great15:57
gmmahadoing it in the kubecontext might be a bit hard, but yoiu can still do it via commandlines if performance is what you are after15:57
kata-irc-bot<archana.m.shinde> catching up..15:57
gmmahacool. will get that your way later today15:57
brtknri've gone down the root of using cri-o, will it work with that too?15:57
kata-irc-bot<archana.m.shinde> brtknr: if you have a spare device, you can pass it using the gist for raw block devices in k8s15:57
kata-irc-bot<archana.m.shinde> or use devicemapper backed by a real device as @gmmaharaj mentioned15:58
gmmahabrtknr: it seems cri=o also has a devicemapper device. you can definitely work with that15:59
gmmahai haven't tested it though15:59
brtknrlet me explain my use case, I want to mount a parallel file system to kata containers and see if they can read/write lots of things in parallel when there's lots of them15:59
brtknrusing a raw block device limits me to using one host correct?16:00
brtknras I need to specify node affinity16:00
gmmahabrtknr: yes. what is the parallel file system you have in mind?16:01
kata-irc-bot<archana.m.shinde> yes a raw block device will limit you to one16:01
brtknrwith runc, I can mount the parallel file system using hostPath and can get close to raw performance16:01
brtknrgmmaha: We're usibg BeeGFS16:02
gmmahabrtknr: aah.. i have never worked on that system, but i have worked on ceph a little and know somethign about it.16:03
gmmahabeegFS lets you do a fuse mount of the block backend?16:03
gmmahaon the host?16:03
brtknrgmmaha: We also have Ceph available for testing but havent tried it yet16:03
brtknrgmmaha: with BeeGFS, its a kernel mount on on host16:04
brtknrhowever the server runs on user space... dont ask me why :P16:04
gmmahabrtknr: lol.. i won't.16:05
gmmaha@amshinde can correct me. but if it is a kernel mount and it can be recognized as a mount point, kata runtime should automaticaly pass it as a block volume onto the guest16:05
gmmahawhich should bypass 9p and get you out of those vows.16:06
gmmahaamshinde: ^^16:06
kata-irc-bot<archana.m.shinde> if its mounted on the host side, then the  way it would passed is through 9p16:08
kata-irc-bot<archana.m.shinde> brtknr: do you have your pod yaml file that you have setup with runc?16:09
kata-irc-bot<archana.m.shinde> wanted to take a look at your exact setup you have now16:09
kata-irc-bot<gmmaharaj> https://gist.github.com/brtknr/06521748bca81b399152a42bf7cb653816:09
kata-irc-bot<gmmaharaj> that's what i got from him @archana.m.shinde16:09
kata-irc-bot<archana.m.shinde> @gmmaharaj doesnt help, thats the gist I provided him for `raw block support` in k8s :slightly_smiling_face:16:11
kata-irc-bot<gmmaharaj> duh! you wanted the final setup. sorry16:12
brtknrhttps://github.com/brtknr/packaging/commit/db356ec01d236021a691d567db0dfe99a4d64d4916:14
brtknrits just a simple hostPath volume mount16:14
brtknrits time for pub here, thanks for the discussion everyone, feels like 2 step forward 1 step but hey, its progress :)16:20
brtknrs/1 step/1 step back16:21
gwhaleyah, Friday night... have a good weekend brtknr!16:21
brtknryou too :)16:21
*** jodh has quit IRC16:24
brtknrLast thought: wondering if I run 10x as many kata containers, whether I’ll be able to saturate the iops16:39
davidgilukbrtknr: It should do, they're all pretty independent - but then it depends how your network filesystem attaches16:55
*** gwhaley has quit IRC16:57
*** sgarzare has quit IRC17:34
*** irclogbot_1 has quit IRC18:08
*** irclogbot_1 has joined #kata-dev18:10
*** davidgiluk has quit IRC19:14
*** eernst has joined #kata-dev19:21
*** eernst has quit IRC19:23
*** eernst has joined #kata-dev19:27
*** eernst has quit IRC19:32
*** eernst has joined #kata-dev19:34
*** eernst has quit IRC19:38
*** eernst has joined #kata-dev19:40
*** eernst has quit IRC19:45
*** eernst has joined #kata-dev19:46
*** eernst has quit IRC19:50
*** eernst has joined #kata-dev19:53
*** eernst has quit IRC19:57
*** eernst has joined #kata-dev19:59
*** eernst has quit IRC20:02
*** eernst has joined #kata-dev20:05
*** eernst has quit IRC20:09
*** eernst has joined #kata-dev20:16
*** eernst has quit IRC20:22
*** eernst has joined #kata-dev20:23
*** eernst has quit IRC20:25
*** devimc has quit IRC20:51
*** igordc has quit IRC23:38

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!