Friday, 2019-05-17

kata-irc-bot2<gmmaharaj> i tried all my ways of testing nemu and now networking doesn't work at all. the only thing i am missing is building qemu locally, install and then try to run nemu.00:20
kata-irc-bot2<gmmaharaj> @eric.ernst once your static binary is in place, i would love to give it a go00:20
kata-irc-bot2<gmmaharaj> i swear all this worked before i pushed the patch.. so i am sort of baffled on why this isn't working now00:20
kata-irc-bot2<eric.ernst> It is already there.00:24
kata-irc-bot2<eric.ernst> Latest image includes nemu.00:24
kata-irc-bot2<eric.ernst> Katadocker/kata-deploy:1.7.0 or latest00:24
*** gmmaha has quit IRC01:49
*** gmmaha has joined #kata-dev01:50
*** tmhoang has quit IRC04:53
*** pcaruana has joined #kata-dev05:20
*** lpetrut has joined #kata-dev06:05
*** tmhoang has joined #kata-dev06:32
*** jodh has joined #kata-dev07:26
*** davidgiluk has joined #kata-dev08:05
*** gwhaley has joined #kata-dev08:05
*** lpetrut has quit IRC08:09
*** lpetrut has joined #kata-dev08:29
*** sgarzare has quit IRC09:30
*** sgarzare has joined #kata-dev09:36
*** gwhaley has quit IRC11:05
*** devimc has joined #kata-dev11:52
*** devimc has quit IRC12:16
*** devimc has joined #kata-dev12:20
*** gwhaley has joined #kata-dev12:21
*** fuentess has joined #kata-dev12:33
*** dklyle has joined #kata-dev12:50
*** dims has joined #kata-dev12:52
kata-irc-bot2<graham.whaley> key kata folks - any idea if there is a cap/limit of 100 containers under a k8s Deployment? I am trying to automate a large scaling k8s kata test, and it stops deploying new containers when it gets to 100 ... and it was meant to get up to 1k on this run. Hmm, I'll go surf.... could be some built in k8s default failsafe limit I guess. /cc @gmmaharaj @krsna172913:39
kata-irc-bot2<eric.ernst> Yes.13:45
kata-irc-bot2<graham.whaley> hah, found it - `kubelet` has a max-pods, default of 11013:45
kata-irc-bot2<graham.whaley> you can set it on the kubelet cmdline...13:45
kata-irc-bot2<graham.whaley> sort of crushes my 1k container test right now ;)13:46
kata-irc-bot2<graham.whaley> I guess, for k8s, I should also rephrase that as s/1k containers/1k pods/ :slightly_smiling_face:13:49
kata-irc-bot2<eric.ernst> You can change that value afaiu?13:58
*** lpetrut has quit IRC14:06
kata-irc-bot2<graham.whaley> I'll see if you can change it anywhere but the `kubelet` cmdline - that'd be nice to not have to edit/restart the node14:07
kata-irc-bot2<graham.whaley> @eric.ernst - ah, maybe if you grab the kubelet config, you can indeed edit/deploy a new kubelet configmap. He, 'when k8s goes mad', coz to grab the current config, you 'mearly'...14:11
kata-irc-bot2<graham.whaley> ```  NODE_NAME="the-name-of-the-node-you-are-reconfiguring"; curl -sSL "http://localhost:8001/api/v1/nodes/${NODE_NAME}/proxy/configz" | jq '.kubeletconfig|.kind="KubeletConfiguration"|.apiVersion="kubelet.config.k8s.io/v1beta1"' > kubelet_configz_${NODE_NAME}```14:11
*** eernst has joined #kata-dev14:22
*** tmhoang has quit IRC14:27
kata-irc-bot2<krsna1729> You can set it via kubeadm yaml before cluster creation or var lib kubelet config yaml per node15:41
kata-irc-bot2<graham.whaley> sai: yep, thx. finally worked that out. the cluster I have is a clr-k8s, so, I may just refresh the (1 node) cluster and try again, thx!15:55
*** sgarzare has quit IRC16:01
kata-irc-bot2<mike> 110 per node is the default max16:30
kata-irc-bot2<graham.whaley> hi @mike - yeah, I found that out whilst trying to run a '1000 container' scaling test ;) Heh. The cluster I am on is set up by a kubeadm init, so I can tweak it in the config yaml there. I will add a check to my test to try and extract the max-pods and check against the test num_pods request, and bawk out if I find I cannot run the test on the cluster provided - better than hanging up for some time and then dying when I do bump16:32
kata-irc-bot2into the limit16:32
kata-irc-bot2<mike> not sure how you’re doing your tests but most of us modifying that parameter do it in our api-servers yaml (so effectively commad line but a bit easier to change up)16:34
kata-irc-bot2<graham.whaley> right. the test itself will not make the modification - you will just say 'test up to n pods'. It will expect the cluster to be set up appropriately (that is also deliberate, as the test can then be run across different cluster configs, like different versions of kata in my case, to compare the results across runs). But, making it do a first pass sanity check to detect early failure cases is just polite :slightly_smiling_face:16:35
kata-irc-bot2<mike> oh yeah definitely, good idea16:36
kata-irc-bot2<mike> i might have a quick jq command you can run to check somewhere in my snippets16:37
kata-irc-bot2<graham.whaley> I think I can get it from the 'describe node -o json', and parse, yeah - unless somebody has overridden the config path I think. But, I'll skip that case.16:37
*** khyr0n has joined #kata-dev16:48
kata-irc-bot2<krsna1729> @graham.whaley dont forget to give a big enough ip pool per node :P16:50
kata-irc-bot2<krsna1729> something like this https://gist.github.com/krsna1729/e30d825c090243c9cef81c4b9f83ae1616:51
kata-irc-bot2<graham.whaley> ooh, heh, yeah ;)16:51
kata-irc-bot2<archana.m.shinde> @graham.whaley Maybe a quick thing to try for your minikube setup : https://github.com/kata-containers/documentation/pull/445#issuecomment-49352453116:58
gwhaleythx amshinde - will look on Monday. Yeah, I would have pinged you harder soon ;-) Having kata minikube on my laptop makes writing tests (initially) easier for me, and I'm missing it ;-)16:59
gwhaleythx!17:00
*** sarob has joined #kata-dev17:00
kata-irc-bot2<archana.m.shinde> sure, let me know if that was the issue, we can take a look next week17:00
*** jodh has quit IRC17:00
kata-irc-bot2<archana.m.shinde> @graham.whaley17:00
*** gwhaley has quit IRC17:03
*** igordc has joined #kata-dev17:07
*** khyr0n_ has joined #kata-dev17:12
*** khyr0n_ has quit IRC17:48
*** khyr0n has quit IRC17:48
*** khyr0n has joined #kata-dev17:48
kata-irc-bot2<eric.ernst> @archana.m.shinde @sebastien.boeuf - any ideas why netowrking would stop working when using virtiofs?18:07
kata-irc-bot2<archana.m.shinde> dont see why18:10
kata-irc-bot2<archana.m.shinde> what are you seeing exactly..any error message?18:11
kata-irc-bot2<gmmaharaj> @eric.ernst it is not working with 9p either for me.18:12
kata-irc-bot2<gmmaharaj> but switching the machine type to `pc` it is all fine.18:12
kata-irc-bot2<eric.ernst> @archana.m.shinde trying to checkout.18:21
kata-irc-bot2<eric.ernst> see: https://hackmd.io/vvnqvmO6TwugoJO5nFwa1g18:27
kata-irc-bot2Action: gmmaharaj goes to follow.18:28
kata-irc-bot2<archana.m.shinde> so looks like the combination of virt and huge pages is failing18:34
kata-irc-bot2<eric.ernst> it boots, just doesn't have network connectivity18:35
kata-irc-bot2<archana.m.shinde> do you see the network interfaces in the container?18:35
kata-irc-bot2<gmmaharaj> @archana.m.shinde yes, i do.18:36
kata-irc-bot2<gmmaharaj> it also has an IP address assigned. but it is unable to reach the gateway.18:36
kata-irc-bot2<archana.m.shinde> hmm18:36
kata-irc-bot2<archana.m.shinde> can you ping another container18:37
kata-irc-bot2<gmmaharaj> let me launch 218:37
kata-irc-bot2<archana.m.shinde> or the docker bridge18:37
kata-irc-bot2<eric.ernst> no18:37
kata-irc-bot2<eric.ernst> cannot18:37
kata-irc-bot2<gmmaharaj> yeah. gateway is the docker bridge right?18:37
kata-irc-bot2<eric.ernst> everything looks the same in the ns18:37
kata-irc-bot2<gmmaharaj> i am not able to reach it.18:37
kata-irc-bot2<eric.ernst> and link looks fine in the guest.18:37
kata-irc-bot2<eric.ernst> *in the containr in the guest18:37
kata-irc-bot2<archana.m.shinde> so its just not dns18:38
kata-irc-bot2<archana.m.shinde> you are using tc I suupose18:39
kata-irc-bot2Action: gmmaharaj goes to check18:39
kata-irc-bot2<archana.m.shinde> lets try with macvtap as well18:39
kata-irc-bot2<gmmaharaj> yup.18:39
kata-irc-bot2Action: gmmaharaj goes to test that18:39
kata-irc-bot2<gmmaharaj> same result.18:40
kata-irc-bot2<gmmaharaj> no network there either.18:40
kata-irc-bot2<archana.m.shinde> thats quite weird18:41
kata-irc-bot2<archana.m.shinde> I dont see why hugepages should interfere with virtio-net18:42
kata-irc-bot2<archana.m.shinde> @manohar.r.castelino do you have any insight ?18:42
kata-irc-bot2<archana.m.shinde> these are virtio-net right..can you check if its virtio-net18:43
kata-irc-bot2<gmmaharaj> yup. `driver=virtio-net-pci,netdev=network-0,mac=02:42:ac:11:00:02,disable-modern=true,mq=on,vectors=4,romfile=`18:44
kata-irc-bot2<archana.m.shinde> hey going for lunch ..will be back in a bit18:44
*** sarob has quit IRC18:47
davidgilukeernst: What aobut virtiofs+no huge pages?18:49
kata-irc-bot2<gmmaharaj> you mean just use /dev/shm? we can try that18:50
kata-irc-bot2<eric.ernst> That isn't a 'configuration' yet for Kata.18:50
davidgilukwell it's just in that chart you have 9p+huge failing, but 9p+normal passing18:51
davidgilukso does the problem follow the huge pages or the fs?18:51
kata-irc-bot2<eric.ernst> Can you test that locally?  I updated the doc -- doesn't matter the network connection in the ns on host (ie, same behavior for  tcfilter, macvtap and bridged)18:51
kata-irc-bot2<eric.ernst> Yes, I suspect huge based on the 9p behavior.18:51
davidgilukit's odd for it to affect the networking18:52
kata-irc-bot2<eric.ernst> ie: https://hackmd.io/vvnqvmO6TwugoJO5nFwa1g?both18:52
kata-irc-bot2<eric.ernst> yes, i agree18:52
kata-irc-bot2<eric.ernst> very odd.18:52
kata-irc-bot2<eric.ernst> its only the case with NEMU.........18:53
kata-irc-bot2<eric.ernst> @archana.m.shinde is @robert.bradford around?18:54
kata-irc-bot2<eric.ernst> or @sebastien.boeuf?18:54
kata-irc-bot2<eric.ernst> i'm curious if this can be reproduced in their release build as well18:55
davidgilukyou're not doing dpdk or anything fancy like that?18:56
kata-irc-bot2<gmmaharaj> nope. In my case, I am testing all this inside a VM.18:57
kata-irc-bot2<eric.ernst> wget google.com :slightly_smiling_face:18:58
kata-irc-bot2<eric.ernst> that's all18:58
davidgiluknothing's crashing, you're just not getting any packets?18:59
kata-irc-bot2<gmmaharaj> davidgiluk: @eric.ernst @archana.m.shinde using /dev/shm + virtio  + nemu + machine_type virt works.19:04
kata-irc-bot2<gmmaharaj> 9p also works19:05
kata-irc-bot2<gmmaharaj> so it seems like virt + hugepages is an issue now.19:05
davidgilukgmmaha: Can you show me the commandline with hugepages and with /dev/shm?19:06
davidgiluk ^qemu commandline19:06
davidgiluk(or in your case nemu I guess, but same mostly)19:06
kata-irc-bot2<eric.ernst> i'm curious what next steps are for this then: https://github.com/kata-containers/runtime/pull/165719:14
kata-irc-bot2<eric.ernst> and i'm curiosu what the failure is.19:14
kata-irc-bot2<eric.ernst> i don't have the shm cmdline19:14
davidgilukok, just make sure it's passing the share=on flag to teh -object memory-backend-file  otherwise things get confused as hell; but I'd expect it only vhost-user things to get confused (i.e. virtio-fs not the networking)19:17
* davidgiluk parks his brain for the weekend19:21
*** davidgiluk has quit IRC19:22
kata-irc-bot2<eric.ernst> Thx for the help David19:41
*** pcaruana has quit IRC20:02
kata-irc-bot2<gmmaharaj> davidgiluk: /opt/kata/bin/nemu-system-x86_64 -name sandbox-905681fd1ee2cc55fae240fb33729140a80ad07daa6b19f22b646af0a530dd80 -uuid 6b3636a7-0d0e-46ef-a359-dd456bcdfb5d -machine virt,accel=kvm,kernel_irqchip,nvdimm -cpu host,pmu=off -qmp unix:/run/vc/vm/905681fd1ee2cc55fae240fb33729140a80ad07daa6b19f22b646af0a530dd80/qmp.sock,server,nowait -m 2048M,slots=10,maxmem=8999M -device pcie-pci-bridge,bus=pcie.0,id=pcie-bridge-0,addr=2,romfile=20:04
kata-irc-bot2-device virtio-serial-pci,disable-modern=true,id=serial0,romfile= -device virtconsole,chardev=charconsole0,id=console0 -chardev socket,id=charconsole0,path=/run/vc/vm/905681fd1ee2cc55fae240fb33729140a80ad07daa6b19f22b646af0a530dd80/console.sock,server,nowait -device nvdimm,id=nv0,memdev=mem0 -object memory-backend-file,id=mem0,mem-path=/opt/kata/share/kata-containers/kata-containers-image_clearlinux_1.7.0-rc1_agent_f983b3665f.img,size=13421772820:04
kata-irc-bot2-device virtio-scsi-pci,id=scsi0,disable-modern=true,romfile= -object rng-random,id=rng0,filename=/dev/urandom -device virtio-rng,rng=rng0,romfile= -device virtserialport,chardev=charch0,id=channel0,name=agent.channel.0 -chardev socket,id=charch0,path=/run/vc/vm/905681fd1ee2cc55fae240fb33729140a80ad07daa6b19f22b646af0a530dd80/kata.sock,server,nowait -chardev20:04
kata-irc-bot2socket,id=char-59485d3086ebad82,path=/run/vc/vm/905681fd1ee2cc55fae240fb33729140a80ad07daa6b19f22b646af0a530dd80/vhost-fs.sock -device vhost-user-fs-pci,chardev=char-59485d3086ebad82,tag=kataShared,cache-size=1024M -netdev tap,id=network-0,vhost=on,vhostfds=3,fds=4 -device driver=virtio-net-pci,netdev=network-0,mac=02:42:ac:11:00:02,disable-modern=true,mq=on,vectors=4,romfile= -global kvm-pit.lost_tick_policy=discard -vga none -no-user-config20:04
kata-irc-bot2-nodefaults -nographic -daemonize -object memory-backend-file,id=dimm1,size=2048M,mem-path=/dev/shm,share=on -numa node,memdev=dimm1 -kernel /opt/kata/share/kata-containers/vmlinuz-4.19.28-39 -append tsc=reliable no_timer_check rcupdate.rcu_expedited=1 i8042.direct=1 i8042.dumbkbd=1 i8042.nopnp=1 i8042.noaux=1 noreplace-smp reboot=k console=hvc0 console=hvc1 iommu=off cryptomgr.notests net.ifnames=0 pci=lastbus=0 root=/dev/pmem0p120:04
kata-irc-bot2rootflags=dax,data=ordered,errors=remount-ro ro rootfstype=ext4 quiet systemd.show_status=false panic=1 nr_cpus=8 agent.use_vsock=false init=/usr/lib/systemd/systemd systemd.unit=kata-containers.target systemd.mask=systemd-networkd.service systemd.mask=systemd-networkd.socket -bios /opt/kata/share/kata-nemu/OVMF.fd -pidfile /run/vc/vm/905681fd1ee2cc55fae240fb33729140a80ad07daa6b19f22b646af0a530dd80/pid -smp 1,cores=1,threads=1,sockets=8,maxcpus=820:04
kata-irc-bot2<gmmaharaj> sorry was out for  lunch. Just got back on.20:04
kata-irc-bot2<gmmaharaj> @eric.ernst that is the patch that i pulled into the tree to test it.20:05
kata-irc-bot2<gmmaharaj> next steps, talk to @manohar.r.castelino in 30 mins to understand how it should go and flush it out and land it asap20:05
kata-irc-bot2<archana.m.shinde> @gmmaharaj I talked to @sebastien.boeuf about the issue20:20
kata-irc-bot2<archana.m.shinde> lets see if they can figure out whats going on with nemu20:20
kata-irc-bot2<gmmaharaj> davidgiluk: http://paste.openstack.org/show/751552/ has both `/dev/shm` and hugepsges.20:20
kata-irc-bot2<gmmaharaj> @archana.m.shinde aah nice.. thanks. Keep me posted?20:20
kata-irc-bot2<sebastien.boeuf> @gmmaharaj I'll call you in 1020:21
*** eernst has quit IRC20:34
*** eernst_ has joined #kata-dev20:41
kata-irc-bot2<eric.ernst> ya'll figuring it out?20:45
kata-irc-bot2<gmmaharaj> @eric.ernst @sebastien.boeuf and Rob are trying to srt it out right now and testing on a non-kata env. should have an update from them soon. ish.20:48
*** devimc has quit IRC20:48
kata-irc-bot2<salvador.fuentes> @gmmaharaj @eric.ernst btw, can you make another quick test with nemu + virtiofs?  I am adding `-m` flag to the docker command and the docker command gets hanged for me, something like: `sudo docker run -ti --runtime=kata-nemu -m 200M busybox sh`20:51
kata-irc-bot2<gmmaharaj> @salvador.fuentes checking.20:54
kata-irc-bot2<salvador.fuentes> thanks20:55
kata-irc-bot2<eric.ernst> I see the same.20:57
kata-irc-bot2<gmmaharaj> I get this. ``` ganeshma@virtiofs:~/go/src/github.com/kata-containers/runtime$ docker run -it --runtime=kata-nemu-dev --rm -m 1024 ubun tu bash                                                                                                                 docker: Error response from daemon: Minimum memory limit allowed is 4MB. See 'docker run --help'. ganeshma@virtiofs:~/go/src/github.com/kata-containers/runtime$ docker run -it20:58
kata-irc-bot2--runtime=kata-nemu-dev --rm -m 100M ubun tu bash                                                                                                                 WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap. docker: Error response from daemon: OCI runtime create failed: rpc error: code = Unavailable desc = transport is closing: unknown. ```20:58
kata-irc-bot2<gmmaharaj> this is the one where i have `/dev/shm` instead of hugepages.20:58
kata-irc-bot2<eric.ernst> mine hangs for...ever..20:59
kata-irc-bot2<salvador.fuentes> @gmmaharaj hmm, ok, thanks, at least it sends you back an error, seems that with hugepages only hangs20:59
kata-irc-bot2<salvador.fuentes> yeah, mine also hangs, u using hugepages @eric.ernst?20:59
kata-irc-bot2<gmmaharaj> hrm.. let me try with hugepages.20:59
kata-irc-bot2<eric.ernst> yeah21:00
kata-irc-bot2<eric.ernst> let me try nemu w/ hugepages, but no virtiofs...21:00
kata-irc-bot2<eric.ernst> only fails w/ virtiofs.21:01
kata-irc-bot2<eric.ernst> I think they don't support memory hotplug, though21:01
kata-irc-bot2<eric.ernst> @sebastien.boeuf can confirm?21:01
kata-irc-bot2<eric.ernst> @manohar.r.castelino?21:01
kata-irc-bot2<eric.ernst> I think its an early limitation.21:01
*** igordc has quit IRC21:02
kata-irc-bot2<salvador.fuentes> oh ok, maybe we would need to skip those tests for the CI21:02
kata-irc-bot2<eric.ernst> this is a pretty sad way to fail, though21:02
kata-irc-bot2<gmmaharaj> the `/dev/shm` patch needs to be modified to handle that scenario.. that is the one that @manohar.r.castelino and @bergwolf were talking about21:02
*** igordc has joined #kata-dev21:03
kata-irc-bot2<salvador.fuentes> @gmmaharaj for this last issue?21:03
kata-irc-bot2<gmmaharaj> @salvador.fuentes right. the shm patch needs to be modded to handle all file backed mem and mem hotplug as well.21:03
kata-irc-bot2<gmmaharaj> once i understand the impact and re-work it.21:04
kata-irc-bot2<gmmaharaj> for now you can go ahead and disable the test.. i can re-enable it as part of that patch21:04
kata-irc-bot2<salvador.fuentes> got it, thanks21:04
*** fuentess has quit IRC21:05
kata-irc-bot2<robert.bradford> @gmmaharaj can you turn of vhost-net in the configuration file21:30
kata-irc-bot2<robert.bradford> @gmmaharaj (for debugging this network failing on hugepages issue.)21:30
kata-irc-bot2<robert.bradford> turn off21:30
kata-irc-bot2<gmmaharaj> got it. testing it now.21:31
kata-irc-bot2<gmmaharaj> uhh.. that works.21:32
kata-irc-bot2<gmmaharaj> let me check the configs once make to make sure i am not blinding myselg.21:32
kata-irc-bot2<eric.ernst> no vhost - net is nice for security, and terrible for perf :slightly_smiling_face:21:32
*** igordc has quit IRC21:33
*** igordc has joined #kata-dev21:33
kata-irc-bot2<robert.bradford> @gmmaharaj what's your host kernel?21:38
kata-irc-bot2<gmmaharaj> @robert.bradford `Linux virtiofs 4.15.0-47-generic #50-Ubuntu SMP Wed Mar 13 10:44:52 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux`21:38
kata-irc-bot2<eric.ernst> it *should* be pretty simple to repro on any other system too.21:38
kata-irc-bot2<eric.ernst> just a couple cmds: https://hackmd.io/vvnqvmO6TwugoJO5nFwa1g?both21:39
*** khyr0n has quit IRC22:01
*** khyr0n has joined #kata-dev22:31
kata-irc-bot2<robert.bradford> @eric.ernst @gmmaharaj https://github.com/intel/nemu/pull/23223:02
kata-irc-bot2<eric.ernst> i'm going to add five GitHub stars to cloud-hypervisor projects as thanks.23:04
kata-irc-bot2<eric.ernst> :)23:04
kata-irc-bot2<eric.ernst> Thanks @robert.bradford!23:04
kata-irc-bot2Action: gmmaharaj gives his salutations to robert.bradford23:05
kata-irc-bot2<robert.bradford> @eric.ernst there is a tag for you to consume: release-2019-05-1723:30

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!