*** changcheng has quit IRC | 00:59 | |
*** changcheng has joined #kata-dev | 01:00 | |
*** tonyb has quit IRC | 01:07 | |
*** tonyb has joined #kata-dev | 01:09 | |
*** eernst has joined #kata-dev | 01:26 | |
*** tonyb has quit IRC | 01:27 | |
*** eernst has quit IRC | 01:27 | |
*** eernst has joined #kata-dev | 01:28 | |
*** eernst has quit IRC | 01:32 | |
*** zerocoolback has joined #kata-dev | 04:25 | |
*** sjas_ has joined #kata-dev | 04:27 | |
*** sjas has quit IRC | 04:30 | |
*** marst has joined #kata-dev | 04:40 | |
*** jodh has joined #kata-dev | 06:53 | |
*** gwhaley has joined #kata-dev | 08:02 | |
*** davidgiluk has joined #kata-dev | 08:04 | |
*** annabelleB_ has joined #kata-dev | 08:06 | |
*** david-lyle has quit IRC | 08:14 | |
*** dklyle has joined #kata-dev | 08:15 | |
kata-irc-bot | <miao.yanqiang> Hi, ask a question about hugepage usage in kata-containers | 09:01 |
---|---|---|
kata-irc-bot | <miao.yanqiang> I set `enable_hugepages = true`, and run a container, but how can I check the container has used the hugepages? | 09:02 |
davidgiluk | miao.yanquiang: You should find that the qemu has been started with an option something like memory-backend-file,....mem-path=/dev/hugepages | 09:04 |
davidgiluk | oops; miao.yanqiang: ^ | 09:05 |
kata-irc-bot | <miao.yanqiang> yeah, but I get `4096` by executing `getconf PAGE_SIZE` in container | 09:07 |
gwhaley | @miao.yanqiang, @davidgiluk - I also went to check if our VM kernel has it enabled - there are a bunch of HUGE things on, but also transparent seems to be off https://github.com/kata-containers/packaging/blob/master/kernel/configs/x86_64_kata_kvm_4.14.x#L521 | 09:10 |
gwhaley | probably pull in jcvenegas and julio (devimc) on this (easiest to find on slack prob. - cannot see them on irc right now) | 09:11 |
davidgiluk | miao.yanqiang: hugepages get a bit messy; the page size backing the RAM in qemu can be different from the page size used by the guest (on x86); for example, qemu can be using 2MB hugepages bu tthe guest can still use both 2MB and 4k pages | 09:11 |
davidgiluk | miao.yanqiang: Similarly, the guest can use 2MB hugepages even when qemu isn't using hugepages as backing; the x86 MMU/kvm is very flexible in this | 09:11 |
davidgiluk | gwhaley: Yeh it's odd to have CONFIG_TRANSPARENT_HUGEPAGE off; I'd turn both that on and also CONFIG_TRANSPARENT_HUGEPAGE_MADVISE | 09:13 |
gwhaley | PRs and Issues most welcome @davidgiluk ;-) (and, your input most welcome :-) ) | 09:14 |
kata-irc-bot | <miao.yanqiang> Thanks @davidgiluk | 09:14 |
kata-irc-bot | <miao.yanqiang> For the kata container, is there a way to set page to 2MB now? | 09:16 |
kata-irc-bot | <miao.yanqiang> Or will it affect the running efficiency of applications of kata container if not set | 09:18 |
davidgiluk | miao.yanqiang: I think you should be OK as long as you've got the hugepage option on, then qemu is using it, and the guest does what the guest normally does | 09:21 |
davidgiluk | gwhaley: Hmm; I wonder if there's some argument about how well transparent hp works for short lived VMs/systems - there's some over head from going around and trying to find hugepages; if the VM only lives for a short while perhaps that doesn't pay off? | 09:22 |
kata-irc-bot | <miao.yanqiang> There is a use case for running VPP in kata's documentation, with only options set, and no other settings: https://github.com/kata-containers/documentation/blob/master/use-cases/using-vpp-and-kata.md | 09:25 |
gwhaley | davidgiluk - heh, yep. This is a perennial issue for us, as different container users have different expectations, from single or few very large long lived containers to trying to use containers for FAAS - quite different patterns and requirements. We've tried to stick with one kernel/rootfs image to 'rule them all', and regularly debate if we have to start generating multiple kernels or rootfs (and agent variants) to satisfy different | 09:28 |
gwhaley | needs. 'debug' is quite often one case. enabling seccomp was the case for discussion this week. | 09:28 |
gwhaley | we have really tried to avoid multiple versions of components as our test matrix will exponentially explode on us :-( | 09:29 |
davidgiluk | yeh | 09:30 |
davidgiluk | gwhaley: the TRANSPARENT_HUGEPAGE_* options allow some stuff switchable at runtime; the TRANSPARENT_HUGEPAGE_MADVISE lets you turn it on and off for individual mappings - I'm not quite sure what the default is | 09:30 |
davidgiluk | miao.yanqiang: I think you should be fine if you just enable the hugepage option in the kata config like that doc says; that tells qemu to use hugepages for it's backing; I'd assume VPP explicitly asks for hugepages in the guest | 09:33 |
*** justJanne has quit IRC | 09:37 | |
kata-irc-bot | <miao.yanqiang> But the `/dev/a directory` of the VM is not visible within the container. | 09:39 |
*** justJanne has joined #kata-dev | 09:39 | |
kata-irc-bot | <miao.yanqiang> So, in principle, applications in the container cannot access the `hugepage` of the VM | 09:40 |
davidgiluk | miao.yanqiang: No, that's not how it works | 09:43 |
davidgiluk | miao.yanqiang: The memory map used by the host kernel and thus the host-qemu is independent of the guest kernel | 09:43 |
davidgiluk | miao.yanqiang: passing the hugepage option to qemu tells qemu to use hugepages in it's memory mapping on the host; it doesn't influence the type of mappings the guest does | 09:44 |
davidgiluk | miao.yanqiang: Inside the guest there will be it's own /dev/hugepages that correspond to allocations made by the guest kernel | 09:45 |
kata-irc-bot | <miao.yanqiang> Oh, forgive my ignorance, thank you for explaining.:slightly_smiling_face: | 09:47 |
*** annabelleB__ has joined #kata-dev | 09:53 | |
*** gwhaley has quit IRC | 09:56 | |
*** annabelleB__ has quit IRC | 09:57 | |
davidgiluk | miao.yanqiang: no problem | 09:58 |
kata-irc-bot | <niteshkonkar007> So when I create an issue here `https://github.com/kata-containers/kata-containers/issues/new` , I do not get an option for adding labels like `limitation` or `feature` when creating an issue and after the issue is created as well. | 10:06 |
*** annabelleB_ has quit IRC | 10:34 | |
*** marcov has quit IRC | 11:07 | |
kata-irc-bot | <miao.yanqiang> Hi @davidgiluk, are you still here? | 11:15 |
davidgiluk | sure | 11:15 |
kata-irc-bot | <miao.yanqiang> Great! | 11:16 |
*** marcov has joined #kata-dev | 11:17 | |
kata-irc-bot | <miao.yanqiang> > @miao.yanqiang: Inside the guest there will be it's own /dev/hugepages that correspond to allocations made by the guest kernel How does the container access the hugepages of guest? | 11:17 |
kata-irc-bot | <miao.yanqiang> The container seems is not running in privileged mode | 11:19 |
kata-irc-bot | <ydjainopensource> I think only maintainers can add a label | 11:19 |
kata-irc-bot | <miao.yanqiang> When I run a kata | 11:20 |
davidgiluk | miao.yanqiang ah, that I've not checked, /dev/hugepages might need setting up by something in the kata guest | 11:21 |
kata-irc-bot | <niteshkonkar007> Then I will request one of the maintainer to add a label `limitation` to `https://github.com/kata-containers/documentation/pull/242`. | 11:21 |
kata-irc-bot | <miao.yanqiang> OK, Thanks so much! | 11:26 |
*** fuentess has joined #kata-dev | 12:34 | |
*** annabelleB has joined #kata-dev | 12:38 | |
*** devimc has joined #kata-dev | 12:47 | |
kata-irc-bot | <anne> @verytired1 looks like everyone's favorite IRC bug is back (aka the Jess Fraz Alarm) ^^ | 12:57 |
*** marcov has quit IRC | 13:01 | |
*** marcov has joined #kata-dev | 13:02 | |
*** zerocoolback has quit IRC | 13:15 | |
kata-irc-bot | <james.o.hunt> Anyone created a wikipedia page for that yet? :) | 13:31 |
*** annabelleB has quit IRC | 13:37 | |
kata-irc-bot | <verytired1> Ha | 13:50 |
davidgiluk | adding a new config option seems to involve adding it in about 50 places | 14:05 |
*** zerocoolback has joined #kata-dev | 15:33 | |
*** devimc has quit IRC | 15:37 | |
*** devimc has joined #kata-dev | 15:38 | |
*** jodh has quit IRC | 17:15 | |
*** ChanServ sets mode: -rf | 17:18 | |
*** sjas_ is now known as sjas | 17:23 | |
*** zerocoolback has quit IRC | 18:21 | |
*** davidgiluk has quit IRC | 19:32 | |
*** fuentess has quit IRC | 20:42 | |
kata-irc-bot | <sebastien.boeuf> @eguzman hi! | 20:48 |
kata-irc-bot | <sebastien.boeuf> @eguzman just saw your POC PR about coldplugging GPUs | 20:48 |
kata-irc-bot | <sebastien.boeuf> @eguzman before talking about the possibility to coldplug some devices, I'd like to understand why the hotplug does not work in your case | 20:49 |
*** dims has quit IRC | 20:55 | |
kata-irc-bot | <eguzman> The exact details for why GPU hotplugging doesn't work are related to the low-level details of PCI configuration | 21:11 |
kata-irc-bot | <eguzman> On my current machine, the output of `docker run --runtime=kata-runtime --rm -it --device=/dev/vfio/34 centos/tools bash` | 21:11 |
kata-irc-bot | <eguzman> `docker: Error response from daemon: OCI runtime create failed: QMP command failed: unknown` | 21:11 |
kata-irc-bot | <eguzman> the logs indicate that the QMP device_add command failed | 21:12 |
kata-irc-bot | <sebastien.boeuf> have you tried to enable the configuration `hotplug_vfio_on_root_bus = true` in the config file `configuration.toml` ? | 21:14 |
kata-irc-bot | <eguzman> On other machines I have tested, the VM is successfully boots but attempts to install drivers result in error messages claiming the BIOS misconfigured the PCI device | 21:15 |
kata-irc-bot | <eguzman> `hotplug_vfio_on_root_bus = true` results in identical errors | 21:16 |
kata-irc-bot | <sebastien.boeuf> ok, it'd be nice to get the logs of what's happening when you run a simple VM, and then hotplug your gpu with QMP | 21:17 |
kata-irc-bot | <sebastien.boeuf> no Kata involved | 21:17 |
kata-irc-bot | <sebastien.boeuf> just to try to understand what's actually not working | 21:18 |
kata-irc-bot | <sebastien.boeuf> do you see any kernel error from the dmesg? | 21:19 |
kata-irc-bot | <sebastien.boeuf> what's doing the hook that you need inside the guest ? | 21:20 |
kata-irc-bot | <sebastien.boeuf> I'm trying to understand if triggering the actions that you need as part of the hook would actually solve your issue if they were triggered after the device has been hotplugged | 21:21 |
*** dims has joined #kata-dev | 21:23 | |
*** eernst_ has joined #kata-dev | 21:34 | |
*** devimc has quit IRC | 21:41 | |
kata-irc-bot | <eric.ernst> @sebastien.boeuf not all devices support hot plug | 21:42 |
*** eernst_ has quit IRC | 21:50 | |
*** eernst has joined #kata-dev | 21:50 | |
kata-irc-bot | <eguzman> no kernel message in dmesg | 21:51 |
kata-irc-bot | <eguzman> the hooks we want to run are in charge of installing drivers and leveraging them within the container | 21:55 |
kata-irc-bot | <sebastien.boeuf> could you send the error message that you get from QMP when you try to add the device with hotplug ? | 22:03 |
kata-irc-bot | <sebastien.boeuf> also, can you share the PCI details about the card ? | 22:13 |
kata-irc-bot | <eric.ernst> @archana.m.shinde I need another ACK? https://github.com/kata-containers/runtime/pull/703 | 22:14 |
kata-irc-bot | <eric.ernst> here too: https://github.com/kata-containers/proxy/pull/112 | 22:14 |
kata-irc-bot | <eric.ernst> if you are happy iwth them, I'd like to see all the 1.3.0-rc0 merged.... | 22:14 |
kata-irc-bot | <archana.m.shinde> merged the runtime | 22:16 |
kata-irc-bot | <archana.m.shinde> I see the ARM job failing on the proxy | 22:17 |
kata-irc-bot | <eric.ernst> that's the only one it is running on - was it passing or consistent before? | 22:17 |
kata-irc-bot | <eric.ernst> I ask because all I'm doing is changing VERSION file... | 22:17 |
kata-irc-bot | <archana.m.shinde> I think thats been newly introduced | 22:19 |
kata-irc-bot | <archana.m.shinde> @jose.carlos.venegas.m: ^ Has that ever passed? | 22:20 |
kata-irc-bot | <eric.ernst> http://jenkins.katacontainers.io/job/kata-containers-proxy-ARM-18.04-PR/ | 22:21 |
kata-irc-bot | <eric.ernst> sometimes. | 22:21 |
kata-irc-bot | <archana.m.shinde> ok looks like its a network issue | 22:22 |
kata-irc-bot | <archana.m.shinde> merging it since its just a version change :slightly_smiling_face: | 22:23 |
kata-irc-bot | <eric.ernst> ty | 22:23 |
kata-irc-bot | <eguzman> doing the QMP device_add myself, there are no reported errors | 22:40 |
kata-irc-bot | <eguzman> however, trying to install the driver: `[ 24.659038] nvidia 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 [ 24.659046] NVRM: This PCI I/O region assigned to your NVIDIA device is invalid: [ 24.659049] NVRM: BAR1 is 0M @ 0x0 (PCI:0001:00.0) [ 24.659052] NVRM: The system BIOS may have misconfigured your GPU. [ 24.659060] nvidia: probe of 0000:01:00.0 failed with error -1 [ 24.659598] NVRM: The NVIDIA probe routine | 22:41 |
kata-irc-bot | failed for 1 device(s). [ 24.659603] NVRM: None of the NVIDIA graphics adapters were initialized!` | 22:41 |
kata-irc-bot | <sebastien.boeuf> ok, and now, if you coldplug the device when starting the VM, what logs do you get when you install the driver ? | 23:12 |
kata-irc-bot | <sebastien.boeuf> and please could you share the qemu command line you're using ? | 23:12 |
kata-irc-bot | <sebastien.boeuf> in both cases | 23:12 |
kata-irc-bot | <sebastien.boeuf> also, please provide the output of an lspci in both cases (coldplug/hotplug) | 23:13 |
kata-irc-bot | <eguzman> when cold plugging (adding -device vfio-pci,host=XX:XX.X to the qemu command line) the driver installs properly | 23:20 |
kata-irc-bot | <eguzman> in both cases lspci shows the graphics card | 23:20 |
kata-irc-bot | <eguzman> this is a fundamental limitation of the existing hardware/software stack which doesn't support hotplugging | 23:34 |
*** lungaro has joined #kata-dev | 23:44 | |
lungaro | how are clear containers maintained? Are they patches I apply to my kernel? | 23:49 |
lungaro | sorry, not clear containers, kata containers | 23:50 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!