kata-irc-bot | <eric.ernst> :thread: shim size comparison for 2.0 and 1.x | 00:19 |
---|---|---|
kata-irc-bot | <eric.ernst> Was doing some quick testing comparing 2.0 and 1.x for size comparison. | 00:20 |
kata-irc-bot | <eric.ernst> shim | ps rss | total program size|resident set size|shared pages| text (code) | data/stack | library | dirty pages | |---|---|----|---|---|-----|-----|-----|----| | 2.x | 33920 | 250093 | 8480 | 4521 | 4176 | 0 | 43064 | 0 | | 1.x | 25252 | 248467 | 6313 | 4393 | 4989 | 0 | 43120 | 0 | | 00:24 |
kata-irc-bot | <eric.ernst> ```shim | ps rss | total program size|resident set size|shared pages| text (code) | data/stack | library | dirty pages | |---|---|----|---|---|-----|-----|-----|----| | 2.x | 33920 | 250093 | 8480 | 4521 | 4176 | 0 | 43064 | 0 | | 1.x | 25252 | 248467 | 6313 | 4393 | 4989 | 0 | 43120 | 0 |``` | 00:25 |
kata-irc-bot | <eric.ernst> | 00:26 |
kata-irc-bot | <eric.ernst> So, code is less, which I expect based on some of what we removed for store, etc. | 00:26 |
kata-irc-bot | <eric.ernst> But the memory usage is significantly higher. I would have expected maybe parity, or less, but not an ~8MB increase. | 00:27 |
kata-irc-bot | <eric.ernst> That's eating into the benefits of the reduced guest/agent | 00:27 |
*** bumperSteff has quit IRC | 00:57 | |
*** bumperSteff has joined #kata-dev | 00:57 | |
*** sameo has quit IRC | 00:57 | |
*** auk_ is now known as auk | 01:46 | |
*** auk_ has joined #kata-dev | 03:45 | |
*** auk has quit IRC | 03:46 | |
*** auk_ is now known as auk | 03:55 | |
kata-irc-bot | <eric.ernst> @liubin0329 thanks for adding pprof. I'm going to take a look now. | 04:07 |
kata-irc-bot | <bergwolf> yeah, that's totally unexpected | 05:47 |
kata-irc-bot | <eric.ernst> this is just swapping agent/runtime, rest of it is the same on the system. | 05:48 |
kata-irc-bot | <eric.ernst> https://gist.github.com/egernst/f3f786d38f92c7a7f20baf581f490e6d | 05:50 |
kata-irc-bot | <bergwolf> this one? containerd-shim-kata-v2 threads 3 vs. 5 | 05:53 |
kata-irc-bot | <liubin0329> Seems related to Prometheus: ```(pprof) top Showing nodes accounting for 2581.10kB, 100% of 2581.10kB total Showing top 10 nodes out of 36 flat flat% sum% cum cum% 528.17kB 20.46% 20.46% 528.17kB 20.46% github.com/kata-containers/kata-containers/src/runtime/containerd-shim-v2.glob..func1 516.76kB 20.02% 40.48% 1028.79kB 39.86% encoding/json.typeFields 512.14kB 19.84% 60.33% 1024.14kB 39.68% | 05:55 |
kata-irc-bot | github.com/prometheus/client_golang/prometheus.(*metricMap).getOrCreateMetricWithLabelValues 512.02kB 19.84% 80.16% 1028.79kB 39.86% encoding/json.newStructEncoder 512kB 19.84% 100% 512kB 19.84% github.com/prometheus/client_golang/prometheus.makeLabelPairs 0 0% 100% 516.76kB 20.02% encoding/json.(*Encoder).Encode 0 0% 100% 512.02kB 19.84% encoding/json.(*decodeState).object 0 0% | 05:55 |
kata-irc-bot | 100% 512.02kB 19.84% encoding/json.(*decodeState).unmarshal 0 0% 100% 512.02kB 19.84% encoding/json.(*decodeState).value 0 0% 100% 516.76kB 20.02% encoding/json.(*encodeState).marshal``` | 05:55 |
kata-irc-bot | <liubin0329> Wait, I'm using FC, let me change to QEMU and check again. | 05:56 |
*** auk has quit IRC | 07:16 | |
*** sgarzare has joined #kata-dev | 07:51 | |
*** pcaruana has joined #kata-dev | 07:52 | |
*** jodh has joined #kata-dev | 07:59 | |
*** sameo has joined #kata-dev | 08:11 | |
*** fgiudici has joined #kata-dev | 08:20 | |
*** david-lyle has joined #kata-dev | 08:47 | |
*** sgarzare_ has joined #kata-dev | 08:47 | |
*** dklyle has quit IRC | 08:47 | |
*** sgarzare has quit IRC | 08:47 | |
*** snir has quit IRC | 08:49 | |
*** snir has joined #kata-dev | 08:50 | |
*** david-lyle has quit IRC | 08:58 | |
*** davidgiluk has joined #kata-dev | 09:07 | |
*** th0din has joined #kata-dev | 10:19 | |
*** devimc has joined #kata-dev | 12:32 | |
*** jodh_ has joined #kata-dev | 13:38 | |
*** jodh has quit IRC | 13:40 | |
*** EricAdamsZNC2 has quit IRC | 13:40 | |
*** EricAdamsZNC has joined #kata-dev | 13:42 | |
*** sameo has quit IRC | 14:03 | |
*** devimc has quit IRC | 14:12 | |
*** devimc has joined #kata-dev | 14:15 | |
*** crobinso has joined #kata-dev | 14:35 | |
kata-irc-bot | <christophe> Could someone help me understand the failure in http://jenkins.katacontainers.io/job/kata-containers-2.0-fedora-PR/560? The change is https://github.com/kata-containers/kata-containers/pull/1114, and the fedora-crio test fails with ```Failed at 48: chronic sudo -E yum install -y kubelet-"$install_kubernetes_version" kubeadm-"$install_kubernetes_version" kubectl-"$install_kubernetes_version" --disableexcludes=kubernetes Kubernetes not | 15:05 |
kata-irc-bot | installed Openshift not installed Disable systemd-journald rate limit Terminated ++ handle_error 69 ++ local exit_code=143 ++ local line_number=69 ++ echo 'Failed at /tmp/jenkins/workspace/kata-containers-2.0-fedora-PR/ci_entry_point.sh +69: .ci/jenkins_job_build.sh "${repo_to_test}"'``` | 15:05 |
kata-irc-bot | <fidencio> Let me take a look. | 15:08 |
kata-irc-bot | <fidencio> ```Install Kubernetes components Build timed out (after 5 minutes). Marking the build as aborted. Build was aborted Performing Post build task...``` | 15:10 |
kata-irc-bot | <fidencio> It does look like a flake, and it does seem to be safe enough to just restart that specific CI. | 15:12 |
kata-irc-bot | <fidencio> @christophe: ^ | 15:12 |
kata-irc-bot | <eric.ernst> @c3d is the workload being constrained? | 15:16 |
kata-irc-bot | <eric.ernst> Adjusting score seems okay in general, but I’m interested in better understanding the failure here. We don’t support “bestEffort” (unconstrained) very well. | 15:16 |
kata-irc-bot | <christophe> @eric.ernst Well, there is a problem here that the OOM killer adjustment applies also to children. Is that what you are referring to? But in the original bug, the OOM killer specifically killed the agent. | 15:18 |
kata-irc-bot | <eric.ernst> Well, before getting into that, I’m also interested just in general on how this can be reproduced. I’m guessing there isn’t a memory limit applied to the containers cgroup in the guest | 15:19 |
kata-irc-bot | <christophe> Ah, this happened while running a fuzzer in a container with vfio, originally. So I'm not sure how easy it is to reproduce | 15:20 |
kata-irc-bot | <christophe> I don't think there was a specific constraint. | 15:20 |
kata-irc-bot | <christophe> I will ask. Unfortunately, it's on a private Bugzilla. | 15:20 |
kata-irc-bot | <eric.ernst> The workload was a fuzzer? What’s the workload spec, or cmdlone? | 15:21 |
kata-irc-bot | <eric.ernst> If you don’t constrain a memory greedy workload, bad things happen. | 15:21 |
kata-irc-bot | <christophe> ```podman --runtime=kata-vfio run --security-opt label=type:container_kvm_t -it --rm --cap-add=CAP_IPC_LOCK --device=/dev/vfio/120 --device=/dev/vfio/vfio fedora sh Now inside the container: # git clone https://gitlab.com/cailca/linux-mm # cd linux-mm; make # ./random -x 0-100 -f (which just run some syscalls fuzzing)``` | 15:21 |
kata-irc-bot | <eric.ernst> —memory=(something same) | 15:21 |
kata-irc-bot | <christophe> So no, no memory constraint | 15:21 |
kata-irc-bot | <eric.ernst> Maybe we can fail more gracefully .. maybe. | 15:22 |
kata-irc-bot | <christophe> Can we have that discussion on the issue itself, BTW, so that there is a record? | 15:22 |
kata-irc-bot | <eric.ernst> Oomadj as well, but... I can see why this Halle a. | 15:22 |
*** sameo has joined #kata-dev | 15:28 | |
*** devimc has quit IRC | 15:39 | |
*** devimc has joined #kata-dev | 15:39 | |
*** dklyle has joined #kata-dev | 15:48 | |
*** pcaruana has quit IRC | 16:09 | |
*** sgarzare_ has quit IRC | 17:00 | |
kata-irc-bot | <fidencio> @jose.carlos.venegas.m, @salvador.fuentes, hola! :slightly_smiling_face: | 17:07 |
kata-irc-bot | <fidencio> I'd like to ask what's exactly tested with containerd on 2.x branch, as I'd like to reach the parity of what's tested with containerd and what's tested with CRI-O. In the past few days I was able to enable the `bats` that were skipped on CRI-O, meaning that in that part we're fine. But what are the other bits that would need some love? :slightly_smiling_face: | 17:07 |
kata-irc-bot | <fidencio> Adding @wmoschet and @cmeadors to the loop as well. | 17:07 |
kata-irc-bot | <fidencio> And @fgiudici. :slightly_smiling_face: | 17:14 |
kata-irc-bot | <jose.carlos.venegas.m> @fidencio Hey | 17:14 |
kata-irc-bot | <jose.carlos.venegas.m> Let me check if based in CI files I can tell you. Today @salvador.fuentes @gabriela.cervantes.te are offline that are the ones that will know quickly | 17:15 |
kata-irc-bot | <fidencio> Ah, @jose.carlos.venegas.m, don't need to spend time on this now. | 17:16 |
kata-irc-bot | <fidencio> I can poke them again next week, no problem at all. | 17:16 |
kata-irc-bot | <jose.carlos.venegas.m> @fidencio sure no problem they are back next week | 17:16 |
kata-irc-bot | <fidencio> Thanks! | 17:17 |
fgiudici | root | 17:18 |
fgiudici | ups :-P | 17:18 |
davidgiluk | Password: | 17:34 |
fidencio | hunter2 | 17:37 |
*** jodh_ has quit IRC | 18:02 | |
fgiudici | lol | 18:27 |
*** fgiudici has quit IRC | 18:29 | |
kata-irc-bot | <eric.ernst> @bergwolf -- yeah. I see that in the pmap, but when checking /proc/$PID/status, I see # threads listed as 9 for each. The ~8MB is coming from RssAnon (not terribly useful) | 18:43 |
*** davidgiluk has quit IRC | 20:11 | |
*** auk has joined #kata-dev | 21:19 | |
*** devimc has quit IRC | 22:03 | |
*** th0din has quit IRC | 22:21 | |
*** th0din has joined #kata-dev | 22:24 | |
*** ajin has quit IRC | 22:37 | |
kata-irc-bot | <eric.ernst> So, I still think the shim binary is super large, and I hope we can look into reducing size if feasible, but I think I found some of the issues | 22:39 |
kata-irc-bot | <eric.ernst> 1. 1.x is a bug in the makefile - buildmode=pie was dropped. Yikes. This accounts for a ~20% reduction in binary size. Opened https://github.com/kata-containers/runtime/issues/3074 and will fix, which'll bring 1.x to be as large, almost, as 2.0 2. I noticed that on 2.0 we link a couple of libraries now: ``` required from libc.so.6: 0x0d696914 0x00 06 GLIBC_2.4 0x09691974 0x00 04 GLIBC_2.3.4 0x09691a75 0x00 02 | 22:41 |
kata-irc-bot | GLIBC_2.2.5``` | 22:41 |
kata-irc-bot | <eric.ernst> @samuel.ortiz - thanks for the soundboard on some of this. | 22:42 |
kata-irc-bot | <eric.ernst> This shim is run per pod, and is just as critical as the agent and the VMM and the guest kernel for reducing the footprint of Kata. We should monitor this closely, and see what we can do to reduce it (am not saying rewrite in rust!) | 22:50 |
kata-irc-bot | <eric.ernst> devimc @archana.m.shinde @chen.bo PTAL: https://github.com/kata-containers/runtime/pulls?q=is%3Apr+is%3Aopen+shim-v2++buildmode | 22:56 |
*** sameo has quit IRC | 23:52 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!