*** sjas has joined #kata-general | 04:11 | |
*** sjas_ has quit IRC | 04:14 | |
*** gwhaley has joined #kata-general | 07:44 | |
*** gwhaley has quit IRC | 08:22 | |
*** annabelleB has joined #kata-general | 09:25 | |
*** annabelleB has quit IRC | 09:32 | |
*** annabelleB has joined #kata-general | 10:01 | |
*** annabelleB has quit IRC | 10:40 | |
*** gwhaley has joined #kata-general | 11:44 | |
*** annabelleB has joined #kata-general | 12:56 | |
*** annabelleB has quit IRC | 13:01 | |
kata-dev-irc-bot | <xu> Check out @gnawux’s Tweet: https://twitter.com/gnawux/status/991679858441969665?s=09 | 14:06 |
---|---|---|
gwhaley | ooh, you had happy hour at the Admiral in Copenhagen - I've stayed there a couple of times.... they have some really bad tasting (in a good way) schnapps!!! | 14:15 |
kata-dev-irc-bot | <xu> :,) | 14:15 |
kata-dev-irc-bot | <raravena80> would be interesting to see gVisor benchmarks | 16:11 |
kata-dev-irc-bot | <xu> The members said they put security first rather than performance. | 16:11 |
kata-dev-irc-bot | <raravena80> yeah, system call overhead they say. | 16:13 |
kata-dev-irc-bot | <eric.ernst> I'm excited about checking out some of the common areas of concern in gvisor. | 16:17 |
kata-dev-irc-bot | <eric.ernst> ie, 9p patches, what their RPC solution looks like (gRPC is large) | 16:18 |
gwhaley | is gvisor using 9p - ooh, that will be interesting to peek at then | 16:22 |
kata-dev-irc-bot | <eric.ernst> yes, they are. | 16:22 |
kata-dev-irc-bot | <eric.ernst> and they have some posix fixes in place, AFAIU. | 16:23 |
kata-dev-irc-bot | <eric.ernst> :slightly_smiling_face: | 16:23 |
gwhaley | ah, hmm, but they have their own 'little OS' don't they inside the container (I only read one v.brief article so far) - so, they will have a 9p client in there - maybe their fixes are specific to their OS, and cannot (trivially) be leveraged over to our linux kernel and the 9p/VFS interfaces. we'll see... | 16:24 |
kata-dev-irc-bot | <anne> @tallclair might cover some of this in his talk Friday I imagine? | 16:35 |
kata-dev-irc-bot | <anne> Maybe it can be part of an arch committee call post-KubeCon for those not here | 16:36 |
kata-dev-irc-bot | <tallclair> My talk will cover gVisor at a pretty high level, most of the focus is on the Kubernetes layer. For a deeper dive on gVisor, go to Dawn Chen & Zhengyu He's talk on Thursday: http://sched.co/Dqv1 | 16:47 |
kata-dev-irc-bot | <eric.ernst> It's a long commute for me, but looking forward to the recap! | 16:49 |
*** gwhaley has quit IRC | 17:40 | |
kata-dev-irc-bot | <jonolson> they are using 9p, yes | 17:44 |
kata-dev-irc-bot | <xu> They implement their own 9p | 17:46 |
kata-dev-irc-bot | <jonolson> Yes, sorry, they use the 9p protocol — there is nothing of Linux 9p or qemu 9p in gVisor | 17:46 |
kata-dev-irc-bot | <jonolson> Their use of KVM is also very limited — they aren’t using any of the device model support present in KVM (no vHost, etc. unless I’m mistaken — I’ll follow-up with them) | 17:47 |
kata-dev-irc-bot | <eric.ernst> that was my undersatnding as well from initial reading... | 17:47 |
kata-dev-irc-bot | <eric.ernst> I'm hopeful that we can leverage some 9p fixes! | 17:47 |
kata-dev-irc-bot | <eric.ernst> that should be one of the areas we can have some good collaboration. I hope. | 17:48 |
kata-dev-irc-bot | <jonolson> Heh, sadly, probably not, but I’m going to sorta spam folks with a wall of text now :slightly_smiling_face: | 17:48 |
kata-dev-irc-bot | <eric.ernst> :slightly_smiling_face: | 17:48 |
kata-dev-irc-bot | <jonolson> So there are really two aspects to this, there’s the gVisor syscall sandbox — you don’t need to use 9p at all, ordinary syscalls for open(), read(), fstat(), etc. can work directly as their “VM exits” (with the KVM sandbox) are essentially a heavy paravirtualized “syscall ABI” — they have no real hardware model in Shentu other than where required to play nicely with x86. | 17:49 |
kata-dev-irc-bot | <eric.ernst> that's "ptrace" mode? | 17:50 |
kata-dev-irc-bot | <jonolson> either way — ptrace() and KVM provide equivalent functionality (more or less) with different perf characteristics (KVM is, as far as I know, across the board faster) | 17:50 |
kata-dev-irc-bot | <jonolson> Where 9p comes into the mix is if you don’t want to expose host resources at all — say you had a virtual filesystem for container base images (we do) | 17:51 |
kata-dev-irc-bot | <samuel.ortiz> @jonolson Even on nested env? | 17:51 |
kata-dev-irc-bot | <samuel.ortiz> (i.e KVM being faster) | 17:52 |
kata-dev-irc-bot | <jonolson> @samuel.ortiz ah, maybe not nested — Nicolas has benchmarked it I believe, but I don’t actually remember the numbers — *adds a followup* | 17:52 |
kata-dev-irc-bot | <eric.ernst> okay. Either way, I thought there was work done on 9p around posix compliance. Wasn't clear where/how; I haven't done a git pull yet. | 17:52 |
kata-dev-irc-bot | <jonolson> The KVM backend is fairly new — they have another technology they use for the bulk of gVisor’s usage at Google | 17:52 |
kata-dev-irc-bot | <samuel.ortiz> ah that was the missing piece, thanks @jonolson. That other technology uses VT though, iiuc? | 17:53 |
kata-dev-irc-bot | <jonolson> @eric.ernst sort — they use a modified version of 9P2000.L — afaik they haven’t done any work to make Linux speak the version they use, however I’ve been fighting with the incompatibility in the last couple weeks, so I may send patches :slightly_smiling_face: | 17:53 |
kata-dev-irc-bot | <eric.ernst> k. An area of pain on our side as well. | 17:55 |
kata-dev-irc-bot | <jonolson> @samuel.ortiz Yes — a good model is their KVM backend, minus a bunch of the things KVM does on a VMexit that gVisor doesn’t actually care about (reading tons of MSRs, etc.) | 17:55 |
kata-dev-irc-bot | <samuel.ortiz> @jonolson Yes, I guess they don't care about the platform state as much as a regular VM. | 17:58 |
kata-dev-irc-bot | <jonolson> As far as _how_ they use 9P — it’s mostly the modified 9P protocol over Unix domain sockets — they have some shared memory bits to accelerate parts of it, but that’s an optimization — the big trick is that since it’s a Unix domain socket (and the sandbox itself is running on the host, and their “machine ABI” includes things like file descriptors as first class primitives) they can “donate” file descriptors fr | 17:58 |
kata-dev-irc-bot | into the sandbox, and have a (relatively) cheap way of working with them — really this would work fairly well over virtio-9p as well (fids are essentially the same thing), but nobody (except the gVisor team) has taken the model that far for things like sharing socket fds, etc. as far as I know | 17:58 |
kata-dev-irc-bot | <jonolson> @samuel.ortiz right — particularly since they control both the “vmm” (runsc) and the “guest kernel” they know exactly which bits of state they care about | 18:00 |
kata-dev-irc-bot | <jonolson> it makes VM exit handling closer to an agreed-upon calling convention than trying to fully emulate whatever an unknown guest kernel is expecting | 18:04 |
kata-dev-irc-bot | <samuel.ortiz> @jonolson I was a little confused about where the VMM actually is in that picture. | 18:05 |
kata-dev-irc-bot | <samuel.ortiz> right. Still more expensive than a syscall, but with a custom KVM you can mitigate that I guess. | 18:05 |
kata-dev-irc-bot | <jonolson> With the KVM backend the “vmm” is in user-mode, qemu-style — exits are fairly expensive, but since “stock” KVM has no knowledge about gVisor’s needs, all the gVisory bits require a round-trip through host userspace | 18:07 |
kata-dev-irc-bot | <jonolson> but if you control all of the players there might be some vm exits where you know that usermode is just going to turn that into, e.g., a read() syscall — if there’s no discretion left for what user-mode will do, may as well not bother dropping to user-mode at all | 18:09 |
kata-dev-irc-bot | <jonolson> fwiw, i’ve been wondering about whether even in the more general case (where you know less about the kernel bits) something similar might be possible leveraging bpf | 18:10 |
kata-dev-irc-bot | <jonolson> (basically add a kvm ioctl() to register bpf programs for exit handling that can in some cases elide the exit to usermode, without having to go full vhost :slightly_smiling_face: | 18:11 |
kata-dev-irc-bot | <jonolson> I haven’t actually looked to see if anyone has done this yet — but bpf is so in vogue with seccomp, etc. | 18:12 |
*** gwhaley has joined #kata-general | 18:18 | |
*** HL_ has joined #kata-general | 18:21 | |
*** HL_ has quit IRC | 18:26 | |
*** gwhaley has quit IRC | 20:10 | |
*** mylinux has joined #kata-general | 20:23 | |
*** mylinux has quit IRC | 20:49 | |
*** mylinux has joined #kata-general | 21:04 | |
*** mylinux has quit IRC | 21:44 | |
*** justJanne has quit IRC | 23:52 | |
*** justJanne has joined #kata-general | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!