15:10:50 <danpb> #startmeeting libvirt 15:10:51 <openstack> Meeting started Tue Jul 22 15:10:50 2014 UTC and is due to finish in 60 minutes. The chair is danpb. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:10:52 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:10:54 <openstack> The meeting name has been set to 'libvirt' 15:11:01 <apmelton> o/ 15:11:04 <thomasem> o/ 15:11:04 <s1rp_> o/ 15:11:07 <sew> o/ 15:11:15 <dgenin> o/ 15:11:38 <danpb> sorry i got side tracked & forgot ... 15:12:29 <thomasem> No worries, it happens. We just had a topic from our team that we'd really value your input on. 15:12:36 <danpb> ok, go for it 15:13:21 <thomasem> So, we're exploring cpu tuning for containers, but, as I'm sure you've seen, /proc/cpuinfo still shows the host's processor info instead of something that better reflects the tuning for the guest, like /proc/meminfo does. 15:13:54 <danpb> ah, so this is a real can of worms you'll wish you'd not raised :-) 15:14:00 <apmelton> haha 15:14:05 <thomasem> Lol, oh my favorite! 15:14:32 <danpb> so for a start containers don't really have any concept of virtualized CPUs 15:14:47 <apmelton> so with cpushares/quota, you still technically have every cpu, but if you've locked down the cpus with cpusets, I believe you would only have the cpus you've been allocated 15:14:53 <apmelton> so you can simulate vcpus with cpusets 15:15:01 <danpb> eg if you tell libvirt <vcpus>3</vcpus> or <vcpus>8</vcpu> <vcpus>whatever</vcpu> it is meaningless 15:15:10 <thomasem> mhmm 15:15:19 <danpb> what containers do give you is the ability to set affinity of the container to the host 15:15:35 <danpb> so you can say only run this container on host CPUs n->m 15:15:43 <danpb> which is done with cgroups cpuset 15:16:01 <danpb> the /proc/cpuinfo file though is really unrelated to CPU affinity masks 15:16:09 <danpb> eg, consider if you ignore containers for a minute 15:16:24 <danpb> and just have a host OS and put apache inside a cpuset cgroup 15:16:35 <danpb> you then have the exact same scenario wrt /proc/cpuinfo 15:17:13 <danpb> what this all says to me is that applications should basically ignore /proc/cpuinfo as a way to determine how many CPUs they have available 15:17:34 <danpb> they need to look at what they are bound to 15:18:08 <thomasem> How would we inform applications of that? Is it common for applications to inspect /proc/cpuinfo for tuning themselves? 15:18:45 <danpb> i don't know to be honest - I've been told some (to remain unnamed) large enterprise software parses stuff in /proc/cpuinfo 15:19:00 <apmelton> heh 15:19:10 <thomasem> hmmm 15:19:19 <danpb> i kind of see this is a somewhat of a gap in the Linux ecosystem API 15:19:46 <danpb> nothing is really providing apps a good library API to determine available CPU / RAM availability 15:20:30 <thomasem> I see where you're coming from 15:20:53 <danpb> i kind of feel the same way about /proc/meminfo - what we hacked up in libvirt is really not container specific - same issue with any app that wants to see "available 'host' memory" which is confined by cgroups memory controller 15:21:15 <sew> so wrt vcpu and flavors for containers, we're considering just setting vcpu to zero for our lxc flavors - does that sound reasonable? 15:21:20 <danpb> so in that sense I (at least partially) regret that we added overriding of /proc/meminfo into libvirt 15:22:05 <danpb> sew: not sure about that actually 15:22:23 <danpb> sew: it depends how things interact with the NUMA/CPU pinning stuff I'm working on 15:22:37 <thomasem> What would you have in place of the /proc/meminfo solution in Libvirt to provide guests a normal way of understanding its capabilities? 15:22:45 <danpb> sew: we're aiming to avoid directly exposing the idea of setting a CPU affinity mask to the user/admin 15:22:53 <apmelton> danpb: working on that in libvirt or nova's libvirt drive? 15:23:00 <danpb> sew: so the flavour would just say "want exclusive CPU pinning" 15:23:18 <danpb> and then libvirt nova driver would figure out what host CPUs to pin the guest to 15:23:26 <sew> interesting concept danpb 15:23:31 <danpb> so to be able todo that, we need the vcpus number set to a sensible value 15:23:41 <apmelton> ah, danpb, so we could mimic vcpus for containers with that? 15:23:42 <danpb> just so that we can figure out how many host CPUs to pin the guest to 15:24:10 <danpb> even though when we pass this vcpu value onto libvirt it will be ignored 15:24:30 <danpb> IOW from Nova flavour level, vcpus is still useful even though it isn't useful at libvirt level 15:24:47 <danpb> apmelton: it is a Juno feature for Nova libvirt driver 15:25:20 <apmelton> ah really, I wasn't aware of that, how do you use it? 15:25:23 <danpb> thomasem: ultimately i think there needs to be some kind fo API to more easily query cgroup confinement / resource availability 15:25:51 <danpb> apmelton: big picture is outlined here https://wiki.openstack.org/wiki/VirtDriverGuestCPUMemoryPlacement 15:25:56 <thomasem> Ah, so a process, whether in a full container guest, or simply under a single cgroup limitation can find its boundaries? 15:26:15 <danpb> thomasem: yeah, pretty much 15:26:26 <thomasem> Hmmm, I wonder how we could pursue that, tbh. 15:26:37 <thomasem> Start a chat on ze mailing list for LXC? 15:26:48 <thomasem> Or perhaps work like that is already underway? 15:26:49 <danpb> with the way systemd is rising to a standard in linux, and the owner of cgroups, it is possible that systemd's DBus APIs might be the way forward 15:27:00 <thomasem> oh okay 15:27:25 <thomasem> interesting 15:27:29 <danpb> but i'm fairly sure there's more that systemd would need to expose in this respect still 15:28:06 <danpb> overall though the current view is that Systemd will be the exclusive owner of all things cgroup related - libvirt and other apps need to talk to systemd to make changes to cgroups config 15:28:15 <sew> systemd does seem like the logical place for all that to happen 15:28:21 <thomasem> gotcha 15:28:36 <thomasem> Something to research and pursue, then. 15:31:05 <danpb> that all said, if there's a compelling reason for libvirt to fake /proc/cpuinfo for sake of compatibility we might be able to explore that upstream 15:31:34 <danpb> just that it would really be a work of pure fiction based solely on the <vcpu> value from the XML that does nothing from a functional POV :-) 15:31:47 <thomasem> Yeah, we'd be lying. 15:31:48 <thomasem> lol 15:32:00 <danpb> for added fun, /proc/cpuinfo is utterly different for each CPU architecture - thanks linux :-( 15:32:02 <apmelton> danpb: wouldn't it be better to base it off cpupinning? 15:32:05 <thomasem> It's just the question of whether it's better to lie closer to the truth :P 15:32:16 <apmelton> or is that not supported with libvirt-lxc? 15:32:30 <danpb> you can do guest pinning with libivrt lxc 15:33:35 <apmelton> for instance, instead of ignoring the vcpu value in lxc, could libvirt translate that into cpu pins? 15:33:56 <danpb> if the kernel ever introduced a cgroup tunable "max N number of processes concurrently in running state for schedular" that would conceptually work for a vcpu value but that's probalby not likely to happen 15:34:13 <apmelton> heh 15:34:17 <danpb> apmelton: that would mean the guests were always pinned even when pinning is not requested 15:34:38 <danpb> which is something i'd prefer to avoid since while it works ok as you startup a sequence of guests 15:34:48 <danpb> once you shutdown a few & start some new ones, you end up very unbalanced in placement 15:34:59 <apmelton> yea, that gets complex fast 15:35:15 <danpb> you'd have to have libvirt constantly re-pinning containers to balance things out again 15:36:41 <apmelton> ok, that makes sense 15:39:27 <thomasem> regarding faking /proc/cpuinfo for compatibility, I am not immediately aware of an application use-case that would look for that. Can anyone think of an issue with the guest being able to see the host processor info in general (in a multi-tenant env)? 15:40:00 <danpb> i don't think there's any real information leakage problems 15:41:41 <thomasem> Okay 15:41:44 <dgenin> maybe if you new that the data you were after was on a particular type of node you could use /proc/cpuinfo to navigate the cloud 15:42:10 <dgenin> just an idea 15:43:41 <danpb> dgenin: the /proc/cpuinfo file is pretty low entropy as far as identifying information is concerned 15:44:05 <danpb> particularly as clouds will involve large pools of identical hardware 15:44:13 <dgenin> true, there are bound to be many nodes with the same cpuinfo 15:44:31 <danpb> there's many other easier ways to identify hosts 15:44:52 <dgenin> what do you have in mind? 15:46:10 <danpb> sysfs exposes host UUIDs :-) 15:47:08 <dgenin> yeah, but the attacker is not likely to know something so precise 15:47:20 <dgenin> another possibility is that cpuinfo is not sufficient alone but it could be combined with other identifying information to pidgeonhole the node 15:47:47 <danpb> any other agenda items to discuss, or we can call it a wrap 15:48:28 <thomasem> Not from me. I have stuff to think about now. :) 15:48:45 <thomasem> Not like I didn't before, but more now. hehe. 15:49:04 <sew> thx for the background on cpuinfo danpb 15:49:26 <danpb> ok, till next week.... 15:49:30 <danpb> #endmeeting