*** mikal5 is now known as mikal | 01:01 | |
masahito | sean-k-mooney: late ack. Thanks. yes, I'm thinking to push the PoC code early as much as possible and trying to go the features in the early next cycle. | 01:39 |
---|---|---|
mikal | Oh nice, because rebasing a patch series now involves adding a signed off header, it also invalidates all votes against that review, even if the code hasn't changed. | 02:25 |
opendevreview | Michael Still proposed openstack/nova master: libvirt: Add objects and notifications for sound model. https://review.opendev.org/c/openstack/nova/+/926126 | 04:52 |
opendevreview | Michael Still proposed openstack/nova master: Implement sound model extra spec for libvirt. https://review.opendev.org/c/openstack/nova/+/940770 | 04:52 |
opendevreview | Michael Still proposed openstack/nova master: libvirt: Add objects and notifications for USB controller model. https://review.opendev.org/c/openstack/nova/+/927354 | 04:52 |
opendevreview | Michael Still proposed openstack/nova master: Implement USB controller extra spec for libvirt. https://review.opendev.org/c/openstack/nova/+/950643 | 04:52 |
opendevreview | benlei proposed openstack/nova master: FIX: instance live migration failed after swap volume https://review.opendev.org/c/openstack/nova/+/954210 | 07:53 |
opendevreview | benlei proposed openstack/nova master: FIX: instance live migration failed after swap volume https://review.opendev.org/c/openstack/nova/+/954210 | 11:12 |
opendevreview | Kamil Sambor proposed openstack/nova master: Replace eventlet.event.Event with threading.Event https://review.opendev.org/c/openstack/nova/+/949754 | 14:43 |
*** haleyb|out is now known as haleyb | 14:58 | |
opendevreview | Balazs Gibizer proposed openstack/nova master: [pci]Keep used dev in Placement regardless of dev_spec https://review.opendev.org/c/openstack/nova/+/954149 | 15:11 |
opendevreview | Rajesh Tailor proposed openstack/nova master: Clarify that [pci]device_spec={} matches all devices https://review.opendev.org/c/openstack/nova/+/954274 | 15:50 |
fungi | sean-k-mooney: https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/2069607 was finally switched to public on friday, and contains the rationale behind cloud-init's decision to break that functionality we were discussing | 18:01 |
sean-k-mooney | fungi: ack ill skim it but the problem is that even on x86_64 using dmi is not a correct way to determin if the openstack data souce shoudl be enabled | 19:29 |
mikal | How does AWS handle this given they use exactly the same IP mechanism (we copied from them)? | 19:29 |
sean-k-mooney | for it to be somethign they could rely on it would have to be somethign we suport for all virt driever includign ironic | 19:30 |
sean-k-mooney | mikal: it handeled by you confiruting the backedn in the image | 19:30 |
sean-k-mooney | you are ment ot have custom images per cloud | 19:30 |
sean-k-mooney | so the uec images are the amazon flavor | 19:30 |
mikal | Also, its not strictly true that someone can man in the middle that IP address, assuming they're actually on OpenStack. We intercept that traffic and special case it. | 19:30 |
sean-k-mooney | well the 169 address were chosen because those are not allowed to be routed | 19:31 |
fungi | yeah, where this presents a challenge is e.g. distros producing official bootable cloud images that are intended for use in multiple environments | 19:31 |
fungi | mikal: yes, the concern raised was booting a generic cloud image in a non-openstack environment | 19:31 |
sean-k-mooney | and since clouds are ment to prvode thing like ip/mac spoofing protection in general | 19:31 |
sean-k-mooney | spoofing that ip shoudl nto be possibel in general | 19:32 |
mikal | Honestly this seems like a pretty weak security concern to me. | 19:32 |
sean-k-mooney | fungi so wether someitng responce on that address is ment ot be how you determin if its ec2/openstack | 19:32 |
sean-k-mooney | you differecniate tbetween the two based on the reported filestyem in the metadtaa endpoint | 19:33 |
fungi | on non-x86 architectures cloud-init was assuming that the metadata addresses, if they replied, were trustworthy and not an attacker serving up backdoor userdata | 19:33 |
sean-k-mooney | fungi: it does that on x86 as well | 19:33 |
sean-k-mooney | the may have added some ectra condtional checks at some point | 19:33 |
fungi | but only if the dmi data indicates it's booted by nova (or ec2 i suppose?) | 19:33 |
mikal | Can't I make the same argument about DHCP? I just trust the first reply, that reply needs to not be malicious. | 19:33 |
sean-k-mooney | but ofrigally it woudl hit the ec2 endpoint even on openstack | 19:33 |
sean-k-mooney | fungi: not that im aware of. i think in older release if it was enabeld in the config it was used uncondtionally even on x86 | 19:34 |
fungi | mikal: yes, i think this is the same degree of risk as a rogue dhcp server, assuming the system is configured to also run arbitrary code at boot that the dhcp server points it to? (not sure, that could be a bit of a pathological situation) | 19:34 |
sean-k-mooney | but there are several other first boot systems like ignition and cloudbase init that use the same 169.254.169.254 or http://f8e0 well know adddreses | 19:35 |
fungi | right, i have no idea if cloud-init supports getting metadata from those | 19:36 |
mikal | Wouldn't it be better to spend any development effort here on something like TLS for the metadata service with certificate verification turned on, instead of changing a mechanism that no one has complained about in nearly 20 years? | 19:36 |
sean-k-mooney | i think they "fixed" it by just chaging thte default in the config to not enmebrate by defualt | 19:37 |
fungi | well, technically someone *did* complain about it, hence the bug report | 19:37 |
sean-k-mooney | but i woudl expect distor to override that | 19:37 |
fungi | yes, distros could override it, or they could just build separate ec2/openstack images and othercloud images | 19:38 |
sean-k-mooney | as it stands they will have broken metadata supprot for ironic by default. so if you dont enbale configdrive with yoru ironic instnace it wong be able to use metadata out of the box | 19:38 |
fungi | i have no idea if e.g. debian's or ubuntu's generic cloud images are used on ironic | 19:38 |
sean-k-mooney | i have used them before | 19:39 |
mikal | Building N cloud images is actually a fairly big hosting problem I imagine. That's a fair bit of repeated data just for a very minor problem. | 19:39 |
sean-k-mooney | but it depend on yoru hardware | 19:39 |
fungi | seems like you'd want a more full-featured image with broader hardware support? | 19:39 |
fungi | yeah | 19:39 |
sean-k-mooney | somethime they dont have the drivers requried soemtimes they do | 19:39 |
sean-k-mooney | its been a long time since i tried but i think i used diskimage builder with the cloud image as a base to build one | 19:39 |
sean-k-mooney | cloud image doen nto mean qemu/kvm or vm based | 19:40 |
fungi | debian not including ahci support in the generic cloud kernels because "most" cloud providers (amazon/azure/google) is a good example of why i'd be skeptical of using such images in ironic for bare metal | 19:40 |
mikal | DIB definitely uses the various cloud images as the base for builds. | 19:40 |
sean-k-mooney | mikal: it can/does btu it can replace the kernel ectra | 19:40 |
mikal | sean-k-mooney: yeah, package installs etc etc. | 19:41 |
fungi | mmm, these days the dib elements we're using start from docker images instead, though previously we used elements that bootstrapped from scratch with tools like debootstrap | 19:41 |
mikal | This is how the SF "official" images are built. DIB with the SF agent sprinkled in.' | 19:41 |
sean-k-mooney | fungi: when did we move the ubuntu ones | 19:42 |
fungi | but yes, i think dib does have some non-"minimal" elements that instead modify cloud images with virsh tools | 19:42 |
sean-k-mooney | because they used to use the cloud image until relitivly recently | 19:42 |
fungi | sean-k-mooney: move from using debootstrap to container images? not sure if we have done that for the ubuntu images yet, i have to check | 19:42 |
fungi | but at least for the dib-created images we boot in opendev we've never used cloud images as the base | 19:43 |
sean-k-mooney | fungi: last time i looked and when i was lookign to add alpine most still did not use contianers | 19:43 |
sean-k-mooney | fedora i think did | 19:43 |
sean-k-mooney | but most still default to clodu images as a base | 19:43 |
sean-k-mooney | https://github.com/openstack/diskimage-builder/blob/master/diskimage_builder/elements/ubuntu/root.d/10-cache-ubuntu-tarball#L16-L21 | 19:44 |
sean-k-mooney | ok they use the sqsfs rather then the qcow | 19:44 |
sean-k-mooney | or rather the filesystem image | 19:44 |
fungi | sean-k-mooney: yeah, looks like we're still using ubuntu-minimal (debootstrap into a chroot) not ubuntu-container | 19:44 |
sean-k-mooney | so i know there is a fedora container element https://github.com/openstack/diskimage-builder/tree/master/diskimage_builder/elements/fedora-container | 19:45 |
fungi | the rocky images we build are using the rocky-container element by contrast | 19:45 |
sean-k-mooney | right | 19:45 |
sean-k-mooney | but it think that was only done for 1 or two and most didnt move yet | 19:45 |
fungi | yeah, looking back at it i think you're right, i assumed we were further along on that already | 19:46 |
fungi | i guess it's been on hold while we switch from nodepool to zuul-built images | 19:46 |
sean-k-mooney | i was playing with packer and trying to buidl a alpine image at the weekend and i was debating if i wanted to revive my dib serise for that | 19:47 |
sean-k-mooney | building os images is alwasy a pain regardess of the tooling | 19:48 |
sean-k-mooney | alpine now provides a cloud iamge with cloudinit however | 19:48 |
sean-k-mooney | its just larger then i was hoping | 19:48 |
sean-k-mooney | fungi: i still think exporeing alpine to have a replacement for cirros or moving to using nested kvm might be the only way we elimiate the kernel panics we see in our jobs | 19:49 |
fungi | it seems worth exploring, i agree | 19:50 |
sean-k-mooney | it does not have to be alpine but there are very few other distos that are small enough and well supproted to consider | 19:50 |
sean-k-mooney | anyway back to the topic at hand if folks are building an ironic image they can still use https://github.com/openstack/diskimage-builder/tree/master/diskimage_builder/elements/cloud-init-datasources#environment-variables | 19:51 |
sean-k-mooney | to list openstack | 19:51 |
fungi | i want to say someone (doug?) was playing with using distros that target embedded systems (openwrt? emdebian? building something with yocto?) | 19:52 |
fungi | but it's been years ago now | 19:52 |
sean-k-mooney | fungi: i got it partly work i was just missing growpart :) | 19:52 |
sean-k-mooney | but ya i brefily considerd tinycore since ironic used it for the ipa | 19:53 |
sean-k-mooney | the problem is the combination of small disk footpring <100MB and small ram foot pring <128MB + cloud-init or glean +busybox/tools needed byt tempest | 19:54 |
sean-k-mooney | fungi: now that they offically publish cloud images i shoudl proably jsut poing my testing patch at that and see if it works | 19:55 |
fungi | worth a shot, sure | 19:56 |
sean-k-mooney | fungi: how is the ci since the roolout of the zuul change | 19:57 |
sean-k-mooney | fungi: have cross provider builds been avoided as expected? | 19:57 |
fungi | sean-k-mooney: not 100% solved, we discovered that we also need https://review.opendev.org/954284 | 19:58 |
sean-k-mooney | hum the cross tenant sharing of python object is slightly scary but ok | 19:59 |
sean-k-mooney | i guess that will rool out next weekend | 19:59 |
fungi | we'll probably restart the launchers onto it as soon as the container images promote | 19:59 |
fungi | turns out we had done that when the previous (partial) fix merged but i didn't realize it when discussing later in the week | 20:00 |
sean-k-mooney | oh i see the sharing is not a generic thing just in that funtion | 20:03 |
sean-k-mooney | i was slighly concured that semapheroes or secretes ectra could be shared but this is not related to the job objects | 20:03 |
fungi | no, you could look at it as the problem arising from a generally cross-tenant resource not having been correctly shared since zuul normally separates objects in one tenant from the others | 20:04 |
opendevreview | sean mooney proposed openstack/nova master: [WIP] use alpine instead of cirros https://review.opendev.org/c/openstack/nova/+/881912 | 20:06 |
sean-k-mooney | i think i had previously customised the alpin image i built to allow password login with gocubsgo for tempest although that might works via ssh key loging | 20:08 |
fungi | amusing, i never realized that was the default password for cirros images. i guess someone's a baseball fan | 20:11 |
sean-k-mooney | its hard to tell becase i never wrote the nodepool change to build the image in ci i only built it locally so i dont recall if i use the dev-user element or not | 20:11 |
mikal | fungi: that would be Scott Moser. | 20:11 |
sean-k-mooney | it used to be "cubswin:)" until they one and now its "gocubsgo" | 20:11 |
fungi | hah | 20:11 |
clarkb | re dhcp vs metadata service I think a difference is that typically dhcp is a fairly locked down service within hosting providers. The hosting services may not all realize hey needt o prevent traffic on that special metadata address | 20:12 |
clarkb | basically lock down dhcp is pretty standard stuff. Lock down metadata service isn't | 20:12 |
mikal | I just think this is a pretty big breaking change for a relatively minor "security vulnerability". | 20:12 |
sean-k-mooney | im not sure. i woudl cosnider both to be simialr leveles of treat | 20:12 |
clarkb | sean-k-mooney: they are. But one is super standard that everyone knows about. The other is not | 20:13 |
clarkb | basically the differencesi s one has an rfc the other doesn't | 20:13 |
mikal | Especially when I can think of use cases which want to get rid of the customized DMI data entirely | 20:13 |
fungi | also i don't know how i would go about using a malicious dhcp server to get a machine to run arbitrary code at boot or install access for my personal ssh key | 20:13 |
sean-k-mooney | mikal: i mentione dthis else whre but the dmi info is technically configuable by pakcager/deployers | 20:13 |
sean-k-mooney | nova reads a vendor.conf file if it exits | 20:14 |
sean-k-mooney | which change what that info is | 20:14 |
fungi | but yeah, all of this is great feedback to the cloud-init maintainers | 20:14 |
mikal | So now deployers can accidentally break cloud-init with a DMI customization? | 20:14 |
sean-k-mooney | we do not use that downstream in redhat partly because we saw the ablity to tell it was openstack or worse what version of openstack as a potital security problem | 20:14 |
mikal | That's not great either. | 20:14 |
fungi | and now that the bug is public, i suppose comments could be added tehre | 20:14 |
fungi | sean-k-mooney: i don't see us relying on the dev-user element anywhere, we have a custom element that we inject ssh keys with | 20:16 |
sean-k-mooney | fungi: that was not upstream i ment whewn i built my test alpine image locally | 20:17 |
fungi | ah | 20:17 |
sean-k-mooney | wheni was workign on the dib support i was usign the dev-user image to debug but its been 4 years so i have no idea if the iamge i was testing with still had that or not | 20:17 |
fungi | okay, just re-read your earlier comment about dev-user and now i get what you were saying | 20:17 |
sean-k-mooney | fungi: i got it to boot in the ci and it apear to be workign but boots hung because i did not expand the disk | 20:18 |
sean-k-mooney | so it would start ot boot and then fail | 20:18 |
sean-k-mooney | i actully did 2 impelmetions the most recent one was 2 years ago based on teh continar file support https://review.opendev.org/c/openstack/diskimage-builder/+/755410/4 | 20:19 |
sean-k-mooney | mikal: ya you need to have a file like this https://github.com/openstack/nova/blob/master/etc/nova/release.sample | 20:20 |
sean-k-mooney | mikal: https://github.com/openstack/nova/commit/4aa45d2900a3063305a7b2c727abfc2371909b21 | 20:20 |
sean-k-mooney | the support for that still technially exists https://github.com/openstack/nova/blob/master/nova/version.py | 20:21 |
sean-k-mooney | but nothing i know of actully uses that | 20:21 |
mikal | But that's the key point. OpenStack tries hard not to break existing behaviour. cloud-init apparently does not. | 20:22 |
mikal | I think this is the strongest argument I've seen as to why people should move to glean, especially because glean implies that config drive will not in fact one day be removed. | 20:22 |
sean-k-mooney | there are other light wight alternitives too. the problem with glean is partly its streght | 20:23 |
sean-k-mooney | it does not supprot metadtaa only config drive | 20:23 |
sean-k-mooney | which si why glean is so reliable | 20:23 |
sean-k-mooney | but since we cant update config dirve it only really good for first boot but honestly if your doing day two thinbgs use ansible | 20:24 |
fungi | i don't think it's entirely fair to say that cloud-init doesn't try to preserve existing behaviors, just in this particular case they judged the existing behavior to be a greater risk than the impact from breaking it (whether they judged that accurately is another matter) | 20:25 |
sean-k-mooney | i guess we will see if we get a spike in reports of "my ssh key was not injected" or similar | 20:25 |
fungi | they're typically pretty good about not breaking functionality, and in this particular case the bug was open in private for over a year while they debated the risks of doing so | 20:26 |
sean-k-mooney | on a related not the thing i whish we had time to do was add supprot for jwt or applciation creditials via config dirve. i.e. provide a way simialr to k8s for you to provide and authentication token to the workloads | 20:27 |
fungi | but yes, the complexity and expansive scope of cloud-init, as well as their choice to rely on lots of different non-stdlib python libs, is why we made and use glean for our ci nodes | 20:27 |
sean-k-mooney | its something that aws and all the other cloud already do but providing a way to do athestation or the server and auth is a gap we have today. | 20:28 |
fungi | zuul has jwt support and can do oidc claims too | 20:28 |
sean-k-mooney | fungi: we are only using glean for the nodepool images right? cirros has its own mini implemation in bash that it uses instead | 20:29 |
fungi | though i'm not sure if we could leverage that to authenticate metadata server responses | 20:29 |
fungi | sean-k-mooney: correct | 20:29 |
clarkb | fungi: the zuul implementation requires the third party set up zuul as a trusted oidc provider | 20:29 |
fungi | we don't (currently) build cirros images ourselves afaik | 20:29 |
fungi | clarkb: yeah, i was trying to think if there was a way client-side to leverage that, but the booting server instance would be the client in that case and needs to be reachable before zuul can interact with it | 20:30 |
sean-k-mooney | this is there code https://github.com/cirros-dev/cirros/blob/main/src/usr/bin/ec2metadata and not we dont we use the upstream images and back them into the ci images to avoid downloadeing it in the job | 20:30 |
fungi | yeah, we download the upstream cirros images and ship convenience copies in the cache in our test nodes | 20:31 |
clarkb | note that is a cache though. we can and do remove old versions or lag on adding new versions | 20:31 |
sean-k-mooney | clarkb: well new version very rarely happen | 20:32 |
sean-k-mooney | the last cirrors release was 26th of september 2024 | 20:32 |
fungi | though i expect we're overdue for pruning some of the current list of cirros images we include | 20:32 |
sean-k-mooney | i think there were still some jobs using 0.5.x last time i checked | 20:33 |
sean-k-mooney | they shoudl all be on 0.6.3 at this point but ... | 20:33 |
clarkb | sean-k-mooney: I'm calling it out because someone noticed a cirros image wasn't baked in recently and while it was a bug it was also a buggy job not falling back to download it | 20:33 |
sean-k-mooney | i belive there is a 0.7.0 in the works btu that has been true for a while now | 20:33 |
clarkb | and yes they were using 0.5.x and I indicated that those old versions should probably be considered removable too | 20:33 |
fungi | if the jobs are run infrequently, e.g. on aging unmaintained branches, we usually consider it safe to drop them from the cache since those jobs can still fetch the files at runtime when they occasionally happen to run | 20:34 |
sean-k-mooney | so the kernel panics we hit in 0.6.3 rarely are much more common on 0.5.x becase there is at least 1 addional kernle bug that is not fixed there. | 20:34 |
mikal | sean-k-mooney: a less terrible spiffe arrestor would probably meet your needs while pleasing the hipsters. It's on my mental list of things to one day look at, possibly with some TPM sprinkled in. | 20:55 |
mikal | fungi: they had it open for a year and never thought to ask a Nova core? | 20:56 |
sean-k-mooney | i mean it was a private "secuirty issue" so i would not nessisarly expiect them to | 20:57 |
sean-k-mooney | alhtough if they were unsure they coudl have. i guess fungi was at least somewhat aware | 20:57 |
sean-k-mooney | so i asusme they asked the vmt? | 20:58 |
fungi | jamespage convinced them on adding me to the bug as a heads up, though technically not for a vmt perspective since the vulnerability doesn't exist in openstack clouds, it was more of a question about the potential impact of the breaking change they were considering | 20:58 |
mikal | I don't see that on the bug? I might have missed it because it only skimmed, but they seem to have spent much more like talking to Canonical than the affected platform team. | 20:58 |
mikal | (The VMT being asked that is) | 20:59 |
fungi | and since it wasn't a report for an openstack project, i didn't feel like i had the authority to pull more people into it | 21:01 |
fungi | i was absolutely out of my depth on whether cloud-init's reliance on reading dmi data was a good choice, but also that seemed to already be happening in prior versions | 21:02 |
sean-k-mooney | fungi: ya i dont expect you to knwo those details | 21:05 |
sean-k-mooney | it just sad that the chice is to break eveyr nova virt driver that is not libvirt | 21:05 |
sean-k-mooney | well not break but the dmi interface was only supproted in the libvirt driver so the auto detection wont work genericlly | 21:06 |
mikal | sean-k-mooney: one issue with machine credentials is that OpenStack lacks an IAM role system, so what credentials would you give the machine and with what permissions? Oslo policy is such a mess Red that won't take support calls if you use it, and there really is nothing else. | 21:06 |
mikal | sean-k-mooney: in other clouds that machine credential would be of the form "can read from this one specific object store bucket and this one special user". We can't do that right now. | 21:07 |
sean-k-mooney | mikal: i wanted to supprot passign an applciation credetial that you precreate or askign for nova to create one that on the users behalf | 21:08 |
sean-k-mooney | and we dont supprot custom policy because oslo.plicy is a mess | 21:08 |
sean-k-mooney | its not | 21:08 |
sean-k-mooney | we dont supprot it by default because custoemr kept breaking thing by incorrectly configuring it | 21:09 |
sean-k-mooney | so we say if you want to have custom policy file a supprot exception to at least get a cursory ok form someone who shoudl under stand it | 21:09 |
fungi | it just seems like a catch-22 since you'd need some way to pass data into a server booted from a generic image so it knows what to validate the signature with (public key for example) | 21:10 |
fungi | and if you already had a secure communication channel, you could just use that for the metadata too | 21:11 |
mikal | Well, qemu supports secure communications channels, we just don't use them. | 21:12 |
sean-k-mooney | well metadta was ment to be the secure channel but ya... | 21:12 |
sean-k-mooney | mikal: we wanted it to be mostly one way, and we intentally dont allow direct db access form the compute agent | 21:13 |
sean-k-mooney | fungi: the problem is that it does nto mattter how secure we make this in neutron | 21:13 |
sean-k-mooney | its ever othe rcloud that woudl need ot be secure for them to feel ok with this being enabled by default | 21:14 |
sean-k-mooney | like on the nuetorn side we already trap the reqest based on that ip and then use the soruce mac/ip to proxy it to nova with the neutron proxy authentiication using a shared secret | 21:16 |
sean-k-mooney | neutron also does not allow other vms to respond to the requeest eith its anti spoffing rules | 21:16 |
sean-k-mooney | but unless azure and every other cloud also did that they would consider any well knwo address to be insecure | 21:16 |
sean-k-mooney | i can kind of get where they are commign form it just how cloud images are expected to work | 21:17 |
sean-k-mooney | if they dont supprot metadta by defualt it not really a cloud iamge anymore since tha was kind fo the defining factor at least in the context of cloud-init | 21:18 |
sean-k-mooney | anyways i better call it a day o/ | 21:18 |
clarkb | mikal: you actually can configure a bucket (swift container) that only a specific one off account can edit | 21:19 |
clarkb | its a bit under documented and the tooling is rough. infra uses it and we end up doing direct curl stuff to make it happen iirc | 21:20 |
fungi | sean-k-mooney: by "secure communication channel" i meant one which the booted operating system can somehow verify is trustworthy. a trusted third party (like a well-known public certificate authority) is one possible solution, but probably a terrible one | 21:33 |
fungi | basically include a reference in the metadata blob to the location of a signature and a certificate corresponding to the key that signed the metadata, where that certificate is issued by a trusted third party that cloud-init or whatever knows and can validate it against | 21:37 |
fungi | as soon as you eliminate the ttp, i think you're stuck having to inject some custom trust information into the server, and the trustworthy injection problem becomes essentially the same as the the trustworthy metadata problem | 21:39 |
fungi | but it's likely there are common solutions i'm just not aware of | 21:40 |
sean-k-mooney | fungi: your thinking of something liek the grub efi shim | 21:41 |
fungi | yes, that's the same sort of implementation | 21:42 |
fungi | and faces effectively the same challenges | 21:43 |
sean-k-mooney | so with something like config drive youy can sort of do that in the sense tha tyou could provdie a ca cert to use to authenticate but at that point you might as well just provide the metadta as well | 21:43 |
fungi | my point exactly | 21:43 |
sean-k-mooney | we also try to aovid injecting things directly into the image now if we can avoid it | 21:44 |
fungi | bootstrapping trust is a nontrivial problem | 21:45 |
sean-k-mooney | yep and the trusted compute folks would go as far as saying proving nvram or other uefi stores is not secure enough | 21:45 |
mikal | clarkb: other clouds let you do it effectively for any API call. | 22:02 |
clarkb | ya I know. Its just that is the one example where you can do it with openstack | 22:03 |
mikal | fungi: this "credential zero" problem is literally what SPIFFE is aimed at. Its just the OpenStack attestor is meh. One day someone should fix it. | 22:04 |
mikal | Like the hypervisor could produce a certificate for the instance at boot and ram it into the TPM, to my current understanding of TPMs. | 22:05 |
fungi | yeah, and then the tpm is your implicitly trusted communication channel, albeit one that probably isn't suitable for ramming all the metadata through | 22:09 |
fungi | but can at least bootstrap a more general-purpose source at that point | 22:09 |
sean-k-mooney | mikal: so injecting things into tpm would generally not eb considered secure | 22:13 |
sean-k-mooney | i know you could add a root of turst to teh tpm | 22:14 |
sean-k-mooney | but if we supprot that soemoen woudl ask to be able to lock that out so that only there root of trust form barbiacn could be injected | 22:14 |
sean-k-mooney | that woudl not be so terrible i gues.. allow them ot updload somethign to barbican and pass a refence ot it when booting and have nova inject it into the tpm usign the users token ot retirve it | 22:16 |
mikal | sean-k-mooney: noting that I've never managed to find a good guide to TPM stuff and am therefore a noob, why would the hypervisor generating a unique certificate that identifies the instance and storing it in the TPM so inside the instance can sign things but not actually get to the certificate not be secure? Isn't that almost exactly what the | 22:16 |
mikal | firmware is doing, just with a CA for the cloud instead of a CA for jus the TPM? | 22:16 |
mikal | sean-k-mooney: understanding TPMs is also on my todo list of things I'll never get around to. | 22:17 |
sean-k-mooney | main thing to under stand to day is we cant live migrate with tpms to day because we encypt them with a secreate stored in barbican that only the user that create the vm can acess :) so tpms equal pain | 22:18 |
sean-k-mooney | we coudl likely generate a cert to "sign" or attest the identiy of the vm, and maybe put that in the config drive or tpm for you to atteset | 22:19 |
sean-k-mooney | but im not sure we could use that for auth | 22:19 |
sean-k-mooney | that was one of the jwt propsoals that was made not too long ago. inject a jwt token purly for attestation of the identiy of the sever but not for actual keystone auth usage | 22:20 |
sean-k-mooney | mikal: https://github.com/bbc/nova/commit/382984a3a23032c96089cb5877a55e425db7cee4 | 22:21 |
mikal | sean-k-mooney: so that injected JWT has a name in SPIFFE that I've forgotten, but its a definitely a thing they do. Then again, how do you handle the JWT expiring for long lived VMs? | 22:25 |
sean-k-mooney | mikal: no idea the bbc impleaiton was just an experiment and it never landed upstream | 22:27 |
mikal | https://kasm.stillhq.com/#/session/9606cb74-10e3-4e1e-8990-0779495f79dd -- SPIFFE calls it a SVID apparently. | 22:28 |
mikal | Ugh, wrong url. | 22:30 |
mikal | https://github.com/spiffe/spiffe/blob/main/standards/JWT-SVID.md this one. | 22:30 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!