*** elvira1 is now known as elvira | 09:24 | |
*** sfinucan is now known as stephenfin | 10:23 | |
opendevreview | Jeremy Stanley proposed openstack/project-config master: Restore max-servers in rax-dfw https://review.opendev.org/c/openstack/project-config/+/945707 | 13:58 |
---|---|---|
dansmith | So, we'd like to test something with viommu turned on, which requires setting a metadata item on the images we upload to our providers | 14:14 |
dansmith | we don't think there's any harm in just setting it globally and only using it when we want | 14:14 |
dansmith | what's the preference here.. would infra prefer we add a new image, nodeset, etc and use that for a while? | 14:15 |
dansmith | I don't know the overhead of doing that (seems like if everyone has their own images and nodesets that gets out of hand quickly) | 14:15 |
dansmith | well, also, it should only be set on kvm instances, so not rax | 14:17 |
fungi | dansmith: it looks like nodepool has that ability, though i don't see that we're taking advantage of it in our current configuration: https://opendev.org/zuul/nodepool/src/branch/master/nodepool/builder.py#L1219 https://opendev.org/openstack/project-config/src/branch/master/nodepool/nodepool.yaml#L191 | 14:43 |
dansmith | fungi: what ability? | 14:43 |
fungi | the ability to set arbitrary image metadata at upload | 14:44 |
dansmith | https://github.com/openstack/project-config/blob/master/nodepool/nb07.opendev.org.yaml#L26-L29 | 14:44 |
dansmith | we are for arm ^ | 14:44 |
fungi | aha! i should have looked there | 14:44 |
dansmith | those are separate by nature though | 14:44 |
fungi | yeah, so it does appear we can set it on specific images for specific providers | 14:44 |
fungi | aha, found the right documentation for it, i was looking in the wrong section earlier: https://zuul-ci.org/docs/nodepool/latest/openstack.html | 14:45 |
dansmith | so is it best to just re-upload the noble image to vexx with separate metadata, as a test at least? | 14:45 |
fungi | something like that, though it depends on whether vexxhost is supplying the node labels you expect to test with | 14:46 |
dansmith | well, I | 14:46 |
dansmith | am a big overwhelmed with the complexity here, | 14:46 |
dansmith | but I thought new image with meta, new label, new nodeset(?), change a job to use the new nodeset | 14:47 |
dansmith | or, we just change the vexx base ubuntu-noble image to always set that metadata, | 14:47 |
dansmith | which we're pretty sure is not going to impact anything that doesn't otherwise care | 14:47 |
dansmith | but if we need to be able to target only that image on vexx, maybe the extra node stuff is necessary? | 14:48 |
dansmith | I definitely need guidance | 14:48 |
fungi | the reason i say that is in vexxhost we really only have some less-used labels: https://opendev.org/openstack/project-config/src/branch/master/nodepool/nl07.opendev.org.yaml#L140-L210 | 14:48 |
fungi | mostly due to the absence of flavors which closely match our standard | 14:48 |
fungi | so if you're going to be running a job on nested-virt-ubuntu-noble or ubuntu-noble-32GB nodes then vexxhost could be a fit for that | 14:49 |
fungi | but we don't have the basic ubuntu-noble label there | 14:50 |
dansmith | oh okay, the nested-virt-ubuntu-noble would be a good one to try I think, because viommu is sort of related | 14:50 |
fungi | also doing it there does limit the blast radius even more | 14:50 |
dansmith | I'm not sure I want that ultimately, because nested virt is unstable-r, but it would be a good thing to try, | 14:50 |
dansmith | and then if it all looks good we could move it to more default | 14:51 |
fungi | infra-root: ^ opinions on this experiment? | 14:51 |
dansmith | fungi: surely we run regular ubuntu-noble on vexx somewhere, right? I'm not sure what you meant by that | 14:51 |
fungi | i need to prep to jump into a meeting in just a few minutes, but can look closer in an hour-ish | 14:51 |
fungi | dansmith: we do not, no | 14:51 |
dansmith | ...why? | 14:51 |
fungi | repeating: mostly due to the absence of flavors which closely match our standard | 14:52 |
fungi | too much ram for a sufficient processor count/speed | 14:52 |
dansmith | oh, I thought that's why we min-ram them | 14:52 |
dansmith | sorry, I didn't process that when you said it the first time | 14:52 |
fungi | min-ram says "the flavor must have at least this much ram" | 14:52 |
fungi | the v3-standard-8 flavor there has 32gb though | 14:53 |
dansmith | heh, well of course. I said that because I'm looking at min-ram here and my brain is small. I meant.. I thought we capped those high ram ones to be like the others | 14:53 |
fungi | which is why we also label with *-32GB on the same flavor | 14:53 |
fungi | the 8gb ram flavors there have like 1 or 2 vcpus | 14:54 |
dansmith | but we *could* boot the 32gb ones with mem= on the kernel command line to just artificially keep them like the others | 14:55 |
dansmith | anyway, this is unrelated to my actual goal | 14:56 |
dansmith | I'll push something up for discussion after your meeting | 14:56 |
fungi | yes, we used to do that, but it required modifying the images we upload | 14:57 |
dansmith | (at which time I will have my own meeting of course) | 14:57 |
dansmith | yeah | 14:57 |
fungi | which then meant we needed a separate image for each ram limit | 14:57 |
fungi | which is a massive matrix explosion for the images we're managing and transferring | 14:57 |
opendevreview | Dan Smith proposed openstack/project-config master: Add viommu support to vexxhost ubuntu-noble https://review.opendev.org/c/openstack/project-config/+/945715 | 15:00 |
dansmith | ack | 15:00 |
dansmith | systemd/cgroups at runtime might be close enough and much easier, but fair enough | 15:00 |
*** haleyb_ is now known as haleyb | 15:06 | |
clarkb | what does viommu_model auto affect? I'm wondering what the impact may be on the other labels using the same image upload | 15:13 |
clarkb | looks like you indicate there should be no harm setting it globally | 15:13 |
clarkb | but I don't know what that causes to happen | 15:13 |
clarkb | https://docs.openstack.org/nova/latest/admin/pci-passthrough.html#virtual-iommu-support this says there may be significant impacts | 15:16 |
fungi | also is there a vintage of glance/nova where it is either unavailable or will break entirely? | 15:17 |
fungi | some of our kvm-based providers may still be on comparatively old openstack versions | 15:18 |
opendevreview | Merged openstack/project-config master: Restore max-servers in rax-dfw https://review.opendev.org/c/openstack/project-config/+/945707 | 15:18 |
clarkb | looking at those docs it talks about q35 and arm. I think vexxhost is amd. Not sure if that matters either | 15:19 |
fungi | yeah, i was about to ask whether "intel" literally means intel-brand architecture or x86_64 more generally | 15:20 |
clarkb | in any case I'm happy to experiment particularly with those more specific experimental resources. I just want to understand what this implies/means here | 15:20 |
fungi | a lot of our providers are amd-based, for cost reasons | 15:21 |
clarkb | oh also noble image builds are paused | 15:22 |
dansmith | clarkb: we think the "significant performance impact" means "compared to using the actual hardware IOMMU" | 15:22 |
fungi | the first "note" in that section does talk about amd viommu at least | 15:22 |
clarkb | dansmith: but still at least as good or better than no iommu? | 15:22 |
dansmith | I have it set locally for testing and haven't noticed any real impact (like, I was surprised to read that) | 15:22 |
dansmith | clarkb: again, we believe the viommu is only used for things that are specifically attached to it, not anything else, so it's not like this replaces something that is currently present with something else, | 15:23 |
dansmith | it adds another, which allows for pci passthrough of a device, if you ask for it, but we don't think it has any effect on the regular emulated (i.e. virtio) devices | 15:23 |
clarkb | ok I didn't write the docs and they don't really say antything about that just that your performance might tank | 15:23 |
fungi | so at worst it's an added capability that some software/drivers may automatically engage when they see it's present | 15:23 |
clarkb | honestly from what you are saying those docs should be rewritten they are buggy if what you say is true | 15:24 |
dansmith | clarkb: yeah, neither did I :) | 15:24 |
clarkb | if this requires explicit requests then how are your jobs going to request those things? | 15:24 |
dansmith | clarkb: the reason for doing it on one instead of just slapping it on all of them is because of that statement, just to be cautious | 15:24 |
dansmith | clarkb: if the instance we run devstack in has this enabled, then we'd be able to do everything we need with just devstack post config on the host, a flavor tweak, etc | 15:25 |
clarkb | right so all of our noble instances in vexxhost would be affected? | 15:25 |
clarkb | in that case do we need a new label? | 15:26 |
clarkb | I suppose the new label allows you to avoid nested virt in openmetal and ovh | 15:26 |
dansmith | I dunno why you say that | 15:26 |
clarkb | dansmith: the attribute you are changing affects the ubuntu-noble image that is uploaded to vexxhost. Then every ubuntu-noble label in that region uses that singular image | 15:26 |
dansmith | what I mean is a job that wants to could configure some of the host pci devices to be available for passthrough, and configure the flavor (either in devstack or a tempest test) to request that device | 15:27 |
clarkb | this means that even though you're assiging a specific label to these functionality every ubuntu noble instance booted in vexxhostwith that attribute set will be affected | 15:27 |
dansmith | but it will not affect jobs that don't do that setup | 15:27 |
dansmith | sorry, I thought you were still talking about the "how would you request" question | 15:27 |
clarkb | wait this is on the devstack cloud side not between vexxhost and the devstack host? | 15:27 |
dansmith | yes, this will mean viommu is enabled on all the vexxhost nobles, as I understand it | 15:27 |
clarkb | I think my confusion is around what this actually changes for an instance that I boot | 15:28 |
clarkb | if we ignore the special testing that you want to do. What changes for everything else | 15:28 |
dansmith | okay I'm typing as fast as I can man | 15:28 |
dansmith | you asked about how we're going to request the things and I was telling you that devstack config is basically how that happens | 15:29 |
dansmith | in order to be able to configure devstack to enable pci passthrough for the tempest test instances, the "host" has to have an iommu | 15:29 |
clarkb | but how is that possible if you are requesting pci passthrough to the host? you need a flavor from vexxhost with devices that are passed through right? But I think I've caught up and you're saying this extra feature allows you to pci passthrough the virtual devices that devstack presents to its VMs in the nested cloud | 15:29 |
dansmith | in order for the openstack instance we run devstack in to have an iommu, the provider needs to configure us a virtual one, which is what this metadata thing instructs _their_ nova to do | 15:30 |
dansmith | no | 15:30 |
dansmith | we can pass through virtual devices that have nothing to do with the actual host's physical devices | 15:30 |
dansmith | like the vRNG we get, or the virtual QXL display adapter, etc | 15:30 |
clarkb | but doing so requires a virtual iommu device be allocated in the test instance. This is the improtant detail. And as long as you aren't using that iommu for pass through impact is expected to be minimal/nil | 15:31 |
dansmith | those are fully virtual devices we always have and don't really need, which we can use as donor devices for a flavor with a pci device request in it | 15:31 |
dansmith | correct | 15:31 |
clarkb | ok I think I've caught up then. I agree that seems safe. And having a specific label allows you to avoid the lack of functionality in openmetal and ovh etc for now | 15:32 |
dansmith | it doesn't need to be nested kvm either, but nested in general (like all of ours are) | 15:32 |
clarkb | if vexxhost doesn't support booted with viommu does setting that flag cause all boots to start failing? or will nova just ignore it? | 15:32 |
dansmith | I don't know of any policy we have that lets you refuse people from requesting it, so I probably can't answer that if its something vexxhost has hacked in | 15:33 |
clarkb | I mean more if their nova is too old to support the functionality or if their amd cpus don't supported it (the docs imply it might be intel cpu specific but hard to say) | 15:33 |
clarkb | so not a policy thing but a lack of support | 15:33 |
clarkb | if that does break rolling it back is fairly easy though so not a huge deal if the unexpected happens there | 15:34 |
dansmith | I think that the nova required to support this requires a long-since-new-enough libvirt and kvm such that it's not a problem.. if their nova is old, it should just be ignored since it doesn't know about it | 15:35 |
dansmith | this is a smallish subset of the major workload though right, so .. yeah, seems like a reasonable risk to take | 15:35 |
dansmith | if not, can we just create another image to test with? | 15:35 |
clarkb | ya we could try openmetal or similar | 15:36 |
clarkb | raxflex | 15:36 |
dansmith | per usual I'm not trying to cause trouble, I'm just trying to get some test coverage for things people usually say "meh this is too hard to test" ... and thus this has been uncovered except by hand for a long time | 15:36 |
clarkb | starting with vexxhost is good though they have historically been best at keeping stuff up to date | 15:36 |
clarkb | dansmith: I have one mechanical fix posted to the change. | 15:37 |
clarkb | The other thing is noble images aren't building right now due to the ipv6 kernel bug | 15:37 |
dansmith | ack | 15:37 |
clarkb | we should check if that has been fixed and unpause if so | 15:37 |
opendevreview | Dan Smith proposed openstack/project-config master: Add viommu support to vexxhost ubuntu-noble https://review.opendev.org/c/openstack/project-config/+/945715 | 15:38 |
dansmith | there's no major rush to get this done, I'm just trying to get the waterfowl aligned so it can be done | 15:39 |
clarkb | https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2104134 | 15:39 |
clarkb | this is the kernel issue that we filed. Not sure if ubuntu then has a different tracking system for kernel things | 15:39 |
dansmith | week of the 14th of april the comment says | 15:40 |
clarkb | dansmith: noticed one more mechanical thing. Sorry should've cauight that on the first pass | 15:55 |
dansmith | ack, gotta go to a meeting but will work on that after | 15:57 |
dansmith | clarkb: so I was copying the arm image overrides earlier in the file | 17:12 |
dansmith | do those work for osuosl for some reason that makes these not work for vexx? | 17:12 |
dansmith | wait, maybe I'm confusing something | 17:13 |
dansmith | oh, nb vs nl | 17:14 |
dansmith | tbh, I don't even know why these files have hostnames in them so I'm probably missing plenty of context | 17:14 |
fungi | yeah, the builders are the servers building images, they use the generic nodepool.yaml file in that directory, except the arm builder which has its own hostname-specific one | 17:14 |
dansmith | so I can expand the one in nodepool.yaml without breaking it out into its own file? | 17:16 |
fungi | they can and do upload images to all the providers | 17:16 |
fungi | dansmith: there's a vexxhost ca-ymq-1 section | 17:16 |
dansmith | right, I'm asking procedurally if it's allowed | 17:16 |
fungi | oh, i see what you're asking | 17:16 |
fungi | the yaml anchor | 17:16 |
fungi | yeah, copy from the &provider_diskimages anchor version in ovh-bhs1 and replace the *provider_diskimages reference under vexxhost-ca-ymq-1 | 17:18 |
fungi | similar to what's been done in openmetal-iad3 | 17:18 |
dansmith | that's not what I'm asking but you're answering my question by suggesting it | 17:18 |
fungi | oh, now i get your question. no, don't copy the nodepool.yaml file to any host-specific name | 17:19 |
dansmith | thanks :) | 17:19 |
fungi | just differentiate the diskimages list under the intended provider/region section | 17:19 |
clarkb | nodepool.yaml is the default file if we don't have a more specific override. nb05 and nb06 are our x86_64 builders and they don't have hostname specific files so use the default | 17:19 |
clarkb | and yes just edit the default file (nodepool.yaml) thats fine | 17:20 |
dansmith | so the label moves as well? | 17:20 |
dansmith | er, no, | 17:20 |
dansmith | looks like that's not in nodepool.yaml? | 17:20 |
fungi | &something is an anchor defining a reusable object, *something is a reference that causes the yaml parser to substitute the content from the &something anchor when parsing the file | 17:21 |
dansmith | fungi: I'm aware thanks :) | 17:21 |
fungi | okay, so what' | 17:22 |
fungi | s not in nodepool.yaml? | 17:22 |
dansmith | I'm asking if I need to leave the existing changes to the nl07 file, including the labeling part | 17:22 |
clarkb | dansmith: the label doesn't move thats all on the give me a srver side of the house and nodepool.yaml is only dealing with the build me an image portion | 17:22 |
clarkb | dansmith: yes leave your changes in nl07 as is except drop the metadata from the diskimage there | 17:22 |
dansmith | okay | 17:22 |
opendevreview | Dan Smith proposed openstack/project-config master: Add viommu support to vexxhost ubuntu-noble https://review.opendev.org/c/openstack/project-config/+/945715 | 17:22 |
clarkb | and then we add the metadata to nodepool.yaml | 17:23 |
clarkb | that looks right to me | 17:24 |
dansmith | in order to make sure I run there, I can define my own nodeset looking for that label and make the job use it right? | 17:25 |
fungi | correct, you can even add an anonymous nodeset with that label in your job definition, no need to separately define a nodeset unless you plan to reuse it in multiple jobs | 17:25 |
dansmith | oh okay cool | 17:25 |
clarkb | but yup that is how you get the job to force that location you use the label name in your nodeset (whether an actual nodeset or an anonymous one in the job) | 17:26 |
dansmith | ack | 17:26 |
dansmith | so once this is landed what does the workflow look like to have it updated? automatic or some periodic? | 17:29 |
fungi | lgtm | 17:29 |
fungi | dansmith: the ubuntu-noble images are refreshed ~daily | 17:29 |
clarkb | dansmith: its periodic, but mentioend updates are paused right now | 17:29 |
fungi | oh, right, we need to unpause it | 17:29 |
clarkb | because updating noble breaks a bunch of networking stuff | 17:29 |
dansmith | clarkb: oh yeah I know the pause, I just meant.. what is supposed to happen | 17:29 |
dansmith | I'm not in a huge rush here, so don't unpause it for me if it's not ready (obviously) | 17:30 |
dansmith | the bug mentioned April 14th for the replacement kernel build, so that seems like we aren't ready yet right? | 17:30 |
fungi | once we unpause daily rebuilding of the ubuntu-noble images, the next uploads for them to vexxhost ca-ymq-1 will use that additional metadata | 17:30 |
dansmith | I would make some joke about April 14th being timed well to cause the IRS trouble, but lol, we all know they're using a kernel from the 80s | 17:30 |
fungi | so it'll take effect automatically, probably near-instantly as the images there are increasingly stale so nodepool will resume refreshing them straight away | 17:31 |
fungi | basically an hour-ish after the unpause would be my guess | 17:31 |
clarkb | that sounds about right | 17:32 |
opendevreview | Merged openstack/project-config master: Add viommu support to vexxhost ubuntu-noble https://review.opendev.org/c/openstack/project-config/+/945715 | 17:38 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!