Thursday, 2025-03-27

*** elvira1 is now known as elvira09:24
*** sfinucan is now known as stephenfin10:23
opendevreviewJeremy Stanley proposed openstack/project-config master: Restore max-servers in rax-dfw  https://review.opendev.org/c/openstack/project-config/+/94570713:58
dansmithSo, we'd like to test something with viommu turned on, which requires setting a metadata item on the images we upload to our providers14:14
dansmithwe don't think there's any harm in just setting it globally and only using it when we want14:14
dansmithwhat's the preference here.. would infra prefer we add a new image, nodeset, etc and use that for a while?14:15
dansmithI don't know the overhead of doing that (seems like if everyone has their own images and nodesets that gets out of hand quickly)14:15
dansmithwell, also, it should only be set on kvm instances, so not rax14:17
fungidansmith: it looks like nodepool has that ability, though i don't see that we're taking advantage of it in our current configuration: https://opendev.org/zuul/nodepool/src/branch/master/nodepool/builder.py#L1219 https://opendev.org/openstack/project-config/src/branch/master/nodepool/nodepool.yaml#L19114:43
dansmithfungi: what ability?14:43
fungithe ability to set arbitrary image metadata at upload14:44
dansmithhttps://github.com/openstack/project-config/blob/master/nodepool/nb07.opendev.org.yaml#L26-L2914:44
dansmithwe are for arm ^14:44
fungiaha! i should have looked there14:44
dansmiththose are separate by nature though14:44
fungiyeah, so it does appear we can set it on specific images for specific providers14:44
fungiaha, found the right documentation for it, i was looking in the wrong section earlier: https://zuul-ci.org/docs/nodepool/latest/openstack.html14:45
dansmithso is it best to just re-upload the noble image to vexx with separate metadata, as a test at least?14:45
fungisomething like that, though it depends on whether vexxhost is supplying the node labels you expect to test with14:46
dansmithwell, I14:46
dansmitham a big overwhelmed with the complexity here, 14:46
dansmithbut I thought new image with meta, new label, new nodeset(?), change a job to use the new nodeset14:47
dansmithor, we just change the vexx base ubuntu-noble image to always set that metadata,14:47
dansmithwhich we're pretty sure is not going to impact anything that doesn't otherwise care14:47
dansmithbut if we need to be able to target only that image on vexx, maybe the extra node stuff is necessary?14:48
dansmithI definitely need guidance14:48
fungithe reason i say that is in vexxhost we really only have some less-used labels: https://opendev.org/openstack/project-config/src/branch/master/nodepool/nl07.opendev.org.yaml#L140-L21014:48
fungimostly due to the absence of flavors which closely match our standard14:48
fungiso if you're going to be running a job on nested-virt-ubuntu-noble or ubuntu-noble-32GB nodes then vexxhost could be a fit for that14:49
fungibut we don't have the basic ubuntu-noble label there14:50
dansmithoh okay, the nested-virt-ubuntu-noble would be a good one to try I think, because viommu is sort of related14:50
fungialso doing it there does limit the blast radius even more14:50
dansmithI'm not sure I want that ultimately, because nested virt is unstable-r, but it would be a good thing to try,14:50
dansmithand then if it all looks good we could move it to more default14:51
fungiinfra-root: ^ opinions on this experiment?14:51
dansmithfungi: surely we run regular ubuntu-noble on vexx somewhere, right? I'm not sure what you meant by that14:51
fungii need to prep to jump into a meeting in just a few minutes, but can look closer in an hour-ish14:51
fungidansmith: we do not, no14:51
dansmith...why?14:51
fungirepeating: mostly due to the absence of flavors which closely match our standard14:52
fungitoo much ram for a sufficient processor count/speed14:52
dansmithoh, I thought that's why we min-ram them14:52
dansmithsorry, I didn't process that when you said it the first time14:52
fungimin-ram says "the flavor must have at least this much ram" 14:52
fungithe v3-standard-8 flavor there has 32gb though14:53
dansmithheh, well of course. I said that because I'm looking at min-ram here and my brain is small. I meant.. I thought we capped those high ram ones to be like the others14:53
fungiwhich is why we also label with *-32GB on the same flavor14:53
fungithe 8gb ram flavors there have like 1 or 2 vcpus14:54
dansmithbut we *could* boot the 32gb ones with mem= on the kernel command line to just artificially keep them like the others14:55
dansmithanyway, this is unrelated to my actual goal14:56
dansmithI'll push something up for discussion after your meeting14:56
fungiyes, we used to do that, but it required modifying the images we upload14:57
dansmith(at which time I will have my own meeting of course)14:57
dansmithyeah14:57
fungiwhich then meant we needed a separate image for each ram limit14:57
fungiwhich is a massive matrix explosion for the images we're managing and transferring14:57
opendevreviewDan Smith proposed openstack/project-config master: Add viommu support to vexxhost ubuntu-noble  https://review.opendev.org/c/openstack/project-config/+/94571515:00
dansmithack15:00
dansmithsystemd/cgroups at runtime might be close enough and much easier, but fair enough15:00
*** haleyb_ is now known as haleyb15:06
clarkbwhat does viommu_model auto affect? I'm wondering what the impact may be on the other labels using the same image upload15:13
clarkblooks like you indicate there should be no harm setting it globally15:13
clarkbbut I don't know what that causes to happen15:13
clarkbhttps://docs.openstack.org/nova/latest/admin/pci-passthrough.html#virtual-iommu-support this says there may be significant impacts15:16
fungialso is there a vintage of glance/nova where it is either unavailable or will break entirely?15:17
fungisome of our kvm-based providers may still be on comparatively old openstack versions15:18
opendevreviewMerged openstack/project-config master: Restore max-servers in rax-dfw  https://review.opendev.org/c/openstack/project-config/+/94570715:18
clarkblooking at those docs it talks about q35 and arm. I think vexxhost is amd. Not sure if that matters either15:19
fungiyeah, i was about to ask whether "intel" literally means intel-brand architecture or x86_64 more generally15:20
clarkbin any case I'm happy to experiment particularly with those more specific experimental resources. I just want to understand what this implies/means here15:20
fungia lot of our providers are amd-based, for cost reasons15:21
clarkboh also noble image builds are paused15:22
dansmithclarkb: we think the "significant performance impact" means "compared to using the actual hardware IOMMU"15:22
fungithe first "note" in that section does talk about amd viommu at least15:22
clarkbdansmith: but still at least as good or better than no iommu?15:22
dansmithI have it set locally for testing and haven't noticed any real impact (like, I was surprised to read that)15:22
dansmithclarkb: again, we believe the viommu is only used for things that are specifically attached to it, not anything else, so it's not like this replaces something that is currently present with something else,15:23
dansmithit adds another, which allows for pci passthrough of a device, if you ask for it, but we don't think it has any effect on the regular emulated (i.e. virtio) devices15:23
clarkbok I didn't write the docs and they don't really say antything about that just that your performance might tank15:23
fungiso at worst it's an added capability that some software/drivers may automatically engage when they see it's present15:23
clarkbhonestly from what you are saying those docs should be rewritten they are buggy if what you say is true15:24
dansmithclarkb: yeah, neither did I :)15:24
clarkbif this requires explicit requests then how are your jobs going to request those things?15:24
dansmithclarkb: the reason for doing it on one instead of just slapping it on all of them is because of that statement, just to be cautious15:24
dansmithclarkb: if the instance we run devstack in has this enabled, then we'd be able to do everything we need with just devstack post config on the host, a flavor tweak, etc15:25
clarkbright so all of our noble instances in vexxhost would be affected?15:25
clarkbin that case do we need a new label?15:26
clarkbI suppose the new label allows you to avoid nested virt in openmetal and ovh15:26
dansmithI dunno why you say that15:26
clarkbdansmith: the attribute you are changing affects the ubuntu-noble image that is uploaded to vexxhost. Then every ubuntu-noble label in that region uses that singular image15:26
dansmithwhat I mean is a job that wants to could configure some of the host pci devices to be available for passthrough, and configure the flavor (either in devstack or a tempest test) to request that device15:27
clarkbthis means that even though you're assiging a specific label to these functionality every ubuntu noble instance booted in vexxhostwith that attribute set will be affected15:27
dansmithbut it will not affect jobs that don't do that setup15:27
dansmithsorry, I thought you were still talking about the "how would you request" question15:27
clarkbwait this is on the devstack cloud side not between vexxhost and the devstack host?15:27
dansmithyes, this will mean viommu is enabled on all the vexxhost nobles, as I understand it15:27
clarkbI think my confusion is around what this actually changes for an instance that I boot15:28
clarkbif we ignore the special testing that you want to do. What changes for everything else15:28
dansmithokay I'm typing as fast as I can man15:28
dansmithyou asked about how we're going to request the things and I was telling you that devstack config is basically how that happens15:29
dansmithin order to be able to configure devstack to enable pci passthrough for the tempest test instances, the "host" has to have an iommu15:29
clarkbbut how is that possible if you are requesting pci passthrough to the host? you need a flavor from vexxhost with devices that are passed through right? But I think I've caught up and you're saying this extra feature allows you to pci passthrough the virtual devices that devstack presents to its VMs in the nested cloud15:29
dansmithin order for the openstack instance we run devstack in to have an iommu, the provider needs to configure us a virtual one, which is what this metadata thing instructs _their_ nova to do15:30
dansmithno15:30
dansmithwe can pass through virtual devices that have nothing to do with the actual host's physical devices15:30
dansmithlike the vRNG we get, or the virtual QXL display adapter, etc15:30
clarkbbut doing so requires a virtual iommu device be allocated in the test instance. This is the improtant detail. And as long as you aren't using that iommu for pass through impact is expected to be minimal/nil15:31
dansmiththose are fully virtual devices we always have and don't really need, which we can use as donor devices for a flavor with a pci device request in it15:31
dansmithcorrect15:31
clarkbok I think I've caught up then. I agree that seems safe. And having a specific label allows you to avoid the lack of functionality in openmetal and ovh etc for now15:32
dansmithit doesn't need to be nested kvm either, but nested in general (like all of ours are)15:32
clarkbif vexxhost doesn't support booted with viommu does setting that flag cause all boots to start failing? or will nova just ignore it?15:32
dansmithI don't know of any policy we have that lets you refuse people from requesting it, so I probably can't answer that if its something vexxhost has hacked in15:33
clarkbI mean more if their nova is too old to support the functionality or if their amd cpus don't supported it (the docs imply it might be intel cpu specific but hard to say)15:33
clarkbso not a policy thing but a lack of support15:33
clarkbif that does break rolling it back is fairly easy though so not a huge deal if the unexpected happens there15:34
dansmithI think that the nova required to support this requires a long-since-new-enough libvirt and kvm such that it's not a problem.. if their nova is old, it should just be ignored since it doesn't know about it15:35
dansmiththis is a smallish subset of the major workload though right, so .. yeah, seems like a reasonable risk to take15:35
dansmithif not, can we just create another image to test with?15:35
clarkbya we could try openmetal or similar15:36
clarkbraxflex15:36
dansmithper usual I'm not trying to cause trouble, I'm just trying to get some test coverage for things people usually say "meh this is too hard to test" ... and thus this has been uncovered except by hand for a long time15:36
clarkbstarting with vexxhost is good though they have historically been best at keeping stuff up to date15:36
clarkbdansmith: I have one mechanical fix posted to the change.15:37
clarkbThe other thing is noble images aren't building right now due to the ipv6 kernel bug15:37
dansmithack15:37
clarkbwe should check if that has been fixed and unpause if so15:37
opendevreviewDan Smith proposed openstack/project-config master: Add viommu support to vexxhost ubuntu-noble  https://review.opendev.org/c/openstack/project-config/+/94571515:38
dansmiththere's no major rush to get this done, I'm just trying to get the waterfowl aligned so it can be done15:39
clarkbhttps://bugs.launchpad.net/ubuntu/+source/linux/+bug/210413415:39
clarkbthis is the kernel issue that we filed. Not sure if ubuntu then has a different tracking system for kernel things15:39
dansmithweek of the 14th of april the comment says15:40
clarkbdansmith: noticed one more mechanical thing. Sorry should've cauight that on the first pass15:55
dansmithack, gotta go to a meeting but will work on that after15:57
dansmithclarkb: so I was copying the arm image overrides earlier in the file17:12
dansmithdo those work for osuosl for some reason that makes these not work for vexx?17:12
dansmithwait, maybe I'm confusing something17:13
dansmithoh, nb vs nl17:14
dansmithtbh, I don't even know why these files have hostnames in them so I'm probably missing plenty of context17:14
fungiyeah, the builders are the servers building images, they use the generic nodepool.yaml file in that directory, except the arm builder which has its own hostname-specific one17:14
dansmithso I can expand the one in nodepool.yaml without breaking it out into its own file?17:16
fungithey can and do upload images to all the providers17:16
fungidansmith: there's a vexxhost ca-ymq-1 section17:16
dansmithright, I'm asking procedurally if it's allowed17:16
fungioh, i see what you're asking17:16
fungithe yaml anchor17:16
fungiyeah, copy from the &provider_diskimages anchor version in ovh-bhs1 and replace the *provider_diskimages reference under vexxhost-ca-ymq-117:18
fungisimilar to what's been done in openmetal-iad317:18
dansmiththat's not what I'm asking but you're answering my question by suggesting it17:18
fungioh, now i get your question. no, don't copy the nodepool.yaml file to any host-specific name17:19
dansmiththanks :)17:19
fungijust differentiate the diskimages list under the intended provider/region section17:19
clarkbnodepool.yaml is the default file if we don't have a more specific override. nb05 and nb06 are our x86_64 builders and they don't have hostname specific files so use the default17:19
clarkband yes just edit the default file (nodepool.yaml) thats fine17:20
dansmithso the label moves as well?17:20
dansmither, no, 17:20
dansmithlooks like that's not in nodepool.yaml?17:20
fungi&something is an anchor defining a reusable object, *something is a reference that causes the yaml parser to substitute the content from the &something anchor when parsing the file17:21
dansmithfungi: I'm aware thanks :)17:21
fungiokay, so what'17:22
fungis not in nodepool.yaml?17:22
dansmithI'm asking if I need to leave the existing changes to the nl07 file, including the labeling part17:22
clarkbdansmith: the label doesn't move thats all on the give me a srver side of the house and nodepool.yaml is only dealing with the build me an image portion17:22
clarkbdansmith: yes leave your changes in nl07 as is except drop the metadata from the diskimage there17:22
dansmithokay17:22
opendevreviewDan Smith proposed openstack/project-config master: Add viommu support to vexxhost ubuntu-noble  https://review.opendev.org/c/openstack/project-config/+/94571517:22
clarkband then we add the metadata to nodepool.yaml17:23
clarkbthat looks right to me17:24
dansmithin order to make sure I run there, I can define my own nodeset looking for that label and make the job use it right?17:25
fungicorrect, you can even add an anonymous nodeset with that label in your job definition, no need to separately define a nodeset unless you plan to reuse it in multiple jobs17:25
dansmithoh okay cool17:25
clarkbbut yup that is how you get the job to force that location you use the label name in your nodeset (whether an actual nodeset or an anonymous one in the job)17:26
dansmithack17:26
dansmithso once this is landed what does the workflow look like to have it updated? automatic or some periodic?17:29
fungilgtm17:29
fungidansmith: the ubuntu-noble images are refreshed ~daily17:29
clarkbdansmith: its periodic, but mentioend updates are paused right now17:29
fungioh, right, we need to unpause it17:29
clarkbbecause updating noble breaks a bunch of networking stuff17:29
dansmithclarkb: oh yeah I know the pause, I just meant.. what is supposed to happen17:29
dansmithI'm not in a huge rush here, so don't unpause it for me if it's not ready (obviously)17:30
dansmiththe bug mentioned April 14th for the replacement kernel build, so that seems like we aren't ready yet right?17:30
fungionce we unpause daily rebuilding of the ubuntu-noble images, the next uploads for them to vexxhost ca-ymq-1 will use that additional metadata17:30
dansmithI would make some joke about April 14th being timed well to cause the IRS trouble, but lol, we all know they're using a kernel from the 80s17:30
fungiso it'll take effect automatically, probably near-instantly as the images there are increasingly stale so nodepool will resume refreshing them straight away17:31
fungibasically an hour-ish after the unpause would be my guess17:31
clarkbthat sounds about right17:32
opendevreviewMerged openstack/project-config master: Add viommu support to vexxhost ubuntu-noble  https://review.opendev.org/c/openstack/project-config/+/94571517:38

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!