Thursday, 2025-03-27

*** elvira1 is now known as elvira		09:24
*** sfinucan is now known as stephenfin		10:23
opendevreview	Jeremy Stanley proposed openstack/project-config master: Restore max-servers in rax-dfw https://review.opendev.org/c/openstack/project-config/+/945707	13:58
dansmith	So, we'd like to test something with viommu turned on, which requires setting a metadata item on the images we upload to our providers	14:14
dansmith	we don't think there's any harm in just setting it globally and only using it when we want	14:14
dansmith	what's the preference here.. would infra prefer we add a new image, nodeset, etc and use that for a while?	14:15
dansmith	I don't know the overhead of doing that (seems like if everyone has their own images and nodesets that gets out of hand quickly)	14:15
dansmith	well, also, it should only be set on kvm instances, so not rax	14:17
fungi	dansmith: it looks like nodepool has that ability, though i don't see that we're taking advantage of it in our current configuration: https://opendev.org/zuul/nodepool/src/branch/master/nodepool/builder.py#L1219 https://opendev.org/openstack/project-config/src/branch/master/nodepool/nodepool.yaml#L191	14:43
dansmith	fungi: what ability?	14:43
fungi	the ability to set arbitrary image metadata at upload	14:44
dansmith	https://github.com/openstack/project-config/blob/master/nodepool/nb07.opendev.org.yaml#L26-L29	14:44
dansmith	we are for arm ^	14:44
fungi	aha! i should have looked there	14:44
dansmith	those are separate by nature though	14:44
fungi	yeah, so it does appear we can set it on specific images for specific providers	14:44
fungi	aha, found the right documentation for it, i was looking in the wrong section earlier: https://zuul-ci.org/docs/nodepool/latest/openstack.html	14:45
dansmith	so is it best to just re-upload the noble image to vexx with separate metadata, as a test at least?	14:45
fungi	something like that, though it depends on whether vexxhost is supplying the node labels you expect to test with	14:46
dansmith	well, I	14:46
dansmith	am a big overwhelmed with the complexity here,	14:46
dansmith	but I thought new image with meta, new label, new nodeset(?), change a job to use the new nodeset	14:47
dansmith	or, we just change the vexx base ubuntu-noble image to always set that metadata,	14:47
dansmith	which we're pretty sure is not going to impact anything that doesn't otherwise care	14:47
dansmith	but if we need to be able to target only that image on vexx, maybe the extra node stuff is necessary?	14:48
dansmith	I definitely need guidance	14:48
fungi	the reason i say that is in vexxhost we really only have some less-used labels: https://opendev.org/openstack/project-config/src/branch/master/nodepool/nl07.opendev.org.yaml#L140-L210	14:48
fungi	mostly due to the absence of flavors which closely match our standard	14:48
fungi	so if you're going to be running a job on nested-virt-ubuntu-noble or ubuntu-noble-32GB nodes then vexxhost could be a fit for that	14:49
fungi	but we don't have the basic ubuntu-noble label there	14:50
dansmith	oh okay, the nested-virt-ubuntu-noble would be a good one to try I think, because viommu is sort of related	14:50
fungi	also doing it there does limit the blast radius even more	14:50
dansmith	I'm not sure I want that ultimately, because nested virt is unstable-r, but it would be a good thing to try,	14:50
dansmith	and then if it all looks good we could move it to more default	14:51
fungi	infra-root: ^ opinions on this experiment?	14:51
dansmith	fungi: surely we run regular ubuntu-noble on vexx somewhere, right? I'm not sure what you meant by that	14:51
fungi	i need to prep to jump into a meeting in just a few minutes, but can look closer in an hour-ish	14:51
fungi	dansmith: we do not, no	14:51
dansmith	...why?	14:51
fungi	repeating: mostly due to the absence of flavors which closely match our standard	14:52
fungi	too much ram for a sufficient processor count/speed	14:52
dansmith	oh, I thought that's why we min-ram them	14:52
dansmith	sorry, I didn't process that when you said it the first time	14:52
fungi	min-ram says "the flavor must have at least this much ram"	14:52
fungi	the v3-standard-8 flavor there has 32gb though	14:53
dansmith	heh, well of course. I said that because I'm looking at min-ram here and my brain is small. I meant.. I thought we capped those high ram ones to be like the others	14:53
fungi	which is why we also label with *-32GB on the same flavor	14:53
fungi	the 8gb ram flavors there have like 1 or 2 vcpus	14:54
dansmith	but we could boot the 32gb ones with mem= on the kernel command line to just artificially keep them like the others	14:55
dansmith	anyway, this is unrelated to my actual goal	14:56
dansmith	I'll push something up for discussion after your meeting	14:56
fungi	yes, we used to do that, but it required modifying the images we upload	14:57
dansmith	(at which time I will have my own meeting of course)	14:57
dansmith	yeah	14:57
fungi	which then meant we needed a separate image for each ram limit	14:57
fungi	which is a massive matrix explosion for the images we're managing and transferring	14:57
opendevreview	Dan Smith proposed openstack/project-config master: Add viommu support to vexxhost ubuntu-noble https://review.opendev.org/c/openstack/project-config/+/945715	15:00
dansmith	ack	15:00
dansmith	systemd/cgroups at runtime might be close enough and much easier, but fair enough	15:00
*** haleyb_ is now known as haleyb		15:06
clarkb	what does viommu_model auto affect? I'm wondering what the impact may be on the other labels using the same image upload	15:13
clarkb	looks like you indicate there should be no harm setting it globally	15:13
clarkb	but I don't know what that causes to happen	15:13
clarkb	https://docs.openstack.org/nova/latest/admin/pci-passthrough.html#virtual-iommu-support this says there may be significant impacts	15:16
fungi	also is there a vintage of glance/nova where it is either unavailable or will break entirely?	15:17
fungi	some of our kvm-based providers may still be on comparatively old openstack versions	15:18
opendevreview	Merged openstack/project-config master: Restore max-servers in rax-dfw https://review.opendev.org/c/openstack/project-config/+/945707	15:18
clarkb	looking at those docs it talks about q35 and arm. I think vexxhost is amd. Not sure if that matters either	15:19
fungi	yeah, i was about to ask whether "intel" literally means intel-brand architecture or x86_64 more generally	15:20
clarkb	in any case I'm happy to experiment particularly with those more specific experimental resources. I just want to understand what this implies/means here	15:20
fungi	a lot of our providers are amd-based, for cost reasons	15:21
clarkb	oh also noble image builds are paused	15:22
dansmith	clarkb: we think the "significant performance impact" means "compared to using the actual hardware IOMMU"	15:22
fungi	the first "note" in that section does talk about amd viommu at least	15:22
clarkb	dansmith: but still at least as good or better than no iommu?	15:22
dansmith	I have it set locally for testing and haven't noticed any real impact (like, I was surprised to read that)	15:22
dansmith	clarkb: again, we believe the viommu is only used for things that are specifically attached to it, not anything else, so it's not like this replaces something that is currently present with something else,	15:23
dansmith	it adds another, which allows for pci passthrough of a device, if you ask for it, but we don't think it has any effect on the regular emulated (i.e. virtio) devices	15:23
clarkb	ok I didn't write the docs and they don't really say antything about that just that your performance might tank	15:23
fungi	so at worst it's an added capability that some software/drivers may automatically engage when they see it's present	15:23
clarkb	honestly from what you are saying those docs should be rewritten they are buggy if what you say is true	15:24
dansmith	clarkb: yeah, neither did I :)	15:24
clarkb	if this requires explicit requests then how are your jobs going to request those things?	15:24
dansmith	clarkb: the reason for doing it on one instead of just slapping it on all of them is because of that statement, just to be cautious	15:24
dansmith	clarkb: if the instance we run devstack in has this enabled, then we'd be able to do everything we need with just devstack post config on the host, a flavor tweak, etc	15:25
clarkb	right so all of our noble instances in vexxhost would be affected?	15:25
clarkb	in that case do we need a new label?	15:26
clarkb	I suppose the new label allows you to avoid nested virt in openmetal and ovh	15:26
dansmith	I dunno why you say that	15:26
clarkb	dansmith: the attribute you are changing affects the ubuntu-noble image that is uploaded to vexxhost. Then every ubuntu-noble label in that region uses that singular image	15:26
dansmith	what I mean is a job that wants to could configure some of the host pci devices to be available for passthrough, and configure the flavor (either in devstack or a tempest test) to request that device	15:27
clarkb	this means that even though you're assiging a specific label to these functionality every ubuntu noble instance booted in vexxhostwith that attribute set will be affected	15:27
dansmith	but it will not affect jobs that don't do that setup	15:27
dansmith	sorry, I thought you were still talking about the "how would you request" question	15:27
clarkb	wait this is on the devstack cloud side not between vexxhost and the devstack host?	15:27
dansmith	yes, this will mean viommu is enabled on all the vexxhost nobles, as I understand it	15:27
clarkb	I think my confusion is around what this actually changes for an instance that I boot	15:28
clarkb	if we ignore the special testing that you want to do. What changes for everything else	15:28
dansmith	okay I'm typing as fast as I can man	15:28
dansmith	you asked about how we're going to request the things and I was telling you that devstack config is basically how that happens	15:29
dansmith	in order to be able to configure devstack to enable pci passthrough for the tempest test instances, the "host" has to have an iommu	15:29
clarkb	but how is that possible if you are requesting pci passthrough to the host? you need a flavor from vexxhost with devices that are passed through right? But I think I've caught up and you're saying this extra feature allows you to pci passthrough the virtual devices that devstack presents to its VMs in the nested cloud	15:29
dansmith	in order for the openstack instance we run devstack in to have an iommu, the provider needs to configure us a virtual one, which is what this metadata thing instructs _their_ nova to do	15:30
dansmith	no	15:30
dansmith	we can pass through virtual devices that have nothing to do with the actual host's physical devices	15:30
dansmith	like the vRNG we get, or the virtual QXL display adapter, etc	15:30
clarkb	but doing so requires a virtual iommu device be allocated in the test instance. This is the improtant detail. And as long as you aren't using that iommu for pass through impact is expected to be minimal/nil	15:31
dansmith	those are fully virtual devices we always have and don't really need, which we can use as donor devices for a flavor with a pci device request in it	15:31
dansmith	correct	15:31
clarkb	ok I think I've caught up then. I agree that seems safe. And having a specific label allows you to avoid the lack of functionality in openmetal and ovh etc for now	15:32
dansmith	it doesn't need to be nested kvm either, but nested in general (like all of ours are)	15:32
clarkb	if vexxhost doesn't support booted with viommu does setting that flag cause all boots to start failing? or will nova just ignore it?	15:32
dansmith	I don't know of any policy we have that lets you refuse people from requesting it, so I probably can't answer that if its something vexxhost has hacked in	15:33
clarkb	I mean more if their nova is too old to support the functionality or if their amd cpus don't supported it (the docs imply it might be intel cpu specific but hard to say)	15:33
clarkb	so not a policy thing but a lack of support	15:33
clarkb	if that does break rolling it back is fairly easy though so not a huge deal if the unexpected happens there	15:34
dansmith	I think that the nova required to support this requires a long-since-new-enough libvirt and kvm such that it's not a problem.. if their nova is old, it should just be ignored since it doesn't know about it	15:35
dansmith	this is a smallish subset of the major workload though right, so .. yeah, seems like a reasonable risk to take	15:35
dansmith	if not, can we just create another image to test with?	15:35
clarkb	ya we could try openmetal or similar	15:36
clarkb	raxflex	15:36
dansmith	per usual I'm not trying to cause trouble, I'm just trying to get some test coverage for things people usually say "meh this is too hard to test" ... and thus this has been uncovered except by hand for a long time	15:36
clarkb	starting with vexxhost is good though they have historically been best at keeping stuff up to date	15:36
clarkb	dansmith: I have one mechanical fix posted to the change.	15:37
clarkb	The other thing is noble images aren't building right now due to the ipv6 kernel bug	15:37
dansmith	ack	15:37
clarkb	we should check if that has been fixed and unpause if so	15:37
opendevreview	Dan Smith proposed openstack/project-config master: Add viommu support to vexxhost ubuntu-noble https://review.opendev.org/c/openstack/project-config/+/945715	15:38
dansmith	there's no major rush to get this done, I'm just trying to get the waterfowl aligned so it can be done	15:39
clarkb	https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2104134	15:39
clarkb	this is the kernel issue that we filed. Not sure if ubuntu then has a different tracking system for kernel things	15:39
dansmith	week of the 14th of april the comment says	15:40
clarkb	dansmith: noticed one more mechanical thing. Sorry should've cauight that on the first pass	15:55
dansmith	ack, gotta go to a meeting but will work on that after	15:57
dansmith	clarkb: so I was copying the arm image overrides earlier in the file	17:12
dansmith	do those work for osuosl for some reason that makes these not work for vexx?	17:12
dansmith	wait, maybe I'm confusing something	17:13
dansmith	oh, nb vs nl	17:14
dansmith	tbh, I don't even know why these files have hostnames in them so I'm probably missing plenty of context	17:14
fungi	yeah, the builders are the servers building images, they use the generic nodepool.yaml file in that directory, except the arm builder which has its own hostname-specific one	17:14
dansmith	so I can expand the one in nodepool.yaml without breaking it out into its own file?	17:16
fungi	they can and do upload images to all the providers	17:16
fungi	dansmith: there's a vexxhost ca-ymq-1 section	17:16
dansmith	right, I'm asking procedurally if it's allowed	17:16
fungi	oh, i see what you're asking	17:16
fungi	the yaml anchor	17:16
fungi	yeah, copy from the &provider_diskimages anchor version in ovh-bhs1 and replace the *provider_diskimages reference under vexxhost-ca-ymq-1	17:18
fungi	similar to what's been done in openmetal-iad3	17:18
dansmith	that's not what I'm asking but you're answering my question by suggesting it	17:18
fungi	oh, now i get your question. no, don't copy the nodepool.yaml file to any host-specific name	17:19
dansmith	thanks :)	17:19
fungi	just differentiate the diskimages list under the intended provider/region section	17:19
clarkb	nodepool.yaml is the default file if we don't have a more specific override. nb05 and nb06 are our x86_64 builders and they don't have hostname specific files so use the default	17:19
clarkb	and yes just edit the default file (nodepool.yaml) thats fine	17:20
dansmith	so the label moves as well?	17:20
dansmith	er, no,	17:20
dansmith	looks like that's not in nodepool.yaml?	17:20
fungi	&something is an anchor defining a reusable object, *something is a reference that causes the yaml parser to substitute the content from the &something anchor when parsing the file	17:21
dansmith	fungi: I'm aware thanks :)	17:21
fungi	okay, so what'	17:22
fungi	s not in nodepool.yaml?	17:22
dansmith	I'm asking if I need to leave the existing changes to the nl07 file, including the labeling part	17:22
clarkb	dansmith: the label doesn't move thats all on the give me a srver side of the house and nodepool.yaml is only dealing with the build me an image portion	17:22
clarkb	dansmith: yes leave your changes in nl07 as is except drop the metadata from the diskimage there	17:22
dansmith	okay	17:22
opendevreview	Dan Smith proposed openstack/project-config master: Add viommu support to vexxhost ubuntu-noble https://review.opendev.org/c/openstack/project-config/+/945715	17:22
clarkb	and then we add the metadata to nodepool.yaml	17:23
clarkb	that looks right to me	17:24
dansmith	in order to make sure I run there, I can define my own nodeset looking for that label and make the job use it right?	17:25
fungi	correct, you can even add an anonymous nodeset with that label in your job definition, no need to separately define a nodeset unless you plan to reuse it in multiple jobs	17:25
dansmith	oh okay cool	17:25
clarkb	but yup that is how you get the job to force that location you use the label name in your nodeset (whether an actual nodeset or an anonymous one in the job)	17:26
dansmith	ack	17:26
dansmith	so once this is landed what does the workflow look like to have it updated? automatic or some periodic?	17:29
fungi	lgtm	17:29
fungi	dansmith: the ubuntu-noble images are refreshed ~daily	17:29
clarkb	dansmith: its periodic, but mentioend updates are paused right now	17:29
fungi	oh, right, we need to unpause it	17:29
clarkb	because updating noble breaks a bunch of networking stuff	17:29
dansmith	clarkb: oh yeah I know the pause, I just meant.. what is supposed to happen	17:29
dansmith	I'm not in a huge rush here, so don't unpause it for me if it's not ready (obviously)	17:30
dansmith	the bug mentioned April 14th for the replacement kernel build, so that seems like we aren't ready yet right?	17:30
fungi	once we unpause daily rebuilding of the ubuntu-noble images, the next uploads for them to vexxhost ca-ymq-1 will use that additional metadata	17:30
dansmith	I would make some joke about April 14th being timed well to cause the IRS trouble, but lol, we all know they're using a kernel from the 80s	17:30
fungi	so it'll take effect automatically, probably near-instantly as the images there are increasingly stale so nodepool will resume refreshing them straight away	17:31
fungi	basically an hour-ish after the unpause would be my guess	17:31
clarkb	that sounds about right	17:32
opendevreview	Merged openstack/project-config master: Add viommu support to vexxhost ubuntu-noble https://review.opendev.org/c/openstack/project-config/+/945715	17:38

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!