Thursday, 2026-02-05

gmaansean-k-mooney: I did not +W this as you might want to review it too, if not let me know and I can apply +W https://review.opendev.org/c/openstack/nova/+/97444501:17
gmaanrest other in the series are good to go01:18
opendevreviewmelanie witt proposed openstack/nova master: TPM: prepare to bump service version for live migration  https://review.opendev.org/c/openstack/nova/+/96205101:19
opendevreviewmelanie witt proposed openstack/nova master: TPM: support live migration of `host` secret security  https://review.opendev.org/c/openstack/nova/+/94148301:19
opendevreviewmelanie witt proposed openstack/nova master: TPM: support live migration of `deployment` secret security  https://review.opendev.org/c/openstack/nova/+/92577101:19
opendevreviewmelanie witt proposed openstack/nova master: TPM: test live migration between hosts with different security  https://review.opendev.org/c/openstack/nova/+/95262901:19
opendevreviewmelanie witt proposed openstack/nova master: TPM: add late check for supported TPM secret security  https://review.opendev.org/c/openstack/nova/+/95697501:19
opendevreviewmelanie witt proposed openstack/nova master: TPM: opt-in to new TPM secret security via resize  https://review.opendev.org/c/openstack/nova/+/96205201:19
opendevreviewmelanie witt proposed openstack/nova master: DNM vtpm tempest  https://review.opendev.org/c/openstack/nova/+/95747701:19
opendevreviewmelanie witt proposed openstack/nova master: TPM: bump service version to enable live migration  https://review.opendev.org/c/openstack/nova/+/97572401:19
gmaansean-k-mooney: ah you already checked and left comment there, got it now. I +w it. 01:22
opendevreviewTakashi Natsume proposed openstack/nova master: Update contributor guide for 2026.1 Gazpacho  https://review.opendev.org/c/openstack/nova/+/96189602:32
opendevreviewGhanshyam proposed openstack/nova master: DNM: test oslo.service set_service_opts_defaults  https://review.opendev.org/c/openstack/nova/+/97573902:37
opendevreviewGhanshyam proposed openstack/nova master: DNM: test oslo.service set_service_opts_defaults  https://review.opendev.org/c/openstack/nova/+/97573902:38
opendevreviewMerged openstack/nova master: Libvirt event handling without eventlet  https://review.opendev.org/c/openstack/nova/+/96594903:14
opendevreviewMerged openstack/nova master: SubclassSignatureTestCase to use NoDBTestCase as base  https://review.opendev.org/c/openstack/nova/+/97486103:14
opendevreviewMerged openstack/nova master: Enable mypy on nova/utils.py  https://review.opendev.org/c/openstack/nova/+/96993603:43
opendevreviewGhanshyam proposed openstack/nova master: DNM: test oslo.service set_service_opts_defaults  https://review.opendev.org/c/openstack/nova/+/97573903:56
opendevreviewTakashi Kajinami proposed openstack/nova master: libvirt: Extend functional test coverage of UEFI boot guests  https://review.opendev.org/c/openstack/nova/+/96926306:36
opendevreviewTakashi Kajinami proposed openstack/nova master: libvirt: Add basic xml generation for firmware auto selection  https://review.opendev.org/c/openstack/nova/+/96908506:36
opendevreviewTakashi Kajinami proposed openstack/nova master: libvirt: Add capability to load loader and nvram from xml  https://review.opendev.org/c/openstack/nova/+/96908606:36
opendevreviewTakashi Kajinami proposed openstack/nova master: libvirt: Add capability to load smm feature from existing xml  https://review.opendev.org/c/openstack/nova/+/96913106:36
opendevreviewTakashi Kajinami proposed openstack/nova master: libvirt: Use firmware auto-selection by libvirt  https://review.opendev.org/c/openstack/nova/+/96913206:36
opendevreviewTakashi Kajinami proposed openstack/nova master: AMD SEV: omit iommu='on' for virtio devices  https://review.opendev.org/c/openstack/nova/+/90963507:40
opendevreviewTakashi Kajinami proposed openstack/nova master: libvirt: Remove tpm support detection for libvirt < 8.0.0  https://review.opendev.org/c/openstack/nova/+/95230807:40
opendevreviewTakashi Kajinami proposed openstack/nova master: libvirt: Drop redundant chown of tpm data directory  https://review.opendev.org/c/openstack/nova/+/96244607:41
opendevreviewMerged openstack/nova master: Live migration with iothreads  https://review.opendev.org/c/openstack/nova/+/97500010:56
opendevreviewJohn Garbutt proposed openstack/nova master: Make PCPUs not land on VCPUs by default  https://review.opendev.org/c/openstack/nova/+/97577911:18
opendevreviewMax proposed openstack/nova master: fix: _get_guest_disk_device UnboundLocalError  https://review.opendev.org/c/openstack/nova/+/97578312:18
LarsErikPhi! Struggling a little bit with unified limits and PCI passthrough here. I have servers where with multiple non SR-IOV (type-PCI) GPUs I want to passthrough. I've set them up with a custom resource class, and referenced that class in the flavor together with the pci_passthrough:alias. But then the device is counted double in placement..14:28
LarsErikPbasically this bugs.launchpad.net/nova/+bug/2098496 but for type-PCI as well..14:29
LarsErikPmy goal here is of course to have unified limits on the custom resource class for these GPUs, but that doesn't play very well when the allocations in placement is wrongly counted14:30
LarsErikPam I doing anything wrong here? or is the bug I mentioned also valid for type-PCI hostdevs?14:30
dansmithLarsErikP: I could be wrong, but I don't think you should reference the custom class in placement if you're using PCI-in-placement.. nova will already allocate one for you and I think the flavor reference is adding a second14:37
LarsErikPright, but when the custom resource is not referenced in the flavor, the limit is not enforced14:38
dansmithah, okay14:41
dansmithI would sync with melwitt when she's up later this morning14:42
LarsErikPI guess we can considered her hilighted? :P14:42
dansmithnote that the bug you reference is about leaks in the allocation process not over-allocation in placement (IIRC), which both have been fixed and are not what you're seeing I think14:42
dansmithyup14:42
LarsErikPyeah, I've experienced this as described in the bug with type-VF devices as well. And that is very much fixed with that patch14:43
LarsErikPI got the same kind of symptomps before I applied that patch with type-VF devices. Requesting instances with both the resource class and pci_passthrough resulted in too much usage registered in placement. That is fixed14:45
LarsErikPbut yeah. with my current problem, the allocations are correctly recorded in placement when I remove the custom resource class from the flavor14:46
LarsErikPbut then again, I can't use limits for these resources :-(14:46
LarsErikPI can hack it though.. Using a custom resource for quota counting which I add to the RP for the compute host with the PCI-devices, and reference that in the flavor..14:59
sean-k-mooneyyou shoudl be able to use limits for the resouce without doing resouce:15:00
gmaanbauzas: I know you added it in your review but gate is green on this and ready for review https://review.opendev.org/c/openstack/nova/+/975242/415:00
sean-k-mooneyLarsErikP: if you have pci in placment the quota check in nova is ment to validate the resouce request form the alisa as part of unifed limits automaticlly15:00
LarsErikPhmm maybe I have to set pci/report_in_placement on nova-api nodes as well? This far, I've only set that to true on the compute nodes15:06
sean-k-mooney report not but you need to set teh filter schduler one15:07
LarsErikPI have that on my nodes running nova-scheduler15:07
LarsErikPI have nova-api/apache2 running on separate hosts from the ones running conductor,scheduler etc15:09
sean-k-mooneyLarsErikP: you do need ot have it in nova-api yes15:45
sean-k-mooneyLarsErikP: if you do not set https://docs.openstack.org/nova/latest/configuration/config.html#filter_scheduler.pci_in_placement in the nova-api config it will not prperly translate the pci ailases to resocue classes and unifed limits wont work15:47
sean-k-mooneyin https://docs.openstack.org/nova/latest/admin/pci-passthrough.html#pci-tracking-in-placement15:48
sean-k-mooneywe state """Since nova 27.0.0 (2023.1 Antelope) scheduling and allocation of PCI devices in Placement can also be enabled via filter_scheduler.pci_in_placement config option set in the nova-api, nova-scheduler, and nova-conductor configuration. Please note that this should only be enabled after all the computes in the system is configured to report PCI inventory in Placement15:49
sean-k-mooneyvia enabling pci.report_in_placement. In Antelope flavor based PCI requests are support but Neutron port base PCI requests are not handled in Placement."""15:49
opendevreviewBalazs Gibizer proposed openstack/nova master: Move the concurrent builds to its own Executor  https://review.opendev.org/c/openstack/nova/+/97569415:49
opendevreviewBalazs Gibizer proposed openstack/nova master: Move the concurrent builds to its own Executor  https://review.opendev.org/c/openstack/nova/+/97569415:51
melwittdansmith, bauzas: I updated https://review.opendev.org/c/openstack/nova/+/962051 to remove a few mock decorators I realized I didn't need since MIN_COMPUTE_VTPM_LIVE_MIGRATION = None. otherwise it is the same as when dan had +216:09
gmaanbauzas: replied to your comment, I am planning to add doc when new RPC server will be used and also RPC versioning https://review.opendev.org/c/openstack/nova/+/97524216:21
opendevreviewBalazs Gibizer proposed openstack/nova master: Move the concurrent builds to its own Executor  https://review.opendev.org/c/openstack/nova/+/97569416:26
opendevreviewLajos Katona proposed openstack/nova master: Add regression test to repoduce bug 2140537  https://review.opendev.org/c/openstack/nova/+/97583217:22
opendevreviewBalazs Gibizer proposed openstack/nova master: Deprecate unlimited compute actions  https://review.opendev.org/c/openstack/nova/+/97583317:28
melwittI just read through the backscroll ... thanks sean-k-mooney for the info about pci_in_placement, LarsErikP: lmk if unified limits still doesn't work after you try the config Sean mentioned. I'm not that familiar with the pci in placement code and it's not immediately clear to me if/how unified limits works with it17:32
opendevreviewBalazs Gibizer proposed openstack/nova master: Move the concurrent builds to its own Executor  https://review.opendev.org/c/openstack/nova/+/97569417:33
sean-k-mooneymelwitt: ack if it doesnt that a bug. the pci in placment code translate the pci alis into placment resocue groups and resouce/tratis requests17:34
sean-k-mooneynow i know it does that when calling placment17:34
sean-k-mooneybut its possibel we dont do that before we check unified limits17:35
sean-k-mooneybut there is a generic fucntion for that17:35
melwittsean-k-mooney: agreed it will be a bug if it doesn't work. I'm trying to look at the code and can't tell what's going on haha17:35
sean-k-mooneyi alwasy have to go looing for this as its not in the file i expect it to be in17:36
melwitton the unified limits side (nova/limit/placement.py) I think i only see use of the flavor and not yet seeing it being connected with anything else17:37
melwitton the pci/resource requests side I don't understand anything :)17:37
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/nova/scheduler/utils.py17:38
sean-k-mooneyso the resouceReseust stuff is in there17:38
opendevreviewMerged openstack/nova master: Use an executor to delay STOPPED events  https://review.opendev.org/c/openstack/nova/+/97444517:38
melwittyeah I did find that part. it's the from_request_spec() is that where the pci in placement stuff comes from maybe?17:38
sean-k-mooneypotitally yes17:40
sean-k-mooneybut i m not imidiatly seeing it17:40
sean-k-mooneyofcouse it coudl be in the request spec class17:40
sean-k-mooneyi think form here https://github.com/openstack/nova/blob/master/nova/scheduler/utils.py#L224-L23017:41
melwittoh, yeah, I think limits is only taking stuff from flavor. bc it creates a "fake" RequestSpec object using only the flavor 17:42
sean-k-mooneywhich woudl mean it woudl need to use teh alis to cofnver the alis to pci request17:43
melwittohh ok that's what you have been saying is the alias is in the flavor and then from there eventually it will turn into resource classes underneath17:44
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/nova/objects/request_spec.py#L66417:53
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/nova/objects/request_spec.py#L50317:54
sean-k-mooneyso the resqouse spec object ahs a generate_request_groups_from_pci_requests function that adds in the reqoeus request form teh neutron prot but also the request form the pci alias17:54
sean-k-mooneyif your only lookign at teh falvor you woudl also be missign teh resouces form cyborg17:57
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/nova/scheduler/utils.py#L677-L679 calls17:59
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/nova/scheduler/utils.py#L663-L67417:59
sean-k-mooneybut that wont calls from_request_spec18:00
sean-k-mooneywith the faike object18:00
sean-k-mooneyand that does not call generate_request_groups_from_pci_requests()18:00
sean-k-mooneyso yes there is a bug in the integration fo pci in placmenet and unified limit18:01
sean-k-mooneygenerate_request_groups_from_pci_requests is only called form RequestSpec.from_components18:01
sean-k-mooneyso https://github.com/openstack/nova/blob/master/nova/scheduler/utils.py#L665 need to be a reals reqeust spec for https://github.com/openstack/nova/blob/master/nova/scheduler/utils.py#L672 to be accurate18:03
melwittsean-k-mooney: gotcha, thanks18:04
sean-k-mooneythe note form john https://github.com/openstack/nova/blob/master/nova/limit/placement.py#L127-L13118:05
melwitthaha I was literally just typing that18:05
melwittyeah. so he mentions cyborg in there and now since pci in placement there's also pci18:06
sean-k-mooneywe obviously overlooked updating that for pci in placement too18:06
sean-k-mooneyyep an as a result gpu/vgpu18:06
melwittI think this may have landed before pci and placement happened but maybe I'm misremembering18:06
sean-k-mooneyyep it may18:07
sean-k-mooneyit hink they were happeing aroudn the same tiem18:07
sean-k-mooneythe pci spec expected unifed limtis to "just work"18:07
sean-k-mooneyi think becuase it was expecting the request spec to have the same view18:07
sean-k-mooneymissing the fact that this is only a partial request spec18:07
melwittso LarsErikP you are good to file a bug for the unified limits non integration with pci. it's a known issue/limitation (see linked code comment above) but we could consider it a bug18:08
melwittso I wonder would it be as simple as grabbing the full request spec in the api database if some alias is present in the flavor or something like that? just to avoid making the db query if we know we won't need to18:09
sean-k-mooneyhttps://specs.openstack.org/openstack/nova-specs/specs/zed/approved/pci-device-tracking-in-placement.html#dependencies18:09
melwittI see18:10
sean-k-mooneyso we inteded it to work so clearly a bug from my perspective18:10
melwittI see, ok18:11
sean-k-mooneyand ya i was just looking to see. when we are doing the limit check its really early18:11
sean-k-mooneyso im not sure hwo easy that woudl be18:11
melwittyeah, mostly all in nova-api but we do have the "recheck" logic in nova-conductor18:12
melwittso maybe we could intercept into there. if that is not also too early18:12
sean-k-mooneywe can check in the conductor yes after schdulgin and before we commit the allction to placement18:14
sean-k-mooneywe proably coudl check before callign the schduler to do select destination in the conductor too but i dont knwo if that is a good idea or not18:15
sean-k-mooneythe thing is i belvie we do generate teh pci reqeusts in the api18:16
sean-k-mooneybefore we call the conductor18:16
sean-k-mooneyso i think we can still validate it in the api18:16
melwittthat would be nice if we can18:16
sean-k-mooneywe have _validate_flavor_image_nostatus for example https://github.com/openstack/nova/blob/master/nova/compute/api.py#L74918:18
sean-k-mooneythat where we validate all the numa stuff18:18
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/nova/compute/api.py#L920-L92118:19
sean-k-mooneyok so https://github.com/openstack/nova/blob/master/nova/pci/request.py#L32418:19
sean-k-mooneywe coudl just use get_pci_requests_from_flavor18:19
sean-k-mooneyto add the pci request to the fake request spec18:20
sean-k-mooneythat wont cover cyborg or neutron request but ti woudl make LarsErikP usecase work18:20
melwittif the pci requests are in the flavor then that's ideal :)18:21
sean-k-mooneyyep in this case they are in the form of the pci_passthoug flavor extra spec18:22
melwittI am relieved if this will not be complicated :D18:23
sean-k-mooneywe can use that to create teh pci reqst object and then add those to the request spec and then call geenrate_request_groups_form_pci on the request spec18:23
sean-k-mooneywell cyborg and neutron port will be harder but we dont need to fix all the edge cases in one go18:23
sean-k-mooneycyborg is doabel form the falvor too by the way i just need a call to cyborg to get the request form the device profile18:24
sean-k-mooneythe neutorn port request are the only bit that are a litte tricky18:24
melwittagree that we need not fix them all at the same time. that's nice about cyborg too, also sounds pretty clean and simple18:25
melwittI wonder though for neutron, would that not be a neutron quota thing? I guess the same could be asked of cyborg18:26
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/nova/accelerator/cyborg.py#L90-L9318:26
sean-k-mooneysicne nova is creating the allocation in placment for the packet per second resouce class i think we have to enforce the check18:27
melwitthm ok18:28
sean-k-mooneywith that said we do also get the port request in teh api18:28
sean-k-mooneywe have to validae all teh uuid exits at a minium18:28
sean-k-mooneyfor neutorn qos to work you have to pre create the ports18:29
sean-k-mooneyif you dont do that and you just pass --network then we wont create teh port until we hit the compute node an we wont allocate any bandwith ro qos resouce request to the vm18:30
sean-k-mooneyso i think we can fix that as well but we woudl nee mor ehten just bfv and the flavor to do that18:30
sean-k-mooneywe woudl need the network request object or the ports18:31
melwittI see18:33
sean-k-mooneyi need to update a patch to fix some unit test quickly do you plan to follow up with a bug fix for this. if not i might get claude to create a repoducer and take a crack at fixign it18:34
melwittsean-k-mooney: I will definitely work on it if no one else wants to -- I feel responsible for unified limits. but if you want, feel free to go ahead :)18:36
sean-k-mooneyim kind of curios what will happen if i copy past the last few miniutes into it and ask ti to repoduce the bug we wre dicsussion in a regression test18:36
melwittI uploaded this fix for user-scoped quotas a while back because I felt bad https://review.opendev.org/c/openstack/nova/+/967148 haha18:39
sean-k-mooneyfeeling pride in the craftmanship of the thign you create/maintain is good but you should not feel bad ifhtere are bugs that no one noticed18:40
melwittyeah :) no I felt bad about the idea of knowing about the problem but not fixing it18:41
sean-k-mooneyya i get that that half the reason i fix bugs18:42
melwittso even though user-scoped quotas are legacy and "should not be used" I just put up a patch anyway18:42
melwittthe tough part is it's hard to get review on such things bc they are so "low prio"18:43
sean-k-mooneyoh right we are doing 2 level quat now global and proejct18:43
melwittyeah. there is also supposed to be a TwoLevelEnforcer but iirc it is not implemented yet. I have seen John WIP patch for it but not sure if it ever got finished18:44
melwittI think it might be complicated18:45
sean-k-mooneywell cladie is fixing my unit tests so ill review your patch while i wait18:46
melwitthaha thanks. lmk if you need any reviews also :)18:47
sean-k-mooneywell ill need gibi and you or stephen to reappove the patch that claude is fixing. but that is the last nova patch im activly working on at least for this week18:48
melwittah ok18:49
sean-k-mooneyim sure i have bug fixes form months? years? ago open but noting im activly looking for review on beyond the live migration patch im about to push18:49
sean-k-mooneyhow is the vtpm feature progressing i keep meening to try it out but not finding time18:50
melwittgoing well. not sure we will get both the 'host' and 'deployment' modes merged this cycle but we should at least be able to get 'host' I think. and that's the more important one we wthink18:52
melwitt*we think18:53
sean-k-mooneyyou had a few devstack commit in WIP state that i came across before the break18:54
sean-k-mooneyi ment ot ask you about them when i got back in january but forgot18:54
melwittoh yeah the swtpm and mdevctl install18:54
sean-k-mooneyyes that was one of them18:54
sean-k-mooneyis there a reason that was in wip state18:54
melwittI don't know if someone else already did the mdevctl by now so maybe only need swtpm. I have been using it for running vtpm live migration in tempest18:55
sean-k-mooneymdevctl is an optional dep of libvirt so maybe its pulled in by default now18:55
opendevreviewsean mooney proposed openstack/nova master: Support os-vif TAP pre-creation for OVS/OVN ports  https://review.opendev.org/c/openstack/nova/+/97314918:55
melwittnot really other than I wasn't sure if the patch is universally wanted. or if vtpm live migration in regular tempest would be accepted upstream. currently there is only whitebox testing but I found there is quite a bit we can do with regular tempest18:56
melwitt*only whitebox testing for vtpm18:56
melwittI wrote a bunch of vtpm live migration tempest tests for regular tempest mostly for myself to more easily test all of the scenarios a ton of times18:57
sean-k-mooneywell we will need it when we ever get around to finishign teh mdev ci testing18:57
melwittI hope they will be wanted upstream in general though, that would be nice18:57
opendevreviewLajos Katona proposed openstack/nova master: Use SDK for Neutron networks  https://review.opendev.org/c/openstack/nova/+/92802218:58
sean-k-mooneyi dont see why we would not add it19:00
sean-k-mooneyi mean if it works with generic tempest and has no hardware requriement which vtpm does not why woudl we not add them19:00
melwittthey only seem to work with virt_type = kvm, that's the only non optimal thing. I couldn't get vtpm to work with virt_type = qemu in upstream CI19:01
sean-k-mooneyit shoudl work with qemu19:02
melwittso they can only run on some subset of the CI fleet but I have experienced no issues with getting hosts when I run them19:02
sean-k-mooneybut ok we have nested virt nodeset we can use for the relevent job19:02
melwittI know it should but it would not work when I did it. like if you flip that flag to virt_type = qemu it will fail19:02
melwittthis is my setup https://review.opendev.org/c/openstack/nova/+/957477/46/.zuul.yaml19:03
sean-k-mooneyso question on https://bugs.launchpad.net/nova/+bug/2131272 for one second. in nvoa the ram quta is 4000 you set the user quota to 4000 for use a19:04
sean-k-mooneyuser a and b are in the same project19:04
melwittand I had no un-hardcode some things in our evacuate hook https://review.opendev.org/c/openstack/nova/+/957477/46/roles/run-evacuate-hook/tasks/main.yaml19:04
sean-k-mooneyand you epxectign the project quota to still reject it with a 40319:04
sean-k-mooneyinstead of the current 500 correct19:05
sean-k-mooneymelwitt: some projec have jobs that run with enst virt nodes for speed (and to avoid the kernel painc we have) i dont see why nova should not use a nested virt node set if the feature need to to be test able19:06
sean-k-mooneyso if you remove LIBVIRT_TYPE: kvm and virt_type: kvm it fials19:07
melwittyeah so this is kind of confusing and took a long time for me to figure out. but it's that you need a situation where the user has enough user-scoped quota to fulfill the request BUT the project-wide usage is too high to fulfill the request19:07
melwittin order to reproduce the problem19:08
sean-k-mooneyright i just wanted to make sure that that is the edge case you were tryign to fix19:08
melwittoh. yes that was 19:09
sean-k-mooneythe user quoat is the quota fo the instnace that hta user create on any project right19:09
sean-k-mooneybut the project itslef still need to have quota19:09
sean-k-mooneywhich is what conused everyone19:09
melwittI think the user-scoped is still nested under the project scope. but regardless of the user-scoped quota for a user, they are not supposed to be able to exceed the project quota with all of the project usage added together among all users in the same project19:10
melwittso if you look at user-scoped quota and usage in isolation it might look like the request should pass but if the project has too much already existing usage due to other users in the project, that will affect the new request19:11
melwittsean-k-mooney: yes when I did not have LIBVIRT_TYPE: kvm it failed. I don't remember the error bc it was months ago and I'm not sure if I commented it somewhere in the DNM patch. I should have but I might not have19:12
melwittok good I did:19:13
melwittERROR:system/cpus.c:504:qemu_mutex_lock_iothread_impl: assertion failed: (!qemu_mutex_iothread_locked())19:13
melwittBail out! ERROR:system/cpus.c:504:qemu_mutex_lock_iothread_impl: assertion failed: (!qemu_mutex_iothread_locked())19:13
melwitt2025-09-23 21:47:17.508+0000: shutting down, reason=crashed19:13
melwittthere was no other indication of any problem in nova or libvirt logs that I found19:13
melwittand I could not figure out or find the root cause of the guest crashing. bc the error reason just says "crashed" without other detail19:14
sean-k-mooneymelwitt: does https://review.opendev.org/c/openstack/nova/+/967148/comment/bcfcff4e_9dbf054a/ make sense. i might be missing somethign19:24
sean-k-mooneymelwitt: ok that is clearly a qemu bug that you were triggering19:24
sean-k-mooneythat might be fixed19:25
sean-k-mooneymelwitt: i feel liek that is a bug that we may have reportedn and it may have been fixed19:26
melwittsean-k-mooney: yeah what you said makes sense and I found this is why the message changes, it's re-raised as TooManyInstances https://github.com/openstack/nova/blob/a17b44f3eb16b9284ec8a6292bb942d803688e72/nova/compute/utils.py#L1181-L118419:31
melwittsean-k-mooney: ok, I'll try running without nested virt and see what happens19:31
sean-k-mooneyah https://github.com/openstack/nova/blob/master/nova/exception.py#L1383-L138519:32
sean-k-mooneyok well the message could be slightly better but im ok with the  current patch19:32
sean-k-mooneyi just didnt see where it was been converted19:32
sean-k-mooneyi woudl have expected TooManyInstances to be jsut for instance quota19:33
melwittI'm fine with improving the message. I just didn't think about it19:35
melwittyou will find many things in quotas defy expectations19:35
sean-k-mooney:) 19:35
sean-k-mooneywell im +2 on the patch but happy to re review if you end up updating it19:35
melwittthanks 🥹19:38
sean-k-mooneymelwitt: https://gitlab.com/qemu-project/qemu/-/issues/297819:46
sean-k-mooneymy quess  is that is the issue you hit in the ci job19:47
melwittsean-k-mooney: thanks, I'll subscribe 19:53
melwittI guess they closed it but someone commented they're still hitting the issue after it was closed19:55
sean-k-mooneyya so if we recretaed in ci we coudl update it with an assertion that it stilll happens and prove the releven livbirt xml19:55
sean-k-mooneyi.e. show that it happens with vtpm for example19:56
sean-k-mooneythe simplete way woudl be a second patch on your DNM to just go back to qemu and see if it explodes19:56
melwittyeah. might as well19:58
obreHi! There is a BP (add-amx-traits) thats been lying around since 2023, which describes a feature I would like to see in nova, and that I guess Im able to implement. The BP is not approved, and posted by someone else. Should I in some way update the BP and assign it to myself, or should I create a new one to get it approved and then Implemented?20:45
opendevreviewsean mooney proposed openstack/nova master: Add regression test for unified limits PCI bug  https://review.opendev.org/c/openstack/nova/+/97585920:45
sean-k-mooneyobre: there might eb a poc of that you coudl take over the blueprint but it would have to be for next cycle20:46
sean-k-mooneyobre: https://review.opendev.org/c/openstack/os-traits/+/86814920:46
sean-k-mooneyso there is an os-traits ptach but no nova patch.20:47
sean-k-mooneywe woudlneed to update the libvirt driver to report them i think as well20:47
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L13549-L1362520:49
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/nova/virt/libvirt/utils.py#L608-L62220:49
sean-k-mooneythey woudl need to be added here https://github.com/openstack/nova/blob/master/nova/virt/libvirt/utils.py#L61-L10720:50
sean-k-mooneyi.e. the mapping form teh way the feature flag is reprot by livbirt to the trati20:50
sean-k-mooneyso that small enough ot be a specless bluepirnt20:50
sean-k-mooneyobre: but you would have to propsoe it for 2026.2 and ask for it to be approved in the nova irc meeting 20:51
obreYes; I think it looks simple enough; But Im a bit uncertain for the process.20:51
obreHow/where do I propose it? 20:52
sean-k-mooneyhttps://wiki.openstack.org/wiki/Meetings/Nova20:52
obreAnd are we in a feature-freeze or something for the G-release since that apparently is out of the question?20:52
sean-k-mooneyi responed on the mialing list a few miniots ago for a diffent feature20:53
sean-k-mooneyyes the bluepirnt and spec freeze was decemebr 8th this cycle20:53
obreRight.20:53
sean-k-mooneymaster reopens for new feature work in march20:53
sean-k-mooneyits currenlty open for approved feature work and bug fixes20:53
sean-k-mooneybut you can propose patches at any time 20:53
sean-k-mooneyso you can get it ready adn show it works20:54
sean-k-mooneynon clinet lib freze i think i next week20:54
sean-k-mooneyso its realistcly too late to add a new standar trait to os-traits20:54
obreToo late for H-release? 20:54
sean-k-mooneyfor G20:55
sean-k-mooneyits fine for H20:55
sean-k-mooneyh is the release in september20:55
obreBut Im already too late for G-release in nova? And I guess its fine to get both the Trait in and the nova-use of it for H?20:55
sean-k-mooneywell echinially i think october 20:55
sean-k-mooneyyep 20:56
sean-k-mooneyits fien to rebase the patches and get everythign lined up for H20:56
sean-k-mooneyif you ask for an excption in the irc meeting on monday there is a very very small chace that the team will say yes20:56
sean-k-mooneybut it woudl be much less stressful for you and everyone else to just do it early in H20:56
obreIt feels like such a small change; So it would be interesting to ask nicely :)20:57
sean-k-mooneyin this case i kind fo agree. you could argue that it could be a wishlist bug20:58
sean-k-mooneyi.e. just keepign the tratis and feature flags in sync20:59
obreBut regardless of G or H; I guess the process then is to add a point to the agenda of a nova meeting; and if the feature is approved based on the current BP Ill create a patch for nova (and rebase the path for os-traits if that would be needed) and wait for the possibility to merge in March/April if its for the H-release?20:59
sean-k-mooneybut on the os-traits side we dont backport traits 20:59
sean-k-mooneyyes exactly20:59
sean-k-mooneyyou add the topic ot the open disucssion section of the etherpad20:59
sean-k-mooney* of the wiki21:00
obreIts certainy on the wishlist for my part :P Im managing openstack for a University, and the amx-feature would be nice for some of our researchers :)21:00
sean-k-mooneyyou could also link to the irc logs of this converstation https://meetings.opendev.org/irclogs/%23openstack-nova/latest.log.html#openstack-nova.2026-02-05.log.html#t2026-02-05T20:45:4221:01
sean-k-mooneyto show i supprot the propsal in general21:01
sean-k-mooneyyou can use amx without the trait you just cant scheudle based on the capablity automaticlly21:01
sean-k-mooneyas a workaroudn you could add a CUSTOM_AMX trait for now21:02
sean-k-mooneyeiter usign provider.yaml or the placement api/cli21:02
sean-k-mooneyof couse if your usign the nova feature where you list multiple cpu models in teh config an we give you the first one that matchs based on your trati requests21:03
sean-k-mooneythat wont work21:03
sean-k-mooneybut if you have host-model or host-passthough or a hardcoded cpu model on the host with amx21:04
sean-k-mooneythe vm will get it even if the trait does not exist21:04
obreWe are listing multiple CPU-models, and pick the first one based on Trait.21:04
sean-k-mooneyah :)21:05
sean-k-mooneyim glad someone uses that feature21:05
obreWhich allows me to have the bulk of my VMs run on all my compute-nodes; and the have a smaller subset available to people needing more modern IA's.21:05
obreAnd it works great :)21:05
obreAnd Im exposing it to my users through "generations" of flavors: https://www.ntnu.no/wiki/spaces/skyhigh/pages/114296806/Flavors+of+instances21:06
sean-k-mooneyah yes that a good approch. espcially if you are aslo godo and do not modify your falvors after they are in use21:07
obreAnd now Im starting to have quite a few compute-nodes with amx-support; which sparks my interest in this patch :)21:07
obreYeah; we try not to modify flavors. 21:07
sean-k-mooneyi have never run a lab/cloud for more then my team but i just chated adn created a AZ per cpu generation21:08
sean-k-mooneyso if you cared you just specifed the az you wanted other wise the schduelr just selected a host for you and you got whatever you landed on21:09
obreHaving the possibility to run "old generations" on newer CPU's is to me a very valuable when it comes to flexibility during upgrades. And I think I need the "list of multiple cpu models" for that to work.21:10
obreAt least Im struggeling to see how to accomplish it with AZ's. 21:10
sean-k-mooneyya that the only way to supprot that usecase at least without leave performace on the table21:10
sean-k-mooneyyou are doing it correctly the reason i said i was cheating i that was before the feature existed and we used host-passthough 21:11
obreMakes sense.21:11
obreIll been running this plattform since Ice-House or Juno, so there has been some evolvement over time to end up doing it like this. We started with host-passthrough, but we found the need for the flexibility when the compute-nodes are very heterogenous. 21:13
obreAnd now we are scaling the platforms up quite significantly; as we are bailing VMware in the org as well :) So lots of fun times ahead.21:14
sean-k-mooneyya vmware bills seam to have that effect now21:14
obreIn the edu-sector the increases is mind-bogglingly.21:15
obreTo an extent that Its difficult to believe its true :)21:15
obreBut i have seen the quotes. So I know Im not dreaming. 21:16
obreAnyways; thanks a lot for your responses. Its been very valuable! But now its evening here; so Ill go afk. 21:17
opendevreviewsean mooney proposed openstack/nova master: Fix unified limits to include PCI resource classes  https://review.opendev.org/c/openstack/nova/+/97587221:48
sean-k-mooneymelwitt: ^ LarsErikP 21:53
sean-k-mooneywe shoudl file a proper bug and deceice if we are goign to fix it for cyborg and neutron port ectra but at least for cyborg i think its shoudl be relitively simple21:54
sean-k-mooneygibi: i had to rebase this and fix some unit test failures fare os-vif promoted. https://review.opendev.org/c/openstack/nova/+/973149 woudl you mined rereviewing it tomorrow21:59
opendevreviewsean mooney proposed openstack/nova master: enable tap creation in nova-live-migration  https://review.opendev.org/c/openstack/nova/+/97550021:59
opendevreviewsean mooney proposed openstack/nova master: Fix unified limits to include PCI resource classes  https://review.opendev.org/c/openstack/nova/+/97587222:11
sean-k-mooneyhttps://bugs.launchpad.net/nova/+bug/214063122:20
*** haleyb is now known as haleyb|out22:21
melwittsean-k-mooney: cool thanks, I will look22:23
melwittit would be super if LarsErikP can try out the patch also22:25
opendevreviewmelanie witt proposed openstack/nova master: DNM: vtpm tempest without nested virt  https://review.opendev.org/c/openstack/nova/+/97587422:31
opendevreviewmelanie witt proposed openstack/nova master: DNM: vtpm tempest without nested virt  https://review.opendev.org/c/openstack/nova/+/97587422:33
opendevreviewsean mooney proposed openstack/nova master: Add regression test for unified limits PCI bug  https://review.opendev.org/c/openstack/nova/+/97585922:36
opendevreviewsean mooney proposed openstack/nova master: Fix unified limits to include PCI resource classes  https://review.opendev.org/c/openstack/nova/+/97587222:36
sean-k-mooneymelwitt: yep i have not deploy this on real hardware so it would be nice if it was on LarsErikP system. im going to call it a night but i just updated the patches with the bug links and topic22:42
melwittyeah.. I was imagining he has an env already set up (hence asking in the channel today) so it would be cool if it would be an easy effort thing 22:46
LarsErikPhello! I've been afk since I left work. Scrolled through the backlog now, and I see that I sparked quite a discussion/conversation here :P I'll look into the patches tomorrow and see if I can test something. I have a live non-production env running, so I can probably test the patches quite easily22:55
LarsErikPmelwitt: sean-k-mooney: ^22:55
LarsErikPThanks alot! And I'm so delighted that this probably just wasn't a pebcak :p22:56
melwittit's great to have people using unified limits so we can find out and fix issues :)22:57
LarsErikPYeah, that has been this week's project for me. Testing out the migration to unified limits. And the killer feature we really want is exactly this - having quotas on i.e GPUs.23:01
melwitt++23:02
LarsErikPI guess I should do some testing with type-VF devices here as well. But I guess that's going to work better, as they are actually counted correctly in placement when i have the resource class in the flavor23:03
LarsErikPwith the fix for the bug I mentioned earlier today, that is 23:03
LarsErikPI'll look into the details of what you've been doing, and what I should test when I get back to work tomorrow =) Good night Norway time ;) Thanks for all the efforts!23:05
melwittyeah, earlier Sean said after the fix the remaining gaps they see are for cyborg resources and neutron ports. the former seems like a fix would be simple but for neutron it's more complicated23:06
melwittgnite!23:06
melwittsean-k-mooney: if you were curious, still a fail without nested virt  https://review.opendev.org/c/openstack/nova/+/975874 sample instance log https://zuul.opendev.org/t/openstack/build/009d9ddc64354457a0b926adf753899c/log/compute1/logs/libvirt/libvirt/qemu/instance-00000019_log.txt23:55

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!