Wednesday, 2025-10-22

opendevreviewMerged openstack/nova-specs master: Repropose flavor-search-by-name spec  https://review.opendev.org/c/openstack/nova-specs/+/95886600:30
*** mhen_ is now known as mhen01:46
opendevreviewTaketani Ryo proposed openstack/nova-specs master: Adjust the SEV/SEV-ES code for multiple cc support  https://review.opendev.org/c/openstack/nova-specs/+/96257805:51
opendevreviewTaketani Ryo proposed openstack/nova-specs master: Adjust the SEV/SEV-ES code for multiple cc support  https://review.opendev.org/c/openstack/nova-specs/+/96257805:55
alibasafacan anyone answer the my question ? . I have two aggregates named general and dedicated. The availability zone of general is nova, and the zone of dedicated is another one I wanted to live migrate an instance from the nova zone to the other zone To do this, I removed the host from the dedicated aggregate and added it to the general aggregate. After the live migration completed, I moved the compute host back to the dedicated aggregate After 06:12
alibasafathat, I tried to resize the  instance to its own flavor, but I got an error related to FilterScheduler and zone specifications. I found that in the nova_api database, the availability_zone was updated correctly, but in the nova database, the availability_zone was not updated.06:12
alibasafa Why did this happen06:12
gibiIt is not allowed to move instances between Availability Zones. If adding a host to an aggregate or removing a host from an aggregate would cause an instance to move between Availability Zones, including moving from or moving to the default AZ, then the operation will be rejected. The administrator should drain the instances from the host first then the host can be moved.06:30
gibihttps://docs.openstack.org/nova/latest/admin/availability-zones.html06:30
gibiyou might have an older openstack when this is not clearly rejected yet but causing issues down the line06:31
opendevreviewBalazs Gibizer proposed openstack/nova master: Rally job for eventlet-removal  https://review.opendev.org/c/openstack/nova/+/96013007:51
opendevreviewKonrad Gube proposed openstack/nova master: Use Cinder's os-extend_volume_completion volume action.  https://review.opendev.org/c/openstack/nova/+/87356008:37
zigodansmith: Hi there! I'm running a Dalmatian deployment, and when uploading an image, it fails in the image inspector with the MBR:09:07
zigoTraceback (most recent call last):09:07
zigo  File "/usr/lib/python3/dist-packages/nova/virt/images.py", line 162, in do_image_deep_inspection09:07
zigo    inspector.safety_check()09:07
zigo  File "/usr/lib/python3/dist-packages/oslo_utils/imageutils/format_inspector.py", line 430, in safety_check09:07
zigo    raise SafetyCheckFailed(failures)09:07
zigooslo_utils.imageutils.format_inspector.SafetyCheckFailed: Safety checks failed: mbr09:07
zigoIs this known, and should I backport some patches?09:07
zigohttps://review.opendev.org/c/openstack/oslo.utils/+/928448 maybe ?09:08
* zigo tries this patch09:19
zigoThis doesn't fix my issue ... :/09:50
sean-k-mooneyzigo: i think there is a glance config option related to funky mbrs09:51
zigosean-k-mooney: The issue for me is in nova-compute.09:51
zigoWhen it spawns the VM.09:52
sean-k-mooneyoh ok09:52
sean-k-mooneyis that the full traceback09:52
zigoNope.09:52
zigohttps://paste.opendev.org/show/bwgoYQ6LeWlUFYcIZ998/09:53
zigoThat's what I have in "server show"09:53
sean-k-mooneyah, and form that i tatke it the image is in raw? format based on the call to fetch_to_raw09:55
zigoCorrect. The image is an upload from another cloud's /var/lib/nova/instance/<uuid>/disk09:56
zigoWill it go nicer if I convert to Qcow2 first?09:56
sean-k-mooneyit would work in either case 09:57
sean-k-mooneyi suspect its failing somehwere here https://github.com/openstack/oslo.utils/blob/master/oslo_utils/imageutils/format_inspector.py#L1388-L142109:58
sean-k-mooneyhowever i woudl expect a more verbose excetion message in that case09:58
zigoThe image is, as much as I can tell, made from the Debian one, with grub-cloud, dual boot (ie: UEFI + mbr).09:59
zigoAt least it looks like it, I didn't check much, but it has all of the usual partitionning.09:59
sean-k-mooneyis it publicly aviable or would you beable to provide say the first 1mb of the image which has the format info so we coudl inspect it10:03
zigoI'll investigate myself first. It's not a public image, it's customer's data.10:04
sean-k-mooneywe dont actully need the full image all the gpu info is in the first few KBs10:04
sean-k-mooneyack10:04
sean-k-mooneyon master we provided a cli that you can use to run the inspector https://github.com/openstack/oslo.utils/blob/master/oslo_utils/imageutils/cli.py10:05
sean-k-mooneyso you can do python -m oslo_utils.imageutils -i /path/to/image [-v|--verbose]10:05
zigoAh, nice ! :) I can try it in my Flamingo virtualized test cluster then.10:05
sean-k-mooneyits in epoxy too10:06
sean-k-mooneyjust not 2024.210:06
sean-k-mooneyhttps://github.com/openstack/oslo.utils/blob/stable/2025.1/oslo_utils/imageutils/cli.py10:06
zigoIt's a shame there's no script entry point then.10:07
sean-k-mooneythe master just has type hints10:07
sean-k-mooneywell there is a __main__.py for the module10:08
zigoYeah, but something in /usr/bin would have been much nicer.10:08
sean-k-mooneybut you can basiclly just copy paste it somehwere and run it with the older version of oslo10:08
zigoI'll see if I can push such patch ! :)10:08
zigoOh, I see. Ok, thanks.10:09
sean-k-mooneyi think we did that to not pullute /usr/bin with a debug helper 10:09
sean-k-mooneybut i dont really have a strong opionion either way10:09
sean-k-mooneyif we wanted it to be a cli in /bin all that really needed is addting the console_scripts entry point ot setup.cfg to point at the cli module10:10
sean-k-mooneyalthough on master yo uwoudl now do that via pyproject.toml i think10:11
zigopython3 -m oslo_utils.imageutils -i /path/to/disk <--- Shows nothing, just exists. This means validated in Flamingo ?10:12
sean-k-mooneyzigo: anyway my hunch is the mbr is subtlly invalid and not fully complient with the gpt spec10:13
sean-k-mooneyam thatm mean our error handleing sucks :)10:13
sean-k-mooneytry adding -v10:13
zigo# python3 -m oslo_utils.imageutils -v -i /path/to/disk10:13
zigoSAFETY_CHECK_PASSED=False10:13
zigoVIRTUAL_SIZE=26214410:13
zigoACTUAL_SIZE=26214410:13
zigoIMAGE_FORMAT="gpt"10:13
zigoOSLO_UTILS_VERSION="9.1.0"10:13
zigoFAILURE_REASONS='mbr: GPT MBR has invalid start CHS'10:14
sean-k-mooneythere you go10:14
zigoGreat, that was very helpful sean!10:14
sean-k-mooneylookign at the code it exits 0 on success and 1 on failure by default10:14
sean-k-mooneyso that is form here https://github.com/openstack/oslo.utils/blob/stable/2025.2/oslo_utils/imageutils/format_inspector.py#L1267-L127010:15
sean-k-mooneyso that path means that the os type is 0xEE a protective MBR https://uefi.org/specs/UEFI/2.10/05_GUID_Partition_Table_Format.html#os-types10:19
sean-k-mooneythe check that is failign is "(starth, starts, startt) != (0x00, 0x02, 0x00):" thats StartHead StartSector and StartTrack10:25
sean-k-mooneyhttps://uefi.org/specs/UEFI/2.10/05_GUID_Partition_Table_Format.html#protective-mbr-partition-record-protecting-the-entire-disk10:25
sean-k-mooneyStartingCHS is 3 bytes Set to 0x000200, corresponding to the Starting LBA field.10:26
zigoI probably will propose a patch to accept more valid values.10:42
opendevreviewStephen Finucane proposed openstack/nova master: db: Remove legacy placement tables from API DB  https://review.opendev.org/c/openstack/nova/+/88908610:43
opendevreviewStephen Finucane proposed openstack/nova master: db: Remove legacy tables from main DB  https://review.opendev.org/c/openstack/nova/+/88908710:43
opendevreviewStephen Finucane proposed openstack/nova master: db: Remove legacy columns from main DB  https://review.opendev.org/c/openstack/nova/+/88908810:43
opendevreviewStephen Finucane proposed openstack/nova master: Remove unused 'shutdown_terminate' flag  https://review.opendev.org/c/openstack/nova/+/77315610:49
zigostephenfin: Gosh, thanks for the placement db cleanup. It's not like if it was a 5 years old technical dept ... :/10:49
*** iurygregory_ is now known as iurygregory10:57
opendevreviewStephen Finucane proposed openstack/nova master: objects: Remove custom comparison methods  https://review.opendev.org/c/openstack/nova/+/47228510:58
sean-k-mooneyzigo: that is unlikely to be accpeted as there arent more _valid_ vlaues11:05
zigoThere sure is!11:05
zigohttps://uefi.org/specs/UEFI/2.10/05_GUID_Partition_Table_Format.html11:05
sean-k-mooneyif the os type is declared as 0xEE then it must be that value or its not a valid GPT partion table acrodign to the spec11:05
zigoSearch for CHS.11:05
sean-k-mooneyno the other value are only valid if the mbr is a legacy mbr not a protective mbr11:06
zigoIn the case of "Protective MBR Partition Record" their are some other valid values.11:06
zigoRight, but then why would OpenStack prevent using "Protective MBR" ?11:07
sean-k-mooneywe dont protective mbr requeis it to be exactly  0x00020011:07
zigoMy image is perfectly valid and has 0x00, 0xee, 0xfe11:08
zigo(starth, starts, startt)11:08
sean-k-mooneylegacy mbr allows other value but you cant mix thet 211:08
zigoI believe the Debian image is mixing them to allow both MBR *AND* uefi.11:09
sean-k-mooneythis is somtihng that is often done incorectly when trying to create a image that is both bootable as uefi and bios11:09
zigoThis is maybe incorrect, but existing in the wild at least.11:09
sean-k-mooneyyep we have encounedre that in other cases too. 11:10
sean-k-mooneythose are not techinially allowed by the gpt spec11:10
sean-k-mooneythere is a way to do it 11:10
sean-k-mooneybut its often not done correctly11:10
sean-k-mooneywe had this dicussion with at leat one other disto11:11
zigoOk, but then, what our format_inspector is trying ot acheive is *not* checking if some images aren't respecting the GPT specs. We're trying to check if nova/cinder/glance is being hacked...11:11
zigoI wouldn't mind fixing the Debian if it's broken, that's probably the case, but that wont fix the "I cannot migrate my VMs from my old cloud" use case...11:12
zigo(and that's what I'm currently trying to fix)11:12
sean-k-mooneywe are tryign to require "valid" boot image and currently that includes a valid GPT partion table11:12
sean-k-mooneylet me see if i can find the perio bug11:12
sean-k-mooneyi have forgotten much of the context ath this point11:13
zigoRight. We aren't trying to prevent border-line wrongly not respecting the spec images.11:13
zigoJust something that Qemu can boot safely.11:13
zigoIt'd be ok to be more respecting the spec if we were writing a spec validation tool for qemu, but that's not our goal ...11:14
sean-k-mooneyhttps://bugs.launchpad.net/oslo.utils/+bug/209111411:14
zigoThanks, reading.11:15
UgglaDoes someone knows about https://review.opendev.org/c/openstack/nova-specs/+/964444 ? I mean do we have the owners of this patch around ?14:36
sean-k-mooneyi just saw the propal come in yesterday14:43
sean-k-mooneyso i dont know anything more beyond what is in the review14:43
zigodansmith: I do not agree with your comment in the review. From your review:15:30
zigo"Removing checks from a security layer because Debian's images have been wrong for a long time is not a good plan, IMHO."15:30
zigoIt is wrong that these magic byte checks are doing any security improvements. Plus I'm not sure where this image is comming from, I'm not sure it's from Debian, but it's been there in our older cloud that I'm trying to migrate.15:30
dansmithzigo: understood :)15:31
zigoIMO, we should re-open the topic during the vPTG.15:31
zigoI do agree some kind of checking is good, and thanks for writing these.15:31
zigoThough blocking existing (customer) VMs that's been there for so long isn't helpful.15:32
zigoIf I'm being ignored, I'll cary this as Debian specific patches ... :P15:32
dansmiththese things are 100% security related btw.. the configuration of the MBR is read and interpreted by the hypervisor, the emulated BIOS/firmware, etc.. so it's entirely possible there's an attack vector there15:33
zigoAt what stage the hypervisor reads it?!?15:33
zigoShouldn't this be done by the virtual BIOS when booting?15:34
zigo(ie: not even in Qemu)15:34
dansmiththe hypervisor is running the virtual bios :)15:34
zigoWell in qemu, no?15:34
dansmithyes.. you know qemu is the thing that was hackable with the previous image attack right?15:35
zigoIf there was a way to escape when Qemu boots an HDD, that would be a serious security issue in Qemu.15:35
dansmithescape is not the only attack mode, of course15:35
zigoWell, the thing that was hackable, was the image conversion.15:35
dansmithno it wasnt.. you might want to re-read the bug :)15:36
zigoThat's very different from executing sea-BIOS instructions.15:36
dansmithit was literally qemu itself15:36
opendevreviewBalazs Gibizer proposed openstack/nova master: Rally job for eventlet-removal  https://review.opendev.org/c/openstack/nova/+/96013015:37
zigoWhen reading the bootloader stuff?!?15:37
dansmithwhen reading the image and executing the guest.. not the bootloader specifically, but again, we're talking about being defensive for the *next* thing here15:38
dansmithanyway, re-open the discussion of supporting spec-violating images in the vPTG if you want, and carry debian patches to remove these matches-the-spec image security checks if you think that's the best plan15:39
zigoIMO, here, we're talking about being deffensive against the sea-BIOS being executed and doing weird thing, which is *not* our job, and which is hopefully harmless.15:40
zigoBut fine ... :P15:40
dansmithwell, I thought that inspecting the details of even things like qcow2 data structures was completely not our job until qemu told us it was15:41
zigoBTW, my first iteration got things wrong, what I added wasn't the boot signature I've found.15:41
zigoI've found (0x00, 0x01, 0x00)15:41
zigodansmith: Right. It probably deserves a discussion. :)15:42
zigoThanks for this begining of a chat already.15:42
dansmithpersonally, I think we've already had the discussion, and we've already had a prior art bug that shows another distro realized they were wrong and posted the easy "make the image match the spec" commands, which we could put in the docs15:43
zigoI probably can also patch the MBR of these old VMs...15:44
zigoI'll see if that path works and let you know.15:44
dansmithmaybe glance could have a feature in image conversion to fix these things specifically with parted..  that's something I'd be on board with and probably would address your migration issue15:46
sean-k-mooneyzigo: (0x00, 0x01, 0x00) is because of a bug in parted15:50
sean-k-mooneyacording to the the flatcar image fix pr anyway15:51
zigoAh... thanks a lot! :)15:51
zigoI'll look it up.15:51
zigoFYI, the image I'm talking about is probably running Debian Jessie ... :)15:51
sean-k-mooneyhttps://github.com/flatcar/scripts/pull/2668#issuecomment-266085990115:52
sean-k-mooneyi have no idea how the flatcar image buildign work by the way15:53
sean-k-mooneythey provided a workaroudn for old images https://bugs.launchpad.net/oslo.utils/+bug/2091114/comments/1215:54
sean-k-mooneyby the way on the glance side ther eis https://github.com/openstack/glance/blob/master/glance/common/config.py#L123C1-L138 to allow folks to weaken the security 15:56
sean-k-mooneybut that is not really a good longterm approch and im not conviced that shoudl be in oslo15:56
zigoSurprisingly, Glance isn't complaining, only Nova, not sure why.15:57
dansmithcan't (and shouldn't) be in oslo15:57
kevkohi, i am running nova upgrade check FROM caracal container when all others containers (kolla-ansible deployment) are still antelope ... this should pass , shouldn't ? as it is slurp ...but check failed ... is it ok ? https://paste.openstack.org/show/bXbzW9LwWQwQIEu8vEUV/15:58
dansmithglance hasn't had upload stream safety checks for very long, so depends on what you're upgrading to15:58
sean-k-mooneydansmith: i was honesstly kind of surpised ot see that in glance if im honest i assuem this is planned to be deprecated and remvoed eventually15:58
kevkopoint is that i will upgrade control plane to caracal ...and my 300 compute servers will be upgraded one by one during the weeks 15:58
dansmithsean-k-mooney: yes, I don't like it either15:58
sean-k-mooneykevko: ya you shoudl be able to upgrade the contolers and do the compute later15:59
sean-k-mooneykevko: that looks like you have not rant either the db migration or somethign similar to that16:00
sean-k-mooneyit could also be using the wrogn db16:00
sean-k-mooneythe compute_id was added by https://github.com/openstack/nova/commit/a47fdef1bfba8b1e2c3279f2c51eda3d8a985ec116:01
sean-k-mooneyin bobcat16:02
kevkoyeah, but nova-status upgrade check  command sounds like  "this script will tell you if you can upgrade , which of course including db migration "16:02
sean-k-mooneywell your using a nova-status form carical but have not applied a db migration form bobcat16:03
kevkosean-k-mooney: as I said ... antelope is slurp ...so upgrade to caracal and caracal's nova-status upgrade check should work no ? 16:03
sean-k-mooneyi would need to check our doc for if the status check shoudl be done after db sync or not16:04
kevkosean-k-mooney: yeah, but why I should update db migration from bobcat when antelope -> caracal is supported slurp upgrade ? 16:04
sean-k-mooneyits not that you do them form bobcat it that you woudl run all the caracal ones16:04
kevkothat's the point ...16:04
kevkosean-k-mooney: In my understanding of the world, a DB migration is part of the upgrade... which means the upgrade check should take into account that I’m running it before the upgrade, right? At least that makes more sense to me…16:06
sean-k-mooneykevko: in gerade we run nova stats after you have done the db upgrade16:06
sean-k-mooneyhttps://opendev.org/openstack/grenade/src/branch/master/projects/60_nova/upgrade.sh#L10416:06
sean-k-mooneykeveko you shoudl run the antelop version before the dbsync and the caracal one after16:07
opendevreviewArnaud Morin proposed openstack/nova master: Fix: add a default dev_type in spec  https://review.opendev.org/c/openstack/nova/+/96455216:07
sean-k-mooneykevko: https://docs.openstack.org/nova/latest/admin/upgrades.html#rolling-upgrade-process16:08
sean-k-mooneythat what we docuemtn as well by the way16:08
sean-k-mooneythe db sysnc syste the schema which will work with the old code16:09
sean-k-mooneythe upgade check expct the new db schema 16:09
kevkosean-k-mooney: hmm, thank you.... so    upgrade:    antelope container -> run upgrade check from antelope container -> exec db migrations to caracal -> replace antelope container to caracal -> run upgrade check again from caracal container 16:12
sean-k-mooneybasiclly. the first execution fo the status check tells you if you have any online migration or other thing you need to do before upgrading16:18
sean-k-mooneythe second tell you the same after you have upgraded16:18
opendevreviewBalazs Gibizer proposed openstack/nova master: Rally job for eventlet-removal  https://review.opendev.org/c/openstack/nova/+/96013016:24
opendevreviewMerged openstack/nova master: pci: Add more detail and examples to pci.alias docs  https://review.opendev.org/c/openstack/nova/+/95065918:54
atmarkHello, I have instances that  automatically got shutdown on few computes after restarting nova21:47
atmark`During _sync_instance_power_state the DB power_state (1) does not match the vm_power_state from the hypervisor (4). Updating power_state in the DB to match the hypervisor.`21:47
atmarkFull logs  https://paste.openstack.org/show/bZzH7XnMnZut2zZzTvuk/ 21:47
atmarkOpenStack Version is Yoga 21:51
atmarkWhich table in the db do I need to look at to identify more VMs that has power state  mismatch?21:52

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!