Tuesday, 2026-01-27

*** mhen_ is now known as mhen02:40
opendevreviewTakashi Kajinami proposed openstack/nova master: libvirt: Use firmware auto-selection by libvirt  https://review.opendev.org/c/openstack/nova/+/96913206:30
opendevreviewTakashi Kajinami proposed openstack/nova master: libvirt: Add capability to load smm feature from existing xml  https://review.opendev.org/c/openstack/nova/+/96913106:33
opendevreviewTakashi Kajinami proposed openstack/nova master: libvirt: Use firmware auto-selection by libvirt  https://review.opendev.org/c/openstack/nova/+/96913206:33
gokhanhello folks, I want to get your thoughts about this patch https://review.opendev.org/c/openstack/nova/+/973750 . It seems https://github.com/openstack/nova/commit/5a55a78d510b86975f0f4f8f43ee1feef7206244#diff-47eb12598e353b9e0689707d7b477353200d0aa3ed13045ffd3d017ee7d9e753 commit resolves live migration pre migrate failure. Regarding the fixes in Bug #1899835 (Avoid volume rollback mismatches), which handles cleanup on destination when 07:39
gokhanpre_live_migration fails: I’ve observed that many live migration failures (like the one where a volume is in 'backing-up' status) could potentially be caught much earlier. Currently, if a volume is busy or in an unstable state, the failure often occurs during pre_live_migration, triggering the complex rollback logic we've been refining. Would it make sense to implement a proactive volume status check at the Nova API level (or early in th07:39
gokhane Conductor) before the RPC call to the destination node?07:39
gibigokhan: I agree that if there is a way to catch "bad" volume states early in the complex live migration process then that can decrease the likelyhood of a problematic rollback. 07:47
gokhanthanks gibi We have submitted a patch (https://review.opendev.org/c/openstack/nova/+/973750) for catching in api side. We would like to confirm if this is the recommended approach. 07:56
gokhanSince melwitt  has previously worked on similar issues—specifically the 'Avoid volume rollback mismatches' patches—I would highly value her insights and comments on this matter to ensure we're aligned with the existing logic.08:02
gibigokhan: yeah, I'm reading you patch now...08:03
gokhanthanks gibi :)08:04
*** LarsErik1 is now known as LarsErikP08:07
gibigokhan: I think the direction is OK. I left comments about the test coverage, and pinged melwitt in the review08:10
gokhanThanks for the review gibi,  we will address your comments and upload a new patch set soon.08:15
opendevreviewSeyeong Kim proposed openstack/nova master: libvirt: Support boot_index for multiple block devices  https://review.opendev.org/c/openstack/nova/+/96366508:17
tobias-urdincan somebody refresh my memory, when doing live migration for an instance with cpu_policy=dedicated and cpu_thread_policy=require and cpu_dedicated_set set, does scheduler take into account remapping PCPU for example on source cores 0,40 to something like 1,41 if 0,40 was not free on destination compute?11:04
tobias-urdingibi: ^ iirc it should work? (caracal)11:05
opendevreviewLajos Katona proposed openstack/nova master: Use SDK for Neutron extensions  https://review.opendev.org/c/openstack/nova/+/96227011:12
opendevreviewLajos Katona proposed openstack/nova master: Use SDK for Neutron floating IPs  https://review.opendev.org/c/openstack/nova/+/96260411:12
opendevreviewLajos Katona proposed openstack/nova master: Use SDK for Neutron networks  https://review.opendev.org/c/openstack/nova/+/92802211:24
opendevreviewLajos Katona proposed openstack/nova master: Use SDK for Neutron subnets  https://review.opendev.org/c/openstack/nova/+/96219011:24
opendevreviewLajos Katona proposed openstack/nova master: Use SDK for Neutron extensions  https://review.opendev.org/c/openstack/nova/+/96227011:25
opendevreviewEsra Ozkan proposed openstack/nova master: Fix Concurrent VM Live Migrate - Volume Backup Error  https://review.opendev.org/c/openstack/nova/+/97375011:30
opendevreviewMax proposed openstack/nova master: Add regression test for bug #2137366  https://review.opendev.org/c/openstack/nova/+/97483111:32
opendevreviewMax proposed openstack/nova master: fix: delete attachments after rescheduling delete  https://review.opendev.org/c/openstack/nova/+/97483211:32
opendevreviewEsra Ozkan proposed openstack/nova master: Fix Concurrent VM Live Migrate - Volume Backup Error  https://review.opendev.org/c/openstack/nova/+/97375011:36
opendevreviewLajos Katona proposed openstack/nova master: Use SDK for Neutron floating IPs  https://review.opendev.org/c/openstack/nova/+/96260411:36
gibitobias-urdin: yeah I think it should work but obviously we need more data.12:04
gibitobias-urdin: I think this is the funtional test that at least partially covering it https://github.com/openstack/nova/blob/d840c63a18ad951382be821f90a6b28c627911fa/nova/tests/functional/libvirt/test_numa_live_migration.py#L19712:09
sean-k-mooneytobias-urdin: it does in any release the properly supprot numa live migration which i think was added in train or wallaby12:09
sean-k-mooneyya it was train12:10
sean-k-mooneyhttps://specs.openstack.org/openstack/nova-specs/specs/train/implemented/numa-aware-live-migration.html12:10
tobias-urdinthanks, not sure why i had a vague memory of it having to be equal on both sides, probably pre-train days something then :)12:33
sean-k-mooneyyep pre-train the live migraiton woudl be allowed but the vms xml would not be updated until it was hard rebboted12:33
sean-k-mooneyasumign the cores existed if they didnt the live migraiton woudl fail but only in that case12:34
sean-k-mooneywe put a block in place and backported pre train with a workaround to reenable it as part of supprotign it properly12:34
sean-k-mooneybasiclly pre train we said only do live migration with numa if your usign hugepage without cpu pinning12:35
opendevreviewStephen Finucane proposed openstack/nova master: Follow-up for I85a7729ee65f8987ed0239f80d8d0082a414ab8f  https://review.opendev.org/c/openstack/nova/+/97484312:42
opendevreviewMerged openstack/nova master: api: Simplify servers views (1/3)  https://review.opendev.org/c/openstack/nova/+/95623113:53
opendevreviewBalazs Gibizer proposed openstack/nova master: Use an executor to delay STOPPED events  https://review.opendev.org/c/openstack/nova/+/97444514:00
opendevreviewBalazs Gibizer proposed openstack/nova master: Run nova-compute in native threading mode  https://review.opendev.org/c/openstack/nova/+/96546714:00
opendevreviewBalazs Gibizer proposed openstack/nova master: DNM:Test with oslo.vmware + compute eventlet removal patches  https://review.opendev.org/c/openstack/nova/+/97346814:00
opendevreviewBalazs Gibizer proposed openstack/nova master: SubclassSignatureTestCase to use NoDBTestCase as base  https://review.opendev.org/c/openstack/nova/+/97486114:00
gibisean-k-mooney: bauzas: gmaan: https://review.opendev.org/c/openstack/nova/+/974445 this is ready for serious review now. I have couple of remaining todos but nothing is expected to change in the core of the patch. In the other hand the core of the patch needs deep review14:11
bauzasack, I definitely need a coffee for reviewing it14:12
gibidon't worry if it take two :)14:13
opendevreviewMerged openstack/nova master: api: Simplify servers views (2/3)  https://review.opendev.org/c/openstack/nova/+/95623214:13
opendevreviewMerged openstack/nova master: api: Simplify servers views (3/3)  https://review.opendev.org/c/openstack/nova/+/95623314:17
sean-k-mooneygibi: ok ill try an take a look today or tomorrow14:23
gibithanks both of you :)14:56
MaxLamprecht[m]Hey folks, I'm currently debugging many out of sync cases between nova and cinder BDMs/attachments in our environment. Mostly these are caused by parallel operations/lock-contention or other reasons. I added a comment... (full message at <https://matrix.org/oftc/media/v1/media/download/AYgeZZCT43vzk9WSn7ySGLYl0AqCdodcgtuHH4PEsRyizpNZ9dfQRaY5IGWjRMbtweIwNYxNRndku0zt8aS2xE9CecSB29CAAG1hdHJpeC5vcmcvb0drQk95eXZDTkVTUmphRGJPSk5GYlRZ>)15:13
sean-k-mooney MaxLamprecht[m] multi line messages on matrix only rednere the first line on irc with a link to the full message so its best to avoid multi line messages 15:17
sean-k-mooneyMaxLamprecht[m]: it would be good to add those to https://etherpad.opendev.org/p/nova-2026.1-status#L7615:19
sean-k-mooneyadd a new line per bug with 2 sub bullet point for the regression and fix reviews15:19
sean-k-mooneyMaxLamprecht[m]: ohter then that the best thing you can do to get review input is to bring it up here or in the open discuss section of the weekly irc meeting15:20
MaxLamprecht[m]ack. Good to know :)15:20
MaxLamprecht[m]thx, I will add these MRs15:20
sean-k-mooneythe fact you have created the repoducer and fix sepreatly ectra will definly help with the reivew15:21
gmaangibi: ack15:22
sean-k-mooneyMaxLamprecht[m]: 2 of the serise refence the same bug so you have 3 bug fix/regresion serise bvut only 2 bugs15:27
sean-k-mooneyon actully no you have 315:28
sean-k-mooneyjust similar numbers15:28
MaxLamprecht[m]haha. I was already checking if I messed something up :D15:29
sean-k-mooneyMaxLamprecht[m]: can you make sure you assign https://bugs.launchpad.net/nova/+bug/2088066 https://bugs.launchpad.net/nova/+bug/2137366 and https://bugs.launchpad.net/nova/+bug/2139135 to your self in lanchpad15:29
sean-k-mooneyMaxLamprecht[m]:it was the 66 that was confusing me15:30
MaxLamprecht[m]sean-k-mooney: done15:30
sean-k-mooneyall 3 sound liek valid bugs so i quickly tiragers all 3 as medium and added the approate tags 15:31
MaxLamprecht[m]perfect. I'm currently trying to resolve/find root causes for such bdm/attachment inconcistencies in our environments. Maybe I will find additional bugs while digging into this.15:41
MaxLamprecht[m]I have added another small db index MR to the etherpad that is mostly relevant for large scale deployments15:42
sean-k-mooneyMaxLamprecht[m]: its still proably a good idea to summerise this in a mailing list post or in the next irc meetign so that folks are aware of the related work your are trying to progress15:43
sean-k-mooneyits not striclty requried but the more visiablity your patches have the more likely it will get reviewd15:43
MaxLamprecht[m]sean-k-mooney: ack. I will try to join the next irc meeting15:48
rm_work[m]hopefully quick q about nova vendordata dynamicJSON... I can't find it in the docs anywhere but it seems like I need to configure the `vendordata_dynamic_targets` option with the target named `cloud-init@<address>` or else cloud-init doesn't actually return anything for it. is that expected? and how can I use multiple targets if they need to be named that way? or did I maybe get confused from some other bug I had when I was testing15:50
rm_work[m]this and any name will work?15:50
sean-k-mooneyrm_work[m]: so yes and no15:58
sean-k-mooneyrm_work[m]: we have no upstream supoprt for usign it as a data souce for cloud-init15:59
sean-k-mooneythey have some documentation that suggest you could use it that way15:59
sean-k-mooneybut ti has never been officaly supproted in nova15:59
sean-k-mooneyrm_work[m]: https://github.com/canonical/cloud-init/issues/522116:00
sean-k-mooneyrm_work[m]: in https://github.com/canonical/cloud-init/issues/5221#issuecomment-2086560281 i provide a example of how to hack it to work by doing16:01
sean-k-mooney[API] 16:01
sean-k-mooneyvendordata_dynamic_targets=['cloud-init:string@http://127.0.0.1:123']16:01
sean-k-mooneythere was a spec that was not implemtned  this https://review.opendev.org/c/openstack/nova-specs/+/917109/6/specs/2024.2/approved/dynamicjson-vendordata-cloud-config.rst  https://blueprints.launchpad.net/nova/+spec/dynamicjson-vendordata-cloud-config16:04
sean-k-mooneyrm_work[m]: so yes if your dynmaic backend return a single string that is the cloud init info i think it can work but since it not offially supproted if ti does not its not a nova bug16:05
sean-k-mooneyits a new feature as this was never intended to work with cloud init automaticly like that16:06
rm_work[m]I mean I don't need it to RUN as part of cloud-init, I just need cloud-init to properly fetch the contents into the vendordata2.json file16:36
rm_work[m]like, VM boots, user checks the json file and it contains stuff16:37
rm_work[m]using a colon is a neat trick though, ok will try that16:37
sean-k-mooneyso it will be present in the the config drive and via the metadata api16:38
sean-k-mooneyyou just wont be abel to execute scrips in the guest vms via vender data automaticly without the hack16:38
sean-k-mooneyso if you just want to make the file avaiable you do not need the hack16:38
sean-k-mooneyif you want to install packages ectra then you do but as i noted its not offically supproted so just be aware of that16:39
rm_work[m]yeah it looks like I CAN use any name after all, I think I had an encoding bug originally and fixed that at the same time as trying switching to cloud-init as the key, and assumed I needed that name16:59
rm_work[m]nova doesn't like if the target returns anything other than a string with 200 code17:01
rm_work[m]I was trying to return JSON originally... seeing as how it's called DynamicJSON... but nope lol17:01
rm_work[m]THAT seems like a bug to me but i assume there's a reason17:01
sean-k-mooneynova is expectign a json respocne but a single stirng is also a vaild json responce17:10
sean-k-mooneyrm_work[m]: we are doing json.loads on the responce17:10
sean-k-mooneyrm_work[m]: https://www.madebymikal.com/nova-vendordata-deployment-an-excessively-detailed-guide/17:11
rm_work[m]hmm17:12
rm_work[m]yeah that was the guide I was originally following17:12
sean-k-mooneyyep so the responce format should be the same as the static json file17:13
rm_work[m]ok I'll try again again again17:13
sean-k-mooneythe hack of using a single string with the file contents for cloud init config is abusign the fact that a string is a valid json document17:13
sean-k-mooneywe expect the respocne to be a valid json documetn and specifcly a json object but as long as its valid json it can be proceesed by nova17:14
rm_work[m]ok, i'm trying again17:21
rm_work[m]I feel like I was dealing with like 4 different issues at the same time, and by the time I solved them all I was in this weird configuration and had made some inaccurate assumptions about what the actual issues were17:22
rm_work[m]yeah ok, that does seem to work T_T17:32
rm_work[m]so it looks like almost everything I had to adjust to get this to work was likely unnecessary17:32
rm_work[m]I BELIEVE there is still an issue with return codes -- nova will not actually include the body of the response if the return code is a 5XX right?17:33
sean-k-mooneyproably17:33
sean-k-mooneybut your server should be returnning a 200 ok for the get request we do17:34
rm_work[m]yeah there are cases where the request result is an error, but we do want that returned to the user, so I'm just blindly doing 200 OK returns and just including an error in the body. that does make some sense at least, the CALL didn't fail, just the internal operation did.17:35
sean-k-mooneyya so if you want to return an errror in the body it needs to be a json body not html or it will break nova17:36
sean-k-mooneynova will likly just drop it but its expcting json17:36
sean-k-mooneyyou could jsut have a single string as i noted above17:36
rm_work[m]yeah I made a structure like '{"success": False, "error": "blah"}'17:36
sean-k-mooneyack17:37
sean-k-mooneythat shoudl work17:37
rm_work[m]yeah ok thanks for calling that stuff out, this simplifies things a bit17:39
rm_work[m]sean-k-mooney: ah ok so I see now, I'm relying on it going into /var/lib/cloud/instance/vendor-data2.txt on boot19:26
rm_work[m]when I test just hitting the endpoint, yes it does return properly... but if I boot a new VM, nothing shows up there with this setup19:27
rm_work[m]going back through to test individually to see whether the issue is the naming or the JSON-encoding19:27
rm_work[m]I think maybe this is why I had to name it cloud-init19:27
opendevreviewGhanshyam proposed openstack/nova master: PoC: Graceful shutodwn of nova services  https://review.opendev.org/c/openstack/nova/+/96726119:29
opendevreviewGhanshyam proposed openstack/nova master: PoC: Graceful shutodwn of nova services  https://review.opendev.org/c/openstack/nova/+/96726119:29
rm_work[m]so I guess this is more an issue of how cloud-init is configured to call/parse the vendordata endpoint19:45
sean-k-mooneyrm_work[m]: to have that work you need config drive enabled19:48
sean-k-mooneyit will be placed in the iso19:48
sean-k-mooneyand then it shoudl get mounted with the other info there19:48
sean-k-mooneybut in that case it will only be updated once on first boot19:48
sean-k-mooneyso if it failed when the config drive was irst build and you fixed it after it would not be present19:49
rm_work[m]yeah, so what I'm seeing is that IF I name it cloud-init AND I return a single "string" from a json.dumps(), then /var/lib/cloud/instance/vendor-data2.txt does contain the actual JSON20:08
rm_work[m]which implies that cloud-init hits that endpoint, pulls out the cloud-init key, and json-decodes it O_o20:10
sean-k-mooneyyes it does20:10
sean-k-mooneyit will use it as one of the input like user-data20:11
sean-k-mooneyso it a way for you as a cloud amdin for example to auto install teh qemu-guest-agent  or do similar customaistion20:11
sean-k-mooneynow a lot of poeple dont what the cloud servidce provider doing that but the functionatliy exits in cloud-init for it 20:11
sean-k-mooneyif the content of the string is a cloud-cofnig compatiabel file for example you can use the normal cloud init moduels20:12
sean-k-mooneyhttps://cloudinit.readthedocs.io/en/latest/explanation/about-cloud-config.html#example-cloud-config-file20:12
sean-k-mooneycloudinit.readthedocs.io/en/latest/explanation/vendordata.html20:13
sean-k-mooneyVendor-data follows the same rules as user-data, with the following caveats:20:13
sean-k-mooney    Users have ultimate control over vendor-data. They can disable its execution or disable handling of specific parts of multi-part input.20:13
sean-k-mooney    By default it only runs on first boot.20:13
sean-k-mooney    Vendor-data can be disabled by the user. If the use of vendor-data is required for the instance to run, then vendor-data should not be used.20:13
sean-k-mooney    User-supplied cloud-config is merged over cloud-config from vendor-data.20:13
sean-k-mooneyrm_work[m]: while its techinally docuematned bhavior of could-init its not someithgn that operator and user geneerally expect20:14
opendevreviewMerged openstack/nova master: [hacking]Do not mock threading.Event  https://review.opendev.org/c/openstack/nova/+/97145420:42

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!