Friday, 2026-05-15

tafkamaxHi I have a question about live-migration. I am getting this error: Operation not permitted: libvirt.libvirtError: internal error: unable to execute QEMU command 'migrate-set-capabilities': Postcopy is not supported: Userfaultfd not available: Operation not permitted08:18
tafkamaxI am using kolla-ansible and set the options under [libvirt]... (full message at <https://matrix.org/oftc/media/v1/media/download/ARBKYWZG4vFezBf8AdfbKO35RwPbQknzRNnZhzshrYdIgP66r5d8eL5kRgH6dYMm6kvXhVs1S3zU0FLXLsvUWH1CeectI0ogAG1hdHJpeC5vcmcvT1NZR0RscVlSZk9kZkhkR2ZOc3VHeHF5>)08:20
tafkamaxrunning 2025.1 release08:20
tafkamaxare these even supported?08:20
tafkamaxor how can I find out then?08:20
tafkamaxor do i need to enable a kernel module?08:21
tafkamax* or do i need to enable the sysctl option "vm.unprivileged_userfaultfd=1"08:22
Mike--I checked nova changelog and version 14.0.0 introduced some of those options and should work I'd say08:25
tafkamaxI think I need to fool around with the sysctl options08:25
Mike--https://github.com/openstack/nova/blob/master/releasenotes/source/newton.rst08:26
Mike--also found: https://github.com/openstack/nova/blob/master/doc/source/admin/configuring-migrations.rst08:27
tafkamaxok thx!08:27
tafkamaxthats a good docs08:28
*** elodilles_pto is now known as elodilles08:49
stephenfinsean-k-mooney: Great, thanks09:49
sean-k-mooneyso in the job that failing most often i think i can actully trun off the neutron-periocs but im goign t chat about that wiht the neturon folk after it have coffee09:51
opendevreviewAshish Gupta proposed openstack/nova master: tests: file-backed SQLite with WAL in threading mode for Database and CellDatabases Fixtures  https://review.opendev.org/c/openstack/nova/+/98858310:30
opendevreviewPavlo Shchelokovskyy proposed openstack/nova master: Fix PEP-765 syntax warning  https://review.opendev.org/c/openstack/nova/+/98863610:47
tafkamaxI have a question regarding the live migrations. they take forever to sync the last part of ram. And looking at the logs I don't know if I am looking at the correct thing.10:48
tafkamaxnova.virt.libvirt.migration [req-f5188e51-02b9-4338-b3e7-53e162a3f3c2 req-f369ecbf-a79d-4ddf-b065-eea729d54da1 2337b7a2b5e34752a86ab5d61c525327 0530bca34f3f4d2c92a3495d56a1e065 - - default default] [instance: d5cf4b01-d97d-4395-8dd2-c4930b57e0c9] Increasing downtime to 50 ms after 0 sec elapsed time during the start of the migration10:49
tafkamaxwhat downtime is it? because https://docs.openstack.org/nova/2025.1/configuration/config.html#libvirt.live_migration_downtime <- this downtime starts at 500ms10:49
tafkamaxAnd it is constantly hovering at 1% and 0% memory remaining at takes like ~20minutes to finish10:50
tafkamaxAha it completed faster after changing  live_migration_downtime_delay from default 75 to 2510:55
stephenfintafkamax: That happens if you have a a busy VM. If there's lot of writes to memory in the VM, the writes can happen faster than libvirt can sync things on the destination10:55
stephenfinYou want to look at live_migration_downtime_delay, live_migration_permit_post_copy and live_migration_permit_auto_converge (this one particularly)10:56
* tafkamax sent a code block: https://matrix.org/oftc/media/v1/media/download/AcXSe0uulLyz53TnNkkjWT7nZZIsiyEKA6P9odmnIybwFrUkxqsDakmiHf_wKkU129PnuVjpKmu_goKawiqiJjdCeec2DAgAAG1hdHJpeC5vcmcvVFpITGdxSlpOeXJtTk5XQ3JPYnJlaWl310:56
tafkamaxI have already setup the auto_converge and post_copy10:56
tafkamax^ the logs, the first downtime ~11:43 to 12:03 had the default 75seconds10:56
tafkamaxchanging to 25seconds in ~13:42 made the downtime change faster and then it completed faster10:57
tafkamaxwhatabout live_migration_downtime and downtime_steps though?10:58
tafkamaxor 500 is the max and it starts from 50ms?10:58
tafkamaxI will now move steps from 10 to 611:06
tafkamaxok seems I got the hang of it now how it caluculates it11:18
gibifyi cores, IOThreads in Gazpacho breaks the live migration of pre-existing VMs with pinned CPUs https://bugs.launchpad.net/nova/+bug/2152697 11:53
gibiit is due to an wrong assumption that iothreadpin always exists in the XML11:54
gibihttps://github.com/openstack/nova/commit/53a613d9948826ec9a4cd4a502f7a5d1b2dc87d7#diff-1f1f3f935853a67b1239220cd9f8c28278734732c5c28e8d238f62860391b1ecR27311:54
*** haleyb is now known as haleyb|away12:38
opendevreviewThibaut Démaret proposed openstack/nova master: libvirt: add disk rotation_rate support for local disks  https://review.opendev.org/c/openstack/nova/+/97969312:49
opendevreviewThibaut Démaret proposed openstack/nova master: libvirt: add disk rotation_rate support for local disks  https://review.opendev.org/c/openstack/nova/+/97969313:15
sean-k-mooneygibi: didnt we fix that13:23
sean-k-mooneyi guess we never backported it?13:23
gibisean-k-mooney: we fixed the pinned cpu case and fixed the live migration case but not the live migration of a pre-existing VM wwith pinned cpu case :D13:26
gibithe live migration fix introduced the unconditional check for the iothreadpin xml tag13:27
gibiwhich only exists with VMs that are created (or rebooted) since the IOThread feature13:27
sean-k-mooneyim surpised our multinode grenade job didnt pick that up13:37
sean-k-mooneybut ok that shoudl not be hard to fixx 13:37
sean-k-mooneywe just need ot gate that code by the xml having it13:37
gibiI guess our grenade does not have pinned VMs13:43
sean-k-mooneyoh is this only for pinned vms13:43
sean-k-mooneyoh no13:43
sean-k-mooneyits not pined exaclytu13:43
sean-k-mooneyits that we are not using cpu_share_set or cpu_dedicated_set13:43
sean-k-mooneyso since we are not using either of them we never generate teh element13:43
gibinope13:44
sean-k-mooneycpu_shared_set woudl be enough to catch this i think13:44
sean-k-mooneygibi: are you plannign to work on a patch to make it conditional?13:44
gibiI'm working on a functional reproducer13:44
sean-k-mooneyack cool im happy to review both so add me or ping me when its up13:45
gibiduring live migration we update the src XML, for a pre-existing VM, the src XML does not have any iothreads or iothreadpin field13:45
sean-k-mooneyyep we have had this tyep of issue before13:45
sean-k-mooneyi think we have exsing functional tests that simulate somethign similar13:46
opendevreviewBalazs Gibizer proposed openstack/nova master: Reproduce bug/2152697  https://review.opendev.org/c/openstack/nova/+/98876913:51
gibisean-k-mooney: this is the reproducer ^^13:52
sean-k-mooneyam `del list(conn._vms.values())[0]._def["iothreads"]`13:53
sean-k-mooneyok i see why that works13:53
sean-k-mooneythe list is convertign the the iterator int a lis that containes refence to the actul objects in the _vms dict13:54
sean-k-mooneyso we are deleting trhough the list13:54
sean-k-mooneycoudl we do this a littel iffenrtly13:55
sean-k-mooneyvm_domain = next(conn._vms.values())13:55
gibiwe can use next yes13:55
sean-k-mooneydel vm_domain._def["..."]13:56
sean-k-mooneyi can see why what you have works on a second or third reading but its a littel non obvious13:56
sean-k-mooneygibi: over all the repoducer is more or less doign what i expect13:57
sean-k-mooneyas noted however this shoudl also manifest if you have jsut set cpu_shared_set and are not using dedicated13:57
sean-k-mooneyso it migh be nice to add a second test for floating vms13:57
sean-k-mooneyotherwise this looks like a reasonabel repoducer. it was more or less what i was expecting you to write13:58
gibiI tried an instance with shared CPUs before and it did not hit the same issue 13:59
gibithe problematic code is under a condition 13:59
gibihttps://github.com/openstack/nova/commit/53a613d9948826ec9a4cd4a502f7a5d1b2dc87d7#diff-1f1f3f935853a67b1239220cd9f8c28278734732c5c28e8d238f62860391b1ecR260-R26213:59
gibiabout pinning13:59
sean-k-mooneywe do pin shared cpus14:00
sean-k-mooneyif an only if cpu_share_set is defiend14:00
sean-k-mooney oh14:01
sean-k-mooneyactully in that case we only set vcpu cpuset=...14:01
sean-k-mooneyso we pin it but we dont generate the element14:02
sean-k-mooneyok then ya i see this would only impacte pinned vms that were created before iothread14:02
opendevreviewBalazs Gibizer proposed openstack/nova master: Fix live migration with pinned VM and iothreads  https://review.opendev.org/c/openstack/nova/+/98877414:05
sean-k-mooneygibi: +2 on the repoducer +1 on the fix. ill see if i can try and trest this on monday and loop back to the review after the ci has ran14:08
sean-k-mooneygibi: we might want to add a release note but that my only feedback really right now14:08
sean-k-mooneymaybe also a unit test for the branchign bevhior in the migration module14:08
sean-k-mooneyjust to test _update_numa_xml14:09
gibiyeah I can do that14:09
opendevreviewBalazs Gibizer proposed openstack/nova master: Reproduce bug/2152697  https://review.opendev.org/c/openstack/nova/+/98876914:17
opendevreviewBalazs Gibizer proposed openstack/nova master: Fix live migration with pinned VM and iothreads  https://review.opendev.org/c/openstack/nova/+/98877414:17
opendevreviewBalazs Gibizer proposed openstack/nova master: Fix live migration with pinned VM and iothreads  https://review.opendev.org/c/openstack/nova/+/98877414:20
gibinow with reno and unit test :)14:20
gibi^^ cc dansmith as we chatted about it downstream14:21
sean-k-mooneygibi: so one nuance about your fix14:24
sean-k-mooneyit will work but you intentally choose not to add an iothread to the running instnace14:24
sean-k-mooneyon live migrate14:24
sean-k-mooneywe could also do that perhasp in a followup patch14:25
sean-k-mooneyim personally fine with not doing that and just saying you need to reboot to get the iothread jsut notign that qemu/libvirt techniall supprot adding and removign them on a runing instance14:26
gibiI think it is safer not to try to add it during live migration14:27
sean-k-mooneyya that was my feelign too14:27
sean-k-mooneyjust wanted to raise the possiblity14:27
gibiso if somebody explicity want this then that is a new small feature to me14:28
sean-k-mooneycool well still +1 lets see what ci says and ill loop back14:29
sean-k-mooneygibi: nice find by the way idd you just come across it or was it somethign you noticed in a ci failreu or something14:31
sean-k-mooneyoh you were backporting this downstream right14:31
gibiyepp I tested my backport downstream 14:31
gibithere it is simpler to thest the upgrade 14:31
gibiby just changing the nova-compute container image14:32
gibias we had bugs before with live migration and pinned CPUs, I tested the combination of those :)14:32
sean-k-mooneyack i need to revive and rewrite my ci change14:32
opendevreviewMasanori Ueno proposed openstack/nova master: DNM: Add functional test for NUMA live migration overcommit bug  https://review.opendev.org/c/openstack/nova/+/98877714:33
sean-k-mooneythat enabeld cpu pining upstream14:33
gibiyeah I have a todo to add some iothread tests to whitebox14:33
sean-k-mooneyi never got aroudn to making the calulation of cpu_share_set and cpu_dedicated_set dynmic14:33
sean-k-mooneywhitebox woudl be good too ya14:33
*** gibi is now known as gibi_off14:46
dansmithgibi_off: sean-k-mooney: but... people might want to use live migration to get iothreads without a guest restart16:38
dansmithI mean, I know the risk is higher but.. I _know_ people will want that :D16:38
sean-k-mooneydansmith: thats why i broguht it up. it would be nice if that worked but that is why i was suggesting perhaps a followup17:10
sean-k-mooneyi knwo there is an api to add an remove them at runtime17:10
sean-k-mooneybut i dont knwo if you can do this when live migrating17:10
dansmithack17:11
-opendevstatus- NOTICE: The Gerrit service on review.opendev.org will be offline momentarily while we restart it onto a new patch release18:04
melwittgmaan: sorry to bug you with this but I'm trying to close out a quota bug fix I worked on awhile back that has got one +2 https://review.opendev.org/q/topic:%22bug/2131272%22 the CI results are old but I just rechecked it, if you might have a chance to look at it. it is not urgent19:37
gmaanmelwitt: ack, I will take a look19:45
melwittthanks gmaan 19:46
gmaanmelwitt: one question. do we allow to set user quota >project quota OR set project quota < user quota ? if yes then that is the problem right? 20:26
gmaanso this is case when both are equal and multi users. got it but just wondering if we handle the above case ^^ (i assume yes)20:30
*** jcosmao is now known as Guest950820:44
melwittgmaan: I think it is not allowed to set an individual user's quota to be larger than the quota of their project. And I don't think it's allowed to set project quota lower than an individual user's quota in the project but I am not as sure about that one. /me checks if we have tests for this and if not, it's probably worth adding a couple20:49
gmaanyeah, I was thinking to have test for those cases. but anyways that is separate things came up in my mind. bug#2131272 fix lgtm, waiting for CI results20:53
melwittgmaan: yeah I think it's a good point. I might add the tests to this patch if I find we don't already have them, since we are waiting anyway20:57
melwittI would think they should be easy small func tests that just try to set the quota and assert it's rejected in those scenarios20:58
opendevreviewmelanie witt proposed openstack/nova master: Fix swap disk creation skipped on NFS during cold migration  https://review.opendev.org/c/openstack/nova/+/98854721:23
gmaan++ thanks21:24
opendevreviewmelanie witt proposed openstack/nova master: Use tempest_concurrency=1 for nova-vtpm job  https://review.opendev.org/c/openstack/nova/+/98486421:27
opendevreviewMerged openstack/nova master: Reproducer for bug 2131272  https://review.opendev.org/c/openstack/nova/+/96714721:42
melwittI have a crazy old 4 character bug fix with one +2 on it if anyone wants an easy review https://review.opendev.org/c/openstack/nova/+/90415521:49
opendevreviewmelanie witt proposed openstack/nova master: Fix usage count when user-scoped quota is set  https://review.opendev.org/c/openstack/nova/+/96714822:25
gmaan+w22:31
melwittthanks gmaan !23:07
opendevreviewMerged openstack/nova master: Implementing get_num_instances for ironic virt driver  https://review.opendev.org/c/openstack/nova/+/95568523:10

Generated by irclog2html.py 4.1.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!