Wednesday, 2026-05-06

chandankumarjgilaber_: Hello, I have tried to reproduced this bug https://bugs.launchpad.net/openstack-cyborg/+bug/2017513 But from bug description, I am not user is removing pci device from the config or from the host directly like unplugging the device. 08:50
chandankumarIn my reproducer, I removed the pci device from config and then similar traceback comes from cyborg-agent log.08:51
chandankumarFeel free to take a look at the bug and triage accordingly. Thank you!08:51
*** jgilaber_ is now known as jgilaber08:54
jgilaberI understood it originally as unplugging the device08:54
jgilaberthanks for the reproducer!08:56
jgilaberI'm not sure what would be the best way to handle such cases08:57
chandankumarone more thing, we delete the device profile, you can see the error message, it is also not clear for end user08:57
jgilaberideally we would probably unbind the arq and send a new bind request to nova if an equivalent free device exists08:57
jgilaberthe agent automatically deletes the device profile?08:58
chandankumarlet me try with unbind and see what happens09:00
chandankumarjgilaber: I ubind the device https://paste.openstack.org/raw/bTWU8NVSsBOdJjRNbym3/ but in cyborg, hotplug of a new pci device in running instance is not supported10:19
chandankumarwe can create the bind but it does not will have the device inside the vm10:19
chandankumaror I misunderstood send a new bind request to nova if an equivalent free device exist this part10:20
jgilaberhmm I guess that with the current integration any change requires a resize10:25
sean-k-mooneycorrect it does10:26
jgilaberto be clear my previous comment was me speculating how we could address the bug, I did not expect that was currently supported10:26
sean-k-mooneydevice profilces shoudl not be modifed if they are refence by a falvor10:26
sean-k-mooneyand the only way to chagne the allocation ot a vm is via reisze10:26
sean-k-mooneyi am still catching up on email form beign away for a few days so we can chat more about this tomorrow10:27
chandankumarsure10:27
jgilabersure no problem sean-k-mooney. For context, we're talking about an old bug we dicussed briefly in yesterday's meeting10:28
jgilaberhttps://bugs.launchpad.net/openstack-cyborg/+bug/201751310:28
sean-k-mooneypython 3.6 :)10:28
sean-k-mooneyya that been a while10:29
sean-k-mooneythis looks simialr to a very old nova bug 10:29
sean-k-mooneyi think the pci driver is likely missing the protection we ahve on the nova side but ya we can revisit this once im back up to speed10:30
sean-k-mooneyhttps://github.com/openstack/nova/commit/26c41eccade6412f61f9a8721d853b545061adcc https://github.com/openstack/nova/commit/284ea72e96604bdf16d1c5c4db47247334841b2f https://github.com/openstack/nova/commit/0208be629c3853863bcd49b8bdbe2b9889b85012 https://github.com/openstack/nova/commit/f37cdf0c4182103ad81dbf39188ff39955da385014:04
sean-k-mooneythose are the nova patches related to the isseu reproted in https://bugs.launchpad.net/openstack-cyborg/+bug/201751314:04
sean-k-mooneyhttps://bugs.launchpad.net/nova/+bug/1633120 https://bugs.launchpad.net/nova/+bug/1969496 and https://bugs.launchpad.net/nova/+bug/2115905 are the releated nova bugs we had 14:05
sean-k-mooneythe tl;dr is if a device is refence by a ARQ and that device is not in the whitelist or viaabel on teh host anymore we cannot remove the device form the db or placmeent until that ARQ is deleted and we shoudl not do that automaticlly14:07
sean-k-mooneythe admin need to move or delete the vm or readd the device14:07
sean-k-mooneywe shoudl complain very very loadly in teh logs when the compute agent start up in a miscondifured state but we dont geenrally want to make that an agent startup failure as that a potical dos vector if we do14:08
jgilaberI'm not sure from the bug report if it actually crashes the agent or it just logs the traceback in an ugly way14:23
jgilaberChandan reproduced the error, we can ask him later/tomorrow14:23
sean-k-mooneyin either case thsi shoudl be handeled gracefully and we shoudl not attepmt to delete the resouce provider14:30
sean-k-mooneytrying to delete it if it has allcoation agaisnt it is a bug in our internal logic14:30
sean-k-mooneythe protections in placment are really the last line of defence14:30

Generated by irclog2html.py 4.1.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!