Tuesday, 2025-03-11

opendevreviewsuiong ng proposed openstack/nova stable/2024.2: Fix parameter order in add_instance_info_to_node  https://review.opendev.org/c/openstack/nova/+/94399401:12
opendevreviewAmit Uniyal proposed openstack/nova master: DNM: enable NFS backend job  https://review.opendev.org/c/openstack/nova/+/94400806:19
opendevreviewAmit Uniyal proposed openstack/nova master: DNM: enable NFS backend job  https://review.opendev.org/c/openstack/nova/+/94400806:25
opendevreviewAmit Uniyal proposed openstack/nova master: DNM: enable NFS backend job  https://review.opendev.org/c/openstack/nova/+/94400808:02
opendevreviewAmit Uniyal proposed openstack/nova master: DNM: enable NFS backend job  https://review.opendev.org/c/openstack/nova/+/94400808:15
opendevreviewAmit Uniyal proposed openstack/nova master: DNM: enable NFS backend job  https://review.opendev.org/c/openstack/nova/+/94400808:28
ralonsohhello folks, where exactly in Nova compute the VF is created and the VLAN ID defined?08:32
ralonsohcan you point me to the code?08:32
ralonsohI'm talking about SR-IOV ports08:32
ralonsohbauzas, sean-k-mooney ^08:40
sean-k-mooney[m]the vlan id is compute by neturon08:41
sean-k-mooney[m]we just apply the value ye provide to the vf08:41
ralonsohsean-k-mooney[m], the VLAN ID is set by Neutron?08:41
sean-k-mooney[m]no its set by nova08:42
ralonsohno no, I know the VLAN info is provided by Nova08:42
ralonsohbut where this VLAN ID is set in the VF?08:42
sean-k-mooney[m]but we use the value form the neutron network accosated with the neutron port08:42
ralonsohby Neutron* (not Nova)08:42
sean-k-mooney[m]its set by nova in the libvirt xml08:42
ralonsohok, this is what I was looking for08:43
sean-k-mooney[m]ill find the code one sec08:43
ralonsohthanks!08:43
sean-k-mooneyso our LibvirtConfigGuestInterface has a vlan property https://github.com/openstack/nova/blob/master/nova/virt/libvirt/config.py#L1912 which is rendered into the xml for type direct and hostdeve  vnic_type=direct is hostdev in this case and vnic_type=macvtap is direct...08:46
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/nova/virt/libvirt/config.py#L2015-L201908:46
ralonsohright, so the VLAN is set in the interface definition08:46
ralonsohthanks a lot!08:47
sean-k-mooneythat invloked form here https://github.com/openstack/nova/blob/master/nova/virt/libvirt/vif.py#L630-L65708:48
ralonsohyes, this vif plug method I was aware of it08:49
sean-k-mooneyralonsoh: so yes it defiend in the libvirt defintion and it will program the VLAN filter via the PF before it assigned the VF to qemu08:49
ralonsohbut I didn't see where the VLAN was defined08:49
ralonsohbut is in the XML 08:49
ralonsohsean-k-mooney, ups, now you are here08:50
ralonsohqq about resource providers08:50
sean-k-mooneysure08:50
ralonsohin Neutron, if we add BW info for an agent (SRIOV, OVN, etc) and a network08:50
ralonsohNeutron will use the API to create the RP, the inventories, etc08:50
ralonsohwhat if we remove the RP?08:51
ralonsohI don't think we have considered that08:51
ralonsohat least in OVN, that reads the RP info not from the configuration but from the chassis (so there is no API restart)08:51
sean-k-mooneyso the design at the time considered removing an RP that has allcoation ot be operator error08:51
ralonsohso if there is an error in the configuration (for example, the physnet name is incorrect)08:52
ralonsohdo we need to manually remove the RP ?08:52
opendevreviewMaksim Malchuk proposed openstack/nova stable/2024.2: Allow hyphen in cinder catalog_info service-type  https://review.opendev.org/c/openstack/nova/+/94400908:52
ralonsohah ah, so if there is allocation, it will fail08:52
sean-k-mooneyso it depends. we proposed but did not implemtne the idea of owner triats08:53
sean-k-mooneythe idea was that neutron or nvoa could add an owner trait to all RPs it owns08:53
ralonsohwe use custom names for the physnets08:54
sean-k-mooneythat way it could check fi ther ewere any that were created that should not exist08:54
ralonsohhmmm I think this "deletion" operations are going to be more difficult than expected08:54
opendevreviewMaksim Malchuk proposed openstack/nova stable/2024.1: Allow hyphen in cinder catalog_info service-type  https://review.opendev.org/c/openstack/nova/+/94401008:54
sean-k-mooneyralonsoh: right so you use thing slik same subtree 08:54
ralonsohyes, the host->agent subtree08:55
ralonsohso we "own" this RP08:55
sean-k-mooneyya so you can get all of the sub RPs in the subtree that match your nameign convention08:55
opendevreviewMaksim Malchuk proposed openstack/nova stable/2023.2: Allow hyphen in cinder catalog_info service-type  https://review.opendev.org/c/openstack/nova/+/94401108:55
sean-k-mooneyand delete any that shoudl not exist08:55
sean-k-mooneyif you have a OWNER_NEUTRON trait on them its safer08:55
sean-k-mooneyas even if there are two service usign the same naming convetion you can look at only the ones you created08:56
sean-k-mooneythat makes deleting it simper08:56
ralonsohyeah, this could be a good improvement for sure08:56
sean-k-mooneyfor now however you know they starte with or contain phsenet right08:56
ralonsohthat will mark our RP with no doubt08:56
ralonsohsean-k-mooney, and the agent RP, that is handled by Neutron08:57
sean-k-mooneyso you can must look at the RPs in the subtree for the agent08:57
sean-k-mooneyand remove any that are not in you current config.08:57
ralonsohok, I think I know (maybe) how to do this08:57
ralonsohreference: https://issues.redhat.com/browse/OSPRH-1459708:57
ralonsohI'll open a LP bug too now08:57
sean-k-mooneyif they have allcoation it indicates that the removal migt be in error08:57
ralonsohyes, this is important: we can't remove used RP08:58
sean-k-mooneyi trhink placement blocks it by default but i dont recall 08:58
sean-k-mooneywhen we are removign compute nodes we loop overa ll the allcoation on the RP and delete them first08:58
ralonsohyes, you are right08:58
sean-k-mooneythen delete the rp08:58
ralonsohin any case, I'll try this first08:59
sean-k-mooneyralonsoh: if you can tell there is a rename09:00
sean-k-mooneyyou can also do what is called a reshape09:00
sean-k-mooneywhish is the operation of moving allcoation between RPs09:00
sean-k-mooneythey are non tivial but its how we evovled the stucture over time09:01
sean-k-mooneyhttps://docs.openstack.org/api-ref/placement/#id8609:01
sean-k-mooneyso for a rename you would add the new rp, reshape the inventories/allocation from the old then delete the old RP09:02
ralonsohI don't think we need to implement this reshape (I think)09:03
ralonsohjust remove the RP that is un-configured09:03
sean-k-mooneyya in this case you dont09:03
sean-k-mooneyat least not for that jira09:03
sean-k-mooneyjust letting you know that api exists if you need it in the future09:03
ralonsohfor sure, good to know this!09:04
sahido/09:51
sahidquick question guys, is there a way to stop provision on a specific AZ, or at least a way to do it09:52
sahidI'm sure there is something but I can't remember how09:52
sean-k-mooneysahid: no09:54
sean-k-mooneysahid: not in general at least09:54
sean-k-mooneyyou could proably hack it usign some of the filters09:54
sahidI think it should be an interesting feature, don't you think?09:56
sean-k-mooneyin general no :)09:56
sean-k-mooneywe could add someign like disablign an az09:57
sean-k-mooneyit has a staus but it does nothing09:57
sahidahah ...I hould have bet for your response :-)09:57
sean-k-mooneytoday the status fo an az is just if any compute service is up and not disabled09:57
sean-k-mooneysahid: why do you want to disable it?09:58
sahidsean-k-mooney: it's to validate the resilience if we lose an AZ10:00
sahidthey do such check for everything10:00
sean-k-mooneyAZ are not for fault tollerance10:01
sean-k-mooneyso if your usign them for that you are starting form a broken foundation10:01
sahidit's not the point10:01
sean-k-mooneyallowing an az to be disabled externally woudl be an api change and need a spec. im not entirly sure that a good thing but its not terible10:05
sean-k-mooneytoday you could jsut disbale the computes theat are in an az to get the same effect10:05
sahidsean-k-mooney: it's basically what we are going to do, but they are lot of computes to disable I was trying to think about something else10:07
sahidi don't say it's something nova wants, perhpas it's something useful only for us10:08
sean-k-mooneythe problem is really what are the semantics10:08
sean-k-mooneydo we filter it form the list of AZs10:08
sean-k-mooneydo we impact any lifecycle operation on hosts assocated with the az10:09
sean-k-mooneyi.e. disabel reboot? start?10:09
sahidfrom our use case it is just about to stop provisoning on a specific az10:10
sahidwich may beuseful for upgrade or deployment if we can do that easily10:10
sean-k-mooneyi feel like this would be a new trait added to all host of a given az and an prefilter to forbid that trait10:10
sahidsean-k-mooney: in that way we don't need an API change, right?10:16
sean-k-mooneyno you still do if you want a api call to be able to disable the az10:17
sean-k-mooneyif you just want to add the trait manually you would not need an api change10:17
sean-k-mooneyyou can do that with the aggreate set commands i belive10:17
sahidyes, probably having an API call is better, your right10:18
sahidthan doing some tweaks on resources10:18
sahidshould I start to think about it, it is something that nova would be interested?10:19
sean-k-mooneyit looks like we never merged the helper command to set a trait on all RPs in an aggreate10:19
sean-k-mooneyso you would have ot loop over all of them and set it one by one without an api call on the nova side10:20
sean-k-mooneywhich is the same or more work then disabling all the computes10:21
sahidyes that is why having an api would make more sense if we go in that direction, but there is perhaps a different solution, to indicate to the scheduler that this az is disabled10:23
sahidas it has to go throught all the aggregates10:23
sean-k-mooneyso the schduler does not do anything with AZs anymore directly, the az filter was remvoed 3 releases ago10:30
sean-k-mooneywe have had the abity to do az enforcement in placment for 3-4 years at this point10:30
sahidoh i did not noticed that10:31
sean-k-mooneythe prefilter was added in rocky, became the default in xena and the filter was removed in bobcat10:32
sean-k-mooneysahid: that why im saying this woudl jsut be a trait on the RPs10:32
sean-k-mooneyand we woudl usdate the pre-filter to add it as a forbiden trait10:32
sean-k-mooneyit would work the same way the compute status pre-filter does https://github.com/openstack/nova/blob/master/nova/scheduler/request_filter.py#L242-L25510:34
sahidyes.. that does not really make sense to loop over all the RPs, if we really have to do that I guess we can just create an API to disable all host per AZ10:36
sahidbut this is more something like orchestration10:36
sahidso that does not make sens to have it in nova aswell i guess10:36
opendevreviewSylvain Bauza proposed openstack/nova master: Add service version for Epoxy  https://review.opendev.org/c/openstack/nova/+/94401710:40
opendevreviewSylvain Bauza proposed openstack/nova master: Update min support for Flamingo  https://review.opendev.org/c/openstack/nova/+/94401810:40
opendevreviewSylvain Bauza proposed openstack/nova-specs master: Move Epoxy implemented specs  https://review.opendev.org/c/openstack/nova-specs/+/94401910:49
opendevreviewsean mooney proposed openstack/nova master: only show standard image properties in server show.  https://review.opendev.org/c/openstack/nova/+/94241311:29
sean-k-mooneygibi: ^ fixed the remaining sample tests for 2.10011:30
andrewbonneyHi all. If anyone has chance I'd appreciate a follow up review on a long-standing small bug fix: https://review.opendev.org/c/openstack/nova/+/91996111:37
gibisean-k-mooney: cool, thanks, +211:45
sean-k-mooneyandrewbonney: the core of the change looks ok. i dont think there is really enough tetst coverage to prevent regressing this, would you mind add a functional test that assert the correct behavior when you shleve?11:51
andrewbonneyI'll take a look, thanks11:52
opendevreviewTakashi Kajinami proposed openstack/nova master: doc: Drop deprecated [api] auth_strategy  https://review.opendev.org/c/openstack/nova/+/94402812:09
opendevreviewTakashi Kajinami proposed openstack/nova master: doc: Remove non-existent [service_user] auth_strategy  https://review.opendev.org/c/openstack/nova/+/94402912:09
opendevreviewTobias Urdin proposed openstack/nova master: Add minute to instance_usage_audit_period  https://review.opendev.org/c/openstack/nova/+/94148813:10
*** haleyb|out is now known as haleyb13:30
opendevreviewAndrew Bonney proposed openstack/nova master: Don't reset port dns_name when shelving instances  https://review.opendev.org/c/openstack/nova/+/91996113:56
andrewbonneysean-k-mooney: ^ hopefully that's better. Will see what CI thinks, but tests work locally13:58
sean-k-mooneyandrewbonney: ya assoming that funtional test passes it actully assert the behavior you wanted to change14:00
sean-k-mooneylets see if ci is happy14:00
sean-k-mooneyandrewbonney: thanks for the update14:00
sean-k-mooneyandrewbonney: im a little conferned with how you implemeneted the list_exteions overried by the way14:01
andrewbonneyHappy to take advice - that's a copy from something similar in a regression test14:02
sean-k-mooneyi think you are only modifyign the neutron fixture that is local to this test14:02
sean-k-mooneybut its very close to a patten that can leak changes between tests14:02
sean-k-mooneyandrewbonney: https://github.com/openstack/nova/blob/a329c103cbc864591f232b195016407e407bdb5d/nova/tests/functional/regressions/test_bug_1888395.py#L32-L49 ya its probaly fine14:03
andrewbonneyThat's the one14:03
sean-k-mooneyi wrote that so if its wrong its on me anyway :)14:05
andrewbonney:)14:05
opendevreviewMerged openstack/python-novaclient stable/2025.1: Update .gitreview for stable/2025.1  https://review.opendev.org/c/openstack/python-novaclient/+/94376214:09
opendevreviewDan Smith proposed openstack/os-traits master: Add HW_PCI_ONE_TIME_USE trait  https://review.opendev.org/c/openstack/os-traits/+/94404914:09
opendevreviewMerged openstack/python-novaclient stable/2025.1: Update TOX_CONSTRAINTS_FILE for stable/2025.1  https://review.opendev.org/c/openstack/python-novaclient/+/94376314:11
opendevreviewMerged openstack/nova master: Add service version for Epoxy  https://review.opendev.org/c/openstack/nova/+/94401714:23
opendevreviewDmitriy Chubinidze proposed openstack/nova master: doc: nova service_user wrong auth_url  https://review.opendev.org/c/openstack/nova/+/93868014:34
dansmithsean-k-mooney: gibi: do we allow passing through a PF like a whole device? it seems like that's probably a bad idea, unless we check that a PF with VFs has none of those VFs available for use by nova, right?14:35
sean-k-mooneyyes we do14:35
sean-k-mooneywithout pci in placement we only allow pfs if non of the VFs are allocated14:35
sean-k-mooneywith pci in placment you can only list the PF or the VFs14:36
sean-k-mooneybut not both for the same device14:36
dansmithallocated...currently? or available for allocation at all?14:36
sean-k-mooneyallcoated currently14:36
dansmithwow, okay14:36
sean-k-mooneyit was a usecase that telcos really wanted14:36
dansmitheither way, if pci in placement has a strict either-or rule, that helps14:36
sean-k-mooneyya to make the accounting work we could not have it chaging dynmicaly14:37
dansmithcan I tell that something is a virtual function just because it has the physfunc (or whatever) in its /sys directory?14:37
sean-k-mooneyyes14:37
dansmithcool14:37
sean-k-mooneywe look at the pci capablites which are reported in the nodedevxml14:37
gibidansmith: yep PCI in Placement has a strict either or check14:37
dansmithcool14:37
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/nova/virt/libvirt/host.py#L1373-L143014:38
sean-k-mooneythat is how th elibvirt driver decied what type of device it is14:38
gibihttps://github.com/openstack/nova/blob/0d484ce37d86e989c8abdf57aec5e334f68206ef/nova/tests/functional/libvirt/test_pci_in_placement.py#L206-L239 this is the functional test proving that nova rejectes dependent devices if pci in placement is enabled 14:39
opendevreviewsean mooney proposed openstack/nova master: only show standard image properties in server show.  https://review.opendev.org/c/openstack/nova/+/94241314:59
opendevreviewsean mooney proposed openstack/nova master: [tests] add priting of sample and templete paths  https://review.opendev.org/c/openstack/nova/+/94405414:59
sean-k-mooneydansmith: ^ i think thats what you wanted14:59
opendevreviewDan Smith proposed openstack/nova master: [tests] Add printing of sample and template paths  https://review.opendev.org/c/openstack/nova/+/94405415:05
dansmithsean-k-mooney: got 'em thanks15:05
dansmithbauzas: thanks, I need to quickly tweak that work item title if you can re-apply your +2 after I do, that would be appreciate15:28
dansmithd15:28
bauzasdansmith: sure15:28
opendevreviewDan Smith proposed openstack/nova-specs master: Add one-time-use-devices spec  https://review.opendev.org/c/openstack/nova-specs/+/94348615:29
dansmiththat ^ just makes the work item reflect the discussion and the pre-review on the PoC15:30
sean-k-mooneydansmith: just an fyi i have been putting off doing some ramp on horizon stuff for like 2 weeks so im tryign to spend the rest of the week doing that. ill loop back to the spec firday or early next week15:32
dansmithsean-k-mooney: no worries, we've made a lot of quick progress and I have lots of smaller items to do (like all the test coverage and startup checks) so I'm good for a while for sure15:32
dansmithsurely appreciate sean-k-mooney gibi and bauzas helping sanity check everything early so I can work on implementing the things15:33
bauzasdansmith: I'll try to do code review15:33
bauzas#startmeeting nova16:02
opendevmeetMeeting started Tue Mar 11 16:02:21 2025 UTC and is due to finish in 60 minutes.  The chair is bauzas. Information about MeetBot at http://wiki.debian.org/MeetBot.16:02
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.16:02
opendevmeetThe meeting name has been set to 'nova'16:02
elodilleso/16:03
bauzas#link https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting16:03
dansmitho/16:03
bauzasI guess we can start16:05
* gibi arrives lae16:06
gibi*late16:06
bauzas#topic Bugs (stuck/critical) 16:07
bauzas#info No Critical bug16:07
bauzas#info Add yourself in the team bug roster if you want to help https://etherpad.opendev.org/p/nova-bug-triage-roster16:07
bauzasanything to discuss about bugs ?16:08
bauzaslooks not16:10
bauzas#topic Gate status 16:10
bauzas#link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs 16:12
bauzas#link https://zuul.openstack.org/builds?project=openstack%2Fnova&project=openstack%2Fplacement&branch=stable%2F*&branch=master&pipeline=periodic-weekly&skip=0 Nova&Placement periodic jobs status16:12
bauzasall periodics are green16:13
bauzas#info Please look at the gate failures and file a bug report with the gate-failure tag.16:13
bauzas#info Please try to provide meaningful comment when you recheck16:13
bauzasanything about bugs ?16:14
Ugglao/16:17
bauzas#topic Release Planning 16:17
bauzas#link https://releases.openstack.org/epoxy/schedule.html16:17
bauzas#info Nova deadlines are set in the above schedule 16:18
bauzas#link https://etherpad.opendev.org/p/nova-epoxy-rc-potential 16:18
elodillesfyi the RC1 patches are already proposed:16:18
elodilleshttps://review.opendev.org/c/openstack/releases/+/94394116:18
elodillesand16:18
elodilleshttps://review.opendev.org/c/openstack/releases/+/94394716:18
elodilles(please add a -1 if something is in the queue to merge o:))16:20
bauzasnova-cores, please look at this etherpad, we have a number of patches to merge before we can propose RC1 16:20
bauzaselodilles: will do 16:20
bauzasand saw those16:20
elodillesthanks o/16:20
* bauzas running another meeting at the same time :(16:20
* bauzas definitely needs to pass the PTL baton to Uggla soon16:20
bauzasas a reminder, there is a section in the etherpad for adding bugfixes worth to be delivered 16:23
dansmitheven without the upcoming baton change, probably best to let someone else run the meeting if you have a conflict16:23
bauzasyeah it just went over for 24 mins :(16:24
bauzasanyway, moving on16:24
bauzasyou'll see myself speaking in the channel asking cores to review thinfs16:24
bauzas#topic PTG planning 16:25
bauzas#info Next PTG will be held on Apr 7-1116:25
bauzas#link https://etherpad.opendev.org/p/nova-2025.2-ptg16:25
bauzasand that's it16:25
bauzas#topic Stable Branches 16:25
bauzaselodilles: please 16:25
elodillesyepp16:25
elodillesnothing special to report:16:26
elodilles#info stable gates seem to be healthy16:26
elodilles#info stable branch status / gate failures tracking etherpad: https://etherpad.opendev.org/p/nova-stable-branch-ci16:26
bauzascool16:26
elodilles(and we have some stable/2025.1 already)16:26
bauzas#topic vmwareapi 3rd-party CI efforts Highlights 16:26
bauzasfwiesel seems not to be present16:26
bauzasmoving on then16:26
bauzas#topic Open discussion 16:26
bauzasnothing in the agenda16:27
bauzasanything anyone ?16:27
opendevreviewmitya-eremeev-2 proposed openstack/nova master: Add more thorough handling of evacuated instances  https://review.opendev.org/c/openstack/nova/+/94405816:30
bauzaslooks we can close the meeting16:30
bauzasthanks all16:30
bauzas#endmeeting16:30
opendevmeetMeeting ended Tue Mar 11 16:30:48 2025 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)16:30
opendevmeetMinutes:        https://meetings.opendev.org/meetings/nova/2025/nova.2025-03-11-16.02.html16:30
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/nova/2025/nova.2025-03-11-16.02.txt16:30
opendevmeetLog:            https://meetings.opendev.org/meetings/nova/2025/nova.2025-03-11-16.02.log.html16:30
dansmithbauzas: ack on the previous going over, sorry I just assumed you meant you intended to be superman and run two meetings at once :P16:31
elodillesthanks o/16:31
bauzasdansmith: well, I really have no choices16:31
bauzashence myself leaving the hot seat, or I was about to burn 16:31
ratailormelwitt, could you please review https://review.opendev.org/c/openstack/nova/+/873901 and provide your feedback ?16:38
ratailorgibi, could you please remove -1 here https://review.opendev.org/c/openstack/nova/+/933107 and provide your feedback ?16:38
gibiratailor: done16:45
ratailorgibi, Thanks!16:45
dansmithgibi: if you're still around.. it looks to me like we don't do all the "is this a physical function?" checking if the devspec is a pattern instead of a single device17:13
dansmithI assume because there could be multiple things matched that have different characteristics17:14
dansmithso you think we need to examine all the matching ones so we can abort if your spec matches a VF?17:14
gibiauch. Yeah you are right, at that point we only have the dev_spec that can be wildcard and match to multiple devices. I'm not sure how the remote managed check below works then as that does some physical function check. Anyhow that means we cannot validate that input in at the device_spec parsing code and need to add it later in the pci_placemenet_translator17:16
dansmithgibi: it just punts for patterns17:16
gibiOK17:16
dansmithgibi: well, my translator code already skips VFs17:16
gibinot nice but meh17:16
dansmithmeaning, it won't reserve for VFs.. I could log something there though17:16
sean-k-mooneywe determin if its a phsyica function by looking at the libvirt nodedev xml when generiging the set of pci dicts and then we filter those by the devspec regexes ectra17:17
gibidansmith: yeah I think we can check in the translator that if the OTU is requested for a VF as that is clearly invalid. Currently your code ignoring it but that is missleading as the user expects that the OTU marked device will be reseved but it won't be17:17
gibiso I suggest to raise there and let the deployer know that what is requested not supported17:18
dansmithgibi: you mean the deployer.. the user can't see any of this17:18
dansmithbut okay17:18
gibiyeah the deployer17:18
gibisorry17:18
gibiI think we already have late input validation for the dependend device handling in the translator (when both the PF and its VFs are matched)17:18
dansmithokay17:19
gibihttps://github.com/openstack/nova/blob/0d484ce37d86e989c8abdf57aec5e334f68206ef/nova/compute/pci_placement_translator.py#L146-L15117:19
dansmithgibi: okay so that's where we try to add children, and explode if we already have the parent? so I can just check OTU there and refuse similarly I guess?17:21
sean-k-mooneyadd child is passed a pcidevce object so it could have also checked dev.dev_type17:21
dansmithbut no need to even check because if we're calling add_child and one_time_use=true, then fail right?17:22
sean-k-mooneyi mean you can fail there that pretty late17:24
dansmithright but, see above.. when we're checking the devspec early, we don't have enough into to make a determination for patterns17:24
sean-k-mooneydansmith: the devspec cant be used to tell if its a pf17:25
dansmith...right, I understand17:25
sean-k-mooneybut the device we get back form the driver alredy know if its a pf or not17:25
sean-k-mooneyi think you cna check in here https://github.com/openstack/nova/blob/0d484ce37d86e989c8abdf57aec5e334f68206ef/nova/pci/manager.py#L11117:26
gibiyeah you can check it in the pci manager as well. There we also have the both the dev and it's matching dev_spec17:28
sean-k-mooneyright roughtly here https://github.com/openstack/nova/blob/0d484ce37d86e989c8abdf57aec5e334f68206ef/nova/pci/manager.py#L13517:28
sean-k-mooneydev is the list of dicts form the virt driver and pci_dev_spec is the matched spec17:29
dansmithokay17:29
sean-k-mooneyso here the dict has been encreced by the tags form the devspec https://github.com/openstack/nova/blob/0d484ce37d86e989c8abdf57aec5e334f68206ef/nova/pci/manager.py#L14017:30
sean-k-mooneyand you can add a check before we add it to the devices list17:30
sean-k-mooneyyou will need to add onetime use here https://github.com/openstack/nova/blob/master/nova/pci/devspec.py#L419 as well17:31
gibiI would do it in the translator where you filter the VFs out that is the closest to the actual logic assuming no VFs are OTU17:31
sean-k-mooneythe translator is not the only place we do filtering but that an option yes17:32
gibithe OTU feature hard depends on the translator 17:32
gibiit acutally implementes the OTU reservation17:32
sean-k-mooneyya which is weiered to me because i expect this to really be done in the pci manager 17:33
gibias we cannot put it early in the dev_spec parsing, the second best thing for me would be to put the check to the same place where the reserveation logic is17:33
sean-k-mooneybut i get why its doen that way 17:33
dansmithgibi: yeah that feels a lot cleaner to me, just because it's so contextually-relevant there17:33
gibiif we implement OTU with the PCI tracker then I would agree with you, but we are going with implementing it in Placement17:33
gibidansmith: yepp17:34
sean-k-mooneyi kind of hate that we have any pci logic out soid eof the pci and hardware.py modules but that ship has sailed...17:34
gibisean-k-mooney: it is not hard to move the pci_placement_translator under the pci module17:38
opendevreviewBalazs Gibizer proposed openstack/nova master: Add functional reproducer for bug 2102038  https://review.opendev.org/c/openstack/nova/+/94406117:39
opendevreviewBalazs Gibizer proposed openstack/nova master: Multiple spec per PCI alias limitation  https://review.opendev.org/c/openstack/nova/+/94406217:39
sean-k-mooney^ is not valid config17:55
sean-k-mooneygibi: we can have two differntly named aliases refer to the same vedor_id and product_id17:56
sean-k-mooneybut you cant have 2 alises with the same name and differnt contett17:57
gibisean-k-mooney: you can :)18:00
gibinot with PCI in Placement but without it18:00
sean-k-mooneyit was a bug if it was ever there18:00
sean-k-mooneyit was never inteded to be supproted18:00
gibithe spec in the alias is a list 18:00
sean-k-mooneyyes18:00
sean-k-mooneybut that is so you can define multiple18:01
gibinope a single named alias can have a list of attached spec dicts18:01
sean-k-mooneyit was always ment to map to exactly one value18:01
sean-k-mooneythat not how it was intended to work18:01
gibiI have to drop now but we can discuss it tomorrow18:02
gibihttps://github.com/openstack/nova/blob/0d484ce37d86e989c8abdf57aec5e334f68206ef/nova/pci/request.py#L17218:03
gibithis is very nova appends more spec to the same alias18:03
gibis/very/where/18:03
* gibi disappears18:03
sean-k-mooneyya i just found that18:04
sean-k-mooneyi can tell you tha tthat is very problematic18:04
sean-k-mooneysriov live migraiton and the gpu live migration depend on that never happeinging18:05
sean-k-mooneyit looks like that was added before i started workign on nova in the very early days of pci support18:06
sean-k-mooneybut that breaks other assumtions made over the years18:06
sean-k-mooneyi guess it does not break sriov live migration because of the hotplug18:09
sean-k-mooneythe vgpu live migraiton does not expect this nor does pci in placment18:09
gibiyeah pci in placement explicitly forbid this, just not with the proper response hence the bug and reproduce above18:12
sean-k-mooney we need to add a check to prevent merging aliais with diffent resouce classes18:12
sean-k-mooneylike https://github.com/openstack/nova/blob/0d484ce37d86e989c8abdf57aec5e334f68206ef/nova/pci/request.py#L168-L17018:13
gibiwe have a clear reject in the pci in placement code already https://github.com/openstack/nova/blob/0d484ce37d86e989c8abdf57aec5e334f68206ef/nova/objects/request_spec.py#L504-L52818:13
gibijust the exception is not translated properly hence the 500 return from nova-api18:14
sean-k-mooneythat failign for a diffent reason18:14
sean-k-mooneythe check is vlaid there18:14
sean-k-mooneybut this really should be a config error in other case18:14
sean-k-mooneybut ya lets pick this up again tomorrow18:14
sean-k-mooneythere are several thinkg our config parsing has allwoed over the years that i woudl condier invliad an dthis is right at the top of the list for me18:15
sean-k-mooneyspecificly merging 2 alisie with a diffent product_id18:16
sean-k-mooneymerging none conflictign values is kind of ok18:17
sean-k-mooneybut the orign behavior is not in my opipion18:17
opendevreviewDan Smith proposed openstack/nova master: Support "one-time-use" PCI devices  https://review.opendev.org/c/openstack/nova/+/94381618:48
opendevreviewmitya-eremeev-2 proposed openstack/nova master: Add more thorough handling of evacuated instances  https://review.opendev.org/c/openstack/nova/+/94405820:46

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!