gibi | sean-k-mooney: when you are up. We can go through the open questions in the PCI tracing spec | 08:24 |
---|---|---|
ralonsoh | gibi, qq if you know that. For live-migration, all libvirt should have the same "virsh secret-list" ? | 09:10 |
gibi | ralonsoh: o/ good question. I guess you are looking at a situation with encrypted volumes | 09:14 |
ralonsoh | gibi, yes, I've deployed two nodes with devstack and ceph | 09:15 |
gibi | hm, I dont seem to find any code in nova that would move the secret | 09:18 |
gibi | I'm not a sure how this works but seems like barbican is involved too | 09:27 |
gibi | ralonsoh: as melwitt took over the ephemeral encryption work from lyarwood I expect that she might know more about the volume encryption case as well | 09:31 |
ralonsoh | gibi, thanks, I'm going to try to deploy this env without encription, if possible | 09:32 |
opendevreview | Jorhson Deng proposed openstack/nova master: Clear the ignore_hosts before starting evacuate https://review.opendev.org/c/openstack/nova/+/841089 | 09:42 |
sean-k-mooney | do you mean the ceph secret | 09:43 |
sean-k-mooney | if so we expect the operator to distibute that | 09:43 |
ralonsoh | sean-k-mooney, this is in a devstack installation | 09:43 |
sean-k-mooney | yep | 09:44 |
ralonsoh | so one compute is requesting its own secret id | 09:44 |
ralonsoh | and the other compute is asking for other | 09:45 |
ralonsoh | and I don't know it that should match the ceph.conf fsid | 09:45 |
sean-k-mooney | hum i could try and deploy this and see. but the devstack roles just copy the keyfiels and config to the compute https://github.com/openstack/devstack/blob/master/roles/sync-controller-ceph-conf-and-keys/tasks/main.yaml | 09:50 |
sean-k-mooney | then on the compute you set REMOTE_CEPH=True | 09:50 |
ralonsoh | sean-k-mooney, yes, that's set | 09:51 |
sean-k-mooney | https://github.com/openstack/devstack-plugin-ceph/blob/master/devstack/lib/ceph#L834-L841= | 09:51 |
sean-k-mooney | the secret config seams to be the same | 09:51 |
sean-k-mooney | https://github.com/openstack/devstack-plugin-ceph/blob/master/devstack/lib/ceph#L244-L259= | 09:52 |
ralonsoh | sean-k-mooney, yeah, it is. In any case, I'll deploy the second compute node again | 09:52 |
sean-k-mooney | ack | 09:53 |
sean-k-mooney | i think from lookign at that that the secure and keyfiles should be the same on all hosts | 09:53 |
ralonsoh | sean-k-mooney, one question | 09:53 |
ralonsoh | in the first node, the fsid is 37... | 09:54 |
ralonsoh | and the "sudo virsh secret-list " is d5... | 09:54 |
ralonsoh | this is wrong... | 09:54 |
ralonsoh | pffff | 09:54 |
ralonsoh | sean-k-mooney, sorry, I've re-deployed the first controller | 10:17 |
ralonsoh | and CEPH has changed the /etc/ceph/ceph.cong fsid number | 10:18 |
ralonsoh | that means the virsh secret and the fsid is different now | 10:18 |
sean-k-mooney | yes the it will change every time you stack | 10:21 |
ralonsoh | sean-k-mooney, but what ID should I use now? | 10:22 |
sean-k-mooney | ralonsoh: on a different topic has anyone raised support virtio-failover in neutron | 10:23 |
sean-k-mooney | ralonsoh: likely the one for the contoler | 10:23 |
ralonsoh | sorry what? | 10:23 |
sean-k-mooney | that basically answers my question :) virtio-failover is like a form of automatic bounding | 10:23 |
ralonsoh | ah no that I'm aware | 10:24 |
sean-k-mooney | you can declare one virtio device as a failover device if the primary losses connectivity | 10:24 |
ralonsoh | in any case, I'll check it later | 10:24 |
ralonsoh | nope, that is usually done inside the VM | 10:24 |
sean-k-mooney | no worries im 99% sure that its not supproted | 10:24 |
ralonsoh | sean-k-mooney, last q | 10:24 |
ralonsoh | about this fsid | 10:25 |
sean-k-mooney | ralonsoh: virtio failover allwos qemu to ask the guest virtio driver to do the failover automaticaly in the guest | 10:25 |
sean-k-mooney | ralonsoh: sure | 10:25 |
ralonsoh | if "virsh secret-list" is one number | 10:25 |
ralonsoh | and in ceph.conf I have another one | 10:25 |
ralonsoh | then which one should I use in the second comptue? | 10:25 |
sean-k-mooney | the secret uuid is the cinder_ceph_uuid | 10:28 |
ralonsoh | yes, and it's generated by devstack-ceph | 10:28 |
ralonsoh | CEPH_FSID=$(uuidgen) | 10:28 |
sean-k-mooney | yep so you coudl doble check the cidner config | 10:28 |
sean-k-mooney | and determin which is the correct uuid value | 10:28 |
sean-k-mooney | ill quickly deploy with ceph and see if we can compare | 10:29 |
gibi | Uggla: responded in the manila spec | 10:33 |
sean-k-mooney | gibi: im just going to grab coffee but we can chat about the pci spec when ever suits | 10:38 |
sean-k-mooney | ill be back in 10 mins | 10:39 |
gibi | I will grab lunch so I will ping you later | 10:45 |
sean-k-mooney | cool no rush. | 10:47 |
gibi | sean-k-mooney: OK, I'm back | 11:33 |
gibi | sean-k-mooney: so the first question is simple. Will nova create the custom resource class mentioned in the [pci]device_list or we expect the deployer to pre-create that | 11:38 |
gibi | https://review.opendev.org/c/openstack/nova-specs/+/791047/4/specs/zed/approved/pci-device-tracking-in-placement.rst#156 | 11:38 |
gibi | I think in the vgpu case nova creates the custom RC | 11:39 |
gibi | as the config does not have a full RC but just some typename | 11:40 |
gibi | and nova generates the RC name from it | 11:40 |
opendevreview | Balazs Gibizer proposed openstack/nova master: DNM: log number of green(thread|let)s periodically https://review.opendev.org/c/openstack/nova/+/841040 | 11:48 |
sean-k-mooney | gibi: o/ am yes i think nova should create the custom resouce classes in placement | 11:50 |
sean-k-mooney | the reason for this is we want to use CUSTOM_<VENDOR_ID>_<PRODUCT_ID> | 11:51 |
sean-k-mooney | when no RC has been specified in the device list | 11:51 |
sean-k-mooney | so i think it would be a better end user experince if those custom resouce classes were created automatically | 11:52 |
gibi | so resource_class=foobar is OK and nova will create CUSTOM_FOOBAR in placement | 11:52 |
sean-k-mooney | ah are you asking if nova should normalise and prepend the CUSTOM_ | 11:52 |
gibi | yep, as a follow up :) | 11:52 |
gibi | follow up question | 11:53 |
sean-k-mooney | i think that woudl be workable. i would prefer to encurage them to set CUSTOM_<whatever> | 11:53 |
sean-k-mooney | but i think its fine to prepend and normalise automatically | 11:54 |
gibi | a bit more user friendly if we normalize and prepend | 11:54 |
gibi | so I will go with that | 11:54 |
sean-k-mooney | yep just so long as we are smart and only prepend when needed | 11:54 |
gibi | OK, I can make it smart to avoid double custom | 11:55 |
gibi | OK | 11:55 |
gibi | next one is future proofing the RC | 11:55 |
gibi | https://review.opendev.org/c/openstack/nova-specs/+/791047/4/specs/zed/approved/pci-device-tracking-in-placement.rst#169 | 11:55 |
gibi | I think we support the case today when pci alias and neutorn sriov is configured in the same deployment | 11:56 |
gibi | also I think it is possible to consume the same type-VF either from alias or from port | 11:57 |
gibi | so for this case we need an RC name for the type-VF RP that is known before the scheduling for the sriov case | 11:57 |
gibi | SRIOV_NET_VF could be use for that | 11:57 |
gibi | as that already exists in os-traits | 11:58 |
sean-k-mooney | yes you can consume type-vf via alias | 11:58 |
sean-k-mooney | because VF and sriov have nothing to do with networking | 11:58 |
sean-k-mooney | you can have VFs for gpus or ssds | 11:58 |
gibi | true | 11:58 |
sean-k-mooney | the physical_network tag in the device list | 11:59 |
sean-k-mooney | is what marks it as a nic and is required for neutron consumtion | 11:59 |
sean-k-mooney | im not sure if we explictly prevent alaise form consuming device with physical_network set | 11:59 |
sean-k-mooney | but we could do that going forward i guess | 11:59 |
gibi | I think we did not prevent it today | 11:59 |
sean-k-mooney | ack | 12:00 |
sean-k-mooney | so for device with physical_network set | 12:00 |
gibi | we can simply say that resource_class will not be applicable if physical_network tag is present, and nova will use standard RC for these devices | 12:00 |
sean-k-mooney | i think its fine to mark them as SRIOV_NET_VF or SRIOV_NET_PF | 12:00 |
gibi | cool | 12:00 |
sean-k-mooney | we would likely track vdpa devices as SRIOV_NET_VF too | 12:01 |
sean-k-mooney | although i guess that could be SRIOV_NET_VDPA | 12:01 |
sean-k-mooney | that actully might make more sense not that i think of it | 12:01 |
gibi | yeah. but we don't have to decide it now, I just needed to make sure the that the current spec is future proof | 12:02 |
sean-k-mooney | so yes making RC and phsynet mutally exclsive i think is correct | 12:02 |
sean-k-mooney | yep | 12:02 |
gibi | OK | 12:02 |
gibi | next one | 12:02 |
sean-k-mooney | so you descoped the current spec to jsut the alias based passthough case yes | 12:02 |
gibi | yes | 12:02 |
sean-k-mooney | cool im fine with that by the way | 12:02 |
gibi | it is still complex enough | 12:02 |
sean-k-mooney | i want the neutron way to work too but it does not need to be in the initall mvp | 12:03 |
gibi | I just keep an eye on things not to create a dead end with the current spec | 12:03 |
sean-k-mooney | yep | 12:03 |
sean-k-mooney | ok so back ot your next question :) | 12:03 |
gibi | OK, the next one is simple. With the current proposal the RP is named <hostname>_pci_0000_84_00_0 | 12:03 |
gibi | if we follow the pGPUnaming | 12:04 |
gibi | PGPU naming | 12:04 |
gibi | is that OK? | 12:04 |
gibi | we could have a full normal PCI address if we want as the RP name charset is not restricted | 12:04 |
sean-k-mooney | am so i was not planning to use the lable for libvirt | 12:04 |
sean-k-mooney | the nodedev name pci_0000_84_00_0 | 12:05 |
sean-k-mooney | is not considerd stable by them | 12:05 |
gibi | ahh, good to know | 12:05 |
gibi | then I think it is better not to rely on it | 12:05 |
sean-k-mooney | so i was thinking it woudl be <hostname>_<pci addres in linux format> | 12:05 |
gibi | so like in DDDD:BB:AA.FF format? | 12:05 |
sean-k-mooney | yes | 12:05 |
gibi | OK, we can do that | 12:05 |
sean-k-mooney | is : allowed | 12:06 |
gibi | yes it is | 12:06 |
gibi | the RP name is free text | 12:06 |
sean-k-mooney | ok we could normalise | 12:06 |
sean-k-mooney | if not | 12:06 |
gibi | the traits and RCs are restricted | 12:06 |
sean-k-mooney | if we wanted to have pci_0000_84_00_0 by the way i would prefer that nova generated that | 12:06 |
sean-k-mooney | rahter then using the value directly form libvirt | 12:06 |
sean-k-mooney | but im oke wiht usei the bdf format above DDDD:BB:AA.FF | 12:07 |
gibi | ack | 12:08 |
gibi | next one | 12:08 |
sean-k-mooney | so one thing related to this | 12:08 |
gibi | go | 12:08 |
sean-k-mooney | even if you add the device to the device list using devname instead of the adress we would still use the pci adress in the RP name yes | 12:08 |
gibi | yes | 12:09 |
sean-k-mooney | i just want to make sure that detail is hidden form placement | 12:09 |
sean-k-mooney | cool | 12:09 |
gibi | if we need the devname for any reason during the scheduling we can add that as a trait | 12:09 |
sean-k-mooney | yes we could. currently its not used in the alias or pci request object | 12:09 |
sean-k-mooney | so we shoudl not | 12:09 |
sean-k-mooney | but if we did a trait woudl be workable | 12:10 |
gibi | if not needed then we wont add it :) | 12:10 |
sean-k-mooney | :) | 12:11 |
gibi | so the next question is related | 12:11 |
gibi | what traits nova needs to add automatically? | 12:11 |
gibi | only the ones mentioned in the device_lsit | 12:11 |
gibi | ? | 12:11 |
gibi | or we want to automate things like adding capabilities as traits | 12:11 |
sean-k-mooney | ah so yes the capablitiy traits shoudl be added in my view | 12:12 |
sean-k-mooney | they were intended to be reported to placement orgianly | 12:12 |
gibi | OK, that make sense | 12:12 |
gibi | do we have in the code somewhere listed what are the capabilities we parse? or we parse verything? | 12:13 |
sean-k-mooney | so if i rememebr correctly you were suggeting allow additivie only traits to be listed in the device_list | 12:13 |
gibi | *everything | 12:13 |
sean-k-mooney | gibi: im looking for the code now | 12:13 |
sean-k-mooney | but we had code to normalise the capablites and report them to placement that ralonsoh wrote in the past | 12:13 |
gibi | sean-k-mooney: yepp the today spec only supports additive traits | 12:13 |
sean-k-mooney | https://review.opendev.org/q/topic:bp%252Fenable-sriov-nic-features | 12:14 |
sean-k-mooney | https://review.opendev.org/c/openstack/nova/+/466051 specficically | 12:15 |
sean-k-mooney | but you might be able to reuse some of the other work | 12:15 |
sean-k-mooney | gibi: the traits have already been added to os-traits https://github.com/openstack/os-traits/blob/master/os_traits/hw/nic/offload.py | 12:16 |
gibi | thanks, this helps | 12:17 |
gibi | so lets report capabilities as traits | 12:17 |
gibi | and then I will amend the spec to allow removing traits via device_list | 12:17 |
gibi | to disable capability | 12:18 |
sean-k-mooney | yep. right now this only makes sense for neutron nics really | 12:18 |
sean-k-mooney | since we dont really gather capablities for other devices | 12:18 |
gibi | ahh | 12:18 |
gibi | so no generic PCI caps | 12:18 |
sean-k-mooney | although remote_mannaged might be the excption | 12:18 |
sean-k-mooney | well for the remote managed deviecs we now have the vpd | 12:18 |
sean-k-mooney | capablity | 12:18 |
sean-k-mooney | that is not yet a trait so maybe we would want to report that | 12:19 |
sean-k-mooney | gibi: i think for the inital version we could keep it to jsut operator provided traits | 12:19 |
sean-k-mooney | if we want to keep it simple | 12:19 |
sean-k-mooney | then in the future we can auto discover device capablities and report them if that makes sense | 12:20 |
gibi | yeah that make sense, it is easy to add later | 12:20 |
gibi | so keeping traits just additive now as well | 12:20 |
gibi | OK | 12:21 |
gibi | next one | 12:22 |
gibi | https://review.opendev.org/c/openstack/nova-specs/+/791047/4/specs/zed/approved/pci-device-tracking-in-placement.rst#288 | 12:22 |
gibi | What to do if both ``resource_class`` and ``vendor_id`` and ``product_id`` are provided in the alias? | 12:22 |
sean-k-mooney | good question. i guess we have too options | 12:23 |
sean-k-mooney | first is consider that an error | 12:23 |
sean-k-mooney | second is use the resouce_class for placment queries | 12:23 |
sean-k-mooney | and the the rest for the pci/numa filter | 12:23 |
gibi | the only use case I can think of is that deployer use a generic RC but then later want to refine the alias via product id | 12:23 |
sean-k-mooney | my secret plan is to eventually remove the need for the alias | 12:24 |
sean-k-mooney | so over time it woudl be nice if we coudl move to jsut having the resouce class in the alias | 12:24 |
gibi | do you want to go with flavor extra_spec based resource? | 12:24 |
sean-k-mooney | yes and no | 12:25 |
sean-k-mooney | i current hate that we allow grouping in the extra_specs | 12:25 |
sean-k-mooney | so part of me wants to keep the alisa as we can insulate operators form that | 12:25 |
sean-k-mooney | but i do kind of like the idea of have just resouce:CUSTOM_<whatever>=1 | 12:26 |
gibi | OK, I get the goal that we want to move deployers to RC based alias in the future and if the deployer still want product id based filtering then the deployer can use traits for that | 12:26 |
sean-k-mooney | so i think having the RC take prescidence for the placment query and allowign vendor_id and product_id makes sense | 12:26 |
gibi | OK | 12:27 |
sean-k-mooney | so really what i would like is for operators to take devces with the custom resouce class and use the RC name as the "alias" | 12:28 |
gibi | so we keep the PCIFilter to keep filtering for vendor / product | 12:28 |
gibi | at least for now | 12:28 |
gibi | the RC name as alias make sense | 12:28 |
sean-k-mooney | ya for now althogh in princial i think you could trun it off if we did this right | 12:28 |
sean-k-mooney | my idea is | 12:29 |
gibi | yes if we let the product id filtering case go and no SRIOV in the deployment then we can turn off the PCIFilter | 12:29 |
gibi | neutron based SRIOV | 12:30 |
sean-k-mooney | device_list = some device -> RC gpu_gold | 12:30 |
sean-k-mooney | and then you could jsut ask for RC gpu_goal in the alias | 12:30 |
sean-k-mooney | right now the reason i dont wnat to go directly to resouce:gpu_gold=1 | 12:31 |
sean-k-mooney | is that intoduced problems wiht vgpu and generic mdev usage | 12:31 |
sean-k-mooney | it shoudl be resovlable | 12:31 |
sean-k-mooney | i.e. if we see that the resouce does not match any of the generic_mdev types listed in teh config we know tis a pci passthough request | 12:31 |
sean-k-mooney | but i tought that would complicate the spec more then need initally | 12:32 |
gibi | ahh yeah | 12:33 |
gibi | thanks for the background | 12:33 |
gibi | next | 12:33 |
gibi | I feel that both you and me want to keep the dependent device handling supported. But as stephenfin said, it is a lot of complexity | 12:34 |
gibi | so just double checking it that you still think this is needed | 12:34 |
gibi | as per https://review.opendev.org/c/openstack/nova-specs/+/791047/4/specs/zed/approved/pci-device-tracking-in-placement.rst#320 | 12:34 |
sean-k-mooney | honestly i would like to be able to remove it. but im concerned by the upgrade impact | 12:34 |
sean-k-mooney | i do know we have customer that want to dynically choose if they consime a device a a PF or VF when they boot the workload | 12:35 |
sean-k-mooney | but its not very reliable today | 12:35 |
sean-k-mooney | as in its easy for vms to consume 1 vf on all the devices | 12:36 |
gibi | I also think that there is many deployment out there that is was used without knowing it. I mean if somebody whitelisted a PF that had VFs then the VFs become scheduleable automatically | 12:36 |
sean-k-mooney | basically meaning you can not allocate PF even though your could have allocate the vfs differntly | 12:36 |
sean-k-mooney | right so i think what stephenfin had in mind was if you whitelist the PF and it has VFs we would only expose the VFs | 12:37 |
sean-k-mooney | where as today unless you use the product_id to filter | 12:37 |
sean-k-mooney | we expos both the VFs and PFs | 12:37 |
sean-k-mooney | if we maintian the current behavior we obvilly need ot dynamically adjst the reserved value | 12:37 |
gibi | yes, that is the complexity | 12:38 |
sean-k-mooney | to emulated the unclaimable state | 12:38 |
gibi | but it is solveable | 12:38 |
gibi | I think I will keep this open for bauzas or other reviews to chime in | 12:38 |
sean-k-mooney | sure | 12:38 |
sean-k-mooney | we decided to reduce flexiblity for cpu pinning | 12:39 |
sean-k-mooney | with isolate | 12:39 |
sean-k-mooney | and we know that not everyone was happy with that | 12:39 |
sean-k-mooney | we can elect to do the same here but we need to be deliberiate about it and comunicate it well if we want to force this change | 12:40 |
ralonsoh | sean-k-mooney, sorry, I was having lunch | 12:40 |
ralonsoh | what do you need? | 12:40 |
sean-k-mooney | if we can live with the complexity then we proably shoudl keep it | 12:40 |
sean-k-mooney | ralonsoh: i found it | 12:40 |
gibi | OK, I will plan with the complexity | 12:40 |
ralonsoh | ah perfect | 12:40 |
sean-k-mooney | ralonsoh: https://review.opendev.org/q/topic:bp%252Fenable-sriov-nic-features | 12:40 |
sean-k-mooney | ralonsoh: your code for tracking pci device in placment | 12:40 |
sean-k-mooney | ralonsoh: we are just discussing the spec to enable it. gibi will be taking on that feature | 12:41 |
ralonsoh | perfect | 12:41 |
gibi | ralonsoh: https://review.opendev.org/c/openstack/nova-specs/+/791047/4/specs/zed/approved/pci-device-tracking-in-placement.rst this is the spec if you are interested :) | 12:41 |
ralonsoh | sure | 12:41 |
gibi | sean-k-mooney: so I have one more open question | 12:42 |
sean-k-mooney | gibi: go for it | 12:42 |
gibi | upgrade | 12:42 |
gibi | obviously rolling upgrade is a pain | 12:42 |
sean-k-mooney | ah well you burn down the datacenter and build a new one in the ashes | 12:42 |
sean-k-mooney | obvisouly the least painful approch | 12:42 |
gibi | in the PCPU case we did a fallback query | 12:42 |
gibi | to allow schedling to not-yet upgraded computes | 12:43 |
sean-k-mooney | yep | 12:43 |
sean-k-mooney | we could make the prefilter configurable | 12:43 |
gibi | I'm not sure how we did the allocation in that case | 12:43 |
gibi | but | 12:43 |
gibi | in the PCI case if we select a host based on the fallback query then on that host the scheduler will not allocate PCI devices in placement | 12:43 |
sean-k-mooney | right so i would not use a fallback | 12:44 |
sean-k-mooney | by default we shoudl reprot the inventores to placemnt | 12:44 |
sean-k-mooney | and ahve a prefileter | 12:44 |
sean-k-mooney | the prefilter would add the pci device request to the query | 12:44 |
sean-k-mooney | and we disable it by default in zed | 12:44 |
sean-k-mooney | then enable it by default in AA | 12:44 |
sean-k-mooney | so you would rolling upgrade to Zed | 12:45 |
sean-k-mooney | then enable the prefilter once all host are upgraded | 12:45 |
sean-k-mooney | you likely would have to then do a heal-allocation like command to update the allcotiosn of existign instances | 12:45 |
gibi | ahh i see | 12:46 |
sean-k-mooney | we could also have a nova-status check | 12:46 |
gibi | so we report devices but we don't allocate yet | 12:46 |
sean-k-mooney | yep | 12:46 |
gibi | then when every compute is ready we do a heal and then start allocating | 12:46 |
sean-k-mooney | yes using the claim in the pci_devices table | 12:47 |
gibi | yepp we keep using the claim and the pci_device table | 12:47 |
gibi | anyhow | 12:47 |
gibi | as that tracks exact VF PCI addresses | 12:47 |
gibi | Placement wont | 12:47 |
sean-k-mooney | yep so form the contolers | 12:47 |
sean-k-mooney | we will have all the info in the pci_devices table to heal the allocations | 12:48 |
sean-k-mooney | since we also have the parent adresses | 12:48 |
sean-k-mooney | we can constuct the RP names | 12:48 |
gibi | yepp | 12:49 |
sean-k-mooney | we could consider | 12:49 |
sean-k-mooney | if we can activate teh filter based on min compute service version | 12:49 |
sean-k-mooney | if we were to do that we would likely need the compute agent to heal the allcoations automaticaly | 12:50 |
sean-k-mooney | perhaps on starup or in the upsadate_avaiable_resouces periodic task | 12:50 |
sean-k-mooney | im not sure if we want that level of compleixty but we already do reshapes in init_host | 12:51 |
sean-k-mooney | do you think that is too much "magic"/complexity | 12:52 |
sean-k-mooney | it would make the operator experince much nicer as it woudl just start working once everythin was upgraded | 12:52 |
gibi | this reshap will be just RP creation, we won't move things | 12:52 |
gibi | so calling that code from periodic feels OK | 12:52 |
gibi | if we have that then it is safe to enable the prefilter automatically | 12:53 |
gibi | by the compute min version | 12:53 |
sean-k-mooney | right so when the compaute-agent start with the new code. the first time it create the inventroeis it would also update the allcoations | 12:53 |
sean-k-mooney | and then the prefilter woudl activate once all computes are upgraded | 12:53 |
sean-k-mooney | based on min verion check | 12:54 |
sean-k-mooney | the proably i see with this would be move operations before the prefilter is enabled | 12:54 |
sean-k-mooney | unless we have it continue to heal | 12:54 |
sean-k-mooney | untill the min version reaces the required version | 12:54 |
sean-k-mooney | there would be some inconsitency for a time but the pci_tracker would enforce the corret behaivor with regards to not over subscibing | 12:55 |
gibi | yeah we have the pci tracker and pci claim as a fallback | 12:55 |
gibi | so we can move even if the prefilter is disabled | 12:56 |
gibi | just have to have a way to heal the placement allocation | 12:56 |
gibi | eventually | 12:56 |
sean-k-mooney | yep | 12:56 |
gibi | OK, I think I got my answers | 12:56 |
gibi | thank you for your time | 12:56 |
gibi | I really appreciate it | 12:56 |
sean-k-mooney | which again can use current and min service version to disable the healing when its not needed | 12:57 |
sean-k-mooney | no worries | 12:57 |
sean-k-mooney | im excited to see this moving forward | 12:57 |
sean-k-mooney | will you summerise this in the spec | 12:57 |
sean-k-mooney | perhaps like to the irc logs | 12:57 |
sean-k-mooney | *link | 12:57 |
gibi | I will do the summary | 12:57 |
gibi | and linking to the log | 12:57 |
gibi | then I will respin the spec | 12:57 |
gibi | and trim the questions | 12:58 |
gibi | I'm excited to stat coding up some of these in nova and watch them fail in the func env :) | 12:58 |
gibi | it will be fun | 12:58 |
sean-k-mooney | gibi: on a related not you reviewed Uggla spec. there was kind of an open question regarding updating hte AZ when you specify a host did you weigh in on that. | 12:59 |
gibi | I saw it and I think it was settled, I had no objection. But then I will doulecheck | 12:59 |
gibi | doublecheck | 12:59 |
sean-k-mooney | gibi: ack ill try and review it again shortly so | 12:59 |
sean-k-mooney | gibi: on a more selfish note i could also use your input on something else but its not super urgent https://review.opendev.org/c/openstack/nova/+/841017/1/nova/virt/libvirt/driver.py | 13:01 |
sean-k-mooney | i dont think that is 100% correct but i works for vdpa i need to test it with VFs and other vnic-types | 13:02 |
* gibi clicks | 13:02 | |
sean-k-mooney | basically we are curently unpluging neutron interface using _detach_pci_dev for suspend | 13:02 |
sean-k-mooney | that does not work for vdpa and im pretty sure it does not work in general | 13:03 |
sean-k-mooney | so i need to verify that and file a bug | 13:03 |
gibi | I never tried suspend with PCI / neutron SRIOV. So I neither confirm now deny that it works | 13:07 |
sean-k-mooney | it used to but its been a very long time since i checked it. | 13:07 |
sean-k-mooney | so ya i need to test it with differnt backends | 13:08 |
gibi | but your comments seems valid that if something is an interface then that cannot be detached as a hostdev | 13:08 |
sean-k-mooney | i have the ablity to test hardware offloaded ovs and sriov at home and i still have the servers i was usign for vdpa although ill be giving those abck today | 13:09 |
sean-k-mooney | so i can see if i can test the differnt combinations | 13:10 |
sean-k-mooney | i think self.detach_interface(context, instance, vif) shoudl work however in all cases | 13:10 |
sean-k-mooney | i dont really know why we have sepcial handelign for the host dev elements | 13:10 |
sean-k-mooney | detach_interface | 13:11 |
sean-k-mooney | is ment to be the abstraction here and its what is called when we call detach form the api | 13:11 |
gibi | yepp detach inteface dynamically use hostdev or interface config object | 13:12 |
sean-k-mooney | so i think i can just factor out the common code form https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L9811-L9829= | 13:12 |
sean-k-mooney | making the migrate_data optionall effectivly | 13:13 |
gibi | yepp | 13:13 |
gibi | that seems doable | 13:13 |
sean-k-mooney | basically i woudl jsut take in a list of vifs | 13:13 |
sean-k-mooney | so what im wonderign is it better to adapt the curent fucion as i did in the wip patch | 13:14 |
sean-k-mooney | or jsut do the refactor | 13:15 |
sean-k-mooney | and call detach_interface | 13:15 |
sean-k-mooney | via _detach_direct_passthrough_vifs | 13:15 |
gibi | I would do the refactor and call detach_inteface but I'm biased with the detach_interface code :D | 13:15 |
sean-k-mooney | well see i trust the detach_inteface code more | 13:16 |
sean-k-mooney | its better tested | 13:16 |
sean-k-mooney | ok thanks ill try and confirm my sepculation that suspend was broken and file a bug | 13:17 |
gibi | cool | 13:17 |
sean-k-mooney | one thing i need to bring up in the team meeting tomorrow is how to track the vdpa work | 13:18 |
sean-k-mooney | https://review.opendev.org/q/topic:bug%252F1970467 the non WIP patch is the bug fix | 13:18 |
sean-k-mooney | for the move operation that actully work | 13:18 |
sean-k-mooney | the next 3 add attach/detach, suspend and hotplug live migration | 13:19 |
sean-k-mooney | i feel like the last 3 shoudl be a specless blueprint or maybe a small spec | 13:19 |
*** dasm|off is now known as dasm | 13:27 | |
gibi | I'm OK with both direction. If there is no API change then I'm fine with specless but if you have open questions then those are easy to discuss via a spec | 13:27 |
sean-k-mooney | there are no api change other then me removing the api block on the operation however i think i should be adding a compute service version bump for live migration | 13:29 |
sean-k-mooney | to supprot rolling upgade | 13:29 |
sean-k-mooney | i dont have that in the wip code | 13:29 |
gibi | I think this still can fly as specless | 13:31 |
sean-k-mooney | ack that is what i was hoping but if other felt differently i just wanted to get the spec up quickly | 13:32 |
gibi | yeah it is worth to ask | 13:32 |
opendevreview | Balazs Gibizer proposed openstack/nova-specs master: PCI device tracking in Placement https://review.opendev.org/c/openstack/nova-specs/+/791047 | 14:36 |
gibi | sean-k-mooney: updated according to our discussion ^^ | 14:36 |
* Uggla plays with tempest today. Unshelve to host should have a tempest test. | 16:36 | |
opendevreview | ribaudr proposed openstack/python-novaclient master: Microversion 2.91: Support specifying destination host to unshelve https://review.opendev.org/c/openstack/python-novaclient/+/831651 | 16:55 |
opendevreview | Merged openstack/nova stable/xena: Test aborting queued live migration https://review.opendev.org/c/openstack/nova/+/836145 | 17:24 |
opendevreview | Merged openstack/nova stable/xena: Add functional tests to reproduce bug #1960412 https://review.opendev.org/c/openstack/nova/+/836146 | 17:24 |
opendevreview | Merged openstack/nova stable/xena: Clean up when queued live migration aborted https://review.opendev.org/c/openstack/nova/+/836147 | 17:24 |
opendevreview | Merged openstack/nova stable/yoga: Retry in CellDatabases fixture when global DB state changes https://review.opendev.org/c/openstack/nova/+/840734 | 19:05 |
opendevreview | Artom Lifshitz proposed openstack/nova master: Reproduce bug 1952745 https://review.opendev.org/c/openstack/nova/+/841170 | 21:26 |
artom | I'm kinda proud of ^^ | 21:26 |
opendevreview | Artom Lifshitz proposed openstack/nova master: Reproduce bug 1952745 https://review.opendev.org/c/openstack/nova/+/841170 | 21:36 |
opendevreview | Artom Lifshitz proposed openstack/nova master: Reproduce bug 1952745 https://review.opendev.org/c/openstack/nova/+/841170 | 21:41 |
*** dasm is now known as dasm|off | 22:16 | |
opendevreview | melanie witt proposed openstack/nova stable/ussuri: Define new functional test tox env for placement gate to run https://review.opendev.org/c/openstack/nova/+/840771 | 23:55 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!