opendevreview | Michael Still proposed openstack/nova master: libvirt: direct SPICE console object changes https://review.opendev.org/c/openstack/nova/+/926876 | 00:13 |
---|---|---|
opendevreview | Michael Still proposed openstack/nova master: libvirt: direct SPICE console database changes https://review.opendev.org/c/openstack/nova/+/926877 | 00:13 |
opendevreview | Michael Still proposed openstack/nova master: libvirt: allow direct SPICE connections to qemu https://review.opendev.org/c/openstack/nova/+/924844 | 00:13 |
mikal | ^--- the third patch will fail because there's some subtle API testing thing I need to figure out, but I don't have time right now. | 00:13 |
sean-k-mooney | i think its https://review.opendev.org/c/openstack/nova/+/924844/37/nova/api/openstack/compute/console_auth_tokens.py#93 | 00:23 |
sean-k-mooney | that shoudl be 2.98 not 2.97 | 00:23 |
sean-k-mooney | im comemntign on a few more | 00:26 |
sean-k-mooney | mikal: ok comment left on the thing i think that need to be updated but im not 100% sure | 00:28 |
opendevreview | Amit Uniyal proposed openstack/nova master: WIP: remove swap.disk from disk.info https://review.opendev.org/c/openstack/nova/+/939643 | 04:28 |
opendevreview | Amit Uniyal proposed openstack/nova master: WIP: remove swap.disk from disk.info https://review.opendev.org/c/openstack/nova/+/939643 | 04:39 |
opendevreview | Merged openstack/nova master: allow discover host to be enabled in multiple schedulers https://review.opendev.org/c/openstack/nova/+/938523 | 09:38 |
tkajinam | sean-k-mooney gibi, could one of you nudge https://review.opendev.org/c/openstack/nova/+/936287 to gate ? its parent was already merged | 10:18 |
tkajinam | (it already has two +2s but no +A | 10:18 |
tkajinam | I had to rebase it, and that wiped out +A :-( | 10:18 |
sean-k-mooney[m] | done | 10:19 |
tkajinam | sean-k-mooney[m], thx ! | 10:20 |
mikal | sean-k-mooney: thanks for the pointer on the review, I'm uploading a new set of patches nowish that should pass tests. | 10:43 |
mikal | I am not sure what timezone gmann is in, but we might want to make sure they're aware of the tempest SNAFU. | 10:43 |
opendevreview | Merged openstack/nova master: Add unit test coverage of get_machine_ips https://review.opendev.org/c/openstack/nova/+/936287 | 10:48 |
opendevreview | Michael Still proposed openstack/nova master: libvirt: direct SPICE console object changes https://review.opendev.org/c/openstack/nova/+/926876 | 10:49 |
opendevreview | Michael Still proposed openstack/nova master: libvirt: direct SPICE console database changes https://review.opendev.org/c/openstack/nova/+/926877 | 10:49 |
opendevreview | Michael Still proposed openstack/nova master: libvirt: allow direct SPICE connections to qemu https://review.opendev.org/c/openstack/nova/+/924844 | 10:49 |
bauzas | mikal: I need to leave now but I left some comments on your series | 10:51 |
mikal | bauzas: the spice-direct or the USB / sound one? | 10:52 |
bauzas | mikal: nothing critical as most of them were related to the 2.99 fixes you need to do, but there is also another question about the returned value of tls_port in case of a rolling upgrade | 10:52 |
bauzas | mikal: the former | 10:52 |
bauzas | tl;dr: if the API calls some old compute for getting the ConsoleAuthToken object, it will obtain a None value for the tls_port field | 10:53 |
bauzas | which basically works but returns 'null' (IIRC) as an API returned value | 10:54 |
mikal | I think that is ok. That would also happen if you had spice.require_secure set to false. | 10:54 |
mikal | i.e. a user of the API already has to handle that case. | 10:54 |
bauzas | mikal: cool, I wanted you to be aware of such value, feel free to leave a comment then to reply you're good with that | 10:54 |
mikal | Ok, will do. | 10:54 |
opendevreview | Michael Still proposed openstack/nova master: libvirt: direct SPICE console object changes https://review.opendev.org/c/openstack/nova/+/926876 | 11:09 |
opendevreview | Michael Still proposed openstack/nova master: libvirt: direct SPICE console database changes https://review.opendev.org/c/openstack/nova/+/926877 | 11:09 |
opendevreview | Michael Still proposed openstack/nova master: libvirt: allow direct SPICE connections to qemu https://review.opendev.org/c/openstack/nova/+/924844 | 11:09 |
sean-k-mooney | bauzas: can you review https://review.opendev.org/c/openstack/nova/+/940835 and https://review.opendev.org/c/openstack/nova/+/940873/6 in preperation for the spice series, that prepares the jobs to use spice, the devstack change to make multi node sspeice work proeprly is already merged so i coudl technially squash those but i think its simpler just ot merge them as is | 11:48 |
sean-k-mooney | im likely going to rename that job in a followup but i wont get to that for a while | 11:49 |
sean-k-mooney | im thinking nova-alt (i.e. for testing non default config that we do not plan to make the default, like nova-next but subtley differnt | 11:50 |
sean-k-mooney | ) | 11:50 |
sean-k-mooney | i plan to eventully renable the numa testing in that job and ill also try and add nova on nfs to it too | 11:51 |
sean-k-mooney | but i need to do some homework on exactly how to do that properly | 11:52 |
opendevreview | Balazs Gibizer proposed openstack/nova stable/2024.2: Reproduce bug/2097359 https://review.opendev.org/c/openstack/nova/+/942085 | 12:33 |
opendevreview | Balazs Gibizer proposed openstack/nova stable/2024.2: Update InstanceNUMACell version after data migration https://review.opendev.org/c/openstack/nova/+/942086 | 12:33 |
opendevreview | Balazs Gibizer proposed openstack/nova stable/2024.2: Update InstanceNUMACell version in more cases https://review.opendev.org/c/openstack/nova/+/942087 | 12:33 |
opendevreview | Balazs Gibizer proposed openstack/nova stable/2024.1: Reproduce bug/2097359 https://review.opendev.org/c/openstack/nova/+/942088 | 12:47 |
opendevreview | Balazs Gibizer proposed openstack/nova stable/2024.1: Update InstanceNUMACell version after data migration https://review.opendev.org/c/openstack/nova/+/942089 | 12:47 |
opendevreview | Balazs Gibizer proposed openstack/nova stable/2024.1: Update InstanceNUMACell version in more cases https://review.opendev.org/c/openstack/nova/+/942090 | 12:47 |
opendevreview | Balazs Gibizer proposed openstack/nova stable/2023.2: repoduce post liberty pre vicoria instance numa db issue https://review.opendev.org/c/openstack/nova/+/942091 | 12:49 |
opendevreview | Balazs Gibizer proposed openstack/nova stable/2023.2: allow upgrade of pre-victoria InstanceNUMACells https://review.opendev.org/c/openstack/nova/+/942092 | 12:50 |
opendevreview | Balazs Gibizer proposed openstack/nova stable/2023.2: Reproduce bug/2097359 https://review.opendev.org/c/openstack/nova/+/942093 | 12:50 |
opendevreview | Balazs Gibizer proposed openstack/nova stable/2023.2: Update InstanceNUMACell version after data migration https://review.opendev.org/c/openstack/nova/+/942094 | 12:50 |
opendevreview | Balazs Gibizer proposed openstack/nova stable/2023.2: Update InstanceNUMACell version in more cases https://review.opendev.org/c/openstack/nova/+/942095 | 12:50 |
*** tkajinam is now known as Guest9449 | 13:10 | |
opendevreview | Amit Uniyal proposed openstack/nova master: Reproducer for cold migration on shared storage https://review.opendev.org/c/openstack/nova/+/940304 | 13:23 |
opendevreview | Amit Uniyal proposed openstack/nova master: WIP: remove swap.disk from disk.info https://review.opendev.org/c/openstack/nova/+/939643 | 13:23 |
opendevreview | Amit Uniyal proposed openstack/nova master: Reproducer for cold migration on shared storage https://review.opendev.org/c/openstack/nova/+/940304 | 13:35 |
opendevreview | Amit Uniyal proposed openstack/nova master: WIP: remove swap.disk from disk.info https://review.opendev.org/c/openstack/nova/+/939643 | 13:35 |
*** whoami-rajat_ is now known as whoami-rajat | 14:04 | |
danfai | hi nova, I'm currently trying to understand live migration in a bit more detail, especially implications on the network model. My current use case is moving from Linuxbridge to OVN, and initial tests were promising to be able to do this with live migration in our context. When doing so nova seems to convert the vif correctly but in our case uses the bridge name from the source | 14:35 |
danfai | hypervisor, which then breaks. If I hardcode/override the vif_type/bridge name on the destination hypervisor, most things work (in our case). I see that part of it comes from neutron when asking about the vif type and it would get bridge. Another one I'm not sure where it comes from, but I guess from the rpc call from the source hypervisor. What I was wondering is, should | 14:35 |
danfai | bridge names not be independent of the source hypervisor and rather be calculated based on the destination hypervisors information? (our case libvirt/kvm, public network only, still yoga) | 14:35 |
danfai | (best part of it is, that if you migrate twice, it will use the correct bridge name as the vif_type would change) | 14:39 |
opendevreview | Sylvain Bauza proposed openstack/nova master: Support multiple allocations for vGPUs https://review.opendev.org/c/openstack/nova/+/845757 | 15:18 |
opendevreview | Sylvain Bauza proposed openstack/nova master: document how to ask for more a single vGPU https://review.opendev.org/c/openstack/nova/+/906151 | 15:18 |
bauzas | dansmith: sean-k-mooney: thanks for the reviews on the weigher, I got an interesting race condition that I need to fix fwiw https://review.opendev.org/c/openstack/nova/+/940642/9 | 15:33 |
dansmith | I saw | 15:33 |
bauzas | I'll probably try/catch the call | 15:33 |
bauzas | any other opinion ? | 15:34 |
sean-k-mooney | huh i see | 15:34 |
sean-k-mooney | that form looking upt the isntnaces on the host right | 15:34 |
sean-k-mooney | or rather there image properites | 15:34 |
sean-k-mooney | so ya wraping it in a try and ignoring it if the lookup fails makes sense | 15:35 |
sean-k-mooney | im not sure if you want to explcity handlel the detelete case or also skip if there is a general exception | 15:35 |
sean-k-mooney | like the cell db was not accsabel at the time | 15:36 |
sean-k-mooney | you shoudl handel nova.exception.InstanceNotFound for sure | 15:37 |
bauzas | sean-k-mooney: yeah, I'll go this way | 15:48 |
bauzas | because users can delete their instances anyway | 15:48 |
bauzas | even when scheduling | 15:48 |
sean-k-mooney | yep and in this case the instnace that could raise this migh belong to another tenant | 15:55 |
-opendevstatus- NOTICE: nominations for the OpenStack PTL and TC positions are closing soon, for details see https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/message/7DKEV7IEHOTHED7RVEFG7WIDVUC4MY3Z/ | 15:59 | |
bauzas | #startmeeting nova | 16:01 |
opendevmeet | Meeting started Tue Feb 18 16:01:06 2025 UTC and is due to finish in 60 minutes. The chair is bauzas. Information about MeetBot at http://wiki.debian.org/MeetBot. | 16:01 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 16:01 |
opendevmeet | The meeting name has been set to 'nova' | 16:01 |
bauzas | howdy folks | 16:01 |
fwiesel | o/ | 16:01 |
r-taketn | o/ | 16:01 |
bauzas | wiki again is super slow | 16:01 |
bauzas | I updated it in advance but the update failed | 16:02 |
elodilles | o/ | 16:02 |
gibi | o/ | 16:02 |
bauzas | yeah it failed | 16:03 |
elodilles | did my update interfere with yours? :-o | 16:03 |
bauzas | okay, lemme bring you my notes directly | 16:03 |
bauzas | Error: 1205 Lock wait timeout exceeded; try restarting transaction (dd34c169f1ba97e28fc00622156bd8b2e9de60ef.rackspaceclouddb.com) | 16:03 |
bauzas | that's what I get | 16:03 |
elodilles | hmmm | 16:03 |
bauzas | #topic Bugs (stuck/critical) | 16:03 |
bauzas | #info No critical bug | 16:04 |
bauzas | #info Add yourself in the team bug roster if you want to help https://etherpad.opendev.org/p/nova-bug-triage-roster | 16:04 |
bauzas | any bugs to raise ? | 16:04 |
bauzas | looks not, moving on | 16:04 |
bauzas | agenda for today will be https://paste.opendev.org/show/bHNrlGFsGF7V7TpuLwGY/ | 16:05 |
bauzas | #topic Gate status | 16:05 |
bauzas | #link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs | 16:05 |
bauzas | #link https://etherpad.opendev.org/p/nova-ci-failures-minimal | 16:05 |
bauzas | #link https://zuul.openstack.org/builds?project=openstack%2Fnova&project=openstack%2Fplacement&branch=stable%2F*&branch=master&pipeline=periodic-weekly&skip=0 Nova&Placement periodic jobs status | 16:06 |
bauzas | #info Please look at the gate failures and file a bug report with the gate-failure tag. | 16:06 |
bauzas | #info Please try to provide meaningful comment when you recheck | 16:07 |
bauzas | that's it for gate | 16:07 |
bauzas | anything to talk about our gate ? | 16:07 |
bauzas | fwiw, I was more present upstream these days and found no real problem | 16:08 |
bauzas | moving on | 16:08 |
bauzas | #topic Release Planning | 16:08 |
bauzas | #link https://releases.openstack.org/epoxy/schedule.html | 16:08 |
bauzas | #info Nova deadlines are set in the above schedule | 16:09 |
bauzas | #info 1.5 week before Feature Freeze | 16:09 |
Uggla | o/ | 16:09 |
bauzas | the bell is ticking | 16:09 |
bauzas | #topic Review priorities | 16:09 |
bauzas | #link https://etherpad.opendev.org/p/nova-2025.1-status | 16:09 |
bauzas | I eventually updated this etherpad \o/ | 16:10 |
bauzas | and I've started to do a round of reviews | 16:10 |
bauzas | anything we should discuss ? | 16:11 |
bauzas | if not, moving on in a sec | 16:12 |
bauzas | #topic PTG planning | 16:13 |
bauzas | as a reminder, | 16:13 |
bauzas | #info Next PTG will be held on Apr 7-11 | 16:13 |
bauzas | #link https://etherpad.opendev.org/p/nova-2025.2-ptg | 16:13 |
bauzas | please add your items to discuss at that PTG into that etherpad | 16:13 |
bauzas | #topic Stable Branches | 16:14 |
bauzas | elodilles: take the mic | 16:14 |
elodilles | yepp | 16:14 |
elodilles | #info stable gates seem to be healthy | 16:15 |
elodilles | #info stable/2023.2 release patch: https://review.opendev.org/c/openstack/releases/+/941420 (stable/2023.2 (bobcat) is going to transition to EOL in ~2 months) | 16:15 |
elodilles | note that it will directly go to End of Life as it is not a SLURP release | 16:15 |
elodilles | #info stable branch status / gate failures tracking etherpad: https://etherpad.opendev.org/p/nova-stable-branch-ci | 16:15 |
elodilles | and that's all from me | 16:15 |
fwiesel | I suspect it is my turn. | 16:18 |
fwiesel | #topic vmwareapi 3rd-party CI efforts Highlights | 16:18 |
fwiesel | No updates from my side. | 16:18 |
bauzas | thanks fwiesel | 16:18 |
bauzas | cool thanks | 16:18 |
bauzas | still fighting the dragons ? | 16:18 |
fwiesel | Yes, I didn't quite get around to it, that is the main reason of the lack of progress. | 16:19 |
bauzas | no worries, we all have urgent business to sell | 16:19 |
bauzas | moving on then | 16:19 |
bauzas | #topic Open discussion | 16:19 |
bauzas | two items today AFAICS | 16:20 |
bauzas | (sean-k-mooney) neutron removal of linux bridge | 16:20 |
bauzas | sean-k-mooney: around N? | 16:20 |
sean-k-mooney | yep | 16:20 |
sean-k-mooney | so... neutron removed linxu bridge | 16:20 |
sean-k-mooney | they did not deprecated it in the last slurp | 16:20 |
sean-k-mooney | so we never did | 16:20 |
sean-k-mooney | so my propsoal is we deprecate it this release | 16:20 |
sean-k-mooney | disable the linux bridge job | 16:21 |
sean-k-mooney | and remove supprot next cycle | 16:21 |
sean-k-mooney | we need to do this in os-vif as well | 16:21 |
sean-k-mooney | i have a patch up for that | 16:21 |
sean-k-mooney | os-vif is subject to the non clinet lib freeze and is currently blocked | 16:21 |
sean-k-mooney | form merging code because of this | 16:21 |
sean-k-mooney | we have some patches we want to merge before then | 16:21 |
sean-k-mooney | so the topci is, are we ok with that plan | 16:21 |
sean-k-mooney | deprecate and disbales testing this cycle | 16:22 |
sean-k-mooney | then remove next cycle | 16:22 |
gibi | I'm OK to deprecate and disable. We cannot really do anything else if neutron pulled the plug | 16:22 |
sean-k-mooney | ill submit the patch for that for nova later today | 16:22 |
sean-k-mooney | i jsut need to write it | 16:22 |
bauzas | yeah I think we can agree | 16:23 |
sean-k-mooney | we are not fully blocked as the linux bridge job only tirggers on a subset of files | 16:23 |
bauzas | send the deprecation signal and remove the job | 16:23 |
bauzas | anyone having concerns by the deprecation ? | 16:24 |
sean-k-mooney | nova in 2025.1 woudl be able to run with older neutron or with linux bridge if you wrote an out of tree ml2/driver | 16:25 |
sean-k-mooney | so i will note that in the release note | 16:25 |
sean-k-mooney | but i dont expect that we will need/want to supprot that going forward. we can have that discssion before doing the removal form nova itslef | 16:25 |
sean-k-mooney | the thing is os-vif does nto host plugins for out of tree neutron backends | 16:26 |
sean-k-mooney | so we shoudl remove the os-vif supprot in os-vif next cycle in either case | 16:26 |
sean-k-mooney | anyway thats all form me | 16:26 |
bauzas | cool thanks | 16:27 |
bauzas | #agreed linuxbridge will be deprecated by 2025.1 | 16:28 |
bauzas | next time | 16:28 |
bauzas | gmann (I will not be able to attend the meeting due to a time conflict, but most of you know the context and proposal), Spec freeze exception for RBAC service/manager role | 16:28 |
bauzas | #link https://review.opendev.org/c/openstack/nova-specs/+/937650 | 16:28 |
bauzas | gmann requested for a late specless approval | 16:29 |
bauzas | spec approval (my bad) | 16:29 |
bauzas | Spec has one +2 already and code changes are in progress: https://review.opendev.org/q/topic:%22bp/policy-service-and-manager-role-default%22 | 16:30 |
bauzas | usually we would not approve a spec so late in the cycle, but there are some criterias here to talk | 16:30 |
bauzas | 1/ the spec is quite simple and only relates to policy changes | 16:31 |
bauzas | 2/ we already merged some efforts related to policy | 16:31 |
bauzas | 3/ sean-k-mooney already accepted the spec | 16:31 |
bauzas | would there any concern by the approval ? | 16:32 |
sean-k-mooney | yes, although that was with the coment it needs an execption and the undersanding we woudl have this converstaion about 4 weeks ago | 16:32 |
sean-k-mooney | from me just review bandwith | 16:32 |
bauzas | my personal opinion on that is the risk about the policy changes | 16:32 |
gibi | yeah, if there are two cores signing up for reviewing it the I have no objection | 16:33 |
gibi | but the time is pretty sort | 16:33 |
bauzas | we're talking of modifying the inter-service communication | 16:33 |
sean-k-mooney | we are but | 16:33 |
bauzas | so the patches would require extra caution | 16:33 |
sean-k-mooney | by defualt i this cycle swe will go form admin to admin or service | 16:33 |
bauzas | I'm not opposed to give that series a late approval, but I wouldn't rush on reviewing the patches to be clear | 16:34 |
sean-k-mooney | so the real upgrade impact woudl be next cycle or whenever we drop admin | 16:34 |
sean-k-mooney | the manager changes would also be additive | 16:34 |
sean-k-mooney | we obviouly need to validate that our exsing jobs do not require modfiigcaiont to pass | 16:35 |
sean-k-mooney | to ensure there is no upgrade impact | 16:35 |
sean-k-mooney | we may also want to explcity have the new defautl job enable extra testing fo the manager policy | 16:36 |
sean-k-mooney | gmann: what the state of the tempets testing for this | 16:36 |
sean-k-mooney | my irc client freeked out so not sure if i missed messages | 16:39 |
bauzas | sean-k-mooney: gmann was unable to attend the meeting | 16:40 |
bauzas | so he won't be able to reply to you I guess | 16:40 |
sean-k-mooney | oh ok. | 16:40 |
bauzas | fwiw, I don't really know what to do with that spec | 16:40 |
bauzas | I don't have the context yet and I need some time to reload it | 16:41 |
bauzas | so I won't be able to +2 it shortly | 16:41 |
sean-k-mooney | lookign at the tempest patches | 16:41 |
bauzas | if any other nova-core is able to review that spec, I'm not opposed to a late approval but those two cores shall honestly sign off for the implementation review too | 16:42 |
sean-k-mooney | so i am not sure ill have time to rewview it either by the way | 16:42 |
sean-k-mooney | i think this shoudl be moved to next cycle honestly | 16:42 |
sean-k-mooney | i woudl prefer to focus on merging the ohter approved specs | 16:43 |
gibi | I dont see cores signing up, so punt it | 16:45 |
bauzas | that's sad but reasonable | 16:45 |
bauzas | we shouldn't offer help if we can't review the patches | 16:45 |
sean-k-mooney | to be clear form my perspecitve this was punted when it missed the sepc freeze | 16:45 |
sean-k-mooney | and it was not approved as an exption in the team meeting after | 16:46 |
bauzas | #agreed unfortunately given the review bandwidth and the time it takes to gather context for approving the spec, we can't late approve the spec as we can't also guarantee that we could review the implementation patches in the reasonable timeline that we have | 16:46 |
bauzas | okay, that was it for today's items in the agenda | 16:47 |
bauzas | anything anyone ? | 16:47 |
r-taketn | bauzas: I have one request. (My entry might have been deleted, Sorry for the confusion) | 16:47 |
MengyangZhang[m] | I also have a topic. https://blueprints.launchpad.net/nova/+spec/enhanced-granularity-and-live-application-of-qos Would like to hear how the community feels about this before moving on looking into the code and writing a spec. Already proposed this idea to cinder and they liked the idea of per-project control of qos and live updating the qos limits. | 16:48 |
r-taketn | #link https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/thread/244U52T3HINVWUFSMPMA45A67BUPAQGK/ | 16:49 |
sean-k-mooney | MengyangZhang[m]: we will not provide a inteface to modify cpu qos via cidner forntend qos | 16:50 |
bauzas | r-taketn: ok, you're the next | 16:50 |
sean-k-mooney | so if thaty what you plandded -2 | 16:50 |
sean-k-mooney | but ya lest discuss r-taketn topic first | 16:50 |
r-taketn | Could you please check the last message the above mailing list and give your comments? | 16:51 |
sean-k-mooney | r-taketn: sure but just to clarify somethign we said previous | 16:51 |
sean-k-mooney | nova will not merge any code for unreleased hardware | 16:51 |
bauzas | r-taketn: oh, OK, that's a follow-up question on Arm-CCA | 16:51 |
bauzas | yeah, what sean-k-mooney said | 16:52 |
bauzas | we can't just support something that our operators can't use | 16:52 |
r-taketn | Oh, ok. I might misunderstand | 16:53 |
sean-k-mooney | if there are some feautres that will eventually be used by CCA but are useful on there own we can enabel that in advance | 16:53 |
bauzas | for the sake of the meeting time, could we then move to the second item ? | 16:54 |
r-taketn | yes | 16:55 |
bauzas | thanks | 16:55 |
bauzas | MengyangZhang[m]: looking at your blueprint | 16:56 |
sean-k-mooney | MengyangZhang[m]: to be clear "live QoS updates to CPU and memory limits, including configurations in cputune, cachetune, and memorytune libvirt XML elements to enable seamless performance adjustments for running instances." is not something we can or shoudl supprot via cidner frontend qos policies | 16:56 |
sean-k-mooney | if we wanted to do that we woudl need to add a new QOS api to nova to allows admins to defien qos polices | 16:56 |
sean-k-mooney | and a instance action to apply it | 16:56 |
bauzas | that sounds a very large topic to discuss | 16:56 |
bauzas | haven't we already proposed to discuss those kind of topics at the PTG ? | 16:57 |
sean-k-mooney | i tought so | 16:57 |
MengyangZhang[m] | <sean-k-mooney> "Mengyang Zhang: we will not..." <- It's not via cinder qos. What we want to achieve is to be able to live update all QoS limits on existing VMs on nova side. For disk qos, it's a bit special as it requires cinder side changes. | 16:57 |
bauzas | I actually see it in https://etherpad.opendev.org/p/nova-2025.2-ptg#L47 | 16:57 |
bauzas | so I'd recommend to defer the discussion at the PTG | 16:57 |
sean-k-mooney | MengyangZhang[m]: so the only qos you can configure today in nova | 16:57 |
sean-k-mooney | is via flavor extra specs | 16:58 |
sean-k-mooney | so what you are effectivly asking for is a limited form of live resize | 16:58 |
MengyangZhang[m] | we can set cpu and memory qos via flavor extra specs. But cinder is a bit different. | 16:59 |
sean-k-mooney | you can set disk qos via flavor extra specs too. | 17:00 |
bauzas | we're over time | 17:00 |
sean-k-mooney | cidner qos today coudl be supproted via swap voluem (volume retype in cinder) | 17:00 |
bauzas | thanks folks | 17:00 |
bauzas | I'll end the meeting but I also recommend talking about that usecase at the PTG | 17:01 |
bauzas | thnaks | 17:01 |
bauzas | #endmeeting | 17:01 |
opendevmeet | Meeting ended Tue Feb 18 17:01:15 2025 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 17:01 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/nova/2025/nova.2025-02-18-16.01.html | 17:01 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/nova/2025/nova.2025-02-18-16.01.txt | 17:01 |
opendevmeet | Log: https://meetings.opendev.org/meetings/nova/2025/nova.2025-02-18-16.01.log.html | 17:01 |
sean-k-mooney | MengyangZhang[m]: i think we woudl need a nova cinder corss project seesion on this topic for the cinder case | 17:01 |
cardoe | https://review.opendev.org/c/openstack/nova/+/942019 is just a simple message fix that I was gonna see if I could also backport to 2024.2 (and technically affects back to 2024.1) Not sure if there's some test or release note you guys would like me to add. | 17:02 |
MengyangZhang[m] | Yes, make sense. So we discuss it on Thur? | 17:02 |
r-taketn | thanks | 17:02 |
MengyangZhang[m] | <sean-k-mooney> "Mengyang Zhang: i think we woudl..." <- Yes, we can discuss it there. I already proposed this idea to them. Here's their comments https://meetings.opendev.org/meetings/cinder/2025/cinder.2025-01-15-14.00.log.html#:~:text=to%20implement%20it-,14%3A17%3A29,-%3Csimondodsley%3E%20i | 17:09 |
sean-k-mooney | MengyangZhang[m]: likely how we would need to do this si to add a new external event that cidner could call back when qos is modifed on a volume | 17:09 |
sean-k-mooney | we would also need to call them back to etier complet the update or fail it | 17:10 |
MengyangZhang[m] | make sense. I can look into this a bit more closely, but since this may be a large project, just want to make sure the community is interested in the idea so we can start working on it in my company | 17:12 |
sean-k-mooney | MengyangZhang[m]: have you written a cidner spec for this yet? i suspect we will need one for cinder and another for nova as it would need api change on both sides | 17:12 |
MengyangZhang[m] | not yet, yes spec makes complete sense. will work on it next | 17:13 |
sean-k-mooney | MengyangZhang[m]: this is similar compelxtity to https://specs.openstack.org/openstack/nova-specs/specs/2024.2/approved/assisted-volume-extend.html | 17:13 |
sean-k-mooney | https://review.opendev.org/c/openstack/cinder-specs/+/866718 | 17:14 |
sean-k-mooney | MengyangZhang[m]: while POC are useful in general its expected to agree the design in a spec before working on the code | 17:15 |
sean-k-mooney | if we focus on the cidner usecase what your really tryign to adress is the gap for frontend qos when you do "openstack volume qos associate QOS_ID VOLUME_TYPE_ID" | 17:16 |
sean-k-mooney | ... why are they using a get request to make a change... | 17:19 |
sean-k-mooney | https://docs.openstack.org/api-ref/block-storage/v3/index.html?expanded=detach-volume-from-server-detail#associate-qos-specification-with-a-volume-type | 17:19 |
MengyangZhang[m] | For cinder, there are two requests, one is per project qos not the per volume type qos. Secondly, once the qos policy is changed, we want to let the existing vm automatically adopt the new limits instead of requireing a live-migration to adopt it | 17:19 |
sean-k-mooney | anyway that return a 202 | 17:19 |
sean-k-mooney | today it shoudl eb returing a 409 conflict if the volume is attached to and isntance and the qos policy has frontent qos | 17:19 |
sean-k-mooney | MengyangZhang[m]: live migration shoudl not cause it to update today | 17:20 |
sean-k-mooney | if it does that proably a bug | 17:20 |
sean-k-mooney | i stongly suspect that cidner api is not propelry reject assocateing a qos policy with a volume that is in-use when the policy has front end qos | 17:21 |
sean-k-mooney | it it allows that today that is wrong | 17:21 |
MengyangZhang[m] | Let me double check it | 17:25 |
sean-k-mooney | it woudl be a prety seriours api bug on there side | 17:26 |
sean-k-mooney | because they would allow the requested state and the actual state to get out of sync | 17:26 |
sean-k-mooney | which could lead to late failure the next tiem a user hard reboots an instnace | 17:27 |
MengyangZhang[m] | I will perform some testing locally and get back to you on this | 17:30 |
sean-k-mooney | if you can apply a qos policy to a volume that is in use and that policy has a frontend qos rule then its a cidner api bug form my perspecitve | 17:30 |
sean-k-mooney | its not ok to make a change i nthe cinder api and only have it take effect potentialy month later the next time the vm is moved or the xml is otherwise regenerated | 17:31 |
sean-k-mooney | the current api is written in terms of the volume type | 17:32 |
sean-k-mooney | nto in terms fo a volume | 17:32 |
sean-k-mooney | cinder shoudl not apply the qos update to existing instance or the updated type until a retyp or simialr api call is done | 17:33 |
opendevreview | Sylvain Bauza proposed openstack/nova master: Add a new ImagePropertiesWeigher https://review.opendev.org/c/openstack/nova/+/940642 | 17:41 |
opendevreview | Sylvain Bauza proposed openstack/nova master: per-property ImageMetaPropsWeigher https://review.opendev.org/c/openstack/nova/+/941601 | 17:41 |
MengyangZhang[m] | <sean-k-mooney> "its not ok to make a change i..." <- unfortunately, that's the case now. If a qos policy is created and associated with a volume type, then all in-use volumes of that type will get qos limits when the xml is regenerated, like during live migration or resize. | 17:48 |
MengyangZhang[m] | <sean-k-mooney> "cinder shoudl not apply the..." <- what do you mean by retype? | 17:49 |
gmann | bauzas: sean-k-mooney: RE on RBAC spec, I read the meeting log. I am ok with decision and understand if no core has bandwidth to review it. I will see what best can be done in next cycle. | 17:54 |
sean-k-mooney | gmann: if you have the patches in a mergabel state with testing i dotn see why we woudl really wait once we are passed teh RC peroid | 17:55 |
gmann | mikal: I am in PST TZ, I will check tempest change after TC meeting (~ 1hr from now) | 17:55 |
sean-k-mooney | we can approved the spec for next cycle | 17:55 |
gmann | ack | 17:56 |
sean-k-mooney | i dont kwno how practical it is but it woudl be ncie to compelte it before the PTG for example | 17:56 |
sean-k-mooney | certinly its a good candiate to complete before m1 | 17:58 |
sean-k-mooney | i just dont know how much tiem we will all have with release activies ectra | 17:58 |
gmann | yeah, let me finish the code change first before re-proposing the spec. manger role is all good but i need to finish the service role thjings | 18:01 |
sean-k-mooney | does one need the other. both are covered by the same spec but we do not need to complet eboth to merge it right | 18:02 |
sean-k-mooney | if the manager role is working that can be merged first and then teh service user role change can come after? | 18:02 |
gmann | both are separate, I was little lazy to put both in one effort. this is manager role change and its parent add manager role context in unit tests so that we can see how manager role addition looks like https://review.opendev.org/c/openstack/nova/+/941347/2 | 18:04 |
sean-k-mooney | well i dont really mind them both being in the same spec | 18:05 |
gmann | and manager role is more I am interested to do as that is user usable things and service-role is just (but still important ) to improve the internal communication | 18:05 |
sean-k-mooney | right | 18:05 |
sean-k-mooney | i woudl proably do them in that order too | 18:06 |
gmann | sean-k-mooney: what you suggest now? do you want me to separate spec for manager role if that can be done in this cycle? you can see the code change for manager role to answer it. | 18:07 |
gmann | https://review.opendev.org/c/openstack/nova/+/941056/2 and https://review.opendev.org/c/openstack/nova/+/941347/2 | 18:08 |
sean-k-mooney | if we wanted to try and get some provess this cycle that might be achivele givent its only 2 relitivly small pathces | 18:09 |
sean-k-mooney | my concern woudl still be with testing | 18:09 |
sean-k-mooney | have you any tempest coverage for that | 18:09 |
sean-k-mooney | oh https://review.opendev.org/c/openstack/tempest/+/941594 | 18:09 |
sean-k-mooney | and thast merged already | 18:09 |
sean-k-mooney | although that not reall testing it | 18:11 |
gmann | policy changes are from ADMIN -> manager-or-ADMIN and tempest tests testing those with admin so no impact but there are two policy changes form member-or-admin to manager-or-admin which is impact but tempest test tests them admin only. | 18:24 |
sean-k-mooney | im almost finished quicckly reviewign the manager patches but there are issues in the second one | 18:24 |
gmann | sean-k-mooney: tempest testing you mean to test those policy with manager role instead of admin? manager role access is asserted in unit tests | 18:24 |
sean-k-mooney | correct to asssert that with the new default policy a client with the manger role could cold migrate for example | 18:25 |
gmann | then we loose coverage of doing it with manager role | 18:26 |
gmann | unless we run same tests with two role admin and manager | 18:26 |
sean-k-mooney | yes i was suggesting doing it with both admin and manager | 18:26 |
gmann | * then we loose coverage of doing it with ADMIN role | 18:26 |
sean-k-mooney | althoguht that is really jsut testign the implied roles feature irght | 18:27 |
sean-k-mooney | admin would imply manager | 18:27 |
gmann | that might be add more time in tempest jobs if we test APIs with all possible policy defaults. bcz it is same for member-or-reader | 18:28 |
sean-k-mooney | unlike service i dont see a reason for it to siad the default heriachy | 18:28 |
gmann | sean-k-mooney: yeah, that I was about to write. I think testing with manager is least permissible default role testing we need | 18:28 |
gmann | and admin anyways should have access on anything manager can do | 18:28 |
gmann | sean-k-mooney: let me modify the tempest test and see | 18:28 |
sean-k-mooney | i guess there is one differnce | 18:30 |
sean-k-mooney | we can jsut use project manager in the roles becuase when you have the admin role we do not need to compre the project id | 18:31 |
sean-k-mooney | so we do actully need to have PROJECT_MANAGER_OR_ADMIN for api that are valid to call form both | 18:32 |
sean-k-mooney | although on the tempest side i dotn think that changes thigns | 18:32 |
gmann | yeah | 18:34 |
priteau | Hello. Could a Nova core approve this backport please? https://review.opendev.org/c/openstack/nova/+/940846 I have just tested it and it resolves our baremetal node deletion issues. | 18:35 |
sean-k-mooney | ah the 2024.1 backport | 18:36 |
sean-k-mooney | sure | 18:36 |
gmann | sean-k-mooney: anyways let me see if I can finish the testing otherwise I am ok not to be in hurry in this cycle. The main reason is that I am struggling with my bandwidth for upstream activities at least for next few months also. | 18:36 |
sean-k-mooney | gmann: i finsihed reviewing the manger changes and i dont think you updated it with the dicussion form the spec | 18:37 |
sean-k-mooney | i was also expecting you to modify instnace lock btu maybe that was nto covered | 18:37 |
sean-k-mooney | i didnt look at the spec in a while | 18:37 |
gmann | In spec, we left the lock and other server action. I explicitly mentioned those as not-changing-those | 18:38 |
sean-k-mooney | i was expectign you to allow the manager ot unlock an admin locked insntace | 18:40 |
sean-k-mooney | but not restrict lock to manager | 18:40 |
sean-k-mooney | but that is fine | 18:40 |
sean-k-mooney | priteau: approved :) | 18:41 |
dansmith | bauzas: still around? | 19:16 |
opendevreview | Michael Still proposed openstack/nova master: libvirt: allow direct SPICE connections to qemu https://review.opendev.org/c/openstack/nova/+/924844 | 19:31 |
priteau | sean-k-mooney: thanks! | 19:42 |
MengyangZhang[m] | <sean-k-mooney> "cinder shoudl not apply the..." <- unfortunately this is the case now. How should I proceed then? | 20:29 |
sean-k-mooney | it depend | 20:31 |
opendevreview | Merged openstack/nova stable/2024.1: ironic: Fix ConflictException when deleting server https://review.opendev.org/c/openstack/nova/+/940846 | 20:31 |
sean-k-mooney | MengyangZhang[m]: if it allow the volume type ot be updated after its in use does it also propagte those changes to instance when a new attachment is created | 20:32 |
sean-k-mooney | or does it update the conetnt for exsiting attachments | 20:32 |
MengyangZhang[m] | to new attachment, existing volumes must be reattached to have the update applied | 20:33 |
sean-k-mooney | ack so that not great as i treally should only update it if you do a volume retype | 20:34 |
sean-k-mooney | but its less bad then otherwise | 20:34 |
sean-k-mooney | what we really need is an per volume api call to trigger the atachment update. the worst case sensiario is you create a new qos policy, you apply it to a volume time and cinder tirggers all volumesof that type ot be udpated | 20:37 |
sean-k-mooney | by that i mean | 20:38 |
sean-k-mooney | it would be a problem if cidner tirgger nova to call libvirt to update all fontend qos on all affected volumes | 20:38 |
sean-k-mooney | its tricky but how nova would expect cidner to work is that when a voluem is created iwth a given voluem type. the defintion fo that voluem type should be snapshot and saved per voloume and used for the lifetime of the volume | 20:41 |
sean-k-mooney | meaning the only way to update a voluem to get the new qos would be via a voluem retype ot the same or a diffent volume type | 20:42 |
sean-k-mooney | that is how we handel flavors and iamges | 20:42 |
sean-k-mooney | it sounds like cinder is not doing this | 20:43 |
MengyangZhang[m] | No, not for live migration/resize coz qos limits are saved in cinder.quality_of_service_specs db table and it is only assoicated with volume type. I don't see qos values are saved on a per volume basis. Thus, when volume attachment event happens, they just fetch qos limits associated with the volume type and popolate to the connection_info field, which then passed back to nova to generate xml | 20:48 |
MengyangZhang[m] | The reboot however works a bit different. I believe it is reading the connection_info from nova.block_device_mapping db, which saves a snapshot of volume qos and use it for the lifetime of the instance. | 20:50 |
sean-k-mooney | right so that probelmatic becasue it shoudl not change on a live migration | 20:51 |
sean-k-mooney | the reboot behavior in nova is how it shoudl work | 20:51 |
sean-k-mooney | the qos policy applied to an instnace volume shoudl not change as a sideffect changing the host | 20:52 |
MengyangZhang[m] | understood, then what we should do? discuss this in next cinder-nova meeting? | 20:53 |
sean-k-mooney | i think so yes | 20:53 |
sean-k-mooney | i do not know if modifying the QOS could cause the live migration to fail for example | 20:53 |
sean-k-mooney | or cause issues with reverting a cold migrate or resize | 20:53 |
MengyangZhang[m] | i don't think it would cause live-migration fail. We have deployed qos to our private cloud for some time and we don't see increased live-migration failures when existing VMs are adapting the qos | 20:55 |
sean-k-mooney | it would fail if the destination host did not supprot the qos config | 20:56 |
sean-k-mooney | i.e. if the souce and destination host had differnt version of libvirt and the qos element was supported on the source but nto the desitnaiton | 20:57 |
sean-k-mooney | we shoudl be validating that in pre live migration | 20:57 |
opendevreview | ribaudr proposed openstack/nova master: Augment the LiveMigrateData object https://review.opendev.org/c/openstack/nova/+/942143 | 20:58 |
opendevreview | ribaudr proposed openstack/nova master: Add live_migratable flag to PCI device specification https://review.opendev.org/c/openstack/nova/+/942144 | 20:58 |
opendevreview | ribaudr proposed openstack/nova master: Update manager to allow vfio pci device live migration https://review.opendev.org/c/openstack/nova/+/942145 | 20:58 |
opendevreview | ribaudr proposed openstack/nova master: Update conductor and filters allowing migration with SR-IOV devices https://review.opendev.org/c/openstack/nova/+/942146 | 20:58 |
opendevreview | ribaudr proposed openstack/nova master: WIP: Update driver to map the targeted address for SR-IOV PCI devices https://review.opendev.org/c/openstack/nova/+/942147 | 20:58 |
sean-k-mooney | downstream we supprot deploying a single verison of nova across both rhel 8 and rhel 9 host for an extened peoid of time | 20:58 |
sean-k-mooney | we have at least one custoemr that told use that for regaltory reasoson can only update eithe revery 6-12 months | 21:00 |
sean-k-mooney | so they would realistically ahve some deployments with a mix of libvirt versions for up to a year | 21:00 |
MengyangZhang[m] | I think we discussed this at some point, the pre live migration step wouldn't fail plus openstack supported front-side disk qos 13 years ago | 21:01 |
sean-k-mooney | right for the current set we have it wont fail unless libvir remvoes supprot for a filed in a new version | 21:02 |
sean-k-mooney | but im making the point that nova does not really expect this to change on live migation | 21:02 |
MengyangZhang[m] | get it | 21:02 |
sean-k-mooney | a concreate example fo this is the weith element https://libvirt.org/formatdomain.html#block-i-o-tuning | 21:04 |
sean-k-mooney | the range [100, 1000]. After kernel 2.6.39, the value could be in the range [10, 1000] | 21:04 |
sean-k-mooney | so hte changed the range that was allowed | 21:04 |
sean-k-mooney | again it old enough that its ok | 21:04 |
MengyangZhang[m] | make sense | 21:05 |
sean-k-mooney | im going to call it a day btu we shoudl see what the cinder folks think about how to proceed | 21:05 |
MengyangZhang[m] | make sense thanks for the help! | 21:05 |
opendevreview | Merged openstack/os-vif master: Remove os-vif-linuxbridge https://review.opendev.org/c/openstack/os-vif/+/941567 | 23:23 |
opendevreview | sean mooney proposed openstack/os-vif master: Update gate jobs as per the 2025.1 cycle testing runtime https://review.opendev.org/c/openstack/os-vif/+/935580 | 23:38 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!