Tuesday, 2025-02-18

opendevreviewMichael Still proposed openstack/nova master: libvirt: direct SPICE console object changes  https://review.opendev.org/c/openstack/nova/+/92687600:13
opendevreviewMichael Still proposed openstack/nova master: libvirt: direct SPICE console database changes  https://review.opendev.org/c/openstack/nova/+/92687700:13
opendevreviewMichael Still proposed openstack/nova master: libvirt: allow direct SPICE connections to qemu  https://review.opendev.org/c/openstack/nova/+/92484400:13
mikal^--- the third patch will fail because there's some subtle API testing thing I need to figure out, but I don't have time right now.00:13
sean-k-mooneyi think its https://review.opendev.org/c/openstack/nova/+/924844/37/nova/api/openstack/compute/console_auth_tokens.py#9300:23
sean-k-mooneythat shoudl be 2.98 not 2.9700:23
sean-k-mooneyim comemntign on a few more00:26
sean-k-mooneymikal: ok comment left on the thing i think that need to be updated but im not 100% sure00:28
opendevreviewAmit Uniyal proposed openstack/nova master: WIP: remove swap.disk from disk.info  https://review.opendev.org/c/openstack/nova/+/93964304:28
opendevreviewAmit Uniyal proposed openstack/nova master: WIP: remove swap.disk from disk.info  https://review.opendev.org/c/openstack/nova/+/93964304:39
opendevreviewMerged openstack/nova master: allow discover host to be enabled in multiple schedulers  https://review.opendev.org/c/openstack/nova/+/93852309:38
tkajinamsean-k-mooney gibi,  could one of you nudge https://review.opendev.org/c/openstack/nova/+/936287 to gate ? its parent was already merged10:18
tkajinam(it already has two +2s but no +A10:18
tkajinamI had to rebase it, and that wiped out +A :-(10:18
sean-k-mooney[m]done10:19
tkajinamsean-k-mooney[m], thx !10:20
mikalsean-k-mooney: thanks for the pointer on the review, I'm uploading a new set of patches nowish that should pass tests.10:43
mikalI am not sure what timezone gmann is in, but we might want to make sure they're aware of the tempest SNAFU.10:43
opendevreviewMerged openstack/nova master: Add unit test coverage of get_machine_ips  https://review.opendev.org/c/openstack/nova/+/93628710:48
opendevreviewMichael Still proposed openstack/nova master: libvirt: direct SPICE console object changes  https://review.opendev.org/c/openstack/nova/+/92687610:49
opendevreviewMichael Still proposed openstack/nova master: libvirt: direct SPICE console database changes  https://review.opendev.org/c/openstack/nova/+/92687710:49
opendevreviewMichael Still proposed openstack/nova master: libvirt: allow direct SPICE connections to qemu  https://review.opendev.org/c/openstack/nova/+/92484410:49
bauzasmikal: I need to leave now but I left some comments on your series10:51
mikalbauzas: the spice-direct or the USB / sound one?10:52
bauzasmikal: nothing critical as most of them were related to the 2.99 fixes you need to do, but there is also another question about the returned value of tls_port in case of a rolling upgrade10:52
bauzasmikal: the former10:52
bauzastl;dr: if the API calls some old compute for getting the ConsoleAuthToken object, it will obtain a None value for the tls_port field10:53
bauzaswhich basically works but returns 'null' (IIRC) as an API returned value10:54
mikalI think that is ok. That would also happen if you had spice.require_secure set to false.10:54
mikali.e. a user of the API already has to handle that case.10:54
bauzasmikal: cool, I wanted you to be aware of such value, feel free to leave a comment then to reply you're good with that10:54
mikalOk, will do.10:54
opendevreviewMichael Still proposed openstack/nova master: libvirt: direct SPICE console object changes  https://review.opendev.org/c/openstack/nova/+/92687611:09
opendevreviewMichael Still proposed openstack/nova master: libvirt: direct SPICE console database changes  https://review.opendev.org/c/openstack/nova/+/92687711:09
opendevreviewMichael Still proposed openstack/nova master: libvirt: allow direct SPICE connections to qemu  https://review.opendev.org/c/openstack/nova/+/92484411:09
sean-k-mooneybauzas: can you review https://review.opendev.org/c/openstack/nova/+/940835 and https://review.opendev.org/c/openstack/nova/+/940873/6 in preperation for the spice series, that prepares the jobs to use spice, the devstack change to make multi node sspeice work proeprly is already merged so i coudl technially squash those but i think its simpler just ot merge them as is11:48
sean-k-mooneyim likely going to rename that job in a followup but i wont get to that for a while11:49
sean-k-mooneyim thinking nova-alt (i.e. for testing non default config that we do not plan to make the default, like nova-next but subtley differnt11:50
sean-k-mooney)11:50
sean-k-mooneyi plan to eventully renable the numa testing in that job and ill also try and add nova on nfs to it too11:51
sean-k-mooneybut i need to do some homework on exactly how to do that properly11:52
opendevreviewBalazs Gibizer proposed openstack/nova stable/2024.2: Reproduce bug/2097359  https://review.opendev.org/c/openstack/nova/+/94208512:33
opendevreviewBalazs Gibizer proposed openstack/nova stable/2024.2: Update InstanceNUMACell version after data migration  https://review.opendev.org/c/openstack/nova/+/94208612:33
opendevreviewBalazs Gibizer proposed openstack/nova stable/2024.2: Update InstanceNUMACell version in more cases  https://review.opendev.org/c/openstack/nova/+/94208712:33
opendevreviewBalazs Gibizer proposed openstack/nova stable/2024.1: Reproduce bug/2097359  https://review.opendev.org/c/openstack/nova/+/94208812:47
opendevreviewBalazs Gibizer proposed openstack/nova stable/2024.1: Update InstanceNUMACell version after data migration  https://review.opendev.org/c/openstack/nova/+/94208912:47
opendevreviewBalazs Gibizer proposed openstack/nova stable/2024.1: Update InstanceNUMACell version in more cases  https://review.opendev.org/c/openstack/nova/+/94209012:47
opendevreviewBalazs Gibizer proposed openstack/nova stable/2023.2: repoduce post liberty pre vicoria instance numa db issue  https://review.opendev.org/c/openstack/nova/+/94209112:49
opendevreviewBalazs Gibizer proposed openstack/nova stable/2023.2: allow upgrade of pre-victoria InstanceNUMACells  https://review.opendev.org/c/openstack/nova/+/94209212:50
opendevreviewBalazs Gibizer proposed openstack/nova stable/2023.2: Reproduce bug/2097359  https://review.opendev.org/c/openstack/nova/+/94209312:50
opendevreviewBalazs Gibizer proposed openstack/nova stable/2023.2: Update InstanceNUMACell version after data migration  https://review.opendev.org/c/openstack/nova/+/94209412:50
opendevreviewBalazs Gibizer proposed openstack/nova stable/2023.2: Update InstanceNUMACell version in more cases  https://review.opendev.org/c/openstack/nova/+/94209512:50
*** tkajinam is now known as Guest944913:10
opendevreviewAmit Uniyal proposed openstack/nova master: Reproducer for cold migration on shared storage  https://review.opendev.org/c/openstack/nova/+/94030413:23
opendevreviewAmit Uniyal proposed openstack/nova master: WIP: remove swap.disk from disk.info  https://review.opendev.org/c/openstack/nova/+/93964313:23
opendevreviewAmit Uniyal proposed openstack/nova master: Reproducer for cold migration on shared storage  https://review.opendev.org/c/openstack/nova/+/94030413:35
opendevreviewAmit Uniyal proposed openstack/nova master: WIP: remove swap.disk from disk.info  https://review.opendev.org/c/openstack/nova/+/93964313:35
*** whoami-rajat_ is now known as whoami-rajat14:04
danfaihi nova, I'm currently trying to understand live migration in a bit more detail, especially implications on the network model. My current use case is moving from Linuxbridge to OVN, and initial tests were promising to be able to do this with live migration in our context. When doing so nova seems to convert the vif correctly but in our case uses the bridge name from the source14:35
danfaihypervisor, which then breaks. If I hardcode/override the vif_type/bridge name on the destination hypervisor, most things work (in our case). I see that part of it comes from neutron when asking about the vif type and it would get bridge. Another one I'm not sure where it comes from, but I guess from the rpc call from the source hypervisor. What I was wondering is, should14:35
danfaibridge names not be independent of the source hypervisor and rather be calculated based on the destination hypervisors information? (our case libvirt/kvm, public network only, still yoga)14:35
danfai(best part of it is, that if you migrate twice, it will use the correct bridge name as the vif_type would change)14:39
opendevreviewSylvain Bauza proposed openstack/nova master: Support multiple allocations for vGPUs  https://review.opendev.org/c/openstack/nova/+/84575715:18
opendevreviewSylvain Bauza proposed openstack/nova master: document how to ask for more a single vGPU  https://review.opendev.org/c/openstack/nova/+/90615115:18
bauzasdansmith: sean-k-mooney: thanks for the reviews on the weigher, I got an interesting race condition that I need to fix fwiw https://review.opendev.org/c/openstack/nova/+/940642/915:33
dansmithI saw15:33
bauzasI'll probably try/catch the call 15:33
bauzasany other opinion ?15:34
sean-k-mooneyhuh i see15:34
sean-k-mooneythat form looking upt the isntnaces on the host right15:34
sean-k-mooneyor rather there image properites15:34
sean-k-mooneyso ya wraping it in a try and ignoring it if the lookup fails makes sense15:35
sean-k-mooneyim not sure if you want to explcity handlel the detelete case or also skip if there is a general exception15:35
sean-k-mooneylike the cell db was not accsabel at the time15:36
sean-k-mooneyyou shoudl handel  nova.exception.InstanceNotFound for sure15:37
bauzassean-k-mooney: yeah, I'll go this way15:48
bauzasbecause users can delete their instances anyway 15:48
bauzaseven when scheduling15:48
sean-k-mooneyyep and in this case the instnace that could raise this migh belong to another tenant15:55
-opendevstatus- NOTICE: nominations for the OpenStack PTL and TC positions are closing soon, for details see https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/message/7DKEV7IEHOTHED7RVEFG7WIDVUC4MY3Z/15:59
bauzas#startmeeting nova16:01
opendevmeetMeeting started Tue Feb 18 16:01:06 2025 UTC and is due to finish in 60 minutes.  The chair is bauzas. Information about MeetBot at http://wiki.debian.org/MeetBot.16:01
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.16:01
opendevmeetThe meeting name has been set to 'nova'16:01
bauzashowdy folks16:01
fwieselo/16:01
r-taketno/16:01
bauzaswiki again is super slow16:01
bauzasI updated it in advance but the update failed16:02
elodilleso/16:02
gibio/16:02
bauzasyeah it failed16:03
elodillesdid my update interfere with yours? :-o16:03
bauzasokay, lemme bring you my notes directly16:03
bauzas Error: 1205 Lock wait timeout exceeded; try restarting transaction (dd34c169f1ba97e28fc00622156bd8b2e9de60ef.rackspaceclouddb.com)16:03
bauzasthat's what I get16:03
elodilleshmmm16:03
bauzas#topic Bugs (stuck/critical)16:03
bauzas#info No critical bug16:04
bauzas#info Add yourself in the team bug roster if you want to help https://etherpad.opendev.org/p/nova-bug-triage-roster16:04
bauzasany bugs to raise ?16:04
bauzaslooks not, moving on16:04
bauzasagenda for today will be https://paste.opendev.org/show/bHNrlGFsGF7V7TpuLwGY/16:05
bauzas#topic Gate status16:05
bauzas#link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs 16:05
bauzas#link https://etherpad.opendev.org/p/nova-ci-failures-minimal16:05
bauzas#link https://zuul.openstack.org/builds?project=openstack%2Fnova&project=openstack%2Fplacement&branch=stable%2F*&branch=master&pipeline=periodic-weekly&skip=0 Nova&Placement periodic jobs status16:06
bauzas#info Please look at the gate failures and file a bug report with the gate-failure tag.16:06
bauzas#info Please try to provide meaningful comment when you recheck16:07
bauzasthat's it for gate16:07
bauzasanything to talk about our gate ? 16:07
bauzasfwiw, I was more present upstream these days and found no real problem16:08
bauzasmoving on16:08
bauzas#topic Release Planning 16:08
bauzas#link https://releases.openstack.org/epoxy/schedule.html16:08
bauzas#info Nova deadlines are set in the above schedule16:09
bauzas#info 1.5 week before Feature Freeze16:09
Ugglao/16:09
bauzasthe bell is ticking16:09
bauzas#topic Review priorities 16:09
bauzas#link https://etherpad.opendev.org/p/nova-2025.1-status16:09
bauzasI eventually updated this etherpad \o/16:10
bauzasand I've started to do a round of reviews16:10
bauzasanything we should discuss ?16:11
bauzasif not, moving on in a sec16:12
bauzas#topic PTG planning 16:13
bauzasas a reminder,16:13
bauzas#info Next PTG will be held on Apr 7-1116:13
bauzas#link https://etherpad.opendev.org/p/nova-2025.2-ptg16:13
bauzasplease add your items to discuss at that PTG into that etherpad16:13
bauzas#topic Stable Branches 16:14
bauzaselodilles: take the mic16:14
elodillesyepp16:14
elodilles#info stable gates seem to be healthy16:15
elodilles#info stable/2023.2 release patch: https://review.opendev.org/c/openstack/releases/+/941420 (stable/2023.2 (bobcat) is going to transition to EOL in ~2 months)16:15
elodillesnote that it will directly go to End of Life as it is not a SLURP release16:15
elodilles#info stable branch status / gate failures tracking etherpad: https://etherpad.opendev.org/p/nova-stable-branch-ci16:15
elodillesand that's all from me16:15
fwieselI suspect it is my turn.16:18
fwiesel#topic vmwareapi 3rd-party CI efforts Highlights 16:18
fwieselNo updates from my side.16:18
bauzasthanks fwiesel 16:18
bauzascool thanks16:18
bauzasstill fighting the dragons ?16:18
fwieselYes, I didn't quite get around to it, that is the main reason of the lack of progress.16:19
bauzasno worries, we all have urgent business to sell16:19
bauzasmoving on then16:19
bauzas#topic Open discussion 16:19
bauzastwo items today AFAICS16:20
bauzas (sean-k-mooney) neutron removal of linux bridge 16:20
bauzassean-k-mooney: around N?16:20
sean-k-mooneyyep16:20
sean-k-mooneyso... neutron removed linxu bridge16:20
sean-k-mooneythey did not deprecated it in the last slurp16:20
sean-k-mooneyso we never did16:20
sean-k-mooneyso my propsoal is we deprecate it this release16:20
sean-k-mooneydisable the linux bridge job16:21
sean-k-mooneyand remove supprot next cycle16:21
sean-k-mooneywe need to do this in os-vif as well16:21
sean-k-mooneyi have a patch up for that16:21
sean-k-mooneyos-vif is subject to the non clinet lib freeze and is currently blocked16:21
sean-k-mooneyform merging code because of this16:21
sean-k-mooneywe have some patches we want to merge before then16:21
sean-k-mooneyso the topci is, are we ok with that plan16:21
sean-k-mooneydeprecate and disbales testing this cycle16:22
sean-k-mooneythen remove next cycle 16:22
gibiI'm OK to deprecate and disable. We cannot really do anything else if neutron pulled the plug16:22
sean-k-mooneyill submit the patch for that for nova later today16:22
sean-k-mooneyi jsut need to write it16:22
bauzasyeah I think we can agree 16:23
sean-k-mooneywe are not fully blocked as the linux bridge job only tirggers on a subset of files16:23
bauzassend the deprecation signal and remove the job16:23
bauzasanyone having concerns by the deprecation ?16:24
sean-k-mooneynova in 2025.1 woudl be able to run with older neutron or with linux bridge if you wrote an out of tree ml2/driver16:25
sean-k-mooneyso i will note that in the release note16:25
sean-k-mooneybut i dont expect that we will need/want to supprot that going forward. we can have that discssion before doing the removal form nova itslef16:25
sean-k-mooneythe thing is os-vif does nto host plugins for out of tree neutron backends16:26
sean-k-mooneyso we shoudl remove the os-vif supprot in os-vif next cycle in either case16:26
sean-k-mooneyanyway thats all form me16:26
bauzascool thanks16:27
bauzas#agreed linuxbridge will be deprecated by 2025.1 16:28
bauzasnext time16:28
bauzasgmann (I will not be able to attend the meeting due to a time conflict, but most of you know the context and proposal), Spec freeze exception for RBAC service/manager role 16:28
bauzas#link  https://review.opendev.org/c/openstack/nova-specs/+/93765016:28
bauzasgmann requested for a late specless approval16:29
bauzasspec approval (my bad)16:29
bauzasSpec has one +2 already and code changes are in progress: https://review.opendev.org/q/topic:%22bp/policy-service-and-manager-role-default%2216:30
bauzasusually we would not approve a spec so late in the cycle, but there are some criterias here to talk16:30
bauzas1/ the spec is quite simple and only relates to policy changes16:31
bauzas2/ we already merged some efforts related to policy16:31
bauzas3/ sean-k-mooney already accepted the spec16:31
bauzaswould there any concern by the approval ?16:32
sean-k-mooneyyes, although that was with the coment it needs an execption and the undersanding we woudl have this converstaion about 4 weeks ago16:32
sean-k-mooneyfrom me just review bandwith16:32
bauzasmy personal opinion on that is the risk about the policy changes16:32
gibiyeah, if there are two cores signing up for reviewing it the I have no objection16:33
gibibut the time is pretty sort16:33
bauzaswe're talking of modifying the inter-service communication16:33
sean-k-mooneywe are but16:33
bauzasso the patches would require extra caution16:33
sean-k-mooneyby defualt i this cycle swe will go form admin to admin or service16:33
bauzasI'm not opposed to give that series a late approval, but I wouldn't rush on reviewing the patches to be clear16:34
sean-k-mooneyso the real upgrade impact woudl be next cycle or whenever we drop admin16:34
sean-k-mooneythe manager changes would also be additive16:34
sean-k-mooneywe obviouly need to validate that our exsing jobs do not require modfiigcaiont to pass16:35
sean-k-mooneyto ensure there is no upgrade impact16:35
sean-k-mooneywe may also want to explcity have the new defautl job enable extra testing fo the manager policy16:36
sean-k-mooneygmann: what the state of the tempets testing for this16:36
sean-k-mooney my irc client freeked out so not sure if i missed messages16:39
bauzassean-k-mooney: gmann was unable to attend the meeting16:40
bauzasso he won't be able to reply to you I guess16:40
sean-k-mooneyoh ok.16:40
bauzasfwiw, I don't really know what to do with that spec16:40
bauzasI don't have the context yet and I need some time to reload it16:41
bauzasso I won't be able to +2 it shortly16:41
sean-k-mooneylookign at the tempest patches16:41
bauzasif any other nova-core is able to review that spec, I'm not opposed to a late approval but those two cores shall honestly sign off for the implementation review too16:42
sean-k-mooneyso i am not sure ill have time to rewview it either by the way16:42
sean-k-mooneyi think this shoudl be moved to next cycle honestly16:42
sean-k-mooneyi woudl prefer to focus on merging the ohter approved specs16:43
gibiI dont see cores signing up, so punt it16:45
bauzasthat's sad but reasonable16:45
bauzaswe shouldn't offer help if we can't review the patches16:45
sean-k-mooneyto be clear form my perspecitve this was punted when it missed the sepc freeze16:45
sean-k-mooneyand it was not approved as an exption in the team meeting after 16:46
bauzas#agreed unfortunately given the review bandwidth and the time it takes to gather context for approving the spec, we can't late approve the spec as we can't also guarantee that we could review the implementation patches in the reasonable timeline that we have16:46
bauzasokay, that was it for today's items in the agenda16:47
bauzasanything anyone ?16:47
r-taketnbauzas: I have one request. (My entry might have been deleted, Sorry for the confusion)16:47
MengyangZhang[m]I also have a topic. https://blueprints.launchpad.net/nova/+spec/enhanced-granularity-and-live-application-of-qos Would like to hear how the community feels about this before moving on looking into the code and writing a spec. Already proposed this idea to cinder and they liked the idea of per-project control of qos and live updating the qos limits. 16:48
r-taketn#link https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/thread/244U52T3HINVWUFSMPMA45A67BUPAQGK/ 16:49
sean-k-mooneyMengyangZhang[m]: we will not provide a inteface to modify cpu qos via cidner forntend qos16:50
bauzasr-taketn: ok, you're the next16:50
sean-k-mooneyso if thaty what you plandded -216:50
sean-k-mooneybut ya lest discuss r-taketn topic first16:50
r-taketnCould you please check the last message the above mailing list and give your comments? 16:51
sean-k-mooneyr-taketn: sure but just to clarify somethign we said previous16:51
sean-k-mooneynova will not merge any code for unreleased hardware16:51
bauzasr-taketn: oh, OK, that's a follow-up question on Arm-CCA16:51
bauzasyeah, what sean-k-mooney said16:52
bauzaswe can't just support something that our operators can't use16:52
r-taketnOh, ok. I might misunderstand16:53
sean-k-mooneyif there are some feautres that will eventually be used by CCA but are useful on there own we can enabel that in advance16:53
bauzasfor the sake of the meeting time, could we then move to the second item ?16:54
r-taketnyes16:55
bauzasthanks16:55
bauzasMengyangZhang[m]: looking at your blueprint16:56
sean-k-mooneyMengyangZhang[m]: to be clear "live QoS updates to CPU and memory limits, including configurations in cputune, cachetune, and memorytune libvirt XML elements to enable seamless performance adjustments for running instances." is not something we can or shoudl supprot via cidner frontend qos policies16:56
sean-k-mooneyif we wanted to do that we woudl need to add a new QOS api to nova to allows admins to defien qos polices16:56
sean-k-mooneyand a instance action to apply it16:56
bauzasthat sounds a very large topic to discuss16:56
bauzashaven't we already proposed to discuss those kind of topics at the PTG ?16:57
sean-k-mooneyi tought so16:57
MengyangZhang[m]<sean-k-mooney> "Mengyang Zhang: we will not..." <- It's not via cinder qos. What we want to achieve is to be able to live update all QoS limits on existing VMs on nova side. For disk qos, it's a bit special as it requires cinder side changes. 16:57
bauzasI actually see it in https://etherpad.opendev.org/p/nova-2025.2-ptg#L4716:57
bauzasso I'd recommend to defer the discussion at the PTG16:57
sean-k-mooneyMengyangZhang[m]: so the only qos you can configure today in nova16:57
sean-k-mooneyis via flavor extra specs16:58
sean-k-mooneyso what you are effectivly asking for is a limited form of live resize16:58
MengyangZhang[m]we can set cpu and memory qos via flavor extra specs. But cinder is a bit different. 16:59
sean-k-mooneyyou can set disk qos via flavor extra specs too.17:00
bauzaswe're over time17:00
sean-k-mooneycidner qos today coudl be supproted via swap voluem (volume retype in cinder)17:00
bauzasthanks folks17:00
bauzasI'll end the meeting but I also recommend talking about that usecase at the PTG17:01
bauzasthnaks17:01
bauzas#endmeeting17:01
opendevmeetMeeting ended Tue Feb 18 17:01:15 2025 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)17:01
opendevmeetMinutes:        https://meetings.opendev.org/meetings/nova/2025/nova.2025-02-18-16.01.html17:01
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/nova/2025/nova.2025-02-18-16.01.txt17:01
opendevmeetLog:            https://meetings.opendev.org/meetings/nova/2025/nova.2025-02-18-16.01.log.html17:01
sean-k-mooneyMengyangZhang[m]: i think we woudl need a nova cinder corss project seesion on this topic for the cinder case17:01
cardoehttps://review.opendev.org/c/openstack/nova/+/942019 is just a simple message fix that I was gonna see if I could also backport to 2024.2 (and technically affects back to 2024.1) Not sure if there's some test or release note you guys would like me to add.17:02
MengyangZhang[m]Yes, make sense. So we discuss it on Thur?17:02
r-taketnthanks17:02
MengyangZhang[m]<sean-k-mooney> "Mengyang Zhang: i think we woudl..." <- Yes, we can discuss it there. I already proposed this idea to them. Here's their comments https://meetings.opendev.org/meetings/cinder/2025/cinder.2025-01-15-14.00.log.html#:~:text=to%20implement%20it-,14%3A17%3A29,-%3Csimondodsley%3E%20i17:09
sean-k-mooneyMengyangZhang[m]: likely how we would need to do this si to add a new external event that cidner could call back when qos is modifed on a volume17:09
sean-k-mooney we would also need to call them back to etier complet the update or fail it17:10
MengyangZhang[m]make sense. I can look into this a bit more closely, but since this may be a large project, just want to make sure the community is interested in the idea so we can start working on it in my company17:12
sean-k-mooneyMengyangZhang[m]: have you written a cidner spec for this yet? i suspect we will need one for cinder and another for nova as it would need api change on both sides17:12
MengyangZhang[m]not yet, yes spec makes complete sense. will work on it next17:13
sean-k-mooneyMengyangZhang[m]: this is similar compelxtity to https://specs.openstack.org/openstack/nova-specs/specs/2024.2/approved/assisted-volume-extend.html17:13
sean-k-mooneyhttps://review.opendev.org/c/openstack/cinder-specs/+/86671817:14
sean-k-mooneyMengyangZhang[m]: while POC are useful in general its expected to agree the design in a spec before working on the code17:15
sean-k-mooneyif we focus on the cidner usecase what your really tryign to adress is the gap for frontend qos when you do "openstack volume qos associate QOS_ID VOLUME_TYPE_ID"17:16
sean-k-mooney... why are they using a get request to make a change... 17:19
sean-k-mooneyhttps://docs.openstack.org/api-ref/block-storage/v3/index.html?expanded=detach-volume-from-server-detail#associate-qos-specification-with-a-volume-type17:19
MengyangZhang[m]For cinder, there are two requests, one is per project qos not the per volume type qos. Secondly, once the qos policy is changed, we want to let the existing vm automatically adopt the new limits instead of requireing a live-migration to adopt it17:19
sean-k-mooneyanyway that return a 20217:19
sean-k-mooneytoday it shoudl eb returing a 409 conflict if the volume is attached to and isntance and the qos policy has  frontent qos17:19
sean-k-mooneyMengyangZhang[m]: live migration shoudl not cause it to update today17:20
sean-k-mooneyif it does that proably a bug17:20
sean-k-mooneyi stongly suspect that cidner api is not propelry reject assocateing a qos policy with a volume that is in-use when the policy has front end qos17:21
sean-k-mooneyit it allows that today that is wrong17:21
MengyangZhang[m]Let me double check it17:25
sean-k-mooneyit woudl be a prety seriours api bug on there side17:26
sean-k-mooneybecause they would allow the requested state and the actual state to get out of sync17:26
sean-k-mooneywhich could lead to late failure the next tiem a user hard reboots an instnace17:27
MengyangZhang[m]I will perform some testing locally and get back to you on this17:30
sean-k-mooneyif you can apply a qos policy to a volume that is in use and that policy has a frontend qos rule then its a cidner api bug form my perspecitve17:30
sean-k-mooneyits not ok to make a change i nthe cinder api and only have it take effect potentialy month later the next time the vm is moved or the xml is otherwise regenerated17:31
sean-k-mooneythe current api is written in terms of the volume type17:32
sean-k-mooneynto in terms fo a volume17:32
sean-k-mooneycinder shoudl not apply the qos update to existing instance or the updated type until a retyp or simialr api call is done17:33
opendevreviewSylvain Bauza proposed openstack/nova master: Add a new ImagePropertiesWeigher  https://review.opendev.org/c/openstack/nova/+/94064217:41
opendevreviewSylvain Bauza proposed openstack/nova master: per-property ImageMetaPropsWeigher  https://review.opendev.org/c/openstack/nova/+/94160117:41
MengyangZhang[m]<sean-k-mooney> "its not ok to make a change i..." <- unfortunately, that's the case now. If a qos policy is created and associated with a volume type, then all in-use volumes of that type will get qos limits when the xml is regenerated, like during live migration or resize.17:48
MengyangZhang[m]<sean-k-mooney> "cinder shoudl not apply the..." <- what do you mean by retype?17:49
gmannbauzas: sean-k-mooney: RE on RBAC spec, I read the meeting log. I am ok with decision and understand if no core has bandwidth to review it. I will see what best can be done in next cycle. 17:54
sean-k-mooneygmann: if you have the patches in a mergabel state with testing i dotn see why we woudl really wait once we are passed teh RC peroid17:55
gmannmikal: I am in PST TZ, I will check tempest change after TC meeting (~ 1hr from now)17:55
sean-k-mooneywe can approved the spec for next cycle 17:55
gmannack17:56
sean-k-mooneyi dont kwno how practical it is but it woudl be ncie to compelte it before the PTG for example17:56
sean-k-mooneycertinly its a good candiate to complete before m117:58
sean-k-mooneyi just dont know how much tiem we will all have with release activies ectra17:58
gmannyeah, let me finish the code change first before re-proposing the spec. manger role is all good but i need to finish the service role thjings18:01
sean-k-mooneydoes one need the other. both are covered by the same spec but we do not need to complet eboth to merge it right18:02
sean-k-mooneyif the manager role is working that can be merged first and then teh service user role change can come after?18:02
gmannboth are separate, I was little lazy to put both in one effort. this is manager role change and its parent add manager role context in unit tests so that we can see how manager role addition looks like https://review.opendev.org/c/openstack/nova/+/941347/218:04
sean-k-mooneywell i dont really mind them both being in the same spec18:05
gmannand manager role is more I am interested to do as that is user usable things and service-role is just (but still important ) to improve the internal communication 18:05
sean-k-mooneyright18:05
sean-k-mooneyi woudl proably do them in that order too18:06
gmannsean-k-mooney: what you suggest now? do you want me to separate spec for manager role if that can be done in this cycle? you can see the code change for manager role to answer it.18:07
gmannhttps://review.opendev.org/c/openstack/nova/+/941056/2 and https://review.opendev.org/c/openstack/nova/+/941347/218:08
sean-k-mooneyif we wanted to try and get some provess this cycle that might be achivele givent its only 2 relitivly small pathces18:09
sean-k-mooneymy concern woudl still be with testing18:09
sean-k-mooneyhave you any tempest coverage for that18:09
sean-k-mooneyoh https://review.opendev.org/c/openstack/tempest/+/94159418:09
sean-k-mooneyand thast merged already 18:09
sean-k-mooneyalthough that not reall testing it18:11
gmannpolicy changes are from ADMIN -> manager-or-ADMIN and tempest tests testing those with admin so no impact but there are two policy changes form member-or-admin to manager-or-admin which is impact but tempest test tests them admin only.18:24
sean-k-mooneyim almost finished quicckly reviewign the manager patches but there are issues in the second one18:24
gmannsean-k-mooney: tempest testing you mean to test those policy with manager role instead of admin? manager role access is asserted in unit tests18:24
sean-k-mooneycorrect to asssert that with the new default policy a client with the manger role could cold migrate for example18:25
gmannthen we loose coverage of doing it with manager role18:26
gmannunless we run same tests with two role admin and manager 18:26
sean-k-mooneyyes i was suggesting doing it with both admin and manager18:26
gmann* then we loose coverage of doing it with ADMIN role18:26
sean-k-mooneyalthoguht that is really jsut testign the implied roles feature irght18:27
sean-k-mooneyadmin would imply manager18:27
gmannthat might be add more time in tempest jobs if we test APIs with all possible policy defaults. bcz it is same for member-or-reader 18:28
sean-k-mooneyunlike service i dont see a reason for it to siad the default heriachy18:28
gmannsean-k-mooney: yeah, that I was about to write. I think testing with manager is least permissible default role testing we need18:28
gmannand admin anyways should have access on anything manager can do18:28
gmannsean-k-mooney: let me modify the tempest test and see18:28
sean-k-mooneyi guess there is one differnce18:30
sean-k-mooneywe can jsut use project manager in the roles becuase when you have the admin role we do not need to compre the project id18:31
sean-k-mooneyso we do actully need to have PROJECT_MANAGER_OR_ADMIN for api that are valid to call form both 18:32
sean-k-mooneyalthough on the tempest side i dotn think that changes thigns18:32
gmannyeah18:34
priteauHello. Could a Nova core approve this backport please? https://review.opendev.org/c/openstack/nova/+/940846 I have just tested it and it resolves our baremetal node deletion issues.18:35
sean-k-mooneyah the 2024.1 backport18:36
sean-k-mooneysure 18:36
gmannsean-k-mooney: anyways let me see if I can finish the testing otherwise I am ok not to be in hurry in this cycle. The main reason is that I am struggling with my bandwidth for upstream activities at least for next few months also.18:36
sean-k-mooneygmann: i finsihed reviewing the manger changes and i dont think you updated it with the dicussion form the spec18:37
sean-k-mooneyi was also expecting you to modify instnace lock btu maybe that was nto covered18:37
sean-k-mooneyi didnt look at the spec in a while18:37
gmannIn spec, we left the lock and other server action. I explicitly mentioned those as not-changing-those18:38
sean-k-mooneyi was expectign you to allow the manager ot unlock an admin locked insntace18:40
sean-k-mooneybut not restrict lock to manager18:40
sean-k-mooneybut that is fine18:40
sean-k-mooneypriteau: approved :)18:41
dansmithbauzas: still around?19:16
opendevreviewMichael Still proposed openstack/nova master: libvirt: allow direct SPICE connections to qemu  https://review.opendev.org/c/openstack/nova/+/92484419:31
priteausean-k-mooney: thanks!19:42
MengyangZhang[m]<sean-k-mooney> "cinder shoudl not apply the..." <- unfortunately this is the case now. How should I proceed then?  20:29
sean-k-mooneyit depend 20:31
opendevreviewMerged openstack/nova stable/2024.1: ironic: Fix ConflictException when deleting server  https://review.opendev.org/c/openstack/nova/+/94084620:31
sean-k-mooneyMengyangZhang[m]: if it allow the volume type ot be updated after its in use does it also propagte those changes to instance when a new attachment is created20:32
sean-k-mooneyor does it update the conetnt for exsiting attachments20:32
MengyangZhang[m]to new attachment, existing volumes must be reattached to have the update applied20:33
sean-k-mooneyack so that not great as i treally should only update it if you do a volume retype20:34
sean-k-mooneybut its less bad then otherwise20:34
sean-k-mooneywhat we really need is an per volume api call to trigger the atachment update. the worst case sensiario is you create a new qos policy, you apply it to a volume time and cinder tirggers all volumesof that type ot be udpated20:37
sean-k-mooneyby that i mean20:38
sean-k-mooneyit would be a problem if cidner tirgger nova to call libvirt to update all fontend qos on all affected volumes20:38
sean-k-mooneyits tricky but how nova would expect cidner to work is that when a voluem is created iwth a given voluem type. the defintion fo that voluem type should be snapshot and saved per voloume and used for the lifetime of the volume20:41
sean-k-mooneymeaning the only way to update a voluem to get the new qos would be via a voluem retype ot the same or a diffent volume type20:42
sean-k-mooneythat is how we handel flavors and iamges20:42
sean-k-mooneyit sounds like cinder is not doing this20:43
MengyangZhang[m]No, not for live migration/resize coz qos limits are saved in cinder.quality_of_service_specs db table and it is only assoicated with volume type. I don't see qos values are saved on a per volume basis. Thus, when volume attachment event happens, they just fetch qos limits associated with the volume type and popolate to the connection_info field, which then passed back to nova to generate xml 20:48
MengyangZhang[m]The reboot however works a bit different. I believe it is reading the connection_info from nova.block_device_mapping db, which saves a snapshot of volume qos and use it for the lifetime of the instance. 20:50
sean-k-mooneyright so that probelmatic becasue it shoudl not change on a live migration20:51
sean-k-mooneythe reboot behavior in nova is how it shoudl work20:51
sean-k-mooneythe qos policy applied to an instnace volume shoudl not change as a sideffect changing the host20:52
MengyangZhang[m]understood, then what we should do? discuss this in next cinder-nova meeting?20:53
sean-k-mooneyi think so yes20:53
sean-k-mooneyi do not know if modifying the QOS could cause the live migration to fail for example20:53
sean-k-mooneyor cause issues with reverting a cold migrate or resize20:53
MengyangZhang[m]i don't think it would cause live-migration fail. We have deployed qos to our private cloud for some time and we don't see increased live-migration failures when existing VMs are adapting the qos20:55
sean-k-mooneyit would fail if the destination host did not supprot the qos config20:56
sean-k-mooneyi.e. if the souce and destination host had differnt version of libvirt and the qos element was supported on the source but nto the desitnaiton20:57
sean-k-mooneywe shoudl be validating that in pre live migration20:57
opendevreviewribaudr proposed openstack/nova master: Augment the LiveMigrateData object  https://review.opendev.org/c/openstack/nova/+/94214320:58
opendevreviewribaudr proposed openstack/nova master: Add live_migratable flag to PCI device specification  https://review.opendev.org/c/openstack/nova/+/94214420:58
opendevreviewribaudr proposed openstack/nova master: Update manager to allow vfio pci device live migration  https://review.opendev.org/c/openstack/nova/+/94214520:58
opendevreviewribaudr proposed openstack/nova master: Update conductor and filters allowing migration with SR-IOV devices  https://review.opendev.org/c/openstack/nova/+/94214620:58
opendevreviewribaudr proposed openstack/nova master: WIP: Update driver to map the targeted address for SR-IOV PCI devices  https://review.opendev.org/c/openstack/nova/+/94214720:58
sean-k-mooneydownstream we supprot deploying a single verison of nova across both rhel 8 and rhel 9 host for an extened peoid of time20:58
sean-k-mooneywe have at least one custoemr that told use that for regaltory reasoson can only update eithe revery 6-12 months21:00
sean-k-mooneyso they would realistically ahve some deployments with a mix of libvirt versions for up to a year21:00
MengyangZhang[m]I think we discussed this at some point, the pre live migration step wouldn't fail plus openstack supported front-side disk qos 13 years ago21:01
sean-k-mooneyright for the current set we have it wont fail unless libvir remvoes supprot for a filed in a new version21:02
sean-k-mooneybut im making the point that nova does not really expect this to change on live migation21:02
MengyangZhang[m]get it21:02
sean-k-mooneya concreate example fo this is the weith element https://libvirt.org/formatdomain.html#block-i-o-tuning21:04
sean-k-mooneythe range [100, 1000]. After kernel 2.6.39, the value could be in the range [10, 1000]21:04
sean-k-mooneyso hte changed the range that was allowed21:04
sean-k-mooneyagain it old enough that its ok21:04
MengyangZhang[m]make sense21:05
sean-k-mooneyim going to call it a day btu we shoudl see what the cinder folks think about how to proceed21:05
MengyangZhang[m]make sense thanks for the help!21:05
opendevreviewMerged openstack/os-vif master: Remove os-vif-linuxbridge  https://review.opendev.org/c/openstack/os-vif/+/94156723:23
opendevreviewsean mooney proposed openstack/os-vif master: Update gate jobs as per the 2025.1 cycle testing runtime  https://review.opendev.org/c/openstack/os-vif/+/93558023:38

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!