Tuesday, 2025-09-02

*** mhen_ is now known as mhen01:10
opendevreviewRajesh Tailor proposed openstack/nova master: Fix 'nova-manage image_property set' command  https://review.opendev.org/c/openstack/nova/+/94673306:53
*** elodilles_pto is now known as elodilles08:19
gibisean-k-mooney: Uggla: I've added the eventlet stuff to the PTG etherpad https://etherpad.opendev.org/p/nova-2026.1-ptg08:39
Ugglagibi 👍09:34
sean-k-mooneyelodilles: thank you for approving https://review.opendev.org/c/openstack/requirements/+/959051 in my head i was willing to grant the watcher-dashbaord review until our next watcher irc meeting to merge but that last of the 3 needed that release11:02
sean-k-mooneyits one of those things where its nice to complete the feature by including the horizon supprot this cycle but i also am not willing ot keep merging thing too much past FF even if we are just wait on release ectra11:04
opendevreviewMerged openstack/nova master: Reproduce that only half of the PCI devs are removed  https://review.opendev.org/c/openstack/nova/+/95397111:08
elodillessean-k-mooney: no problem. the python-watcherclient release happened yesterday, and new-release patches (upper constraint version bumps) should land as soon as possible anyway11:55
elodillessean-k-mooney: in the future: you could propose releases anytime if that is needed in other repository, no need to wait for the generated milestone release patches :)11:57
sean-k-mooneygibi: i finished reviweeing https://review.opendev.org/q/topic:%22bug/2115905%22 the first 2 im +2 on although there are some nits related to the commit mesaage that would be nice to fix. https://review.opendev.org/c/openstack/nova/+/954149 i have a question/suggeetion on how to clsoe the schduer race for you to consider but im +1 over all on that11:58
sean-k-mooneyelodilles: actully that not what happend in this case, we only merged the api change on FF11:58
elodilles(on the other hand watcher-dashboard is cycle-with-rc release typed project, so it still has time until the release, and FF is mostly discussed within the given project teams -> if the team decides that it's not risky for the release, then things can be merged after FF)11:59
sean-k-mooneyso we had to wait for that to merge before merging the clint change and doing the release11:59
sean-k-mooneyelodilles: the main issue with watcher-dashboard is there is almost no testing or test infra in that and we need to figure out how to resolve that going forward.12:00
elodilles:S12:00
sean-k-mooneyelodilles: that means we need to carfully manually tests all changes so that takes time12:00
elodillesi see12:00
sean-k-mooneyi want to eventully explore using playwrite (instead of creating selenium tests) to see if we can get some end to end test coverage but we also coudl use django unit test framework more then we do12:01
sean-k-mooneyhorizon has historically never relaly facilated testing plugins well12:02
sean-k-mooneyso as a resutl most plugins were nto proeprly tested12:02
sean-k-mooneyfixing that is on our todo list just not as high as it proably shoudl be12:02
elodillesmaybe next cycle ;)12:06
grami[m]Has anyone run into the problem with Nvidia GPU and requiring pci-expander-bus and vswitch-pcis?12:11
sean-k-mooneygrami[m]: it should not require them however they recommend them12:19
sean-k-mooneyfor performance reasons12:20
sean-k-mooneyvswitch-pcis is not something im famialr with12:20
sean-k-mooneynvidia recomemtn in one of there kvm docs that you attache thre gpus and nics to a virutal pcie switch in the vm12:21
sean-k-mooneynova cant do that but its too hlep with device to device comuication within the vm12:21
sean-k-mooneyinstead of conencting the gpu to a pcie-root-port you add a pcie-switch-upstream-port contoler instead and you can then group multipel gpus or other device as downstream ports on that switch.12:26
sean-k-mooneythere is very little upstream libvirt docs on how this all works https://libvirt.org/formatdomain.html#controllers12:27
sean-k-mooneyand hte nvidia docs on this topic are relivitly new.12:27
sean-k-mooneybut equally hard to actuly find12:27
grami[m]sean-k-mooney: thanks yeah I'm in the process of passing h200 with the normal pci-root-port performance is worse then the CPU. I guess at some point I will have to right a patch but the nova code is confusing to me and need a starting point. Do you know what line in nova or the file to start in?12:31
sean-k-mooneyhonestly this sould liek a nivida hardware bug as this really shoudl not have any impact if tis only for 1 h200 device12:35
sean-k-mooneygrami[m]: changign this is nto going to be easy in nova.  there is an upgrade impact to this change and it will likely require a spec.12:35
sean-k-mooneyit may be possibel to jsut alwasy use a pcie-switch-upstream-port instead of pcie-root-port but i susspect that will increase teh qemu memory usage12:36
opendevreviewRajesh Tailor proposed openstack/nova master: Fix duplicate words  https://review.opendev.org/c/openstack/nova/+/95913612:36
sean-k-mooneyso that may not be accpatabel to do in general12:36
grami[m]sean-k-mooney: thanks for the info will do some digging on the spec and look to do one. 12:37
sean-k-mooneybefore gettign that far have you done any basic performace turning like ensuring numa affintiy between the vm and pci device12:38
sean-k-mooneyby defualt nova only does that if the guest has a numa toplgoy 12:38
sean-k-mooneyi.e. addh hw:mem_page_size=small (exiplcit small 4k numa local pages)12:39
sean-k-mooneywill significnatly boost your vm performance if it was previosul going across numa12:39
grami[m]sean-k-mooney: The memory issue is already a problem with 256gb as the BAR doesn't have enough so you have to increase the this just to get the GPU to boot. 12:39
*** haleyb|out is now known as haleyb12:39
grami[m]Already have numa dedicated CPU and huge page 12:40
sean-k-mooneythat not what i was referint too. so each pci_root_port add a couple of megabyte of overhead even if un used12:41
sean-k-mooneyso qhen you enable the q35 machine type with the max number of pcie-root-ports we supprot today it has doble the qemu overhad as the pc machien type12:41
sean-k-mooneyusign a pcie-swithc might actully help with that but that one of the thing we woudl need to consier12:42
sean-k-mooneya long time ago i wanted to allcoate 1 pci switch per guest numa node by default12:42
sean-k-mooneyand then attach the passhtough device to the relevnet virtual numa node pcie switch12:42
sean-k-mooneyso that the guess coudl make proper schdulign decisions12:43
sean-k-mooneytoday all device in teh guest have nuam node -1 i.e. not numa info aviable12:43
sean-k-mooneyso the gest kernel assume its on virtual numa ndoe 0 effectvily12:43
sean-k-mooneywhich can degreed performance so if you have not already tried expliclty having only 1 numa node in the guest that is wroth testing12:44
grami[m]Ah yeah right now I'm going to see if I can get the instance to work by editing the xml then will look to see what needs to change. 12:44
grami[m]Yeah made sure it's only one numa zone for testing and will go from that. 12:45
sean-k-mooneyright now i dont know of a way to change the bar space allocated to the viruatlised device. i assuem that need qemu supprot and im not sure if livbirt and qemu are hooked up such that it can be adjusted12:46
sean-k-mooneythere is memReserve12:47
sean-k-mooney```memReserve12:47
sean-k-mooney    Some PCI devices have non-prefetchable memory bar larger than 2MiB. Use this attribute to override value computed by firmware and thus make controller reserve more memory (in KiB) so that such PCI device can be hot plugged. For cold plugged PCI devices, the firmware will automatically reserve the correct amount of memory```12:47
sean-k-mooneybut i dont think that is correect12:48
sean-k-mooneygrami[m]: you not trying to use resizable bar ar you?12:59
sean-k-mooneyjust doing a little quick resarch qemu/libvirt wont do the resize for you you have to do that manually today before bindign the device to vfio-pci. second there is a limisation in ovmf by default where the adress apce resverd sames to default to 32GB for mmio mappings https://edk2.groups.io/g/discuss/topic/5934071113:04
sean-k-mooneywe do not allow any use of <qemu:commandline> in nova so you woudl need that to be supproted in libvirt nativly before turnign the mmio apature via "fw_cfg name=opt/ovmf/X-PciMmio64Mb,string=65536" coudl be supproted13:05
gibisean-k-mooney: 13:05
gibithanks for the review in https://review.opendev.org/c/openstack/nova/+/95414913:06
gibiI responded inline. bottom line your reserved idea is probably correct, but I'm afraid of the complexity of the patch already13:06
sean-k-mooneygibi: that not a blocker13:07
sean-k-mooneywe can consider it in a followup13:07
sean-k-mooneyi was jsut trying to think is there a way to close that gap13:07
grami[m]sean-k-mooney: Yeah had to add that Nvidia says it's a requirement for the larger gpus13:09
sean-k-mooneygrami[m]: based on https://bugzilla.redhat.com/show_bug.cgi?id=1578278 it looks lime https://specs.openstack.org/openstack/nova-specs/specs/2023.1/approved/libvirt-maxphysaddr-support.html may also impact that13:09
sean-k-mooneybut i dont think the two ar the same13:09
grami[m]Already have a patch for this but it's medieval in its method but works for now.13:10
sean-k-mooneywe cant accapt any change that uses raw <qemu:commandline> elements in nova13:10
sean-k-mooneyso if we need to tweak the ovmf parmater seperatly form teh cpu adress space13:10
sean-k-mooneywe will need to modify libvirt first13:10
grami[m]Ah thanks for that 13:11
*** srelf_ is now known as Continuity13:21
opendevreviewribaudr proposed openstack/nova master: doc: mark the maximum microversion for 2025.2 Flamingo  https://review.opendev.org/c/openstack/nova/+/95918013:37
opendevreviewribaudr proposed openstack/nova master: Update compute rpc alias for epoxy  https://review.opendev.org/c/openstack/nova/+/95918113:40
gibisean-k-mooney: it is the freqency and the impact of the gap vs. my fears of the complexity to close and test that gap is what makes me reluctant13:41
gibibauzas: re13:45
gibi10:32 < bauzas> honestly, I don't have any number, if the job works up to tomorrow, then if so, let's leave it to be voting :)13:45
gibiIt works https://zuul.opendev.org/t/openstack/builds?job_name=nova-tox-py312-threading&project=openstack/nova13:45
bauzasperfect, then let me hit the button13:48
gibithanks!13:49
opendevreviewribaudr proposed openstack/nova master: Add service version for Falmingo  https://review.opendev.org/c/openstack/nova/+/95918413:52
opendevreviewribaudr proposed openstack/nova master: Add Flamingo prelude section  https://review.opendev.org/c/openstack/nova/+/95918814:15
UgglaNova meeting in ~1h15:00
opendevreviewribaudr proposed openstack/nova-specs master: Move Flamingo implemented specs  https://review.opendev.org/c/openstack/nova-specs/+/95920215:22
opendevreviewMerged openstack/nova stable/2025.1: api: Fix validators for hw:cpu_max_* extra specs  https://review.opendev.org/c/openstack/nova/+/95773915:34
opendevreviewMerged openstack/nova-specs master: Move Flamingo implemented specs  https://review.opendev.org/c/openstack/nova-specs/+/95920215:47
opendevreviewMerged openstack/nova master: doc: mark the maximum microversion for 2025.2 Flamingo  https://review.opendev.org/c/openstack/nova/+/95918015:49
Uggla#startmeeting nova16:00
opendevmeetMeeting started Tue Sep  2 16:00:21 2025 UTC and is due to finish in 60 minutes.  The chair is Uggla. Information about MeetBot at http://wiki.debian.org/MeetBot.16:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.16:00
opendevmeetThe meeting name has been set to 'nova'16:00
UgglaHello everyone16:00
gibio/16:01
jayaanando/16:01
gmaano/16:01
dansmitho.16:01
bauzaso/16:02
UgglaLet's start.16:02
elodilleso/16:02
Uggla#topic Bugs (stuck/critical) 16:02
Uggla#info No Critical bug16:03
Uggla#topic Gate status 16:03
Uggla#link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs 16:03
Uggla#link https://etherpad.opendev.org/p/nova-ci-failures-minimal16:03
Uggla#link https://zuul.openstack.org/builds?project=openstack%2Fnova&project=openstack%2Fplacement&branch=stable%2F*&branch=master&pipeline=periodic-weekly&skip=0 Nova&Placement periodic jobs status16:03
Uggla#info Please look at the gate failures and file a bug report with the gate-failure tag.16:03
Uggla#info Please try to provide a meaningful comment when you recheck16:03
UgglaAnythink special with the gates ?16:04
UgglaAnything special with the gates ?16:04
gibijust a heads up16:04
gibithat I see multiple cases when the ceph multistore job failed due to volume build issues and mostly in the encrypted volume tempest tests16:05
gibiI did not dig deeper but I feel it is a pattern16:05
Ugglaan idea of the success/failure rate ?16:07
gibihttps://zuul.opendev.org/t/openstack/builds?job_name=nova-ceph-multistore&project=openstack/nova16:07
gibinot terrible16:07
gibiI will try to dig into the failure tomorrow16:08
Ugglaok thanks gibi16:08
Ugglaalthough that looks not easy.16:09
Ugglamoving on to next topic16:09
Uggla#topic tempest-with-latest-microversion job status 16:09
Uggla#link https://zuul.opendev.org/t/openstack/builds?job_name=tempest-with-latest-microversion&skip=016:09
Ugglagmaan something you would like to share on this topic ?16:10
gmaanno progress, I am going to spend time on it this week.16:10
Uggla👍16:10
Uggla#topic Release Planning 16:10
Uggla#link https://releases.openstack.org/flamingo/schedule.html16:11
Uggla#info Nova deadlines are set in the above schedule16:11
Uggla#info RC1 target is next week.16:11
Uggla#info I have created a draft 959067: Nova 2025.2 Flamingo cycle highlights | https://review.opendev.org/c/openstack/releases/+/95906716:11
UgglaThanks to review the highlights, I know some of you have already done it.16:12
Uggla#info PTG etherpad for 2026.1 is available: https://etherpad.opendev.org/p/nova-2026.1-ptg16:12
Uggla^ thanks sean-k-mooney to create it, gibi already put eventlet stuff in it.16:12
Ugglaso you can start adding topics you would like to discuss at the next ptg16:13
UgglaPTG will be October 27-31, 2025, you can already register  (https://openinfra.org/ptg/)16:15
UgglaBtw I have opened most of the RC1 patches. Feel free to ping me if you spot something wrong. 16:16
Uggla#topic Review priorities 16:16
Uggla#link https://etherpad.opendev.org/p/nova-2025.2-status16:17
UgglaBug to review are in the doc, but we may focus on the following ones:16:18
Uggla#link 699176: Faults from cell DB missing in GET /servers/detail | https://review.opendev.org/c/openstack/nova/+/69917616:18
Uggla#link 955657: Preserve vTPM state between power off and power on | https://review.opendev.org/c/openstack/nova/+/95565716:18
UgglaThe latest one was really wanted by the author.16:19
Ugglaand gmaan if you can have a look at there 2: #link 952894: Reproducer for bug 2114951 | https://review.opendev.org/c/openstack/nova/+/952894 & 952895: Fix bug 2114951 | https://review.opendev.org/c/openstack/nova/+/95289516:20
jayaanandi am from NetApp. Multiple customers asking for extending NFS volumes attached to instances16:21
* Uggla shame less proposing mine.16:21
gmaanright, I opened those when you added me in review and then got distracted, will check those16:21
Ugglagmaan 👍16:21
jayaanandIs it possible in G release 16:22
Ugglajayaanand, it is a feature if I'm not wrong.16:22
jayaanandRelated to spec https://review.opendev.org/c/openstack/nova-specs/+/94950416:23
Ugglaok I'll put that in the PTG topic, it is in the pipe for quite a long time. So I'll try to set it as a "priority".16:26
UgglaAny draft patches already attached to it ?16:26
jayaanandThank you! We don't have any draft patch 16:28
dansmithjayaanand: you don't have draft patches because... you expect someone else to work on it?16:29
jayaanandWe have partial fix proposed https://review.opendev.org/c/openstack/nova/+/680648 for a Bug related to NFS extension. Is it possible to merge 16:29
Ugglajayaanand, there are 2 x -1 so it looks difficult.16:30
jayaanandReviewers pointed to this feature 16:30
dansmithwell, some discussion is definitely required.. and that discussion should happen before we try to give you any sort of idea about when it could land16:31
jayaanandok16:32
bauzasare we already in the open discussion ?16:33
Ugglabauzas not yet.16:33
bauzasif not, please await until we have this topic as we are on our nova meeting, please :)16:33
bauzas(fwiw, I have an item for our open discussion as well :-) )16:33
Ugglajayaanand, let's discuss that at PTG.16:34
Ugglamoving on16:34
Uggla#topic OpenAPI 16:34
Uggla#link: https://review.opendev.org/q/topic:%22openapi%22+(project:openstack/nova+OR+project:openstack/placement)+-status:merged+-status:abandoned16:34
Uggla#info still 28 remaining atm.16:34
Uggla#topic Stable Branches 16:35
Ugglaelodilles, please go ahead.16:35
elodilles#info stable branches (stable/2025.1 and stable/2024.*) seem to be in OK state16:35
jayaanandThank you! 👍16:35
elodilles#info stable/2025.2 branch cut patch for nova libraries (needs release liaison approve): https://review.opendev.org/c/openstack/releases/+/95913716:35
elodilles#info stable branch status / gate failures tracking etherpad: https://etherpad.opendev.org/p/nova-stable-branch-ci16:35
elodillesthat's all from me, back to you Uggla 16:36
UgglaThx elodilles16:36
UgglaSkipping next topic beacause I guess Fabian is still on pto16:36
Uggla#topic Gibi's news about eventlet removal. 16:37
Uggla#link Blog: https://gibizer.github.io/categories/eventlet/16:37
gibio/16:37
Uggla#link nova-scheduler series is ready for core review, starting at https://review.opendev.org/c/openstack/nova/+/94796616:37
Ugglagibi, the mic is your if you'd like to share someting.16:37
gibiso n-sch, n-api, n-metadata is done and tested in nova-next. 16:37
gibithe n-conductor patch missed the FF due to a late futurist bug that is since fixed but we will only try n-cond in G16:38
gibithe py312-threading job is the last piece for F making it voting is on the gate as we speak. This runs the majority of our unit test without eventlet16:38
gibiplans for G is in the PTG etherpad16:39
gibiexpect a tight schedule :)16:39
Uggla👍16:39
gibithat is it16:39
gibiback to you Uggla16:39
Ugglathx gibi, it's great you manage to do that. I think it will give confidence for the next steps.16:40
JayFgibi: I'll note that futurist bug was fixed and released and added to requirements :)16:40
gibiJayF: correct16:41
JayFMay not change your internal status for conductor, but wanted to make sure folks knew16:41
gibithanks JayF 16:41
sean-k-mooneyo/ sorry was in an internal meeting16:42
Uggla#topic Open discussion 16:42
UgglaSean would like to discuss about  https://bugs.launchpad.net/nova/+bug/206091616:42
UgglaSo please go ahead.16:43
sean-k-mooneythis is short16:43
sean-k-mooneybasically the truest_vif feature for years required custom policy ot use16:43
sean-k-mooneyin 2025.1 neutron finally fixed that with a proper api exentions16:43
sean-k-mooneynova now jsut need to use it16:44
sean-k-mooneymy question is16:44
sean-k-mooneyshoudl we treat this as a security hardening opertuity/bug16:44
sean-k-mooneyor a specless blueprint16:44
sean-k-mooneyi.e. would we consier backporting it or master only16:44
sean-k-mooneywe could also defer that to the ptg but to me this is either whishlest rfe bug16:45
sean-k-mooneyor a very small specles blueprint but i dont know what people would prefer16:45
gibiI cannot really tell without checking how this will look like16:46
gmaanin that case, should not neutron policy for "binding:profile" needs to be change the default to admin-or-service? 16:47
gibiI guess we need to depend on the new neutron API16:47
gmaanthat is what they have for many other APIs16:47
bauzasthat's a new extension ?16:47
sean-k-mooneygmaan: its curent admin only and it shoudl eventually be service only yes16:47
bauzasif so, I'm afraid of any backport16:47
gmaansean-k-mooney: service only or admin-or-service?16:47
sean-k-mooneywhat i can do is make time to do a poc16:47
sean-k-mooneygmaan: it shoudl be service only eventually16:48
sean-k-mooneyhumans shoudl never modify its content16:48
sean-k-mooneyonly nova ironic or zun shoudl ever modify binding:profile16:48
sean-k-mooneyand hta twas the sechruity issue. you had to relax access to people to modify its content to use trusted_vif16:48
gmaani see16:49
UgglaI have the feeling that a specless BP looks more adequate.16:49
sean-k-mooneywhat i can do is see if i can find time to do a poc before the ptg or a future irc meeting16:49
sean-k-mooneyand we can revisit then16:50
sean-k-mooneynote we treated a similer issue for hardware offloaded ovs as a securyt hardining bug16:50
sean-k-mooneythat why i asked16:50
gmaanneutron API extension is done this cycle? 16:50
sean-k-mooneyhttps://bugs.launchpad.net/nova/+bug/202081316:51
sean-k-mooneygmaan: it was done in 2025.1 i belive so epoxy16:51
gmaanyeah, since 2025.1 https://github.com/openstack/neutron-lib/blob/stable/2025.1/neutron_lib/api/definitions/port_trusted_vif.py16:52
sean-k-mooneyhttps://review.opendev.org/c/openstack/neutron/+/92606816:52
sean-k-mooneythe neutron change is older? not sure but i see 2024.216:53
sean-k-mooneyanyway i just wanted to get an initial read form folks16:54
sean-k-mooneyto me it was the same as https://bugs.launchpad.net/nova/+bug/202081316:54
gmaanI think till we have neutron API available, we can backport it but not beyond that16:54
sean-k-mooneywe can also defer the backporting question until we see what the patch looks like16:55
gmaan++16:55
sean-k-mooneyit shoudl be small and non invaisves in general but i have not witten it yet16:55
Ugglaok, bauzas, you would like to discuss something. I hope it will be ok in the remaing time.16:57
bauzasjust a quick one16:57
bauzasbut let's discuss this in the next meeting16:57
bauzas(about the pavilion on the openinfra summit, whether we would want to be there)16:57
sean-k-mooneywaht is that?16:59
sean-k-mooneydo you mean the fourm sessions that used to be part fo sumit?17:00
bauzashttps://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/thread/MM4OJO5CCC4GHMYUN3OJBDHTSZNB32H2/17:00
bauzasbut let's discuss this next ween17:00
bauzasI have another TC meeting now17:00
sean-k-mooney oh its effectivly lightnight talks or showcases17:00
sean-k-mooneyfor project to have tiem in a shared presenting space17:01
sean-k-mooneysure we can think on that and revisit next week17:01
sean-k-mooney`The deadline to sign up is 26 September!` ok so we also have time17:01
sean-k-mooneyto respond to this ask17:01
Ugglayep.17:01
UgglaWe’ve reached the top of the hour, so thanks to everyone attending this meeting.17:03
gibithanks Uggla 17:03
Uggla#endmeeting17:04
opendevmeetMeeting ended Tue Sep  2 17:04:00 2025 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)17:04
opendevmeetMinutes:        https://meetings.opendev.org/meetings/nova/2025/nova.2025-09-02-16.00.html17:04
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/nova/2025/nova.2025-09-02-16.00.txt17:04
opendevmeetLog:            https://meetings.opendev.org/meetings/nova/2025/nova.2025-09-02-16.00.log.html17:04
elodillesthanks o/17:04
opendevreviewMerged openstack/nova master: Add service version for Falmingo  https://review.opendev.org/c/openstack/nova/+/95918420:20
opendevreviewMerged openstack/nova master: Fix pci_tracker.save to delete all removed devs  https://review.opendev.org/c/openstack/nova/+/95397220:20
gmaanUggla: commented in this, I am ok to fix this as bug and no need to bump microversion. I think you are planning to backport also https://review.opendev.org/c/openstack/nova/+/95289522:12
gmaansome suggestion to fix comment and explain detail in commit msg otherwise fix lgtm22:13

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!