Monday, 2026-04-13

opendevreviewArtem Vasilyev proposed openstack/nova master: Add reproducer for bug #2147776  https://review.opendev.org/c/openstack/nova/+/98400708:27
opendevreviewArtem Vasilyev proposed openstack/nova master: Fix race between resize and update_available_resource for dedicated CPUs  https://review.opendev.org/c/openstack/nova/+/98404808:40
opendevreviewMichel Nederlof proposed openstack/nova master: Add RBD XML update functionality for migration process  https://review.opendev.org/c/openstack/nova/+/97403208:53
opendevreviewMichel Nederlof proposed openstack/nova master: Add RBD XML update functionality for migration process  https://review.opendev.org/c/openstack/nova/+/97403209:04
opendevreviewBalazs Gibizer proposed openstack/nova master: [py313-threading]Reenable last scatter-gather unit test  https://review.opendev.org/c/openstack/nova/+/98434909:31
gibifyi from #openstack-eventlet-removal09:45
gibi09:30 < amorin> hey team, I am looking to start an openstack deployment without eventlet,09:45
amorinyup :) for now it's mostly dev purposes09:52
sean-k-mooneyamorin: you can mostly do that with devstack today but i dont knwo fi any fo the other installer have an easy button to do it10:08
amorinI have my own way to install anyway :) for nova it's pretty strait forward with this env var, that's perfect10:09
amorinI think I already know the answer, but placement is also eventlet free, right?10:12
DominikDanelski[m]sean-k-mooney: Hello, did you have time recently to look at proposed nova-spec (https://review.opendev.org/c/openstack/nova-specs/+/978570?usp=search), the related code changes, or the smaller bugfix related to allocations (https://review.opendev.org/c/openstack/nova/+/968446?usp=search)?10:16
sean-k-mooneyamorin: placement never used eventlet10:17
sean-k-mooneyamorin: so yes10:17
amorinmake sense10:18
sean-k-mooneywatcher uses the same env var or a very similar one and cybrog is eventlet free now too just an fyi10:18
amorinperfect, thanks10:18
lajoskatonaUggla: Hi, welcome back! I have a patch in Neutron for sending notification (/os-server-external-events: https://docs.openstack.org/api-ref/compute/#create-external-events-os-server-external-events ) in case a new subnet is created on a network on which there are already VMs sitting.11:07
lajoskatonaUggla: from user perspective this will mean (most visible at least) that after creating the subnet the server list output will add the extra IP for the given VM.11:08
lajoskatonaUggla: This is a bulk POST from Neutron side, but as I am not familiar with the depths of Nova would be good to check this from a Nova perspective also before I overload some network cache mechanism :-)11:09
sean-k-mooneylajoskatona: i didnt think prot recived ip on subnets automaticlly11:20
sean-k-mooneylajoskatona:my understandignis if you add a subnet to a network that has existing prot if you want to an an ip formthat subnet to the port you have to do it manualy11:20
sean-k-mooneyso the server output shoudl not change without addign an addtional fixed ip11:20
lajoskatonasean-k-mooney: good point, I have to double check that11:21
sean-k-mooneyso the act of adding a subnet to a netowkr shoudl not require a network-vif-changed event11:21
sean-k-mooneylajoskatona: if you auto added ip to prot by addign a subnet that owuld be a breaking api chagne from my perspective as hostiroclly you did that when you exaused ipis and eneded more11:22
sean-k-mooneyso you would not want to auto acllcate ips form the new subnet to exisitng ports11:22
lajoskatonasean-k-mooney: I played only with sending out the network-changed event 11:24
sean-k-mooneysorry network change is what i ment11:24
sean-k-mooneythe only plasce wehre we cache/use the subnet info is for the metadat api11:25
sean-k-mooneybut even so adding a subnet to a network11:25
sean-k-mooneyshoudl not chagne the subnets assocatred with a port11:25
sean-k-mooneythat is determien based on its fixed ips11:25
sean-k-mooneyso i think its only correct to send a network-chagned event ot nova if you added a fixed ip form the new subnet to a port but not for the creation of the subnet itself11:27
lajoskatonasean-k-mooney: ack, I check it, if that can be catched and managed from Neutron side and if that is something to work or just document it somewhere.11:30
lajoskatonasean-k-mooney: did it work ever? Perhaps when nova-network was the networking?11:30
opendevreviewKoya Watanabe proposed openstack/nova-specs master: Repropose newly instance metadata/tag protection feature  https://review.opendev.org/c/openstack/nova-specs/+/97733911:33
sean-k-mooneylajoskatona: did what ever work?11:41
sean-k-mooneyadding a subnet to a network is not ment to add it to exisitng ports11:42
sean-k-mooneyso i dont think that ever "worked" if that is what you mean11:42
sean-k-mooneybtu that would be a breakign api change to add now11:42
sean-k-mooneynova-networks only ever supproted 1 subnet per tenant and it was not really exposed as a first class thing11:43
sean-k-mooneyat least not in the way that neturon does11:43
sean-k-mooneyauto addign subnet ip to prost woudl also likely break l3 routed netowrk for what its worth11:43
opendevreviewBalazs Gibizer proposed openstack/nova master: [py313-threading]Reenable the last vmware unit test  https://review.opendev.org/c/openstack/nova/+/98436911:44
sean-k-mooneygiven you cant assume a subnet is routable to all host in that case11:44
opendevreviewBalazs Gibizer proposed openstack/nova master: [py313-threading]Reenable the last vmware unit test  https://review.opendev.org/c/openstack/nova/+/98436911:45
opendevreviewBalazs Gibizer proposed openstack/nova master: Fix oslo_svc_fixture.SleepFixture usage  https://review.opendev.org/c/openstack/nova/+/98437311:56
Ugglasean-k-mooney, we have this bug (more a RFE).  https://bugs.launchpad.net/nova/+bug/2060916 do you know the status, Do we we have a spec less blue print for it ?12:46
opendevreviewJohn Garbutt proposed openstack/nova master: scheduler: support shared tenant placement aggregates  https://review.opendev.org/c/openstack/nova/+/98438313:01
opendevreviewPhilipp Dreesens proposed openstack/nova-specs master: Add spec for bidirectional RPC liveness handshake  https://review.opendev.org/c/openstack/nova-specs/+/98438413:02
sean-k-mooneyUggla: so the nova part of this13:16
sean-k-mooneyUggla: is that nova should adopt the use fo the new more secure way fo enablet tursted_vf13:17
Ugglasean-k-mooney something to discuss at the PTF ?13:18
sean-k-mooneyhaving an admin modify binding porfile was a security risk espically fi you used custom policy to allow endusers13:18
Uggla*PTG13:18
sean-k-mooneyUggla: this came out of a prior ptg dicussion forfm a year or two ago13:18
sean-k-mooneyi asked neutron to develop this13:18
sean-k-mooneyso i woudl prefer to handle this as a wishlist bug or at most a specless blueprint ideally13:19
sean-k-mooneyUggla: basicly this just requires use to check 1 addtional localtion with a fallback tothe old way13:19
Ugglayes I'd like a SLBP too, and closing this bug.13:19
sean-k-mooneyack then lets create the blueprint firest then mark ti closed after with a link to the blueprint13:20
Ugglaok do you want me to open the BP ?13:21
sean-k-mooneyUggla: orginally this was a bug prly becuase we might want to backport it to the rlease where neturon added the feature13:21
Ugglaoh I guess we can note that in the BP13:21
sean-k-mooneyif you have time that woudl be great if not i cna trhy and loop back to it13:21
sean-k-mooneyUggla: the backporting is really jsut a nice to have13:22
UgglaI'm gonna open it, I will ping you for a "review".13:22
sean-k-mooneyif we fix it on master i would be happy with that13:22
Ugglaack13:22
opendevreviewKoya Watanabe proposed openstack/nova-specs master: Repropose newly instance metadata/tag protection feature  https://review.opendev.org/c/openstack/nova-specs/+/97733913:22
sean-k-mooneyUggla: im not sure fi you saw my message form last week about the cybrog cross project session13:23
sean-k-mooneywould 13:00 utc tuesday work for you in the cybrog room13:23
Ugglasean-k-mooney, yep I saw it this morning. I propose to discuss during the upstream meeting.13:23
sean-k-mooneyack that works too13:24
Ugglais 1h ok for you ?13:24
sean-k-mooneyyes 13:24
sean-k-mooneywe might be able to do less time but it shoudl not take more13:24
sean-k-mooneymy plan was to then use the next hour in the cyborg room to wrap up the cyborg ptg13:24
sean-k-mooneyso we can take any feedback back and resolve it 13:25
sean-k-mooneyill be able to join the nova seession then after that wraps up at least on tuesday13:25
phildree[m]Hi Nova team! I’ve proposed a new spec for improving compute node healthchecks: https://review.opendev.org/c/openstack/nova-specs/+/984384 13:26
phildree[m]This is my first time working on OpenStack/Nova, so I’d love any feedback or corrections on the approach. Thanks for the help in advance!13:26
UgglaI have just seen the TC meetings shrink to 16h and 17h.13:27
opendevreviewJohn Garbutt proposed openstack/nova master: scheduler: support shared tenant placement aggregates  https://review.opendev.org/c/openstack/nova/+/98438313:29
opendevreviewJohn Garbutt proposed openstack/nova master: scheduler: support shared tenant placement aggregates  https://review.opendev.org/c/openstack/nova/+/98438313:32
opendevreviewJohn Garbutt proposed openstack/nova master: scheduler: support shared tenant placement aggregates  https://review.opendev.org/c/openstack/nova/+/98438314:24
opendevreviewJohn Garbutt proposed openstack/nova master: scheduler: support shared tenant placement aggregates  https://review.opendev.org/c/openstack/nova/+/98438314:29
opendevreviewJohn Garbutt proposed openstack/nova master: scheduler: support shared tenant placement aggregates  https://review.opendev.org/c/openstack/nova/+/98438314:34
opendevreviewJohn Garbutt proposed openstack/nova master: scheduler: support shared tenant placement aggregates  https://review.opendev.org/c/openstack/nova/+/98438314:37
bauzasUggla: I just noticed that the nova sessions are 1 hour later than usual PTGs (2pm-6pm UTC). Was it intutional ?15:02
Ugglabauzas, I have just booked the room before my PTO. So the slots are not well defined. Please do not take them into account yet.15:16
bauzasUggla: ack, can you then check with sean-k-mooney about the exact timings because he mentioned they scheduled their Watcher sessions based on our own timings15:17
Ugglabauzas, yes we will discuss that at the upstream meeting. Then I will refine the agenda/tools15:17
bauzascool15:17
sean-k-mooneyi have tried to agragne the cybrog and watcher sessions to minimise overlap with nova and the tc sessions15:18
sean-k-mooneybasiclly im trying to maxiumis the tiemi can attend the nova session based on how they were orginaly planned15:18
opendevreviewKamil Sambor proposed openstack/nova master: Replace eventlet with threading in novncproxy  https://review.opendev.org/c/openstack/nova/+/97608915:19
sean-k-mooneymy intent is if there is not a tc cyborg or watcher seession at the tiem of a nova session ill try and be in the nova sessions15:19
opendevreviewKamil Sambor proposed openstack/nova master: Enable native threading mode for console proxy services  https://review.opendev.org/c/openstack/nova/+/97608915:20
opendevreviewKamil Sambor proposed openstack/nova master: Enable threading mode for proxy services  https://review.opendev.org/c/openstack/nova/+/97608915:21
UgglaNova upstream meeting in ~20mn15:40
opendevreviewMichel Nederlof proposed openstack/nova master: Add RBD XML update functionality for migration process  https://review.opendev.org/c/openstack/nova/+/97403215:42
Uggla#startmeeting nova16:02
opendevmeetMeeting started Mon Apr 13 16:02:24 2026 UTC and is due to finish in 60 minutes.  The chair is Uggla. Information about MeetBot at http://wiki.debian.org/MeetBot.16:02
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.16:02
opendevmeetThe meeting name has been set to 'nova'16:02
UgglaHello everyone16:02
sean-k-mooneyo/16:02
fwieselo/16:02
bauzaso/ but tired16:02
phildree[m]o/16:03
gibio/16:03
tkajinamo/16:03
elodilleso/16:03
UgglaLet's start16:04
Uggla#topic Bugs (stuck/critical)16:04
Uggla#info No Critical bug16:04
Uggla#topic Gate status16:04
Uggla#link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs 16:04
opendevreviewsean mooney proposed openstack/nova master: enable tap creation in nova-live-migration  https://review.opendev.org/c/openstack/nova/+/97550016:04
Uggla#link https://etherpad.opendev.org/p/nova-ci-failures-minimal16:05
Ugglalink https://zuul.openstack.org/builds?project=openstack%2Fnova&project=openstack%2Fplacement&branch=stable%2F*&branch=master&pipeline=periodic-weekly&skip=0 Nova&Placement periodic jobs status16:05
Uggla#info Please look at the gate failures and file a bug report with the gate-failure tag.16:05
Uggla#info Please try to provide a meaningful comment when you recheck16:05
UgglaTBH, I have not checked the status. Anything to report about the gate ?16:05
gibiUggla: I saw some tox-cover job timeouts recently but I haven't looked deeper16:06
Ugglagibi ok16:07
gibihttps://zuul.opendev.org/t/openstack/builds?job_name=openstack-tox-cover&project=openstack/nova it is not super frequent but it exists16:07
gibiI vaguely remember we changed something there not to long ago...16:07
gibiahh we added the func test16:08
gibihttps://review.opendev.org/c/openstack/nova/+/97865216:09
gibiwe already bumped the timeout once16:09
sean-k-mooneyyep which slows things down16:09
sean-k-mooneyi think stephen had an idea for makign it faster but it woudl have lost some coverage initally16:09
sean-k-mooneybumping it slightly more might be ok as i think it now only fails on slow rax hosts16:09
gibiyeah that is easy to do16:10
gibiI will put up a patch...16:10
Ugglacool16:11
sean-k-mooneyi will say there is like a 30min spread betwen the faster and slowere hosts so it quite senitive to the provider16:11
Ugglasomething else to add ?16:12
sean-k-mooneynot really16:12
Ugglaok so moving on16:12
Uggla#topic Release Planning 16:12
Uggla#link https://releases.openstack.org/hibiscus/schedule.html16:12
Uggla#info we will discuss Hibiscus release planning at the PTG.16:12
Uggla#info Nova deadlines are set in the above schedule16:12
Uggla#info PTG etherpad for 2026.2 is available: https://etherpad.opendev.org/p/nova-2026.2-ptg16:13
Uggla#info This is a "work in progress document", but you can enter your topics at the bottom of the document16:13
* Uggla currently working to build the agenda. I'll ping when it will be ready.16:13
opendevreviewBalazs Gibizer proposed openstack/nova master: [openstack-tox-cover]Increase timeout further  https://review.opendev.org/c/openstack/nova/+/98441716:14
UgglaI'll try to pack the X session on Mon / Tue.16:14
UgglaMon will be shorter to let people go to the tc sessions.16:15
Ugglasean-k-mooney suggested Cross session would tuesday 13:00 utc work for the nova/cyborg cross project time?16:16
Uggladoes someone have an objection to that ^ ?16:16
sean-k-mooneythe intent was to have it before the nova sessiosn strated since when i propved that nova didnt have monday seession16:17
sean-k-mooneythat leave 1 hour left in the cybrog session to reflect on any feedback16:17
sean-k-mooneyi have 2 cyborg slots booked on monday which we could also use as an alterniive16:18
sean-k-mooneyso im pretty flexible on what works for the wider group16:18
gibimy only concern is overlapping with the tc meetings16:19
gibi(but I don't know the exact timings)16:19
sean-k-mooneyyep i have made sure there is none betwen cybrog and tc16:19
sean-k-mooneyhttps://ptg.opendev.org/ptg.html16:19
sean-k-mooneywe coudl put up a poll or i can just sync with rene adn we can create an etherpad16:21
bauzasyeah I will need to attend the TC sessions16:21
bauzasusually we also had sessions one hour earlier, that's probably why now we see more conflicts16:21
sean-k-mooneywe will also need to squeeze in the other nova cross project session ideally in the first 2 days16:22
Ugglalet's start like this, we will adjust during the week if needed. Because atm not all cross session are defined yet.16:22
bauzas(we were usually meeting in between 1-5pm UTC)16:22
sean-k-mooneybauzas: the time change does not impact the tc session16:22
sean-k-mooneyorgianlly nova was not having session at all on monday and it finsied beofre them on monday16:22
sean-k-mooneyin the current ptg adgendar there is also no conflict16:23
Ugglabauzas again, do not take into account what was defined earlier.16:23
bauzasthere is a conflict on FRiday16:23
sean-k-mooneyactuly no there is now on firday16:23
sean-k-mooneyright there wasnt before the extra session were added the week before last16:23
bauzasand yeah, originally, TC sessions were on Monday with no Nova sessions, and on Friday it was overlapping for one hour16:24
Ugglabauzas, I will try to avoid overlapping for sure.16:24
sean-k-mooneyi think one of the tc session might have mvoed form monday to firday16:25
sean-k-mooneyas there is now only 2 hours on monday16:25
Ugglaok I propose to move on to next topic, I will ping when the things will be more accurate/settled.16:26
bauzasthanks16:26
bauzas(on Wed, I will have to skip two hours due to personal concerns but I'll speak out once timings settled)16:27
tkajinam(my only hope is that cc-related topics are scheduled in earlier slots16:27
tkajinam(but do not intend to be demanding :-P16:27
Ugglabut basically my goal is to pack X session on Mon / Tue avoid overlapping and try to spread the topic in the most convenient hours.16:28
Ugglatkajinam, sure I'll keep that in mind.16:28
tkajinamUggla, makes sense16:28
tkajinamUggla, ;-)16:28
Ugglaok next topic16:29
Uggla#topic Review priorities 16:29
Uggla#link New file for Hibiscus https://etherpad.opendev.org/p/nova-2026.2-status16:29
Uggla#info I have updated Launchpad and the above doc. Please ping me if you spot something missing.16:29
Uggla#info Starting: https://etherpad.opendev.org/p/nova-2026.2-status#L16 interesting bugs to review.16:29
Uggla#info Noticed https://bugs.launchpad.net/nova/+bug/2147554 bug entered by Gibi : graceful shutdown fails if requested too early in nova-compute startup sequence. Might need follow up by gmaan.16:30
Ugglagibi, do you want to say something about ^16:30
gibinot much. I noticed some race condition related to graceful shutdown and notified gmaan 16:30
gibinothing serious the shutdown still happens just a bit less gracefully :)16:31
Uggla:)16:31
UgglaAlso gouthamr created nice specs for storage. https://review.opendev.org/c/openstack/nova-specs/+/983801 16:32
Ugglaand https://review.opendev.org/c/openstack/nova-specs/+/983816 16:33
UgglaWe will have a X session with manilla to discuss them. So cool if you can look at them upfront.16:33
Uggla#topic Stable Branches16:34
* Uggla giving the mic to elodilles16:34
sean-k-mooneygibi: why you say failes those it exit early16:34
sean-k-mooneyi.e. not wait16:34
sean-k-mooneyelodilles: Uggla  woudl we want to implemetn the memfd part as well for virtio-fs quality of life16:35
gibisean-k-mooney: fails to be graceful I guess :)16:36
sean-k-mooneyUggla: that was for you i guess to inlcude that topic in the mainilar sesssion16:36
Ugglayes we discuss to have a topic about it. 16:36
opendevreviewMerged openstack/nova stable/2026.1: Follow ironic job rename  https://review.opendev.org/c/openstack/nova/+/98316616:36
Ugglahowever, from internal perspective, all of these are consider low priorities.16:36
Uggla:(16:37
sean-k-mooneywell if someone does the work we can try and review but ya...16:37
sean-k-mooneyim not sure if elodilles is around at teh moment or stable branches are all good ?16:37
elodillesyepp16:37
elodillesi'm around16:37
Ugglahowever I said it does not hurt to discuss at the PTG and progress at least on the spec16:37
elodillesjust wanted to wait for the prev topic to end o:)16:38
sean-k-mooney:)16:38
Ugglayep sorry elodilles please go ahead16:38
elodillesthx :)16:38
elodilles#info nova stable gates should be OK16:38
elodilles#info thanks for the reviews of placement and osc-placement gate fixes, all but one merged: placement's 2025.1 branch needs a fix for grenade-skip-level job16:38
elodillesi'll try to figure out the issue there, but if that doesn't succeed soon, then we can set it to non-voting (as it is already the case on nova's stable/2025.1)16:38
elodilles(unmaintained/2024.1 -> stable/2025.1)16:39
elodilles#info stable branch status / gate failures tracking etherpad: https://etherpad.opendev.org/p/nova-stable-branch-ci16:39
elodillesbasically that's it from me16:39
* elodilles passes the mic back16:40
sean-k-mooneywell we are not stricly ment to do upgrade testign from unmaintined16:40
sean-k-mooneybut if you do get it workign then cool16:40
elodillessean-k-mooney: yes, that's why i mentioned16:40
sean-k-mooneyis this pkg_resouce related out of interst16:40
elodillessean-k-mooney: though if i can get it working... o:)16:40
sean-k-mooneyi knwo we haveing fixed those issues on 2024.2 in the requireemnt repo16:40
sean-k-mooneyso i expect 2024.1 is broken too16:40
elodillessean-k-mooney: yes, it is somewhat related, but afaiu it shouldn't be :/ so still checking why we end up with an error like that :/16:41
sean-k-mooneyack16:41
elodillessean-k-mooney: no, that works afaik16:41
elodillessean-k-mooney: the error is when nova is upgraded from 2024.1 to 2025.1, but i still don't have the full root cause16:42
sean-k-mooneythe requiremnts check job failes because the requirement repo depends on pkg_resouces adn we have not pineed setup tools on 2024.2 or bumpt pbr but we can chat about that later if you like16:42
elodillessean-k-mooney: sure16:43
elodillesthanks in advance o/16:43
* Uggla just noticed you passed the mic back16:44
Ugglathanks elodilles16:44
elodillesthanks too o:)16:44
Uggla#topic vmwareapi 3rd-party CI efforts Highlights16:44
Ugglafwiesel anything for us ?16:44
fwieselHi, so I can confirm that the bugfix in oslo.vmware would fix the image upload timeout observed by gibi16:45
fwiesel#link  https://review.opendev.org/c/openstack/oslo.vmware/+/98261416:45
Uggla\o/ thanks16:45
fwieselThat's from my side for now... I spend the calmer days looking into some random errors, but nothing to report now.16:45
gibiyeah I'm +1 on the fix16:46
Ugglafwiesel thank you.16:46
Uggla#topic Gibi's news about eventlet removal16:46
* Uggla giving the mic to gibi16:46
gibio/ 16:46
gibino much to report16:48
gibiwork ongoing on the console proxies and testing16:48
gibithe PTG etherpad is up to date16:48
gibiwe will discuss status and plans there16:49
gibiEOM16:49
sean-k-mooneyhttps://review.opendev.org/c/openstack/oslo.vmware/+/982614 looks indepent of the executor issue we saw in the vmware job so this is just a general urllib3 issue not an eventet oen right16:49
sean-k-mooneyim asking because it does not have a bug filed for it and im wodnering if eventlet was masking it16:50
gibithe executor issue was the deadlock on sending dependent tasks to the parent's executor we fixed that in general16:51
gibithis bug apperad after16:51
gibiand was visible as hanging in image tasks in the vmware driver16:51
sean-k-mooneyack so it may have been there but not visable before16:52
gibiyeah16:52
sean-k-mooneyi know evently also monkey patches urllib316:52
sean-k-mooneyso that also complciates things16:52
sean-k-mooneyok we can likely move on16:52
Uggla#topic Bug scrubbing 16:52
Uggla#info up to 197 (-2)16:52
Uggla#link: https://etherpad.opendev.org/p/nova-bug-triage-roster16:53
Uggla#link: https://truc.uggla.fr/ to follow the trend.16:53
UgglaUggla: This week [public] Upstream bug triage. Wednesday, April 15 · 15:30 – 16:00 UTC. Video call link: https://meet.google.com/kmq-eedg-auj16:53
Ugglaoops forget to say : Skipping Layos topic as he looks not around and we are a bit late on schedule16:53
Uggla#topic Open discussion 16:54
UgglaUggla: Suggest no meeting next week. 20th April vPTG week.16:54
gibi+116:54
UgglaI will notify on the ML too16:54
elodilles+116:54
Ugglaany last 5 minutes topic to discuss ?16:55
phildree[m]Hello guys, I am not sure if this is the right place and time to bring this up because i am new to OpenStack, but I have created a spec and am looking for some feedback: https://review.opendev.org/c/openstack/nova-specs/+/98438416:57
Ugglaphildree[m], can you add a topic in the ptg etherpad ? then I'll define a slot for you next week.16:58
phildree[m]Ok, thank you :)16:58
Ugglaany "prefered" timing  ?16:58
phildree[m]Nope16:59
Ugglaok17:00
Ugglaso I think we are good.17:00
Ugglatime to close, thanks for joining this meeting. Have a nice day/evening and see you next week for the vPTG.17:01
Uggla#endmeeting17:01
opendevmeetMeeting ended Mon Apr 13 17:01:07 2026 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)17:01
opendevmeetMinutes:        https://meetings.opendev.org/meetings/nova/2026/nova.2026-04-13-16.02.html17:01
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/nova/2026/nova.2026-04-13-16.02.txt17:01
opendevmeetLog:            https://meetings.opendev.org/meetings/nova/2026/nova.2026-04-13-16.02.log.html17:01
fwieselo/ thanks everyone17:01
elodillessee you next week o/17:01
sean-k-mooneyphildree[m]: have you talked to oslo about his17:01
phildree[m]No, this IRC is my first point of contact for now17:01
sean-k-mooneyphildree[m]: in generally iw oudl prefer if we removed the if we didnt have to have any aplciation level handchakes17:01
gibithanks17:02
sean-k-mooneywe have some existign rpc pings but anything that makes the conductor comptue messageing nosyier by addign background messages direct hurts scalableity fo nova17:03
phildree[m]sean-k-mooney: Ok, I understand. Maybe handshake is a bit much for what i have proposed, its more like an echo17:03
sean-k-mooneyright im saying even adding a echo/ping is protically a scaling issues17:04
phildree[m]sean-k-mooney: I have seen the existing ping but it is not applicable to our usecase (I explained why in the spec)17:04
phildree[m]sean-k-mooney: yes you are totally right, I get that. Any other ideas on how to solve the problem we have been facing?17:05
opendevreviewMerged openstack/nova stable/2025.1: Fix fill_metadata usage for the ImagePropertiesWeigher  https://review.opendev.org/c/openstack/nova/+/96429117:05
sean-k-mooneyi have not read your spec yet but we gerneally perfer to start with the usecase/probelm17:05
sean-k-mooneybefore going to a feature/design chagne17:05
sean-k-mooneyphildree[m]: the prviosu solution which we never implmetned was per binary healthcheck endponts on each nova service17:07
sean-k-mooneyphildree[m]: i think teh issue your having is something the outbount rabbit conenction is fucntional17:08
sean-k-mooneybut the inbound one is not correct17:08
sean-k-mooneyi.e. the compute can call the conductor but the conductor cannot sucessfully call the computes17:08
phildree[m]Yes, we faced this exact issue.17:09
sean-k-mooneydo you have the guareented message delivery flag enabled in oslo17:09
phildree[m]I will think about the other approach with individual per binary healthcheck endpoints 17:09
sean-k-mooneywell we didnt impment that or at least we didnt merge it17:10
phildree[m]That is a good question, which I am not able to answer right now17:10
melwittbauzas: hi, there are two vtpm related patches that are simple and need a second review if you could get a chance plz https://review.opendev.org/c/openstack/nova/+/978836 and https://review.opendev.org/c/openstack/nova/+/98350417:11
sean-k-mooneyphildree[m]: https://docs.openstack.org/nova/latest/configuration/config.html#oslo_messaging_rabbit.direct_mandatory_flag17:11
sean-k-mooneyi woudl check you have that set to true17:11
phildree[m]My issues with the per binary healthcheck is that I still need a way to ensure that full communication from conductor to compute (and back) is up and running, which i cannot check without any actual communication, I think17:11
sean-k-mooneyphildree[m]: what that will detect is fi the per compute queue is present17:11
phildree[m]Ok very nice, I will look into it, thank you for the infos :)17:12
sean-k-mooneyphildree[m]: well the whole point of the per binary healthcheck si to not use rabbit or the conductor17:12
sean-k-mooneyit to move the health moditoring out fo that code path to an external http endpoint17:12
sean-k-mooneythat you can have prometheous or whatever scrape instead17:12
sean-k-mooneyphildree[m]: the direct_mandatory_flag tuns a slient failrue (the messag eis encued to a new queue that nothign is listenting too) into a driect rpc call failure and failure in teh api 17:14
phildree[m]I get it, and it makes sense, but then there is no way to check the health of the communication (and the underlying infrastructure) between conductor and compute (only the individual binaries)17:14
sean-k-mooneyright today we do that via the resouce update whcih is a push model17:15
sean-k-mooneyits worth a dicussion i guess but my main concern is we dont really run perodic form the conductor to monitor the satues of other agents17:16
sean-k-mooneyand we try not to do monitoring in general in nova. so this woudl likely need to be confiugrable and off by defualt17:16
sean-k-mooneyat least until we know the overhad of it17:16
phildree[m]I have touched many points of what you have brought up, but as I said, I am not yet very familiar with OpenStack so maybe there are some issues I did not anticipate.17:17
phildree[m]I think I have to read up on what you have sent first before I may rethink the idea. Thanks a lot for the infos so far, that was very helpfull and a lot less complicated then i thought :)17:17
sean-k-mooneyphildree[m]: one thing we are currently changing17:19
sean-k-mooneyis we will nolonger have a slngle rpc listter per compute agent17:19
sean-k-mooneywe will now have two becase fo the graceful shutdown work17:19
sean-k-mooneyso gmaan woudl likely need to review your prosal in the context fo that work17:20
phildree[m]Ok, that is something very important as this may have implications on my idea17:20
sean-k-mooneywith graceful shutdown we will be shutting down 1 of the 2 lstenter until the inprogress operation complete17:20
sean-k-mooneyso any kind fo monitoring would likely have to be on the one we dont shutdown17:21
gmaansean-k-mooney: reading chat17:22
sean-k-mooneygmaan: context is https://review.opendev.org/c/openstack/nova-specs/+/984384/1/specs/2026.2/approved/bidirectional-rpc-liveness-handshake-for-compute-nodes.rst17:22
phildree[m]At least for the shutdown period, yes17:22
sean-k-mooneyphildree[m]: are you goign to present this at the ptg for dicussion?17:22
phildree[m]I have to be honest, I first have to read up on what the PTG exactly is and how it works17:23
sean-k-mooneyack its basically a workign session wehre user/operators/devleoper talk about our plans for the next release and pain points17:24
sean-k-mooneyit operats now as a virual event with an eterphad for notes and adgena and jitsi meet as the video confernce tool17:24
sean-k-mooneyits free to attend. they used ot be inperson events17:25
phildree[m]Then this would probably make sense from my point of view (even if its just to gain a better understanding of the whole project)17:25
sean-k-mooneyif you cant attend thats fine too.17:25
gmaanphildree[m]: I agree with sean-k-mooney  that periodic handshake over RPC is expensive especially in scale env17:25
sean-k-mooneygmaan: im kind of wondering if we can maybke marke the node as down sooner by monitoring failed/timedout rpc calls of soemthing liek that17:26
gmaanI also have not read the spec but does compute service up check not enough? that is kind of periodic check if service is down or not17:26
sean-k-mooneyi.e. only do a condotor to compute ping in that case to confirm the state17:26
sean-k-mooneygmaan: there was at least one case in the psat where the comptue service is up but the rabbit queue was deleted17:27
sean-k-mooneyso it coudl send rpc calles like the status update but not recive them17:27
gmaan" failed/timedout rpc calls" might not give actual result and can end up in false result. 17:27
sean-k-mooneyhttps://docs.openstack.org/nova/latest/configuration/config.html#oslo_messaging_rabbit.direct_mandatory_flag was partly added to prevent that17:27
phildree[m]gmaan: I think if you do the service up check it relies on the DB entries based on heartbeats which is exactly what our issues was because the heartbeat may keep working while the requests from conductor to compute failed17:28
gmaanyeah not RPC17:28
phildree[m]Just like sean-k-mooney just described17:28
gmaanyeah, that is true and existing challenge 17:29
sean-k-mooneyphildree[m]: the reall issue is take an enve with 500 nodes and 3 conductors. we technically ahve a leader election fetrue in the conductor but if we didnt use that ever 300 second (our default perodic interval) we would send 1500 messages 3 form each conduxtor x 500 compute nodes17:30
gmaanso what is actual issue? conductor to compute RPC call fail the  it give status to check if compute service is up or not if yes then RPC is down right? 17:30
sean-k-mooneyin rare cases the compute can be up enough for the status to be up but a call to the compute form the api will fail to boot a vm17:31
phildree[m]In my proposal the handshake is compute initiated in order to avoid leader stuff and states in the conductor 17:31
sean-k-mooneyphildree[m]: well we typiclly have 3-5 conductors per cell and can have 300-1500 compute per cell in large clouds17:32
gmaanI am not saying 'service up' is reliable source but my question is do we want to automate it in conductor to check compute before call otherwise asl sch  to select other dest? or just for clear troubleshooting and error message that what exactly is down?17:33
gmaans/asl/ask17:33
sean-k-mooneygmaan: ya im not sayign we shoudl do this im just playing devils advocate17:34
gmaansean-k-mooney: or the healthcheck proposal is good fit here but is the latest proposal check RPC health also?17:35
phildree[m]it like a tcp Syn (compute send a request to any conductor), Syn-Ack (conductor replies to that tiny request), Ack (compute initiates the regular heartbeat as is)17:35
phildree[m]This way when any part of the communction is broken the node does not appear as up17:35
phildree[m]But in order to reduce RPCs i think also just a periodic ping without compute base initiation might be feasible but will have the issue of coordination etc..17:35
sean-k-mooneygmaan: it was goign to catch connection closed events and similar on the comptue and report if the rpc bus was working17:36
sean-k-mooneybut not by active probing17:36
sean-k-mooneyonly by recoreding if we got an excption17:36
gmaank17:36
sean-k-mooneyso it woudl not really help in this case17:36
gmaan++17:36
gmaanyeah17:36
gmaando not have any perfect solution as of now but let's discuss it in PTG17:38
phildree[m]Ok, thanks anyway for the engagement, I really appreciate you guys taking your time :)17:39
*** dwi_ is now known as dwi18:48
opendevreviewMerged openstack/nova stable/2026.1: fix: device_by_alias should respect config type  https://review.opendev.org/c/openstack/nova/+/98398919:20
opendevreviewJohn Garbutt proposed openstack/nova master: Make PCPUs not land on VCPUs by default  https://review.opendev.org/c/openstack/nova/+/97577919:58
opendevreviewJohn Garbutt proposed openstack/nova master: Fix ironic instance rebuild with API >=2.93  https://review.opendev.org/c/openstack/nova/+/96329819:59
melwittdansmith: there is one more small patch that was split out from the 'user' live migration patch that is ready for review at your convenience https://review.opendev.org/c/openstack/nova/+/98350520:24
dansmithah yep, the other one, sorry I guess I missed that when hitting the first one21:00
dansmithdone now21:00
melwittdansmith: thanks!!21:16
*** bauzas1 is now known as bauzas22:40
opendevreviewmelanie witt proposed openstack/nova master: [WIP] how borken is ipv6 only ceph.  https://review.opendev.org/c/openstack/nova/+/98230222:46
opendevreviewmelanie witt proposed openstack/nova master: [WIP] how borken is ipv6 only ceph.  https://review.opendev.org/c/openstack/nova/+/98230222:51

Generated by irclog2html.py 4.1.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!