16:00:40 <Uggla> #startmeeting nova 16:00:40 <opendevmeet> Meeting started Tue May 27 16:00:40 2025 UTC and is due to finish in 60 minutes. The chair is Uggla. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:40 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:40 <opendevmeet> The meeting name has been set to 'nova' 16:00:50 <Uggla> Hello everyone 16:01:00 <fwiesel> o/ 16:01:08 <Callum027> o/ 16:01:08 <gibi> o/ 16:01:09 <Uggla> awaiting a moment for people to join. 16:01:15 <dpascual> o/ 16:02:52 <Uggla> #topic Bugs (stuck/critical) 16:03:00 <Uggla> #info No Critical bug 16:03:15 <elodilles> o/ 16:03:37 <Uggla> bauzas, still working on https://review.opendev.org/c/openstack/nova/+/922140 bauzas working on a patch to enable nova-next. 16:03:45 <bauzas> o/ 16:04:41 <Uggla> bauzas, something you want to say ? 16:05:13 <Uggla> about ^ 16:05:35 <gmaan> o/ 16:06:02 <Uggla> #topic Gate status 16:06:10 <Uggla> #link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs 16:06:19 <Uggla> #link https://etherpad.opendev.org/p/nova-ci-failures-minimal 16:06:25 <Uggla> #link https://zuul.openstack.org/builds?project=openstack%2Fnova&project=openstack%2Fplacement&branch=stable%2F*&branch=master&pipeline=periodic-weekly&skip=0 Nova&Placement periodic jobs status 16:06:32 <Uggla> #info Please look at the gate failures and file a bug report with the gate-failure tag. 16:06:42 <Uggla> #info Please look at the gate failures and file a bug report with the gate-failure tag. 16:07:21 <Uggla> Seems that we are hitting a lot https://bugs.launchpad.net/glance/+bug/2109428 16:07:48 <Uggla> gibi, do you want to add something about it ? 16:07:55 <gibi> no 16:08:03 <gibi> it is not nova 16:08:19 <gibi> and this week I did not checked if we hit it again 16:08:44 <Uggla> yep I hit it too last week. 16:09:28 <Uggla> maybe dansmith knows more about this issue ? 16:09:43 <bauzas> I've seen a revert proposal 16:09:47 <bauzas> from Glance IIRC 16:09:49 <dansmith> is this pre-revert? 16:10:03 <dansmith> https://review.opendev.org/c/openstack/nova/+/950336 16:10:11 <dansmith> merged may 20 16:10:44 <gibi> I last checked on the 19th 16:10:50 <gibi> I can check for hits since then... 16:10:59 <dansmith> tl;dr the new glance location API is fundamentally broken for ceph and cinder backends, we reverted nova's use of it while working on a solution for those backends in glance 16:12:19 <Uggla> thx for the info dan 16:12:23 <gibi> I see two hits after 20th 16:12:24 <gibi> | 455491157c0f44cebf2cfcfe9f1c7b5c | 2025-05-22T07:20:41 | check | https://review.opendev.org/950623 | master | 16:12:27 <gibi> | f61be3ef2b9a442d8843f65152343018 | 2025-05-20T12:55:29 | check | https://review.opendev.org/948450 | master | 16:12:48 <dansmith> one is the revert of revert, so expected 16:13:01 <dansmith> the other is too close to merge, probably used old code 16:13:09 <gibi> OK then maybe we are moving to the good direction 16:13:14 <gibi> dansmith: thanks for the info 16:13:17 <dansmith> np 16:13:36 <Uggla> ok mobing on 16:13:37 <Uggla> #topic tempest-with-latest-microversion job status 16:13:47 <Uggla> #link https://zuul.opendev.org/t/openstack/builds?job_name=tempest-with-latest-microversion&skip=0 16:13:59 <gmaan> no new update this week 16:14:11 <Uggla> ok so let's continue 16:14:27 <Uggla> #topic Release Planning 16:14:34 <Uggla> #link https://releases.openstack.org/flamingo/schedule.html 16:14:44 <Uggla> #info Nova deadlines are set in the above schedule 16:15:01 <Uggla> #info Only 1 weeks before Nova Spec Soft Freeze, do not forget to submit your specs. 16:15:35 <Uggla> #topic Review priorities 16:15:45 <Uggla> #link https://etherpad.opendev.org/p/nova-2025.2-status 16:17:41 <Uggla> fyi I have just asked for the Cloud hypervisor, to have the first patches. As we discussed last week. 16:18:31 <Uggla> but tbh I have not looked closely yet to know if we progressed. 16:18:54 <Uggla> #topic OpenAPI 16:19:04 <Uggla> #link: https://review.opendev.org/q/topic:%22openapi%22+(project:openstack/nova+OR+project:openstack/placement)+-status:merged+-status:abandoned 16:19:20 <Uggla> 22 patches remaining. (+1) 16:19:36 <Uggla> #topic Stable Branches 16:19:50 <Uggla> elodilles, your turn. 16:20:07 <elodilles> thanks, i'll be short, as there's nothing to report 16:20:10 <elodilles> #info stable branches (stable/2025.1 and stable/2024.*) seem to be in OK state 16:20:16 <elodilles> Uggla: back to you 16:20:24 <Uggla> cool thx elodilles 16:20:39 <Uggla> #topic vmwareapi 3rd-party CI efforts Highlights 16:20:56 <Uggla> fwiesel, something to share ? 16:21:20 <fwiesel> So, from my side I want to introduce dpascual. He will be my stand-in, in case I should be inable to attend. 16:21:36 <Uggla> dpascual, welcome 16:21:37 <dpascual> hi everyone 16:22:01 <fwiesel> He is operating the vmwareapi CI, and hopefully should keep you in the loop in case I can't 16:22:51 <fwiesel> Otherwise, since last Tuesday, we have a problem with some change in neutron. I am debugging this, so the CI failures come from there. 16:22:58 <fwiesel> That's from my side. 16:23:09 <fwiesel> Uggla: back to you 16:23:41 <Uggla> fwiesel, thx good to know. 16:24:26 <Uggla> #topic Gibi's news about eventlet removal. 16:24:31 <Uggla> #link Series: https://gibizer.github.io/categories/eventlet/ 16:24:51 <Uggla> #link nova-scheduler series is ready for core review, starting at https://review.opendev.org/c/openstack/nova/+/947966 16:24:52 <gibi> no new blogpost as my prios mostly elsewhere these weeks 16:25:10 <gibi> sean-k-mooney gave some review and I respin the series to fix them 16:25:37 <gibi> oslo.service threading backend released and the global req bumped so we are now using the official oslo.service version in the series 16:25:48 <gibi> that is all 16:25:52 <Uggla> \o/ 16:26:11 <Uggla> thx gibi 16:26:20 <Uggla> #topic Open discussion 16:26:30 <Uggla> #link Callum027: Spec-less blueprint for adding new attributes to libvirt domain XML metadata: https://blueprints.launchpad.net/nova/+spec/xml-image-meta 16:26:49 <Callum027> Hi guys, I am proposing a spec-less blueprint to add new metadata to Nova's libvirt domain metadata for Ceilometer and other telemetry-related OpenStack services. 16:26:52 <Uggla> Hi Callum027 16:27:05 <Callum027> To give some context, a long time ago Ceilometer used to perform API queries for polling instances. Because this was not scalable it was changed to reading the XML metadata Nova adds to the libvirt domain metadata on hypervisors. My understanding is that this was not the original intention for how this metadata is used, but Ceilometer now relies on 16:27:05 <Callum027> it and has done for a long time, and changing this at this point would be a big project. 16:27:22 <Callum027> The problem that Ceilometer faces is that this metadata does not provide enough information for Ceilometer to do what it needs, so Ceilometer still needs to perform Nova API queries for some things, which scales badly with the number of running instances in a cloud and increases API load. What I'm proposing are additions to Nova's libvirt domain 16:27:22 <Callum027> metadata to not only make it unnecessary for Ceilometer to make API queries for compute polling in most cases, but also add additional metadata that make it easier to use services like CloudKitty to implement OpenStack billing. 16:27:40 <Callum027> I'd appreciate feedback if anyone has any (sean-k-mooney has +1'd the spec and changes as an initial pass), and if everything looks good, approval to proceed with the patches created for this. 16:27:58 <gibi> btw this metadata in the libvirt xml helps with troubleshooting as it tells the details of the VM from the xml instead of looking it up form the logs of from a DB dump 16:28:22 <sean-k-mooney> yep 16:28:27 <gibi> so I'm supporting 16:28:29 <sean-k-mooney> this is why i want to addm ore to it 16:28:29 <gibi> this 16:28:44 <sean-k-mooney> i mean it help with billing and other ceilomenter usecases too 16:29:02 <sean-k-mooney> but if im being selfish having the info in the xml/logs means i do not have to ask supprot to ask a custoemr for it 16:29:17 <sean-k-mooney> so shortening that round trip time is a massive win IMO 16:29:26 <Callum027> In our experience, having the flavour metadata was useful for helping determine how an instance ended up on a hypervisor 16:30:35 <Uggla> I understand we agree with this BP. Everybody ok with a specless one ? 16:31:10 <gibi> yeah I'm OK with it being a specless one 16:31:15 <sean-k-mooney> im supportive fo that and just proceedign witht he review and ensuring we have a good release note 16:32:19 <sean-k-mooney> there are some askpsec that woudl be nice to have beyond what Callum027 would liek to do. but i dont neesisarly want to expand the scope of there request to include them 16:32:37 <sean-k-mooney> so if we decied to work on those we can file a follow up bluepritn or revisit the discussion 16:33:00 <gibi> +1 on follow up and keep the scope small and land it quickly 16:33:37 <Callum027> Sounds great, thanks guys, we've actually been using this in our production envs for a little while and it's been working great so I'm confident it's ready to go 16:33:59 <gibi> Callum027 thanks for working on it 16:34:05 <Uggla> Callum027, cool 16:34:38 <Callum027> That's it from me Uggla, thanks 16:35:32 <Callum027> Actually, do I need to add this to the Etherpad? 16:36:10 <Uggla> Callum027, it is ok, I have just updated it 16:36:24 <Callum027> Awesome, thank you! 16:36:50 <Uggla> btw, thanks for joining this call. 16:37:21 <Uggla> I guess sean-k-mooney wanted to discuss this https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/thread/NSW3OG5ME5RDPQYLHF4T4RCWPQYG57PK/ too 16:40:00 <gibi> it seems they have an RFE but they don't have a person volunteering to impelemnt 16:40:29 <gibi> I guess we can wait until we have somebody who at leat can propose a spec 16:40:33 <sean-k-mooney> /me clicks 16:40:47 <gibi> sean-k-mooney: this is the memballon RFE 16:40:47 <sean-k-mooney> oh that 16:41:07 <sean-k-mooney> ya i kind of felt like may i shoudl jsut spend an afternoon and write it 16:41:15 <sean-k-mooney> basicaly the short version is 16:41:33 <sean-k-mooney> nova orgianlly did not enabel the mem balloan at all unless you enabel memory stat reporting 16:41:45 <sean-k-mooney> at some point libvirt started adding it by default 16:41:59 <sean-k-mooney> but we have never actullly used it to release/reclaim unused memory form the guest 16:42:07 <sean-k-mooney> because that used to require a special agent on the host 16:42:17 <sean-k-mooney> proxmox and ovirt provided those 16:42:23 <sean-k-mooney> that now built into libvirt 16:42:36 <sean-k-mooney> so we coudl just hardcode 2 xml atributes 16:42:49 <sean-k-mooney> which woudl allwo memory to be release in 2 cases 16:43:14 <sean-k-mooney> oen when the OOM reaper is about to kill the vm if we have auto defalte on the vm gets a chane to free memory 16:43:24 <sean-k-mooney> the other one "freePageReporting" 16:43:40 <sean-k-mooney> allow the memory to be retruend to the host any time the guest free memory 16:43:52 <sean-k-mooney> i think it would nice ot supprot both in nova 16:44:08 <sean-k-mooney> but ya someone just needt to write it 16:44:32 <sean-k-mooney> i kind of wodnered how other felt 16:44:40 <gibi> if you have time to write it I'm not against it, but it is just a nice to have 16:45:09 <sean-k-mooney> track this as a wishlist RFE bug? 16:45:12 <sean-k-mooney> spec? 16:45:38 <sean-k-mooney> i feel like this is one of those things where is quicker to write then disucss :) 16:46:19 <Uggla> Yep that sounds quite strait forward so specless BP, I think. 16:46:23 <gibi> if this is hardcode as enalbed then specless 16:46:25 <sean-k-mooney> all of our qemu and linvert version miniums are above wehre this was added so there is no migration or upgrade aspect either 16:46:48 <gibi> does it effect realtime / nfv guest performance? 16:46:54 <sean-k-mooney> :) 16:47:13 <sean-k-mooney> well yes but they normlaly turn off the mem ballon entirly 16:47:22 <sean-k-mooney> because it addds overhead that they dont tolllerate 16:47:38 <sean-k-mooney> whicih is why we shoudl fix the bug that stops you form being able to do that 16:47:43 <gibi> ahh 16:47:50 <sean-k-mooney> https://bugs.launchpad.net/nova/+bug/2069192 16:47:57 <gibi> so we should not make that situation worse :) Lets fix the bug first then 16:48:13 <sean-k-mooney> so this would be hard code on if you have the mem balloan enabled 16:48:37 <sean-k-mooney> gibi: rajesh ahs a patch up to fix that bug https://review.opendev.org/c/openstack/nova/+/945621 16:48:44 <gibi> cool 16:49:06 <gibi> so I'm OK with a specless 16:49:14 <gibi> and I will look at that bugfix 16:49:30 <sean-k-mooney> Uggla: gibi ok lets do this. if i find time to work on this before m2 ill file a specless blueprint and we can aprove it then 16:49:51 <sean-k-mooney> if not i can file a RFE bug or similar tracker and if someone has time we can work on it later 16:50:01 <gibi> sounds good to me 16:50:53 <Uggla> +1 16:51:20 <Uggla> I have tracked that in the etherpad no to forget. 16:52:30 <Uggla> Something else to discuss ? 16:52:48 <gibi> - 16:53:11 <Uggla> Do you have still energy left to triage 1 or 2 bugs ? 16:53:26 <gibi> - 16:53:49 <sean-k-mooney> we can proably do one 16:54:17 <sean-k-mooney> although if you have not noticed im a littel disrtrated today. but we can try to do one 16:54:51 <Uggla> #topic Bug scrubbing 16:54:56 <Uggla> #link: https://etherpad.opendev.org/p/nova-bug-selection-for-triaging#L4 16:55:02 <Uggla> Just one 16:55:17 <Uggla> https://bugs.launchpad.net/nova/+bug/2103500 16:56:21 <Uggla> To my mind that sounds a valid one. At least regarding the comment from Brian. 16:58:29 <gibi> db error in the neutron code, this is a neutron bug 16:58:49 <gibi> at least neutron should not return 500 16:59:14 <sean-k-mooney> so we had a vm create request with --network 16:59:28 <sean-k-mooney> it existsed at the api but by the time we got to the compute it as deleteed 16:59:48 <sean-k-mooney> i guesss it coudl happen when we do the quota check as well 17:00:01 <gmaan> and nova do capture NetworkNotFound exception if it is not found and requested https://github.com/openstack/nova/blob/221a3e89e8988bc664298106ee691a4e41ca71f9/nova/api/openstack/compute/servers.py#L842 17:00:02 <sean-k-mooney> in anycase ya we shoudl get back a 400 not a 500 17:00:43 <gmaan> or somewhere neutron exception is lost and not raised to controller ? 17:00:47 <sean-k-mooney> the expected behivor here shold be that nova puts the vm in error 17:01:03 <sean-k-mooney> and when the vm is deleted we would clean up any prot that were created then 17:01:09 <sean-k-mooney> based on teh delete on termeinate behaivor 17:01:20 <gmaan> yeah, bug say ignore the deleted network port and continue VM creation which does not seems right 17:01:42 <sean-k-mooney> its not form a nova point of view because we could not honner the orginal request 17:02:33 <sean-k-mooney> so depending on where this failed it soudl be burried in cell 0, rejected at the quoata check with a 400 or in error in the cell db if it got all the way to the compute 17:04:05 <sean-k-mooney> reading it more careflyy im tempeted to mark it inviald in nova 17:04:18 * gibi bows to Uggla and disappers 17:04:35 <sean-k-mooney> in there case its the quota check that is failing whne we list all prots for a tenatnt 17:05:05 <Uggla> sean-k-mooney, if you are unsure we can keep it open for next week. 17:05:26 <sean-k-mooney> althogh if we are missing the try accpate on teh network not found then thats a valid bug i guess 17:07:18 <Uggla> sale for valid ? 17:07:55 <gmaan> I have not tracked the exact call of validate_networks() but nova handle NetworkNotFound in _validate_requested_network_ids 17:07:55 <gmaan> https://github.com/openstack/nova/blob/221a3e89e8988bc664298106ee691a4e41ca71f9/nova/network/neutron.py#L988 17:09:06 <gmaan> does bug mention 500 is returned? or 400 ? 17:09:14 <sean-k-mooney> gmaan: i think neturon is returnign a 500 17:09:17 <sean-k-mooney> https://paste.openstack.org/show/bt3BWnyrBbQspiFx8KPZ/ 17:09:39 <gmaan> ohk 17:09:51 <sean-k-mooney> i think we need more info to be sure 17:10:13 <sean-k-mooney> the do have a NetworkNotFound excption https://paste.openstack.org/show/bJL9b5aY99XpZlLZdzgy/ 17:10:15 <gmaan> I think nova hanlding the error correctly if neuton return NetworkNotFound 17:10:35 <gmaan> and controller also handle NetworkNotFound so 400 will be return 17:11:21 <sean-k-mooney> can you comment to that effect on the bug 17:11:47 <gmaan> sure, will do after TC meeting 17:11:58 <sean-k-mooney> the first step to fixing this if we were to fix it in nova woudl be a repoducer test anyway 17:12:03 <sean-k-mooney> iesally a functional one 17:13:45 <Uggla> sean-k-mooney, does that mean we can set it as valid ? 17:15:43 <Uggla> I need to close the meeting, so I propose to look at that bug offline later or next week. 17:15:47 <Uggla> Thanks all and thanks for the extended time for bug triage. 17:15:54 <gmaan> Uggla: I will check exact trace of validate_network and see if nova missing NetworkNotFound handling anywhere 17:15:54 <Uggla> #endmeeting