16:00:40 <Uggla> #startmeeting nova
16:00:40 <opendevmeet> Meeting started Tue May 27 16:00:40 2025 UTC and is due to finish in 60 minutes.  The chair is Uggla. Information about MeetBot at http://wiki.debian.org/MeetBot.
16:00:40 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
16:00:40 <opendevmeet> The meeting name has been set to 'nova'
16:00:50 <Uggla> Hello everyone
16:01:00 <fwiesel> o/
16:01:08 <Callum027> o/
16:01:08 <gibi> o/
16:01:09 <Uggla> awaiting a moment for people to join.
16:01:15 <dpascual> o/
16:02:52 <Uggla> #topic Bugs (stuck/critical)
16:03:00 <Uggla> #info No Critical bug
16:03:15 <elodilles> o/
16:03:37 <Uggla> bauzas, still working on https://review.opendev.org/c/openstack/nova/+/922140 bauzas working on a patch to enable nova-next.
16:03:45 <bauzas> o/
16:04:41 <Uggla> bauzas, something you want to say ?
16:05:13 <Uggla> about ^
16:05:35 <gmaan> o/
16:06:02 <Uggla> #topic Gate status
16:06:10 <Uggla> #link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs
16:06:19 <Uggla> #link https://etherpad.opendev.org/p/nova-ci-failures-minimal
16:06:25 <Uggla> #link https://zuul.openstack.org/builds?project=openstack%2Fnova&project=openstack%2Fplacement&branch=stable%2F*&branch=master&pipeline=periodic-weekly&skip=0 Nova&Placement periodic jobs status
16:06:32 <Uggla> #info Please look at the gate failures and file a bug report with the gate-failure tag.
16:06:42 <Uggla> #info Please look at the gate failures and file a bug report with the gate-failure tag.
16:07:21 <Uggla> Seems that we are hitting a lot https://bugs.launchpad.net/glance/+bug/2109428
16:07:48 <Uggla> gibi, do you want to add something about it ?
16:07:55 <gibi> no
16:08:03 <gibi> it is not nova
16:08:19 <gibi> and this week I did not checked if we hit it again
16:08:44 <Uggla> yep I hit it too last week.
16:09:28 <Uggla> maybe dansmith knows more about this issue ?
16:09:43 <bauzas> I've seen a revert proposal
16:09:47 <bauzas> from Glance IIRC
16:09:49 <dansmith> is this pre-revert?
16:10:03 <dansmith> https://review.opendev.org/c/openstack/nova/+/950336
16:10:11 <dansmith> merged may 20
16:10:44 <gibi> I last checked on the 19th
16:10:50 <gibi> I can check for hits since then...
16:10:59 <dansmith> tl;dr the new glance location API is fundamentally broken for ceph and cinder backends, we reverted nova's use of it while working on a solution for those backends in glance
16:12:19 <Uggla> thx for the info dan
16:12:23 <gibi> I see two hits after 20th
16:12:24 <gibi> | 455491157c0f44cebf2cfcfe9f1c7b5c | 2025-05-22T07:20:41 | check    | https://review.opendev.org/950623 | master |
16:12:27 <gibi> | f61be3ef2b9a442d8843f65152343018 | 2025-05-20T12:55:29 | check    | https://review.opendev.org/948450 | master |
16:12:48 <dansmith> one is the revert of revert, so expected
16:13:01 <dansmith> the other is too close to merge, probably used old code
16:13:09 <gibi> OK then maybe we are moving to the good direction
16:13:14 <gibi> dansmith: thanks for the info
16:13:17 <dansmith> np
16:13:36 <Uggla> ok mobing on
16:13:37 <Uggla> #topic tempest-with-latest-microversion job status
16:13:47 <Uggla> #link https://zuul.opendev.org/t/openstack/builds?job_name=tempest-with-latest-microversion&skip=0
16:13:59 <gmaan> no new update this week
16:14:11 <Uggla> ok so let's continue
16:14:27 <Uggla> #topic Release Planning
16:14:34 <Uggla> #link https://releases.openstack.org/flamingo/schedule.html
16:14:44 <Uggla> #info Nova deadlines are set in the above schedule
16:15:01 <Uggla> #info Only 1 weeks before Nova Spec Soft Freeze, do not forget to submit your specs.
16:15:35 <Uggla> #topic Review priorities
16:15:45 <Uggla> #link https://etherpad.opendev.org/p/nova-2025.2-status
16:17:41 <Uggla> fyi I have just asked for the Cloud hypervisor, to have the first patches. As we discussed last week.
16:18:31 <Uggla> but tbh I have not looked closely yet to know if we progressed.
16:18:54 <Uggla> #topic OpenAPI
16:19:04 <Uggla> #link: https://review.opendev.org/q/topic:%22openapi%22+(project:openstack/nova+OR+project:openstack/placement)+-status:merged+-status:abandoned
16:19:20 <Uggla> 22 patches remaining. (+1)
16:19:36 <Uggla> #topic Stable Branches
16:19:50 <Uggla> elodilles, your turn.
16:20:07 <elodilles> thanks, i'll be short, as there's nothing to report
16:20:10 <elodilles> #info stable branches (stable/2025.1 and stable/2024.*) seem to be in OK state
16:20:16 <elodilles> Uggla: back to you
16:20:24 <Uggla> cool thx elodilles
16:20:39 <Uggla> #topic vmwareapi 3rd-party CI efforts Highlights
16:20:56 <Uggla> fwiesel, something to share ?
16:21:20 <fwiesel> So, from my side I want to introduce dpascual. He will be my stand-in, in case I should be inable to attend.
16:21:36 <Uggla> dpascual, welcome
16:21:37 <dpascual> hi everyone
16:22:01 <fwiesel> He is operating the vmwareapi CI, and hopefully should keep you in the loop in case I can't
16:22:51 <fwiesel> Otherwise, since last Tuesday, we have a problem with some change in neutron. I am debugging this, so the CI failures come from there.
16:22:58 <fwiesel> That's from my side.
16:23:09 <fwiesel> Uggla: back to you
16:23:41 <Uggla> fwiesel, thx good to know.
16:24:26 <Uggla> #topic Gibi's news about eventlet removal.
16:24:31 <Uggla> #link Series: https://gibizer.github.io/categories/eventlet/
16:24:51 <Uggla> #link nova-scheduler series is ready for core review, starting at https://review.opendev.org/c/openstack/nova/+/947966
16:24:52 <gibi> no new blogpost as my prios mostly elsewhere these weeks
16:25:10 <gibi> sean-k-mooney gave some review and I respin the series to fix them
16:25:37 <gibi> oslo.service threading backend released and the global req bumped so we are now using the official oslo.service version in the series
16:25:48 <gibi> that is all
16:25:52 <Uggla> \o/
16:26:11 <Uggla> thx gibi
16:26:20 <Uggla> #topic Open discussion
16:26:30 <Uggla> #link Callum027: Spec-less blueprint for adding new attributes to libvirt domain XML metadata: https://blueprints.launchpad.net/nova/+spec/xml-image-meta
16:26:49 <Callum027> Hi guys, I am proposing a spec-less blueprint to add new metadata to Nova's libvirt domain metadata for Ceilometer and other telemetry-related OpenStack services.
16:26:52 <Uggla> Hi Callum027
16:27:05 <Callum027> To give some context, a long time ago Ceilometer used to perform API queries for polling instances. Because this was not scalable it was changed to reading the XML metadata Nova adds to the libvirt domain metadata on hypervisors. My understanding is that this was not the original intention for how this metadata is used, but Ceilometer now relies on
16:27:05 <Callum027> it and has done for a long time, and changing this at this point would be a big project.
16:27:22 <Callum027> The problem that Ceilometer faces is that this metadata does not provide enough information for Ceilometer to do what it needs, so Ceilometer still needs to perform Nova API queries for some things, which scales badly with the number of running instances in a cloud and increases API load. What I'm proposing are additions to Nova's libvirt domain
16:27:22 <Callum027> metadata to not only make it unnecessary for Ceilometer to make API queries for compute polling in most cases, but also add additional metadata that make it easier to use services like CloudKitty to implement OpenStack billing.
16:27:40 <Callum027> I'd appreciate feedback if anyone has any (sean-k-mooney has +1'd the spec and changes as an initial pass), and if everything looks good, approval to proceed with the patches created for this.
16:27:58 <gibi> btw this metadata in the libvirt xml helps with troubleshooting as it tells the details of the VM from the xml instead of looking it up form the logs of from a DB dump
16:28:22 <sean-k-mooney> yep
16:28:27 <gibi> so I'm supporting
16:28:29 <sean-k-mooney> this is why i want to addm ore to it
16:28:29 <gibi> this
16:28:44 <sean-k-mooney> i mean it help with billing and other ceilomenter usecases too
16:29:02 <sean-k-mooney> but if im being selfish having the info in the xml/logs means i do not have to ask supprot to ask a custoemr for it
16:29:17 <sean-k-mooney> so shortening that round trip time is a massive win IMO
16:29:26 <Callum027> In our experience, having the flavour metadata was useful for helping determine how an instance ended up on a hypervisor
16:30:35 <Uggla> I understand we agree with this BP. Everybody ok with a specless one ?
16:31:10 <gibi> yeah I'm OK with it being a specless one
16:31:15 <sean-k-mooney> im supportive fo that and just proceedign witht he review and ensuring we have a good release note
16:32:19 <sean-k-mooney> there are some askpsec that woudl be nice to have beyond what Callum027 would liek to do. but i dont neesisarly want to expand the scope of there request to include them
16:32:37 <sean-k-mooney> so if we decied to work on those we can file a follow up bluepritn or revisit the discussion
16:33:00 <gibi> +1 on follow up and keep the scope small and land it quickly
16:33:37 <Callum027> Sounds great, thanks guys, we've actually been using this in our production envs for a little while and it's been working great so I'm confident it's ready to go
16:33:59 <gibi> Callum027 thanks for working on it
16:34:05 <Uggla> Callum027, cool
16:34:38 <Callum027> That's it from me Uggla, thanks
16:35:32 <Callum027> Actually, do I need to add this to the Etherpad?
16:36:10 <Uggla> Callum027, it is ok, I have just updated it
16:36:24 <Callum027> Awesome, thank you!
16:36:50 <Uggla> btw, thanks for joining this call.
16:37:21 <Uggla> I guess sean-k-mooney wanted to discuss this https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/thread/NSW3OG5ME5RDPQYLHF4T4RCWPQYG57PK/ too
16:40:00 <gibi> it seems they have an RFE but they don't have a person volunteering to impelemnt
16:40:29 <gibi> I guess we can wait until we have somebody who at leat can propose a spec
16:40:33 <sean-k-mooney> /me clicks
16:40:47 <gibi> sean-k-mooney: this is the memballon RFE
16:40:47 <sean-k-mooney> oh that
16:41:07 <sean-k-mooney> ya i kind of felt like may i shoudl jsut spend an afternoon and write it
16:41:15 <sean-k-mooney> basicaly the short version is
16:41:33 <sean-k-mooney> nova orgianlly did not enabel the mem balloan at all unless you enabel memory stat reporting
16:41:45 <sean-k-mooney> at some point libvirt started adding it by default
16:41:59 <sean-k-mooney> but we have never actullly used it to release/reclaim unused memory form the guest
16:42:07 <sean-k-mooney> because that used to require a special agent on the host
16:42:17 <sean-k-mooney> proxmox and ovirt provided those
16:42:23 <sean-k-mooney> that now built into libvirt
16:42:36 <sean-k-mooney> so we coudl just hardcode 2 xml atributes
16:42:49 <sean-k-mooney> which woudl allwo memory to be release in 2 cases
16:43:14 <sean-k-mooney> oen when the OOM reaper is about to kill the vm if we have auto defalte on the vm gets a chane to free memory
16:43:24 <sean-k-mooney> the other one "freePageReporting"
16:43:40 <sean-k-mooney> allow the memory to be retruend to the host any time the guest free memory
16:43:52 <sean-k-mooney> i think it would nice ot supprot both in nova
16:44:08 <sean-k-mooney> but ya someone just needt to write it
16:44:32 <sean-k-mooney> i kind of wodnered how other felt
16:44:40 <gibi> if you have time to write it I'm not against it, but it is just a nice to have
16:45:09 <sean-k-mooney> track this as a wishlist RFE bug?
16:45:12 <sean-k-mooney> spec?
16:45:38 <sean-k-mooney> i feel like this is one of those things where is quicker to write then disucss :)
16:46:19 <Uggla> Yep that sounds quite strait forward so specless BP, I think.
16:46:23 <gibi> if this is hardcode as enalbed then specless
16:46:25 <sean-k-mooney> all of our qemu and linvert version miniums are above wehre this was added so there is no migration or upgrade aspect either
16:46:48 <gibi> does it effect realtime / nfv guest performance?
16:46:54 <sean-k-mooney> :)
16:47:13 <sean-k-mooney> well yes but they normlaly turn off the mem ballon entirly
16:47:22 <sean-k-mooney> because it addds overhead that they dont tolllerate
16:47:38 <sean-k-mooney> whicih is why we shoudl fix the bug that stops you form being able to do that
16:47:43 <gibi> ahh
16:47:50 <sean-k-mooney> https://bugs.launchpad.net/nova/+bug/2069192
16:47:57 <gibi> so we should not make that situation worse :) Lets fix the bug first then
16:48:13 <sean-k-mooney> so this would be hard code on if you have the mem balloan enabled
16:48:37 <sean-k-mooney> gibi: rajesh ahs a patch up to fix that bug https://review.opendev.org/c/openstack/nova/+/945621
16:48:44 <gibi> cool
16:49:06 <gibi> so I'm OK with a specless
16:49:14 <gibi> and I will look at that bugfix
16:49:30 <sean-k-mooney> Uggla: gibi ok lets do this. if i find time to work on this before m2 ill file a specless blueprint and we can aprove it then
16:49:51 <sean-k-mooney> if not i can file a RFE bug or similar tracker and if someone has time we can work on it later
16:50:01 <gibi> sounds good to me
16:50:53 <Uggla> +1
16:51:20 <Uggla> I have tracked that in the etherpad no to forget.
16:52:30 <Uggla> Something else to discuss ?
16:52:48 <gibi> -
16:53:11 <Uggla> Do you have still energy left to triage 1 or 2 bugs ?
16:53:26 <gibi> -
16:53:49 <sean-k-mooney> we can proably do one
16:54:17 <sean-k-mooney> although if you have not noticed im a littel disrtrated today. but we can try to do one
16:54:51 <Uggla> #topic Bug scrubbing
16:54:56 <Uggla> #link: https://etherpad.opendev.org/p/nova-bug-selection-for-triaging#L4
16:55:02 <Uggla> Just one
16:55:17 <Uggla> https://bugs.launchpad.net/nova/+bug/2103500
16:56:21 <Uggla> To my mind that sounds a valid one. At least regarding the comment from Brian.
16:58:29 <gibi> db error in the neutron code, this is a neutron bug
16:58:49 <gibi> at least neutron should not return 500
16:59:14 <sean-k-mooney> so we had a vm create request with --network
16:59:28 <sean-k-mooney> it existsed at the api but by the time we got to the compute it as deleteed
16:59:48 <sean-k-mooney> i guesss it coudl happen when we do the quota check as well
17:00:01 <gmaan> and nova do capture NetworkNotFound  exception if it is not found and requested https://github.com/openstack/nova/blob/221a3e89e8988bc664298106ee691a4e41ca71f9/nova/api/openstack/compute/servers.py#L842
17:00:02 <sean-k-mooney> in anycase ya we shoudl get back a 400 not a 500
17:00:43 <gmaan> or somewhere neutron exception is lost  and not raised to controller ?
17:00:47 <sean-k-mooney> the expected behivor here shold be that nova puts the vm in error
17:01:03 <sean-k-mooney> and when the vm is deleted we would clean up any prot that were created then
17:01:09 <sean-k-mooney> based on teh delete on termeinate behaivor
17:01:20 <gmaan> yeah, bug say ignore the deleted network port and continue VM creation which does not seems right
17:01:42 <sean-k-mooney> its not form a nova point of view because we could not honner the orginal request
17:02:33 <sean-k-mooney> so depending on where this failed it soudl be burried in cell 0, rejected at the quoata check with a 400 or in error in the cell db if it got all the way to the compute
17:04:05 <sean-k-mooney> reading it more careflyy im tempeted to mark it inviald in nova
17:04:18 * gibi bows to Uggla and disappers
17:04:35 <sean-k-mooney> in there case its the quota check that is failing whne we list all prots for a tenatnt
17:05:05 <Uggla> sean-k-mooney, if you are unsure we can keep it open for next week.
17:05:26 <sean-k-mooney> althogh if we are missing the try accpate on teh network not found then thats a valid bug i guess
17:07:18 <Uggla> sale for valid ?
17:07:55 <gmaan> I have not tracked the exact call of validate_networks() but nova handle NetworkNotFound in _validate_requested_network_ids
17:07:55 <gmaan> https://github.com/openstack/nova/blob/221a3e89e8988bc664298106ee691a4e41ca71f9/nova/network/neutron.py#L988
17:09:06 <gmaan> does bug mention 500 is returned? or 400 ?
17:09:14 <sean-k-mooney> gmaan: i think neturon is returnign a 500
17:09:17 <sean-k-mooney> https://paste.openstack.org/show/bt3BWnyrBbQspiFx8KPZ/
17:09:39 <gmaan> ohk
17:09:51 <sean-k-mooney> i think we need more info to be sure
17:10:13 <sean-k-mooney> the do have a NetworkNotFound excption https://paste.openstack.org/show/bJL9b5aY99XpZlLZdzgy/
17:10:15 <gmaan> I think nova hanlding the error correctly if neuton return NetworkNotFound
17:10:35 <gmaan> and controller also handle NetworkNotFound  so 400 will be return
17:11:21 <sean-k-mooney> can you comment to that effect on the bug
17:11:47 <gmaan> sure, will do after TC meeting
17:11:58 <sean-k-mooney> the first step to fixing this if we were to fix it in nova woudl be a repoducer test anyway
17:12:03 <sean-k-mooney> iesally a functional one
17:13:45 <Uggla> sean-k-mooney, does that mean we can set it as valid ?
17:15:43 <Uggla> I need to close the meeting, so I propose to look at that bug offline later or next week.
17:15:47 <Uggla> Thanks all and thanks for the extended time for bug triage.
17:15:54 <gmaan> Uggla: I will check exact trace of validate_network and see if nova missing NetworkNotFound  handling anywhere
17:15:54 <Uggla> #endmeeting