Tuesday, 2025-08-19

opendevreviewMerged openstack/nova master: restrict swap volume to cinder  https://review.opendev.org/c/openstack/nova/+/95775701:00
*** mhen_ is now known as mhen01:25
gibielodilles: when you are up could you look at https://review.opendev.org/q/topic:%22bug/2112187%22+project:openstack/nova backports. It is important as it is a security hardening bug05:38
sean-k-mooney[m]gibi thanks for updating tempest, is that third test new? i tought ran tempest full - the encyption adn cinder backup tests on friday so im not sure why i didnt see it fail07:02
sean-k-mooney[m]hum no i must not have set the approate tempest config locally and  it must have skiped07:07
sean-k-mooney[m]i proably didnt explictly define the multiatch volume type of something along those lines07:07
gibisean-k-mooney[m]: yeah it is not new07:17
opendevreviewMerged openstack/nova stable/2025.1: restrict swap volume to cinder  https://review.opendev.org/c/openstack/nova/+/95775909:08
opendevreviewMax proposed openstack/nova master: fix: leftover volume_attachment on instance delete  https://review.opendev.org/c/openstack/nova/+/93498409:16
sean-k-mooneygibi: did you see my comment last week about using RSS instead of adress_sapce for the prlimit for the qemu-img commands by the way?12:02
sean-k-mooneypriteau: approved12:04
priteauThank you sean-k-mooney :)12:05
opendevreviewKamil Sambor proposed openstack/nova master: Switch nova-conductor to use ThreadPoolExecutor  https://review.opendev.org/c/openstack/nova/+/95708812:41
gibisean-k-mooney: I saw and made a note abuot it13:18
gibiI have to get back to it eventually13:18
sean-k-mooneygibi: i assum you have not pushed a poc patch to bump the limit yet? i could try and push one to swap ti to rss and depend on the DNM ci job revert patch you have13:27
gibisean-k-mooney: I only pinpointed the address space limit increase we need on master +5MB. And I tried to see how much address space we used before the new ceph but I cannot get that data as i) I cannot roll back the ceph version in ubuntu as the package repo already cleaned ii) the older ubuntu based jobs does not want to run the specific test for me even though I think I set everything. But I had 13:32
gibinot time to dig deep into RSS at all13:32
gibithe problem with RSS is similar, we don't know the baseline usage. 13:33
gibiso we cannot judge the that the increase is just small and OK, or it is a huge bump from the previous version13:34
gibiif you have time you sure you can play with it to get an RSS number we need on master13:34
sean-k-mooneywell i was orginally thinking fo using the same 1G limit to start with but on RSS not adress space13:35
sean-k-mooneyand then cosndiering if we need to factor it out to a tuneable config option later13:35
sean-k-mooneythe orgianl intent of the limit was to workaroudn some pathological beahivor in old version of qemu-img with untrusted input13:36
gibiI suggest to at least try locally with different volume sizes to see if the RSS is depending on the size of the volume13:36
gibisean-k-mooney: I affraid we might have a new pathological behavior :)13:36
gibitoday the test resizes an 1G volume to a 2G volume13:37
sean-k-mooneywell i think sinc ethey are now recfacotring things to use stackless corutiens in the newer version of c++13:37
gibiso the test is not really realistic. Out there the volumes are lot bigger13:37
sean-k-mooneythat the amoutn of adress space has increase but not nessisarly the amount of commit memory13:37
gibiyeah I agree about that ^^13:37
gibiso RSS is a good direction13:37
sean-k-mooneygibi: in principal this does not matter abot voluems corroct13:37
sean-k-mooneywe coudl do some local testign with files of diffent size13:37
sean-k-mooneyor do you think the volumes actully play an imporant role13:38
sean-k-mooneyoh actully this iss in the rbd code path13:38
sean-k-mooneyso we likely do need ceph right?13:38
gibieither the volume, or the fact that it is luksv1 encrypted, or the fact that the test case is resizing plays a role as other test cases does not fail13:38
sean-k-mooneyack so ya we woudl need to test a few diffent ways.13:39
gibi  File "/opt/stack/tempest/tempest/api/volume/admin/test_encrypted_volumes_extend.py", line 41, in test_extend_attached_encrypted_volume_luksv113:39
sean-k-mooneyi wonder  fi there is a way to log this info13:39
gibithis is the test case that triggers it13:39
sean-k-mooneywe porobaly shoudl park this for now but sinc ehte ceph folks are interested i also dont want to loose too much momentem13:39
gibiwe call a bunch of qemu-img info during tempest, but only this test trigger the out of memory for qemu-img info13:40
sean-k-mooneyya so dans comment that the encyption may impact it seams valid13:40
gibiyeah I have a bz open on ceph and they are responsive but we don't have a clear non tempest repro for them13:40
sean-k-mooneyi was wondering if we coudl create a smaller non openstack repdocuer too13:41
sean-k-mooneythat why i was wondering about a file13:41
gibiI copied the volume out of ceph as a file and run the same command it did not failed13:41
sean-k-mooneybut we could likely jsut see if we can manually repoduce this with ceph and qemu directly13:41
sean-k-mooneyi just dont happen to have a ceph install to hand currently13:41
gibiso rbd is in the picture13:41
sean-k-mooneyack13:42
sean-k-mooneyso im wondering is it as simple as create a raw image wiht luks, upload that to ceph as a voluem13:42
sean-k-mooneyextend the volume and messure the adress space and rss13:42
sean-k-mooneyfo qemu image as you grow it13:42
gibiyeah that can be tried13:43
gibirealistically for me it is not earlier than next week when I can try13:43
sean-k-mooneyya i was goign to say the same13:43
sean-k-mooneypost FF13:43
sean-k-mooneyi can do the quick RSS hack and push but digging deeper will take more time then ill have before FF or perhaps RC1 is done13:44
gibiack13:45
dansmithgibi: I wonder if you could look at this conversation: https://review.opendev.org/c/openstack/nova/+/952630/12/nova/conductor/manager.py#32214:07
dansmithI really hate this idea that we have to do a mini data migration (from instance to reqspec) every time we do a move operation14:07
dansmithit feels fragile to me, like we could end up missing that step in some cases, and/or get other weird rebuild/evac/whatever behavior later, and after hours of debugging,14:08
dansmithrealize that one instance worked because it had been migrated once (and thus got the reqspec change) and another failed because it had never been migrated14:08
gibidansmith: OK, added to my list along with the other place https://review.opendev.org/c/openstack/nova/+/942502/13/nova/scheduler/request_filter.py#477 melwitt ping me about yesterday14:09
dansmith(in this case I mean s/migrated/moved in some way/)14:09
dansmithgibi: ack, thanks14:09
dansmithgibi: that's a similar theme to the question above.. all about how we define, choose, and expose the default policy we choose for new instances when they don't otherwise request one14:14
UgglaHello, nova meeting in ~1h.15:01
jssfrUggla, I might not make it, but I'd like to bring https://review.opendev.org/c/openstack/nova/+/955657 to the table -- I managed to (hopefully) bring the patch into shape as discussed two weeks ago and I'd appreciate any reviews or feedback :)15:05
jssfrI'll read up on IRC discussion if I'm pinged here, too (but probably I'm going to be preparing dinner at the meeting time :-X) :)15:06
Uggla#startmeeting nova16:00
opendevmeetMeeting started Tue Aug 19 16:00:25 2025 UTC and is due to finish in 60 minutes.  The chair is Uggla. Information about MeetBot at http://wiki.debian.org/MeetBot.16:00
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.16:00
opendevmeetThe meeting name has been set to 'nova'16:00
gibio/16:00
tkajinamo/16:00
UgglaHello everyone16:00
gmaano/16:00
UgglaAwaiting a couple of minute so people can join.16:01
Uggla#topic Bugs (stuck/critical) 16:02
elodilleso/16:02
Uggla#info No Critical bug16:02
Uggla#topic Gate status16:02
Uggla#link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs 16:02
Uggla#link https://etherpad.opendev.org/p/nova-ci-failures-minimal16:02
Uggla#link https://zuul.openstack.org/builds?project=openstack%2Fnova&project=openstack%2Fplacement&branch=stable%2F*&branch=master&pipeline=periodic-weekly&skip=0 Nova&Placement periodic jobs status16:02
Uggla#info Please look at the gate failures and file a bug report with the gate-failure tag.16:02
Uggla#info Please try to provide a meaningful comment when you recheck16:02
UgglaI checked quickly the report and don't see anything scary.16:03
Uggla#topic tempest-with-latest-microversion job status16:04
gibiwe have a security adjacent bug switched to public yesterday. Fix landed on master and stable backports are progressing. https://bugs.launchpad.net/nova/+bug/211218716:04
Ugglagmaan, do you want to say anything about it ?16:04
gmaansorry, nothing from me.16:04
Ugglathanks gmaan16:05
Ugglagibi, sorry I was a bit fast.16:05
gibiI just wanted to raise attention to that bug and the related fixes16:05
gibiwe need to do a tempest dance to skip some tests, land the nova fix, then re-enable tests with fixes applied16:06
elodillesregarding the sec bug fix: do we want to do stable releases once those fixes landed?16:06
gibielodilles: yes we want16:06
elodillesgibi: ACK, then I'll prepare some release patches16:06
gibithanks16:07
elodillesnp16:07
Ugglathx elodilles16:07
gibiUggla: back to you16:07
Uggla#topic Release Planning 16:08
Uggla#link https://releases.openstack.org/flamingo/schedule.html16:08
Uggla#info Nova deadlines are set in the above schedule16:09
Uggla#info Feature freeze is next week.16:09
Uggla#topic Review priorities16:09
Uggla#link https://etherpad.opendev.org/p/nova-2025.2-status16:09
Uggla#link Amd Sev serie (tkajinam) need review https://review.opendev.org/q/topic:%22bp/amd-sev-es-libvirt-support%2216:10
Uggla#link 955657: Preserve vTPM state between power off and power on | https://review.opendev.org/c/openstack/nova/+/95565716:10
UgglaConcerning the later one, jssfr woould like reviews or feedbacks.16:11
tkajinamregarding the first one, I've addressed initial feedback by Uggla and appreciate reviews16:12
Ugglatkajinam, I did not manage to spend a lot of time on it. But I'll try to do it before the end of the week.16:13
tkajinamthx16:13
tkajinamI'm concerned that it might be too tricky to be merged in a few days, but any opinion about possibility to get FFE would be also appreciated.16:13
UgglaThere are not a lot of patches so maybe it will be possible.16:16
UgglaAnyway we will see.16:16
tkajinamyup16:16
Ugglamoving on to next topic16:16
Uggla#topic OpenAPI 16:16
Uggla#link: https://review.opendev.org/q/topic:%22openapi%22+(project:openstack/nova+OR+project:openstack/placement)+-status:merged+-status:abandoned16:17
Uggla#info still 32 remaining atm.16:17
Uggla#topic Stable Branches 16:17
Ugglaelodilles, I give you the mic.16:18
elodillesthanks16:18
elodilles#info stable branches (stable/2025.1 and stable/2024.*) seem to be in OK state16:18
elodilles#info stable branch status / gate failures tracking etherpad: https://etherpad.opendev.org/p/nova-stable-branch-ci16:18
elodillesand that's all (plus i'm going to prepare the new stable release patches)16:19
elodillesUggla: back to you16:19
Ugglathx elodilles16:19
elodillesnp16:20
UgglaI'm going to skip next topic due to pto.16:22
Uggla#topic Gibi's news about eventlet removal. 16:22
gibio/16:22
Uggla#link Blog: https://gibizer.github.io/categories/eventlet/16:22
gibiBoth n-sch and n-api and n-metadata API patches landed so these services can be switched to native threading and nova-next is running them that way already on all patches. My large series is now for the unit test fixes but that can wait.16:22
gibiThere is a bunch of small patches that I want to promote to land:16:22
gibihttps://review.opendev.org/c/openstack/nova/+/957424 Ask for pre-prod testing for native threading16:22
gibihttps://review.opendev.org/c/openstack/nova/+/947260 Remove nova.service.process_launcher16:23
gibihttps://review.opendev.org/c/openstack/nova/+/950991 Centralize cooperative yield16:23
gibihttps://review.opendev.org/c/openstack/nova/+/950992 [hacking] N374 do not use time.sleep(0) to yield16:23
gibihttps://review.opendev.org/c/openstack/nova/+/954990 Remove eventlet timer from multi_cell_list16:23
Uggla\o/16:23
gibiand tomorrow is pulic holiday here so I will not be able to run the weekly eventlet sync call 16:23
gibiso feel free to skip it or do it without me16:23
gibiI know Kamil updated the nova-conductor series after my feedback but I had no time to go back to it yet16:24
gibithat is all from me16:24
gibiback to you Uggla 16:24
UgglaCool thanks Gibi !16:24
Uggla#topic Open discussion16:25
Uggla#info FYI, I'm a volunteer for the next cycle. 957240: Adding René Ribaud candidacy for Nova 2026.1 PTL | https://review.opendev.org/c/openstack/election/+/95724016:26
gibithanks Uggla 16:26
tkajinamThanks for stepping up, Uggla++ :-)16:26
UgglaThat's all I have16:26
UgglaDoes someone have a topic ?16:27
tkajinamno additional topic from me16:28
Ugglaok moving on.16:29
Uggla#topic Bug scrubbing 16:29
Uggla#info down to 172, I manage to answer reporters about some invalid or incomplete ones. 16:30
UgglaIf you don't mind I will skip the scrubbing today, because we are only few today, and my eyes hurt.16:32
sean-k-mooneyo/ sorry was distraced down stream16:32
UgglaHi sean-k-mooney, any last topic you want to bring ? Otherwise I think I will close for todaY.16:33
sean-k-mooneyno didnt really have anything16:34
sean-k-mooneywas just reading back16:34
sean-k-mooneythe main thing is obviously the bug we are currently backporting but htere is nothing actionabl to do on that16:34
UgglaThe security one ?16:35
sean-k-mooneyya we are just making our way through the stable branches16:35
sean-k-mooneythanks elodilles  for taking a look by the way16:35
elodillessean-k-mooney: no problem, thanks for fixing it o:)16:35
sean-k-mooneyelodilles: on the watcher side we will probly do release fot his and the other pending patches at the end of the week or next week16:36
sean-k-mooneyill prepare patches for those16:36
elodillessean-k-mooney: ACK, thanks for the notice, i'll try to take an eye on it16:36
elodilles1st release patch for nova is up btw: https://review.opendev.org/c/openstack/releases/+/95795116:37
elodillesUggla: ^^^ fyi o:)16:37
sean-k-mooneyUggla: so ya thats all i had so we can wrap if there are no other topics16:37
elodilles(i'll update 2024.2 dalmatian release patch when the patch gets merged and prepare 2024.1 caracal one too)16:38
Ugglaelodilles cool, I'll look at it tomorrow.16:39
elodillesthanks in advance o/16:39
opendevreviewMerged openstack/nova master: only show standard image properties in server show.  https://review.opendev.org/c/openstack/nova/+/94241316:39
UgglaSo I'm gonna close, thanks all.16:40
Uggla#endmeeting16:40
opendevmeetMeeting ended Tue Aug 19 16:40:08 2025 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)16:40
opendevmeetMinutes:        https://meetings.opendev.org/meetings/nova/2025/nova.2025-08-19-16.00.html16:40
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/nova/2025/nova.2025-08-19-16.00.txt16:40
opendevmeetLog:            https://meetings.opendev.org/meetings/nova/2025/nova.2025-08-19-16.00.log.html16:40
elodillesthanks Uggla o/16:40
tkajinamthanks16:40
opendevreviewsean mooney proposed openstack/nova stable/2025.1: only show standard image properties in server show.  https://review.opendev.org/c/openstack/nova/+/95795816:44
sean-k-mooney... that has conflicts let me do that form the cli instead...16:44
opendevreviewMerged openstack/nova stable/2024.2: restrict swap volume to cinder  https://review.opendev.org/c/openstack/nova/+/95776217:01
*** sfinucan is now known as stephenfin17:02
gibimelwitt: dansmith: responded in https://review.opendev.org/c/openstack/nova/+/942502/13#message-cf59c0d4f875b19c85c544c81fa76210786ca184 19:28
gibiunfortunately I had less time today for this than I wanted and I will be away until next Monday.19:28
gibianyhow I lean towards using the host level default temporarily if possible. I offered some suggestion how. 19:35
melwittthank you gibi. I will read through19:36
dansmithmelwitt: I don't have the brain power to think this through completely right now, so let me muse here in semi-persistent land19:41
dansmithsomething gibi mentioned makes me think...19:41
dansmithwe're kinda depending on the scheduler doing the right thing for much of the enforcement of the policy (maybe more than I'd like even, as I noted, but alas)19:42
dansmithmaybe the better thing here is to think about the reqspec as the authoritative thing instead of sysmeta19:42
dansmithlike maybe we have a default (configurable or just default to host) or we require a policy on the flavor/image for new instances19:43
dansmithand we just stamp it into the sysmeta on the way down19:44
dansmithand handle the migration of existing instances differently.. either:19:44
dansmith1.  stamp them as user and make them resize if they want "out". That's somewhat fitting in that right now the flavor they booted with doesn't have migrate ability, resize to a flavor that does fixes the problem19:44
dansmith2. make hard reboot pass down a default *or* have some other way for them to stamp a policy (either choice or agree to the default)19:45
dansmithand then just focus on reqspec being the authoritative thing and sysmeta just becomes basically the write-once-read-many record of what was chosen so we know where to find a secret, but never change it (at the compute level)19:46
melwittI will need to think through these ... and I note that it feels like most of the spec is being rewritten (and the implementation thusly rewritten). not sure if this can realistically make FF :/19:50
dansmithyeah, I know, I was thinking about this earlier too and wondering if we were to ignore the existing instances if we could even just like merge host+user, no hard reboot fix and push some of the details to the end19:52
dansmiththe above where we just require resize if you want out of the current limitation might also mean we need to merge less to hit our goal19:52
dansmithand we could follow up with the hard reboot approve-in-place as an optimization19:53
melwittthat's an interesting idea19:53
gibi1. feel very tempting at first read20:02
gibiresize being the opt in is acceptable to me20:02
gibiand for the cloud admin asking for a reboot or a resize is not a big difference, it is an ask anyhow20:04
gibialso level=user becomes the default which I like :) 20:04
melwittI agree resize opt in sounds reasonable for the first phase in any case20:05
opendevreviewMerged openstack/nova master: [tests] Add printing of sample and template paths  https://review.opendev.org/c/openstack/nova/+/94405420:15
dansmithresize as the opt-in also lets the admin force them to be moved if they don't want to mix user and host, for example. hard reboot does not give that option20:15
sean-k-mooneydoes resizeing in this context mean selectign a new flavor with a diffent secreate policy to opt in.20:44
sean-k-mooneyim jsut wrapping for the day but i will try to read abck tomorrow if i rememeber20:45
sean-k-mooneyoh that the option 1 above20:45
sean-k-mooneyto do 1 we woudl ahve to update the isntance copy of the falvor as if the extra spec was set similar to how we often do that to fake the image property being set20:47
sean-k-mooneywhich is proably ok but i think that a littel trickerer just becasue we store it diffently20:47
opendevreviewMerged openstack/python-novaclient master: tests: Stop "testing" against Identity v2 API  https://review.opendev.org/c/openstack/python-novaclient/+/95298420:47
sean-k-mooneytirvially so but it woudl be the first tiem to do that with the embeded falvor that i am aware of20:48
melwittyeah.. I need a moment to regain my bearings 😝 I have spent so much time working on this and implementing the spec. I need some time to re-adjust my mind to this new thing20:51
sean-k-mooneyya i dont have ennoguh sapre brain power to load context and think about this with any depth tonight to say anythign more then "Maybe?"20:55
melwittheh20:56
sean-k-mooneyi vaguly got that dan want to avoid updating the request spec on all move operations which i agree should not be required but i dont knwo why the serisse currenlty is so i cant really comment20:56
melwittit's because the original design involves setting a default inside sysmeta down in the libvirt driver. so that has to make it to the request spec somehow in order to be honored during scheduling20:57
sean-k-mooneyas in fakign that it was selected in the image medata data 20:58
melwittthis issue goes away if we require image/flavor for the policy, which is why that's being talked about above20:58
sean-k-mooneyso your saying we need to sync the image metadata form teh cell db to the request spec 20:58
sean-k-mooneyso that move ops fine the right host20:58
melwittyes pretty much. the entire design in the spec is to put a image_hw_tpm_secret_security into the instance sysmeta regardless of whether it came from an image or not20:58
sean-k-mooneyghat that sound valid and unfortunet because the comptue shoudl nbot need to touch the api db20:59
sean-k-mooneybut how do we handel this today?20:59
sean-k-mooneywe set a buch fo thing liek that already20:59
melwittyou mean like for AZs? that got brought up in the gerrit review21:00
sean-k-mooneylike we set the request for a config drive or the machine tyep when the isntace lands on a comptue node21:00
sean-k-mooneyand recored them in instance_system_metadta during spawn21:00
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L7190-L720021:02
sean-k-mooneywe do it in other palcees too that just the first palce i found21:03
melwittyeah that's very similar thing. but I don't find that ever makes it to the request spec21:03
sean-k-mooneythere are a bunch of debug lgos when we boot a vm where we recored specific values to make sure they dont change over the lifetime of the vm21:04
melwittnot sure if any of those cases are things that are required for scheduling21:04
sean-k-mooneyya nor am i21:06
sean-k-mooneyoh i set config_drive on the instance object not the image meta github.com/openstack/nova/blob/master/nova/compute/manager.py#L2255-L2268 https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2255-L226821:10
sean-k-mooneyand we do not schduler on that either21:10
sean-k-mooneymelwitt: https://paste.opendev.org/show/bGufCv4fBSCZwAP3m6qT/ thos are the log message i ment21:19
sean-k-mooneywhich is https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2255-L226821:20
sean-k-mooneysorry wrong link21:20
sean-k-mooneyhttps://github.com/openstack/nova/blob/d5cfdfd16d7573a83b05dd6c0656b7610e77c4c4/nova/virt/libvirt/driver.py#L993-L104321:21
sean-k-mooneywe can scheduler on  'hw_video_model', 'hw_vif_model'21:21
sean-k-mooneyvia a placement prefileter21:22
sean-k-mooneybut perhaps that only works fi you ahve it set on the iamge properly21:22
melwitthm yeah maybe21:22
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2255-L226821:23
sean-k-mooneywehn i wrote that i was not thinkign fo this other way of populating it 21:23
sean-k-mooney...21:23
sean-k-mooneyway is that link stuck in my copy past buffer21:23
sean-k-mooneyhttps://github.com/openstack/nova/blob/d5cfdfd16d7573a83b05dd6c0656b7610e77c4c4/nova/scheduler/request_filter.py#L19921:23
melwittit really wants you to talk about that link I guess21:24
sean-k-mooneyoh its becasue the url was not updating in my firefox21:24
sean-k-mooneyyou know the way if you edit the url but then navigate it does not updstea properly i think that is what happend21:24
melwittyeah that request filter is assuming the image properties are already in the request spec21:25
sean-k-mooneyso lee added this registration logic so that we could freely change the defautls without affectin new instnaces21:25
sean-k-mooneybut it was never inteded for scheduleing really21:26
sean-k-mooneybut i guess that was an oversight in the orignal feature21:26
sean-k-mooneywe never provide a way to sync the request spec with those if it was ever needed21:26
sean-k-mooneymelwitt: this is were your nova audit proroal woudl have been useful21:27
sean-k-mooneyi.e. to do a perodic sync of this info form the cell db to the api db21:27
melwittnova-janitor21:29
opendevreviewMerged openstack/python-novaclient master: Replace keystoneclient with openstacksdk  https://review.opendev.org/c/openstack/python-novaclient/+/95298621:31
opendevreviewMerged openstack/nova stable/2024.1: restrict swap volume to cinder  https://review.opendev.org/c/openstack/nova/+/95776421:37
opendevreviewMerged openstack/nova stable/2025.1: Ignore metadata tags in pci/stats _find_pool logic  https://review.opendev.org/c/openstack/nova/+/94509623:54

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!