Tuesday, 2025-04-01

opendevreviewYaguang Tang proposed openstack/nova stable/2023.2: Fix device type when booting from ISO image  https://review.opendev.org/c/openstack/nova/+/94590301:56
opendevreviewYaguang Tang proposed openstack/nova stable/2024.1: Fix device type when booting from ISO image  https://review.opendev.org/c/openstack/nova/+/94581606:52
sahido/09:28
sahidany chance to have review regarding that one, et the serie behind?09:29
sahidhttps://review.opendev.org/c/openstack/neutron/+/940983/609:29
sahidit's waiting for a while now :-)09:29
sean-k-mooneyi assume you ment ot put that in the neutron channel09:31
stephenfinsean-k-mooney: trivial README change here if you've 3.5 seconds https://review.opendev.org/c/openstack/nova/+/94499409:32
sean-k-mooneysahid: but i would not recommend creating a theread pool dynmaiclly in a plugin09:32
sean-k-mooneysahid: if your going to use thread pools you shoudl create one globaly for neutron to use and reuse it 09:32
sean-k-mooneylike we do in nova09:32
sahidsean-k-mooney: oh yes... my mistake09:34
sahidsean-k-mooney: that is an interesting point yes09:34
sean-k-mooneysahid: https://review.opendev.org/c/openstack/nova/+/92249709:34
sahidi will discuss that with neutron, i think ralonsoh may also need it from it's change to the l3 agent09:35
sahidits09:35
sean-k-mooneyactully that not quite the patch i wanted09:35
sean-k-mooneywell no that uses futureist09:35
sean-k-mooneyoh im not adding eventlet tpool there im just moving the import09:36
sean-k-mooneythats what was confusing me09:36
sahidin all cases I think that make sense to have one global thread pool and each pick on it, so each service will use its own thread pool09:40
sahidbut well some time we may need to have one context os threads so we can play with them, join them or kill them09:41
sean-k-mooneysahid: you shoudl never kill a thread in python09:44
sean-k-mooneyyou shoudl be using futures for the most part when you need to wait on results09:44
sean-k-mooneyif you need to kill something you should be using a process pool09:45
sean-k-mooneyotherwise you need to dispatch cancelable tasks09:45
sean-k-mooneykilling a thread via python is generally unsafe, there is no public api to do that and you have to fall abck to posix hacks, which mean you can leak locks and filehandels09:46
opendevreviewsean mooney proposed openstack/osc-placement master: Add bindep.txt for ubunutu 24.04 support  https://review.opendev.org/c/openstack/osc-placement/+/94603112:14
opendevreviewsean mooney proposed openstack/osc-placement master: Add bindep.txt for ubunutu 24.04 support  https://review.opendev.org/c/openstack/osc-placement/+/94603112:14
sean-k-mooneyhi folks https://review.opendev.org/c/openstack/osc-placement/+/946031 seam to fix the osc-placement docs jobs. we will need to backport that to stable 2025.1 as well13:07
sean-k-mooneywe can do that after the release. tl;dr we need a bindep file to install pcre because the header is not transitivly installed on ubuntu 24.0413:08
cardoeHoping to get https://review.opendev.org/c/openstack/nova/+/942019 on your folks radar too. Really helps in figuring out why something failed to get the error message instead of the hex memory address of the error message.13:23
gibicardoe: I've approved the patch. Thanks for pushing a fix13:25
cardoeThank you. :)13:25
gibisean-k-mooney: bauzas: could one of you take a look at this simple os_traits addition https://review.opendev.org/c/openstack/os-traits/+/944049 for OTU?14:08
sean-k-mooneysure but if we plan do do a realase for that we shoudl include https://review.opendev.org/c/openstack/os-traits/+/94041814:09
bauzasdone14:09
gibisean-k-mooney: sure14:12
dansmithgibi: my test should have caught that I was iterating RCs for no reason.. do you mind if I put a slightly-unrealistic allocation in for the PCI device in the test where there are two RCs to make sure we only call invalidate once?14:14
dansmithPCI in placement would only have one CUSTOM_PCI_X_Y:1 thing in the allocation, but if I put two I can make sure I'm only doing it once14:15
gibidansmith: I don't mind at all14:16
dansmithcool14:16
dansmithgibi: are you walking up the series again? if so, I'll hold off pushing this fix14:16
opendevreviewRajesh Tailor proposed openstack/nova master: Add upgrade status check for duplicate cell names  https://review.opendev.org/c/openstack/nova/+/90181014:17
opendevreviewMerged openstack/os-traits master: Add HW_PCI_ONE_TIME_USE trait  https://review.opendev.org/c/openstack/os-traits/+/94404914:18
gibidansmith: yeah I'm reviewing at the moment14:27
dansmithack, i'll hold off14:27
gibithanks14:27
artomUggla, heads up, new security/nova x-project topic for PTL at https://etherpad.opendev.org/p/nova-2025.2-ptg#L7714:40
gibidansmith: I finished reviewing the stack14:54
dansmithgibi: cool, thanks, I'm just running tests after removing the spec copy14:54
gibiOK14:54
opendevreviewRajesh Tailor proposed openstack/nova master: Update the api-ref for unshelve  https://review.opendev.org/c/openstack/nova/+/93805414:57
gibidansmith: do you plan to push some functional test coverage top of the series?14:58
dansmithI haven't gone deeply into the functional tests for pci yet, because I've been sort of working with my devstack (and scheming about getting it tested in a job)14:59
dansmithif you have a pointer to a good place to look at something that would be applicable for inspiration, I can do that15:00
gibisure: https://github.com/openstack/nova/blob/master/nova/tests/functional/libvirt/test_pci_in_placement.py this has good examples15:00
gibiactually VM boots with PCI in Placement starts here https://github.com/openstack/nova/blob/master/nova/tests/functional/libvirt/test_pci_in_placement.py#L161815:01
dansmithcool, I'll get to a stopping point in the other thing I'm doing soon and try to add some more of that15:02
bauzasoh doh15:02
bauzasforgot we got daylight savings here15:02
bauzasI was about to yell to Uggla he forgot the meeting :D15:03
* bauzas goes back into his cave for 57 mins15:03
gibidansmith: cool15:03
gibidansmith: you probably need to extend our assert to check for the reserved value on the inventory here https://github.com/openstack/nova/blob/98226b60f3fe7b20e8d7f208c12f8d0086cd83d0/nova/tests/functional/libvirt/test_pci_sriov_servers.py#L182-L18815:05
gibias today only total and max_unit is asserted15:06
UgglaMeeting in around 1h15:07
dansmithack15:08
* gibi now hates DST15:08
Uggla@artom, ok I have seen the cross session needs.15:09
dansmithgibi: I'm not sure I can write those functional tests15:18
dansmithwell, the delay has messed up the joke.. I was going to say "I can't bring myself to waste enough vertical space to fit the style in that file"15:27
dansmithUggla: can we do an os-traits release to get the new one that merged this morning? we can't really test functionally until that's available in our reqs/venvs15:27
Uggla@dansmith, yes I think so. I'll propose a patch for it right after the Epoxy release. Will it be ok for you ?15:29
* gibi looks at golang built in formatter15:30
dansmithgibi: don't get me started15:30
* gibi stops looking15:30
gibi:)15:30
dansmithUggla: I guess I'll put these in a separate patch since they won't pass until then. I already have a local patch to add just the trait reporting thing to a functional test, which can't work in CI until the os-traits release15:30
dansmithso I'll just add some more in there and push that up with the caveat that it won't work until that releases15:31
dansmithgibi: ^ re: functionals in a separate patch, at least for the moment15:32
Uggla@dansmith, ok lgtm15:32
gibidansmith: works for me too15:32
* gibi hands his spare vertical space to dansmith to put it in good use15:33
dansmithI had stopped looking at the functional stuff since I had to hack my local env with the trait to get even basic stuff working, but I'll just do that so I can make some progress there15:33
dansmithgibi: unfortunately, the '\n' bytes are free, it's the screen real estate and the burning feeling in my eyeballs that is not15:34
gibitrue15:36
gibiwhat if you add a new test_ file for otu and use your own style? I would totally accept that15:37
gibithe current file is already 2000 LOC so it is reasonable to split15:38
dansmithI already hate that nova has grown this very-different style from it's reasonably-consistent-just-not-machine-formatted style before this madeness15:40
dansmithso further splitting and being different to all the other PCI-in-placement tests here doesn't seem beneficial, except for my eyeballs15:41
dansmithI will attempt to hold my tongue and just do it, but if I can't stand it I'll split15:42
dansmithI don't *think* there will be a lot of OTU tests here15:42
dansmiththere's just not that much to do I think15:42
gibidansmith: understood15:47
gibiI'm not sure how nova kept the resonably consistent style in the past15:47
gibiwere there stricter reviewers back then?15:48
gibiless hippy devs?15:49
dansmitha lot of existing code and people copying the style that was there15:49
dansmithwe also had some guidelines like breaking lines with parens instead of backslashes, and several of the pep8 style guidelines which were not machine enforced15:50
dansmitha lot of people wrote a lot of reasonably-consistent code before machine formatting made everything reliably ugly :)15:50
UgglaMeeting in 5mn.15:55
Uggla#startmeeting nova16:01
opendevmeetMeeting started Tue Apr  1 16:01:42 2025 UTC and is due to finish in 60 minutes.  The chair is Uggla. Information about MeetBot at http://wiki.debian.org/MeetBot.16:01
opendevmeetUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.16:01
opendevmeetThe meeting name has been set to 'nova'16:01
UgglaHello everyone16:01
masahitoo/16:02
dansmitho/16:02
gibio/16:02
sean-k-mooneyo/16:02
elodilleso/16:03
bauzaso/16:03
Uggla#topic Bugs (stuck/critical) 16:03
Uggla#info No Critical bug16:03
Uggla#info Add yourself in the team bug roster if you want to help https://etherpad.opendev.org/p/nova-bug-triage-roster16:04
Ugglaanything about bugs ?16:04
Uggla#topic Gate status 16:05
Uggla#link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs 16:05
Uggla#link https://etherpad.opendev.org/p/nova-ci-failures-minimal16:05
Uggla#link https://zuul.openstack.org/builds?project=openstack%2Fnova&project=openstack%2Fplacement&branch=stable%2F*&branch=master&pipeline=periodic-weekly&skip=0 Nova&Placement periodic jobs status16:05
Uggla#info Please look at the gate failures and file a bug report with the gate-failure tag.16:05
sean-k-mooneyso technially osc-placment gate is blocked without https://review.opendev.org/c/openstack/osc-placement/+/94603116:06
Uggla#info Please try to provide meaningful comment when you recheck16:06
sean-k-mooneywe need that on master and stable/2025.116:06
sean-k-mooneytldr without a bindep file the tox-docs job fails on noble16:06
sean-k-mooneyspecificly because libpcre3-dev is not installed16:07
sean-k-mooneyit only affect the jobs so not blocking the release 16:07
elodillessean-k-mooney: add me as reviewer to the stable patch and I'll +2 it o:)16:08
elodillesand thanks for the fix!16:08
sean-k-mooneysure. ill propsoe it once its on master. it came up when reviewing https://review.opendev.org/c/openstack/osc-placement/+/94375916:08
elodilles+116:08
Ugglathx for the fix sean-k-mooney !16:08
sean-k-mooneywe didnt have any osc-placement change this cycle that im aware of16:09
sean-k-mooneyso not have release notes for a few days is pretty low impact16:09
elodillessean-k-mooney: ACK16:09
sean-k-mooneyso no need to rush but good to do before we forget about it16:09
Ugglasure, anything else about the gate topic ?16:10
sean-k-mooneynot form me, we can move on i think16:11
Ugglayep16:11
bauzasI'll try to review the bindep patch16:11
bauzaswe did this before iirc16:11
sean-k-mooneywe had to fix a diffent bindep issue for nova16:11
sean-k-mooneyit was not including the test profile16:11
sean-k-mooneythat got fixed a few weeks ago16:12
Uggla#topic Release Planning 16:12
Uggla#link https://releases.openstack.org/epoxy/schedule.html16:12
Uggla#info Nova deadlines are set in the above schedule16:13
Uggla#link 945904: 2025.1 Epoxy final releases for cycle-with-rc projects | https://review.opendev.org/c/openstack/releases/+/94590416:13
Uggla#link https://releases.openstack.org/flamingo/schedule.html16:13
Uggla#info Remaining post RC1: update min servion version for a SLURP or non-SLURP release : https://review.opendev.org/c/openstack/nova/+/944018/16:14
Uggla#info Nova Flamingo deadlines will be discussed at the PTG.16:14
UgglaEpoxy should be released tomorrow if I'm not wrong.16:15
Uggla\o/16:15
elodilles~o~16:15
gibinice16:15
elodillesyepp, we start the machinery tomorrow16:16
Ugglabtw @elodilles I have +1 the patch above.16:16
elodillesthanks Uggla o/16:16
Uggla#topic Review priorities16:18
Uggla#info Flamingo priorities will be discussed at the PTG.16:18
Uggla#topic PTG planning 16:18
Uggla#info Next PTG will be held on Apr 7-1116:18
Uggla#link https://etherpad.opendev.org/p/nova-2025.2-ptg16:19
UgglaI think we have collected all the topics, the agenda is draft for the moment. I think I will finalize it tomorrow.16:19
UgglaToday, there's a new and rather unexpected topic on the table for the PTG.16:20
UgglaRumor has it that it originates from a secret TC meeting focused on improving our resilience against supply chain attacks.16:20
dansmithUggla: the glance ptl was just pinging me about a session with glance about location api and a couple other things16:20
UgglaAs a result, we have a new top priority for the next development cycle — replacing the eventlet removal initiative.16:20
UgglaIt has been officially decided to begin rewriting Nova, starting with the scheduler component, in Brainfuck (https://en.wikipedia.org/wiki/Brainfuck).16:21
UgglaThe minimalist nature of the language, combined with its near-total unreadability, provides an unparalleled level of protection against malicious code injections.16:21
UgglaA working PoC is already available here: https://www.jdoodle.com/ia/1FbD16:21
UgglaYep, @dansmith he contacted me for a new cross team meeting.16:22
dansmithokay good16:22
dansmithI put it on the etherpad at the bottom16:22
* gibi hopes there is a mandatory formatter in the brainfuck compiler16:22
gibiBtw, will we have a full nova PTG day on Monday to front load stuff while we have?16:23
* dansmith scowls at gibi16:23
Ugglayes we have booked the full week Monday to Friday16:23
gibiack16:24
UgglaI'll try to clarify the full agenda by tomorrow.16:24
UgglaMonday we may start later and cover only the retro.16:25
bauzashmmm, April gotcha16:25
UgglaWe have a question from mikal 16:25
opendevreviewsean mooney proposed openstack/osc-placement master: Add bindep.txt for ubunutu 24.04 support  https://review.opendev.org/c/openstack/osc-placement/+/94603116:25
Ugglamikal: Sean suggested on IRC at https://meetings.opendev.org/irclogs/%23openstack-nova/latest.log.html#t2025-03-25T19:01:27 that a "specless blueprint" might be sufficient to finish off the SPICE VDI work. I've therefore created https://blueprints.launchpad.net/nova/+spec/libvirt-spice-vdi, but wont be able to attend the meeting due to timezones. Could y'all discuss and decide if I need to propose something for the PTG / require a spec / or if this 16:25
Ugglais sufficient to finish off this work? Thanks!16:25
sean-k-mooneyya so basicaly we did approve the usb and sound device changes as part of spice direct16:26
sean-k-mooneybut we didnt have time to land it16:26
sean-k-mooneyso since the code is written if there are no objections i was suggeting proceed with a specless blueprint unless peole had design concerns16:27
gibiif there are no changes compared to the approved spec then I'm fine with it16:27
sean-k-mooneymikal said they can attend part of the ptg16:27
sean-k-mooneyso we could cover it a the start of one fo the sessions fi needed16:27
Ugglasean-k-mooney, it is unclear in the message he said he won't be able to attend.16:28
sean-k-mooneyi htink they can atted if its first thign but it will be midnight for them16:28
sean-k-mooneyso if we can handel this async over eamil or approve and just move to gerrit16:29
sean-k-mooneyit will work a lot better for them16:29
bauzaslast time, he was able to join on a late evening for him16:29
bauzasbut we need to give him a specific time16:29
bauzaslike Friday 1pm if that works for him16:29
sean-k-mooneyso do we want to defer approving https://blueprints.launchpad.net/nova/+spec/libvirt-spice-vdi until we chat with them in the ptg?16:30
sean-k-mooneyill note that while that has spice in the name16:30
gibiI have nothing against approving it16:30
sean-k-mooneythis applie to vnc also16:30
Ugglayep I will set it as early as possible. sean-k-mooney are you ok to discuss this topic if Mikal won't join16:31
sean-k-mooneysure16:31
Ugglacool16:31
Ugglaany concerns about the specless BP for Mikal ?16:32
sean-k-mooneyill note that we only need a ptg session fi we have open design questons.16:32
bauzasI don't have any concerns 16:33
Ugglaso i guess we can go for it.16:33
gibigo for it16:34
Ugglasean-k-mooney, I will set the topic for the agenda, maybe it will be quick if it is crystal clear.16:34
sean-k-mooneyack16:34
Ugglasomething else you want to discuss for the PTG prep ?16:35
sean-k-mooneyam just an fyi im double booked with watcher 16:35
sean-k-mooneyso ill attend where i can 16:36
sean-k-mooneyping if im not there and you would like my input16:36
sean-k-mooneyi tried to leave comment in the doc already16:36
Ugglasean-k-mooney, ok sure thx.16:36
UgglaI close this topic by saying we will not have this meeting next week due to the vPTG.16:36
Uggla#topic Stable Branches16:36
masahitoi'm not sure i mentioned it before. if my topic is selected for the ptg topics, i prefer 14utc or later.16:36
Ugglamasahito, yes, this is somewhere in my mind.16:37
masahitothanks.16:37
Uggla#info stable/2024.* gates broken with nova-ceph-multistore job failure (test case test_volume_upload fails - No image found with ID...)16:37
Uggla         #info stable branch status / gate failures tracking etherpad: https://etherpad.opendev.org/p/nova-stable-branch-ci16:37
Ugglaelodilles, the floor is yours16:38
elodillesyeah, i couldn't identify the root cause yet for that job failure yet ^^^ :(16:38
elodillesso if anyone has any idea, it is appreciated o:)16:38
elodillesstable/2023.2 gate is fine, and afaik stable/2025.1, too, but not 100% on that16:39
elodillesand that is all i can say now16:39
sean-k-mooneythe iso patch has been rechecked again...16:41
sean-k-mooneystill hiting that volume bug16:41
sean-k-mooneyi havent looked at it yet either ill admit16:41
sean-k-mooneyi assume cinder of glance shoudl be seeing the same failure?16:42
gmannI think I saw it on stable/2025.1 also  (have not checked if it is same or different?) https://review.opendev.org/c/openstack/devstack/+/945239/comments/231d655b_52020b9416:42
* elodilles clicks but review.o.o is slow nowadays here :/16:43
sean-k-mooney that still the nova job16:43
sean-k-mooneyi was hopign that was a non nova one showing the same issue16:44
elodillesgmann: looks like same16:44
Ugglalooks the same yes.16:44
sean-k-mooneydansmith: do you know off the top of your head if there is anything special about the nova-ceph-multistore job that woudl cause no image found on volume upload?16:45
dansmithvolume upload to glance?16:45
sean-k-mooneypresumable yes16:45
dansmithnot really, that seems sort of impossible,16:46
dansmithsince you create first and then upload16:46
elodilles(and the devstack patch merged successfully on 27th March, so it either not fully blocking or it was fixed on stable/2025.1 somehow)16:46
gmannseems like glance fail to import image? 16:46
gmann#link https://zuul.opendev.org/t/openstack/build/54c93553fc6c4797b25077fb29a89338/log/controller/logs/screen-g-api.txt#1280716:46
sean-k-mooney  953cc449-8f49-4d56-ab3e-49b54f49f937 failed to import image 833ccb36-83b8-41a7-b932-86884e4cdfc0 to the filesystem.: NoneType: None16:47
dansmithis this a multinode setup?16:47
dansmithwith glance-remote configured?16:47
dansmiththat sort of looks like it can't do import because it can't look up the configured self-reference url of a node that did the stage, maybe?16:48
sean-k-mooneyits multinode and multi store but im note sure if that also means multiple glance apis16:48
sean-k-mooneyoh no16:48
sean-k-mooneyits single node16:48
sean-k-mooneybut multiple backends16:48
dansmithyeah, single node and no g-api-r service16:49
dansmiththat must be the import-from-http test and not volume upload then16:50
sean-k-mooneyits empest.api.volume.test_volumes_actions.VolumesActionsTest.test_volume_upload[id-d8f1ca95-3d5b-44a3-b8ca-909691c9532d,image]16:50
sean-k-mooneyhttps://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_54c/openstack/54c93553fc6c4797b25077fb29a89338/testr_results.html16:50
sean-k-mooneyits a post to POST https://10.0.18.121/volume/v3/volumes/5726aafd-b7f7-4dbd-be64-265076f0efbb/action  so asking cidner to upload the voluem to glance16:51
sean-k-mooneyBody: {"os-volume_upload_image": {"image_name": "tempest-VolumesActionsTest-Image-2109450379", "disk_format": "raw"}}16:51
dansmithI also don't expect volume upload to use import16:52
dansmithsean-k-mooney: I meant the failed task that gmann linked to16:53
dansmithlet's not debug it here16:53
sean-k-mooneyoh ok16:53
gibiUggla: I think we can close the meeting and let the folks continue troubleshooting16:53
sean-k-mooneyya16:53
gmannI think there are some image delete request in log 16:53
gmann#link https://zuul.opendev.org/t/openstack/build/54c93553fc6c4797b25077fb29a89338/log/controller/logs/screen-g-api.txt#1906216:53
gmannyeah let's debug later16:53
Ugglayep can we looks at that after the meeting ?16:54
sean-k-mooneysure16:54
Uggla I skip the vmwareapi 3rd-party CI efforts Highlights as fwiesel is ooo this week.16:54
UgglaLatest topic16:54
Uggla#topic Open discussion16:54
UgglaIf there is not, I'll close the meeting.16:55
Ugglaanything more to discuss ?16:55
Uggla3...16:55
Uggla2...16:56
Uggla1...16:56
bauzas.16:56
Ugglathanks all16:57
Uggla#endmeeting16:57
opendevmeetMeeting ended Tue Apr  1 16:57:16 2025 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)16:57
opendevmeetMinutes:        https://meetings.opendev.org/meetings/nova/2025/nova.2025-04-01-16.01.html16:57
opendevmeetMinutes (text): https://meetings.opendev.org/meetings/nova/2025/nova.2025-04-01-16.01.txt16:57
opendevmeetLog:            https://meetings.opendev.org/meetings/nova/2025/nova.2025-04-01-16.01.log.html16:57
elodillesthanks Uggla o/16:57
bauzasthanks Uggla16:57
masahitothank you 16:57
gibithanks16:57
UgglaI thought @dansmith would like to discuss brainfuck formating. :)16:58
sean-k-mooneydansmith: gmann: i have a thoey but im not sure if it makes sense. looking at https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_54c/openstack/54c93553fc6c4797b25077fb29a89338/controller/logs/etc/glance/glance-api_conf.txt the default backend is file(cheap), this is a volume upload test and the job has ceph. so i would assome that16:58
sean-k-mooneycinder is using ceph. could the issue be that cidner is expectign to upload to the robust(ceph) store but we are trying to upload to the default file store16:58
sean-k-mooneyits called black16:58
sean-k-mooneywith its one line per function argument16:59
sean-k-mooneydansmith: what im wondering is did cinder create a volume snapshot on ceph and then try and use the interoperable imporat flow to import from the ceph store to file store or somethign odd like that17:00
dansmithyou can't use import like that17:00
dansmithso I don't think so17:00
dansmithyou're thinking of adding the location directly17:00
sean-k-mooneyperhaps 17:01
sean-k-mooneyhttps://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_54c/openstack/54c93553fc6c4797b25077fb29a89338/controller/logs/etc/cinder/cinder_conf.txt17:01
sean-k-mooneycidner is using ceph17:01
sean-k-mooneyim just wondering if cidner ig getting confused about how to upload the image17:01
dansmithmaybe best to ask a cinder person to look17:02
sean-k-mooneyya im going to very quieckly try and find the chagne id in there logs17:03
sean-k-mooney*request id17:03
elodillesgmann sean-k-mooney : it looks like this patch landed around the time when things started to break: https://review.opendev.org/c/openstack/tempest/+/93859217:04
gmannelodilles: yeah, I was about to debug that and someone rechecked and it merged and then I forgot17:05
gmannI am checking some tempest logs if something test is making request which is racing GET and DELETE17:05
gmannI will do after TC meeting17:05
sean-k-mooneythere is a trace back related to that i think in cidner17:06
sean-k-mooneyit thinks it uploaded the image  DEBUG cinder.volume.manager [None req-ca38b299-90c1-489f-93c7-3fd8b32b5e69 tempest-VolumesActionsTest-2112541612 None] Uploaded volume to glance image-id: 439cbb04-d5ef-480f-89c9-84bf2d4c749b17:06
elodillesthanks in advance gmann o/17:06
gmannbecause this is image id whihc is 404 in failing test and there is successful delete request https://zuul.opendev.org/t/openstack/build/54c93553fc6c4797b25077fb29a89338/log/controller/logs/screen-g-api.txt#1906917:06
gmannand after this DELETE request there are GET and 40417:07
opendevreviewDan Smith proposed openstack/nova master: Invalidate PCI-in-placement cached RPs during claim  https://review.opendev.org/c/openstack/nova/+/94414917:07
opendevreviewDan Smith proposed openstack/nova master: Support "one-time-use" PCI devices  https://review.opendev.org/c/openstack/nova/+/94381617:07
opendevreviewDan Smith proposed openstack/nova master: Add one-time-use devices docs and reno  https://review.opendev.org/c/openstack/nova/+/94426217:07
opendevreviewDan Smith proposed openstack/nova master: WIP: Functional tests for one-time-use devices  https://review.opendev.org/c/openstack/nova/+/94606517:07
gmannsean-k-mooney: I am searhing via image id from the tempest test failing traceback: 17:07
gmannDetails: {'message': 'No image found with ID 3eb80d77-c55b-40d1-bf5d-1e22607350e8<br /><br />\n\n\n', 'code': '404 Not Found', 'title': 'Not Found'}17:07
sean-k-mooneyhttps://zuul.opendev.org/t/openstack/build/54c93553fc6c4797b25077fb29a89338/log/controller/logs/screen-c-vol.txt#4972-503717:07
dansmithgibi: that ^ tests the full workflow in a single test.. there's a lot of setup so I'm not sure if it's really useful to split that into multiple cases17:08
dansmithgibi: also not sure its feasible to test anything other than the happy path there (the cache issue is covered in unit tests)17:08
sean-k-mooneygmann: do you know where that 3eb80d77-c55b-40d1-bf5d-1e22607350e8 is coming form 17:09
sean-k-mooneywe ca see in the start fo the test logs that the id of the create image is 439cbb04-d5ef-480f-89c9-84bf2d4c749b 17:09
dansmithI assume testing things like startup failures due to improper config of a VF or something are not really needed to be covered in functional since they're covered in unit17:10
sean-k-mooneygmann: oh this is doing multiple operations in on etest17:10
gmannsean-k-mooney: its is from cinder upload_volume https://github.com/openstack/tempest/blob/80c0477f78c71a2bd2e1a324c41cd2f50329b200/tempest/api/volume/test_volumes_actions.py#L12317:10
sean-k-mooney so what happing is the raw iamge is uploading fine17:12
sean-k-mooneyand the qcow is failing17:12
sean-k-mooneygmann: the qcow fails because of https://zuul.opendev.org/t/openstack/build/54c93553fc6c4797b25077fb29a89338/log/controller/logs/screen-c-vol.txt#4972-503717:13
sean-k-mooneyits failing in the format inspector code17:13
dansmithon the cinder side17:14
sean-k-mooneyyep17:14
dansmithand that's their oooold integrated format inspector code17:14
gmannbut in that case, I expect upload_volume to fail instead of giving image_id 17:14
sean-k-mooneyso its a ceph backed cinder volume. and we are asking cidner ot upload the volume as an image in qcow2 format17:14
dansmithand apparently because filename is none17:14
sean-k-mooneygmann: i have not looked at the cidner code but i belive the worklaod is create teh glance image then upload the data17:15
sean-k-mooneyso they are probaly failing in between those two steps17:15
sean-k-mooneygmann: form the looks of that trace the ceph cinder volume driver does not support converting the format when uploading a volume to an image17:17
sean-k-mooneygmann: so in the near term we proably should tweak that job to only use raw17:17
sean-k-mooneyand if cidner ever support that with ceph backed volumes we can turn it back on17:17
sean-k-mooneylooking at the trace they are either expectign a host monted volume or a qemu rbd path to be passed as the path to qemu image. but its obviously not passing either given its None17:19
sean-k-mooneydansmith: gmann  i think the short term fix would be to add disk_format=raw here https://github.com/openstack/nova/blob/master/.zuul.yaml#L71217:23
sean-k-mooneybased on the default change in https://review.opendev.org/c/openstack/tempest/+/938592/4/tempest/config.py17:23
sean-k-mooneyqcow makes sensce fo lvm josb perhaps since the volujme will be a local block device and they can call qemu image on it17:25
dansmithwhy not have the cinder team confirm and do that if so?17:25
dansmithisn't this on a stable job?17:25
sean-k-mooneyits likely on master too17:25
sean-k-mooneybut yes17:25
dansmithhow would that not be 100% fail?17:26
gmannbut this is not 100% failure in stable and master pass correctly ?17:26
dansmithmy point is, let's not just go papering over some failure with a devstack change as this seems like it should be a cinder bug no?17:26
sean-k-mooneyright i think it is a ciner bug17:26
gmanndansmith: yeah, that what I was wondering, its not 100%17:26
dansmithright, so I say kick it over to them17:27
sean-k-mooneyso i dont see it as papering over17:27
sean-k-mooneywe need to report it to them as a gate blocker17:27
sean-k-mooneyand then eitehr revert the tempst defualt change until its fixed or just skip it in our job until fixed17:27
dansmithif glance is not using ceph as well, then asking them to upload a qcow2 image is totally legit17:27
sean-k-mooneybut i agree the first step is get cinder involved17:27
sean-k-mooneyglance is usign file and ceph17:28
sean-k-mooneyso yes uploading qcow is ligit in the job17:28
sean-k-mooneygiven file is the default17:28
sean-k-mooneyeven if it was not the default it would be ligit but slow17:28
dansmithyeah17:28
sean-k-mooneywe also have force image convertion to raw in this job17:29
sean-k-mooneywhich could be a diffent issue17:29
sean-k-mooneyits not failing in galnce however so its not the current issue17:29
supamatt:wave: hi folks this change broke security groups https://review.opendev.org/c/openstack/nova/+/811521 , if any security groups with the same name exist in the project (whether the VM is using that sec group or not). Results in VMs not being able to be built with an error saying "Multiple security groups found matching <name>. Use an ID to be more specific"17:32
sean-k-mooneyif you relying on passign the name instead of the uuid17:33
sean-k-mooneyit woudl also only fail if that conflicting named security group was shared with you17:33
sean-k-mooneysupamatt: im not really sure how to adress the conflicting requirements17:34
sean-k-mooneyprefering the local one seams incorrect17:34
sean-k-mooneyif there is a name conflict it should be raised as an error by nova17:35
masahitogmann: gibi: hi please review this api bug fix https://review.opendev.org/c/openstack/nova/+/939658  The v2.100 follows the same fix, so it might be ok for W+1.17:36
supamattsean-k-mooney: that's the thing, I am passing the uuid to nova. But nova is errors if any security has the same name in a project, regardless if I use that sec group to provision a vm. 17:36
sean-k-mooneysupamatt: on ok17:37
sean-k-mooney*oh17:37
sean-k-mooneythat simpler to fix17:37
sean-k-mooneysupamatt: have you filed a bug for this yet17:37
supamattI was checking lp now to see if there was an existing one17:38
supamattbut it seems I am the first to see this, so I will file one17:38
sean-k-mooneyif you pass a name then the current behavior is arguabel correct. we could prefer the one from the current project but that might eb surprisign, if you pass a uuid we shoudl use the one you said even if there are duplicate names so that a ligit bug17:39
sean-k-mooneysupamatt: so taht a new feature in epoxy which release tomorrow17:40
sean-k-mooneywe wont have a fix for it in the release but we likely will do another release after the ptg17:41
sean-k-mooneysupamatt: https://review.opendev.org/c/openstack/nova/+/811521/13/nova/network/neutron.py#851 that where the but is17:51
sean-k-mooneysupamatt: the current algorthim is raising based on the aviabel secuirty groups to the user not the requsted one17:51
sean-k-mooneywe need to change the data stuctures and the algorthim slightly17:52
gmannmasahito: ack, will check soon17:53
gmanndansmith: sean-k-mooney: elodilles: on test_volume_upload ceph failure, I found the cinder fix which is being backported to stable branches one by one. https://review.opendev.org/q/I32b505aa69c71b62e7e3a52d65d38165d34e97d817:57
sean-k-mooney gmann  ack17:58
gmannwe just need to wait for that to be merged on all stable 17:58
dansmithgotcha17:58
dansmithI'm also a bit surprised (not really) they have not converted to oslo format inspector17:58
dansmithI even prototyped the patch for them17:59
sean-k-mooneyhum ya the failure was on stable/2025.117:59
sean-k-mooneythat means they are also missing some of the mutliiple format things18:00
sean-k-mooneyfor iso supprot and vfat supprot18:00
dansmithyeah18:00
dansmithnot sure if that's an issue for them or not18:00
sean-k-mooneywell it mean you wont be able to create volumes form some isos presumable18:00
sean-k-mooneythat a pretty neich usecase given you can only use them properly with hacks18:01
dansmithI'm still a bit surprised that fix that gmann found is not deterministic in its fail pattern18:01
elodillesgmann: ah, cool, sounds great! thanks!18:01
sean-k-mooneydansmith: ya that odd to me too18:01
sean-k-mooneydansmith: that unlink https://review.opendev.org/c/openstack/cinder/+/945616/2/cinder/volume/drivers/rbd.py#2126 bug18:04
sean-k-mooneyupload can raise so there should at least be a try  finally block here18:05
dansmithunlink you mean?18:06
dansmithit also looks like they're using two strategies to unlink the temp file no?18:06
sean-k-mooneyya the os.unlink of the temp export of the volume 18:06
dansmithoh, I guess the context only unlinks on error18:06
dansmithbut yeah, that could definitely fail18:06
sean-k-mooneyoh they are using fileutils.remove_path_on_error(tmp_file):18:07
sean-k-mooneyto handel the excption case18:07
sean-k-mooneyand then the explcit unlink for happy path18:07
dansmithright, seems like they should use try..finally: unlink()18:07
sean-k-mooneythat works... but not how i would have done that18:07
dansmithyeah18:07
sean-k-mooneydont we use context lib.exit_stack or something like that to handel it in nova/oslo18:08
dansmithin some places yeah but I don't think it's really necessary here18:08
sean-k-mooneywe might not even need that18:08
sean-k-mooneyright that really only needed if we have multipel resocue we need to clean up18:08
sean-k-mooneygmann: are we sure this is not just hard failing but it looks like its only failign sometimes due to the backport fo that patch18:10
sean-k-mooneygmann: given the "fix" id dont see how this could sometiems work and sometiems not18:10
sean-k-mooneyenabling the qcow image type and this abckprot have all been happening over the last week18:11
gmannsean-k-mooney: after this fix merged in master and stable/2025.1, I have not seen failure. in my devstack change on stable/2025.1 it started passing once this fix backport merged on stable/2025.1. I checked the timing to make sure that18:11
sean-k-mooneyack18:12
sean-k-mooneywell let just wait for the final patch to land i guess18:12
gmannstable/2024.2 backport still not merged so there we have 100%18:12
gmann100% failing18:12
gmannyeah18:12
sean-k-mooneyack that makes sense18:13
sean-k-mooneygmann: the default chagne of image formats was in tempest yes?18:16
gmannyes18:16
sean-k-mooneyi wonder if it would make sense to replace https://github.com/openstack/tempest/blob/80c0477f78c71a2bd2e1a324c41cd2f50329b200/zuul.d/project.yaml#L113-L115 with the nova ceph multistore job18:17
sean-k-mooneythat would provide the same level fo ceph coverage but allso test this configuration where upload go to a glance backed by file in this case.18:18
sean-k-mooneythat might be an over reaction but that would ahve caught this18:20
sean-k-mooneyye already have quite a lot in check18:20
gmannsean-k-mooney: +1, that make sense. we do have nova-ceph-multistore in tempest but as experimental which is what we forget to run on related changes18:20
sean-k-mooneyso i would be hesitent to suggest adding more jobs18:20
sean-k-mooneyits in devstack too18:20
sean-k-mooneyhttps://github.com/openstack/devstack/blob/master/.zuul.yaml#L933-L93418:20
sean-k-mooneythat how we are testing ceph integration on devstack changes18:21
gmannyes, I think replacing devstack-plugin-ceph-tempest-py3->nova-ceph-multistore in tempest will cover more cases18:21
gmanncatching in devstack is late where  tempest things get merged18:22
gmannsean-k-mooney: might be late for you but tomorrow or later if you can propose that in tempest I can merge. if not then I will do sometime later this week18:23
sean-k-mooneyi can do it now. its 19:23 but ill be around until at least the top of the hour18:24
gmannthanks18:24
sean-k-mooneygmann: am what will i do about https://github.com/openstack/tempest/blob/80c0477f78c71a2bd2e1a324c41cd2f50329b200/zuul.d/project.yaml#L159-L16018:27
sean-k-mooneyit was temporaly disabled in gate 3 years ago18:27
sean-k-mooneywhen i replace it will i add nova-ceph-multinode or just remove that comment18:28
sean-k-mooneyhttps://github.com/openstack/tempest/commit/b1ea4327108cbbd518dfc75482dff79493b4edc918:28
gmannsean-k-mooney: yeah, mostly due to stability. but you can keep nova ceph as voting and remove from experimental queue https://github.com/openstack/tempest/blob/80c0477f78c71a2bd2e1a324c41cd2f50329b200/zuul.d/project.yaml#L17618:28
gmannyeah18:28
sean-k-mooneyok ill add it to both18:28
sean-k-mooneyits voting on nova and devstack and stable so i dont see a reason to skip in gate18:29
gmannyeah18:29
sean-k-mooneygmann: last question will i put the curernt devstack-ceph job in experimental. i think no but i can replace nova-ceph-multistore with it if its useful to have there18:31
sean-k-mooneyexperiental quite a lot in it too and we can easilly add a dnm to run it if needed18:31
supamattsean-k-mooney: here's the lp bug for the security group problem, https://bugs.launchpad.net/nova/+bug/210589618:31
sean-k-mooneythanksi htink i figured out how to fix it and commented in the orginal patch18:32
sean-k-mooneygmann: i think https://review.opendev.org/c/openstack/tempest/+/946076 should be good. i will respin it tomorrow if you have any issues18:40
gmannsean-k-mooney: looks good, thanks18:40
opendevreviewMatthew Heler proposed openstack/nova master: Fix creating virtual servers with multiple security groups  https://review.opendev.org/c/openstack/nova/+/94607919:28
opendevreviewGoutham Pacha Ravi proposed openstack/nova stable/2025.1: DNM: test dependency on devstack-plugin-ceph changes  https://review.opendev.org/c/openstack/nova/+/94608219:42
sean-k-mooneysupamatt: we will need some test coverage but https://review.opendev.org/c/openstack/nova/+/946079/1/nova/network/neutron.py looks like it implementes the changes i suggested.20:14
sean-k-mooneysupamatt: im not sure if you have had a chace to test it or not20:14
sean-k-mooneyi can help you write  a functionl repoducer or perhaps write one for you if you dont have time but lets see how ci goes over night20:15
sean-k-mooneyim going to drop for today but ill check back tomorrow20:15
supamattI tested the patch in lab, seems to have worked and the vm built. When previously it did not.20:38
opendevreviewMatthew Heler proposed openstack/nova master: Fix creating virtual servers with multiple security groups  https://review.opendev.org/c/openstack/nova/+/94607922:54
opendevreviewMatthew Heler proposed openstack/nova master: Fix creating virtual servers with multiple security groups  https://review.opendev.org/c/openstack/nova/+/94607923:38

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!