| opendevreview | Stephen Finucane proposed openstack/nova master: tests: Filter out (more) eventlet deprecation warnings https://review.opendev.org/c/openstack/nova/+/974686 | 11:26 |
|---|---|---|
| opendevreview | Stephen Finucane proposed openstack/nova master: console: Fix type error https://review.opendev.org/c/openstack/nova/+/974687 | 11:26 |
| opendevreview | Stephen Finucane proposed openstack/nova master: privsep: Remove dead code https://review.opendev.org/c/openstack/nova/+/977111 | 11:26 |
| opendevreview | Stephen Finucane proposed openstack/nova master: Add ruff-check https://review.opendev.org/c/openstack/nova/+/974441 | 11:26 |
| opendevreview | Stephen Finucane proposed openstack/nova master: typing: Add hints to nova.cmd https://review.opendev.org/c/openstack/nova/+/705657 | 11:26 |
| opendevreview | Stephen Finucane proposed openstack/nova master: typing: Add hints to nova.conf https://review.opendev.org/c/openstack/nova/+/974688 | 11:26 |
| opendevreview | Stephen Finucane proposed openstack/nova master: typing: Add hints to nova.console https://review.opendev.org/c/openstack/nova/+/974689 | 11:26 |
| opendevreview | Stephen Finucane proposed openstack/nova master: typing: Add hints to remaining top-level modules https://review.opendev.org/c/openstack/nova/+/705658 | 11:26 |
| opendevreview | Stephen Finucane proposed openstack/nova master: typing: Add hints to nova.virt, nova.virt.libvirt https://review.opendev.org/c/openstack/nova/+/974220 | 11:26 |
| opendevreview | Stephen Finucane proposed openstack/nova master: typing: Correct import issues https://review.opendev.org/c/openstack/nova/+/974221 | 11:26 |
| opendevreview | Stephen Finucane proposed openstack/nova master: mypy: Enable incremental checks https://review.opendev.org/c/openstack/nova/+/974222 | 11:26 |
| opendevreview | Stephen Finucane proposed openstack/nova master: mypy: Disallow incomplete defs https://review.opendev.org/c/openstack/nova/+/974690 | 11:26 |
| opendevreview | Stephen Finucane proposed openstack/nova master: mypy: Disallow untyped defs (where possible) https://review.opendev.org/c/openstack/nova/+/974725 | 11:26 |
| opendevreview | Stephen Finucane proposed openstack/nova master: typing: Add hints to nova.privsep https://review.opendev.org/c/openstack/nova/+/977112 | 11:26 |
| opendevreview | Stephen Finucane proposed openstack/nova master: typing: Add hints to nova.policy https://review.opendev.org/c/openstack/nova/+/977153 | 11:26 |
| opendevreview | Stephen Finucane proposed openstack/nova master: libvirt: Fix logging type error https://review.opendev.org/c/openstack/nova/+/978432 | 11:26 |
| opendevreview | Stephen Finucane proposed openstack/nova master: utils: Pass string to lock https://review.opendev.org/c/openstack/nova/+/978433 | 11:26 |
| opendevreview | Stephen Finucane proposed openstack/nova master: privsep: Pass strings, not ints https://review.opendev.org/c/openstack/nova/+/978434 | 11:26 |
| opendevreview | Stephen Finucane proposed openstack/nova master: privsep: Correct arg order https://review.opendev.org/c/openstack/nova/+/978435 | 11:26 |
| opendevreview | Stephen Finucane proposed openstack/nova master: Remove more dead code https://review.opendev.org/c/openstack/nova/+/978436 | 11:26 |
| elodilles | sean-k-mooney: about novaclient release model change you raised during friday evening: yeah, so a patch should be created in openstack/releases repository where we move the novaclient from the deliverables/hibiscus -> deliverables/_independent (note that python-novaclient was release for gazpacho already) | 11:55 |
| elodilles | sean-k-mooney: and we can do this after 2026.1 Gazpacho release, after we have created deliverables/hibiscus o:) | 11:56 |
| elodilles | sean-k-mooney: i forgot about this one after the Antelope PTG :/ | 11:57 |
| elodilles | sean-k-mooney: one comment though: once we have moved python-novaclient to 'independent' we won't be able to do stable releases. in general this would not be a problem. BUT.... if there is any important bug fix and the master branch is dropped old py3 version (or other similar) support then we won't be able to get that fixed on stable branches. so this is something to be keep in mind if we want | 12:00 |
| elodilles | to move a deliverable to cycle independent model | 12:00 |
| elodilles | sean-k-mooney: i'm saying this as we had in the past some issues when multiple oslo project moved to independent model and caused issues when fixes could not be added and released. so oslo projects moved back from cycle independent to cycle based release model because of this. | 12:02 |
| sean-k-mooney | elodilles: so whe ahve dicussed entirly frezing novaclient and discontiuing even bug fix release | 12:03 |
| sean-k-mooney | the only reaosn it still exist is to provide a brindge for serivice like heat or neutron to move to the sdk | 12:03 |
| sean-k-mooney | it might make snes for use to jsut go do that for those services next release | 12:03 |
| sean-k-mooney | the same way neuton is moving nova to the sdk this cycle | 12:04 |
| elodilles | sean-k-mooney: can we set novaclient as retired maybe next cycle? | 12:05 |
| sean-k-mooney | i was thinking more for 2027.1 | 12:05 |
| elodilles | maybe that would be the most straightforward way | 12:05 |
| elodilles | sean-k-mooney: i see, 2027.1 also sounds OK to me | 12:06 |
| sean-k-mooney | we still have services using it so if we set that as a goal for 2027.1 that give 2 relases to complete the move but we shoudl discuss that with the wider team | 12:06 |
| sean-k-mooney | watcher had it as a priorty this cycle | 12:07 |
| elodilles | sean-k-mooney: this cycle the clients are already released, so i think we cannot do anything about it | 12:07 |
| sean-k-mooney | ya we cant | 12:07 |
| sean-k-mooney | which is fine | 12:07 |
| elodilles | ACK | 12:08 |
| sean-k-mooney | i actully need to add "supprotign watcher in the sdk" to the ptg adjenda there and speak to stephen et al about the right way to do that so whtat we can also deprecate adn or remove the watcherclient python binding in a similar time frame | 12:09 |
| gokhan | hi folks, I am currently testing OpenStack Epoxy with Masakari for instance high availability. In my lab environment, I am using Ceph RBD as the backend for Nova ephemeral disks and for cinder disks. I am performing "hard failure" tests by physically pulling the power plug of a compute node to trigger Masakari’s evacuation process. While Masakari successfully detects the failure and starts the instances on another node, I am facing sever | 12:21 |
| gokhan | e filesystem corruption on almost all evacuated VMs.Upon recovery on the new node, VMs drop to emergency mode with blk_update_request: I/O error and journal corruption. I suspect network=writeback is the primary culprit during a hard power-off because volatile buffers are lost before being flushed to the Ceph OSDs. Is it a common consensus to avoid network=writeback when using Masakari or any HA driver that expects survival after a hard node crash? | 12:21 |
| gokhan | What is the recommended disk_cachemodes for Ceph RBD in production environments where power-loss-induced HA is a requirement? Is network=none the only safe path? Would switching to virtio-scsi (with hw_scsi_model and hw_disk_bus) significantly improve the "flush" command reliability during sudden failures compared to standard virtio-blk? | 12:21 |
| sean-k-mooney | gokhan: so i asume masakari is using evacuate. its a hard requiremt of using the evacuate api that the vm must be stopped on the orginal host and the admin or masikari in this case is requried to verify that before calling evacuate. by fencing i assume you mean that the phsyical host runing the vm is power off with ipmi/redfhist/pdu | 12:23 |
| sean-k-mooney | network=writeback still commit all guest flushes to the backing store i.e. ceph | 12:24 |
| sean-k-mooney | so if your guest os is properly flushing write then there shoudl not be any currption | 12:25 |
| sean-k-mooney | or at least chanign to to none for the cache mode woudl not alter the flush behavior | 12:25 |
| gokhan | thanks sean-k-mooney , Regarding your point on fencing: Yes, Masakari is indeed using the Evacuate API, and I am simulating a hard failure by physically pulling the power plug of the compute node (simulating a sudden power outage or a motherboard failure where no graceful shutdown or IPMI action is possible from the host side). | 12:27 |
| sean-k-mooney | the production recomendation for ceph is hw_disk_bus=scsi hw_scsi_model=virtio-scsi and cash mode set to writeback | 12:27 |
| sean-k-mooney | that is the recommendation form the ceph comunity | 12:27 |
| sean-k-mooney | https://docs.ceph.com/en/latest/rbd/rbd-openstack/#image-properties https://docs.ceph.com/en/latest/rbd/rbd-openstack/#configuring-nova | 12:33 |
| sean-k-mooney | by the way there si not ceph level rbd caching as well | 12:33 |
| sean-k-mooney | i.e. at the ceph.conf level | 12:34 |
| sean-k-mooney | i think the current recommendaions are to weriteback at the qemu level and enabling rbd cachign with ` rbd cache writethrough until flush = true` at the rbd/ceph level | 12:36 |
| sean-k-mooney | with that sadi the removed the expiclty specificatgion of the livbirt cache_mode form there doc at some poitn | 12:36 |
| sean-k-mooney | https://docs.ceph.com/en/mimic/rbd/rbd-openstack/#id2 | 12:37 |
| sean-k-mooney | gokhan: one of hte main reasonis for usein virtio-scsi in the past was for discard/trim supprot because that was not previously implmented wehn useing virtio-blk | 12:38 |
| sean-k-mooney | that gap has been resolved in the last 2-3 years in modern qemu releases | 12:38 |
| gokhan | sean-k-mooney, there is no cache on ceph environment | 12:41 |
| gokhan | ı will try with new configs | 12:41 |
| sean-k-mooney | rbd cachign is apprenly enabled by default reading the ceph docs | 12:41 |
| gokhan | sean-k-mooney, do we need to disable it or keep as is | 12:52 |
| sean-k-mooney | honestly this is not something we test or can really advise on. you might get more userful feedback form the ops mailing list or operator hour session | 12:53 |
| sean-k-mooney | the best i can do is point you at the offical docs from teh ceph comnuity above | 12:53 |
| sean-k-mooney | well that and https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L668-L700 | 12:54 |
| sean-k-mooney | a flush form the guest is flushed to ceph in all cases expct unsafe | 12:55 |
| sean-k-mooney | you absolutely shoudl no use unsafe unless you hate your guests data | 12:55 |
| sean-k-mooney | i woudl read that comemtn that explians the differnt modes in detail and decive which fits your performance and correctness needs | 12:57 |
| sean-k-mooney | a guest shoudl never rely on writs being commited without flushing. if they do they have a bug in there software | 12:57 |
| sean-k-mooney | writethough and directsync will inject extra fsync on every wite with the performance penaly that implies | 12:58 |
| gokhan | disk_cachemodes=file=none,block=none,network=none | 13:07 |
| gokhan | is also not worked forthis scenario. Now I am trying with recommended settings | 13:07 |
| gokhan | sean-k-mooney, I find the problem. ceph is locked disk from previous host. | 13:13 |
| gokhan | rbd lock list cinder-volumes/volume-4263698a-ad00-472c-b4c8-62591664a8e1 | 13:16 |
| gokhan | There is 1 exclusive lock on this image. | 13:16 |
| gokhan | Locker ID Address | 13:16 |
| gokhan | client.95989750 auto 125705438402288 10.x.x.x:0/1193021369 | 13:16 |
| gokhan | sean-k-mooney, I also saw tour response to mail list https://lists.openstack.org/pipermail/openstack-discuss/2023-February/032256.html | 13:18 |
| *** kaisers is now known as Guest4147 | 14:28 | |
| *** Guest4147 is now known as kaisers | 14:28 | |
| opendevreview | ribaudr proposed openstack/nova master: FUP Add HW_PCI_LIVE_MIGRATABLE trait to PCI resource providers https://review.opendev.org/c/openstack/nova/+/977310 | 14:44 |
| opendevreview | ribaudr proposed openstack/nova master: Tidy up pci self.flags() calls in SR-IOV functional tests https://review.opendev.org/c/openstack/nova/+/978479 | 14:44 |
| opendevreview | Max proposed openstack/nova master: fix: delete attachments after rescheduling delete https://review.opendev.org/c/openstack/nova/+/974832 | 14:48 |
| opendevreview | Takashi Kajinami proposed openstack/nova master: Drop wrong strict assertion of quota class set id https://review.opendev.org/c/openstack/nova/+/978494 | 15:10 |
| opendevreview | Takashi Kajinami proposed openstack/nova master: Drop wrong strict assertion of quota class class set id https://review.opendev.org/c/openstack/nova/+/978494 | 15:12 |
| tkajinam | ^^^ this is strange but it's the behavior novaclient has been testing... | 15:12 |
| opendevreview | Takashi Kajinami proposed openstack/nova master: Fix wrong strict assertion of quota class class set id https://review.opendev.org/c/openstack/nova/+/978494 | 15:13 |
| opendevreview | Takashi Kajinami proposed openstack/python-novaclient master: DNM: testing https://review.opendev.org/c/openstack/python-novaclient/+/978532 | 15:13 |
| Uggla | Nova meeting in ~35mn | 15:25 |
| opendevreview | Takashi Kajinami proposed openstack/nova master: Fix wrong strict assertion of quota class class set id https://review.opendev.org/c/openstack/nova/+/978494 | 15:52 |
| tkajinam | hmmm so there is an issue with zuul now and jobs may be delayed for some time. | 15:54 |
| tkajinam | I mean it take some time until zuul actually starts processing notifications from gerrit | 15:54 |
| tkajinam | (seeing the discussion in OpenDev room in matrix | 15:55 |
| Uggla | #startmeeting nova | 16:00 |
| opendevmeet | Meeting started Mon Mar 2 16:00:47 2026 UTC and is due to finish in 60 minutes. The chair is Uggla. Information about MeetBot at http://wiki.debian.org/MeetBot. | 16:00 |
| opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 16:00 |
| opendevmeet | The meeting name has been set to 'nova' | 16:00 |
| Uggla | Hello everyone | 16:00 |
| elodilles | o/ | 16:01 |
| bauzas | o/ | 16:01 |
| kaisers | o/ | 16:01 |
| sean-k-mooney | o/ is here but distracted | 16:01 |
| tkajinam | o/ | 16:01 |
| gmaan | o/ | 16:01 |
| lajoskatona | o/ | 16:02 |
| Uggla | Let's start | 16:03 |
| Uggla | #topic Bugs (stuck/critical) | 16:03 |
| fwiesel | o/ | 16:03 |
| Uggla | #info No Critical bug | 16:03 |
| tkajinam | https://bugs.launchpad.net/nova/+bug/2143057 | 16:03 |
| Uggla | oh | 16:03 |
| dansmith | o/ | 16:03 |
| tkajinam | there is one I reported 1 hour ago (I changed its importance a few minutes ago) | 16:03 |
| tkajinam | which is blocking the novaclient CI | 16:03 |
| tkajinam | I've proposed a potential fix but zuul is now slow and is not starting jobs timely... | 16:04 |
| Uggla | ok I have seen you message earlier about this. | 16:04 |
| tkajinam | yeah | 16:05 |
| Uggla | the bug sounds like something introduced with openapi. | 16:05 |
| Uggla | am I right ? | 16:05 |
| tkajinam | yup | 16:06 |
| tkajinam | the error is caused by schema validation of responses | 16:06 |
| sean-k-mooney | well that or its now showign a bug in the client | 16:06 |
| tkajinam | which was recently enabled at the last phase of openapi series | 16:06 |
| sean-k-mooney | ya so respocne validation is only enabeld in ci | 16:06 |
| gmaan | is not the test should be fixed in novaclient? | 16:06 |
| sean-k-mooney | it not inteneded for production use | 16:06 |
| sean-k-mooney | and ya i am wonderign if its a novaclinet issue or not | 16:07 |
| gmaan | if API doc says, default is the only supported | 16:07 |
| tkajinam | but only API doc says that. that is the problem. | 16:07 |
| gmaan | fake-class-2-1 -> 'default' | 16:07 |
| tkajinam | I mean API doc says only default is supported but it's not actually validated in API | 16:07 |
| sean-k-mooney | tkajinam: that is the actual api contract however | 16:07 |
| tkajinam | I'm ok with "fixing" novaclient tests but in that case we should also fix nova to return an appropriate error to client for non-default id | 16:08 |
| tkajinam | currently it returns 400 with response schema validation error | 16:08 |
| sean-k-mooney | im alittel confused because i see tempst/lib/cli in that path | 16:09 |
| sean-k-mooney | are these novaclient tempest tests? | 16:09 |
| sean-k-mooney | "/home/zuul/src/opendev.org/openstack/python-novaclient/.tox/functional/lib/python3.12/site-packages/tempest/lib/cli/base.py" | 16:09 |
| gmaan | novaclient but they use tempest base test for cli | 16:10 |
| sean-k-mooney | ack | 16:10 |
| gmaan | anyways, let's continue on gerrit, I wll comment there | 16:10 |
| tkajinam | ok | 16:10 |
| Uggla | good thx gmaan | 16:10 |
| opendevreview | Takashi Kajinami proposed openstack/nova master: Fix wrong strict assertion of quota class set id https://review.opendev.org/c/openstack/nova/+/978494 | 16:11 |
| Uggla | #topic Gate status | 16:11 |
| Uggla | #link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs | 16:11 |
| Uggla | #link https://etherpad.opendev.org/p/nova-ci-failures-minimal | 16:11 |
| Uggla | #link https://zuul.openstack.org/builds?project=openstack%2Fnova&project=openstack%2Fplacement&branch=stable%2F*&branch=master&pipeline=periodic-weekly&skip=0 Nova&Placement periodic jobs status | 16:11 |
| Uggla | #info Please look at the gate failures and file a bug report with the gate-failure tag. | 16:11 |
| Uggla | #info Please try to provide a meaningful comment when you recheck | 16:11 |
| Uggla | Anything else except what we discussed before ? | 16:12 |
| Uggla | seems not so moving on. | 16:13 |
| Uggla | #topic Release Planning | 16:13 |
| Uggla | #link https://releases.openstack.org/gazpacho/schedule.html | 16:13 |
| Uggla | #info Nova deadlines are set in the above schedule | 16:13 |
| Uggla | info PTG etherpad for 2026.1 is available: https://etherpad.opendev.org/p/nova-2026.1-ptg | 16:13 |
| Uggla | #info Last week was Feature Freeze. (Thursday 20260226) Thanks for all the reviews and merges to close open features in this cycle. | 16:13 |
| Uggla | #topic Review priorities | 16:15 |
| tkajinam | maybe it's time to prepare etherpad for 2026."2" ? | 16:15 |
| Uggla | tkajinam, yep it is part of post FF tasks. | 16:16 |
| tkajinam | Uggla, ok :-) | 16:16 |
| Uggla | #link https://etherpad.opendev.org/p/nova-2026.1-status | 16:16 |
| Uggla | Starting here: https://etherpad.opendev.org/p/nova-2026.1-status#L70 there are interesting bugs to improve live migration. So I would like cores to review them. | 16:17 |
| Uggla | Can we also look at https://review.opendev.org/c/openstack/nova/+/972601: return error about external network to the user on build failure from Cardoe. It would be great to get special care for that one, because, unless I'm wrong, we missed it in the previous cycle, and it's important for Cardoe. | 16:17 |
| kaisers | Quick question related to reviews: I've a small change up to drop warnings on the libvirt volume Quobyte driver, it was recently set back to supported in Cinder. Does such a change need any additional background? It is not really a bugfix but also not a new feature, so i am not sure if should do anything or just let it sit. | 16:18 |
| cardoe | Thanks Uggla. We just got a number of bug reports about this from users. It was 4 by my last count so clearly folks are using nova+ironic out there. | 16:18 |
| kaisers | sry, didn't want to fire in between | 16:19 |
| Uggla | kaisers can you provide the link of the patch, I will add it to the list | 16:20 |
| kaisers | yes, thnx. https://review.opendev.org/c/openstack/nova/+/977300 | 16:20 |
| Uggla | does someone wants to add something else ? | 16:22 |
| r-taketn | Uggla: I’ve read the irclogs and understood your decision that we have extra weeks regarding sev refactor series. Thank you for your confirmation. | 16:22 |
| Uggla | r-taketn yep. | 16:22 |
| Uggla | r-taketn, please continue the effort on that topic if you can. That will be good to have this for next cycle AMD-SNP and TDX. | 16:23 |
| r-taketn | Thank you. I will continue this series during next two weeks. I’ve updated the code and comments. Please review them if cores have time. | 16:25 |
| Uggla | yep at least gibi told me he will follow up on your serie | 16:26 |
| gibi | r-taketn: I'm planning to spend time with your refactor tomorrow of the day after | 16:26 |
| Uggla | gibi 👍 | 16:26 |
| r-taketn | gibi: Thank you! | 16:26 |
| Uggla | can we move on ? | 16:27 |
| r-taketn | That’s all from me. Thanks | 16:27 |
| Uggla | r-taketn your welcome. | 16:27 |
| Uggla | #topic OpenAPI | 16:28 |
| Uggla | #link: https://review.opendev.org/q/topic:%22openapi%22+(project:openstack/nova+OR+project:openstack/placement)+-status:merged+-status:abandoned | 16:28 |
| Uggla | #info still 0 remaining | 16:28 |
| Uggla | #info This topic is closed. A big thank you to stephenfin, sean-k-mooney, gmaan and all people involved with this. | 16:28 |
| Uggla | \o/ | 16:28 |
| gibi | congrats | 16:29 |
| Uggla | #topic Stable Branches | 16:29 |
| gmaan | tkajinam: on that topic, I commented on gerrit, your change is ok for me but let's fix the test as this bug fix and we can discuss further about how better we can handle the error | 16:30 |
| gmaan | https://review.opendev.org/c/openstack/nova/+/978494 | 16:30 |
| * Uggla passing the mic to elodilles | 16:30 | |
| elodilles | thx Uggla | 16:30 |
| elodilles | #info stable gates should be in good state | 16:30 |
| elodilles | #info stable/2026.1 branch cut for libraries: https://review.opendev.org/c/openstack/releases/+/978397 | 16:30 |
| elodilles | thanks for the review Uggla :) | 16:30 |
| elodilles | #info stable branch status / gate failures tracking etherpad: https://etherpad.opendev.org/p/nova-stable-branch-ci | 16:31 |
| * elodilles gives back the mic to Uggla | 16:31 | |
| Uggla | thx elodilles | 16:32 |
| Uggla | #topic vmwareapi 3rd-party CI efforts Highlights | 16:32 |
| * Uggla giving the mic to fwiesel | 16:32 | |
| Uggla | seems fwiesel is not available so moving on. | 16:34 |
| Uggla | #topic Gibi's news about eventlet removal | 16:34 |
| gibi | o/ | 16:34 |
| * Uggla passing the mic to gibi | 16:34 | |
| gibi | nova-compute now can run in native threading mode (default is still eventlet) | 16:35 |
| bauzas | \o/ | 16:35 |
| dansmith | yay | 16:35 |
| gibi | and nova-next runs nova-compute and the rest with native threading | 16:35 |
| Uggla | \o/ | 16:35 |
| gmaan | \o/ | 16:35 |
| dansmith | I apologize for being so absent on those reviews, but is there a reno asking operators to test and provide feedback? | 16:35 |
| fwiesel | Uggla: Sorry. was distracted. no updates | 16:35 |
| Uggla | fwiesel no worries | 16:36 |
| gibi | dansmith: there is reno about the change and nova logs at startup to asking for testing first in pre-prod | 16:36 |
| gibi | dansmith: we can emphesize the feedback part in the prelude | 16:36 |
| Uggla | dansmith, gibi, I will try to do it in prelude and highlights | 16:36 |
| dansmith | ack, sounds good.. this, probably more than any previous thing really needs some concerted feedback I think | 16:37 |
| gmaan | ++ for prelude | 16:37 |
| gmaan | gibi: I am changing to run graceful shutdown job in both mode (eventlet mode as periodic), so that we will have coverage for both https://review.opendev.org/c/openstack/nova/+/978292 | 16:37 |
| dansmith | prelude seems like a good place | 16:37 |
| gibi | gmaan: cool | 16:37 |
| gibi | dansmith: I agree. | 16:37 |
| gibi | we punted novncproxy to H but there is some WIP patch up already from Kamil | 16:38 |
| gibi | I have a set of unit test patches that we can still land becore RC1 https://review.opendev.org/c/openstack/nova/+/970069 | 16:38 |
| gibi | and I have a long list of TODOs for cleanups about executors that I will prepare patches for for early H | 16:39 |
| gibi | I still plan to hold the eventlet sync this week but maybe skip some calls during the RC period | 16:40 |
| gibi | just to restrat in full force once master is open of H | 16:40 |
| gibi | that is it from me | 16:40 |
| * gibi passes the mic back | 16:40 | |
| gibi | ohh one more thing | 16:41 |
| gibi | that fwiesel is around | 16:41 |
| gibi | fwiesel: could you help me debugging the vmware CI with native threading | 16:41 |
| gibi | https://review.opendev.org/c/openstack/nova/+/973468 | 16:41 |
| opendevreview | Doug Goldstein proposed openstack/nova master: return error about external network to the user on build failure https://review.opendev.org/c/openstack/nova/+/972601 | 16:41 |
| gibi | see the comments from me in ^^ | 16:41 |
| gibi | I mean in https://review.opendev.org/c/openstack/nova/+/973468 | 16:42 |
| gibi | the image upload seems to be problematic | 16:42 |
| gibi | but snapshot works well with the libvirt driver so this is vmware driver specific | 16:42 |
| gibi | so I need your help | 16:42 |
| Uggla | fwiesel ? | 16:44 |
| gibi | fwiesel: it is OK if you just read back and we can discuss outside of the meeting as well | 16:44 |
| Uggla | fwiesel will probably answer later, moving on to the next topic | 16:45 |
| Uggla | #topic Nova using openstack sdk for neutron | 16:46 |
| * Uggla passing the mic to lajoskatona | 16:46 | |
| lajoskatona | o/ thanks | 16:46 |
| lajoskatona | short feedback: no progress since last meeting | 16:47 |
| * Uggla shame on me I have still not provided my comments. | 16:47 | |
| lajoskatona | I plan to update the floating IP and port reated changes and clean them (the one for ports actually is red so have to work more on that I think) | 16:47 |
| gmaan | what's the plan on this, are we ok to merge those in rc period or wait till H? | 16:47 |
| lajoskatona | no worries, I think now everybody busy with the releae and later with RCs | 16:48 |
| dansmith | I assume these are not without risk right? | 16:48 |
| gmaan | yeah, there is risk | 16:48 |
| Uggla | gmaan, to my mind there is no urgency. And it is more something for H | 16:48 |
| lajoskatona | yes, I would wait till beginning of H to have time for testing in CI | 16:48 |
| lajoskatona | +1 | 16:49 |
| gmaan | k, bcz I planned to review it but missed the FF so I can start but wait for H to open for merging | 16:49 |
| Uggla | lajoskatona I think we are good. | 16:50 |
| Uggla | moving to next topic | 16:50 |
| Uggla | #topic Bug scrubbing | 16:50 |
| Uggla | #info up to 212 (-4) | 16:50 |
| Uggla | I'll continue to focus on that this week. | 16:51 |
| Uggla | Today because we are a bit short of time, I would like to discuss only this bug: https://bugs.launchpad.net/nova/+bug/2141325 opened by gibi. | 16:51 |
| gibi | o/ | 16:51 |
| gibi | so in short | 16:52 |
| gibi | nova assumes during live migration prep that a single VM does not have multiple tap devs with the same MAC | 16:52 |
| gibi | but neutron allows setting MAC on port objects | 16:52 |
| gibi | so one can set up a VM with two ports in different neworks with the same MAC | 16:52 |
| gibi | and live migration will fail | 16:53 |
| gibi | question: do we want to fix it, or do we want to state that please don't try to do this | 16:53 |
| dansmith | is the limitation a nova thing or a libvirt/qemu thing? | 16:53 |
| gibi | on nova, how we identifies tap devs across the two XMLs | 16:54 |
| dansmith | i.e. is it just us using the MAC as a primary key for mapping ports? | 16:54 |
| gibi | just us | 16:54 |
| dansmith | ack | 16:54 |
| gibi | I rather not try to touch that XML gen code | 16:54 |
| gibi | as manually managing MAC in the cloud is sort of an edge case in my head | 16:54 |
| dansmith | meanin you prefer we not fix this? | 16:55 |
| gibi | yepp I prefer documenting this as a known limitation | 16:55 |
| bauzas | could we at least fix it by doc ? | 16:55 |
| bauzas | ah too late :) | 16:55 |
| dansmith | I think documenting it as a limitation now is obviously better than having it fail, | 16:56 |
| dansmith | but it is also probably indicative of us using something as a PK that shouldn't be, and I also think there are probably legit reasons for doing this related to HA | 16:56 |
| dansmith | IIRC, we had a todo to make the NICs identifiable by alias like the disks work I did, | 16:57 |
| gibi | I tried to poke the downstream reporter about their use case doing this but so far I got nothing | 16:57 |
| dansmith | which is probably what we _should_ be using instead of MAC right? | 16:57 |
| bauzas | could we fail early with a good self-explaned exception ? | 16:58 |
| gibi | yeah libvirt has aliases. I haven't checked if it fits to our live migration XML manipulation code to use that | 16:58 |
| gibi | bauzas: today we fail here which is probably late https://github.com/openstack/nova/blob/264e868d4931595140260c0f655a10b525be38f7/nova/virt/libvirt/migration.py#L496-L500 | 16:59 |
| gibi | sorry not there | 16:59 |
| gibi | that is the place of the root cause | 16:59 |
| gibi | we fail at | 16:59 |
| bauzas | I'm thinking of when creating the instance | 16:59 |
| gibi | File "/usr/lib64/python3.9/site-packages/libvirt.py", line 2225, in migrateToURI3 | 16:59 |
| gibi | bauzas: that would open a can of worms, 1) pre-existing instance with duplicated MAC 2) interface attach | 17:00 |
| bauzas | if you ask for two ports, can we quick get the MAC addresses already or do we need another roundtrip ? (please, I don't want this if so) | 17:00 |
| gibi | if failing early is the goal then we should fail pre-live migration | 17:00 |
| bauzas | yup, then yes | 17:01 |
| dansmith | well, | 17:01 |
| bauzas | either in the conductor check or later in compute pre-flighyt | 17:01 |
| dansmith | I think if we're going to say it's not allowed, we should prevent it at instance spawn/interface attach right? | 17:01 |
| dansmith | because in this case, a user can do something that prevents an admin from live migrating an instance | 17:01 |
| dansmith | which is not a cool thing to find out later | 17:01 |
| bauzas | then it should be when we attach a port ? | 17:02 |
| dansmith | so if we say "not allowed" it probably needs to be enforced much earlier | 17:02 |
| dansmith | but again, I feel like this is probably something we can/should fix (even if later) with the alias stuff like we did for disks and planned for nics | 17:02 |
| bauzas | right | 17:03 |
| gibi | OK, if preventing it needs a bunch of code then maybe fixing it is comparable complexity (I was on the side of doccing it) | 17:03 |
| dansmith | the disk stuff has fallen out of my brain a bit, but seems like we could use the port uuid as the alias and then we'd be able to match up the port and interface in that code, right? | 17:03 |
| dansmith | gibi: well, I think documenting it also documents "how to prevent your operator from moving your instance around" :) | 17:04 |
| gibi | dansmith: right port uuid is unique in openstack, nic alias is unique in the VM | 17:04 |
| dansmith | yeah | 17:04 |
| gibi | OK I will summarize this discussion back to the bug | 17:05 |
| gibi | I got enough feedback that a simple doc patch will not be enough :) | 17:05 |
| gibi | thanks | 17:05 |
| dansmith | sorry gibi :/ | 17:06 |
| Uggla | gibi thanks to you explaining this. | 17:06 |
| Uggla | We are already overtime, so time to close for this week. | 17:06 |
| cw0306-lee[m] | Hi! I have a question. | 17:06 |
| gibi | dansmith: no worries :) | 17:07 |
| Uggla | cw0306-lee[m] last minute question, please go ahead | 17:07 |
| cw0306-lee[m] | I tried to fix bug and added it to etherpad proposed bugfixes, will they be reviewed in order? | 17:07 |
| Uggla | cw0306-lee[m] we will try. There is no specific order though, really depend on prio and reviewer bandwith. | 17:08 |
| cw0306-lee[m] | Uggla: Thank you! | 17:09 |
| Uggla | readiness too | 17:09 |
| Uggla | ok so now time to close, thanks for joining this meeting. Have a nice day/evening and see you next week. | 17:10 |
| Uggla | #endmeeting | 17:10 |
| opendevmeet | Meeting ended Mon Mar 2 17:10:29 2026 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 17:10 |
| opendevmeet | Minutes: https://meetings.opendev.org/meetings/nova/2026/nova.2026-03-02-16.00.html | 17:10 |
| opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/nova/2026/nova.2026-03-02-16.00.txt | 17:10 |
| opendevmeet | Log: https://meetings.opendev.org/meetings/nova/2026/nova.2026-03-02-16.00.log.html | 17:10 |
| gibi | thanks Uggla | 17:10 |
| elodilles | thanks o/ | 17:10 |
| lajoskatona | o/ | 17:11 |
| Uggla | gibi thanks to you, we have one more bug triaged ! | 17:11 |
| tkajinam | gmaan, regarding that novaclient tests update, do you know any stestr magic to avoid specific tests from being executed concurrently ? | 17:12 |
| tkajinam | https://opendev.org/openstack/python-novaclient/src/branch/master/novaclient/tests/functional/v2/test_quota_classes.py#L79-L107 | 17:12 |
| tkajinam | the problem is that there are few tests accessing quota-class and if we use only the default one then we should refactor these to avoid conflicts among these | 17:13 |
| tkajinam | I'll look into that tomorrow but the test update looks tricky for me now | 17:15 |
| sean-k-mooney | tkajinam: for what its worth even in queens we have notes stating that only the default classs is supproted by nova | 17:18 |
| sean-k-mooney | https://docs.openstack.org/nova/queens/admin/quotas.html#to-update-quota-values-for-an-existing-project | 17:18 |
| sean-k-mooney | https://that.guru/blog/quotas-in-openstack/#quota-classes looking at the history the supprot for this was ripped out in ocata and there was never eally any intree supprot | 17:21 |
| gmaan | tkajinam: for quota concurrent tests, I think you can use LockFixture like tempest is doing https://github.com/search?q=repo%3Aopenstack%2Ftempest%20compute_quotas&type=code | 17:21 |
| gmaan | doing it via stestr also possible but that will be more complicated of grouping them and running in serial mode | 17:23 |
| sean-k-mooney | ya porting the serial decorator is one approch | 17:25 |
| *** mtreinish_ is now known as mtreinish | 17:30 | |
| opendevreview | Dominik proposed openstack/nova-specs master: Repurpose NUMA Topology with Resource Providers for 2026.2 https://review.opendev.org/c/openstack/nova-specs/+/978570 | 17:39 |
| *** mtreinish_ is now known as mtreinish | 17:53 | |
| Zhan[m] | Hi friends, I have a quick question. Does Nova allow launching/live-migrating VMs to a host with nova-compute disabled? For example, if I want to do a QC before putting the host back to service, I can launch a VM on there with elevated privileges (e.g., using admin credential), make sure everything works, and then finally enable nova-compute. | 18:29 |
| sean-k-mooney | Zhan[m]: no. disbaling a compute node exist to prevent schdulign to the host | 18:31 |
| sean-k-mooney | we enforce this with a frobiden trait via a placment prefilter | 18:31 |
| sean-k-mooney | https://github.com/openstack/nova/blob/master/nova/scheduler/request_filter.py#L243-L255 | 18:33 |
| Zhan[m] | Yeah I saw that earlier too. So the only way is to enable it, launch a VM, disable it, make sure everything's correct, and then re-enable it? | 18:34 |
| sean-k-mooney | normlaly i would suggest using somethign else liek a requried trait to isolate the host or teh tenat affinity filter for exmaple | 18:34 |
| sean-k-mooney | i.e. test it by addign teh host to a host aggreate that can only be used by the admin project | 18:35 |
| sean-k-mooney | https://docs.openstack.org/nova/latest/admin/aggregates.html#tenant-isolation-with-placement | 18:35 |
| sean-k-mooney | so restirct it to your admin project, renebaled it, test it adn then eitehr remove it form teh aggreate or disable it again to do futher maintance | 18:36 |
| Zhan[m] | ahh I see, something like a "maintenance" aggregate where only admin can access | 18:36 |
| sean-k-mooney | yep exactly | 18:36 |
| Zhan[m] | I see, thanks! | 18:37 |
| sean-k-mooney | having a third sate enabeld,disabled and maintance woudl perhaps have been a cleaner way to do that but you can emulate that with aggreates | 18:38 |
| Zhan[m] | I think having a third state make sense too, but both approaches solve the same problem ultimately. Wondering if that can be something to discuss and maybe implement, or aggregate is still the recommended way to go? | 18:40 |
| Zhan[m] | It would be good to put this info somewhere, I can imagine others having this question too. | 18:43 |
| dansmith | being able to target disabled hosts seems reasonable to me, without requiring the aggregate thing | 18:47 |
| dansmith | like, if an admin has disabled a host to prevent regular users from scheduling to it, | 18:47 |
| dansmith | but has (of course) the permission to target hosts specifically either with a boot or a migration, | 18:48 |
| dansmith | it seems totally reasonable to ignore the disabled check for that case | 18:48 |
| dansmith | like if I need to disable a failing host, then send a build to it to see if I've fixed it before I let others go there | 18:48 |
| dansmith | or I disable a host to prevent it from filling up so I can emergency migrate instances to it to escape a failing node | 18:48 |
| Zhan[m] | yeah I was actually thinking about replacing the ComputeFilter by a custom filter that will say OK to a disabled host if it's admin and state is up and destination is explicitly specified, until I found that mandatory pre-filter. | 18:49 |
| dansmith | yeah, the pre-filter could do the same | 18:49 |
| Zhan[m] | yeah, I think we can do something in nova that does "if admin (or policy-allowed-role) and if host is specified and if state it up", remove that specific pre-filter? | 18:53 |
| Zhan[m] | imo a third state or this approach both seem fine, just that the prior one may need more changes | 18:53 |
| sean-k-mooney | so it used to be possibel before the pre-filter with --force | 18:58 |
| sean-k-mooney | btu that was because that bypassed everything | 18:58 |
| sean-k-mooney | inclduign the status check provide the compute node existed | 18:59 |
| Zhan[m] | yeah probably not a good idea... | 18:59 |
| Zhan[m] | Any thoughts? Maybe I should do a bug report or a spec? | 19:02 |
| sean-k-mooney | proably a small spec. as its sort of an api change. it is an api change if we had a new state its techinally a policy change adn or cofnig option if we take the other approch | 19:03 |
| sean-k-mooney | so not a bug | 19:03 |
| Zhan[m] | okie dokie, I'll work on it and list all approaches | 19:04 |
| sean-k-mooney | but its not a super invaisve feature given it woudl be highly restrict to admins by default | 19:04 |
| Zhan[m] | definitely, if we add a policy then role will likely be admin by default too. but it's good to have it configurable as folks may create dedicated role for maintenance | 19:05 |
| sean-k-mooney | ya i was debating if the manager role would make sense but i dont think so. but if we have a new policy rule and defautl it to admin then operators can alwasy confirure it to the role they perfer | 19:07 |
| sean-k-mooney | one thing that is not clear to me is if admin shoudl jsut alwasy bypass the check or if this shoudl be opt in either by updating the role or a config option | 19:08 |
| sean-k-mooney | for cross cell migrate we dont even allow admins to do that by default | 19:08 |
| sean-k-mooney | they have ot enable it via custom policy if they want to supprot that in the cloud so this coudl be the same | 19:08 |
| Zhan[m] | I would advocate for including admin in the new policy (default admin too) instead of making it always bypass as it's just very clear & straight forward, people won't need to guess or look at code to know "ok admin can do this". | 19:15 |
| Zhan[m] | I'll put both option in the spec regardless | 19:15 |
| *** erlon3 is now known as erlon | 19:51 | |
Generated by irclog2html.py 4.1.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!