bauzas | happy spec review day, everyone :) | 07:50 |
---|---|---|
gibi | I did my first pass on the spec. I skipped re-proposes as those probably doesn't need much discussion anyhow | 09:21 |
gibi | s/spec/specs/ | 09:22 |
bauzas | gibi: I still need to do my duty, still working on f... vgpus | 09:28 |
* gibi sending good vibes | 09:29 | |
opendevreview | Balazs Gibizer proposed openstack/nova master: [libvirt]Add migration_inbound_addr https://review.opendev.org/c/openstack/nova/+/900203 | 09:35 |
opendevreview | Balazs Gibizer proposed openstack/nova master: DNM: test hostname based migration URL https://review.opendev.org/c/openstack/nova/+/900275 | 09:38 |
opendevreview | Sylvain Bauza proposed openstack/nova master: WIP: Support to pass PFs in device_addresses https://review.opendev.org/c/openstack/nova/+/900251 | 09:42 |
bauzas | there we go, I'm done | 09:42 |
bauzas | I'll just await sean-k-mooney and dansmith for discussing about that | 09:43 |
dvo-plv | bauzas, Hello. I'm currently resolving your comments in the spec file. I have a question regarding dependencies | 10:46 |
bauzas | dvo-plv: I need to go off in a few mins | 10:47 |
dvo-plv | Sure, I will udpate all what i can, another discuss later | 10:47 |
bauzas | dvo-plv: ack, I'll ping you this afternoon so | 10:49 |
opendevreview | Danylo Vodopianov proposed openstack/nova-specs master: Re-propose using VirtIO PackedRing Configuration support for 2024.1 https://review.opendev.org/c/openstack/nova-specs/+/895924 | 10:49 |
opendevreview | Stephen Finucane proposed openstack/nova master: docs: Add documentation on server groups https://review.opendev.org/c/openstack/nova/+/899979 | 12:18 |
opendevreview | Dmitriy Rabotyagov proposed openstack/nova-specs master: [spec] Add Cross-AZ scheduling blueprint https://review.opendev.org/c/openstack/nova-specs/+/900296 | 14:06 |
opendevreview | Dmitriy Rabotyagov proposed openstack/nova-specs master: [spec] Add Cross-AZ scheduling blueprint https://review.opendev.org/c/openstack/nova-specs/+/900296 | 14:12 |
noonedeadpunk | I guess I'm a bit late for the spec review day, sorry :( | 14:13 |
noonedeadpunk | had some off time during previous week so wasn't able to work on it then | 14:13 |
bauzas | noonedeadpunk: no worries, we plan to do a second round of specs in a next review day | 14:20 |
bauzas | like mine I didn't had time to write yet :) | 14:20 |
dvo-plv | bauzas, Hello, I get your latest comments | 14:24 |
dvo-plv | https://review.opendev.org/c/openstack/nova-specs/+/895924/6/specs/2024.1/approved/virtio_packedring_configuration_support.rst#154 Those patches which you mentioned are from another our blueprint | 14:25 |
dvo-plv | I believe that patches are group by topic | 14:25 |
dvo-plv | Should I add links on all patches to the spec file ? | 14:26 |
bauzas | dvo-plv: hey | 14:26 |
bauzas | dvo-plv: I don't particularly want to track the implementation patches in a spec, I'd rather prefer if you can write something in the spec saying that the nova changes depend on neutron and os-vif changes that are on the fly | 14:27 |
opendevreview | John Garbutt proposed openstack/nova-specs master: WIP: Add spec for PCI Groups https://review.opendev.org/c/openstack/nova-specs/+/899719 | 14:27 |
dvo-plv | I see, okay I will do it | 14:28 |
bauzas | I see https://review.opendev.org/q/topic:bp%252Fvirtio-packedring-configuration-support | 14:28 |
bauzas | does that mean anything is required on the neutron side ? | 14:28 |
bauzas | nothing* | 14:28 |
bauzas | dvo-plv: ^ | 14:28 |
dvo-plv | no, this feature is relayted to the nova | 14:28 |
bauzas | I'm actually confused | 14:28 |
bauzas | nova will pick the computes that are recent enough, gotcha | 14:29 |
bauzas | but | 14:29 |
dvo-plv | its about acceleration for qemu | 14:29 |
bauzas | oh I see | 14:30 |
bauzas | so we're just exposing the feature as a trait | 14:30 |
dvo-plv | yes | 14:30 |
bauzas | okay, then nevermind my comment on dependencies, but there are still some things to write about testing :) | 14:30 |
bauzas | I'll update | 14:31 |
bauzas | dvo-plv: I may propose an alternative, I'm not particarly fond of adding another trait for something only QEMU | 14:32 |
bauzas | we already add a lot of those | 14:33 |
dvo-plv | Maybe we should leave as is. We already did alot of work here and testing for implement this | 14:34 |
bauzas | dvo-plv: I still don't get why we need a trait, that's the point | 14:40 |
bauzas | https://docs.openstack.org/nova/latest/reference/libvirt-distro-support-matrix.html#min-libvirt-qemu-version-and-next-min-libvirt-qemu-version-table | 14:40 |
bauzas | it says here that Antelope libvirt min is 6.0 | 14:40 |
bauzas | and Bobcat 7.0 | 14:40 |
bauzas | so basically, the problem we need to solve is only with Antelope computes or I'm stupid | 14:41 |
dvo-plv | For scheduling process, when operator makes OpenStack updates | 14:42 |
bauzas | yeah, I got it, but look at that table | 14:42 |
dvo-plv | it case the live migration when one node works with packed ring and another not | 14:42 |
dvo-plv | when user make update of the system | 14:42 |
bauzas | sure, but Bobcat (and Caracal) require libvirt 7.0 as min | 14:43 |
bauzas | so you're guaranteed that libvirt is newer than 6.7, which is what you want | 14:43 |
bauzas | the only case we need to solve is a very particular case, which is when the operator has a rolling upgrade environment on a SLURP cadence, ie. Caracal controllers and computes + a certain number of Antelope computes | 14:44 |
bauzas | in that specific case, we don't wanna land on an Antelope compute, for sure | 14:44 |
dvo-plv | but when user hsa antelope and do system update, he could faced with the situation when one compute has antelope with old qemu and other already bobcat. to prevent wrong migration we will filter nodes with trait | 14:45 |
bauzas | this is why I'm proposing to add a prefilter that would only check the compute version | 14:45 |
bauzas | how would you do that for antelope computes since you can't backport that detection code ? | 14:45 |
bauzas | what I'm suggesting is to ask this feature to be Bobcat min | 14:46 |
bauzas | I know I approved https://review.opendev.org/c/openstack/os-traits/+/876069 but I wasn't understand this was just for decorating a libvirt version | 14:47 |
dvo-plv | it will required to delete it from the os-traits, rewrite nova code https://review.opendev.org/c/openstack/nova/+/876075 and glance | 14:49 |
dvo-plv | rewrite spec file | 14:49 |
dvo-plv | or we can move with it and not limit us with bobcat | 14:50 |
bauzas | sure, but we'll keep forever some trait support for something we don't need in the very next future, ie. D :) | 14:51 |
bauzas | this is the exact reason why we do specifications, in order to avoid such confusions | 14:51 |
bauzas | and again, I don't get how antelope computes will magically decorate them with a trait, given the code isn't there | 14:52 |
dvo-plv | when user starts to update the system | 14:52 |
bauzas | dvo-plv: let me explain you | 14:53 |
dvo-plv | he could have situation when one compute old, another has altests release and vm was run on the latest and to avoid issue with qemu we could filter it with trait | 14:53 |
bauzas | dvo-plv: in https://review.opendev.org/c/openstack/nova/+/876075/25/nova/virt/libvirt/driver.py#9046 | 14:53 |
bauzas | you're adding a trait | 14:53 |
bauzas | and in the conditional below, you're decorating the compute with that trait | 14:54 |
bauzas | so, indeed, Caracal computes will be decorated with that trait (since libvirt min is enough), I don't disagre | 14:55 |
bauzas | disagree* | 14:55 |
bauzas | now, Bobcat computes won't cointain that code, rightN | 14:55 |
bauzas | and Antelope ones, too, right? | 14:55 |
dvo-plv | yes | 14:55 |
bauzas | for Bobcat, this isn't a problem : since the min is recent enough, you're good | 14:58 |
bauzas | for Antelope, indeed, depending on the OS version, you may or may not have the right libvirt version | 14:58 |
bauzas | but unfortunately Antelope computes can't decorate themselves with https://review.opendev.org/c/openstack/nova/+/876075/25/nova/virt/libvirt/driver.py since they don't have that code running | 15:00 |
dvo-plv | yes and in case if Antelope will have old qemu, it will help to avoid wrong qemu command | 15:01 |
bauzas | dvo-plv: so again, why do we need to use a trait if we just say "we will only send that instance to Bobcat and newer computes" ? | 15:02 |
bauzas | we could also do something in the API like https://github.com/openstack/nova/blob/b64ecb0cc776bd3eced674b0f879bb23c8a4b486/nova/compute/api.py#L284-L312 | 15:05 |
bauzas | and return some HTTP40x exception if not all the computes are Bobcat or newer | 15:05 |
dvo-plv | So, you suggest to remove trait and add somewhere in the scheduler check that if packed ring was requested and release of the target node is equal or higher then Bobcat we approve live migration to this to this node ? | 15:07 |
bauzas | dvo-plv: that's a good question, we should either restrict the placement query to >=Bobcat computes (but I don't know if we have a construct like this) or we could just have some API check that would ensure that all computes are above or equal Bobcat version, like the link I passed to you | 15:11 |
dvo-plv | I see your point and it sounds correctly | 15:12 |
bauzas | dvo-plv: I'm not telling you to drop the libvirt additions you made in https://review.opendev.org/c/openstack/nova/+/876075 | 15:13 |
bauzas | and nothing changes on the UX side (image property or flavor extraspec) | 15:13 |
bauzas | that's just the scheduling part | 15:14 |
bauzas | and we could even lift that restriction in the API once we arrive in D timeframe, since we'll only support N-1 upgrade | 15:15 |
dvo-plv | But maybe we could also invite sean-k-mooney and gibi who also take a part in the spec file design to make one more common decision in that topic to choose new vector which i have to investigate how and where to implement it | 15:15 |
bauzas | sure | 15:15 |
bauzas | dvo-plv: just my guess, was that spec originally proposed in Antelope ? | 15:15 |
bauzas | if so, the libvirt check would sound more important to me by that time | 15:16 |
bauzas | but the fact is, we bumped our mins in Bobcat, so that simplifies the logic :) | 15:16 |
dvo-plv | i'm not familiar with release date, I thought that it was before Zed release May 03 2023 https://review.opendev.org/c/openstack/nova-specs/+/868377 | 15:17 |
bauzas | I need to disappear for 20 mins, but I can read | 15:18 |
opendevreview | Dmitriy Rabotyagov proposed openstack/nova stable/2023.1: add a regression test for all compute RPCAPI 6.x pinnings for rebuild https://review.opendev.org/c/openstack/nova/+/900306 | 15:28 |
opendevreview | Dmitriy Rabotyagov proposed openstack/nova stable/zed: add a regression test for all compute RPCAPI 6.x pinnings for rebuild https://review.opendev.org/c/openstack/nova/+/900307 | 15:28 |
dvo-plv | sean-k-mooney, gibi | 15:29 |
dvo-plv | So, in a conclusion we have two open questions. How we need to proceed with scheduling, trait or release filter | 15:29 |
dvo-plv | Does tempest test is a required from us ? | 15:29 |
bauzas | noonedeadpunk: thanks for proposing the backparts, hadn't time yet | 15:47 |
bauzas | reminder: nova meeting in 13 mins here. | 15:47 |
noonedeadpunk | no worries. I will push rest on top in couple of mins | 15:47 |
bauzas | dvo-plv: I'm not saying a tempest test is mandatory, just a preference | 15:48 |
bauzas | that said, at least a functional test seems needed | 15:48 |
bauzas | shit, forgot the meeting was in another impromptu call :( | 16:02 |
bauzas | #startmeeting | 16:02 |
opendevmeet | bauzas: Error: A meeting name is required, e.g., '#startmeeting Marketing Committee' | 16:02 |
elodilles | o/ | 16:02 |
bauzas | #startmeeting nova | 16:02 |
opendevmeet | Meeting started Tue Nov 7 16:02:15 2023 UTC and is due to finish in 60 minutes. The chair is bauzas. Information about MeetBot at http://wiki.debian.org/MeetBot. | 16:02 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 16:02 |
opendevmeet | The meeting name has been set to 'nova' | 16:02 |
elodilles | o/ | 16:02 |
bauzas | #link https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting | 16:02 |
bauzas | (this will be live, I haven't updated the wiki yet) | 16:02 |
dansmith | o/ | 16:03 |
gibi | o/ | 16:03 |
bauzas | there, let's start | 16:04 |
bauzas | #topic Bugs (stuck/critical) | 16:04 |
bauzas | #info No Critical bug | 16:04 |
bauzas | #link https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New 32 new untriaged bugs (-4 since the last meeting) | 16:04 |
bauzas | #info Add yourself in the team bug roster if you want to help https://etherpad.opendev.org/p/nova-bug-triage-roster | 16:05 |
bauzas | Uggla_: any bug you wanted to tell us ? | 16:05 |
bauzas | anyway, let's move on | 16:05 |
bauzas | elodilles: fancy taking the baton ? | 16:06 |
elodilles | bauzas: yepp | 16:06 |
elodilles | i can take it :) | 16:06 |
bauzas | cool thanks | 16:06 |
bauzas | #info bug baton is elodilles | 16:07 |
bauzas | elodilles: ++ | 16:07 |
Uggla_ | oh | 16:08 |
Uggla_ | yes | 16:08 |
bauzas | #topic Gate status | 16:08 |
bauzas | #link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs | 16:08 |
bauzas | #link https://zuul.openstack.org/builds?project=openstack%2Fnova&project=openstack%2Fplacement&pipeline=periodic-weekly Nova&Placement periodic jobs status | 16:08 |
Uggla_ | https://bugs.launchpad.net/nova/+bug/2039381 | 16:08 |
bauzas | #info Please look at the gate failures and file a bug report with the gate-failure tag. | 16:08 |
bauzas | Uggla_: oh sorry we moved to the gate section, can we discuss your bug in the open discussion then ? | 16:09 |
dansmith | melwitt has a fix up for the vnc ting.. I've asked her a question, but assume we'll get that on the way soon once she's around | 16:09 |
bauzas | fwiw, all periodics are in green | 16:09 |
Uggla_ | yep sure | 16:10 |
bauzas | dansmith: oh, nice to hear, any change I could dent ? | 16:10 |
dansmith | sec | 16:10 |
dansmith | https://review.opendev.org/c/openstack/grenade/+/900257 | 16:11 |
bauzas | dansmith: cool, CCing it | 16:11 |
bauzas | oh that's why it's failing intermittently | 16:12 |
bauzas | depending on the host you land | 16:12 |
bauzas | I nacked the idea when gibi suggested it because I was too lazy to look at the job and I just trusted the fact we got green runs | 16:12 |
bauzas | what a shame | 16:12 |
dansmith | bauzas: that's what I'm wondering | 16:13 |
dansmith | seems like we should be failing a lot more though, | 16:13 |
dansmith | and why this has become a thing all of a sudden also seems weird if it's been broken like this for a while | 16:13 |
dansmith | but otherwise yeah, hopefully an easy fix | 16:13 |
bauzas | because yesterday we merged my patch that added the servers actions checks in grenade | 16:13 |
bauzas | this test was flakey but never run, and I just opened the can of worms | 16:14 |
dansmith | but I think it has failed on other patches, not just yours | 16:14 |
bauzas | well, then the vnc check itself could be present on other tempest tests | 16:15 |
bauzas | or we could run the server actions list in other jobs, rather | 16:15 |
bauzas | anyway | 16:15 |
bauzas | moving on | 16:16 |
bauzas | #topic Release Planning | 16:16 |
bauzas | #link https://releases.openstack.org/caracal/schedule.html | 16:16 |
bauzas | #info Nova deadlines will be proposed in the schedule above | 16:16 |
bauzas | I just need to file a release patch with the correct dates :) | 16:17 |
bauzas | #info Caracal-1 milestone in 1 week | 16:17 |
bauzas | #info Spec review day today | 16:17 |
bauzas | I think I made a correct round of reviews but there are still some specs I haven't looked yet | 16:17 |
bauzas | I'll continue my duty until EOB | 16:18 |
dansmith | any list of specs just waiting for a +W? | 16:18 |
bauzas | sure | 16:19 |
opendevreview | Steven Relf proposed openstack/nova master: Adding basic auth to dynamic vendordata api calls https://review.opendev.org/c/openstack/nova/+/900252 | 16:19 |
gibi | there is a set of reproposals that are easy wins | 16:19 |
bauzas | dansmith: you mean a second +2 or just a plain +W ? :) | 16:19 |
dansmith | bauzas: second review I mean? | 16:19 |
bauzas | https://review.opendev.org/q/(project:openstack/nova-specs)+status:open+NOT+owner:self+NOT+label:Workflow%253C%253D-1+label:Verified%253E%253D1%252Czuul+NOT+reviewedby:self+is:mergeable+NOT+label:Code-Review%253C%253D-1%252Cnova-core+label:Code-Review%253E%253D2 | 16:19 |
bauzas | not sure the link will render correctly | 16:19 |
bauzas | but tl;dr: dansmith you have the sole spec that needs a second +2 :) | 16:20 |
bauzas | and yeah, there are reproposals on the way | 16:20 |
bauzas | btw. I wonder why they don't show up | 16:20 |
dansmith | bauzas: ah bummer.. maybe gibi can circle back on that .. that's the only one I can't finish :) | 16:21 |
bauzas | https://review.opendev.org/q/project:openstack/nova-specs+status:open+label:Code-Review%253E%253D2 | 16:21 |
dansmith | but yeah I'll look at the re-proposals | 16:21 |
bauzas | sorry, my dash link seems to be wrong | 16:21 |
bauzas | fwiw, had no time yet to correctly write a mdev live-migration spec | 16:21 |
auniyal6 | on CI bugs, nova-emulation job is failing too (I wanted to check if its failing for more patches, but builds page is not opening for me right now ) | 16:22 |
bauzas | I'll bug folks directly :) | 16:22 |
sean-k-mooney | i can also try and loop back | 16:22 |
sean-k-mooney | ill be doing more review later today | 16:22 |
sean-k-mooney | dansmith: on your spec | 16:22 |
* gibi will check back on the device alias | 16:22 | |
johnthetubaguy | (I keep trying to do the ironic re-proposal, but I keep getting distracted, sorry) | 16:22 |
opendevreview | Dmitriy Rabotyagov proposed openstack/nova stable/2023.1: Fix rebuild compute RPC API exception for rolling-upgrades https://review.opendev.org/c/openstack/nova/+/900336 | 16:22 |
opendevreview | Dmitriy Rabotyagov proposed openstack/nova stable/2023.1: Adding server actions tests to grenade-multinode https://review.opendev.org/c/openstack/nova/+/900337 | 16:22 |
bauzas | gibi: you left a +1 because of Uggla_'s concern but AFAICR, it was resolved | 16:22 |
bauzas | so I just went +2 | 16:23 |
gibi | bauzas: OK, thanks | 16:23 |
opendevreview | Dmitriy Rabotyagov proposed openstack/nova stable/2023.2: add a regression test for all compute RPCAPI 6.x pinnings for rebuild https://review.opendev.org/c/openstack/nova/+/900309 | 16:23 |
bauzas | okay, moving on | 16:23 |
bauzas | dvo-plv wanted us to do a group discussion on https://review.opendev.org/c/openstack/nova-specs/+/895924/ since I asked to *not* use a trait but let's not discuss this now | 16:24 |
opendevreview | Dmitriy Rabotyagov proposed openstack/nova stable/2023.2: Fix rebuild compute RPC API exception for rolling-upgrades https://review.opendev.org/c/openstack/nova/+/900338 | 16:24 |
opendevreview | Dmitriy Rabotyagov proposed openstack/nova stable/2023.2: Adding server actions tests to grenade-multinode https://review.opendev.org/c/openstack/nova/+/900339 | 16:24 |
bauzas | if people want, we could discuss this during open discussion | 16:24 |
bauzas | (or just read my gerrit comments, you'll get my point, which is please avoid adding traits for just a libvirt version check that's already supported for all computes except Antelope and olders) | 16:25 |
bauzas | anyway, moving on | 16:25 |
dansmith | yeah | 16:26 |
dansmith | needs to be capability-based, IMHO | 16:26 |
dansmith | "supports X" not "is version X" | 16:26 |
sean-k-mooney | yep | 16:27 |
sean-k-mooney | so supprot virtio-packed format | 16:27 |
sean-k-mooney | is a capablity | 16:27 |
bauzas | I said we could resolve that with a service version check | 16:27 |
sean-k-mooney | not a version check | 16:27 |
sean-k-mooney | so it shoudl be a trait | 16:27 |
sean-k-mooney | and we shoudl not need a comptue service check | 16:27 |
bauzas | sean-k-mooney: that's not exactly what's written in both code and spec | 16:27 |
sean-k-mooney | as this is not a feature that is impemnted at the comptue mager level | 16:27 |
bauzas | the requirement is "is libvirt >6.7" | 16:27 |
sean-k-mooney | right that is the requireemtn for the feature too work | 16:28 |
bauzas | since bobcat supports 7.0, we're good | 16:28 |
sean-k-mooney | but it is modeled by reportign a triat for the host where the capablitys is supproted | 16:28 |
sean-k-mooney | yep | 16:28 |
sean-k-mooney | but we now need to supprot n-2 and diffent oss | 16:28 |
sean-k-mooney | *operating systems | 16:29 |
bauzas | so it only leaves the rolling upgrade case with a caracal env + an antelope node | 16:29 |
bauzas | which will no longer be a problem with D | 16:29 |
sean-k-mooney | well this featur like many other can be docusmeed as only supproted after a full upgrade is complete | 16:29 |
bauzas | so again, I'm very against adding a trait for modeling a libvirt version that's gonna be minimum for all our computes next cycle anyway | 16:29 |
sean-k-mooney | bauzas: that not what its modeling | 16:29 |
sean-k-mooney | and its required in my view for as long as we have any virt driver (other then ironic and libvirt) that supprot vms | 16:30 |
bauzas | https://review.opendev.org/c/openstack/nova/+/876075/25/nova/virt/libvirt/driver.py | 16:30 |
bauzas | that exactly models the libvirt version | 16:30 |
dansmith | not really | 16:31 |
sean-k-mooney | its the only way to detect it because libvirt does not have a way to detect it without a version check | 16:31 |
sean-k-mooney | but | 16:31 |
sean-k-mooney | since we now have the min versio nrequiremnet | 16:31 |
dansmith | it's exposing whether or not the feature is supported, but the way it does that is by the libvirt version internally | 16:31 |
sean-k-mooney | this would be a static trait that is exposed by computes using the libvirt driver | 16:31 |
dansmith | that's different than exposing a trait of the actual version | 16:31 |
sean-k-mooney | dansmith: exactly | 16:31 |
bauzas | sean-k-mooney: then I'd suggest a prefilter that would only ensure that we land on a libvirt hostr | 16:31 |
dansmith | the prefilter can be the trait though | 16:32 |
sean-k-mooney | bauzas: i belive the prefilter was aldreay in the spec | 16:32 |
sean-k-mooney | bauzas: its somethign i have defintly dicussed for this feature previously | 16:32 |
bauzas | yeah, but again, I don't like us adding yet another trait for this | 16:32 |
sean-k-mooney | bauzas: i belive this is exactly what traits shoudl be used for | 16:32 |
dansmith | but this is what we do right? all the other traits are just like this I think | 16:32 |
sean-k-mooney | dansmith: more or less | 16:33 |
bauzas | then, let's just provide this trait without a conditional | 16:33 |
sean-k-mooney | bauzas: yes i also said that in my comments | 16:33 |
sean-k-mooney | and i said this last cycle too when we origianlly proposed it | 16:33 |
bauzas | if it's really about directing to libvirt nodes | 16:33 |
dansmith | we can do that if we hard-require the min version | 16:33 |
sean-k-mooney | bauzas: basically the context yoru missing is we had prviously agree that if we enforced the min verison we would update the sepc depending on which feature landed first | 16:34 |
sean-k-mooney | kasyaps min version bump | 16:34 |
sean-k-mooney | or this feature | 16:34 |
sean-k-mooney | in the repopoal it should be updated to drop the check because the min verion bump already happend in bobcat | 16:34 |
dansmith | so to be clear, we're already hard-requiring the min version needed for this, and thus the trait is just for the upgrade case where we might have old computes? | 16:35 |
opendevreview | Dmitriy Rabotyagov proposed openstack/nova stable/zed: Fix rebuild compute RPC API exception for rolling-upgrades https://review.opendev.org/c/openstack/nova/+/900341 | 16:35 |
opendevreview | Dmitriy Rabotyagov proposed openstack/nova stable/zed: Adding server actions tests to grenade-multinode https://review.opendev.org/c/openstack/nova/+/900342 | 16:35 |
bauzas | what *you* missed is that I figured this out before the meeting (that spec is older than our min bump) but I missed the fact we want to direct to libvirt-only nodes, hence my mistake | 16:35 |
sean-k-mooney | dansmith: upgrade case or where your mixing virt drivers | 16:35 |
sean-k-mooney | dansmith: but use oru current min is 7.0.0 i belvie and the feature is in 6.x | 16:35 |
dansmith | sean-k-mooney: okay libvirt and ironic being the only possibilities there :) | 16:35 |
bauzas | dansmith: the former (upgrade case) was my concern | 16:35 |
bauzas | sean-k-mooney: it requires 6.7 | 16:35 |
dansmith | so I mean, I would probably lean towards just a service version check to not let this work until everything is upgraded, but the multi-virt driver thing is a fair point | 16:36 |
sean-k-mooney | bauzas: yep i said 6.x because i know we met the min requirement with our min supported version | 16:36 |
bauzas | I don't want us to buy a new trait for something that's only because we support N-2 this cycle | 16:36 |
dansmith | and even though we're dropping basically all the others, we could have a new one in the future without this support, so... | 16:36 |
sean-k-mooney | dansmith: well we dont need a new compute service verion for this feature in general | 16:36 |
sean-k-mooney | altough i guess we coudl do oen for the new prefilter | 16:37 |
bauzas | dansmith: so you're on the same page than me, a service version check for upgrades is enough, but if we really want to avoid other hypervisors we could need a trait | 16:37 |
dansmith | sean-k-mooney: we don't need it, but we could use it (cheaper than a trait) for the usual purpose of not exposing features until everything is upgraded | 16:37 |
dansmith | bauzas: no, I said I lean that way, but I'm also fine with a trait because of the virt driver possibility | 16:37 |
sean-k-mooney | dansmith: oh i disagree i alwasy conisderd a comptue version bump more expensive or at most the same | 16:37 |
dansmith | sean-k-mooney: well, we disagree then.. we bump service versions all the time for stuff like this, where there isn't even a rpc bump to correlate | 16:38 |
dansmith | and it is a single integer in one tree versus a new enum in traits, a package release, a dep update, and then a nova patch | 16:38 |
bauzas | tbh, I'd rather prefer having the prefilter asking 'get me a libvirt compute' rather than 'get me a compute that supports foo' since this feature is very QEMU-centric | 16:38 |
bauzas | sean-k-mooney: the fact is, this service version check can drop next cycle | 16:39 |
sean-k-mooney | bauzas: that would be a misuse of traits | 16:39 |
bauzas | sean-k-mooney: starting with D, all computes will support a libvirt recent enough | 16:39 |
dansmith | sean-k-mooney: how is that a misuse of traits? | 16:39 |
sean-k-mooney | if we want to also have a servic verion bump and an additon check in the api prior to the call to the schduler we can | 16:39 |
sean-k-mooney | dansmith: we previosly said we didnt want to have a trait for which virt driver is in use | 16:39 |
bauzas | if two virt drivers were proposing the same feature, then yah a trait sounds good to me | 16:39 |
sean-k-mooney | we can revert that if we want but i know there was pushback to that in the past | 16:40 |
bauzas | but here that capability is purely qemu-based | 16:40 |
sean-k-mooney | bauzas: so is the only reason your takign this stance because we did the min libvirt bump last cycle | 16:40 |
dansmith | sean-k-mooney: we have traits for which type of hardware is on the host, this seems similar, but we can also filter on hypervisor type from host state anyway right? so we can do it without a trait | 16:40 |
bauzas | say some other feature does the same, we gonna add another trait and another prefilter | 16:40 |
sean-k-mooney | bauzas: yes we should anytime thre is a capablity that is not supproted by all supproted drivers | 16:41 |
bauzas | and boom, traits explosion, plus the fact we yet again push hypervisor features upfront | 16:41 |
sean-k-mooney | bauzas: traits are cheap and placment was built to deal with many of them | 16:41 |
dansmith | honestly, this particular feature is probably not critical, right? meaning: | 16:41 |
bauzas | service version checks that can get rid next cycle are cheaper IMHO and, | 16:42 |
bauzas | no traits is cheaper than a single one :)= | 16:42 |
dansmith | if we restrict to libvirt hosts with a filter, then it's easy for us to say in the reno that if you're not upgraded that feature request will not be honored by old computes that don't know about it | 16:42 |
dansmith | after the upgrade it's all good | 16:42 |
dansmith | this is an optimization not like "I *need* 32G of memory else please reject" | 16:42 |
bauzas | I just feel we can say in the notes 'please do what you need in case you have mixed hypervisors' | 16:43 |
sean-k-mooney | bauzas: i really dislike that direction as i feel our schduler shoudl ensure you land on a host that can supprot the request feature without additonal admin intervention | 16:43 |
sean-k-mooney | dansmith: in this partical case if we dont have the trait it will be non critical only in that on older host it willl be ignored | 16:44 |
bauzas | only if you have mixed hypervisors, right? | 16:44 |
sean-k-mooney | bauzas: no even with a singel hypervior | 16:44 |
bauzas | and in that case, you probably already did the setup in order to shard your cloiud | 16:44 |
dansmith | sean-k-mooney: right, it just seems like a best-effort sort of thing compared to some others | 16:44 |
bauzas | sean-k-mooney: still talking of the 'I want a compute with libvirt recent enough' then ? | 16:45 |
bauzas | again, doesn't sound to me worth adding a trait for this | 16:45 |
sean-k-mooney | dansmith: if its a flavor extra spec we shoudl always guarentee it abel to work on the slected host | 16:45 |
sean-k-mooney | so if we are fine with rejecting it on the compute node sure | 16:45 |
bauzas | anyway, I feel we're arguing right, but the time flies and we're on a meeting | 16:45 |
sean-k-mooney | im not ok with allowing the vm to boot | 16:45 |
bauzas | can we drop this until the end of this meeting | 16:46 |
bauzas | ? | 16:46 |
dansmith | yep | 16:46 |
sean-k-mooney | sure or to thet spec review | 16:46 |
bauzas | and we could try to find a way forward just after | 16:46 |
bauzas | cool, moving on then | 16:46 |
bauzas | #topic Review Priorities | 16:47 |
bauzas | still an action item on me | 16:47 |
bauzas | I have to create an etherpad as we agreed at PTG | 16:47 |
bauzas | so I'll keep this bullet until I'm set | 16:47 |
dvo-plv | I have some issue with electricity, battery dead, so ping me and i will answer later when I will be back online | 16:47 |
bauzas | #action bauzas to create a tracking etherpad for this cycle | 16:48 |
bauzas | dvo-plv: np, we also captured this conversation in the meeting logs that'll show up once we end the meeting | 16:48 |
bauzas | #topic Stable Branches | 16:48 |
bauzas | elodilles: your time | 16:49 |
elodilles | #info stable gates don't seem blocked | 16:49 |
elodilles | (stable/victoria's nova-ceph-multistore looked suspicious as it failed with POST_FAILURE a couple of times, but it has passed now) | 16:49 |
elodilles | #info stable release patches proposed: https://review.opendev.org/q/project:openstack/releases+is:open+intopic:nova | 16:49 |
elodilles | last time we agreed to wait for some patches to merge ^^^ | 16:50 |
elodilles | feel free to update the patch when they are merged | 16:50 |
bauzas | yeah and noonedeadpunk proposed backports | 16:50 |
elodilles | ++ | 16:50 |
bauzas | related to the RPC fixes we wanted to land | 16:50 |
bauzas | so, we need reviews | 16:50 |
bauzas | I surely can review things I wrote | 16:50 |
elodilles | ACK, will review them then :) | 16:51 |
elodilles | #info stable branch status / gate failures tracking etherpad: https://etherpad.opendev.org/p/nova-stable-branch-ci | 16:51 |
elodilles | and that's all from my side | 16:51 |
bauzas | elodilles: ping me if you need the change numbers | 16:51 |
bauzas | cool, thanks | 16:51 |
bauzas | rushing, we have a few time left | 16:51 |
bauzas | #topic Open discussion | 16:51 |
bauzas | (gibi): Seeking for specless approval https://blueprints.launchpad.net/nova/+spec/libvirt-migrate-with-hostname-instead-of-ip based on the PTG discussion https://etherpad.opendev.org/p/nova-caracal-ptg#L815. I have a proposed impl https://review.opendev.org/c/openstack/nova/+/900203 | 16:52 |
gibi | so | 16:52 |
gibi | as we discussed on the PTG I would like to make it possible for the libvirt driver to use hostnames during cold migration instead of IP addresses | 16:52 |
noonedeadpunk | bauzas: fwiw, backport to Zed is failing unit tests | 16:52 |
noonedeadpunk | so some look there is appreciated | 16:52 |
gibi | the live migration already uses hostnames by default | 16:53 |
dansmith | gibi: ++ | 16:53 |
gibi | this specless bp propose a new config option to make the new behavior opt in | 16:53 |
bauzas | noonedeadpunk: I have a clue, but let's not discuss this now (I guess this is the rpc version check unittest that fails) | 16:53 |
dansmith | gibi: no new RPC or object stuff needed, just a change on which thing we advertise to the remote side right? | 16:53 |
sean-k-mooney | dansmith: we might need a db change | 16:53 |
gibi | no rpc, no object, just a config option for the libvirt driver | 16:53 |
dansmith | sean-k-mooney: why/ | 16:53 |
gibi | sean-k-mooney: no DB change is needed afaik | 16:53 |
bauzas | and this is per-computes ? | 16:54 |
dansmith | gibi: cool, ++ for specless BP from me | 16:54 |
gibi | bauzas: this is per compute for the incoming migrations | 16:54 |
sean-k-mooney | dont we look up the remote systme "connection" info via fiels in the obejct | 16:54 |
sean-k-mooney | that are stored in teh db | 16:54 |
gibi | it is in the migration object | 16:54 |
gibi | we store the IP there today | 16:54 |
gibi | as a string | 16:54 |
dansmith | but it's just a string, AFAIK | 16:54 |
sean-k-mooney | right so will the content of that change | 16:54 |
bauzas | yeah | 16:54 |
gibi | now it will be either IP or a hostname | 16:54 |
sean-k-mooney | ok that is what i was wondering about | 16:54 |
bauzas | and we said it was opt-in | 16:55 |
sean-k-mooney | ok provided we dont assume that its an ip anywhere today | 16:55 |
bauzas | so this is not a breaking upgrade change | 16:55 |
sean-k-mooney | and given its opt in | 16:55 |
gibi | it is opt-in the default of the new config is to use my_ip as before | 16:55 |
sean-k-mooney | i think im fine with that as specless | 16:55 |
bauzas | yeah, so I'm favor of approving it | 16:55 |
bauzas | any concerns ? | 16:55 |
sean-k-mooney | the only thing we need to do is docuemnt that you shoudl not set the new config option until the cloud is fully upgraded | 16:56 |
sean-k-mooney | it will likely work with the hostname | 16:56 |
sean-k-mooney | and old nova | 16:56 |
gibi | I can add a reno | 16:56 |
sean-k-mooney | ack then all good form me | 16:56 |
gibi | and amend the config doc with a warning | 16:56 |
dansmith | I bet it works even with old ones | 16:56 |
gibi | it depends | 16:57 |
sean-k-mooney | provided its resolveable it likely will | 16:57 |
gibi | yeah | 16:57 |
sean-k-mooney | but it will depend on dns and /etc/hosts | 16:57 |
bauzas | I mean, that will depend on how the cloud is configured, but old computes can work | 16:57 |
bauzas | for sure | 16:57 |
sean-k-mooney | gibi: and just to clarify it can be an fqdn right (ip, hostname or fqdn) | 16:57 |
sean-k-mooney | it being the new config option | 16:57 |
gibi | it can be IP, it is defulted to my_ip, it can be a user defined FQDN or hostname, or it can be "%s" which will be replaced with the hostname of the node | 16:58 |
sean-k-mooney | +1 | 16:58 |
dansmith | yup | 16:58 |
bauzas | okay, if you don't mind, | 16:58 |
gibi | the last is the most useful for me in my deployment work | 16:58 |
bauzas | #agreed https://blueprints.launchpad.net/nova/+spec/libvirt-migrate-with-hostname-instead-of-ip is approved as a specless BP | 16:59 |
bauzas | we're on time | 16:59 |
bauzas | any last min question ? | 16:59 |
Uggla_ | yep | 16:59 |
Uggla_ | looking at https://bugs.launchpad.net/nova/+bug/2039381 | 16:59 |
Uggla_ | do you think it could be linked to a configuration (service token) issue ? | 16:59 |
bauzas | oh dman, forgot | 17:00 |
bauzas | I'll end the meeting now, we'll discuss this right after | 17:00 |
Uggla_ | ok | 17:00 |
bauzas | thanks all | 17:00 |
bauzas | #endmeeting | 17:00 |
opendevmeet | Meeting ended Tue Nov 7 17:00:48 2023 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 17:00 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/nova/2023/nova.2023-11-07-16.02.html | 17:00 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/nova/2023/nova.2023-11-07-16.02.txt | 17:00 |
opendevmeet | Log: https://meetings.opendev.org/meetings/nova/2023/nova.2023-11-07-16.02.log.html | 17:00 |
bauzas | dvo-plv: you have the logs there^ | 17:00 |
bauzas | Uggla_: so | 17:01 |
* bauzas looks | 17:01 | |
bauzas | Uggla_: did they do the service token thing ? | 17:02 |
bauzas | now that's mandatory ? | 17:02 |
Uggla_ | I don't know. But do we do it on devstack ? | 17:03 |
*** d34dh0r5- is now known as d34dh0r53 | 17:03 | |
bauzas | I bet we do | 17:04 |
bauzas | https://docs.openstack.org/nova/latest/admin/configuration/service-user-token.html | 17:04 |
bauzas | Uggla_: can you check ? | 17:05 |
gmann | bauzas: FYI, with daylight saving things, nova meeting time conflict with my other appointment so I would not be able to join. If anything for me during meeting, please ping me and i will reply later. | 17:05 |
bauzas | gmann: no worries | 17:05 |
bauzas | DST sucks | 17:05 |
gmann | yeah :) | 17:05 |
Uggla_ | bauzas, yes the section is part of my nova.conf (devstack). | 17:06 |
bauzas | dansmith: sean-k-mooney: very briefly because today is spec review day and you both have other urgencies, please note that none of https://review.opendev.org/q/topic:bug%252F2041519-new actually works since we don't delete existing Resource Providers | 17:07 |
bauzas | so we need some logic like https://review.opendev.org/c/openstack/nova/+/899625 in order to clean up the no-longer needed RPs | 17:07 |
bauzas | (talking of upgrades, for sure a greenfields deployment with no existing RPs will correctly create the only needed RPs) | 17:08 |
dansmith | bauzas: yeah understand we need to clean up the current situation | 17:09 |
bauzas | anyway, let's not discuss this for now, this is more a fyi | 17:09 |
bauzas | and we'll revisit that later | 17:09 |
elodilles | sean-k-mooney: when you'll have time could you please have a quick look at this pip-23.1-support / PBR related patch? https://review.opendev.org/c/openstack/releases/+/900053/7 | 17:11 |
clarkb | elodilles: see scrollback in #openstack-infra | 17:14 |
dvo-plv | bauzas, sean-k-mooney, densmith I read logs and my conclusion that we do not have final decision, how I need to managed this? Ping you tomorrow or add this topic to some another meeting? | 17:33 |
bauzas | usually we let people chime in on Gerrit | 17:34 |
bauzas | IMHO we clarified exactly the usecases so all people should understand others' concerns | 17:35 |
elodilles | clarkb: looking | 17:35 |
dvo-plv | So i need just wait for discussion in the spec comments ? | 17:36 |
bauzas | dvo-plv: fwiw, sean-k-mooney and I already commented | 17:46 |
bauzas | maybe dansmith could add his thoughts once he has time (later today or tomorrow) | 17:46 |
bauzas | in the mean, this doesn't really change what you wrote, | 17:46 |
bauzas | only the scheduling part may be reworked | 17:47 |
dansmith | I thought we were going to have a discussion about it? | 17:47 |
bauzas | dansmith: you had other prios | 17:47 |
bauzas | if you have time, sean-k-mooney and I are debating on the need for a trait | 17:47 |
bauzas | I captured my thoughts in gerrit | 17:47 |
bauzas | I just can't wait for the next feature coming by and asking for the same | 17:47 |
bauzas | "hey, I wanna be sure we land on libvirt node" | 17:48 |
bauzas | so my question is, shall we create a trait for every single image property or flavor extra spec that matches a specific libvirt feature ? | 17:49 |
bauzas | (and a prefilter of course) | 17:49 |
dansmith | yeah, honestly for a thing that is just based on libvirt version, and especially a thing in a version from the past, I'd hate to get into the habit of doing a new trait for each of those things because it's such a heavyweight process | 17:50 |
dansmith | so I think I'm leaning more heavily towards a service version and a hypervisor filter for this kind of stuff | 17:50 |
bauzas | in the past, we left people to configure their clouds the way they wanted if they had mixed virt drivers | 17:50 |
dansmith | however, gibi also approved the spec with the trait, so might be good to get him to weigh in there | 17:50 |
bauzas | the spec never got approved, right? | 17:51 |
bauzas | oh my bad, no | 17:51 |
dansmith | bauzas: right the mixed hypervisor thing is definitely a concern, but this is also the sort of thing where a flavor isn't likely to work between ironic and libvirt anyway, | 17:51 |
dansmith | so I guess I'm less concerned about that specifically | 17:51 |
dansmith | bauzas: yeah, it's approved | 17:51 |
bauzas | in the past, we leaned towards saying it's the operator responsibility to make the matching that works | 17:52 |
bauzas | we had aggregates | 17:52 |
bauzas | so I'm surprised that we now all go full steam on adding traits for such things | 17:52 |
dvo-plv | yeah, unfortinatelly i did not get finnal decision how I should move with my solution according to your thoughts | 17:53 |
dansmith | bauzas: yeah I mean sean-k-mooney's point of *not* making it "some assembly required" definitely resonates with me- I want us to do the right thing when we can | 17:53 |
dansmith | but a trait is pretty heavy for each "just need a new libvirt version" sort of thing, | 17:54 |
bauzas | dvo-plv: I'm sorry I went into trampling your previous approval, but I guess my disapproval is more about the pattern we follow than your own spec itself | 17:54 |
dansmith | and I do not think we assert that all the extra specs are schedulable | 17:54 |
bauzas | dansmith: yeah, so let's do some kind of automatic filter that would say 'I'm libvirt specific, please give me a libvirt thingies' | 17:54 |
bauzas | if we reconsider the operator experience and the burden of setting aggregates | 17:55 |
dansmith | I gotta jump to my next thing, but yeah I think I agree | 17:56 |
bauzas | fwiw, we have filters that do mappings between flavors/images and aggregates | 17:57 |
bauzas | operators could just amend their 'libvirt' aggregates by adding an agg property equal to hw:virtio_packed_ring and problem is solved | 17:59 |
bauzas | but that's adding some operator burden I don't disagree | 17:59 |
dvo-plv | its okay, never mind | 17:59 |
sean-k-mooney | bauzas: i think a post schduler filter is far to heavy weight for this | 19:17 |
sean-k-mooney | we can do a min compute server version check if we really want | 19:17 |
sean-k-mooney | i feel like that is the midel gorund we can all live with but that means makign this feature unaviabel during the upgrade to caracal | 19:18 |
sean-k-mooney | i think that is an ok requirement but its somethign we could have supproted via the trait if we wanted too | 19:18 |
bauzas | you know what ? I'm tired of this, so let's just use the trait | 19:28 |
bauzas | I'll just don't +2 the spec | 19:30 |
gmann | sean-k-mooney: just in case you missed these virt driver deprecation backport https://review.opendev.org/q/topic:deprecate-virt-drivers+status:open | 21:11 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!