opendevreview | Merged openstack/nova stable/ussuri: Remove broken legacy zuul jobs https://review.opendev.org/c/openstack/nova/+/795374 | 02:14 |
---|---|---|
gibi | ganso: I have no +2 rights on stable branches | 07:03 |
opendevreview | Yongli He proposed openstack/nova master: Smartnic support - cyborg drive https://review.opendev.org/c/openstack/nova/+/771362 | 08:03 |
opendevreview | Yongli He proposed openstack/nova master: smartnic support - new vnic type https://review.opendev.org/c/openstack/nova/+/771363 | 08:03 |
opendevreview | Yongli He proposed openstack/nova master: smartnic support - create arqs https://review.opendev.org/c/openstack/nova/+/758944 | 08:03 |
opendevreview | Yongli He proposed openstack/nova master: smartnic support - build instance with smartnic arqs https://review.opendev.org/c/openstack/nova/+/798249 | 08:03 |
opendevreview | Yongli He proposed openstack/nova master: smartnic support - cleanup arqs https://review.opendev.org/c/openstack/nova/+/798054 | 08:03 |
opendevreview | Yongli He proposed openstack/nova master: smartnic support - reject server move and suspend https://review.opendev.org/c/openstack/nova/+/779913 | 08:03 |
opendevreview | Yongli He proposed openstack/nova master: smartnic support - functional tests https://review.opendev.org/c/openstack/nova/+/780147 | 08:03 |
lyarwood | sean-k-mooney: https://review.opendev.org/c/openstack/devstack/+/798514 btw | 08:32 |
lyarwood | https://bugs.launchpad.net/devstack/+bug/1933096 moved to devstack | 08:33 |
opendevreview | Lee Yarwood proposed openstack/nova master: zuul: Add CentOS 8 stream integrated compute tempest job to gate https://review.opendev.org/c/openstack/nova/+/797616 | 08:35 |
lyarwood | gibi: nova-live-migration doesn't look happy on master | 08:42 |
* lyarwood creates a gate-failure bug | 08:42 | |
lyarwood | Call _is_port_status_active returns false in 60.000000 seconds | 08:43 |
lyarwood | test_live_migration_with_trunk | 08:43 |
opendevreview | Lee Yarwood proposed openstack/nova master: zuul: Skip test_live_migration_with_trunk until bug #1933954 is fixed https://review.opendev.org/c/openstack/nova/+/798580 | 08:56 |
*** bhagyashris_ is now known as bhagyashris | 09:15 | |
stephenfin | sean-k-mooney: gibi: bauzas: When you're all around, I'd like to pick up discussion on that availability zone issue again (I got stuck in meetings after lunch yesterday) | 09:31 |
bauzas | stephenfin: I'm working on some A100 vGPU test... :( | 09:32 |
bauzas | urgent query from some PM | 09:32 |
stephenfin | ah, no worries, I guess I can keep it to the Gerrit review and let you respond async | 09:32 |
stephenfin | tl;dr: I still have concerns about recording null or the default AZ when the host doesn't belong to the availability zone | 09:33 |
stephenfin | I'd be okay with overwriting the requested AZ with the hosts AZ (with a warning) if we insist on not blocking the request | 09:35 |
lyarwood | https://bugs.launchpad.net/nova/+bug/1933954 and https://bugs.launchpad.net/nova/+bug/1933958 smell related to me if anyone with more neutron context has time to review | 09:37 |
bauzas | stephenfin: that's probably why we should only record None | 09:39 |
bauzas | and not the default AZ | 09:39 |
bauzas | stephenfin: so, letting the instance to be movable between AZs | 09:40 |
bauzas | stephenfin: if operators want the instance to *not* be movable between AZs, they should use both --az and --host | 09:41 |
bauzas | stephenfin: replied on PS1 https://review.opendev.org/c/openstack/nova/+/798145 | 09:46 |
* bauzas needs to go to the gym | 09:46 | |
rohit02 | hi team on any openstack ussuri we are getting error while launching multiattach volume booted instance on horizon"Multiattach volumes are only supported starting with compute API version 2.60. (HTTP 400) (Request-ID: req-fd275b1c-c827-460f-9005-ff9f3d02dbb6)" | 09:48 |
rohit02 | is a known issue for multiattach volume type for horizon? is there any fix for this issue | 09:50 |
lyarwood | Odd, no idea why Horizon wouldn't be using 2.latest tbh | 09:53 |
lyarwood | is it configurable in Horizon? | 09:54 |
stephenfin | lyarwood: Yeah, they look related. Looking at the logs for the first one (https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_8f7/771362/28/check/nova-live-migration/8f76ccd/compute1/logs/screen-n-cpu.txt) I see: | 09:57 |
stephenfin | "Neutron is not new enough to perform early destination host port binding activation. Port bindings will be updated later." | 09:57 |
stephenfin | which only happens if neutron doesn't support port bindings, apparently https://github.com/openstack/nova/blob/master/nova/network/neutron.py#L2860-L2868 | 09:58 |
lyarwood | cool cool, I don't see anything obvious in nova or neutron that might be causing this | 09:58 |
lyarwood | guess it might be in devstack or tempest (for the job definitions) | 09:58 |
stephenfin | I'm looking at recent devstack changes as we speak | 10:00 |
stephenfin | I asked on #openstack-neutron too | 10:00 |
lyarwood | ta | 10:01 |
stephenfin | Merge "[ML2] Change way how list of supported API extensions is made" | 10:03 |
stephenfin | that looks relevant | 10:03 |
stephenfin | (from neutron) | 10:03 |
lyarwood | gah I was looking at author and not the commit date | 10:05 |
stephenfin | git config :q | 10:07 |
stephenfin | whoops | 10:07 |
stephenfin | git config format.pretty fuller # <-- if you don't have it, very helpful | 10:08 |
lyarwood | wonder if that works with tig | 10:08 |
lyarwood | nope, nvm I'll just engage my brain next time | 10:09 |
stephenfin | <slaweq> stephenfin yes, it seems we are missing binding-extended in the https://github.com/openstack/neutron/blob/master/neutron/common/ovn/extensions.py#L85 | 10:10 |
stephenfin | lyarwood: fyi ^ | 10:10 |
stephenfin | (from #openstack-neutron) | 10:10 |
lyarwood | Coolio | 10:10 |
opendevreview | Stephen Finucane proposed openstack/nova master: DNM: Testing ML2 extension aliases fix https://review.opendev.org/c/openstack/nova/+/798635 | 10:15 |
opendevreview | Stephen Finucane proposed openstack/nova master: DNM: Testing ML2 extension aliases fix https://review.opendev.org/c/openstack/nova/+/798635 | 10:18 |
kashyap | lyarwood: Hi, is this soft-lockup w/ CirrOS consistently reproducible? - https://bugs.launchpad.net/nova/+bug/1931702 | 10:31 |
lyarwood | kashyap: hard to say, I had to hack dumping the console log into a few runs of the job before we disabled certain tests that always failed, I've not had time to rerun things with that change reverted yet | 10:33 |
kashyap | Hm; I saw your comment on the console log. | 10:34 |
lyarwood | kashyap: https://review.opendev.org/c/openstack/tempest/+/794757 and https://review.opendev.org/c/openstack/nova/+/795997 FWIW | 10:34 |
* kashyap clicks | 10:36 | |
kashyap | lyarwood: Thanks for the rework; and the skip makes sense for now. To guess a non-OpenStack reproducer from the test: | 10:38 |
kashyap | lyarwood: Live-migrating a guest, its disk, with an additional file-based disk should do it? | 10:39 |
lyarwood | kashyap: yeah pretty much | 10:39 |
kashyap | lyarwood: Okay; thanks. I can also ask the virt QE to add a test to this effect to this suite. As this is a common path for OpenStack | 10:40 |
gibi | kashyap: do you want to talk about https://blueprints.launchpad.net/nova/+spec/virtio-as-default-display-device on the todays Nova meeting? | 11:13 |
kashyap | gibi: Oh, hi. Yes, that'd be good | 11:23 |
gibi | then I will add it to the agenda | 11:23 |
kashyap | gibi: Thanks for priming my memory | 11:23 |
gibi | no problem | 11:26 |
*** tbhchman is now known as tbachman | 11:29 | |
gibi | stephenfin: I'm here until the top of the hour to talk about the availability zone check. Alternatively I'm here from 15:30 CEST till the nova meeting | 11:43 |
gibi | stephenfin: bauzas replyed in the review and I think that was how I understood the the original agreement. so --az az:host should be translated to an internal behavior to match the --host host behavior and log a warning if that host is not in the az | 11:47 |
kashyap | gibi: Afraid, I might not be there for the entire meeting, as I have to run for an errand; I see it's 18:00 CET (1600 UTC) | 11:49 |
gibi | kashyap: I can move the topic a earlier in the agend, what is your cut of time_ | 11:50 |
gibi | ? | 11:50 |
kashyap | I can be around the first 20 mins | 11:50 |
kashyap | Thank you, as usual for adjusting. I feel guilty :D | 11:50 |
gibi | kashyap: OK, I will move it. Don't worry | 11:51 |
kashyap | Thx! | 11:52 |
bauzas | gibi: again, even if we do the same, the main difference between the az hack and the --host value is that for the former, we don't verify the AZ | 12:53 |
gibi | bauzas: but for the --host we also not verify any az as az was not even provided | 12:53 |
bauzas | that's why I would like to continue to not verify it, but saying that the instance couldn't be stuck to a specific AZ | 12:53 |
bauzas | gibi: ah, correct indeed | 12:54 |
gibi | that work sof me | 12:54 |
gibi | works for me | 12:54 |
bauzas | kk | 12:54 |
gibi | so in --az my-az:host we dont enforce my-az, we simply replace that to whathever internal data that represents --host host only | 12:54 |
gibi | I don't want to say "ignore az" as I'd like to have a warning that the az is ignored | 12:55 |
gibi | question: do we only ignore the az if the host is not in the az, or we ignore it even if the host is in the az? | 12:55 |
sean-k-mooney | gibi: https://review.opendev.org/c/openstack/nova/+/797428/2 has finally passed check so im +1 on it and the preceeding patch if we want to move those forward now | 13:06 |
sean-k-mooney | oh your +2 on that bauzas ^ | 13:06 |
gibi | yepp I'm OK | 13:06 |
sean-k-mooney | bauzas: its stephens fixs for my port delegation patch | 13:07 |
bauzas | sean-k-mooney: ok, I can take a look | 13:07 |
bauzas | gibi: I haven't seen your question, looking | 13:07 |
sean-k-mooney | gibi: thanks for reviewing those | 13:07 |
bauzas | gibi: by saying --az foo:host, we make the instance movable between AZs | 13:08 |
bauzas | gibi: so, the instance will land on host1 which can be on az1 | 13:08 |
bauzas | gibi: but eventually, the instance could be moved to another host in az2 as the requested AZ eventually is "None" | 13:08 |
gibi | bauzas: so if --az foo:host and host in foo then we still not pin the instance to foo az toda? | 13:08 |
gibi | y | 13:09 |
bauzas | gibi: today, we stick to foo | 13:09 |
bauzas | gibi: even if host isn't in foo | 13:09 |
bauzas | tomorrow, we'll stick to nothing and leave the instance be in any AZ | 13:09 |
bauzas | if host is in 'bar' AZ, fine | 13:10 |
gibi | bauzas: ack, this works for me | 13:10 |
sean-k-mooney | the alternitive being stick to the az that host is in, reject the request or keep the current behavior | 13:10 |
sean-k-mooney | i think im ok with what bauzas is suggesting too | 13:10 |
bauzas | and like I said, if the operator wants to land on host and stick to foo, they can say --host host and --az foo | 13:11 |
bauzas | then, they'll get NoValidHosts | 13:11 |
bauzas | sean-k-mooney: I don't like the alternative as I don't wanna verify the AZ by the API | 13:11 |
bauzas | since the AZFilter can be disabled | 13:11 |
sean-k-mooney | its not related to the az filter | 13:12 |
bauzas | and I know some ops that use forced_hosts hack but don't run AZFilter | 13:12 |
sean-k-mooney | instance hav az without that | 13:12 |
bauzas | we only enforce AZs by the filter | 13:12 |
bauzas | or by placement | 13:12 |
sean-k-mooney | no we also suport^ | 13:13 |
bauzas | (if the prefilter is enabled) | 13:13 |
sean-k-mooney | yep and you can disable both | 13:13 |
bauzas | correct | 13:13 |
bauzas | so, you can technically use the forced_hosts hack without using AZs | 13:13 |
sean-k-mooney | although the filter should get deprecated this cycle and removed next cycle | 13:13 |
bauzas | whatever | 13:13 |
sean-k-mooney | but the prefilter will still be configurable | 13:13 |
bauzas | we'll continue to verify AZs by the scheduler | 13:13 |
sean-k-mooney | although on by default | 13:13 |
bauzas | like the AZFilter ;) | 13:13 |
sean-k-mooney | right although we could do in the api instead | 13:14 |
bauzas | no | 13:14 |
bauzas | it's a breaking change | 13:14 |
sean-k-mooney | its a viald althernitive | 13:14 |
sean-k-mooney | bauzas: not if its configurable although yes config driven api behavior is bad | 13:14 |
sean-k-mooney | although that is effectivly what we do with the filter | 13:14 |
bauzas | we said a couple of times to *not* verify the filters by the api service | 13:14 |
bauzas | the api service needs to be scheduler agnostic | 13:15 |
sean-k-mooney | this is not really verifying the filter | 13:15 |
sean-k-mooney | its verifying if the request is valid | 13:15 |
sean-k-mooney | anyway | 13:15 |
bauzas | it's enforcing AZs on the API side while you could have disabled it | 13:15 |
bauzas | and I have serious concerns about it | 13:15 |
bauzas | but yeah, I think we have a plan | 13:16 |
sean-k-mooney | well long term i would like to remove the ablity to diable az filtering | 13:16 |
opendevreview | Merged openstack/nova master: Fix error '404 Not Found' https://review.opendev.org/c/openstack/nova/+/797233 | 13:16 |
sean-k-mooney | and by long term i mean in Z | 13:16 |
bauzas | sean-k-mooney: you'll release the operator's fury | 13:16 |
sean-k-mooney | why by default everything will be in one az | 13:16 |
bauzas | AZs *have to* be optional | 13:16 |
sean-k-mooney | it wont have any impact on them | 13:16 |
sean-k-mooney | everythign is in the nova az by default | 13:16 |
sean-k-mooney | well unles you rename the default az | 13:17 |
bauzas | sean-k-mooney: again, it's not a matter of AZs topology | 13:17 |
bauzas | sean-k-mooney: it's about the contract the user is signing off when booting | 13:17 |
sean-k-mooney | yes | 13:17 |
bauzas | either they decide to stick to an AZ or to be AZ-agnostic | 13:17 |
sean-k-mooney | i did not say you had to request it | 13:17 |
bauzas | having one big AZ won't help | 13:17 |
sean-k-mooney | i jus said we shoudl always report and filter based on it if present | 13:17 |
sean-k-mooney | if there is no az request the placement query woudl be identical | 13:18 |
sean-k-mooney | bauzas: what i would like to see is the request spec az remains as None by default | 13:18 |
sean-k-mooney | but that we would alway check az if requested with the placment query | 13:19 |
bauzas | sean-k-mooney: some operators make use of schedule_default_az with different values on the API services | 13:19 |
sean-k-mooney | yep that would still work | 13:19 |
bauzas | so they round-robin their instances between AZs by the number of workers | 13:19 |
sean-k-mooney | since that would populate the requestspec | 13:19 |
sean-k-mooney | i would just like to remove the az filer in Y and remove disabling the placment query in Z | 13:20 |
sean-k-mooney | no other changes | 13:20 |
bauzas | sean-k-mooney: that's a breaking change, right? | 13:20 |
sean-k-mooney | bauzas: it should not be no | 13:20 |
bauzas | sean-k-mooney: because atm, users can ask for AZs and silently move between AZs | 13:20 |
sean-k-mooney | unless they are forcing to host with a wrong az | 13:20 |
bauzas | so, at least a microversion | 13:20 |
sean-k-mooney | how can they move between az if they asked for one | 13:21 |
sean-k-mooney | with what a forced migration? | 13:21 |
sean-k-mooney | the requst spec az wont get updated so if you asked for one at boot you will always stay inside that az unless the admin forces a migration | 13:22 |
sean-k-mooney | bauzas: anyway i think im ok with your "set request spec to None" when you do the az hack proposal | 13:23 |
sean-k-mooney | but i think we evenuatlly could validate it in the api if we wanted too and as i said i think by z we shoudl just always add teh az if present in the quest spec to the placment query | 13:24 |
sean-k-mooney | modulo perhapse if you are using an older microverion for a forced migration | 13:25 |
sean-k-mooney | althogh im not sure about the last part, e.g. shoudl we continue to support forced migrations that break az affinity. that not todays problem however | 13:26 |
opendevreview | Kashyap Chamarthy proposed openstack/nova master: libvirt: Switch the default video model from 'cirrus' to 'virtio' https://review.opendev.org/c/openstack/nova/+/798680 | 13:32 |
opendevreview | Kashyap Chamarthy proposed openstack/nova master: libvirt: Switch the default video model from 'cirrus' to 'virtio' https://review.opendev.org/c/openstack/nova/+/798680 | 13:42 |
sean-k-mooney | kashyap: you cant do ^ this cycle | 13:46 |
kashyap | sean-k-mooney: Too late? | 13:46 |
kashyap | sean-k-mooney: You mean a deprecation cycle? | 13:46 |
sean-k-mooney | no you need to recored the currnt video model this cycle | 13:46 |
kashyap | Ohh, right; darn | 13:46 |
sean-k-mooney | then and only then can we change the default | 13:46 |
sean-k-mooney | so you can write the patch and we can merge it early Y | 13:47 |
kashyap | There's that part...where are we w.r.t that? I know Lee did the work for machine types | 13:47 |
sean-k-mooney | currently we have not started on it so it | 13:47 |
sean-k-mooney | its not really hard to do | 13:47 |
kashyap | sean-k-mooney: Yeah; fair enough, good point. So the blueprint is only for Y? | 13:47 |
sean-k-mooney | yes i think so but you coudl bring it up in the team meeting later | 13:48 |
kashyap | Nod; I'll mark it as -W for now | 13:48 |
sean-k-mooney | currntly the upgrade impact woudl be the dispaly woudl change on hard reboot | 13:48 |
sean-k-mooney | if people were ok with with we coudl proceed but i expect this would break some peopele hence the need to record it | 13:48 |
sean-k-mooney | downstream we could backport the recording patch | 13:49 |
sean-k-mooney | but upstream i dont think that woudl be accpeted | 13:49 |
sean-k-mooney | e.g. downstream we could start recordign the video model in 16.2/train if we rally wanted or in 17/wallaby | 13:49 |
kashyap | sean-k-mooney: That is record it in system_metadata, yeah? | 13:49 |
sean-k-mooney | yes | 13:50 |
kashyap | Nod. | 13:50 |
sean-k-mooney | you should be able to copy paste part of lee's machine type patch | 13:50 |
sean-k-mooney | if you want to give it a try | 13:50 |
kashyap | sean-k-mooney: On hard reboot - display changing shold be acceptable, no? | 13:50 |
kashyap | sean-k-mooney: Yep; noted, I'll give it go | 13:51 |
sean-k-mooney | kashyap: in general no. teh vm models should not change on hard reboot | 13:52 |
sean-k-mooney | it may or may not break guests | 13:52 |
sean-k-mooney | it really depens on if they have drivers | 13:52 |
sean-k-mooney | it should fall back to generic vga in this case | 13:53 |
sean-k-mooney | so it should work | 13:53 |
sean-k-mooney | but in general for other models it would nto be safe | 13:53 |
kashyap | sean-k-mooney: Right; preserving the guest ABI, etc. That's the reason also why libvirt doesn't gratuitously change things on cold reboot, though | 13:53 |
sean-k-mooney | e.g. disk bus | 13:53 |
kashyap | sean-k-mooney: Yep; the fallback to generic VGA should catch it. So I don't think there'd be any visible breakage | 13:53 |
sean-k-mooney | right so we really just need to see in this case if people are ok with the upgrade impact since its mitigated by generic vga | 13:54 |
sean-k-mooney | well provided your not useing rhel 6 | 13:54 |
sean-k-mooney | rhel 7 shoudl be fine but rhel 6 predates the intoduction of virtio-gpu and i do not belive they have the required kernel support even with the fallback but i coudl be miss remebering that | 13:55 |
sean-k-mooney | it may have been related to one of the other email treads i just rememebr there bing an issue with virtio and rhel 6 mentioned recently | 13:56 |
kashyap | sean-k-mooney: No worries about RHEL-6 - RHEL-6 is not supported by OSP | 13:57 |
sean-k-mooney | rhel6 went to els only phase on November 30, 2020 but els exits untile June 30, 2024 | 13:57 |
kashyap | Yes; am aware of the RHEL-6/CentOS 6 thing | 13:57 |
sean-k-mooney | kashyap: as a guest it is | 13:57 |
sean-k-mooney | altough ya it woudl need els | 13:58 |
kashyap | Yep; | 13:58 |
sean-k-mooney | anyway that is the onel really issue i see | 13:58 |
sean-k-mooney | gibi: not that i want to reopen the owner_triat conversation but part of the reason for intoducing them was to give each service at least 1 unique tratit that only they will use | 14:00 |
sean-k-mooney | i do agree that that is the main different between nova<>cinder and nova<>cyborg | 14:00 |
gibi | for me owner means the RP the trait is on cannot be touched by any other placement client, but we simply cannot enforce that semantic in placement | 14:08 |
gibi | and as I said there are many undefined edges of the semantic itself | 14:09 |
viks_ | Hi, is there any way we can provision MAC os? | 14:11 |
viks_ | using openstack nova? | 14:11 |
gmann | lyarwood: gibi is this know issue - test_live_migration_with_trunk failing consistently https://zuul.opendev.org/t/openstack/build/94d92ea104734ba49f6c93e17a91e3f7/log/job-output.txt#63963 | 14:12 |
gibi | gmann: we have port status issue before | 14:15 |
gibi | gmann: let me find it | 14:15 |
gmann | ok | 14:15 |
gibi | gmann: we had this fix https://review.opendev.org/c/openstack/tempest/+/786465 | 14:16 |
gmann | gibi: ohk, seems it still failing. I will check logs after qa meeting | 14:17 |
gibi | it is stil the parent port that remains DOWN | 14:17 |
gmann | ohk | 14:17 |
stephenfin | viks_: Last I checked, Apple's licensing only supports MacOS on Apple hardware. Maybe you could provision Mac Minis or Mac Pros using Ironic but using VMs (via nova) don't seem likely | 14:29 |
stephenfin | *doesn't | 14:29 |
sean-k-mooney | viks_: no not really you could try using uefi and q35 | 14:34 |
sean-k-mooney | but baskcialy macos has some specifc hardware requirement that i dont think qemu can fully emulate | 14:34 |
sean-k-mooney | i have seen peopel hack around it in the past by modifying the apple bootloader/kernel but its not fully supportred in qemu/libvirt and its not allowed by apples licening as stephen point out i belive | 14:35 |
sean-k-mooney | stephenfin: you could run vms via nova on a mac mini but those vms would have to be windows or linux | 14:36 |
sean-k-mooney | apple has a hypervior interface in the os which qemu can use | 14:36 |
sean-k-mooney | and libvirt can manage it but i dont think we can use kvm so enable the native support currenlty via libvirt | 14:37 |
viks_ | stephenfin: sean-k-mooney ok thanks | 14:38 |
kashyap | viks_: sean-k-mooney: Hardware accel with QEMU + MacOS works with "-accel hvf"; I recently checked it w/ QEMU upstream | 14:42 |
kashyap | viks_: I was told these instructions from Brew work in general: https://wiki.qemu.org/Hosts/Mac | 14:42 |
kashyap | viks_: Oh, wait - IIUC, you want to run MacOS as a guest on x86. | 14:43 |
viks_ | kashyap: ok... thanks... basically i wanted to see if i can provision mac os, either on kvm or baremetal ... and if is there any POC /doc related to it in openstack | 14:46 |
kashyap | (Nod) Upstream QEMU has MacOS as part of its CI | 14:46 |
kashyap | But no robust testing, IIUC | 14:47 |
* kashyap --> needs to go into a meeting | 14:47 | |
viks_ | kashyap: ok... thanks | 14:48 |
sean-k-mooney | kashyap: yep and they also recently added apple silicon support | 15:08 |
stephenfin | slaweq: That failure on https://review.opendev.org/c/openstack/nova/+/798635 looks unrelated. Looks like https://review.opendev.org/c/openstack/neutron/+/798634/ did the trick. Thanks! | 15:10 |
slaweq | stephenfin thx for info :) | 15:10 |
sean-k-mooney | slaweq: that should not have broken things by the way but yes https://review.opendev.org/c/openstack/neutron/+/798634/ should fix it | 15:12 |
sean-k-mooney | slaweq: we still technically support live migration without binding-extended since we cannot yet make that a mandaory requriemetn on the neutron side | 15:13 |
sean-k-mooney | slaweq: it would be really nice to make it mandatory going forward but we need neutron to provide a way to do that | 15:13 |
slaweq | sean-k-mooney You mean to somehow force it on all plugins/drivers to support it, right? | 15:14 |
sean-k-mooney | slaweq: yes we need a way to intoduce mandatory extesions or move some of the current ones to be core extentions | 15:15 |
sean-k-mooney | so that we can stop supporting nuetorn backends that dont support it | 15:15 |
slaweq | yeah, that would be good thing | 15:15 |
sean-k-mooney | this would require contrail to finally add support for it | 15:15 |
slaweq | maybe this could be good topic for cross project session on the next PTG? :) | 15:16 |
sean-k-mooney | i have for the last 3-4 ptgs already | 15:16 |
sean-k-mooney | but i can i ask for this every time | 15:16 |
sean-k-mooney | we would basically just have to agree a process for graduation of extnsions to the core api | 15:17 |
sean-k-mooney | then for Y declare the following set will be added | 15:18 |
sean-k-mooney | basically give one upstream cycle at a minium to ensure all backend support the new ones | 15:18 |
slaweq | the problem is that we will be loud about it in the community | 15:19 |
sean-k-mooney | then nova could drop support in say Z and issue deprecation warnings in Y when the required exttion is not found | 15:19 |
slaweq | and finally someone will still miss announcement and will complain :) | 15:19 |
sean-k-mooney | well we acidentally drop support for live migratrion without binding extened | 15:19 |
slaweq | maybe You could propose some initial draft of spec so we can discuss it there? | 15:19 |
sean-k-mooney | and it took 14 months or so for peopel to complain | 15:20 |
sean-k-mooney | so yes they will but maybe not promptly | 15:20 |
sean-k-mooney | slaweq: sure is can propsose something | 15:20 |
slaweq | ++ thx | 15:20 |
slaweq | I will then ensure that neutron team will review it :) | 15:21 |
sean-k-mooney | do ye have a Y or backloag folder. i can propsoe it against xena but i dont really expect us to do anything concret this cycle | 15:21 |
sean-k-mooney | well other them maybe agree the driection | 15:21 |
sean-k-mooney | slaweq: by the way when is the neutron team meeting have you discussed the os-vif per-port bridge topic | 15:22 |
sean-k-mooney | slaweq: ill be brining that up in the nova meeeting later | 15:22 |
slaweq | we had that meeting 1h ago | 15:22 |
slaweq | and ralonsoh raised that issue | 15:22 |
slaweq | he will follow up with spec for that too | 15:22 |
sean-k-mooney | ack ok i can sync with him | 15:23 |
slaweq | ++ | 15:23 |
* bauzas is depressed to see yet another virtual PTG :( | 15:24 | |
sean-k-mooney | has that been annouched | 15:25 |
* gibi is just depressed in general | 15:27 | |
kashyap | bauzas: Yeah; it sucks; but at least we see some "light at the end of the tunnel" and are not stuck in long virus waves. | 15:29 |
bauzas | sean-k-mooney: yup, one hour ago | 15:30 |
bauzas | kashyap: I understand the reasoning behind the decision but this hurts | 15:30 |
bauzas | I can just hope all the contributors could be vaccinated sooner than later so we could just ask them some kind of passport to travel | 15:30 |
bauzas | gibi: another thought, I need to leave by 6:15pm our time | 15:34 |
bauzas | if we can do a quick meeting, that'd be lovely | 15:34 |
gibi | bauzas: I will try to be quick | 15:34 |
gibi | and thanks for the headsup | 15:34 |
sean-k-mooney | i have one networking topic to bring up in open discuss but apparently its going to be a spec now instead of a bug so i guess it will be an fyi. | 15:36 |
sean-k-mooney | and review the spec | 15:36 |
sean-k-mooney | slaweq: by the way we had planned to treat it as a bug because i didn tno really want to do a downstream only backport of this though we can disucss later | 15:37 |
gibi | sean-k-mooney: ack | 15:46 |
gibi | nova meeting starts in 5 minutes here in the channel | 15:54 |
opendevreview | Merged openstack/nova stable/ussuri: Error anti-affinity violation on migrations https://review.opendev.org/c/openstack/nova/+/796719 | 15:55 |
opendevreview | Stephen Finucane proposed openstack/nova master: scheduler: Remove 'USES_ALLOCATION_CANDIDATES' https://review.opendev.org/c/openstack/nova/+/773640 | 15:59 |
opendevreview | Stephen Finucane proposed openstack/nova master: scheduler: 'USES_ALLOCATION_CANDIDATES' removal cleanup https://review.opendev.org/c/openstack/nova/+/797513 | 15:59 |
opendevreview | Stephen Finucane proposed openstack/nova master: scheduler: Remove 'hosts_up' https://review.opendev.org/c/openstack/nova/+/773641 | 15:59 |
opendevreview | Stephen Finucane proposed openstack/nova master: trivial: Remove FakeScheduler (for realz) https://review.opendev.org/c/openstack/nova/+/773642 | 15:59 |
opendevreview | Stephen Finucane proposed openstack/nova master: scheduler: Merge 'FilterScheduler' into base class https://review.opendev.org/c/openstack/nova/+/773643 | 15:59 |
opendevreview | Stephen Finucane proposed openstack/nova master: docs: Drop references to non-filter scheduler drivers https://review.opendev.org/c/openstack/nova/+/773645 | 15:59 |
opendevreview | Stephen Finucane proposed openstack/nova master: scheduler: Merge driver into manager https://review.opendev.org/c/openstack/nova/+/773644 | 15:59 |
opendevreview | Stephen Finucane proposed openstack/nova master: tests: Merge 'test_utils', 'test_scheduler_utils' https://review.opendev.org/c/openstack/nova/+/773646 | 15:59 |
opendevreview | Stephen Finucane proposed openstack/nova master: conf: Remove deprecated aliases https://review.opendev.org/c/openstack/nova/+/773647 | 15:59 |
kashyap | gibi: BTW, I don't need to rush; so take your time. I cancelled my errand | 15:59 |
gibi | #startmeeting nova | 16:00 |
opendevmeet | Meeting started Tue Jun 29 16:00:06 2021 UTC and is due to finish in 60 minutes. The chair is gibi. Information about MeetBot at http://wiki.debian.org/MeetBot. | 16:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 16:00 |
opendevmeet | The meeting name has been set to 'nova' | 16:00 |
stephenfin | o/ | 16:00 |
* kashyap waves | 16:00 | |
gibi | kashyap: OK, then I will do the normal agenda and your topic will be part of the Open Discussion | 16:00 |
kashyap | Yes, that's fine. | 16:00 |
bauzas | \o | 16:01 |
gmann | o/ | 16:01 |
gibi | bauzas asked for a quick meeting so lets start | 16:01 |
gibi | #topic Bugs (stuck/critical) | 16:01 |
gibi | no critical bug | 16:01 |
gibi | #link 22 new untriaged bugs (+1 since the last meeting): #link https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New | 16:01 |
gibi | any bug we need to talk about today/ | 16:02 |
gibi | ? | 16:02 |
stephenfin | lyarwood spotted a bug earlier this morning. We think it's basically fixed now. Just waiting for the neutron patch to merge (thanks slaweq) | 16:02 |
stephenfin | until that merges though, the gate is stuck (the live-migration job will fail) | 16:03 |
stephenfin | (https://review.opendev.org/c/openstack/neutron/+/798634/ is the fix, btw) | 16:03 |
gibi | stephenfin: thanks | 16:04 |
gibi | any other bug to mention? | 16:04 |
sean-k-mooney | stephenfin: that job should not fail | 16:05 |
sean-k-mooney | we should fall back to the old way without mulitple port bindings | 16:05 |
sean-k-mooney | it might fail on stable branches if we have not backported the fix | 16:05 |
stephenfin | maybe not, but it does and I haven't had time to figure out why 🤷 | 16:05 |
sean-k-mooney | well we should since that means contrail is broken | 16:06 |
gmann | one test failing is this which is port status gibi mentioned before meeting https://be2e92e10ead782aa651-35e07a4cf42cfaed2fcffa4bf0b16f1b.ssl.cf1.rackcdn.com/794757/9/check/tempest-multinode-full-py3/94d92ea/testr_results.html | 16:06 |
sean-k-mooney | they do not support it. | 16:06 |
gmann | is that same? | 16:06 |
stephenfin | yes, I think so | 16:06 |
gmann | ok | 16:06 |
sean-k-mooney | so we proably want to hold the neutron patch till we fiture this out or propose a revert so we can debug | 16:07 |
sean-k-mooney | well the neutron patch is correct | 16:07 |
gibi | if the neutron fix is needed anyhow then I vote for merging it and troubleshoot on a revert if needed | 16:07 |
sean-k-mooney | so proably the latter propose a revert so we can figure out why it failed | 16:07 |
stephenfin | yeah, what gibi said | 16:08 |
gibi | as we anyhow moved to the gate status lets move there by topic as wel | 16:08 |
gibi | l | 16:08 |
gibi | #topic Gate status | 16:08 |
gibi | Nova gate bugs #link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure | 16:09 |
sean-k-mooney | a reguression of this is effectivly a critial nova bug just an fyi | 16:09 |
gibi | please tag bugs with gate-failure so that we can follow them there | 16:09 |
gibi | placement weekly jobs are green #link https://zuul.openstack.org/builds?project=openstack%2Fplacement&pipeline=periodic-weekly | 16:10 |
gibi | anything else on the gate status? | 16:10 |
gibi | #topic Release Planning | 16:11 |
gibi | Milestone 2 is in 3 weeks (15 of July) which is spec freeze | 16:11 |
gibi | Spec review day is 6th of July #link http://lists.openstack.org/pipermail/openstack-discuss/2021-June/023083.html | 16:11 |
gibi | and as you saw on the ML the next PTG time is set to October 18-22 | 16:11 |
gibi | and it will be virtual still | 16:12 |
gibi | any other news on the incoming release? | 16:12 |
sean-k-mooney | :( i dont think so | 16:12 |
dansmith | sean-k-mooney: what is the sadface? virtual ptg? | 16:13 |
sean-k-mooney | yes | 16:13 |
bauzas | yup | 16:13 |
bauzas | :( :( :( even | 16:13 |
dansmith | heh, okay | 16:13 |
gibi | we should have a terapeutic session in the PTG | 16:14 |
gibi | anyhow moving on | 16:14 |
gibi | #topic Stable Branches | 16:14 |
gibi | stable gates should be OK (though 'wait_for_volume_resource_status' intermittently fails) | 16:14 |
gibi | EOM (from elodilles ) | 16:14 |
gibi | any other stable news? | 16:14 |
elodilles | nothing from me for now | 16:15 |
dansmith | same failure on master a bunch too I think | 16:15 |
dansmith | the cinder peeps were working on it a couple weeks ago | 16:15 |
gibi | yeah I think lyarwood is still trying to see what happens in the guest that prevents detaching a volume | 16:15 |
dansmith | well, the cinder peeps were thinking it was an lvm segv or something | 16:16 |
dansmith | (on the host) | 16:16 |
sean-k-mooney | have we check that the falvor have at least 2 cores. its really just a workaroudn but i think that help downstream at one point with the guest not respondind to the detach | 16:16 |
gibi | dansmith: could be multiple indendependent failure I only hit the detach one last week but I'm did not looked at CI results recently | 16:17 |
sean-k-mooney | i mean 1 shoudl really be enought but sometiems if the guest has 2 cores it will still be abel to repsond if its hung on other thngs | 16:17 |
dansmith | gibi: yeah, they already fixed one thing that manifested in the same way I think, which was specifically timeout related IIRC | 16:18 |
dansmith | but the latest was lvm crashing I think | 16:18 |
dansmith | anyway | 16:18 |
gibi | dansmith: thansk that is good info | 16:18 |
gibi | sean-k-mooney: good idea | 16:18 |
gibi | moving on | 16:18 |
gibi | #topic Sub/related team Highlights | 16:18 |
gibi | bauzas: are you still with us? | 16:18 |
bauzas | yup | 16:18 |
bauzas | nothing to report, sir. | 16:19 |
gibi | then | 16:19 |
gibi | Libvirt (bauzas) | 16:19 |
gibi | ack | 16:19 |
gibi | thanks | 16:19 |
bauzas | this^ | 16:19 |
gibi | :) | 16:19 |
gibi | moving on | 16:19 |
gibi | #topic Open discussion | 16:19 |
gibi | (kashyap) seeking approval for the specless bp https://blueprints.launchpad.net/nova/+spec/virtio-as-default-display-device | 16:19 |
kashyap | gibi: So on that: | 16:19 |
gibi | I think this was discussed today on the channel | 16:19 |
kashyap | I was reminded that we can't do the switch in the current devel cycle | 16:19 |
kashyap | But we need some preperatory work for Y release | 16:20 |
kashyap | E.g. recording the video model in system_metadata. Get the tests sorted, and then do the switch. | 16:20 |
gibi | OK, so then for X you only aim for the recording and testing then switch in Y | 16:20 |
dansmith | why do we need to record the video model | 16:20 |
dansmith | ? | 16:20 |
sean-k-mooney | to prevent it chanigng for exisating vm after upgrade and hard reboot | 16:21 |
gibi | I think it is to avoid changing ABI to the gues during hard reboot | 16:21 |
kashyap | dansmith: Upthread, sean-k-mooney was saying it might | 16:21 |
sean-k-mooney | yes that ^ | 16:21 |
kashyap | But: | 16:21 |
dansmith | is that because we're changing our default? | 16:21 |
dansmith | oh | 16:21 |
kashyap | There won't be any _visible_ breakage here: | 16:21 |
kashyap | dansmith: Yep | 16:21 |
dansmith | I thought the spec was adding it as an option, this is for changing the default, I see | 16:21 |
kashyap | Sorry, I should've given a summary here. | 16:21 |
kashyap | dansmith: The first sentence of the BP says: "Change Nova's default video display from 'cirrus' to 'virtio'." :-) | 16:22 |
sean-k-mooney | dansmith: ya we added virtio i think in train | 16:22 |
sean-k-mooney | so in this specific case it actuly might be safe to jsut make the change | 16:22 |
sean-k-mooney | because fo vga fallback mode | 16:23 |
kashyap | Yeah | 16:23 |
dansmith | well, I was going to say, i think we've made such changes in the past after some interval | 16:23 |
sean-k-mooney | but in general for device model default changes | 16:23 |
sean-k-mooney | we woudl have to recored then change | 16:23 |
kashyap | dansmith: To summarize the above: | 16:23 |
kashyap | If your guest has the kernel driver, then "virtio" display dev will make use of it; or else, it'll gracefully fallback to VGA | 16:23 |
sean-k-mooney | dansmith: the only one i can think of was enabling the RNG by default | 16:23 |
kashyap | So that's the recommended option from the QEMU graphics maints | 16:23 |
kashyap | sean-k-mooney: Yep | 16:24 |
dansmith | kashyap: fall back to cirrus? | 16:24 |
kashyap | dansmith: No, no; fall back to "VGA compatibility mode", which is still better than "cirrus" | 16:24 |
sean-k-mooney | dansmith: no the virtio-gpu device support a vga hardware interface | 16:24 |
dansmith | okay | 16:24 |
sean-k-mooney | dansmith: you just wont get all the fatures but it shold funciton simialr to cirrus in the guest | 16:24 |
kashyap | I.e. standard VGA. | 16:24 |
kashyap | Yes | 16:25 |
bauzas | do operators would opt into it ? | 16:25 |
dansmith | yeah, but that might freak out a windows machine if your display adapter suddenly changes I guess | 16:25 |
bauzas | or would we need to change the default automatically ? | 16:25 |
kashyap | bauzas: No; they can opt out of it here. | 16:25 |
sean-k-mooney | dansmith: on reboot it should be ok but that was the upgrade concern that would prompt recored then change next cycle | 16:25 |
bauzas | I understand dansmith's concern about freaking out if done automatically | 16:25 |
kashyap | bauzas: Yes, we should do the right thing here by changing the defaul. | 16:25 |
sean-k-mooney | kashyap: not for existing instahce you need to use hw_video_model in the image | 16:26 |
kashyap | bauzas: dansmith's good point is for Windows | 16:26 |
kashyap | sean-k-mooney: Right; obvious the default implies only for the new ones. | 16:26 |
sean-k-mooney | so tl;dr recored in X change in Y ? | 16:26 |
kashyap | s/obvious/obviously,/ | 16:26 |
dansmith | so the problem is for people who don't have hw_video_model in their image meta right? | 16:26 |
kashyap | sean-k-mooney: But _do_ we need to record at all? As there's no breakage here | 16:26 |
sean-k-mooney | dansmith: correct | 16:27 |
dansmith | can we just create all new instances with that set to the default if they don't have it in their image? | 16:27 |
dansmith | then we're good for next time too when we switch to whizbang32 video | 16:27 |
bauzas | can't wait for it | 16:27 |
dansmith | compute assumes cirrus if unset forever, otherwise honors what it's set to, and then we can make the switch now for any new instances | 16:27 |
sean-k-mooney | dansmith: not really but we can store our default in the instance_system_metadata | 16:27 |
sean-k-mooney | which is what we are now doing for machine_type as if it was set in the image | 16:28 |
dansmith | sean-k-mooney: not really? we mirror image meta in sysmeta already right? so we'd just be using that instead of a bespoke key? | 16:28 |
sean-k-mooney | dansmith: ya so we can set it in our copy which is what kashyap was going to do | 16:28 |
dansmith | mirror *some* of image_meta I mean | 16:28 |
sean-k-mooney | we just cant set it in glance unless we just document use the glance import plugin to set it on all uploaded images | 16:29 |
dansmith | ack, okay, then we don't need a warning cycle to switch the default if we do it that way | 16:29 |
dansmith | sean-k-mooney: right I'm talking about our local copy (of course) | 16:29 |
sean-k-mooney | dansmith: so unless we backport the recording of the current value we would still need one cycle | 16:29 |
dansmith | why? | 16:29 |
sean-k-mooney | to populate the instance_metadata_table for exisiting instnaces | 16:30 |
kashyap | sean-k-mooney: Yeah, why? I still don't see it. | 16:30 |
dansmith | no, we just assume cirrus forever if unset | 16:30 |
sean-k-mooney | oh | 16:30 |
sean-k-mooney | that just means dont change the default | 16:30 |
dansmith | past the virtio default, it'll always be set to something, so if set, honor that, else cirrus (but just on the compute).. new instances always get virtio set explicitly by default on create | 16:30 |
kashyap | dansmith: So we can even directly change w/o even recording it in system_metadata, as we did for virtio-rng (I'll get the commit later for you to read) | 16:30 |
sean-k-mooney | dansmith: that will complicate the inital spawn logic and posibel hard reboot | 16:31 |
dansmith | kashyap: okay not sure how, but happy to look | 16:31 |
sean-k-mooney | it might be doable but we reuse span in hard reboot | 16:31 |
* bauzas needs to disappear | 16:31 | |
dansmith | sean-k-mooney: just spawn, AFAIK, which seems fine as we record other such things IIRC, but whatever | 16:31 |
kashyap | dansmith: https://opendev.org/openstack/nova/commit/de512f2c025 | 16:32 |
sean-k-mooney | so we will need to tell the different betweeen first boot and subsequint | 16:32 |
dansmith | just trying to avoid needing a cycle to change *and* annotate all existing instances | 16:32 |
kashyap | (It's slow to load) | 16:32 |
sean-k-mooney | dansmith: i guess we could try and implement that and see what it looks like | 16:32 |
dansmith | we can talk outside the meeting about it | 16:32 |
kashyap | Yeah | 16:32 |
kashyap | Thanks for the design discussion so far! | 16:33 |
kashyap | gibi: Any other topics? We can hash it outside of the meeting | 16:33 |
gibi | OK. then I hold on approving the bp until you agree on the way forward | 16:33 |
gibi | sean-k-mooney has one more headsup I think | 16:33 |
gibi | so moving on to that | 16:33 |
gibi | sean-k-mooney: | 16:33 |
sean-k-mooney | yes so ovn migration... | 16:33 |
sean-k-mooney | am tl;dr is architeutlaly there is alwasy a race when doing live migartion with ovn | 16:34 |
sean-k-mooney | effectivly ovn can only start installing rule when the tap is created on the dest | 16:34 |
sean-k-mooney | and at that point we have called libvirt to do the migration and its incontol | 16:35 |
sean-k-mooney | to to avoid that and create the port in prelive migration im proposing an os-vif change | 16:35 |
sean-k-mooney | baiscly reinotduce hybrid-plug btu with ovs bridges and patch port instead of linux bridges and veth pairs | 16:35 |
sean-k-mooney | that will not have any perfromance impact on the vm | 16:36 |
sean-k-mooney | but will allow ovn to isntall the rules in prelive migrate | 16:36 |
sean-k-mooney | i was wondering how people felt about that | 16:36 |
gibi | honestly it is too deep networking to me. I assume the impact is mostly in os-vif. Does nova needs to be adapted? | 16:37 |
stephenfin | so previously, we had | 16:37 |
stephenfin | (ovs bridge) veth | <---> | veth (linux bridge) tap | <---> | VM | 16:37 |
stephenfin | and now we'll have | 16:37 |
stephenfin | (ovs bridge) patch | <---> | patch (ovs bridge) tap | <---> | VM | 16:37 |
stephenfin | so everything stays in OVS but there's an additional (on top of br-int) bridge? | 16:37 |
sean-k-mooney | more like (ovs bridge) tap | <---> | VM orginally to (ovs bridge) patch | <---> | patch (ovs bridge) tap | <---> | VM | 16:38 |
sean-k-mooney | yes | 16:38 |
sean-k-mooney | this is the poc but it has a bug (ovs bridge) patch | <---> | patch (ovs bridge) tap | <---> | VM | 16:38 |
sean-k-mooney | https://review.opendev.org/c/openstack/os-vif/+/798055 | 16:38 |
sean-k-mooney | currently its configurable and defualting to true for development | 16:39 |
stephenfin | do we need to worry about flows getting added for the patch <-> tap in the second (new) bridge? | 16:39 |
stephenfin | or does that happen automatically? | 16:39 |
sean-k-mooney | stephenfin: just the normal action | 16:39 |
sean-k-mooney | so no rules required | 16:39 |
sean-k-mooney | on the neutron side if we wanted to proceed there woudl need to be some qos changes for ovn | 16:39 |
sean-k-mooney | so that will be covered by a spec | 16:39 |
sean-k-mooney | if we are ok with this on the nova side i would like to track the capablity as a bug against os-vif | 16:40 |
sean-k-mooney | so we can backport the ablity to opt in tothis behavor but not use it by default for stable branches | 16:40 |
stephenfin | excellent, so we'll pre-populate a flow in the br-int for the new patch port, and then the comms from the other side of the patch port to the VM don't need anything explicit bar the normal action | 16:40 |
stephenfin | that wfm, personally | 16:41 |
stephenfin | certainly seems better than re-adding hybrid plug with the OVS -> linux bridge -> VM dance | 16:41 |
sean-k-mooney | i guess may main question is bug blueprint or spec for this | 16:41 |
stephenfin | I would like to see some high level docs on this _somewhere_ | 16:43 |
gibi | hm, if this requires a neutron spec, then why do you need to backport the os-vif change to stable? | 16:43 |
sean-k-mooney | personally i would prefer to leave this bake for a cycle and enable it by default next cycle | 16:43 |
sean-k-mooney | gibi: the neutorn spec is to fix QOS support | 16:43 |
stephenfin | it could be a blueprint but docs in the neutron tree might be better | 16:43 |
* gibi is slow | 16:43 | |
gibi | sean-k-mooney: ahh OK I see | 16:43 |
sean-k-mooney | it would be useful for those that dont need qos without that | 16:43 |
gibi | yepp now I got it | 16:43 |
gibi | this is a bugfix for os-vif to support live migration with OVN | 16:44 |
gibi | or more preciesly fix a race in live migration | 16:44 |
gibi | I can live with this as a bugfix | 16:44 |
sean-k-mooney | yes basically | 16:44 |
sean-k-mooney | and thats also why we woudl default this to off intially and then enable it by default in the future | 16:45 |
gibi | any objection? | 16:45 |
sean-k-mooney | operators can opt in early if they want but not change any behavior by default | 16:45 |
stephenfin | I'm good. Can't speak for others tho | 16:46 |
gibi | I don't see any hands raised :) | 16:46 |
sean-k-mooney | we can defer if peopel want to think about it more | 16:46 |
sean-k-mooney | im still working on the poc | 16:46 |
sean-k-mooney | my main concern is m2 and spec freeze | 16:47 |
gibi | it is accepted as a bug now, here. If somebody later has an objection the we can rediscuss but until that this is a bug | 16:47 |
gibi | Is there any other topic for today | 16:47 |
sean-k-mooney | not form me | 16:48 |
gibi | then let's close this | 16:49 |
gibi | thanks for joining | 16:49 |
gibi | #endmeeting | 16:49 |
opendevmeet | Meeting ended Tue Jun 29 16:49:07 2021 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 16:49 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/nova/2021/nova.2021-06-29-16.00.html | 16:49 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/nova/2021/nova.2021-06-29-16.00.txt | 16:49 |
opendevmeet | Log: https://meetings.opendev.org/meetings/nova/2021/nova.2021-06-29-16.00.log.html | 16:49 |
elodilles | o/ | 16:49 |
gibi | feel free to continue the video mode discussion | 16:49 |
gibi | I will drop now but will read back tomorrow | 16:49 |
sean-k-mooney | well quickly the live migration issue https://review.opendev.org/c/openstack/nova/+/742180 should have fixed that | 16:49 |
sean-k-mooney | this is what previously broke live migration without binding-extended | 16:50 |
sean-k-mooney | but that is still fully supported so droping binding-extended form ml2/ovn shoudl not result in job failures | 16:50 |
sean-k-mooney | unless we are talking about cross-cell migration | 16:50 |
kashyap | Thanks for running it, gibi. | 16:50 |
sean-k-mooney | *cross-cell resize | 16:50 |
dansmith | sean-k-mooney: so, looking at the build process, it surely seems like the libvirt driver could just look at vm_state==BUILDING in spawn and annotate the desired default going forward | 16:51 |
dansmith | doesn't seem overly complicated to me | 16:52 |
dansmith | unless I'm missing elsewhere that we might be building but not want to do that | 16:52 |
sean-k-mooney | dansmith: ya i was thinking about that later in the meeting | 16:52 |
sean-k-mooney | i think your right we can detech inital spwan | 16:52 |
dansmith | yeah, so IMHO that'd be the way to go | 16:53 |
sean-k-mooney | dansmith: the quistion that i have is wether we can detect it in a place that is within the virt driver that also has the required info | 16:53 |
opendevreview | Rodrigo Barbieri proposed openstack/nova stable/train: Error anti-affinity violation on migrations https://review.opendev.org/c/openstack/nova/+/798717 | 16:54 |
sean-k-mooney | i dont belive it will still be in building when we are generating the xml | 16:54 |
dansmith | well, it's set to building right before we get our spawn called | 16:54 |
sean-k-mooney | yes but we might need to pass down a flag internally in the driver | 16:54 |
dansmith | sean-k-mooney: we don't need to detect it while building the xml do we? we can go ahead and annotate the instance right in spawn() so it's there later for the xml building no? | 16:54 |
sean-k-mooney | dansmith: the default depens on the image and flavor and config values | 16:55 |
dansmith | if instance.vm_state == building: instance.system_metadata['image_hw_whatever'] = $default; instance.save() | 16:55 |
sean-k-mooney | dansmith: e.g. if different based on architrues and a few other things | 16:55 |
dansmith | sean-k-mooney: it does? | 16:55 |
dansmith | oh sure, okay | 16:55 |
dansmith | but still, I think you have all that in spawn I would guess | 16:55 |
sean-k-mooney | yes proably let me check quickly | 16:56 |
dansmith | yeah we actually build the xml right in spawn, | 16:56 |
dansmith | so I think we should be fine, even if you want to pass a flag to get_guest_xml() from there instead of having it look or something | 16:56 |
sean-k-mooney | its basically decided here https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L5912-L5951 | 16:57 |
sean-k-mooney | in _add_video_driver | 16:57 |
dansmith | sure, so we could break out the "which video" part from the actual xml bit and just call it to get the model name we need to use separate from the xml part | 16:57 |
sean-k-mooney | yep or just call that directly | 16:58 |
dansmith | it returns an xml node or something doesn't it? | 16:58 |
sean-k-mooney | but ok | 16:58 |
dansmith | anyway, regardless.. I think it's not hard | 16:58 |
sean-k-mooney | yes well it returns one of our config objects | 16:58 |
sean-k-mooney | LibvirtConfigGuestVideo i think https://github.com/openstack/nova/blob/25e218484990b41485973fab86adf5afc21dd476/nova/virt/libvirt/config.py#L2052 | 16:59 |
dansmith | yeah | 17:00 |
sean-k-mooney | so we can jsut get the type filed value if we need too | 17:00 |
sean-k-mooney | so as a general pattern you would advise | 17:00 |
sean-k-mooney | update the default on new instance creation and recored in instance_system_metadata | 17:00 |
sean-k-mooney | and then just use that value in all other spawn cases | 17:00 |
dansmith | yup | 17:01 |
sean-k-mooney | in this case though virtio will work in all configuration i belive | 17:01 |
sean-k-mooney | so we really just need to check if tis build and if tis set in the image | 17:02 |
sean-k-mooney | if not set it in the image metada copy we have | 17:02 |
dansmith | well, I'd really say we should avoid breaking sensitive windows vms by changing anything much | 17:03 |
sean-k-mooney | so basiclly image_meta.properties.set('hw_video_model', image_meta.properties.get('hw_video_model')) | 17:04 |
sean-k-mooney | * image_meta.properties.set('hw_video_model', image_meta.properties.get('hw_video_model', 'virtio')) | 17:04 |
dansmith | https://www.howson.pro/content/images/2016/07/sound-popped-up-after-fi.png | 17:04 |
dansmith | don't want that on your cloud instance :P | 17:04 |
sean-k-mooney | hehe no that would be awkward | 17:05 |
sean-k-mooney | i mean you can attach a cinder volume as a driver disk but it sucks | 17:06 |
dansmith | let us not go there :) | 17:06 |
* dansmith goes to request "attach disk as floppy for win95" in cinder lp | 17:07 | |
sean-k-mooney | you can kind of do that today actully but its really inovlved and convulted | 17:08 |
dansmith | lol | 17:09 |
sean-k-mooney | dansmith: so 1.) set new default in image metata copy on new spawn. 2.) keep current logic in driver which will use the image vaule preferencally if set. 3.) recored value if not in image after its calulated for exising instance? | 17:10 |
sean-k-mooney | 3.) would only run once per instance the first time they hard reboot | 17:10 |
sean-k-mooney | and after that it just uses what in the insance_system_metadata | 17:11 |
sean-k-mooney | then rise and repeat taht for any default we want to change like this in the future | 17:11 |
sean-k-mooney | is that about right ^ if so it would be nice to add to the contibutors docs | 17:12 |
sean-k-mooney | i can proably submit a patch for that | 17:12 |
dansmith | well, I was going to say just always assume cirrus if it's not set | 17:13 |
dansmith | instead of "fixing" all the existing instances, but either works, as long as we assume cirrus if not set there | 17:13 |
sean-k-mooney | well assume exsiting behavior | 17:13 |
sean-k-mooney | which will be cirrus on x86 | 17:13 |
opendevreview | Merged openstack/nova master: Add test coverage for API version headers in CORS https://review.opendev.org/c/openstack/nova/+/796580 | 17:14 |
sean-k-mooney | its vga on power and virtio on arm | 17:14 |
dansmith | yeah | 17:14 |
opendevreview | Merged openstack/nova master: Fix typos in minimum version policy docs https://review.opendev.org/c/openstack/nova/+/795575 | 17:14 |
sean-k-mooney | kashyap: does ^ work/make sense to you | 17:15 |
opendevreview | Merged openstack/nova master: Make test_refresh_associations_* deterministic https://review.opendev.org/c/openstack/nova/+/794396 | 17:15 |
sean-k-mooney | dansmith: we may have other default like this we want to change but i cant rememebr them at present which is why i want to document the workflow for this type of change | 17:16 |
dansmith | ack | 17:16 |
opendevreview | Merged openstack/nova master: Remove PROJECT_ADMIN limitation from zero-disk and external-network policy https://review.opendev.org/c/openstack/nova/+/794360 | 17:49 |
opendevreview | Merged openstack/nova master: Improve policy doc for supported scope info https://review.opendev.org/c/openstack/nova/+/762013 | 17:50 |
slaweq | Hi nova-stable-cores, can You check https://review.opendev.org/c/openstack/nova/+/787253? Thx in advance | 20:57 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!