*** tetsuro has joined #openstack-nova | 00:08 | |
*** ociuhandu has quit IRC | 00:13 | |
*** ociuhandu has joined #openstack-nova | 00:14 | |
*** ociuhandu has quit IRC | 00:18 | |
*** tetsuro has quit IRC | 00:19 | |
*** tetsuro has joined #openstack-nova | 00:21 | |
*** bbowen has joined #openstack-nova | 00:27 | |
*** tosky has quit IRC | 00:32 | |
*** lbragstad has quit IRC | 00:37 | |
*** zhanglong has joined #openstack-nova | 00:54 | |
openstackgerrit | Brin Zhang proposed openstack/python-novaclient master: Microversion 2.83 - action event fault details https://review.opendev.org/714561 | 00:54 |
---|---|---|
*** larainema has joined #openstack-nova | 00:56 | |
openstackgerrit | Merged openstack/nova stable/pike: rt: only map compute node if we created it https://review.opendev.org/676463 | 00:58 |
openstackgerrit | Brin Zhang proposed openstack/nova master: Expose instance action event details out of the API https://review.opendev.org/694430 | 00:58 |
openstackgerrit | Brin Zhang proposed openstack/nova master: Add instance actions v283 samples test https://review.opendev.org/706251 | 00:58 |
brinzhang_ | stephenfin, gibi: updated done of bp/action-event-fault-details https://review.opendev.org/694430 | 00:58 |
openstackgerrit | Merged openstack/nova stable/pike: pike-only: remove broken non-voting ceph jobs https://review.opendev.org/700072 | 01:05 |
brinzhang_ | dansmith: Did you see my reply in https://review.opendev.org/#/c/693828/? | 01:11 |
*** Liang__ has joined #openstack-nova | 01:13 | |
*** zhanglong has quit IRC | 01:16 | |
brinzhang_ | sean-k-mooney: I left some comments in https://review.opendev.org/#/c/631244/69/nova/accelerator/cyborg.py@214, can you check again? if it's true, I think that can be don by follow up. | 01:21 |
*** zhanglong has joined #openstack-nova | 01:21 | |
*** macz_ has joined #openstack-nova | 01:24 | |
*** liuyulong has quit IRC | 01:28 | |
*** macz_ has quit IRC | 01:28 | |
gmann | brinzhang_: one comment on policy check for older version - https://review.opendev.org/694430 | 01:34 |
brinzhang_ | gmann: will check | 01:36 |
brinzhang_ | gmann: https://review.opendev.org/#/c/694430/11/nova/api/openstack/compute/instance_actions.py@187 it does not many impact as before, right? | 01:40 |
gmann | brinzhang_: one more comment on policy name | 01:41 |
gmann | brinzhang_: you mean for request with <2.83 ? | 01:42 |
brinzhang_ | but that can reduce to judgement the policy, | 01:42 |
gmann | yeah, with <2.83 we anyhow will not show the 'details' field so no need to check policy things also | 01:42 |
brinzhang_ | ok, I will change to your write | 01:44 |
brinzhang_ | change The value will be ``null`` for old records. to The value will be ``null`` for older version. is it ok? | 01:44 |
openstackgerrit | melanie witt proposed openstack/nova master: DNM: try to get some debug info for bug 1844929 https://review.opendev.org/701478 | 01:45 |
openstack | bug 1844929 in OpenStack Compute (nova) "grenade jobs failing due to "Timed out waiting for response from cell" in scheduler" [High,Confirmed] https://launchpad.net/bugs/1844929 | 01:45 |
gmann | brinzhang_: that is I am confuse, older version you mean API version ? | 01:46 |
gmann | because for older microversion we are not showing the field itself. | 01:46 |
brinzhang_ | yes, before microversion 2.83 | 01:46 |
gmann | ok, then you can remove that line as that field is present in response only after 2.83. if request if with <2.83 then 'details' itself is not present | 01:47 |
brinzhang_ | ok | 01:48 |
brinzhang_ | remove this line | 01:48 |
gmann | your api-ref change reflect that this field is new in 2.83 | 01:48 |
gmann | +1 | 01:48 |
*** spatel has joined #openstack-nova | 01:56 | |
openstackgerrit | Brin Zhang proposed openstack/nova master: Expose instance action event details out of the API https://review.opendev.org/694430 | 02:05 |
openstackgerrit | Brin Zhang proposed openstack/nova master: Add instance actions v283 samples test https://review.opendev.org/706251 | 02:05 |
brinzhang_ | gmann: done, thanks | 02:05 |
brinzhang_ | because of the policy name changes, I have tested in my local, it works fine. | 02:05 |
openstackgerrit | Brin Zhang proposed openstack/nova master: Expose instance action event details out of the API https://review.opendev.org/694430 | 02:08 |
openstackgerrit | Brin Zhang proposed openstack/nova master: Add instance actions v283 samples test https://review.opendev.org/706251 | 02:08 |
openstackgerrit | melanie witt proposed openstack/nova master: DNM: try to get some debug info for bug 1844929 https://review.opendev.org/701478 | 02:11 |
openstack | bug 1844929 in OpenStack Compute (nova) "grenade jobs failing due to "Timed out waiting for response from cell" in scheduler" [High,Confirmed] https://launchpad.net/bugs/1844929 | 02:11 |
*** macz_ has joined #openstack-nova | 02:17 | |
gmann | brinzhang_: after reading the 2.62 changes, i think we do not need the new policy. we already have existing policy which can control the info from non-admin. commented on spec also | 02:20 |
gmann | let's wait for sean-k-mooney reply. https://review.opendev.org/#/c/694430/13/nova/policies/instance_actions.py@52 | 02:20 |
*** gyee has quit IRC | 02:21 | |
*** macz_ has quit IRC | 02:21 | |
gmann | i think sean-k-mooney concern was we should not pass these info to non-admin which is controlled via existing policy also, like host info in 'events' - https://github.com/openstack/nova/blob/f454e1dec9580abf4605e071bdd678a40f492a49/nova/api/openstack/compute/instance_actions.py#L170 | 02:22 |
*** ianw has quit IRC | 02:30 | |
brinzhang_ | gmann: I cannot open github, you mean the https://review.opendev.org/#/c/694430/13/nova/api/openstack/compute/instance_actions.py@172 | 02:35 |
brinzhang_ | gmann: the show_host = api_version_request.is_supported(req, '2.62')? | 02:35 |
gmann | brinzhang_: yeah, show_host with 2.62 | 02:35 |
*** ianw has joined #openstack-nova | 02:36 | |
*** ianw has quit IRC | 02:37 | |
gmann | brinzhang_: let's wait for sean-k-mooney reply before changes in case something i missed | 02:37 |
brinzhang_ | gmann: the .BASE_POLICY_NAME % 'events' is a rule_admin policy | 02:37 |
openstackgerrit | Merged openstack/nova stable/train: libvirt: Provide the backing file format when creating qcow2 disks https://review.opendev.org/710788 | 02:37 |
brinzhang_ | gmann: ok, let sean-k-mooney check again | 02:38 |
*** ianw has joined #openstack-nova | 02:40 | |
gmann | yeah event policy is admin by default | 02:40 |
brinzhang_ | so show_host can be shown for an admin user, I think the details policy (SYSTEM_READER) is suitable for now. | 02:42 |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Add test coverage of existing flavor_manage policies https://review.opendev.org/714814 | 02:44 |
openstackgerrit | Brin Zhang proposed openstack/nova-specs master: [Trivial] Remove note for the implementation https://review.opendev.org/714817 | 02:47 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: bug-fix: set do_cleanup always True for libvirt driver https://review.opendev.org/714593 | 03:00 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: support live migration with vpmems https://review.opendev.org/687856 | 03:00 |
openstackgerrit | Luyao Zhong proposed openstack/nova master: Track orphan instances and error migrations in resource tracker https://review.opendev.org/714653 | 03:00 |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Introduce scope_types in os-flavor-manage https://review.opendev.org/714818 | 03:02 |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Add new default roles in os-flavor_manage policies https://review.opendev.org/714819 | 03:18 |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Pass the actual target in os-flavor-manage policy https://review.opendev.org/714822 | 03:25 |
*** psachin has joined #openstack-nova | 03:32 | |
*** ociuhandu has joined #openstack-nova | 03:34 | |
*** ociuhandu has quit IRC | 03:38 | |
*** udesale has joined #openstack-nova | 04:51 | |
openstackgerrit | melanie witt proposed openstack/nova master: DNM: try to get some debug info for bug 1844929 https://review.opendev.org/701478 | 04:51 |
openstack | bug 1844929 in OpenStack Compute (nova) "grenade jobs failing due to "Timed out waiting for response from cell" in scheduler" [High,Confirmed] https://launchpad.net/bugs/1844929 | 04:51 |
*** ratailor has joined #openstack-nova | 05:00 | |
*** vishalmanchanda has joined #openstack-nova | 05:03 | |
*** spatel has quit IRC | 05:04 | |
openstackgerrit | Luyao Zhong proposed openstack/nova stable/train: bug-fix: Reject live migration with vpmem https://review.opendev.org/714064 | 05:26 |
*** links has joined #openstack-nova | 05:29 | |
*** evrardjp has quit IRC | 05:36 | |
*** ociuhandu has joined #openstack-nova | 05:36 | |
*** evrardjp has joined #openstack-nova | 05:36 | |
*** TxGirlGeek has quit IRC | 05:40 | |
*** TxGirlGeek has joined #openstack-nova | 05:40 | |
*** ociuhandu has quit IRC | 05:41 | |
*** TxGirlGeek has quit IRC | 05:45 | |
*** dklyle has quit IRC | 05:50 | |
*** macz_ has joined #openstack-nova | 05:53 | |
*** macz_ has quit IRC | 05:58 | |
*** xek_ has joined #openstack-nova | 06:07 | |
openstackgerrit | Qiu Fossen proposed openstack/nova master: The instance is volume backed and power state is PAUSED,shelve the instance failed https://review.opendev.org/711609 | 06:12 |
openstackgerrit | Elod Illes proposed openstack/nova stable/pike: Mask the token used to allow access to consoles https://review.opendev.org/708876 | 06:40 |
openstackgerrit | Elod Illes proposed openstack/nova stable/pike: Avoid circular reference during serialization https://review.opendev.org/714148 | 06:40 |
openstackgerrit | Sundar Nadathur proposed openstack/nova master: Delete ARQs for an instance when the instance is deleted. https://review.opendev.org/673735 | 06:48 |
openstackgerrit | Sundar Nadathur proposed openstack/nova master: Enable hard/soft reboot with accelerators. https://review.opendev.org/697940 | 06:48 |
openstackgerrit | Sundar Nadathur proposed openstack/nova master: Enable start/stop of instances with accelerators. https://review.opendev.org/699553 | 06:48 |
openstackgerrit | Sundar Nadathur proposed openstack/nova master: Enable and use COMPUTE_ACCELERATORS trait. https://review.opendev.org/699554 | 06:48 |
openstackgerrit | Sundar Nadathur proposed openstack/nova master: Bump compute rpcapi version and reduce Cyborg calls. https://review.opendev.org/704227 | 06:48 |
openstackgerrit | Sundar Nadathur proposed openstack/nova master: Block unsupported instance operations with accelerators. https://review.opendev.org/674726 | 06:48 |
openstackgerrit | Sundar Nadathur proposed openstack/nova master: Add cyborg tempest job. https://review.opendev.org/670999 | 06:48 |
*** tetsuro has quit IRC | 07:13 | |
*** xek_ has quit IRC | 07:14 | |
*** xek_ has joined #openstack-nova | 07:14 | |
*** vesper11 has quit IRC | 07:16 | |
*** vesper has joined #openstack-nova | 07:16 | |
*** belmoreira has joined #openstack-nova | 07:46 | |
*** dpawlik has joined #openstack-nova | 07:47 | |
*** nightmare_unreal has joined #openstack-nova | 07:48 | |
*** maciejjozefczyk has joined #openstack-nova | 07:52 | |
*** tesseract has joined #openstack-nova | 07:58 | |
*** slaweq has joined #openstack-nova | 07:59 | |
*** ralonsoh has joined #openstack-nova | 08:01 | |
openstackgerrit | Luyao Zhong proposed openstack/nova master: Track orphan instances and error migrations in resource tracker https://review.opendev.org/714653 | 08:13 |
*** sapd1_x has joined #openstack-nova | 08:18 | |
*** tkajinam has quit IRC | 08:19 | |
*** stephenfin has quit IRC | 08:19 | |
*** amoralej|off is now known as amoralej | 08:22 | |
*** ileixe has joined #openstack-nova | 08:23 | |
*** tosky has joined #openstack-nova | 08:24 | |
*** tetsuro has joined #openstack-nova | 08:26 | |
nightmare_unreal | openstackgerrit: | 08:26 |
*** dtantsur|afk is now known as dtantsur | 08:29 | |
*** stephenfin has joined #openstack-nova | 08:30 | |
*** rpittau|afk is now known as rpittau | 08:31 | |
*** slaweq has quit IRC | 08:41 | |
luyao | brinzhang_: https://review.opendev.org/#/c/678451/ was moved to https://review.opendev.org/#/c/714653, I add more testcases, thanks for your review | 08:44 |
lyarwood | elod: morning, https://review.opendev.org/#/c/713961/ if you have time, almost finished with these now. | 08:55 |
brinzhang_ | luyao: Thansk, got it, I will check later(I have some docs need to compete.) | 09:00 |
luyao | brinzhang_: thanks | 09:02 |
brinzhang_ | luyao:np ^^ | 09:02 |
*** ociuhandu has joined #openstack-nova | 09:04 | |
luyao | lyarwood: I have a seperate patch to address the 'do_cleanup' flag issue, could you look at it again? https://review.opendev.org/#/c/714593 | 09:04 |
lyarwood | yup can try today | 09:05 |
luyao | lyarwood: thanks :) | 09:05 |
*** tetsuro has quit IRC | 09:05 | |
luyao | elod: thanks for review, comments addressed https://review.opendev.org/#/c/714064/ | 09:07 |
*** ociuhandu has quit IRC | 09:08 | |
*** slaweq has joined #openstack-nova | 09:18 | |
brinzhang_ | sean-k-mooney: https://review.opendev.org/#/c/694430/ and https://review.opendev.org/#/c/699669/ need your check, gmann have a question -1 for the new policy for show events:details, and I think it's necessary, and adopt my case, I left my comment in the spec. | 09:18 |
* gibi got hit by a list of downstream issues so will mostly be off today | 09:19 | |
openstackgerrit | jayaditya gupta proposed openstack/nova master: Support for nova-manage placement heal_allocations --cell https://review.opendev.org/714459 | 09:28 |
* nightmare_unreal checks if /me works here | 09:36 | |
* nightmare_unreal it works | 09:36 | |
openstackgerrit | Huaqiang Wang proposed openstack/nova master: Refactor the code in checking available host CPUs https://review.opendev.org/714657 | 09:36 |
openstackgerrit | Huaqiang Wang proposed openstack/nova master: Introduce 'MIXED' CPU allocation policy for instance https://review.opendev.org/713354 | 09:36 |
openstackgerrit | Huaqiang Wang proposed openstack/nova master: Introduce the interface of creating 'MIXED' policy instance through 'PCPU' and 'VCPU' https://review.opendev.org/713355 | 09:36 |
openstackgerrit | Huaqiang Wang proposed openstack/nova master: metadata: export the vCPU IDs that are pinning on the host CPUs https://review.opendev.org/688936 | 09:36 |
huaqiang | stephenfin: nice! you quickly fixed so much for mixed-instance bp! | 09:37 |
huaqiang | I'd like to follow your code and also contributes | 09:37 |
stephenfin | huaqiang: Feel free to take ownership of the whole lot of them, if they're helpful | 09:38 |
huaqiang | so I also updated the code to address the comments already made | 09:38 |
huaqiang | You are the gurantee of the code | 09:39 |
huaqiang | If you like to let me do some thing I'd like to do | 09:39 |
huaqiang | maybe from testing your code? | 09:39 |
huaqiang | I see not all test passed | 09:39 |
stephenfin | Oh, I hadn't checked that yet. Let me respin things to fix those | 09:40 |
stephenfin | Then we can figure out if any of them are useful | 09:40 |
*** martinkennelly has joined #openstack-nova | 09:41 | |
huaqiang | I'll spend about two hours in testing your patches that not marked with 'WIP' if you haven't test by yourself. or you can tell me which patch need more test | 09:43 |
stephenfin | I think everything not marked in WIP is potentially useful | 09:44 |
stephenfin | The WIP patches duplicate your work so I'll probably abandon my ones. I wrote those WIP patches last week before you submitted the new revision | 09:45 |
huaqiang | I'd like to know if you will continue these 'WIP' patches? | 09:45 |
huaqiang | ok | 09:45 |
huaqiang | a lot of them are simular | 09:45 |
stephenfin | I won't. You've already done that work | 09:45 |
huaqiang | got. | 09:45 |
huaqiang | I'd like the take the reposibility | 09:46 |
*** zhanglong has quit IRC | 09:51 | |
*** ivve has joined #openstack-nova | 09:53 | |
*** Liang__ has quit IRC | 09:57 | |
elod | lyarwood: hi, +W'd | 10:00 |
lyarwood | elod: many thanks :) | 10:01 |
elod | luyao: thanks, looks good to me, +2 | 10:01 |
*** factor has joined #openstack-nova | 10:04 | |
elod | lyarwood: thanks, too :) Now there are a bunch of patches in the gate queue in pike, hope we won't hit too many error_extending failures :S | 10:04 |
*** tesseract has quit IRC | 10:05 | |
*** tesseract has joined #openstack-nova | 10:07 | |
lyarwood | elod: I noticed the tempest job had failed a few times, was that the issue? | 10:12 |
lyarwood | elod: I haven't had time to look into it yet but wanted to later today | 10:12 |
lyarwood | elod: really want to flush the stable/pike queue once and for all :) | 10:12 |
elod | lyarwood: yes, most of them are that failure :S yes, stable/pike will look nice if everything will be merged in the queue \o/ | 10:14 |
*** ociuhandu has joined #openstack-nova | 10:14 | |
*** rcernin has quit IRC | 10:17 | |
lyarwood | elod: I'll take a look later today | 10:19 |
elod | lyarwood: thanks! AFAIK that issue is not a new one, but strange that we hit now in that number... maybe because there were not so many check/gate runs towards pike in the last months | 10:26 |
*** maciejjozefczyk_ has joined #openstack-nova | 10:27 | |
*** trident has quit IRC | 10:29 | |
*** maciejjozefczyk has quit IRC | 10:29 | |
*** trident has joined #openstack-nova | 10:31 | |
*** trident has quit IRC | 10:33 | |
*** trident has joined #openstack-nova | 10:37 | |
lyarwood | elod: https://review.opendev.org/#/c/697523/ - I don't think we can workaround the issue on stable/pike within c-vol, thoughts on blacklisting the specific test in our compute jobs? | 10:44 |
openstackgerrit | John Garbutt proposed openstack/nova master: Prevent compute manager freeze when greenpool is full https://review.opendev.org/575034 | 10:44 |
openstackgerrit | John Garbutt proposed openstack/nova master: Prevent compute manager freeze when greenpool is full https://review.opendev.org/575034 | 10:44 |
*** lbragstad has joined #openstack-nova | 10:45 | |
johnthetubaguy | gibi: stephenfin: I noticed we were all talking about this patch, it seem very like one of the big ironic pain points, so I fixed up my worries: https://review.opendev.org/#/c/575034 | 10:47 |
johnthetubaguy | belmoreira: I am wondering if you have seen this patch, and if it would help your powersync issues at all: https://review.opendev.org/#/c/575034 | 10:49 |
belmoreira | johnthetubaguy: no, let me have a look | 10:51 |
johnthetubaguy | its from the vmware folks, which I guess see related issues | 10:52 |
*** ociuhandu has quit IRC | 10:52 | |
openstackgerrit | Lee Yarwood proposed openstack/nova stable/pike: tempest: Avoid bug #1796708 on slower stable/pike CI hosts https://review.opendev.org/714915 | 10:57 |
openstack | bug 1796708 in Cinder "VolumesExtendTest.test_volume_extend_when_volume_has_snapshot intermittently fails with "Extend volume failed.: VolumeNotDeactivated: Volume volume-5514a6ad-abbb-46b3-a464-d73cc67e55af was not deactivated in time."" [Medium,Confirmed] https://launchpad.net/bugs/1796708 | 10:57 |
lyarwood | elod: ^ lets skip that test for now. | 10:57 |
*** ociuhandu has joined #openstack-nova | 11:01 | |
elod | lyarwood: usually i don't like disabling tests, but maybe now that is OK for that test on pike, especially for that bug is open since 2018 and has no fix. Let's see the regex works well in your patch :) | 11:03 |
*** sapd1 has quit IRC | 11:04 | |
*** sapd1_x has quit IRC | 11:06 | |
openstackgerrit | John Garbutt proposed openstack/nova master: Prevent compute manager freeze when greenpool is full https://review.opendev.org/575034 | 11:12 |
openstackgerrit | John Garbutt proposed openstack/nova master: Prevent compute manager freeze when greenpool is full https://review.opendev.org/575034 | 11:13 |
*** tosky is now known as tosky_ | 11:24 | |
openstackgerrit | John Garbutt proposed openstack/nova master: WIP: Enforce resource limits using oslo.limit https://review.opendev.org/615180 | 11:33 |
*** tosky_ is now known as tosky | 11:40 | |
openstackgerrit | Merged openstack/nova stable/pike: Avoid circular reference during serialization https://review.opendev.org/714148 | 11:40 |
openstackgerrit | Merged openstack/nova stable/pike: Mask the token used to allow access to consoles https://review.opendev.org/708876 | 11:40 |
openstackgerrit | Merged openstack/nova stable/pike: Remove exp legacy-tempest-dsvm-full-devstack-plugin-nfs https://review.opendev.org/702061 | 11:40 |
lyarwood | well well well | 11:41 |
lyarwood | looks like we got some faster CI nodes on that run | 11:41 |
*** rpittau is now known as rpittau|bbl | 11:42 | |
*** eharney has quit IRC | 11:44 | |
*** eharney has joined #openstack-nova | 11:49 | |
*** spatel has joined #openstack-nova | 11:53 | |
luyao | lyarwood: Hi, thanks for your quick comments on 'do_cleanup' flag bug-fix https://review.opendev.org/#/c/714593 | 11:55 |
*** spatel has quit IRC | 11:58 | |
lyarwood | luyao: np, I agree this needs cleaning up, I just don't think setting do_cleanup to True is the correct way of doing it for now. | 12:00 |
*** amodi has joined #openstack-nova | 12:00 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: [Community goal] Update contributor documentation https://review.opendev.org/712420 | 12:01 |
luyao | lyarwood: it's what I want to ask, I'm confusing about that, do you mean I need another cleanup method to do the cleanup? not invoking driver.cleanup directly or rollback_live_migration_at_destination | 12:02 |
lyarwood | luyao: yes, I think something like live_migration_cleanup_source and live_migration_cleanup_destination would be better instead of overloading cleanup itself | 12:04 |
lyarwood | elod: fun, we can't limit the regex used by the tempest-full jobs as they are using this tox env to run the commands - https://github.com/openstack/tempest/blob/51fe1ae61bed5d62c18864748520db25144f6db9/tox.ini#L103-L116 | 12:05 |
luyao | lyarwood: thingking.....what's the difference between the new cleanup methed and the existing one | 12:05 |
lyarwood | luyao: the existing one duplicates lots of cleanup already handled during the live migration flow | 12:07 |
lyarwood | luyao: we already unplug VIFs, disconnect volumes etc on success | 12:07 |
lyarwood | luyao: and depending on the failure we also do it there | 12:07 |
lyarwood | luyao: IMHO we should break up the cleanup method into smaller private methods that handle each aspect of this and use them only when required during the LM flows | 12:08 |
elod | lyarwood: :-/ anyway, at least a couple of patch got merged. so maybe recheck is enough for now as we don't have that many pike patches... | 12:09 |
lyarwood | I'll post a change removing it so if it does end up blocking things we can still remove it | 12:10 |
openstackgerrit | Lee Yarwood proposed openstack/nova stable/pike: zuul: Remove tempest-full from the gate due to bug #1796708 https://review.opendev.org/714915 | 12:11 |
openstack | bug 1796708 in Cinder "VolumesExtendTest.test_volume_extend_when_volume_has_snapshot intermittently fails with "Extend volume failed.: VolumeNotDeactivated: Volume volume-5514a6ad-abbb-46b3-a464-d73cc67e55af was not deactivated in time."" [Medium,Confirmed] https://launchpad.net/bugs/1796708 | 12:11 |
elod | lyarwood: sounds like a plan :) | 12:11 |
openstackgerrit | Marcin Juszkiewicz proposed openstack/nova master: fix typo in wrong cpu_model message https://review.opendev.org/714928 | 12:13 |
luyao | lyarwood: OK, got it, I'll look into the code and reply you in more detail on that patch, thanks :) | 12:16 |
*** hrw has joined #openstack-nova | 12:20 | |
hrw | morning | 12:20 |
*** links has quit IRC | 12:27 | |
*** links has joined #openstack-nova | 12:28 | |
*** ratailor has quit IRC | 12:32 | |
openstackgerrit | Merged openstack/nova stable/train: nova-live-migration: Ensure subnode is fenced during evacuation testing https://review.opendev.org/713961 | 12:38 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: libvirt: Use virDomainBlockCopy to swap volumes when using -blockdev https://review.opendev.org/696834 | 12:40 |
lyarwood | ^ gibi / kashyap / stephenfin ; rebased with a bug created and referenced for tracking if you have time to review. | 12:40 |
kashyap | lyarwood: Will look. Trying to investigate a different bug I was thrown at elsewhere | 12:41 |
*** udesale_ has joined #openstack-nova | 12:42 | |
*** udesale has quit IRC | 12:45 | |
hrw | kevinz: thanks for https://review.opendev.org/#/c/709494 - finally booted VM in qemu TCG | 12:58 |
*** spatel has joined #openstack-nova | 12:59 | |
kevinz | hrw: np, good to hear that :-D | 12:59 |
hrw | kevinz: I wonder how many aarch64 changes done in nova should be redone in libvirt ;d | 13:00 |
hrw | but then still would stay due to desync between projects | 13:00 |
*** rpittau|bbl is now known as rpittau | 13:00 | |
*** nweinber has joined #openstack-nova | 13:01 | |
kevinz | hrw: hope not so many, we just tweak tweak and tweak | 13:02 |
hrw | kevinz: -M virt as default feels like something for libvirt ;D | 13:04 |
*** ryneq has joined #openstack-nova | 13:04 | |
hrw | no, it is default there. it's qemu where it is not | 13:04 |
sean-k-mooney | do we still have the cpu feature flag check disabled for aarch64 | 13:04 |
kevinz | yes, there are some methods in libvirt regarding with CPU are not implemented on aarch64 | 13:05 |
sean-k-mooney | so its not safe to use max as the default model if we want to support livemigration | 13:05 |
hrw | sean-k-mooney: anything around cpu features/model/passthrough on aarch64 is like walking on minefield | 13:05 |
sean-k-mooney | well in an upgrade case at least | 13:05 |
kevinz | sean-k-mooney: yes, I know that Kevin Zheng from Huawei is working on libvirt side to make that happen | 13:06 |
sean-k-mooney | hrw: well we have the info in /sys | 13:06 |
brinzhang_ | sean-k-mooney: did you look at https://review.opendev.org/#/c/694430/ and https://review.opendev.org/#/c/699669/3, gmann want you can check that ^^ | 13:06 |
sean-k-mooney | libvirt is just not reading it | 13:06 |
sean-k-mooney | brinzhang_: no but ill look now | 13:06 |
kevinz | so hopefully live migration will works well on arm64 soon | 13:06 |
hrw | sean-k-mooney: can you remind me /sys path? | 13:06 |
brinzhang_ | sean-k-mooney: yeah, thanks. I think we need the new policy | 13:07 |
sean-k-mooney | hrw: actully its in /proc/cpuinfo | 13:07 |
sean-k-mooney | hrw: the "flags" filed is "Flags" on aarch64 | 13:07 |
sean-k-mooney | which prevents libvirt reading it | 13:07 |
sean-k-mooney | the model is also available | 13:08 |
hrw | /proc/cpuinfo... file which should just die | 13:08 |
sean-k-mooney | hrw it really should not | 13:08 |
sean-k-mooney | hrw: its the standard interface to report this info | 13:08 |
hrw | it is far from standard | 13:08 |
hrw | each arch has own way | 13:08 |
sean-k-mooney | yes but at least its a common location | 13:09 |
sean-k-mooney | otherwise you have to use more arcane cpuid checks and model specific registers | 13:09 |
hrw | yep | 13:10 |
*** amoralej is now known as amoralej|lunch | 13:12 | |
kashyap | gibi: Yeah, the bitwise OR and logical OR of flags is always a bit confusing for me too; they look reasonable, see my comment: https://review.opendev.org/#/c/696834/12/nova/virt/libvirt/guest.py@773 | 13:17 |
*** ociuhandu has quit IRC | 13:19 | |
*** ociuhandu has joined #openstack-nova | 13:20 | |
*** mriedem has joined #openstack-nova | 13:20 | |
*** ociuhandu has quit IRC | 13:25 | |
*** sapd1_x has joined #openstack-nova | 13:38 | |
gibi | kashyap, lyarwood: thanks. I'm +2 | 13:43 |
lyarwood | gibi: many thanks! | 13:44 |
dansmith | brinzhang_: I did, but I didn't understand what any of that had to do with why we need to use patch | 13:48 |
*** martinkennelly has quit IRC | 13:52 | |
*** martinkennelly has joined #openstack-nova | 13:53 | |
*** Liang__ has joined #openstack-nova | 13:55 | |
*** Liang__ is now known as LiangFang | 13:56 | |
*** liuyulong has joined #openstack-nova | 13:59 | |
*** amoralej|lunch is now known as amoralej | 14:00 | |
*** dswebb has joined #openstack-nova | 14:02 | |
*** happyhemant has joined #openstack-nova | 14:06 | |
*** maciejjozefczyk_ is now known as maciejjozefczyk | 14:19 | |
*** prometheanfire has quit IRC | 14:19 | |
*** prometheanfire has joined #openstack-nova | 14:23 | |
openstackgerrit | Maciej Kucia proposed openstack/nova master: SR-IOV passthrough: Check PF only if VF is enabled https://review.opendev.org/476642 | 14:23 |
openstackgerrit | Merged openstack/nova master: ksa auth conf and client for Cyborg access https://review.opendev.org/631242 | 14:25 |
*** psachin has quit IRC | 14:27 | |
openstackgerrit | Lee Yarwood proposed openstack/python-novaclient master: Microversion 2.83 - Stable device boot from volume rescue https://review.opendev.org/714956 | 14:29 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: virt: Provide block_device_info during rescue https://review.opendev.org/700811 | 14:29 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: libvirt: Add support for stable device rescue https://review.opendev.org/700812 | 14:29 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: compute: Report COMPUTE_RESCUE_BFV and check during rescue https://review.opendev.org/701429 | 14:29 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: compute: Extract _get_bdm_image_metadata into nova.utils https://review.opendev.org/705212 | 14:29 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: libvirt: Support boot from volume stable device instance rescue https://review.opendev.org/701431 | 14:29 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: api: Introduce microverion 2.83 allowing boot from volume rescue https://review.opendev.org/701430 | 14:29 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: DNM - Test stable device rescue tests with BFV instances https://review.opendev.org/710050 | 14:29 |
huaqiang | hello. I see many '_from_dict' method in some NobaObject based classes, but not all classes, | 14:33 |
huaqiang | should I make it work for new field? | 14:33 |
openstackgerrit | Luigi Toscano proposed openstack/nova stable/ocata: Remove exp legacy-tempest-dsvm-full-devstack-plugin-nfs https://review.opendev.org/714958 | 14:34 |
nightmare_unreal | hey, how can one overwrite allocation for instance | 14:36 |
*** ociuhandu has joined #openstack-nova | 14:36 | |
huaqiang | nightmare_unreal: cool name :D | 14:38 |
*** tbachman has quit IRC | 14:38 | |
nightmare_unreal | huaqiang: thanks :D , it's just a nick I registered when I was more into gaming haha | 14:39 |
mriedem | nightmare_unreal: why do you want/need to? | 14:42 |
nightmare_unreal | working on this : https://bugs.launchpad.net/nova/+bug/1868997 | 14:42 |
openstack | Launchpad bug 1868997 in OpenStack Compute (nova) "option to overwrite allocations for instances" [Undecided,New] - Assigned to jayaditya gupta (jayssj11) | 14:42 |
mriedem | is that referring to the heal_allocations CLI? https://docs.openstack.org/nova/latest/cli/nova-manage.html#placement | 14:43 |
mriedem | we don't really need to track todos in the code with bug reports...so i'm not sure why someone opened that bug | 14:43 |
mriedem | or is that someone you? :) | 14:43 |
nightmare_unreal | yes that's me :) | 14:45 |
mriedem | ok https://github.com/openstack/nova/blob/master/nova/cmd/manage.py#L216 doesn't refer to a todo | 14:46 |
mriedem | oh wrong line, 2126 | 14:46 |
mriedem | https://github.com/openstack/nova/blob/master/nova/cmd/manage.py#L2126 | 14:46 |
nightmare_unreal | yup that one | 14:46 |
*** tbachman has joined #openstack-nova | 14:47 | |
nightmare_unreal | I also did the --cell one : https://review.opendev.org/#/c/714459/ | 14:47 |
nightmare_unreal | but it needs review | 14:47 |
mriedem | are you on belmiro's team at cern? | 14:47 |
nightmare_unreal | yup | 14:48 |
nightmare_unreal | new joinee | 14:48 |
gibi | nightmare_unreal: I will get back to https://review.opendev.org/#/c/714459/ hopefully tomorrow | 14:49 |
*** jraju__ has joined #openstack-nova | 14:49 | |
nightmare_unreal | thanks gibi | 14:49 |
mriedem | cool. welcome. i can leave some quick comments on ^ | 14:49 |
nightmare_unreal | sure | 14:49 |
*** links has quit IRC | 14:49 | |
gibi | mriedem: thanks! | 14:50 |
gibi | mriedem: do you miss reviewing nova code ? :) | 14:51 |
*** macz_ has joined #openstack-nova | 14:52 | |
*** dklyle has joined #openstack-nova | 14:52 | |
openstackgerrit | Lee Yarwood proposed openstack/nova master: WIP libvirt: Break up get_disk_mapping within blockinfo https://review.opendev.org/714962 | 14:54 |
*** sapd1_x has quit IRC | 14:55 | |
*** ociuhandu has quit IRC | 14:56 | |
mriedem | gibi: heal_allocations started as my baby so i'm partial | 14:57 |
*** udesale_ has quit IRC | 14:57 | |
*** brinzhang has joined #openstack-nova | 14:59 | |
gibi | :) | 15:00 |
mriedem | nightmare_unreal: ok comments inline | 15:00 |
nightmare_unreal | thanks :) | 15:00 |
mriedem | gibi: it's also nice to review something outside of github too | 15:01 |
brinzhang | sean-k-mooney: gmann: If we re-using the os-isntance-actions: events policy, we want to expose noValidHost and other information to the non-admin, which cannot be changed by modifying the policy, is it? | 15:01 |
mriedem | nightmare_unreal: as to your original question, this put_allocations method is the one that overwrites the allocations for an instance https://github.com/openstack/nova/blob/master/nova/cmd/manage.py#L1948 | 15:02 |
brinzhang | we dont want to expose the traceback to the non-admin user | 15:02 |
gibi | mriedem: I don't have too much experience with github but I imagine gerrit is a nicser interface | 15:02 |
gibi | nicer | 15:02 |
mriedem | gibi: just...different | 15:02 |
* nightmare_unreal coming from github world | 15:02 | |
mriedem | unified diff in github reviews isn't terrible | 15:03 |
nightmare_unreal | new to gerrit though | 15:03 |
nightmare_unreal | thanks mriedem , I will work on it and submit again. | 15:03 |
mriedem | err i should say split diff i guess to be like how i used gerrit | 15:03 |
mriedem | nightmare_unreal: so the way heal_allocations works is we determine if an instance needs healing and the conditional for that is here https://github.com/openstack/nova/blob/master/nova/cmd/manage.py#L1922 | 15:05 |
gmann | brinzhang: os-isntance-actions: events policy is admin by default | 15:05 |
mriedem | bypassing that is essentially a --force option or something like that | 15:05 |
sean-k-mooney | i prefer gerrit for revew also. im not really a fan of the pull request workflow but its still beter then doing things by email | 15:05 |
gmann | so it would not expose those info to non-admin. until override to do so | 15:05 |
nightmare_unreal | mriedem: ah okay thanks for the guidance . | 15:06 |
mriedem | nightmare_unreal: the thing that gets tricky is probably the network port allocations logic in there since you can tell the command to skip that, so we could have some problems if you skip healing port allocatoins but then forcefully overwrite the existing allocations to match the current flavor | 15:06 |
mriedem | so that probably means if you add a --force option or something it probably needs to be mutually exclusive with --skip-port-allocations | 15:07 |
brinzhang | We need a policy to expose the details to non-admin (from modify the default policy), if we just use de admin default, maybe the traceback can do anything, we dont need to populate details | 15:07 |
mriedem | this is why functional tests are better for changes to this command because there are a lot of moving parts | 15:07 |
brinzhang | gmann | 15:08 |
nightmare_unreal | okay | 15:08 |
* mriedem goes back to his real job | 15:08 | |
*** ociuhandu has joined #openstack-nova | 15:09 | |
brinzhang | gmann:https://review.opendev.org/#/c/699669/3/specs/ussuri/approved/action-event-fault-details.rst@54 | 15:09 |
brinzhang | gmann: My use case is for non-admin, but at the end need to add system_reader role default, but we should allow user change the default poliy to show the details for non-admin user | 15:11 |
brinzhang | gmann: That's why I insist on using the new policy for show details | 15:12 |
brinzhang | gmann: as gibi comment in https://review.opendev.org/#/c/699669/2/specs/ussuri/approved/action-event-fault-details.rst@128 | 15:15 |
luyao | lyarwood: Hi, I replied on https://review.opendev.org/#/c/714593/ in detail, as you commented, new cleanup method might be better, but it's really huge change, I think we need a bp, I'm willing to do this but later, my target is vpmem live migration in this release.:) | 15:16 |
luyao | lyarwood: thanks again for your comments. :) | 15:16 |
brinzhang | dansmith: I replied your comment in https://review.opendev.org/#/c/693828/, we have more details disscussed with the destroy-instance-with-datavolume using PATCH API in the SPEC | 15:18 |
*** eharney has quit IRC | 15:18 | |
brinzhang | damsmith: The spec https://review.opendev.org/#/c/580336/ | 15:18 |
lyarwood | luyao: if you just need to clean vpmem things up then you can add them to the existing cleanup methods outside of cleanup for live migration | 15:19 |
*** prometheanfire has quit IRC | 15:19 | |
lyarwood | luyao: post_live_migration or rollback_live_migration_at_destination etc | 15:19 |
*** macz_ has quit IRC | 15:20 | |
*** macz_ has joined #openstack-nova | 15:21 | |
*** eharney has joined #openstack-nova | 15:21 | |
luyao | lyarwood: I'd like to set do_cleanup True if there are vpmems, do you think it's OK? | 15:22 |
gmann | brinzhang: gibi so this is bit we are missing. os-isntance-actions: events policy is admin only and if operator want to show it to non-admin then traceback and host name etc can be seen to non-admin which is all good because operator want to do so | 15:22 |
*** prometheanfire has joined #openstack-nova | 15:23 | |
gmann | and now with new field 'details' we are showing in 'events' dict so if policy os-isntance-actions: events pass then only we will add 'details'. so operator has to make os-isntance-actions: events for non-admin first and then only non-admin can see new field 'details' | 15:24 |
gmann | even we have new policy for 'details' field, operator has to enable 'events' policy for non-admin. | 15:24 |
gmann | i mean operator cannot do 1. keep os-isntance-actions: events for admin only and 2. new policy os-isntance-actions: events:details for non-admin | 15:25 |
lyarwood | luyao: I'd rather not change the semantics for vpmems at all and just add cleanup for them directly in the required places as we do with vifs and volumes | 15:25 |
lyarwood | luyao: I'll add a comment once I've finished something | 15:25 |
brinzhang | gmann: in traceback recorded the sensitive information, why expose this to the non-admin? | 15:26 |
gmann | brinzhang: we are embedding new field 'details' in 'events' dict which is already policy-configurable for non-admin. | 15:26 |
gmann | brinzhang: yeah, i am saying if event policy is admin how operator make 'details' to show to non-admin | 15:26 |
gmann | it is inside 'events' dict not outside | 15:27 |
gmann | https://review.opendev.org/#/c/694430/13/nova/api/openstack/compute/instance_actions.py@183 | 15:27 |
brinzhang | gmann: no, if the microversion 2.51, we can see the events dict https://opendev.org/openstack/nova/src/branch/master/nova/api/openstack/compute/instance_actions.py#L171 | 15:28 |
brinzhang | s/ if the microversion 2.51/ if the microversion >= 2.51 | 15:29 |
*** mlavalle has joined #openstack-nova | 15:30 | |
*** arxcruz|rover is now known as arxcruz | 15:30 | |
luyao | lyarwood: Thanks. I understand you mean we can add a separate method to cleanup vpmem, acctually I had such one solution, but that means we need to add an rpc api to cleanup destination vpmem like rollback_live_migration_at_destination, alex_xu comments that it's very vpmem and libvirt specific, he hoped we can utilize current cleanup method | 15:30 |
*** LiangFang has quit IRC | 15:32 | |
gmann | brinzhang: you are right on that. i missed that microversion change. | 15:32 |
brinzhang | gmann: so we don't need to have to pass os-isntance-actions: events to show the 'events' dict. | 15:32 |
gmann | brinzhang: and hostId is also shown always to non-admin | 15:32 |
brinzhang | gmann: yes, hostId always shown to non-admin | 15:32 |
gmann | when and how operator will decide that he/she does not want to show traceback to non-admin but want to show 'details' which is nothing but error message for nova exception and exception names for other to non-admin. | 15:34 |
gmann | i mean we want to guard the 'details' field with admin by default but tell operator to enable for non-admin if he want to do with keeping traceback for admin only | 15:35 |
brinzhang | Yes, right | 15:35 |
gmann | actually that use case I am not getting. if new policy is non-admin by default then it make sense | 15:35 |
gmann | but we cannot make it non-admin by default because it may be info leak | 15:36 |
brinzhang | gmann: thanks ^^ | 15:36 |
brinzhang | yes, that why we set system_reader by default | 15:36 |
lyarwood | luyao: I'm confused, we already have these within the libvirt driver? You wouldn't need to add any RPC calls. | 15:36 |
gmann | brinzhang: i mean i cannot get the use case that is why i am finding difficulty to understand the use of new policy | 15:37 |
gmann | or question is like: how operator can decide the that 'details' which has admin related info expose to non-admin but not traceback | 15:38 |
brinzhang | yeah, in the spec if there is not have the policy limit, I think you will get that case firstly | 15:38 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: libvirt: Support boot from volume stable device instance rescue https://review.opendev.org/701431 | 15:39 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: api: Introduce microverion 2.83 allowing boot from volume rescue https://review.opendev.org/701430 | 15:39 |
*** gyee has joined #openstack-nova | 15:40 | |
brinzhang | if the operator want to expose the details to the non-admin, they just need to change the default policy, is it? | 15:40 |
brinzhang | because the BASE_POLICY_NAME % 'events:details' just limit show details in 'events' dict | 15:41 |
luyao | lyarwood: we already have vpmem cleanup logic inside libvirt driver, driver.cleanup will invoke vpmem cleanup. | 15:41 |
*** sapd1 has joined #openstack-nova | 15:41 | |
lyarwood | luyao: so why can't we just call that specific logic from other places instead of calling the entire cleanup method? | 15:42 |
luyao | lyarwood: if I want to cleanup vpmems on destination host but do_cleanup is False, I need a rpc call for vpmem cleanup | 15:43 |
gmann | brinzhang: yeah i get that but my point is it is difficult for operator to decide that 'details' (which can have non-nova exception so does leak the infa info) can be shown to non-admin and traceback not. | 15:43 |
luyao | lyarwood: alternatively we can set do_cleanup to True, then rpc call rollback_live_migration_at_destination will be invoked, then vpmem cleanup will be called inside that | 15:44 |
brinzhang | gmann: if the details is non-nova exception, it will be just only show the Exception class name to the details | 15:45 |
gmann | brinzhang: if any operator ask how to use these two policy in different way (one allowed for admin and one for non-admin) then we would not have clear answer right ? | 15:45 |
brinzhang | gamnn: pls see https://review.opendev.org/#/c/712697/ | 15:45 |
*** jraju__ has quit IRC | 15:47 | |
*** TxGirlGeek has joined #openstack-nova | 15:47 | |
luyao | lyarwood: so I asked could I set do_cleanup to True if there are vpmems | 15:48 |
brinzhang | gmann: 'traceback' show the exception details info, contains python path, and the all details. but 'details' just only show the format message if it's a nova exception, but if that is an non-nova excetption, we just show the simple info the the non-admin | 15:48 |
brinzhang | I donnot think it no clear | 15:48 |
luyao | lyarwood: Do I make it clear? | 15:49 |
lyarwood | luyao: yeah okay, that might be okay in the short term but I think after this we really need to clean this interface up | 15:49 |
gmann | brinzhang: that is what i was thinking to add in API side but serialize_args does. | 15:49 |
lyarwood | luyao: cleanup within libvirt is actually looking at migrate_data so why we are making the call to cleanup dependent on it is weird | 15:49 |
gmann | to handle the non nova exception details | 15:49 |
luyao | lyarwood: yeah agree | 15:49 |
brinzhang | gmann: you mean, something need I add in os-instance-action API? | 15:52 |
gmann | brinzhang: no i mean hiding detail about non nova exception but exception name itself can leak few info about driver used etc. | 15:54 |
gmann | can non-admin take action based on non-nova exception ? | 15:55 |
brinzhang | maybe try to do something that they can do, nothing else | 15:56 |
gmann | i am thinking if we hide the non- nova exception from 'details' field and only expose the nova exception which is what use case of 'details' is for non-admin | 15:56 |
luyao | lyarwood: we only have instance path file to cleanup previously but now we have other devices needs cleanup | 15:56 |
gmann | dansmith: ^^ ? any use case of keeping non-nova exception name in action event 'details' field. | 15:56 |
gmann | admin anyways can see all details from traceback | 15:57 |
brinzhang | gmann: thanks, I am sorry it's too later for me, I have to go. | 15:58 |
gmann | so that we can keep new field 'details' usable and no info leak for non-admin | 15:58 |
gmann | brinzhang: ah sorry. yeah. I will reply on review. thanks for discussion and late night. | 15:58 |
luyao | lyarwood: we can also add a flag in libvirt migrate data to tell if there are vpmems needs cleanup, I'm not sure is it necessary? | 15:59 |
brinzhang | gmann:We only show non-nova exception class name to users, I don't think it will cause serious information leakage. | 15:59 |
brinzhang | gmann: this serialize_args change comes mriedem and dansmith, if they are around, I think you can get more. | 16:01 |
brinzhang | gmann: thanks too, bye | 16:01 |
sean-k-mooney | luyao: we had to do host cleanup before for things other then the instnace files | 16:01 |
sean-k-mooney | luyao: like removing mounted volumes, cleaning up ports or other actions | 16:02 |
brinzhang | gmann: this is the original thought https://review.opendev.org/#/c/694428/9/nova/objects/instance_action.py@196 | 16:03 |
openstackgerrit | Lee Yarwood proposed openstack/python-novaclient master: Microversion 2.83 - Stable device boot from volume rescue https://review.opendev.org/714956 | 16:03 |
lyarwood | luyao: possibily, just need to jump on a call and I'll try to update the review again | 16:04 |
sean-k-mooney | brinzhang: for non admin i think they should only see the class name for nova exceptions too | 16:04 |
brinzhang | sean-k-mooney: yeah, agree, make sense to me too. | 16:04 |
sean-k-mooney | non admins ususally dont have the acess required to fix the cause of most nova excpetions | 16:05 |
luyao | sean-k-mooney: sorry, you mean post live migration? | 16:07 |
sean-k-mooney | luyao: yes we clean up those resouces ealier in the function | 16:08 |
sean-k-mooney | luyao: so we unplug the guest interface on the ovs bridge for example | 16:08 |
sean-k-mooney | and we have to unmount any cinder volumes that were mounted on the soucres node | 16:09 |
luyao | sean-k-mooney: yeah, invoking driver.cleanup will not cleanup them again | 16:09 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: virt: Provide block_device_info during rescue https://review.opendev.org/700811 | 16:09 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: libvirt: Add support for stable device rescue https://review.opendev.org/700812 | 16:09 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: compute: Report COMPUTE_RESCUE_BFV and check during rescue https://review.opendev.org/701429 | 16:09 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: compute: Extract _get_bdm_image_metadata into nova.utils https://review.opendev.org/705212 | 16:09 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: libvirt: Support boot from volume stable device instance rescue https://review.opendev.org/701431 | 16:09 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: api: Introduce microverion 2.83 allowing boot from volume rescue https://review.opendev.org/701430 | 16:09 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: DNM - Test stable device rescue tests with BFV instances https://review.opendev.org/710050 | 16:09 |
sean-k-mooney | luyao: ya i know | 16:10 |
sean-k-mooney | well with the flags you have set | 16:10 |
gmann | sean-k-mooney: brinzhang and that is what use case if actually. expose something a non-admin could fix. may be filter or whitelist the non-admin fixable exceptions can be better here ? | 16:10 |
gmann | or at least not expose the non-nova exception at all. | 16:10 |
sean-k-mooney | gmann: well i would geuss any 4xx errors should be actionable by them in some way | 16:11 |
sean-k-mooney | if we are identifying them as client issues | 16:11 |
luyao | sean-k-mooney: yeah, and now I need driver.cleanup to cleanup vpmems | 16:11 |
gmann | sean-k-mooney: yeah most of them yes. few 404 might not be but I have not checked all exceptions but overall 4xx is in their range | 16:12 |
*** damien_r has quit IRC | 16:16 | |
luyao | sean-k-mooney, lyarwood: I'll offline and can't response promptly, so please left comments on patch https://review.opendev.org/#/c/687856 if you have any suggestion about vpmem cleanup during live migration. Many Thanks. :) | 16:17 |
*** sapd1 has quit IRC | 16:24 | |
sean-k-mooney | luyao: sure | 16:24 |
sean-k-mooney | luyao: o/ | 16:25 |
*** damien_r has joined #openstack-nova | 16:27 | |
*** damien_r has quit IRC | 16:32 | |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Add test coverage of existing flavor_manage policies https://review.opendev.org/714814 | 16:32 |
*** liuyulong has quit IRC | 16:42 | |
*** ociuhandu has quit IRC | 16:47 | |
*** ociuhandu has joined #openstack-nova | 16:48 | |
*** ociuhandu has quit IRC | 16:53 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Reproduce bug 1869050 https://review.opendev.org/714997 | 16:53 |
openstack | bug 1869050 in OpenStack Compute (nova) "migration of anti-affinity server fails due to stale scheduler instance info" [Low,Triaged] https://launchpad.net/bugs/1869050 - Assigned to Balazs Gibizer (balazs-gibizer) | 16:53 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Update scheduler instance info at confirm resize https://review.opendev.org/714998 | 16:53 |
hrw | https://review.opendev.org/#/c/709494 - can someone take a look so aarch64 will be a bit better in nova? | 16:56 |
openstackgerrit | John Garbutt proposed openstack/nova master: Update quota sets APIs https://review.opendev.org/712749 | 16:58 |
openstackgerrit | John Garbutt proposed openstack/nova master: Tell oslo.limit how to count nova resources https://review.opendev.org/713301 | 16:58 |
*** damien_r has joined #openstack-nova | 16:59 | |
*** belmoreira has quit IRC | 17:01 | |
*** dtantsur is now known as dtantsur|afk | 17:06 | |
melwitt | kashyap: would you mind revisiting the aarch64 patch, it's been updated ^ | 17:06 |
openstackgerrit | Lee Yarwood proposed openstack/nova master: WIP libvirt: Break up get_disk_mapping within blockinfo https://review.opendev.org/714962 | 17:09 |
sean-k-mooney | alex_xu: dansmith gibi so just did a evacuate test with the cyborg fake driver. http://paste.openstack.org/show/791153/ | 17:12 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: hardware: Update and correct typing information https://review.opendev.org/714694 | 17:12 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: libvirt: Add typing information https://review.opendev.org/714695 | 17:12 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: tests: Split instance NUMA object tests https://review.opendev.org/714696 | 17:12 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: objects: Replace 'cpu_pinning_requested' helper https://review.opendev.org/714697 | 17:12 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: hardware: Don't consider overhead CPUs for unpinned instances https://review.opendev.org/714698 | 17:12 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: hardware: Remove handling of pre-Train compute nodes https://review.opendev.org/714699 | 17:12 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: hardware: Add validation for 'cpu_realtime_mask' https://review.opendev.org/468203 | 17:12 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: hardware: Tweak the 'cpu_realtime_mask' handling slightly https://review.opendev.org/461456 | 17:12 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: hardware: Rework 'get_realtime_constraint' https://review.opendev.org/714700 | 17:12 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: hardware: Invert order of NUMA topology generation https://review.opendev.org/714701 | 17:12 |
sean-k-mooney | alex_xu: dansmith gibi we can evacuate but it does not create allocation for the fpga | 17:12 |
*** rpittau is now known as rpittau|afk | 17:13 | |
*** derekh has joined #openstack-nova | 17:14 | |
sean-k-mooney | the arqs are also not updated http://paste.openstack.org/show/791154/ | 17:15 |
sean-k-mooney | ill update the block operation patch review with that info but currently we cannot evacuate properly. | 17:15 |
lyarwood | stephenfin: https://review.opendev.org/#/c/696834/ - not sure if you're still here but this should be ready now. | 17:16 |
* stephenfin clicks | 17:17 | |
stephenfin | lyarwood: done | 17:18 |
lyarwood | stephenfin: thanks | 17:19 |
openstackgerrit | John Garbutt proposed openstack/nova master: WIP: Enforce resource limits using oslo.limit https://review.opendev.org/615180 | 17:20 |
openstackgerrit | John Garbutt proposed openstack/nova master: Prevent compute manager freeze when greenpool is full https://review.opendev.org/575034 | 17:21 |
openstackgerrit | melanie witt proposed openstack/nova stable/train: Add config option for neutron client retries https://review.opendev.org/715010 | 17:27 |
openstackgerrit | John Garbutt proposed openstack/nova master: Prevent compute manager freeze when greenpool is full https://review.opendev.org/575034 | 17:30 |
melwitt | lyarwood: I dunno if you saw my comment on this one https://review.opendev.org/708030 IIUC this is an option you're thinking to keep indefinitely, if so, it shouldn't go under [workarounds] as they're things intended to be temporary and removed | 17:31 |
*** evrardjp has quit IRC | 17:36 | |
*** evrardjp has joined #openstack-nova | 17:36 | |
*** ociuhandu has joined #openstack-nova | 17:39 | |
kashyap | melwitt: Hiya; will look at the AArch64 thing tom. in the AM. (Aside: just to keep you posted, I'm off from tomm. evening until 31) | 17:44 |
kashyap | (s/31/31st-Mar/) | 17:44 |
*** ociuhandu has quit IRC | 17:45 | |
kashyap | Actually, looking now | 17:45 |
melwitt | cool thanks! | 17:46 |
kashyap | melwitt: Okay, they went with the upstream QEMU AArch64 recomm. of model 'max'. Cool | 17:47 |
openstackgerrit | John Garbutt proposed openstack/nova master: WIP: Enforce resource limits using oslo.limit https://review.opendev.org/615180 | 17:52 |
*** tesseract has quit IRC | 18:03 | |
kashyap | stephenfin: melwitt: The release note contains a lot of not useful info, which will only confuse: https://review.opendev.org/#/c/709494/20 | 18:05 |
kashyap | stephenfin: melwitt: I suggested whittling it down to a couple of sentences. Hope that looks okay | 18:05 |
stephenfin | kashyap: yeah, I was iffy on that too but figured it was good enough. Now that there's two of us... | 18:06 |
kashyap | Maybe whoever is merging it can amend it? If it's not urgent, perhaps Kevin could respoin | 18:06 |
kashyap | stephenfin: Hehe, much of it is verbatim from a review comment I made; looks odd to have "stream of consciounsess" as a release note ;-) | 18:06 |
kashyap | Err, I myself made a grammar error; /me goes to fix | 18:07 |
kashyap | Alright; /me goes to make some dinner | 18:09 |
*** nightmare_unreal has quit IRC | 18:36 | |
*** irclogbot_2 has quit IRC | 18:37 | |
*** lbragstad has quit IRC | 18:57 | |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Add test coverage of existing hypervisors policies https://review.opendev.org/715029 | 18:57 |
*** amoralej is now known as amoralej|off | 18:59 | |
*** maciejjozefczyk has quit IRC | 19:01 | |
*** ociuhandu has joined #openstack-nova | 19:01 | |
*** irclogbot_1 has joined #openstack-nova | 19:02 | |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Introduce scope_types in os-hypervisors https://review.opendev.org/715036 | 19:10 |
*** ociuhandu has quit IRC | 19:13 | |
lyarwood | melwitt: yeah sorry was working my way down to these changes this week | 19:14 |
melwitt | dansmith: are you aware that in a vanilla devstack with one cell, we are getting [workarounds]disable_group_policy_check_upcall = True ? this is new to me | 19:14 |
lyarwood | melwitt: I'll respin and/or update in the morning. | 19:14 |
mriedem | it's intentional because of superconductor mode | 19:14 |
mriedem | melwitt: ^ | 19:15 |
dansmith | yeah, what mriedem said | 19:15 |
mriedem | otherwise affinity tests will below up | 19:15 |
dansmith | so people either have to disable affinity or enable that workaround | 19:15 |
mriedem | there are a few things disabled by default like that in devstack | 19:15 |
melwitt | lyarwood: ok, np at all. just wanted to make sure in case you didn't see | 19:15 |
dansmith | because we still don't have affinity in placement | 19:15 |
mriedem | https://docs.openstack.org/nova/latest/user/cellsv2-layout.html#operations-requiring-upcalls | 19:16 |
mriedem | anything not marked complete in that list probably has some kind of flag to disable it in devstack | 19:16 |
mriedem | and we don't test cross_az_attach=false in the gate anywhere so...that just flies under the radar | 19:17 |
melwitt | mriedem, dansmith: thanks. yeah, I see that the logic is based on whether we have a superconductor going or not. just trying to work out whether we want or if there's way to set it False if we know everything's on the same MQ. context, we got a regression reported https://bugs.launchpad.net/nova/+bug/1863190 that's looking not like a regression since I saw [workarounds]disable_group_policy_check_upcall = True in the config | 19:17 |
openstack | Launchpad bug 1863190 in OpenStack Compute (nova) "Server group anti-affinity no longer works" [Undecided,New] | 19:17 |
*** macz_ has quit IRC | 19:17 | |
mriedem | i think there is at least one known latent multi-cell bug with how (anti-)affinity works, but i'm fuzzy on the details | 19:18 |
melwitt | two parallel anti-affinity requests seeming to violate policy in the single MQ deployment | 19:18 |
mriedem | if you don't have multiple cells then i guess that doesn't apply | 19:18 |
mriedem | if you have single cell and are support anti-affinity then you need the late check enabled in the compute | 19:19 |
melwitt | yeah if you are multi MQ then you can't get anti-affinity if the requests hit at the same time | 19:19 |
melwitt | right | 19:19 |
mriedem | [workarounds]disable_group_policy_check_upcall = True means you're opting into the wildness | 19:19 |
melwitt | I was just thinking you'd think our default devstack with one MQ should set it False | 19:19 |
mriedem | it's false by default | 19:19 |
*** larainema has quit IRC | 19:20 | |
melwitt | it is, but something in the default devstack logic is setting it True | 19:20 |
melwitt | that is, I cloned the devstack repo and brought up a vanilla devstack and I'm getting it set to True | 19:20 |
mriedem | yeah, because superconductor is the default mode | 19:20 |
mriedem | superconductor is the ideal mode for deploying nova, so we test in the gate with that by default | 19:20 |
melwitt | yeah. and I'm thinking this sort of "bug" will keep being reported occasionally bc people not realizing what devstack is doing | 19:21 |
mriedem | it's a known limitation, link them to the docs | 19:21 |
mriedem | tempest has (anti)affinity tests as well but i don't think they make parallel requests | 19:22 |
mriedem | because of this | 19:22 |
melwitt | yeah. they have to be very parallel too. the first time I tried to repro I didn't get it but after a few tries I got it | 19:22 |
mriedem | right, i remember poking around in those tempest tests awhile back related to all of this | 19:24 |
*** ralonsoh has quit IRC | 19:26 | |
mriedem | oh also i think tempest tests the affinity stuff with 2 servers in the same create request - same (anti)affinity group, and that works b/c the scheduler knows about the decisions within the same request | 19:26 |
melwitt | ah, yeah | 19:26 |
mriedem | https://github.com/openstack/tempest/blob/7a588ded216f74ddd0015c3065d4fae10de2161f/tempest/api/compute/admin/test_servers_on_multinodes.py | 19:27 |
mriedem | and https://github.com/openstack/tempest/blob/f419f4d36fd0f99a9c53fe3a984d172b02e828c5/tempest/api/compute/servers/test_server_group.py | 19:27 |
mriedem | but of course you get nfv mano systems in the wild that are robots just firing off rapid requests | 19:28 |
mriedem | but those are likely single cell and shouldn't be disabling that late affinity check upcall :) | 19:28 |
mriedem | starlingx had patches for a lot of this server group stuff | 19:29 |
melwitt | yeah, I see | 19:29 |
mriedem | to try and mitigate some of it, but it wasn't all perfect either, e.g. locks within conductor but that would only lock *that* conductor (or scheduler) worker, not across all - unless you use an external locking mechanism, like a db or etcd or something | 19:29 |
melwitt | yeah, I know of one where they run with a single scheduler and serialize affinity requests | 19:30 |
melwitt | right | 19:30 |
mriedem | trying to make scheduling requests serialized for group based scheduling | 19:30 |
mriedem | for starlingx with like 1 node and 1 worker then it's probably fine | 19:30 |
* melwitt nods | 19:30 | |
melwitt | ok. I'll finish repro'ing the situation with the workaround set to False to make sure the late affinity check triggers, and write up something for the bug. and will link the doc | 19:32 |
*** irclogbot_1 has quit IRC | 19:37 | |
*** irclogbot_2 has joined #openstack-nova | 19:40 | |
*** irclogbot_2 has quit IRC | 19:42 | |
mriedem | melwitt: if you're so inclined and it's something that keeps coming up it might be worth writing something up in the troubleshooting docs, e.g. why are my servers that are in x policy landing on the same/different hosts when they shouldn't? | 19:44 |
mriedem | and explain the parallel issue | 19:44 |
mriedem | i found it nice to write something up once and then just point people to that | 19:45 |
mriedem | https://docs.openstack.org/nova/latest/admin/support-compute.html | 19:45 |
*** irclogbot_2 has joined #openstack-nova | 19:45 | |
*** macz_ has joined #openstack-nova | 19:50 | |
melwitt | mriedem: yup I think that's a good idea. I'll do that | 19:51 |
melwitt | thanks for suggesting | 19:51 |
*** martinkennelly has quit IRC | 19:54 | |
mriedem | my first few weeks on the new job were me in slack being like "why x? why y? where is z documented?" and then taking the answers and trying to document them to feel like i was useful | 19:55 |
melwitt | that's a good investment. every time I don't do that, I regret it later | 19:58 |
melwitt | and I usually forget because I have the memory recall of a hamster | 19:59 |
*** irclogbot_2 has quit IRC | 20:00 | |
*** ociuhandu has joined #openstack-nova | 20:01 | |
*** irclogbot_0 has joined #openstack-nova | 20:04 | |
*** irclogbot_0 has quit IRC | 20:12 | |
*** happyhemant has quit IRC | 20:15 | |
*** irclogbot_2 has joined #openstack-nova | 20:16 | |
*** irclogbot_2 has quit IRC | 20:16 | |
*** irclogbot_0 has joined #openstack-nova | 20:22 | |
*** ociuhandu has quit IRC | 20:27 | |
*** ociuhandu has joined #openstack-nova | 20:28 | |
*** ociuhandu has quit IRC | 20:33 | |
melwitt | johnsom: hey, finally got a chance to dig into the bug report you opened awhile back about anti-affinity, tl;dr is I don't find that there's been a regression. pls see my latest comment explaining https://bugs.launchpad.net/nova/+bug/1863190 | 20:37 |
openstack | Launchpad bug 1863190 in OpenStack Compute (nova) "Server group anti-affinity no longer works" [Undecided,New] | 20:37 |
johnsom | melwitt Ok, thank you for having a look. I got some feedback that it changed around queens, but I didn't go back and confirm either way. | 20:38 |
*** tosky has quit IRC | 20:40 | |
johnsom | melwitt My money is on that setting being the variable. | 20:41 |
melwitt | johnsom: it looks like it was likely a timing difference bc the change that disabled the late affinity upcall was back in pike https://review.opendev.org/477556 | 20:42 |
johnsom | Lol, that is "around" in OpenStack time. | 20:43 |
melwitt | around for certain values of around | 20:43 |
johnsom | Yep | 20:44 |
*** seba has quit IRC | 20:46 | |
*** seba has joined #openstack-nova | 20:46 | |
johnsom | rm_work FYI: https://bugs.launchpad.net/nova/+bug/1863190 comment 7 | 20:46 |
openstack | Launchpad bug 1863190 in OpenStack Compute (nova) "Server group anti-affinity no longer works" [Undecided,New] | 20:46 |
*** ericyoung has quit IRC | 20:47 | |
rm_work | hmm k | 20:47 |
rm_work | we switched to hard-anti-affinity and made sure we have retries enabled | 20:48 |
*** nweinber has quit IRC | 20:48 | |
melwitt | rm_work: you have to have your cell conductors and computes configured a certain way to be able to handle racing affinity requests. if you have one shared MQ the configs can be set to support it. if you have multiple MQs there's no enforcement of affinity for racing requests until affinity support is added to placement | 20:49 |
*** ericyoung has joined #openstack-nova | 20:49 | |
rm_work | k | 20:50 |
rm_work | our symptom was that a late check will catch it, but | 20:50 |
rm_work | for soft-anti-affinity, it doesn't actually BLOCK it | 20:50 |
rm_work | which means soft-anti-affinity is pretty useless | 20:50 |
rm_work | hard-anti-affinity it catches and sends for rescheduling, but our problem was we had reschedules disabled | 20:50 |
rm_work | we have since fixed that | 20:50 |
melwitt | yeah. I'm not sure what soft-anti-affinity is for ... give it a try and if not, meh I guess | 20:51 |
openstackgerrit | Merged openstack/nova master: libvirt: Use virDomainBlockCopy to swap volumes when using -blockdev https://review.opendev.org/696834 | 20:51 |
rm_work | it's supposed to be "best attempt" | 20:51 |
melwitt | yeah | 20:51 |
rm_work | so if you are down to only one schedulable HV, it'll still *work* because that's better than nothing | 20:51 |
rm_work | but in our case, using pack scheduling, it basically never does anything unless it catches up front | 20:51 |
rm_work | the late-catch will do nothing as it's still "valid" | 20:52 |
rm_work | which makes it not so useful | 20:52 |
melwitt | right | 20:52 |
melwitt | so what did you do? enable some retries? | 20:52 |
rm_work | switched to hard-aa and set scheduling retries to 3 (which is the original default, i think -- we had set it specifically to 0) | 20:53 |
*** tosky has joined #openstack-nova | 20:54 | |
melwitt | rm_work: yeah, ok. we do have a gap regarding the default pack scheduling + server group requests as mriedem mentioned earlier, and a way we could deal with that is to do something similar to starlingx where we serialize server group request claims at the scheduler, but we'd need to use a distributed lock since we have multiple scheduler workers. not something we already have in nova so would take more effort to add. would be a spec | 20:58 |
melwitt | and all | 20:58 |
*** mlavalle has quit IRC | 21:09 | |
*** mlavalle has joined #openstack-nova | 21:11 | |
*** xek_ has quit IRC | 21:17 | |
openstackgerrit | Merged openstack/nova stable/pike: Improve metadata server performance with large security groups https://review.opendev.org/697523 | 21:20 |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Add new default roles in os-hypervisors policies https://review.opendev.org/715071 | 21:24 |
mriedem | melwitt: tbc, i think the solution for the affinity problem during scheduling likely involves placement as has been discussed for years, not serializing things like starlingx did as a workaround | 21:32 |
mriedem | but how that would work in placement has always been difficult to model | 21:32 |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Pass the actual target in os-hypervisors policy https://review.opendev.org/715074 | 21:33 |
mriedem | but then placement is your distributed lock :) | 21:33 |
melwitt | mriedem: yeah, sorry, I was missing that the pack pattern problem with anti-affinity would go away with placement affinity | 21:33 |
melwitt | this stuff confuses the hell out of me | 21:33 |
*** derekh has quit IRC | 21:34 | |
melwitt | so, nix what I said earlier rm_work ^ | 21:35 |
*** dpawlik has quit IRC | 21:35 | |
mriedem | "if you have multiple MQs there's no enforcement of affinity for racing requests until affinity support is added to placement" is not quite true, you just need the conductors configured to hit the API DB (compute -> cell conductor -> API DB); devstack doesn't configure the cell conductor with the API DB connection so again, devstack is doing the *ideal* separate setup but not what most (if any) nova deployments are probably d | 21:40 |
mriedem | , | 21:40 |
mriedem | pretty sure most nova deployments just have the api db connection configured everywhere | 21:41 |
openstackgerrit | Merged openstack/nova stable/stein: nova-live-migration: Ensure subnode is fenced during evacuation testing https://review.opendev.org/713962 | 21:41 |
mriedem | and yeah you have to have reschedules enabled to....reschedule :) | 21:41 |
*** derekh has joined #openstack-nova | 21:42 | |
mriedem | the only things that do the late affinity check in the computes are server create and evacuate, so you can still violate affinity policy for other moves (unshelve, cold and live migrate) | 21:42 |
melwitt | yeah ... I was realizing that slowly regarding the difference between database access vs MQ access | 21:42 |
mriedem | and the only flows that reschedule today from the compute are create and cold migrate/resize | 21:42 |
mriedem | i think evacuate just fails the operation if you fail the late affinity check | 21:43 |
melwitt | I can't remember why that "impossible to contact bc MQ" was ever a thing wrt to affinity | 21:44 |
*** derekh has quit IRC | 21:44 | |
*** ociuhandu has joined #openstack-nova | 21:45 | |
melwitt | was it before alternate_hosts became a thing maybe? | 21:45 |
melwitt | sigh ... have to correct my comment yet again | 21:46 |
*** ociuhandu has quit IRC | 21:49 | |
*** spatel has quit IRC | 21:52 | |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Add test coverage of existing instance usage log policies https://review.opendev.org/715080 | 21:54 |
mriedem | alternate hosts just solved the problem of the cell conductor needing to go back to the scheduler and API DB | 21:55 |
mriedem | there is actually still an upcall bug there which i don't think got fixed, | 21:55 |
mriedem | during reschedule the conductor will update the instance.availability_zone for the alternate host and to do that it needs to hit the aggregates table which is in the API DB | 21:56 |
melwitt | yeah, I mean I had thought in the past it was said that the late affinity check would be impossible due to MQ isolation. but as you explained that's not true. so if that was ever said, I wondered why. it might have been before alternate hosts was added | 21:57 |
mriedem | err i guess i fixed that https://review.opendev.org/#/q/topic:bug/1781286+(status:open+OR+status:merged)+branch:master | 21:57 |
mriedem | anytime, it's time to social distance myself into the kitchen, o/ | 21:58 |
*** mriedem has left #openstack-nova | 21:58 | |
dansmith | melwitt: I think what you're thinking of is that we don't currently have a way for the child cell to know about the top-level mq, and I don't think we should | 22:01 |
dansmith | we do have a separate api_db connection string, so as a hack, child conductors can use that as if they were a top-level conductor to still hit that database, | 22:02 |
dansmith | which at least reduces the scope of who at the lower level can talk up, | 22:02 |
dansmith | and since we should be solving this in placement, we can just hang onto the status quo instead of further pollute the model by teaching everyone to call up | 22:02 |
dansmith | concerned people could give perms to the child conductors to only view the instance_groups and related tables I think, to further limit the scope of what it can see to just what is needed for tht check | 22:03 |
melwitt | dansmith: thanks ... I think I am thinking of that. but I could have sworn that there was some previously discussed impossibility about it regarding separate MQs, I might have been mixing back before we had alternate hosts, how once you're in the cell you can't call the scheduler again to request a reschedule | 22:04 |
dansmith | ...because we don't have a way to tell those services about the upper mq like we do for the upper db | 22:05 |
melwitt | right. yeah, I do understand that. maybe I was thinking back to before we had alternate hosts and became able to reschedule without sharing a MQ | 22:05 |
dansmith | well, yeah, the alternate hosts thing was the only way we could reschedule without adding a similar upcall | 22:06 |
melwitt | and had tied that to the late affinity check in my head. I dunno | 22:06 |
dansmith | for the same reason | 22:06 |
dansmith | well, it's the same thing of course | 22:06 |
dansmith | it was just easier to solve that with pre-populating some alternates to chew through, whereas the affinity check is not so easy | 22:07 |
melwitt | oh ... guh | 22:07 |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Introduce scope_types in os-instance-usage-audit-log https://review.opendev.org/715082 | 22:08 |
melwitt | dansmith: I think what confused me was that prior to alternate hosts, you needed to be able to access the scheduler's MQ to reschedule (right?) ... so an upcall to another MQ. and I didn't process that the late affinity check does not involve needing to upcall to another MQ to work, it only needs to upcall to the API DB | 22:20 |
dansmith | melwitt: yes, (re)schedule is an rpc call, whereas the affinity check is just a db operation | 22:21 |
melwitt | right, ok | 22:22 |
*** Jeffrey4l has quit IRC | 22:27 | |
*** Jeffrey4l has joined #openstack-nova | 22:29 | |
*** slaweq has quit IRC | 22:34 | |
*** vishalmanchanda has quit IRC | 22:35 | |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Add new default roles in os-instance-usage-audit-log policies https://review.opendev.org/715085 | 22:37 |
*** slaweq has joined #openstack-nova | 22:46 | |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Pass the actual target in os-instance-usage-audit-log policy https://review.opendev.org/715089 | 22:48 |
*** slaweq has quit IRC | 22:51 | |
*** brinzhang has quit IRC | 22:51 | |
*** tkajinam has joined #openstack-nova | 22:52 | |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Add new default roles in os-instance-usage-audit-log policies https://review.opendev.org/715085 | 22:55 |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Pass the actual target in os-instance-usage-audit-log policy https://review.opendev.org/715089 | 22:56 |
*** rcernin has joined #openstack-nova | 22:56 | |
openstackgerrit | melanie witt proposed openstack/nova master: Add info about affinity requests to the troubleshooting doc https://review.opendev.org/715092 | 23:18 |
*** efried_gone has quit IRC | 23:18 | |
openstackgerrit | Merged openstack/nova master: libvirt: Use oslo.utils >= 4.1.0 to fetch format-specific image data https://review.opendev.org/710785 | 23:18 |
openstackgerrit | Merged openstack/nova master: remove DISTINCT ON SQL instruction that does nothing on MySQL https://review.opendev.org/705850 | 23:19 |
openstackgerrit | melanie witt proposed openstack/nova master: Add info about affinity requests to the troubleshooting doc https://review.opendev.org/715092 | 23:20 |
*** macz_ has quit IRC | 23:24 | |
*** lbragstad has joined #openstack-nova | 23:25 | |
brinzhang_ | gmann: I would like we not only catch 4xx error, maybe be we also need get 500, so I would like to keep the exception name to populate the details if it is a non-nova exception | 23:49 |
brinzhang_ | gmann: for example: as a non-admin, iuf I created server failed because of NoValidHost(500), if I get this message, that I can try again, Otherwise I cannot get nothing useful message | 23:51 |
brinzhang_ | s/iuf/if | 23:51 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!