*** brinzhang has joined #openstack-nova | 00:13 | |
*** tetsuro has joined #openstack-nova | 00:16 | |
*** swp20 has joined #openstack-nova | 00:39 | |
*** songwenping_ has joined #openstack-nova | 00:41 | |
*** swp20 has quit IRC | 00:45 | |
*** Liang__ has joined #openstack-nova | 01:11 | |
*** yaawang has quit IRC | 01:35 | |
*** yaawang has joined #openstack-nova | 01:36 | |
*** sapd1 has joined #openstack-nova | 01:48 | |
*** ircuser-1 has quit IRC | 02:19 | |
*** sapd1 has quit IRC | 03:14 | |
*** sapd1_x has joined #openstack-nova | 03:14 | |
*** slaweq has joined #openstack-nova | 03:17 | |
*** slaweq has quit IRC | 03:22 | |
*** tetsuro has quit IRC | 03:28 | |
*** ratailor has joined #openstack-nova | 03:56 | |
*** tetsuro has joined #openstack-nova | 04:16 | |
*** evrardjp has quit IRC | 04:36 | |
*** evrardjp has joined #openstack-nova | 04:36 | |
*** belmoreira has joined #openstack-nova | 04:42 | |
openstackgerrit | Elod Illes proposed openstack/nova stable/rocky: libvirt: check job status for VIR_DOMAIN_EVENT_SUSPENDED_MIGRATED event https://review.opendev.org/711233 | 04:44 |
---|---|---|
*** songwenping_ has quit IRC | 04:46 | |
*** belmoreira has quit IRC | 04:46 | |
*** huaqiang has quit IRC | 04:54 | |
*** udesale has joined #openstack-nova | 05:15 | |
*** vishalmanchanda has joined #openstack-nova | 05:29 | |
*** songwenping_ has joined #openstack-nova | 05:32 | |
*** links has joined #openstack-nova | 05:52 | |
*** dpawlik has joined #openstack-nova | 06:10 | |
*** dpawlik has quit IRC | 06:13 | |
*** slaweq has joined #openstack-nova | 06:17 | |
*** dpawlik has joined #openstack-nova | 06:18 | |
*** alex_xu has joined #openstack-nova | 06:18 | |
openstackgerrit | Xinran WANG proposed openstack/os-resource-classes master: Add new resource class for QAT card. https://review.opendev.org/726314 | 06:40 |
*** lpetrut has joined #openstack-nova | 06:41 | |
*** maciejjozefczyk has joined #openstack-nova | 06:58 | |
*** ttsiouts has joined #openstack-nova | 06:59 | |
*** tinwood is now known as tinwood-afk | 07:06 | |
*** ccamacho has joined #openstack-nova | 07:11 | |
*** ccamacho has quit IRC | 07:12 | |
*** ccamacho has joined #openstack-nova | 07:29 | |
*** tosky has joined #openstack-nova | 07:34 | |
*** rpittau|afk is now known as rpittau | 07:36 | |
*** ralonsoh has joined #openstack-nova | 07:40 | |
*** belmoreira has joined #openstack-nova | 07:46 | |
*** nightmare_unreal has quit IRC | 07:47 | |
*** nightmare_unreal has joined #openstack-nova | 08:01 | |
*** dtantsur|afk is now known as dtantsur | 08:04 | |
gibi | good morning Nova | 08:05 |
*** tinwood-afk is now known as tinwood | 08:05 | |
lyarwood | Morning \o | 08:07 |
*** eandersson7 has joined #openstack-nova | 08:07 | |
*** arxcruz has quit IRC | 08:10 | |
*** arxcruz has joined #openstack-nova | 08:10 | |
*** kukacz_ has joined #openstack-nova | 08:11 | |
openstackgerrit | Kevin Zhao proposed openstack/nova master: [WIP] CI: add tempest-integrated-compute-aarch64 job https://review.opendev.org/714439 | 08:12 |
*** kukacz has quit IRC | 08:15 | |
*** eandersson has quit IRC | 08:15 | |
*** eandersson7 is now known as eandersson | 08:15 | |
openstackgerrit | Kevin Zhao proposed openstack/nova master: [WIP] CI: add tempest-integrated-compute-aarch64 job https://review.opendev.org/714439 | 08:24 |
*** ttsiouts has quit IRC | 08:24 | |
*** ttsiouts has joined #openstack-nova | 08:25 | |
*** martinkennelly has joined #openstack-nova | 08:26 | |
*** ttsiouts has quit IRC | 08:28 | |
*** ttsiouts has joined #openstack-nova | 08:28 | |
*** ttsiouts has quit IRC | 08:28 | |
*** ttsiouts has joined #openstack-nova | 08:29 | |
*** ttsiouts has quit IRC | 08:30 | |
*** ttsiouts has joined #openstack-nova | 08:31 | |
*** salmankhan has joined #openstack-nova | 08:35 | |
*** derekh has joined #openstack-nova | 08:37 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: objects: Remove 'NovaObjectDictCompat' from 'Migration' https://review.opendev.org/723572 | 08:37 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: objects: Remove 'NovaObjectDictCompat' from 'InstancePCIRequest' https://review.opendev.org/723573 | 08:37 |
*** salmankhan has quit IRC | 08:42 | |
*** salmankhan has joined #openstack-nova | 08:42 | |
*** links has quit IRC | 08:43 | |
*** links has joined #openstack-nova | 08:54 | |
*** xek has joined #openstack-nova | 09:03 | |
*** kevinz has quit IRC | 09:07 | |
brinzhang | gibi, bauzas: When I researching the define nova cyborg interaction notification, I found some code logical not suitable, so I | 09:07 |
brinzhang | submmit a patch to optimize it, would you like to review https://review.opendev.org/#/c/726564/ | 09:07 |
*** kevinz has joined #openstack-nova | 09:07 | |
openstackgerrit | Liang Fang proposed openstack/nova master: [WIP] rbd patch for volume local cache https://review.opendev.org/726762 | 09:21 |
*** Liang__ has quit IRC | 09:30 | |
*** ttsiouts has quit IRC | 09:31 | |
*** ttsiouts has joined #openstack-nova | 09:32 | |
*** ttsiouts has quit IRC | 09:36 | |
bauzas | good morning Nova | 09:38 |
*** ttsiouts has joined #openstack-nova | 09:41 | |
*** priteau has joined #openstack-nova | 09:44 | |
*** happyhemant has joined #openstack-nova | 10:15 | |
*** rpittau is now known as rpittau|bbl | 10:15 | |
*** tetsuro has quit IRC | 10:26 | |
stephenfin | lyarwood, bauzas: would you do me the honours? https://review.opendev.org/#/c/710238/ | 10:33 |
* bauzas just has read too fast | 10:34 | |
stephenfin | hahaha | 10:34 |
bauzas | "would you do me the homous" ? | 10:34 |
* bauzas is hungry | 10:34 | |
gibi | I got an interesting support case downstream. Does nova calculate the disk usage of its own image cache on the compute? | 10:36 |
stephenfin | gibi: I don't think nova includes anything except instances in those calculation | 10:37 |
stephenfin | *s | 10:37 |
gibi | all the non-nova disk usage should be configured in reserved_host_disk_mb | 10:37 |
stephenfin | I'd expect image cache to be included in the reserved host config | 10:37 |
stephenfin | yeah | 10:37 |
gibi | for that I would need to know the maximum size of the image cache | 10:38 |
bauzas | stephenfin: I'm not a multi-attach specialist | 10:38 |
bauzas | but I wonder why we were avoiding QEMU>2.10 | 10:38 |
gibi | do we have a way to maximize the size of the nova image cache? | 10:38 |
gibi | I mean limit | 10:38 |
stephenfin | bauzas: I think kashyap explained that to me at some point. Let me look | 10:38 |
stephenfin | it's a weird conditional, for sure | 10:38 |
stephenfin | gibi: I'm not aware of any, but I suspect there must be something. /me looks | 10:39 |
gibi | I tried to find it but I failed | 10:39 |
bauzas | stephenfin: yeah I suspect something was borked | 10:39 |
bauzas | but this whole comment is confusing | 10:40 |
bauzas | it's an "or" clause | 10:40 |
bauzas | so in theory, we should only support multiattach if QEMU<2.10 | 10:40 |
bauzas | but I suspect the wording being incorrect, hence the confusion | 10:40 |
stephenfin | gibi: Yeah, I can't see anything either. Sounds like a gap :-\ | 10:42 |
stephenfin | bauzas: okay, the context in in https://bugzilla.redhat.com/show_bug.cgi?id=1378242 | 10:45 |
openstack | bugzilla.redhat.com bug 1378242 in libvirt "QEMU image file locking (libvirt)" [Unspecified,Closed: errata] - Assigned to pkrempa | 10:45 |
*** links has quit IRC | 10:45 | |
gibi | stephenfin: thanks for confirming. I will do a problem reproduction and file a bug but I feel this will be considered as a feautre request | 10:45 |
stephenfin | tl;dr: QEMU added a feature that broke multi-attach, which necessitated a new libvirt feature to fix it again | 10:45 |
gibi | from upstream perspective | 10:45 |
bauzas | stephenfin: cool, and the bug is readable without being internal | 10:46 |
*** links has joined #openstack-nova | 10:46 | |
bauzas | stephenfin: okay, so the libvirt version is superseding the QEMU one | 10:47 |
stephenfin | gibi: Perhaps. A quick look suggests we have documented multiple error codes for the API though, so if it was classified as a bug, it should be a backportable one | 10:47 |
gibi | stephenfin: which API you are referring to? | 10:48 |
bauzas | stephenfin: worth respinning a better commit msg explaining this ? | 10:48 |
stephenfin | bauzas: Yup. You either need to use an older version of QEMU, or you need a newer version of libvirt to workaround the changes in newer QEMU | 10:48 |
stephenfin | bauzas: Good call. Let me do that | 10:48 |
bauzas | and the fact we have a recent QEMU isn't a problem since libvirt fixes this | 10:48 |
bauzas | stephenfin: thanks | 10:49 |
stephenfin | gibi: The host aggregate image caching API. Commit 339129870692467b703220dbc3905fd8bffe6a83 | 10:49 |
gibi | stephenfin: ohh. This goes way beyond that. As nova cached images before that API was added. | 10:50 |
gibi | based on images download for new instance boots | 10:51 |
stephenfin | Ah, the old just-in-time caching behavior? | 10:51 |
gibi | jepp | 10:51 |
gibi | as far as I see that is also not limited in size | 10:51 |
lyarwood | gibi: is this with the libvirt virt driver? | 10:52 |
gibi | lyarwood: yes libvirt | 10:52 |
gibi | lyarwood: the instances dir is on local file system | 10:52 |
lyarwood | gibi: right then you're correct that isn't limited AFAIK | 10:52 |
gibi | isn't it even a security concern? Can I will the disk via the cache to prevent nova-compute for booting VMs? | 10:53 |
gibi | s/will/fill/ | 10:53 |
lyarwood | gibi: that should be taken into consideration when attempting to schedule instances to the node | 10:54 |
lyarwood | gibi: the cache is just a way of sharing the base image between instances | 10:54 |
gibi | lyarwood: but it doesn't as far as I see | 10:54 |
lyarwood | gibi: are you providing a unique image with every request? | 10:55 |
gibi | lyarwood: if I have a raw base image and two qcow2 guest image then I potentially use original image.size + instance 1 flavor.disk + instance 2 flavor.disk, but nova only calculate the instance 1 flavor.disk + instance 2 flavor.disk as used | 10:56 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: libvirt: Remove MIN_LIBVIRT_MULTIATTACH https://review.opendev.org/710238 | 10:57 |
stephenfin | bauzas: ^ | 10:57 |
gibi | (I ignore disk_available_least as placement does not use that just he old removed DiskFilter used that) | 10:57 |
lyarwood | gibi: don't we report the adjusted available space to placement taking that into account? | 10:58 |
lyarwood | gibi: the size of the RAW base file that is | 10:58 |
lyarwood | gibi: and the potential size of the two qcow2 instance disks? | 10:58 |
lyarwood | _get_disk_over_committed_size_total ? | 10:59 |
lyarwood | ah right | 11:00 |
lyarwood | we don't actually report that back up sorry | 11:00 |
gibi | http://paste.openstack.org/show/793375/ I don't see adjustements in placement based on this | 11:00 |
lyarwood | ewww | 11:01 |
gibi | lyarwood: for me either the size of the cache should be configurable (and then we can account for that in reserved_host_disk_mb) or nova needs to report cached disk usage as used in placement | 11:02 |
lyarwood | gibi: yeah I'd say the latter, I honestly thought we did already. | 11:02 |
*** martinkennelly has quit IRC | 11:03 | |
gibi | lyarwood: this is a bug in pike for one of our customers so I also have to think about a backportable solution | 11:04 |
gibi | lyarwood: do you think that determining the size of the image cache is easy? | 11:04 |
gibi | is it just some file system calls in the _base dir, isn't it? | 11:05 |
lyarwood | gibi: yes for file based backends | 11:05 |
gibi | I have close to zero knowledge on non file based backends behavior | 11:06 |
*** jsuchome has joined #openstack-nova | 11:06 | |
lyarwood | gibi: we don't cache in rbd iirc | 11:07 |
* lyarwood spins up an env to play with this | 11:07 | |
gibi | is there a way to turn of the cache? | 11:07 |
gibi | for file based backend? (that would be a workaround in my downstream issue) | 11:07 |
lyarwood | gibi: I don't think so, you can disable the manager but I think we still cache things at creation time | 11:09 |
lyarwood | gibi: the manager just doesn't run to clean things up | 11:10 |
* lyarwood -> lunch back in 20 | 11:10 | |
gibi | lyarwood: thanks | 11:10 |
* gibi follows lyarwood's example and goes for food | 11:10 | |
*** songwenping_ has quit IRC | 11:14 | |
*** tbachman has joined #openstack-nova | 11:19 | |
*** songwenping_ has joined #openstack-nova | 11:32 | |
*** iurygregory has quit IRC | 11:37 | |
*** toabctl has quit IRC | 11:43 | |
openstackgerrit | Sasha Andonov proposed openstack/nova master: rbd_utils: increase _destroy_volume timeout https://review.opendev.org/705764 | 11:57 |
*** rpittau|bbl is now known as rpittau | 11:58 | |
*** iurygregory has joined #openstack-nova | 11:58 | |
nightmare_unreal | how can I determine default api version used by osc CLI ? suppose if we don't specify --os-compute-api-version , which version will it take and how it's determined ? | 12:07 |
*** raildo has joined #openstack-nova | 12:09 | |
lyarwood | nightmare_unreal: https://docs.openstack.org/api-guide/compute/microversions.html#version-discovery - I think there's also an osc command for that | 12:10 |
nightmare_unreal | Thanks :) | 12:10 |
*** priteau has quit IRC | 12:16 | |
*** tkajinam has quit IRC | 12:31 | |
*** eharney has joined #openstack-nova | 12:31 | |
*** links has quit IRC | 12:38 | |
*** links has joined #openstack-nova | 12:39 | |
*** songwenping_ has quit IRC | 12:51 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: docs: Note the ``hw_numa_nodes`` image property https://review.opendev.org/683849 | 12:52 |
*** sapd1_x has quit IRC | 12:59 | |
*** iurygregory has quit IRC | 13:01 | |
*** nweinber has joined #openstack-nova | 13:02 | |
*** iurygregory has joined #openstack-nova | 13:02 | |
*** lbragstad has joined #openstack-nova | 13:06 | |
*** udesale_ has joined #openstack-nova | 13:06 | |
*** udesale has quit IRC | 13:09 | |
*** lbragstad has quit IRC | 13:21 | |
*** nweinber has quit IRC | 13:22 | |
*** lbragstad has joined #openstack-nova | 13:22 | |
*** nweinber has joined #openstack-nova | 13:22 | |
*** artom has joined #openstack-nova | 13:24 | |
*** zzzeek has quit IRC | 13:25 | |
*** zzzeek has joined #openstack-nova | 13:26 | |
*** zzzeek has quit IRC | 13:26 | |
*** zzzeek has joined #openstack-nova | 13:27 | |
*** owalsh has quit IRC | 13:30 | |
*** owalsh has joined #openstack-nova | 13:31 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: SR-IOV passthrough: Check PF only if VF is enabled https://review.opendev.org/476642 | 13:34 |
jsuchome | dansmith: Hi, I've updated related patchets about that direct rbd download (including the spec) could you give it another look? | 13:36 |
*** dpawlik has quit IRC | 13:38 | |
*** dpawlik has joined #openstack-nova | 13:42 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Poison netifaces.interfaces() in tests https://review.opendev.org/671773 | 13:46 |
dansmith | jsuchome: yep will queue for today | 13:46 |
*** beekneemech is now known as bnemec | 13:47 | |
*** ttsiouts has quit IRC | 13:47 | |
*** ttsiouts has joined #openstack-nova | 13:48 | |
*** ratailor has quit IRC | 13:49 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Raise if flavor and image disagree on hide_hypervisor_id https://review.opendev.org/663365 | 13:51 |
*** ttsiouts has quit IRC | 13:52 | |
*** brinzhang_ has joined #openstack-nova | 13:58 | |
*** ttsiouts has joined #openstack-nova | 14:01 | |
*** brinzhang has quit IRC | 14:02 | |
*** brinzhang has joined #openstack-nova | 14:02 | |
*** brinzhang_ has quit IRC | 14:03 | |
*** awalende has joined #openstack-nova | 14:06 | |
*** brinzhang_ has joined #openstack-nova | 14:11 | |
*** brinzhang has quit IRC | 14:14 | |
*** ttsiouts has quit IRC | 14:17 | |
openstackgerrit | Lee Yarwood proposed openstack/nova stable/train: Revert "nova shared storage: rbd is always shared storage" https://review.opendev.org/726861 | 14:17 |
openstackgerrit | Lee Yarwood proposed openstack/nova stable/stein: Revert "nova shared storage: rbd is always shared storage" https://review.opendev.org/726862 | 14:17 |
*** ttsiouts has joined #openstack-nova | 14:17 | |
openstackgerrit | Lee Yarwood proposed openstack/nova stable/rocky: Revert "nova shared storage: rbd is always shared storage" https://review.opendev.org/726863 | 14:17 |
*** dtantsur is now known as dtantsur|brb | 14:17 | |
openstackgerrit | Lee Yarwood proposed openstack/nova stable/queens: Revert "nova shared storage: rbd is always shared storage" https://review.opendev.org/726864 | 14:18 |
*** ttsiouts has quit IRC | 14:21 | |
sean-k-mooney | by the way are we planning to backport https://review.opendev.org/#/c/663365/ upstream | 14:23 |
sean-k-mooney | its the fix for bug@ #1831723 | 14:24 |
sean-k-mooney | bug: #1831723 | 14:24 |
openstack | bug 1831723 in OpenStack Compute (nova) "The flavor hide_hypervisor_id value can be overridden by the image img_hide_hypervisor_id" [Undecided,In progress] https://launchpad.net/bugs/1831723 - Assigned to Stephen Finucane (stephenfinucane) | 14:24 |
sean-k-mooney | downstream i think we would want to backport that as im sure customer will hit it at some point | 14:24 |
sean-k-mooney | but it might be nice to backport upstream too but im not sure its allowed | 14:25 |
sean-k-mooney | lyarwood: ^ any toughts on the topic | 14:25 |
sean-k-mooney | we are just adding a namespaced version of an existing unnamsepaced extra_spec | 14:26 |
*** lpetrut has quit IRC | 14:26 | |
sean-k-mooney | downstream that is not conisderd an api change since extra_specs are not part of the api | 14:26 |
lyarwood | sean-k-mooney: reading | 14:26 |
openstackgerrit | James Page proposed openstack/nova stable/queens: hardware: fix memory check usage for small/large pages https://review.opendev.org/726867 | 14:27 |
sean-k-mooney | lyarwood: actully that is not the patch i ment to link | 14:28 |
lyarwood | sean-k-mooney: right you had me slightly confused tbh | 14:29 |
sean-k-mooney | that one we might also want to backport but one sec | 14:29 |
lyarwood | yeah that's looks valid to backport | 14:29 |
lyarwood | that* | 14:30 |
sean-k-mooney | https://review.opendev.org/#/c/722187/ | 14:30 |
sean-k-mooney | that is the one i ment | 14:30 |
sean-k-mooney | for https://bugs.launchpad.net/nova/+bug/1841932 | 14:30 |
openstack | Launchpad bug 1841932 in OpenStack Compute (nova) "hide_hypervisor_id extra_specs in nova flavor cannot pass AggregateInstanceExtraSpecsFilter" [Low,In progress] - Assigned to Stephen Finucane (stephenfinucane) | 14:30 |
lyarwood | sean-k-mooney: so we deprecate hide_hypervisor_id in that change but still provide backward compatability so at first glance I think we can backport this upstream? | 14:33 |
sean-k-mooney | lyarwood: correct deprecated but still supported | 14:33 |
sean-k-mooney | we would have to drop the validation changes | 14:33 |
sean-k-mooney | form the patch | 14:33 |
sean-k-mooney | but the rest of it i think would be fine | 14:34 |
lyarwood | ah right that only just landed | 14:34 |
openstackgerrit | James Page proposed openstack/nova stable/queens: hardware: fix memory check usage for small/large pages https://review.opendev.org/726867 | 14:34 |
openstackgerrit | James Page proposed openstack/nova stable/queens: Fix overcommit for NUMA-based instances https://review.opendev.org/726868 | 14:34 |
*** dklyle has joined #openstack-nova | 14:35 | |
sean-k-mooney | lyarwood: yep but that should be fine and easy to call out on the intiall backport patch | 14:35 |
sean-k-mooney | its pretty self contained | 14:35 |
*** jhesketh has quit IRC | 14:35 | |
*** sapd1_x has joined #openstack-nova | 14:40 | |
*** brinzhang_ has quit IRC | 14:40 | |
*** MrWatson has quit IRC | 14:58 | |
*** dtantsur|brb is now known as dtantsur | 14:59 | |
*** mlavalle has joined #openstack-nova | 14:59 | |
*** NostawRm has joined #openstack-nova | 15:00 | |
jsuchome | dansmith: thanks, I'll address the changes in spec | 15:01 |
*** jharris has joined #openstack-nova | 15:02 | |
-openstackstatus- NOTICE: Our CI mirrors in OVH BHS1 and GRA1 regions were offline between 12:55 and 14:35 UTC, any failures there due to unreachable mirrors can safely be rechecked | 15:08 | |
*** klindgren_ has joined #openstack-nova | 15:09 | |
*** klindgren has quit IRC | 15:09 | |
gibi | lyarwood: reported a bug for the image cache issue https://bugs.launchpad.net/nova/+bug/1878024 | 15:12 |
openstack | Launchpad bug 1878024 in OpenStack Compute (nova) "disk usage of the nova image cache is not counted as used disk space" [Undecided,New] | 15:12 |
gibi | lyarwood: could you please check if it make sense what I wrote there? | 15:12 |
gibi | dansmith: you worked with the image cache recently so you might be interested ^^ | 15:13 |
dansmith | yeah, reading now | 15:13 |
gibi | thanks | 15:13 |
dansmith | if the op has a separate mount for the cache, this wouldn't be a problem | 15:13 |
dansmith | so it's not going to affect everyone the sae | 15:14 |
dansmith | *same | 15:14 |
gibi | yeah, what the downstream customer has is a simple disk partition for the nova instances_path | 15:14 |
gibi | and the cache is under the instances_path | 15:14 |
dansmith | right, which I'm sure is common | 15:14 |
lyarwood | gibi: ack thanks just on calls for a while but will look once I'm off | 15:14 |
gibi | lyarwood: thanks | 15:15 |
dansmith | the problem with something like A is that when you're scheduling, | 15:15 |
dansmith | you don't know whether or not the image is already on the remote system, so you don't know whether to look for hosts with 2*$size disk space or not | 15:15 |
gibi | dansmith: true | 15:16 |
dansmith | B doesn't really work either because you can't assume you can purge your way out of the cache limit | 15:16 |
dansmith | if you boot a hundred instances from different images, you can't prevent the image cache from going over the desired size, | 15:17 |
dansmith | unless you refuse to boot instances there, which people will complain about because there is plenty of disk space and not understand | 15:17 |
gibi | could nova-compute periodically update a separate DISK_GB allocation in placement based on the actual size of the image cache | 15:18 |
gibi | ? | 15:18 |
dansmith | not periodically, but synchronously with the decision to cache an image (either during boot or otherwise) | 15:18 |
dansmith | otherwise you have a race | 15:18 |
dansmith | I'd have to think about that, but we'd need to only do that if the images and cache are on the same filesystem, otherwise we'd count against the wrong total | 15:20 |
gibi | it will be racey anyhow due to what you said about the problem of requesting allocation for the cache during scheduling | 15:20 |
dansmith | I mean racing for disk space, which could go badly if you lose, not just racing for image boot | 15:21 |
dansmith | but yes, the scheduler is never going to know whether or not to pick a host based on cache availability, so you always have that | 15:21 |
*** belmoreira has quit IRC | 15:21 | |
dansmith | we could go totally crazy and create an allocation for each image, by image uuid and after selecting a host, the scheduler could check to see if there was an allocation for that image against the host's provider to decide if it thinks it will fit :) | 15:22 |
dansmith | that has some nice benefits, but it's a little crazy and there's still plenty of room for racing of course, | 15:22 |
dansmith | and plenty of room for exhausting all the candidates in a small query set, leading to non-ideal looping of retries | 15:23 |
dansmith | plus we'd have to have a separate provider for the cache disk if they're separate | 15:23 |
gibi | complicated indeed | 15:27 |
gibi | I will pass the workaround of having the cache on a different partition to downstream | 15:27 |
gibi | at least that is something that the downstream project can do | 15:27 |
aarents | gibi: dansmith we have this issue, painfull one | 15:28 |
dansmith | gibi: ack | 15:28 |
aarents | gibi yep in some case we put cahe in another file system to get ride of this | 15:29 |
gibi | dansmith: the old DiskFilter had the disk_available_least info to prevent overallocation but we removed the DiskFilter | 15:29 |
dansmith | yeah, fair point | 15:30 |
gibi | we might want to re-introduce something like diks_available_least as a filter? or a pre-filter with placement support? | 15:30 |
dansmith | I think diskfilter had plenty of other problems, like the other way where the filter behavior conflicted with the hypervisors listing, which definitely causes support cases | 15:31 |
*** belmoreira has joined #openstack-nova | 15:31 | |
dansmith | not that they agree now, but.. | 15:31 |
dansmith | gibi: I don't think we'd want that to be a pre-filter because you'd have to provide either an inclusion or exclusion list of all hosts to placement each time | 15:32 |
dansmith | not like a trait or aggregate, but "any one of these hosts: [... array of 5000 ...] | 15:32 |
gibi | yeah I don't want to bring back the whole DiskFilter, just bring back the extra information to the scheduler / placement that how much actual disk space is free under the instance_path | 15:34 |
gibi | as an idea | 15:34 |
*** belmoreira has quit IRC | 15:34 | |
dansmith | gibi: all I'm saying is doing it as a pre-filter is the wrong place | 15:34 |
gibi | ack | 15:35 |
dansmith | we might still be reporting that value such that a filter can check it | 15:35 |
*** sapd1_x has quit IRC | 15:37 | |
dansmith | one other option is that the cache_images() thing that I added is setup to work as a call, returning information about presence, | 15:38 |
dansmith | so if we were to pass it a "don't download just check" flag, or a "download in background" flag, then we would get back an indication if it's present or not, | 15:38 |
dansmith | which would let the scheduler know whether or not to filter out hosts with 1x or 2x the disk space | 15:39 |
dansmith | that's pretty heavy, and would likely need to be done from conductor, | 15:40 |
dansmith | but it's a little less racy than checking some minutes-old disk free amount and assuming it's going to work | 15:40 |
dansmith | another cloudy way to look at this is to say we should just have people weigh hosts on free space, | 15:40 |
dansmith | in line with our "we don't schedule at capacity" project goal | 15:41 |
gibi | dansmith: so if the rpc call reports that the image is not cached, then we would add 2x disk space the allocation candidate query, but only allocate 1x disk space on the selected host for the instance, then on the compute side the image cache code would allocate the other 1x disk space in placement for the cache | 15:42 |
dansmith | no | 15:42 |
dansmith | we've already done the a-c query at that point | 15:42 |
dansmith | we'd just use that to advise us which of the a-c are valid | 15:43 |
gibi | ahh yeah, you have to now which host you send the rpc call | 15:43 |
dansmith | right | 15:43 |
gibi | the weigher thing is good for big deployments but will fall short for edge. As far as I understand my downstream report is from a really small edge site close to capacity. :/ | 15:46 |
*** links has quit IRC | 15:47 | |
dansmith | indeed, although I think I'd argue that for highly constrained situations the separate filesystem is the right approach there anyway, given the (a) usually constrain-able image sets for edge and (b) the need to avoid the race that we'll have in some form anyway | 15:47 |
dansmith | but yep, it's not a great answer for someone that just wants it to work ideally | 15:47 |
gibi | I will definitely suggest the separate partition for now as I feel whathever solution we come up with (if any) it will not be backportable | 15:48 |
dansmith | for sure | 15:48 |
dansmith | fwiw, making a pre-call to cache_images(background=True) would theoretically give us some lower time-to-boot performance in other cases | 15:49 |
dansmith | I'm really not sure whether that's a terrible idea or not, but it's an interesting thought | 15:50 |
dansmith | similar to the cyborg case of starting the programming at bind time from the conductor in parallel to the rest of the instance bringup | 15:50 |
gibi | it would have a side effect to cache image to a compute that otherwise will not be selected | 15:50 |
gibi | hm or not | 15:51 |
dansmith | in the tight case you mean right? We'd start caching an image on a host that the scheduler was going to exclude anyway, which is true | 15:51 |
*** sapd1_x has joined #openstack-nova | 15:52 | |
gibi | yeah for the thight case when the image would fit into the cache but the instance root disk would not any more | 15:52 |
dansmith | yep, for sure | 15:52 |
gibi | but that really tight | 15:52 |
gibi | what if for thight cases we allow disabling the cache entirely? it is thight so no space for cache | 15:53 |
dansmith | doing that would require a substantial resdesign of the whole image backend for libvirt I think | 15:54 |
*** swp20 has joined #openstack-nova | 15:54 | |
gibi | is it because we assume that there is a backing file for the root fs image which happen to be the cached image? | 15:55 |
*** sapd1_x has quit IRC | 15:56 | |
dansmith | I'm actually not sure what happens if you configure qcow2 and "flatten_images" actually, I'd have to look | 15:56 |
dansmith | that might have the same effect, I'm not sure | 15:56 |
*** markguz_ has joined #openstack-nova | 15:58 | |
gibi | do you mean force_raw_images conf option? | 15:59 |
gibi | or use_cow_images = False? or a linear combination of the two :) | 16:00 |
markguz_ | Hi nova folks. i have an instance that failed that i migrated (not live) to another compute host, but upon restarting it errors with | 16:00 |
markguz_ | Unsupported VIF type binding_failed convert '_nova_to_osvif_vif_binding_failed' | 16:00 |
dansmith | gibi: yeah there's some combination that results in full flattening, but I'm not sure what they are | 16:00 |
dansmith | gibi: not sure if tat actually results in the cache image going away, or getting copied or what | 16:01 |
dansmith | actually, as to be the latter I think since we have to expand the size of it | 16:01 |
*** swp20 has quit IRC | 16:01 | |
markguz_ | i've tried the various solutions found via google but non work | 16:01 |
dansmith | so you probably end up with 2x the space initially at least, and then you'd need to immediately purge the original or something | 16:01 |
gibi | dansmith: ack | 16:02 |
*** gyee has joined #openstack-nova | 16:04 | |
gibi | dansmith: hm even when the flat backend copies the image to raw it does update the cache to keep the base image https://github.com/openstack/nova/blob/d6450879c7f7dd19366b6f002301fbbf87918026/nova/virt/libvirt/imagebackend.py#L585 | 16:05 |
gibi | dansmith: anyhow thanks for your thoughts I have to drop today soon so I will add some summary of this discussion to the bug. | 16:08 |
*** rpittau is now known as rpittau|afk | 16:09 | |
dansmith | gibi: yeah, that's what I was thinking above when I said "but then you'd need to immediately purge" | 16:10 |
dansmith | gibi: the cache is trying to be a cache | 16:10 |
gibi | yeah, I see now | 16:12 |
*** sapd1_x has joined #openstack-nova | 16:13 | |
*** dtantsur is now known as dtantsur|afk | 16:19 | |
openstackgerrit | Takashi Natsume proposed openstack/nova master: Remove six.reraise https://review.opendev.org/726898 | 16:23 |
*** swp20 has joined #openstack-nova | 16:24 | |
*** swp20 has quit IRC | 16:29 | |
*** swp20 has joined #openstack-nova | 16:30 | |
gibi | lyarwood, dansmith: updated the bug 1878024 with what we talked about above. | 16:30 |
openstack | bug 1878024 in OpenStack Compute (nova) "disk usage of the nova image cache is not counted as used disk space" [Undecided,New] https://launchpad.net/bugs/1878024 | 16:30 |
dansmith | cool | 16:30 |
gibi | and now I go and bake some bread for dinner | 16:30 |
gibi | see you tomorrow | 16:31 |
*** hemna_ has joined #openstack-nova | 16:32 | |
*** swp20 has quit IRC | 16:34 | |
*** udesale_ has quit IRC | 16:34 | |
*** swp20 has joined #openstack-nova | 16:34 | |
*** hemna has quit IRC | 16:34 | |
*** evrardjp has quit IRC | 16:36 | |
*** evrardjp has joined #openstack-nova | 16:36 | |
*** brinzhang has joined #openstack-nova | 16:37 | |
*** amodi has quit IRC | 16:39 | |
*** jharris has quit IRC | 16:40 | |
*** mlavalle has quit IRC | 16:44 | |
*** mlavalle has joined #openstack-nova | 16:46 | |
*** ttsiouts has joined #openstack-nova | 16:57 | |
*** derekh has quit IRC | 17:01 | |
*** maciejjozefczyk_ has joined #openstack-nova | 17:06 | |
*** swp20 has quit IRC | 17:07 | |
*** swp20 has joined #openstack-nova | 17:08 | |
*** maciejjozefczyk has quit IRC | 17:09 | |
*** maciejjozefczyk has joined #openstack-nova | 17:10 | |
*** swp20 has quit IRC | 17:12 | |
*** swp20 has joined #openstack-nova | 17:12 | |
*** maciejjozefczyk_ has quit IRC | 17:14 | |
*** sapd1_x has quit IRC | 17:22 | |
*** ttsiouts has quit IRC | 17:31 | |
*** nightmare_unreal has quit IRC | 17:39 | |
*** jsuchome has quit IRC | 17:39 | |
*** swp20 has quit IRC | 17:46 | |
*** swp20 has joined #openstack-nova | 17:56 | |
*** salmankhan has quit IRC | 17:57 | |
*** ttsiouts has joined #openstack-nova | 17:58 | |
*** swp20 has quit IRC | 18:03 | |
*** happyhemant has quit IRC | 18:05 | |
*** ttsiouts has quit IRC | 18:10 | |
*** ttsiouts has joined #openstack-nova | 18:11 | |
*** ralonsoh has quit IRC | 18:11 | |
openstackgerrit | Merged openstack/nova master: Support for --force flag for nova-manage placement heal_allocations command https://review.opendev.org/715395 | 18:15 |
openstackgerrit | Merged openstack/nova stable/queens: Include only required fields in ironic node cache https://review.opendev.org/724862 | 18:15 |
openstackgerrit | Merged openstack/nova stable/queens: Lowercase ironic driver hash ring and ignore case in cache https://review.opendev.org/723054 | 18:16 |
openstackgerrit | Merged openstack/nova stable/rocky: Add config option for neutron client retries https://review.opendev.org/722819 | 18:16 |
*** ttsiouts has quit IRC | 18:16 | |
openstackgerrit | Merged openstack/nova master: Suppress remaining policy warnings in unit tests https://review.opendev.org/726272 | 18:16 |
markguz_ | anyone know how to get out of Unsupported VIF type binding_failed convert '_nova_to_osvif_vif_binding_failed' hell? | 18:22 |
*** ircuser-1 has joined #openstack-nova | 18:25 | |
*** iurygregory has quit IRC | 18:25 | |
*** ttsiouts has joined #openstack-nova | 18:38 | |
*** iurygregory has joined #openstack-nova | 18:38 | |
*** maciejjozefczyk has quit IRC | 18:54 | |
*** amodi has joined #openstack-nova | 18:56 | |
*** toabctl has joined #openstack-nova | 18:57 | |
*** dkehn has joined #openstack-nova | 19:20 | |
*** ccamacho has quit IRC | 19:25 | |
*** brinzhang_ has joined #openstack-nova | 19:32 | |
openstackgerrit | Harshavardhan Metla proposed openstack/nova master: [Nova] Add reference to Placement installation guide https://review.opendev.org/726936 | 19:35 |
*** brinzhang has quit IRC | 19:35 | |
*** factor has joined #openstack-nova | 19:36 | |
*** jmlowe has quit IRC | 19:42 | |
markguz_ | for anyone that's interested setting neutron.ml2_port_bindings.vif_type to "ovs" in the database fixed this for me | 19:42 |
sean-k-mooney | markguz_: if you get that error its because neutron failed to bind the port | 19:43 |
markguz_ | i know | 19:43 |
sean-k-mooney | which normally means there was an error on cthe compute node | 19:43 |
markguz_ | but once it happens it seems next to impossible to fix it via the normal methods. | 19:44 |
sean-k-mooney | markguz_: you fix it by setting the host filed to "" or "none" then back to the original hostname | 19:44 |
markguz_ | the instance will not boot due the "binding_failed" being written into the vif_type field | 19:45 |
markguz_ | sean-k-mooney: or by updating that field i just mentioned in the neutron db | 19:45 |
sean-k-mooney | yes but you can do "openstack --os-cloud=admin port set --host none baf2b165-797b-4305-bc6b-5b63250b890d" follow by " openstack --os-cloud=admin port set --host workstation baf2b165-797b-4305-bc6b-5b63250b890d" | 19:46 |
sean-k-mooney | to do it from the api without db hacking | 19:46 |
sean-k-mooney | it will actully cause port binding to happen properly recalualting the correct values | 19:46 |
*** jmlowe has joined #openstack-nova | 19:46 | |
markguz_ | sean-k-mooney: ok. thanks for that | 19:46 |
sean-k-mooney | so i did that yesterday becauses i was swaping form the iptables firewall dirver to the ovs one | 19:47 |
markguz_ | this happened to me when i did a non-live migrate of a shutdown instance to a new host | 19:48 |
sean-k-mooney | hum it should not happen in that case | 19:48 |
sean-k-mooney | something obviosly went wrong there should be an error in the neutron server log | 19:49 |
markguz_ | yeah. haven't had time to deep dive. Was focussed on getting the instance back online | 19:49 |
sean-k-mooney | ist the instance still in resize_verify or did this happen after that point | 19:50 |
sean-k-mooney | if you had not confirmed the migrate/resize then you could have reverted | 19:50 |
sean-k-mooney | if you had then ya db edit or unest and reset the host to rebind the port then hard reboot | 19:51 |
*** ttsiouts has quit IRC | 20:26 | |
*** nweinber has quit IRC | 20:42 | |
*** dpawlik has quit IRC | 20:46 | |
*** xek has quit IRC | 20:59 | |
*** awalende has quit IRC | 21:05 | |
*** awalende has joined #openstack-nova | 21:06 | |
*** jangutter_ has quit IRC | 21:09 | |
*** awalende has quit IRC | 21:10 | |
*** raildo has quit IRC | 21:17 | |
*** raildo has joined #openstack-nova | 21:17 | |
*** raildo has quit IRC | 21:39 | |
*** brinzhang has joined #openstack-nova | 21:51 | |
*** brinzhang_ has quit IRC | 21:54 | |
*** mgariepy has joined #openstack-nova | 22:00 | |
*** slaweq has quit IRC | 22:08 | |
*** slaweq has joined #openstack-nova | 22:09 | |
*** slaweq has quit IRC | 22:13 | |
*** KeithMnemonic has joined #openstack-nova | 22:14 | |
*** tkajinam has joined #openstack-nova | 22:55 | |
*** tosky has quit IRC | 22:57 | |
*** markguz_ has quit IRC | 23:47 | |
*** kevinz has quit IRC | 23:52 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!