*** tkajinam has quit IRC | 00:12 | |
*** takamatsu has quit IRC | 00:18 | |
*** dklyle has joined #openstack-nova | 00:28 | |
*** brinzhang has joined #openstack-nova | 00:38 | |
*** brinzhang_ has joined #openstack-nova | 00:45 | |
*** brinzhang has quit IRC | 00:47 | |
*** brinzhang_ has quit IRC | 00:52 | |
*** brinzhang has joined #openstack-nova | 00:54 | |
*** tetsuro has joined #openstack-nova | 00:56 | |
*** mgoddard has quit IRC | 01:01 | |
*** mgoddard has joined #openstack-nova | 01:05 | |
*** tetsuro has quit IRC | 01:13 | |
*** spatel has joined #openstack-nova | 01:16 | |
*** imacdonn has quit IRC | 01:17 | |
*** imacdonn has joined #openstack-nova | 01:18 | |
*** tetsuro has joined #openstack-nova | 01:18 | |
*** spatel has quit IRC | 01:20 | |
brinzhang | lyarwood: After you review this spec[1] 5/20, now this spec wait for your update, when do you have time to do it? [1]https://review.opendev.org/#/c/580336/ | 01:31 |
---|---|---|
*** zul has quit IRC | 02:09 | |
*** bhagyashris has joined #openstack-nova | 02:26 | |
*** tetsuro has quit IRC | 02:42 | |
*** artom has quit IRC | 03:04 | |
*** tetsuro has joined #openstack-nova | 03:24 | |
*** tetsuro has quit IRC | 03:24 | |
*** psachin has joined #openstack-nova | 03:33 | |
*** udesale has joined #openstack-nova | 04:02 | |
*** BjoernT has joined #openstack-nova | 04:08 | |
*** BjoernT_ has joined #openstack-nova | 04:13 | |
*** BjoernT has quit IRC | 04:14 | |
*** BjoernT has joined #openstack-nova | 04:23 | |
*** BjoernT_ has quit IRC | 04:24 | |
*** tetsuro has joined #openstack-nova | 04:25 | |
*** eandersson has joined #openstack-nova | 04:25 | |
*** etp has joined #openstack-nova | 04:53 | |
*** ircuser-1 has quit IRC | 04:57 | |
*** threestrands has joined #openstack-nova | 05:00 | |
*** Luzi has joined #openstack-nova | 05:03 | |
*** tetsuro has quit IRC | 05:04 | |
*** ratailor has joined #openstack-nova | 05:38 | |
*** tetsuro has joined #openstack-nova | 05:42 | |
*** tetsuro has quit IRC | 05:44 | |
*** tetsuro has joined #openstack-nova | 05:45 | |
alex_xu | efried: bauzas Luyao found a problem https://review.opendev.org/#/c/671222/4/nova/virt/libvirt/device.py@119, I guess we should use the migration as the consumer when migrate, just like we did for the placement | 05:47 |
*** BjoernT has quit IRC | 06:02 | |
-openstackstatus- NOTICE: Due to a failure on the logs.openstack.org volume, old logs are unavailable while partition is recovered. New logs are being stored. ETA for restoration probably ~Mon Jul 22 12:00 UTC 2019 | 06:05 | |
*** whoami-rajat has joined #openstack-nova | 06:05 | |
*** ChanServ changes topic to "Due to a failure on the logs.openstack.org volume, old logs are unavailable while partition is recovered. New logs are being stored. ETA for restoration probably ~Mon Jul 22 12:00 UTC 2019" | 06:05 | |
*** udesale has quit IRC | 06:16 | |
*** tetsuro has quit IRC | 06:17 | |
*** ricolin__ is now known as ricolin | 06:20 | |
*** jaosorior has joined #openstack-nova | 06:25 | |
*** ChanServ changes topic to "Current runways: https://etherpad.openstack.org/p/nova-runways-train -- This channel is for Nova development. For support of Nova deployments, please use #openstack." | 06:25 | |
-openstackstatus- NOTICE: logs.openstack.org volume has been restored. please report any issues in #openstack-infra | 06:25 | |
*** markvoelker has quit IRC | 06:32 | |
*** pcaruana has joined #openstack-nova | 06:32 | |
*** belmoreira has joined #openstack-nova | 06:36 | |
*** belmoreira has quit IRC | 06:46 | |
*** udesale has joined #openstack-nova | 06:47 | |
*** xek has joined #openstack-nova | 06:56 | |
*** boxiang has joined #openstack-nova | 06:57 | |
*** slaweq has joined #openstack-nova | 06:57 | |
*** tesseract has joined #openstack-nova | 07:01 | |
*** markvoelker has joined #openstack-nova | 07:04 | |
*** rpittau|afk is now known as rpittau | 07:05 | |
*** rcernin has quit IRC | 07:07 | |
*** maciejjozefczyk has joined #openstack-nova | 07:08 | |
*** tkajinam has joined #openstack-nova | 07:08 | |
*** xek has quit IRC | 07:09 | |
*** maciejjozefczyk_ has joined #openstack-nova | 07:09 | |
*** xek has joined #openstack-nova | 07:10 | |
*** maciejjozefczyk_ has quit IRC | 07:10 | |
*** jaosorior has quit IRC | 07:18 | |
*** jhesketh has quit IRC | 07:22 | |
*** ileixe has quit IRC | 07:22 | |
*** jhesketh has joined #openstack-nova | 07:24 | |
*** ileixe has joined #openstack-nova | 07:24 | |
*** ttsiouts has joined #openstack-nova | 07:24 | |
openstackgerrit | wangwei1 proposed openstack/nova master: fix spelling error in nova/api/validation/__init__.py https://review.opendev.org/669244 | 07:32 |
*** tkajinam has quit IRC | 07:36 | |
*** ttsiouts has quit IRC | 07:42 | |
*** ttsiouts has joined #openstack-nova | 07:43 | |
*** ttsiouts has quit IRC | 07:48 | |
*** jangutter has joined #openstack-nova | 07:52 | |
*** ileixe has quit IRC | 07:59 | |
bauzas | alex_xu: ack, I'll need to look | 08:03 |
*** ileixe has joined #openstack-nova | 08:03 | |
*** ralonsoh has joined #openstack-nova | 08:04 | |
*** davidsha has joined #openstack-nova | 08:06 | |
*** threestrands has quit IRC | 08:14 | |
*** zbr|out is now known as zbr | 08:22 | |
*** ivve has joined #openstack-nova | 08:24 | |
*** ttsiouts has joined #openstack-nova | 08:27 | |
*** tssurya has joined #openstack-nova | 08:34 | |
*** ociuhandu has joined #openstack-nova | 08:37 | |
*** ociuhandu has quit IRC | 08:39 | |
*** shilpasd has joined #openstack-nova | 08:39 | |
*** ociuhandu has joined #openstack-nova | 08:39 | |
*** cdent has joined #openstack-nova | 08:49 | |
alex_xu | bauzas: thanks, after take a look the detail, I don't think the migration uuid works. in the end, I'm thinking passing the flavor id to the claim/unclaim_for_instance. Then claim/unclaim based on the (instance_uuid, flavor_id) | 09:00 |
*** mgoddard has quit IRC | 09:01 | |
*** mgoddard has joined #openstack-nova | 09:04 | |
*** mdbooth has joined #openstack-nova | 09:15 | |
*** cdent has quit IRC | 09:21 | |
*** cdent has joined #openstack-nova | 09:23 | |
openstackgerrit | **** proposed openstack/nova master: Nova: node should be deleted when last service is deleted https://review.opendev.org/671731 | 09:26 |
*** priteau has joined #openstack-nova | 09:29 | |
*** udesale has quit IRC | 09:40 | |
*** udesale has joined #openstack-nova | 09:40 | |
*** mdbooth has quit IRC | 09:41 | |
*** mdbooth has joined #openstack-nova | 09:41 | |
*** udesale has quit IRC | 09:42 | |
*** udesale has joined #openstack-nova | 09:42 | |
*** FlorianFa has joined #openstack-nova | 09:42 | |
*** ociuhandu has quit IRC | 09:45 | |
*** ociuhandu has joined #openstack-nova | 09:48 | |
*** mgoddard has quit IRC | 10:11 | |
*** mgoddard has joined #openstack-nova | 10:18 | |
*** artom has joined #openstack-nova | 10:24 | |
*** ttsiouts has quit IRC | 10:27 | |
*** ttsiouts has joined #openstack-nova | 10:28 | |
*** udesale has quit IRC | 10:28 | |
*** ccamacho has joined #openstack-nova | 10:29 | |
*** ccamacho has quit IRC | 10:29 | |
*** ccamacho has joined #openstack-nova | 10:29 | |
*** ttsiouts has quit IRC | 10:32 | |
*** jaosorior has joined #openstack-nova | 10:41 | |
*** jaosorior has quit IRC | 10:43 | |
*** jaosorior has joined #openstack-nova | 10:44 | |
*** shilpasd has quit IRC | 10:50 | |
*** sean-k-mooney has joined #openstack-nova | 10:52 | |
*** tbachman has joined #openstack-nova | 10:55 | |
*** bhagyashris has quit IRC | 10:59 | |
*** ttsiouts has joined #openstack-nova | 11:10 | |
*** tssurya has quit IRC | 11:11 | |
*** shilpasd has joined #openstack-nova | 11:14 | |
*** betherly has joined #openstack-nova | 11:19 | |
*** betherly has quit IRC | 11:24 | |
*** tssurya has joined #openstack-nova | 11:24 | |
sean-k-mooney | stephenfin: when you get back form lunch we have killed the cacheing schduler so maybe we should delete the core, ram and disk filters then nuke the filed for the resouce tracker? | 11:29 |
sean-k-mooney | it would be a change to the hyperviors api but you could always proxy the data from placement. | 11:30 |
*** kaisers has quit IRC | 11:35 | |
*** kaisers has joined #openstack-nova | 11:36 | |
openstackgerrit | **** proposed openstack/nova master: Nova: node should be deleted when last service is deleted https://review.opendev.org/671731 | 11:37 |
*** jaosorior has quit IRC | 11:38 | |
*** jaosorior has joined #openstack-nova | 11:39 | |
sean-k-mooney | johnthetubaguy: lyarwood can ye take a look at this stable backport https://review.opendev.org/#/c/671532/ if ye have time | 11:41 |
*** psachin has quit IRC | 11:50 | |
*** betherly has joined #openstack-nova | 11:50 | |
*** jaypipes has joined #openstack-nova | 11:53 | |
*** spatel has joined #openstack-nova | 11:54 | |
*** betherly has quit IRC | 11:55 | |
openstackgerrit | Alex Xu proposed openstack/nova master: Add the virt driver interface for claim and unclaim the devices https://review.opendev.org/670782 | 11:57 |
openstackgerrit | Alex Xu proposed openstack/nova master: Moves the allocation retrieving early https://review.opendev.org/670783 | 11:57 |
openstackgerrit | Alex Xu proposed openstack/nova master: Calling the virt driver's claim/unclaim_for_instance in resource tracker https://review.opendev.org/670784 | 11:57 |
openstackgerrit | Alex Xu proposed openstack/nova master: Add DeviceManager to the libvirt virt driver https://review.opendev.org/671388 | 11:57 |
openstackgerrit | Alex Xu proposed openstack/nova master: Populates the existing mediated devices in the libvirt device manager https://review.opendev.org/670787 | 11:57 |
openstackgerrit | Alex Xu proposed openstack/nova master: Using the claim/unclaim_for_instance for mdevs https://review.opendev.org/671222 | 11:57 |
openstackgerrit | Alex Xu proposed openstack/nova master: Adds functional test for creating the instance with vgpus https://review.opendev.org/671398 | 11:57 |
*** ygk_12345 has joined #openstack-nova | 12:01 | |
ygk_12345 | hi all | 12:01 |
ygk_12345 | some of the vms on the compute node are eating too much of memory. Can someone point me to any links for finding out what is happening in those vms from the hypervisro prespective ? | 12:02 |
*** ratailor has quit IRC | 12:06 | |
*** udesale has joined #openstack-nova | 12:07 | |
*** etp has quit IRC | 12:13 | |
*** spatel has quit IRC | 12:13 | |
TheJulia | sean-k-mooney: I never get enough sleep and if I don't take meds It would be even worse... so I always needssleep :) | 12:18 |
mdbooth | ygk_12345: You're unlikely to find the right expertise here for that. Assuming you're using libvirt/kvm you'll want to talk to a libvirt/kvm forum. Or your vendor... | 12:18 |
ygk_12345 | mdbooth is there any irc channel for kvm in general ? | 12:19 |
*** _erlon_ has joined #openstack-nova | 12:21 | |
openstackgerrit | sean mooney proposed openstack/nova master: Libvirt: add support for vPMU configuration. https://review.opendev.org/671338 | 12:23 |
mdbooth | ygk_12345: Try #virt on irc.oftc.net | 12:23 |
ygk_12345 | mdbooth ok thanks | 12:23 |
sean-k-mooney | ygk_12345: how much memory are they consuming over what you execpt | 12:24 |
ygk_12345 | sean-k-mooney what I observer is in general the cpu usage against those qemu-kvm processes is shooting above 150 sometimes and so I am suspecting them. If I shut them down it is normal again | 12:25 |
sean-k-mooney | ygk_12345: are you seeing high cpu usage, or high memory usage or both | 12:26 |
ygk_12345 | both | 12:26 |
ygk_12345 | sometimes | 12:26 |
ygk_12345 | how to find out whats exactly happening at those times ? | 12:26 |
sean-k-mooney | you would likely need to look at the guest journal/logs to be honest but you could check the host for kernel errors | 12:28 |
ygk_12345 | hmm | 12:28 |
sean-k-mooney | e.g. check if the kernel is soft locking a core | 12:28 |
sean-k-mooney | that could be caused by the guest soft locking up | 12:29 |
ygk_12345 | ok | 12:29 |
sean-k-mooney | are you currently using q35 machine type or are your still using the pc i440 machine type | 12:29 |
ygk_12345 | we use dell with latest intel | 12:30 |
sean-k-mooney | we have seen that q35 while more modern tends to use more memory | 12:30 |
ygk_12345 | oh ok | 12:30 |
*** ivve has quit IRC | 12:35 | |
*** dklyle has quit IRC | 12:36 | |
*** david-lyle has joined #openstack-nova | 12:36 | |
*** ygk_12345 has quit IRC | 12:39 | |
*** irclogbot_3 has quit IRC | 12:39 | |
*** irclogbot_2 has joined #openstack-nova | 12:43 | |
*** boxiang_ has joined #openstack-nova | 12:44 | |
*** francoisp has joined #openstack-nova | 12:46 | |
boxiang_ | https://review.opendev.org/#/c/649963/ and https://review.opendev.org/#/c/651969/ Can someone help review these two patches? thanks :) | 12:49 |
*** boxiang has quit IRC | 12:50 | |
*** jaosorior has quit IRC | 12:59 | |
efried | alex_xu: I don't understand. Why would we not use the migration consumer to migrate? | 13:00 |
*** mchlumsky has quit IRC | 13:01 | |
*** beraldo has joined #openstack-nova | 13:01 | |
*** mchlumsky has joined #openstack-nova | 13:01 | |
*** BjoernT has joined #openstack-nova | 13:02 | |
sean-k-mooney | gibi_off: not sure if your are on vaction/PTO this week but just an fyi that i should have adressed your feedback in https://review.opendev.org/671338 | 13:08 |
*** aojea has joined #openstack-nova | 13:09 | |
*** eharney has joined #openstack-nova | 13:10 | |
*** artom has quit IRC | 13:11 | |
*** mriedem has joined #openstack-nova | 13:11 | |
openstackgerrit | Matt Riedemann proposed openstack/nova stable/stein: Restore RT.old_resources if ComputeNode.save() fails https://review.opendev.org/672038 | 13:14 |
sean-k-mooney | efried: i am also confused. isnt a migration consumer proposed for use when doing cold and live migrations to hold the allocation on the second node | 13:14 |
efried | sean-k-mooney: He's off for two weeks. It looks like the changes are minor; if so another core will probably proxy his +2. | 13:15 |
efried | sean-k-mooney: that's my confusion as well. | 13:15 |
efried | perhaps we only use the migration consumer for live migration? | 13:15 |
efried | I don't know. | 13:15 |
sean-k-mooney | efried: ah ok good to know. | 13:15 |
mriedem | they're used for both cold and live | 13:15 |
mriedem | just not evacuate | 13:16 |
sean-k-mooney | mriedem: o/ welcome back, have a good vaction? | 13:16 |
mriedem | thanks, yeah | 13:16 |
sean-k-mooney | we dont use them for evac because we dont use them in rebuild | 13:16 |
*** beraldo has left #openstack-nova | 13:16 | |
mriedem | um | 13:17 |
mriedem | that's not really the reason | 13:17 |
mriedem | it's because for evac the source compute service is down, so it's not really the same flow for how the migration-based allocations are handled at the end | 13:17 |
mriedem | we have migration records for evac | 13:18 |
alex_xu | efried: if we use migration as consumer for the dst, when we confirm the resize, there isn't a call to dest host to change migration consumer back to instance consumer | 13:18 |
sean-k-mooney | right however event with a down host we could still free the allocation on the source node in placment | 13:18 |
mriedem | sean-k-mooney: we do that when the source compute service comes back up | 13:18 |
efried | alex_xu: oh, you're talking about swapping which side gets which consumer? | 13:19 |
alex_xu | efried: if we use migration as consumer for the src, we will call the the dest host to claim the device, so we have no way to change the src host's claim consumer to migration. | 13:19 |
sean-k-mooney | yes that is true. | 13:19 |
alex_xu | efried: yes | 13:19 |
mriedem | we *could* use migration-based consumers for evac, i just don't think it was really a priority when dansmith implemented that blueprint | 13:19 |
mriedem | i'm obviously walking into something late here though | 13:20 |
mriedem | who is out for 2 weeks? | 13:20 |
efried | gibi_off: | 13:20 |
efried | mriedem: but that's a separate topic | 13:20 |
mriedem | ok, which series are you talking about? | 13:20 |
efried | mriedem: we're talking about alex_xu's work to make hypervisor-specific claiming (atm for VGPUs and VPMEMs) happen in the virt driver. | 13:21 |
alex_xu | mriedem: https://review.opendev.org/#/q/status:open+project:openstack/nova+branch:master+topic:claim_for_instance | 13:21 |
mriedem | ok and this is prep work for vpmem i guess | 13:22 |
* sean-k-mooney clicks, i thought they happened in the RT | 13:22 | |
mriedem | claims do happen in the rt today | 13:22 |
mriedem | the only hypervisor-specific thing though is the overhead stuff | 13:22 |
*** sean-k-mooney has quit IRC | 13:22 | |
alex_xu | mriedem: vgpu allocation has race problem also, in libvirt, we allocate vgpu out of rt, that means no lock | 13:22 |
efried | also, it just makes sense for hypervisor-specific claiming to happen in the virt driver. | 13:23 |
mriedem | alex_xu: it happens during driver.spawn correct? | 13:24 |
efried | So tldr we're adding a hook from the various *_claimZ to call into a new ComputeDriver interface which interprets allocations etc to earmark the real devices corresponding to the allocations | 13:24 |
alex_xu | mriedem: yes, https://bugs.launchpad.net/nova/+bug/1836204 | 13:24 |
openstack | Launchpad bug 1836204 in OpenStack Compute (nova) "The allocation of VGPU has race problem" [High,Triaged] - Assigned to Alex Xu (xuhj) | 13:24 |
*** BjoernT has quit IRC | 13:24 | |
*** BjoernT has joined #openstack-nova | 13:25 | |
mriedem | i guess we're not modeling each specific device in placement right? just the total number of available in inventory, and then it's up to the virt driver to pick one, which could conflict. | 13:25 |
*** BjoernT has quit IRC | 13:25 | |
efried | exactly | 13:26 |
mriedem | weee, so much for ever getting rid of the rt claims code... | 13:26 |
*** BjoernT has joined #openstack-nova | 13:26 | |
*** BjoernT has quit IRC | 13:26 | |
efried | The RT claims *framework* will need to stay, but IMO eventually "all" of the guts should wind up moving into this ComputeDriver.claim_for_instance | 13:26 |
*** BjoernT has joined #openstack-nova | 13:27 | |
efried | especially for things like PCI devices | 13:27 |
*** BjoernT has quit IRC | 13:27 | |
efried | and NUMA | 13:27 |
*** sean-k-mooney has joined #openstack-nova | 13:27 | |
efried | the line where it makes sense to cut things over is when we move to tracking a resource in placement. | 13:27 |
efried | the messy ones obviously being VCPU, MEMORY_MB, DISK_GB | 13:28 |
*** BjoernT has joined #openstack-nova | 13:28 | |
*** BjoernT has joined #openstack-nova | 13:28 | |
*** BjoernT has quit IRC | 13:28 | |
sean-k-mooney | efried: well placement just keeps a tally count. if we need to do device assignment we need to continue to do tracking the the RT | 13:28 |
efried | yup | 13:29 |
efried | where "device" will eventually encompass NUMA-split proc/mem | 13:29 |
*** BjoernT has joined #openstack-nova | 13:29 | |
sean-k-mooney | ram and cpus are just devices in disgiuse | 13:29 |
*** BjoernT has joined #openstack-nova | 13:30 | |
*** BjoernT has quit IRC | 13:30 | |
alex_xu | I have too much fun with same host resize today | 13:30 |
efried | bbiab | 13:30 |
*** BjoernT has joined #openstack-nova | 13:31 | |
*** BjoernT has joined #openstack-nova | 13:32 | |
*** BjoernT has quit IRC | 13:32 | |
*** BjoernT has joined #openstack-nova | 13:33 | |
*** BjoernT has quit IRC | 13:33 | |
*** BjoernT has joined #openstack-nova | 13:34 | |
*** BjoernT has quit IRC | 13:34 | |
*** BjoernT has joined #openstack-nova | 13:35 | |
*** BjoernT has quit IRC | 13:35 | |
*** lbragstad has joined #openstack-nova | 13:35 | |
*** BjoernT has joined #openstack-nova | 13:35 | |
*** BjoernT has joined #openstack-nova | 13:36 | |
*** BjoernT has quit IRC | 13:36 | |
*** needscoffee is now known as kmalloc | 13:36 | |
stephenfin | mriedem, efried: I'm hitting the boundaries of my resource tracking know how and could do with some help | 13:36 |
stephenfin | I've got this patch to start tracking pcpus as a ComputeNode/HostState field https://review.opendev.org/#/c/671794/ | 13:37 |
*** BjoernT has joined #openstack-nova | 13:37 | |
*** BjoernT has quit IRC | 13:37 | |
stephenfin | It's not complete, and it's also breaking some functional/tempest tests because one of those objects is being spat out for the os-hypervisors API | 13:37 |
*** BjoernT has joined #openstack-nova | 13:38 | |
*** BjoernT has quit IRC | 13:38 | |
stephenfin | However, does any of the resource tracking claims stuff matter for vcpus, ram and disk anymore, given we're actually requesting things from placement? | 13:38 |
*** BjoernT has joined #openstack-nova | 13:39 | |
openstackgerrit | sahid proposed openstack/nova master: cellv2: make update_cell to support cell0 https://review.opendev.org/672045 | 13:39 |
stephenfin | Referring specifically to this https://github.com/openstack/nova/blob/master/nova/compute/claims.py#L97-L109 | 13:39 |
*** BjoernT has joined #openstack-nova | 13:39 | |
*** BjoernT has quit IRC | 13:39 | |
mriedem | they used to matter for scheduler drivers that didn't use placement, like the caching scheduler, but that's gone now, | 13:40 |
*** spatel has joined #openstack-nova | 13:40 | |
mriedem | the overhead stuff comes from the driver and that's still part of the claim | 13:40 |
mriedem | i know the libvirt driver claims overhead in certain case | 13:40 |
mriedem | *cases | 13:40 |
*** BjoernT has joined #openstack-nova | 13:40 | |
*** BjoernT has quit IRC | 13:40 | |
mriedem | https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L884 | 13:40 |
stephenfin | Yup, for emulator threads, though I do have a chance to stop doing that now with the PCPU stuff | 13:40 |
mriedem | there are other drivers that implement that method as well | 13:41 |
mriedem | hyperv i think is one | 13:41 |
*** BjoernT has joined #openstack-nova | 13:41 | |
*** BjoernT has quit IRC | 13:41 | |
mriedem | https://github.com/openstack/nova/blob/master/nova/virt/hyperv/vmops.py#L119 | 13:41 |
mriedem | besides the overhead stuff i'm not sure anything cares about vcpus/ram/disk claims in the rt anymore since that should be handled by the scheduler + placement now | 13:42 |
*** BjoernT has joined #openstack-nova | 13:42 | |
mriedem | dansmith at one time had a patch to rip that out | 13:43 |
*** BjoernT has quit IRC | 13:43 | |
stephenfin | Hmm, so it sounds like that alone prevents me from simply removing this stuff https://github.com/openstack/nova/blob/master/nova/compute/claims.py#L160-L162 | 13:43 |
mriedem | https://review.opendev.org/#/c/551026/ | 13:43 |
dansmith | https://review.opendev.org/#/c/551026/ | 13:43 |
dansmith | heh | 13:43 |
*** beekneemech is now known as bnemec | 13:43 | |
stephenfin | Okay, I might incorporate that into my series, if that makes sense? | 13:44 |
dansmith | we never removed cachingscheduler did we? but surely we can now yeah? | 13:44 |
mriedem | we did | 13:44 |
openstackgerrit | sahid proposed openstack/nova master: cellv2: make update_cell to support cell0 https://review.opendev.org/672045 | 13:44 |
mriedem | i just left a comment on dan's patch | 13:44 |
dansmith | oh sweet | 13:44 |
*** BjoernT has joined #openstack-nova | 13:45 | |
*** BjoernT has quit IRC | 13:45 | |
mriedem | in boston we talked about the overhead-from-driver problem for claims / placement and i think had mostly just shrugged it off saying as a workaround operators could bump up the reserved amount of inventory on hosts that would use vms that need the overhead space | 13:45 |
*** BjoernT has joined #openstack-nova | 13:45 | |
*** BjoernT has quit IRC | 13:46 | |
*** BjoernT has joined #openstack-nova | 13:46 | |
mriedem | caching scheduler was removed in stein fwiw | 13:46 |
*** BjoernT has quit IRC | 13:46 | |
*** BjoernT has joined #openstack-nova | 13:47 | |
*** BjoernT has quit IRC | 13:47 | |
stephenfin | Hmm, okay, I clearly need to think about this | 13:47 |
*** Luzi has quit IRC | 13:48 | |
*** BjoernT has joined #openstack-nova | 13:48 | |
stephenfin | Starting with figuring out if anything else is using/cares about the ComputeNode.vcpus/vcpus_used field | 13:48 |
*** BjoernT has quit IRC | 13:48 | |
stephenfin | (and therefore whether I need a pcpu equivalent) | 13:48 |
*** shilpasd has quit IRC | 13:49 | |
*** BjoernT has joined #openstack-nova | 13:49 | |
*** BjoernT has quit IRC | 13:49 | |
*** TxGirlGeek has joined #openstack-nova | 13:49 | |
*** BjoernT has joined #openstack-nova | 13:49 | |
*** BjoernT has quit IRC | 13:50 | |
*** BjoernT has joined #openstack-nova | 13:50 | |
*** BjoernT has quit IRC | 13:50 | |
*** BjoernT has joined #openstack-nova | 13:51 | |
*** BjoernT has joined #openstack-nova | 13:52 | |
*** BjoernT has quit IRC | 13:52 | |
*** TxGirlGeek has quit IRC | 13:52 | |
stephenfin | mriedem: Can/should 'Aggregate(Ram|Disk|Core)Filter' be deprecated? | 13:55 |
dansmith | those are widely used I believe | 13:55 |
*** tesseract has quit IRC | 13:55 | |
dansmith | and don't have placement-based alternatives because they work on aggregates | 13:55 |
dansmith | we broke the allocation ratio one and people revolted | 13:56 |
stephenfin | I'm misreading this release note so, I guess :( https://github.com/openstack/nova/blob/master/releasenotes/notes/agg-resource-filters-6e24c92a69afa85f.yaml | 13:56 |
stephenfin | That suggests to me that aggregate-based overcommit ratios aren't a thing anymore | 13:56 |
stephenfin | and the filters can be removed because they're useless | 13:57 |
dansmith | there are more functions in the aggregate filters than just allocation ratios right? | 13:57 |
dansmith | however, that breakage was what led to the revolt I think | 13:57 |
dansmith | i.e. that commit | 13:57 |
stephenfin | Oh, quite possibly :) Looking at the AggregateDiskFilter, I don't see much more happening https://github.com/openstack/nova/blob/master/nova/scheduler/filters/disk_filter.py | 13:58 |
mriedem | that release note was from ocata i thought, and yeah part of the misunderstanding | 13:59 |
stephenfin | (Though the __init__.py for same should probably be overridden to not log the warning) | 13:59 |
mriedem | a more detailed description is in the docs now https://docs.openstack.org/nova/latest/admin/configuration/schedulers.html#allocation-ratios | 13:59 |
stephenfin | wait, nvm. I see DEPRECATED now | 13:59 |
*** tesseract has joined #openstack-nova | 14:00 | |
dansmith | stephenfin: yeah maybe I'm confused,but I thought there were a couple other aggregate-based operations other than just the ratio for those, but maybe I'm wrong | 14:00 |
dansmith | or maybe the filters I'm thinking of aren't tied specifically to those three classes | 14:00 |
mriedem | stephenfin: any out of tree filters/weighers could be using ComputeNode.vcpus/vcpus_used | 14:00 |
mriedem | plus the os-hypervisors api is using them | 14:00 |
mriedem | "clearly need to think about this" is probably the understatement of the year for that pcpu overhaul blueprint | 14:01 |
stephenfin | Fair point. So I probably can't outright remove them, but I'm thinking/hoping I don't need to distinguish between pcpus/vcpus at that particular level | 14:01 |
stephenfin | Amen. | 14:02 |
stephenfin | mriedem: fwiw, the doc linked says the exact same thing as the release note and gives me the same impression. If I'm reading it correctly, I assume that "Note" needs to be removed? https://docs.openstack.org/nova/latest/admin/configuration/schedulers.html#scheduling-considerations | 14:03 |
mriedem | no, | 14:06 |
mriedem | it's saying the allocation ratio set via the aggregate metadata is broken since ocata, | 14:06 |
mriedem | and you *have* to set the allocation ratio per the computes / resource providers instead | 14:07 |
mriedem | which leads into the usage scenarios section | 14:07 |
mriedem | https://review.opendev.org/#/c/640898/ is also related | 14:09 |
mriedem | and was something we talked about at the ptg in dublin | 14:09 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add functional test for resize crash compute restart revert https://review.opendev.org/670393 | 14:10 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Pass migration to finish_revert_migration() https://review.opendev.org/668631 | 14:10 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: [DNM] testing bug/1813789 revert resize events https://review.opendev.org/664442 | 14:10 |
stephenfin | I guess (3) is the important piece from that? If you want to use aggregate-based allocation ratios, you *must* set e.g. '[DEFAULT] cpu_allocation_ratio' to 'None' | 14:11 |
*** aojea has quit IRC | 14:15 | |
mriedem | those scenarios (1 and 3 specifically) are more about specific use cases when discussing this problem with operators, i.e. cern does everytihng with config mgmt and cares about scenario 1. iweb people exposed the host aggregate metadata stuff to users to control their own allocation ratios so they cared about the api part, which is scenario 3 | 14:15 |
*** ildikov has joined #openstack-nova | 14:16 | |
*** ttsiouts has quit IRC | 14:16 | |
mriedem | it's been quite awhile since i've had to think about this much, and at this point i don't really know why someone would even enable those filters now | 14:16 |
*** ttsiouts has joined #openstack-nova | 14:16 | |
stephenfin | yeah, I've probably gone far enough down this rabbit hole myself | 14:16 |
mriedem | https://review.opendev.org/#/c/544683/ was the related nova spec | 14:17 |
mriedem | which was superseded with mel's osc-placement patch | 14:17 |
mriedem | the aggregates api sugar in nova is that you can set the allocation ratios per aggregate in one call, which you can't do in placement | 14:17 |
stephenfin | summary: Can't remove ComputeNode.vcpus(_used) or HostState.vcpus(_used) yet because they might be used by external filters... | 14:17 |
mriedem | and the api right? | 14:17 |
stephenfin | and the API, correct | 14:17 |
mriedem | we have talked a few times about the os-hypervisors API, or part of it, just proxying to placement rather than rely on those values set by the RT | 14:18 |
stephenfin | Might want to put them behind "this field has been deprecated getters/setters, but that's a nice-to-have | 14:18 |
stephenfin | No one should have (Core|Disk|Ram)Filter enabled so I don't need to worry about those and could probably remove them (it's been long enough) | 14:19 |
stephenfin | Because of that, I probably don't need to distinguish between vcpus and pcpus in the ComputeNode object and can just lump them in together | 14:19 |
stephenfin | The os-hypervisors API is already lying about available vcpus (overcommit ratios aren't respected) so what's another lie | 14:20 |
stephenfin | As you say, could update that to proxy to placement but that's tangential to this and probably shouldn't be lumped into what is already a somewhat large series | 14:20 |
*** ttsiouts has quit IRC | 14:21 | |
stephenfin | and I can probably remove the resource claiming stuff for CPU, RAM and disk like dansmith was doing but I need to think of a workaround for handling overhead | 14:21 |
dansmith | overhead is handled by setting reserved in the compute configs right? | 14:22 |
stephenfin | Afraid not. XenAPI seems to be adding some amount for each instance booted | 14:22 |
stephenfin | and that's what we use to account for the extra CPU consumed when you enable emulator thread offloading in Libvirt using the 'isolate' policy | 14:23 |
dansmith | right, we discussed that, | 14:23 |
dansmith | and I think we just settled on "meh, set enough reserved to cover enough of your instances" | 14:23 |
dansmith | in dublin, IIRC | 14:23 |
stephenfin | I'm happy with that but am I going to break people? | 14:24 |
*** artom has joined #openstack-nova | 14:24 | |
stephenfin | i.e. nova will now allow you to boot N+1 instances on a XenAPI host because overcommit is being ignored and the operator forgot to update their reserved | 14:24 |
mriedem | dansmith: boston was the first time i remember talking about it, but yeah it was awhile ago and the agreed workaround was bumping reserved in config | 14:24 |
mriedem | jay said he'd write a doc about how to calculate all that but... | 14:25 |
* dansmith nods | 14:25 | |
mriedem | https://bugs.launchpad.net/nova/+bug/1683858 has the details | 14:27 |
openstack | Launchpad bug 1683858 in OpenStack Compute (nova) "Allocation records do not contain overhead information" [Medium,Won't fix] | 14:27 |
stephenfin | I'm already changing all the things, so what's another thing to that pile | 14:28 |
*** artom has quit IRC | 14:29 | |
*** ttsiouts has joined #openstack-nova | 14:29 | |
efried | (stephenfin, I'm glad Matt & Dan answered you; I probably wouldn't have been much help there) | 14:35 |
bauzas | I'm honestly not really paying attention to this channel, but I saw the above discussion | 14:37 |
bauzas | stephenfin: dansmith: others: any stuff I can help ? | 14:37 |
stephenfin | bauzas: I think I'm good for now, but keep your ears open :) | 14:39 |
bauzas | okidoki | 14:42 |
efried | bauzas: mriedem, or johnthetubaguy: could I please ask you to do the final on https://review.opendev.org/#/c/651681/ ? I was going to proxy sean-k-mooney's +1, but I think that would be three Intels. | 14:43 |
bauzas | efried: lemme look | 14:43 |
efried | (auto-converge/post-copy) | 14:43 |
* bauzas needs to be more there in the channel | 14:44 | |
*** artom has joined #openstack-nova | 14:44 | |
*** artom has quit IRC | 14:45 | |
*** artom has joined #openstack-nova | 14:45 | |
artom | dansmith, could we get your stamp of (dis)approval on https://review.opendev.org/#/c/671471/ when you get a chance? tia :) | 14:45 |
bauzas | efried: oh this one ? | 14:46 |
bauzas | sure, I can +W | 14:46 |
efried | thanks | 14:46 |
bauzas | I already reviewed it once | 14:46 |
openstackgerrit | Bence Romsics proposed openstack/os-vif master: Insert osprofiler trace info as external_ids to the bridge table https://review.opendev.org/665715 | 14:48 |
*** jaosorior has joined #openstack-nova | 14:53 | |
*** spatel has quit IRC | 14:55 | |
mnaser | in nova, has there been some sort of 'standard' on what you do in code that is inherently race-y in the cases that you lose the race to another worker | 14:56 |
jaypipes | stephenfin: hyperv has a completely different calculation as well for overhead, IIRC. | 14:57 |
*** belmoreira has joined #openstack-nova | 14:57 | |
mnaser | in cinder, there's a cast operation that involves a race between workers, and the worker that loses seems to be raising an exception | 14:57 |
*** ircuser-1 has joined #openstack-nova | 14:57 | |
mnaser | which to me seems a little too much, raising an exception for a non-failing scenario | 14:57 |
efried | mnaser: That sounds really broad to me. I would think one would need to know more about the specific situation. | 14:58 |
efried | Sometimes you could retry. Sometimes you could log and continue. Sometimes you would have to fail the operation... | 14:59 |
mnaser | https://github.com/openstack/cinder/blob/master/cinder/objects/cleanable.py#L142-L155 | 14:59 |
mnaser | so the code seems to pretty much gives me the indication that the other service created the worker, and now it'll resume happily after and we'll stop | 15:00 |
*** boxiang_ has quit IRC | 15:04 | |
*** ccamacho has quit IRC | 15:05 | |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Remove deprecated Core/Ram/DiskFilter https://review.opendev.org/672065 | 15:06 |
stephenfin | jaypipes: ack. This is going to be fun | 15:06 |
stephenfin | I might be kicking that can down the road... | 15:07 |
*** gyee has joined #openstack-nova | 15:08 | |
efried | mnaser: If the code is saying "make sure this thing gets cleaned" and the exception indicates "another thread cleaned it" then it seems like the right thing is to log info ("Another thread took care of this for us, carrying on") and proceed. | 15:10 |
*** cdent has quit IRC | 15:11 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add functional test to resize volume-backed server with zero root disk https://review.opendev.org/672067 | 15:12 |
*** cdent has joined #openstack-nova | 15:13 | |
efried | mriedem: Ima rebase & reapprove https://review.opendev.org/#/c/669523/1 unless you're already in the middle | 15:14 |
openstackgerrit | Eric Fried proposed openstack/nova master: Remove Newton-era min compute checks for server create with device tags https://review.opendev.org/669523 | 15:15 |
*** Alphazero_ has joined #openstack-nova | 15:16 | |
mriedem | efried: haven't noticed | 15:16 |
efried | done | 15:16 |
Alphazero_ | Hi All, I recently uninstalled and reinstalled keystone via Juju on an up and running cluster and have been receiving this message when trying to list existing instances: | 15:17 |
Alphazero_ | The server is currently unavailable. Please try again at a later time.<br /><br /> | 15:18 |
Alphazero_ | (HTTP 503) (Request-ID: req-22f009ce-d65c-4795-8dd5-ff7c2a351d6d) | 15:18 |
Alphazero_ | checked nova.conf and its being configured by JuJu | 15:18 |
Alphazero_ | Any help would be greatly app. thanks! | 15:19 |
*** priteau has quit IRC | 15:20 | |
stephenfin | Alphazero_: You probably want #openstack or, at a stretch, #openstack-keystone | 15:25 |
Alphazero_ | whoops sorry ^^ just noticed the message at the top - will shift over... | 15:25 |
* artom humbly asks for non-RH eyes on https://review.opendev.org/#/c/670593/. We have a +3, looking for that extra point :) efried again? | 15:27 | |
artom | Hrmm, maybe wait for sean-k-mooney to weight, seems like he has doubts as to the completeness of the fix | 15:28 |
sean-k-mooney | i think you are still relying on the pf being bound to a networking driver and having a netdev | 15:29 |
sean-k-mooney | if the PF is bound to vfio-pci | 15:29 |
sean-k-mooney | it will not have a net dev | 15:29 |
sean-k-mooney | in which case im not sure pci_utils.get_mac_by_pci_address will work | 15:30 |
dansmith | artom: sorry meant to say "ack" above, but...ack. | 15:30 |
sean-k-mooney | im looking at that code now | 15:30 |
artom | dansmith, no worries, I saw the +W go through, and am thankful (in silence) | 15:30 |
dansmith | artom: ack | 15:30 |
*** lpetrut has joined #openstack-nova | 15:30 | |
sean-k-mooney | artom: so ya https://github.com/openstack/nova/blob/master/nova/pci/utils.py#L162-L180 relise on the PF being bound to a networking driver not vfio-pci | 15:31 |
sean-k-mooney | so in that case we will still skip exposing the metadata for that PF | 15:32 |
artom | sean-k-mooney, so, I did test this (unlike the original patch :( ), what would cause the PF to use networking driver vs vfio-pci? | 15:32 |
artom | Because in the env I had it was using the former, going by what you're saying | 15:33 |
sean-k-mooney | udev | 15:33 |
sean-k-mooney | or the kernel configuration in general | 15:33 |
sean-k-mooney | try manually binding the PF driver to vfio-pci | 15:33 |
artom | Env is gone :( | 15:33 |
sean-k-mooney | it will still be usable by kvm but the metadata will be empty | 15:34 |
sean-k-mooney | you change wont break anythin it just will end up in the except block and retrun none | 15:34 |
artom | So, strictly speaking, better than what we have now? ;) | 15:34 |
sean-k-mooney | so your fix is valid if the PF is boud to say i40e | 15:34 |
sean-k-mooney | yep | 15:34 |
sean-k-mooney | as i said i think its incomplte not wrong. | 15:35 |
artom | Some of the PFs our there will start working, the others will remain device-tag-less | 15:35 |
sean-k-mooney | so maybe partial-bug? | 15:35 |
artom | That'd wfm | 15:35 |
sean-k-mooney | instead of close-bug and i'd be +1 on it | 15:35 |
artom | Ack, leave a review and I'll update the commit message | 15:35 |
sean-k-mooney | sure thing. ill do that in a few minutes. just grabbing something to drink brb | 15:36 |
artom | And if I ever get an SRIOV env again, I'll come up with a patch to address vfio-pci-bound PFs, presumable with some new method to find the MAC address from the PCI address. | 15:36 |
artom | mriedem, thanks for looking, replied | 15:41 |
artom | And welcome back, btw. No way 1 week was enough, but we're still happy to see you :) | 15:41 |
jangutter | artom: I think if the PF is bound to vfio-pci then the kernel knows almost nothing about the device. Only way to get info is via some kind of backdoor (like another PCI device that happens to "manage" the PF). | 15:44 |
artom | jangutter, so in those cases we should just give up on getting its MAC address? | 15:45 |
jangutter | artom: the really ugly way is to rebind the PCI device to a kernel driver, let it probe, check what the MAC is, rebind it to vfio-pci. | 15:46 |
artom | jangutter, I feel like that's not Nova's job... | 15:46 |
jangutter | artom: I concur. | 15:46 |
artom | Would the device's PCI address be the same in the guest as on the host? If so it'd be enough to just expose tag + PCI address in the metadata | 15:47 |
artom | If not, exposing just a device tag makes no sense | 15:47 |
sean-k-mooney | jangutter: well for neutron sriov passthouhg of PF we are ment to be discovering the PF mac and setting the neutron port to it | 15:47 |
sean-k-mooney | jangutter: im not sure exactly how we do that however | 15:48 |
sean-k-mooney | im hoping it more robost then we do for tagging | 15:48 |
sean-k-mooney | jangutter: but yes when its boud to vfio-pci the kernel only know what is reported in the pci config space | 15:49 |
sean-k-mooney | so what is reported by lspci | 15:49 |
sean-k-mooney | but i don think the mac is part of that | 15:49 |
jangutter | yeah, the problem is that vfio-pci is _not_ a driver. It's technically just "hey, this is a raw pci device and I'm not going to interpret anything about it." | 15:49 |
jangutter | there's code in libvirt that does the whole "driver rebinding" thing for the MLX-3 series for VF's, because it needs info from the driver to figure out which port the VF is connected to. | 15:50 |
cdent | efried: am I right that the shared disk spec is effectively dead: https://review.opendev.org/650188 (trying to trim my attention) | 15:50 |
jangutter | It would not be surprising to me if that's also done for PF's. | 15:51 |
artom | My stomach is angry, so I'm go going to get phood, but I'll read the scrollback when I get back | 15:51 |
jangutter | could this be exposed via devlink? My spider-sense says 'unlikely'. | 15:52 |
*** mlavalle has joined #openstack-nova | 16:01 | |
*** udesale has quit IRC | 16:08 | |
openstackgerrit | Andreas Jaeger proposed openstack/nova master: Update api-ref location https://review.opendev.org/672077 | 16:09 |
*** belmoreira has quit IRC | 16:09 | |
*** ttsiouts has quit IRC | 16:11 | |
*** ttsiouts has joined #openstack-nova | 16:12 | |
*** belmoreira has joined #openstack-nova | 16:16 | |
*** lpetrut has quit IRC | 16:16 | |
*** jaosorior has quit IRC | 16:17 | |
*** belmoreira has quit IRC | 16:17 | |
*** ttsiouts has quit IRC | 16:17 | |
*** tssurya has quit IRC | 16:17 | |
sean-k-mooney | artom: commented. i gave it a tentitive +1 rather then -1 given it looks like the vfio-pci case was never supported for metadata generation | 16:20 |
*** rpittau is now known as rpittau|afk | 16:21 | |
artom | sean-k-mooney, I'm all for respinning with Partial-bug | 16:21 |
artom | I'll also improve the LOG message | 16:22 |
*** mlavalle has quit IRC | 16:23 | |
artom | sean-k-mooney, what does the passed through PF look like from the guest? I'm assuming its guest PCI address won't be the same as the host | 16:23 |
artom | My train of thought is - if we can't find its MAC, and we don't know what its guest PCI address it, there's no point in exposing just a tag, because the guest will have no way of associating that tag with a device | 16:24 |
*** davidsha has quit IRC | 16:24 | |
sean-k-mooney | it wont but the guest pci address is stored in the target element of the host dev element and the host pci address is in the source element | 16:25 |
artom | Ah, so we ignore the MAC entirely, and just expose that along with the tag | 16:25 |
sean-k-mooney | yep | 16:25 |
artom | I could do that in this patch as well, however I have no way if testing that | 16:26 |
artom | *of | 16:26 |
sean-k-mooney | i could proably test it on my server that i use for sriov dev but im not in shannon at the moment | 16:26 |
openstackgerrit | Andreas Jaeger proposed openstack/nova master: Update api-ref location https://review.opendev.org/672077 | 16:28 |
*** cdent has quit IRC | 16:28 | |
artom | It'd be a beefier change though | 16:28 |
artom | Currently LibvirtConfigGuestHostdevPCI doens't even understand <target> | 16:29 |
artom | Which actually means we're putting the wrong PCI address into the metadata, as we're using the <source> | 16:29 |
artom | *facepalm* | 16:29 |
artom | ... but we have no concept of PCI address in the VIF data structure that we use to store the tags | 16:31 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: trivial: Rename exception argument https://review.opendev.org/671795 | 16:53 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: trivial: Remove unused function parameter https://review.opendev.org/671796 | 16:53 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Remove 'hardware.get_host_numa_usage_from_instance' https://review.opendev.org/671797 | 16:53 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Remove 'hardware.host_topology_and_format_from_host' https://review.opendev.org/671798 | 16:53 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Remove 'hardware.instance_topology_from_instance' https://review.opendev.org/671799 | 16:53 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: WIP: hardware: Differentiate between shared and dedicated CPUs https://review.opendev.org/671800 | 16:53 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Add support translating CPU policy extra specs, image meta https://review.opendev.org/671801 | 16:53 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Remove deprecated CPU, RAM, disk claiming in resource tracker https://review.opendev.org/551026 | 16:53 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Remove 'nova.virt.driver.ComputeDriver.estimate_instance_overhead' https://review.opendev.org/672106 | 16:53 |
*** igordc has joined #openstack-nova | 16:54 | |
stephenfin | dansmith: ^ I think the reno I've added to https://review.opendev.org/551026 should cover how to handle overhead sufficiently, when combined with https://review.opendev.org/671801 | 16:55 |
stephenfin | (Just FYI. The series is still WIP) | 16:55 |
*** ricolin has quit IRC | 16:57 | |
mriedem | is it just me or are the zuul comments now always showing up regardless of the toggle CI button? | 16:58 |
dansmith | stephenfin: commented on something you said in there | 16:58 |
artom | sean-k-mooney, still around? Can I pick your brain for a thing? | 17:01 |
artom | (Related to that device tagging patch) | 17:02 |
stephenfin | mriedem: The zuul comments, yes. Third party CI are hidden | 17:02 |
stephenfin | I suspect that was intentional, since the button now reads "Toggle Extra CI" | 17:02 |
stephenfin | dansmith: Think you might be right, in which case this is much ado about nothing. I'll respond once I've made sure | 17:07 |
efried | artom: looks like mriedem is engaged with https://review.opendev.org/#/c/670593/ ? | 17:07 |
artom | efried, yep, thanks for looking | 17:07 |
dansmith | stephenfin: I didn't realize you were worried about something acutely happening now when we discussed earlier, I thought you were just generally wondering about the impact of moving away from those filters | 17:08 |
artom | and I think we're going back to drawing board for that one anyways | 17:08 |
efried | cdent: re shared disk spec, if it doesn't get some comeback from the author (or get picked up by someone else) then I guess it's dead, yeah. | 17:08 |
artom | The review process (sean-k-mooney mostly) unearthed other problems in that code | 17:08 |
stephenfin | No, it was pretty much the 'hw:emulator_threads_policy=isolate' case that I was worried about breaking | 17:08 |
*** Alphazero_ has quit IRC | 17:09 | |
*** amodi has quit IRC | 17:13 | |
*** mlavalle has joined #openstack-nova | 17:14 | |
sean-k-mooney | artom: my mother just arrived home but yes im still around | 17:15 |
openstackgerrit | Merged openstack/nova master: libvirt: move checking CONF.my_ip to init_host() https://review.opendev.org/671471 | 17:16 |
artom | sean-k-mooney, I thought I had a way of solving the tag <-> XML device problem for VFIO-PCI, but when writing it out in the review, turns out I didn't | 17:16 |
artom | I was thinking of using instance.pci_devices, but they don't have any MAC either | 17:16 |
artom | Unless we do a DB migration and start shoving them in there | 17:17 |
sean-k-mooney | you can use the vif['binding_profile']['pci_slot'] | 17:17 |
openstackgerrit | Andreas Jaeger proposed openstack/nova master: Update api-ref location https://review.opendev.org/672077 | 17:17 |
artom | sean-k-mooney, that won't solve the problem | 17:18 |
sean-k-mooney | that will be the host pci address for that port | 17:18 |
*** mvkr_ has quit IRC | 17:18 | |
*** _mlavalle_1 has joined #openstack-nova | 17:19 | |
sean-k-mooney | artom: we can construct the NetworkInterfaceMetadata without setting the mac filed | 17:21 |
sean-k-mooney | or rather setting it condionally | 17:21 |
artom | sean-k-mooney, I know, but we need to correlate the tag in nova/objects/network_request.py to the correct device in the instance XML | 17:22 |
artom | Currently that's done by MAC address, because it's the one piece of info that's common (sometimes) to both | 17:22 |
*** bbowen has quit IRC | 17:22 | |
*** mlavalle has quit IRC | 17:22 | |
sean-k-mooney | artom: but its not the "one" pices of info that is common | 17:23 |
sean-k-mooney | the pci address should be common too i belive | 17:23 |
artom | sean-k-mooney, right, but all of the existing code assumes the MAC will be used | 17:23 |
artom | So we write the tag in a VirtualInterface object, which has the MAC and nothing else. | 17:24 |
sean-k-mooney | well that is entirly broken | 17:24 |
artom | Now you tell me :P | 17:25 |
sean-k-mooney | you can have two ports with the same mac on different networks and attach them both to the same instance | 17:25 |
*** panda has quit IRC | 17:25 | |
artom | Where were you 3 years ago or whenever Newton happened | 17:25 |
artom | ;) | 17:25 |
artom | See, I though Neutron placed a MAC uniqueness constraint per instance as well, not just per-Network | 17:26 |
*** panda has joined #openstack-nova | 17:26 | |
sean-k-mooney | i proably pointed that out 3 years ago if i reviewd it :) | 17:26 |
sean-k-mooney | no | 17:26 |
sean-k-mooney | it does not | 17:26 |
sean-k-mooney | but nova proably does | 17:27 |
artom | I recall *something* of that sort | 17:27 |
sean-k-mooney | in anycase vifs_to_expose is a lis tof nova.model.Vif objects? | 17:27 |
artom | No, VirtualInterface | 17:27 |
sean-k-mooney | ah that was what i was about to ask | 17:27 |
sean-k-mooney | looking at the object https://github.com/openstack/nova/blob/master/nova/objects/virtual_interface.py#L34-L49 | 17:28 |
sean-k-mooney | you have the neutorn pot uuid | 17:29 |
sean-k-mooney | so you can get the full vifs form the network info cache | 17:29 |
sean-k-mooney | and then get the mac for that | 17:29 |
artom | We have the network ID, not the port | 17:30 |
sean-k-mooney | is this not the port id https://github.com/openstack/nova/blob/master/nova/objects/virtual_interface.py#L47 | 17:32 |
artom | NO that | 17:32 |
artom | That's the VIF uuid itself | 17:32 |
sean-k-mooney | with is the VIF uuid? | 17:33 |
sean-k-mooney | why would it need one? | 17:33 |
sean-k-mooney | also is this really the unique constarit https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/models.py#L860-L861 | 17:33 |
*** _mlavalle_1 has quit IRC | 17:33 | |
*** mlavalle has joined #openstack-nova | 17:34 | |
*** jmlowe has quit IRC | 17:34 | |
artom | sean-k-mooney, IIRC there was something else higher up | 17:35 |
artom | (About the constraint) | 17:35 |
sean-k-mooney | well that constiarint is wrong | 17:35 |
sean-k-mooney | it would only allow 1 instnace of the mac in a cell | 17:35 |
sean-k-mooney | which is not required by neutorn | 17:35 |
artom | Makes device tagging mostly work though ;) | 17:36 |
sean-k-mooney | im not sure we want to fix all the issue here in one patch | 17:37 |
sean-k-mooney | we could but im not sure you want to have to write db migrations to fix this | 17:37 |
sean-k-mooney | the uniqui constratint should be (address,deleted,uuid) if uuid is the network uuid | 17:39 |
artom | sean-k-mooney, btw, just double checked, and the virtual_interface uuid does appear to be the interface UUID, not the port | 17:40 |
*** bbowen has joined #openstack-nova | 17:40 | |
sean-k-mooney | the instance uuid should be in the instance_uuid field | 17:40 |
sean-k-mooney | not the uuid field | 17:40 |
artom | I mean the virtual_interface has its own UUID, much like an instance | 17:41 |
artom | Dunno why, but that's what it is | 17:41 |
sean-k-mooney | a ok | 17:41 |
artom | I checked on my devstack, the UUIDs don't match between ports and virtal_interfaces | 17:42 |
sean-k-mooney | in that case the constratin would be (address,deleted,network_id) | 17:42 |
sean-k-mooney | ok | 17:42 |
artom | I don't disagree, but it has the side effect of allowing device tagging to work, and it's kinda too late to impose it now | 17:42 |
sean-k-mooney | it a fairly sever regression | 17:43 |
sean-k-mooney | that was intoduec a long time agp | 17:43 |
*** ralonsoh has quit IRC | 17:43 | |
sean-k-mooney | can you try and boot two vms with ports that have the same mac on different network and see what happens? | 17:44 |
sean-k-mooney | i need to run to the shop to grab food for dinner | 17:44 |
artom | Sure, thanks for the help so far | 17:45 |
sean-k-mooney | ill be back in an hour or so. for now i would suggest lets just fix the case we know is broken but should work | 17:45 |
artom | I suppose | 17:45 |
sean-k-mooney | we can try and figureout how to make vfio-pci work as a sperate patch | 17:45 |
artom | Yeah, incremental improvement and all that | 17:46 |
sean-k-mooney | yep. i also agree with mriedem it would be nice to add a functional test for this if we can. | 17:46 |
sean-k-mooney | anyway ill be back in about an hour and a half after dinner | 17:47 |
artom | That's always the problem with functional tests - unless you *know* all the possible things the hardware can do, they're mostly useless | 17:47 |
*** ociuhandu_ has joined #openstack-nova | 17:47 | |
mriedem | i didn't necessary ask for a functional test, | 17:47 |
mriedem | anything in functional testing for that would have to be stubbed with fakelibvirt anyway | 17:47 |
artom | Otherwise you're essentially re-writing the same code, but shaped like a test | 17:47 |
mriedem | bbiab but i'll review https://review.opendev.org/#/c/667177/ when i get back | 17:48 |
artom | And clearly I *don't* know all the possible things the hardware can do, that's why we're here in the first place | 17:48 |
*** ociuhandu has quit IRC | 17:49 | |
*** ociuhandu_ has quit IRC | 17:52 | |
sean-k-mooney | ya i guess that is fair. i wont be back at my place in shannon until wednesday but i can look at recreating the vfio-pci case later this week or early next week. | 17:55 |
sean-k-mooney | i was assuming we would just extend https://github.com/openstack/nova/blob/master/nova/tests/functional/libvirt/test_pci_sriov_servers.py to check metadtaa was generated. | 17:56 |
artom | sean-k-mooney, for the record, 2 VMs with identical MACs on different networks (1 VM per network) work fine | 17:57 |
sean-k-mooney | cool the unique constraint musts be changed somewhere | 17:59 |
artom | ... and actually, 1 VM with 2 ports, same MAC on both but different networks, also works fine | 17:59 |
artom | Which is... worrying | 17:59 |
*** jmlowe has joined #openstack-nova | 17:59 | |
*** tesseract has quit IRC | 18:01 | |
artom | Oh, the address is now being stored as MAC/port_uuid | 18:01 |
artom | This seems new... | 18:01 |
sean-k-mooney | oh right i rememebr that hack | 18:07 |
sean-k-mooney | ok that makes sense | 18:07 |
sean-k-mooney | we did it that way because nova networks required a unique mac | 18:08 |
sean-k-mooney | but neutron did not | 18:08 |
artom | So https://review.opendev.org/#/c/304511/, which depends on MAC address global uniqueness (or at least per-instance uniqueness), merged the same say as https://review.opendev.org/#/c/336069/, which removed said uniquenss | 18:08 |
mriedem | that's from newton | 18:12 |
artom | I know, I'm just going back in time and realizing that device tagging never fully worked | 18:12 |
dansmith | WAT | 18:13 |
artom | Because if you had a VM with 2 identially-MAC'ed NICs, device tags could get switched around | 18:13 |
artom | Most people probably have autogenerated MACs, so the likelihood of collisions is pretty low (made apparent by this not having been reported in 3 years), but still technically broken | 18:15 |
dansmith | ah well, that's less concerning to me I think | 18:16 |
dansmith | a single machine with two identical macs on different networks would be confusing to the guest OS as well | 18:17 |
openstackgerrit | Artom Lifshitz proposed openstack/nova master: Device tags: don't pass pf_interface=True to get_mac_by_pci_address https://review.opendev.org/670593 | 18:27 |
openstackgerrit | Artom Lifshitz proposed openstack/nova master: WIP: Device tagging: expose target PCI address, not source https://review.opendev.org/672127 | 18:27 |
artom | mriedem, sean-k-mooney ^^ | 18:27 |
artom | (Thanks for hitting those backports!) | 18:28 |
mriedem | you didn't really need to rebase this... https://review.opendev.org/#/c/670593/1..2 | 18:29 |
sean-k-mooney | dansmith: i think the usecase for identical mac on different networks had something to do with bonds/loadbalancing but its limited scope for when it actully makes sense | 18:34 |
dansmith | sean-k-mooney: that only makes sense if you've got upstream filtering by the network or something.. normally boding two nics in linux uses two macs and the bonding agent chooses one | 18:34 |
sean-k-mooney | i know that people used to use it with sriov and custom physnets to force VF to come from differnet phyiscal nic which they would bound in the guest | 18:35 |
sean-k-mooney | so they would create 2 netwroks with the same vlan on different physnets that were physically the same netwrok then create two ports with the same mac and bond them | 18:36 |
dansmith | presumably because neutron or ovs or whatever will not allow you to transmit as a different mac than the port expects right? | 18:37 |
sean-k-mooney | dansmith: right the correct way to do it would be two mac and then use teh allow_adress pair extention to allow the ohter mac | 18:37 |
dansmith | I understand the use as a workaround, but it's a nightmare for exactly this sort of confusion to nova and the guest inside | 18:37 |
sean-k-mooney | yes neutron mac anti spoofing rules would drop it if you didnt use the allowed adress pairs extention | 18:37 |
sean-k-mooney | at least for ovs | 18:38 |
dansmith | aye | 18:38 |
sean-k-mooney | for sriov i think it would similarly be drop by the nic unless you toll the nic specific ally to allow the other mac | 18:38 |
sean-k-mooney | im not sure if we support that or not | 18:38 |
*** jmlowe has quit IRC | 18:39 | |
sean-k-mooney | dansmith: but ya ist a pain in the ass and really just a hack | 18:39 |
artom | mriedem, failure to git review -R strikes again :( | 18:42 |
efried | artom: I think you skipped out the other day when I mentioned this, but you can put | 18:45 |
efried | [gitreview] | 18:45 |
efried | rebase = false | 18:45 |
efried | in your .gitconfig and you'll never have to remember -R again. | 18:45 |
artom | efried, cheers for that | 18:46 |
artom | (Though I think in this case I actually did a git rebase -i mindlessly when I realized I'd forgotten something in the patch below the new WIP one) | 18:47 |
*** efried is now known as efried_pto | 18:47 | |
* efried_pto is out til tomorrow o/ | 18:47 | |
openstackgerrit | Andreas Jaeger proposed openstack/python-novaclient master: Update api-ref location https://review.opendev.org/672135 | 18:52 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add FUP unit test for port heal allocations https://review.opendev.org/672142 | 19:06 |
sean-k-mooney | git review does not automaticlaly rebase | 19:07 |
sean-k-mooney | it check if the patch can be rebased/merged but it doesnt auto rebase | 19:08 |
sean-k-mooney | man git-review | 19:08 |
* sean-k-mooney wrong window | 19:09 | |
mriedem | melwitt: i cherry picked this using the gerrit ui https://review.opendev.org/#/c/672038/ | 19:10 |
mriedem | so i'm not sure what merge conflict you're seeing | 19:10 |
melwitt | mriedem: I know. just saying when I cherry-pick -x it, the test conflicts. unless I've done something wrong | 19:11 |
mriedem | i'm not sure why the gerrit ui wouldn't fail then | 19:13 |
mriedem | usually the ui is pickier | 19:13 |
*** eharney has quit IRC | 19:23 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Correct project/user id descriptions for os-instance-actions https://review.opendev.org/670027 | 19:24 |
*** igordc has quit IRC | 19:27 | |
*** Roamer` has quit IRC | 19:30 | |
*** igordc has joined #openstack-nova | 19:30 | |
mriedem | easy api-ref fix ^ | 19:30 |
*** bbowen_ has joined #openstack-nova | 19:33 | |
*** bbowen has quit IRC | 19:36 | |
*** sean-k-mooney has quit IRC | 19:37 | |
*** lbragstad has quit IRC | 19:37 | |
*** sean-k-mooney has joined #openstack-nova | 19:39 | |
openstackgerrit | Artom Lifshitz proposed openstack/nova stable/stein: libvirt: move checking CONF.my_ip to init_host() https://review.opendev.org/672154 | 19:44 |
openstackgerrit | Artom Lifshitz proposed openstack/nova stable/stein: libvirt: move checking CONF.my_ip to init_host() https://review.opendev.org/672154 | 19:47 |
openstackgerrit | Artom Lifshitz proposed openstack/nova stable/rocky: libvirt: move checking CONF.my_ip to init_host() https://review.opendev.org/672155 | 19:47 |
openstackgerrit | Artom Lifshitz proposed openstack/nova stable/stein: libvirt: move checking CONF.my_ip to init_host() https://review.opendev.org/672154 | 19:48 |
openstackgerrit | Artom Lifshitz proposed openstack/nova stable/rocky: libvirt: move checking CONF.my_ip to init_host() https://review.opendev.org/672155 | 19:49 |
openstackgerrit | Artom Lifshitz proposed openstack/nova stable/queens: libvirt: move checking CONF.my_ip to init_host() https://review.opendev.org/672161 | 20:00 |
*** eharney has joined #openstack-nova | 20:02 | |
*** igordc has quit IRC | 20:03 | |
*** maciejjozefczyk has quit IRC | 20:07 | |
mriedem | dansmith: Kevin_Zheng: maybe you have ideas on this spec https://review.opendev.org/#/c/667894/ - i feel like we've talked about this before, but in this case they basically just want to know who (among several admins) initiated a migration operation on a server. that could be determined via some existing APIs but it gets a bit clunky to handle client-side. | 20:07 |
*** igordc has joined #openstack-nova | 20:08 | |
*** jmlowe has joined #openstack-nova | 20:10 | |
*** bbowen_ has quit IRC | 20:24 | |
dansmith | ugh | 20:24 |
openstackgerrit | Artom Lifshitz proposed openstack/nova stable/queens: libvirt: move checking CONF.my_ip to init_host() https://review.opendev.org/672161 | 20:30 |
*** ivve has joined #openstack-nova | 20:32 | |
openstackgerrit | sean mooney proposed openstack/nova master: libvirt: harden get_domain_capabilities https://review.opendev.org/670189 | 20:37 |
openstackgerrit | sean mooney proposed openstack/nova master: Libvirt: report storage bus traits https://review.opendev.org/666914 | 20:37 |
openstackgerrit | sean mooney proposed openstack/nova master: use domain capablites to get supported device models https://review.opendev.org/666915 | 20:37 |
openstackgerrit | sean mooney proposed openstack/nova master: Add transform_image_metadata request filter https://review.opendev.org/665775 | 20:37 |
*** amodi has joined #openstack-nova | 20:38 | |
*** luksky11 has joined #openstack-nova | 20:39 | |
*** xek has quit IRC | 20:44 | |
*** xek has joined #openstack-nova | 20:45 | |
openstackgerrit | Artom Lifshitz proposed openstack/nova master: Pass migration to finish_revert_migration() https://review.opendev.org/668631 | 20:53 |
openstackgerrit | Artom Lifshitz proposed openstack/nova master: [DNM] testing bug/1813789 revert resize events https://review.opendev.org/664442 | 20:53 |
artom | Check it out, I updated a patch without giving reviewers a migraine because I rebased it | 20:54 |
*** vishwanathj has joined #openstack-nova | 20:57 | |
*** whoami-rajat has quit IRC | 21:01 | |
*** vishwanathj has quit IRC | 21:02 | |
*** vishwanathj has joined #openstack-nova | 21:02 | |
mriedem | artom: +2 on that but remember to send the courtesy warning | 21:03 |
*** xek has quit IRC | 21:03 | |
mriedem | unless you already did | 21:03 |
*** xek has joined #openstack-nova | 21:04 | |
*** mvkr_ has joined #openstack-nova | 21:04 | |
artom | mriedem, not yet, I can send it now | 21:04 |
artom | Your patch it below it, so it's not like a second +2 will send mine through the gate tomorrow | 21:05 |
artom | (Not that your patch needs work, it just adds time) | 21:05 |
artom | Sent. | 21:09 |
*** pcaruana has quit IRC | 21:10 | |
*** vishwanathj has quit IRC | 21:30 | |
*** vishwanathj has joined #openstack-nova | 21:30 | |
*** artom has quit IRC | 21:31 | |
openstackgerrit | sean mooney proposed openstack/nova master: libvirt: delegate ovs plug to os-vif https://review.opendev.org/602432 | 21:33 |
*** vishwanathj has quit IRC | 21:35 | |
*** luksky11 has quit IRC | 21:40 | |
*** bbowen has joined #openstack-nova | 21:42 | |
*** sean-k-mooney has quit IRC | 21:43 | |
openstackgerrit | Merged openstack/nova stable/stein: Revert resize: wait for events according to hybrid plug https://review.opendev.org/670645 | 21:46 |
openstackgerrit | Merged openstack/nova master: Remove Newton-era min compute checks for server create with device tags https://review.opendev.org/669523 | 21:46 |
openstackgerrit | Merged openstack/nova master: nova-manage: heal port allocations https://review.opendev.org/637955 | 21:59 |
openstackgerrit | Merged openstack/nova master: Move consts from neutronv2/api to constants module https://review.opendev.org/668945 | 21:59 |
openstackgerrit | Merged openstack/nova master: Use neutron contants in cmd/manage.py https://review.opendev.org/668946 | 21:59 |
openstackgerrit | Merged openstack/nova master: Add 'resource_request' to neutronv2/constants https://review.opendev.org/668947 | 21:59 |
*** betherly has joined #openstack-nova | 22:00 | |
*** slaweq has quit IRC | 22:01 | |
*** betherly has quit IRC | 22:05 | |
*** vishwanathj has joined #openstack-nova | 22:10 | |
*** vishwanathj has quit IRC | 22:15 | |
*** betherly has joined #openstack-nova | 22:23 | |
*** betherly has quit IRC | 22:30 | |
*** rcernin has joined #openstack-nova | 22:30 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Use the safe get_binding_profile https://review.opendev.org/669817 | 22:38 |
*** mriedem has quit IRC | 22:39 | |
*** tkajinam has joined #openstack-nova | 22:39 | |
openstackgerrit | Dustin Cowles proposed openstack/nova master: Introduces the openstacksdk to nova https://review.opendev.org/643664 | 22:50 |
openstackgerrit | Dustin Cowles proposed openstack/nova master: Introduces SDK to IronicDriver and uses for node.get https://review.opendev.org/642899 | 22:50 |
openstackgerrit | Dustin Cowles proposed openstack/nova master: Use SDK for node.list https://review.opendev.org/656027 | 22:50 |
openstackgerrit | Dustin Cowles proposed openstack/nova master: Use SDK for validating instance and node https://review.opendev.org/656028 | 22:50 |
openstackgerrit | Dustin Cowles proposed openstack/nova master: Use SDK for setting instance id https://review.opendev.org/659690 | 22:50 |
openstackgerrit | Dustin Cowles proposed openstack/nova master: Use SDK for add/remove instance info from node https://review.opendev.org/659691 | 22:50 |
openstackgerrit | Dustin Cowles proposed openstack/nova master: Use SDK for getting network metadata from node https://review.opendev.org/670213 | 22:50 |
*** ivve has quit IRC | 22:57 | |
*** mlavalle has quit IRC | 23:00 | |
*** artom has joined #openstack-nova | 23:05 | |
*** vishwanathj has joined #openstack-nova | 23:20 | |
*** threestrands has joined #openstack-nova | 23:42 | |
*** TxGirlGeek has joined #openstack-nova | 23:54 | |
*** TxGirlGeek has quit IRC | 23:56 | |
*** jaypipes has quit IRC | 23:56 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!