*** sapd1_x has quit IRC | 00:07 | |
*** spatel has joined #openstack-nova | 00:08 | |
*** slaweq has joined #openstack-nova | 00:11 | |
*** slaweq has quit IRC | 00:15 | |
*** lbragstad has quit IRC | 00:23 | |
*** hamzy has joined #openstack-nova | 00:27 | |
*** brinzhang has joined #openstack-nova | 00:32 | |
*** spatel has quit IRC | 00:34 | |
*** bhagyashris has joined #openstack-nova | 00:50 | |
*** takashin has joined #openstack-nova | 00:51 | |
*** brinzhang has quit IRC | 00:55 | |
*** bhagyashris has quit IRC | 00:59 | |
*** _alastor_ has quit IRC | 01:12 | |
*** brinzhang has joined #openstack-nova | 01:18 | |
*** spatel has joined #openstack-nova | 01:30 | |
*** spatel has quit IRC | 01:30 | |
openstackgerrit | Merged openstack/nova master: hacking: Resolve W503 (line break occurred before a binary operator) https://review.opendev.org/651555 | 01:31 |
---|---|---|
openstackgerrit | Merged openstack/nova master: hacking: Resolve E741 (ambiguous variable name) https://review.opendev.org/652103 | 01:31 |
*** mgoddard has quit IRC | 01:40 | |
*** mgoddard has joined #openstack-nova | 01:48 | |
*** yedongcan has joined #openstack-nova | 01:53 | |
openstackgerrit | Yongli He proposed openstack/nova-specs master: grammar fix for show-server-numa-topology spec https://review.opendev.org/667487 | 01:54 |
openstackgerrit | Yongli He proposed openstack/nova master: Clean up orphan instances virt driver https://review.opendev.org/648912 | 01:57 |
openstackgerrit | Yongli He proposed openstack/nova master: clean up orphan instances https://review.opendev.org/627765 | 01:57 |
*** igordc has quit IRC | 01:58 | |
*** Dinesh_Bhor has joined #openstack-nova | 02:05 | |
*** slaweq has joined #openstack-nova | 02:11 | |
*** slaweq has quit IRC | 02:16 | |
openstackgerrit | Bhagyashri Shewale proposed openstack/nova master: Ignore root_gb for BFV in simple tenant usage API https://review.opendev.org/612626 | 02:27 |
*** bhagyashris has joined #openstack-nova | 02:33 | |
*** hongbin has joined #openstack-nova | 02:44 | |
openstackgerrit | Alex Xu proposed openstack/nova master: Correct the comment of RequestSpec's network_metadata https://review.opendev.org/667061 | 02:44 |
*** cfriesen has quit IRC | 02:52 | |
*** ricolin has joined #openstack-nova | 03:01 | |
*** takashin has left #openstack-nova | 03:02 | |
*** whoami-rajat has joined #openstack-nova | 03:04 | |
*** hongbin has quit IRC | 03:18 | |
*** mkrai__ has joined #openstack-nova | 03:30 | |
*** Dinesh_Bhor has quit IRC | 03:33 | |
*** psachin has joined #openstack-nova | 03:35 | |
*** udesale has joined #openstack-nova | 03:51 | |
*** hongbin has joined #openstack-nova | 03:54 | |
*** slaweq has joined #openstack-nova | 04:11 | |
*** hongbin has quit IRC | 04:12 | |
*** slaweq has quit IRC | 04:16 | |
*** jhesketh has quit IRC | 04:19 | |
*** jhesketh has joined #openstack-nova | 04:19 | |
*** dave-mccowan has quit IRC | 04:23 | |
*** mkrai__ has quit IRC | 04:27 | |
*** mkrai__ has joined #openstack-nova | 04:29 | |
*** shilpasd has joined #openstack-nova | 04:29 | |
*** ratailor has joined #openstack-nova | 04:58 | |
openstackgerrit | Yongli He proposed openstack/nova-specs master: grammar fix for show-server-numa-topology spec https://review.opendev.org/667487 | 05:22 |
*** konetzed has quit IRC | 05:28 | |
*** ivve has joined #openstack-nova | 05:36 | |
*** mmethot has quit IRC | 05:46 | |
*** mgoddard has quit IRC | 05:54 | |
*** shilpasd has quit IRC | 05:56 | |
*** Bidwe_jay65 has quit IRC | 05:56 | |
*** lpetrut has joined #openstack-nova | 05:56 | |
*** lpetrut has quit IRC | 05:57 | |
*** lpetrut has joined #openstack-nova | 05:57 | |
*** tetsuro has joined #openstack-nova | 05:59 | |
*** mgoddard has joined #openstack-nova | 06:00 | |
*** dpawlik has joined #openstack-nova | 06:05 | |
*** slaweq has joined #openstack-nova | 06:11 | |
*** ratailor has quit IRC | 06:14 | |
*** slaweq has quit IRC | 06:16 | |
*** artom has joined #openstack-nova | 06:16 | |
*** artom is now known as artom|gmtplus3 | 06:16 | |
*** ratailor has joined #openstack-nova | 06:21 | |
*** pcaruana has joined #openstack-nova | 06:26 | |
*** belmoreira has joined #openstack-nova | 06:33 | |
*** rdopiera has joined #openstack-nova | 06:35 | |
*** mkrai__ has quit IRC | 06:39 | |
*** mkrai_ has joined #openstack-nova | 06:39 | |
openstackgerrit | Merged openstack/nova master: Remove comments about mirroring changes to nova/cells/messaging.py https://review.opendev.org/667107 | 06:43 |
openstackgerrit | Merged openstack/nova master: Drop source node allocations if finish_resize fails https://review.opendev.org/654067 | 06:43 |
*** slaweq has joined #openstack-nova | 06:44 | |
*** belmoreira has quit IRC | 06:45 | |
*** luksky has joined #openstack-nova | 06:45 | |
*** belmoreira has joined #openstack-nova | 06:47 | |
*** maciejjozefczyk has joined #openstack-nova | 06:50 | |
*** rdopiera has quit IRC | 06:52 | |
*** rdopiera has joined #openstack-nova | 06:52 | |
openstackgerrit | Brin Zhang proposed openstack/python-novaclient master: Microversion 2.74: Support Specifying AZ to unshelve https://review.opendev.org/665136 | 06:52 |
*** bhagyashris has quit IRC | 07:01 | |
*** rcernin has quit IRC | 07:06 | |
*** belmoreira has quit IRC | 07:13 | |
*** belmoreira has joined #openstack-nova | 07:14 | |
*** tesseract has joined #openstack-nova | 07:16 | |
brinzhang | efried: Are you around? | 07:29 |
*** belmoreira has quit IRC | 07:34 | |
*** ccamacho has joined #openstack-nova | 07:37 | |
*** tetsuro has quit IRC | 07:40 | |
*** rajinir has quit IRC | 07:45 | |
*** damien_r has joined #openstack-nova | 07:46 | |
*** belmoreira has joined #openstack-nova | 07:48 | |
*** ttsiouts has joined #openstack-nova | 07:48 | |
*** ralonsoh has joined #openstack-nova | 07:49 | |
*** psachin has quit IRC | 07:55 | |
*** tetsuro has joined #openstack-nova | 07:58 | |
*** tssurya has joined #openstack-nova | 08:00 | |
*** ttsiouts has quit IRC | 08:00 | |
*** ttsiouts has joined #openstack-nova | 08:01 | |
*** ttsiouts has quit IRC | 08:05 | |
*** ttsiouts has joined #openstack-nova | 08:06 | |
*** tkajinam has quit IRC | 08:16 | |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: pull out functions from _heal_allocations_for_instance https://review.opendev.org/655457 | 08:25 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: reorder conditions in _heal_allocations_for_instance https://review.opendev.org/655458 | 08:25 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: Prepare _heal_allocations_for_instance for nested allocations https://review.opendev.org/637954 | 08:25 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: pull out put_allocation call from _heal_* https://review.opendev.org/655459 | 08:25 |
openstackgerrit | Balazs Gibizer proposed openstack/nova master: nova-manage: heal port allocations https://review.opendev.org/637955 | 08:25 |
*** tetsuro has quit IRC | 08:28 | |
*** Luzi has joined #openstack-nova | 08:31 | |
*** dtantsur|afk is now known as dtantsur|mtg | 08:37 | |
*** tesseract has quit IRC | 08:38 | |
*** tesseract has joined #openstack-nova | 08:40 | |
*** rpittau|afk is now known as rpittau|mtg | 08:41 | |
*** imacdonn has quit IRC | 08:42 | |
*** imacdonn has joined #openstack-nova | 08:43 | |
*** rcernin has joined #openstack-nova | 08:46 | |
*** ociuhandu has joined #openstack-nova | 08:47 | |
*** mdbooth has quit IRC | 09:02 | |
openstackgerrit | Surya Seetharaman proposed openstack/nova master: Grab fresh power state info from the driver https://review.opendev.org/665975 | 09:04 |
*** ociuhandu has quit IRC | 09:04 | |
*** rcernin has quit IRC | 09:07 | |
*** jaosorior has quit IRC | 09:22 | |
*** jaosorior has joined #openstack-nova | 09:24 | |
openstackgerrit | Boxiang Zhu proposed openstack/nova master: Update AZ admin doc to mention the new way to specify hosts https://review.opendev.org/666767 | 09:29 |
*** mdbooth has joined #openstack-nova | 09:32 | |
kashyap | Does anyone here of an existing bug in the Gate, where the "tempest-slow-py3" is failing with: | 09:32 |
kashyap | tempest.exceptions.BuildErrorException: Server 008c5c50-ff54-49f4-adb0-23775e8af5f1 failed to build and is in ERROR status | 09:32 |
kashyap | Details: {'code': 500, 'created': '2019-06-25T20:55:49Z', 'message': 'Unexpected vif_type=binding_failed'} | 09:32 |
kashyap | http://logs.openstack.org/89/667389/1/check/tempest-slow-py3/2606bcc/testr_results.html.gz | 09:32 |
kashyap | [That is a stable/stein backport] | 09:32 |
kashyap | Okay, I see time outs (also in the stable/rocky backport) in the 'testr_results'. /me goes to 'recheck' | 09:37 |
*** psachin has joined #openstack-nova | 09:39 | |
*** xek has joined #openstack-nova | 09:55 | |
*** ratailor_ has joined #openstack-nova | 10:01 | |
*** ratailor has quit IRC | 10:04 | |
*** ociuhandu has joined #openstack-nova | 10:06 | |
*** artom|gmtplus3 has quit IRC | 10:06 | |
*** ttsiouts has quit IRC | 10:10 | |
*** ttsiouts has joined #openstack-nova | 10:10 | |
*** artom has joined #openstack-nova | 10:10 | |
*** liuyulong has joined #openstack-nova | 10:14 | |
*** ttsiouts has quit IRC | 10:15 | |
*** brinzhang has quit IRC | 10:18 | |
*** luksky has quit IRC | 10:28 | |
*** mkrai_ has quit IRC | 10:29 | |
*** mkrai_ has joined #openstack-nova | 10:29 | |
*** mkrai_ has quit IRC | 10:31 | |
*** mkrai__ has joined #openstack-nova | 10:31 | |
*** dpawlik has quit IRC | 10:37 | |
*** dpawlik has joined #openstack-nova | 10:38 | |
*** davidsha has joined #openstack-nova | 10:40 | |
*** brinzhang has joined #openstack-nova | 10:41 | |
*** dpawlik has quit IRC | 10:42 | |
*** sapd1_x has joined #openstack-nova | 10:42 | |
*** dpawlik has joined #openstack-nova | 10:45 | |
*** liuyulong has quit IRC | 10:47 | |
*** dpawlik has quit IRC | 10:50 | |
*** dpawlik has joined #openstack-nova | 10:53 | |
*** Bidwe_jay has joined #openstack-nova | 10:57 | |
*** mkrai__ has quit IRC | 10:58 | |
*** mkrai_ has joined #openstack-nova | 10:58 | |
*** luksky has joined #openstack-nova | 10:58 | |
*** dpawlik has quit IRC | 11:00 | |
*** dpawlik has joined #openstack-nova | 11:01 | |
*** sapd1_x has quit IRC | 11:02 | |
NewBruce | Hey @kayshap | 11:06 |
NewBruce | so good news, i didnt try to mess around with xml, instead just use SELinux ;) but not sure if you can help out on this one - migrations are failing with | 11:07 |
NewBruce | Live Migration failure: Library function returned error but did not set virError: libvirtError: Library function returned error but did not set vir | 11:07 |
NewBruce | digging into the libvirt logs - | 11:08 |
NewBruce | 2019-06-26 09:46:22.816+0000: 30621: error : virNetClientStreamRaiseError:200 : stream had I/O failure | 11:08 |
NewBruce | 2019-06-26 09:46:23.190+0000: 19228: error : virNetClientProgramDispatchError:177 : internal error: qemu unexpectedly closed the monitor: 2019-06-26T09:46:22.815029Z qemu-kvm: Failed to load PCIDevice:config | 11:08 |
NewBruce | 2019-06-26T09:46:22.815033Z qemu-kvm: Failed to load virtio-net:virtio | 11:08 |
NewBruce | 2019-06-26T09:46:22.815036Z qemu-kvm: error while loading state for instance 0x0 of device '0000:00:03.0/virtio-net' | 11:08 |
NewBruce | thing is, from the control side of things, the migration completed - no errors or anything are returned… also, i could migrate fine between these hosts before i added SELinux, and it (rarely) works to migrate a machine… im lost if its a LibVirt or nova issue at this point - thoughts? | 11:09 |
NewBruce | the osc reports life is peachy : openstack server migrate --live cc-compute28-sto2 aadfe56a-88b8-49c0-9dac-41a4c494c1b5 --wait | 11:10 |
NewBruce | Progress: 97Complete | 11:10 |
NewBruce | but nova never gets to post-migration, and i dont think its actually doing the migration itself - on the source | 11:11 |
NewBruce | Took 2.35 seconds for pre_live_migration on destination host cc-compute26-sto2. | 11:11 |
NewBruce | Migration running for 0 secs, memory 100% remaining; (bytes processed=0, remaining=0, total=0) | 11:11 |
NewBruce | Migration operation has aborted | 11:11 |
*** ttsiouts has joined #openstack-nova | 11:13 | |
*** ociuhandu has quit IRC | 11:23 | |
*** ociuhandu has joined #openstack-nova | 11:23 | |
*** tbachman has joined #openstack-nova | 11:27 | |
*** mkrai_ has quit IRC | 11:31 | |
*** shilpasd has joined #openstack-nova | 11:31 | |
*** shilpasd10 has joined #openstack-nova | 11:31 | |
*** shilpasd10 has quit IRC | 11:32 | |
*** shilpasd63 has joined #openstack-nova | 11:33 | |
*** sean-k-mooney has joined #openstack-nova | 11:43 | |
*** ratailor_ has quit IRC | 11:48 | |
*** ivve has quit IRC | 11:51 | |
*** udesale has quit IRC | 11:51 | |
*** udesale has joined #openstack-nova | 11:52 | |
*** eharney has quit IRC | 11:55 | |
*** _erlon_ has joined #openstack-nova | 11:59 | |
*** _alastor_ has joined #openstack-nova | 12:00 | |
*** luksky has quit IRC | 12:02 | |
*** luksky has joined #openstack-nova | 12:03 | |
*** francoisp has joined #openstack-nova | 12:04 | |
*** _alastor_ has quit IRC | 12:04 | |
*** mdbooth has quit IRC | 12:11 | |
*** ttsiouts has quit IRC | 12:13 | |
*** ttsiouts has joined #openstack-nova | 12:13 | |
*** lbragstad has joined #openstack-nova | 12:17 | |
*** ttsiouts has quit IRC | 12:18 | |
*** ttsiouts has joined #openstack-nova | 12:20 | |
*** mdbooth has joined #openstack-nova | 12:21 | |
*** martinkennelly has joined #openstack-nova | 12:23 | |
*** mriedem has joined #openstack-nova | 12:25 | |
mriedem | lyarwood: is your plan for https://bugs.launchpad.net/nova/+bug/1832248 to get https://review.opendev.org/#/c/664418/ released and then bump the lower constraint dependency from nova to os-brick and then consider the nova bug fixed? | 12:27 |
openstack | Launchpad bug 1832248 in OpenStack Compute (nova) "tempest.api.volume.test_volumes_extend.VolumesExtendAttachedTest.test_extend_attached_volume failing when using the Q35 machine type" [Undecided,New] | 12:27 |
*** shilpasd63 has quit IRC | 12:30 | |
*** shilpasd80 has joined #openstack-nova | 12:31 | |
alex_xu | mriedem: hope we answered all your question https://review.opendev.org/601596, looking for one more +2 :) | 12:32 |
alex_xu | johnthetubaguy: ^ hope you around, the vpmem spec is in good shape | 12:33 |
lyarwood | mriedem: no the nova bug is seperate, the os-brick change just works around the underlying QEMU issue Nova is hitting with q35 | 12:34 |
*** Luzi has quit IRC | 12:35 | |
mriedem | lyarwood: oh ok | 12:36 |
mriedem | alex_xu: ack, i still need to read all of the replies... | 12:36 |
alex_xu | mriedem: hah, i see, a lot | 12:37 |
*** brinzhang has quit IRC | 12:38 | |
*** brinzhang has joined #openstack-nova | 12:39 | |
alex_xu | mriedem: also, there is the code for reference https://review.opendev.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/virtual-persistent-memory, althought it is merge conflict, but it is still good to see what we probably are going to change in the code | 12:39 |
*** dpawlik has quit IRC | 12:40 | |
sean-k-mooney | mriedem: can you take a look at https://review.opendev.org/#/c/667264/ its a osc change for force down. you sent a mail to the list about droping computenode host/service id compat code and im wondering if that is related or not | 12:42 |
mriedem | i think my biggest hangups were on the (1) flavor extra spec definition which was a bit hard to parse from a user perspective in my opinion and (2) the questions about the new data model and versioned object which were very similar to a BDM but i realize we don't want to re-use BDMs for this | 12:43 |
mriedem | sean-k-mooney: different issue | 12:44 |
mriedem | sean-k-mooney: before 2.53 you had to call a force-down route, with 2.53 you just call the normal PUT route | 12:44 |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Add mising tests for flavor extra_specs mv 2.61 https://review.opendev.org/667600 | 12:44 |
mriedem | https://developer.openstack.org/api-ref/compute/#update-forced-down for <2.53 | 12:44 |
sean-k-mooney | right i saw that | 12:44 |
mriedem | https://developer.openstack.org/api-ref/compute/#update-compute-service >=2.53 | 12:44 |
sean-k-mooney | what i was concerned about is the new form uses service id | 12:45 |
mriedem | with 2.53 the service_id in the API is a uuid | 12:45 |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Add missing tests for flavor extra_specs mv 2.61 https://review.opendev.org/667600 | 12:45 |
sean-k-mooney | the old used host and binary name | 12:45 |
mriedem | service_id is the uuid of the service, it's fine | 12:45 |
sean-k-mooney | and i was not clear what you were proposeing droping in the mail | 12:45 |
mriedem | it's unrelated to the relationship between compute nodes and services | 12:45 |
sean-k-mooney | ok | 12:45 |
mriedem | see all of the notes/todos around ComputeNode.service_id in the code | 12:45 |
mriedem | and ComputeNode.host | 12:46 |
lyarwood | mriedem: https://review.opendev.org/#/c/457886/ - btw, would you mind taking a look at this if you have time this week. | 12:46 |
sean-k-mooney | mriedem: ok im reading them now thanks | 12:46 |
mriedem | lyarwood: sure | 12:47 |
lyarwood | thanks | 12:47 |
*** dave-mccowan has joined #openstack-nova | 12:52 | |
*** dpawlik has joined #openstack-nova | 12:55 | |
*** mmethot has joined #openstack-nova | 12:57 | |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Add missing tests for flavor extra_specs mv 2.61 https://review.opendev.org/667600 | 13:00 |
*** xek_ has joined #openstack-nova | 13:05 | |
*** xek has quit IRC | 13:07 | |
bauzas | mriedem: FWIW, I need to reload a shit ton of context from Kilo before replying to you but I saw your email | 13:08 |
bauzas | mriedem: because I wonder if we need a major version bump for the ComputeNode object | 13:09 |
*** mmethot is now known as mmethot|brb | 13:10 | |
mriedem | bauzas: i wondered about that as well but figured it wasn't required | 13:15 |
openstackgerrit | Martin Midolesov proposed openstack/nova master: Implementing graceful shutdown. https://review.opendev.org/666245 | 13:15 |
*** eharney has joined #openstack-nova | 13:15 | |
mriedem | i think we've only ever bumped the major version on an object and that's when dansmith did Instance v2.0 | 13:16 |
mriedem | i don't remember the details of how complicated it was but i'm pretty sure i'd screw it up if i tried to do it myself | 13:16 |
*** rajinir has joined #openstack-nova | 13:18 | |
bauzas | mriedem: yeah I need to remember why I was thinking about that by Kilo time | 13:24 |
mdbooth | stephenfin or sean-k-mooney: https://review.opendev.org/#/c/663382/4/nova/compute/manager.py Not my area of expertise, but would the prior call to _deallocate_network not mean that neutron would no longer return this stuff? | 13:28 |
mriedem | sean-k-mooney: i've replied on https://review.opendev.org/#/c/667264/2 with what i think they should do in the 2.53 case, | 13:28 |
mriedem | whether or not novaclient has all of the plumbing they need i haven't checked | 13:28 |
*** eharney has quit IRC | 13:29 | |
sean-k-mooney | mriedem: thanks osc is not what i normally review but since they asked me to take a look i said i would review | 13:29 |
*** mmethot|brb is now known as mmethot | 13:29 | |
mriedem | sean-k-mooney: mdbooth: also commented in https://review.opendev.org/#/c/663382/4 | 13:31 |
*** spatel has joined #openstack-nova | 13:31 | |
sean-k-mooney | maybe im looking. we could proably use try_dealocate_networks there too | 13:31 |
mdbooth | mriedem: Ooh, I'd forgotten that gem. | 13:31 |
mriedem | mdbooth: what? force_refresh? | 13:33 |
dansmith | mriedem: correct, and yes, it's complicated | 13:33 |
mriedem | you don't want to use that in this case | 13:33 |
mriedem | mdbooth: because force_refresh only goes back to stein and i'm guessing you want to backport this further than that | 13:33 |
mdbooth | mriedem: Ack. | 13:33 |
*** shilpasd80 has quit IRC | 13:34 | |
*** spatel has quit IRC | 13:36 | |
kashyap | Any others seeing stable/stein failures with the 'tempest-slow-p3' job? | 13:43 |
kashyap | http://logs.openstack.org/89/667389/1/check/tempest-slow-py3/2606bcc/testr_results.html.gz | 13:44 |
mriedem | kashyap: yes known issue | 13:44 |
mriedem | https://review.opendev.org/#/c/667216 | 13:44 |
kashyap | Ah, thanks. I didn't wanted to mindlessly do 'recheck' | 13:45 |
*** yedongcan has quit IRC | 13:45 | |
sean-k-mooney | mdbooth: deallocate_network delete the neutron port that were auto allcoated by nova so yes we proably should move that to the end of the function since it clears the network info caceh https://opendev.org/openstack/nova/src/branch/master/nova/network/neutronv2/api.py#L1603-L1604 | 13:46 |
*** mloza has quit IRC | 13:47 | |
*** eharney has joined #openstack-nova | 13:48 | |
*** eharney has quit IRC | 13:48 | |
*** eharney has joined #openstack-nova | 13:49 | |
*** mlavalle has joined #openstack-nova | 13:54 | |
*** belmoreira has quit IRC | 13:55 | |
openstackgerrit | Lee Yarwood proposed openstack/nova master: libvirt: Add a rbd_connect_timeout configurable https://review.opendev.org/667421 | 13:56 |
mriedem | sean-k-mooney: i left some more comments/questions in that one and added some vmware and zvm driver devs | 14:00 |
sean-k-mooney | mriedem: it looks like https://review.opendev.org/#/c/660761/8 is trying to fix the same or a similar bug | 14:01 |
*** belmoreira has joined #openstack-nova | 14:02 | |
*** brinzhang has quit IRC | 14:03 | |
sean-k-mooney | mriedem: if we delete while building there is a scond race which causes us to not clean up the vif | 14:03 |
*** brinzhang has joined #openstack-nova | 14:03 | |
mriedem | that's amorin's fix yes | 14:04 |
mriedem | which is different from stephenfin's which is handling a failure while building | 14:04 |
sean-k-mooney | e.g. if the vm has spawned but we get teh delete before we update the instcance state in the db we raise an exception which is what cause us to not clean them up | 14:04 |
mriedem | and amorin was just in here the other day saying he had a similar issue there | 14:04 |
sean-k-mooney | mriedem: no stephens issue was a failure caused when you delete while building | 14:05 |
*** jistr is now known as jistr|call | 14:05 | |
sean-k-mooney | spefically for the customer it was cause because one of the isntance in there heat stack failed to build and that cause all fo the instance to be deleted | 14:05 |
sean-k-mooney | mriedem: i think amorin bug is a duplicate of stephens but im not sure it would fix it in all cases as in sthepens edgecase we never call distroy | 14:07 |
sean-k-mooney | well maybe they are both bugs i did not fully review there bug in detail | 14:08 |
mriedem | as i said amorin said he still has an issue which stephenfin's patch might resolve | 14:12 |
mriedem | amorin said he was going to try and recreate and use stephen's patch to test it | 14:12 |
sean-k-mooney | ya i think on reading there bug both would be needed | 14:12 |
sean-k-mooney | mriedem: amorin is fixing the fact we might be useing an outdated network_info object form the instance and stephenfin is fixing that if we fail due to the db update we never even tried to clean up the vifs | 14:13 |
sean-k-mooney | so to fix the downsteam bug we will need to backprot both. | 14:14 |
sean-k-mooney | ok this make more sense to me now. | 14:14 |
*** _alastor_ has joined #openstack-nova | 14:15 | |
amorin | hey all | 14:20 |
amorin | the bug I faced 2 days ago was not fixed by stephenfin patch | 14:22 |
*** jistr|call is now known as jistr | 14:22 | |
amorin | I found that it was something else in our code | 14:22 |
amorin | cc mriedem sean-k-mooney | 14:23 |
*** _erlon_ has quit IRC | 14:23 | |
mriedem | mnaser: i think you just hit something like this nw info cache lost thing, so you might have input here http://lists.openstack.org/pipermail/openstack-discuss/2019-June/007363.html | 14:23 |
amorin | by the way, I faced an other one, related to the patch I did: | 14:23 |
amorin | https://review.opendev.org/#/c/667294/ | 14:23 |
mriedem | maciejjozefczyk: sean-k-mooney: ^ | 14:23 |
mriedem | amorin: one step forward, two steps back :( | 14:24 |
amorin | yup | 14:25 |
mriedem | i remember a similar check was added here https://github.com/openstack/nova/blob/707deb158996d540111c23afd8c916ea1c18906a/nova/network/base_api.py#L35 | 14:25 |
amorin | exact | 14:25 |
sean-k-mooney | ok so we might need all 3 patches | 14:26 |
sean-k-mooney | amorin: stephenfin patch is a generalised fix to a very specific edgecase | 14:27 |
sean-k-mooney | amorin: what you originally tried to fix was more subtle as we were passing stale data in some cases | 14:28 |
maciejjozefczyk | ehh, instance_info_cache :) | 14:29 |
openstackgerrit | Martin Midolesov proposed openstack/nova master: Implementing graceful shutdown. https://review.opendev.org/666245 | 14:29 |
sean-k-mooney | maciejjozefczyk: yep its awsome... | 14:30 |
sean-k-mooney | mriedem: out of interest why do we store the instance info cache in the db? | 14:30 |
sean-k-mooney | i fell like we would have fewer bugs related to it if we actully just made it an in process dict cache | 14:31 |
mriedem | sean-k-mooney: i'll direct your question to the people that worked on nova back in 2011 or something | 14:31 |
sean-k-mooney | well my next question was going to be "i assume this is because of nova networks legacy choices" | 14:32 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Remove no longer required "inner" methods. https://review.opendev.org/655282 | 14:32 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Privsepify ipv4 forwarding enablement. https://review.opendev.org/635431 | 14:32 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Remove unused FP device creation and deletion methods. https://review.opendev.org/635433 | 14:32 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Privsep the ebtables modification code. https://review.opendev.org/635435 | 14:32 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Move adding vlans to interfaces to privsep. https://review.opendev.org/635436 | 14:32 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Move iptables rule fetching and setting to privsep. https://review.opendev.org/636508 | 14:32 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Move dnsmasq restarts to privsep. https://review.opendev.org/639280 | 14:32 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Move router advertisement daemon restarts to privsep. https://review.opendev.org/639281 | 14:32 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Move calls to ovs-vsctl to privsep. https://review.opendev.org/639282 | 14:32 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Move setting of device trust to privsep. https://review.opendev.org/639283 | 14:32 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Move final bridge commands to privsep. https://review.opendev.org/639580 | 14:32 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Cleanup the _execute shim in nova/network. https://review.opendev.org/639581 | 14:32 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: We no longer need rootwrap. https://review.opendev.org/554438 | 14:32 |
openstackgerrit | Stephen Finucane proposed openstack/nova master: Cleanup no longer required filters and add a release note. https://review.opendev.org/639826 | 14:32 |
mriedem | sean-k-mooney: idk, you'd have to do some digging to find out when the network info cache was introduced, i don't know if it was before quantum or not | 14:33 |
mriedem | but we also store bdms in the db which are essentially the same thing - a cache of volume information for the server | 14:33 |
mriedem | which was probably before cinder existed | 14:33 |
sean-k-mooney | im seeing a pattern there | 14:33 |
sean-k-mooney | ok well lets fix the current issue first but i think i might look into if we could remove storing it to the db | 14:34 |
amorin | I would love that | 14:34 |
amorin | :p | 14:34 |
sean-k-mooney | cacheing in memory in the compute agent would likely be enough | 14:34 |
sean-k-mooney | we would have to rebuild it every time the compute agent restarts but i think that is fine | 14:35 |
sean-k-mooney | actully we could use memcache to cache it too which would mean all the services would have acess to it anway its now on my todo list | 14:37 |
sean-k-mooney | messing up the netron policy and currpting the network info cache is what cause our ci cloud production outage at the weekend | 14:38 |
*** aarents has joined #openstack-nova | 14:38 | |
*** luksky has quit IRC | 14:43 | |
mriedem | TheJulia: is this a known busted job? http://logs.openstack.org/17/667417/1/check/ironic-tempest-ipa-wholedisk-bios-agent_ipmitool-tinyipa/db33ba3/controller/logs/devstacklog.txt.gz#_2019-06-26_05_47_14_168 | 14:45 |
mriedem | sean-k-mooney: redoing how the nw info cache works is hopefully wayyyyyyy down on your todo lits | 14:46 |
mriedem | *list | 14:46 |
shilpasd | efried: mriedem: can you tell me how to trigger live migration sync and async way, any CLI commands? | 14:47 |
mriedem | shilpasd: i don't know what you mean, sync and async way | 14:48 |
shilpasd | mriedem: means nova live-migration <instance_id>, it triggers live migration, but any another way to live migrate, any periodic call or something | 14:48 |
mriedem | no nova doesn't auto-live migrate things for you | 14:49 |
shilpasd | mriedem: i am in process of verifying all move operations on NFS changes done against https://review.opendev.org/#/c/650188/ | 14:49 |
shilpasd | so wnat to take care of all move operations | 14:50 |
shilpasd | so just want to know @ it | 14:50 |
mriedem | all move operations are user-initiated | 14:51 |
mriedem | as far as i know anyway | 14:51 |
shilpasd | ok, as of now verifying SHELVE + SHELVE with offload + UNSHELVE + REBUILD + RESIZE + RESIZE REVERT + EVACUATION + COLD MIGRATION + COLD MIGRATION REVERT + LIVE MIGRATION | 14:52 |
*** lpetrut has quit IRC | 14:52 | |
shilpasd | just list if i missed anything | 14:52 |
mriedem | by rebuild i assume you mean evacuate | 14:52 |
mriedem | rebuild (the server action in the api) isn't a move, | 14:52 |
mriedem | but evacuate is | 14:52 |
efried | brinzhang: I'm here now, what's up? | 14:52 |
mriedem | evacuate = rebuild on another host | 14:52 |
shilpasd | rebuild using another image | 14:52 |
mriedem | rebuild + a new image is not a move | 14:53 |
mriedem | it's rebuilding the server's root disk image on the same host | 14:53 |
bauzas | mriedem: not sure I understood your point in https://bugs.launchpad.net/nova/+bug/1793569/comments/5 | 14:53 |
openstack | Launchpad bug 1793569 in OpenStack Compute (nova) "Add placement audit commands" [Wishlist,Confirmed] - Assigned to Sylvain Bauza (sylvain-bauza) | 14:53 |
mriedem | also, shelve w/o offload and then unshelve is also not a move operation, | 14:53 |
bauzas | mriedem: do you want heal_allocations to support this or the "placement audit' rather ? | 14:53 |
mriedem | if the instance is shelved but not offloaded, and then the user unshelves, it's just unshelved on the same host | 14:53 |
shilpasd | mriedem: ok, noted | 14:53 |
shilpasd | mriedem: what @ resize | 14:54 |
shilpasd | its move operation, right, since resizing on another host also | 14:55 |
mriedem | shilpasd: maybe :) | 14:55 |
mriedem | unless nova is configured with allow_resize_to_same_host and the scheduler picks the same host the instance is already one, | 14:55 |
mriedem | which is possible in a small edge site or if the server is in a strict affinity group and can't be moved | 14:55 |
mriedem | *already on | 14:56 |
shilpasd | got it | 14:56 |
mriedem | https://bugs.launchpad.net/nova/+bug/1790204 is all about that problem | 14:56 |
openstack | Launchpad bug 1790204 in OpenStack Compute (nova) "Allocations are "doubled up" on same host resize even though there is only 1 server on the host" [High,Triaged] | 14:56 |
mriedem | bauzas: i think i meant to say "nova-manage placement audit" there, | 14:58 |
mriedem | since heal_allocations doesn't report on things really, nor does it delete allocations, it only adds allocations for instances (not migrations) that are missing | 14:58 |
bauzas | mriedem: ack, will add this there then | 14:58 |
mriedem | i went on to continue talking about heal_allocations but idk, it's a blur | 15:00 |
shilpasd | mriedem: one more query, i have NFS configuration, and performing resize on another host, and it goes for creating a instance data file on the dest system via SSH | 15:00 |
shilpasd | refer code at https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L8861 | 15:00 |
shilpasd | mriedem: during shared resource provider check, why this check is necessary? | 15:01 |
*** cfriesen has joined #openstack-nova | 15:01 | |
shilpasd | _is_storage_shared_with() | 15:02 |
*** asettle is now known as asettle-PTO | 15:03 | |
*** xek__ has joined #openstack-nova | 15:03 | |
mriedem | shilpasd: it may be ssh or rsync, it depends on config https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.remote_filesystem_transport | 15:03 |
mriedem | the default is ssh | 15:03 |
mriedem | i'm less familiar with this code, but for one we don't have shared storage provider support in the libvirt driver anyway, | 15:04 |
*** ccamacho has quit IRC | 15:04 | |
mriedem | but this is presumably one of the things we could replace if we had compute nodes modeled in a shared storage aggregate and we could avoid the "temp file create" tests and such for shared storage | 15:04 |
mriedem | as i'm sure lyarwood and mdbooth could probably attest, shared storage support in the libvirt driver can be very confusing because there are the instance files like console logs and such, and there is the image backend, and that can all be different and be a mix of shared storage and non-shared storage, e.g. the root disk images might be in the rbd image backend but the instance files, like console logs, could be on local dis | 15:05 |
mriedem | d get ssh'ed/rsync'ed around | 15:05 |
*** xek_ has quit IRC | 15:05 | |
mriedem | e.g. | 15:06 |
mriedem | https://docs.openstack.org/nova/latest/configuration/config.html#workarounds.ensure_libvirt_rbd_instance_dir_cleanup | 15:06 |
sean-k-mooney | mriedem: yes it is but if we keep getting bug with it i might have to raise it. but ya not before m2 likely not before m3 if in train at all. | 15:07 |
sean-k-mooney | ^ network info cache rework | 15:07 |
mriedem | bauzas: i think what i was thinking of was an audit command could detect that you have orphaned allocations tied to a not-in-progress migration, e.g. a migration that failed but we failed to cleanup the allocations, | 15:07 |
mriedem | bauzas: and then that information could be provided to the admin to then determine what to do, e.g. delete the allocations for the migration record consumer and potentially the related instance, | 15:08 |
bauzas | mriedem: yeah ok | 15:08 |
mriedem | and if they delete the allocations for the instance, then they could run heal_allocations on the instance to fix things up | 15:08 |
mriedem | we could also eventually build on that to make it automatic with options | 15:08 |
mriedem | e.g. nova-manage placement audit --heal | 15:08 |
mriedem | something like that | 15:08 |
shilpasd | mriedem: thanks for discussing doubts, will go through the sharings and get back to you for any further | 15:09 |
mriedem | sean-k-mooney: redoing nova's nw info cache at this point in the game is going to be a big undertaking, and i would not be surprised if trying to use a global cache like memcache or etcd or something just generates more or different kinds of bugs than what we've already been patching lo these many years, as i'm sure dansmith can agree | 15:09 |
* mriedem feels the need to phone a friend | 15:10 | |
dansmith | oh mahgod | 15:10 |
efried | yonglihe: I'm going to fix your pep8 error on https://review.opendev.org/#/c/627765/ real quick, k? | 15:10 |
dansmith | why do we need a memcache? it's in the database | 15:10 |
efried | it's due to a new rule that recently merged. | 15:10 |
sean-k-mooney | dansmith: i was suggesting not keeping in in the database and only having a dict cache or maybe use memcache | 15:11 |
dansmith | sean-k-mooney: ...why? | 15:11 |
sean-k-mooney | mriedem: and ya it would be a blueprint or spec not a bug fix | 15:11 |
mriedem | sean-k-mooney: we can just as easily f that up | 15:11 |
sean-k-mooney | well if its in process as a dict cache then if we f it up it fixed by restarting the compute agent | 15:12 |
sean-k-mooney | memcahce is proably not going to help with anything | 15:12 |
dansmith | we store some stuff in nwinfo that isn't anywhere else, IIRC, like which ports we created vs. the user, so that has to be persisted somewhere if we were going to use memcache | 15:12 |
dansmith | ...yeah ;) | 15:12 |
dansmith | what problem is being solved here? | 15:13 |
mriedem | i don't think that overhauling to use an external cache service and restarting the compute is the giant hammer we really need for what we're trying to solve | 15:13 |
sean-k-mooney | nothing at the momemnt reworking it is unrelated to what we are trying to fix | 15:13 |
openstackgerrit | Eric Fried proposed openstack/nova master: Clean up orphan instances virt driver https://review.opendev.org/648912 | 15:13 |
openstackgerrit | Eric Fried proposed openstack/nova master: clean up orphan instances https://review.opendev.org/627765 | 15:13 |
mriedem | so this is a....thought exercise? | 15:13 |
efried | sean-k-mooney, gibi: Would y'all please have another look at these --^ | 15:13 |
sean-k-mooney | yes | 15:13 |
*** brinzhang has quit IRC | 15:13 | |
sean-k-mooney | its on my todo list to figure ot if it makes sense to even do | 15:14 |
gibi | efried: I have it open | 15:14 |
*** brinzhang has joined #openstack-nova | 15:14 | |
efried | thanks gibi | 15:15 |
efried | thanks sean-k-mooney | 15:15 |
efried | sean-k-mooney: fyi it's apparently a thing stx cares about | 15:15 |
efried | thus presumably it "makes sense" in some capacity :) | 15:15 |
mriedem | efried: hyperv ci is happy with the update_provider_tree patch https://review.opendev.org/#/c/667417/ | 15:17 |
efried | mriedem: thanks for the reminder | 15:17 |
mriedem | efried: fwiw that cleanup orphan instances thing is also something that the public cloud SIG (and huawei public cloud ops) care about as well, which i was initially reviewing it awhile back | 15:17 |
mriedem | *why i was | 15:18 |
mriedem | the concern at the last ptg was how much duplication there was with the existing periodic to cleanup running deleted (but not orphaned) instances | 15:18 |
*** psachin has quit IRC | 15:19 | |
efried | okay, thanks for that background. | 15:20 |
mriedem | something something live migration fails and you've got untracked guests on the host consuming resources (which aren't tracked obviously) so then trying to schedule things to those hosts fails b/c you're out of resources | 15:21 |
efried | sounds like we need a patch to clean up those orphaned instances | 15:21 |
* mriedem shrugs | 15:22 | |
mriedem | i'm sure lots of operators have already just written scripts to detect and clean those types of thing sup | 15:22 |
mriedem | *up | 15:22 |
*** whoami-rajat has quit IRC | 15:22 | |
mriedem | but yeah it's better to have it native probably | 15:22 |
*** brinzhang has quit IRC | 15:27 | |
*** dpawlik has quit IRC | 15:28 | |
*** icarusfactor has joined #openstack-nova | 15:28 | |
*** jamesdenton has joined #openstack-nova | 15:29 | |
*** factor has quit IRC | 15:29 | |
*** aarents has quit IRC | 15:35 | |
*** whoami-rajat has joined #openstack-nova | 15:38 | |
efried | mriedem: We don't have a way to prove the xen one is being hit, do we? (update_provider_tree) | 15:42 |
efried | since their CI is dead? | 15:42 |
efried | mriedem: also, if you haven't already, there should be a note to the ML warning of this (and another before we remove the code path, obvsly) | 15:43 |
efried | ...for oot folk | 15:44 |
mriedem | sorry was just doing tech support with my mom | 15:44 |
efried | (I know nova_powervm is copacetic fwiw) | 15:44 |
mriedem | i was waiting to send the oot ML email until we were more sure about what i've proposed | 15:45 |
mriedem | and idk about the xen one if their CI is dead, though it's pretty damn basic | 15:45 |
mriedem | just a port of get_inventory | 15:45 |
bauzas | efried: mriedem: heh, the reportclient doesn't of course support all placement API queries, so I wonder whether I should add something like "get_resource_providers()" method in the reportclient just for nova-manage caller, or calling directly the Placement API | 15:46 |
bauzas | thoughts on that ? | 15:46 |
efried | bauzas: If it's something simple like GET /resource_providers (you really want all of them?) then yeah, just call SchedulerReportClient.get() | 15:47 |
bauzas | zactly | 15:47 |
efried | sfine | 15:47 |
bauzas | efried: but then I don't have a safe_connect connection | 15:48 |
mriedem | if you're not going to page, you could be listing 14K providers in the case of cern... | 15:48 |
efried | bauzas: We don't want @safe_connect | 15:48 |
efried | ever, anywhere | 15:48 |
efried | Handle ksa.ClientException at the caller instead. | 15:48 |
efried | And if you see @safe_connect anywhere in your travels and want to kill it and do that ^, I will buy your drivks. | 15:48 |
efried | drinks | 15:48 |
* mriedem notes that GET /resource_providers doesn't support paging | 15:48 | |
efried | true story | 15:48 |
bauzas | it's 40°C here, I'm all for a drink | 15:49 |
efried | bauzas: what are you trying to do with the master list? | 15:49 |
bauzas | efried: looking up all allocations to see whether they're orphaned | 15:49 |
bauzas | mriedem: ah shit, excellent point | 15:49 |
mriedem | you could instead page the compute nodes in the cells and hit this api https://developer.openstack.org/api-ref/placement/?expanded=#list-resource-provider-allocations | 15:50 |
bauzas | we could possibly need to look at all allocations per resource provider, which would be given by a list of compute services (which is paginated AFAIK) | 15:50 |
bauzas | heh, jinxed | 15:50 |
mriedem | compute service != compute node == resource provider | 15:50 |
bauzas | shit, typo, nodes indeed | 15:50 |
bauzas | tell me about my Kilo bp | 15:50 |
*** ttsiouts has quit IRC | 15:51 | |
mriedem | so once you get the allocations for a given provider, what are you going to do? | 15:51 |
*** ttsiouts has joined #openstack-nova | 15:51 | |
mriedem | check if an instance (or migration) exists with the given consumer uuid? | 15:51 |
mriedem | and if not, consider the allocation orphaned? | 15:51 |
mriedem | iff the allocation has resources that nova "owns" like VCPU | 15:52 |
mriedem | without consumer types in the allocations response we have to rely on the resource class | 15:52 |
bauzas | exactly this, I was about to say which resource classes where nova-related | 15:52 |
bauzas | were* | 15:53 |
efried | ugh, relying on resource class... | 15:53 |
efried | this is where the concept of provider owner would be handy. | 15:54 |
bauzas | yeah I know | 15:54 |
efried | hopefully we're not allowing allocations from different owners against the same provider anywhere | 15:54 |
bauzas | we could also add an argument asking for the resource class we wanna check | 15:54 |
efried | no, we shouldn't do it by resource class | 15:54 |
efried | because same resource class may be managed by different owners in different providers | 15:55 |
*** tssurya has quit IRC | 15:55 | |
efried | think VF (nova-PCI vs cyborg vs neutron) | 15:55 |
efried | but we (need to make sure we) have a rule that a provider as a whole is only managed by a single owner. | 15:55 |
bauzas | hmmm | 15:56 |
*** ttsiouts has quit IRC | 15:56 | |
bauzas | actually, I'm checking consumer_id | 15:57 |
bauzas | so I guess all resource providers corresponding to compute nodes (and children associated) should have allocations against consumer_id that | 15:57 |
bauzas | that *is* either a migration object or a nova instance | 15:57 |
bauzas | even cyborg, right? | 15:58 |
openstackgerrit | Nate Johnston proposed openstack/nova stable/stein: [DNM] Test change to check for port/instance project mismatch https://review.opendev.org/667663 | 15:58 |
bauzas | efried: ^? | 15:59 |
*** igordc has joined #openstack-nova | 15:59 | |
*** damien_r has quit IRC | 15:59 | |
efried | bauzas: If what you're looking to do is clean up allocations against orphaned instances, I think it's legit to remove all the allocations associated with that consumer, even if they're on providers you don't own. That's symmetrical with what we do when we schedule (we claim all of those atomically from nova). | 16:00 |
efried | and | 16:00 |
efried | if there's an allocation against a compute node RP, you can legitimately assume it's in that category | 16:01 |
efried | but | 16:01 |
efried | that will break eventually if we ever have resourceless roots | 16:01 |
efried | because | 16:01 |
efried | you can *not* assume that all children of the compute node RP *also* belong to nova. | 16:01 |
bauzas | baby steps here :) | 16:01 |
efried | yeah, just leave a note/todo I guess. | 16:02 |
bauzas | at least if I can support nested rps, it would be cool | 16:02 |
*** _erlon_ has joined #openstack-nova | 16:02 | |
bauzas | because eg. VGPU allocations are still made *against* a consumer which is an instance, yeepee | 16:02 |
bauzas | but, that would mean I would look at all resource providers, not only the ones Nova owns | 16:03 |
efried | yeah, it would be 1) compute node => 2) compute node RP => 3) allocations against that RP => 4) consumer for that allocation => 5) filter down to orphan consumers => 6) allocations for those consumers => 7) delete all of those | 16:03 |
efried | no | 16:03 |
bauzas | and here comes pagination... | 16:04 |
efried | with the limitation noted above (stops working for resourceless roots, which we're a long way off of), the above process will get you there. | 16:04 |
efried | Step 1 done by paginating from the nova API. | 16:04 |
bauzas | cool then | 16:04 |
efried | this is in a nova-manage type utility? | 16:05 |
efried | So we don't care that it'll take FOREVER to run at cern? | 16:05 |
bauzas | a nova-manage placement audit thing | 16:05 |
efried | mm | 16:05 |
bauzas | so a cron job basically | 16:05 |
bauzas | marker and the likes | 16:05 |
efried | mm | 16:06 |
bauzas | zactly like heal_allocations | 16:06 |
efried | sure would be nice to find a way to make it more efficient then. | 16:06 |
mriedem | heal_allocations doesn't have a marker | 16:06 |
efried | but: make it, make it right, make it fast | 16:06 |
mriedem | it has a limit of things to process | 16:06 |
mriedem | nor does heal_allocations deal with nested allocations | 16:06 |
mriedem | the audit command could also take a --consumer option to just investigate what the operator thinks is a problem instance/migration | 16:08 |
mriedem | note that i added --instance to heal_allocations later for that reason | 16:09 |
bauzas | yup I saw | 16:09 |
mriedem | and --dry-run | 16:09 |
mriedem | depends on what the command will do though, if it's just reporting then you don't need a --dry-run | 16:09 |
*** artom has quit IRC | 16:10 | |
bauzas | I was thinking of just telling the orphaned, but then later adding a --remove option | 16:11 |
bauzas | *later* | 16:11 |
bauzas | anyway, needs to go off and run by hot summer nights | 16:12 |
bauzas | I think I have everything I needed, thanks folks | 16:13 |
* mriedem assumes "hot summer nights" is a french adult store that bauzas frequents | 16:14 | |
dansmith | hoo boy | 16:14 |
mriedem | strictly adult cheese, wine and things of that nature | 16:15 |
melwitt | now, for another fun topic | 16:16 |
melwitt | mriedem, dansmith: I was reading these comments on an old [unmerged] patch: https://review.opendev.org/#/c/462521/12/releasenotes/notes/resize-auto-revert-6e1648828aba16b2.yaml@5, | 16:17 |
*** maciejjozefczyk has quit IRC | 16:17 | |
melwitt | and it made me think of the [recently merged] patch: https://review.opendev.org/633227 again and how it changed ERROR state to ACTIVE (or STOPPED) state. now I'm worried that wasn't an ok thing to do (API change?) | 16:17 |
melwitt | for a failed cold migration to self | 16:18 |
mriedem | not the same | 16:18 |
mriedem | in my change, | 16:18 |
mriedem | we failed in prep_resize before we actually did anything to the guest | 16:18 |
mriedem | in that case, putting the instance in ERROR status makes no sense imo | 16:18 |
mriedem | as i said, the only way you can get it out of error then is to do something like rebuild, hard reboot and/or reset status to ACTIVE, | 16:19 |
dansmith | and was yours also resetting to ACTIVE if it was actually shutoff? | 16:19 |
dansmith | I forget | 16:19 |
mriedem | and if i started a resize or cold migration of a STOPPED instance, then resetting it to ACTIVE isn't what i want, nor is rebuild or hard reboot really | 16:19 |
mriedem | dansmith: that was the point of my fix | 16:19 |
mriedem | to reset to STOPPED if it was STOPPED | 16:19 |
dansmith | mriedem: right | 16:19 |
mriedem | well, in part, | 16:19 |
mriedem | the main point was don't put it in ERROR status | 16:20 |
*** belmoreira has quit IRC | 16:21 | |
melwitt | ok, I think I see. this is ok because the instance is actually ok (other than cosmetic), whereas for the first example, the instance was not ok and was proposed to auto-correct to an ok/healthy state | 16:21 |
dansmith | the auto-revert actually moved stuff back, IIRC | 16:22 |
dansmith | not just correcting state, but actual revert | 16:22 |
melwitt | yeah it did | 16:22 |
melwitt | I was zooming in on the vm_state part of it, how it appears to an external script like in your example in the comment | 16:23 |
melwitt | and then I was thinking, is that a problem, if we imagine an external script executing a cold migrate and it fails and the instance stays ACTIVE so the script doesn't know it didn't work. that sort of thing | 16:24 |
melwitt | I was wondering about that after I read the comments on the old auto-revert patch | 16:25 |
dansmith | but the difference is, | 16:26 |
mriedem | the external thing should be waiting for task_state to be None to know the operation is done (or the instance action is finished/error, or the migration status is 'finished' or whatever in this case) | 16:26 |
dansmith | the merged patch corrected state before it changed from $orig to MIGRATING or whatever, right? | 16:26 |
mriedem | polling the vm_state in the API is not sufficient | 16:26 |
dansmith | the auto-revert one has it go into all the migrating states and then pop back | 16:26 |
dansmith | specifically, potentially pop back to ACTIVE and not have moved, IIRC | 16:27 |
melwitt | yes, I believe it did prevent an ERROR state that occurred before going to MIGRATING | 16:27 |
mriedem | i'm getting lost in the "it" references here when talking about separate changes | 16:28 |
melwitt | heh, sorry. the merged change | 16:28 |
mriedem | booth's change was, | 16:28 |
mriedem | resize/cold migrated failed somewhere and somehow, and the instance was set to ERROR status, right? | 16:28 |
mriedem | and if you tried doing a revertResize API call on that ERROR instance, it would do the revert resize flow to go back from the dest to the source host | 16:29 |
dansmith | no, it did a full revert I think | 16:29 |
mriedem | even though what we could have failed on was maybe something in prep_resize or resize_instance before the guest / volumes / networking ever actually *got* to the dest host | 16:29 |
dansmith | so we get to the dest host, fail, auto-revert back to source, and go back to ACTIVE | 16:30 |
dansmith | you wait for ACTIVE to mean "success" but really it failed and the instance hasn't resized or move | 16:30 |
melwitt | yeah, I think it was a full revert on the booth change. i.e. do automatically what a user would have to do, initiate a revert | 16:30 |
mriedem | oh i see https://review.opendev.org/#/c/462521/12/nova/compute/manager.py@4449 | 16:30 |
dansmith | granted it's been 18 months since I last looked at this | 16:30 |
dansmith | it's really the opposite of what mriedem's change was doing, | 16:30 |
dansmith | which was keep it active if we don't start | 16:31 |
mriedem | or stopped rather than active... | 16:31 |
dansmith | well, and that's an important piece yeah | 16:31 |
mriedem | i.e. start resize with a stopped server, prep_resize fails, don't reset to active *because it's stopped* | 16:31 |
dansmith | right | 16:31 |
mriedem | eventually the power sync task would stop the instance i think but still | 16:31 |
dansmith | or restart it when it shouldn't, right? | 16:32 |
melwitt | yeah, makes sense | 16:32 |
dansmith | if vm_state is active, it was stopped, power state sync says "hmm, this should be running" | 16:32 |
mriedem | i don't think that task ever starts anything | 16:32 |
mriedem | even though people have asked for that in the past | 16:32 |
dansmith | no? I thought it would for things like post-host-failure recovery | 16:32 |
mriedem | i believe the reasoning was always, we don't want to turn things on by guessing and then bill the user | 16:32 |
dansmith | well, billing is unrelated to started or stopped, but okay :) | 16:33 |
dansmith | it's a complex enough not-really-a-state-machine that I'm sure I'm getting it wrong | 16:33 |
mriedem | depends on how you do your billing | 16:33 |
dansmith | regardless, ACTIVE but not running is about as bad | 16:33 |
mriedem | same - it's been a long time since i loked | 16:33 |
mriedem | *looked | 16:33 |
mriedem | anyway, i agree that if i'm doing a resize (and i'm sure tempest would do this), you're waiting for the instance to go to VERIFY_RESIZE with task_state=None, | 16:34 |
mriedem | it the instance goes back to ACTIVE with task_state=None, i'd wait indefinitely | 16:34 |
mriedem | unless i've got a timeout, | 16:34 |
dansmith | especially if you went into RESIZING in between | 16:34 |
mriedem | or also checking instance actions or migration status (which might be admin-only inof) | 16:34 |
mriedem | *info | 16:34 |
mriedem | i personally wouldn't try to track the task_state transitions since that's probably a losing game | 16:35 |
mriedem | i would just wait for terminal states but yeah | 16:35 |
*** tesseract has quit IRC | 16:35 | |
dansmith | the thing is, ACTIVE is a terminal state for auto-confirm | 16:35 |
mriedem | true yeah | 16:35 |
dansmith | so if it went ACTIVE -> RESIZING -> ACTIVE, you should assume it actually resized and was auto-confirmed | 16:35 |
dansmith | but with auto-revert, | 16:35 |
mriedem | i know powervc set auto-confirm to 1 second | 16:35 |
dansmith | that breaks that behavior | 16:35 |
mriedem | lbragstad had to fix a few race bugs as a result :) | 16:36 |
dansmith | with auto-revert, ACTIVE->RESIZING->ACTIVE could mean "it worked" or "it didn't" | 16:36 |
mriedem | dansmith: yeah, and you wouldn't know unless you checked the migratoin or instance actions, which you as a non-admin might not have access to those details | 16:36 |
melwitt | yeah, I see | 16:36 |
dansmith | it turns waiting for a terminal state into a much more complex affair for sure | 16:36 |
openstackgerrit | Merged openstack/nova master: Replace deprecated with_lockmode with with_for_update https://review.opendev.org/666221 | 16:37 |
melwitt | that's a helpful way to think about it, imagining what a tempest (or func test) would need to do to be able to automate it | 16:37 |
mriedem | maybe should link this conversation into the abandoned change so we have that when this comes up again in 2 years :) | 16:40 |
melwitt | yeah, that's a good idea. let me do that now | 16:40 |
sean-k-mooney | ... i started reading the scroll back and i think on second tought i not going to do that | 16:44 |
sean-k-mooney | melwitt: the only way for a non admin to deterim if a cold migrate suceeded would be to check the hashed host id before and after | 16:47 |
sean-k-mooney | for resize they could check the if the flavor is the one they expected | 16:48 |
dansmith | sean-k-mooney: not really | 16:48 |
dansmith | oh for a strict migration, yeah | 16:48 |
dansmith | was going to say, resize to same host breaks that assumption | 16:48 |
melwitt | sean-k-mooney: could also observe ACTIVE -> RESIZING -> ACTIVE as dansmith described, right? as non-admin | 16:49 |
dansmith | melwitt: yes | 16:49 |
sean-k-mooney | you could observe it if you pool but you would not know if it succeded or failed | 16:49 |
sean-k-mooney | without also checking if the falvor is the old or new one | 16:49 |
dansmith | sean-k-mooney: you won't go back to active from resizing currently | 16:49 |
sean-k-mooney | oh ok so that was the change ye were talking about | 16:50 |
melwitt | sean-k-mooney: if it failed [after going to RESIZING] it would go to ERROR. are you talking hypothetically with the abandoned patch? | 16:50 |
sean-k-mooney | melwitt: there are case we i though it would auto revert that went back to active | 16:51 |
melwitt | sean-k-mooney: no, that was the proposal in the abandoned patch | 16:51 |
sean-k-mooney | ok i might be thinking about live migrate then | 16:52 |
sean-k-mooney | for live migrate we can fail to migrate but still be in active | 16:52 |
*** davidsha has quit IRC | 16:56 | |
sean-k-mooney | so ya looking at code earch revert_resize is only ever called form the api which simplifes some things but not others | 16:58 |
sean-k-mooney | melwitt: do we currently allow you to revert a resize for an instance that is in error because the resize failed | 16:59 |
sean-k-mooney | so you can go active->resizeing->error->active? | 16:59 |
mriedem | fwiw, as a non-admin i think you can tell if your resize failed if the instance action "message" is not null /servers/{server_id}/os-instance-actions/{request_id} | 17:00 |
mriedem | er GET /servers/{server_id}/os-instance-actions/{request_id} | 17:00 |
melwitt | sean-k-mooney: I think so, based on the abandoned patch. it was proposing to do that automatically (from error) | 17:00 |
sean-k-mooney | melwitt: ok if we did not you would have to do rest state (which is admin only?) + a hard reboot | 17:01 |
melwitt | mriedem: you mean failed before resize started right | 17:02 |
mriedem | no if the operation failed | 17:02 |
mriedem | if any event in an action (operation) fails, the overall action 'message' is always 'Error': https://github.com/openstack/nova/blob/707deb158996d540111c23afd8c916ea1c18906a/nova/db/sqlalchemy/api.py#L5227 | 17:02 |
*** ricolin has quit IRC | 17:02 | |
melwitt | sean-k-mooney: if we did not allow revert from error? I don't think reset state + reboot would put everything back properly | 17:02 |
mriedem | which is actually a bug... | 17:02 |
mriedem | https://bugs.launchpad.net/nova/+bug/1824420 | 17:03 |
openstack | Launchpad bug 1824420 in OpenStack Compute (nova) "Live migration succeeds but instance-action-list still has unexpected Error status" [Undecided,Triaged] | 17:03 |
melwitt | oh | 17:03 |
mriedem | so before we go down the road of "well the user can track the operatoin to see if it was auto-reverted on error because of instance actions" let me point out that relying on instance actions that way isn't fool proof because of that bug | 17:04 |
mriedem | and especially b/c it's a result of failures on hosts and then doing reschedules to other hosts | 17:04 |
mriedem | which resize can do | 17:04 |
sean-k-mooney | the instace should become active on the source host but it might not fix the allocation in placment properly | 17:04 |
mriedem | auto-reverting a failed resize could be all sorts of f'ed up | 17:05 |
mriedem | because rollbacks are near impossible | 17:05 |
mriedem | hard to test | 17:05 |
mriedem | i'm fairly certain our live migration rollback code is also quite janky in several ways | 17:05 |
mriedem | because we don't test it in the gate | 17:06 |
*** martinkennelly has quit IRC | 17:07 | |
sean-k-mooney | just looking at that bug the live migration failed right? | 17:08 |
sean-k-mooney | so we would exepct there to be an error in the instance action log? | 17:09 |
mriedem | no | 17:10 |
mriedem | read my comments on the bug | 17:10 |
sean-k-mooney | maybe im missreading it as its kind of hard to read the initilal bug | 17:10 |
mriedem | a pre-check on one of the candidate dest hosts failed | 17:10 |
mriedem | which triggers a reschedule to another dest host in the conductor live migration task | 17:10 |
mriedem | the 2nd host works | 17:10 |
mriedem | but b/c the pre-check failed on the first dest host the instance action event for that one is error which sets the overall action message to 'Error' | 17:11 |
sean-k-mooney | ah ok | 17:11 |
mriedem | iow, actions aren't reschedule aware | 17:11 |
mriedem | or what is a non-fatal error | 17:11 |
sean-k-mooney | right | 17:12 |
sean-k-mooney | should we be loging the prechecks as events? | 17:12 |
sean-k-mooney | i was not exepcting to see compute_check_can_live_migrate_destination events | 17:12 |
mriedem | hard to say | 17:13 |
mriedem | if you configure nova to not have alternate hosts for reschedules | 17:13 |
mriedem | then you'd likely want to know it's that dest pre-check that failed right? | 17:13 |
sean-k-mooney | maybe or jsut that you had no valid hosts? | 17:14 |
sean-k-mooney | / exausted teh list of alternates | 17:14 |
sean-k-mooney | i thikn we would still log the failure right | 17:14 |
openstackgerrit | Merged openstack/nova master: Remove orphaned comment from _get_group_details https://review.opendev.org/667135 | 17:15 |
mriedem | sure, if you set https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.migrate_max_retries to 0 for now retries, or you don't have any alternate hosts | 17:15 |
mriedem | idk, anyway, it's tangential to the auto-revert failed resize thing mel was asking about | 17:15 |
sean-k-mooney | for me this feels like we are leaking an implemenation detail as an event | 17:15 |
sean-k-mooney | ya it is | 17:16 |
mriedem | instance action events are basically entirely leaked implementation details :) | 17:16 |
mriedem | the event names come from the methods they decorate | 17:16 |
mriedem | there is no guarantee on api stability for those things | 17:16 |
sean-k-mooney | ok personally i would prefer not to decorate that function | 17:17 |
sean-k-mooney | but as you said its tangental to melwitt's topic | 17:17 |
*** zbr|ruck is now known as zbr | 17:18 | |
*** luksky has joined #openstack-nova | 17:29 | |
*** udesale has quit IRC | 17:29 | |
*** panda has quit IRC | 17:31 | |
*** panda has joined #openstack-nova | 17:35 | |
openstackgerrit | Lee Yarwood proposed openstack/nova master: libvirt: Add a rbd_connect_timeout configurable https://review.opendev.org/667421 | 17:36 |
openstackgerrit | Eric Fried proposed openstack/nova-specs master: grammar fix for show-server-numa-topology spec https://review.opendev.org/667487 | 17:36 |
Nick_A | any idea why metadata would send all /24 routes in a region to each instance? http://paste.openstack.org/show/y0lE42EA59yhnu7G1KnY/ | 17:38 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Fix AttributeError in RT._update_usage_from_migration https://review.opendev.org/667687 | 17:38 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Fix RT init arg order in test_unsupported_move_type https://review.opendev.org/667688 | 17:38 |
openstackgerrit | Ghanshyam Mann proposed openstack/nova master: Multiple API cleanup changes https://review.opendev.org/666889 | 17:48 |
*** NewBruce9 has joined #openstack-nova | 17:48 | |
*** jamesdenton has quit IRC | 17:49 | |
*** jistr_ has joined #openstack-nova | 17:52 | |
*** dtantsur has joined #openstack-nova | 17:52 | |
*** klindgren_ has joined #openstack-nova | 17:52 | |
*** NewBruce has quit IRC | 17:53 | |
*** dtantsur|mtg has quit IRC | 17:53 | |
*** klindgren has quit IRC | 17:53 | |
*** jistr has quit IRC | 17:53 | |
*** maciejjozefczyk has joined #openstack-nova | 17:53 | |
*** altlogbot_1 has quit IRC | 17:55 | |
*** jamesdenton has joined #openstack-nova | 17:56 | |
*** altlogbot_2 has joined #openstack-nova | 17:59 | |
sean-k-mooney | yonglihe: efried just reviewing https://review.opendev.org/#/c/648912/14 but why are we looking up instance by name? | 17:59 |
*** altlogbot_2 has quit IRC | 18:01 | |
efried | sean-k-mooney: I haven't the foggiest. I'm involved here in an administrative capacity :) | 18:01 |
sean-k-mooney | the domain xml has the instance uuid stored in the uuid field for several release now so im wondering why we would use the instance domain name instead | 18:02 |
*** altlogbot_1 has joined #openstack-nova | 18:03 | |
melwitt | sean-k-mooney: been meaning to get to that review. after what we briefly discussed at the ptg, that change would be best to piggyback onto the existing periodic for handling deleted instances. I don't know why it would be looking up instances by name | 18:03 |
efried | k, hopefully yonglihe will be able to answer on the review. Thanks sean-k-mooney. | 18:03 |
sean-k-mooney | melwitt: it is piggybacking on that task | 18:04 |
melwitt | but I see a lot of new methods | 18:04 |
sean-k-mooney | efried: ok im trying to find where it gets teh list of suspected instances | 18:04 |
sean-k-mooney | melwitt: ya there are. im not sure if they are all needed. | 18:05 |
melwitt | yeah, I would think there should be none new needed | 18:05 |
sean-k-mooney | well we need a new method to query the driver for the instance that are runnin on the host but not in the db | 18:06 |
sean-k-mooney | and then we can call the old meethods that implement the policy. e.g. reap or log or do nothing whatever you have set in the config | 18:07 |
melwitt | why? there's a self._get_instances_on_driver method already | 18:07 |
sean-k-mooney | that is a good question :) | 18:08 |
sean-k-mooney | i have only properly looked at https://review.opendev.org/#/c/648912/14 whic is doing driver change but i need to look at how that is being used in https://review.opendev.org/#/c/627765/28 | 18:08 |
melwitt | anyway, just saying skimming those patches I don't see why they're so large | 18:08 |
melwitt | or rather, I expected not such a large patch for this | 18:09 |
sean-k-mooney | melwitt: this seam oververly complex https://review.opendev.org/#/c/627765/28/nova/compute/manager.py@8884 | 18:13 |
sean-k-mooney | also _destroy_orphan_instances is not a greate name for that since it might not destroy anything . | 18:14 |
sean-k-mooney | melwitt: i was wrong however its adding a new periodici task not extending the existing tasks | 18:15 |
*** hongbin has joined #openstack-nova | 18:33 | |
*** ociuhandu has quit IRC | 18:36 | |
melwitt | sean-k-mooney: yeah, that's what I had thought. it should get much simpler if it's changed to extend the existing periodic. and I think the suggestion at the ptg from dansmith was to add another enumerated choice to the existing conf option that is something like "reap-unknown" which will reap both known deleted and unknown guests. and otherwise just log unknowns | 18:41 |
openstackgerrit | Merged openstack/nova master: Fix update_provider_tree signature in reference docs https://review.opendev.org/667408 | 18:43 |
openstackgerrit | Eric Fried proposed openstack/nova-specs master: grammar fix for show-server-numa-topology spec https://review.opendev.org/667487 | 18:43 |
mriedem | sean-k-mooney: it's named _destroy* because it's similar to the existing _destroy_running_instances | 18:44 |
mriedem | which might also not destroy anything | 18:44 |
sean-k-mooney | mriedem: ah ok | 18:44 |
sean-k-mooney | my main issue with this is it will only really work for the libvirt dirver | 18:45 |
sean-k-mooney | and the fake driver | 18:45 |
*** ivve has joined #openstack-nova | 18:45 | |
mriedem | yup | 18:53 |
mriedem | i believe i noted something along those lines last time i did a deep review on it | 18:54 |
openstackgerrit | Miguel Ángel Herranz Trillo proposed openstack/nova master: Fix type error on call to mount device https://review.opendev.org/659780 | 18:54 |
mriedem | i seem to remember why they needed to lookup by name at the time, and it was libvirt-specific | 18:54 |
sean-k-mooney | ya im looking at it again | 18:54 |
sean-k-mooney | its because we can have libvirt domains that are for instance that nova created but have been deleted form the db | 18:55 |
sean-k-mooney | so to destory the running guest they need to use the libvirt domain name | 18:55 |
mriedem | having said all that, i'm not opposed to starting with something that could eventually be implemented by other drivers | 18:55 |
mriedem | as long as it's graceful about other drivers not implementing the necessary interface | 18:56 |
sean-k-mooney | ya it handels the not implmetned excetion correctly in the manager | 18:56 |
sean-k-mooney | im think of asking them to use the uuid instead of the name however | 18:56 |
sean-k-mooney | the instance uuid is set in the libvirt domains uuid field | 18:57 |
mriedem | yeah uuid is ideal if we can use it | 18:57 |
sean-k-mooney | but i libvirt allows us to deleate by uuid | 18:58 |
sean-k-mooney | me might just be pushing the translation into the dirver | 18:58 |
sean-k-mooney | im also wondering how to deal with the fact we could be leaking plugged interfaces | 18:59 |
*** maciejjozefczyk has quit IRC | 19:02 | |
mriedem | heh, | 19:03 |
mriedem | well we could be leaking all sorts of things | 19:04 |
mriedem | storage connections | 19:04 |
sean-k-mooney | will undefineing the domin delete the root disk? | 19:04 |
sean-k-mooney | or other disks | 19:04 |
mriedem | i doubt it | 19:05 |
sean-k-mooney | if we are reaping the orpahn vms that were created by nova but are deleted in the db we really should be cleaning up all there local resouces in that case which this current patch does not attemtp to do | 19:05 |
mriedem | otherwise we wouldn't need separate calls during driver.destroy to cleanup the volumes via brick and unplug vifs via os-vif | 19:06 |
mriedem | at some point this could also be cyborg devices and such couldn't it? | 19:06 |
sean-k-mooney | ya that is a good point | 19:06 |
sean-k-mooney | ya | 19:06 |
sean-k-mooney | so if we are going to reap these we need to try and clean up as much of the resouces as we can or just support powering down the instance but not reaping them? | 19:07 |
openstackgerrit | Merged openstack/nova-specs master: grammar fix for show-server-numa-topology spec https://review.opendev.org/667487 | 19:07 |
sean-k-mooney | if we just delete the domain the operator has noting ot go on to figure out what they need to clean up manually | 19:07 |
sean-k-mooney | its tricky however as from the domain we dont know what nova create or not but i think we should at lesast try to do some cleanup | 19:09 |
*** dklyle has quit IRC | 19:09 | |
sean-k-mooney | anyway im going to get something to eat | 19:12 |
sean-k-mooney | o/ | 19:13 |
*** ralonsoh has quit IRC | 19:17 | |
*** phughk has quit IRC | 19:18 | |
mriedem | there is meta in the domain that tells us if nova created it | 19:19 |
sean-k-mooney | yes there is | 19:19 |
mriedem | for libvirt anyway | 19:19 |
sean-k-mooney | which i asked yonglihe to check when deleteing the domian | 19:20 |
sean-k-mooney | in the libvirt case that is | 19:20 |
efried | mriedem: +1 on the ML note, thanks for that | 19:20 |
mriedem | np | 19:21 |
sean-k-mooney | that is what https://review.opendev.org/#/c/648912/14/nova/virt/libvirt/driver.py@9695 is doing and its used to filer the domains here https://review.opendev.org/#/c/627765/28/nova/compute/manager.py@8886 | 19:22 |
*** whoami-rajat has quit IRC | 19:22 | |
*** maciejjozefczyk has joined #openstack-nova | 19:31 | |
*** dklyle has joined #openstack-nova | 19:36 | |
*** panda has quit IRC | 19:39 | |
*** panda has joined #openstack-nova | 19:40 | |
*** altlogbot_1 has quit IRC | 19:45 | |
*** altlogbot_2 has joined #openstack-nova | 19:47 | |
*** maciejjozefczyk has quit IRC | 19:50 | |
*** tbachman has quit IRC | 19:55 | |
efried | mriedem: took me three days to go through all the specs, but http://lists.openstack.org/pipermail/openstack-discuss/2019-June/007381.html | 19:59 |
*** tbachman has joined #openstack-nova | 19:59 | |
*** mriedem has quit IRC | 20:04 | |
*** mriedem has joined #openstack-nova | 20:07 | |
mriedem | efried: huh https://review.opendev.org/#/q/project:openstack/nova-specs+status:open+path:%255Especs/train/approved/.*+reviewedby:self seems to not work properly, it's only showing me https://review.opendev.org/#/c/631154/ but i've clearly commented on https://review.opendev.org/#/c/648686/ as well | 20:10 |
mriedem | maybe reviewedby is only on the latest PS? | 20:10 |
mriedem | aha | 20:10 |
mriedem | https://review.opendev.org/#/q/project:openstack/nova-specs+status:open+path:%255Especs/train/approved/.*+reviewer:self | 20:10 |
mriedem | reviewer:self, not reviewedby:self | 20:11 |
efried | whoops, thanks | 20:14 |
*** altlogbot_2 has quit IRC | 20:15 | |
*** eharney has quit IRC | 20:16 | |
*** altlogbot_2 has joined #openstack-nova | 20:19 | |
*** altlogbot_2 has quit IRC | 20:43 | |
*** altlogbot_2 has joined #openstack-nova | 20:45 | |
*** Bidwe_jay has quit IRC | 20:59 | |
*** altlogbot_2 has quit IRC | 21:00 | |
*** altlogbot_3 has joined #openstack-nova | 21:05 | |
*** pcaruana has quit IRC | 21:05 | |
*** ivve has quit IRC | 21:07 | |
*** cfriesen has quit IRC | 21:24 | |
melwitt | mnaser, imacdonn: hi, as responders to the ML thread awhile back, I have a spec up for showing server status as UNKNOWN if host status is UNKNOWN that has been receiving some reviews. your reviews would be helpful for deciding whether it goes forward https://review.opendev.org/666181 | 21:31 |
*** cfriesen has joined #openstack-nova | 21:31 | |
mriedem | dansmith: you may want to drop your +2 to a +1 or 0 https://review.opendev.org/#/c/457886/ until i get the ceph job results on it | 21:33 |
openstackgerrit | Miguel Ángel Herranz Trillo proposed openstack/nova master: Fix type error on call to mount device https://review.opendev.org/659780 | 21:41 |
*** panda has quit IRC | 21:42 | |
mriedem | dansmith: nvm, lyarwood already had a patch up to test that | 21:44 |
*** panda has joined #openstack-nova | 21:45 | |
*** takashin has joined #openstack-nova | 21:50 | |
*** rcernin has joined #openstack-nova | 22:00 | |
*** mlavalle has quit IRC | 22:09 | |
*** xek__ has quit IRC | 22:10 | |
*** shilpasd has quit IRC | 22:11 | |
*** eharney has joined #openstack-nova | 22:12 | |
mriedem | efried: are you ok with me just pushing up this test change and +2ing? https://review.opendev.org/#/c/659780/3/nova/tests/unit/virt/disk/mount/test_nbd.py | 22:14 |
efried | ... | 22:14 |
efried | mriedem: yes, and I'll +A | 22:15 |
*** luksky has quit IRC | 22:16 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Fix type error on call to mount device https://review.opendev.org/659780 | 22:16 |
mriedem | done | 22:17 |
efried | +A | 22:18 |
*** rcernin has quit IRC | 22:19 | |
*** rcernin has joined #openstack-nova | 22:20 | |
mnaser | melwitt: left a comment thanks :D | 22:25 |
melwitt | thanks | 22:43 |
*** tkajinam has joined #openstack-nova | 23:05 | |
*** threestrands has joined #openstack-nova | 23:15 | |
*** igordc has quit IRC | 23:25 | |
*** threestrands has quit IRC | 23:29 | |
*** mriedem has quit IRC | 23:42 | |
*** hongbin has quit IRC | 23:43 | |
*** slaweq has quit IRC | 23:50 | |
*** icarusfactor has quit IRC | 23:51 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!