Monday, 2018-08-06

*** efried1 has joined #openstack-nova00:16
*** efried has quit IRC00:20
*** efried1 is now known as efried00:20
*** frankwang has joined #openstack-nova00:27
*** gbarros has quit IRC00:30
*** gbarros has joined #openstack-nova00:51
*** fanzhang has joined #openstack-nova01:04
*** mrsoul has quit IRC01:05
*** chason has joined #openstack-nova01:21
*** tetsuro_ has quit IRC01:23
*** chenyb4 has joined #openstack-nova01:35
*** hongbin has joined #openstack-nova01:39
*** Nel1x has joined #openstack-nova01:43
*** liuyulong has joined #openstack-nova01:54
openstackgerritzhufl proposed openstack/nova master: Fix none-ascii char in doc  https://review.openstack.org/58842201:55
openstackgerritPierre Blanc proposed openstack/nova master: Docs: Add guide to migrate instance with snapshot  https://review.openstack.org/58444202:01
*** bkopilov has quit IRC02:14
*** Dinesh_Bhor has joined #openstack-nova02:17
openstackgerritlei zhang proposed openstack/nova master: [Docs] Update the confusing console output  https://review.openstack.org/58900402:30
*** psachin has joined #openstack-nova02:31
openstackgerritYikun Jiang (Kero) proposed openstack/nova master: Add microversion info in the os-server-groups API samples  https://review.openstack.org/58900602:34
*** kiseok7 has quit IRC02:35
*** alex_xu has joined #openstack-nova02:42
*** gbarros has quit IRC02:46
*** gbarros has joined #openstack-nova02:50
openstackgerritlei zhang proposed openstack/nova master: [Docs] Update the confusing console output  https://review.openstack.org/58900402:52
*** gbarros has quit IRC02:52
*** vivsoni has joined #openstack-nova02:56
openstackgerritlei zhang proposed openstack/nova master: [Docs] Update the compute node installation steps  https://review.openstack.org/58900803:02
*** Nel1x has quit IRC03:34
*** udesale has joined #openstack-nova03:52
*** kevinbenton has quit IRC04:00
*** d34dh0r53 has quit IRC04:00
*** kevinbenton has joined #openstack-nova04:00
*** d34dh0r53 has joined #openstack-nova04:04
*** d34dh0r53 has quit IRC04:05
*** d34dh0r53 has joined #openstack-nova04:06
*** Dinesh_Bhor has quit IRC04:08
*** hongbin has quit IRC04:09
*** frankwang has quit IRC04:11
*** frankwang has joined #openstack-nova04:12
openstackgerritTetsuro Nakamura proposed openstack/nova stable/queens: Don't filter out sibling sets with one core  https://review.openstack.org/58857004:15
openstackgerritTetsuro Nakamura proposed openstack/nova stable/queens: Ensure emulator threads are always calculated  https://review.openstack.org/58857104:15
openstackgerritTetsuro Nakamura proposed openstack/nova stable/queens: Always pass 'NUMACell.siblings' to _pack_instance_onto_cores'  https://review.openstack.org/58857204:15
openstackgerritTetsuro Nakamura proposed openstack/nova stable/queens: trivialfix: cleanup _pack_instance_onto_cores()  https://review.openstack.org/58857304:15
openstackgerritTetsuro Nakamura proposed openstack/nova stable/queens: Add unit tests for EmulatorThreadsTestCase  https://review.openstack.org/58857404:15
openstackgerritTetsuro Nakamura proposed openstack/nova stable/queens: Not use thread alloc policy for emulator thread  https://review.openstack.org/58857504:15
*** frankwang has quit IRC04:51
*** icey has quit IRC05:13
*** icey has joined #openstack-nova05:14
*** Dinesh_Bhor has joined #openstack-nova05:24
*** hoonetorg has quit IRC05:28
*** udesale has quit IRC05:34
*** janki has joined #openstack-nova05:38
*** frankwang has joined #openstack-nova05:40
*** hoonetorg has joined #openstack-nova05:42
*** ircuser-1 has quit IRC05:43
*** Luzi has joined #openstack-nova05:53
*** nicolasbock has joined #openstack-nova06:05
*** moshele has joined #openstack-nova06:06
*** pcaruana has joined #openstack-nova06:07
*** udesale has joined #openstack-nova06:11
*** ratailor has joined #openstack-nova06:17
*** tetsuro_ has joined #openstack-nova06:34
*** adrianc_ has joined #openstack-nova06:36
*** tetsuro_ has quit IRC06:41
*** luksky has joined #openstack-nova06:42
*** ivve has joined #openstack-nova06:46
*** rcernin has quit IRC06:52
*** slaweq has quit IRC06:55
*** maciejjozefczyk has joined #openstack-nova06:56
*** slaweq has joined #openstack-nova06:58
openstackgerritChen proposed openstack/nova master: Trivial fix on migration doc  https://review.openstack.org/58902807:08
*** ajo has joined #openstack-nova07:09
*** deepak_mourya_ has joined #openstack-nova07:10
*** jhesketh_ has joined #openstack-nova07:10
*** jhesketh has quit IRC07:11
*** rmart04 has joined #openstack-nova07:14
*** gvrangan has joined #openstack-nova07:14
*** quiquell has joined #openstack-nova07:15
quiquellGood morning07:15
quiquellWe have found this http://logs.openstack.org/95/583195/18/check/tripleo-ci-centos-7-containers-multinode/769edac/logs/undercloud/var/log/containers/nova/nova-scheduler.log.txt.gz07:15
quiquellWhat it means ?07:15
quiquell2018-08-03 22:57:32.138 26 ERROR nova.scheduler.client.report [req-e207e96f-2e1c-4648-afc7-be5c51d396ef 0dd904c9905d4ab7962c306a23e8e4ff 6825b026e19a450283f76bcc85f30a43 - default default] Failed to retrieve allocation candidates from placement API for filters: RequestGroup(use_same_provider=False, resources={CUSTOM_BAREMETAL:1}, traits=[], aggregates=[])07:16
quiquellGot 400: {"errors": [{"status": 400, "request_id": "req-8b3d0e7f-0238-4290-99f9-4a3df8213231", "code": "placement.undefined_code", "detail": "The server could not comply with the request since it is either malformed or otherwise incorrect.\n\n Invalid resource class in resources parameter: No such resource class CUSTOM_BAREMETAL.  ", "title": "Bad Request"}]}.07:16
*** Dinesh_Bhor has quit IRC07:18
*** vishakha_ has joined #openstack-nova07:19
*** Bhujay has joined #openstack-nova07:21
*** ccamacho has joined #openstack-nova07:23
*** Dinesh_Bhor has joined #openstack-nova07:24
alex_xuquiquell: 'CUSTOM_BAREMETAL' is custom trait, you must create it by the placement api before you refer it. In the ironic case, nova-manager will create that custom trait for ironic virt driver.07:30
alex_xuquiquell: check the nova-manager log, sounds like nova-manager doesn't startup correctly http://logs.openstack.org/95/583195/18/check/tripleo-ci-centos-7-containers-multinode/769edac/logs/undercloud/var/log/containers/nova/nova-compute.log.txt.gz#_2018-08-03_23_06_47_76407:30
*** Bhujay has quit IRC07:30
openstackgerritGhanshyam Mann proposed openstack/nova master: Define irrelevant-files for tempest-full-py3 job  https://review.openstack.org/58903907:30
alex_xuquiquell: also zero node return by the ironic driver http://logs.openstack.org/95/583195/18/check/tripleo-ci-centos-7-containers-multinode/769edac/logs/undercloud/var/log/containers/nova/nova-compute.log.txt.gz#_2018-08-03_23_06_47_88207:30
alex_xuquiquell: sorry, s/nova-maanger/nova-compute/07:31
alex_xuquiquell: that means nova-compute doesn't reach to the code to create that CUSTOM_BARETAL for the ironic virt driver07:31
alex_xuquiquell: also sorry, CUSTOM_BARETAL is custom resource class, it is not custom trait....really typo too much wrong words...07:32
quiquellalex_xu: Do we have any log about virt driver to check why it returns 0 ?07:32
alex_xuquiquell: i guess there isn't too much, probably needs to check ironic log to see whether ironic works correctly07:34
*** Bhujay has joined #openstack-nova07:39
*** jpenag is now known as jpena07:40
*** tssurya has joined #openstack-nova07:40
gmanngiblet: should we mark these bug fixed/invalid now (fixed by zuul) - https://bugs.launchpad.net/nova/+bug/1745405  https://bugs.launchpad.net/nova/+bug/174543107:43
openstackLaunchpad bug 1745405 in OpenStack Compute (nova) "tempest-full job triggered for irrelevant changes" [Undecided,In progress] - Assigned to Balazs Gibizer (balazs-gibizer)07:43
openstackLaunchpad bug 1745431 in OpenStack Compute (nova) "neutron-grenade job is triggered for irrelevant changes " [Undecided,In progress] - Assigned to Balazs Gibizer (balazs-gibizer)07:43
*** Kvisle is now known as Kvisle_07:44
*** avolkov has joined #openstack-nova07:44
*** dpawlik has joined #openstack-nova07:50
*** rpittau has joined #openstack-nova07:53
*** Bhujay has quit IRC07:54
*** dtantsur|afk is now known as dtantsur07:59
openstackgerritGhanshyam Mann proposed openstack/nova master: Add tempest-slow job to run the tempest slow tests  https://review.openstack.org/56769707:59
*** cdent has joined #openstack-nova08:00
*** tetsuro_ has joined #openstack-nova08:10
*** jaosorior has joined #openstack-nova08:21
*** Dinesh_Bhor has quit IRC08:22
*** Bhujay has joined #openstack-nova08:45
*** cdent has quit IRC08:45
*** Dinesh_Bhor has joined #openstack-nova08:58
*** maciejjozefczyk has quit IRC09:04
*** amarao has joined #openstack-nova09:06
*** mdbooth has joined #openstack-nova09:09
*** maciejjozefczyk has joined #openstack-nova09:15
openstackgerritSylvain Bauza proposed openstack/nova master: Pass allocations to virt drivers when resizing  https://review.openstack.org/58908509:18
*** panda is now known as panda|rover|off09:25
*** maciejjozefczyk has quit IRC09:29
*** adrianc_ has quit IRC09:36
*** maciejjozefczyk has joined #openstack-nova09:36
openstackgerritSurya Seetharaman proposed openstack/nova master: Cleanup comp_node, res_prov, services, aggregate_hosts during cell deletion  https://review.openstack.org/54666009:40
*** vishakha_ is now known as vishakha09:40
*** tetsuro has joined #openstack-nova09:43
*** sahid has joined #openstack-nova09:47
*** kaliya has joined #openstack-nova09:54
*** vivsoni has quit IRC10:00
*** giblet is now known as gibi10:04
*** liuyulong has quit IRC10:05
*** cdent has joined #openstack-nova10:06
cdentgibi: another good thing to get in: https://review.openstack.org/#/c/587772/10:06
*** do3meli has joined #openstack-nova10:10
gibigmann: marked both bug invalid in nova. Thanks for the notice10:10
gibicdent: looking...10:10
*** do3meli has left #openstack-nova10:11
*** frankwang has quit IRC10:15
openstackgerritLee Yarwood proposed openstack/nova master: fixtures: Track volume attachments within CinderFixtureNewAttachFlow  https://review.openstack.org/58701310:18
openstackgerritLee Yarwood proposed openstack/nova master: Add regression test for bug#1784353  https://review.openstack.org/58701410:19
openstackgerritLee Yarwood proposed openstack/nova master: compute: Recreate volume attachments during a reschedule  https://review.openstack.org/58707110:19
lyarwooddansmith: morning, nothing urgent but could you take a look at ^ specifically a question raised by mriedem on a previous patchset - https://review.openstack.org/#/c/587071/4/nova/compute/manager.py10:19
gibicdent: +210:19
openstackgerritLee Yarwood proposed openstack/nova master: compute: Recreate volume attachments during a reschedule  https://review.openstack.org/58707110:20
cdentthanks gibi10:22
*** sambetts_ is now known as sambetts|afk10:24
gmanngibi: thanks10:25
*** chenyb4 has quit IRC10:27
*** luksky has quit IRC10:28
*** vivsoni has joined #openstack-nova10:29
*** hughsaunders has quit IRC10:32
*** hughsaunders has joined #openstack-nova10:33
*** udesale has quit IRC10:34
*** adrianc has joined #openstack-nova10:36
*** Dinesh_Bhor has quit IRC10:46
*** dave-mccowan has joined #openstack-nova10:47
*** sahid has quit IRC10:55
*** tetsuro_ has quit IRC10:56
*** tetsuro_ has joined #openstack-nova10:56
*** luksky has joined #openstack-nova11:02
*** dave-mccowan has quit IRC11:03
*** cdent has quit IRC11:09
*** gvrangan has quit IRC11:09
*** jpena is now known as jpena|lunch11:14
*** tetsuro has quit IRC11:19
*** cdent has joined #openstack-nova11:23
*** leakypipes is now known as jaypipes11:28
*** Dinesh_Bhor has joined #openstack-nova11:32
*** Dinesh_Bhor has quit IRC11:32
*** slagle has joined #openstack-nova11:49
*** mardim has joined #openstack-nova11:59
mardimhello guys11:59
*** Kevin_Zheng has joined #openstack-nova11:59
mardimI have one question11:59
mardimI have this libvirt xml cpu topology12:00
mardim<cpu mode='host-passthrough' check='none'>12:00
mardim    <topology sockets='1' cores='2' threads='2'/>12:00
mardim  </cpu>12:00
mardimThen I tried to spinup a cirros instance with 2 vcpus and hw:cpu_policy='dedicated', hw:cpu_thread_policy='isolate'12:01
mardimBut the instance is in error state because I get no valid host was found12:01
mardimApparently scheduler cannot pin and isolate the processes of the instance12:01
mardimbut I do not know why12:01
mardimhere are some scheduler logs which might help12:02
mardimcpu_usage": 4, "memory_usage": 64, "cpuset": [0, 1, 2, 3], "pinned_cpus": [0, 1], "siblings": [[0, 1], [2, 3]],12:02
*** cdent has quit IRC12:02
mardimdo you have any clue what is the problem ?12:03
mardimThanks !!!12:03
*** ratailor has quit IRC12:04
*** gbarros has joined #openstack-nova12:15
*** jpena|lunch is now known as jpena12:18
*** vivsoni has quit IRC12:21
*** vivsoni has joined #openstack-nova12:22
*** vivsoni has quit IRC12:23
*** cdent has joined #openstack-nova12:23
*** vivsoni_ has joined #openstack-nova12:23
*** tbachman_ has joined #openstack-nova12:31
*** quiquell is now known as quiquell|lunch12:32
*** tbachman has quit IRC12:33
*** tbachman_ is now known as tbachman12:33
*** Luzi has quit IRC12:34
*** dtantsur is now known as dtantsur|brb12:36
openstackgerritLee Yarwood proposed openstack/nova master: fixtures: Track volume attachments within CinderFixtureNewAttachFlow  https://review.openstack.org/58701312:38
openstackgerritLee Yarwood proposed openstack/nova master: Add regression test for bug#1784353  https://review.openstack.org/58701412:38
openstackgerritLee Yarwood proposed openstack/nova master: compute: Recreate volume attachments during a reschedule  https://review.openstack.org/58707112:38
*** cdent has quit IRC12:39
*** Sigyn has quit IRC12:41
*** Sigyn has joined #openstack-nova12:42
*** cdent has joined #openstack-nova12:43
*** Luzi has joined #openstack-nova12:49
*** dave-mccowan has joined #openstack-nova12:53
*** mchlumsky has joined #openstack-nova12:56
*** quiquell|lunch is now known as quiquell|off12:57
*** Luzi has quit IRC12:58
*** panda|rover|off is now known as panda|rover-ish12:58
*** eharney has joined #openstack-nova13:00
*** quiquell|off has quit IRC13:01
*** dave-mccowan has quit IRC13:03
*** mchlumsky has quit IRC13:04
*** mchlumsky has joined #openstack-nova13:06
mdboothI'm trying to make 'python setup.py test' in the osc-placement repo exclude functional tests. Does anybody have any ideas where to start looking?13:10
mdboothIt seems to be pbr voodoo, so... any advice short of rtfs appreciated :)13:11
* mdbooth hasn't found any obvious documentation13:12
*** Luzi has joined #openstack-nova13:13
*** edmondsw has joined #openstack-nova13:13
cdentmdbooth: sorry, I've got no clues. I've not been much involved in osc-placement's birth13:16
efriedmdbooth: --test-path=...13:16
mdboothefried: Is that passed to testr?13:17
mdboothhttps://docs.openstack.org/pbr/latest/user/using.html#testing This suggests it might be tox.ini, although there's a deprecation notice13:17
efriedmdbooth: have we not switched osc-placement to stestr yet?13:18
mdboothefried: pbr is the darkest most arcane magik13:18
mdboothI have no idea :)13:19
cdentare you building rpms mdbooth, and thus forced through setup.py?13:19
*** jroll has quit IRC13:19
mdboothcdent: Yes13:19
efriedmdbooth: I would run to mtreinish if it were me.13:19
* cdent gives mdbooth a hug13:19
mdboothWell not *forced*. It's just a script, I can run whatever I want.13:19
*** jroll has joined #openstack-nova13:19
mdboothBut everything else runs setup.py test, so I'd prefer to make that work for consistency13:19
cdentyou might also try efried's suggestion with a --<space> before the --test-path13:19
mdboothWhat is the argument to --test-path?13:20
efriednot having the repo in front of me, I would guess ./osc_placement/tests/unit13:20
* cdent nods13:21
mdboothefried: Ah, ok.13:21
*** janki has quit IRC13:24
mdboothEurgh... It doesn't work and I'm way too hot and I just want to lie in an ice bucket and weep until it all goes away.13:24
*** Bhujay has quit IRC13:31
*** Luzi has quit IRC13:34
*** takashin has joined #openstack-nova13:36
mdboothSo I have a behaviour difference here between the CI environment and locally. CI runs 'python2 setup.py test' and it does All The Things(tm). Locally it does nothing: http://paste.openstack.org/show/727412/13:40
*** mriedem has joined #openstack-nova13:41
mdboothCI: (search for 'python2 setyp.py test') https://logs.rdoproject.org/81/15181/3/check/legacy-rdoinfo-DLRN-check/83a6ba9/buildset/centos-rpm-master/repos/95/77/9577cd899541b4c2a5b9fa74a59f4c346bd5addf_dev/rpmbuild.log13:41
*** MasterofJOKers has quit IRC13:41
mdboothAny idea what environment factors affect setup.py test?13:42
*** MasterofJOKers has joined #openstack-nova13:43
*** lbragstad has joined #openstack-nova13:43
efriedn-sch meeting in 10 minutes in #openstack-meeting-alt13:50
dansmithlyarwood: mriedem I left comments on that review.. definitely seems like obligatory pings to cinder for every build is a less-than-awesome change, and would rather see it be something we do on reschedule if we need13:51
dansmithalso, queried about whether or not the GET is enough. I guess I would expect there are some attachment states that can't be reversed blindly by the boot process?13:51
lyarwooddansmith: ack thanks13:52
openstackgerritsahid proposed openstack/nova master: hardware: fix memory check usage for small/large pages  https://review.openstack.org/53216813:55
openstackgerritLee Yarwood proposed openstack/nova master: DNM/WIP compute: Reduce likelihood of bdm creation race during attach  https://review.openstack.org/58916413:56
mriedemdansmith: lyarwood: that's where i'm leaning - re-create attachments in conductor build_instances if we're rescheduling13:59
dansmithyeah13:59
dansmitht'would make more sense to me to do that14:00
lyarwoodmriedem / dansmith ; kk, tbh I've spent very little time looking around within conductor and just assumed it wouldn't be the right place, I'll try to move things over now and respin.14:00
mdboothOS_TEST_PATH14:04
efriedmdbooth: \o/ nice one14:07
mdboothAlthough I can't see why rpmbuild isn't using the default14:08
*** spotz has joined #openstack-nova14:08
* mriedem continues the starlingx diff dive14:14
mdboothmriedem: Wow, brave.14:14
mriedemalready mostly done14:15
mdboothAny good nuggets?14:15
*** efried1 has joined #openstack-nova14:16
*** efried has quit IRC14:16
*** efried1 is now known as efried14:16
mriedemmdbooth: i'll post a summary with a link to my spreadsheet in the ML once i'm done14:16
mriedemthere are definitely some things we can upstream to nova14:16
mriedemgiven it was a snapshot based on pike, there are also several bug fixes they have backported so not really forks, just cherry picks we don't have in stable/pike14:17
mriedemand other bug fixes that weren't reported upstream14:17
mriedeme.g. https://review.openstack.org/#/c/588689/14:17
*** Bhujay has joined #openstack-nova14:18
mriedemand https://review.openstack.org/#/c/588657/14:18
mdboothNice14:18
mdboothAh, so you do at least have a git repo14:18
mdboothYou're not just staring at a massive unified diff14:18
mriedemi am14:18
mriedemhttps://github.com/starlingx-staging/stx-nova/commit/71acfeae0d1c59fdc77704527d763bd85a276f9a14:18
mriedemoh yeah i do have the actual git repo cloned14:19
mriedemso i can search things for context14:19
mriedembut i'm mostly starting with the diff14:19
mriedemi'm also starting to glaze over things toward the end here out of sheaer fatigue14:20
mriedem*sheer14:20
*** kklimonda_ has joined #openstack-nova14:23
*** Tahvok_ has joined #openstack-nova14:23
*** tetsuro_ has quit IRC14:25
*** kklimonda has quit IRC14:26
*** Tahvok has quit IRC14:26
*** baffle has quit IRC14:26
*** colby_ has quit IRC14:26
*** szaher has quit IRC14:26
*** bjolo has quit IRC14:26
*** dtantsur|brb has quit IRC14:26
*** Tahvok_ is now known as Tahvok14:26
*** kklimonda_ is now known as kklimonda14:26
gibidansmith: hi! I'm looking at some migation revert functional test case and it seems we are still hitting the legacy allocation handling code path when reverting the allocation on the destination14:26
gibidansmith: I think it is a bug in https://github.com/openstack/nova/blob/8688b25ca7379391cba28fab30b5a628957e673e/nova/compute/manager.py#L3952-L395514:27
gibidansmith: cn_uuid points to the destinantion host based but the migration.uuid only supposed to hold allocation on the source host14:27
gibidansmith: So the if condition never be true14:27
*** eharney has quit IRC14:28
*** dtantsur has joined #openstack-nova14:28
gibidansmith: Did I missed something or it is really a bug?14:28
*** Bhujay has quit IRC14:29
*** naichuans has quit IRC14:29
dansmithgibi: there's something tricky about this, let me re-load context14:30
*** dpawlik has quit IRC14:31
openstackgerritLee Yarwood proposed openstack/nova master: conductor: Recreate volume attachments during a reschedule  https://review.openstack.org/58707114:37
dansmithgibi: this is the context I was thinking of, although reading that review comment now doesn't seem to make sense: https://review.openstack.org/#/c/498948/9/nova/compute/manager.py@364814:40
dansmithgibi: however, I rarely add code like this without some test fail making me do it,14:41
dansmithso I'm not sure14:41
dansmithmaybe raise there and run all the test_servers tests to see if something fails?14:41
dansmithit's possible that it was there for some case which has since been removed, but.. worth a shot14:43
dansmithI was thinking that we actually call that on both the source and destination14:43
dansmithbecause of finish_resize_revert_on_destination(),14:43
dansmithbut i don't see it in there14:44
gibidansmith: I will check resize-same-host sceneario. a simple migrate revert func test now goes to the legacy path14:44
dansmithso I wonder if it was at some point14:44
dansmithgibi: we have those tests in tree though right?14:44
dansmithto verify the migration-holding allocations?14:44
gibidansmith: we have functional tests14:44
gibidansmith: for same host resize and migrate too14:44
dansmithgibi: right, which validate the non-legacy path works yeah?14:45
gibidansmith: which validates that allocations handled properly in non-legacy case. But I think those tests are still hitting the legacy path14:45
*** cdent has quit IRC14:45
gibilet me reproduce both with migrate and same host resize14:46
*** gbarros has quit IRC14:46
dansmithhmm14:46
dansmithso the tests aren't noticing that we're leaking an allocation or something?14:46
gibidansmith: I think we are eventually not leaking as the legacy codepath also handles the alloction properly14:47
dansmithokay I'm not sure how that could be, if it's not handling the migration uuid, but I'll wait for your analysis :)14:48
gibidansmith: https://github.com/openstack/nova/blob/8688b25ca7379391cba28fab30b5a628957e673e/nova/compute/manager.py#L3982 this will remove the allocation on the destination14:48
dansmithon the dest, yeah, but the source?14:48
openstackgerritMerged openstack/nova master: Reload oslo_context after calling monkey_patch()  https://review.openstack.org/58777214:48
gibidansmith: yeah, I'm confused now. I will do the reproduction14:49
mriedemthe allocation on the source should get moved by conductor right? or is this for the "old computes" scenario?14:50
dansmithmriedem: he thinks we're taking the old doubled patch during some cleanups14:51
dansmithbut I'm not sure how we could, without failing those tests (or having big holes in them14:51
dansmithwell, and his point is there is a clause in the delete_after_move function we can't explain14:51
dansmithwhich he thinks might be related14:52
*** gyee has joined #openstack-nova14:52
dansmithgibi: I added an exception to that clause and we _do_ hit it in the functional tests14:56
dansmithso that _is_ true somewhere :)14:56
dansmithduring revert resize14:56
dansmithin three tests14:57
*** dave-mccowan has joined #openstack-nova14:57
dansmithtest_migrate_revert, test_resize_revert, and test_resize_revert_reverse in ServerMovingTests14:57
*** moshele has quit IRC15:03
*** ircuser-1 has joined #openstack-nova15:04
openstackgerritMatt Riedemann proposed openstack/nova master: Define irrelevant-files for tempest-full-py3 job  https://review.openstack.org/58903915:05
*** luksky has quit IRC15:05
gibidansmith: did you put the raise in the legacy path?15:06
dansmithgibi: no, in the clause you think can't ever be true15:06
dansmithgibi: this: https://pastebin.com/GC17gjG415:07
gibidansmith: I put it in front of the legacy code and hit the same tests you hit15:08
gibidansmith: it doesn't make any sense15:09
*** amarao has quit IRC15:09
dansmithhrm.15:09
dansmithoh, well,15:09
dansmiththat can happen in the legit case,15:09
dansmithbecause we're just checking for allocs there. if we find none, we will fall through to the legacy path as expected15:10
*** eharney has joined #openstack-nova15:10
gibidansmith: ahh. When I question the existence of an if condition above it was the if allocs15:10
dansmithoh15:11
gibias we are querying alloctions held by the migration.uuid on the _dest_ host15:11
gibibut that is always empty15:11
gibias migration holds allocation on the source host only15:11
dansmithright right15:12
dansmithexcept for same-host, but that should never hit this because of the source check above15:12
*** pcaruana has quit IRC15:12
dansmithI'll move the exception and re-run to confirm15:12
gibiI did that and got green results15:12
gibiso my theory is that in case of revert on dest we always see empty allocs and don't return so we hit the legacy path below15:13
dansmithokay15:14
dansmithhmm, which I guess isn't a problem,15:14
dansmithif we're deleting dest allocs and restoring source allocs,15:15
dansmithexcept for the case of if we miss the migration allocs we need to delete15:15
gibiit seams it doesn't casue a leak, but I have to leak a bit more closer15:15
dansmithgibi: still not sure where this is coming from though -- are you chasing a bug or writing a test or what?15:15
gibidansmith: trying implement placement 1.28 support in report client (consumer_generation) and writing consumer gen conflict tests and I hit this legacy path that I thought I should not hit any more15:16
dansmithokay15:16
dansmithwe should be able to clean all this up at this point anyway I think15:17
dansmithperhaps we should make a point of ripping this all out for stein?15:17
*** rmart04 has quit IRC15:17
*** cdent has joined #openstack-nova15:17
dansmithanyone with an unconfirmed migration between pike and stein probably has other problems :)15:17
*** janki has joined #openstack-nova15:17
*** eharney has quit IRC15:18
gibidansmith: to avoid hitting the legacy and later rip it out I need to change https://github.com/openstack/nova/blob/8688b25ca7379391cba28fab30b5a628957e673e/nova/compute/manager.py#L3952-L3955 to query the migration allocation from the source host. Does that make sense to you?15:19
dansmithwell..15:19
dansmiththe point of that is to make the destination node not do anything at all (i.e. not run the legacy path) if the new-style allocations were used15:20
dansmithso, I guess, but you'll have to do a lookup of the source uuid I think in order to use that method15:21
gibidansmith: yes that is the goal. if that code sees the allocation held by the migration on the source host it returns and let the finish_revert_resize do the work15:21
dansmithwhich is a little bit icky15:21
*** eharney has joined #openstack-nova15:21
*** hoonetorg has quit IRC15:22
gibidansmith: ohh. Then we just assume there is no legacy migration any more and unconditioanlly not doing anython on the dest host15:22
dansmithgibi: ah, just change that call to get_allocations_by_consumer() and if the migration has any allocations, then we must be doing new-style15:22
gibidansmith: good point, we can do that15:22
gibithen I will file a bug and this small change as a bugfix. then later when Stein is open I can remove the whole legacy path in a separate patch15:23
dansmithyar15:23
*** ebbex has joined #openstack-nova15:23
gibidansmith: thanks for the brainpower15:23
dansmithnp :)15:24
*** jiapei has joined #openstack-nova15:26
mdboothIncidentally, my pbr test problem was that I wasn't explicitly installing python-testrepository. This resulted in <unfathomable pbr-related weirdness> ultimately resulting in the tests attempting to run functional rather than just unit.15:30
* mdbooth would have preferred a failure with a message that python-testrepository wasn't installed.15:31
efriedjaypipes, mriedem, dansmith: What is it going to take to have sufficient confidence in any of the various attempts to reduce the number of redundant placement calls per periodic to actually merge code?15:35
efriede.g. these two are quite similar15:36
efriedhttps://review.openstack.org/#/c/588091/  (removes the "first" _update)15:36
efriedhttps://review.openstack.org/#/c/587050/  (removes the "last" _update)15:36
efriedand are both showing green across the board test-wise.15:36
jaypipesefried: which one do you prefer?15:37
efriedjaypipes: It makes little difference to me. The first one removes more lines of code. At the moment it's sitting on top of the reshaper series, but could be extracted easily.15:38
mriedemidk15:42
mriedemstuff like that likely won't show a side effect until we've merged it and run it awhile and found some weird failures,15:42
mriedemwhich will likely be non-trivial to debug15:42
efriedjust so.15:42
mriedemso definitely not rocky15:42
*** jiapei_ has joined #openstack-nova15:42
*** jiapei has left #openstack-nova15:43
mriedemand i'd expect what you'd see are weird scheduling failures15:43
mriedemdue to some timing issue15:43
dansmithand maybe not at all at gate-level scale15:43
mriedemwould be nice to have some company with a stress test lab kick around either of those to see what falls out15:44
openstackgerritMatt Riedemann proposed openstack/nova master: Delete instance_group_member records from API DB during archive  https://review.openstack.org/58894315:44
mriedembut i'm pretty sure at this point in my time working on openstack, no company has a stress test lab that shares results publicly :)15:45
mriedemif we don't see it in the gate,15:45
mriedemwe'll see it 18-24 months from now when someone actually upgrades to use it and finds problems15:45
efriedright15:45
jaypipesmriedem: 36-64 months from now.15:46
mriedemwell, cern would be the first to hit it probably15:46
efriedBut even that is a big "maybe".15:46
mriedemi also haven't had the time to read cdent's write up on this either yet15:46
efriedmriedem: TL;DR we're calling _update twice every periodic.15:47
mriedemyeah i knew that much :)15:47
efriedwhich calls all the placement things to refresh the cache etc.15:47
efriedthat's really all there is to it.15:47
cdentmriedem: there's not much more than that what efried says, must more words to indicate the code path15:47
efriedDo we need both calls for some reason?15:47
mriedemalso https://review.openstack.org/#/c/520024/15:48
*** jaosorior has quit IRC15:48
mriedemgiven ^ you should probably see if the ovh gang is interested in testing out either of those changes15:48
mriedemwhen did we start doing the 2 calls?15:48
mriedemb/c ovh might not have a region running that yet15:48
cdent>18 months ago15:49
cdentit was there when I did the first version of the post15:49
*** jaosorior has joined #openstack-nova15:49
efriedoh look, https://review.openstack.org/#/c/520024 is exactly the same as https://review.openstack.org/#/c/588091/15:50
mriedemlooking at ^ from ovh they opted to leave in the _update call at the end15:50
mriedemthe +1 from minho on that was also because they had done a duplicate of the same patch15:50
openstackgerritEric Fried proposed openstack/nova master: Update resources once in update_available_resource  https://review.openstack.org/52002415:51
*** adrianc has quit IRC15:52
mriedemok i have voted15:52
openstackgerritChen proposed openstack/nova master: Trivial fix on migration doc  https://review.openstack.org/58902815:52
mriedemif it's a coin toss, go with what ovh is already using15:52
mriedemand drop the duplicate change from efried and coalesce on the ovh patch15:53
efriedNote that there's also https://review.openstack.org/#/c/588094/ which is failing tox (expected because I haven't updated those tests yet) but also live migration consistently.15:53
efriedmriedem: The coalesce is a no-op; the extras in mine are only there because of the reshaper series. I've abandoned it.15:53
mriedemmaciejjozefczyk: are you guys running with this in production now? https://review.openstack.org/#/c/520024/15:54
mriedemmaciejjozefczyk: any side effects or issues with that patch?15:54
*** dklyle has joined #openstack-nova15:57
*** efried has quit IRC15:58
mriedemjaypipes: has cfriesen talked with you about how they account for hosting shared and pinned cpus on the same host by making VCPU inventory a fraction?16:00
mriedemi assume it would have come up during https://review.openstack.org/#/c/555081/16:00
jaypipesmriedem: nope.16:01
jaypipesmriedem: and I'm 100% against making amount a non-integer value.16:01
cdentIt was briefly glossed over in Dublin, but the response then was "placement can't do that"16:03
cdentor maybe s/can't/won't/16:04
dansmithI'm also 100% against that :)16:04
mriedemi'm not suggesting we do that,16:04
mriedembut i thought there was an alternative way to model it via nested providers, or some other kind of inventory16:04
mriedemPCPUs?16:04
mriedemlooks like that is the proposal in the spec16:05
dansmithwe talked about making dedicated cpus a different inventory item in placement16:05
dansmithif that's what youmean16:05
mriedemyes16:05
mriedemthere is just a shit load of code in starlingx to deal with shared and pinned cpus on the same host,16:06
mriedemand i'm mostly at the point of glossing over all of it and just saying, we have a spec for this16:06
mriedemat the end of my diff dive i want to be able to give a sort of tl;dr on the major changes16:07
mriedemwhich at this point is i think just shared/pinned on same host, live resize (cpu only), and l3 cache partitioning16:07
mriedemthere are a lot of other things, but those are the big ones that affect the entire stack16:07
mriedemif you're all lucky, i might even make some m'fing charts!16:08
*** tssurya has quit IRC16:09
*** efried has joined #openstack-nova16:10
*** jpena is now known as jpena|off16:12
*** janki has quit IRC16:15
mriedemsurprisingly i don't see the tpm stuff in her16:17
mriedem*here16:18
openstackgerritTakashi NATSUME proposed openstack/nova master: api-ref: Add descriptions for rebuild  https://review.openstack.org/58893116:19
*** takashin has left #openstack-nova16:20
mdboothlyarwood: https://review.openstack.org/#/c/587071/ I think that's the wrong build16:21
mdboothlyarwood: I won't be bowled over with surprise if I'm wrong, though.16:22
*** kaliya has quit IRC16:27
mdboothlyarwood: Hmm, looks like compute calls build_instances directly on reschedule? So... I could be wrong.16:29
*** imacdonn has quit IRC16:38
*** imacdonn has joined #openstack-nova16:38
*** betherly_ is now known as betherly-afk16:40
*** SamYaple has joined #openstack-nova16:44
mdboothmriedem: I assume it's not possible to 'demote' an attachment to a reservation?16:46
mriedemfirst, that sounds borderline genocidal16:50
mriedemsecond, i don't know what that means16:50
mriedeman empty volume attachment reserves the volume16:50
mriedemactually i think even a volume attachment that has a host connector is still not considering the volume as in-use,16:50
mriedemthat's why we have to call the 'complete' action on the attachment16:51
*** evrardjp has quit IRC16:51
mriedemto make the volume to in-use16:51
mdboothmriedem: Right. I'm just thinking of lyarwood's bug. The issue, IIUC, is that we create a 'reservation', which is an empty attachment, then the compute turns it into a real attachment, right?16:51
mdboothAnd the only way to get rid of that is to delete it.16:51
mriedemdepends on where we fail,16:51
mriedemif we didn't get to the point of calling the complete action on the attachment, the volume is not in-use16:52
mdboothSure, but if it's after we hydrated the attachment, this is how to rollback.16:52
*** hongbin has joined #openstack-nova16:52
openstackgerritEric Fried proposed openstack/nova master: Make get_allocations_for_resource_provider sane  https://review.openstack.org/58459816:52
openstackgerritEric Fried proposed openstack/nova master: Report client: Real get_allocs_for_consumer  https://review.openstack.org/58459916:52
openstackgerritEric Fried proposed openstack/nova master: Report client: get_allocations_for_provider_tree  https://review.openstack.org/58464816:52
openstackgerritEric Fried proposed openstack/nova master: Report client: _reshape helper, placement min bump  https://review.openstack.org/58503416:52
openstackgerritEric Fried proposed openstack/nova master: Report client: update_from_provider_tree w/reshape  https://review.openstack.org/58504916:52
openstackgerritEric Fried proposed openstack/nova master: Compute: Handle reshaped provider trees  https://review.openstack.org/57623616:52
mriedemi believe i talked to myself at length on an earlier PS on the review about how we could just simply not delete the attachment before rescheduling16:52
mdboothI'm pretty sure the answer's no, but I'm just wondering if there's any direct opposite to the compute's action which would cause the attachment to go back to being just a 'reservation'16:53
mdboothOk, sounds like you already did that dance.16:53
mriedemhttps://review.openstack.org/#/c/587071/3/nova/compute/manager.py@163116:54
mriedemtl;dr is i think this is no worse than what happened during reschedule *before* the attachments stuff,16:55
mriedemwe could go either way and there are pros/cons both ways16:55
mriedeme.g. if compute doesn't delete the attachment, conductor would have to when we exhausted retries16:56
*** gbarros has joined #openstack-nova16:57
mdboothmriedem: Nice writeup, thanks.16:58
*** luksky has joined #openstack-nova16:59
mriedemif anyone knows about or cares about uefi instances, this might be a bug fix we need in nova https://github.com/starlingx-staging/stx-nova/commit/71acfeae0d1c59fdc77704527d763bd85a276f9a#diff-f4019782d93a196a0d026479e6aa61b1R851417:00
mriedemah looky here https://bugs.launchpad.net/nova/+bug/178512317:00
openstackLaunchpad bug 1785123 in OpenStack Compute (nova) "UEFI NVRAM lost on cold migration or resize" [Undecided,New]17:00
*** hemna_ has joined #openstack-nova17:06
*** rmart04 has joined #openstack-nova17:11
melwitt.17:23
mdboothmriedem: I wonder if live migration handles that.17:24
mdboothmriedem: Pretty sure we care, btw.17:24
*** tssurya has joined #openstack-nova17:30
*** evrardjp has joined #openstack-nova17:41
*** vivsoni_ has quit IRC17:44
*** vivsoni_ has joined #openstack-nova17:44
*** psachin has quit IRC17:45
*** moshele has joined #openstack-nova17:45
*** jiapei_ has quit IRC17:51
mriedemmdbooth: they also have lvm thin pools; i looked up the patch to add that to nova and you nacked it saying we already have sparse lvms,17:53
mriedembut sparse_logical_volumes was deprecated in rocky17:53
* mriedem makes trombone sound17:53
*** dtantsur is now known as dtantsur|afk17:53
mriedemthey have quite a bit of support for lvm, not sure if that's just for their 1-2 node configs, and then use ceph for their 100 node deployment or what17:55
mriedemand i'm done with the diff \o/17:57
penickI saw the diff and backed away18:04
* penick moonwalks away from the diff18:04
*** tbachman has quit IRC18:08
*** tbachman has joined #openstack-nova18:09
openstackgerritRadoslav Gerganov proposed openstack/nova stable/queens: Reload oslo_context after calling monkey_patch()  https://review.openstack.org/58924918:16
openstackgerritRadoslav Gerganov proposed openstack/nova stable/queens: Reload oslo_context after calling monkey_patch()  https://review.openstack.org/58924918:17
openstackgerritRadoslav Gerganov proposed openstack/nova stable/pike: Reload oslo_context after calling monkey_patch()  https://review.openstack.org/58925118:18
*** moshele has quit IRC18:19
*** nicolasbock has quit IRC18:23
*** nicolasbock has joined #openstack-nova18:24
mriedempenick: here you go https://docs.google.com/spreadsheets/d/1ugp1FVWMsu4x3KgrmPf7HGX8Mh1n80v-KVzweSDZunU/edit?usp=sharing18:32
penickdamn, that's handy. Thanks!18:36
mriedemnow i'll figure out how to digest that a bit for some simple charts18:38
*** moshele has joined #openstack-nova18:49
*** gbarros has quit IRC18:51
*** gbarros has joined #openstack-nova18:54
*** moshele has quit IRC18:58
*** rmart04 has quit IRC18:59
*** hoonetorg has joined #openstack-nova19:17
*** _ix has joined #openstack-nova19:45
openstackgerritmelanie witt proposed openstack/nova stable/ocata: [stable only] Add functional regression test for bug 1783613  https://review.openstack.org/58841619:59
openstackbug 1783613 in OpenStack Compute (nova) ocata "[ocata only] quota usage not decremented during boot/delete race" [Undecided,In progress] https://launchpad.net/bugs/1783613 - Assigned to melanie witt (melwitt)19:59
openstackgerritmelanie witt proposed openstack/nova stable/ocata: [stable only] Handle quota usage during create/delete races  https://review.openstack.org/58241319:59
*** dtroyer has joined #openstack-nova20:02
*** moshele has joined #openstack-nova20:05
dansmithmriedem: I dunno when this ^ regressed exactly, but it seems like a minor enough fix to be worth getting into the older releases that suffer from it20:05
dansmithgiven where most people are, and to make sure anyone holding onto a non-counting-quotas release has a snowball's chance20:05
mriedemok20:07
openstackgerritMerged openstack/nova master: Increase max_unit in placement test fixture  https://review.openstack.org/58815820:21
*** moshele has quit IRC20:29
*** rtjure has joined #openstack-nova20:31
*** tbachman_ has joined #openstack-nova20:34
*** tbachman has quit IRC20:36
*** tbachman_ has quit IRC20:39
*** eharney has quit IRC20:39
*** tbachman has joined #openstack-nova20:46
openstackgerritMerged openstack/nova master: [placement] Debug log per granular request group  https://review.openstack.org/58835020:46
*** moshele has joined #openstack-nova20:46
*** tbachman has quit IRC21:02
openstackgerritmelanie witt proposed openstack/nova stable/ocata: [stable only] Add functional regression test for bug 1783613  https://review.openstack.org/58841621:02
openstackbug 1783613 in OpenStack Compute (nova) ocata "[ocata only] quota usage not decremented during boot/delete race" [Undecided,In progress] https://launchpad.net/bugs/1783613 - Assigned to melanie witt (melwitt)21:02
openstackgerritmelanie witt proposed openstack/nova stable/ocata: [stable only] Handle quota usage during create/delete races  https://review.openstack.org/58241321:02
*** tbachman has joined #openstack-nova21:04
*** edmondsw has quit IRC21:15
*** harlowja has joined #openstack-nova21:21
*** moshele has quit IRC21:27
*** tssurya has quit IRC21:31
*** cdent has quit IRC21:40
*** slagle has quit IRC21:53
*** rcernin has joined #openstack-nova22:15
jaypipesmriedem: looks like you had to spend a lot of time going through that starling-x diff. :(22:18
*** avolkov has quit IRC22:24
*** nicolasbock has quit IRC22:25
*** luksky has quit IRC22:25
*** Kevin_Zheng has quit IRC22:28
mriedemdefinitely > 022:29
mriedemi put it off for a couple of weeks and then once i got going it was actually kind of interesting22:33
mriedemalthough i'll say i glossed over a ton of the l3 cache and shared/pinned floating cpus and scaling (live resize) stuff22:33
mriedembecause it's just a ton of code22:33
mriedemplus it's all super low-level in the hardware.py module which i avoid at all costs22:33
melwittnice job on the spreadsheet, lots of interesting info there22:42
*** bitskrie1 has joined #openstack-nova22:43
*** mvkr has joined #openstack-nova22:45
mriedemheh, maybe don't do this https://bugs.launchpad.net/nova/+bug/178519322:51
openstackLaunchpad bug 1785193 in OpenStack Compute (nova) "changing a node's cell results in duplicate hypervisors" [Undecided,New] - Assigned to Chen (chenn2)22:51
mriedem"it hurts when i do x." "then don't do x"22:52
melwittwe need a safety fence22:52
melwittoh, they did it by editing nova.conf? heh. I had assumed it was through a nova-manage command of some sort22:53
openstackgerritmelanie witt proposed openstack/nova master: Add a prelude release note for the 18.0.0 Rocky GA  https://review.openstack.org/58930322:56
* melwitt will bbl22:59
mriedemwell, this isn't really a cells thing23:00
mriedemchanging the db config at any point randomly would have caused weird issues23:01
melwittyeah, I hadn't yet read the bug and assumed a thing had occurred from use of a nova-manage command23:02
melwittwhen I said, "we need a safety fence"23:02
mriedemnova is not enterprise ready23:06
melwitt:)23:07
*** hongbin has quit IRC23:19
*** tbachman has quit IRC23:21
*** hemna_ has quit IRC23:36
*** tetsuro_ has joined #openstack-nova23:48

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!