Tuesday, 2018-10-09

openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (8)  https://review.openstack.org/57531100:07
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (9)  https://review.openstack.org/57558100:07
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (10)  https://review.openstack.org/57601700:07
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (11)  https://review.openstack.org/57601800:18
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (12)  https://review.openstack.org/57601900:18
*** itlinux has joined #openstack-nova00:23
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (13)  https://review.openstack.org/57602000:32
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (14)  https://review.openstack.org/57602700:38
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (15)  https://review.openstack.org/57603100:38
openstackgerritliuming proposed openstack/nova master: Deletes evacuated instance files when source host is ok  https://review.openstack.org/60598700:41
openstackgerritTakashi NATSUME proposed openstack/nova master: api-ref: Remove a description in servers-actions.inc  https://review.openstack.org/60879600:44
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (16)  https://review.openstack.org/57629900:49
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (17)  https://review.openstack.org/57634400:49
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (18)  https://review.openstack.org/57667300:57
*** lyarwood has quit IRC00:59
*** tristanC has quit IRC01:01
*** tristanC has joined #openstack-nova01:01
*** hongbin has joined #openstack-nova01:08
*** hshiina has joined #openstack-nova01:09
*** slaweq has joined #openstack-nova01:11
openstackgerritMerged openstack/nova master: Fix nits in choices documentation  https://review.openstack.org/60831001:12
naichuansbauzas: Thanks.01:14
*** slaweq has quit IRC01:15
*** slagle has joined #openstack-nova01:20
*** mrsoul has quit IRC01:35
openstackgerritlei zhang proposed openstack/nova master: Remove useless TODO section  https://review.openstack.org/60880201:44
*** Dinesh_Bhor has joined #openstack-nova01:44
*** TuanDA has joined #openstack-nova01:54
*** hshiina has quit IRC01:55
*** mhen has quit IRC01:58
*** Dinesh_Bhor has quit IRC01:59
*** mhen has joined #openstack-nova02:00
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (19)  https://review.openstack.org/57667602:05
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (20)  https://review.openstack.org/57668902:05
*** hshiina has joined #openstack-nova02:14
openstackgerritTakashi NATSUME proposed openstack/python-novaclient master: Replace MB with MiB  https://review.openstack.org/60880702:21
openstackgerritTakashi NATSUME proposed openstack/nova stable/rocky: Remove unnecessary redirect  https://review.openstack.org/60740002:28
openstackgerritBrin Zhang proposed openstack/nova master: Add compute version 36 to support ``volume_type``  https://review.openstack.org/57936002:34
*** psachin has joined #openstack-nova02:40
*** lbragstad has joined #openstack-nova02:43
*** spatel has quit IRC02:49
openstackgerritMerged openstack/nova master: Remove redundant irrelevant-files from neutron-tempest-linuxbridge  https://review.openstack.org/60698902:51
*** tetsuro has joined #openstack-nova02:52
*** Dinesh_Bhor has joined #openstack-nova02:55
*** psachin has quit IRC03:10
openstackgerritTakashi NATSUME proposed openstack/python-novaclient master: doc: Start using openstackdoctheme's extlink extension  https://review.openstack.org/60882903:23
*** psachin has joined #openstack-nova03:44
*** lbragstad has quit IRC03:46
*** lbragstad has joined #openstack-nova03:48
*** lbragstad has quit IRC03:58
*** tetsuro has quit IRC03:58
*** hongbin has quit IRC04:00
*** udesale has joined #openstack-nova04:07
*** Dinesh_Bhor has quit IRC04:09
*** jaosorior has joined #openstack-nova04:28
*** pcaruana has joined #openstack-nova04:38
openstackgerritBrin Zhang proposed openstack/nova master: Add compute version 36 to support ``volume_type``  https://review.openstack.org/57936004:48
alex_xujaypipes: sorry, I get more question on it https://review.openstack.org/#/c/555081/20, it's my last question I think, hope won't make you annoying on no ending spec review04:51
*** Dinesh_Bhor has joined #openstack-nova04:51
*** ratailor has joined #openstack-nova04:53
*** brinzhang has joined #openstack-nova04:53
*** tetsuro has joined #openstack-nova04:54
*** pooja-jadhav has joined #openstack-nova05:03
*** ShilpaSD has quit IRC05:04
*** pooja_jadhav has quit IRC05:04
*** ShilpaSD has joined #openstack-nova05:06
*** mdbooth has joined #openstack-nova05:11
*** mdbooth has quit IRC05:13
*** pooja-jadhav is now known as pooja_jadhav05:19
*** janki has joined #openstack-nova05:25
openstackgerritNaichuan Sun proposed openstack/nova master: xenapi(N-R-P): Add API to support vgpu resource provider create  https://review.openstack.org/52031305:26
openstackgerritNaichuan Sun proposed openstack/nova master: xenapi(N-R-P):Get vgpu info from `allocations`  https://review.openstack.org/52171705:28
openstackgerritNaichuan Sun proposed openstack/nova master: xenapi(N-R-P): support compute node resource provider update  https://review.openstack.org/52104105:28
openstackgerritNaichuan Sun proposed openstack/nova master: os-xenapi(n-rp): add traits for vgpu n-rp  https://review.openstack.org/60426905:28
*** tetsuro has quit IRC05:34
*** Luzi has joined #openstack-nova05:48
openstackgerritTakashi NATSUME proposed openstack/nova master: api-ref: Replace non UUID string with UUID  https://review.openstack.org/60885406:02
*** adrianc has joined #openstack-nova06:13
openstackgerritTakashi NATSUME proposed openstack/nova master: Transform volume.usage notification  https://review.openstack.org/58034506:28
openstackgerritTakashi NATSUME proposed openstack/nova master: Transform compute_task notifications  https://review.openstack.org/48262906:29
*** brinzhang has quit IRC06:30
openstackgerritOpenStack Proposal Bot proposed openstack/nova stable/rocky: Imported Translations from Zanata  https://review.openstack.org/60426006:30
*** brinzhang has joined #openstack-nova06:31
*** tetsuro has joined #openstack-nova06:32
*** ttsiouts has joined #openstack-nova06:34
*** mdbooth has joined #openstack-nova06:55
*** slaweq has joined #openstack-nova06:58
*** mvkr has quit IRC06:59
*** rcernin has quit IRC07:08
*** helenafm has joined #openstack-nova07:08
*** alexchadin has joined #openstack-nova07:24
*** ralonsoh has joined #openstack-nova07:24
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (21)  https://review.openstack.org/57670907:29
openstackgerritTakashi NATSUME proposed openstack/nova master: Remove mox in unit/network/test_neutronv2.py (22)  https://review.openstack.org/57671207:29
*** ttsiouts has quit IRC07:30
*** ttsiouts has joined #openstack-nova07:31
*** ttsiouts has quit IRC07:35
*** ttsiouts has joined #openstack-nova07:36
bauzasgood morning nova07:36
*** ttsiouts has quit IRC07:38
*** ttsiouts has joined #openstack-nova07:38
*** tetsuro has quit IRC07:39
*** belmoreira has joined #openstack-nova07:40
*** alexchadin has quit IRC07:41
*** priteau has joined #openstack-nova07:48
*** priteau has quit IRC07:50
*** ttsiouts has quit IRC07:52
*** mdbooth has quit IRC07:53
*** mdbooth has joined #openstack-nova07:54
*** jangutter has quit IRC07:55
*** jangutter has joined #openstack-nova07:55
*** hshiina has quit IRC07:55
gibigood morning07:57
gibibauzas, mriedem, jaypipes, efried: I saw the discussion about force live migrate on the scheduler meeting. I think I will post a mail with a summary of the different pieces and possibilities on the ML07:58
bauzasok cool07:58
openstackgerritRodolfo Alonso Hernandez proposed openstack/os-vif master: Remove IPTools deprecated implementation  https://review.openstack.org/60542207:59
openstackgerritJan Gutter proposed openstack/os-vif master: Add support for generic representors  https://review.openstack.org/60869308:02
*** dtantsur|afk is now known as dtantsur08:08
*** mdbooth has quit IRC08:09
*** ttsiouts has joined #openstack-nova08:09
*** mdbooth has joined #openstack-nova08:10
*** Dinesh_Bhor has quit IRC08:13
*** janki is now known as janki|lunch08:16
*** tetsuro has joined #openstack-nova08:26
*** mdbooth has quit IRC08:28
*** mdbooth has joined #openstack-nova08:37
*** priteau has joined #openstack-nova08:45
*** Dinesh_Bhor has joined #openstack-nova08:49
*** mdbooth has quit IRC08:50
*** dpawlik has quit IRC08:51
*** dpawlik has joined #openstack-nova08:52
*** trungnv has joined #openstack-nova08:53
*** psachin has quit IRC08:54
mnaserfwiw08:54
mnaserhttp://logs.openstack.org/15/608315/4/check/gpu-test/7243464/job-output.txt.gz08:54
mnaserhttps://review.openstack.org/#/c/608315/08:54
mnaseryou should be able to do tests with access to a k80 gpu08:54
mnaserwith nested virt too08:56
mnasercc sean-k-mooney ^08:56
*** lbragstad has joined #openstack-nova08:56
*** belmorei_ has joined #openstack-nova08:57
*** belmoreira has quit IRC08:57
*** mdbooth has joined #openstack-nova09:21
*** panda has quit IRC09:22
*** mvkr has joined #openstack-nova09:22
openstackgerritZhenyu Zheng proposed openstack/nova-specs master: Detach and attach boot volumes - Stein  https://review.openstack.org/60062809:24
*** panda has joined #openstack-nova09:24
openstackgerritBrin Zhang proposed openstack/nova master: Add compute API version for when a ``volume_type`` is requested  https://review.openstack.org/60557309:25
jaypipesgibi: cool with me. thank you!09:36
*** adrianc has quit IRC09:38
*** imacdonn has quit IRC09:39
gibibauzas, mriedem, jaypipes, efried: I've posted the mail. Sorry it turned out as a long one. http://lists.openstack.org/pipermail/openstack-dev/2018-October/135551.html09:42
jaypipesgibi: :) no worries, it's a complicated subject.09:43
sean-k-mooneymnaser: oh cool. bauzas should be pleased. thanks :) now we just need to figure out how to use devstack to deploy with vgpus09:46
sean-k-mooneybauzas: do you have a local.conf/devstack plugin that automates teh setup09:46
*** vabada has quit IRC09:46
mnaseryeah feel free to hack away at it09:47
*** k_mouza has joined #openstack-nova09:48
*** dave-mccowan has joined #openstack-nova09:49
sean-k-mooneymnaser: actully looking at https://docs.nvidia.com/grid/gpus-supported-by-vgpu.html nvidia may have locked out the k80s... we can still try them09:49
mnaser#justnvidiathings09:50
sean-k-mooneymnaser: you know im really looking forward to the point were someone implement the vgpu support in the nouveau09:51
sean-k-mooneydriver so there are not nvida locks on the hardware09:51
*** imacdonn has joined #openstack-nova09:52
sean-k-mooneythat said i personally use the nvidia binary driver becase performance09:52
mnaseri'd say this is where things go beyond my knowledge09:54
sean-k-mooneythe nouveau is the opensource linux driver for nvidia gpus. the offical binary blob give you about 10-15% better performace in games and i thin its require to use some of the nvida only techs like hair works.  for vgpus instead of the normal driver you run there grid driver which is desinged for there datachenter gpus09:56
sean-k-mooneythat said im 98% certin that if you could remove the sku check that it would likely work on there desktop and workstation gpus but nvidia want to charge for the privlage of useing virtualisation with there gpus09:57
*** vabada has joined #openstack-nova09:57
*** janki|lunch is now known as janki10:02
openstackgerritTetsuro Nakamura proposed openstack/nova stable/rocky: Add alloc cands test with nested and aggregates  https://review.openstack.org/60745410:02
openstackgerritTetsuro Nakamura proposed openstack/nova stable/rocky: Fix aggregate members in nested alloc candidates  https://review.openstack.org/60890310:02
*** TuanDA has quit IRC10:03
*** Dinesh_Bhor has quit IRC10:06
*** k_mouza has quit IRC10:15
*** k_mouza has joined #openstack-nova10:15
*** lbragstad has quit IRC10:16
*** k_mouza has quit IRC10:20
*** adrianc has joined #openstack-nova10:24
*** markvoelker has joined #openstack-nova10:25
openstackgerritMerged openstack/nova master: conf: Gather 'live_migration_scheme', 'live_migration_inbound_addr'  https://review.openstack.org/45657210:28
*** nehaalhat_ has joined #openstack-nova10:29
*** k_mouza has joined #openstack-nova10:29
nehaalhat_Hi, can any one help me to merge this patch: https://review.openstack.org/#/c/581218/10:29
*** takashin has left #openstack-nova10:30
*** tetsuro has quit IRC10:35
*** tetsuro has joined #openstack-nova10:40
*** tbachman has quit IRC10:42
*** moshele has joined #openstack-nova10:42
*** ttsiouts has quit IRC10:44
*** mvkr has quit IRC10:44
*** k_mouza has quit IRC10:44
*** alexchadin has joined #openstack-nova10:45
*** k_mouza has joined #openstack-nova10:45
*** k_mouza has quit IRC10:48
*** k_mouza has joined #openstack-nova10:48
*** udesale has quit IRC10:52
*** Dinesh_Bhor has joined #openstack-nova10:52
*** markvoelker has quit IRC10:58
*** threestrands has joined #openstack-nova11:01
*** ttsiouts has joined #openstack-nova11:01
*** mvkr has joined #openstack-nova11:02
*** k_mouza has quit IRC11:05
*** ttsiouts has quit IRC11:09
*** ttsiouts has joined #openstack-nova11:13
*** erlon_ has joined #openstack-nova11:16
*** slagle has quit IRC11:22
*** jangutter has quit IRC11:29
*** jangutter has joined #openstack-nova11:30
openstackgerritBrin Zhang proposed openstack/nova master: Add microversion 2.67 to support volume_type  https://review.openstack.org/60639811:32
*** alexchadin has quit IRC11:39
openstackgerritMerged openstack/nova master: Move test.nested to utils.nested_contexts  https://review.openstack.org/60841611:46
*** alexchadin has joined #openstack-nova11:47
*** markvoelker has joined #openstack-nova11:51
*** ttsiouts has quit IRC12:01
*** ttsiouts has joined #openstack-nova12:02
*** tetsuro has quit IRC12:07
*** Dinesh_Bhor has quit IRC12:09
*** ttsiouts has quit IRC12:10
*** brinzhang has quit IRC12:17
*** ttsiouts has joined #openstack-nova12:22
*** tbachman has joined #openstack-nova12:27
*** ratailor has quit IRC12:36
pooja_jadhavHi team, anyone help me in the https://github.com/openstack/nova/blob/85b36cd2f82ccd740057c1bee08fc722209604ab/nova/tests/functional/api_sample_tests/test_simple_tenant_usage.py#L85-L93.. When we run test name "test_get_tenants_usage" and passed instance_uuid_1 in the query. If we check the instances in simple tenant usages controller. we can see instance-2 object data in the instances list. If we pass instance-2 in query then in the12:41
pooja_jadhavinstances we can see instance-3. So how actually its working?12:41
openstackgerritLucian Petrut proposed openstack/nova master: Fix os-simple-tenant-usage result order  https://review.openstack.org/60868512:45
bauzasmnaser: sean-k-mooney: sorry was afk12:49
bauzasmnaser: thanks for the proposal, but unfortunately, AFAIK, k80 devices aren't supported by nvidia for vGPUs12:50
* bauzas raises fist 12:50
bauzasgibi: saw your thread, I need proper time to read it and reply to it12:51
* bauzas is currently stuck in downstream universe12:51
*** skatsaounis has quit IRC12:53
*** skatsaounis has joined #openstack-nova12:55
gibibauzas: sure it needs time. The reason for the mail was to summarize the problem as it is pretty hard for solve it consistently without seeing every corner of it12:55
*** adrianc has quit IRC12:57
*** lpetrut has joined #openstack-nova12:58
*** alexchadin has quit IRC12:59
stephenfindansmith: What's the by-service approach you refer to here? https://review.openstack.org/#/c/608703/ (link to a spec/commit is good)13:02
*** adrianc has joined #openstack-nova13:03
*** dpawlik has quit IRC13:07
*** ociuhandu has joined #openstack-nova13:07
*** udesale has joined #openstack-nova13:13
*** panda has quit IRC13:13
*** ttsiouts has quit IRC13:13
*** panda has joined #openstack-nova13:14
*** mriedem has joined #openstack-nova13:19
*** udesale has quit IRC13:21
*** janki has quit IRC13:23
*** mvkr has quit IRC13:23
*** lbragstad has joined #openstack-nova13:28
dansmithstephenfin: did you see the link to the bug?13:32
dansmithstephenfin: regardless, as I said, I think mentioning the discover step (whether by service or regular) in the devstack docs is the right thing to do13:32
stephenfinYup, that's the plan. Didn't click the bug link though. Will do now13:33
stephenfinWell, soon as I'm done with downstream fun13:33
*** ttsiouts has joined #openstack-nova13:37
*** awaugama has joined #openstack-nova13:42
*** liuyulong has joined #openstack-nova13:43
*** adrianc has quit IRC13:44
efriedgibi: Can you help me understand some basics of selecting a destination host during evac/migrate?13:47
*** hongbin has joined #openstack-nova13:48
*** mlavalle has joined #openstack-nova13:50
efriedor dansmith bauzas13:51
efriedYou can specify a host without the force flag and we'll run GET /a_c, right?13:51
dansmithefried: I'm not sure what you're asking.. it's not really any different13:51
dansmithyeah, IIRC13:51
dansmithonly the force flag makes us totally skip I think13:51
efriedAnd then what happens if the GET /a_c returns no candidates for the requested host?13:51
efriedDo we fail or do we select a different host?13:51
*** ttsiouts has quit IRC13:52
dansmithwe should fail13:53
dansmithlemme find a thread to pull13:53
dansmithhttps://github.com/openstack/nova/blob/master/nova/conductor/manager.py#L983-L99513:55
*** mchlumsky has joined #openstack-nova13:55
dansmithschedule with a single host in the destination field.. if we get back novalidhost, we error the migration13:56
*** adrianc has joined #openstack-nova13:56
efriedSo what use was the force flag ever supposed to be? Literally an optimization to avoid running some code, but behaviorally/algorithmically no difference?13:58
efriedBecause in theory since we're using placement now, which is fast, that optimization buys us almost nothing. So if ^ is true, the force flag is basically obsolete anyway.14:00
*** eharney has joined #openstack-nova14:00
mriedemi can field this one...14:02
mriedemefried: before the force flag, specifying a host at all bypassed the scheduler,14:02
mriedemthen a microversion was added which made passing a host go through the scheduler for validation, but apparently at least one person though we should preserve the ability to bypass the scheduler, so the force flag was added to do that backdoor14:02
*** ttsiouts has joined #openstack-nova14:02
efriedWhat was the motivation to "bypass the scheduler"? Because it was inefficient?14:03
*** s1061123 has quit IRC14:03
*** beagles is now known as beagles_cough14:03
*** beagles_cough is now known as beagles14:03
mriedemidk, i'm assuming to oversubscribe a host just to move things around14:03
mriedemat least temporarily14:03
efriedis oversubscribe even possible at this point?14:04
efried(Outside of allocation ratio, which doesn't count)14:04
mriedemi don't think so, at least not for vcpu/ram/dis14:04
mriedem*disk14:04
mriedemas i noted in gibi's patch - we broke that in pike when force still goes through claiming resource allocations in conductor14:04
sean-k-mooneyefried: if it bypassed the scduer it proably bypassed the placement claim too but i have never check that14:04
mriedemsean-k-mooney: incorrect14:05
efriedYeah, that's the point. Since we started claiming from placement, you can't oversubscribe.14:05
mriedemefried: in the old days, before placement, if you bypass the scheduler, conductor would not send limits down to the RT so it wouldn't fail the limits check on the resource claim14:05
sean-k-mooneymriedem: was that unintentional however as you jsut said we "broke" that in pike14:05
*** Luzi has quit IRC14:05
efriedhas anyone screamed about that breakage?14:05
efriedOr do we still not have enough serious operators on pike yet? :P14:06
*** s1061123 has joined #openstack-nova14:06
mriedemthere are like 3 people i now on >= pike but no...14:06
mriedem*know14:06
edleafeefried: one reason for the force option was that admins wanted to be able to say "I know what I'm doing, dammit!". It wasn't about code efficiency14:06
efriededleafe: But IIUC, the destination host is observed regardless.14:07
mriedemobserved?14:07
efriedmeaning you either get on the suggested host or you die14:07
*** itlinux has quit IRC14:07
mriedemcorrect14:07
efriedI'm not sure how we would "fix" the oversubscribe thing at this point, without adding a placement feature to allow it.14:07
mriedemi don't think adding features to support shooting yourself is something we want to do at this point14:08
efriedwhich I doubt we want to do14:08
efried"Hey placement, you know that one thing you're supposed to be designed to do? Yeah, don't do that."14:08
edleafeefried: it isn't just oversubscription that would cause the scheduler to reject it. Things like affinity, etc., would also cause it to fail14:08
mriedemnote that even with forced live migration, conductor still runs some checks that could cause us to reject the host14:08
mriedemedleafe: nope14:09
efriedyou mean besides the allocation claim?14:09
mriedemwe don't do any affinity checks outside of the scheduler for live migration14:09
edleafemriedem: without the force option, we do14:09
mriedemin the scheduler14:09
mriedembut that's more to my point - force is bad14:10
efriedI think what I'm getting at is, force isn't so much bad as... obsolete?14:10
mriedemyou could screw up AZs too14:10
efriedLike, it doesn't do anything useful anymore.14:10
edleafewell, yeah, but admins wanted to be able to override14:10
mriedemefried: agree14:10
edleafeI'm not saying it's right14:10
mriedemefried: it made a bit more sense pre-placement14:10
edleafeJust that that was the pushback at the time14:10
efriedbecause oversubscribe14:10
*** ttsiouts has quit IRC14:11
efriedOkay, thanks, this helps validate my response to gibi's thread.14:11
openstackgerritMerged openstack/os-vif master: Remove IPTools deprecated implementation  https://review.openstack.org/60542214:13
*** mugsie has joined #openstack-nova14:14
openstackgerritJan Gutter proposed openstack/os-vif master: Extend port profiles with datapath offload type  https://review.openstack.org/57208114:15
*** ttsiouts has joined #openstack-nova14:17
belmorei_hi. I'm having an issue with Ironic allocations in placement14:17
*** eharney has quit IRC14:18
belmorei_We define the Ironic flavors with resources:VCPU=0, ... as documented in "https://docs.openstack.org/ironic/queens/install/configure-nova-flavors.html".14:18
belmorei_What I see is that allocations are only done for the resource class, and not vcpus, ram, disk. I imagine this intentional?14:18
efriedwhat do you mean, only done for the resource class?14:20
belmorei_the allocation for the resource_provider is only the ironic resource_class14:20
dansmithbelmorei_: yes, expected14:20
belmorei_great. Let me describe the issue14:21
belmorei_The problem is that we enable the ironic flavors per project (these projects are only for baremetal) (projects are mapped to a cell that only has the baremetal nodes)14:24
belmorei_However, users also have access to the "default" flavors that are for VMs in these projects (we can't remove public flavors)14:24
*** spatel has joined #openstack-nova14:25
belmorei_If a user makes the mistake to use a default flavor in these projects (flavor for VMs) placement can return already in use baremetal nodes because they have cpu, ram, ...14:26
dansmithbelmorei_: the baremetal nodes should be exposing no cpu,ram,etc inventory14:27
dansmithbelmorei_: they should expose one inventory item of the baremetal resource class and nothing else14:27
belmorei_dansmith: Good to know I would expect that, but it's not happening. Maybe a conf issue in my side14:28
dansmithbelmorei_: yeah, I'm not sure how that could be happening anymore.. jroll dtantsur ?14:29
* dtantsur is on a meeting, but will check soon14:29
mriedemhttps://github.com/openstack/nova/blob/stable/queens/nova/virt/ironic/driver.py#L79014:29
mriedemwe still reported cpu/ram/disk inventory for ironic nodes in queens14:29
mriedemhttps://github.com/openstack/nova/commit/a985e34cdeef777fe7ff943e363a5f1be6d991b7#diff-1e4547e2c3b36b8f836d8f851f85fde7 removed that in stein14:30
dansmithmriedem: um, we should have had a cutover so we're not exposing both right?14:31
dansmithI thought that was like pike14:31
dansmithcomment there says zero in pike14:31
jrolldansmith: we didn't do the cutover so people could migrate their flavor14:31
dansmithjroll: right, but in queens?14:32
jrollthis is the first I've heard of this, fwiw, though it makes sense14:32
jrolldansmith: people were scared to remove it14:32
dansmithhmm14:32
jrolla quick workaround would likely be to set 0 for cpu/ram in ironic node.properties14:33
jrolland then it will report 014:33
dansmithI thought this was long since sorted14:33
jrollyeah, I thought it just worked as well14:33
dansmithwhat was the plan during the overlap to avoid exposing the old values once everything was migrated?14:33
*** eharney has joined #openstack-nova14:33
dansmithoverride properties in ironic?14:33
dansmiths/everything/flavors/14:33
jrollI don't remember, sorry14:33
belmorei_jroll: that means updating all ironic nodes... can't do it.14:34
belmorei_How about if I remove the "resources:VCPU=0", ... from the flavors?14:34
*** moshele has quit IRC14:34
mriedemyour bm flavors aren't the problem14:34
mriedemthe problem is the bm flavors are getting scheduled to the ironic nodes right?14:35
mriedem*vm flavors14:35
dansmithright14:35
jrollthat would prevent VMs from landing on an active ironic node, but not an inactive one14:35
mriedemand that's because the bm nodes are reporting ram/cpu/disk inventory14:35
belmorei_jroll: true14:35
belmorei_ok, so for me the best is to backport the commit mentioned by mriedem14:39
mriedembelmorei_: you might be able to just patch https://github.com/openstack/nova/blob/stable/queens/nova/virt/ironic/driver.py#L797 to be 014:39
belmorei_yeah, thanks14:39
mriedembelmorei_: i'm guessing that's not going to backport cleanly to queens14:39
bauzasgibi: thanks for the email recap14:39
belmorei_any plan to still include this in rocky?14:39
bauzasgibi: I now understand the problem :)14:40
belmorei_because for me is a bug14:40
gibibauzas: :)14:40
mriedembelmorei_: i'm not sure how others would feel about a stable/queens / rocky only config option to disable reporting vcpu/ram/disk inventory for ironic14:40
mriedema workarounds option14:40
gibimriedem, efried, edleafe: force flag means I want to move the server so desperately that I don't care about any safeties14:41
gibiat least for me it means that14:41
mriedemdefault to disabled, but if enabled, report 0 total vcpu/ram/disk inventory for ironic nodes14:41
mriedemdansmith: ^ how do you feel about a config option backdoor for belmorei_'s case?14:41
mriedemwe're not going to backport https://github.com/openstack/nova/commit/a985e34cdeef777fe7ff943e363a5f1be6d991b714:42
dansmithmriedem: I'm for it because I'm not sure how we're not effing people over with this right now14:42
belmorei_mriedem: this is easy for me to patch. Having a conf backdoor option doesn't seem good14:42
dansmithI'm guessing tripleo people don't care because undercloud/overcloud14:42
mriedembelmorei_: a workarounds config option provides a generic solution for *everyone* with this problem14:42
belmorei_mriedem: true14:43
mriedemessentially it means enabling it says you've done your ironic instance flavor migration and you're good to go14:43
dansmithright14:43
mriedemand we have a nova-status check for that as well14:43
mriedembelmorei_: how about you report a bug to start and we can go from there?14:43
mriedemjroll: btw i do remember something breaking after we removed that code in stein, but i can't remember what off the top of my head14:44
mriedemwhich is why i wanted to hold off on removing it right before the rocky GA14:44
openstackgerritMatt Riedemann proposed openstack/nova stable/rocky: Don't emit warning when ironic properties are zero  https://review.openstack.org/60857314:46
jrollmriedem: this? https://bugs.launchpad.net/nova/+bug/178750914:46
openstackLaunchpad bug 1787509 in OpenStack Compute (nova) "Baremetal filters and default filters cannot be used simultaneously in the same nova" [Undecided,Won't fix]14:46
jrollor maybe https://bugs.launchpad.net/tripleo/+bug/1787910/14:46
openstackLaunchpad bug 1787910 in OpenStack Compute (nova) rocky "OVB overcloud deploy fails on nova placement errors" [High,Fix committed] - Assigned to Matt Riedemann (mriedem)14:46
belmorei_mriedem I will create the bug report14:47
mriedemmaybe14:47
belmorei_mriedem dansmith jroll thanks for the help14:47
mriedemjroll: around the time we deprecated the core/ram/disk filters14:47
jrollmriedem: yeah I don't remember what either, just going by irc logs14:47
mriedemyeah https://bugs.launchpad.net/tripleo/+bug/178791014:47
openstackLaunchpad bug 1787910 in OpenStack Compute (nova) rocky "OVB overcloud deploy fails on nova placement errors" [High,Fix committed] - Assigned to Matt Riedemann (mriedem)14:47
dansmithmriedem: so they still had ram required in those flavors and failed when ironic stopped reporting ram inventory14:52
dansmithyeah?14:52
dansmithwe have to cut over at some point and I thought we already had.. workaround config flag to let them get over the hump seems like the best thing at this point14:52
mriedemdansmith: they being tripleo in that bug?14:54
dansmithwell, ovb but yeah14:54
mriedemyeah looks like it based on https://review.openstack.org/#/c/596093/14:55
*** munimeha1 has joined #openstack-nova14:55
*** lpetrut has quit IRC14:56
*** hamzy has quit IRC14:57
*** hamzy has joined #openstack-nova15:01
dtantsurbelmorei_: a workaround may be to remove memory_mb and vcpus from ironic nodes properties15:05
dtantsurwith something like $ openstack baremetal node unset <node> --property memory_mb15:06
dtantsurmriedem: this may be a bit easier than hacking nova ^^15:07
belmorei_dtantsur: thanks, but the problem is the number of baremetal nodes that we have. Also, we would need to change the commission procedure to include that.15:08
belmorei_for now I will just patch this in nova15:08
*** itlinux has joined #openstack-nova15:08
dtantsurbelmorei_: do you use something like inspection to populate these properties?15:08
dtantsuralso before resource classes I used to use host aggregates to more or less separate bm and vm nodes on the same nova15:09
belmorei_dtantsur: yes, inspection populate them15:11
mriedembelmorei_: out of curiosity, before cells v2, could your vm flavors get scheduled to bm cells/15:13
mriedem?15:13
mriedemor were the flavors segregated at the top cell layer?15:14
mriedembelmorei_: also fyi, you can't set total=0 for inventory on the resource class as i said above, placement api will reject that since total must be >=115:17
mriedemso need to just omit posting those non-custom-resource-class inventories15:17
belmorei_mriedem: with cellsV1 we were using the baremetal filters, so they will not be schedule to an already deployed node. But yes, if a user used a vm flavor a think it would be the same (it would use the physical node to create the vm flavor instance)15:20
belmorei_mriedem: thanks for the heads up for the patch15:21
*** tssurya has joined #openstack-nova15:26
*** eharney_ has joined #openstack-nova15:31
openstackgerritMartin Midolesov proposed openstack/nova master: vmware:PropertyCollector for caching instance properties  https://review.openstack.org/60827815:32
*** ttsiouts has quit IRC15:33
mriedemdansmith: belmorei_: fyi i'm working on a rocky patch with the workaround option15:33
*** ttsiouts has joined #openstack-nova15:33
*** eharney has quit IRC15:34
dansmithmriedem: cool15:37
*** ttsiouts has quit IRC15:38
belmorei_mriedem: thanks15:39
belmorei_mriedem: https://bugs.launchpad.net/nova/+bug/179692015:39
openstackLaunchpad bug 1796920 in OpenStack Compute (nova) "Baremetal nodes should not be exposing non-custom-resource-class (vcpu, ram, disk)" [Undecided,New]15:39
*** munimeha1 has quit IRC15:43
*** mdbooth has quit IRC15:44
*** hamzy has quit IRC15:50
*** janki has joined #openstack-nova15:52
mriedemdansmith: looks like zuulv3 status something or other changed and now openstack-gerrit-dashboard is getting NoneType errors - you see the same?15:58
dansmithmriedem: I noticed it was failing this morning but didn't go to look if zuul was down. usually that's the reason15:59
mriedemi'm guessing API change http://zuul.openstack.org/status15:59
mriedemnot sure, but the dashboard is different15:59
*** macza has joined #openstack-nova15:59
dansmithah yeah15:59
imacdonndansmith: could you take a peek at this, please? https://review.openstack.org/60809116:03
*** belmorei_ has quit IRC16:03
*** helenafm has quit IRC16:06
*** ircuser-1 has quit IRC16:07
openstackgerritMatt Riedemann proposed openstack/nova stable/rocky: [stable-only] Add report_ironic_standard_resource_class_inventory option  https://review.openstack.org/60904316:08
mriedemdansmith: jroll: dtantsur: ^ belmiro took off....would be nice if he can confirm that fixes his problem16:08
jrollthanks16:09
imacdonnthat's one long option name :)16:10
mriedemsuggestions welcome16:10
*** belmoreira has joined #openstack-nova16:10
mriedemi figured do_the_dew wouldn't be helpful16:11
dansmithimacdonn: done16:11
dansmithimacdonn: mriedem should look at that too16:11
dansmithor rather16:11
mriedemi did once..16:11
dansmithmriedem should look at and agree with me on that too16:11
imacdonnI do see your point16:13
imacdonnnot sure if anyone is actually doing the "keep hammering on it until it concedes" approach, but yeah16:14
edmondswand that notification having the message is also important for PowerVC, since it has means to present errors from notifications in the PowerVC GUI16:14
dansmithimacdonn: I expect everyone is16:14
edmondswoops, ignore ^, somehow jumped channels16:15
imacdonnmy suspicion is that some people are running it once, and missing the fact that there are failures, and maybe others are not running it at all16:15
dansmithpeople have to run this at various times or things won't work16:16
imacdonnthat may not be immediately obvious16:17
imacdonnI've upgraded at least pike -> queens -> rocky without doing any online migrations, and nothing obviously didn't work16:17
dansmithwe have some db migrations which have blocked if you haven't run these to completion16:18
dansmithmaybe none since pike, but..16:18
mriedemhttp://git.openstack.org/cgit/openstack/openstack-ansible-os_nova/tree/tasks/nova_db_setup.yml#n9816:18
mriedemosa is certainly using it16:18
dansmithI guess the default now is to run until completion, which is probably what people are doing I guess16:18
dansmithbut I know the return value here was critical earlier when people were running batches themselves16:19
mriedemimacdonn: we also migrate some stuff online outside of the command16:19
mriedemlike on read from the db16:19
mriedemor new resource create16:19
imacdonnyeah, I know ... my point is that it's possible to get away without running the command, at least in some circumstances16:20
dansmithimacdonn: I'm not sure what that has to do with anything16:20
dansmithOSA and tripleo, and I expect other systems run this explicitly16:20
mriedemif it's possible it's by chance16:20
dansmithif you don't and it doesn't break in the versions you use, then you got lucky,16:21
mriedemlike dan said, we probably just haven't had a blocker migration in awhile16:21
dansmithbut that doesn't really mean anything for how important this is to notbreak16:21
*** threestrands has quit IRC16:21
mriedemalso depends on how old your data is,16:21
mriedemi plan on dropping our request spec compat from newton in stein and if you don't have that migration done you'll fail to do things like migrate instances16:21
imacdonnOK, nevermind .. I wasn't disgreeing that it's important to solve ...16:21
mriedemso just make this return 2, doc and reno it and we're happy right?16:22
*** ralonsoh has quit IRC16:22
imacdonnyeah, that'd work for this particular problem ... although it's probably not backportable ?16:23
imacdonn(since it'll break things that don't know to check for 2)16:23
mriedemi'm not sure; if things are failing but tooling is not aware of it, i think it's probably better to opt to the side of putting an upgrade release note and saying this will fail now16:23
mriedembut i'd rather know something isn't working explicitly16:24
mriedemdansmith: agree? ^16:24
imacdonnbut if the automation is just repeating infinitely until it gets a 0, it'll .... repeat infinitely16:24
dansmithwhat if 2 means "I didn't do anything but there were exceptions", 1 means "I did things, maybe there were some exceptions too", 0 means "I didn't do anything, but no errors"16:25
dansmithrepeat on 1, done on 0, 2 means we hit terminal fail state16:25
dansmitheither way people that loop on nonzero will break with anything you're going to do, which is why reno and doc is super important16:26
imacdonnright16:26
dansmithnot sure how I feel about changing behavior in a backport with a retroactive reno, but mriedem is the authority here, so I'd do whatever he says16:27
imacdonnI'm thinking that most people only read release notes for a new release, not for errata updates16:28
dansmithunfortunately I don't think they even read them for new releases, but.. yeah16:28
imacdonnheh yeah, there is that16:28
*** k_mouza has joined #openstack-nova16:29
mriedemi'll defer to tonyb16:30
dansmiththe AUD stops with tonyb16:31
*** panda has quit IRC16:34
*** spatel has quit IRC16:35
*** panda has joined #openstack-nova16:36
melwitt.16:36
*** moshele has joined #openstack-nova16:39
*** gyee has joined #openstack-nova16:43
mriedemi get it16:43
mriedemtook me awhile16:43
*** mriedem is now known as mriedem_stew16:43
imacdonndansmith: I'm on the fence about your last suggestion .... I think I tend towards exceptions being bad, requiring some problem to be addressed immediately ... but then there was a suggestion that some migrations may raise exceptions "by design" if some other migration had not yet been completed16:48
dansmithimacdonn: well, by design or not, we've had some that won't complete until others do16:49
*** eharney_ is now known as eharney16:52
imacdonndansmith: is it defined somewhere that a migration should raise an exception in such a case? (as opposed to just not doing any work) Seems like ideally there should be a way for a migration to explicitly state that "I can't do this *yet*"16:55
dansmiththe way to do that is to return nonzero found, with zero done16:55
imacdonnso how do you distinguish that from "I can't do this *all all, ever*" ?16:56
dansmithregardless, because of the complexity of hitting all the cases of live data, which we're historically bad at doing, making the process stop on exception is just practically not the best plan, IMHO16:56
dansmiththere's no case where found is nonzero where done is expected to remain zero forever16:56
dansmithfound is items that should be migratable16:56
imacdonnOK16:59
imacdonnI'll try to implement that and see what falls out17:00
*** moshele has quit IRC17:01
*** dtantsur is now known as dtantsur|afk17:08
*** janki has quit IRC17:14
*** sambetts_ is now known as sambetts|afk17:18
*** janki has joined #openstack-nova17:22
*** moshele has joined #openstack-nova17:25
*** ralonsoh has joined #openstack-nova17:28
*** moshele has quit IRC17:29
*** tssurya has quit IRC17:39
*** ralonsoh has quit IRC17:40
*** mchlumsky has quit IRC17:40
*** ociuhandu has quit IRC17:45
*** k_mouza has quit IRC17:45
*** qqmber has joined #openstack-nova17:47
*** k_mouza has joined #openstack-nova17:49
*** Swami has joined #openstack-nova17:49
*** adrianc has quit IRC17:51
*** k_mouza_ has joined #openstack-nova17:52
*** k_mouza has quit IRC17:53
*** k_mouza_ has quit IRC17:54
*** k_mouza has joined #openstack-nova17:54
*** k_mouza has quit IRC17:58
*** moshele has joined #openstack-nova18:02
*** k_mouza has joined #openstack-nova18:02
*** hamzy has joined #openstack-nova18:05
*** k_mouza_ has joined #openstack-nova18:05
*** k_mouza has quit IRC18:07
*** k_mouza_ has quit IRC18:08
*** k_mouza has joined #openstack-nova18:08
*** k_mouza_ has joined #openstack-nova18:11
*** k_mouza has quit IRC18:13
*** k_mouza_ has quit IRC18:14
*** k_mouza has joined #openstack-nova18:14
*** k_mouza has quit IRC18:18
*** hoangcx has quit IRC18:19
*** trungnv has quit IRC18:20
*** k_mouza has joined #openstack-nova18:30
sean-k-mooneymelwitt: i left some feedback in https://review.openstack.org/#/c/575735/2 fyi. hope that helps. the code should work but its duplicating logic that is not needed.18:30
*** mriedem_stew is now known as mriedem18:31
melwittsean-k-mooney: cool thanks18:31
*** k_mouza has quit IRC18:34
*** k_mouza has joined #openstack-nova18:34
*** janki has quit IRC18:38
*** k_mouza has quit IRC18:38
*** k_mouza has joined #openstack-nova18:40
*** k_mouza has quit IRC18:42
mriedemdansmith: off the top of your head, do you know much about the migration_context we stash on the instance during cold migration / resize (created by the RT move claim) and what we need it for besides routing neutron events to the source and dest host? looks like it's otherwise for tracking numa/pci on the source and dest host,18:43
mriedemreason i ask is because i'm currently using a move claim for the cross-cell resize but i have alternatives to using a resize_claim,18:44
mriedemboth claim ways in the RT are kind of weird for how i'm doing this18:44
*** k_mouza has joined #openstack-nova18:46
mriedemre: https://review.openstack.org/#/c/603930/9/nova/compute/resource_tracker.py and https://review.openstack.org/#/c/603930/9/nova/compute/manager.py@513818:46
*** k_mouza_ has joined #openstack-nova18:49
mriedemi don't think i need to care about the migration_context for the same reasons as normal resize because for cross-cell, we'll have shelved offloaded from the source by the time we claim on the dest, so meh18:49
dansmithmriedem: well, the point of it was to avoid doing things like looking up the most recent unfinished migration for an instance in order to get at things we were going to stash on there18:50
dansmithprobably for things that aren't covered by the old/new flavor, so yeah probably numaish things18:50
dansmithmaybe, but I guess I'd hope that we could make it as similar of a process as possible,18:50
dansmithand the fact that we only have migration-context for cold moves right now is unfortunate I think18:50
*** k_mouza has quit IRC18:51
mriedemwell, we can make it similar, but things get weird if we do, as noted in those links above18:51
mriedemi just haven't ran into anything with this that requires needing the migration_context being set on the instance18:52
dansmithwe use it for directing notifications to both computes right?18:52
*** eharney has quit IRC18:53
mriedemyes, but i don't really need that with cross-cell resize,18:53
mriedembecause sending an event to the source is useless b/c we've shelved offloaded from the source18:53
*** k_mouza_ has quit IRC18:53
mriedemwhen we unshelve on the target, the instance.host gets set and the API will route the event there18:54
dansmithsure, I was just responding to "we don't use it anywhere"18:54
mriedemi also have to do cludgy shit like this https://review.openstack.org/#/c/603930/9/nova/compute/manager.py@518918:54
mriedemif not using an instance_claim18:54
dansmithso, one question I had but was saving was.. are you going to have a resize operation end up going through SHELVED_OFFLOADED from the user's perspective?18:54
*** k_mouza has joined #openstack-nova18:55
mriedemthe terminal state is VERIFY_RESIZE for the user18:55
mriedemi have a TODO related to that https://review.openstack.org/#/c/603930/9/nova/compute/manager.py@502418:55
dansmithbecause the context would be one place to stash what we're doing to hide that18:56
mriedemi think i need to leave the task_state set there,18:56
mriedemand then on unshelve on the target host, we set the vm_state to RESIZED rather than ACTIVE: https://review.openstack.org/#/c/603930/9/nova/compute/manager.py@519518:57
mriedemso the functional test is like a normal resize where the caller issues the resize and then polls for VERIFY_RESIZE status18:57
*** k_mouza_ has joined #openstack-nova18:58
dansmithaight, well, whatever.. point being, it seems bad to create more places where we don't have that set.. more places where if we needed to know which kind of migration we're doing we have to look at the list an guess18:58
dansmithfor example,18:58
dansmithwe have this long standing bug where we don't properly consider live migrations with pinned cpus as needing to match 1:1 for the destination host,18:59
dansmithwhich is solvable in other ways,18:59
dansmithbut one downstream hack tries to find the migration, determine if it's a live one, so it can make better choices18:59
*** k_mouza has quit IRC18:59
*** moshele has quit IRC18:59
dansmithwhich is icky for other reasons, but.. finding out what kind of migration is currently going on and what those details are seems like a good thing to me19:00
dansmithso, not a real helpful opinion, but....there it is19:00
*** k_mouza has joined #openstack-nova19:01
mriedemi can do it either way, it's mostly just a question of how gorby we want the RT resize_claim flow to become when we're unshelving during a cross-cell resize....since as noted we have to do some things manually if we're not doing an instance_claim19:01
mriedem*gorpy19:01
dansmithdo we not have any places where we might need to look at a late-stage migration and know if it was cross-cell before we allow or disallow something?19:01
*** k_mouza_ has quit IRC19:02
dansmithwell, fwiw, the migration context setting via instance_claim never made sense to me, and continues to confuse me and others when we try to remember where it gets set (and why not for things like live migration)19:02
mriedemit's not set in instance_claim, it's in resize_claim19:02
dansmithyou know what I mean gdi :)19:02
dansmithin the claim process19:02
mriedemi'm not aware of late stage thingies that rely on the migration_context, maybe there are in the actual finish_resize/finish_revert_resize/confirm_resize flows...which i'm not using19:03
dansmithI'm saying things we'd need to handle in this process, not existing ones19:03
mriedemwhen we revert/confirm the API looks up the migration directly, not via the migration_context: https://review.openstack.org/#/c/603930/9/nova/compute/api.py@324419:03
dansmithyeah, which is kinda silly19:04
mriedemwhich, i guess ^ works because we have the finished status and then it goes to 'completed' or something in the compute19:04
dansmiththat would be much nicer as "get the instance and get the current migration" instead of "sort and assume the last one is legit"19:04
mriedemwhich sound the same to me19:04
dansmithI should just stop talking. No, I don't have any good reasons.19:05
mriedemoh i guess on revert the migration status goes to 'reverted'19:05
*** k_mouza has quit IRC19:05
mriedemand 'confirmed' on confirm19:05
mriedemalright, well, it's just a drop in the bucket of questions in here19:05
mriedemi just wanted to plant the seed of doubt in someone else's mind about this19:06
*** qqmber has quit IRC19:06
mriedemyou're welcome19:06
*** eharney has joined #openstack-nova19:08
*** markvoelker has quit IRC19:09
*** markvoelker has joined #openstack-nova19:09
*** hamzy_ has joined #openstack-nova19:10
*** moshele has joined #openstack-nova19:11
*** k_mouza has joined #openstack-nova19:12
*** hamzy has quit IRC19:12
*** k_mouza has quit IRC19:15
*** markvoelker has quit IRC19:15
*** markvoelker has joined #openstack-nova19:17
*** tbachman has quit IRC19:18
*** k_mouza has joined #openstack-nova19:18
sean-k-mooneyso can i ask a dumb question related to that converstation.19:18
sean-k-mooneywhat is the difference logically between a migration context, a migtion object and migration_data19:19
sean-k-mooneyi know all three exists but not sure why there is not jsut one datastructure19:19
sean-k-mooneymriedem: dansmith is the answer to ^ documented anywhere19:20
dansmithmigration context is attached to an instance and contains a link to the migration record (i.e. the current one) and some other current details19:20
dansmithmigration objects are the in-progress and archival history of an instance's movements19:21
*** k_mouza has quit IRC19:21
dansmithmigrate data is transient virt-specific detailage about the low-level bits that is ferried back and forth but never persisted19:21
*** k_mouza has joined #openstack-nova19:21
sean-k-mooneyoh ok that actully kind of makes sence. the nameing is unfortuate but the reason for having 3 distinct entities makes sense19:22
mriedemmigrate_data is also only for live migration19:23
mriedemhence the name, LiveMigrateData19:23
dansmithyeah, left that detail out19:24
sean-k-mooneymriedem: yes but for cold migrtion we kindo of abuse the migration_context for associting claim with the active migration too right19:24
*** k_mouza_ has joined #openstack-nova19:24
*** moshele has quit IRC19:25
sean-k-mooneyso we dont actully stuff the resocetrack claims into the migration_context as far as i know but i think we must the uuid of the migration_context when claiming the resouce or somthing like that.19:25
*** k_mouza has quit IRC19:26
mriedembased on my questions above, i'm clearly not the person to ask about the intracacies of how the migration_context is used during resize19:27
mriedem*intricacies even19:27
dansmithsean-k-mooney: migration context has no uuid, it's attached to the instance19:27
mriedemMigrationContext.migration_id could be used to find the migration object if needed19:28
mriedemwithin the same cell19:28
sean-k-mooneysorry the migrtion record/object reffrence by the migration_context has a uuid which we use19:28
dansmithsean-k-mooney: it has things that we don't need to persist after completion, unlike things we store in the migration record for posterity, like what flavor it was and what flavor it is now19:28
*** k_mouza_ has quit IRC19:28
dansmithmriedem: I meant the context doesn't have its own identifier19:28
mriedemah yeah19:28
dansmithand, unfortunate that we used the id there I guess, as it makes it potentially less helpful for the cross-cell case19:29
mriedemlaura is home, the 2:30 vacuuming has started19:29
mriedemfor cross-cell everything is scoped to the cell db, so it's not a big deal19:29
*** k_mouza has joined #openstack-nova19:29
dansmithwell,19:30
mriedemconductor orchestrates the db switching when needed19:30
dansmithonce we want cross-cell live migration it'll probably be relevant19:30
mriedemi think i'll probably be back at ibm working on php / xenapi by the time that happens..19:30
* dansmith nods19:30
mriedemyou know, the up and coming tech19:31
dansmithit's a thing people want though19:31
mriedemyeah, huawei's public cloud has a beta for cross-cell live migration,19:31
dansmithespecially if it were to make it easier to migrate from an older cloud to a newer one by making one a cell of the other until it's emptied19:31
mriedemi'm not sure how they do it but...19:31
dansmiththat's been requested since icehouse or so19:31
*** k_mouza_ has joined #openstack-nova19:32
sean-k-mooneydansmith: we all know livemigration never works :P also if i get sirov live migration working this cycle i have no plan to test cross cell, sriov, nuam aware  live migration between different releases during upgrade :)19:32
* sean-k-mooney now that i have said it a telco will ask for it19:33
openstackgerritMatt Riedemann proposed openstack/nova stable/rocky: WIP: Test report_ironic_standard_resource_class_inventory=False  https://review.openstack.org/60910719:33
mriedemthere is no cross cell live migration...19:33
*** k_mouza has quit IRC19:34
sean-k-mooneymriedem: didnt you jsut say huawei's public cloud has beta supprot. i would assume they will ask you to upstream it a some point or is that complete different devision form yours?19:34
*** awaugama has quit IRC19:35
mriedemcompletely different19:35
*** k_mouza has joined #openstack-nova19:35
mriedemtheir public cloud is still using cascading, which is their proprietary cells v119:35
mriedemthey are working on migrating off that to cells v219:36
dansmiththat's more like cross-deployment live migration19:36
sean-k-mooneydansmith: is that still an ask from the edge working group?19:36
*** k_mouza_ has quit IRC19:36
dansmithis what? cross-deployment live migration?19:37
dansmithI'm sure it'll be on their list at some point19:37
dansmithI can see the cross-cell thing being fine, but I dunno about cross-deploy19:37
sean-k-mooneyya when i was in the edge room in dublin i spent 15 minute explaining why cross cloud inter hyperviror live migration was never going to be a thing and should not be in the phase 1 basic feature support for edge19:38
*** k_mouza_ has joined #openstack-nova19:38
sean-k-mooneythey litrally wanted to live migrate form libvirt + ceph in one edge site to vmware on another19:39
jaypipesdansmith, mriedem, melwitt: is there a reason we don't pass instance metadata to the RequestSpec?19:39
*** k_mouza has quit IRC19:40
dansmithjaypipes: why would we need to?19:40
jaypipesdansmith: so that scheduler filters can look at the instance metadata? :)19:40
dansmiththey should never do that19:40
dansmithinstance metadata is owned by the user not nova19:40
jaypipesdansmith: oh, but Oath disagrees strongly. ;)19:40
dansmithlooking at it, especially for placement would violate19:40
dansmithjaypipes: -219:40
jaypipeshehe19:41
melwittbut what about... custom filters YALL19:41
jaypipesdansmith: what melwitt said :)19:41
dansmithyeah19:41
jaypipesdansmith: example...19:41
sean-k-mooneyjaypipes: jay just stuff the info in a schduler hint and use the json fileter19:41
* sean-k-mooney ducks19:41
mriedemthe request spec has an instance_uuid on it, you can get the instance from that and pull the metadata off the instance; that won't work for multi-create, but you probably don't care at oath19:41
dansmithyup19:42
jaypipesdansmith: nova boot --property ytag=SOME_CUSTOM_YAHOO_GOOP; nova scheduler has a filter that does an external lookup to our inventory management system of the ytag to grab the availability zone (really, just a power domain) to send the instance to19:42
dansmithor hint yourself to an external artifact, fetch it and go nuts19:42
*** k_mouza has joined #openstack-nova19:42
dansmithjaypipes: yep, I got it, but metadata is off limits19:42
*** k_mouza_ has quit IRC19:42
dansmith--hint ytag-fml -> look up fml externally, do a thing19:43
jaypipesmriedem: you can't do that. the instance_uuid is for the mapping, but the metadata doesn't exist yet, so if you try to call Instance.get_by_uuid(instance_uuid) from the filter, that's a dead end.19:43
mriedemjaypipes: get the build request then19:43
jaypipesmriedem: build request doesn't store metadata.19:43
melwittlast I heard, oath does use multi-create. there's auto-scale stuff that boots several instances at once19:43
mriedemjaypipes: sure it does,19:43
mriedemit stores the instance19:43
mriedemwhich stores the metadata19:43
*** mvkr has joined #openstack-nova19:44
dansmithright, that's how we know what to create when we've picked a host :)19:44
jaypipesmriedem: https://github.com/openstack/nova/blob/stable/ocata/nova/compute/api.py#L1004-L100719:44
jaypipesmriedem: where exactly does the build request store the instance metadata?19:44
mriedeminstance=instance19:44
dansmithit's in instance19:44
dansmithheh yeah19:44
mriedeminstance.update(base_options)19:45
mriedembuild request stores a serialized instance object19:45
jaypipesah..19:45
jaypipesI missed that. sorry.19:45
jaypipesok, so I'll change the filter to pull the build request by instance_uuid.19:45
mriedemhttps://github.com/openstack/nova/blob/stable/ocata/nova/compute/api.py#L93619:45
jaypipesthanks y'all19:45
mriedemperformance will suck19:45
mriedembut...19:45
dansmithbut you're going to hell anyway?19:45
dansmithyah.19:45
jaypipesmriedem: yes, understood.19:45
sean-k-mooney hum in that case the json fileter and compute capablity filters can already read it and do stuff...19:46
jaypipessean-k-mooney: we already have a custom IronicCapabilitiesFilter. don't get me started :)19:46
melwittjaypipes: yeah, you might probably run into perf problems during a db lookup in a filter (but I guess you said earlier you're doing an external system lookup in a filter already and that wasn't hurting perf?)19:46
*** k_mouza has quit IRC19:46
sean-k-mooneyjaypipes: are you thinking of pulling this suff out in a pre placement filter or post out of interest19:46
jaypipesthis one's actually a network availability filter that looks for num_additional_ipv4 and num_ipv6 custom metadata key/values, calls out to our IPAM system from within the filter itself, and determines if the target system has enough IP addresses available.19:47
jaypipesdansmith: you're welcome ^19:47
melwittI remember when I worked at yahoo, we ran into perf issues with an in-tree filter that was doing db lookups and had to patch it out (and upstream fixed it soon after)19:47
* dansmith hands jaypipes a handful of poo19:47
jaypipesmelwitt: this is even worse. :) it's doing out of band calls to a REST API from within the in-tree filter :)19:48
sean-k-mooneyjaypipes: for l3 routeded network we are storing that kindo of info in placement19:48
melwittyeah. interesting that it's not causing perf issues19:48
jaypipesmelwitt: well, it's not like the traffic to the scheduler is huge...19:48
jaypipesmelwitt: I mean, it's not like there's thousands of concurrent callers of nova boot for ironic hosts.19:49
sean-k-mooneyjaypipes: any way you could jsut write some kind of bridge between neutron and the ipam to model it in placement and not use a scheduler filter19:49
melwittit used to be, is what I'm saying. but that was back before we had placement filtering the set of compute nodes down19:49
jaypipessean-k-mooney: baby steps :)19:49
jaypipesmelwitt: we're getting there.. slowly :)19:49
sean-k-mooneyjaypipes: hehe ok it just seam like some of that code might already exists for the l3 routed network and with gibis resouce stuff you might be able to have neutron pass say a host aggreage and ip reouce request back19:50
sean-k-mooney* placement aggreate19:50
melwittoh, ironic. I don't know much about that. I think what I was referring to was VMs. it was since there were 1000 compute nodes, have a db lookup once per the 1000 and having concurrent requests caused the problems19:51
jaypipessean-k-mooney: ALL of that code already exists in the L3 routed net segments stuff :) we just need to get there first..19:51
melwitt*db lookup for each of the 100019:51
*** k_mouza has joined #openstack-nova19:51
sean-k-mooneyjaypipes: so playing devils advocate here if you wrote a neutron ipam plugin driver and and we had a way to get the info from neutron to nova via the port you would not need the filter right.19:53
*** k_mouza has quit IRC19:54
jaypipessean-k-mooney: correct.19:54
sean-k-mooneyjaypipes: im just thinking this would be generically useful out side oath too but for mass reuse we would not want to go the schduler filter route19:54
jaypipessean-k-mooney: I welcome your imminent pull request to our repo.19:54
sean-k-mooney:) well have you talked to any neutron folk about how much effort that would be?19:55
sean-k-mooneysound like you can do the schduler filter out of tree without nova changes in anycase19:55
mriedemdansmith: on the detach/attach root volume spec https://review.openstack.org/#/c/600628/ did you see that it was updated to also allow detaching the root volume from stopped instances in addition to shelved offloaded?19:56
jaypipessean-k-mooney: though the "interesting" part about this particular filter is that it's quantitative -- "make sure this baremetal host is in rack that has X number of available IPv4s in its subnet" -- but the request isn't actually *for* that amount. It's basically "ok, just make sure I *could* get this many IPs, but don't actually grab those IPs for me. right now. maybe later, ok thx bai"19:56
dansmithmriedem: no I saw some activity on that this morning and have it queued19:57
mriedemjaypipes: sounds like what you want is quotas and resource claims brother!19:57
melwittjaypipes: that sounds kind of like the key/value discussion we've been having. need disk_typeA=4 but don't want to actually consume them19:57
mriedemyour fingers say no but your mouth....also says no19:58
melwittso I wonder if you could solve your complex partitioning/layout issues similarly19:58
*** gouthamr has quit IRC19:58
mriedemdansmith: yeah i'm not sure how i feel about it....but i'm also not sure i have a good excuse against allowing it19:58
sean-k-mooneyjaypipes: right ok so your not reseving the ips so your hoping that if you need them in the future you could resrve them19:58
sean-k-mooneyjaypipes: that sound more like a weigher19:58
dansmithmriedem: well, I thought the shelve bit was because we didn't have to worry about disconnecting on the compute node19:59
mriedemit's definitely more straight forward if the instance is shelved19:59
dansmithyeah19:59
dansmithanywa19:59
dansmithI'll try to get around to it19:59
*** k_mouza has joined #openstack-nova20:04
sean-k-mooneymelwitt: jaypipes not sure how the disk_typeA=4 is intended to work but asumming this was all in placement you "could" ask placement for 4 ips in this case but instead of claiming all the resouces in the allocation candiate only calim 1 ip.20:05
sean-k-mooneymelwitt: jaypipes that said for that code to be in the nova tree it would have to be generic and not for jsut this usecase. not sure how you would model that in flvor extraspecs however20:06
mriedemjaypipes: efried: i'm trying to figure out what's going on with https://review.openstack.org/#/c/552105/ and https://review.openstack.org/#/c/544683/ from reading the ptg etherpad and it's not really clear to me if those are supposed to be combined or initial allocation ratios is a dependency for the other spec?20:06
mriedemi spoke with yikun last night and he's confused as to what should be changed based on the etherpad, and i kind of am too since the etherpad is just mostly discussion20:07
melwittI think the initial ratios spec is a dependency for the other spec. two specs needed20:07
mriedemhttps://review.openstack.org/#/c/544683/ says "#agreed in Stein PTG to squash this into [1]"20:07
mriedemand the etherpad says "Sounds like the two specs need to be combined a bit."20:08
melwittok, then I must not have understood20:08
melwittI thought it was two specs, one to define the initial allocation ratios and another to define how to handle the initial allocation ratios20:08
mriedemwhy wouldn't that just be one spec?20:09
*** k_mouza has quit IRC20:09
sean-k-mooneymriedem: we said to combine them at the ptg yes but im trying to rember the details20:09
melwittI don't know. but that's what was being talked about in the room at the time, or so I thought20:10
jaypipessean-k-mooney, mriedem, dansmith, melwitt: to be clear, I'm not asking for anything at all :) I'm really just gonna do this custom filter thing as a stop-gap measure until we get on a more up to date version of nova.20:10
mriedemagreed to add new initial_allocation ratio options with the default values from the ComputeNode object today,20:10
mriedemchange the existing *_allocation_ratio values from 0.0 defaults to None20:10
*** k_mouza has joined #openstack-nova20:10
sean-k-mooneyjaypipes: oh i know. it just sounded like a useful thing. maybe for T20:10
mriedemdelete the code in the ComputeNode object so it's all config driven20:10
jaypipesmriedem: yes, change them back to None from 0.0.20:11
mriedemand then something something if config is set, that trumps the API20:11
jaypipesmriedem: but I distinctly remember saying I was at the end of my proverbial rope with both of those specs and someone else would need to pick it up.20:11
jaypipes:)20:11
mriedemso mgagne's use case can use the API exclusively and CERN can use the config exclusively20:11
mriedemjaypipes: yes yikun is happy to pick it up,20:11
jaypipescool, thx20:11
mriedembut he doesn't understand what the direction is...20:11
sean-k-mooneymriedem: yes i think we said if you want to be api drive in the config you set the value to none or remvoe it20:11
mriedemwhich is why i'm trying to be a middleman here20:11
sean-k-mooneymriedem: if you want to be config driven you set the config value and dont touch it from the api20:12
jaypipesmriedem: you are wonderful middleware.20:12
mriedemand yikun had a question, "How to address the upgrade case? If we already have a 0.0 cpu ratio in db, should we change it to 16.0 first? online migration?"20:13
*** k_mouza_ has joined #openstack-nova20:13
sean-k-mooneymriedem: spefically you set cpu_allocation_ratio=None initall_cpu_allocation_ratio=16.0 if you want to set a default for new node but manage the actuall value form api20:13
mriedemwould we change the ComputeNode.*_allocation_ratio to the config value on read if the value in the db is 0.0?20:14
sean-k-mooneyand set cpu_allocation_ratio=x if you want to manage via config20:14
*** k_mouza has quit IRC20:15
sean-k-mooneymriedem: does that make sense?20:15
mriedemyes i get that,20:16
mriedemthe question is upgrades https://review.openstack.org/#/c/552105/5/specs/stein/approved/initial-allocation-ratios.rst@11420:16
*** k_mouza has joined #openstack-nova20:17
*** k_mouza_ has quit IRC20:18
sean-k-mooneymriedem: i guess on upgrade if 0.0 is set in the db that would also imply that the resouce provider allocation_ratio or what ever is 0.20:18
sean-k-mooney0.0 also correct20:18
sean-k-mooneymriedem: if the resouce provide exists and we get 0.0 from the db but placemetn has another vaule i would assuem we shoudl keep the placement value but not sure if that case can happen today20:20
sean-k-mooneythe compute node will just override placement with the value it gets from the resouce tracker today in update provider tree right?20:21
*** k_mouza has quit IRC20:21
*** k_mouza_ has joined #openstack-nova20:24
*** cfriesen has joined #openstack-nova20:24
mriedemthe allocation ratio in placement can't be 0.020:25
mriedemit will literally shit itself20:25
cfriesenI assume it20:26
cfriesenit's a divide by zero thing?20:26
sean-k-mooneymriedem: ok in that case the logic is simple. on upgrade if placement provider exits and db value is 0 set db value to placement value. if not placement provider exeits set db to intiall_* value and create provider as normal20:26
cfriesenshould placement check for that?20:26
openstackgerritiain MacDonnell proposed openstack/nova master: Handle online_data_migrations exceptions  https://review.openstack.org/60809120:27
sean-k-mooneycfriesen: its not actully a device by 0 but we multiply the available capasity by 0 and see if its larger then what we requested20:27
*** k_mouza has joined #openstack-nova20:27
sean-k-mooneycfriesen: so placement will not have a math error but you wont be able to get allocation against that resouce provider ever20:28
cfriesenah, thanks20:28
*** k_mouza_ has quit IRC20:29
*** k_mouza_ has joined #openstack-nova20:31
*** k_mouza has quit IRC20:31
*** k_mouza has joined #openstack-nova20:34
*** k_mouza_ has quit IRC20:36
*** k_mouza has quit IRC20:36
*** k_mouza has joined #openstack-nova20:37
*** pcaruana has quit IRC20:37
*** k_mouza_ has joined #openstack-nova20:40
*** k_mouza has quit IRC20:41
*** k_mouza has joined #openstack-nova20:44
*** k_mouza_ has quit IRC20:44
*** hamzy_ has quit IRC20:46
*** k_mouza_ has joined #openstack-nova20:47
*** k_mouza has quit IRC20:48
*** erlon_ has quit IRC20:50
*** gouthamr has joined #openstack-nova20:50
*** k_mouza has joined #openstack-nova20:50
efriedmriedem: We talked about this in the sched meeting yesterday. jaypipes said he was about ready to abandon those two specs. We also discussed the possibility of generic inventory yaml leading to a solution.20:50
*** k_mouza_ has quit IRC20:51
mriedemefried: i know i was there and said i'd reach out to yikun to pick up the specs,20:51
mriedembut he's confused about the direction, as am i20:51
mriedemhence questions20:51
*** ttsiouts has joined #openstack-nova20:51
mriedemi'm going through https://review.openstack.org/#/c/552105/ again now,20:51
mriedemsome of that is outdated given https://github.com/openstack/nova/commit/2588af87c862cfd02d860f6b860381e907b279ff20:51
*** k_mouza has quit IRC20:54
*** ttsiouts has quit IRC21:00
*** rmart04 has joined #openstack-nova21:00
*** ttsiouts has joined #openstack-nova21:00
*** rmart04 has quit IRC21:01
*** k_mouza has joined #openstack-nova21:02
*** eharney has quit IRC21:05
*** k_mouza has quit IRC21:05
*** k_mouza has joined #openstack-nova21:06
mriedemalright i've dumped comments in https://review.openstack.org/#/c/552105/ - i think i could probably update the spec at this point to cover the upgrade impact21:07
mriedemi don't think we should leave the existing options defaulting to 0.0 like bauzas is asking for - that just prolongs the confusion of what those defaults mean21:08
mriedemjaypipes: if you can skim my comments to see if they make sense i can take over updating the spec21:08
*** k_mouza has quit IRC21:08
*** k_mouza has joined #openstack-nova21:12
*** k_mouza has quit IRC21:14
*** k_mouza has joined #openstack-nova21:15
*** priteau has quit IRC21:15
*** cfriesen has quit IRC21:19
*** k_mouza has quit IRC21:20
sean-k-mooneymriedem: are you proposing defaulting them to None or 16.0/ what ever the real default is for that resource ?21:20
mriedemwhat the spec says21:21
mriedemchange the *_allocation_ratio defaults from 0.0 to None21:21
mriedemthe initial_*_allocation_ratio defaults become what is in the ComputeNode object facade today21:21
mriedemand we drop the facade21:21
sean-k-mooneyright that makes sense to me and inital_*_allocation_ratios will have per resouce type defaults correct21:22
mriedemyes21:22
sean-k-mooneyya that all sound sane to me. i have not read bauzas comment arguing for keeping 0.021:22
sean-k-mooneycurrent 0.0 has a special meaning right? e.g. use schduler/conductor values not compute node os somthing like that21:23
sean-k-mooneyi assume that is what his comment was related too.21:24
* sean-k-mooney clicks speck link to read21:24
sean-k-mooneythe nova-status check makes sense but im not sure you can check the config as part of it21:26
mriedemif the config explicitly sets the *_allocation_ratios to 0.0 when we have changed the defaults to None, that means their config mgmt system is setting that on purpose and is likely busted21:27
sean-k-mooneymriedem: ture i just was thinking for FFU or in general the nova status command cant check the config on each compute unless you ran it on each compute21:29
sean-k-mooneythat said if you use oslo configs ablity to auto generate configs does it generate the config with all the values commeted out or set to there default. just trying to think if there was a resonable reason why it might be set to 0.021:30
mriedemthe allocation ratios aren't read on control plane services, so i think it's reasonable to assume if someone's config said 0.0 for those values when the defaults are None they are just setting the config globally and it's wrong21:30
sean-k-mooneysorry i thnk i missed that bit where will these new values be read? scheduler/condctor or compute node21:34
sean-k-mooneyi had assuemed this was all config that was being read by the compute node?21:35
mriedemthe compute is what sets it21:38
mriedemthe scheduler will read it21:38
mriedemfrom the compute node object21:38
*** tbachman has joined #openstack-nova21:39
*** ttsiouts has quit IRC21:39
mriedemthe only service that reads the config for these options is the compute service21:39
*** ttsiouts has joined #openstack-nova21:39
openstackgerritAdam Harwell proposed openstack/nova master: Add apply_cells to nova-manage  https://review.openstack.org/56898721:41
sean-k-mooneymriedem: oh ok i was under the impression if the sechduler recived a 0.0 from the compute node it would read its own config and use the allocation ratio it got. was that how it used to work or am i just imagining things.21:42
*** tbachman_ has joined #openstack-nova21:43
sean-k-mooneymriedem: by in anycase i think the nova-status check is sufficent. if there is a 0.0 in the db for a value the operator should first update there config and then upgrate/run online migration whatever is needed21:43
*** tbachman has quit IRC21:44
*** tbachman_ is now known as tbachman21:44
*** ttsiouts has quit IRC21:44
sean-k-mooneyi.e. im agreeing with your suggestion thanks for explaining :)21:44
imacdonnmriedem dansmith: Fixing the migrations thing made grenade go boom ... there actually was another latent bug that I stumbled on, which was causing the exit code to be zero even though work had been done - with that fixed, grenade is not doing the right thing (repeating until exit status 0)21:48
sean-k-mooneyimacdonn: what is the other bug?21:48
mriedemjaypipes: yikun: i've also gone through https://review.openstack.org/#/c/544683/ and left comments; i'm not on board with all it's proposing, but i think some of that is outdated now per the ptg discussions21:49
imacdonnsean-k-mooney: "ran" gets reset to zero for each iteration of the loop here: https://github.com/openstack/nova/blob/master/nova/cmd/manage.py#L71821:49
mriedemi think the gist of ^ is that it's proposing to proxy aggregate allocation ratio metadata to placement, correct?21:49
imacdonnsean-k-mooney: then it gets used later to determine if any migrations were ran/run at all here: https://github.com/openstack/nova/blob/master/nova/cmd/manage.py#L73921:50
imacdonnsean-k-mooney: so if the last iteration of the loop didn't do anything (even though a previous iteration did), it'd not count21:51
sean-k-mooney imacdonn im not sure that is incorrect. if the last iteration did nothing it means migrations was empty so we break out of the while21:52
imacdonnsean-k-mooney: yes, but the decision about whether or not any work had been done needs to consider ALL iterations of the loop21:52
sean-k-mooneydoes it? why?21:53
imacdonnsean-k-mooney: because, IIUC, that's what exit code 1 means (some migrations did work)21:53
*** k_mouza has joined #openstack-nova21:54
* sean-k-mooney currently parseing return ran and 1 or 021:54
imacdonnI interpret that as "if ran is not zero, return 1, otherwise return 0"21:55
sean-k-mooneyimacdonn: yes but im trying to think what does that logically mean21:56
sean-k-mooneyis return 0 been used to indicate sucess like in bash or does 1 indicate sucess21:56
*** k_mouza has quit IRC21:57
imacdonnsean-k-mooney: it's complicated :) (again, IIUC)... zero means that there is no migration work remaining to be done, 1 means "some migrations did work, and there may be some more work that still needs to be done:21:58
imacdonnso you're supposed to keep running the command until it doesn't return 121:58
sean-k-mooneyimacdonn: right in that cae you want ran to be 0 when all pending migrations have been processed so you want to reset it in the loop21:58
imacdonnsean-k-mooney: no... you're supposed to re-run the command ... that's not what that loop is for21:59
sean-k-mooneythat is not how i read how it is currently written22:00
sean-k-mooneyfrom the current code it looks like its intent is to run all migration in batches up to max count and then exit when there are none left22:01
imacdonnsean-k-mooney: yes, but there are scenarios where some of them will not work the first time (due to dependencies on others), so oit'22:02
imacdonnit's necessary to iterate the whole command until you get a 022:02
imacdonn.... is what I've been led to understand22:02
imacdonnsee comments on https://review.openstack.org/#/c/608091/22:02
sean-k-mooneythis comment https://review.openstack.org/#/c/608091/1/nova/cmd/manage.py@753 or another one22:04
imacdonnyes, those22:04
sorrisonmriedem I have another fun policy change https://review.openstack.org/#/c/608474/22:05
*** k_mouza has joined #openstack-nova22:06
sorrisonmriedem: I've been trying to figure out the whole admin/user context thing with tests and requests. I think I'm missing something22:06
sean-k-mooneyso from mriedem comment we did not raise exceptions before so while we said we coudl exit with a 1 error code we did not22:06
imacdonnsean-k-mooney: I'm still not completely clear on what the exit code should be ... I thought I understood that '1' means "some migrations did work"22:08
*** k_mouza has quit IRC22:09
*** k_mouza_ has joined #openstack-nova22:09
*** skatsaounis has quit IRC22:10
sean-k-mooneyi think for mriedem comment 1 means some migrations ran but there were no errors and he was suggesting 2 for some migrations ran and there were errros and 0 means all migration ran and no errors22:11
imacdonnI'm not sure how to distinguish between "some ran" and "all ran"22:11
sean-k-mooneyimacdonn: if you specify make count before your change and you havde more migration pending then max count it would have exited with 122:12
sean-k-mooneyso just change the retur on line 753 to 2 then be correct but we need a docs update to say what 2 means22:13
*** k_mouza_ has quit IRC22:14
imacdonnsean-k-mooney: the suggestion from today was to change it to return 2 *only if no migrations did work this time*22:14
sean-k-mooneywhen --max-count is set unlimited is set to false so we break out of the loop on line 735 and ran is non 0 so " retrun ran and 1 or 0" returns 122:14
sean-k-mooneyimacdonn: ok in that case on line 734 add if got_exceptions: return 222:16
*** moshele has joined #openstack-nova22:16
*** slaweq has quit IRC22:17
sean-k-mooneyinfact you could do the exception check on line 725 if you really wanted to exit fast22:17
imacdonnthat would be bad22:18
imacdonnbecause some other migration may be a dependency for the first one to not fail, and you'd never get to run the second one22:18
sean-k-mooneyyou jsut said you want to exit if there are excpetions22:19
imacdonnnope, pretty sure I didn't22:19
imacdonnI said they want the final exit status to be 2, if there were exceptions AND no migrations did any work22:19
sean-k-mooneyoh "*only if no migrations did work this time*" i misread that22:20
mriedemsorrison: comments inline22:20
imacdonnI guess "did work" is a confusing term ... "took effect"? ......22:21
sean-k-mooneymriedem: so we are gueessing about what you want return 2 to mean for https://review.openstack.org/#/c/608091/1/nova/cmd/manage.py@75322:21
sorrisonThanks mriedem :-)22:21
sean-k-mooneymriedem: since your hear can you clarify for imacdonn22:21
* imacdonn reckons mriedem is in NUMA mode, and we're on the other node ;)22:22
sean-k-mooneymriedem: based on the current code return 1 could have been returned if we passed --max-count and we had more the max-count migration so i think retrun 1 means some migrations ran with no errors and there are more to run and return 2 should be mean there were errors when runign migrations you better check out what went wrong22:23
*** k_mouza has joined #openstack-nova22:24
imacdonnsean-k-mooney: per dansmith, we should only return 2 if there were exceptions *and* we've determined that no other migrations did work (i.e. modified rows)22:26
sean-k-mooneyimacdonn: i dont see that in his comment on the review. was that in the schduler meeting?22:28
mriedemit was in channel22:28
mriedemseveral hours ago22:28
*** k_mouza has quit IRC22:28
mriedemand it sounds like you guys are talking about it all over again to come back to the same conclusion?22:28
sean-k-mooneyno there was a vlaid case to return 1 before22:28
mriedem"(11:25:34 AM) dansmith: what if 2 means "I didn't do anything but there were exceptions", 1 means "I did things, maybe there were some exceptions too", 0 means "I didn't do anything, but no errors""22:29
imacdonnmriedem: the sticking point now is the meaning of '1'22:29
sean-k-mooneywe would have return one if we set a max count and there were more then max count migrations22:29
imacdonnmriedem: In the current implementation, if it runs through all migrations (in batches), the final return code is 0, even though work was done22:30
sean-k-mooneymriedem: so if there were 100 migrations to run and we set --max-count=90 before and all 90 ran sucessfully we would have retruned 1 to indicate there are more to run22:30
imacdonnin my interpretation of the discussion this morning, it should be '1' if work was done, even if there is no work remaining to do22:31
*** tbachman has quit IRC22:31
*** k_mouza has joined #openstack-nova22:31
sean-k-mooneymriedem: 0 used to mean i ran all migrtions sucessfuly since 0 is sucess on bash22:32
mriedemsorry but i'm past the point of having attention to think about this today22:33
sean-k-mooneymriedem: no worries am ill leave a comment in the patch with what i understand the current logic is and you and dan can check when ye have had some rest22:33
sean-k-mooneyimacdonn: are you ok with waiting for them to look at this tomorrow?22:34
*** k_mouza has quit IRC22:34
imacdonnsean-k-mooney: sure22:34
*** panda has quit IRC22:37
*** panda has joined #openstack-nova22:39
*** rcernin has joined #openstack-nova22:42
*** k_mouza has joined #openstack-nova22:43
*** owalsh is now known as owalsh_away22:47
*** mriedem has quit IRC22:57
*** hongbin has quit IRC22:58
*** lbragstad has quit IRC23:01
*** spatel has joined #openstack-nova23:03
*** cfriesen has joined #openstack-nova23:05
*** spatel has quit IRC23:07
*** slaweq has joined #openstack-nova23:11
*** efried has quit IRC23:12
*** efried has joined #openstack-nova23:13
*** itlinux has quit IRC23:14
*** slaweq has quit IRC23:16
openstackgerritsean mooney proposed openstack/nova master: add get_pci_requests_from_vifs to request.py  https://review.openstack.org/60916623:22
*** macza has quit IRC23:32
*** mlavalle has quit IRC23:35
*** cfriesen has quit IRC23:41
*** cfriesen has joined #openstack-nova23:42
*** Swami has quit IRC23:43
*** erlon_ has joined #openstack-nova23:44
*** sambetts|afk has quit IRC23:44
*** sambetts_ has joined #openstack-nova23:45
openstackgerritChris Friesen proposed openstack/nova-specs master: Add support for emulated virtual TPM  https://review.openstack.org/57111123:56

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!