Wednesday, 2019-04-03

*** wolverineav has joined #openstack-nova00:00
*** tetsuro has joined #openstack-nova00:04
*** liuyulong is now known as liuyulong|away00:05
*** wolverineav has quit IRC00:06
openstackgerritMerged openstack/nova master: Fix bug preventing forbidden traits from working  https://review.openstack.org/64865300:16
openstackgerritTakashi NATSUME proposed openstack/nova master: Add a live migration regression test  https://review.openstack.org/64120000:17
mriedemoh wow this is pretty bad https://review.openstack.org/#/c/648653/00:18
*** takashin has joined #openstack-nova00:19
mriedemefried: so forbidden traits just never worked in nova? ^00:19
*** brinzhang has joined #openstack-nova00:26
*** hamzy has joined #openstack-nova00:28
*** wolverineav has joined #openstack-nova00:28
*** wolverineav has quit IRC00:58
*** wolverineav has joined #openstack-nova01:02
*** wolverineav has quit IRC01:06
*** tetsuro_ has joined #openstack-nova01:10
*** alex_xu has quit IRC01:10
*** lbragstad_ has joined #openstack-nova01:10
*** tetsuro has quit IRC01:12
*** cfriesen has quit IRC01:12
*** lbragstad has quit IRC01:12
*** phasespace has quit IRC01:12
*** nicholas has quit IRC01:12
*** mtreinish has quit IRC01:12
*** toabctl has quit IRC01:12
*** spotz has quit IRC01:12
*** mtreinish has joined #openstack-nova01:13
openstackgerritYongli He proposed openstack/nova master: Clean up orphan instances virt driver  https://review.openstack.org/64891201:13
openstackgerritYongli He proposed openstack/nova master: Clean up orphan instances  https://review.openstack.org/62776501:13
*** bbowen__ has quit IRC01:16
*** wolverineav has joined #openstack-nova01:18
*** mmethot has quit IRC01:19
*** wolverineav has quit IRC01:22
*** dtantsur|afk has quit IRC01:24
*** dtantsur has joined #openstack-nova01:25
*** igordc has quit IRC01:29
*** wolverineav has joined #openstack-nova01:39
*** hongbin has joined #openstack-nova01:41
*** bbowen__ has joined #openstack-nova01:41
*** wolverineav has quit IRC01:43
*** tetsuro_ has quit IRC01:49
*** whoami-rajat has joined #openstack-nova01:57
*** mriedem has quit IRC02:07
*** spsurya has joined #openstack-nova02:12
*** BjoernT has joined #openstack-nova02:27
*** alex_xu has joined #openstack-nova02:32
*** BjoernT has quit IRC02:58
*** hongbin has quit IRC03:00
*** Sundar has quit IRC03:02
openstackgerritYongli He proposed openstack/nova master: Clean up orphan instances virt driver  https://review.openstack.org/64891203:04
openstackgerritYongli He proposed openstack/nova master: Clean up orphan instances  https://review.openstack.org/62776503:04
*** BjoernT has joined #openstack-nova03:05
*** BjoernT has quit IRC03:06
*** BjoernT has joined #openstack-nova03:09
*** lbragstad_ is now known as lbragstad03:14
*** psachin has joined #openstack-nova03:16
*** zhubx has joined #openstack-nova03:17
*** zhubx has quit IRC03:30
*** zhubx has joined #openstack-nova03:31
*** nicolasbock has quit IRC03:33
*** wolverineav has joined #openstack-nova03:40
*** wolverineav has quit IRC03:44
*** udesale has joined #openstack-nova04:00
*** brinzhang has quit IRC04:11
*** brinzhang has joined #openstack-nova04:11
*** igordc has joined #openstack-nova04:18
*** whoami-rajat has quit IRC04:57
*** gbarros has joined #openstack-nova05:02
*** BjoernT has quit IRC05:11
*** ratailor has joined #openstack-nova05:18
*** toabctl has joined #openstack-nova05:21
*** abhishekk has joined #openstack-nova05:26
*** gbarros has quit IRC05:27
*** lbragstad has quit IRC05:28
*** krypto has joined #openstack-nova05:51
*** aarora06 has joined #openstack-nova06:07
*** janki has joined #openstack-nova06:17
*** sridharg has joined #openstack-nova06:18
*** shilpasd has joined #openstack-nova06:20
*** slaweq has joined #openstack-nova06:26
*** pcaruana has joined #openstack-nova06:36
*** pcaruana has quit IRC06:38
*** pcaruana has joined #openstack-nova06:38
openstackgerritMerged openstack/nova master: Remove flavor id and name validation code  https://review.openstack.org/63815006:39
*** mdbooth_ has joined #openstack-nova06:40
*** mdbooth has quit IRC06:43
*** Cardoe has quit IRC06:47
*** Cardoe has joined #openstack-nova06:47
*** dpawlik has joined #openstack-nova06:47
*** krypto has quit IRC06:49
*** rpittau|afk is now known as rpittau06:50
*** krypto has joined #openstack-nova06:50
*** luksky has joined #openstack-nova06:50
*** phasespace has joined #openstack-nova06:52
*** tkajinam has quit IRC06:58
*** tkajinam has joined #openstack-nova07:03
*** tosky has joined #openstack-nova07:09
*** whoami-rajat has joined #openstack-nova07:13
*** awalende has joined #openstack-nova07:13
*** tesseract has joined #openstack-nova07:16
*** bbobrov has quit IRC07:17
*** bbobrov has joined #openstack-nova07:17
*** ralonsoh has joined #openstack-nova07:18
*** xek has joined #openstack-nova07:19
*** jistr is now known as jistr|afk07:22
*** tssurya has joined #openstack-nova07:32
*** helenafm has joined #openstack-nova07:42
awalendeI want to set the numatune memory mode to "preferred" in nova, how do I do that in Queens?07:44
*** igordc has quit IRC08:04
openstackgerritYongli He proposed openstack/nova master: Clean up orphan instances  https://review.openstack.org/62776508:06
openstackgerritKashyap Chamarthy proposed openstack/nova-specs master: Re-propose the spec to allow specifying a list of CPU models  https://review.openstack.org/64203008:11
openstackgerritTetsuro Nakamura proposed openstack/nova master: Add node_uuid field to Destination object  https://review.openstack.org/64953208:16
openstackgerritTetsuro Nakamura proposed openstack/nova master: Pass node uuid to new Destination.node_uuid  https://review.openstack.org/64953308:16
openstackgerritTetsuro Nakamura proposed openstack/nova master: Add in_tree field to RequestGroup object  https://review.openstack.org/64953408:16
openstackgerritTetsuro Nakamura proposed openstack/nova master: node_uuid from RequestSpec to ResourceRequest  https://review.openstack.org/64953508:16
openstackgerritLee Yarwood proposed openstack/nova master: Block swap volume on volumes with >1 rw attachment  https://review.openstack.org/57279008:16
*** ttsiouts has joined #openstack-nova08:18
*** ccamacho has joined #openstack-nova08:25
*** priteau has joined #openstack-nova08:29
*** takashin has left #openstack-nova08:29
*** abhishekk has quit IRC08:33
*** owalsh_ is now known as owalsh08:43
*** tkajinam has quit IRC08:45
*** wolverineav has joined #openstack-nova08:45
*** derekh has joined #openstack-nova08:46
*** luksky has quit IRC08:46
*** wolverineav has quit IRC08:50
*** luksky has joined #openstack-nova09:18
*** xek has quit IRC09:20
*** xek has joined #openstack-nova09:21
*** jangutter has quit IRC09:21
*** jangutter has joined #openstack-nova09:22
*** xek has quit IRC09:22
*** shilpasd has quit IRC09:22
openstackgerritMerged openstack/nova master: De-cruft compute manager live migration  https://review.openstack.org/64144909:36
*** whoami-rajat has quit IRC09:37
*** markvoelker has joined #openstack-nova09:55
*** sidx64_ has joined #openstack-nova10:02
*** sidx64_ has quit IRC10:18
openstackgerritStephen Finucane proposed openstack/nova master: Remove unreachable codepaths  https://review.openstack.org/64955910:19
*** sidx64 has joined #openstack-nova10:19
*** ttsiouts has quit IRC10:22
*** ttsiouts has joined #openstack-nova10:23
*** markvoelker has quit IRC10:27
*** ttsiouts has quit IRC10:27
openstackgerritStephen Finucane proposed openstack/nova master: doc: Group API versions by release  https://review.openstack.org/64956010:27
openstackgerritStephen Finucane proposed openstack/nova master: doc: Trivial fixes to API version history  https://review.openstack.org/64956110:27
openstackgerritStephen Finucane proposed openstack/nova master: trivial: Remove dead code  https://review.openstack.org/64956210:27
openstackgerritStephen Finucane proposed openstack/nova master: hacking: Fix dodgy check  https://review.openstack.org/64956310:27
openstackgerritStephen Finucane proposed openstack/nova master: zvm: Remove dead code  https://review.openstack.org/64956410:27
openstackgerritStephen Finucane proposed openstack/nova master: trivial: Remove dead 'ALIAS' constant  https://review.openstack.org/64956510:27
openstackgerritStephen Finucane proposed openstack/nova master: trivial: Remove dead placement API functions  https://review.openstack.org/64956610:27
openstackgerritStephen Finucane proposed openstack/nova master: trivial: Remove unused constants, functions  https://review.openstack.org/64956710:27
openstackgerritStephen Finucane proposed openstack/nova master: Use '_' for unused variables  https://review.openstack.org/64956810:27
openstackgerritStephen Finucane proposed openstack/nova master: trivial: Remove dead resource tracker code  https://review.openstack.org/64956910:27
openstackgerritStephen Finucane proposed openstack/nova master: trivial: Remove dead nova.db functions  https://review.openstack.org/64957010:28
stephenfinawalende: You can't. We don't expose that10:28
stephenfinawalende: It was planned but never implemented10:28
*** nicolasbock has joined #openstack-nova10:33
*** Dinesh_Bhor has quit IRC10:34
*** erlon has joined #openstack-nova10:38
openstackgerritKashyap Chamarthy proposed openstack/nova-specs master: Add Secure Boot support for KVM- and QEMU-based guests  https://review.openstack.org/50672010:39
* kashyap hates the dangling hyphen. But is there a better way to put it?10:39
sean-k-mooneyyes just remove teh hypens10:40
sean-k-mooneyfor kvm and qemu based guests10:41
sean-k-mooneyyou can also remove based10:41
NewBrucehowdy sean-k-mooney10:41
sean-k-mooneyjust looking at your proposea i dont se any reference to the use of the flavor extraspecs for contoling secure boot10:42
sean-k-mooneyNewBruce: hi10:42
NewBrucewas a bit like christmas last night, waiting to see how the tests would run :)10:42
NewBruceshame it didn’t blow up, huh?! ;)10:42
sean-k-mooneyya i had the tab open for a while but needed to go sleep10:42
sean-k-mooneythere are some other thing we could try like forcing different conpute level on either node10:43
NewBruceyep, same - we’re gonna roll out rocky accross the full compute today and see what that does10:43
NewBruce…. well, I was going to, until a net node decided to fall over due to excessive CPU usage in netlink10:44
kashyapsean-k-mooney: But the sentence is syntactically correct.  (I read about dangling hyphens enough that the above is correct, but ugly :D)10:44
sean-k-mooneythe grenade job should have been testing queens to rocky migration i think10:44
NewBruceever seen this?10:44
NewBruce2019-04-03T10:30:28.283Z|2158410|poll_loop(handler44)|INFO|Dropped 3426304 log messages in last 6 seconds (most recently, 0 seconds ago) due to excessive rate10:44
NewBruce2019-04-03T10:30:28.283Z|2158411|poll_loop(handler44)|INFO|wakeup due to [POLLIN] on fd 33 (unknown anon_inode:[eventpoll]) at ../lib/dpif-netlink.c:2786 (99% CPU usage)10:44
sean-k-mooneyactuly no it would have been rocky to master10:44
sean-k-mooneyis that form ovs10:44
NewBruceyup10:45
NewBruceAbsolutely nothing on the machine; freshly rebooted10:45
NewBruce  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND10:45
NewBruce 1979 root      10 -10 1000172 135296  10912 S 799.3  0.2   1136:19 ovs-vswitchd10:45
kashyapsean-k-mooney: Hmm, I mentioned the metadata property (`os_secure_boot`), but didn't write anything about flavor extra_specs10:45
NewBruce… ouch; seems like a FD leak perhaps10:45
kashyapsean-k-mooney: I'm ill a bit; will be taking the noon off.  Will respond to feedback tomm10:45
sean-k-mooneykashyap: take it easy. ill see if i can find the relevent place to add it and do a full review10:46
sean-k-mooneyopenstack flavor set FLAVOR-NAME \10:46
sean-k-mooney    --property os:secure_boot=SECURE_BOOT_OPTION10:46
*** wolverineav has joined #openstack-nova10:46
sean-k-mooneyvalues are required|optional|disabled10:47
sean-k-mooneyNewBruce: perhaps it could be jsut ovs trying franticly to reconnect to all the tap device for the vms or something10:48
sean-k-mooneyi have seen that message or one like it when ovs has been under heavy load but i dont recal it on startup10:49
kashyapsean-k-mooney: Yeah, makes sense.  Please add your comment on the review.  And thanks for the quick feedback10:49
kashyapAppreciate your time10:49
sean-k-mooneykashyap: its just a mirror of the image propery but it is used so that operators can use the aggerate filter to force the only instance with secure boot enabeled can be schduled to a host aggrate for increased security10:50
sean-k-mooneyat least in hyperv land where this works today10:50
NewBrucelooks like its got itself into a loop waiting for netlink and just stuck there10:51
*** wolverineav has quit IRC10:51
*** tbachman has quit IRC10:51
kashyapsean-k-mooney: I see.  Noted.10:52
kashyapsean-k-mooney: The current problem for KVM/QEMU guests, upstream is that it gives a bogus sense of "secure boot" :-(10:52
kashyapBecause currently in upstream Nova (a) no way to configure "SMM"; (b) no way to specify NVRAM file with enrolled keys; (c) auto-select the _right_ OVMF binary10:53
kashyapAnyway, more discussion later :-)10:53
*** mvkr has quit IRC11:01
*** sidx64 has quit IRC11:01
*** jistr|afk is now known as jistr11:02
*** sidx64_ has joined #openstack-nova11:02
*** helenafm has quit IRC11:04
*** ttsiouts has joined #openstack-nova11:08
*** hoonetorg has joined #openstack-nova11:16
*** markvoelker has joined #openstack-nova11:24
*** brinzhang has quit IRC11:24
*** udesale has quit IRC11:26
*** dikonoor has joined #openstack-nova11:30
openstackgerritsean mooney proposed openstack/nova stable/ocata: PCI: do not force remove allocated devices  https://review.openstack.org/63507511:30
*** liuyulong|away is now known as liuyulong11:30
*** sidx64_ has quit IRC11:32
*** ratailor has quit IRC11:36
*** rcernin has quit IRC11:38
*** aarora06 has quit IRC11:45
openstackgerritMerged openstack/nova master: Docs: emulator threads: clarify expected behavior  https://review.openstack.org/64941611:47
*** yan0s has joined #openstack-nova11:52
*** mvkr has joined #openstack-nova11:54
*** artom has quit IRC11:56
*** markvoelker has quit IRC11:57
openstackgerritTetsuro Nakamura proposed openstack/nova master: Add node_uuid field to Destination object  https://review.openstack.org/64953212:05
openstackgerritTetsuro Nakamura proposed openstack/nova master: Pass node uuid to new Destination.node_uuid  https://review.openstack.org/64953312:05
openstackgerritTetsuro Nakamura proposed openstack/nova master: Add in_tree field to RequestGroup object  https://review.openstack.org/64953412:05
openstackgerritTetsuro Nakamura proposed openstack/nova master: node_uuid from RequestSpec to ResourceRequest  https://review.openstack.org/64953512:05
*** shilpasd has joined #openstack-nova12:06
shilpasdhi: observed that in multinode setup, instance data files created at all nodes (controller + compute in my case) at 'instances_path', why is this so?12:08
openstackgerritMerged openstack/nova master: Style corrections for privsep usage.  https://review.openstack.org/64861512:09
openstackgerritMerged openstack/nova master: Remove mox in unit/network/test_neutronv2.py (4)  https://review.openstack.org/57410612:09
openstackgerritMerged openstack/nova master: Remove mox in unit/network/test_neutronv2.py (5)  https://review.openstack.org/57411012:10
*** tbachman has joined #openstack-nova12:12
*** odyssey4me has quit IRC12:12
sean-k-mooneyo/12:18
sean-k-mooneyare there any known gate issues related to installing hacking package12:19
sean-k-mooneyi think the failure im seeing was just an intermitent netowrk issue but12:19
sean-k-mooneybefore i recheck https://review.openstack.org/#/c/649409/ i just said i would ask12:19
*** whoami-rajat has joined #openstack-nova12:19
sean-k-mooneyill jsut recheck as everything else passed12:21
sean-k-mooneymelwitt: when https://review.openstack.org/#/c/649409/ merges ill cherry pick it back to stable/stein12:22
*** artom has joined #openstack-nova12:25
sean-k-mooneytonyb: speaking of stable backport i fixed the indentation in https://review.openstack.org/#/c/635075/2..312:25
*** odyssey4me has joined #openstack-nova12:26
*** eharney has quit IRC12:33
*** dtantsur is now known as dtantsur|brb12:36
*** priteau has quit IRC12:36
*** dtantsur|brb is now known as dtantsur12:36
*** mmethot has joined #openstack-nova12:36
*** janki has quit IRC12:38
*** priteau has joined #openstack-nova12:38
*** sridharg has quit IRC12:38
efriedmriedem: Right (forbidden traits never worked)12:38
*** helenafm has joined #openstack-nova12:39
alex_xueandersson: yea, tags is better for just lable instance, and searching http://specs.openstack.org/openstack/nova-specs/specs/newton/implemented/tag-instances.html12:39
sean-k-mooneyefried: it worked in a flavor right just not in an image?12:43
sean-k-mooneyefried: or is it just totally borked for everything on the nova sise12:43
sean-k-mooneyside12:43
efriedsean-k-mooney: it didn't work in flavor, but it was supposed to.12:43
efriedsean-k-mooney: it doesn't work in image, but we never implemented that, so that part's not surprising.12:44
sean-k-mooneyoh ok so it only works at the placement api level12:44
efriedright12:44
efriedwell, now it works from flavor too.12:44
sean-k-mooneyah ok12:44
efriedbecause fixed12:44
sean-k-mooney:)12:44
efriedprobably stein backport-worthy.12:44
*** priteau has quit IRC12:44
efriedbut probably not RC-worthy12:44
efriednot sure12:45
sean-k-mooneyit was "intoduced" in rocky right?12:45
* efried looks...12:45
sean-k-mooneyyep https://github.com/openstack/nova-specs/blob/master/specs/rocky/implemented/placement-forbidden-traits.rst12:45
efriedsean-k-mooney: The placement side was a year ago.12:45
efriedtrying to find the nova side12:46
sean-k-mooneyi think it was all covered by the same spec for rocky12:47
shilpasdefried: observed that in multinode setup, instance data files created at all nodes (controller + compute in my case) at 'instances_path', is it correct behavior?12:47
*** wolverineav has joined #openstack-nova12:47
efriedsean-k-mooney: https://blueprints.launchpad.net/nova/+spec/forbidden-traits-in-nova different spec12:49
efriedshilpasd: sounds like a libvirt question. sean-k-mooney or stephenfin, y'all have that answer off the top?12:49
sean-k-mooneyefried: assuming this was added in rocky then its not an rc candiate since the regession was not in stien but i would agrue the fix should be backported to both stable/stien and stable/rocky as it was just a bug in the original implemenation and not a new feature12:50
*** gbarros has joined #openstack-nova12:50
sean-k-mooneyefried: shilpasd i did not anser because i did not know.12:50
sean-k-mooneyif by instance data path you mean under /var/lib/libvirt* then yes12:50
efriedsean-k-mooney: https://review.openstack.org/#/c/561677/ <== rocky. So yeah.12:51
sean-k-mooneyi think that is correct12:51
shilpasdefried: sean-k-mooney: okay, will check with stephenfin, 'instances_path = /opt/stack/data/nova/instances'12:51
openstackgerritMatthew Booth proposed openstack/nova master: systemd detection result caching nit fixes  https://review.openstack.org/64922912:51
sean-k-mooneyefried: so ya not an RC candiate but assuming the fix is non invasive it should be a backprot candiate12:51
sean-k-mooneyshilpasd: is this a devstack setup12:52
*** wolverineav has quit IRC12:52
shilpasdsean-k-mooney: yes12:53
sean-k-mooneyif so the /opt/stack/data directory is where we sotre image and cinder voluems and other perstnet data form the openstack services12:53
sean-k-mooneyso i would not be surpiesd if the instacne root disk is also stored there12:53
sean-k-mooneyi have never really checked however.12:54
shilpasdsean-k-mooney: but in multinode case, it was stored at both node in my case, at controller side and at compute side12:54
sean-k-mooneyshilpasd: it wont be sotre in both localtions it will be copied if you do a migration12:55
sean-k-mooneye.g. we wont create the isntace root disk on all nodes in the multinode devstack cloud12:55
openstackgerritEric Fried proposed openstack/nova stable/stein: Adding tests to demonstrate bug #1821824  https://review.openstack.org/64960012:56
openstackbug 1821824 in OpenStack Compute (nova) stein "Forbidden traits in flavor properties don't work" [High,Confirmed] https://launchpad.net/bugs/182182412:56
shilpasdsean-k-mooney: yes, but before migration i am verifying 'instances_path' and observed that instance files are already there12:56
openstackgerritEric Fried proposed openstack/nova stable/stein: Fix bug preventing forbidden traits from working  https://review.openstack.org/64960112:56
*** lbragstad has joined #openstack-nova12:56
shilpasdsean-k-mooney: IMO it should not be the case, as you said during migration it should copy12:56
sean-k-mooneyif you are doing a block migration it will copy them yes12:56
*** jmlowe has quit IRC12:57
sean-k-mooneyyou can also mount /opt/stack/data/nova on nfs to avoid this but you would have to do that manually12:57
openstackgerritEric Fried proposed openstack/nova stable/rocky: Adding tests to demonstrate bug #1821824  https://review.openstack.org/64960212:57
openstackbug 1821824 in OpenStack Compute (nova) stein "Forbidden traits in flavor properties don't work" [High,In progress] https://launchpad.net/bugs/1821824 - Assigned to Eric Fried (efried)12:57
openstackgerritEric Fried proposed openstack/nova stable/rocky: Fix bug preventing forbidden traits from working  https://review.openstack.org/64960312:57
sean-k-mooneyshilpasd: are you tryign to debug somthing locally or in the gate12:58
sean-k-mooneyor  just understand how it works12:58
shilpasdsean-k-mooney: tried with NFS + multinode, there saw insatnce data file created at all 3 places in my case, at NFS shared location, at controller and at compute node.12:58
sean-k-mooneyah its not create at all 3 locations12:58
sean-k-mooneythe data is stored on the nfs share and its just mounted on the contoler and compute12:59
sean-k-mooneythere is one copy of the data but its acceable via nfs form multiple locations12:59
shilpasdsean-k-mooney: yes me to wondered, and here during evacuation getting error 'libvirtError: internal error: process exited while connecting to monitor.', 'ERROR oslo_messaging.rpc.server Is another process using the image?'12:59
sean-k-mooneyshilpasd: what version of openstack are you deploying13:00
*** gaoyan has joined #openstack-nova13:00
shilpasdsean-k-mooney: stein13:00
sean-k-mooneywhen we do an evacuate we assume the guest on the donw node is stoped13:01
sean-k-mooneyif its not its not safe to evacuate13:01
shilpasdopenstack 3.18.013:01
sean-k-mooneydid you stop the vm?13:01
shilpasdpurposefully stopping n-cpu by service-force-down13:01
sean-k-mooneythat will not stop the vm13:02
shilpasdyes but evacuation happens in this case also13:02
sean-k-mooneyyes and the if you use force down you are required to check that the vms are stopped on the host13:02
sean-k-mooneyforce does was added specficaly for operator that had external monitorin that could determin the host had failed and would stop all reunning guest before seting it13:03
shilpasdok, will check that, but IMO error i am getting is because of same data files already available and during evacuation the said error ''ERROR oslo_messaging.rpc.server Is another process using the image?''13:04
sean-k-mooneyhttps://specs.openstack.org/openstack/nova-specs/specs/liberty/implemented/mark-host-down.html#use-cases13:04
shilpasdok, will check after VM down, but original question remains, why insatnce data files created at all 3 places NFS, controller and compute?13:05
shilpasdis it correct behaviour?13:05
sean-k-mooneyyes13:05
sean-k-mooneyassumign that the instance directory is on nfs13:05
sean-k-mooneywhat does mount show on the contoler or compute node13:05
sean-k-mooneywe have special logic that detect if the instace directory is on nfs and skips copying the disk in that cases13:06
shilpasdsame shared location13:06
shilpasdcan you please help me where exatly this logic is13:06
shilpasdi have checked def _create_image() of libvirt13:07
openstackgerritJared Winborne proposed openstack/nova master: Leave brackets on Ceph IP addresses for libguestfs  https://review.openstack.org/64940513:08
*** Alon_KS has joined #openstack-nova13:09
*** cdent has joined #openstack-nova13:09
sean-k-mooneyshilpasd: https://github.com/openstack/nova/blob/1554d35834a474514f827449bd7d4f1d2f0af1d6/nova/virt/libvirt/driver.py#L6611-L664113:10
*** mriedem has joined #openstack-nova13:10
sean-k-mooneyshilpasd: the issue here however is not related to this check.13:11
KH-JaredI'm just going to blame my long lines on trusting my IDE too much. Its pep8 warning was at 120 characters instead of 79, whoops13:11
sean-k-mooneyif you use force host down, you are required to ensure the vm is not running before you call evacuate. that is the cause of the error you hit13:12
*** amodi has joined #openstack-nova13:12
shilpasdsean-k-mooney: yes, thanks for this input, will check with VM down, and will debug more the given refrence code, and get back to you on the same later13:13
sean-k-mooneyKH-Jared: ya the 79 column limmit is particlaly annoying because no ide defualt to 79. some defult to 80 but even pycharm does not defualt to 79 and its a python focused ide13:13
KH-JaredI can get behind it though, the code has always been extremely nice to read, I assume it adds to that. I also have it fixed in my ide now, so hopefully won't see a pep8 failure again in the future13:15
sean-k-mooneyi have been working on openstack for the better part of 6 years and i still dont write pep8 complient code by default13:16
sean-k-mooneybut i run fast8 on it most of the time before i push13:16
sean-k-mooneyKH-Jared: you are aware of the fast8 env13:16
sean-k-mooneytox -e fast813:16
KH-Jaredi am now13:16
sean-k-mooneyit runs pep8 but just on your patched files13:17
sean-k-mooneynot on all of nova13:17
sean-k-mooneyis way fater13:17
artomsean-k-mooney, whoa, that exists?13:17
sean-k-mooney*faster13:17
sean-k-mooneyartom: yes...13:17
artomI've been hacking something like it with $(tox -e pep8 `git show --name-only | grep ^nova`)13:17
sean-k-mooneyartom: stephenfin added it in pike13:18
artom...13:18
KH-Jaredthis was my first change on a large project, not even just openstack, all testing I've done has been much smaller by comparison. I'm going to be using flake8 more often atleast13:18
artom\o/13:18
sean-k-mooneyits also in a few other repos at this point13:18
efriedjaypipes: Do we still have one RT per ironic node?13:18
*** wolverineav has joined #openstack-nova13:19
mriedemgibi_off: do you know why this was made into a warning? https://github.com/openstack/nova/blob/b33fa1c054ba4b7d4e789aa51250ad5c8325da2d/nova/scheduler/client/report.py#L1880 we hit that a lot in normal resizes: https://bugs.launchpad.net/nova/+bug/182291713:19
openstackLaunchpad bug 1822917 in OpenStack Compute (nova) ""Overwriting current allocation" warnings in logs during move operations although there are no failures" [Undecided,In progress] - Assigned to Takashi NATSUME (natsume-takashi)13:19
mriedemefried: there is one RT per nova-compute service,13:20
mriedemand the RT has a dict of compute nodes13:20
*** erlon has quit IRC13:20
mriedemhttps://github.com/openstack/nova/blob/master/nova/compute/manager.py#L54413:21
mriedemhttps://github.com/openstack/nova/blob/master/nova/compute/resource_tracker.py#L13913:21
efriedmriedem: I'm looking at stephenfin's change https://review.openstack.org/#/c/649559/ which seems sane in itself, but https://review.openstack.org/#/c/649559/1/nova/compute/resource_tracker.py@605 is only being run through once, for the "first" node.13:21
efriedit's probably moot by luck because we don't track PCI devices on ironic nodes (right?)13:22
mriedemprobably13:23
*** wolverineav has quit IRC13:23
jaypipesefried: no, not for years.13:25
efriedswhat I thought, see above13:25
jaypipesk, will look.13:26
openstackgerritStephen Finucane proposed openstack/nova master: trivial: Remove dead resource tracker code  https://review.openstack.org/64956913:29
*** ricolin has joined #openstack-nova13:30
*** dklyle has quit IRC13:30
sean-k-mooneystephenfin: regarding ^ this conflits with some other chages that might be starting to used some of that dead code13:30
openstackgerritStephen Finucane proposed openstack/nova master: trivial: Remove dead nova.db functions  https://review.openstack.org/64957013:31
sean-k-mooneyim thinking about https://review.openstack.org/#/q/topic:bug/1809095+(status:open+OR+status:merged)13:31
*** dklyle has joined #openstack-nova13:31
*** elod_off has quit IRC13:31
*** elod_off has joined #openstack-nova13:32
stephenfinsean-k-mooney: Possibly. If so, let me know. I was just using vulture (https://pypi.org/project/vulture/) to figure that stuff out so there's a lot of missing context13:32
*** gbarros has quit IRC13:32
sean-k-mooneystephenfin: actully no never mind13:32
sean-k-mooneythey are not13:32
sean-k-mooneyits just a conflit in the tests13:32
sean-k-mooneystephenfin: we might have some function that were added for the sriov migration code too13:33
sean-k-mooneywe merged all the resouce tracker code but have not merged the 2 patches that use it13:33
openstackgerritMatt Riedemann proposed openstack/nova stable/stein: Error out migration when confirm_resize fails  https://review.openstack.org/64942113:33
openstackgerritStephen Finucane proposed openstack/nova master: trivial: Remove dead nova.db functions  https://review.openstack.org/64957013:36
mriedemsean-k-mooney: are you going to cherry pick https://review.openstack.org/#/c/649409/ to stable/stein now?13:36
sean-k-mooneyspeaking of https://review.openstack.org/#/q/topic:bp/libvirt-neutron-sriov-livemigration+(status:open) if people have time to review that again it would be nice to see that merged now that train is open13:36
mriedemefried: we're doing an rc2 for https://review.openstack.org/#/c/649409/ right?13:36
gibi_offmriedem: I didn't find the specific reason for the warning in move_allocation so I guess I added it becase it thought that is should not really happen that we overwrite existing allocation during move13:36
sean-k-mooneyim waiting for zuul to merge it13:36
sean-k-mooneybut yes ill cherry pick it once that is done13:37
mriedemgibi_off: it's there because of https://specs.openstack.org/openstack/nova-specs/specs/queens/implemented/migration-allocations.html13:37
openstackgerritStephen Finucane proposed openstack/nova master: trivial: Remove unused constants, functions  https://review.openstack.org/64956713:37
openstackgerritStephen Finucane proposed openstack/nova master: trivial: Remove dead resource tracker code  https://review.openstack.org/64956913:37
openstackgerritStephen Finucane proposed openstack/nova master: trivial: Remove dead nova.db functions  https://review.openstack.org/64957013:37
sean-k-mooney... it going to fail again because fo failure in lower constarits13:38
stephenfinmriedem, gibi_off: Have reshuffled those around to remove the placement changes and fix an issue with the last patch, FYI13:38
sean-k-mooneyhttp://logs.openstack.org/09/649409/3/check/openstack-tox-lower-constraints/1a8274d/testr_results.html.gz13:38
mriedemi assume that's related to http://status.openstack.org/elastic-recheck/#1793364 but the signature on that query is old13:40
gibi_offmriedem: so during revert resize/migrate the target allocation is non empty as we keep the allocation exists both on the source and the target host of the migration13:40
*** awaugama has joined #openstack-nova13:40
mriedemduring revert we should drop the target node allocations, held by the instance consumer, and move the source node allocations, held by the migratoin consumer, to the instance consumer13:41
sean-k-mooneymriedem: well the patch to fix our lower constratins job just laned so maybe this is a bug that was fixed in later versions of sqlacamy/pymysql13:41
mriedemso after revert there are no allocations for the instance on the target node and the instance has the source node allocations again13:41
gibi_offmriedem: yeah I mixed up it is not moving allocation between hosts it moves between cosnumers13:42
mriedemdansmith: sounds like we might be doing an RC2 for stein, and bug 1715374 is latent, but do you think it would be worth putting out a known issue release note for stein anyway?13:42
openstackbug 1715374 in OpenStack Compute (nova) "Reloading compute with SIGHUP prevents instances from booting" [High,In progress] https://launchpad.net/bugs/1715374 - Assigned to Ralf Haferkamp (rhafer)13:42
dansmithmriedem: I dunno, it's been this way for a long time apparently13:42
dansmithso doesn't really seem like it13:43
gibi_offmriedem: anyhow I think the warning can be deleted13:43
mriedemgibi_off: or at least dropped to debug13:43
*** ttsiouts has quit IRC13:43
gibi_offyeah13:44
mriedemdansmith: just thinking about our upgrade docs and such that mention to use sighup during an upgrade https://docs.openstack.org/nova/stein/user/upgrade.html?highlight=sighup#concepts13:45
*** ttsiouts has joined #openstack-nova13:45
dansmithwell, I know13:45
mriedemmaybe that should have a note instead13:45
dansmithyeah, that would make more sense I think13:46
mriedemand https://docs.openstack.org/nova/stein/configuration/config.html?highlight=sighup#compute.resource_provider_association_refresh for efried13:46
mriedem^ might work for that option, but then kills the event listening stuff so you can't boot a server right?13:46
mriedemneutron events i mean13:46
mriedemblech we have stuff in here too https://docs.openstack.org/nova/stein/admin/configuration/schedulers.html?highlight=sighup#compute-capabilities-as-traits13:48
*** jmlowe has joined #openstack-nova13:49
sean-k-mooneyso appraently rechecking a runing job does not work so i can either rebase to head of mater if peopel feel like +w again or i can recheck in 50mins when zuul finishies it current run13:49
mriedemin other fun news, i learned last night that with cross-cell resize, we have to do a hard delete of the instance in a cell db rather than a soft delete13:49
mriedemsean-k-mooney: just rebase13:50
openstackgerritsean mooney proposed openstack/nova master: Libvirt: gracefully handle non-nic VFs  https://review.openstack.org/64940913:50
sean-k-mooneydone that will retriger the jobs and it just need +w. im going to grab lunch before a meeting so brb13:51
mriedemso i wonder if we could recreate this sighup issue in one of our post-test hook scripts, we'd just sighup the local compute service and then create a server which should timeout waiting for the network-vif-plugged callback right?13:54
dansmithno,13:54
dansmithyou have to have a server in the middle of creating when you sighup I think13:55
*** oanson has quit IRC13:55
*** eharney has joined #openstack-nova13:55
dansmithbefore the event comes in13:55
dansmiththe other option is probably just to pick another signal as a stop-gap, register for it ourselves, and wire it up to our existing handlers13:55
dansmithbut that's even more icky for rc213:55
mriedemon a new server create after the signup, won't the _events dict be None and when registering the callback we'd hit this? https://review.openstack.org/#/c/420026/7/nova/compute/manager.py@31613:56
*** BjoernT has joined #openstack-nova13:56
*** udesale has joined #openstack-nova13:56
dansmithI don't think so because we'll be re-created at the point before we get there13:56
*** BjoernT has quit IRC13:56
dansmithI think it only happens if we end up with a server in the middle of that while the event comes in, but once the sighup has finished the full restart, the new manager is hooked to new rpc connections, etc13:57
dansmithit doesn't completely bork the server forever, AFAIK13:57
dansmithelse it would be really obvious that it's totally broken13:57
openstackgerritStephen Finucane proposed openstack/nova master: Remove unreachable codepaths  https://review.openstack.org/64955913:58
mriedemok i thought the service was fubar after the sighup13:59
mriedemso this is less severe than i thought13:59
bauzasany urgent reviews I should do before RC2 tagging ?14:01
*** mlavalle has joined #openstack-nova14:01
dansmithnot that I know of14:01
tridentmriedem: Regarding https://review.openstack.org/#/c/648653/14:01
bauzashttps://etherpad.openstack.org/p/nova-stein-rc-potential is pretty done14:01
*** igordc has joined #openstack-nova14:01
tridentmriedem: You are correct, forbidden traits never worked unless there were also a required trait in the same flavor. If there were, both would be used, if not, the forbidden trait was lost.14:02
*** awalende has quit IRC14:03
bauzasmriedem: https://review.openstack.org/#/c/649454/ wanting it for RC2 ? I can vote on my own change given melwitt proposed it14:03
bauzasI think it should clarify our Stein docs14:03
*** awalende has joined #openstack-nova14:03
bauzaseven if we don't honestly branch them14:03
mriedembauzas: i'm waiting to hear that yes we're doing an rc214:06
mriedemand then i'd like to step through with the ptl which stein backports are going to go into it14:06
*** BjoernT has joined #openstack-nova14:06
bauzassoooooo... efried?14:06
mriedemthere is a 50% chance efried is the PTL coordinating the stein RC2 :)14:07
*** awalende_ has joined #openstack-nova14:07
*** awalende has quit IRC14:08
bauzasoh, right, it's still melwitt's point :p14:08
efriedhi, sorry, catching up.14:09
mriedembauzas: you could start by backporting https://review.openstack.org/#/c/649409/ to stable/stein14:10
bauzasmriedem: ok, I can do it14:10
efriedI think we need an RC2 for sure for https://review.openstack.org/#/c/649409/ at least14:10
efriedand if we're doing it, we might as well put the docs in, no risk there.14:10
mriedemhttps://review.openstack.org/#/c/649454/ specifically yeah?14:11
*** awalende_ has quit IRC14:11
efriedyes14:12
efriedmriedem: wanna +A that one please?14:13
* bauzas would love the Gerrit cherry-pick button to amend the commit msg like -x 14:13
mriedemdone14:14
efriedmriedem: Has anyone done docs on the SIGHUP thing?14:14
efriedL5614:14
mriedemno14:14
mriedemi was just creating a devstack env to try and recreate the bug14:14
efriedmriedem: I can go through and add a quick14:15
efried.. note:: SIGHUP is broken, see `bug 1715374`_14:15
openstackbug 1715374 in OpenStack Compute (nova) "Reloading compute with SIGHUP prevents instances from booting" [High,In progress] https://launchpad.net/bugs/1715374 - Assigned to Ralf Haferkamp (rhafer)14:15
efriedto all the docs where it's mentioned if you like.14:15
*** dklyle has quit IRC14:15
mriedemwell, if the nuance is it's broken for a window while servers are being created and waiting for an event when the sighup runs, that gets a bit hard to communicate in all the places we have sighup mentioned in the docs14:16
efried.. note:: SIGHUP behavior is questionable, see `bug 1715374`_ if you try it and things go wobbly.14:17
openstackbug 1715374 in OpenStack Compute (nova) "Reloading compute with SIGHUP prevents instances from booting" [High,In progress] https://launchpad.net/bugs/1715374 - Assigned to Ralf Haferkamp (rhafer)14:17
efried?14:17
mriedemidk, could give a more detailed explanation of the bug in the reset() portion of https://docs.openstack.org/nova/latest/reference/services.html#the-nova-manager-module for just the compute service14:18
mriedemi wouldn't want to mention a bug in 5 places just to have to remember after the bug is fixed to go back and remove all of those 5 places14:19
efriedas you wish14:20
openstackgerritSylvain Bauza proposed openstack/nova stable/stein: Libvirt: gracefully handle non-nic VFs  https://review.openstack.org/64963014:21
bauzasmriedem: ^14:21
*** cdent has quit IRC14:24
*** cdent has joined #openstack-nova14:26
mriedemis there anything else we want to consider? https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/stein14:27
bauzasmriedem: https://review.openstack.org/#/c/647310/ is controversial for RC214:29
dansmithI thought we said no to that?14:31
dansmithand sounds like lyarwood is on board with the original plan too14:31
openstackgerritHelena proposed openstack/nova-specs master: Spec for a new nova virt driver to manage an RSD, composable infrastructure deployment  https://review.openstack.org/64866514:32
lyarwoodyarp, missed that it had already been discussed in the change once I got back online yesterday. gertty--14:32
lyarwoodDoes anyone have any idea why nova-stable-maint already has +2 on stable/stein btw?14:34
mriedemi'm not sure if the release team distinguishes anymore14:34
mriedemsmcginnis: ^?14:34
*** awalende has joined #openstack-nova14:35
jaypipesmriedem, dansmith: for online data migrations, did we envision those migration routines living forever or are we envisioning being able to delete some over time?14:36
mriedemjaypipes: we already have deleted some over time14:36
dansmithjaypipes: we have deleted many14:36
jaypipesoh, ok.14:36
mriedemmost recent memory is i removed the request spec and flavor migration ones14:36
dansmithwe're not super good at setting that timer and then doing it, like any other cleanup14:37
jaypipesmriedem, dansmith: after looking into mnaser's troubles, I think the removing the keypairs migration entirely might be the best solution.14:37
dansmithbut most of them should be relatively no-op-ish if done properly14:37
jaypipesit's Newton-era14:37
jaypipesdansmith: unfortunately that one does a full table scan across instance_extra each time it's run.14:37
mriedemjaypipes: yeah i started looking at that the other day, but i don't think we have a blocker migration or anything in place to make sure the migration has completed before we rip that out14:37
jaypipesdansmith: looking for WHERE keypairs=NULL14:37
mriedemlike we did for flavors14:37
dansmithmriedem: we might not be able to for that one14:38
jaypipesdansmith: and for large DBs like mnaser's 3M+ instance_extra records, it really is a resource hog.14:38
dansmithmriedem: but since it's so old, if it hadn't finished, we probably know that other things would be broken14:38
mriedemif there are keypairs in the cell db wouldn't that be an indication?14:38
dansmithjaypipes: I hear you14:38
mriedemfor the request spec migration, we didn't have a blocker migration, just a nova-status upgrade check: https://github.com/openstack/nova/commit/ed4fe3ead62c09ec7de7b6a11072295a99997b4f#diff-91e852cb498abb50ca653ef6418bd65a14:38
mriedemoh i added the upgrade check in rocky, dropped the reqspec migration code in stein14:39
dansmithjaypipes: I don't agree that going back to putting this inside the schema migration would make people happy, as his cloud would still be down while he waits for that to happen, but.. you know :)14:39
*** awalende has quit IRC14:39
dansmithjaypipes: hopefully you just meant some sort of version-like sentinel instead of scanning to determine if things needed to be done :)14:40
jaypipesdansmith: why would his cloud be down?14:40
dansmithjaypipes: if we did it in the middle of a schema migration?14:40
jaypipesyeah.14:40
dansmithbecause he'd be halfway between two schema versions with two versions of code that can't use it?14:40
dansmithi.e. the reason we decoupled those tasks in the first place14:41
jaypipessorry, you misunderstand...14:41
*** dklyle has joined #openstack-nova14:41
mriedemwouldn't dropping the keypairs migration be similar to dropping the instance groups migration? https://github.com/openstack/nova/commit/1160921c2d053ce33279ca4ec1f00572271e7c95#diff-91e852cb498abb50ca653ef6418bd65a14:41
mriedemjaypipes: fyi https://review.openstack.org/#/q/topic:remove-newton-online-compat-code+(status:open+OR+status:merged)14:42
mriedemfor a blueprint14:42
jaypipesdansmith: if we did the data migration as an Alembic/sqlalchemy-migrate migration, then we wouldn't have to ever run the "check to see if this pre-condition exists" (like the SELECT against the instance_extra table...) more than once.14:42
jaypipesmriedem: ack14:42
mriedemi seem to remember we've fucked up the schema migration scripts before as well14:42
jaypipesdansmith: I'm not saying the data migration would be the *same* migration as the schema migration. just that we could do it *as* a migraiton.14:43
dansmithjaypipes: that's what I said above about the sentinel approach for scanning, but definitely do not agree they should be in the same migration14:43
jaypipesinstead of the whole separate nova-manage thing.14:43
mriedemand since you can't downgrade the schema to get back, you'd have to reset the version in the migrations table and re-run those if we had a bug14:43
jaypipesbut that train has passed...14:43
dansmithjaypipes: okay, well, you can see why people might be confused about such a statement :)14:43
jaypipesmriedem: not once, ever, has anyone ever done a downgrade of a schema migration.14:43
mriedemthat's not true14:43
mriedemhell we used to gate on being able to upgrade and downgrade the schema14:44
jaypipesgating != ever being done in production, ever.14:44
dansmithit didn't actually work for real data14:44
dansmithyeah14:44
dansmithheh14:44
*** dpawlik has quit IRC14:44
jaypipesit just doesn't happen, sorry.14:44
dansmithwhich is why I don't want to couple the two processes14:44
jaypipesthe procedure is a) backup your stuff, b) run migrations, c) test and if problem, d) restore from backup.14:45
dansmithbut jaypipes, we could of course just set up our own version counter to be a sentinel to avoid re-running migrations,14:45
jaypipesnobody ever ran a schema downgrade in production. I can almost guarantee that.14:45
mriedemi should introduce you to some chinese people i know then14:45
mriedemanyway, i'm not saying it's something that should be done14:45
jaypipesmriedem: then, clearly, SOMEONE is WRONG on the Internet.14:46
jaypipes:P14:46
dansmithjaypipes: we certainly have people that think they want it, but I definitely agree they're misguided :)14:46
mriedemi have also told them ^14:46
jaypipesI'm a bit slap-happy today, guys, you'll have to forgive me.14:46
* jaypipes goes back to shooting himself in the head with Ironic rebuild issues.14:46
*** luksky has quit IRC14:47
mriedemi'd be happy to drop the newton-era keypair migrations, but i think it's more complicated than just deleting that code, since we normally have something that checks to see if you have completed your homework before we just drop the migration routine and all of the compat code14:48
mriedemwhich requires some thinking14:48
dansmithwell,14:48
dansmithif the migration is acutely painful we could drop it without dropping the compat code and then take our time with the latter I think14:49
dansmithor fix the migration to be more efficient if that's really possible14:49
dansmithISTR that since they're one-to-many and across DBs it's harder than that14:49
dansmithbut it's been a long time14:49
jaypipesmriedem: yeah, I was pondering what that check would be other than SELECT COUNT(*) FROM instance_extra WHERE keypairs IS NULL, though.14:49
mriedemthere is also https://review.openstack.org/#/c/517158/ which might mean some of the migration routine is dead now14:50
mriedemi.e. https://github.com/openstack/nova/blob/master/nova/objects/keypair.py#L24214:50
mriedemoh nvm that just means we wouldn't hit this https://github.com/openstack/nova/blob/master/nova/objects/keypair.py#L26014:51
mriedemhttps://github.com/openstack/nova/commit/be8242cb5a0f8396f6b8c042813847db0571df14#diff-f5877540177ee26b63552ec5f56d74fb14:53
mriedemso as of ocata, the keypairs table in the cell dbs should be empty14:53
mriedemand the keypair information per instance should be in the instance_extra table yeah14:53
mriedem?14:53
mriedemi'm confused, if you can't upgrade to ocata while there are keypairs in the cell db https://github.com/openstack/nova/commit/be8242cb5a0f8396f6b8c042813847db0571df14#diff-f5877540177ee26b63552ec5f56d74fb then how would the 'migrate to api db' still hit anything here? https://github.com/openstack/nova/blob/master/nova/objects/keypair.py#L26514:56
dansmithmriedem: I think it's not hitting anything, it's just scanning the whole table which is the problem14:57
mriedem"select([func.count()]).select_from(keypairs).where(                                                    keypairs.c.deleted == 0).scalar()" is the query jaypipes just said14:57
mriedemoh nvm it's not14:58
mriedembased on the commit message in https://review.openstack.org/#/c/517158/ i think we probably have reasonable justification to just kill that 'migrate to api db' data migration from newton that is still getting run14:59
jaypipesmriedem: yeah, the problematic one is the scan on instance_extra14:59
*** artom has quit IRC14:59
*** artom has joined #openstack-nova15:00
mriedemi'm not sure why i didn't go further on https://review.openstack.org/#/c/517158/2/nova/cmd/manage.py and remove keypair_obj.migrate_keypairs_to_api_db - i probably just ran out of time and was doing them in order, and wanted to handle the request spec one first, which i did in rocky/stein15:00
*** artom has quit IRC15:00
mriedemso.....i think we can probably remove migrate_keypairs_to_api_db now?15:00
jaypipesmriedem: maybe you were performing a schema downgrade in production.15:01
mriedemi will cut you15:01
* jaypipes dons armor15:01
mriedemif we think we can drop this, and we're doing an rc2 for stein we might want to get off this pot and include it in rc2 if it's really causing pain for upgrades15:01
mriedembut i feel like i'm trying to convince myself this is ok15:02
mriedemlike everything i do in nova, which comes back to bite my ass15:02
dansmithseems risky for an rc215:04
dansmithnot from any real data I have15:04
dansmithbut rc2 should be "we can't release without this" and this doesn't seem to fit that, IMHO15:04
*** artom has joined #openstack-nova15:04
dansmithwe probably need to nuke all those "added in neutron" migrations15:05
mriedemi've got a local change, sec15:07
dansmithonly one more now I guess15:07
dansmithlooking at your patch from earlier there were abunch15:07
mriedemyeah https://review.openstack.org/#/q/topic:remove-newton-online-compat-code+(status:open+OR+status:merged)15:07
mriedemi tried15:07
mriedemgot tired15:07
*** phasespace has quit IRC15:12
*** zhubx has quit IRC15:16
*** zhubx has joined #openstack-nova15:16
cdentIs there a concise description on the rules about migrations between AZs somewhere? cold and live, force and not force?15:22
mriedemlikely not, at least off the top of my head, but i could probably explain it quick15:23
mriedemthen we could report a docs bug to fill that in later15:23
cdentI'll take that15:24
mriedemtl;dr if the user creates the server with a specific AZ or https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.default_schedule_zone is not None (meaning it goes into some default AZ), then the instance is restricted to that AZ for all move operations,15:24
mriedemUNLESS it is forced like with live migrate or evacuate15:24
mriedemwhich, by default, with OSC you get a forced live migration every time15:25
mriedemso very easy to shoot yourself in the foot there as an admin15:25
mriedemwhich is why there are at least 5 patches to osc to fix that, listed at L18 here https://etherpad.openstack.org/p/DEN-osc-compute-api-gaps15:25
*** gaoyan has quit IRC15:25
cdent\o/15:26
cdentare there any restrictions that distinguish the default AZ as special with regard to other AZs?15:26
mriedemdocs on migrate caveats regarding AZs could probably live here https://docs.openstack.org/nova/latest/user/aggregates.html#availability-zones-azs if you want to report a bug and i can wordsmith it later15:27
mriedemno15:27
mriedemif DEFAULT.default_schedule_zone is not None and the user doesn't request an AZ explicitly, it's treated as if the user did request the default AZ15:27
cdentI guess if you specify that a sever is in foo-AZ (non-default) then you can't cold migrate from there to any other `one?15:27
mriedemthat happens here https://github.com/openstack/nova/blob/357da989c194a8b59842629cb64b2809143a4eae/nova/api/openstack/compute/servers.py#L64115:28
mriedemcdent: correct15:28
*** mvkr has quit IRC15:28
cdentokay. thanks. I'll make a bug, summarize this stuff there.15:28
mriedemnote there is a spec proposed to allow pasing a new AZ on unshelve, which i think is reasonable15:28
* cdent nods15:30
mriedem[cinder]/cross_az_attach=False also makes this all more complicated....15:30
mriedemala https://review.openstack.org/#/c/469675/ wee15:30
* cdent burns everything down15:31
mriedemtl;dr if the cloud is configured for [cinder]/cross_az_attach=False and you boot from volume where the volume is in a non-default zone, server create explodes immediately15:31
mriedem*and you don't create the server in the same zone15:31
mriedemsorrison suffers from ^15:31
* bauzas just seeing the backlog15:32
mriedemside effects include chronic vegimitis15:32
dansmithgdi mriedem, save your depressing stuff for mondays will you? wednesday is supposed to be cresting the hill of depression and heading down to friday happiness15:32
mriedembeing constantly depressed about bugs from essex still being in our code is all i have to live for anymore15:32
* cdent prefers his depressed on tuesday15:32
mriedemnot to mention all of my software skills are stuck in 6 years ago and i'm not learning anything new as a developer...wahwah15:34
artomYou're learning new soft skills15:35
artomLike... how to deal with depression ^_^15:36
dansmithnice15:37
*** burt has quit IRC15:38
mriedemthat'll be good on my next interview15:40
openstackgerritMatt Riedemann proposed openstack/nova master: Drop migrate_keypairs_to_api_db data migration  https://review.openstack.org/64964815:40
mriedemdansmith: jaypipes: mnaser: ^ there you go15:40
*** burt has joined #openstack-nova15:42
openstackgerritMatt Riedemann proposed openstack/nova master: Drop migrate_keypairs_to_api_db data migration  https://review.openstack.org/64964815:43
mriedemcdent: heh sound familiar? https://bugs.launchpad.net/nova/+bug/182298615:45
openstackLaunchpad bug 1822986 in OpenStack Compute (nova) "Not clear if www_authenticate_uri is really needed" [Undecided,New]15:45
cdentoh hai15:45
cdenthmm, yes, rather familiar15:46
*** dpawlik has joined #openstack-nova15:48
*** BjoernT has quit IRC15:48
*** tssurya has quit IRC15:49
melwittmriedem, efried: I thought I was coordinating RC2, but if someone else wants to, that's fine with me. I put the things I thought should go into it on https://etherpad.openstack.org/p/nova-stein-rc-potential last night15:51
mriedemmelwitt: i think those are the same two changes that are being put in, so it's aligned15:52
*** dpawlik has quit IRC15:52
melwittok, cool15:53
*** manjeets has joined #openstack-nova15:54
*** rpittau is now known as rpittau|afk15:57
*** phasespace has joined #openstack-nova15:58
jaypipesmriedem: ++16:00
*** nicolasbock has quit IRC16:02
*** erlon has joined #openstack-nova16:02
*** dpawlik has joined #openstack-nova16:02
*** dikonoor has quit IRC16:05
melwittmriedem, efried: fyi I've proposed RC2 at https://review.openstack.org/64965616:06
*** yan0s has quit IRC16:09
*** BjoernT has joined #openstack-nova16:11
*** artom has quit IRC16:12
*** imacdonn has quit IRC16:12
*** imacdonn has joined #openstack-nova16:13
*** gbarros has joined #openstack-nova16:17
*** gbarros_ has joined #openstack-nova16:17
*** gbarros has quit IRC16:18
efriedmriedem: was away for a bit there. Did you decide to do something about SIGHUP for RC2?16:19
*** udesale has quit IRC16:19
mriedemefried: no16:19
mriedemi went on a online data migrations tangent16:20
efriedight16:20
mriedemkashyap: MIN_LIBVIRT_VERSION = (3, 0, 0) and MIN_LIBVIRT_POSTCOPY_VERSION = (1, 3, 3), do you have a change to remove MIN_LIBVIRT_POSTCOPY_VERSION?16:21
*** gbarros_ has quit IRC16:22
mriedemdoesn't look like it16:24
*** dtantsur is now known as dtantsur|afk16:24
*** ttsiouts has quit IRC16:32
*** ttsiouts has joined #openstack-nova16:33
*** psachin has quit IRC16:34
mriedemNewBruce: what do you have set for live_migration_permit_post_copy in nova.conf on your compute services?16:34
*** cfriesen has joined #openstack-nova16:36
*** bbowen__ has quit IRC16:37
*** bbowen__ has joined #openstack-nova16:37
*** ttsiouts has quit IRC16:37
openstackgerritMatt Riedemann proposed openstack/nova master: libvirt: drop MIN_LIBVIRT_POSTCOPY_VERSION  https://review.openstack.org/64967116:38
*** READ10 has joined #openstack-nova16:43
*** spsurya has quit IRC16:46
*** tesseract has quit IRC16:49
openstackgerritMatt Riedemann proposed openstack/nova master: libvirt: remove conditional on VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY  https://review.openstack.org/64967416:50
mriedemsean-k-mooney: so on https://review.openstack.org/#/c/649464/ we're never getting a post-copy event during live migration so we don't activate the dest host port binding prior to post_live_migration_at_destination, so that kind of throws that theory out the window as being the issue NewBruce is hitting16:52
openstackgerritEric Fried proposed openstack/nova stable/rocky: Fix bug preventing forbidden traits from working  https://review.openstack.org/64960316:52
*** cdent has quit IRC16:53
*** helenafm has quit IRC16:54
*** luksky has joined #openstack-nova16:55
*** derekh has quit IRC16:56
*** wolverineav has joined #openstack-nova16:57
sean-k-mooneymriedem: ya i noticed that this morning that it passed16:58
sean-k-mooneyi was wondering if we shoudl maybe try and force the compute level to be different on the contoler vs the compute and see if that triggers the issue17:00
sean-k-mooneymriedem: were you going to resubmit with post copy enabled?17:00
*** BjoernT has quit IRC17:01
mriedemsean-k-mooney: yeah i was going to try enabling post-copy but these cirros guests are so tiny i'm not sure it will mean anything17:02
mriedemefried: dansmith: on that SIGHUP n-cpu issue, i tried that in a train devstack created this morning and i don't even get to the wait for network event part17:02
mriedemhttp://paste.openstack.org/show/748820/17:02
mriedemlibvirt blows up in privsep17:02
sean-k-mooneymriedem: i might try and create a neutron fullstack test to reproduce the nutron db error17:03
mriedemthis is how i hup'ed: sudo systemctl kill -s HUP devstack@n-cpu.service17:03
dansmithmriedem: what happens if you try it again?17:03
mriedemthe server create? or the hup?17:03
dansmithmriedem: the create17:03
mriedemsame thing17:03
dansmithokay17:03
dansmithI have not seen it manifest that way17:04
dansmithobviously that seems like a much bigger deal17:04
mriedemunplug on cleaning up from the failure also then blows up b/c privsep http://paste.openstack.org/show/748821/17:04
dansmithwhich should be more evidence that the oslo behavior is completely wrong17:04
mriedemdid a full systemctl restart and created a server and it was fine as expected17:06
dansmithI forget, but can people still choose to run privsep things via rootwrap so we can restart the daemon?17:06
dansmithbecause if so, maybe that's why you see that and it's devstack default config or whatever17:06
dansmithmaybe I could get mikal to yell at us over twitter to answer that17:06
mriedemidk, privsep config et al is something i haven't had to look at in years17:07
mriedembut was always confusing to me17:07
dansmithyeah17:07
dansmithor maybe it's different with some of the privsep things we merged recently,17:07
dansmithbut people reporting it are missing more of the privsepification17:08
dansmithdid you create a server before the hup too?17:08
mriedemyup17:12
mriedemcreated test1, fine, sighup'ed, create test2 fails, create test3 fails, restart n-cpu, create test4 ok17:12
mriedemcommented 34 and 35 fwiw in bug https://bugs.launchpad.net/nova/+bug/171537417:14
openstackLaunchpad bug 1715374 in OpenStack Compute (nova) "Reloading compute with SIGHUP prevents instances from booting" [High,In progress] - Assigned to Ralf Haferkamp (rhafer)17:14
*** artom has joined #openstack-nova17:14
dansmithokay just trying to think of reasons you see it differently than reported, but probably because of recent changes I guess17:14
mriedemthe privsep-helper child processes are definitely gone after the SIGHUP http://paste.openstack.org/show/748822/17:18
dansmithso we probably just didn't notice before or something as fewer things used it17:19
*** BjoernT has joined #openstack-nova17:19
dansmithbut I don't think there's any reason that they're being killed now and not before17:19
*** erlon has quit IRC17:20
mriedemyeah, this is what i see in the n-cpu logs on the HUP http://paste.openstack.org/show/748823/17:20
mriedemnote the17:21
mriedemApr 03 17:15:28 train nova-compute[19990]: DEBUG oslo_privsep.comm [-] EOF on privsep read channel {{(pid=19990) _reader_main /usr/local/lib/python2.7/dist-packages/oslo17:21
dansmithyep17:21
dansmithand that we're calling the sighup handler, but then restarting anyway17:21
dansmithsoft restart17:21
dansmithso you might be able to do something to the global privsep state to cause it to be respawned again after the restart, but I dunno how that works17:21
dansmithit's just all pretty wrong17:22
*** BjoernT has quit IRC17:23
mriedemnot much interesting in the unit either http://paste.openstack.org/show/748824/17:24
*** spsurya has joined #openstack-nova17:27
*** BjoernT has joined #openstack-nova17:28
mriedemyeah i'm not sure what does this http://logs.openstack.org/64/649464/1/check/tempest-full-py3/f849a52/controller/logs/screen-n-cpu.txt.gz#_Apr_02_23_01_32_82205917:29
mriedemoh probably this https://github.com/openstack/nova/blob/master/nova/cmd/compute.py#L4517:30
mriedemi'll throw that into ComputeManager.reset() and see what happens17:33
dansmithI think that won't help, because it's before the restart17:33
dansmithbut worth a try I guess17:34
mriedemyeah it didn't do anything17:36
mriedemwell i guess we can say SIGHUP of n-cpu is f'ed17:37
mnaservery very f'd17:37
mnaserI mean I caught this through the openstack-ansible CI which was relying on it to refresh RPCs versions after upgrades17:38
mnaserbut for some reason we had vif_plugging_is_fatal=false so it never broke17:38
mnaserbut once i got rid of that option, we resorted to restarting all agents which is not ideal really17:38
mriedemi'm not sure if i should report a new bug for the privsep wrinkle here, or if that is a regression in stein - to find out i'd have to spin up a stable/rocky devstack17:38
dansmithbut this is more broken than what you were seeing17:38
mnaser:<17:39
mriedemright, i could probably just recreate in our post-test hook in the nova-next job17:39
mriedembut first i need some lunch17:39
eanderssonalex_xu, for sure, but nothing supports it yet, not even the openstackclient17:40
*** dpawlik has quit IRC17:45
KH-JaredI'm learning that simple changes aren't always simple. Apparently I'm making the tearDown of a test fail17:52
*** jmlowe has quit IRC17:52
*** jmlowe has joined #openstack-nova17:55
*** wolverineav has quit IRC17:56
artomKH-Jared, yep. Actual fix: 1 day's work. Tests: Methuselah.17:56
*** wolverineav has joined #openstack-nova17:57
*** amodi has quit IRC18:03
*** wolverineav has quit IRC18:09
*** wolverineav has joined #openstack-nova18:10
*** lbragstad has quit IRC18:11
*** lbragstad has joined #openstack-nova18:12
*** igordc has quit IRC18:14
*** samueldmq has joined #openstack-nova18:25
openstackgerritMatt Riedemann proposed openstack/nova master: DNM: Test theory about bug 1822884  https://review.openstack.org/64946418:26
openstackbug 1822884 in OpenStack Compute (nova) "live migration fails due to port binding duplicate key entry in post_live_migrate" [Undecided,New] https://launchpad.net/bugs/182288418:26
openstackgerritMatt Riedemann proposed openstack/nova master: libvirt: drop MIN_LIBVIRT_POSTCOPY_VERSION  https://review.openstack.org/64967118:27
openstackgerritMatt Riedemann proposed openstack/nova master: libvirt: remove conditional on VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY  https://review.openstack.org/64967418:27
mriedemhmm wtf http://logs.openstack.org/53/641453/1/check/nova-grenade-live-migration/88cd6f4/logs/screen-n-cpu.txt.gz?level=TRACE#_Apr_03_15_47_39_46961718:29
*** READ10 has quit IRC18:32
*** ricolin has quit IRC18:38
KH-JaredI feel like I'm missing something, because what failed for 'IBM zVM CI' was a tempest test, but under Zuul, the same tempest test was successful18:41
mriedemKH-Jared: you don't need to worry about the zvm ci, your rbd change is only for the libvirt driver which is not what zvm ci is using18:42
mriedemthe zuul jobs and specifically the non-voting ceph plugin job are what you'd care about18:42
KH-Jaredgood to know. I figured, but I was still going to dig to make sure it wasn't my changes somehow18:43
KH-Jaredso that was some time spent that I probably should've just asked sooner18:43
efriedmriedem: If privsep is part of what doesn't start back up properly, and there's privsep stuff in the critical path that's merged since that bug was opened (https://review.openstack.org/#/q/status:merged+project:openstack/nova+branch:master+topic:my-own-personal-alternative-universe) then that'd do it.18:45
sean-k-mooneyKH-Jared: well it good that you are paying attention to the thrid party ci jobs but yes its always good to ask yourself is it resonable the my change could have broken it18:45
sean-k-mooneyand if you dont know always feel free to ask18:45
efriedmriedem: We don't know if privsep was borked before and just not being hit because those paths were still using rootwrap18:45
mriedemefried: i'm going to try on rocky devstack to see if it is a very obvious regression in stein because if so we should have a known issue reno18:46
efriedack18:46
*** igordc has joined #openstack-nova18:50
*** nicolasbock has joined #openstack-nova18:51
*** jmlowe has quit IRC18:57
*** tbachman has quit IRC19:08
*** tbachman has joined #openstack-nova19:10
*** wolverineav has quit IRC19:10
*** wolverineav has joined #openstack-nova19:10
*** wolverineav has quit IRC19:15
*** sidx64 has joined #openstack-nova19:19
*** erlon has joined #openstack-nova19:32
*** eharney has quit IRC19:37
*** tosky has quit IRC19:44
*** jmlowe has joined #openstack-nova19:48
kashyapmriedem: Was AFK as I was a bit "under the weather".  To your question, no, I don't yet have a patch to remove MIN_LIBVIRT_POSTCOPY_VERSION (and a couple of other constants that I noted in the main commit message)19:53
kashyapI'll get to it this week.  As I noted I will fix them19:53
kashyapmriedem: Curious, what made you notice it?  Or just regular code audit caught your eye?19:53
*** wolverineav has joined #openstack-nova19:58
*** READ10 has joined #openstack-nova19:58
* kashyap will catch up tomm early (CEST)19:58
openstackgerritMerged openstack/nova stable/stein: Add doc on VGPU allocs and inventories for nrp  https://review.openstack.org/64945419:59
mriedemkashyap: i already pushed up a change, and i just noticed b/c i was looking at that post-copy code19:59
*** mriedem has quit IRC20:01
*** sidx64 has quit IRC20:01
*** mriedem has joined #openstack-nova20:06
*** erlon has quit IRC20:07
*** krypto has quit IRC20:08
*** BjoernT has quit IRC20:10
*** mrhillsman is now known as mrhillsman_bbiab20:11
mriedemefried: dansmith: i recreated that sighup privsep issue on rocky devstack so it's not a stein regression20:14
dansmith"nice"20:16
*** igordc has quit IRC20:18
*** awalende has joined #openstack-nova20:20
*** whoami-rajat has quit IRC20:29
*** READ10 has quit IRC20:34
*** BjoernT has joined #openstack-nova20:44
*** rcernin has joined #openstack-nova20:45
*** BjoernT has quit IRC20:45
*** artom has quit IRC20:47
mriedemheh, if an instance action fails, the message is always "Error"20:48
mriedemthat's it20:48
mriedemthe api says, "The related error message for when an action fails." - but that's really only 'Error'20:48
mriedemvery helpful20:48
*** wolverineav has quit IRC20:49
*** BjoernT has joined #openstack-nova20:50
openstackgerritMerged openstack/nova master: Libvirt: gracefully handle non-nic VFs  https://review.openstack.org/64940920:51
*** BjoernT has quit IRC20:54
*** ralonsoh has quit IRC20:54
*** BjoernT has joined #openstack-nova20:57
*** spsurya has quit IRC20:59
*** igordc has joined #openstack-nova20:59
*** mmethot has quit IRC21:00
*** BjoernT has quit IRC21:05
mriedemso this is probably going to be filed under a big pile of don't care, but when you resize a server, you get at least 2 events on the action in conductor, one here (conductor_migrate_server): https://github.com/openstack/nova/blob/e7ae6c65cd24fb3e0776fac80fbab2ab16e9d9ed/nova/conductor/manager.py#L26621:05
mriedemand one here which is just called 'cold_migrate': https://github.com/openstack/nova/blob/e7ae6c65cd24fb3e0776fac80fbab2ab16e9d9ed/nova/conductor/manager.py#L28821:05
mriedemthe latter is confusing if you're doing a resize and not a cold migration21:05
mriedemand both together is redundant21:05
mriedemthe latter was here first though21:05
mriedemformer was added in newton21:06
mriedemany validity in dropping the confusing 'cold_migrate' one?21:06
*** pcaruana has quit IRC21:06
mriedemnote for a resize the action name is still 'resize' rather than (cold) 'migrate'21:06
*** BjoernT has joined #openstack-nova21:06
*** owalsh has quit IRC21:07
openstackgerritMatt Riedemann proposed openstack/nova stable/stein: Libvirt: gracefully handle non-nic VFs  https://review.openstack.org/64963021:08
mriedemefried: melwitt: ^ approved21:09
openstackgerritMatt Riedemann proposed openstack/nova master: Fix ProviderUsageBaseTestCase._run_periodics for multi-cell  https://review.openstack.org/64117921:12
openstackgerritMatt Riedemann proposed openstack/nova master: Improve CinderFixtureNewAttachFlow  https://review.openstack.org/63938221:12
openstackgerritMatt Riedemann proposed openstack/nova master: Add functional recreate test for bug 1818914  https://review.openstack.org/64152121:12
openstackbug 1818914 in OpenStack Compute (nova) "Hypervisor resource usage on source still shows old flavor usage after resize confirm until update_available_resource periodic runs" [Low,In progress] https://launchpad.net/bugs/1818914 - Assigned to Matt Riedemann (mriedem)21:12
openstackgerritMatt Riedemann proposed openstack/nova master: Remove unused context parameter from RT._get_instance_type  https://review.openstack.org/64179221:12
openstackgerritMatt Riedemann proposed openstack/nova master: Update usage in RT.drop_move_claim during confirm resize  https://review.openstack.org/64180621:12
openstackgerritMatt Riedemann proposed openstack/nova master: Add Migration.cross_cell_move and get_by_uuid  https://review.openstack.org/61401221:12
openstackgerritMatt Riedemann proposed openstack/nova master: Add InstanceAction/Event create() method  https://review.openstack.org/61403621:12
openstackgerritMatt Riedemann proposed openstack/nova master: Add Instance.hidden field  https://review.openstack.org/63112321:12
openstackgerritMatt Riedemann proposed openstack/nova master: Add TargetDBSetupTask  https://review.openstack.org/62789221:12
openstackgerritMatt Riedemann proposed openstack/nova master: Add CrossCellMigrationTask  https://review.openstack.org/63158121:12
openstackgerritMatt Riedemann proposed openstack/nova master: Execute TargetDBSetupTask  https://review.openstack.org/63385321:12
openstackgerritMatt Riedemann proposed openstack/nova master: Add can_connect_volume() compute driver method  https://review.openstack.org/62131321:12
openstackgerritMatt Riedemann proposed openstack/nova master: Add prep_snapshot_based_resize_at_dest compute method  https://review.openstack.org/63329321:12
openstackgerritMatt Riedemann proposed openstack/nova master: Add PrepResizeAtDestTask  https://review.openstack.org/62789021:12
openstackgerritMatt Riedemann proposed openstack/nova master: Add prep_snapshot_based_resize_at_source compute method  https://review.openstack.org/63483221:12
openstackgerritMatt Riedemann proposed openstack/nova master: Add nova.compute.utils.delete_image  https://review.openstack.org/63760521:12
openstackgerritMatt Riedemann proposed openstack/nova master: Add PrepResizeAtSourceTask  https://review.openstack.org/62789121:12
openstackgerritMatt Riedemann proposed openstack/nova master: Refactor ComputeManager.remove_volume_connection  https://review.openstack.org/64218321:12
openstackgerritMatt Riedemann proposed openstack/nova master: Add revert_snapshot_based_resize conductor RPC method  https://review.openstack.org/63804721:12
openstackgerritMatt Riedemann proposed openstack/nova master: Revert cross-cell resize from the API  https://review.openstack.org/63804821:12
openstackgerritMatt Riedemann proposed openstack/nova master: Confirm cross-cell resize while deleting a server  https://review.openstack.org/63826821:12
openstackgerritMatt Riedemann proposed openstack/nova master: Add CrossCellWeigher  https://review.openstack.org/61435321:12
openstackgerritMatt Riedemann proposed openstack/nova master: Add cross-cell resize policy rule and enable in API  https://review.openstack.org/63826921:12
*** awaugama has quit IRC21:13
mriedemdansmith: i added a test for this cross-cell resize issue i ran into yesterday: https://review.openstack.org/#/c/643451/5/nova/tests/functional/test_cross_cell_migrate.py@499 - tl;dr is i'm going to need to hard destroy the instance in the db, instance.destroy() won't cut it21:14
mriedemfor example, resize to target cell and then revert back to source, then try to resize again to target will fail (that's that test) because there is a (soft) deleted instance record in the target cell db still21:14
mriedemand the uuid unique constraint will prevent us from creating the instance in the target cell db on the 2nd attempt21:15
mriedemsame for rollbacks on failure (hard destroy in target cell db) and confirm resize (hard destroy from source cell db)21:15
mriedemi think it's probably ok, although it sounds kind of scary - but you'll always have a copy of the instance in one of the db when the operation is over, so we don't lose anything21:15
*** owalsh has joined #openstack-nova21:16
mriedema real shitty alternative limitation/workaround is that it's just a no-go for cross-cell resizing that instance until the problem db is purged of the deleted instance21:16
*** eharney has joined #openstack-nova21:21
melwittmriedem: thx21:30
efriedmriedem: http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Deadlock%20found%20when%20trying%20to%20get%20lock%3B%20try%20restarting%20transaction%5C%22%20AND%20message%3A%5C%22UPDATE%20migrations%20SET%20updated_at%3D%25(updated_at)s%2C%20status%3D%25(status)s%20WHERE%20migrations.id%20%3D%20%25(migrations_id)s%5C%2221:33
efriedlooks... new?21:33
mriedemefried: nope21:34
mriedemhttps://bugs.launchpad.net/nova/+bug/164253721:35
openstackLaunchpad bug 1642537 in OpenStack Compute (nova) stein "finish_resize fails with DBDeadlock on migrations table" [Medium,In progress] - Assigned to Matt Riedemann (mriedem)21:35
dansmithmriedem: you can't undelete it if you need to restore?21:36
efriedmriedem: not in e-r I guess?21:36
tonybIs there a configurable 'timeout' for server build operations?21:37
*** slaweq has quit IRC21:38
mriedemefried: used to be i think21:39
mriedemdansmith: no....but restore how/where?21:39
* tonyb is trying to build a baremetal and the instance is hitting exception.MaxRetriesExceeded but the build doesn't actually fail or complete the node just gets powered down too soon21:39
mriedemlike i said, you've have a copy of the instance and related records in the other cell db21:39
dansmithmriedem: I mean undelete if you need to revert21:39
dansmithmriedem: look it up with deleted=yes, set deleted=0, save21:39
dansmithyou'll have to fix up things we don't soft-delete, but that seems easier21:40
dansmithI think you probably do want to hard-delete it when you confirm though, so you don't have to worry about the was-it-in-this-cell-before case each time,21:40
mriedemthat's not really the issue, unless i'm misunderstanding21:40
dansmithbut I dunno, while we're waiting for confirm...21:40
dansmithokay21:40
dansmithoh, I see,21:41
mriedemwhen you revert, the instance is in both dbs,21:41
melwitttonyb: I found this https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.instance_build_timeout21:41
dansmithyou're talking about the instance in the target cell that got reverted back to source, a week later when you try to migrate again?21:41
mriedemyeah21:41
mriedemwe literally can't have the instance.deleted!=0 in the cell db we're moving to21:41
tonybmelwitt: Thanks21:42
tonyb(undercloud) [root@director config-data]# grep -Erin instance_build_timeout nova21:42
tonybnova/etc/nova/nova.conf:983:#instance_build_timeout=021:42
mriedemheh i didn't know instance_build_timeout existed21:42
*** rcernin has quit IRC21:42
tonybso I guess that isn't it21:42
mriedemit's a periodic21:42
mriedemlooks like a hack for just setting instance status to ERROR for things that were hung or we didn't properly set to ERROR state on failure21:43
*** takashin has joined #openstack-nova21:43
dansmithmriedem: okay, I guess that seems obvious to me so I'm not sure why you're bringing it up as if it's a problem or something21:43
mriedemdansmith: we just haven't hard hard destroy on stuff in the cell db records, except tags i guess21:43
mriedemso it seems new and scary21:43
melwitttonyb: oh yeah, hm21:43
dansmithmmkay21:44
* tonyb wander off and look harder21:44
mriedemluckily theodoros already has a patch to add that support for his rebuild from cell0 series https://review.openstack.org/#/c/570202/21:44
mriedemwhich it turned out he later didn't need, but i will21:44
mriedemanyway, i'll update the re-proposed spec to note it21:47
*** bbowen__ has quit IRC21:51
*** slaweq has joined #openstack-nova21:55
*** awalende has quit IRC21:59
*** slaweq has quit IRC22:00
*** bbowen__ has joined #openstack-nova22:00
openstackgerritMatt Riedemann proposed openstack/nova master: api-ref: document ordering for instance actions and events  https://review.openstack.org/64974822:06
*** wolverineav has joined #openstack-nova22:06
*** weshay is now known as weshay|ruck22:08
openstackgerritMatt Riedemann proposed openstack/nova master: api-ref: fix description of os-server-external-events 'events' param  https://review.openstack.org/64975022:09
*** BjoernT has quit IRC22:12
*** wolverineav has quit IRC22:12
*** mlavalle has quit IRC22:15
*** artom has joined #openstack-nova22:27
openstackgerritMatt Riedemann proposed openstack/nova master: Remove MIN_COMPUTE_MULTIATTACH conditions in API  https://review.openstack.org/64975722:32
*** bbowen__ has quit IRC22:33
*** bbowen__ has joined #openstack-nova22:33
*** nicolasbock has quit IRC22:34
*** lbragstad has quit IRC22:49
*** lbragstad has joined #openstack-nova22:50
openstackgerritMatt Riedemann proposed openstack/nova master: Add nova-status upgrade check for minimum required cinder API version  https://review.openstack.org/64975922:51
efriedmriedem: where does grenade source live?23:01
*** wolverineav has joined #openstack-nova23:02
mriedemhttp://git.openstack.org/cgit/openstack-dev/grenade/23:03
*** tkajinam has joined #openstack-nova23:04
*** slaweq has joined #openstack-nova23:06
*** wolverineav has quit IRC23:06
*** slaweq has quit IRC23:10
*** wolverineav has joined #openstack-nova23:56
*** brinzhang has joined #openstack-nova23:58
*** tbachman has quit IRC23:58

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!