Tuesday, 2018-08-21

*** gyee has quit IRC00:00
*** cdent has quit IRC00:02
*** tbachman has quit IRC00:04
*** mvkr has quit IRC00:05
*** mvkr has joined #openstack-nova00:18
*** prometheanfire has left #openstack-nova00:24
*** hongbin has joined #openstack-nova00:26
*** macza has quit IRC00:41
*** erlon has quit IRC00:41
*** sapd1 has joined #openstack-nova00:46
*** imacdonn has quit IRC00:49
*** imacdonn has joined #openstack-nova00:49
*** moshele has quit IRC00:53
*** bhagyashris has joined #openstack-nova01:08
*** bzhao__ has joined #openstack-nova01:10
*** mhen has quit IRC01:13
*** mhen has joined #openstack-nova01:17
*** mrsoul has quit IRC01:25
*** mriedem has quit IRC01:26
*** janki has joined #openstack-nova01:28
*** Dinesh_Bhor has joined #openstack-nova01:35
*** brinzhang has joined #openstack-nova01:51
*** Bhujay has joined #openstack-nova02:00
*** bhagyashris has quit IRC02:00
*** gcb_ has joined #openstack-nova02:07
*** Dinesh_Bhor has quit IRC02:25
*** bhagyashris has joined #openstack-nova02:25
*** gbarros has quit IRC02:28
*** gbarros has joined #openstack-nova02:37
*** Dinesh_Bhor has joined #openstack-nova02:40
*** gbarros has quit IRC02:53
*** markvoelker has joined #openstack-nova03:01
*** bhagyashris has quit IRC03:01
*** psachin has joined #openstack-nova03:06
*** janki has quit IRC03:06
*** Bhujay has quit IRC03:24
*** yikun has joined #openstack-nova03:24
*** Nel1x has quit IRC03:34
*** udesale has joined #openstack-nova04:00
*** Dinesh_Bhor has quit IRC04:07
*** hongbin has quit IRC04:10
*** janki has joined #openstack-nova04:27
*** psachin has quit IRC04:27
*** hoonetorg has quit IRC04:29
*** gbarros has joined #openstack-nova04:31
*** markvoelker has quit IRC04:32
*** markvoelker has joined #openstack-nova04:43
*** Bhujay has joined #openstack-nova04:45
*** hoonetorg has joined #openstack-nova04:47
*** Bhujay has quit IRC04:51
*** markvoelker has quit IRC04:52
*** Bhujay has joined #openstack-nova04:53
*** hoonetorg has quit IRC04:54
*** holser_ has joined #openstack-nova04:59
*** ratailor has joined #openstack-nova05:04
*** hoonetorg has joined #openstack-nova05:06
*** Dinesh_Bhor has joined #openstack-nova05:10
*** psachin has joined #openstack-nova05:14
*** gbarros has quit IRC05:42
*** jaosorior has quit IRC05:45
*** jaosorior has joined #openstack-nova05:47
*** Luzi has joined #openstack-nova05:51
vishakhamriedem: Hi, https://review.openstack.org/#/c/580271/ is this change valid?? Kindly review. As I am little confused with comments. Thanks05:52
*** abhishekk has joined #openstack-nova05:55
*** moshele has joined #openstack-nova06:01
*** holser_ has quit IRC06:02
*** janki has quit IRC06:06
*** yikun has quit IRC06:11
*** abhishekk has quit IRC06:18
*** alexchadin has joined #openstack-nova06:27
*** ccamacho has joined #openstack-nova06:28
openstackgerritChen proposed openstack/nova stable/rocky: Update ssh configuration doc  https://review.openstack.org/59404106:31
*** maciejjozefczyk has joined #openstack-nova06:33
*** ratailor_ has joined #openstack-nova06:34
*** ratailor has quit IRC06:35
*** ratailor_ has quit IRC06:36
*** ratailor has joined #openstack-nova06:36
*** abhishekk has joined #openstack-nova06:36
openstackgerritChen proposed openstack/nova stable/rocky: Revisons on notifications doc  https://review.openstack.org/59404206:37
*** rcernin has quit IRC06:38
*** ratailor_ has joined #openstack-nova06:38
*** ratailor__ has joined #openstack-nova06:40
*** rcernin has joined #openstack-nova06:40
*** ratailor has quit IRC06:41
*** ratailor__ has quit IRC06:41
*** ratailor__ has joined #openstack-nova06:41
*** pcaruana has joined #openstack-nova06:42
*** ratailor_ has quit IRC06:43
*** abhishekk has quit IRC06:47
*** luksky has joined #openstack-nova06:50
*** adrianc has joined #openstack-nova06:51
*** rcernin has quit IRC06:51
*** tssurya has joined #openstack-nova06:52
*** NostawRm has quit IRC07:09
openstackgerritJiri Suchomel proposed openstack/nova master: Ignore deleted instances when populating with availability zones  https://review.openstack.org/59405007:13
gmannalex_xu: i am on vacation for 2 weeks (till 31st Aug ) so will not be able to do API office hour.07:17
gmannmelwitt: ^^ i will be able to provide API updates from 31st Aug onward (on vacation for 2 weeks.)07:17
*** alexchadin has quit IRC07:21
*** sahid has joined #openstack-nova07:22
*** alexchadin has joined #openstack-nova07:32
*** tetsuro has quit IRC07:36
*** jpena|off is now known as jpena07:40
*** macza has joined #openstack-nova08:06
*** tetsuro has joined #openstack-nova08:08
*** yankcrime has joined #openstack-nova08:11
*** macza has quit IRC08:11
*** burt has quit IRC08:20
*** burt has joined #openstack-nova08:21
*** cdent has joined #openstack-nova08:33
*** macza has joined #openstack-nova08:48
*** sahid has quit IRC08:53
*** alexchadin has quit IRC08:53
*** macza has quit IRC08:53
*** holser_ has joined #openstack-nova08:53
*** macza has joined #openstack-nova09:09
*** macza has quit IRC09:14
*** alexchadin has joined #openstack-nova09:14
*** luksky has quit IRC09:28
*** macza has joined #openstack-nova09:30
*** macza has quit IRC09:35
*** Dinesh_Bhor has quit IRC09:39
*** mriedem has joined #openstack-nova09:42
mriedemo/09:43
kashyapIsn't it terribly early for you there?09:44
* cdent checks the time09:44
cdentjet lag?09:44
sean-k-mooneymriedem: o/09:44
* kashyap waves hi, getting back after a 2-ish week PTO09:44
mriedemjet lag and work on the brain09:45
lbragstadmriedem: same, my sleep schedule is completely screwed09:45
sean-k-mooneymriedem: well do what jay pipes used to do. if your up early get work done early and be done by 1/2 pm and enjoy the rest of your day09:46
* kashyap just bought the book (everybody and their dog are recommending it) -- https://www.amazon.com/Why-We-Sleep-Unlocking-Dreams/dp/1501144316. Almost at every page the author backs up his claims based on solid science09:47
kashyapmriedem: ^ Might want to check it out, when you're not sleeping :P09:48
kashyap(FWIW, the author is not a "journalist" writing junk 'pop science'; he's a serious researcher on that topic.)09:49
*** sahid has joined #openstack-nova09:56
mriedemsean-k-mooney: question in https://bugs.launchpad.net/nova/+bug/178801409:57
openstackLaunchpad bug 1788014 in OpenStack Compute (nova) "when live migration fails due to a internal error rollback is not handeled correctly." [Undecided,New]09:57
*** dtantsur|afk is now known as dtantsur09:59
*** adrianc has quit IRC10:00
*** luksky has joined #openstack-nova10:02
*** dpawlik_ has quit IRC10:03
sean-k-mooneymriedem: hi i think that is the issue yes. i have not had time to pin down the exact cause but i supect it because we are not activating the source binding after deleting the dest10:03
mriedemhmm, i guess i would have expected neutron to automatically activate the source host port bindings when the dest host bindings were deleted10:04
mriedembecause when we activate the dest host bindings, neutron automatically de-activates the source host bindings,10:04
mriedemso my thinking was when we delete the dest bindings on rollback, neutron would say, oh i need to activate the only other bindings (source) left10:04
mriedemi could have a wip patch for you to test with if you still have that live migration env available10:05
*** dpawlik has joined #openstack-nova10:08
*** adrianc has joined #openstack-nova10:09
*** macza has joined #openstack-nova10:12
sean-k-mooneymriedem: i have the devstack vms shut down but i can have it set up quickly again10:16
openstackgerritSlawek Kaplonski proposed openstack/os-vif master: Avoid os-vif to add ovs ports as trunk by default  https://review.openstack.org/59411810:16
*** macza has quit IRC10:16
sean-k-mooneyim goint to work on the first two neutron bugs first10:16
sean-k-mooney^ that is confusing.... we dont10:17
sapd1sean-k-mooney: Are you working on SR-IOV attach/detach?10:20
sapd1:D10:20
sean-k-mooneysapd1: its on my todo list10:21
*** alexchadin has quit IRC10:25
*** Bhujay has quit IRC10:26
openstackgerritJiri Suchomel proposed openstack/nova master: Set default AZ explicitely for instances without host. Ignore deleted instances when populating with availability zones  https://review.openstack.org/59405010:26
openstackgerritJiri Suchomel proposed openstack/nova master: Set default AZ explicitely for instances without host.  https://review.openstack.org/59405010:27
*** neha30 has quit IRC10:30
mriedemtssurya: on ^, we should just filter out instances w/o a host10:31
mriedemdefault_availability_zone isn't the right config option for instances10:32
mriedemdefault_schedule_zone is, but it defaults to None so it wouldn't fix the bug10:32
*** macza has joined #openstack-nova10:33
*** Dinesh_Bhor has joined #openstack-nova10:34
*** alexchadin has joined #openstack-nova10:36
sean-k-mooneymriedem: isint the default availableity zone nova?10:36
mriedemdefault default_availability_zone is nova10:36
mriedemif the instance is on a host10:36
mriedemdefault_schedule_zone is the thing we set on instance.availability_zone if the user didn't request an az10:37
mriedemand that defaults to None10:37
*** ratailor__ has quit IRC10:37
*** ratailor has joined #openstack-nova10:37
*** macza has quit IRC10:38
sean-k-mooneyoh ok, does horozon handel that differently?10:38
sean-k-mooneyor does devstack set them both to nova?10:38
mriedemno10:39
mriedemGET /servers/{id} will return '' for the az if the instance doesn't have a host set https://github.com/openstack/nova/blob/722d5b477219f0a2435a9f4ad4d54c61b83219f1/nova/api/openstack/compute/views/servers.py#L17010:40
*** ratailor_ has joined #openstack-nova10:40
*** ratailor_ has quit IRC10:40
sean-k-mooneymriedem: when can an instance not have a host set. when its shelved?10:41
mriedemif it fails during scheduling10:42
mriedemNoValidHost10:42
sean-k-mooneyah ok10:42
mriedemand yes if it's shelved offloaded10:42
mriedemi'm not sure that we clear out the instance availability_zone on shelve offload though10:42
sean-k-mooneywell it makes sense if its not schduled to a node it should not have az right?10:42
mriedemif not that's like another bug10:42
mriedemcorrect10:42
mriedemthat's what i'm saying in the review10:43
*** ratailor has quit IRC10:43
mriedemmelwitt: i've marked https://bugs.launchpad.net/nova/+bug/1788115 for rc potential10:44
openstackLaunchpad bug 1788115 in OpenStack Compute (nova) "nova-manage db online_data_migrations hangs on instances with no host set" [Medium,In progress] - Assigned to Jiri Suchomel (jsuchome)10:44
mriedemyeah we don't clear out the instance.az on shelve offload10:46
mriedemwe will update it on unshelve though10:46
mriedemhttps://github.com/openstack/nova/blob/722d5b477219f0a2435a9f4ad4d54c61b83219f1/nova/conductor/manager.py#L81510:46
mriedemwhich reminds me https://review.openstack.org/#/c/559828/10:47
mriedemas noted in ^ we also don't clear the port binding10:48
tssuryamriedem: back from lunch, oh okay yea would be better to all together filter out the ones where host is NOne10:49
mriedemtssurya: i think that still works for your original bug too10:49
mriedemcomments inline on why10:50
* tssurya looking10:50
*** macza has joined #openstack-nova10:54
*** macza has quit IRC10:59
tssuryamriedem: I agree,10:59
*** Dinesh_Bhor has quit IRC11:05
*** jpena is now known as jpena|lunch11:10
*** Bhujay has joined #openstack-nova11:13
*** udesale has quit IRC11:25
*** macza has joined #openstack-nova11:35
*** macza has quit IRC11:40
openstackgerritBrin Zhang proposed openstack/nova-specs master: Resource retrieving: add change-before filter  https://review.openstack.org/59197611:44
openstackgerritJiri Suchomel proposed openstack/nova master: Filter out instances without a host when populating AZ  https://review.openstack.org/59405011:47
*** MasterofJOKers has quit IRC11:49
*** ujjain has quit IRC11:51
mriedemsean-k-mooney: see how ^ floats your boat11:53
mriedemoops11:53
openstackgerritMatt Riedemann proposed openstack/nova master: Re-activate source host port bindings on live migration rollback  https://review.openstack.org/59413911:53
mriedemsean-k-mooney: ^11:53
mriedemi want to see what miguel thinks about that too11:53
sean-k-mooneymriedem: ill test that and see. the other thing is im not sure what the state of qemu when this partaclar bug happens due to the fact th monitor closed.11:55
sean-k-mooneythat is kind of a seperate bug however11:56
sean-k-mooneyat least with this patch if i do a hard reboot i think everything shold work properly again.11:56
*** MasterofJOKers has joined #openstack-nova11:56
mriedemi am a bit surprised that qemu would bomb out after we went into post-copy mode11:56
mriedemwe only activate the dest host port bindings during post-copy or post-live migration11:56
mriedemso your qemu failure must have happened after post-copy for us to activate the dest host bindings11:57
mriedemyou'd know if you saw "Binding ports to destination host" in the source host compute logs11:57
*** yikun_ has joined #openstack-nova11:57
sean-k-mooneyill reporduce and check that then apply your patch and see what happens11:58
*** ujjain has joined #openstack-nova12:00
*** macza has joined #openstack-nova12:04
openstackgerritYikun Jiang (Kero) proposed openstack/nova master: [placement] Use oslotest uuidsentinel  https://review.openstack.org/59414412:05
mriedemcdent: ^12:05
cdentnice12:05
openstackgerritBrin Zhang proposed openstack/nova-specs master: Resource retrieving: add change-before filter  https://review.openstack.org/59197612:07
cdentmriedem: wait12:07
cdentnm12:08
cdentefried: did you see https://review.openstack.org/#/c/594068/12:08
efriedlooking...12:08
*** macza has quit IRC12:08
efriedcdent: Thanks, I was just starting to poke on that.12:09
cdentefried: cool, wasn't sure if you had started your own12:09
*** macza has joined #openstack-nova12:11
mriedemclearly that will have to go through the release dance12:11
*** macza has quit IRC12:12
*** jpena|lunch is now known as jpena12:18
openstackgerritBrin Zhang proposed openstack/nova-specs master: Resource retrieving: add change-before filter  https://review.openstack.org/59197612:23
openstackgerritSlawek Kaplonski proposed openstack/os-vif master: Avoid os-vif to add untagged ports to ovs ports by default  https://review.openstack.org/59411812:23
*** brinzhang has quit IRC12:24
openstackgerritJiri Suchomel proposed openstack/nova master: Filter out instances without a host when populating AZ  https://review.openstack.org/59405012:26
*** Luzi has quit IRC12:28
openstackgerritSlawek Kaplonski proposed openstack/os-vif master: DNM Testing different CDN projects with DEAD_VLAN_TAG  https://review.openstack.org/59415312:28
*** tbachman has joined #openstack-nova12:32
*** macza has joined #openstack-nova12:35
*** tbachman_ has joined #openstack-nova12:35
*** Luzi has joined #openstack-nova12:35
*** tbachman has quit IRC12:37
*** tbachman_ is now known as tbachman12:37
*** macza has quit IRC12:40
tssuryaalex_xu, gmann or other api-experts: nova list seems to have this filter "--instance-name" which doesn't seem to be processed anywhere, was wondering where/why it was used ?12:58
*** NostawRm has joined #openstack-nova13:00
alex_xutssurya: I even don't know we have '--instance-name' filter in the API13:00
alex_xuI don't think we have that in the API13:01
tssuryaalex_xu: I didn't find it either; but its listed in the options13:01
mriedemwhat is it translated to in novaclient?13:01
mriedemadded in 2011 by rackspace so it was probably something in rax13:03
*** efried is now known as efried_goatin13:03
mriedemnot upstream13:03
mriedemlooks like it's meant to filter on OS-EXT-SRV-ATTR:instance_name13:04
alex_xuwe have attribute 'OS-EXT-SRV-ATTR:instance_name', try to find out if there is any translate13:04
alex_xumriedem: yea13:04
mriedemright which would be instance.name13:04
mriedemfiltering on --name would be instance.display_name13:04
tssuryabut we also have the "--name"13:04
tssuryaah okay13:04
mriedemhttps://github.com/openstack/python-novaclient/commit/dcd5544133f1cc1171f8078b2ed54143b52fb06413:06
alex_xubut it doesn't work in server side13:07
tssuryaalex_xu: yea, was just trying it13:07
*** s10 has joined #openstack-nova13:08
tssuryabecause we don't have it on the server side right ?13:08
alex_xuyes, I think so13:09
alex_xugmann: enjoy your vacation!13:09
tssuryacool, should I open a bug on the client to remove it from the option for the users ? or do we have plans on putting it on the server side13:10
mriedemthat initial novaclient change wasn't even correct,13:11
mriedemit was later updated in https://github.com/openstack/python-novaclient/commit/fc8e5e3fe3a1164eb2e923ed599e63a2af1a4f3c13:11
alex_xutssurya: we have a filter called 'name', it is should be the '--instance-name'13:11
tssuryaeither ways, I was trying to skip the minimal constructs for down cells for "all" filters and came across this one not abiding the rules and doing nothing except priting everything13:11
alex_xutssurya: sorry, that 'name' is mapping to 'display_name' also13:12
tssuryaalex_xu: yea, the documentation for those options need to be more clear to explain what means what if we are going to have both13:12
alex_xutssurya: yea13:12
tssuryamriedem: oh, so you want to keep both ?13:13
mriedemnot necessarily,13:13
mriedemclearly there is a bug in novaclient which needs to be reported13:13
tssuryamriedem: right, I can open a bug now and we can see if this option is really useful to implement on the server side, else we can punt it. At least the documentation should be clearer for those options13:14
mriedemso the --name filter in nova list is being mapped to filter on instance.name rather than display_name?13:15
mriedemhttps://github.com/openstack/python-novaclient/commit/fc8e5e3fe3a1164eb2e923ed599e63a2af1a4f3c13:16
mriedemoops13:16
mriedemfilter_mapping = {13:16
mriedem                'image': 'image_ref',13:16
mriedem                'name': 'display_name',13:16
mriedemso we map name to display_name in the compute API code13:16
mriedemand instance_name should map to 'name'13:16
mriedemis what i think alex_xu was saying13:16
mriedemif we want to support that in the server13:16
mriedembut the client side --instance-name filter doesn't do anything today, right?13:16
alex_xumriedem: we have instance_name filter long time before https://github.com/openstack/nova/commit/1c90eb34085dbb69f37e2f63dea7496afabb06b3#diff-516904cc81cade24a9122ecf96707bf0R70213:17
mriedemright13:17
mriedem(8:16:28 AM) mriedem: and instance_name should map to 'name'13:17
*** psachin has quit IRC13:17
mriedemso when was that removed?13:17
alex_xumriedem: probably folsom release, I see that filter in that relese, but disappear after grizzle13:20
*** eharney has quit IRC13:26
alex_xumriedem: tssurya here https://review.openstack.org/#/c/10917/313:26
mriedemah ok, and forgot to remove the novaclient side of that13:27
tssuryaah thanks13:28
mriedemand apparently no one has noticed since folsom13:28
*** erlon has joined #openstack-nova13:28
openstackgerritChris Dent proposed openstack/nova master: Set policy_opt defaults in placement gabbi fixture  https://review.openstack.org/59417213:28
mriedemhttps://bugs.launchpad.net/python-novaclient/+bug/1295126/comments/313:29
openstackLaunchpad bug 1295126 in python-novaclient "Admin only shown for args that can be used by non-admin" [Wishlist,Fix released] - Assigned to Verónica Musso (veronica-a-musso)13:29
mriedem"and --instance-name has no effect for both"13:29
mriedemtssurya: alex_xu: i'd probably just deprecate the --instance-name option in nova list, it's not done anything since essex13:30
mriedemadding the support server-side at this point is likely a microversion13:30
alex_xumriedem: yea, and it is admin-only filter so we can deprecate it13:30
tssuryamriedem: ack, I don't think its that essential a filter13:30
tssuryaits not an admin-only..13:31
alex_xutssurya: instance_name field only can be see by the admin?13:33
mriedemno,13:35
mriedemOS-EXT-SRV-ATTR:instance_name is also shown for non-admins13:35
mriedemit's in ExtendedServerAttributesController13:35
mriedemoh wait no alex_xu is correc13:36
mriedem*correct13:36
mriedemos_compute_api:os-extended-server-attributes defaults to admin-only13:36
mriedemhttps://docs.openstack.org/nova/latest/configuration/policy.html13:37
*** efried_goatin is now known as efried13:37
mriedemstephenfin: see https://docs.openstack.org/nova/latest/configuration/policy.html and os_compute_api:os-extended-server-attributes - i thought we had restructured text formatting on policy option help?13:38
mriedemmaybe that's only in oslo.config option help?13:38
mriedemefried: ? ^13:38
*** ccamacho has quit IRC13:38
efriedmriedem: Patch not merged. Lemme grab it...13:39
tssuryaoh, well its confusing because the help for the options doesn't say its Admin only and the bug above ^ says they changed it: https://bugs.launchpad.net/python-novaclient/+bug/1295126/comments/613:39
openstackLaunchpad bug 1295126 in python-novaclient "Admin only shown for args that can be used by non-admin" [Wishlist,Fix released] - Assigned to Verónica Musso (veronica-a-musso)13:39
mriedemtssurya: yup,13:39
mriedemdespite that one person saying it was never even used13:39
mriedemtssurya: so just report a bug and deprecate --instance-name13:39
mriedemi'll +2 that13:39
tssuryamriedem: cool13:39
mriedemit predates gerrit so i'm not surprised it's a mess13:40
stephenfinmriedem: Yeah, just oslo.config, I think13:40
stephenfinthough I had it in my head oslo.policy wasn't broken in the first place. Obviously not13:40
efriedmriedem:13:41
efried- nova patch to twiddle a couple of options to prove it works13:41
efried- oslo.config patch to address complaint that using the rst role in help text shows up ugly in the sample: https://review.openstack.org/#/c/583064/13:41
efriedhttps://review.openstack.org/#/c/583025/ shoulda been that first link, sorry13:41
stephenfinefried: I think that's a different issue13:41
stephenfinefried: mriedem's asking why newlines and the likes in policy.help aren't being parsed13:42
stephenfin...in the HTML output. Your patch affects the ini output, right?13:42
tssuryaokay, so instance.display_name is name and instance.hostname is OS-EXT-SRV-ATTR:hostname and we don't care about OS-EXT-SRV-ATTR:instance_name.13:43
efriedstephenfin: the oslo.config patch, yes.13:43
efriedoh, reread what mriedem was actually saying. Yeah, I don't know about that, sorry.13:44
efriedI would have asked stephenfin :)13:44
mriedemha13:44
openstackgerritMatt Riedemann proposed openstack/nova master: Filter out instances without a host when populating AZ  https://review.openstack.org/59405013:44
mriedem^ is likely an RC3 issue13:45
mriedemregarding petr's email about install guide testing,13:47
stephenfinmriedem: Agreed13:47
mriedemi wonder how valid it is, or time would be saved, by starting up devstack but not enabling nova, so that you can do that manually after keystone/glance/cinder/neutron are already setup13:48
mriedemi think the only major thing in the install guide in rocky was the placement db13:48
sean-k-mooneymriedem: i think you will hit depency issues13:49
*** awaugama has joined #openstack-nova13:49
mriedemon other openstack services?13:50
mriedemor things like setting up libvirt?13:50
sean-k-mooneywell neutron would expect to be able to talks to placement for things like routed networks13:50
openstackgerritMatt Riedemann proposed openstack/nova stable/rocky: Filter out instances without a host when populating AZ  https://review.openstack.org/59417813:50
mriedemrouted networks are optional and devstack doesn't set those up anyway13:50
mriedemwe definitely *should* have a ci job that uses routed networks13:51
mriedemacross a 2-node deploy where each host is in a separate aggregate13:51
mriedembut that would require time and people that care to make sure it continues to work13:51
sean-k-mooneyin theroy devstack should be able to help i guess13:52
*** gbarros has joined #openstack-nova13:52
sean-k-mooneymriedem: is placement installation considered part of the nova install guide13:52
*** moshele has quit IRC13:53
*** alexchadin has quit IRC13:55
dansmithtssurya: looks like the down-cell stack needs rebasing again13:55
dansmithpresumably its review-able regardless?13:56
efriedmriedem: https://review.openstack.org/594179 <== alternative uuidsentinel impl13:56
tssuryadansmith: yea, its ready for a first time review13:56
tssuryastill working on filtering part13:56
dansmithokay13:56
tssuryabut would be nice to get opinions13:56
tssuryaI have them as seperate patches for now, will squash them with the version BUMP13:56
tssuryaonce we review the approach13:57
tssuryaand, mriedem: sorry about missing the instance.host None case earlier on and the backport headaches.13:57
openstackgerritJiri Suchomel proposed openstack/nova stable/pike: Filter out instances without a host when populating AZ  https://review.openstack.org/59418413:58
*** alexchadin has joined #openstack-nova13:58
openstackgerritSurya Seetharaman proposed openstack/nova stable/queens: Filter out instances without a host when populating AZ  https://review.openstack.org/59418514:00
stephenfinmriedem: https://bugs.launchpad.net/oslo.policy/+bug/178818314:02
openstackLaunchpad bug 1788183 in oslo.policy "Rule description not rendered as rST" [Undecided,New]14:02
mriedemtssurya: not your fault, we have reviewers for a reason14:04
mriedemand i obviously missed it as well14:04
mriedemsean-k-mooney: i think so yes14:04
mriedemefried: why not in oslotest? because of the circular dep?14:05
efriedmriedem: And because it's... a UUID util. And because just because I can't think of a reason for it to be used outside of test, doesn't mean it can't be. See commit message.14:05
openstackgerritMatt Riedemann proposed openstack/nova stable/queens: Filter out instances without a host when populating AZ  https://review.openstack.org/59418514:06
sean-k-mooneymriedem: then in that case if you wanted to test the nova install guide, and use devstack to help, you would just have devstack install keystone,mysql, and rabbitmq. and perhapse memcached14:07
mriedembut nova also needs neutron14:07
mriedemand i don't want to go through the neutron install guide to test nova's install14:07
mriedemsame for glance14:07
mriedemcue kevin fox to say it should all be one monolithic install14:08
*** ccamacho has joined #openstack-nova14:08
sean-k-mooneywell i would not expect the install guide for nova to cover the glance or neutron parts14:09
sean-k-mooneyi also would not assmue you could boot a vm after finishing it14:09
sean-k-mooneyi would just assumne i had the nova compontes deployed and fuctioning14:09
sean-k-mooneye.g. nova hypervior list should show all the resouces but openstack server create would fail14:10
mriedemwell, if i'm installing nova, i would like to be able to create a vm by the end of it14:10
mriedemotherwise i don't know if i f'ed up the install somewhere14:10
sean-k-mooneyin that case it does have to be a multi service install guide14:11
mriedemalso, https://docs.openstack.org/nova/latest/install/controller-install-ubuntu.html#install-and-configure-components refers to the neutron install guide14:11
*** alexchadin has quit IRC14:13
sean-k-mooneyi guess refering to the other guide also works. that said untill nova networks if fully dead neutron is technical not a nova depency14:13
sean-k-mooneybut i could see adding neutron to the devstack install.14:13
*** alexchadin has joined #openstack-nova14:13
openstackgerritMatt Riedemann proposed openstack/nova stable/pike: Filter out instances without a host when populating AZ  https://review.openstack.org/59418414:14
sean-k-mooneyglace i guess would also be required because there is no way to boot a vm otherwise. unless you used the fake drivers14:14
*** Luzi has quit IRC14:23
*** eharney has joined #openstack-nova14:23
*** mlavalle has joined #openstack-nova14:25
*** vivsoni has quit IRC14:25
*** lbragstad has quit IRC14:26
*** munimeha1 has joined #openstack-nova14:27
mriedemjroll: pretty sure this has always been true yeah? https://bugs.launchpad.net/nova/+bug/178750914:27
openstackLaunchpad bug 1787509 in OpenStack Compute (nova) "Baremetal filters and default filters cannot be used simultaneously in the same nova" [Undecided,New]14:27
mriedemuntil pike anyway14:27
dansmithmriedem: so reviewing tssurya's series just now made me (re-)realize14:34
dansmithwe're still iterating all of the instances from a cell before returning them in order to do the fault stuff14:35
mriedemdansmith: i think when you were adding instance lister,14:35
mriedemyou pre-joined faults in the db api and it didn't seem to make a difference in perf14:35
mriedemand it might have caused some other issue14:35
dansmithso I'm surprised we gained as much as we did by my batching, and so I wonder if we push the faults into the batches if it would help14:36
dansmithmriedem: yeah, I remember that now14:36
mriedemwe also only show fault if the vm state is ERROR or DELETED14:36
dansmithright14:36
*** Luzi has joined #openstack-nova14:36
mriedemwell this bug says nova list is too slow https://bugs.launchpad.net/nova/+bug/178814914:37
openstackLaunchpad bug 1788149 in OpenStack Compute (nova) "nova list too slow" [Undecided,New]14:37
mriedemso there is that14:37
*** Luzi has quit IRC14:37
dansmithheh14:37
tssuryanice14:38
mriedemalex_xu: did you say gmann was on vacation? https://review.openstack.org/#/c/584223/14:41
mriedem(8:09:57 AM) alex_xu: gmann: enjoy your vacation!14:41
tssuryamriedem: yes untill 31st14:41
tssuryauntil*14:42
sean-k-mooneymriedem: regarding the migration issue i dont see "Binding ports to destination host" in either the source or dest compute logs14:45
sean-k-mooneythe dest does have "Plugging VIFs using destination host port bindings before live migration." and "Deleted binding for port 3218fd70-ea82-4ee1-9a5b-2d3c9d8b9fa0 and host devstack2."14:46
mriedemthe former is when we do pre_live_migration on the dest host,14:46
mriedemat that point port bindings are still active for the source host14:46
mriedemthe latter is when we're rolling back after the failed migration14:47
mriedemso i'm not sure that my patch would fix your issue if we never deactivated the source host bindings14:47
*** r-daneel has joined #openstack-nova14:47
mriedemthat's why i was saying i'd be surprised if it fixed it b/c it would mean the failure happened after post-copy14:47
sean-k-mooneyya. i expected to see boot. could deleteing the port bindings be the root of the issue14:47
mriedemmaybe?14:47
mriedemmaybe neutron f's up and screws up the active source host port binding?14:48
sean-k-mooneyill apply the patch in anycase and see what happens14:48
s10mriedem: bug https://bugs.launchpad.net/nova/+bug/1788149 could be eventlet related and maybe caused by nova-neutron connection on every nova show/nova list (https://bugs.launchpad.net/nova/+bug/1567655)14:50
openstackLaunchpad bug 1788149 in OpenStack Compute (nova) "nova list too slow" [Undecided,Incomplete]14:50
openstackLaunchpad bug 1567655 in OpenStack Compute (nova) "500 error when trying to list instances and neutron-server is down" [Medium,Confirmed]14:50
mriedems10: hmm yeah good point re eventlet14:51
mriedemalso if they are running nova-api via wsgi in pike we aren't monkey patching eventlet14:51
*** gbarros has quit IRC14:52
mriedemand yeah https://bugs.launchpad.net/nova/+bug/1567655 came up again with my product team overlords last week14:52
openstackLaunchpad bug 1567655 in OpenStack Compute (nova) "500 error when trying to list instances and neutron-server is down" [Medium,Confirmed]14:52
mriedemregarding perf/scaling issues with nova api14:52
*** gbarros has joined #openstack-nova14:52
mriedemtl;dr we should cache the port security group information in instance.info_cache like we do for other port information14:52
*** ccamacho has quit IRC14:53
openstackgerritMatt Riedemann proposed openstack/nova master: Merge config drive extension response into server controller  https://review.openstack.org/58422314:53
openstackgerritMatt Riedemann proposed openstack/nova master: Merge extended server attributes extension response  https://review.openstack.org/58459014:53
openstackgerritMatt Riedemann proposed openstack/nova master: Merge keypair extension response into server view builder  https://review.openstack.org/58474814:53
openstackgerritMatt Riedemann proposed openstack/nova master: Merge server usage extension response into server view builder  https://review.openstack.org/58526214:53
openstackgerritMatt Riedemann proposed openstack/nova master: Merge security groups extension response into server view builder  https://review.openstack.org/58547514:53
openstackgerritMatt Riedemann proposed openstack/nova master: Merge extended_status extension response into server view builder  https://review.openstack.org/59209214:53
openstackgerritDan Smith proposed openstack/nova master: WIP: Record cell success/failure/timeout in CrossCellLister  https://review.openstack.org/59426514:56
dansmithtssurya: ^14:56
*** alexchadin has quit IRC14:56
dansmithtssurya: what if we do that, and then in get_instance_objects_sorted() (or above) we just get a handle on InstanceLister itself14:56
dansmithtssurya: then we can construct the missing instances from the failed cells separately from the intricate multi-cell-listing logic?14:57
tssuryadansmith: sounds good, guess its better to move it out of the generator14:57
*** alexchadin has joined #openstack-nova14:57
sean-k-mooneymriedem: this is really annoying. im not seeing the issue going from ovs-dpdk to kernel ovs. the migration failes but the binding are fine. i was hitting this going form ovs to ovs-dpdk  when i reported the bug so ill test that next.14:58
dansmithtssurya: yeah I actually would like it higher than get_instance_objects_sorted(), but definitely want it out of the low-level logic if possible14:58
tssuryadansmith: when you say higher than get_instance_objects_sorted(), then you mean we return the list of non-responsive cells instead from ^^ patch and constrcut it in compute/api ?15:01
tssuryaconstruct*15:01
dansmithtssurya: yes, I'd like that better personally, just to keep "api logic" closer to the api15:01
dansmithtssurya: maybe just return a tuple from get_instance_objects_sorted() indicating (failed_cell_uuids, instances_i_actually_got)15:02
*** lbragstad has joined #openstack-nova15:02
*** dpawlik has quit IRC15:03
*** alexchadin has quit IRC15:03
tssuryadansmith: okay looks doable, I will try it out15:03
dansmithcool15:04
tssuryathanks for the review btw, was getting kind of lost in the details15:04
tssurya:)15:04
dansmithtssurya: no problem, this is complicated and I've been neglecting it too long15:04
tssuryahehe :D15:04
openstackgerritJose Castro Leon proposed openstack/nova master: Add extend in-use volumes support for RBD  https://review.openstack.org/59427315:04
jrollmriedem: yeah, death to the baremetal filters15:05
mriedemi left comments on the bug,15:05
mriedemwas mostly looking for input on how people have done vm/bm in a single compute endpoint, i assume host aggregates15:05
mriedembut i've heard there are also quota issues when doing it that way15:06
*** pcaruana has quit IRC15:09
*** gbarros has quit IRC15:09
*** gbarros has joined #openstack-nova15:11
*** priteau has joined #openstack-nova15:14
sean-k-mooneymriedem: you can use capablityes in the flavor to avoid the need for host aggregates15:15
sean-k-mooneynot sure how many people go that route vs AZs or host aggregates15:16
*** sambetts|afk is now known as sambetts15:16
sean-k-mooneyatully with more recent releases you can just use resouce classes + dedicated baremetal or vm flavor and let placement handel it15:17
mriedemthat's why the baremetal filter options were deprecated in pike and removed in rocky15:18
*** s10 has quit IRC15:18
mriedemwhich is why i marked the bug as won't fix15:18
*** mhen has quit IRC15:18
*** mhen has joined #openstack-nova15:20
dansmithmriedem: do you know if Kevin_Zheng and yikun_ are still working on tests? because I think if they have no ERROR instances, I could hack up a generator they could test to compare apples to apples on whether or not that object list loop could go faster15:20
jrock_cfdghello - i'm trying to add a serial device with specific paramaters to an instance at creation time (source mode=connect host=0.0.0.0 port=4555) ; I think i've narrowed it down to these 3 scripts (/usr/lib/python-2.7/site-packages/nova/virt/libvirt/{config,driver,guest}.py - which is the correct place to make this change? And has anyone here done anything like this and maybe have some examples?15:20
mriedemidk15:20
dansmithmeaning still have their profiling setup accessible or whatevef15:20
dansmithokay15:20
mriedemi'm sure it's still setup15:21
Kevin_Zhengwe can still test15:21
mriedemit's just a bash script on a devstack deploy on a baremetal host15:21
dansmithKevin_Zheng: ohai15:21
mriedemthe lurker15:21
mriedemoh right, monday, tuesday thursday are work late days for kevin and yikun15:21
dansmithah15:22
dansmithKevin_Zheng: I assume all your test instances are ACTIVE or something right?15:22
*** moshele has joined #openstack-nova15:22
Kevin_ZhengYes15:22
Kevin_ZhengAll active15:22
dansmithKevin_Zheng: so right here, we iterate all the instances: https://github.com/openstack/nova/blob/master/nova/compute/instance_list.py#L124-L12615:22
dansmithKevin_Zheng: and so I'm wondering if removing that would also help your perf a bit.. the problem is we have to handle faults, which are handled in that list method right now15:23
*** alex_xu has quit IRC15:23
mriedemmaciejjozefczyk: if you're around https://review.openstack.org/#/c/591607/15:23
dansmithKevin_Zheng: but with the batching, we *might* be better off doing that in the batches instead of at the top to reduce latency15:23
mriedemmaciejjozefczyk: our public cloud ops team reported the same issue15:24
Kevin_ZhengSo instead all instances, we do what?15:24
dansmithKevin_Zheng: well, we'd do it in the batch handler, so we fill faults on ~100 instances at a time in "parallel" instead of on 1000 instances serially15:24
mriedemefried: i guess we can land this now huh https://review.openstack.org/#/c/520024/15:25
Kevin_ZhengOh OK15:25
dansmithKevin_Zheng: sounds like if I come up with a test patch you could run it again and compare to without the patch just to see if it helps or hurts?15:25
Kevin_ZhengGuess I have to generate some error instance then15:25
cdentyay! on 52002415:26
Kevin_ZhengYeah we can do it15:26
mriedemyou insert them right into the cell db right?15:26
Kevin_ZhengYes15:26
dansmithKevin_Zheng: well, the first test would still be all active, just to measure what the perf impact of unrolling that loop is15:26
dansmithKevin_Zheng: then we'd test a patch with some error instances to see if we lose all of that with the fault handling, or only a fraction of the gain we made15:27
*** Bhujay has quit IRC15:27
Kevin_ZhengOk15:27
dansmithKevin_Zheng: I'll try cooking something up and will add you to the review15:30
Kevin_ZhengCool, I will go to bed and check in the morning15:30
dansmiththanks15:31
sean-k-mooneymriedem: so  regarding the live migration bug. the source node is activating the binding on the dest host binding after the migration aborts and this is also racing with the deltion of the binding on the dest host ...15:33
efriedmriedem: Yes, on 024, thanks.15:33
*** gbarros has quit IRC15:33
*** gbarros has joined #openstack-nova15:34
mriedemsean-k-mooney: so we're hitting post-copy and then aborting?15:37
mriedemthere are only 2 places that live migration activates the dest host port binding:15:38
mriedem1. post-copy event callback from libvirt15:38
mriedem2. _post_live_migration after the hypervisor said the live migration was successful15:38
*** dklyle has quit IRC15:41
openstackgerritMatt Riedemann proposed openstack/nova master: Explicitly fail if trying to attach SR-IOV port  https://review.openstack.org/59189815:42
*** luzC has quit IRC15:42
sean-k-mooneymriedem: http://paste.openstack.org/show/728534/15:42
dansmithugh, the expectation that we return an instancelist from get_all makes this harder than I thought15:42
sean-k-mooneyi think we are geting an updat form neutron and that is trigering the activate. perhaps hitting the _pos_live_migration code15:42
*** dklyle has joined #openstack-nova15:44
sean-k-mooneymriedem: lines 43-50 are teh ones im suspicous of15:45
mriedema neutron event wouldn't make us activate a port15:45
mriedemjust refresh the info cache15:45
mriedemAug 21 16:19:17 devstack2 nova-compute[25894]: WARNING nova.compute.manager [None req-594840ec-7af2-47d2-929b-cef9dda07bb8 service nova] [instance: fead1ca6-beab-4c47-a73e-a3ab7f7c4de2] Received unexpected event network-vif-unplugged-ef02ea3f-9a11-4519-bcd3-2bfca97edf26 for instance with vm_state active and task_state migrating.15:46
mriedemmeans we ignore it15:46
sean-k-mooneyhum ok well on line 50 we activate the port binding for devstack5 which was the destination node. but the migration has already aborted15:47
mriedemhmm15:47
mriedemAug 21 16:19:17 devstack2 nova-compute[25894]: DEBUG nova.network.neutronv2.api [None req-c8b07cbc-52f7-4d20-aacc-f3036ad90c8d None None] Activated binding for port ef02ea3f-9a11-4519-bcd3-2bfca97edf26 and host devstack5. {{(pid=25894) activate_port_binding /opt/stack/nova/nova/network/neutronv2/api.py:1352}}15:47
mriedemindeed15:48
*** luksky has quit IRC15:48
mriedemsean-k-mooney: do you see any "(Lifecycle Event)" messages right before the traceback on the source node?15:50
sean-k-mooneychecking15:50
mriedemshould have also seen "Binding ports to destination host" if it was handle_lifecycle_event was what was activating the binding15:51
mriedemer,15:51
mriedemsean-k-mooney: are these the logs before or after my patch from a few hours ago?15:51
sean-k-mooneybefore. and ya the migration competes...15:53
sean-k-mooneyill paste the log section15:53
mriedemso you're seeing the "Migration completed" lifecycle event15:53
mriedem?15:53
mriedemmaybe that's sent in both failure and success cases15:54
sean-k-mooneyhttp://paste.openstack.org/show/728539/15:54
mriedembingo15:54
mriedemAug 21 16:19:16 devstack2 nova-compute[25894]: INFO nova.compute.manager [None req-c8b07cbc-52f7-4d20-aacc-f3036ad90c8d None None] [instance: fead1ca6-beab-4c47-a73e-a3ab7f7c4de2] VM Migration completed (Lifecycle Event)15:54
sean-k-mooneyline 25 is the completion and line 31 is the failue15:55
mriedemAug 21 16:19:17 devstack2 nova-compute[25894]: DEBUG nova.compute.manager [None req-c8b07cbc-52f7-4d20-aacc-f3036ad90c8d None None] [instance: fead1ca6-beab-4c47-a73e-a3ab7f7c4de2] Binding ports to destination host: devstack5 {{(pid=25894) handle_lifecycle_event /opt/stack/nova/nova/compute/manager.py:1130}}15:55
sean-k-mooneyya i just saw that too15:55
mriedemAug 21 16:19:17 devstack2 nova-compute[25894]: ERROR nova.virt.libvirt.driver [-] [instance: fead1ca6-beab-4c47-a73e-a3ab7f7c4de2] Live Migration failure: internal error: qemu unexpectedly closed the monitor: 2018-08-21T15:19:15.187710Z qemu-kvm: -chardev socket,id=charnet0,path=/var/run/openvswitch/vhuef02ea3f-9a,server: info: QEMU waiting15:55
mriedemyeah so the driver is sending the 'migration completed' event even though the job failed15:55
mriedemthat's the bug15:55
mriedemand that's why we are activating the dest host port bindings on failure15:55
mriedemand then deleting them in rollback :)15:56
sean-k-mooneyya. so libvirt bug?15:56
mriedemlibvirt driver bug yeah15:56
*** gyee has joined #openstack-nova15:56
sean-k-mooneywell the live migration completion event is comming from libvirt no?15:57
mriedemyes, but the driver should check the job status to see if it failed or not15:57
mriedemif we can15:57
mriedemotherwise i don't think we can rely on that event15:57
sean-k-mooneylet me see if danpb is about15:57
openstackgerritChris Dent proposed openstack/nova master: Set policy_opt defaults in placement deploy unit test  https://review.openstack.org/59433415:57
*** danpb has joined #openstack-nova15:58
*** itlinux has joined #openstack-nova15:59
sean-k-mooneydanpb: thanks. am regarding http://paste.openstack.org/show/728539/. does the live migration completion event from libvirt have a status we can check for failures?15:59
mriedemi've updated https://review.openstack.org/#/c/594139/ with comments15:59
danpbsean-k-mooney: you summoned me  :-)15:59
mriedemhow many goats had to be sacrificed?16:00
sean-k-mooneyhaha TBD16:00
*** macza has joined #openstack-nova16:00
danpbsean-k-mooney: you have any more context than just that log file ?16:01
sean-k-mooneydanpb: yes im testing live migration between ovs to ovs-dpdk in this case16:01
sean-k-mooneythat is causing qemu to have an internal error16:02
danpbyep, looks like QEMU on target saw error in expected state & exited16:02
mdboothmriedem: I think if we've reached the point of post-copy, we shouldn't rollback.16:02
sean-k-mooneynova is assuming that when we get the live migration complete event that everything worked fine bug in this case qemu explodes and the migration failes16:02
danpbwhcih should have caused libvirt to abort migration & nova to rollback16:02
mdboothmriedem: Because the guest was actually running on the destination at that point.16:02
mdboothGuessing that could be tricky with the current code structure, though...16:03
danpbmdbooth: this logfile isn't showing post-coyp is it ?  looks like normal pre-copy to me16:03
sean-k-mooneymdbooth: the guest is still running fine after this. is netwroking is messed up but its still running on the host16:03
mriedemmdbooth: we aren't hitting post-copy16:03
mdboothdanpb: We could be talking about different things. I was referring to https://review.openstack.org/#/c/594139/16:03
danpboh fun, two different live migration discussions in parallel :-)16:04
sean-k-mooneymdbooth: danpb its the same one16:04
mriedemdanpb: so this is all new code since you've been in nova16:04
sean-k-mooneyi reproduced it16:04
mriedemwe're just assuming "migration completed" means it was successful, which is wrong in this case16:04
*** sahid has quit IRC16:05
mriedemso just need to not send that event callback up to the compute manager if the job failed16:05
mriedemwhich i think we can glean from the jobState object16:05
mriedemassuming that info is available to us in the params to _event_lifecycle_callback16:05
mriedemi'm not sure that it does though, we get event and detail16:06
mriedembut not the job status16:06
danpbmriedem:  which migration events are you referring to ?16:07
mdboothI don't think we're actually consuming events. We're polling the migration job.16:07
mriedemhttps://github.com/openstack/nova/blob/master/nova/virt/libvirt/host.py#L179-L18416:07
mriedemmdbooth: no16:07
mriedemnot for this16:08
danpboh, so you're just looking at the lifecycle events16:08
mdboothmriedem: ack. Was looking at _live_migration_monitor.16:08
danpbi'm not convinced that's not a desirable way to determine success vs failure16:08
mriedemwe're getting VIR_DOMAIN_EVENT_SUSPENDED_MIGRATED16:09
mriedemand assuming it's success16:09
danpbthe the job status from the _live_migration_monitor is better way to check for failure16:09
mriedemyeah, but we're on different threads here16:09
mriedemwe have the domain, could we get the jobState from that?16:09
danpbmriedem: that  VIR_DOMAIN_EVENT_SUSPENDED_MIGRATED  just says that the guest has been paused, as a result of the live migration operation16:09
danpbit doesn't say anything about the operation being success or failure16:10
mriedemright, and that's our bug :)16:10
danpbso you definitely can't assume  success from that16:10
mriedemright,16:11
mriedemso i can remove that to fix this quick16:11
mriedemor try to find the jobState from the domain and check the status?16:11
dansmithKevin_Zheng: okay I've changed my mind for the moment.. the api code is so generator-unfriendly that a quick hack to test this is more involved than I thought16:11
danpbmriedem: if there's some action that needs to take place during the migration operation16:11
danpbmriedem: then my gut feeling would be to hav the _live_migration_monitor thread take care of it16:12
sean-k-mooneymriedem: well im not sure we need to change that code. where is the EVENT_LIFECYCLE_MIGRATION_COMPLETED event consumed because we have stopped moving stuff at this point we jsut dont know if it succeded16:12
mriedemdanpb: yeah most likely - and that's inline with what dansmith said on the review for this change16:12
mriedemsince it was baking libvirt logic into the compute manager lifecycle callback handler16:12
mriedemsean-k-mooney: ComputeManager.handle_lifecycle_event16:13
danpbif the lifecycle events are needed, then forward those onto that thread too16:13
sean-k-mooneymriedem: so what we actully need to do is check the job status here https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1126-L113916:15
mriedemwe're not going to do that in the compute manager16:16
*** adrianc has quit IRC16:16
openstackgerritDan Smith proposed openstack/nova master: Batch results per cell when doing cross-cell listing  https://review.openstack.org/59269816:16
openstackgerritDan Smith proposed openstack/nova master: List instances from all cells explicitly  https://review.openstack.org/59371716:16
openstackgerritDan Smith proposed openstack/nova master: Make instance_list perform per-cell batching  https://review.openstack.org/59313116:16
openstackgerritEric Fried proposed openstack/nova master: [placement] Add /reshaper handler for POST  https://review.openstack.org/57692716:16
openstackgerritEric Fried proposed openstack/nova master: reshaper: Look up provider if not in inventories  https://review.openstack.org/58503316:16
openstackgerritEric Fried proposed openstack/nova master: Make get_allocations_for_resource_provider sane  https://review.openstack.org/58459816:17
openstackgerritEric Fried proposed openstack/nova master: Report client: Real get_allocs_for_consumer  https://review.openstack.org/58459916:17
openstackgerritEric Fried proposed openstack/nova master: Report client: get_allocations_for_provider_tree  https://review.openstack.org/58464816:17
openstackgerritEric Fried proposed openstack/nova master: Report client: _reshape helper, placement min bump  https://review.openstack.org/58503416:17
openstackgerritEric Fried proposed openstack/nova master: Report client: update_from_provider_tree w/reshape  https://review.openstack.org/58504916:17
openstackgerritEric Fried proposed openstack/nova master: Compute: Handle reshaped provider trees  https://review.openstack.org/57623616:17
openstackgerritEric Fried proposed openstack/nova master: [placement] Regex consts for placement schema  https://review.openstack.org/59186316:17
danpbmriedem: yeah you'd want to check status in the libvirt driver, and if some action is required in the compute manager, trigger some callout for the compute manager to act on i guess16:17
mriedemhow does one even determine job status based on https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainJobInfo ?16:18
mriedemhttps://libvirt.org/html/libvirt-libvirt-domain.html#virDomainJobType ?16:18
*** jpena is now known as jpena|off16:19
openstackgerritEric Fried proposed openstack/nova master: [placement] Regex consts for placement schema  https://review.openstack.org/59186316:19
openstackgerritEric Fried proposed openstack/nova master: [placement] Add /reshaper handler for POST  https://review.openstack.org/57692716:19
openstackgerritEric Fried proposed openstack/nova master: reshaper: Look up provider if not in inventories  https://review.openstack.org/58503316:20
openstackgerritEric Fried proposed openstack/nova master: Make get_allocations_for_resource_provider sane  https://review.openstack.org/58459816:20
openstackgerritEric Fried proposed openstack/nova master: Report client: Real get_allocs_for_consumer  https://review.openstack.org/58459916:20
openstackgerritEric Fried proposed openstack/nova master: Report client: get_allocations_for_provider_tree  https://review.openstack.org/58464816:20
openstackgerritEric Fried proposed openstack/nova master: Report client: _reshape helper, placement min bump  https://review.openstack.org/58503416:20
openstackgerritEric Fried proposed openstack/nova master: Report client: update_from_provider_tree w/reshape  https://review.openstack.org/58504916:20
openstackgerritEric Fried proposed openstack/nova master: Compute: Handle reshaped provider trees  https://review.openstack.org/57623616:20
mriedemah nvm i see how we get this info in nova16:21
danpbmriedem: yeah the job type field is what we're hooking off16:21
mriedemyup16:21
mriedemelif info.type == libvirt.VIR_DOMAIN_JOB_FAILED:16:21
mriedemdanpb: alright thanks i think i know what to do here,16:22
mriedemsean-k-mooney: i probably won't have something for you to test by your eod16:22
mriedemalthough your eod varies wildly16:22
mriedembut i'm in serious need of a shower and lunch at this point....i'm devolving16:23
mdboothmriedem danpb: IIRC we encountered limitations with this in the block rebase operation. Isn't there a race with the job info disappearing? If the job is no longer present, we no longer know if it failed or not, and the solution was supposed to be to consume events?16:23
sean-k-mooneyhaha yes it does today i need to drive home which is an hour and a half away so ill be leave shortly. if you have something ill test it as soon as im back online16:23
mdboothYeah, I wrote one of my comment essays about it16:24
mdboothhttps://github.com/openstack/nova/blob/master/nova/virt/libvirt/guest.py#L827-L83816:25
sean-k-mooneymdbooth: right. am can we check if the domain is still present on the source node? if it is it would mean it failed right?16:28
danpbmdbooth: with new enough libvirt the job will stick around16:31
danpbmdbooth: with older libvirt the _live_Migration_monitor code has heuristic to try to figure out if no-job == failed vs success16:32
mdboothdanpb: I got the impression at the time that eric was piling on heuristics in there for us, but really we weren't supposed to be doing that. Sounds like that's out of date?16:33
*** panda is now known as panda|off16:33
danpbmdbooth: what do you mean ?16:34
mdboothThere was also the heuristic for status.end.16:36
mdboothI just got the strong impression at the time that consuming events was the intended approach here.16:36
mdboothIf we've sat on the problem for that to be out of date... result :)16:36
*** dtantsur is now known as dtantsur|afk16:38
mriedemmdbooth: if i get no job, i'm going to not send the callback event to compute manager to trigger the port binding activation,16:38
mriedembecause worst case is the job failed and we're screwing up networking, which is what sean is seeing,16:39
mriedembest case is we don't know, but post live migration will still activate the port bindings,16:39
mriedemyou just have a bigger window of network downtime16:39
mriedem*plus*, if the job was successful and we go into post-copy, we activate the port bindings then too16:39
*** s10 has joined #openstack-nova16:39
mriedemi'm fairly certain this is 100% fool proof and will forever be bug free16:40
*** s10 has quit IRC16:46
tssuryadansmith: would you prefer me returnng (1) the failed_cell_uuids from get_instance_objects_sorted only if cell_down_support is set ? or (2) you don't want this flag creeping down even to that level and so we just return the tuple under all conditions ?16:47
tssuryaand deal with it in the api16:47
tssuryaI am asking because its called "get_instance_objects_sorted" and returning the tuple under all conditions kind of might be weird ?16:49
dansmithtssurya: just return it always, and let the api decide what to do with it based on the version I think16:49
tssuryadansmith: ack16:49
dansmithtssurya: you can change the name if you think that's important16:49
tssuryaI will put it up for review and we can see16:50
*** markvoelker has joined #openstack-nova16:50
tssuryathanks16:51
dansmithcool16:51
*** sambetts is now known as sambetts|afk16:52
sean-k-mooneymriedem:  can we get that on a tee shirt.16:52
*** NostawRm has quit IRC16:52
mriedemsean-k-mooney: my bug free guarantee?16:57
mriedemit only applies from today through labor day16:57
*** nicolasbock has joined #openstack-nova16:59
*** danpb has quit IRC17:00
*** sean-k-mooney has quit IRC17:00
dansmithoof, 329 in check17:02
melwitt.17:02
mriedemso ima also mark https://bugs.launchpad.net/nova/+bug/1788014 as rc potential17:04
openstackLaunchpad bug 1788014 in OpenStack Compute (nova) "when live migration fails due to a internal error rollback is not handeled correctly." [Medium,In progress] - Assigned to Matt Riedemann (mriedem)17:04
mriedemgiven it's a regression when live migration fails17:04
mriedemmy only question on that one is doing a tactical fix for the GA17:05
melwittok, so rc3 now17:06
*** NostawRm has joined #openstack-nova17:11
*** tbachman has quit IRC17:11
openstackgerritMerged openstack/nova master: Update resources once in update_available_resource  https://review.openstack.org/52002417:36
openstackgerritMerged openstack/nova master: Set policy_opt defaults in placement gabbi fixture  https://review.openstack.org/59417217:36
openstackgerritMerged openstack/nova master: Set policy_opt defaults in placement deploy unit test  https://review.openstack.org/59433417:39
dansmithmriedem: what's the plan here? https://review.openstack.org/#/c/591735/17:40
dansmithwe would like that to be in all current upstream stable, but will backport it ourselves if we're not going to do it upstream, so I just wanna know if I should hold off or not17:41
*** tbachman has joined #openstack-nova17:45
*** tssurya has quit IRC17:50
mriedemdansmith: i was waiting for you to rebase it17:53
dansmithoh heh sorry17:53
mriedemsni17:53
mriedemsnip snap17:53
mriedemmelwitt: yeah so probably rc317:53
mriedemthese are the 2 as of today https://bugs.launchpad.net/nova/+bugs?field.tag=rocky-rc-potential17:54
mriedemlooks like final rc is thursday17:54
mriedemthat first one has a fix in thegate17:54
mriedemi wouldn't mind bouncing of a few of you on the 2nd one17:54
mriedem*off a few17:54
melwittack17:55
mriedemdansmith: melwitt: so tl;dr, the issue in https://bugs.launchpad.net/nova/+bug/1788014 is that live migration fails and we trigger a lifecycle event which activates the port binding on the dest host incorrectly, it shouldn't do that,17:58
openstackLaunchpad bug 1788014 in OpenStack Compute (nova) "when live migration fails due to a internal error rollback is not handeled correctly." [Medium,In progress] - Assigned to Matt Riedemann (mriedem)17:58
mriedemwe're getting an event from libvirt but don't know if it's success or failure for the job,17:58
mriedemso what i could do if we want to be low risk with the fix for rocky GA is just not listen on that event and we'll activate port bindings on success like we always did before the change, and we'll still do the early port activating on post-copy events if the live migration is successful17:59
mriedemlong-term we could check the actual job status and if failed, don't trigger our lifecycle event, but that's riskier for rocky GA at this point IMO17:59
mriedemso i'd propose a 2-part fix, one partial that we backport and one for just stein18:00
*** luksky has joined #openstack-nova18:02
melwittok18:03
openstackgerritDan Smith proposed openstack/nova stable/queens: Fix cancel_all_events event name parsing  https://review.openstack.org/59208618:04
openstackgerritDan Smith proposed openstack/nova stable/queens: Wait for network-vif-plugged before starting live migration  https://review.openstack.org/59173518:04
openstackgerritDan Smith proposed openstack/nova stable/queens: DNM: Debug patch to test live migration waiting  https://review.openstack.org/59177518:04
melwittI guess I don't understand what the lifecycle event gives us if we already know success or failure18:05
melwittwithout listening for it18:05
mriedemin rocky we started listening on 2 new events,18:07
mriedemone is post-copy and one is migration completed18:07
mriedemthe idea is that as soon as we switch we activate the port bindings on the dest host for minimal downtime18:07
mriedemthe problem is we get the latter event even if live migration fails18:07
mriedemand we're not doing any conditional logic in that one to see if the job failed or not18:08
melwittI think I understand that part, it's just when you said "we'll activate port bindings on success like we always did before the change" it makes me not understand what gain the lifecycle event was supposed to give, if we already know success or failure without it18:09
*** tbachman has quit IRC18:11
mriedembecause if we can activate the network on the dest at the point the guest is paused to complete the transfer, it makes the network downtime window shorter18:11
*** tbachman has joined #openstack-nova18:12
melwittand without the lifecycle event, we activate the network later on after the guest is paused18:14
mriedemwe might still need my other fix for rollback, i'm not sure; what sean is hitting isn't a failure after post-copy18:14
mriedemwe activate the network after the guest transfer is complete and resumed on the dest18:14
*** tbachman has quit IRC18:16
melwittok, got it18:16
*** tbachman has joined #openstack-nova18:18
*** cfriesen has joined #openstack-nova18:21
openstackgerritMatt Riedemann proposed openstack/nova master: libvirt: Don't react to VIR_DOMAIN_EVENT_SUSPENDED_MIGRATED events  https://review.openstack.org/59450818:30
openstackgerritMatt Riedemann proposed openstack/nova master: libvirt: Don't react to VIR_DOMAIN_EVENT_SUSPENDED_MIGRATED events  https://review.openstack.org/59450818:33
*** NobodyCam has quit IRC18:37
*** Kevin_Zheng has quit IRC18:37
*** kklimonda has quit IRC18:37
*** leifz has quit IRC18:37
*** andrewbogott has quit IRC18:37
*** ttx has quit IRC18:37
*** alaski has quit IRC18:37
*** kencjohnston has quit IRC18:37
*** zer0c00l has quit IRC18:37
*** fungi has quit IRC18:37
*** kencjohnston_ has joined #openstack-nova18:38
*** kosamara has quit IRC18:42
*** TheJulia has quit IRC18:42
*** fyx has quit IRC18:42
*** mnaser has quit IRC18:42
*** nicholas has quit IRC18:42
*** TheJulia has joined #openstack-nova18:48
*** fungi has joined #openstack-nova18:48
*** eharney has quit IRC18:52
*** eharney has joined #openstack-nova18:59
*** priteau has quit IRC19:00
*** mnaser has joined #openstack-nova19:09
*** luksky11 has joined #openstack-nova19:14
*** cdent has quit IRC19:16
*** luksky has quit IRC19:17
*** bnemec has quit IRC19:20
*** bnemec has joined #openstack-nova19:20
*** r-daneel has quit IRC19:24
*** r-daneel has joined #openstack-nova19:24
*** moshele has quit IRC19:25
*** theanalyst has quit IRC19:33
*** melwitt has quit IRC19:33
*** sdake has quit IRC19:33
*** melwitt has joined #openstack-nova19:34
*** sdake has joined #openstack-nova19:34
openstackgerritMatt Riedemann proposed openstack/nova master: libvirt: check job status for VIR_DOMAIN_EVENT_SUSPENDED_MIGRATED event  https://review.openstack.org/59452719:36
*** moshele has joined #openstack-nova19:36
*** luksky has joined #openstack-nova19:36
mriedemmelwitt: dansmith: alright fyi ^ would need sean-k-mooney to test out the 2nd more complicated fix since he has the env that recreates the bug19:37
mriedemand i need to run to an appt19:37
*** luksky11 has quit IRC19:38
*** evrardjp has quit IRC19:39
*** Vek has quit IRC19:39
*** pcarver has quit IRC19:39
*** jistr|off has quit IRC19:39
*** aarents has quit IRC19:39
*** jcosmao has quit IRC19:39
*** ingy has quit IRC19:39
*** gryf has quit IRC19:39
mriedemalso threw that stuff in the rc todo etherpad19:39
*** mriedem is now known as mriedem_afk19:39
*** sambetts|afk has quit IRC19:42
melwittack19:43
*** gryf has joined #openstack-nova19:44
*** sambetts_ has joined #openstack-nova19:45
*** moshele has quit IRC19:45
*** samueldmq has quit IRC19:46
*** _hemna has joined #openstack-nova19:47
*** jistr has joined #openstack-nova19:47
*** BlackDex has quit IRC19:48
*** samueldmq has joined #openstack-nova19:49
*** BlackDex has joined #openstack-nova19:52
openstackgerritEric Fried proposed openstack/nova master: [placement] Add /reshaper handler for POST  https://review.openstack.org/57692719:54
openstackgerritEric Fried proposed openstack/nova master: reshaper: Look up provider if not in inventories  https://review.openstack.org/58503319:54
openstackgerritEric Fried proposed openstack/nova master: Make get_allocations_for_resource_provider raise  https://review.openstack.org/58459819:54
openstackgerritEric Fried proposed openstack/nova master: Report client: Real get_allocs_for_consumer  https://review.openstack.org/58459919:54
openstackgerritEric Fried proposed openstack/nova master: Report client: get_allocations_for_provider_tree  https://review.openstack.org/58464819:55
openstackgerritEric Fried proposed openstack/nova master: Report client: _reshape helper, placement min bump  https://review.openstack.org/58503419:55
openstackgerritEric Fried proposed openstack/nova master: Report client: update_from_provider_tree w/reshape  https://review.openstack.org/58504919:55
openstackgerritEric Fried proposed openstack/nova master: Compute: Handle reshaped provider trees  https://review.openstack.org/57623619:55
*** luksky11 has joined #openstack-nova19:56
*** eharney has quit IRC19:59
*** theanalyst has joined #openstack-nova19:59
*** luksky has quit IRC20:00
*** r-daneel has quit IRC20:03
*** r-daneel has joined #openstack-nova20:03
*** tbachman has quit IRC20:08
*** Tahvok_ has joined #openstack-nova20:12
*** gmann_ has joined #openstack-nova20:13
*** jaosorior_ has joined #openstack-nova20:14
dansmithso I20:14
dansmitham pretty sure my batching breaks tssurya's down cell in its current form20:14
dansmithmore specifically, it'll cause scatter gather to never notice failues20:15
*** jaosorior has quit IRC20:17
*** edmondsw_ has joined #openstack-nova20:18
*** Tahvok has quit IRC20:19
*** edmondsw has quit IRC20:19
*** gmann has quit IRC20:19
*** edmondsw_ is now known as edmondsw20:19
*** gmann_ is now known as gmann20:19
*** Tahvok_ is now known as Tahvok20:19
melwittdansmith: it's specific to the batching? it looks like pre-batch the code still removes the "did not respond" and "raised exception" results20:24
dansmithstill removes?20:24
melwittdansmith: this part looks like it's removing "down cells" from results? https://review.openstack.org/#/c/592698/5/nova/compute/multi_cell_list.py@31120:25
dansmithright but with the batching we never hit that because we don't start executing the queries until the heapq20:26
melwittoh, I see20:27
melwitthmm20:27
dansmithI'll just have to get a little more into the middle of that process and we just won't get the standard handlers from scatter gather20:29
*** erlon has quit IRC20:39
*** holser_ has quit IRC20:44
*** awaugama has quit IRC20:51
*** harlowja has joined #openstack-nova21:02
*** mriedem_afk is now known as mriedem21:04
mriedemhuh, you don't see instance.save() messaging timeouts in the gate very often http://logs.openstack.org/98/591898/3/check/nova-next/2d5e60c/logs/screen-n-cpu.txt.gz#_Aug_21_17_10_17_73226321:13
mriedemguessing this isn't good http://logs.openstack.org/98/591898/3/check/nova-next/2d5e60c/logs/screen-n-cpu.txt.gz#_Aug_21_17_10_17_42642021:14
mriedemseen here too http://logs.openstack.org/85/567785/7/check/nova-tox-functional-py35/d5a8036/job-output.txt#_2018-08-17_09_06_20_24142721:16
*** munimeha1 has quit IRC21:21
*** luksky11 has quit IRC21:29
mriedemmelwitt: what do you think is missing from placement for shared storage providers support? as far as i know, it's the nova stuff that's lacking as we identified ~2 weeks ago21:36
mriedemfor sure move operations are not ready for shared storage providers21:37
mriedemthe placement side of shared storage is pretty simple though, and has been done for a long time21:37
openstackgerritDmitry Sutyagin proposed openstack/nova-specs master: Allow disabling KSM / mem-merge via extra spec  https://review.openstack.org/59319721:40
melwittmriedem: it's not that I think anything is missing, I'm pragmatically thinking of the integration work and if there is something unforeseen we need to fix. I expect bugs to shake out when we integrate something for the first time. I think not having bugs shake out will be the rarer case21:41
dansmithmriedem: I think we were saying the same about aggregates being done in placement before we added the placement filter stuff and realized we needed tweaks21:41
dansmithand surely thought what was being done in placement for NRPs was going to be usable by nova until we thought about it21:41
mriedemdansmith: the member_of thing with aggs right?21:46
dansmithwe needed member_of with and and or21:46
dansmithand I meant prefilter above21:46
dansmithor request filter21:46
dansmithor whatever that guy called it21:46
dansmiththe granular request stuff is another similar example, where we try to use what we think is just clean resource requests for actual nova stuff and realize we need this giant complex syntax instead21:47
dansmithall that could be developed in two separate rooms for sure, just like multiattach or multiple host bindings21:47
dansmithand hot damn, in a few years we'll be golden21:48
dansmithmriedem: by the way, I wonder if the object action rpc methods to conductor ought to be long_rpc_timeouts21:49
melwittyeah. in case it wasn't clear, the thing I care about is delivering stuff that operators and users need, that we know they need, and I don't see how becoming two separate groups helps that21:49
dansmithmriedem: they really should never hang for a long time, but just piling up more because we time out, run the periodic again and generate more traffic is probably worse21:49
dansmithmriedem: re: your save timeout21:50
*** itlinux has quit IRC21:51
mriedemdansmith: it seems something weird happened with the servicegroup in that failure21:52
dansmithyeah, not related to your actual thing, but just thinking of what that reminds me of, which is conductor is overwhelmed21:52
mriedemdansmith: also, i was talking with efried about kevin's exclusive trait thing, and guess what https://review.openstack.org/#/c/593475/21:58
mriedemit's already been proposed :)21:58
dansmithwell, having not read it and skimmed the -1, I'm assuming it's for the same reason I think it's a non-starter22:00
*** rha has quit IRC22:02
*** rha has joined #openstack-nova22:03
mriedemencoding metadata in a trait name22:03
dansmithand it's only one special prefix,22:05
dansmithwhich means one class22:05
mriedemCUSTOM_INTEL_FOR_SERIOUS_WORKLOADS22:06
mriedemi can see it now22:06
efriedPOST /traits/CUSTOM_INTEL_FOR_SERIOUS_WORKLOADS22:08
efried{ 'name': 'CUSTOM_INTEL_FOR_SERIOUS_WORKLOADS,22:08
efried  'required': true,22:08
efried  'allowed_user_ids': [...],22:08
efried  'allowed_project_ids': [...],22:08
efried  ...22:08
efried}22:08
mriedemqueue jay vomit22:08
mriedem*cue22:09
efriedwe could do the same thing with aggregates22:09
mriedemdansmith: re granular, we could still do POST queries....22:09
mriedemjust saying22:09
mriedemgranular request group syntax is likely something that could benefit from some kind of flavor extra specs validate api22:10
*** rcernin has joined #openstack-nova22:10
efriedno argument there22:13
*** mriedem is now known as mriedem_away22:22
*** moshele has joined #openstack-nova22:46
openstackgerritEric Fried proposed openstack/nova master: [placement] Regex consts for placement schema  https://review.openstack.org/59186323:05
openstackgerritEric Fried proposed openstack/nova master: [placement] Add /reshaper handler for POST  https://review.openstack.org/57692723:05
openstackgerritEric Fried proposed openstack/nova master: reshaper: Look up provider if not in inventories  https://review.openstack.org/58503323:05
openstackgerritEric Fried proposed openstack/nova master: Make get_allocations_for_resource_provider raise  https://review.openstack.org/58459823:05
openstackgerritEric Fried proposed openstack/nova master: Report client: Real get_allocs_for_consumer  https://review.openstack.org/58459923:05
openstackgerritEric Fried proposed openstack/nova master: Report client: get_allocations_for_provider_tree  https://review.openstack.org/58464823:05
openstackgerritEric Fried proposed openstack/nova master: Report client: _reshape helper, placement min bump  https://review.openstack.org/58503423:05
openstackgerritEric Fried proposed openstack/nova master: Report client: update_from_provider_tree w/reshape  https://review.openstack.org/58504923:05
openstackgerritEric Fried proposed openstack/nova master: Compute: Handle reshaped provider trees  https://review.openstack.org/57623623:05
*** erlon has joined #openstack-nova23:06
*** r-daneel has quit IRC23:09
*** erlon has quit IRC23:22
* melwitt will bbl23:28
*** Kevin_Zheng has joined #openstack-nova23:35
*** macza has quit IRC23:37
openstackgerritBrin Zhang proposed openstack/nova-specs master: Resource retrieving: add change-before filter  https://review.openstack.org/59197623:42
*** mlavalle has quit IRC23:42
openstackgerritDan Smith proposed openstack/nova master: Batch results per cell when doing cross-cell listing  https://review.openstack.org/59269823:48
openstackgerritDan Smith proposed openstack/nova master: List instances from all cells explicitly  https://review.openstack.org/59371723:48
openstackgerritDan Smith proposed openstack/nova master: Make instance_list perform per-cell batching  https://review.openstack.org/59313123:48
openstackgerritDan Smith proposed openstack/nova master: Record cell success/failure/timeout in CrossCellLister  https://review.openstack.org/59426523:48
openstackgerritDan Smith proposed openstack/nova master: Make CELL_TIMEOUT a constant  https://review.openstack.org/59457023:48
openstackgerritDan Smith proposed openstack/nova master: Stash the cell uuid on the context when targeting  https://review.openstack.org/59457123:48
openstackgerritDan Smith proposed openstack/nova master: Make RecordWrapper record RequestContext and expose cell_uuid  https://review.openstack.org/59457223:48
*** gbarros has quit IRC23:49
*** gbarros has joined #openstack-nova23:51

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!