Tuesday, 2019-05-07

*** mriedem has quit IRC00:09
*** takashin has joined #openstack-nova00:16
*** brinzhang has joined #openstack-nova00:45
*** takashin has quit IRC01:13
*** whoami-rajat has joined #openstack-nova01:16
*** dannins has joined #openstack-nova01:19
*** tiendc has joined #openstack-nova01:31
*** logan- has quit IRC01:33
*** logan- has joined #openstack-nova01:37
*** Swami has quit IRC01:44
*** cdent has quit IRC01:44
*** bbowen_ has quit IRC02:04
*** hongbin has joined #openstack-nova02:19
*** JamesBenson has joined #openstack-nova02:34
*** cfriesen has quit IRC02:39
*** udesale has joined #openstack-nova02:42
*** rcernin has quit IRC02:43
*** samueldmq has quit IRC02:46
*** boxiang has quit IRC02:46
*** boxiang has joined #openstack-nova02:50
*** lbragstad has quit IRC02:54
*** rcernin has joined #openstack-nova03:29
*** cdent has joined #openstack-nova03:44
*** brinzhang has quit IRC03:44
*** brinzhang has joined #openstack-nova03:45
openstackgerritChason Chan proposed openstack/python-novaclient master: Tiny fix of documentation  https://review.opendev.org/65752203:57
*** cdent has quit IRC03:58
*** zhongjun2 has quit IRC04:05
*** tkajinam has quit IRC04:07
*** tkajinam has joined #openstack-nova04:08
*** tkajinam has quit IRC04:08
*** tkajinam has joined #openstack-nova04:08
*** KH-Jared has quit IRC04:14
*** ivve has quit IRC04:17
*** ricolin has joined #openstack-nova04:26
*** hongbin has quit IRC04:33
*** janki has joined #openstack-nova04:37
openstackgerritTakashi NATSUME proposed openstack/nova master: Fix assert methods in unit tests  https://review.opendev.org/65752504:48
*** KH-Jared has joined #openstack-nova05:04
*** JamesBenson has quit IRC05:11
*** ivve has joined #openstack-nova05:20
*** ricolin has quit IRC05:24
*** JamesBenson has joined #openstack-nova05:28
*** JamesBenson has quit IRC05:35
*** HW-Peter has quit IRC05:41
*** maciejjozefczyk_ has joined #openstack-nova05:57
*** KH-Jared has quit IRC06:06
*** lpetrut has joined #openstack-nova06:07
*** pcaruana has joined #openstack-nova06:21
*** slaweq has joined #openstack-nova06:36
*** rpittau|afk is now known as rpittau06:47
*** hoonetorg has quit IRC07:04
*** hoonetorg has joined #openstack-nova07:08
*** KH-Jared has joined #openstack-nova07:11
*** threestrands has joined #openstack-nova07:11
*** KH-Jared has quit IRC07:15
*** tosky has joined #openstack-nova07:16
*** ccamacho has joined #openstack-nova07:23
*** ivve has quit IRC07:25
*** tesseract has joined #openstack-nova07:26
*** rcernin has quit IRC07:27
*** ivve has joined #openstack-nova07:33
*** ivve has quit IRC07:44
*** ivve has joined #openstack-nova07:59
*** trident has quit IRC08:01
*** trident has joined #openstack-nova08:02
*** ccamacho has quit IRC08:03
*** threestrands has quit IRC08:18
*** tkajinam has quit IRC08:35
*** maciejjozefczyk_ has quit IRC08:37
*** maciejjozefczyk_ has joined #openstack-nova08:37
*** priteau has joined #openstack-nova08:44
*** ralonsoh has joined #openstack-nova08:48
openstackgerritMerged openstack/nova master: Use migration_status during volume migrating and retyping  https://review.opendev.org/63722409:07
*** KH-Jared has joined #openstack-nova09:24
*** alex_xu has quit IRC09:28
*** jaosorior has joined #openstack-nova09:48
*** KH-Jared has quit IRC09:53
*** zbr has joined #openstack-nova10:04
*** Woutifier has joined #openstack-nova10:07
*** Woutifier has left #openstack-nova10:07
*** KH-Jared has joined #openstack-nova10:07
*** pcaruana has quit IRC10:19
*** yan0s has joined #openstack-nova10:24
*** bbowen_ has joined #openstack-nova10:41
*** udesale has quit IRC10:42
*** tbachman has quit IRC10:42
*** udesale has joined #openstack-nova10:42
*** bbowen_ has quit IRC10:50
*** pcaruana has joined #openstack-nova10:55
*** panda is now known as panda|lunch10:56
*** mdbooth has joined #openstack-nova10:57
*** priteau has quit IRC11:11
*** rpittau has quit IRC11:14
*** rpittau has joined #openstack-nova11:17
openstackgerritMerged openstack/nova master: Log when port resource is leaked during port delete  https://review.opendev.org/65707911:27
openstackgerritMerged openstack/nova master: Fix assert methods in unit tests  https://review.opendev.org/65752511:27
openstackgerritJohn Garbutt proposed openstack/nova-specs master: Policy Default Refresh spec  https://review.opendev.org/54785011:30
openstackgerritLee Yarwood proposed openstack/nova stable/stein: Use migration_status during volume migrating and retyping  https://review.opendev.org/65757511:39
openstackgerritJohn Garbutt proposed openstack/nova-specs master: Add Unified Limits Spec  https://review.opendev.org/60220111:39
openstackgerritLee Yarwood proposed openstack/nova stable/rocky: Use migration_status during volume migrating and retyping  https://review.opendev.org/65757711:43
openstackgerritLee Yarwood proposed openstack/nova stable/queens: Use migration_status during volume migrating and retyping  https://review.opendev.org/65757911:43
openstackgerritBalazs Gibizer proposed openstack/nova stable/stein: Log when port resource is leaked during port delete  https://review.opendev.org/65758111:50
*** lyarwood has quit IRC11:51
*** bbowen has joined #openstack-nova11:52
*** bbowen_ has joined #openstack-nova11:53
openstackgerritBalazs Gibizer proposed openstack/nova-specs master: Resource provider - request group mapping in allocation candidate  https://review.opendev.org/59760111:54
*** ygk_12345 has joined #openstack-nova11:55
*** bbowen has quit IRC11:56
*** mriedem has joined #openstack-nova12:05
ygk_12345hi all12:05
ygk_12345can someone help me how to get a vm's flavor from the nova database12:05
mriedeminstance_extras.flavor column12:06
mriedem*instance_extra12:06
mriedemhttps://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/models.py#L38612:06
gibiygk_12345: mysql nova_cell1 -e 'select flavor from instance_extra;'12:08
gibiahh mriedem was faster12:08
ygk_12345gibi: ok thanks got it :)12:09
ygk_12345gibi: but is there a way to get that name of the flavor only in the output ?12:09
mriedemygk_12345: original_name is provided in the output i think12:09
mriedemhttps://developer.openstack.org/api-ref/compute/?expanded=show-server-details-detail#show-server-details12:10
mriedemflavor.original_name12:10
mriedemthat's available since microversion 2.4712:10
ygk_12345mriedem: is flavor a table ?12:10
gibiand there is "name" key in the flavor json in the database12:11
mriedemflavors is a table in the nova_api db yes12:11
mriedembut the flavor used to create the instance is not necessarily still in that table,12:11
mriedemor with the same values, i.e. extra specs12:11
mriedemthe copy of the flavor used to create or last resize the instance is in the instance_extra table12:11
ygk_12345gibi: yes I see but how to get it only in the query output. I am getting all the cluttered output12:12
mriedemygk_12345: it's stored as a json blob in the instance_extra table12:12
gibiygk_12345: ^^12:12
mriedemso you'd have to deserialize it12:12
mriedemgibi: were you planning on backporting https://review.opendev.org/#/q/topic:bug/1819923+(status:open+OR+status:merged) ? i'm wondering about ordering for https://review.opendev.org/#/c/651945/12:12
*** tbachman has joined #openstack-nova12:12
ygk_12345gibi: mriedem now I see in the nova_api db. I  will extract it somehow. Thanks for your time guys12:13
mriedemjust remember what i said about the flavor used to create the server may not be in the flavors table anymore12:14
gibimriedem: yes, I'm planning to backport heal port allocation to stein12:15
gibimriedem: if https://review.opendev.org/#/c/651945/ merges then I will handle the conflicts12:16
gibior we might consider https://review.opendev.org/#/c/651945/ also as a backport12:18
*** nicolasbock has joined #openstack-nova12:20
*** mchlumsky has joined #openstack-nova12:28
*** alex_xu has joined #openstack-nova12:35
*** brinzhang has quit IRC12:41
*** _hemna has quit IRC12:47
*** hemna has joined #openstack-nova12:48
*** tiendc has quit IRC12:53
*** mmethot has joined #openstack-nova13:04
*** tssurya has joined #openstack-nova13:05
*** jdillaman has joined #openstack-nova13:12
*** lbragstad has joined #openstack-nova13:20
mriedemtssurya: i'm reviewing your locked_reason patch so hold up if you're making changes13:22
tssuryamriedem: yea I am13:22
tssuryaok will hold up :)13:23
* bauzas is back with a bit of jetlag13:23
tssuryabauzas: hi-fi13:23
bauzas\o13:23
* gibi survives the jetlag with coffee13:24
tssuryagibi: thanks for the review on the locked_reason patch, sorry for the big patch size, couldn't help it13:24
gibitssurya: no worries. at the end it was fairly easy to review13:25
gibitssurya: thanks for working on that feature13:25
*** panda|lunch is now known as panda13:25
tssuryamriedem: while you are reviewing could you confirm if you are okay with the comment here? : https://review.opendev.org/#/c/648662/8/api-ref/source/servers-actions.inc@64013:25
*** hemna has quit IRC13:25
tssuryagibi: :)13:25
*** hemna has joined #openstack-nova13:26
*** mgoddard has quit IRC13:39
*** mgoddard has joined #openstack-nova13:40
*** lpetrut has quit IRC13:40
*** artom has quit IRC13:45
*** jaosorior has quit IRC13:49
openstackgerritBalazs Gibizer proposed openstack/nova stable/stein: Reproduce bug #1819460 in functional test  https://review.opendev.org/65760013:49
openstackbug 1819460 in OpenStack Compute (nova) stein "instance stuck in BUILD state due to unhandled exceptions in conductor" [Medium,Confirmed] https://launchpad.net/bugs/181946013:49
openstackgerritBalazs Gibizer proposed openstack/nova stable/stein: Fix exception type in test_boot_reschedule_fill_provider_mapping_raises  https://review.opendev.org/65760113:49
openstackgerritBalazs Gibizer proposed openstack/nova stable/stein: Handle placement error during re-schedule  https://review.opendev.org/65760213:49
openstackgerritBalazs Gibizer proposed openstack/nova stable/stein: Only call _fill_provider_mapping if claim succeeds  https://review.opendev.org/65760313:49
mriedemhuh, auto_disk_config is a valid filter parameter for listing servers but not handled in the DB API at all from what i can tell13:55
mriedemoh well, it's xen only anyway13:55
openstackgerritDan Smith proposed openstack/nova master: Add a workaround config toggle to refuse ceph image upload  https://review.opendev.org/65707814:02
*** boxiang has quit IRC14:02
*** boxiang has joined #openstack-nova14:03
*** tbachman has quit IRC14:03
*** jdillaman has quit IRC14:03
*** ccamacho has joined #openstack-nova14:06
openstackgerritRodolfo Alonso Hernandez proposed openstack/os-vif stable/stein: Prevent "qbr" Linux Bridge from replying to ARP messages  https://review.opendev.org/65567814:08
*** artom has joined #openstack-nova14:12
*** jistr is now known as jistr|call14:14
*** janki has quit IRC14:14
mriedemtssurya: done14:17
mriedemmight be a record14:17
tssuryamriedem: thanks! record for a hard rain of comments?14:18
tssurya:D14:18
*** artom has quit IRC14:18
openstackgerritRodolfo Alonso Hernandez proposed openstack/os-vif stable/rocky: Prevent "qbr" Linux Bridge from replying to ARP messages  https://review.opendev.org/65569214:18
*** pcaruana has quit IRC14:19
mriedemtssurya: yeah14:19
openstackgerritRodolfo Alonso Hernandez proposed openstack/os-vif stable/queens: Prevent "qbr" Linux Bridge from replying to ARP messages  https://review.opendev.org/65569414:19
tssuryamriedem: heh but thanks a lot for the time, I'll work on them14:19
tssuryamriedem: btw am I causing these tests to fail ? http://logs.openstack.org/62/648662/8/check/openstack-tox-py36/7604398/testr_results.html.gz14:20
tssuryaI couldn't find the cause14:20
mriedemno14:21
mriedemhttp://status.openstack.org/elastic-recheck/#182325114:21
tssuryamriedem: oh cool thanks14:21
*** jistr|call is now known as jistr14:23
*** tbachman has joined #openstack-nova14:25
*** ygk_12345 has quit IRC14:27
*** JamesBenson has joined #openstack-nova14:27
mriedemdansmith: did you see my question/concern on the image type request filter patch regarding bfv instances?14:29
mriedemi didn't -1 yet14:29
dansmithno14:29
*** mrch_ has joined #openstack-nova14:29
mrch_had anyone problems upping novacompute after upgrade queens>rocky (centos)?14:30
*** mlavalle has joined #openstack-nova14:31
*** JamesBenson has quit IRC14:31
*** JamesBenson has joined #openstack-nova14:34
*** mdbooth has quit IRC14:35
openstackgerritRodolfo Alonso Hernandez proposed openstack/os-vif stable/rocky: Prevent "qbr" Linux Bridge from replying to ARP messages  https://review.opendev.org/65569214:37
*** pcaruana has joined #openstack-nova14:38
openstackgerritRodolfo Alonso Hernandez proposed openstack/os-vif stable/queens: Prevent "qbr" Linux Bridge from replying to ARP messages  https://review.opendev.org/65569414:38
*** efried is now known as efried_pto14:39
openstackgerritMatt Riedemann proposed openstack/nova master: api-ref: fix mention of all_tenants filter for non-admins  https://review.opendev.org/65762014:49
mriedemsorrison: you know how you added the "os_compute_api:servers:allow_all_filters" policy rule? if using all_tenants as the filter parameter, that could still be filtered out for non-admins because all_tenants has it's own specific rules (os_compute_api:servers:index:get_all_tenants and os_compute_api:servers:detail:get_all_tenants)14:50
mriedemit seems that os_compute_api:servers:allow_all_filters should also apply to the all_tenants filter, yeah?14:51
*** Sundar has joined #openstack-nova14:53
openstackgerritDan Smith proposed openstack/nova master: Expose Hyper-V supported image types  https://review.opendev.org/65513715:01
openstackgerritDan Smith proposed openstack/nova master: Make libvirt expose supported image types  https://review.opendev.org/65345415:01
openstackgerritDan Smith proposed openstack/nova master: Add ironic driver image type capabilities  https://review.opendev.org/65572915:01
openstackgerritDan Smith proposed openstack/nova master: Add vmware driver image type capabilities  https://review.opendev.org/65573015:01
openstackgerritDan Smith proposed openstack/nova master: Add xenapi driver image type capabilities  https://review.opendev.org/65573115:01
openstackgerritDan Smith proposed openstack/nova master: Add zvm driver image type capabilities  https://review.opendev.org/65573215:01
openstackgerritDan Smith proposed openstack/nova master: Add image type request filter  https://review.opendev.org/65641315:01
openstackgerritDan Smith proposed openstack/nova master: Enable image type query support in nova-next  https://review.opendev.org/65690315:01
openstackgerritDan Smith proposed openstack/nova master: Add docs for image type support request filter  https://review.opendev.org/65702515:01
dansmithmriedem: I've been thinking about the tracing thing too15:03
*** cfriesen has joined #openstack-nova15:03
*** Sundar has quit IRC15:04
dansmithmore about making the existing filters log what they're doing, but the timer thing is legit too15:05
*** tosky has quit IRC15:06
*** mgariepy has quit IRC15:07
mriedemi'd think that osprofiler should be handling the time spent in each filter, but not sure how granular that is15:08
dansmithmeaning we need some sort of hook thing?15:10
*** imacdonn has quit IRC15:11
*** imacdonn has joined #openstack-nova15:11
mriedemi'm not sure if the @profiler.trace_cls decorator on the SchedulerAPI (rpc) gives us what we'd want there15:12
mriedemsince select_destinations is a call it might, but not sure15:12
*** mrch_ has quit IRC15:12
openstackgerritSurya Seetharaman proposed openstack/nova master: [Docs] Change the server query parameter display into a list docs.  https://review.opendev.org/65762415:12
mriedemthere was a recent ML thread about nicer osprofiler visualizations but i can't find it15:14
*** tbachman has quit IRC15:15
*** pcaruana has quit IRC15:16
mriedemoh right, something like this http://logs.openstack.org/69/617269/3/check/tempest-smoke-py3-osprofiler-redis/d5563c0/osprofiler-traces/trace-58ed6da6-f2c8-418c-9c60-337bf00ad86a.html.gz15:16
openstackgerritSurya Seetharaman proposed openstack/nova master: [Docs] Change the server query parameter display into a list.  https://review.opendev.org/65762415:16
mriedemnot sure how to easily find a server create trace in there15:16
*** ccamacho has quit IRC15:16
*** hamzy has quit IRC15:20
dansmithmriedem: well, I'll start something and you can add to it if you want15:21
dansmithit'd be really nice if we could see what the reqspec looks like before/after in a generic way, but that's probably too detailed and too fragile, vs. each one just logging what it's doing15:22
mriedemi guess http://logs.openstack.org/69/617269/3/check/tempest-smoke-py3-osprofiler-redis/d5563c0/osprofiler-traces/trace-46016681-2a3d-4ca2-adf7-c4941649d753.html.gz and filter on nova-scheduler15:22
mriedemdoesn't give me the time per filter though15:23
mriedemjust says that like select_destinations took 302ms15:23
*** tbachman has joined #openstack-nova15:24
*** artom has joined #openstack-nova15:27
*** mgariepy has joined #openstack-nova15:29
*** ivve has quit IRC15:29
mriedemah timeutils already has something i was thinking to use15:29
dansmithI was just using stopwatch15:29
mriedemlooks like you can just use oslo_utils.timeutils.time_it15:30
dansmithyeah, but I want to not log if the thing appears disabled15:31
*** jdillaman has joined #openstack-nova15:31
mriedemthe decorator takes an enabled kwarg15:32
dansmithbut we don't know until after it's run, unless we change something15:32
*** yan0s has quit IRC15:32
dansmithI was just going to have them return a boolean if they're enabled/effective15:32
dansmithlike, if we're is_bfv=True, no point in logging 0.015:33
dansmithI could only log if above a threshold, but then you don't know if something is running or running fastly15:33
mriedemok that's more granular than i was thinking at first, like this is pretty easy:15:33
mriedemhttp://paste.openstack.org/show/750876/15:33
*** mdbooth has joined #openstack-nova15:33
dansmithyeah, that works for the two simple ones, but not for my new one15:33
dansmithbecause it has three reasons to not log15:34
dansmithgive me a few and I'll push up what I was thinking15:34
mriedemmap_az_to_placement_aggregate also has several reasons to not do anything15:34
mriedemso depends on how granular you want this15:34
openstackgerritDan Smith proposed openstack/nova master: WIP: Add extra logging to request filters  https://review.opendev.org/65762915:38
dansmithmriedem: ^15:38
dansmithgeneric timer and contextual logs from each filter about what change it's making15:39
dansmithwhich I think is what I'd want to validate or diagnose15:39
*** tbachman has quit IRC15:39
*** tbachman has joined #openstack-nova15:43
*** sapd1_x has joined #openstack-nova15:45
*** hamzy has joined #openstack-nova15:48
*** tbachman has quit IRC15:48
mriedemi think your unit tests in https://review.opendev.org/#/c/656413/ are going to blow up15:50
*** rpittau is now known as rpittau|afk15:51
dansmithI thought is_bfv was usage safe so it would just return false15:51
dansmithbut I didn't run them15:51
dansmithor maybe we set it somewhere before we get too far, but I won't hit that in the unit tests15:52
mriedemconductor sets it before the scheduler for a move, and api for a new server, but your unit tests won't lazy-load the value15:52
dansmithyeah15:53
dansmithI need to remove the empty image from the volume case anyway15:54
*** gyee has joined #openstack-nova15:55
openstackgerritDan Smith proposed openstack/nova master: Enable image type query support in nova-next  https://review.opendev.org/65690315:58
openstackgerritDan Smith proposed openstack/nova master: Add docs for image type support request filter  https://review.opendev.org/65702515:58
*** cdent has joined #openstack-nova16:04
*** beagles has quit IRC16:11
*** beagles has joined #openstack-nova16:13
*** manjeets__ is now known as manjeets16:21
*** jbernard_ is now known as jbernard16:21
openstackgerritDan Smith proposed openstack/nova master: WIP: Add extra logging to request filters  https://review.opendev.org/65762916:24
*** whoami-rajat has quit IRC16:25
*** nicolasbock has quit IRC16:29
*** nicolasbock has joined #openstack-nova16:29
openstackgerritJohn Garbutt proposed openstack/nova-specs master: Add Unified Limits Spec  https://review.opendev.org/60220116:40
*** ivve has joined #openstack-nova16:40
*** whoami-rajat has joined #openstack-nova16:43
*** tbachman has joined #openstack-nova16:53
*** sapd1_x has quit IRC16:58
*** udesale has quit IRC17:13
*** maciejjozefczyk_ has quit IRC17:17
*** ttsiouts has joined #openstack-nova17:22
dansmithah crap17:23
openstackgerritDan Smith proposed openstack/nova master: Add image type request filter  https://review.opendev.org/65641317:24
openstackgerritDan Smith proposed openstack/nova master: Enable image type query support in nova-next  https://review.opendev.org/65690317:24
openstackgerritDan Smith proposed openstack/nova master: Add docs for image type support request filter  https://review.opendev.org/65702517:24
openstackgerritDan Smith proposed openstack/nova master: WIP: Add extra logging to request filters  https://review.opendev.org/65762917:24
openstackgerritRodolfo Alonso Hernandez proposed openstack/os-vif stable/rocky: Prevent "qbr" Linux Bridge from replying to ARP messages  https://review.opendev.org/65569217:34
*** ttsiouts has quit IRC17:47
*** ttsiouts has joined #openstack-nova17:48
dansmithmriedem: have you been following the helm and health check stuff that wants to throw a non-nova tool on our bus to ping all our services all the time for health checks?17:50
dansmithI've been arguing for going the real http-based healthcheck route to avoid extra message bus load, the upgrade concerns, and to, you know, allow health checks to work in the absence of a working message bus, which is the most likely failure in any deployment :)17:51
artom<drive by> I'd argue that you'd still need to check message queue health, but that it's a separate thing and shouldn't be confused with Nova17:52
artomCuz if your message queue is down, not much is going to work :)17:53
dansmithnova does this already, and, the http based check can report "hey I can't talk to rabbit"17:53
dansmiththe http check returns a json blob of "things I know about" and "are they working or not"17:53
dansmithso, "can I talk to rabbit" "am I able to report my service status" "can I talk to libvirt right now" etc17:54
dansmithdoing that over the message bus adds load to the most loaded thing, and doesn't work if the bus is down,17:55
dansmithbut also won't work if conductor is down after a few minutes, because nova-compute is likely to have exhausted its threadpool trying to check in17:55
dansmithso it's really a fragile expectation I think17:55
*** lbragstad has quit IRC17:59
*** lbragstad has joined #openstack-nova17:59
*** ralonsoh has quit IRC18:00
*** ttsiouts has quit IRC18:14
artomdansmith, oh, we have a specific health check "API"?18:15
artomTIL18:15
dansmithno, but kube has a standard format that people have proposed adding to our services18:16
dansmithand I'm saying we should do _that_18:16
artomSeems sensible to me18:16
artomBetter than doing it through the back door18:16
* dansmith screenshots for later18:16
artomWhat, that I agreed with you?18:16
dansmithno, nevermind :)18:17
artomOh I know.18:17
artomBut... this being a professional channel and all.18:17
cdent"we should do _that_"++18:22
mriedemdansmith: no i haven't been following it18:23
dansmiththey already have an agent that sits on our bus and pings services, and unsurprisingly, have found that the version matters and things break if you don't have that right18:24
dansmithand I just want to make sure we're not getting into a situation where we bless that kind of activity, and/or start trying to worry about external consumers of our RPC APIs and hold versions around because of what might be out in the wild18:25
dansmithhttps://review.opendev.org/#/c/651140/18:25
artomI mean, we could be evil and just remove ping()18:28
dansmithartom: we need it for our own use18:28
dansmithand note that that error dump was on the *nova* server side, as a result of pinging with an old version,18:28
dansmithso that tool potentially generates bug reports for us18:29
dansmithartom: compute pings conductor at startup and waits until it replies, in case it gets started earlier, since it'll fail without conductor being up18:29
dansmithI told them about ping because they wanted to add their own ping-like thing to our rpcapi that they would use to do that,18:29
*** ttsiouts has joined #openstack-nova18:29
dansmithso I was like "we already have this, so please don't add another one, but also... about that approach..."18:30
artomWait, ping() is in the conductor rpcapi, right?18:31
artomSo... how are they going to check for Nova API health?18:31
artomWhich I assume is what they most care about...18:31
openstackgerritChris Dent proposed openstack/nova master: Make all functional tests reusable by other projects  https://review.opendev.org/65765918:31
artomAh, it's in the base class18:33
artomSo all our services have it18:33
dansmiththey can't check api that way anyway18:34
dansmithbecause api doesn't listen18:34
dansmithbut yeah, it's baked in for the services that do18:34
*** ttsiouts has quit IRC18:35
artomSo that's a good argument: you can only use ping() on services that listen to the message queue, which API doesn't.18:35
artomAnd API is kinda important ;)18:35
dansmithone could argue that you can test nova-api externally already ... via the api18:36
dansmithbut yeah, it's an argument.. the others I have are more important to me though :)18:36
artomRight, I'm trying to think of what's important to *them* :)18:36
artomSince you know, it's how you actually convince people ;)18:36
artomReading aspiers's reply on https://review.opendev.org/#/c/653707/2/specs/rpc-health-checks.rst@43, I can see where the self-healing SIG people are coming from. You want a think that tells you directly, with no proxy, whether a single service is healthy or not.18:39
artomYou then have your own intelligence on what to do. But you don't need further analysis to see where the actual failure is - well, less of it, anyways18:39
artomDoing it through the RPC back door is one less layer to think about18:39
dansmithit's dependent on the network and rabbit working though18:40
dansmithand hides the visibility of what might be wrong on the network18:40
dansmithlike, if you get no route, you know the computer is down. if you get refused, you know the service is down18:41
dansmithvia rpc, you'll just not hear back, which could mean pretty much anything, including "the bus is overloaded right now"18:41
artomAlso true.18:42
dansmithregardless, I don't want them to be on our bus as a matter of principle, aside from the fact that I think they'll get bad data from doing so18:42
artomAnyways, I need to step out of this philosophical discussion, things need doing :)18:42
*** purplerbot has quit IRC18:42
artomWas fun thinking about, though18:42
*** purplerbot has joined #openstack-nova18:42
openstackgerritSurya Seetharaman proposed openstack/nova master: Microversion 2.73: Support adding the reason behind a server lock  https://review.opendev.org/64866218:43
tssuryamriedem: thanks again for the detailed review ^. I totally appreciate it.18:44
*** zbr is now known as zbr|pto18:45
aspiersdansmith: thanks for the quick and helpful reply. I think I'm totally fine with your suggestion of opening up new dedicated HTTP endpoints for health-checking RPC-only services, and the advantages you cite make sense18:49
*** _hemna has joined #openstack-nova18:51
aspiersdansmith, artom: FWIW, the problem the OSH folks encountered was not due to pinging with an old version - it was due to not knowing ping() exists and therefore deliberately invoking a non-existent RPC call, which (unsurprisingly) caused errors in the server logs18:53
aspiersThe errors were actually masked for a while, but surfaced when something in oslo logging changed18:53
artomaspiers, ah, yeah that makes more sense with "Attempted method: pod_health_probe_method_ignore_errors"18:54
dansmithaspiers: it's the same thing that will happen if you do use ping, but with the wrong version18:54
artomThat's definitely not a method we have ')18:54
dansmithwhich was my point18:54
aspiersdansmith: gotcha18:54
artomIt does add an external thing to the upgrade process. Nova internally can support N/N+118:55
artomWith the proposed health check, suddenly this new external thing has be to upgraded at the same time as part of the Nova upgrade18:55
dansmithartom: we actually only support N/N+1 internally for major versions on bridge releases18:55
dansmithwe support N.0-N.x on regular releases18:56
aspiersartom: You lost me - what has to be upgraded?18:56
dansmiththis ^ :)18:56
artomaspiers, and dansmith lost me :)18:56
artomWhat's a bridge release?18:57
aspiersAre you talking about a scenario where the ping() interface changes in future versions?18:57
dansmithwe haven't had one in a while, but when we bump the major, we support the old and new majors on the server side18:57
dansmithit's a whole complex thing18:57
dansmithaspiers: ping is unlikely to change, but it could, but even if it doesn't, you have to track the supported major in your tool else you'll generate errors like the one in the commit message when we bump, even though ping hasn't changed18:58
aspiersdansmith: you mean because it won't support calls from (much) older versions?18:59
dansmithand during a major bump release, you'd have to know which computes are upgraded to send the new major18:59
dansmithand conductors don't support multiple majors18:59
artomAh, so my confusion stemmed from thinking release = major bump, which isn't the case18:59
dansmithartom: right18:59
dansmithaspiers: from anything other than major.0 right18:59
*** mgariepy has quit IRC18:59
aspiersOK got it (I think)18:59
artomSo technically queens computes can talk to stein conductors, because both are major 518:59
aspierseven though I suspect the liveness probe would be packaged within the same container as the service it's checking19:00
artomIt would still have to be code-updated at the same time with the new major19:01
aspiersright19:01
artomWhich is harder when it's not the same repo :)19:01
dansmiththis is why you shouldn't do this :)19:01
aspierswhereas going through oslo.middleware (say) wouldn't have any of these problems19:01
*** ttsiouts has joined #openstack-nova19:02
dansmithand we'd have a defined health check schema we agree on and support19:03
dansmithi.e. list of tests19:03
aspiersHmm, re-reading the scrollback - did you say nova already exposes some health data on an existing endpoint?19:04
aspiersI wasn't aware of that ... /me checks the API docs19:05
dansmithno19:05
aspiersMaybe it's just using the existing oslo.middleware mechanism19:05
dansmithno, I'm saying we *should* do that19:05
dansmithnone of our services (other than API) even have http interfaces, so we need to do that thing19:05
aspiersAh, wrong interpretation of "this" --> http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2019-05-07.log.html#t2019-05-07T17:53:2519:05
*** mgariepy has joined #openstack-nova19:05
dansmithI'm saying this is what we should be doing19:05
aspiersYeah OK, that's what I thought. Just got momentarily confused by the scrollback.19:06
aspiersSorry :)19:06
*** ttsiouts has quit IRC19:06
artomI have to say, opening up internal services with an HTTP endpoint seems dangerous, security wise.19:09
artomDepends on how it's implemented, I guess19:09
artomBut being able to DDoS the conductor doesn't seem fun19:09
artomTough I guess any self-respecting operator will firewall the crap out of those19:10
aspiersartom: I doubt OS-Helm has any intention of carrying out DDoS attacks ;-)19:10
aspiersExactly. They probably don't even need to listen outside localhost19:10
*** tesseract has quit IRC19:10
aspiersRemember this is k8s monitoring from the container host19:10
aspiersor similar19:10
artomaspiers, I don't doubt Helm's intentions, it's others I'm paranoid about19:11
aspiersIt won't be open to others19:11
aspiers... unless the operator wants it to be. We're on the same page :)19:11
dansmithartom: it's read-only, has no real info, and you can make it depth-1, so literally only one request at a time19:12
aspiersRight19:12
artomdansmith, what, you mean dumping a detailed 1MB JSON blob of stats isn't a good idea? ;)19:12
aspiersIt actually links up quite nicely with dirk's idea of exposing more internal metrics for consumption by e.g. Prometheus19:13
aspiersAnyway, gotta go. Back tomorrow.19:14
*** jobewan has joined #openstack-nova19:24
*** tssurya has quit IRC19:29
*** _hemna has quit IRC19:34
*** ttsiouts has joined #openstack-nova19:43
*** itlinux has joined #openstack-nova19:58
*** bbowen_ has quit IRC20:12
*** ttsiouts has quit IRC20:16
*** itlinux has quit IRC20:19
*** hamzy has quit IRC20:27
*** artom has quit IRC20:31
*** slaweq has quit IRC20:36
*** slaweq has joined #openstack-nova20:42
*** tssurya has joined #openstack-nova20:52
*** slaweq has quit IRC20:55
*** jdillaman has quit IRC21:00
*** slaweq has joined #openstack-nova21:13
*** ttsiouts has joined #openstack-nova21:18
*** ttsiouts has quit IRC21:23
*** slaweq has quit IRC21:25
*** nicolasbock has quit IRC21:25
*** nicolasbock has joined #openstack-nova21:25
*** bbowen_ has joined #openstack-nova21:30
*** hongbin has joined #openstack-nova21:35
*** mriedem has quit IRC21:36
cfriesenis opendev.org down?21:49
cdentcfriesen: looks that way22:02
*** JamesBenson has quit IRC22:03
*** slaweq has joined #openstack-nova22:11
*** ccstone has joined #openstack-nova22:20
*** slaweq has quit IRC22:24
openstackgerritJohn Garbutt proposed openstack/nova master: Move default policy target  https://review.opendev.org/65769622:43
openstackgerritJohn Garbutt proposed openstack/nova master: Better policy unit tests  https://review.opendev.org/65769722:43
openstackgerritJohn Garbutt proposed openstack/nova master: Add functional test for admin_actions  https://review.opendev.org/65769822:43
*** panda has quit IRC22:44
*** jobewan has quit IRC22:45
*** panda has joined #openstack-nova22:46
*** zer0c00l_ has joined #openstack-nova22:48
openstackgerritJohn Garbutt proposed openstack/nova master: WIP: Integrating with unified limits  https://review.opendev.org/61518022:57
*** threestrands has joined #openstack-nova23:00
*** tkajinam has joined #openstack-nova23:00
*** threestrands has quit IRC23:00
*** rcernin has joined #openstack-nova23:06
*** slaweq has joined #openstack-nova23:11
*** artom has joined #openstack-nova23:21
*** openstackstatus has joined #openstack-nova23:24
*** ChanServ sets mode: +v openstackstatus23:24
*** slaweq has quit IRC23:24
-openstackstatus- NOTICE: If your jobs failed due to connectivity issues to opendev.org they can be rechecked now. Services have been restored at that domain.23:26
*** hoonetorg has quit IRC23:33
*** ttsiouts has joined #openstack-nova23:35
*** hongbin has quit IRC23:41
*** mlavalle has quit IRC23:43
*** hoonetorg has joined #openstack-nova23:47
*** bbowen_ has quit IRC23:47
*** bbowen_ has joined #openstack-nova23:47
*** JamesBenson has joined #openstack-nova23:53
*** whoami-rajat has quit IRC23:55
*** JamesBenson has quit IRC23:57

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!