Monday, 2020-07-13

*** hongbin has quit IRC		00:20
*** hongbin has joined #heat		00:22
*** ricolin has joined #heat		02:33
*** hongbin has quit IRC		03:26
*** ramishra has joined #heat		03:44
*** ricolin_ has joined #heat		04:56
*** ricolin has quit IRC		04:57
*** udesale has joined #heat		05:38
*** rcernin has quit IRC		06:02
*** rcernin has joined #heat		06:11
*** vishalmanchanda has joined #heat		06:54
*** rcernin has quit IRC		08:06
*** ramishra has quit IRC		08:28
*** ramishra has joined #heat		08:29
*** k_mouza has joined #heat		09:21
*** rcernin has joined #heat		09:54
*** ricolin_ has quit IRC		10:01
*** rcernin has quit IRC		10:03
*** rcernin has joined #heat		10:08
*** rcernin has quit IRC		10:40
*** ricolin_ has joined #heat		11:23
*** tkajinam has quit IRC		11:37
*** udesale_ has joined #heat		12:09
*** udesale has quit IRC		12:12
*** ramishra has quit IRC		13:22
*** ramishra has joined #heat		13:26
*** udesale_ has quit IRC		13:27
*** irclogbot_2 has quit IRC		13:27
*** udesale_ has joined #heat		13:28
*** irclogbot_2 has joined #heat		13:28
*** beekneemech is now known as bnemec		14:38
*** udesale_ has quit IRC		15:07
*** k_mouza has quit IRC		16:33
*** ayoung has joined #heat		16:55
*** ayoung has quit IRC		17:14
*** ayoung has joined #heat		17:17
*** ayoung has quit IRC		17:29
*** ayoung has joined #heat		17:30
*** ricolin_ has quit IRC		17:40
*** ayoung has quit IRC		18:03
*** ayoung has joined #heat		18:04
zaneb	mnaser: I think this is the same issue: https://storyboard.openstack.org/#!/story/2007843	19:04
mnaser	zaneb: yep.. we've resorted to running `heat-manage service clean` for now every 1 hour + manually setting `host` to match the system hostname (we don't want to run with `hostNetwork` and this is running inside k8s)	19:05
mnaser	it pretty much does mean that autoscaling heat-engine inside k8s is not a possibility	19:06
zaneb	why not? rabbit gets overwhelmed?	19:07
mnaser	zaneb: well, what i noticed is on scale-down events (when heat gets SIGTERM), it doesn't clean up its queues behind it, so for a bit, engines are still up	19:08
mnaser	zaneb: so things queue up in those and then api calls timeout (for example, a stack list)	19:09
zaneb	mnaser: that's weird. when you do a graceful shutdown it should stop pulling new requests off the queue and respond to in-flight API calls before shutting down	19:10
mnaser	zaneb: but because the engine state is still up (cause i assume it only goes 'down' when it hits $timeout)	19:11
mnaser	so things still get queue'd to it (i noticed this with engine_worker with a few messages like 13-15)	19:11
zaneb	we don't unicast API requests to a particular engine afaik, except when we're checking if it's still alive for the purposes of stealing its locks	19:12
mnaser	zaneb: maybe i'm misinterpreting the real issue, but the artifact is api timeouts and some engine listener queues with 13-15 msgs (or more depending on how busy heat was at the time)	19:12
zaneb	mnaser: looks like the engine-listener ones are purely messages asking the engine if it's still alive. if there's no reply we'll conclude that it's not. so that shouldn't break anything	19:31
zaneb	API requests shouldn't time out unless the engine is shutdown ungracefully though	19:32
mnaser	zaneb: well also engine_worker_xxxx thing	19:32
mnaser	those also end up with messages	19:32
zaneb	that's weird because we literally only ever use cast() with engine_worker queues afaict	19:33
mnaser	zaneb: but the theory of a cast() from the api going towards engine_worker queues that are no longer being listened to because the worker has shut down?	19:34
mnaser	this could also be a by-product of other things, upgrading to ussuri really hurt the performance till enabling cache :x	19:34
zaneb	there's 3 kinds of topics	19:35
zaneb	engine - uses call() to respond to API calls. no idea how oslo.messaging chooses which engines queue to send to	19:36
zaneb	engine-listener - uses call() to check if a particular engine is alive	19:36
zaneb	engine_worker - uses cast() exclusively	19:36
zaneb	I'd have assumed that for cast() all messages would go in the same queue and any engine could pick them up	19:37
mnaser	zaneb: so this is what i end up with https://www.irccloud.com/pastebin/CqDCzURw/	19:38
zaneb	thanks. oslo.message docs are non-existent, so we'll have to find an expert to explain how it is supposed to work	19:44
zaneb	bnemec: who is the oslo.messaging expert these days? kgiusti?	19:46
bnemec	He's the first person I would point you to, yeah.	19:46
zaneb	asked in #openstack-oslo	19:57
*** vishalmanchanda has quit IRC		21:08
*** k_mouza has joined #heat		22:12
*** tkajinam has joined #heat		22:54
*** ayoung has quit IRC		23:06
*** ayoung has joined #heat		23:08
*** rcernin has joined #heat		23:08
*** k_mouza has quit IRC		23:17
*** hoonetorg has quit IRC		23:24
*** k_mouza has joined #heat		23:30
*** k_mouza has quit IRC		23:34
*** hoonetorg has joined #heat		23:39
*** k_mouza has joined #heat		23:40
*** k_mouza has quit IRC		23:45
*** k_mouza has joined #heat		23:48
*** k_mouza has quit IRC		23:53
*** k_mouza has joined #heat		23:57

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!