*** hongbin has quit IRC | 00:20 | |
*** hongbin has joined #heat | 00:22 | |
*** ricolin has joined #heat | 02:33 | |
*** hongbin has quit IRC | 03:26 | |
*** ramishra has joined #heat | 03:44 | |
*** ricolin_ has joined #heat | 04:56 | |
*** ricolin has quit IRC | 04:57 | |
*** udesale has joined #heat | 05:38 | |
*** rcernin has quit IRC | 06:02 | |
*** rcernin has joined #heat | 06:11 | |
*** vishalmanchanda has joined #heat | 06:54 | |
*** rcernin has quit IRC | 08:06 | |
*** ramishra has quit IRC | 08:28 | |
*** ramishra has joined #heat | 08:29 | |
*** k_mouza has joined #heat | 09:21 | |
*** rcernin has joined #heat | 09:54 | |
*** ricolin_ has quit IRC | 10:01 | |
*** rcernin has quit IRC | 10:03 | |
*** rcernin has joined #heat | 10:08 | |
*** rcernin has quit IRC | 10:40 | |
*** ricolin_ has joined #heat | 11:23 | |
*** tkajinam has quit IRC | 11:37 | |
*** udesale_ has joined #heat | 12:09 | |
*** udesale has quit IRC | 12:12 | |
*** ramishra has quit IRC | 13:22 | |
*** ramishra has joined #heat | 13:26 | |
*** udesale_ has quit IRC | 13:27 | |
*** irclogbot_2 has quit IRC | 13:27 | |
*** udesale_ has joined #heat | 13:28 | |
*** irclogbot_2 has joined #heat | 13:28 | |
*** beekneemech is now known as bnemec | 14:38 | |
*** udesale_ has quit IRC | 15:07 | |
*** k_mouza has quit IRC | 16:33 | |
*** ayoung has joined #heat | 16:55 | |
*** ayoung has quit IRC | 17:14 | |
*** ayoung has joined #heat | 17:17 | |
*** ayoung has quit IRC | 17:29 | |
*** ayoung has joined #heat | 17:30 | |
*** ricolin_ has quit IRC | 17:40 | |
*** ayoung has quit IRC | 18:03 | |
*** ayoung has joined #heat | 18:04 | |
zaneb | mnaser: I think this is the same issue: https://storyboard.openstack.org/#!/story/2007843 | 19:04 |
---|---|---|
mnaser | zaneb: yep.. we've resorted to running `heat-manage service clean` for now every 1 hour + manually setting `host` to match the system hostname (we don't want to run with `hostNetwork` and this is running inside k8s) | 19:05 |
mnaser | it pretty much does mean that autoscaling heat-engine inside k8s is not a possibility | 19:06 |
zaneb | why not? rabbit gets overwhelmed? | 19:07 |
mnaser | zaneb: well, what i noticed is on scale-down events (when heat gets SIGTERM), it doesn't clean up its queues behind it, so for a bit, engines are still up | 19:08 |
mnaser | zaneb: so things queue up in those and then api calls timeout (for example, a stack list) | 19:09 |
zaneb | mnaser: that's weird. when you do a graceful shutdown it should stop pulling new requests off the queue and respond to in-flight API calls before shutting down | 19:10 |
mnaser | zaneb: but because the engine state is still up (cause i assume it only goes 'down' when it hits $timeout) | 19:11 |
mnaser | so things still get queue'd to it (i noticed this with engine_worker with a few messages like 13-15) | 19:11 |
zaneb | we don't unicast API requests to a particular engine afaik, except when we're checking if it's still alive for the purposes of stealing its locks | 19:12 |
mnaser | zaneb: maybe i'm misinterpreting the real issue, but the artifact is api timeouts and some engine listener queues with 13-15 msgs (or more depending on how busy heat was at the time) | 19:12 |
zaneb | mnaser: looks like the engine-listener ones are purely messages asking the engine if it's still alive. if there's no reply we'll conclude that it's not. so that shouldn't break anything | 19:31 |
zaneb | API requests shouldn't time out unless the engine is shutdown ungracefully though | 19:32 |
mnaser | zaneb: well also engine_worker_xxxx thing | 19:32 |
mnaser | those also end up with messages | 19:32 |
zaneb | that's weird because we literally only ever use cast() with engine_worker queues afaict | 19:33 |
mnaser | zaneb: but the theory of a cast() from the api going towards engine_worker queues that are no longer being listened to because the worker has shut down? | 19:34 |
mnaser | this could also be a by-product of other things, upgrading to ussuri really hurt the performance till enabling cache :x | 19:34 |
zaneb | there's 3 kinds of topics | 19:35 |
zaneb | engine - uses call() to respond to API calls. no idea how oslo.messaging chooses which engines queue to send to | 19:36 |
zaneb | engine-listener - uses call() to check if a particular engine is alive | 19:36 |
zaneb | engine_worker - uses cast() exclusively | 19:36 |
zaneb | I'd have assumed that for cast() all messages would go in the same queue and any engine could pick them up | 19:37 |
mnaser | zaneb: so this is what i end up with https://www.irccloud.com/pastebin/CqDCzURw/ | 19:38 |
zaneb | thanks. oslo.message docs are non-existent, so we'll have to find an expert to explain how it is supposed to work | 19:44 |
zaneb | bnemec: who is the oslo.messaging expert these days? kgiusti? | 19:46 |
bnemec | He's the first person I would point you to, yeah. | 19:46 |
zaneb | asked in #openstack-oslo | 19:57 |
*** vishalmanchanda has quit IRC | 21:08 | |
*** k_mouza has joined #heat | 22:12 | |
*** tkajinam has joined #heat | 22:54 | |
*** ayoung has quit IRC | 23:06 | |
*** ayoung has joined #heat | 23:08 | |
*** rcernin has joined #heat | 23:08 | |
*** k_mouza has quit IRC | 23:17 | |
*** hoonetorg has quit IRC | 23:24 | |
*** k_mouza has joined #heat | 23:30 | |
*** k_mouza has quit IRC | 23:34 | |
*** hoonetorg has joined #heat | 23:39 | |
*** k_mouza has joined #heat | 23:40 | |
*** k_mouza has quit IRC | 23:45 | |
*** k_mouza has joined #heat | 23:48 | |
*** k_mouza has quit IRC | 23:53 | |
*** k_mouza has joined #heat | 23:57 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!