Wednesday, 2020-07-29

*** rcernin has quit IRC00:05
*** k_mouza has joined #heat00:15
*** k_mouza has quit IRC00:20
*** rcernin has joined #heat00:31
*** weshay|ruck has quit IRC00:53
*** weshay_ has joined #heat00:53
*** andrein has quit IRC00:57
*** zzzeek has quit IRC00:57
*** andrein has joined #heat01:01
*** zzzeek has joined #heat01:01
*** tkajinam has quit IRC01:06
*** tkajinam has joined #heat01:06
*** k_mouza has joined #heat01:31
*** tkajinam has quit IRC01:33
*** tkajinam has joined #heat01:34
*** k_mouza has quit IRC01:36
*** ricolin has quit IRC02:15
*** ricolin has joined #heat02:21
*** k_mouza has joined #heat02:45
*** k_mouza has quit IRC02:50
*** k_mouza has joined #heat03:27
*** k_mouza has quit IRC03:31
*** udesale has joined #heat04:45
*** ramishra has quit IRC05:11
*** ramishra has joined #heat05:16
*** brtknr has quit IRC06:17
*** vishalmanchanda has joined #heat06:39
*** tosky has joined #heat07:39
*** brtknr has joined #heat08:46
*** rcernin has quit IRC09:12
*** k_mouza has joined #heat09:29
*** tkajinam has quit IRC10:01
*** ramishra has quit IRC10:08
*** ramishra has joined #heat10:49
*** ricolin has quit IRC10:52
openstackgerritRabi Mishra proposed openstack/heat stable/rocky: Don't store signal_url for ec2 signaling of deployments  https://review.opendev.org/74373211:50
*** udesale_ has joined #heat12:19
*** udesale has quit IRC12:21
*** weshay_ is now known as weshay|ruck12:54
zanebnsmeds: I think this is the bug report for that issue: https://storyboard.openstack.org/#!/story/200784312:55
pas-haI believe it all stems down from the fact that the heat engines' ids are not stable and re-created on each restart. the same zombie entries are then seen in heat service-list, and must be periodically cleaned up from outside13:22
pas-haideally i'd like them to be stable enough, like hostname-<worker-index>.13:23
pas-hathis is not complete solution, but the zombies will be left only if one downscales the heat-engine workers on the host which is IMO way less frequent operation than heat-engine restart13:24
zanebpas-ha: I wish I understood better how those queues are used. are they just there to wait for responses to messages sent out by the engine using call()?13:28
jrosserif you ever get in a situation where the heat services constantly restart, bad things end up happening real quick13:28
jrosserthats caused us a whole-cloud outage at least once13:28
zanebo.O13:29
jrosserbecasue youve got just a bazzillion queues generated and rabbitmq is just wedged totally13:29
zanebI'd love to fix this, it seems to be getting worse. I'd never heard of it before like a year ago and now there's multiple people (at least 4-5) complaining about it13:29
zanebthe only proposal we've had was to fix it in devstack, which I nixed because that doesn't actually help any real clouds13:30
zanebwe store in the DB a list of the engines and their status, so we know if they're not still alive. in principle we could delete those queues. in practice, I don't know if oslo.messaging gives us the APIs to do that13:31
zanebbut also I really wanna know if we're losing messages when we do that13:32
zaneb(I suspect the increased reports are a result of more people running OpenStack on top of k8s)13:33
*** gmoro has joined #heat13:36
jrosseri dont know the root cause but the symptom we had was like this13:40
jrosserINFO heat.engine.worker [-] Starting engine_worker (1.4) in engine b3ab1e59-1ed5-4293-b1613013:41
jrosserINFO oslo_service.service [-] Child 16146 killed by signal 913:41
jrosserjust over and over in the heat service log13:41
nsmeds@zaneb @jrosser thanks for the discussion! Considering we do not use Heat (yet), as it's a fairly new OpenStack deployment, do you think we're safe to just use `rabbitmqctl` to delete all the Heat queues?13:54
zanebnsmeds: in that situation, I would probably shut down all of the heat services, delete the queues, and start them back up again13:55
nsmedsBtw, we use openstack-ansible - so not k8s. This only occurred on the staging cluster (which we deployed first, and had a few issues figuring out settings). The production cluster was deployed "smoothly" and doesn't have this same issue with Heat queues.13:55
nsmedsOk, I will try so today. Thank you!13:56
zanebthat way you know you have what you need and no more13:56
*** cliffparsons has quit IRC13:57
jrossernsmeds: i take no credit for the bash skills here but someone gave me this to help clean up http://paste.openstack.org/show/796430/13:57
nsmeds@jrosser thank you!13:58
zanebjrosser: 9 is SIGKILL so that would do it13:58
*** cliffparsons has joined #heat13:58
*** ricolin has joined #heat14:40
*** cliffparsons has quit IRC14:40
*** rcernin has joined #heat14:42
*** rcernin_ has joined #heat14:52
ricolintosky, I will send patch to migrate those two jobs14:53
*** rcernin_ has quit IRC14:56
*** rcernin has quit IRC14:58
*** irclogbot_2 has quit IRC14:58
toskyricolin: thank you!14:58
*** irclogbot_1 has joined #heat14:58
ricolinnsmeds, heat generate queues for engine service every time a engine service started, so what you need to do is to set expire time for queue, so when you restart over, queue will be removed eventually14:59
ricolinnsmeds, probably some commands like14:59
ricolinnsmeds, http://paste.openstack.org/show/796434/14:59
ricolintosky, no, thank you!:)15:00
* ricolin keep forgot those two jobs15:00
*** hoonetorg has quit IRC15:09
nsmeds@ricolin ok, I'll look into that as well :)15:16
*** hoonetorg has joined #heat15:21
*** cliffparsons has joined #heat15:47
*** k_mouza has quit IRC16:09
*** udesale_ has quit IRC16:24
*** k_mouza has joined #heat16:28
*** k_mouza has quit IRC16:33
*** k_mouza has joined #heat16:46
*** k_mouza has quit IRC16:50
*** k_mouza has joined #heat16:57
*** ricolin has quit IRC17:01
*** k_mouza has quit IRC17:01
*** k_mouza has joined #heat17:15
*** k_mouza has quit IRC17:19
*** k_mouza has joined #heat17:35
*** k_mouza has quit IRC17:39
*** k_mouza has joined #heat17:54
*** k_mouza has quit IRC17:58
*** k_mouza has joined #heat18:10
*** k_mouza has quit IRC18:14
*** vishalmanchanda has quit IRC18:29
*** k_mouza has joined #heat18:31
*** k_mouza has quit IRC18:39
*** k_mouza has joined #heat19:22
*** k_mouza has quit IRC19:27
*** k_mouza has joined #heat19:33
*** k_mouza has quit IRC19:37
*** k_mouza has joined #heat19:42
*** k_mouza has quit IRC19:47
*** k_mouza has joined #heat20:05
*** k_mouza has quit IRC20:09
*** rcernin_ has joined #heat22:36
*** rcernin_ has quit IRC22:48
*** rcernin has joined #heat22:48
*** tkajinam has joined #heat22:53
*** cliffparsons has quit IRC23:13
*** tosky has quit IRC23:21
*** ramishra has quit IRC23:52

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!