Thursday, 2018-10-18

*** bobh has quit IRC		01:00
*** rakhmerov has joined #openstack-mistral		03:16
*** akovi has joined #openstack-mistral		04:18
akovi	@vgoleg: clearing out the cache everywhere may be the right solution. It's a cache anyway. However, the whole rerun feature seems to be very much off-course in this respect. Just asking: why is it so that you are rerunning stuff instead of creating a normal excecution?	05:27
*** hardikjasani has joined #openstack-mistral		06:02
*** jtomasek has joined #openstack-mistral		06:18
openstackgerrit	Akhil jain proposed openstack/mistral master: Add framework for mistral-status upgrade check https://review.openstack.org/611513	06:47
rakhmerov	d0ugal: hi	07:16
rakhmerov	here?	07:16
*** shardy has joined #openstack-mistral		07:23
rakhmerov	d0ugal, akovi: hey, I just sent an email to openstack-dev about oslo messaging & "blocking" executor	07:23
rakhmerov	would be good if you could say a couple supporting words	07:24
*** openstackgerrit has quit IRC		07:35
*** apetrich has quit IRC		07:39
vgvoleg	akovi: e.g. service was unavailable and we perform rerun	07:46
akovi	doesn't sound like an idempotent solution :(	07:46
akovi	Still, I think it should be the WF that takes care of not running actions that neeed not be running	07:47
akovi	The whole concept of rerun with overwriting history is not a particularly good design decision	07:48
vgvoleg	ok, I got you, rerun == deprecated feature	07:52
vgvoleg	But the problem is that it isn't deprecated and we found some mistral problems using it	07:53
vgvoleg	My question was about how to get rid of them	07:55
vgvoleg	Not about advantages or disadvantages of rerun feature	07:56
vgvoleg	For now, caches may cause some problems in case of using multiple engine instances	07:57
vgvoleg	I am not sure if I understand this issue right or not	08:00
*** apetrich has joined #openstack-mistral		08:00
d0ugal	rakhmerov: sure, I'll take a look shortly	08:14
d0ugal	I need to go out briefly this morning tho	08:14
akovi	yes, so the cache is building on the assumption that tasks in final states will not change	08:18
*** shardy has quit IRC		08:18
therve	rakhmerov: What doesn't work with pymysql and eventlet?	08:18
akovi	rerun simply violates this	08:18
akovi	one solution would be your proposal, to blow the cache if rerun is executed	08:19
akovi	you will have to implement some enhancement that ensures all engines do this before beginning to deal with outdated tasks	08:20
akovi	I don't have an idea now how this should be implemented	08:21
akovi	synchronization between the engines exists only through the DB	08:21
akovi	but storing this in the DB would require making an additional check for the new state before every operation on the tasks	08:22
akovi	therve: we see that the engine process is getting overwhelmed by parallel tasks and starts failing towards external services	08:23
therve	akovi: Failing how?	08:24
akovi	therve: in our case RMQ hearbeats are missed and then the whole cluster crashes when the different mistral processes start reconnecting every few seconds	08:24
therve	That doesn't sound closely related to eventlet	08:25
therve	More a general concurrency issue	08:25
akovi	sure	08:25
akovi	with the blocking executor it happens less frequently and the service stays stable in the long run	08:26
*** shardy has joined #openstack-mistral		08:26
therve	I'd say that ought to be fixed by configuring the concurrency properly	08:27
akovi	can you help with this?	08:27
therve	Set executor_thread_pool_size for example	08:27
*** openstackgerrit has joined #openstack-mistral		09:15
openstackgerrit	Renat Akhmerov proposed openstack/mistral stable/rocky: Update OnClauseSPec task name criteria https://review.openstack.org/611550	09:15
rakhmerov	apetrich: hi, were you going to take this one? https://bugs.launchpad.net/mistral/+bug/1793651	09:17
openstack	Launchpad bug 1793651 in Mistral "Backwards compatibility issue: when starting a workflow "params" can't be null now" [High,Confirmed]	09:17
rakhmerov	I may be mistaken	09:17
rakhmerov	just want to confirm	09:18
*** gkadam has joined #openstack-mistral		09:21
*** gkadam has quit IRC		09:21
rakhmerov	akovi: as far as the cache, it simply needs to be cleared out on rerun	09:24
rakhmerov	it's safe to do	09:25
rakhmerov	btw, I found an issue with that cache recently, will soon be pushed upstream	09:25
rakhmerov	therve: Andras described his issues with rmq, I've also seen some other things. For example, we have a mechanism DB based lockes based on transactional properties where a thread should block till it gets a lock in DB	09:28
rakhmerov	and it doesn't work	09:28
rakhmerov	it's not blocked and moves forward	09:28
rakhmerov	therve: it's still a very early feedback though and I keep investigating	09:28
therve	rakhmerov: OK. That sounds fixable, but I don't know how hard it is obviously	09:43
rakhmerov	therve: you mean fixable what?	09:44
rakhmerov	not removing "blocking" executor for now?	09:45
therve	rakhmerov: Your db locking mechanism	09:45
rakhmerov	aah	09:45
rakhmerov	it worked but without eventlet	09:45
rakhmerov	that's the whole point	09:45
rakhmerov	if I run two processes and simulate a situation when one process (a thread within it) needs to block then it blocks	09:46
rakhmerov	but yeah... we will see	09:46
therve	Yeah that worked because you had no concurrency	09:47
therve	That helps locking mechanism in general :)	09:47
rakhmerov	therve: no-no	09:56
rakhmerov	I may have misunderstood	09:56
rakhmerov	it worked with different processes	09:57
rakhmerov	a thread in one process would block until a thread in the other one released it	09:57
rakhmerov	it'd be totally fine if eventlet dispatched to another green thread in this situation	09:58
rakhmerov	but it seems to let the same thread to move forward	09:58
apetrich	rakhmerov, I wasn't but I probably can	10:13
rakhmerov	apetrich: ooh, ok )	10:17
rakhmerov	so, if you could look.. :)	10:17
rakhmerov	should be very simple	10:17
openstackgerrit	Akhil jain proposed openstack/mistral master: Add framework for mistral-status upgrade check https://review.openstack.org/611513	10:17
apetrich	rakhmerov, aye	10:18
apetrich	assigning to me.	10:18
rakhmerov	thanks!	10:19
akovi	rakhmerov: yes, the cache should be cleared; the issue is how do you do this on all engine instances?	10:44
rakhmerov	akovi: ooh... right	10:44
rakhmerov	got your point now	10:44
rakhmerov	I thought about having a TTL cache	10:45
rakhmerov	it's not 100% reliable although if set to a reasonable value it would help	10:45
akovi	unfortunately, a ttl would not be deterministic	10:48
openstackgerrit	Andras Kovi proposed openstack/mistral master: Fix state change propagation in workflows https://review.openstack.org/607960	11:00
*** thrash\|g0ne is now known as thrash		11:44
*** apetrich has quit IRC		12:21
*** apetrich has joined #openstack-mistral		12:33
*** gkadam has joined #openstack-mistral		12:34
openstackgerrit	Renat Akhmerov proposed openstack/mistral master: WIP: improving join https://review.openstack.org/610461	12:39
d0ugal	rakhmerov: I thought the oslo team had agreed to keep the blocking executor?	12:52
d0ugal	has there been a recent change and a move to remove it again?	12:53
*** bobh has joined #openstack-mistral		13:01
openstackgerrit	Merged openstack/mistral master: Reduce the concurrency in the 500 wb join Rally task https://review.openstack.org/608910	13:10
*** jrist has quit IRC		13:11
*** jrist has joined #openstack-mistral		13:13
*** akovi has quit IRC		13:28
*** hardikjasani has quit IRC		13:38
*** thrash is now known as thrash\|biab		13:56
*** gkadam has quit IRC		13:59
*** gkadam has joined #openstack-mistral		13:59
*** apetrich has quit IRC		14:11
*** thrash\|biab is now known as thrash		14:16
*** apetrich has joined #openstack-mistral		14:20
*** gkadam_ has joined #openstack-mistral		14:33
*** gkadam has quit IRC		14:35
*** gkadam_ has quit IRC		15:38
*** shardy has quit IRC		16:43
*** apetrich has quit IRC		18:10
*** apetrich has joined #openstack-mistral		18:24
*** bobh has quit IRC		19:17
*** apetrich has quit IRC		19:44
*** openstackgerrit has quit IRC		20:36
*** bobh has joined #openstack-mistral		22:37
*** bobh has quit IRC		22:41

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!