*** bobh has quit IRC | 01:00 | |
*** rakhmerov has joined #openstack-mistral | 03:16 | |
*** akovi has joined #openstack-mistral | 04:18 | |
akovi | @vgoleg: clearing out the cache everywhere may be the right solution. It's a cache anyway. However, the whole rerun feature seems to be very much off-course in this respect. Just asking: why is it so that you are rerunning stuff instead of creating a normal excecution? | 05:27 |
---|---|---|
*** hardikjasani has joined #openstack-mistral | 06:02 | |
*** jtomasek has joined #openstack-mistral | 06:18 | |
openstackgerrit | Akhil jain proposed openstack/mistral master: Add framework for mistral-status upgrade check https://review.openstack.org/611513 | 06:47 |
rakhmerov | d0ugal: hi | 07:16 |
rakhmerov | here? | 07:16 |
*** shardy has joined #openstack-mistral | 07:23 | |
rakhmerov | d0ugal, akovi: hey, I just sent an email to openstack-dev about oslo messaging & "blocking" executor | 07:23 |
rakhmerov | would be good if you could say a couple supporting words | 07:24 |
*** openstackgerrit has quit IRC | 07:35 | |
*** apetrich has quit IRC | 07:39 | |
vgvoleg | akovi: e.g. service was unavailable and we perform rerun | 07:46 |
akovi | doesn't sound like an idempotent solution :( | 07:46 |
akovi | Still, I think it should be the WF that takes care of not running actions that neeed not be running | 07:47 |
akovi | The whole concept of rerun with overwriting history is not a particularly good design decision | 07:48 |
vgvoleg | ok, I got you, rerun == deprecated feature | 07:52 |
vgvoleg | But the problem is that it isn't deprecated and we found some mistral problems using it | 07:53 |
vgvoleg | My question was about how to get rid of them | 07:55 |
vgvoleg | Not about advantages or disadvantages of rerun feature | 07:56 |
vgvoleg | For now, caches may cause some problems in case of using multiple engine instances | 07:57 |
vgvoleg | I am not sure if I understand this issue right or not | 08:00 |
*** apetrich has joined #openstack-mistral | 08:00 | |
d0ugal | rakhmerov: sure, I'll take a look shortly | 08:14 |
d0ugal | I need to go out briefly this morning tho | 08:14 |
akovi | yes, so the cache is building on the assumption that tasks in final states will not change | 08:18 |
*** shardy has quit IRC | 08:18 | |
therve | rakhmerov: What doesn't work with pymysql and eventlet? | 08:18 |
akovi | rerun simply violates this | 08:18 |
akovi | one solution would be your proposal, to blow the cache if rerun is executed | 08:19 |
akovi | you will have to implement some enhancement that ensures all engines do this before beginning to deal with outdated tasks | 08:20 |
akovi | I don't have an idea now how this should be implemented | 08:21 |
akovi | synchronization between the engines exists only through the DB | 08:21 |
akovi | but storing this in the DB would require making an additional check for the new state before every operation on the tasks | 08:22 |
akovi | therve: we see that the engine process is getting overwhelmed by parallel tasks and starts failing towards external services | 08:23 |
therve | akovi: Failing how? | 08:24 |
akovi | therve: in our case RMQ hearbeats are missed and then the whole cluster crashes when the different mistral processes start reconnecting every few seconds | 08:24 |
therve | That doesn't sound closely related to eventlet | 08:25 |
therve | More a general concurrency issue | 08:25 |
akovi | sure | 08:25 |
akovi | with the blocking executor it happens less frequently and the service stays stable in the long run | 08:26 |
*** shardy has joined #openstack-mistral | 08:26 | |
therve | I'd say that ought to be fixed by configuring the concurrency properly | 08:27 |
akovi | can you help with this? | 08:27 |
therve | Set executor_thread_pool_size for example | 08:27 |
*** openstackgerrit has joined #openstack-mistral | 09:15 | |
openstackgerrit | Renat Akhmerov proposed openstack/mistral stable/rocky: Update OnClauseSPec task name criteria https://review.openstack.org/611550 | 09:15 |
rakhmerov | apetrich: hi, were you going to take this one? https://bugs.launchpad.net/mistral/+bug/1793651 | 09:17 |
openstack | Launchpad bug 1793651 in Mistral "Backwards compatibility issue: when starting a workflow "params" can't be null now" [High,Confirmed] | 09:17 |
rakhmerov | I may be mistaken | 09:17 |
rakhmerov | just want to confirm | 09:18 |
*** gkadam has joined #openstack-mistral | 09:21 | |
*** gkadam has quit IRC | 09:21 | |
rakhmerov | akovi: as far as the cache, it simply needs to be cleared out on rerun | 09:24 |
rakhmerov | it's safe to do | 09:25 |
rakhmerov | btw, I found an issue with that cache recently, will soon be pushed upstream | 09:25 |
rakhmerov | therve: Andras described his issues with rmq, I've also seen some other things. For example, we have a mechanism DB based lockes based on transactional properties where a thread should block till it gets a lock in DB | 09:28 |
rakhmerov | and it doesn't work | 09:28 |
rakhmerov | it's not blocked and moves forward | 09:28 |
rakhmerov | therve: it's still a very early feedback though and I keep investigating | 09:28 |
therve | rakhmerov: OK. That sounds fixable, but I don't know how hard it is obviously | 09:43 |
rakhmerov | therve: you mean fixable what? | 09:44 |
rakhmerov | not removing "blocking" executor for now? | 09:45 |
therve | rakhmerov: Your db locking mechanism | 09:45 |
rakhmerov | aah | 09:45 |
rakhmerov | it worked but without eventlet | 09:45 |
rakhmerov | that's the whole point | 09:45 |
rakhmerov | if I run two processes and simulate a situation when one process (a thread within it) needs to block then it blocks | 09:46 |
rakhmerov | but yeah... we will see | 09:46 |
therve | Yeah that worked because you had no concurrency | 09:47 |
therve | That helps locking mechanism in general :) | 09:47 |
rakhmerov | therve: no-no | 09:56 |
rakhmerov | I may have misunderstood | 09:56 |
rakhmerov | it worked with different processes | 09:57 |
rakhmerov | a thread in one process would block until a thread in the other one released it | 09:57 |
rakhmerov | it'd be totally fine if eventlet dispatched to another green thread in this situation | 09:58 |
rakhmerov | but it seems to let the same thread to move forward | 09:58 |
apetrich | rakhmerov, I wasn't but I probably can | 10:13 |
rakhmerov | apetrich: ooh, ok ) | 10:17 |
rakhmerov | so, if you could look.. :) | 10:17 |
rakhmerov | should be very simple | 10:17 |
openstackgerrit | Akhil jain proposed openstack/mistral master: Add framework for mistral-status upgrade check https://review.openstack.org/611513 | 10:17 |
apetrich | rakhmerov, aye | 10:18 |
apetrich | assigning to me. | 10:18 |
rakhmerov | thanks! | 10:19 |
akovi | rakhmerov: yes, the cache should be cleared; the issue is how do you do this on all engine instances? | 10:44 |
rakhmerov | akovi: ooh... right | 10:44 |
rakhmerov | got your point now | 10:44 |
rakhmerov | I thought about having a TTL cache | 10:45 |
rakhmerov | it's not 100% reliable although if set to a reasonable value it would help | 10:45 |
akovi | unfortunately, a ttl would not be deterministic | 10:48 |
openstackgerrit | Andras Kovi proposed openstack/mistral master: Fix state change propagation in workflows https://review.openstack.org/607960 | 11:00 |
*** thrash|g0ne is now known as thrash | 11:44 | |
*** apetrich has quit IRC | 12:21 | |
*** apetrich has joined #openstack-mistral | 12:33 | |
*** gkadam has joined #openstack-mistral | 12:34 | |
openstackgerrit | Renat Akhmerov proposed openstack/mistral master: WIP: improving join https://review.openstack.org/610461 | 12:39 |
d0ugal | rakhmerov: I thought the oslo team had agreed to keep the blocking executor? | 12:52 |
d0ugal | has there been a recent change and a move to remove it again? | 12:53 |
*** bobh has joined #openstack-mistral | 13:01 | |
openstackgerrit | Merged openstack/mistral master: Reduce the concurrency in the 500 wb join Rally task https://review.openstack.org/608910 | 13:10 |
*** jrist has quit IRC | 13:11 | |
*** jrist has joined #openstack-mistral | 13:13 | |
*** akovi has quit IRC | 13:28 | |
*** hardikjasani has quit IRC | 13:38 | |
*** thrash is now known as thrash|biab | 13:56 | |
*** gkadam has quit IRC | 13:59 | |
*** gkadam has joined #openstack-mistral | 13:59 | |
*** apetrich has quit IRC | 14:11 | |
*** thrash|biab is now known as thrash | 14:16 | |
*** apetrich has joined #openstack-mistral | 14:20 | |
*** gkadam_ has joined #openstack-mistral | 14:33 | |
*** gkadam has quit IRC | 14:35 | |
*** gkadam_ has quit IRC | 15:38 | |
*** shardy has quit IRC | 16:43 | |
*** apetrich has quit IRC | 18:10 | |
*** apetrich has joined #openstack-mistral | 18:24 | |
*** bobh has quit IRC | 19:17 | |
*** apetrich has quit IRC | 19:44 | |
*** openstackgerrit has quit IRC | 20:36 | |
*** bobh has joined #openstack-mistral | 22:37 | |
*** bobh has quit IRC | 22:41 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!