08:00:43 <rakhmerov> #startmeeting Mistral
08:00:44 <openstack> Meeting started Wed Sep 11 08:00:43 2019 UTC and is due to finish in 60 minutes.  The chair is rakhmerov. Information about MeetBot at http://wiki.debian.org/MeetBot.
08:00:45 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
08:00:46 <rakhmerov> hi all
08:00:47 <openstack> The meeting name has been set to 'mistral'
08:00:51 <vgvoleg> hi!
08:01:03 <rakhmerov> if there's anyone here for the meeting reveal yourself! )
08:01:08 <rakhmerov> vgvoleg: hi
08:01:40 <rakhmerov> eyalb: ^
08:02:16 <vgvoleg> I'd like to discuss something
08:02:36 <rakhmerov> guys, I have to step away urgently for 20-30 mins. Oleg, please start writing your topics/questions, I'll join later
08:02:48 <vgvoleg> First of all, I've found that notifier base is not moved to mistral_lib
08:03:08 <vgvoleg> so to write custom publisher we should import mistral
08:03:57 <vgvoleg> I guess that it is not OK and I'm going to move it
08:04:03 <vgvoleg> Nod if you agree :D
08:04:51 <rakhmerov> Nodding.. )
08:04:57 <vgvoleg> Secondly, I've found that PUT operations in our api are not safe at all
08:05:06 <rakhmerov> ?
08:06:12 <vgvoleg> We can break anything if we send multiple put requests to one execution
08:06:22 <vgvoleg> no check mechanisms
08:06:32 <vgvoleg> no locks
08:07:48 <vgvoleg> we can send contradictory commands to engine at the same time
08:07:55 <vgvoleg> it is not ok
08:08:29 <vgvoleg> And, IMO, it should be fixed on the api side
08:08:50 <vgvoleg> But I don't know how to do it nice
08:09:33 <vgvoleg> And the third topic I'd like to discuss it our ERROR state
08:10:17 <vgvoleg> in our state machine, SUCCESS is truly terminal, we can't do anything with execution if it was completed successfully
08:10:40 <vgvoleg> but ERROR is not truly terminal - we can rerun it, for example
08:11:44 <vgvoleg> and I think that it is a gap, that we don't have truly terminal state to indicate error
08:12:44 <vgvoleg> Something that moves ERROR execution to read-only
08:13:21 <vgvoleg> Something that means "OK, we don't care that it is failed and we are not going to do anything with it"
08:14:40 <vgvoleg> I think we can use CANCELLED state for it, but current implementation does not support this transaction
08:15:24 <vgvoleg> maybe we should add any additional state for this
08:15:34 <openstackgerrit> yatin proposed openstack/mistral master: moved generic util functions from mistral to mistral-lib  https://review.opendev.org/676373
08:15:44 <vgvoleg> I dont know tbh
08:16:03 <vgvoleg> So I'll be glad to listen your opinion
08:22:06 <rakhmerov> I'm here
08:22:29 <rakhmerov> reading..
08:22:51 <rakhmerov> vgvoleg: on your 2nd thing, can you given an example?
08:24:31 <vgvoleg> rakhmerov: sure, we can send two PUT requests to /v2/executions, the first request should pause execution, the second one should cancel it
08:24:32 <rakhmerov> on #3 I don't see why it is a real problem. It's all kind of relative, terminal or non-terminal. ERROR is really considered terminal from perspective of the running workflow
08:24:54 <rakhmerov> rerun mechanism is not a regular part of the execution process
08:25:13 <rakhmerov> vgvoleg: so what can happen?
08:25:16 <vgvoleg> and API will send two commands to the engine
08:25:22 <vgvoleg> which is not ok
08:25:22 <rakhmerov> so?
08:25:29 <rakhmerov> what bad is going to happen?
08:26:04 <rakhmerov> what's going to break?
08:26:48 <vgvoleg> it's a dice roll
08:26:55 <rakhmerov> why?
08:27:08 <rakhmerov> Oleg, it is something that a user can legally do
08:27:16 <rakhmerov> it's out of our control
08:27:25 <rakhmerov> yes, they can do it virtually simulteneously
08:27:34 <rakhmerov> but what will be broken in Mistral?
08:28:02 <vgvoleg> ok, I dont have the concrete example right now :D
08:28:27 <rakhmerov> if the CANCEL request comes first it will win, the PAUSE request will fail because it'll see that the execution is not in a proper state
08:29:01 <rakhmerov> if we'll have an opposite order I don't see any issues as well
08:29:19 <rakhmerov> as far as I remember we can legally cancel workflows that are in PAUSE state
08:29:39 <vgvoleg> ok ok ok I'll find a bug for sure
08:30:06 <vgvoleg> about #3
08:30:24 <rakhmerov> remember that in both cases it will be one DB TX
08:30:48 <vgvoleg> we should strictly separate terminal states and not-terminal states
08:30:57 <rakhmerov> it will either fail w/o changing anything in DB or will succeed and not let the other one make changes
08:31:15 <rakhmerov> vgvoleg: what does "strictly" mean here?
08:31:18 <vgvoleg> since we have rerun mechanism, ERROR is not terminal state
08:31:42 <rakhmerov> it is a terminal state from perspective of the certain scenario
08:31:44 <vgvoleg> so we can't work with it like it is a read-only
08:31:54 <vgvoleg> we can't cache it and so on
08:32:06 <rakhmerov> ok, what's the practical task you're trying to solve? )
08:32:27 <rakhmerov> we can but we need to do a better job when caching
08:32:31 <rakhmerov> invalidating etc.
08:33:04 <vgvoleg> I want to have a state that means ERROR, but will be read only
08:33:45 <rakhmerov> what will be the difference?
08:33:50 <rakhmerov> from the regular ERROR?
08:33:51 <vgvoleg> so that I’m sure that it will not change
08:33:54 <rakhmerov> I don't understand
08:34:07 <vgvoleg> the current ERROR execution could be changed
08:34:18 <rakhmerov> what's the point? Why can't we rerun a workflow in that state too?
08:34:55 <vgvoleg> because it is read only
08:35:07 <rakhmerov> I mean logically what will be the difference?
08:35:08 <vgvoleg> it is a terminal state, that means it will be not changed
08:35:30 <rakhmerov> how are we going to explain user why we can rerun one ERROR state and can't rerun another kind of ERROR state?
08:36:19 <vgvoleg> so there will be a transition (something like) TEMPORARY ERROR -> ERROR
08:36:44 <vgvoleg> and this will be human-initiated operation
08:37:04 <rakhmerov> nope
08:37:22 <rakhmerov> I fail to understand this..
08:37:32 <vgvoleg> Renat right now we call ERROR state as a terminal
08:37:44 <vgvoleg> but it is not terminal state
08:37:51 <rakhmerov> so all workflows have to be moved to that state only if a human says to do so?
08:38:07 <vgvoleg> so I want to have the FINAL_ERROR state
08:38:10 <rakhmerov> Oleg, again: in a certain scenario it is a terminal state
08:38:32 <rakhmerov> caching is a completely different problem
08:38:43 <rakhmerov> it's an implementation issue that we need to solve
08:38:44 <vgvoleg> caching was my stupid example
08:38:54 <rakhmerov> w/o letting users know about it
08:39:02 <vgvoleg> of use cases of read only objects
08:39:27 <vgvoleg> I told about external caching
08:39:30 <vgvoleg> not in Mistral
08:39:37 <rakhmerov> I don't see any point in having one more state for ERROR, really. What would be an explanation for users?
08:40:20 <rakhmerov> imagine someone coming here and asking "Guys, why did you add one more state? I lived just fine w/o it."
08:40:29 <rakhmerov> what are you going to answer?
08:40:38 <vgvoleg> Because we want to be honest with our clients
08:40:43 <vgvoleg> ERROR is not terminal
08:40:45 <rakhmerov> "The new state is terminal, the old one is not" ?
08:40:49 <vgvoleg> yes
08:40:52 <vgvoleg> :)
08:41:08 <rakhmerov> vgvoleg: Oleg, lots of clients don't care it all whether something is terminal or not :)
08:41:28 <vgvoleg> because right now we say that ERROR is a terminal state
08:41:48 <rakhmerov> in a certain (most common) scenario it's 100% true
08:42:35 <vgvoleg> yes
08:42:36 <vgvoleg> sure
08:42:52 <rakhmerov> let's not bother users with such mathematical kid of terms at all. It's not what they care about
08:43:11 <vgvoleg> until people find out that mistral has an amazing feature like rerun
08:43:46 <rakhmerov> ERROR is a truly terminal state meaning that a system doesn't have an automatic algorithm that can change this state to something else
08:43:59 <rakhmerov> only a human reasonably can
08:44:34 <rakhmerov> but in this case they know what they are doing and they don't care if it's not considered globally terminal anymore
08:44:44 <rakhmerov> because it's they decision to change it
08:44:59 <rakhmerov> but if we're talking about automatic processing it's 100% terminal
08:45:28 <vgvoleg> while we say that this is a terminal state, users can assume that such a state does not change, they can build their logic around this
08:46:59 <rakhmerov> again: they make a decision to change this state themselves :)
08:47:14 <rakhmerov> they know that it can change
08:47:17 <vgvoleg> ok I got what you mean
08:47:18 <rakhmerov> but only if they want to
08:47:41 <rakhmerov> automatically it can never ever change to something else
08:49:54 <vgvoleg> probably I'll return with this discussion when there will be more people :D
08:50:01 <vgvoleg> that's all, thank you!
08:51:30 <rakhmerov> ok :)
08:51:39 <boxiang> rakhmerov: https://review.opendev.org/#/c/680858/ can not fix my issue.
08:55:04 <rakhmerov> ok, let's wrap for now
08:55:08 <rakhmerov> #endmeeting