08:06:19 <rakhmerov> #startmeeting Mistral 08:06:20 <openstack> Meeting started Wed Sep 4 08:06:19 2019 UTC and is due to finish in 60 minutes. The chair is rakhmerov. Information about MeetBot at http://wiki.debian.org/MeetBot. 08:06:21 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 08:06:23 <rakhmerov> vgvoleg: ok 08:06:24 <openstack> The meeting name has been set to 'mistral' 08:06:25 <rakhmerov> go ahead 08:06:57 <rakhmerov> eyalb, abdelal, apetrich: ^ 08:07:00 <vgvoleg> First of all, this one https://blueprints.launchpad.net/mistral/+spec/mistral-task-skipping-feature 08:07:09 <apetrich> o/ 08:07:20 <eyalb> o/ 08:07:40 <vgvoleg> It's already done in our fork, but I didn't see any reaction here :( 08:07:48 <abdelal> o/ 08:07:54 <rakhmerov> vgvoleg: ok, let me read (again) 08:08:19 <vgvoleg> I've done changes in cloudflow too to support this 08:09:07 <vgvoleg> Just tell me that it would be useful and I'll push it :) 08:10:29 <vgvoleg> https://sun9-12.userapi.com/c854220/v854220580/df9c0/j5PU_qZKAi8.jpg 08:10:32 <vgvoleg> something like this 08:11:41 <abdelal> a question regarding that , if t2 published x1 , how will you pass it to t3 since t2 is skipped ? 08:12:22 <vgvoleg> skipped task also has publish section 08:12:30 <vgvoleg> it works the same way 08:12:39 <abdelal> it will publish even if it was skipped ? 08:12:47 <vgvoleg> oh wait 08:12:56 <vgvoleg> Maybe I understand you wrong 08:13:18 <vgvoleg> there is 'publish-on-skip' section 08:13:28 <rakhmerov> no-no, I think we can't publish anything using the regular "publish" if the state is not SUCCESS 08:13:51 <vgvoleg> where you can, for example, mock data 08:13:58 <rakhmerov> vgvoleg: are you aware of different syntax that you can use to publish vars? 08:14:13 <vgvoleg> It is OK with publish-on-error 08:14:15 <rakhmerov> you can have publish under "on-success", "on-error", etc. 08:14:33 <vgvoleg> I don't see any differences with publish-on-skip 08:14:50 <rakhmerov> vgvoleg: I think we shouldn't mix these things 08:15:05 <vgvoleg> we didn't mix them 08:15:14 <vgvoleg> it is another publish section 08:15:15 <rakhmerov> different states => different language key words 08:15:30 <vgvoleg> if task is skipped, no 'publish' will be published 08:15:31 <vgvoleg> :) 08:15:45 <rakhmerov> vgvoleg: that's right 08:16:06 <rakhmerov> vgvoleg: regular "publish" and "publish-on-error" will be deprecated I think soon 08:16:26 <rakhmerov> we'll be encouraging people to use "publish" under "on-..." 08:16:47 <rakhmerov> where you can define scopes (currently "branch" and "global") 08:16:49 <vgvoleg> the only thing I'm not sure about is what we should do if task don't have 'on-skip' branch 08:17:04 <rakhmerov> I'm thinking may be we shouldn't even introduce this "publish-on-skip" 08:17:50 <rakhmerov> vgvoleg: so, it's still a bit confusing to me.. 08:17:52 <vgvoleg> In our fork, if we skip a task with no 'on-skip' section, we use 'on-success' 08:18:11 <rakhmerov> vgvoleg: no, I disagree with this approach 08:18:21 <rakhmerov> it should be a totally separate things 08:18:25 <rakhmerov> thing 08:18:57 <rakhmerov> vgvoleg: can you express with one phrase why this functionality is needed? :) 08:18:59 <rakhmerov> this whole thing 08:19:15 <rakhmerov> I'm still struggling with the idea I guess 08:19:32 <rakhmerov> apetrich, eyalb, abdelal: may be you have some thoughts 08:19:50 <rakhmerov> vgvoleg: so we do skipping if what? 08:20:08 <rakhmerov> on an external failure? 08:20:25 <vgvoleg> If the flow execution is too long, it will be great to have an opportunity to skip a failed task in the tail of the flow 08:20:29 <apetrich> sorry, I'm trying to understand it too. 08:20:44 <vgvoleg> if there is an external failure 08:20:55 <vgvoleg> so retry will not help 08:21:20 <vgvoleg> So it is all about a manual control of the flow 08:21:31 <vgvoleg> like rerun, cancel, pause 08:22:13 <openstackgerrit> Eyal proposed openstack/mistral master: Add a cookiecutter template to generate custom stuff https://review.opendev.org/679782 08:22:27 <vgvoleg> If we see that result of the task is not so important right now, we can skip this task and continue the execution of the flow 08:22:50 <rakhmerov> ok 08:23:36 <akovi> hi All! 08:23:46 <rakhmerov> akovi: hi Andras 08:23:59 <rakhmerov> we're discussing https://blueprints.launchpad.net/mistral/+spec/mistral-task-skipping-feature 08:23:59 <vgvoleg> and we should be sure that the execution will not break 08:24:28 <vgvoleg> so we provide an opportunity to 'mock' any data in 'publish-on-skip' 08:24:34 <rakhmerov> vgvoleg: what's the problem if it breaks? 08:24:50 <rakhmerov> it will just have status "ERROR" but it will do all the work 08:25:04 <vgvoleg> in case we need something from 'publish' in further 08:25:17 <rakhmerov> maybe you just need some mechanisms (in your system) to interpret the results of a failed workflow correctly? 08:25:47 <vgvoleg> We can't expect every situation 08:25:57 <rakhmerov> vgvoleg: again, as far as "publish-on-skip", I'm not sure we need this at all 08:26:53 <rakhmerov> but may be for the sake of symmetry, we need to do both options: "publish-on-skip" and the totally new clause "on-skip" that can have "publish" inside just like for other "on-xxx" things 08:27:22 <vgvoleg> I'm not sure if I understand you right 08:27:29 <rakhmerov> ok 08:27:34 <vgvoleg> you just say about global problems 08:27:40 <rakhmerov> no 08:27:43 <vgvoleg> not about problems of this feature? 08:27:50 <rakhmerov> I'm talking about the new syntax for "publish" 08:28:00 <rakhmerov> no-no, it's related 08:28:04 <rakhmerov> that's why I touched it 08:28:06 <akovi> so, to clarify: the basic idea is to let a task be skipped before rerunning it. Right? 08:28:11 <vgvoleg> Ok I got it 08:28:22 <rakhmerov> akovi: even after running it 08:28:27 <rakhmerov> vgvoleg: is that right? 08:28:47 <vgvoleg> akovi: yes, we skip the task and rerun the workflow 08:29:00 <akovi> the execution fails, we skip the task and rerun 08:29:05 <akovi> ok 08:29:05 <rakhmerov> yep 08:29:10 <vgvoleg> Consider the situation: in the flow there is a task that requests any data from a third-party service while it is dead. Retries will not help in this situation, the task will go to the ERROR terminal state and the workflow will finish its work. Starting the workflow from the very beginning can be very expensive - it could have been several days bef 08:29:11 <vgvoleg> ore the fall. For such cases in the mistral there is a rerun mechanism - a certain decision maker determines whether circumstances have changed (whether third-party service has come to life), and if so, the workflow will continue its work from the fallen task. 08:29:24 <rakhmerov> because "retry" wouldn't make sense in many cases 08:29:32 <vgvoleg> n fact, the environment cannot always stabilize, and it can be very expensive to adapt the workflow to work with a new environment. Also it is not always possible to automatically assess the nature of the error that led to the fall. It can be something fatal, and maybe something insignificant, which in general does not affect the execution of the w 08:29:32 <vgvoleg> hole workflow. The decision maker can assess how important the results of the current task are and continue the execution of the workflow if not important. 08:29:35 <vgvoleg> yes 08:29:42 <vgvoleg> it's flom blueprint :D 08:29:49 <rakhmerov> I figured ) 08:29:50 <vgvoleg> all the arguments are there 08:30:03 <akovi> this is probably useful when wfs are executed under human surveillance 08:30:10 <vgvoleg> yes 08:30:21 <rakhmerov> vgvoleg: too many arguments, usually if we can't express an idea with 1 phrase then it's a bad idea ) 08:30:37 <rakhmerov> akovi: well, yes, it's exactly about that 08:31:05 <rakhmerov> akovi: basically, that way we provide more ways for humans to influence workflow executions 08:32:13 <akovi> so my general stance on wfs is that if needed, they should be implemented in an idempotent way 08:32:15 <rakhmerov> so I guess, I'm not against it if 1) It's 100% backwards compatible (shouldn't be a problem here) 2) if it's a totally separate feature that doesn't interfere with existing stuff 08:32:23 <akovi> however, it's freaking hard in many cases 08:32:34 <rakhmerov> by #2 I mean that it doesn't reuse "on-success" etc. 08:32:45 <rakhmerov> akovi: yeah.. 08:33:29 <akovi> I think this feature is a useful one. 08:33:57 <vgvoleg> It would be very comfortable to reuse 'on-success' if 'on-skip' is missed 08:34:23 <akovi> Unfortunately it will work in many cases only if the publish-on-skip closure is defined in the wf spec 08:34:36 <rakhmerov> vgvoleg: no, let's please not do this 08:34:45 <vgvoleg> In many cases we want to continue 'on-success' execution 08:34:48 <rakhmerov> well, on the other hand... 08:34:55 <vgvoleg> So we will have duplicates 08:34:57 <vgvoleg> in every task 08:35:09 <rakhmerov> I know you want, but I'm not sure at all if other people will want ) 08:35:27 <rakhmerov> I want to make sure we have the common sense here 08:35:31 <vgvoleg> akovi: yes, that's the main idea: if you want to use this feature, be sure that your flow is ready for this 08:36:01 <rakhmerov> vgvoleg: they you can have "on-skip" where needed 08:36:17 <akovi> what if we omit the publish-on-skip and substitute it with a noop task that just publishes the same values? 08:36:17 <rakhmerov> vgvoleg: I somewhat don't like the idea to reuse "on-success" 08:36:18 <vgvoleg> rakhmerov: it's not a problem if we have 'on-skip: t1, on-success: t1' 08:36:41 <rakhmerov> other people can say "we want to consider such tasks failed but w/o moving them to ERROR status) 08:36:56 <vgvoleg> but if we have a long array with next tasks, duplicating them would be ugly 08:37:12 <rakhmerov> vgvoleg: why? 08:37:16 <rakhmerov> why ugly? 08:37:21 <rakhmerov> it's just about your case 08:37:45 <rakhmerov> but like I said, we're a considering a completely different even that may happen in a workflow 08:38:02 <rakhmerov> and different people may treat it differently 08:38:25 <rakhmerov> that's why I want to make it more generic and not let it interfere with the existing mechanisms 08:38:36 <vgvoleg> well, I can write some docs... :D 08:38:46 <rakhmerov> docs for what? 08:38:52 <vgvoleg> For this feature 08:38:59 <akovi> wf: 08:38:59 <akovi> task1: 08:38:59 <akovi> action: some_custom_action 08:38:59 <akovi> publish: 08:38:59 <akovi> var1: <% task (). result.var1%> 08:39:00 <akovi> var2: <% task (). result.var2%> 08:39:00 <akovi> var3: <% task (). result.var3%> 08:39:01 <akovi> on-success: task2 08:39:01 <akovi> on-skip: task2 08:39:02 <akovi> skipped-task1: 08:39:02 <akovi> action: std.noop 08:39:03 <akovi> publish: 08:39:03 <akovi> var1: "var1" 08:39:04 <akovi> var2: "var2" 08:39:16 <vgvoleg> If something is described in docs, it is legal 08:39:35 <rakhmerov> vgvoleg: let me put it this way: you may want to have lots of repeating tasks in "on-success" and in "on-error". But we don't say "it's ugly to repeat them" 08:39:42 <rakhmerov> because those a totally different cases 08:40:09 <rakhmerov> for that we actually have "on-complete" where we can move repeating stuff 08:40:44 <rakhmerov> vgvoleg: no-no, I can't accept that approach ("If something is described in docs, it is legal"), sorry 08:41:32 <rakhmerov> docs must not aim to explain why we came up with a bad design 08:41:46 <vgvoleg> akovi: how do you want to describe them? 08:42:17 <vgvoleg> if we run this flow the skipped task will be executed with task1 in the parallel way :D 08:42:31 <akovi> no 08:42:35 <akovi> I messed it up 08:42:55 <akovi> task1.on-skip = skipped-task1 08:43:15 <vgvoleg> oh 08:43:21 <akovi> this way there's no need for alternative publish methods 08:44:04 <vgvoleg> I can't undestand why 'publish-on-error' is OK and 'publish-on-skip' is not OK 08:44:08 <akovi> where do we usually share copy-pastes? I forgot the name of the service 08:44:18 <eyalb> pastebin 08:44:26 <vgvoleg> I think that creating redundant instances for publish is not OK 08:44:49 <rakhmerov> redundant instances? 08:44:52 <rakhmerov> what's that? 08:45:04 <vgvoleg> noop task just for publish 08:45:12 <rakhmerov> guys, please let's be more accurate with terms 08:45:58 <rakhmerov> noop for publish... I fail to understand this 08:46:14 <vgvoleg> look at Andras' example 08:46:33 <rakhmerov> ok, yes... 08:47:02 <rakhmerov> well, IMO it's not a problem to make a separate "publish-on-skip" thing 08:47:22 <rakhmerov> and "publish" under "on-skip" 08:47:32 <rakhmerov> it's easy 08:48:00 <rakhmerov> I just don't like the idea to reuse either "on-success" or something else that already exists to handle skipping 08:48:22 <rakhmerov> duplicates, in my opinion, is not a problem that we need to solve now 08:48:44 <vgvoleg> ok 08:48:45 <abdelal> the plan is to eventually remove publish and publish-on-error right?, i dont think its wise to add publish-on-skip too 08:48:58 <rakhmerov> abdelal: yes! 08:49:10 <rakhmerov> abdelal: yes, right 08:49:32 <rakhmerov> I just thought that maybe we still need to add it, but just for the sake of symmetry with other clauses 08:49:40 <abdelal> so i think we should follow the current synyax we want,just have publish under on-skip if anything 08:49:53 <abdelal> syntax* 08:49:56 <rakhmerov> that's for sure, yes 08:50:02 <vgvoleg> guys I'm OK with this changes but please don't mix them to the feature discussion 08:50:05 <eyalb> I agree 08:50:33 <rakhmerov> I'm just saying that the language should always be symmetric around similarities that we're adding 08:50:51 <rakhmerov> vgvoleg: :) 08:51:29 <rakhmerov> so, if there's no serious objections then I'd say "go ahead and push a patch" 08:51:31 <abdelal> i think this feature is useful overall , but also as renat said , it should be as generic as much as possible and be alligned with current syntax 08:51:46 <rakhmerov> these technical nuances is something that we'll polish a bit later 08:51:47 <rakhmerov> not now 08:51:57 <rakhmerov> yes 08:51:58 <akovi> ok, great 08:52:12 <rakhmerov> vgvoleg: sounds OK? 08:52:23 <vgvoleg> sure :) 08:52:24 <akovi> so on-skip.publish? or publish.on-skip? 08:52:33 <rakhmerov> once we see your patch we may notice something else to discuss 08:53:15 <rakhmerov> akovi: I'm for adding "publish-on-skip" and "on-skip" (that may contain "publish") just the same way as for other clauses 08:53:26 <rakhmerov> then it'll be 100% symmetric 08:53:44 <vgvoleg> guys, in the blueprint everything is symmetric 08:53:47 <vgvoleg> 102% 08:53:55 <akovi> hmm 08:54:12 <akovi> the removal of publish-on* was new info for me 08:54:38 <vgvoleg> It's because they mix two different topics 08:54:42 <akovi> and if this is being removed then I'm not sure why we would provide this syntax for a new feature 08:54:45 <akovi> but I'm ok 08:54:47 <vgvoleg> relax :) 08:55:38 <rakhmerov> vgvoleg: yes, sorry for that 08:55:44 <rakhmerov> but it's kind of related 08:55:56 <rakhmerov> if we are to discuss particular syntax 08:56:03 <vgvoleg> so the second blueprint I'd like to discuss in the next meeting 08:56:11 <rakhmerov> vgvoleg: yes, please 08:56:15 <vgvoleg> thank you! 08:56:16 <rakhmerov> we don't have enough time today 08:56:25 <rakhmerov> thanks guys, I have to wrap up ) 08:56:33 <rakhmerov> thanks everyone for joining 08:56:43 <rakhmerov> I'd encourage you to do it more often 08:56:45 <eyalb> bye 08:56:48 <rakhmerov> bye 08:56:49 <vgvoleg> bye! 08:56:51 <akovi> great, I'm looking forward to see this working 08:56:58 <rakhmerov> #endmeeting