16:00:56 #startmeeting Mistral 16:00:58 Meeting started Mon Dec 22 16:00:56 2014 UTC and is due to finish in 60 minutes. The chair is rakhmerov. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:59 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:01:01 The meeting name has been set to 'mistral' 16:01:07 hi all 16:01:23 hi) 16:02:05 hi hi 16:02:15 let's wait for others 16:02:28 in the meantime I'll list out our agenda items 16:03:04 [edit] 16:03:04 Review action items 16:03:04 Current status (p 16:03:06 rogress, issues, roadblocks, further plans) 16:03:08 "Kilo-1" scope and blueprints 16:03:10 "for-each" 16:03:12 Scoping (global, local etc.) 16:03:14 Load testing 16:03:16 Open discussion 16:03:24 sorry, that wasn't too accurate 16:03:40 I just copied from wiki and pasted it 16:04:28 ok, let's start slowly 16:04:38 #topic Review Action Items 16:04:42 1. akuznetsova, share all the work on HA testing & benchmarking 16:04:46 it's done 16:04:56 2. nmakhotkin, fix the bug with "=" sign in simplified syntax if Lakshmi doesn't fix it himself by the end of Monday 16:04:59 done either 16:05:13 3. nmakhotkin, add full example(s) to 'for-each' spec 16:05:16 done 16:05:21 even more than we planned 16:05:33 4. all, keep discussing contexts (scopes) in the mailing list and other channels 16:05:39 it's going on 16:06:05 and I think we're close to find the consensus 16:06:16 hi ! 16:06:28 hi Nikolay 16:06:34 #topic Current status (progress, issues, roadblocks, further plans) 16:06:51 as usually, let's quickly report our statuses (a couple of sentences) 16:07:37 hi all, sorry I’m a bit late 16:07:51 my status: last week I added a couple "join" of examples to extra and created several README files for example, also fixed some small bugs and released kilo-1 16:08:06 and participated lots of email exchanges 16:08:09 hi dzimine 16:08:11 np 16:08:22 Finished mistral examples docs and prepared doc for for-each draft 16:08:50 also fixed that bug with "=" 16:08:57 ok 16:09:16 I tested Mistral before release, verified bugs and prepared some docs with result of Rally testing 16:09:28 ok 16:10:30 my status: last week contributed to for-each spec, and worked with Winson on defining two other blueprints: “action providers”, and “context”. 16:10:48 ok 16:11:11 dzimine, do you know when Winson is planning to share this "action providers" thing? 16:11:28 I'm excited him to share it with other folks 16:12:05 absolutely, as you suggested - he’ll do a next level of details write-up/diagram and put it up. 16:12:11 is he here by a chance? 16:12:18 not sure 16:12:21 Winson? 16:13:01 looks like he's not 16:13:09 ok, let's go to the next topic 16:13:45 #topic "Kilo-2" scope and blueprints 16:13:57 ok, the end date for Kilo-2 is Feb 5 16:14:14 so we have about a month and a half 16:14:30 here's the release page: https://launchpad.net/mistral/+milestone/kilo-2 16:14:56 for now I assigned blueprints that I think are important (IMO) 16:14:58 but 16:15:20 you can propose which blueprints you'd like to see done in Kilo-2 16:15:26 and we can discuss it 16:16:09 so please take a look at the page now and make your comments/suggestions 16:16:22 or if you need time we can do it offline in ML 16:17:00 also some of the BPs are not ready yet that may be included in it 16:17:04 like "action providers" 16:17:36 workflow_constants https://blueprints.launchpad.net/mistral/+spec/mistral-workflow-constants is going to be covered by what Winson is proposing with context, let’s assign it on him 16:17:37 so after we agree on a set of blueprints we need to set a time to do a planning poker like we did before 16:17:57 yes, I asked him about it today 16:18:00 Action Providers - I’ll create a blue print an yes I propose it for kilo-2 16:18:09 you're right that it may be a part of a bigger BP 16:18:16 ok 16:18:17 and dry-run - propose to move away 16:18:28 on this one I tend to agree 16:18:41 intuitively, i think we're not there yet 16:18:42 No-op task. It seems to be done already 16:18:46 and it's less important 16:18:55 NikolayM, not completely 16:19:01 no-op action is done 16:19:05 main reason - I think we bring up quality threshold and begin making things less prototypy, so it will take longer. 16:19:13 no-op task is related to it but a little different 16:19:25 Should for-each be here, too? 16:19:30 rakhmerov, are you sure that this blueprint mistral-dashboard-crud-operations will be finished? 16:20:05 for-each definitely, let's include BPs on it, I just wasn't sure if they were all prepared 16:20:10 but yes, +1 for for-each 16:20:34 akuznetsova, yes, I would love to see it done a couple of months ago :)) 16:21:01 akuznetsova, I think it will be hard to do without mistral-workbook-builder 16:21:04 and btw, I also have a desire to dive in UI a little bit 16:21:16 so I would like it to be assigned to me 16:21:23 but if we'll have simple way to create wbs and wfs, it will be great 16:21:34 yes 16:21:39 NikolayM, yes, we need to wait when Timur S and his team finish their work 16:21:49 to complete PAUSE/RESUME functionality, we must provide ways to update task’s data context. Again I think it’s ok if it’s not in kilo-2, but let me write down a BP for tracking. 16:22:08 yes, but for now we decided with Timur that we shouldn't sync our work 16:22:30 rakhmerov, oh, I didn't know it 16:22:31 builder will be mostly independent thread of work for now 16:22:42 yeah, it was discussed about a week ago 16:23:01 dzimine, yes, ok 16:23:04 please do that 16:23:20 btw, from code perspective it's almost done 16:23:40 recently we made some changes that will make this simpler 16:25:18 didn't we finish this one https://blueprints.launchpad.net/mistral/+spec/mistral-yaml-request-body ? 16:25:36 I guess NikolayM worked on it 16:25:42 so ideally we need to spend a day or two maximum to agree on blueprints and I want to suggest we meet at the end of this week (or beginning of the next one) to do planning poker 16:25:58 akuznetsova, yes, it's in BETA AVAILABLE 16:26:05 yes 16:26:16 because I wasn't careful enough with releasing Kilo-1 :) 16:26:33 now it's closed so we have to assign it to something else 16:26:34 the same with bp on docs 16:26:37 my mistake 16:26:43 right 16:26:54 rakhmerov, ok, now I see 16:27:11 NikolayM, can you please assign the BPs you created for "for-each" to Kilo-2? 16:27:30 yes, sure 16:27:51 akuznetsova, what about https://blueprints.launchpad.net/mistral/+spec/mistral-milti-tenancy-tests ? 16:27:55 #action stackstorm: create "action providers" blueprint and assign to Kilo-2 16:28:08 I thought you have finished them 16:28:15 #action NikolayM: assign all relevant "for-each" BPs to Kilo-2 16:28:29 NikolayM, done) 16:29:54 I assigned it to kilo-2 and passed BETA AVAILABLE status ) 16:29:55 yeah, it's in BETA AVAILABLE as well 16:30:01 :) 16:30:02 ok 16:30:19 I also personally doubt about this one 16:30:27 https://blueprints.launchpad.net/mistral/+spec/mistral-ceilometer-integration 16:30:38 I don't really believe we can get it done in a month 16:30:42 so, we have 3 bps which are already done :) 16:30:45 unless someone helps us 16:30:54 yes :) 16:31:10 I want to create bp for creating Rally gate and pylint gate 16:31:16 it actually happened because there were BPs with not assigned milestone 16:31:25 so I missed them then I was releasing Kilo-1 16:32:25 akuznetsova, sure, it must be included for its apparent importance 16:33:08 and we will need gate for dashboard tests 16:33:09 #action akuznetsova, create a BP for load testing and benchmarking and assign it to Kilo-2 16:33:23 yes 16:33:39 I'm just actually not sure if it's feasible to do in this cycle 16:33:53 I think we definitely need to create a BP 16:34:08 assign it to Kilo-2 with say "medium" priority 16:34:25 and during planning poker we first need to estimate all BPs with "high" priority 16:34:36 you assigned this bp https://blueprints.launchpad.net/mistral/+spec/mistral-dashboard-tests to kilo-2, it means that we need a gate for them) 16:34:38 and see what else we have chances to do 16:34:58 hm.... 16:35:11 the form may be different 16:35:15 but yes, agree 16:35:23 ideally they should run on a gate 16:35:55 it would be nice if someone from deployers could help us set a gate for this 16:36:07 can we discuss the content of the tests themselves here, or separately? 16:36:26 #action rakhmerov, find out if there's a chance to get a deployer involve to set a gate for dashboard tests 16:36:39 yes 16:36:44 let's then finish this topic? 16:37:00 the rest on Kilo-2 scope will be done offline I guess 16:37:26 #topic Load testing 16:37:34 so pls go ahead 16:38:12 since we don't have a huge amount of time I could suggest we point out just major points probably 16:38:24 dzimine, you had a couple of questions 16:38:28 and continue to shape out the details offline 16:39:03 dzimine, btw I answered to your question in the mailing thread 16:39:11 why don’t you sum up what’s the goal of the tests? May be this is so obvious… but missed in descriptions. 16:39:17 yes I saw the reply, thank you. 16:40:20 load and performance tests, right, so what is the goal and how the load is defined? and how performance is defined and measured? 16:41:32 which paths do we worry about the most, how the tests relate to these paths? Like do we really worry the most about API being able to get somethign from db and serve it to the clietn???? 16:41:52 why list_workbooks get the Rabbit communication all together? 16:42:22 hm... good question 16:42:39 it actually shouldn't 16:42:40 goals are to measure the time of main Mistral requests, define when the Mistral will not be able to answer, it is what I do now 16:42:51 you see, something has already been revealed! ;)) 16:43:02 I think we need to help akuznetsova with some info on where we expect Misral to fail, and how to create the type of load that is representative of the production load. 16:43:18 dzimine, yes, please 16:43:40 dmitri, I think load is always defined as a number of operations (of certain types) occuring in a system simultaneously 16:43:47 cause if I run the load_workbook test, I can happily shut down engine and executor processes and it will work even better. 16:43:53 I guess we need to define some main scenarios which have to be tested 16:43:59 da-h. 16:43:59 but you may be right that we need to tell all these things explicitly 16:44:09 yes 16:44:20 dzimine, I already have scenario with simple wf execution 16:44:34 ok, the question to you Dmitri :) 16:44:43 the key operations for workflow system is number of tasks/actions executing per secound, number of workflows running simultaniously, 16:44:52 what scenarios and metrics are important for you? 16:45:24 let's discuss it and make it a part of the plan 16:45:44 once we define the “load” (system is doing somethign that it is supposed to do) we add the tests to see if it remains operational - e.g., run full suite of API tests on top of loaded system to see that it can serve the API while under load 16:45:51 Nastya's main goal for the previous cycle was to start doing at least something, learn Rally etc. 16:46:22 to me? or used with standard comparison of workflow engines? which is more interesting ? :) 16:46:47 no, I'm not interested in other workflow engines :) 16:46:54 rather in your opinion 16:47:00 hahaha 16:47:10 (kidding about other engines of course) 16:47:20 dzimine, running API tests from Rally, it is not obvious task and I need to think about it 16:47:31 ok, sure 16:47:34 my opinion althoug interesting indeed, is irrelevant :) 16:47:42 Rally is just a convenient tool 16:47:54 it works in a pretty straightforward way 16:48:05 :))) 16:48:22 well I am not insisting on running our API tests but you’ll likely to redo some as we need to be sure that API is available while system is under load. 16:48:37 I meant is "your experience is indeed helpful for us" 16:48:38 Now given that not all API created equal: 16:49:25 agree on that thought: check API methods when the system is under load (running workflows) 16:49:28 if we load system with workflows and than run “GET TRIGGERS” API, it will be less representative than if we run GET EXECUTIONS and GET TASKS 16:49:56 yes 16:50:03 member:akuznetsova: I am not trying to invent a wheel here 16:50:27 so thought #2: we should make sure to have a highly concurrent access to alike objects 16:50:30 member:akuznetsova how are you guys performance/load test other products? 16:51:15 from my experience, in all projects it's made differently 16:51:26 dzimine, some projects just have a couple of Rally scenarios and Rally gate 16:51:27 gol' na vydumki hitra 16:51:28 :) 16:52:27 I guess a lot of projects don't do this at all 16:52:33 that's the answer ;) 16:52:59 it brings us back to the goal of load/performance tests. If it is ‘forma’l let’s get rally test mechanics in place for now and deal with this later - fine. 16:53:22 problem is that other openstack projects have lower load threshold 16:53:28 what do you mean, dmitri? 16:53:41 I didn't quit understand you 16:53:41 exactly that, lower threshold. 16:53:55 you mean we should get rid of Rally for now? 16:54:07 no, not at all. 16:54:11 ok 16:54:27 and for example in some scenarios keystone will die earlier then Mistral 16:54:32 I mean, we should not stop at formal “couple of rally tests” but do some performance 16:54:33 believe me, Rally is a very cool stuff :) 16:54:47 ooh, yeah 16:54:54 I have no opinion on rally and I think it’s good. 16:55:04 It all on us which tests do we go for. 16:55:09 ok, let's try to work on testing strategy together then 16:56:00 I think the criteria can be something like: make some assumptions what would be a typical load during production usage and try to emulate that 16:56:27 but the challenge is that there's so many dimensions here: 16:56:32 1) deployment schema 16:56:37 Ok, simply, let’s shit-load engine with multipe highly parallel workflows. Let’s make sure we mesure number of WF and running tasks. 16:56:39 2) type of operation 16:56:44 etc. etc. 16:57:06 One aspect here is risk. 16:57:23 yes, akuznetsova, I think we discussed something like that with you before and it should be something in our docs, right? 16:57:26 I dont see getting data from API as risky. Except on highly concurrent objects. 16:57:34 Which we’ll get to later. 16:57:35 I mean "highly parallel workflows" 16:57:49 yes 16:58:00 rakhmerov, not yet 16:58:03 but as it turns out we have problems even with simple operations 16:58:06 ok 16:58:34 The potential troubles are engine concurrency and throughput, and action/workecr communicaion. And what’s the point executors begin to block. 16:58:39 so looks like testing even simple operations gives us some good information about the system 16:58:52 like? 16:59:02 “testing simple operations?” 16:59:13 the goal of testing is “FIND BIG PROBLEMS FAST”. 16:59:35 like "for some reason something is wrong even if we do 'workflow-list' in parallel" 16:59:40 that's what I mean 16:59:41 my whole point is: let’s jump right where the risks are”. 16:59:46 it's already a useful piece of info 16:59:56 ok 17:00:04 let's create a separate discussion we are out of time now 17:00:12 ok. 17:00:25 generally I agree, but again. We already revealed problems even on simple things so we shouldn't completely ignore them too 17:00:31 akuznetsova: do you agree or I confused you? 17:00:31 but maybe we should spent less time 17:00:37 ok, folks