20:05:29 <mikeyp> #startmeeting Orchestration 20:05:30 <openstack> Meeting started Thu Nov 17 20:05:29 2011 UTC. The chair is mikeyp. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:05:32 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic. 20:05:54 <maoy> #topic workflow engines 20:06:17 <mikeyp> #topic workflow engines 20:06:24 <maoy> :) 20:07:00 <maoy> i was wondering about the error handling in the mailing list 20:07:03 <mikeyp> we dont have an agenda full agenda - I think it's workflow engines, eventlet/zookeeper, and anything else 20:07:20 <mikeyp> ok 20:07:35 <maoy> i'm interested more in they handle runtime errors 20:08:14 <maoy> and also if it deal with some failure issues, such as a node is crashed 20:08:26 <mikeyp> main thing I noticed was that exceptions are just raised; there didn't appear to be any concept of exception handling specific to the workflow. 20:08:55 <maoy> is the exception raised in another node (or another Python interpreter)? 20:09:27 <mikeyp> it's single threaded, no conecpt of concurrency or parallelism. 20:09:37 <maoy> got it 20:09:51 <maoy> but we need something that can handle those 20:11:19 <mikeyp> definitely, but I didn't find any cloud-grade (tm) workflows libraries 20:12:04 <mikeyp> it's raises the larger point of how this will all work together - think we need Sandy for that. 20:12:26 <maoy> ok. i'll put some thoughts on that too. 20:12:30 <maoy> i'll try to convert my powerpoint proposal to a wiki page before next meeting. 20:12:57 <mikeyp> the strawman I have in my head is 'orchestration' is a reliable service, that calls into other openstack services. 20:13:01 <maoy> now that i've read though the nova code i have a better idea how to fit in the code.. 20:13:14 <maoy> yes 20:13:43 <mikeyp> I'm not sure what the granularity would be, either in initial or later releases. 20:14:24 <maoy> i think combining that, with more orchestration cooperation logic inside the compute/network nodes, we have something there. 20:14:28 <mikeyp> it seems like TROPIC could support fine grained control. 20:14:55 <maoy> the "orchestrator" might actually nicely fit with the scheduler 20:15:47 <mikeyp> agreed - I see changes there. 20:16:20 <maoy> mike, can you elaborate in "fine grained control"? 20:16:47 <mikeyp> just the general level of steps. 20:16:54 <maoy> ok 20:17:33 <mikeyp> so today, the operations are pretty high level. Schedule calls create, and a large number of things happen. 20:18:15 <mikeyp> should those individual operations be coordinated by orchestration ? 20:18:43 <maoy> i think if they are non trivial, e.g. takes a while to finish 20:18:53 <maoy> they should report their status 20:19:10 <maoy> so that the orchestrator could 1) know what's going on 20:19:17 <maoy> 2) if it's stuck/dead/crashed 20:19:29 <maoy> 3) abort, or restart if necessary 20:20:37 <mikeyp> #action get sandys input on granularity of orchestration 20:21:00 <maoy> the state of the workflow progress should be available 20:21:14 <maoy> it could be either in database, or in zookeeper 20:21:48 <maoy> right now, the task_state column is kind of like that 20:22:12 <maoy> but can definitely be improved 20:22:14 <mikeyp> yes - when I'v done this in the past, workflow runs independently of other operations, and can be interrogated 20:24:05 <mikeyp> in your TROPIC work, where there multiple workflow servers ? 20:24:12 <maoy> i also like to use the analogy of the OS process 20:24:45 <maoy> we essentially need to build mechanisms to track the distributed processes as a coherent workflow 20:24:54 <maoy> restart, or abort it if necessary 20:26:07 <maoy> if you look at those business workflow management software, they are solving a different problem 20:26:53 <maoy> yes. in TROPIC we call them controllers 20:27:01 <maoy> there are multiple of them 20:27:04 <mikeyp> business workflow tends to focus on process control, rather than process execution. 20:27:22 <maoy> but only one is elected as leader to make decisions 20:27:49 <mikeyp> so one is active, the others are 'standby' or failover ? 20:28:08 <maoy> yes 20:28:36 <mikeyp> got it - thats what I thoiugh the paper said. 20:28:41 <maoy> it's hard to make distributed decision. :) 20:29:07 <maoy> although possible, we run the numbers and seems one active is fast enough 20:29:40 <maoy> i also looked at the other proposal mentioned in last meeting 20:29:44 <maoy> from dragon 20:30:00 <maoy> i felt it's very similar to the ppt file I sent 20:30:03 <mikeyp> I#topic pacemaker 20:30:12 <mikeyp> #topic pacemaker 20:30:47 <mikeyp> I haven't really reviewed it, was mostly looking at libraries. 20:31:03 <mikeyp> what are the main differences ? 20:31:27 <maoy> between dragon's and mine? 20:31:52 <mikeyp> yes 20:32:42 <maoy> mine also proposes to keep logs so that we can automatic rollback 20:33:42 <maoy> hold one 20:33:53 <maoy> i need to refresh my memory. :) 20:34:31 <maoy> #action maoy gives dragon proposal feedback 20:34:42 <maoy> i'll do this in an email after the meeting 20:34:53 <mikeyp> #action maoy gives dragon proposal feedback 20:35:15 <mikeyp> ok, I will also review it. 20:35:32 <maoy> i don't know much about pacemaker 20:35:42 <mikeyp> #action mikeyp to review dragon's proposal 20:36:00 <maoy> the picture of pacemaker seems to suggest that corosync is a dependency which i also know nothing about 20:37:02 <maoy> i got zookeeper working with eventlet 20:37:11 <maoy> so that's not a concern. 20:37:29 <mikeyp> #link https://lists.launchpad.net/openstack/msg03767.html dragondm's proposal 20:37:44 <mikeyp> #topic zookeeper / eventlet 20:38:00 <mikeyp> yes, I saw that, good progress. 20:38:37 <mikeyp> #topic vm-stat transitions 20:38:53 <mikeyp> The proposed vm state transitions are in review 20:39:06 <mikeyp> #link https://review.openstack.org/#change,1695 20:39:34 <mikeyp> They seem to be held up, but I'm reviewing the changes anyways. 20:40:07 <mikeyp> I should have said state transition management 20:41:06 <maoy> somehow i felt that the solution they proposed is a little too complicated 20:41:41 <maoy> i remember i saw a big state transition table in the summit 20:42:01 <maoy> hopefully it can be simplified, otherwise it's hard to debug 20:42:53 <mikeyp> hopefully, orchestration can remove some of the complications. 20:43:00 <maoy> exactly 20:43:20 <mikeyp> so, what else do we have ? 20:43:37 <mikeyp> #topic wrap up 20:43:56 <maoy> not much 20:44:34 <maoy> next week is thanksgiving 20:44:36 <mikeyp> OK, then lets wrap up till Sandy can review - I know he was out of pocket travelling today. 20:45:20 <maoy> cool 20:45:28 <mikeyp> #action mikeyp to send email re: next week schedule 20:45:34 <mikeyp> ok, ttyl. 20:45:48 <mikeyp> #endmeeting