20:00:43 #startmeeting state-management 20:00:44 Meeting started Thu Jun 27 20:00:43 2013 UTC. The chair is harlowja. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:00:45 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 20:00:47 The meeting name has been set to 'state_management' 20:00:54 hallo all! 20:00:55 hi 20:01:05 hi hi 20:01:10 hi 20:01:16 hola 20:01:43 howday 20:02:25 so i guess we got enough of the major people, others feel free to chime in :) 20:02:44 so lets see, action items from last time 20:02:45 #link http://eavesdrop.openstack.org/meetings/state_management/2013/state_management.2013-06-20-20.00.html 20:02:56 seems like most of them were *started/completed* 20:03:11 #link https://wiki.openstack.org/wiki/TaskFlow is forming up pretty nicely 20:03:36 thx adrian_otto and others 20:03:59 also just got some examples going 20:04:16 late, but I'M HERE. 20:04:16 https://wiki.openstack.org/wiki/TaskFlow#Examples (nothing super duper yet) 20:04:22 howday! 20:04:29 woah.. caps lock went crazy. sorry. 20:04:41 haha Just thought you were really excited to be here. : P 20:05:04 that too jlucci 20:05:13 kebray DEEP BREATHS 20:05:15 lol 20:05:34 kebray just going over action items that i think were pretty much all mostly done 20:05:42 * adrian_otto eye roll 20:05:43 #link https://wiki.openstack.org/wiki/TaskFlow/HavanaSummitPresentationAbstract was yours which seems like its starting to shape up 20:05:56 yep. 20:06:10 what else, ummm, i put some comments on https://wiki.openstack.org/wiki/Heat/TaskSystemRequirements but waiting for zane to get back to discuss more 20:06:19 #link https://wiki.openstack.org/wiki/Heat/TaskSystemRequirements 20:06:22 My other action is happening organically.. and, with Zane out, I think we are progressing well for now. 20:06:27 ya 20:06:35 I left an open question unanswered on that 20:06:55 ?? 20:07:09 on the HacanaSummitPresentationAbstract I had mentioned a need for a mission statement 20:07:14 ah 20:07:19 Keith followed up with a question asking for clarification 20:07:39 ya, all coming back now 20:08:04 * kebray thinks mission statement isn't needed since we aren't proposing TaskFlow for an OpenStack top-level project. 20:08:05 I meant that if it were to be considered as what is now being discussed as an "OpenStack Program" that it wold need an individual project mission statement 20:08:17 that's right, it's not needed for the current scope 20:08:34 It'll be needed when we expand scope to something more like Convection. 20:08:37 "take over the world" not allowable as a mission statement? 20:08:42 where my remark matters is where Joshua pulled in the reference to Thierry's proposal email to the openstack-dev mailing list 20:09:01 so is that clear? I'm suggesting we table that. 20:09:08 sure, thats fine with me 20:09:10 Works for me. 20:09:12 and cross taht bridge when we come to it 20:09:40 sounds great 20:09:40 I'm happy to switch topics to https://wiki.openstack.org/wiki/Heat/TaskSystemRequirements  now 20:10:04 sure, lets jump there and see how much to talk about 20:10:13 #topic continued-heat 20:10:54 so theres a few on there that i'll have to talk to zane about, or just in general 20:11:38 "Tasks propagate exceptions" is a little awkward when u have many tasks running in parallel, what happens to the potential X exceptions that can pop out 20:11:43 where X >1 20:12:05 and some of them on that requirement list seem pretty subjective, so need to understand those more clearly 20:12:14 'Tasks don't make debugging unnecessarily difficult' for example 20:12:47 but i think all those can be discussed and either removed or clarified 20:13:28 any comments/concerns/questions from others? 20:13:48 We should probably revisit the requirements when Zane is available to discuss them 20:13:54 ya 20:13:59 + 1 20:14:08 there are enough details present to remind us once we understand them 20:14:09 +1 20:14:13 but that's by no means a spec 20:14:25 agreed 20:14:26 many are subjective 20:14:38 agreed 20:14:58 so those will need to be expanded to include some concrete criteria by which to measure confirmance 20:15:03 sure 20:15:13 +1 20:15:21 let's not go overboard, but ther eis enough in question so that it's not actionable without refinement 20:15:59 I think we have enough to avoid going in a completely different direction though, so I'm happy we have something in writing. 20:16:14 ya, it seems like we have been thinking along the same lines 20:16:20 +- a little 20:16:57 sweet, onto next topic? 20:17:04 ok, so can we commit to action items relating to building usage examples? 20:17:17 I see that in our critical path for adoption 20:17:34 sure, i've started some of them, but not alot 20:17:45 without that, we will carry the burden of doing all the implementations, rather than empowering the other projects to leverage what's there 20:17:49 agreed 20:17:58 #link https://wiki.openstack.org/wiki/TaskFlow#Examples to start 20:18:07 but ya, it needs more 20:18:18 I believe Angus may have volunteered to start some as well. 20:18:23 #action harlowja make some more examples 20:18:32 https://github.com/stackforge/taskflow/blob/master/docs/examples/reverting_linear.py 20:18:33 2 hours ago 20:18:33 https://github.com/stackforge/taskflow/commit/6c40c56c5bd25056a396683219b548c5af0dac0c [https://github.com/harlowja] 20:18:33 https://github.com/stackforge/taskflow/blob/master/docs/examples/simple_linear.py 20:18:33 2 hours ago 20:18:33 https://github.com/stackforge/taskflow/commit/6c40c56c5bd25056a396683219b548c5af0dac0c [https://github.com/harlowja] 20:18:34 if everyone adds 1 example, that would be lots of examples ;) 20:18:34 https://github.com/stackforge/taskflow/blob/master/docs/examples/simple_linear_listening.py 20:18:35 2 hours ago 20:18:35 https://github.com/stackforge/taskflow/commit/6c40c56c5bd25056a396683219b548c5af0dac0c [https://github.com/harlowja] 20:18:38 I don't want to commit him to any actions… but, will be great if he chips in. 20:18:56 is Angus present now? 20:19:22 doubtful.. it's early am for him. 20:19:51 so i'll try to keep on adding more examples 20:20:04 ok, who;s best to follow up with him to politely request one? 20:20:22 He volunteered last night on the Heat channel IIRC. 20:20:37 ok, let's take an action to follow up next week then 20:20:41 i can ping him later to see if he has had any luck, but maybe wait a day or 2? 20:20:46 so ithe offer does not go stale 20:20:51 agreed 20:20:55 agreed on the wait a day or two. 20:21:36 is the location of those examples fine, or should we place elsewhere?? 20:21:49 potentially someday we could have a readthedocs.org site or somethign 20:22:33 I think they belong in github personally, as part of the in-repo documentation. 20:22:44 k 20:22:59 We can link the wiki to the appropriate file on github. 20:23:08 works for me 20:23:40 I will volunteer to dress up the example section on the TaskFlow wiki page to describe each example in more detail (catalog style) 20:23:51 nice! 20:24:00 thx adrian_otto 20:24:18 #action adrian_otto dress up examples on taskflow wiki 20:24:25 yep 20:24:28 cool 20:24:39 alright, next big topic that i have 20:24:42 #topic release 20:24:49 soooo 20:25:19 am thinking when we can have or should we have some type of taskflow alpha release 20:25:44 so that the basic usage by cinder could possible go through 20:25:47 #link https://review.openstack.org/#/c/29862/ 20:26:10 and what do we want to recommend to be in that release (and when) 20:26:40 or do we want to wait... 20:26:56 and how do we want to release if we do 20:27:03 harlowja: interested in your thoughts first… as, you're implementing the Cinder one :-) 20:27:54 what's up with the −1 from Duncan Thomas? 20:29:38 sure, that refactoring was more of restructuring into tasks, simple linear flow, with a memory backend, now if we had a simple db backend working, then we could say have a release with just that, and keep the rest as a WIP, orrr, we can do the copy/paste path and just move the needed/used taskflow modules into cinder 20:30:08 or release to pypi in a little with the other flows, but let others know that those aren't done yet 20:30:44 I'm sort of leaning towards the pypi release 20:30:45 adrian_otto i think the duncan thing was just a logic chang that he found 20:31:37 so jlucci i guess if we lean toward the pypi release, that requires that we have things mostly working, and the defintion of mostly :-P 20:31:56 Regarding the cinder refactoring - how do you plan to address "Eventual addition of resumption logic to recover from operations stopped halfway through." 20:32:09 Yeah, I'm working on fixing up distributed to "fully functional" right now 20:32:52 ehudtr so that requires a 'transcation log' concept, which we have accounted for in taskflow (in a few different ways) 20:33:04 so once u have a "transcation log" u can know what to resume from and where 20:33:45 jlucci kebray kchenweijie do u guys think like another week we can have things mostly working? its ok if we strip some not so working things out for the release 20:33:52 Don't you want the tasks to be idempotent 20:34:02 just if it goes on pypi people will be like thats broke and file a bug 20:34:02 I'd say yes for things on my end 20:34:11 i can say yes for what im working on 20:34:13 k 20:34:29 we can talk about anything that we might need to strip out if we need to 20:34:33 *graph_flow* cough 20:34:35 lol 20:34:48 nobody is using it right now anyway (and this parallel stuff removes the need for it) 20:35:00 hehe 20:35:09 ehudtr so in an ideal world everything is idempotent, but i don't think that ideal world exists so much :-/ 20:35:41 harlowja: i can agree with that. its not cooperating with me right now... 20:35:56 kchenweijie u broke python though, ha 20:35:57 Should we align with H2 release? Or is July 18 too far in the future? 20:36:01 if you are only logging then you may do the task twice 20:36:45 ehudtr sure, there are cases where idemptoent does work, just it becomes a real pain in a stateful system, ideally such a stateful system wouldn't exist, but it seems to 20:37:00 kebray i wouldn't mind trying to align with H2 20:37:06 and seeing what happens 20:37:26 which does bring up a good question 20:37:36 i'll be in NY july12->20th 20:38:06 but jlucci will have it all covered ;) 20:38:07 ehudtr: if a task is, let's say, a POST api call to another service (I'm speaking generically, not Cinder specific), then the task can't be idempotent me thinks. 20:38:57 Totally. I'll just leave everything on fire till you get back 20:39:00 Maybe we should shoot for July 11 :-) 20:39:06 all the chain needs to be idempotent 20:39:09 so does it seem fine to try to see what we can get done by next thur meeting, and then access from there for a release? 20:39:16 *or july11 20:39:17 +1 20:39:26 *assess not access, lol 20:39:32 sp suxage ;) 20:39:37 I'd see what we can get done by next meeting, then decide from there a release date 20:39:43 k 20:39:44 +1 20:39:47 works for me. 20:40:09 ehudtr so it'd be nice to have all chains idempotent, but i think thats outside taskflows control 20:40:25 so we need to at least provide a mechanism for when they aren't 20:40:43 idempotency is a pipe dream without redoing basically *all* of OpenStack to coonfirm to taht design principle 20:40:59 Yeah.. we can't enforce idempotency in the TaskFlow library I don't think… because, the user of the library creates the tasks… so, no guarantee they'll be idempotent. 20:41:03 we can't reasonably expect that at this stage 20:41:23 yeah, what adrian_otto said. 20:41:38 agreed, it'd be really nice and would make stuff easier, but in the meantime something like a transcation log helps 20:41:42 ehudtr… curious if maybe I don't understand your use case though… would like to know more if we can help. 20:41:48 not just for resuming, but for analysis of whats going on... 20:41:54 the best we can hope for is a place to put rollback code, and execute it upon failure 20:42:05 adrian_otto agreed 20:42:45 I don't want to suggest that idempotent systems are impossible, but I think that's really got to be out of scope for this effort. 20:43:12 ehudtr the other thought here is that a developer using taskflow can potentially provide there own "resumption" strategy, and if u don't provide one, then thats fine, but u'll have to make your tasks and downstream services be idempotent (which is actually really hard) 20:43:45 you can do coarse grained resumes as well 20:44:04 they are less efficient, but you basically roll back to the last checkpoint, and proceed again fram there 20:44:11 ya 20:44:18 and you leave it up to the calling code where to define checkpoints 20:44:39 otherwise it's task by task 20:44:44 OK 20:45:04 ya, its tough to enforce any which way :) 20:45:31 ok, so onto next topic stuf 20:45:45 actually more of open topics i guess 20:45:50 *since not super important* 20:45:50 lol 20:45:57 #topic open-dicuss 20:46:48 so any feedback on https://review.openstack.org/#/c/34488/ would be cool, its a similar flow as jlucci distributed one, but runs locally instead 20:47:05 it has similar problems, but would likely be a solution for heat (instead of heats coroutines) 20:47:39 right now it will run a 'thread' per task in the job, which might need reworking (but might be ok with greenthreads) 20:47:58 that can be adjusted if we feel neccasary with a different way of running the flow 20:48:20 that's actually what Zane has implemented in his scheduler.py stuff in Heat with the coroutines 20:48:32 so it would be wise to collect his input 20:48:40 harlowja I still haven't looked at it in detail, but conceptually I'm fine with it… my main goal is to get the interface correct such that we use the 34488 approach to gain adoption into Heat, but swap out the backend as needed. 20:48:51 34488?? 20:49:03 short hand for https://review.openstack.org/#/c/34488/ 20:49:05 ah 20:49:15 thx :) 20:49:47 ya, the interface is nearly matching the other flows, but not 100% yet 20:49:52 pretty close though 20:49:53 because, to run this at service provider scale, I'm placing my bets on the work jlucci is doing. So, we need to plug that into TaskFlow as "our" back-end to running Heat Tasks. 20:50:10 agreed 20:50:43 +1 20:51:03 although i could imagine something like nova just wanting to use the parallel flow, since the conductor + mq 'concept' is pretty similar to celery (in a way) 20:51:25 so if they switch to parallel flow, with tasks being ran, then eventually they just switch to celery 20:51:29 and *magic* 20:51:34 If your implementation runs at scale, that's cool too :-) jlucci will still give us the added ability to modify the graph on the fly while a workflow is executing. speaking of which, I need to think up some real world use cases for harlowja on that. 20:52:00 #action kebray real-world cases for modification on the fly 20:52:04 thx 20:52:36 i agree, just thinking that existing openstack projects form there own 'celery' in a manner, so we should make sure without refactoring all of those projects that they can still take advantage of some of the benefits 20:53:00 and slowly refactor them toward this model (as that model proves itself) 20:53:03 *then profit* 20:53:10 which of those are we aware of? 20:53:22 It would be nice to actually put together a roster of those 20:53:35 the openstack projects? 20:53:50 the "Own Celery" implementations in various projects 20:53:58 I'd like to speak about thaem in less abstract terms 20:54:11 ah, well that gets into the gray area of when does a project seem to be creating its own celery :) 20:54:38 but ya, maybe we could form something (?? how to avoid it being controversial??) 20:54:42 I'm suggesting that we produce a list of known task implementations 20:55:18 and provide some informed guidance about which of those are a good fit for Taskflow 20:55:28 sure, that seems reasonable 20:55:33 or ask around for some 20:55:58 sure, some of them are still being formed as we speak i think 20:56:01 since we really care about boosting the quality of OpenStack and making it easier for various projects to benefit from a shared collaborative solution 20:56:17 ok, so let's find some way to describe them 20:56:18 agreed 20:56:26 even if they are wip 20:56:28 Maybe we can collaborate on this.. for example, I can probably get one of the trove devs to write up something tiny on their task execution within Trove. 20:56:49 And at least provide a link to the code where it's implemented within Trove. 20:56:50 sure, i can try in the nova meeting to see if anyone there wants to, maybe john garbutt? 20:57:00 since he's been doing that WIP 20:57:08 heat we already seem to know about 20:57:10 this exercise can help make it really clear where this can fit, and why it needs to be community property in OpenStack 20:57:17 sure 20:57:34 Yeah, the longer the list, the more repetitive the code, the more the need for common code. 20:57:48 +1 20:58:05 exactly. 20:58:06 i did start https://wiki.openstack.org/wiki/StructuredWorkflows a while ago, but its pretty low level, not at the high level that i think u guys are thinking of 20:58:43 it may be a little early to put too much energy into this, but we can build it opportunistically 20:58:48 #action harlowja see if i can work with the nova folks to get some kind of requirements or list of task like stuff there 20:59:17 sure, depends on how busy people are and all that 20:59:48 i can write up some stuff on nova (from what i know) but it will probably be biased and may not be what those folks believe is correct 21:00:04 alright, next time lets see what we have figured out 21:00:09 out of time 21:00:12 #end-meeting 21:00:13 #action kebray to see if someone from Trove can provide a high level sentence or two (and link to code) on their task execution code, desires around a common library, etc. 21:00:16 dang it. 21:00:18 just missed. 21:00:20 oops, u still in 21:00:23 #endmeeting