20:00:58 #startmeeting state-management 20:00:58 hmmmm 20:00:59 Meeting started Thu Jun 20 20:00:58 2013 UTC. The chair is harlowja. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:01:00 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 20:01:03 The meeting name has been set to 'state_management' 20:01:03 ah, just slow 20:01:05 hi 20:01:11 hi there! 20:01:13 hello 20:01:17 hi hi 20:01:19 hello~ 20:01:20 hello 20:01:46 *sorry about getting that invite email out late* always get busy with other stuff and forget to send it, lol 20:01:50 hurr 20:02:07 #action harlowja make cron-job that sends it ;) 20:02:44 so maybe just a little status check before moving to other stuff 20:02:49 #topic status-check 20:03:04 who all has anything to report? 20:03:08 *i do* 20:03:22 create an action for me to propose a TaskFlow talk for summit, please :-) 20:03:32 any problems/things being worked on that are really cool?? ;) 20:03:35 or, I guess I can create actions too? 20:03:40 anyone can 20:03:42 #action kebray taskflow talk for HK summit 20:03:54 there u go 20:04:06 * kebray celebrates technology. 20:04:32 Umm, not too much here. Been doing a lot of documentation/research type stuff 20:04:35 so i got cinder to work using the current taskflow, unit tests are adjusted, nothing major there, working on a few review comments and awaiting final merge (need taskflow to appear somewhere first) 20:04:59 *the cinder flow for creating a volume* 20:04:59 kudos, harlowja 20:05:21 that's a milestone worth recognizing, for sure 20:05:21 def, its a piece of the puzzle 20:05:37 *reminds me of the nova prototype a little that we showed at the prefvious summit 20:05:51 awesome harlowja! I've been following the gerrit reviews.. you've done great work. 20:05:59 #link https://review.openstack.org/#/c/29862/ 20:06:04 alot of people liking it, so thats great 20:06:23 makes me feel good, haha, "While I definitely applaud this work," and such... 20:06:31 so thats great :) 20:06:42 jlucci what kind of research/documentation stuffs? 20:07:07 So, I explored gear as a possible option to replace celery in distributed 20:07:16 gear? 20:07:41 #link http://www.gearman.org/ i think right? 20:07:42 Since infra was already using gearman, using gear might've made some people less antsy to introuce a new dependency 20:07:47 yessir 20:08:14 However, through said research, found that gear actually required more overhead than celery, in that you had to spin up and maintain a delegated gearserver 20:08:41 i've talked to others and they agree, celery seems better than gearman 20:08:44 While you can run celery locally without any additional resources required (although you can split it up if you felt like it) 20:08:52 yups 20:08:54 Also 20:08:59 https://wiki.openstack.org/wiki/Celery 20:09:06 #link https://wiki.openstack.org/wiki/Celery 20:09:14 cool 20:09:15 Documented exactly what celery is, what the architecture looks like, usage, and how to debug 20:09:23 reminds me also 20:09:31 #link https://wiki.openstack.org/wiki/TaskFlow for the wiki i was assigned to work on 20:09:35 *pieces filling in* 20:09:36 For anyone unfamiliar with celery to sort of answer all the questions peeps keep asing 20:09:38 asking * 20:09:46 Yeah. The wiki is looking good! 20:09:46 very cool! 20:09:47 jlucci.. one piece of feedback on the Celery wiki… you make several references to "Celery Client", but the "Client" doesn't show up in any of your diagrams. 20:10:03 Ah yeah 20:10:06 But, overall, looks great! 20:10:08 The client is part of your application 20:10:14 I guess I should say that somewhere. :P 20:10:27 kebray adrian_otto if u want to checkover https://wiki.openstack.org/wiki/TaskFlow that'd be great also, hopefully nothing to incorrect there ;) 20:10:36 *any others feedback is welcome to* 20:10:49 Anywaaay. That stuff, trying to get to harlowja's reviews as well 20:10:55 thx :) 20:10:58 k… read that one at some point, but need to spend more time with it. I will review. 20:11:06 much appreciated 20:11:13 And that's it for me 20:11:28 cool, same for me, kchenweijie how are u doin 20:11:36 hopefully don't hate me that much yet ;) 20:11:41 lol 20:11:48 doing well. working on making the generic types the api, like we discussed in the other irc yesterday 20:11:59 its working so far, so i just need to finish it up 20:12:04 awesomeness 20:12:08 ill submit another code review as soon as its done 20:12:20 sounds great 20:12:21 #action adrian_otto to review/edit https://wiki.openstack.org/wiki/TaskFlow 20:12:31 thx adrian_otto 20:12:34 np 20:12:35 thats pretty much it for me 20:12:38 *its not complete, not sure it will ever be of course ;) 20:12:53 great stuff 20:13:11 any at&t folks around?? 20:13:33 yes 20:13:40 we followed up with the nttdata folks, they are doing a little, but mostly busy with other stuff at the moment (guess thats how it goes( 20:13:56 changbl any kind of stuff u need help with, or want any input on? 20:14:20 or any other complaints ... ;) 20:14:24 harlowja, sorry that my progress indeed got delayed 20:14:36 np, its how it goes :) 20:14:50 Yun Mao here is leaving the company, and I need to take over his projects 20:14:53 :( 20:15:03 sad 20:15:05 I planned to work on ZK backend storage and locking 20:15:23 is he leaving openstack also? 20:15:28 *if u know* 20:15:30 I did some research on locking using an SQL table 20:15:48 harlowja, maybe, less relevant to openstack 20:15:52 he will join FB 20:15:59 ah, double sad 20:16:03 adrian_otto what did u find out :) 20:16:05 and I'm now convinced that it could be done safely with MySQL, because the isolation level is selectable within the session. 20:16:18 very cool, adrian_otto do u want to do the locking api ;) 20:16:28 so that could work for low concurrency use cases where ZK is not desired 20:16:40 at the very least I'll outline it 20:16:42 harlowja, i will still stick to ZK backend for storage and locking, how does the plan sound? 20:17:00 changbl that'd be great, if u have time, but completly understandable if u don't 20:17:10 harlowja, thanks! 20:17:17 adrian_otto that would be much appreciated, more knoweldge sharing the better 20:17:45 for the locking API, are we thinking library methods, or a REST interface? 20:18:28 it's probably only useful if you can reach it remotely in the distributed use cases 20:18:52 library i think, but library backends could be one with a rest interface 20:19:13 yes, that's what I'm thinking as well 20:19:15 +1 library 20:19:38 cool 20:19:58 ok, jlucci let's plan to touch base on this so we end up with something that works with your distributed use case 20:20:27 Sounds good to em 20:20:29 me * 20:21:10 sweet 20:21:42 any nova folks in the room, i have been watching a little there, but would be interested in how that is going 20:21:50 russellb do u know? 20:22:34 ok, will wait a little to see whats happening when there meeting happens 20:23:19 so how about we open discussion on the heat requirements, does that seem fine? 20:23:36 yep 20:24:16 #topic heat-requirements 20:24:20 #link #link https://wiki.openstack.org/wiki/Heat/TaskSystemRequirements 20:24:24 #link https://wiki.openstack.org/wiki/Heat/TaskSystemRequirements 20:24:38 * harlowja letting people check it out 20:25:04 a few things that we might have to address (possibly in different ways also) 20:25:19 - Tasks run in parallel (without going fully distributed?) 20:25:40 - Tasks can spawn other tasks (this or similar is being discussed right now) 20:26:11 the parallel task requirement does not require distributed 20:26:15 - Tasks can time out (this implies that u have an entity that can kill the task when it times out, the implications of that sound scary) 20:26:43 if kill=rollback that's not bad 20:27:08 small interjection - celery (distributed approach here) already has that capability 20:27:17 It can also set an action to perform on timeout 20:27:30 Doesn't necessarily have to be a kill (could be a try this other task instead type thing) 20:27:39 And timeouts can be set during runtime 20:27:43 but if kill == stop task will its still running, thats weird, since then u are basically stopping code from running at a random point, that usually requires some pretty complicated stuff (what if the task is using a file, or locks, or ...) 20:28:22 i'm pretty sure linux has a whole substructure just to do that cleanup, so we might need to get clarification on the scope of 'kill'/timeout 20:29:02 harlowja, isn't that what exceptions/desctructor-cleanup stuff is for? You can signal a task to die, but that doesn't mean it stops execution immediately.. it means it gets a signal to gracefully exit and clean up as soon as possible, no? 20:29:02 harlowja: ok, make a note in the wiki about what clarification is needed 20:29:37 there are different kinds of "kill" and that ambiguity needs to be clarified 20:29:40 kk 20:29:47 anyway, that's how I interpreted Zane's requirement.. basically to allow for tasks to exit gracefully upon signal to kill-self. 20:29:48 #action get clarification on timeout 20:29:59 clarification is good :-) 20:30:34 well that means tasks have checkpoints right? 20:30:34 #action harlowja get clarification on timeout 20:30:43 yes 20:30:44 so it brings up the question of should tasks have checkpoints, or should workflows have checkpoints and tasks shouldn't... 20:30:50 in fact, he raises that as a requirement 20:30:57 when he mentions yield 20:31:37 so that brings up an interesing question, is why should a task have checkpoints, if its a single unit of code that does one thing, why should it be doing many things that require checkpoints ;) 20:31:46 split up the task instead right? 20:31:54 then have the workflow do the checkpointing 20:32:02 a workflow could be implemented as a task, in which case if tasks support a yield equivalent, then that's enough 20:32:33 I think that's what he's thinking about when he mentions that tasks create subtasks 20:32:45 not quite sure there 20:32:54 I disagree 20:32:59 Tasks should not be implemented as a workflow 20:33:06 reverse 20:33:22 A task is the smallest executable bit of data within a workflow 20:33:37 If you want a task to accomplish multiple things, you should split that task up into a set of tasks 20:33:53 yielding to me is a way to switch coroutines, which is in a manner creating a structure of yields that is a workflow 20:33:53 our workflow is explicit, via yielding u can make it somewhat implicit 20:34:10 so this is one of those pieces where the yielding code is different than taskflow :) 20:34:35 likely we just need to figure out what way to go, yielding makes it hard to run in distributed afaik, since u need the explicit structure i think 20:34:44 would that allow the representation of a "arbitrary directed acyclic graph"? 20:35:07 so thats another interesting one, its not really arbitary after they construct the DAG right? 20:35:19 the user provides a document, they construct the DAG, at that point its not arbitrary anymore 20:35:32 so i'm not sure about that requirement so much 20:35:39 yes, but they are generated on the fly 20:35:44 that's what he's getting at 20:36:04 you need the ability to build one up, and then set it into action 20:36:15 Sooo, I'm sure everyone is sick of hearing this, but distributed has that capability 20:36:20 :) 20:36:30 You can create and modify worfklows on the fly/during runtime 20:36:41 i think the heat requirement isn't even at runtime :) 20:36:51 it seems to be more of a build time dynamic, which yes we have 20:37:03 and build time dynamic is really nice, since then we can analyze it 20:37:18 distributed can deal with flows that have cycles though.. so, it need not be a DAG, although you could require it to be a DAG. 20:37:28 agreed 20:37:31 in Heat stacks can be modified 20:37:36 in which case the graph is updated 20:37:45 interesting 20:37:45 so it is a runtime requirement 20:38:06 i wonder how much updating they allow, because u can basically invalidate the whole DAG 20:38:30 if a task can invalidate the DAG it is in, i wonder how that DAG being broken is resolved 20:38:37 I have not reviewed that code, so I can't speak intelligently about the implementation. 20:38:41 np 20:39:11 #action harlowja figure out what kind of DAG modifications heat can do 20:39:33 adrian_otto does Heat actually keep the graph around and modify it on a modify stack operation? I don't know the code well enough to say. 20:39:55 I think it gets persisted, and can be recalled, but I'm not certain. 20:40:01 that scares me, it brings up a whole world of problems when thats possible 20:40:11 which one is persisted, all of the modified ones ;) 20:40:38 and so on 20:40:38 if a DAG is broken in the middle after an update, do u revert back to the previously persisted one to rollback 20:40:47 I do know it's possible to ask the api for the current stack in a serialized representation (JSON) 20:40:47 and that's the current state, not the original statew 20:40:52 and it comes "from the database" 20:40:59 interesting 20:41:16 i defintly wonder how that works 20:41:20 no rollback is implemented that I know of 20:41:29 a stack can go into an error state, but can't be rolled back 20:41:45 ya, i can see why its pretty hard if u allow graph modifications and don't keep the previous graphs ;) 20:41:48 that's where taskflow could make it much more robust. 20:42:17 agreed, although i wonder the real need for said modifications, if we can avoid them that will make everyones lives easier, haha 20:42:43 although i know jlucci will be fine with them ;) 20:42:48 modification to a deployed stack seems like a valid feature. 20:43:00 implementation is surely up for debate :-) 20:43:09 so it seems that we need the authors of that work to participate in a collaboration session to get a better sense of what's needed. 20:43:17 sure, in the forward operation where u don't want to go backwards and rollback i totally see it as a feature, until u start asking the questions of errors ;) 20:43:26 adrian_otto 100% agreed 20:44:04 i think that would help tremendously, we both are doing similar stuff, so why not combine our forces into captain planet 20:44:12 i mean the best library possible 20:44:16 haha 20:44:16 let's see if we can arrange that to happen soon 20:44:23 and address these issues upfront instead of later 20:44:40 kebray: do you have a list of Rackers who should participate? 20:44:40 I'll CC on an e-mail thread we've started with Zane, adrian_otto 20:44:53 since later to me i think might be pretty painful for heat as a project if they don't think about errors and such, but thats my 2 cents 20:44:57 In regards to taskflow/heat compatability and needs 20:45:03 agreed 20:45:03 jlucci thx! 20:45:30 #action jlucci start nice email thread with heat folks, zaneb, other interesting fellows 20:45:47 :D 20:45:55 adrian_otto: jlucci, randall_burt, you, me, for sure… there are a few others that may be interested. 20:45:56 I'm thinking a Hangout+irc might be a good way to discuss this. 20:46:25 def 20:46:28 jlucci, you may want to correspond via the Heat OpenStack mailing list. 20:46:35 I think there are others that may be interseted. 20:46:43 agreed 20:46:48 that'd be fine with me 20:46:52 Sounds good. I'll go ahead and re-do that email then 20:47:05 but I'm convinced that a requirements discussion is beyond the scope of what openstack-dev is good for 20:47:20 so let's arrange something interactive that all stakeholders can attend 20:47:26 great 20:47:53 kebray: do you want to lead coordination, or do you want a hand with that? 20:48:04 i'll try to add some comments to the requirments under [jh] and will try not to be to harsh ;) 20:48:17 * harlowja josh be nice 20:48:19 lol 20:48:40 haha 20:48:48 Just end every sentence with a smiley. 20:48:50 def 20:48:54 lol 20:49:21 adrian_otto… I can add it to my to-do list… but, if I haven't gotten around to it in a few days, then will ask for some help :-) 20:49:30 kebray on a different topic, the summit, how many sessions do u think would be nice, all of them ;) 20:49:36 i think all of them 20:49:37 lol 20:49:39 let's simply recognize that Heat is the best OpenStack use case for the taskflow library, and it's important to make sure they get what they need 20:49:50 #action kebray to coordinate meeting with Heat devs to discuss coordination of task management 20:49:51 100% agree 20:50:06 the other projects, yes, they need it also, but heat is also a core user 20:50:35 kebray: Ok, I will follow up Monday if we don't have the plans secured by then. 20:50:36 the other projects def need it to also, so keep that in mind ;) 20:50:36 nova, cinder... 20:50:36 harlowja: I'd like to see TaskFlow become it's own project track at the summit.. not sure how to make that happen. 20:51:00 interesting 20:51:16 *that would be interesting* (ponder ponder) 20:51:29 ? we had some IBM'ers express interest in contributing to TaskFlow at one point.. .we have anyone in here from "other" companies than Yahoo, Rackspace, ATT, NTT? 20:51:29 let's get it to the point where it can give a good demo 20:52:14 adrian_otto sure i think a great demo is useful, but also just the fundamental concepts are useful to, i guess both would be superb 20:52:18 agreed with adrian_otto… if we can show up at next summit with a kick-butt demo and claim it "stable" for folks to be using, and have the Cinder dependency done, that will be awesome. 20:52:31 + 1 20:52:33 ah, that brings up a good question, how much of cinder :-P 20:52:43 jgriffith yt 20:52:46 How much of cinder done? 20:52:47 ideally all of it 20:53:05 that might require more than me just chugging away on cinder, lol 20:53:08 what's the scope of that effort? 20:53:33 it need not be all done by you 20:53:47 I think it would require some Cinder devs to come on board 20:53:57 hehe, i think the scope is all encompassing, at least it was, https://blueprints.launchpad.net/cinder/+spec/cinder-state-machine 20:53:58 if they have good examples to follow it may be easy enough to divide and conquer 20:54:01 agreed 20:54:15 #action harlowja create those examples for jgriffith and hemna 20:54:44 :) 20:54:57 not sure if they are used to pair sessions, but that's another approach that could work 20:55:06 pair coding? 20:55:12 yep 20:55:14 can be done remotely 20:55:21 intersesting, i've never done that, but could be 20:55:42 i think maybe the examples and such (simple examples) would help 20:55:55 +1 20:56:05 and make sure to address more questions that hemna and jgriffith have :) 20:56:06 *at your service* 20:56:08 yes, that's step one, and then you pair up to provide real-time support for them to get the first one done. 20:56:16 and then they can repeat, repeat 20:56:52 that sounds pretty cool, might work 20:56:52 hemna any thoughts? 20:56:52 *ack 4 minutes 20:56:59 where does the time go, lol 20:57:14 sounds good 20:57:24 sweet 20:57:31 I'm still stuck in refactoring nova's attach code into brick 20:57:44 so I've not been much help with this :( 20:57:49 did u guys rename cinder to brick 20:57:51 did i miss that email 20:57:57 *stop renaming stuff people, haha* 20:58:00 :-p 20:58:18 oh, nm, hehe 20:58:30 #link https://github.com/j-griffith/brick 20:58:52 heh 20:59:09 nah brick is a subproject for cinder and hopefully nova at some point 20:59:11 :) 20:59:12 cool 20:59:41 alright, until next time, anyone need more #openstack-state-management :) 20:59:44 hemna: is that the local storage handling stuff? 20:59:57 #endmeeting