20:00:11 <harlowja> #startmeeting state-management 20:00:12 <openstack> Meeting started Thu Sep 19 20:00:11 2013 UTC and is due to finish in 60 minutes. The chair is harlowja. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:00:13 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 20:00:15 <openstack> The meeting name has been set to 'state_management' 20:00:16 <harlowja> howdy folks! 20:00:23 <harlowja> #link https://wiki.openstack.org/wiki/Meetings/StateManagement#Agenda_for_next_meeting 20:00:29 <melnikov> hi there 20:00:54 <kebray> hello 20:00:58 <caitlin56> hello 20:01:02 <kebray> I didn't do my action item. 20:01:06 <kebray> :-( 20:01:06 <changbl> hey guys 20:01:10 <harlowja> hi hi 20:01:11 <harlowja> ha 20:01:19 <harlowja> kebray u have been a bad boy 20:01:29 <harlowja> #link http://eavesdrop.openstack.org/meetings/state_management/2013/state_management.2013-09-12-20.00.html 20:01:43 <harlowja> hi caitlin56 melnikov changbl 20:01:56 <harlowja> so lets see 20:02:03 <harlowja> #topic action-items-from-last-time 20:02:21 <adrian_otto> hi 20:02:25 <harlowja> so i did some documentation updates for mine, reworking existing docs a little also 20:02:27 <harlowja> hi adrian_otto 20:02:59 <harlowja> #link https://wiki.openstack.org/wiki/TaskFlow/Engines was one of those 20:03:21 <harlowja> #link https://wiki.openstack.org/wiki/TaskFlow/Persistence was the other 20:03:48 <harlowja> i should probably continue improving those (of course) 20:04:22 <harlowja> since without docs, nobody knows what we are doing, ha 20:04:34 <changbl> +1 harlowja 20:04:54 <melnikov> i did my action item, almost -- new convenient function is called taskflow.engines.run(), and it is under review: https://review.openstack.org/46458 20:05:03 <harlowja> thx melnikov ! 20:05:33 <changbl> +1 melnikov , I put some comments there 20:05:40 <harlowja> looks like thats progressing well, will be a very nice function to make it really simple to use taskflow 20:05:43 <harlowja> thx changbl 20:05:58 <changbl> yes, as simple as possible, please guys 20:06:12 <harlowja> :) 20:06:20 <harlowja> yes sir! 20:06:24 <changbl> lol 20:06:49 <harlowja> kebray next week maybe u can do your action item ;) 20:06:58 <kebray> hopefully. 20:07:04 <kebray> Will certainly do my best. 20:07:17 <harlowja> :) 20:07:32 <harlowja> #action kebray mess around with taskflow 20:08:04 <harlowja> so cool, lets see about any needed coordination stuffs 20:08:14 <harlowja> #topic overall-effort-and-coordination 20:08:46 <harlowja> so this week i think the big things that are (hopefully) going in are melnikov run(), flow flattening, and the rework of the multi-threaded engine 20:09:08 <harlowja> #link https://review.openstack.org/#/c/46458/ 20:09:15 <harlowja> #link https://review.openstack.org/#/c/46692/ 20:09:22 <harlowja> #link https://review.openstack.org/#/c/47238/ 20:09:27 <harlowja> (in order of mentioning) 20:09:44 <harlowja> i think jessica has been chugging away on the distributed engine also 20:10:14 <harlowja> and lets see, what else, a shout out to sandywalsh_ :-P 20:10:19 <harlowja> we got mentioned on his blog, ha 20:10:25 <harlowja> #link http://www.sandywalsh.com/2013/09/notification-usage-in-openstack-report.html 20:10:46 <harlowja> "We have big hopes that the TaskFlow project will mature and that existing projects will start to use it. Having a common state management library will mean a central location for notification generation." :) 20:10:54 <sandywalsh_> your welcome :) ... I just watched Jessica's taskflow video ... it's very good 20:10:55 <harlowja> so thanks sandywalsh_ 20:11:16 <kebray> harlowja has TaskFlow been accepted into mainline yet? 20:11:27 <kebray> if not, do you know what coordination stuff needs to happen? 20:11:28 <harlowja> mainline meaning? 20:11:31 <kebray> to make that happen? 20:11:32 <kebray> master 20:11:38 <harlowja> master of what? 20:11:42 <kebray> or is it still wip 20:11:50 <harlowja> master of oslo? master of ? 20:11:50 <kebray> gerrit review acceptance 20:11:56 <kebray> master of TaskFlwo 20:12:06 <harlowja> confused 20:12:12 <kebray> ok.. we can take that offline 20:12:13 <harlowja> Taskflow has been accepted into master of Taskflow 20:12:15 <harlowja> :-P 20:12:24 <kebray> doh.. sorry, distributed engine. 20:12:25 <kebray> err. 20:12:27 <kebray> typo. 20:12:33 <kebray> has distributed engine been accepted into TaskFlow? 20:12:43 <harlowja> not yet, still being worked on afaik 20:12:57 <harlowja> #link https://review.openstack.org/#/c/45585/ 20:13:15 <kebray> Last I heard, it hadn't landed because so many TaskFlow changes were happening... it was like a moving target, and kept breaking distributed WIP. 20:13:40 <kebray> Was wondering if there is any coordination that needs to happen... I want to make sure distributed gets in before the summit. 20:13:51 <harlowja> i think it will, it should calm down i think 20:13:52 <kebray> Was wondering if others are in agreement on that, or against that. 20:14:18 <harlowja> if jessica feels like its ready, and others do as well, then i see no problem in getting it in 20:14:25 <kebray> k. 20:15:15 <kebray> I know there was some frustration that she couldn't get it in because everything kept changing under the hood... has caused a lot of reworking.. hoping ya'll can help with that, get distributed in, then do rework that is inclusive of ensuring distributed continues to work along the way. 20:15:39 <kebray> otherwise she'll never be able to keep up as the lone distributed developer right now. 20:15:49 <kebray> and, if things continue to be in flux. 20:16:29 <harlowja> sure, but this is the way of how software goes, things may always be in flux 20:16:36 <harlowja> *to some level* 20:16:59 <harlowja> although i do know that being a lone distributed developer is not sustainable, so hopefully that can change sometime 20:17:22 <sandywalsh_> https://twitter.com/TheSandyWalsh/status/380776942196121600 20:17:32 <harlowja> thx sandywalsh_ 20:18:13 <harlowja> kebray so i understand the concern, and i agree with u that it needs two get in, although the issue of 'long distributed developer' doesn't go away if its in :) 20:18:19 <harlowja> *lone 20:18:34 <harlowja> especially since i think distributed is actually the most complicated one :) 20:18:49 <harlowja> and has to be done really really carefully 20:18:56 <harlowja> *even if celery is behind it* 20:19:25 <harlowja> but we can chat afterwards maybe about this more? 20:19:40 <harlowja> sound ok kebray ? 20:20:00 <kebray> sounds good. 20:20:40 <harlowja> cool 20:20:55 <harlowja> #topic new-use-cases 20:21:16 <harlowja> so there was a new use-case and maybe some new ideas for cinder backend activities that caitlin56 had recently 20:21:18 <harlowja> #link https://wiki.openstack.org/wiki/Cinder_Backend_Activities 20:21:36 <harlowja> caitlin56 do u might giving a little summary of your idea 20:21:49 <harlowja> *and maybe how u think it releates to taskflow 20:23:01 <harlowja> hmmm, ok, maybe she went away 20:23:25 <harlowja> so the part of that wiki that i was wondering about was 20:23:26 <harlowja> 'To properly utilize Volume Driver abilities, the taskflow code will need to be able to read a list of Volume Driver attributes that document these capabilities. ' 20:23:51 <harlowja> where attributes could be 20:23:54 <harlowja> Stateless Activities: 20:23:58 <harlowja> Snapshot Replication: 20:23:59 <harlowja> ... 20:24:12 <harlowja> so i'm working on wrapping my head around how that might affect taskflow 20:24:26 * caitlin56 apologizes - phone call from her boss. 20:24:29 <harlowja> np 20:24:34 <caitlin56> I can discuss the cinder backend now. 20:24:36 <harlowja> sweet 20:24:38 <harlowja> thx caitlin56 20:25:29 <caitlin56> As to the capabilities: these need to cover things like whether the Cinder volume is on the same machine as Cinder or on an external machine. 20:25:54 <caitlin56> The current code assumes the former, and fetches the payload, compresses it and then puts it object storage. 20:26:28 <caitlin56> This is not the optimal algorithm if the volume is actually on a third machine -- you want to tell *that* machine to put it to the Object Server. 20:26:44 <caitlin56> You can't write a taskflow that is smart without being able to query attributes like that. 20:27:33 <harlowja> so this might affect the construction/runtime of the flow based on those attributes? 20:27:47 <caitlin56> Yes. 20:27:53 <harlowja> k 20:28:00 <caitlin56> So we want to keep them minimal and big. 20:28:39 <harlowja> do u think that the attributes need to be ran at runtime (while executing the flow) or before (during construction) 20:28:55 <harlowja> something like if 'xattribute' use this flow; otherwise use that one 20:29:02 <caitlin56> I'm assuming they would be static. 20:29:16 <caitlin56> You aren't going to change your method of doing snapshots on the fly. 20:29:20 <harlowja> sure 20:29:36 <caitlin56> That's probably an important thing to state, however. 20:29:47 <harlowja> ya, runtime makes it alot harder :-P 20:29:56 <harlowja> ahead of time, not so bad 20:30:12 <caitlin56> Incredibly hard if the backend things it can change dynamically and the taskflow assumes it is static. 20:30:32 <harlowja> ya 20:31:19 <caitlin56> There are several storage scenarios, but what I think is best is to concentrate on backing a volume up to an object as the first, get it right, and then generalize. 20:31:28 <harlowja> agreed 20:31:31 <harlowja> so how would these attributes affect taskflow i guess is my question, if the attributes are ahead-of-time determined, then the thing that uses taskflow can look at those attributes and form a 'set' of tasks using those attributes? 20:31:52 <harlowja> or do u imagine something else there? 20:31:55 <caitlin56> yes, a big outer if statement. 20:32:20 <harlowja> the way i could see it affecting taskflow is if u have something like the following 20:32:22 <caitlin56> I think we can have 2 or 3 strategies, and deal with hundreds of backends. 20:32:36 <harlowja> def volume_get_flow_for_snapshot() 20:32:40 <harlowja> this would then return a flow 20:32:51 <harlowja> and then the outter thing would analyze that returned flow for 'attributes' 20:33:02 <harlowja> but it seems like the outter if statement wouldn't need that 20:33:35 <harlowja> so would taskflow i guess be the thing responsible for just holding the attribute -> flow mapping 20:33:51 <harlowja> simialr to #link https://review.openstack.org/#/c/43051/ 20:33:52 <caitlin56> Well, to cite one specific example, you have storage backends with relatively low cost snapshots and those with heavy cost snapshots. 20:34:10 <harlowja> sure 20:34:26 <caitlin56> With cheap snapshots, you do a backup by taking an anonymous snapshot, backing the snapshot up and then deleting it. 20:34:40 <caitlin56> The state of the volume is not impacted by your backing it up. 20:35:03 <caitlin56> With expensive snapshots, you want to quiesce the volume (putting it in a 'backing-up' state) and copy from the volume itself. 20:35:22 <caitlin56> But that's two strategies, not one strategy for each vendor. 20:35:26 <harlowja> right 20:35:27 <harlowja> gotcha 20:35:43 <harlowja> so u thinking that taskflow could provide that attribute -> strategy layer (?) 20:35:48 <harlowja> or should that recide in cinder 20:35:49 <caitlin56> The trick is to provide progress in either case. 20:36:19 <harlowja> sure 20:36:32 <harlowja> which taskflow can do since its hookin in at that 'level' 20:36:37 <caitlin56> It will vary. I think that one can be done with clever coding of the backup_volume method. Others will require a more outer switch. 20:37:01 <harlowja> sure 20:37:21 <caitlin56> BTW, everything we're discussing with Cinder would apply to Manila (NFS shares for opensgtack) if that project is approved. 20:37:34 <harlowja> of course, i think it applies elsewhere also :) 20:38:12 <caitlin56> Anywhere you are dealing with long running operations and differring implementation strategies. 20:38:12 <harlowja> just trying to distill i down to how it might look in taskflow :) 20:38:21 <harlowja> *it down 20:39:05 <harlowja> sure, so maybe it starts off with that attribute -> strategy object, this object can be queried given a bunch of attributes (and gives back the strategy) 20:39:28 <harlowja> then cinder can register its attributes/strategies 20:39:43 <harlowja> and tell taskflow to go execute the strategy with XYZ attributes 20:40:00 <caitlin56> I'm not an expert on python style, so I'm very flexible about exact coding representation. 20:40:05 <harlowja> :) 20:40:28 <harlowja> does the general idea though seem to be what u are aiming for? 20:41:15 <caitlin56> Yes, plus creating some sort of abstraction that is inclusive of a normal task and something running on a server not under openstack control. 20:41:40 <caitlin56> Both would report progress, but you wouldn't be able to control priorities/etc for an external "activity". 20:41:57 <caitlin56> So I suppose it's a subset interface. 20:42:10 <harlowja> sure, i think #link https://review.openstack.org/#/c/46331/ is part of that second one 20:42:36 <ekarlso> so 20:42:40 <harlowja> although priority control taskflow currently doesn't do (although its an interesting idea), ha 20:42:40 <ekarlso> what's goin' on ? 20:42:51 <harlowja> hi ekarlso 20:43:02 <caitlin56> harlowja: yes. That combines the progress reporting too. 20:43:14 <harlowja> ekarlso just chatting about https://wiki.openstack.org/wiki/Cinder_Backend_Activities 20:44:03 <caitlin56> But none of us (Nexenta) are really expert at core Nova compute stuff. If we're doing things in a way that is awkward do not be afraid to tell us what would be a more normal interface. 20:44:14 <harlowja> i think it seems like an ok interface 20:44:27 <harlowja> its a generally set of useful concepts i think :) 20:45:19 <harlowja> caitlin56 let me write up a blueprint for this in taskflow, and see if i captured it correctly 20:45:21 <harlowja> sound ok? 20:45:21 <caitlin56> And I think it would even take as far as having taskflow that managed failovers between hot standbys -- without requiring each vendor to code that. 20:45:35 <caitlin56> sounds good. 20:46:03 <harlowja> #action harlowja writeup blueprint with distilled taskflow (work idea) for https://wiki.openstack.org/wiki/Cinder_Backend_Activities 20:46:21 <harlowja> managed failovers with hot standbys, hmmmm 20:46:44 <harlowja> that requires the persistence stuff though, to know where to 'pick up last' 20:46:58 <harlowja> correct? 20:47:02 <caitlin56> correct. 20:47:18 <harlowja> k, np, thats being polished as we speak 20:47:33 <caitlin56> The strategy I favor is incremental snapshots, and cloning the volume from the most recent snapshot. 20:47:52 <caitlin56> But you can also do a continuous transaction feed. 20:48:03 <caitlin56> One policy - multiple implementations. 20:48:09 <harlowja> sure 20:48:26 <harlowja> thats the openstack way :-P 20:49:21 <harlowja> cool, thx caitlin56 for discussing, i'll try to see if i can write what might need to be done up :) 20:49:30 <harlowja> and pass it by u 20:49:34 <harlowja> *and others* 20:50:01 <harlowja> #topic open-discuss 20:50:21 <harlowja> anything anyone wants to talk about that i missed?? :) 20:50:41 <harlowja> thx changbl for helping out with some reviews 20:50:46 <changbl> np harlowja 20:50:48 <melnikov> what about some crazy ideas? #link https://blueprints.launchpad.net/taskflow/+spec/eliminate-patterns 20:50:48 <changbl> maybe out of scope, I have one question: 20:50:58 <harlowja> oh crazy ideas, haha! 20:51:51 <harlowja> i like the crazy ideas, although i want to make sure jessica doesn't hate us to much by changing that while she is still working on the big distributed engine piece 20:51:55 <changbl> generalize linear_flow to be graph_flow? 20:52:02 <changbl> lol 20:52:15 <harlowja> changbl i think u have the same crazy idea as melnikov :-P 20:52:28 <changbl> i have one question on concurrency though 20:52:54 <harlowja> sure 20:52:59 <changbl> so say I have flow1 which touches some resources, and so is flow2, and I execute them simultaneously 20:53:13 <harlowja> k 20:53:21 <changbl> any way to guarantee flow1 and flow2 does not touch the same resources? 20:53:24 <changbl> by openstack? 20:53:32 <harlowja> not without https://wiki.openstack.org/wiki/StructuredWorkflowLocks 20:53:51 <harlowja> *which doesn't exist yet 20:54:17 <changbl> so basically there is no concurrency control yet? 20:54:23 <harlowja> file level concurrency control 20:54:35 <harlowja> some attempts at database level locking used for concurrency control 20:54:59 <harlowja> nova and cinder use a 'state' field in the DB to try to not squash themselves 20:55:05 <harlowja> and file level locks 20:55:14 <harlowja> caitlin56 i think has experience with the cinder one 20:55:24 <harlowja> *since its problematic for her i think 20:55:37 <changbl> ok, go the DB part. still confused with the file-level one... 20:55:56 <changbl> can you illustrate more file-level one? 20:55:59 <harlowja> nova locks a file to make sure its simulatenously mutating a hypervisor 20:56:03 <harlowja> *a vm i mean 20:56:16 <harlowja> *to make sure its not simulatenously mutating a vm 20:56:18 <changbl> ok, got it 20:56:24 <changbl> thanks 20:56:32 <harlowja> sure, its not ideal imho 20:56:57 <harlowja> i was hoping for something like that above wiki, and a way for tasks to 'define' what resources they are using, and taskflow would lock them 20:57:05 <harlowja> and unlock them 20:57:28 <harlowja> and u could use different locking implemenations (up to the deployer) 20:57:33 <changbl> yes, in our TROPIC paper, we built some kind of lock manager 20:57:36 <harlowja> ya 20:57:54 <harlowja> although u guys did a much more exhaustive lock analysis :-P 20:58:01 <changbl> :) 20:58:16 <harlowja> in openstack, if taskflow can just attempt to do resource locks, that will be a good start, ha 20:58:24 <harlowja> and tasks can declare what resources they will touch 20:58:24 <changbl> borrowed some ideas from DB locking 20:58:31 <harlowja> ya 20:58:44 <caitlin56> Avoiding locks is even better, but with current DBs itmight be the only way 20:58:57 <harlowja> so full on tropic stuff is hard i think with openstack, but something in the middle might be doable 20:59:23 <harlowja> so thats my thinking anyway changbl 20:59:29 * harlowja i did read your guys paper :) 20:59:34 <changbl> thanks harlowja 20:59:38 <changbl> :) 20:59:42 <harlowja> ok, times up 20:59:42 <harlowja> eck 20:59:44 <changbl> i think it is time 20:59:54 <harlowja> jump into #openstack-state-management if u want to talk more 20:59:58 <harlowja> #endmeeting