#openstack-meeting log

20:00:11 <harlowja> #startmeeting state-management
20:00:12 <openstack> Meeting started Thu Sep 19 20:00:11 2013 UTC and is due to finish in 60 minutes.  The chair is harlowja. Information about MeetBot at http://wiki.debian.org/MeetBot.
20:00:13 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
20:00:15 <openstack> The meeting name has been set to 'state_management'
20:00:16 <harlowja> howdy folks!
20:00:23 <harlowja> #link https://wiki.openstack.org/wiki/Meetings/StateManagement#Agenda_for_next_meeting
20:00:29 <melnikov> hi there
20:00:54 <kebray> hello
20:00:58 <caitlin56> hello
20:01:02 <kebray> I didn't do my action item.
20:01:06 <kebray> :-(
20:01:06 <changbl> hey guys
20:01:10 <harlowja> hi hi
20:01:11 <harlowja> ha
20:01:19 <harlowja> kebray u have been a bad boy
20:01:29 <harlowja> #link http://eavesdrop.openstack.org/meetings/state_management/2013/state_management.2013-09-12-20.00.html
20:01:43 <harlowja> hi caitlin56 melnikov changbl
20:01:56 <harlowja> so lets see
20:02:03 <harlowja> #topic action-items-from-last-time
20:02:21 <adrian_otto> hi
20:02:25 <harlowja> so i did some documentation updates for mine, reworking existing docs a little also
20:02:27 <harlowja> hi adrian_otto
20:02:59 <harlowja> #link https://wiki.openstack.org/wiki/TaskFlow/Engines was one of those
20:03:21 <harlowja> #link https://wiki.openstack.org/wiki/TaskFlow/Persistence was the other
20:03:48 <harlowja> i should probably continue improving those (of course)
20:04:22 <harlowja> since without docs, nobody knows what we are doing, ha
20:04:34 <changbl> +1 harlowja
20:04:54 <melnikov> i did my action item, almost -- new convenient function is called taskflow.engines.run(), and it is under review: https://review.openstack.org/46458
20:05:03 <harlowja> thx melnikov !
20:05:33 <changbl> +1 melnikov , I put some comments there
20:05:40 <harlowja> looks like thats progressing well, will be a very nice function to make it really simple to use taskflow
20:05:43 <harlowja> thx changbl
20:05:58 <changbl> yes, as simple as possible, please guys
20:06:12 <harlowja> :)
20:06:20 <harlowja> yes sir!
20:06:24 <changbl> lol
20:06:49 <harlowja> kebray next week maybe u can do your action item ;)
20:06:58 <kebray> hopefully.
20:07:04 <kebray> Will certainly do my best.
20:07:17 <harlowja> :)
20:07:32 <harlowja> #action kebray mess around with taskflow
20:08:04 <harlowja> so cool, lets see about any needed coordination stuffs
20:08:14 <harlowja> #topic overall-effort-and-coordination
20:08:46 <harlowja> so this week i think the big things that are (hopefully) going in are melnikov run(), flow flattening, and the rework of the multi-threaded engine
20:09:08 <harlowja> #link https://review.openstack.org/#/c/46458/
20:09:15 <harlowja> #link https://review.openstack.org/#/c/46692/
20:09:22 <harlowja> #link https://review.openstack.org/#/c/47238/
20:09:27 <harlowja> (in order of mentioning)
20:09:44 <harlowja> i think jessica has been chugging away on the distributed engine also
20:10:14 <harlowja> and lets see, what else, a shout out to sandywalsh_  :-P
20:10:19 <harlowja> we got mentioned on his blog, ha
20:10:25 <harlowja> #link http://www.sandywalsh.com/2013/09/notification-usage-in-openstack-report.html
20:10:46 <harlowja> "We have big hopes that the TaskFlow project will mature and that existing projects will start to use it. Having a common state management library will mean a central location for notification generation." :)
20:10:54 <sandywalsh_> your welcome :) ... I just watched Jessica's taskflow video ... it's very good
20:10:55 <harlowja> so thanks sandywalsh_
20:11:16 <kebray> harlowja has TaskFlow been accepted into mainline yet?
20:11:27 <kebray> if not, do you know what coordination stuff needs to happen?
20:11:28 <harlowja> mainline meaning?
20:11:31 <kebray> to make that happen?
20:11:32 <kebray> master
20:11:38 <harlowja> master of what?
20:11:42 <kebray> or is it still wip
20:11:50 <harlowja> master of oslo? master of ?
20:11:50 <kebray> gerrit review acceptance
20:11:56 <kebray> master of TaskFlwo
20:12:06 <harlowja> confused
20:12:12 <kebray> ok.. we can take that offline
20:12:13 <harlowja> Taskflow has been accepted into master of Taskflow
20:12:15 <harlowja> :-P
20:12:24 <kebray> doh.. sorry, distributed engine.
20:12:25 <kebray> err.
20:12:27 <kebray> typo.
20:12:33 <kebray> has distributed engine been accepted into TaskFlow?
20:12:43 <harlowja> not yet, still being worked on afaik
20:12:57 <harlowja> #link https://review.openstack.org/#/c/45585/
20:13:15 <kebray> Last I heard, it hadn't landed because so many TaskFlow changes were happening... it was like a moving target, and kept breaking distributed WIP.
20:13:40 <kebray> Was wondering if there is any coordination that needs to happen... I want to make sure distributed gets in before the summit.
20:13:51 <harlowja> i think it will, it should calm down i think
20:13:52 <kebray> Was wondering if others are in agreement on that, or against that.
20:14:18 <harlowja> if jessica feels like its ready, and others do as well, then i see no problem in  getting it in
20:14:25 <kebray> k.
20:15:15 <kebray> I know there was some frustration that she couldn't get it in because everything kept changing under the hood... has caused a lot of reworking.. hoping ya'll can help with that, get distributed in, then do rework that is inclusive of ensuring distributed continues to work along the way.
20:15:39 <kebray> otherwise she'll never be able to keep up as the lone distributed developer right now.
20:15:49 <kebray> and, if things continue to be in flux.
20:16:29 <harlowja> sure, but this is the way of how software goes, things may always be in flux
20:16:36 <harlowja> *to some level*
20:16:59 <harlowja> although i do know that being a lone distributed developer is not sustainable, so hopefully that can change sometime
20:17:22 <sandywalsh_> https://twitter.com/TheSandyWalsh/status/380776942196121600
20:17:32 <harlowja> thx sandywalsh_
20:18:13 <harlowja> kebray so i understand the concern, and i agree with u that it needs two get in, although the issue of 'long distributed developer' doesn't go away if its in :)
20:18:19 <harlowja> *lone
20:18:34 <harlowja> especially since i think distributed is actually the most complicated one :)
20:18:49 <harlowja> and has to be done really really carefully
20:18:56 <harlowja> *even if celery is behind it*
20:19:25 <harlowja> but we can chat afterwards maybe about this more?
20:19:40 <harlowja> sound ok kebray ?
20:20:00 <kebray> sounds good.
20:20:40 <harlowja> cool
20:20:55 <harlowja> #topic new-use-cases
20:21:16 <harlowja> so there was a new use-case and maybe some new ideas for cinder backend activities that caitlin56  had recently
20:21:18 <harlowja> #link https://wiki.openstack.org/wiki/Cinder_Backend_Activities
20:21:36 <harlowja> caitlin56 do u might giving a little summary of your idea
20:21:49 <harlowja> *and maybe how u think it releates to taskflow
20:23:01 <harlowja> hmmm, ok, maybe she went away
20:23:25 <harlowja> so the part of that wiki that i was wondering about was
20:23:26 <harlowja> 'To properly utilize Volume Driver abilities, the taskflow code will need to be able to read a list of Volume Driver attributes that document these capabilities. '
20:23:51 <harlowja> where attributes could be
20:23:54 <harlowja> Stateless Activities:
20:23:58 <harlowja> Snapshot Replication:
20:23:59 <harlowja> ...
20:24:12 <harlowja> so i'm working on wrapping my head around how that might affect taskflow
20:24:26 * caitlin56 apologizes - phone call from her boss.
20:24:29 <harlowja> np
20:24:34 <caitlin56> I can discuss the cinder backend now.
20:24:36 <harlowja> sweet
20:24:38 <harlowja> thx caitlin56
20:25:29 <caitlin56> As to the capabilities: these need to cover things like whether the Cinder volume is on the same machine as Cinder or on an external machine.
20:25:54 <caitlin56> The current code assumes the former, and fetches the payload, compresses it and then puts it object storage.
20:26:28 <caitlin56> This is not the optimal algorithm if the volume is actually on a third machine -- you want to tell *that* machine to put it to the Object Server.
20:26:44 <caitlin56> You can't write a taskflow that is smart without being able to query attributes like that.
20:27:33 <harlowja> so this might affect the construction/runtime of the flow based on those attributes?
20:27:47 <caitlin56> Yes.
20:27:53 <harlowja> k
20:28:00 <caitlin56> So we want to keep them minimal and big.
20:28:39 <harlowja> do u think that the attributes need to be ran at runtime (while executing the flow) or before (during construction)
20:28:55 <harlowja> something like if 'xattribute' use this flow; otherwise use that one
20:29:02 <caitlin56> I'm assuming they would be static.
20:29:16 <caitlin56> You aren't going to change your method of doing snapshots on the fly.
20:29:20 <harlowja> sure
20:29:36 <caitlin56> That's probably an important thing to state, however.
20:29:47 <harlowja> ya, runtime makes it alot harder :-P
20:29:56 <harlowja> ahead of time, not so bad
20:30:12 <caitlin56> Incredibly hard if the backend things it can change dynamically and the taskflow assumes it is static.
20:30:32 <harlowja> ya
20:31:19 <caitlin56> There are several storage scenarios, but what I think is best is to concentrate on backing a volume up to an object as the first, get it right, and then generalize.
20:31:28 <harlowja> agreed
20:31:31 <harlowja> so how would these attributes affect taskflow i guess is my question, if the attributes are ahead-of-time determined, then the thing that uses taskflow can look at those attributes and form a 'set' of tasks using those attributes?
20:31:52 <harlowja> or do u imagine something else there?
20:31:55 <caitlin56> yes, a big outer if statement.
20:32:20 <harlowja> the way i could see it affecting taskflow is if u have something like the following
20:32:22 <caitlin56> I think we can have 2 or 3 strategies, and deal with hundreds of backends.
20:32:36 <harlowja> def volume_get_flow_for_snapshot()
20:32:40 <harlowja> this would then return a flow
20:32:51 <harlowja> and then the outter thing would analyze that returned flow for 'attributes'
20:33:02 <harlowja> but it seems like the outter if statement wouldn't need that
20:33:35 <harlowja> so would taskflow i guess be the thing responsible for just holding the attribute -> flow mapping
20:33:51 <harlowja> simialr to #link https://review.openstack.org/#/c/43051/
20:33:52 <caitlin56> Well, to cite one specific example, you have storage backends with relatively low cost snapshots and those with heavy cost snapshots.
20:34:10 <harlowja> sure
20:34:26 <caitlin56> With cheap snapshots, you do a backup by taking an anonymous snapshot, backing the snapshot up and then deleting it.
20:34:40 <caitlin56> The state of the volume is not impacted by your backing it up.
20:35:03 <caitlin56> With expensive snapshots, you want to quiesce the volume (putting it in a 'backing-up' state) and copy from the volume itself.
20:35:22 <caitlin56> But that's two strategies, not one strategy for each vendor.
20:35:26 <harlowja> right
20:35:27 <harlowja> gotcha
20:35:43 <harlowja> so u thinking that taskflow could provide that attribute -> strategy layer (?)
20:35:48 <harlowja> or should that recide in cinder
20:35:49 <caitlin56> The trick is to provide progress in either case.
20:36:19 <harlowja> sure
20:36:32 <harlowja> which taskflow can do since its hookin in at that 'level'
20:36:37 <caitlin56> It will vary. I think that one can be done with clever coding of the backup_volume method. Others will require a more outer switch.
20:37:01 <harlowja> sure
20:37:21 <caitlin56> BTW, everything we're discussing with Cinder would apply to Manila (NFS shares for opensgtack) if that project is approved.
20:37:34 <harlowja> of course, i think it applies elsewhere also :)
20:38:12 <caitlin56> Anywhere you are dealing with long running operations and differring implementation strategies.
20:38:12 <harlowja> just trying to distill i down to how it might look in taskflow :)
20:38:21 <harlowja> *it down
20:39:05 <harlowja> sure, so maybe it starts off with that attribute -> strategy object, this object can be queried given a bunch of attributes (and gives back the strategy)
20:39:28 <harlowja> then cinder can register its attributes/strategies
20:39:43 <harlowja> and tell taskflow to go execute the strategy with XYZ attributes
20:40:00 <caitlin56> I'm not an expert on python style, so I'm very flexible about exact coding representation.
20:40:05 <harlowja> :)
20:40:28 <harlowja> does the general idea though seem to be what u are aiming for?
20:41:15 <caitlin56> Yes, plus creating some sort of abstraction that is inclusive of a normal task and something running on a server not under openstack control.
20:41:40 <caitlin56> Both would report progress, but you wouldn't be able to control priorities/etc for an external "activity".
20:41:57 <caitlin56> So I suppose it's a subset interface.
20:42:10 <harlowja> sure, i think #link https://review.openstack.org/#/c/46331/ is part of that second one
20:42:36 <ekarlso> so
20:42:40 <harlowja> although priority control taskflow currently doesn't do (although its an interesting idea), ha
20:42:40 <ekarlso> what's goin' on ?
20:42:51 <harlowja> hi ekarlso
20:43:02 <caitlin56> harlowja: yes. That combines the progress reporting too.
20:43:14 <harlowja> ekarlso just chatting about https://wiki.openstack.org/wiki/Cinder_Backend_Activities
20:44:03 <caitlin56> But none of us (Nexenta) are really expert at core Nova compute stuff. If we're doing things in a way that is awkward do not be afraid to tell us what would be a more normal interface.
20:44:14 <harlowja> i think it seems like an ok interface
20:44:27 <harlowja> its a generally set of useful concepts i think :)
20:45:19 <harlowja> caitlin56 let me write up a blueprint for this in taskflow, and see if i captured it correctly
20:45:21 <harlowja> sound ok?
20:45:21 <caitlin56> And I think it would even take as far as having taskflow that managed failovers between hot standbys -- without requiring each vendor to code that.
20:45:35 <caitlin56> sounds good.
20:46:03 <harlowja> #action harlowja writeup blueprint with distilled taskflow (work idea) for https://wiki.openstack.org/wiki/Cinder_Backend_Activities
20:46:21 <harlowja> managed failovers with hot standbys, hmmmm
20:46:44 <harlowja> that requires the persistence stuff though, to know where to 'pick up last'
20:46:58 <harlowja> correct?
20:47:02 <caitlin56> correct.
20:47:18 <harlowja> k, np, thats being polished as we speak
20:47:33 <caitlin56> The strategy I favor is incremental snapshots, and cloning the volume from the most recent snapshot.
20:47:52 <caitlin56> But you can also do a continuous transaction feed.
20:48:03 <caitlin56> One policy - multiple implementations.
20:48:09 <harlowja> sure
20:48:26 <harlowja> thats the openstack way :-P
20:49:21 <harlowja> cool, thx caitlin56 for discussing, i'll try to see if i can write what might need to be done up :)
20:49:30 <harlowja> and pass it by u
20:49:34 <harlowja> *and others*
20:50:01 <harlowja> #topic open-discuss
20:50:21 <harlowja> anything anyone wants to talk about that i missed?? :)
20:50:41 <harlowja> thx changbl for helping out with some reviews
20:50:46 <changbl> np harlowja
20:50:48 <melnikov> what about some crazy ideas? #link https://blueprints.launchpad.net/taskflow/+spec/eliminate-patterns
20:50:48 <changbl> maybe out of scope, I have one question:
20:50:58 <harlowja> oh crazy ideas, haha!
20:51:51 <harlowja> i like the crazy ideas, although i want to make sure jessica doesn't hate us to much by changing that while she is still working on the big distributed engine piece
20:51:55 <changbl> generalize linear_flow to be graph_flow?
20:52:02 <changbl> lol
20:52:15 <harlowja> changbl i think u have the same crazy idea as melnikov :-P
20:52:28 <changbl> i have one question on concurrency though
20:52:54 <harlowja> sure
20:52:59 <changbl> so say I have flow1 which touches some resources, and so is flow2, and I execute them simultaneously
20:53:13 <harlowja> k
20:53:21 <changbl> any way to guarantee flow1 and flow2 does not touch the same resources?
20:53:24 <changbl> by openstack?
20:53:32 <harlowja> not without https://wiki.openstack.org/wiki/StructuredWorkflowLocks
20:53:51 <harlowja> *which doesn't exist yet
20:54:17 <changbl> so basically there is no concurrency control yet?
20:54:23 <harlowja> file level concurrency control
20:54:35 <harlowja> some attempts at database level locking used for concurrency control
20:54:59 <harlowja> nova and cinder use a 'state' field in the DB to try to not squash themselves
20:55:05 <harlowja> and file level locks
20:55:14 <harlowja> caitlin56 i think has experience with the cinder one
20:55:24 <harlowja> *since its problematic for her i think
20:55:37 <changbl> ok, go the DB part. still confused with the file-level one...
20:55:56 <changbl> can you illustrate more file-level one?
20:55:59 <harlowja> nova locks a file to make sure its simulatenously mutating a hypervisor
20:56:03 <harlowja> *a vm i mean
20:56:16 <harlowja> *to make sure its not simulatenously mutating a vm
20:56:18 <changbl> ok, got it
20:56:24 <changbl> thanks
20:56:32 <harlowja> sure, its not ideal imho
20:56:57 <harlowja> i was hoping for something like that above wiki, and a way for tasks to 'define' what resources they are using, and taskflow would lock them
20:57:05 <harlowja> and unlock them
20:57:28 <harlowja> and u could use different locking implemenations (up to the deployer)
20:57:33 <changbl> yes, in our TROPIC paper, we built some kind of lock manager
20:57:36 <harlowja> ya
20:57:54 <harlowja> although u guys did a much more exhaustive lock analysis :-P
20:58:01 <changbl> :)
20:58:16 <harlowja> in openstack, if taskflow can just attempt to do resource locks, that will be a good start, ha
20:58:24 <harlowja> and tasks can declare what resources they will touch
20:58:24 <changbl> borrowed some ideas from DB locking
20:58:31 <harlowja> ya
20:58:44 <caitlin56> Avoiding locks is even better, but with current DBs itmight be the only way
20:58:57 <harlowja> so full on tropic stuff is hard i think with openstack, but something in the middle might be doable
20:59:23 <harlowja> so thats my thinking anyway changbl
20:59:29 * harlowja i did read your guys paper :)
20:59:34 <changbl> thanks harlowja
20:59:38 <changbl> :)
20:59:42 <harlowja> ok, times up
20:59:42 <harlowja> eck
20:59:44 <changbl> i think it is time
20:59:54 <harlowja> jump into #openstack-state-management if u want to talk more
20:59:58 <harlowja> #endmeeting