13:00:39 <Qiming> #startmeeting senlin
13:00:40 <openstack> Meeting started Tue Sep  1 13:00:39 2015 UTC and is due to finish in 60 minutes.  The chair is Qiming. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:00:41 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
13:00:43 <openstack> The meeting name has been set to 'senlin'
13:00:56 <haiwei> hi
13:01:15 <jruano> hi
13:01:25 <yanyanhu> hi
13:02:05 <Qiming> ?
13:02:45 <Qiming> okay, it is working
13:02:52 <Qiming> maybe my network connection is too bad
13:03:01 <yanyanhu> guess so ;)
13:03:16 <Qiming> anyway
13:03:42 <Qiming> please feel free to add agenda items: https://wiki.openstack.org/wiki/Meetings/SenlinAgenda
13:04:13 <Qiming> #topic l-3 milestone items
13:04:20 <Qiming> #link https://etherpad.openstack.org/p/senlin-liberty-workitems
13:04:52 <Qiming> just did a cleanup on the etherpad page
13:05:22 <Qiming> in backlog, we still have some test cases
13:05:42 <Qiming> keystone and sdk test cases still not there?
13:06:01 <yanyanhu> yes
13:06:01 <jruano> yes some of those are mine. i am working on them today
13:06:25 <Qiming> I'm signing on the keystone and sdk unit tests
13:06:35 <Qiming> L3 goals
13:06:49 <Qiming> container clusters ...
13:07:25 <Qiming> the progress of last week was good, haven't heard a thing since then from the team
13:07:30 <Qiming> need to catch up
13:07:42 <Qiming> #action Qiming to catch up with the SUR team on progress
13:08:09 <Qiming> placement policy, we have a simple POC there, 219212
13:08:16 <Qiming> need to make it work before release
13:08:47 <Qiming> I'm not worrying about cross-region support, the key is about algorithm, it has to be flexible
13:09:21 <Qiming> patch 219212 was just checked in by Xinhui, Xinhui cannot join us today due to biz trip
13:09:21 <patchbot> Qiming: https://review.openstack.org/#/c/219212/
13:09:38 <Qiming> hello patchbot
13:09:52 <Qiming> exception handling ..
13:09:58 <Qiming> haiwei, anything new?
13:10:18 <haiwei> no, it's already finished i think
13:10:28 <Qiming> I'm seeing all items crossed over
13:10:30 <haiwei> I made some tests these days it works fine
13:10:35 <Qiming> great.
13:10:56 <Qiming> next is functional tests
13:11:15 <Qiming> I believe we had an issue here
13:11:17 <haiwei> just a little worry about it that some one may complain it is not suitable for the cloud operator
13:11:22 <yanyanhu> just finished the cluster scaling test case
13:11:45 <Qiming> the test case passed?
13:12:04 <yanyanhu> nope, it was blocked by the problem in Action progress
13:12:11 <yanyanhu> without this issue, it passed
13:12:27 <Qiming> okay, we had two issues here actually
13:12:38 <Qiming> one is about the decorator for connection creation
13:12:46 <Qiming> it has been solved
13:12:52 <yanyanhu> yep
13:13:00 <yanyanhu> the second one is a big one :)
13:13:07 <Qiming> another one is related to context usage in action hierarchy
13:13:19 <Qiming> yanyanhu, I'll work with you on this tomorrow
13:13:30 <yanyanhu> thanks, that will be much helpful :)
13:13:35 <haiwei> the problem is?
13:13:42 <yanyanhu> took almost two days on this problem
13:14:00 <Qiming> it is about concurrent operations on sqlalchemy DB
13:14:04 <haiwei> saw your conversation this afternoon, not very clearly about it
13:14:30 <Qiming> the data written from one session cannot be seen from another session immediately
13:15:01 <haiwei> oh
13:15:02 <Qiming> we are having some context/session management problems just surfaced after some "bug fixings"
13:15:47 <Qiming> we may need to rethink whether our usage of oslo_context.get_current() is "green-thread-safe"
13:16:19 <jruano> oh wow. that does seem like a big problem to debug
13:16:20 <Qiming> I believe some of you have seen it this way or that
13:16:23 <yanyanhu> looks like there are still some issues about DB session we need to figure out
13:16:48 <yanyanhu> jruano, yes :)
13:17:09 <Qiming> most projects are not using oslo_context.get_current(), we may be the first to do async executions in engine as well
13:17:09 <haiwei> this problem can be reproduce in what kind of use case?
13:17:40 <Qiming> some locks are not released when action complete
13:18:01 <haiwei> I think I met it before
13:18:02 <yanyanhu> hi, haiwei, I think some operations like cluster-create/delete/update
13:18:07 <Qiming> especially when the action involves both cluster-action and node-action
13:18:11 <yanyanhu> and also resize/scalein/scaleout
13:18:14 <haiwei> and there is a bug report for it
13:18:51 <Qiming> this is a critical issue, we need to solve it as early as possible before it is getting too complicated
13:20:10 <Qiming> next item along the list
13:20:30 <Qiming> senlinclient test cases, I have just started working on it
13:20:46 <Qiming> need more hands on it
13:21:07 <haiwei> I assigned one, but not moving on
13:21:20 <Qiming> actually, there are something I don't think we need to test
13:21:38 <Qiming> I mean the "models" module
13:21:45 <haiwei> ok
13:21:49 <jruano> i can get you an extra hand qi ming. colleague reached out to me other day wanting to see where in openstack he can help
13:21:54 <Qiming> they will eventually get contributed to sdk
13:22:00 <haiwei> shell.py and client.py are necessary i think
13:22:16 <Qiming> jruano, that would be great
13:22:40 <Qiming> it is pretty a labor-intensive job
13:23:07 <Qiming> today is the l-3 milestone ...
13:23:49 <Qiming> I'm thinking maybe we need to create a branch in the coming days and practice feature freeze
13:23:58 <yanyanhu> agree
13:24:30 <Qiming> we will create a release in this branch, and continue development on master
13:24:30 <jruano> yes, that will be the most efficient way to get a release
13:25:01 <haiwei> adding test should be allowed I think
13:25:08 <haiwei> not only bug fix
13:25:10 <Qiming> in the 'release' (0.2?) branch, we can delete all half-baked things
13:25:22 <yanyanhu> yes, 0.2 sounds good :)
13:25:31 <Qiming> yes, we just make sure it is a usable package
13:26:13 <yanyanhu> so we may need some manual tests on all important features
13:26:21 <Qiming> yes
13:26:44 <yanyanhu> after fix the existing bugs, we can start the test
13:27:00 <yanyanhu> hopefully we can finish the test and debug in a week I think
13:27:01 <Qiming> there will be a branch for senlinclient as well
13:27:05 <yanyanhu> if we focus on this
13:27:20 <Qiming> the senlin-dashboard project needs a senlinclient package on Pypi
13:27:26 <haiwei> the next is dc-1?
13:27:37 <haiwei> rc-1
13:27:47 <Qiming> guess so
13:27:58 <Qiming> not quite familar with the process
13:28:19 <Qiming> we learn by doing it, as always
13:28:33 <yanyanhu> yea
13:28:53 <Qiming> so ... haiwei, you just mentioned something about exception handling, not appropriate for cloud operators
13:29:12 <Qiming> can you elaborate that? something we can improve/fix?
13:29:53 <haiwei> yes, one of my colleague complains it
13:30:19 <Qiming> specifics?
13:30:48 <haiwei> because we changed all the sdk exceptions to internal error, we can't get the original information from drivers
13:31:21 <yanyanhu> but I think the original msg from driver is recored in log
13:31:37 <yanyanhu> Qiming is disconnected?
13:31:53 <Qiming> I missed the previous sentence ...
13:31:56 <yanyanhu> oh, just connection reset
13:32:01 <haiwei> and also we catch doe exceptions and don't raise it again, from the engine logs there is not error trace, so for the operator it is difficult to debug the exception
13:32:13 <yanyanhu> <haiwei> because we changed all the sdk exceptions to internal error, we can't get the original information from drivers
13:32:16 <haiwei> because we changed all the sdk exceptions to internal error, we can't get the original information from drivers
13:32:31 <Qiming> okay, that is something we can improve
13:32:53 <Qiming> we can still write logs
13:32:55 <yanyanhu> yes, agree that the exception dump stack is important for debug
13:33:00 <haiwei> it seems we can do a middleware to handle the exception
13:33:13 <Qiming> stack dump is annoying to users
13:33:33 <yanyanhu> haiwei, that is important :)
13:33:43 <haiwei> magnum seems to do it that way
13:33:59 <Qiming> exception handling is always a cross-cutting concern in software engineering
13:34:33 <Qiming> we have been trying to consolidate it into more mangeable framework
13:35:07 <Qiming> please feel free to improve the driver end exception handling
13:35:18 <haiwei> ok
13:35:31 <Qiming> we need to make sure the operators (at least) knows what has been going wrong
13:35:41 <haiwei> yes
13:36:22 <Qiming> at the same time, we filter out messages that are not supposed to be seen by end users
13:36:53 <Qiming> there is always a gray area in-between
13:37:27 <Qiming> #topic revisions to profile/policy schema
13:38:10 <Qiming> during the past week (weekend actually), the biggest modification to the code is about profile and policy definitions
13:38:40 <Qiming> we were using 'senlin profile-create -t os.heat.stack -s specfile name' command to create profiles
13:38:49 <Qiming> and a similar command to create policies
13:39:04 <Qiming> actually, the '-t os.heat.stack' should be part of the specfile
13:39:39 <Qiming> so, we have changed the format of the profile spec and policy spec
13:39:48 <Qiming> now a profile looks like this:
13:39:55 <Qiming> type: os.heat.stack
13:39:58 <Qiming> version: 1.0
13:40:00 <Qiming> properties:
13:40:08 <Qiming> template: blah blah
13:40:15 <Qiming> parameters: blah blah
13:40:29 <Qiming> <and everything else you need to create a heat stack>
13:40:42 <Qiming> a policy will look like this:
13:40:47 <Qiming> type: senlin.policy.deletion
13:40:49 <Qiming> version: 1.0
13:40:53 <Qiming> properties:
13:41:03 <Qiming> destroy_after_deletion: True
13:41:16 <Qiming> criteria: OLDEST_FIRST
13:41:30 <Qiming> <and other properties we had in the deletion policy>
13:42:05 <Qiming> this was a disruptive change we have to do, and hopefully it is done once for all
13:42:24 <Qiming> in future, we can just change the version number to accommodate new properties
13:42:56 <Qiming> this was also an effort to get senlin policy definition better aligned with TOSCA
13:43:15 <Qiming> all relevant changes have been merged
13:43:45 <Qiming> if you are using the master code, you will need to delete existing profiles/policies and create new ones
13:44:37 <Qiming> there are still some open issues
13:45:09 <Qiming> for example, whether we use os.heat.stack as the 'type' or 'os.heat.stack-1.0' as the type name
13:45:16 <yanyanhu> Qiming, does that mean for specific type of profile/policy, we will support different versions in the same module?
13:45:32 <Qiming> good question
13:46:11 <Qiming> maybe we need to revise the setup.cfg file to spell out version numbers
13:46:52 <yanyanhu> yes
13:47:09 <Qiming> still thinking what is the best way to express version difference
13:48:01 <Qiming> when listing profile types or policy types, we need version numbers there too
13:49:08 <Qiming> #topic open discussions
13:49:47 <Qiming> anything?
13:50:04 <yanyanhu> nope from me
13:50:22 <jruano> when are we targeting code freeze?
13:50:26 <haiwei> i am ok
13:50:50 <Qiming> jruano, i was calling it a feature freeze
13:50:58 <jruano> ah, gotcha
13:51:09 <jruano> sounds good to me
13:51:12 <Qiming> the code won't be frozen, we will "backport" bug fixes when necessary
13:51:33 <Qiming> as for feature freeze, it's today
13:52:10 <Qiming> there can be FFE (feature freeze exceptions), though, :)
13:52:34 <Qiming> any new feature we want to add before doing a release
13:53:31 <Qiming> if there is nothing else, we can call an end to the meeting
13:53:34 <jruano> yeah
13:53:36 <jruano> sounds good
13:53:46 <Qiming> 3
13:53:54 <Qiming> 2
13:54:00 <Qiming> 1
13:54:06 <Qiming> 0.5
13:54:09 <Qiming> #endmeeting