13:01:05 <yanyanhu> #startmeeting senlin
13:01:06 <openstack> Meeting started Tue Jun 28 13:01:05 2016 UTC and is due to finish in 60 minutes.  The chair is yanyanhu. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:01:07 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
13:01:09 <openstack> The meeting name has been set to 'senlin'
13:01:20 <yanyanhu> hi
13:01:28 <elynn> o/
13:01:57 <yanyanhu> hi, elynn
13:02:01 <yanyanhu> hi, lixinhui_
13:02:05 <yanyanhu> long time no see :P
13:02:33 <yanyanhu> ok, here is the agenda, plz feel free to add any item you want to discuss
13:02:38 <yanyanhu> https://wiki.openstack.org/wiki/Meetings/SenlinAgenda#Weekly_Senlin_.28Clustering.29_meeting
13:03:06 <yanyanhu> #topic newton workitem
13:03:17 <yanyanhu> https://etherpad.openstack.org/p/senlin-newton-workitems
13:03:29 <yanyanhu> here is the etherpad track our newton workitem
13:03:39 <yanyanhu> first item, testing
13:03:47 <yanyanhu> tempest API test has been done
13:03:56 <yanyanhu> and tempest functional test is in good progress
13:04:03 <elynn> Dose release notes in?
13:04:14 <yanyanhu> I think all existing functional test cases have been migrated(some in progress)
13:04:39 <yanyanhu> elynn, not yet, several patches are still under review :)
13:04:53 <yanyanhu> but I think we can finish it in this week
13:05:15 <yanyanhu> the new gate job for tempest functional test is also available now
13:05:35 <yanyanhu> although it is experimental and doesn't vote
13:05:58 <yanyanhu> after the functional test migration is done, I will propose a patch to remove the old functional test gate job
13:06:06 <yanyanhu> it will be replaced by the new one :)
13:06:27 <yanyanhu> then we will have two jobs enabled in both check and gate pipeline and they will vote
13:06:45 <yanyanhu> ok, this is the tempest test part
13:06:58 <yanyanhu> about rally plugin, didn't get time to work on it this week
13:07:13 <yanyanhu> but our patch to add cluster/profile plugin for rally got first +2
13:07:19 <yanyanhu> need another +2 and workflow
13:07:28 <yanyanhu> basically, it looks good now
13:07:40 <yanyanhu> ok
13:07:55 <yanyanhu> hi, lixinhui_ , around?
13:08:07 <yanyanhu> any update on HA related work :)
13:08:28 <yanyanhu> I think Qiming didn't get time to work on it in last week
13:08:35 <lixinhui_> yes
13:08:48 <lixinhui_> I finished the fencing tests
13:08:59 <yanyanhu> great!
13:09:01 <lixinhui_> need to discuss how to bring into
13:09:03 <lixinhui_> Senlin
13:09:06 <yanyanhu> that is important for our HA solution
13:09:08 <yanyanhu> ok
13:09:26 <lixinhui_> And I tried the resilient
13:09:34 <yanyanhu> I think we can make further discussion in irc channel to decide how to merge into current HA framework
13:09:42 <lixinhui_> elastic cluster template with ceilometer/aodh/gnocchi
13:09:52 <lixinhui_> yes, yanyanhu
13:10:17 <yanyanhu> so monitoring has been included?
13:10:31 <yanyanhu> for failure detection?
13:11:47 <yanyanhu> actually I'm thinking how to build the basic workflow of our HA solution, including leverage other monitor service to detect failure in different layers
13:11:48 <lixinhui_> what the failure detection?
13:12:02 <yanyanhu> like host/VM crash or app failure
13:12:04 <lixinhui_> the failure detection is based on node status
13:12:18 <lixinhui_> yes
13:12:20 <lixinhui_> yanyanhu
13:12:23 <yanyanhu> yes, from nova notification, e.g.
13:12:26 <yanyanhu> for VM failure
13:12:43 <yanyanhu> I think Qiming was working on this?
13:12:44 <lixinhui_> one problem I met is about heat
13:12:49 <yanyanhu> yes?
13:13:06 <lixinhui_> where do you know to set the timeout or retry time
13:13:18 <yanyanhu> retry for?
13:13:20 <lixinhui_> creation of loadbalancer is very slow
13:13:31 <lixinhui_> then heat stack-create will keep retrying
13:13:32 <yanyanhu> yes, it is sometimes
13:13:48 <yanyanhu> I'm not sure there is such an option to customized
13:13:54 <lixinhui_> then there will be several loadbalancers under creation
13:13:57 <yanyanhu> hi, elynn, do you have any idea?
13:14:35 <lixinhui_> I think there should be somewhere can be customized
13:14:49 <lixinhui_> and I know Qiming is adding listener
13:14:56 <lixinhui_> for detection functions
13:15:09 <yanyanhu> I guess that is a fixed value defined in heat engine?
13:15:09 <yanyanhu> so you put all those resources in a heat template?
13:15:31 <lixinhui_> yes
13:15:35 <elynn> retry time for creating resource?
13:15:40 <lixinhui_> yes
13:15:43 <lixinhui_> elynn
13:16:03 <yanyanhu> lixinhui_, maybe you can try to split lb resource from other ones if the stack creation always failed for the timeout of lb creation
13:16:13 <lixinhui_> to avoid creations duplicated creation of loadbalancer
13:16:44 <lixinhui_> yanyanhu, I am using the same template as we presented in Austin
13:16:58 <lixinhui_> which is committed into as a tutorial
13:17:03 <yanyanhu> lixinhui_, that should happened I think if you mean duplicated creation of lb for timeout retry
13:17:03 <elynn> I can't remember any property or configurations about that.
13:17:20 <yanyanhu> lixinhui_, I see
13:17:30 <lixinhui_> that is okay elynn
13:17:38 <lixinhui_> octavia is so slow
13:17:38 <yanyanhu> sorry, that shouldn't happen
13:17:49 <yanyanhu> so haproxy works well
13:17:51 <elynn> I can check later
13:18:01 <yanyanhu> but octavia doesn't?
13:18:05 <lixinhui_> that will be very nice, elynn
13:18:12 <elynn> That would be a problem if it happens on presentation...
13:18:18 <lixinhui_> yes, yanyanhu
13:18:30 <yanyanhu> I see
13:18:39 <lixinhui_> elynn, nsx and haporxy work well
13:18:43 <yanyanhu> elynn, hope we can find some solution for it
13:19:00 <yanyanhu> or in worst case, use some workaround
13:19:00 <lixinhui_> why so many people use octavia
13:19:12 <yanyanhu> lixinhui_, we don't :)
13:19:31 <lixinhui_> yanyanhu, what kinds backend for IBM
13:19:32 <lixinhui_> ?
13:19:35 <elynn> We can at least change the hardcode retry times in our env ;)
13:20:10 <yanyanhu> I'm not sure. But I think octavia is better than haproxy hosted in network controller
13:20:24 <elynn> Saw an option client_retry_limit
13:20:32 <elynn> we can try to increase it in heat.conf
13:20:43 <yanyanhu> not only reduce the risk of single failure, but also much better scalability I think
13:20:50 <lixinhui_> oh, elynn?
13:21:11 <yanyanhu> sorry´╝î single point failure
13:21:11 <lixinhui_> okay yanyanhu
13:21:18 <lixinhui_> I see
13:21:36 <yanyanhu> BTW, I have draft a topic proposal about HA for Barcelona summit
13:21:43 <yanyanhu> https://www.openstack.org/summit/barcelona-2016/call-for-presentations/manage/15037/summary
13:21:56 <yanyanhu> hope we can finish our HA design and make a presentation for it
13:22:07 <yanyanhu> I added you two and Qiming's name as speaker
13:22:17 <elynn> And also we can specify an timeout parameter when creating new stack.
13:22:31 <yanyanhu> I think we can make further discussion to see how can we build this demo
13:22:40 <lixinhui_> okay, yanyanhu. will read it and raise discussion
13:22:58 <lixinhui_> sounds like a better solution, elynn
13:23:02 <yanyanhu> lixinhui_, thanks :) I think HA is an important feature we need to finish in this cycle
13:23:56 <yanyanhu> although maybe the basic one, we hope it is a complete loop, from failure detection to recovery
13:24:00 <lixinhui_> is that a cli parameter? elynn
13:24:23 <elynn> lixinhui_: yes
13:24:38 <lixinhui_> okay, I will try it
13:24:47 <lixinhui_> yes, yanynahu
13:24:53 <yanyanhu> lixinhui_, timeout works well I think
13:25:01 <yanyanhu> for heat stack creation
13:25:16 <lixinhui_> :)
13:25:37 <yanyanhu> you can have a try. I tried to increase it from 1 hour to 10 hours when deployed a very large and complicated stack in softlayer
13:25:39 <yanyanhu> :)
13:26:09 <elynn> That is so long.
13:26:11 <lixinhui_> happy to know
13:26:34 <yanyanhu> ok, I think we can collect our idea about HA topic using  this existing etherpad: https://etherpad.openstack.org/p/senlin-ha-recover
13:26:53 <yanyanhu> elynn, yes, since the deployment of some services inside VM is very slow :)
13:26:57 <yanyanhu> like DB2
13:27:14 <yanyanhu> and we used software deployment for it
13:27:21 <elynn> that's true...
13:27:33 <yanyanhu> anyway, it works good
13:27:54 <yanyanhu> ok, so this is about HA?
13:27:59 <lixinhui_> yes
13:28:01 <yanyanhu> we can have more discussion offline
13:28:06 <yanyanhu> thanks, xinhui
13:28:06 <lixinhui_> sure
13:28:22 <lixinhui_> pleasure
13:28:22 <yanyanhu> next one
13:28:27 <yanyanhu> :)
13:28:31 <yanyanhu> lets skip document
13:28:43 <yanyanhu> I guess haiwei is not here?
13:29:00 <yanyanhu> I noticed he proposed patch to add docker driver
13:29:16 <yanyanhu> although just for most basic operations like creating, deleting
13:29:22 <yanyanhu> it's a startpoint I think
13:29:45 <yanyanhu> will talk with him to add more driver interfaces
13:30:10 <yanyanhu> umm, for other left items, we have no progress I think
13:30:18 <yanyanhu> so topic 2
13:30:27 <yanyanhu> #topic proposal for summit
13:30:47 <yanyanhu> hi, elynn, lixinhui_ , any idea :)
13:30:53 <yanyanhu> besides the HA one
13:31:14 <elynn> Not sure what else we can propose
13:31:25 <lixinhui_> cluster.do
13:31:42 <lixinhui_> I think that is worthy a talk
13:31:47 <yanyanhu> you mean deployment? lixinhui_
13:31:52 <yanyanhu> of app
13:31:57 <lixinhui_> yes
13:32:14 <lixinhui_> and any cluster management
13:32:15 <yanyanhu> I see. It's very useful and powerful I believe
13:32:33 <yanyanhu> just I feel we may need a use case as reference
13:32:55 <lixinhui_> integration will some given agent
13:32:59 <yanyanhu> otherwise, pure technical discussion could be difficult to understand for audience who are not familiar with Senlin code
13:33:23 <lixinhui_> yes
13:33:24 <yanyanhu> lixinhui_, for agent, you mean?
13:33:37 <elynn> agree
13:33:39 <lixinhui_> agent of some given monitor
13:34:05 <yanyanhu> so this for HA solution?
13:34:15 <yanyanhu> as monitoring part?
13:34:20 <lixinhui_> I think it should be a separate topic
13:34:26 <yanyanhu> ok
13:34:27 <elynn> We can think of some use cases, like scaling from a standby pool, green-blue deployment
13:34:39 <lixinhui_> cool, ekynn
13:34:41 <yanyanhu> elynn, that's interesting as well
13:34:44 <lixinhui_> elynn
13:34:54 <yanyanhu> ok, I will create an etherpad to collect these ideas
13:34:55 <elynn> How to use senlin for better management.
13:35:07 <yanyanhu> and then we can discuss and refine them to see what we can propose
13:35:47 <elynn> yes
13:35:48 <yanyanhu> #action yanyanhu to create an etherpad to collect proposal ideas
13:35:48 <yanyanhu> oh, BTW, I also talked with eldon to see whether we can propose one to share their experience on managing cluster in large scale
13:35:48 <yanyanhu> eldon from cmcc
13:36:16 <lixinhui_> cool yanyanhu
13:36:18 <yanyanhu> they have tried to use senlin to manage  cluster consists of a thousand of VMs?
13:36:19 <yanyanhu> I think
13:36:37 <lixinhui_> you know I always wanna to learn more details
13:36:39 <yanyanhu> so that can be a very good demonstration and sharing
13:36:50 <yanyanhu> they also met some problems I believe
13:36:58 <yanyanhu> lixinhui_, sure, me too :)
13:37:07 <lixinhui_> :)
13:37:14 <yanyanhu> so lesson and learn
13:37:18 <yanyanhu> ok
13:37:28 <yanyanhu> will add it as well
13:37:47 <yanyanhu> I will create the etherpad and post the link into irc channel
13:38:10 <yanyanhu> the deadline is 13th ?
13:38:38 <elynn> yes
13:38:39 <yanyanhu> so we need to propose before that date :)
13:38:51 <elynn> about 2 weeks
13:38:55 <yanyanhu> yea
13:39:08 <yanyanhu> lots of thinking needed :P
13:39:26 <yanyanhu> ok, that's for proposal for summit
13:39:43 <yanyanhu> #topic open discussion
13:39:57 <yanyanhu> any other items you guys want to discuss?
13:40:43 <elynn> no from me
13:40:55 <yanyanhu> if not, we can finish 20 minutes earlier :)
13:41:11 <lixinhui_> Qiming should be a good dad
13:41:28 <elynn> Congratuate to him :)
13:41:36 <yanyanhu> lixinhui_, I believe so :)
13:41:39 <yanyanhu> yea
13:41:53 <lixinhui_> :)
13:42:11 <yanyanhu> so maybe a month later, we can go to his home to see his baby:)
13:42:18 <lixinhui_> haha
13:42:22 <yanyanhu> heihei
13:42:29 <yanyanhu> lets go together
13:42:37 <lixinhui_> he will be managed by two women
13:42:41 <elynn> haha
13:42:42 <lixinhui_> since then
13:42:43 <yanyanhu> haha
13:42:49 <yanyanhu> lixinhui_, LOL
13:42:55 <yanyanhu> yep
13:43:03 <yanyanhu> ok, thank you so much for joining
13:43:12 <yanyanhu> I think we can finish the meeting now
13:43:14 <lixinhui_> see you next time
13:43:20 <yanyanhu> lets make further discussion later
13:43:20 <elynn> cu
13:43:22 <yanyanhu> see U
13:43:27 <yanyanhu> #endmeeting