#openstack-meeting log

13:03:12 <Qiming> #startmeeting senlin
13:03:12 <openstack> Meeting started Tue Mar 22 13:03:12 2016 UTC and is due to finish in 60 minutes.  The chair is Qiming. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:03:14 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
13:03:16 <openstack> The meeting name has been set to 'senlin'
13:03:24 <Qiming> hello
13:03:28 <lixinhui> hi
13:03:30 <yanyanhu_> o/
13:03:31 <elynn> Hi
13:03:33 <haiwei_> hi
13:03:35 <cschulz> Hi
13:03:42 <Qiming> sorry for being late, trapped in some microversion patches
13:03:51 <Qiming> pls review agenda and see if you have things to add
13:04:09 <Qiming> #link https://wiki.openstack.org/wiki/Meetings/SenlinAgenda
13:04:37 <Qiming> first thing I'd like to bring up is about summit prep
13:05:48 <Qiming> for the deep dive session, we'll need a high-level structure so every presenter can start prepare his/her own section
13:06:16 <Qiming> elynn brought up this last week?
13:06:34 <elynn> this week :)
13:06:42 <Qiming> okay, :)
13:06:42 <chenying__> Hi
13:06:54 <Qiming> chenying__,  wlc
13:07:26 <Qiming> so my suggestion is like this: elynn to start with why we started senlin
13:07:52 <Qiming> based on our understanding of the requirement, the discussions with Heat team etc
13:08:31 <Qiming> then I follow with an intro to the senlin architecture and some design decisions
13:09:02 <Qiming> yanyan can close the session with some status update, roadmap and pointers, examples etc
13:09:14 <yanyanhu_> ok
13:09:30 <elynn> great!
13:09:44 <Qiming> okay, next, the autoscaling one
13:09:48 <yanyanhu_> will start the paper work in coming week :)
13:10:14 <Qiming> mark will start talking about auto-scaling requirements
13:10:38 <Qiming> and I can follow with a generic intro of senlin
13:10:59 <Qiming> then xinhui can close this with a real deployment example and lessons learnt
13:11:16 <lixinhui> okay
13:11:24 <Qiming> lixinhui, is the other presenter 'mark'?
13:11:37 <Qiming> just some rough ideas
13:11:51 <lixinhui> I think it would be good
13:11:54 <Qiming> any comments are welcomed
13:12:05 <cschulz> We need to start seeding the autoscaling discussions with topics beyond the basic Reactive autoscaling
13:12:08 <lixinhui> will involve you to dicuss with Mark
13:12:21 <cschulz> Predictive autoscaling
13:12:37 <Qiming> okay, great, cschulz, will incorporate those
13:12:57 <cschulz> Also the requirement to bring things like Business Process Management into the scaling decisions.
13:13:00 <Qiming> and in some cases, semi-auto-scaling
13:13:08 <cschulz> Yes
13:13:23 <cschulz> that is one way to look at it.
13:13:59 <cschulz> The scaling driven by events from outside the Cloud
13:14:01 <Qiming> cschulz, we can dive into that requirement later, when the newton development cycle is open
13:14:41 <Qiming> yes, that is why we decided to provide 'receiver' instead of 'trigger' abstraction
13:15:18 <Qiming> in real deployments, people have various monitoring/alerting solutions to hook into the scaling workflow
13:15:19 <cschulz> Has the receiver security issue been solved?
13:15:39 <cschulz> DOS ...
13:15:48 <Qiming> cschulz, not yet
13:16:18 <Qiming> we will need to explore other flavors of receivers, beyond the basic webhook one
13:17:11 <Qiming> one promising alternative is to bring zaqar into the picture, so the triggering end doesn't have to know where the receiving end resides
13:17:39 <Qiming> another related topic is to add a rate-limit middleware in front of senlin-api
13:17:45 <cschulz> I will look into zaqar.  That is new to me.
13:18:08 <Qiming> it is basically the openstack version of AWS SQS
13:18:43 <Qiming> last one is about containers, I saw haiwei has submitted a new patchset on that spec
13:18:53 <Qiming> I haven't got time to review it
13:19:27 <haiwei_> almost the same with what we discussed last week
13:19:32 <Qiming> still evaluating whether senlin + ansible is a good solution for users to create/manage containers
13:20:16 <haiwei_> have no idea how to introduce ansible to container cluster
13:20:16 <Qiming> I've been learning myself some ansible basics, so far, it is pretty useful
13:20:35 <Qiming> haiwei_, it would be something like this:
13:21:00 <Qiming> senlin cluster-create => a vm cluster, with list of IPs
13:21:30 <Qiming> ansible -i <cluster ip list> -m <ansible_module>
13:22:20 <Qiming> you can ssh into those VMs in batch and perform operations to do container environment preparation and even operate them
13:22:40 <haiwei_> the ansible command should be typed by hand by users?
13:22:50 <Qiming> good question
13:23:03 <Qiming> for operators, it is not a difficult task
13:23:33 <Qiming> for end users, we can invoke ansible by calling its API from inside senlin, do whatever necessary jobs on those VMs created
13:24:47 <haiwei_> that means we still need to create a scheduler and something else?
13:24:50 <yanyanhu_> maybe we should allow this ansible deployment triggered by some events
13:24:56 <Qiming> I have tried writing some playbooks (ansible term) for this kind of a task, it is pretty neat
13:25:21 <yanyanhu_> just like other actions
13:25:22 <Qiming> maybe a custom action
13:25:25 <yanyanhu_> yes
13:25:36 <Qiming> just a wrapper of ansible
13:26:05 <Qiming> previously, I tried paramiko, the underlying module used by ansible
13:26:10 <Qiming> it works as well
13:26:32 <Qiming> but 'it works' <> 'it is an elegant design'
13:26:45 <yanyanhu_> do we need to provide high level abstraction for deployment and support ansible as one of the deployment type?
13:26:55 <Qiming> haiwei_, we can have a phone call tomorrow?
13:27:10 <Qiming> yanyanhu_, that is a good idea
13:27:13 <haiwei_> I am ok tomorrow afternoon
13:27:14 <yanyanhu_> maybe, as the default type
13:27:32 <Qiming> the high level abstraction maybe just a container image
13:28:17 <Qiming> by default it would be a docker image, for example
13:28:29 <yanyanhu_> exactly
13:29:06 <Qiming> okay, this is something we need to dive into in this week
13:29:19 <Qiming> #topic newton work items
13:29:32 <haiwei_> what 14:00~1500 beijing time
13:29:39 <haiwei_> for the phone meeting?
13:29:42 <Qiming> just moved etherpad items here: https://etherpad.openstack.org/p/senlin-newton-workitems
13:30:10 <Qiming> haiwei_, fine with me
13:30:42 <haiwei_> yanyanhu_ ?
13:30:54 <yanyanhu_> I'm fine with it
13:30:59 <haiwei_> anyone interested in this can join
13:31:14 <lixinhui> I will dial in
13:31:28 <elynn> o/
13:31:38 <Qiming> okay, I'll send an invitation to you all
13:32:05 <Qiming> last Friday, we have cut the RC1 branch
13:32:19 <Qiming> now we have a stable/mitaka branch on git
13:32:31 <lixinhui> cool!
13:32:38 <Qiming> that branch is meant to be the base for the coming stable release
13:32:38 <zzxwill> I saw the release tag in senlin branch.
13:33:03 <Qiming> if you have found any critical bugs, please bring it up asap
13:33:54 <Qiming> back to newton work items
13:34:28 <Qiming> tempest tests are coming in
13:34:40 <yanyanhu_> saw elynn's patch
13:34:50 <elynn> I create a BP for it.
13:35:07 <elynn> And start to look into it
13:35:17 <Qiming> maybe we should retarget the test cases about failure scenarios and stress tests on it?
13:35:29 <yanyanhu_> yes, I think so
13:35:52 <Qiming> thanks, elynn, that would be very important a QA infra
13:36:10 <yanyanhu_> those tests should be included in tempest test framework
13:36:24 <elynn> oh...seems I need to speed up to not let you blame me ;)
13:36:57 <Qiming> yanyanhu_ will blame you
13:37:05 <yanyanhu_> sorry. really trapped by some other works in recent weeks
13:37:12 <yanyanhu_> yes...
13:37:32 <Qiming> just kidding
13:37:37 <elynn> After I rebuild my env, I will start to migrate exist tests on tempest framework.
13:37:40 <yanyanhu_> will investigate rally ASAP to decide how we should support stress test
13:37:54 <Qiming> we need such an infra to enable more thorough tests
13:38:08 <Qiming> cool, yanyanhu_
13:38:15 <yanyanhu_> Qiming, no problem, really my fault :(
13:38:34 <Qiming> oh .... no one's fault
13:38:49 <Qiming> we are all heroes, ;)
13:39:00 <Qiming> health management side
13:39:48 <Qiming> where are we about lb based health detection?
13:40:01 <Qiming> lixinhui, asleep?
13:40:04 <lixinhui> I just
13:40:18 <lixinhui> added lb-node-polling
13:40:25 <lixinhui> into the prototype
13:40:48 <lixinhui> underlying, it will rely on neutron
13:40:50 <yanyanhu_> lixinhui, so you're using haproxy as the service provider now?
13:40:55 <lixinhui> ayes
13:40:56 <lixinhui> yes
13:41:06 <yanyanhu_> ok, at least it works
13:41:26 <lixinhui> I am thinking
13:41:36 <lixinhui> about the demo
13:41:44 <lixinhui> actaully the protoytpe
13:42:10 <lixinhui> maybe we could attach the health_polling policy onto lb cluster
13:42:21 <lixinhui> then simulate member failure
13:42:42 <lixinhui> then watch the auto-healing of the failed member
13:42:43 <Qiming> yes?
13:43:06 <lixinhui> by list the member of pool and senlin cluster node status before and after auto-healing
13:43:09 <Qiming> I believe there are many details to be figured out?
13:43:22 <lixinhui> yes
13:43:35 <Qiming> right, a cluster with scaling policy, health policy and lb policy
13:44:01 <Qiming> let's see how they dance/fight together
13:44:17 <lixinhui> yes
13:44:21 <lixinhui> Xujun
13:44:22 <yanyanhu_> you mean the auto healing will be triggered by simulated failure which is actually not a failure detected by lb monitoring
13:44:38 <lixinhui> one intern is trying to write the heat template
13:44:49 <lixinhui> including everything into a template
13:45:05 <lixinhui> yanyanhu_
13:45:34 <lixinhui> I add lb member status polling logic
13:45:37 <lixinhui> for this demo
13:45:55 <lixinhui> suppose the status of member should be changed bu health monitor
13:46:11 <yanyanhu_> ah, I see
13:46:12 <Qiming> lixinhui, please help bring xujun into the IRC channel, so when he has got questions, he know where to seek help
13:46:24 <lixinhui> sure
13:46:45 <lixinhui> one problem in my mind
13:46:50 <Qiming> we don't have octavia in the picture now?
13:46:56 <lixinhui> no
13:46:58 <lixinhui> now
13:47:03 <Qiming> great
13:47:19 <lixinhui> I am deply nsx based lbaas
13:47:34 <lixinhui> to see any problems there and ask mark to help if does
13:47:45 <Qiming> cool
13:48:16 <Qiming> for the quick demo, we can focus on haproxy, but it would be great if someone helps on validating it using nsx
13:48:31 <lixinhui> yes
13:48:49 <haiwei_> what is nsx?
13:48:53 <lixinhui> hope it would add cents to senlin markting
13:49:06 <lixinhui> NSX is vmware neutron engine
13:49:21 <yanyanhu_> hardware loadbalancer from citrix?
13:49:25 <lixinhui> almost all custmoers bought its license
13:49:27 <yanyanhu_> oh
13:49:44 <cschulz> HW loadbalancer?
13:49:52 <lixinhui> SDDN
13:49:55 <yanyanhu_> sorry, I was wrong :)
13:50:05 <lixinhui> pure
13:50:07 <lixinhui> software
13:50:17 <cschulz> Makes sense
13:50:29 <lixinhui> one problem in my mind
13:50:37 <lixinhui> but maybe not import for this demo
13:50:56 <lixinhui> is how to differenciate the normal node stop and abmorl down
13:51:02 <Qiming> my guess it is a question, not a problem, :P
13:51:10 <lixinhui> yes :)
13:51:13 <cschulz> Maintenance mode?
13:51:19 <Qiming> exactly
13:51:23 <lixinhui> yes
13:51:39 <Qiming> the HA prolicy has to be put on hold in some scenarios
13:51:58 <cschulz> Or at least for a subset of the cluster
13:51:59 <yanyanhu_> disable the HA policy temporarily?
13:52:00 <Qiming> that is why we have cluster-policy binding and there is a 'enabled' field there
13:52:31 <Qiming> if it is always effective, you won't be able to do a thing to your cluster, :)
13:52:43 <lixinhui> ok
13:52:53 <lixinhui> when to disable it
13:53:22 <cschulz> So should we add a 'semi-auto scaling' topic to Newton?
13:53:35 <lixinhui> :)
13:53:37 <Qiming> sure, cschulz
13:53:50 <cschulz> OK will add some stuff
13:54:11 <Qiming> lixinhui, sounds to me more like an interaction between policy and action again
13:54:28 <lixinhui> yes, Qiming
13:54:43 <lixinhui> this will be more complex
13:54:55 <Qiming> before some actions are performed, we will have to disable ha, after that action is done, we renable it
13:55:52 <lixinhui> will think more about it
13:55:53 <yanyanhu_> yes, maybe we should throw some warning to user when they perform 'maintainance operations' with HA policy enabled
13:56:18 <lixinhui> :)
13:56:24 <Qiming> okay, that's something deserve a whole hour for discussion
13:56:31 <Qiming> let's continue this on senlin tomorrow
13:56:38 <lixinhui> okay
13:56:45 <Qiming> last update from me
13:56:57 <Qiming> policy docs are almost all out there
13:57:19 <Qiming> just started working on micro-versioning support for senlin-api
13:57:37 <Qiming> before the versioning thing has landed, we won't accept any changes to senlin-api
13:57:56 <Qiming> #topic open discussion
13:58:09 <Qiming> 2 minutes
13:58:19 <yanyanhu_> no more topics from me
13:58:31 <lixinhui> no from me
13:58:38 <cschulz> We've run into Action suspend not working.  Has that ever been tested?
13:58:42 <haiwei_> ok for me
13:58:59 <Qiming> cschulz, that .... is not thoroughly tested
13:59:02 <yanyanhu_> cschulz, it is not supported I think
13:59:22 <yanyanhu_> really haven't test it yet...
13:59:30 <Qiming> there are some db level support
13:59:31 <cschulz> OK,  That's OK since we will be following a different approach
13:59:37 <Qiming> but the whole thing is half-baked
13:59:55 <Qiming> thanks for joining, guys, have a good night/nice day
14:00:00 <lixinhui> u2
14:00:01 <Qiming> #endmeeting