13:03:12 <Qiming> #startmeeting senlin 13:03:12 <openstack> Meeting started Tue Mar 22 13:03:12 2016 UTC and is due to finish in 60 minutes. The chair is Qiming. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:03:14 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:03:16 <openstack> The meeting name has been set to 'senlin' 13:03:24 <Qiming> hello 13:03:28 <lixinhui> hi 13:03:30 <yanyanhu_> o/ 13:03:31 <elynn> Hi 13:03:33 <haiwei_> hi 13:03:35 <cschulz> Hi 13:03:42 <Qiming> sorry for being late, trapped in some microversion patches 13:03:51 <Qiming> pls review agenda and see if you have things to add 13:04:09 <Qiming> #link https://wiki.openstack.org/wiki/Meetings/SenlinAgenda 13:04:37 <Qiming> first thing I'd like to bring up is about summit prep 13:05:48 <Qiming> for the deep dive session, we'll need a high-level structure so every presenter can start prepare his/her own section 13:06:16 <Qiming> elynn brought up this last week? 13:06:34 <elynn> this week :) 13:06:42 <Qiming> okay, :) 13:06:42 <chenying__> Hi 13:06:54 <Qiming> chenying__, wlc 13:07:26 <Qiming> so my suggestion is like this: elynn to start with why we started senlin 13:07:52 <Qiming> based on our understanding of the requirement, the discussions with Heat team etc 13:08:31 <Qiming> then I follow with an intro to the senlin architecture and some design decisions 13:09:02 <Qiming> yanyan can close the session with some status update, roadmap and pointers, examples etc 13:09:14 <yanyanhu_> ok 13:09:30 <elynn> great! 13:09:44 <Qiming> okay, next, the autoscaling one 13:09:48 <yanyanhu_> will start the paper work in coming week :) 13:10:14 <Qiming> mark will start talking about auto-scaling requirements 13:10:38 <Qiming> and I can follow with a generic intro of senlin 13:10:59 <Qiming> then xinhui can close this with a real deployment example and lessons learnt 13:11:16 <lixinhui> okay 13:11:24 <Qiming> lixinhui, is the other presenter 'mark'? 13:11:37 <Qiming> just some rough ideas 13:11:51 <lixinhui> I think it would be good 13:11:54 <Qiming> any comments are welcomed 13:12:05 <cschulz> We need to start seeding the autoscaling discussions with topics beyond the basic Reactive autoscaling 13:12:08 <lixinhui> will involve you to dicuss with Mark 13:12:21 <cschulz> Predictive autoscaling 13:12:37 <Qiming> okay, great, cschulz, will incorporate those 13:12:57 <cschulz> Also the requirement to bring things like Business Process Management into the scaling decisions. 13:13:00 <Qiming> and in some cases, semi-auto-scaling 13:13:08 <cschulz> Yes 13:13:23 <cschulz> that is one way to look at it. 13:13:59 <cschulz> The scaling driven by events from outside the Cloud 13:14:01 <Qiming> cschulz, we can dive into that requirement later, when the newton development cycle is open 13:14:41 <Qiming> yes, that is why we decided to provide 'receiver' instead of 'trigger' abstraction 13:15:18 <Qiming> in real deployments, people have various monitoring/alerting solutions to hook into the scaling workflow 13:15:19 <cschulz> Has the receiver security issue been solved? 13:15:39 <cschulz> DOS ... 13:15:48 <Qiming> cschulz, not yet 13:16:18 <Qiming> we will need to explore other flavors of receivers, beyond the basic webhook one 13:17:11 <Qiming> one promising alternative is to bring zaqar into the picture, so the triggering end doesn't have to know where the receiving end resides 13:17:39 <Qiming> another related topic is to add a rate-limit middleware in front of senlin-api 13:17:45 <cschulz> I will look into zaqar. That is new to me. 13:18:08 <Qiming> it is basically the openstack version of AWS SQS 13:18:43 <Qiming> last one is about containers, I saw haiwei has submitted a new patchset on that spec 13:18:53 <Qiming> I haven't got time to review it 13:19:27 <haiwei_> almost the same with what we discussed last week 13:19:32 <Qiming> still evaluating whether senlin + ansible is a good solution for users to create/manage containers 13:20:16 <haiwei_> have no idea how to introduce ansible to container cluster 13:20:16 <Qiming> I've been learning myself some ansible basics, so far, it is pretty useful 13:20:35 <Qiming> haiwei_, it would be something like this: 13:21:00 <Qiming> senlin cluster-create => a vm cluster, with list of IPs 13:21:30 <Qiming> ansible -i <cluster ip list> -m <ansible_module> 13:22:20 <Qiming> you can ssh into those VMs in batch and perform operations to do container environment preparation and even operate them 13:22:40 <haiwei_> the ansible command should be typed by hand by users? 13:22:50 <Qiming> good question 13:23:03 <Qiming> for operators, it is not a difficult task 13:23:33 <Qiming> for end users, we can invoke ansible by calling its API from inside senlin, do whatever necessary jobs on those VMs created 13:24:47 <haiwei_> that means we still need to create a scheduler and something else? 13:24:50 <yanyanhu_> maybe we should allow this ansible deployment triggered by some events 13:24:56 <Qiming> I have tried writing some playbooks (ansible term) for this kind of a task, it is pretty neat 13:25:21 <yanyanhu_> just like other actions 13:25:22 <Qiming> maybe a custom action 13:25:25 <yanyanhu_> yes 13:25:36 <Qiming> just a wrapper of ansible 13:26:05 <Qiming> previously, I tried paramiko, the underlying module used by ansible 13:26:10 <Qiming> it works as well 13:26:32 <Qiming> but 'it works' <> 'it is an elegant design' 13:26:45 <yanyanhu_> do we need to provide high level abstraction for deployment and support ansible as one of the deployment type? 13:26:55 <Qiming> haiwei_, we can have a phone call tomorrow? 13:27:10 <Qiming> yanyanhu_, that is a good idea 13:27:13 <haiwei_> I am ok tomorrow afternoon 13:27:14 <yanyanhu_> maybe, as the default type 13:27:32 <Qiming> the high level abstraction maybe just a container image 13:28:17 <Qiming> by default it would be a docker image, for example 13:28:29 <yanyanhu_> exactly 13:29:06 <Qiming> okay, this is something we need to dive into in this week 13:29:19 <Qiming> #topic newton work items 13:29:32 <haiwei_> what 14:00~1500 beijing time 13:29:39 <haiwei_> for the phone meeting? 13:29:42 <Qiming> just moved etherpad items here: https://etherpad.openstack.org/p/senlin-newton-workitems 13:30:10 <Qiming> haiwei_, fine with me 13:30:42 <haiwei_> yanyanhu_ ? 13:30:54 <yanyanhu_> I'm fine with it 13:30:59 <haiwei_> anyone interested in this can join 13:31:14 <lixinhui> I will dial in 13:31:28 <elynn> o/ 13:31:38 <Qiming> okay, I'll send an invitation to you all 13:32:05 <Qiming> last Friday, we have cut the RC1 branch 13:32:19 <Qiming> now we have a stable/mitaka branch on git 13:32:31 <lixinhui> cool! 13:32:38 <Qiming> that branch is meant to be the base for the coming stable release 13:32:38 <zzxwill> I saw the release tag in senlin branch. 13:33:03 <Qiming> if you have found any critical bugs, please bring it up asap 13:33:54 <Qiming> back to newton work items 13:34:28 <Qiming> tempest tests are coming in 13:34:40 <yanyanhu_> saw elynn's patch 13:34:50 <elynn> I create a BP for it. 13:35:07 <elynn> And start to look into it 13:35:17 <Qiming> maybe we should retarget the test cases about failure scenarios and stress tests on it? 13:35:29 <yanyanhu_> yes, I think so 13:35:52 <Qiming> thanks, elynn, that would be very important a QA infra 13:36:10 <yanyanhu_> those tests should be included in tempest test framework 13:36:24 <elynn> oh...seems I need to speed up to not let you blame me ;) 13:36:57 <Qiming> yanyanhu_ will blame you 13:37:05 <yanyanhu_> sorry. really trapped by some other works in recent weeks 13:37:12 <yanyanhu_> yes... 13:37:32 <Qiming> just kidding 13:37:37 <elynn> After I rebuild my env, I will start to migrate exist tests on tempest framework. 13:37:40 <yanyanhu_> will investigate rally ASAP to decide how we should support stress test 13:37:54 <Qiming> we need such an infra to enable more thorough tests 13:38:08 <Qiming> cool, yanyanhu_ 13:38:15 <yanyanhu_> Qiming, no problem, really my fault :( 13:38:34 <Qiming> oh .... no one's fault 13:38:49 <Qiming> we are all heroes, ;) 13:39:00 <Qiming> health management side 13:39:48 <Qiming> where are we about lb based health detection? 13:40:01 <Qiming> lixinhui, asleep? 13:40:04 <lixinhui> I just 13:40:18 <lixinhui> added lb-node-polling 13:40:25 <lixinhui> into the prototype 13:40:48 <lixinhui> underlying, it will rely on neutron 13:40:50 <yanyanhu_> lixinhui, so you're using haproxy as the service provider now? 13:40:55 <lixinhui> ayes 13:40:56 <lixinhui> yes 13:41:06 <yanyanhu_> ok, at least it works 13:41:26 <lixinhui> I am thinking 13:41:36 <lixinhui> about the demo 13:41:44 <lixinhui> actaully the protoytpe 13:42:10 <lixinhui> maybe we could attach the health_polling policy onto lb cluster 13:42:21 <lixinhui> then simulate member failure 13:42:42 <lixinhui> then watch the auto-healing of the failed member 13:42:43 <Qiming> yes? 13:43:06 <lixinhui> by list the member of pool and senlin cluster node status before and after auto-healing 13:43:09 <Qiming> I believe there are many details to be figured out? 13:43:22 <lixinhui> yes 13:43:35 <Qiming> right, a cluster with scaling policy, health policy and lb policy 13:44:01 <Qiming> let's see how they dance/fight together 13:44:17 <lixinhui> yes 13:44:21 <lixinhui> Xujun 13:44:22 <yanyanhu_> you mean the auto healing will be triggered by simulated failure which is actually not a failure detected by lb monitoring 13:44:38 <lixinhui> one intern is trying to write the heat template 13:44:49 <lixinhui> including everything into a template 13:45:05 <lixinhui> yanyanhu_ 13:45:34 <lixinhui> I add lb member status polling logic 13:45:37 <lixinhui> for this demo 13:45:55 <lixinhui> suppose the status of member should be changed bu health monitor 13:46:11 <yanyanhu_> ah, I see 13:46:12 <Qiming> lixinhui, please help bring xujun into the IRC channel, so when he has got questions, he know where to seek help 13:46:24 <lixinhui> sure 13:46:45 <lixinhui> one problem in my mind 13:46:50 <Qiming> we don't have octavia in the picture now? 13:46:56 <lixinhui> no 13:46:58 <lixinhui> now 13:47:03 <Qiming> great 13:47:19 <lixinhui> I am deply nsx based lbaas 13:47:34 <lixinhui> to see any problems there and ask mark to help if does 13:47:45 <Qiming> cool 13:48:16 <Qiming> for the quick demo, we can focus on haproxy, but it would be great if someone helps on validating it using nsx 13:48:31 <lixinhui> yes 13:48:49 <haiwei_> what is nsx? 13:48:53 <lixinhui> hope it would add cents to senlin markting 13:49:06 <lixinhui> NSX is vmware neutron engine 13:49:21 <yanyanhu_> hardware loadbalancer from citrix? 13:49:25 <lixinhui> almost all custmoers bought its license 13:49:27 <yanyanhu_> oh 13:49:44 <cschulz> HW loadbalancer? 13:49:52 <lixinhui> SDDN 13:49:55 <yanyanhu_> sorry, I was wrong :) 13:50:05 <lixinhui> pure 13:50:07 <lixinhui> software 13:50:17 <cschulz> Makes sense 13:50:29 <lixinhui> one problem in my mind 13:50:37 <lixinhui> but maybe not import for this demo 13:50:56 <lixinhui> is how to differenciate the normal node stop and abmorl down 13:51:02 <Qiming> my guess it is a question, not a problem, :P 13:51:10 <lixinhui> yes :) 13:51:13 <cschulz> Maintenance mode? 13:51:19 <Qiming> exactly 13:51:23 <lixinhui> yes 13:51:39 <Qiming> the HA prolicy has to be put on hold in some scenarios 13:51:58 <cschulz> Or at least for a subset of the cluster 13:51:59 <yanyanhu_> disable the HA policy temporarily? 13:52:00 <Qiming> that is why we have cluster-policy binding and there is a 'enabled' field there 13:52:31 <Qiming> if it is always effective, you won't be able to do a thing to your cluster, :) 13:52:43 <lixinhui> ok 13:52:53 <lixinhui> when to disable it 13:53:22 <cschulz> So should we add a 'semi-auto scaling' topic to Newton? 13:53:35 <lixinhui> :) 13:53:37 <Qiming> sure, cschulz 13:53:50 <cschulz> OK will add some stuff 13:54:11 <Qiming> lixinhui, sounds to me more like an interaction between policy and action again 13:54:28 <lixinhui> yes, Qiming 13:54:43 <lixinhui> this will be more complex 13:54:55 <Qiming> before some actions are performed, we will have to disable ha, after that action is done, we renable it 13:55:52 <lixinhui> will think more about it 13:55:53 <yanyanhu_> yes, maybe we should throw some warning to user when they perform 'maintainance operations' with HA policy enabled 13:56:18 <lixinhui> :) 13:56:24 <Qiming> okay, that's something deserve a whole hour for discussion 13:56:31 <Qiming> let's continue this on senlin tomorrow 13:56:38 <lixinhui> okay 13:56:45 <Qiming> last update from me 13:56:57 <Qiming> policy docs are almost all out there 13:57:19 <Qiming> just started working on micro-versioning support for senlin-api 13:57:37 <Qiming> before the versioning thing has landed, we won't accept any changes to senlin-api 13:57:56 <Qiming> #topic open discussion 13:58:09 <Qiming> 2 minutes 13:58:19 <yanyanhu_> no more topics from me 13:58:31 <lixinhui> no from me 13:58:38 <cschulz> We've run into Action suspend not working. Has that ever been tested? 13:58:42 <haiwei_> ok for me 13:58:59 <Qiming> cschulz, that .... is not thoroughly tested 13:59:02 <yanyanhu_> cschulz, it is not supported I think 13:59:22 <yanyanhu_> really haven't test it yet... 13:59:30 <Qiming> there are some db level support 13:59:31 <cschulz> OK, That's OK since we will be following a different approach 13:59:37 <Qiming> but the whole thing is half-baked 13:59:55 <Qiming> thanks for joining, guys, have a good night/nice day 14:00:00 <lixinhui> u2 14:00:01 <Qiming> #endmeeting