13:01:47 <Qiming> #startmeeting senlin 13:01:48 <openstack> Meeting started Tue Mar 8 13:01:47 2016 UTC and is due to finish in 60 minutes. The chair is Qiming. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:01:49 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:01:51 <openstack> The meeting name has been set to 'senlin' 13:01:55 <Qiming> halo 13:02:00 <haiwei> hi 13:02:02 <zzxwill> Hello. 13:02:06 <yanyanhu> hello 13:02:09 <elynn> o/ 13:02:13 <Qiming> got some problems updating the meeting agenda page 13:02:15 <Qiming> sorry 13:02:47 <Qiming> also got a note from xinhui, she won't be able to join us today, because of some company activities ... parties? 13:02:56 <Qiming> anyway, let's get started 13:03:03 <yanyanhu> also can't access it even with proxy 13:03:03 <Qiming> #topic mitaka work items 13:03:08 <yanyanhu> :) 13:03:21 <Qiming> #link https://etherpad.openstack.org/p/senlin-mitaka-workitems 13:03:22 <haiwei> me too 13:03:46 <Qiming> great, site is down, :), not my fault 13:04:12 <Qiming> I was trying to reach xujun about scalability testing, but haven't got a response yet 13:04:35 <Qiming> need an update from bran on stress testing 13:04:54 <yanyanhu> yes, he is not here I think 13:04:58 <Qiming> I myself spent some time on studying tempest 13:05:25 <Qiming> seems tempest is still the way to do api tests, but the code is suggested to live closer to individual projects 13:05:56 <Qiming> so even we start work on that, we are not supposed to commit to tempest directly, will need to confirm this 13:06:09 <yanyanhu> ok 13:06:10 <haiwei> only some core projects' function test are remained in tempest 13:06:31 <Qiming> if the code is to be commited to senlin, then we are still expected to use tempest lib 13:06:49 <Qiming> tempest lib was a separate project, but recently it has been merged back to tempest 13:06:58 <Qiming> ... a lot things happen every day 13:07:28 <Qiming> haiwei, it will be good if some one from NEC can help explain to us the right direction to go 13:07:35 <haiwei> you want to add API test? 13:07:41 <Qiming> I know you have some guys active on that 13:07:46 <haiwei> yes 13:08:01 <Qiming> we are also teaching us the right way to do stress tests 13:08:20 <Qiming> api test, stress test and scenario test were all scope of tempest before 13:08:27 <haiwei> like I have said in mid-cycle, there is some thing called tempest external plugin 13:08:27 <Qiming> but I don't know the current situation 13:08:52 <Qiming> okay, then that is the direction to go for api test 13:09:00 <Qiming> any idea about stress tests? 13:09:04 <haiwei> the tempest external plugin is something for scenario test 13:09:09 <Qiming> are we supposed to invent our own wheel? 13:09:24 <Qiming> scenario test is different from api tests 13:09:33 <haiwei> for stress test, not heard it before 13:09:43 <haiwei> I am not sure if tempest supports it 13:09:54 <Qiming> http://git.openstack.org/cgit/openstack/tempest/tree/tempest/README.rst 13:10:02 <Qiming> line 24 is about api test 13:10:08 <Qiming> line 39 is about scenario 13:10:13 <Qiming> line 50 is about stress test 13:10:22 <haiwei> oh, saw it 13:10:41 <haiwei> ok, I will ask the tempest guys tomorrow 13:10:51 <Qiming> not sure that is still the consensus 13:11:10 <haiwei> or ask him to join Senlin IRC 13:11:13 <Qiming> if we don't get an answer from them, we have to ask on -dev list 13:11:23 <haiwei> yes 13:11:23 <Qiming> okay, either way works 13:11:49 <Qiming> yanyanhu is still on functional test? 13:12:04 <Qiming> saw some patches about profile level support to updates 13:12:06 <yanyanhu> yes. but basic support is almost done 13:12:14 <Qiming> great 13:12:33 <Qiming> api surface test is important 13:12:40 <yanyanhu> yea, that's what I hoped to do, but met some issues 13:13:00 <Qiming> those are supposed to test how the service fails in addition to how it succeeds 13:13:01 <yanyanhu> that I don't know how to address 13:13:18 <Qiming> okay, maybe we need some experts in this field 13:13:28 <yanyanhu> yes, we need to cover failure cases as well 13:13:39 <Qiming> testing is supposed to be our focus in the coming weeks 13:14:00 <Qiming> next, health management 13:14:18 <Qiming> we have a basic version of health manager and a basic health policy 13:14:42 <Qiming> need some tests on them to make sure they work, so that at least we can remove line 16 13:15:31 <Qiming> lb based health detection, yanyanhu do you know the progress? 13:15:40 <yanyanhu> it's done I think 13:15:46 <yanyanhu> oh, sorry 13:15:52 <yanyanhu> you mean health detection 13:15:57 <Qiming> health monitor support part is done 13:16:11 <yanyanhu> not sure about it. Just ensure the HM support in lb policy works now 13:16:12 <Qiming> we need a poller to check it, right? 13:16:13 <yanyanhu> yep 13:16:25 <yanyanhu> yes, I think so 13:16:45 <yanyanhu> the poller need to check the health status of pool member 13:17:04 <yanyanhu> to decide whether it is active or not 13:17:13 <Qiming> yesterday, I noticed there have been some talks about HA in this channel, it turns out to be a HA team meeting 13:17:47 <Qiming> yanyanhu, that is already good because we don't have to check individual nodes 13:18:13 <yanyanhu> yes 13:18:22 <Qiming> in that meeting, I gave people a quick update on senlin, what it is doing on HA 13:18:49 <Qiming> hopefully, we can work with more people on this to solve some real concerns from enterprise users 13:19:12 <Qiming> this will also be one of the subtopic in lixinhui's summit presentation 13:19:42 <Qiming> she is working very hard on that, yesterday, she was working at 11pm ... 13:19:54 <yanyanhu> hard worker :) 13:20:01 <Qiming> hope we can have some thing to share soon 13:20:08 <yanyanhu> she is back from party I think 13:20:19 <lixinhui_> yes... 13:20:31 <Qiming> basic problem is about neutron, lbaas and octavia integration 13:20:32 <cschulz> Spoke with Marco on HA back a bit. He said that simple restart of failed node was probably place to start. 13:20:34 <lixinhui_> sorry for late... 13:20:51 <Qiming> thanks, cschulz, for the input 13:21:10 <Qiming> we are wondering if recovery should be really automated 13:21:33 <lixinhui_> nice to catch this 13:21:40 <Qiming> there were some proponents on auto-recovery, not matter what the "recover" operation is 13:22:25 <Qiming> we can dig on that when we have some basic working code 13:22:29 <cschulz> I agree that recover is a process that could be different for different customers. 13:23:17 <Qiming> customizability is always desired, however, we have to control the degree to which we want it to be customizable 13:23:36 <yanyanhu> it really depends on the use case I think. Especially whether the app/service deployed in VM can restore automatically after restarting 13:24:14 <Qiming> I was thinking of the standby cluster support 13:24:23 <cschulz> Yes maybe we should have a list of common recovery procedures initially 13:24:41 <Qiming> having a standby cluster will speed up 'auto-scaling' and 'recovery' 13:24:59 <Qiming> but it will waste some resources for sure, :) 13:25:31 <cschulz> Depending on the accounting process, it may waste a lot of $ 13:25:38 <yanyanhu> some nodes with role of 'backup' 13:25:44 <Qiming> for nova servers, we can do reboot, rebuild, evacuate, recreate 13:26:01 <Qiming> in some cases, you will need fencing 13:26:30 <Qiming> there is no fencing API on openstack, which means, fencing has to be a per-device-model thing 13:26:34 <cschulz> Yes and in some cases there will be a cluster manager to get informed. 13:27:33 <Qiming> exactly, we need a division of the problem domain, work out solutions step by step 13:27:55 <Qiming> starting from the basics 13:28:05 <Qiming> how about an etherpad for this? 13:28:17 <lixinhui_> okay 13:28:26 <cschulz> OK 13:28:33 <lixinhui_> will get one up for this discussion 13:28:39 <Qiming> that way we can collect many inputs 13:28:46 <Qiming> thanks, lixinhui_ 13:28:53 <Qiming> let's move on 13:28:54 <lixinhui_> :) 13:29:11 <Qiming> documentation will be my main focus in the coming weeks 13:29:26 <Qiming> documenting use cases and the design of policies 13:29:57 <Qiming> end-to-end autoscaling story 13:30:05 <Qiming> I think xinhui is working on one 13:30:11 <lixinhui_> I am prototyping this 13:30:19 <Qiming> https://www.openstack.org/summit/austin-2016/summit-schedule/events/7469 13:30:28 <lixinhui_> still some problem with neutron lbaas v2 13:30:29 <Qiming> congrat's on your talk being accepted 13:30:47 <lixinhui_> thanks for wisdom from all of you 13:31:17 <Qiming> lixinhui_, any specifics we can help? 13:31:49 <lixinhui_> Thanks for suggestions on alarm side from you and yanyanhu 13:31:52 <yanyanhu> lixinhui_, can ceilometer generate samples of lbaas v2 pool correctly? 13:32:21 <lixinhui_> accoring to my trial based half month ago 13:32:24 <lixinhui_> it works 13:32:28 <lixinhui_> but recently 13:32:29 <yanyanhu> nice! 13:32:33 <lixinhui_> neutron can not 13:32:49 <lixinhui_> create lbaas v2 loadbalancer successfully 13:32:58 <lixinhui_> always "pending to create" 13:33:10 <yanyanhu> a new bug? 13:33:14 <lixinhui_> I am blocked by this problem 13:33:20 <lixinhui_> need to search more 13:33:35 <Qiming> ah, I see, let's spend some time together tomorrow on this 13:33:39 <yanyanhu> I recalled I met similar issue before, when using lbaas v1 13:33:45 <Qiming> seems to me like a VM creation problem 13:33:52 <yanyanhu> it was caused by incorrect haproxy configuration 13:33:57 <lixinhui_> Thanks in advance 13:34:12 <Qiming> okay, will be online for this tomorrow 13:34:14 <yanyanhu> basically, the driver of lbaas didn't work correctly 13:34:25 <Qiming> next, profile for container support 13:34:26 <yanyanhu> not sure whether this is the problem you met 13:34:43 <Qiming> the profile part is easy ... 13:34:50 <Qiming> the difficult part is about scheduling 13:35:03 <Qiming> when we have new containers to create, we have to specify a 'host' for it 13:35:24 <Qiming> (let's pretend we don't have network/storage problems at the moment) 13:35:28 <haiwei> you will start containers on vms? 13:35:38 <Qiming> the 'host' could be a VM, it could be a physical machine 13:35:45 <haiwei> ok 13:36:10 <Qiming> let's start with container inside VMs, which seems a common scenario today 13:36:27 <Qiming> the VMs are created by senlin or other services 13:36:36 <Qiming> we need to do some kind of scheduling 13:37:05 <Qiming> one option is to introduce mesos-alike framework, including mesos agents into guest images 13:37:19 <haiwei> If the containers run on vms, there will be two kinds of clusters at the same time 13:37:26 <Qiming> then we will still need a scheduler, e.g. marathon 13:37:44 <Qiming> users don't need to care 13:37:56 <Qiming> we can manage the hosts for the containers 13:38:12 <Qiming> in some use cases I heard of 13:38:45 <Qiming> people build a huge resource pool to run containers, but that huge resource (vm or physical) pool is transparent to users 13:39:20 <Qiming> one option, discussed with yanyanhu today, is to do some resource based 'host' selection 13:39:34 <Qiming> that will give us a very simple starting point to go forward 13:40:01 <haiwei> the 'user' does not include cloud operator ? 13:40:18 <Qiming> we can leverage ceilometer (or others) to monitor per-vm resource availability and find a candidate node to launch the container 13:40:41 <Qiming> cloud operator will know the underlying cluster of VMs 13:41:00 <Qiming> they may even need to autoscale this VM cluster to accommodate more containers 13:41:59 <Qiming> we can start getting our hands dirty and see if this is just another 'placement' policy 13:42:40 <Qiming> btw, congrat's to haiwei's talk proposal being accepted 13:43:10 <Qiming> we are gonna help make that presentation a successful one 13:43:14 <haiwei> currently I can't think out one way to auto-scale containers to the vms which are not controlled by Senlin 13:43:32 <haiwei> Thank you for the help for session proposal 13:43:46 <Qiming> there are two levels of entities to manage 13:43:48 <Qiming> the VM pool 13:43:53 <Qiming> the container pool 13:43:54 <haiwei> I mean that vms seems not under control of Senlin 13:44:01 <yanyanhu> actually, for that scenario, VM cluster have to be managed by Senlin I think 13:44:18 <Qiming> well, they may and may not be 13:44:30 <Qiming> you just need to know their IP 13:44:52 <Qiming> and possibly some secrets to log into them 13:45:25 <Qiming> having senlin manage the two clusters makes total sense to me 13:45:37 <Qiming> was just trying to be open minded on this 13:45:47 <haiwei> ok 13:45:55 <Qiming> haiwei, can you help revise the specs 13:46:05 <Qiming> we can start drilling down the details? 13:46:09 <haiwei> ok, I will do it 13:46:16 <yanyanhu> yes, resource(host) finding progress become necessary if the VM cluster is not created by Senlin 13:46:24 <Qiming> or if a spec is not the right tool, we can use etherpads as well 13:46:39 <haiwei> I can't spend much time on that recently, I will try my best to do it 13:47:27 <Qiming> we need careful preparations for all the presentations 13:47:49 <haiwei> yes, indeed 13:48:02 <Qiming> okay 13:48:14 <Qiming> last item, rework NODE_CREATE/DELETE 13:48:22 <Qiming> it has been there for a long time 13:48:39 <Qiming> let's keep them there as is, :P 13:49:01 <Qiming> driver work 13:49:15 <yanyanhu> have been done 13:49:24 <Qiming> it was much smoother than we had thought 13:49:27 <yanyanhu> just need a little change in neutron driver 13:49:33 <Qiming> that was great 13:49:34 <yanyanhu> yep :P 13:49:39 <Qiming> btw 13:49:45 <lixinhui_> cool! 13:49:53 <Qiming> we just relased mitaka-3 milestone for senlin and senlinclient last week 13:50:13 <Qiming> that is a milestone for everyone 13:50:25 <Qiming> thank you all for your contributions during to past months 13:50:28 <Qiming> we made it 13:50:45 <Qiming> #topic open discussion 13:50:55 <lixinhui_> Qiming, I remember there is still a patch on senlinclient about check/recover, once 13:51:13 <lixinhui_> once you -2 to avoid merge by accidently 13:51:26 <lixinhui_> not sure we will merge it or not 13:51:58 <Qiming> oh, yes 13:51:59 <lixinhui_> at that time, the patch is blocked by sdk version 13:52:31 <Qiming> that one can be unblocked now 13:52:43 <lixinhui_> ok 13:52:53 <Qiming> should have merge it into m-3 13:53:03 <haiwei> what about SDK's issue 13:53:27 <Qiming> sdk version has been bumped into 0.8.1 13:54:23 <yanyanhu> global requirement has been updated 13:54:26 <Qiming> https://review.openstack.org/#/c/285599/ 13:55:22 <Qiming> anything else? 13:55:33 <yanyanhu> nope from me 13:55:36 <lixinhui_> no 13:55:39 <elynn> no 13:55:43 <cschulz> no 13:55:52 <Qiming> thanks everyone for joining 13:55:52 <haiwei> no 13:56:01 <Qiming> good night/day 13:56:06 <Qiming> #endmeeting