13:00:35 <Qiming> #startmeeting senlin 13:00:36 <openstack> Meeting started Tue May 31 13:00:35 2016 UTC and is due to finish in 60 minutes. The chair is Qiming. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:00:37 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:00:39 <openstack> The meeting name has been set to 'senlin' 13:00:56 <Qiming> evening 13:01:07 <xuhaiwei__> hi 13:01:15 <elynn> o/ 13:02:01 <Qiming> #topic newton work items 13:02:20 <Qiming> yanyan won't be able to join today duing family reasons 13:02:32 <Qiming> tempest 13:02:50 <elynn> tempest api gate job is enabled 13:02:57 <Qiming> gate job added and enabled as experimental, yes 13:03:00 <elynn> in experimental queue 13:03:09 <Qiming> removing line 6-7 13:03:16 <elynn> negative tests are slow... 13:03:34 <Qiming> event show test is in, right? 13:03:39 <elynn> yes 13:04:10 <elynn> I will continue negative tests during my part time. 13:04:14 <Qiming> I suggest we do finer granularity test for cluster actions 13:04:52 <Qiming> one of the reasons we didn't document api using openapi is that we have many cluster actions, all on the same uri 13:05:33 <elynn> So you suggest to test rest cluster actions? 13:05:43 <Qiming> those actions differ from each other regarding parameters, better test each and every of them because they are all different apis 13:06:22 <Qiming> yes, cluster_add_node cluster_del_node cluster_resize ... etc 13:06:41 <elynn> Okay, I will work on them. 13:06:48 <elynn> Do we need to test all parameters? 13:06:51 <Qiming> thanks 13:06:58 <elynn> for each cluster action 13:07:05 <Qiming> by the way, I spent quite some time rework the tempest test cases 13:07:34 <elynn> I saw that, you move some functions to util.py 13:07:48 <Qiming> separating util functions out; and use setUp and addCleanup for test case preparation 13:07:53 <elynn> Thanks for doing that! 13:08:08 <Qiming> I don't get the idea of doing resource_setup classmethod calls 13:08:38 <Qiming> it is very cumbersome to do cls.profile = ... then later reference it as self.profile 13:09:01 <elynn> Yes, I thought it's weird... 13:09:05 <Qiming> I think adding new ones would be much easier now 13:09:10 <elynn> Most of them are classmethod... 13:09:37 <elynn> Thanks for that job 13:09:43 <Qiming> I started trying on a few of them and it worked, so I extended the revision to all tests 13:10:00 <Qiming> I have removed some validations after api triggering 13:10:26 <Qiming> because they were making the api test impure, i.e. making them more like functional tests 13:11:02 <elynn> Saw that too, some of tests are using two or more API in one test. 13:11:13 <Qiming> once the whole collection of api tests are in place, we may want to make the gate voting 13:11:31 <elynn> I will follow your mofication to add new tests. 13:11:34 <Qiming> or, maybe we can enable it now 13:11:39 <Qiming> thanks 13:12:03 <elynn> Okay, I will submit a patch to enable it later :) 13:12:08 <Qiming> lixinhui_, there? 13:12:17 <lixinhui_> Yes, Qiming 13:12:31 <Qiming> hi, any news from stess tests? 13:12:46 <lixinhui_> Sorry, Qiming 13:12:50 <Qiming> np 13:13:00 <Qiming> good news is that #138453 is in 13:13:06 <lixinhui_> We are stilling focus to intergate the Senlin with VIO 2.5 13:13:26 <Qiming> now we can add some more rally tests when yanyan gets cycles 13:13:27 <lixinhui_> but I am thinking to delay it 13:14:16 <Qiming> okay, let's switch to that later 13:14:35 <Qiming> health management ... no progress last week 13:14:47 <lixinhui_> not really 13:14:50 <lixinhui_> Qiming 13:14:57 <lixinhui_> I am thinking 13:15:13 <Qiming> i have refactored the health_manager code last week, to make room for event listener 13:15:17 <Qiming> oh? 13:15:17 <lixinhui_> to use heat stack installing the linux ha 13:15:19 <lixinhui_> agent 13:15:41 <Qiming> into vm instances? 13:15:52 <lixinhui_> is that a right way to think that? 13:15:55 <lixinhui_> yes 13:15:59 <lixinhui_> to help fencing 13:16:00 <Qiming> that is one option 13:16:20 <Qiming> fencing has to be done on physical servers, which is beyond heat control 13:16:30 <lixinhui_> do we have other choice? 13:16:55 <lixinhui_> I know 13:17:09 <lixinhui_> just still need to install a agent in the vm 13:17:17 <lixinhui_> for vm level control, right? 13:17:22 <Qiming> I have am impression that nova has some work on fencing interface, but cannot recall the details at the moment 13:17:43 <lixinhui_> did get any info about that 13:17:48 <lixinhui_> did not 13:17:48 <Qiming> ... I am not a fan of installing things into VMs 13:18:04 <Qiming> not at this stage at least 13:18:04 <lixinhui_> I see 13:18:26 <lixinhui_> so I am trying to know if any other choice 13:18:27 <Qiming> our next topic on agenda will touch this topic 13:18:32 <Qiming> okay 13:18:45 <lixinhui_> okay 13:18:50 <lixinhui_> go head 13:19:07 <Qiming> no comment received on the senlin-ha-recover etherpad 13:19:11 <Qiming> moving on 13:19:39 <Qiming> no news about documentation last week, we have done api-ref migration I believe 13:19:50 <Qiming> the new site is up and looks great so far 13:20:02 <Qiming> container support 13:20:19 <xuhaiwei__> yes, i am working on adding docker driver 13:20:28 <Qiming> saw that, xuhaiwei__ 13:20:29 <Qiming> thanks 13:21:00 <xuhaiwei__> need to get the ip from nova server 13:21:08 <Qiming> let's see if our driver work can help shape the API design of higgins 13:21:36 <Qiming> yes, you will get that info 13:21:37 <xuhaiwei__> currently I am concerning about one thing 13:21:47 <Qiming> when node-create, or cluster-create is triggered 13:21:59 <Qiming> the profile will get those information filled in 13:22:23 <xuhaiwei__> that is Senlin supports two kinds of nodes, nova server and heat stack, heat stack node can contain more than one nova server 13:22:27 <Qiming> it is like the region_name, scheduler_hints we fed to nova server 13:22:57 <Qiming> then we don't support specifying a heat stack cluster as the hosting cluster for containers 13:23:22 <Qiming> that is the easy, quick decision 13:23:40 <xuhaiwei__> ok 13:23:51 <Qiming> in the long run, I think we can re-enable heat stack clusters to play this role, provided it has a OS::Server output 13:24:01 <xuhaiwei__> maybe for the first step, we care about nova server node only 13:24:23 <Qiming> I have an impression that a heat stack has a secret output attribute allowing to treat that stack as a nova server 13:24:40 <Qiming> if you really need that, please dig it out 13:24:51 <Qiming> as the first step, it is fine 13:25:18 <Qiming> will take a look at the driver code and think about it, ... how can we generalize that 13:25:36 <Qiming> no progress on engine work 13:25:39 <xuhaiwei__> In fact when I did the demo for the summit session, I used heat stack output directly to get the server's ip 13:25:52 <Qiming> yes, it is possible 13:26:06 <xuhaiwei__> it's convenient 13:26:20 <Qiming> so long as we controle the heat template ... to ensure that the template has server ip in its output 13:26:51 <Qiming> please keep on exploring that 13:26:52 <xuhaiwei__> yes, that's kind of forcing user to do it 13:27:03 <Qiming> and feel free to call for discussions on details 13:27:12 <xuhaiwei__> ok 13:27:18 <Qiming> it is not generic, but still doable, :) 13:27:36 <xuhaiwei__> yes 13:27:44 <Qiming> zaqar support, em ... I'm thinking if we can get some support from fei long on that 13:28:12 <Qiming> there have been a spec proposal 13:28:13 <Qiming> https://review.openstack.org/#/c/318202/3/specs/newton/mistral-notifications.rst 13:28:50 <Qiming> haven't got time to catch up on the review history 13:29:11 <Qiming> if you are interested in connecting the dots, that might be an interesting thread to follow 13:29:25 <Qiming> moving on 13:29:32 <Qiming> event/notifications 13:29:46 <Qiming> that is a big topic than I imagined 13:30:17 <Qiming> so after reading the nova specs, I figured that we need to get oslo.versionedobject landed first 13:30:38 <Qiming> all senlin db objects then can be represented as a versioned object 13:31:12 <Qiming> we can isolate the db changes from senlin-engine/senlin-api, so in future, live upgrade of the service is possible 13:32:06 <Qiming> when working on that, I was also hoping that we can use o.vo to model API requests which should be versioned as well, and notifications, which needs version too 13:32:58 <Qiming> notification's priority is higher than requests because we got requirements to notify other software what happened in senlin 13:33:27 <Qiming> that is a key interface for integrating with existing software 13:33:41 <Qiming> will keep working on that in coming weeks 13:33:58 <Qiming> so ... that's all from the agenda 13:34:09 <Qiming> the first topic 13:34:21 <Qiming> questions/comments? 13:35:13 <Qiming> while implementing the o.vo, there were two blockers ... 13:35:22 <Qiming> can share with you as experiences 13:35:41 <elynn> That's great! 13:35:44 <Qiming> one is that many database are storing DateTime fields without time zone info 13:35:51 <Qiming> including mysql 13:36:15 <Qiming> but o.vo is forcing the DateTimeField to carry TZ info by default 13:36:47 <Qiming> to solve this conflict, you can either turn off TZ in o.vo, or enable TZ at sqlalchemy layer 13:36:55 <Qiming> I chose the latter 13:37:33 <Qiming> another one is about obj_name, which was a column in the event table in senlin db 13:37:53 <Qiming> but o.vo VersionedObject has a obj_name method 13:38:02 <Qiming> yes, the same name 13:38:13 <elynn> Then?... 13:38:16 <Qiming> so we had to change to db schema to make things smooth 13:38:44 <Qiming> that is reason why we have version 5 of db migration 13:39:06 <Qiming> we changed obj_id, obj_name, and obj_type to oid, oname and otype correspondingly 13:39:33 <Qiming> still working on some complaints about some IDs not being UUID format 13:39:53 <Qiming> but anyway, stricter checkings do help make the code stable 13:40:07 <Qiming> #topic senlin cluster-do operation 13:40:35 <Qiming> during a discussion with lixinhui_ and a long 4.5 hours meeting today with a customer 13:40:58 <Qiming> I'm feeling an urgent need to add the cluster_do operation to senlin-api 13:41:22 <xuhaiwei__> what is it? 13:41:44 <Qiming> in general, it is an API that allows users to do things they want on all or some specific nodes in a cluster (focus on nova server cluster now) 13:42:13 <Qiming> I'm thinking of three layers of "things to do" at the moment 13:42:43 <Qiming> layer one: operations exposed by the backend drivers, thus implementable in senlin profiles 13:43:02 <Qiming> e.g. nova evacuate, nova reboot, nova shelf, ... 13:43:24 <Qiming> these operations are specific to a profile type 13:43:40 <Qiming> e.g. 'evacuate' only makes sense to nova servers 13:44:21 <Qiming> we can augment senlin profile types by adding an 'operations' property that captures the operations a backend can understand 13:44:30 <Qiming> then a user can do this: 13:44:46 <Qiming> senlin cluster-do evacuate <cluster_id> 13:44:50 <Qiming> or 13:45:13 <Qiming> senlin cluster-do evacuate --role blue_region <cluster_id> 13:45:40 <xuhaiwei__> sounds reasonable 13:45:54 <Qiming> it is a batch operation you can perform on a cluster, and the operation logic is programmed into the profile type implementation 13:46:28 <Qiming> layer two: run some user specified scripts on a cluster (still nova cluster here) 13:46:59 <Qiming> senlin cluster-do --script <install_linux_ha.sh> <cluster_id> 13:47:41 <Qiming> senlin can take the specified script and scp that code to each and every cluster node, and ssh to those nodes for execution 13:48:16 <Qiming> it is a generic logic, that can be used to run any shell scripts 13:48:48 <Qiming> its function is comparable to the software-config and software-deployment, but with less constraints 13:49:24 <Qiming> layer three: enable senlin to run a ansible playbook directly 13:50:01 <Qiming> senlin cluster-do --playbook install_mysql_cluster.yml <cluster_id> 13:50:41 <Qiming> behind the scene, you can imagine, senlin is calling ansible to run the playbooks 13:51:24 <lixinhui_> very useful 13:52:00 <Qiming> this is a measure to save those guys trapped by heat softwareconfig/deployments 13:52:23 <Qiming> we have been working with them a long time ago to set up a multi-tiered enterprise application 13:53:02 <Qiming> at least layer one and layer two will be useful for your use case, lixinhui_, right? 13:53:18 <lixinhui_> to be honesty 13:53:27 <lixinhui_> I wanna layer3 13:53:36 <Qiming> :D 13:53:37 <lixinhui_> or a super CLI 13:54:03 <lixinhui_> you know 13:54:10 <lixinhui_> install some agent 13:54:17 <lixinhui_> or remove some gent 13:54:57 <Qiming> yes, it would be very useful and very convenient to provision application HA 13:55:46 <lixinhui_> :) 13:56:14 <Qiming> I have already tried to invoke ansible from a python program ... it works 13:56:22 <lixinhui_> cool 13:56:31 <Qiming> though I need to figure out how to manage the keys 13:56:57 <Qiming> I'm not a super fan of layer 3, to be honest 13:57:22 <lixinhui_> reasons 13:57:28 <Qiming> it sounds a wrapper to ansible, if users already have some ansible playbooks, they may want to use ansible directly 13:57:35 <lixinhui_> yes 13:58:29 <Qiming> the advantage of layer 3 ... is that with such an API, it can be exposed to horizon 13:58:46 <lixinhui_> yes 13:58:54 <Qiming> a user can paste their playbook directly into the web page 13:59:11 <Qiming> or the link to the playbook to the web page, and click 'run', :) 13:59:21 <Qiming> oh ... time's up 13:59:27 <lixinhui_> ... 13:59:27 <Qiming> have to free the channel 13:59:38 <lixinhui_> okay, talk more tomorrow 13:59:42 <Qiming> thanks for your time, guys, you are always good listeners, :D 13:59:48 <Qiming> good night 13:59:55 <Qiming> #endmeeting