#openstack-meeting log

13:00:35 <Qiming> #startmeeting senlin
13:00:36 <openstack> Meeting started Tue May 31 13:00:35 2016 UTC and is due to finish in 60 minutes.  The chair is Qiming. Information about MeetBot at http://wiki.debian.org/MeetBot.
13:00:37 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
13:00:39 <openstack> The meeting name has been set to 'senlin'
13:00:56 <Qiming> evening
13:01:07 <xuhaiwei__> hi
13:01:15 <elynn> o/
13:02:01 <Qiming> #topic newton work items
13:02:20 <Qiming> yanyan won't be able to join today duing family reasons
13:02:32 <Qiming> tempest
13:02:50 <elynn> tempest api gate job is enabled
13:02:57 <Qiming> gate job added and enabled as experimental, yes
13:03:00 <elynn> in experimental queue
13:03:09 <Qiming> removing line 6-7
13:03:16 <elynn> negative tests are slow...
13:03:34 <Qiming> event show test is in, right?
13:03:39 <elynn> yes
13:04:10 <elynn> I will continue negative tests during my part time.
13:04:14 <Qiming> I suggest we do finer granularity test for cluster actions
13:04:52 <Qiming> one of the reasons we didn't document api using openapi is that we have many cluster actions, all on the same uri
13:05:33 <elynn> So you suggest to test rest cluster actions?
13:05:43 <Qiming> those actions differ from each other regarding parameters, better test each and every of them because they are all different apis
13:06:22 <Qiming> yes, cluster_add_node cluster_del_node cluster_resize ... etc
13:06:41 <elynn> Okay, I will work on them.
13:06:48 <elynn> Do we need to test all parameters?
13:06:51 <Qiming> thanks
13:06:58 <elynn> for each cluster action
13:07:05 <Qiming> by the way, I spent quite some time rework the tempest test cases
13:07:34 <elynn> I saw that, you move some functions to util.py
13:07:48 <Qiming> separating util functions out; and use setUp and addCleanup for test case preparation
13:07:53 <elynn> Thanks for doing that!
13:08:08 <Qiming> I don't get the idea of doing resource_setup classmethod calls
13:08:38 <Qiming> it is very cumbersome to do cls.profile = ... then later reference it as self.profile
13:09:01 <elynn> Yes, I thought it's weird...
13:09:05 <Qiming> I think adding new ones would be much easier now
13:09:10 <elynn> Most of them are classmethod...
13:09:37 <elynn> Thanks for that job
13:09:43 <Qiming> I started trying on a few of them and it worked, so I extended the revision to all tests
13:10:00 <Qiming> I have removed some validations after api triggering
13:10:26 <Qiming> because they were making the api test impure, i.e. making them more like functional tests
13:11:02 <elynn> Saw that too, some of tests are using two or more API in one test.
13:11:13 <Qiming> once the whole collection of api tests are in place, we may want to make the gate voting
13:11:31 <elynn> I will follow your mofication to add new tests.
13:11:34 <Qiming> or, maybe we can enable it now
13:11:39 <Qiming> thanks
13:12:03 <elynn> Okay, I will submit a patch to enable it later :)
13:12:08 <Qiming> lixinhui_, there?
13:12:17 <lixinhui_> Yes, Qiming
13:12:31 <Qiming> hi, any news from stess tests?
13:12:46 <lixinhui_> Sorry, Qiming
13:12:50 <Qiming> np
13:13:00 <Qiming> good news is that #138453 is in
13:13:06 <lixinhui_> We are stilling focus to intergate the Senlin with VIO 2.5
13:13:26 <Qiming> now we can add some more rally tests when yanyan gets cycles
13:13:27 <lixinhui_> but I am thinking to delay it
13:14:16 <Qiming> okay, let's switch to that later
13:14:35 <Qiming> health management ... no progress last week
13:14:47 <lixinhui_> not really
13:14:50 <lixinhui_> Qiming
13:14:57 <lixinhui_> I am thinking
13:15:13 <Qiming> i have refactored the health_manager code last week, to make room for event listener
13:15:17 <Qiming> oh?
13:15:17 <lixinhui_> to use heat stack installing the linux ha
13:15:19 <lixinhui_> agent
13:15:41 <Qiming> into vm instances?
13:15:52 <lixinhui_> is that a right way to think that?
13:15:55 <lixinhui_> yes
13:15:59 <lixinhui_> to help fencing
13:16:00 <Qiming> that is one option
13:16:20 <Qiming> fencing has to be done on physical servers, which is beyond heat control
13:16:30 <lixinhui_> do we have other choice?
13:16:55 <lixinhui_> I know
13:17:09 <lixinhui_> just still need to install a agent in the vm
13:17:17 <lixinhui_> for vm level control, right?
13:17:22 <Qiming> I have am impression that nova has some work on fencing interface, but cannot recall the details at the moment
13:17:43 <lixinhui_> did get any info about that
13:17:48 <lixinhui_> did not
13:17:48 <Qiming> ... I am not a fan of installing things into VMs
13:18:04 <Qiming> not at this stage at least
13:18:04 <lixinhui_> I see
13:18:26 <lixinhui_> so I am trying to know if any other choice
13:18:27 <Qiming> our next topic on agenda will touch this topic
13:18:32 <Qiming> okay
13:18:45 <lixinhui_> okay
13:18:50 <lixinhui_> go head
13:19:07 <Qiming> no comment received on the senlin-ha-recover etherpad
13:19:11 <Qiming> moving on
13:19:39 <Qiming> no news about documentation last week, we have done api-ref migration I believe
13:19:50 <Qiming> the new site is up and looks great so far
13:20:02 <Qiming> container support
13:20:19 <xuhaiwei__> yes, i am working on adding docker driver
13:20:28 <Qiming> saw that, xuhaiwei__
13:20:29 <Qiming> thanks
13:21:00 <xuhaiwei__> need to get the ip from nova server
13:21:08 <Qiming> let's see if our driver work can help shape the API design of higgins
13:21:36 <Qiming> yes, you will get that info
13:21:37 <xuhaiwei__> currently I am concerning about one thing
13:21:47 <Qiming> when node-create, or cluster-create is triggered
13:21:59 <Qiming> the profile will get those information filled in
13:22:23 <xuhaiwei__> that is Senlin supports two kinds of nodes, nova server and heat stack, heat stack node can contain more than one nova server
13:22:27 <Qiming> it is like the region_name, scheduler_hints we fed to nova server
13:22:57 <Qiming> then we don't support specifying a heat stack cluster as the hosting cluster for containers
13:23:22 <Qiming> that is the easy, quick decision
13:23:40 <xuhaiwei__> ok
13:23:51 <Qiming> in the long run, I think we can re-enable heat stack clusters to play this role, provided it has a OS::Server output
13:24:01 <xuhaiwei__> maybe for the first step, we care about nova server node only
13:24:23 <Qiming> I have an impression that a heat stack has a secret output attribute allowing to treat that stack as a nova server
13:24:40 <Qiming> if you really need that, please dig it out
13:24:51 <Qiming> as the first step, it is fine
13:25:18 <Qiming> will take a look at the driver code and think about it, ... how can we generalize that
13:25:36 <Qiming> no progress on engine work
13:25:39 <xuhaiwei__> In fact when I did the demo for the summit session, I used heat stack output directly to get the server's ip
13:25:52 <Qiming> yes, it is possible
13:26:06 <xuhaiwei__> it's convenient
13:26:20 <Qiming> so long as we controle the heat template ... to ensure that the template has server ip in its output
13:26:51 <Qiming> please keep on exploring that
13:26:52 <xuhaiwei__> yes, that's kind of forcing user to do it
13:27:03 <Qiming> and feel free to call for discussions on details
13:27:12 <xuhaiwei__> ok
13:27:18 <Qiming> it is not generic, but still doable, :)
13:27:36 <xuhaiwei__> yes
13:27:44 <Qiming> zaqar support, em ... I'm thinking if we can get some support from fei long on that
13:28:12 <Qiming> there have been a spec proposal
13:28:13 <Qiming> https://review.openstack.org/#/c/318202/3/specs/newton/mistral-notifications.rst
13:28:50 <Qiming> haven't got time to catch up on the review history
13:29:11 <Qiming> if you are interested in connecting the dots, that might be an interesting thread to follow
13:29:25 <Qiming> moving on
13:29:32 <Qiming> event/notifications
13:29:46 <Qiming> that is a big topic than I imagined
13:30:17 <Qiming> so after reading the nova specs, I figured that we need to get oslo.versionedobject landed first
13:30:38 <Qiming> all senlin db objects then can be represented as a versioned object
13:31:12 <Qiming> we can isolate the db changes from senlin-engine/senlin-api, so in future, live upgrade of the service is possible
13:32:06 <Qiming> when working on that, I was also hoping that we can use o.vo to model API requests which should be versioned as well, and notifications, which needs version too
13:32:58 <Qiming> notification's priority is higher than requests because we got requirements to notify other software what happened in senlin
13:33:27 <Qiming> that is a key interface for integrating with existing software
13:33:41 <Qiming> will keep working on that in coming weeks
13:33:58 <Qiming> so ... that's all from the agenda
13:34:09 <Qiming> the first topic
13:34:21 <Qiming> questions/comments?
13:35:13 <Qiming> while implementing the o.vo, there were two blockers ...
13:35:22 <Qiming> can share with you as experiences
13:35:41 <elynn> That's great!
13:35:44 <Qiming> one is that many database are storing DateTime fields without time zone info
13:35:51 <Qiming> including mysql
13:36:15 <Qiming> but o.vo is forcing the DateTimeField to carry TZ info by default
13:36:47 <Qiming> to solve this conflict, you can either turn off TZ in o.vo, or enable TZ at sqlalchemy layer
13:36:55 <Qiming> I chose the latter
13:37:33 <Qiming> another one is about obj_name, which was a column in the event table in senlin db
13:37:53 <Qiming> but o.vo VersionedObject has a obj_name method
13:38:02 <Qiming> yes, the same name
13:38:13 <elynn> Then?...
13:38:16 <Qiming> so we had to change to db schema to make things smooth
13:38:44 <Qiming> that is reason why we have version 5 of db migration
13:39:06 <Qiming> we changed obj_id, obj_name, and obj_type to oid, oname and otype correspondingly
13:39:33 <Qiming> still working on some complaints about some IDs not being UUID format
13:39:53 <Qiming> but anyway, stricter checkings do help make the code stable
13:40:07 <Qiming> #topic senlin cluster-do operation
13:40:35 <Qiming> during a discussion with lixinhui_ and a long 4.5 hours meeting today with a customer
13:40:58 <Qiming> I'm feeling an urgent need to add the cluster_do operation to senlin-api
13:41:22 <xuhaiwei__> what is it?
13:41:44 <Qiming> in general, it is an API that allows users to do things they want on all or some specific nodes in a cluster (focus on nova server cluster now)
13:42:13 <Qiming> I'm thinking of three layers of "things to do" at the moment
13:42:43 <Qiming> layer one: operations exposed by the backend drivers, thus implementable in senlin profiles
13:43:02 <Qiming> e.g. nova evacuate, nova reboot, nova shelf, ...
13:43:24 <Qiming> these operations are specific to a profile type
13:43:40 <Qiming> e.g. 'evacuate' only makes sense to nova servers
13:44:21 <Qiming> we can augment senlin profile types by adding an 'operations' property that captures the operations a backend can understand
13:44:30 <Qiming> then a user can do this:
13:44:46 <Qiming> senlin cluster-do evacuate <cluster_id>
13:44:50 <Qiming> or
13:45:13 <Qiming> senlin cluster-do evacuate --role blue_region <cluster_id>
13:45:40 <xuhaiwei__> sounds reasonable
13:45:54 <Qiming> it is a batch operation you can perform on a cluster, and the operation logic is programmed into the profile type implementation
13:46:28 <Qiming> layer two: run some user specified scripts on a cluster (still nova cluster here)
13:46:59 <Qiming> senlin cluster-do --script <install_linux_ha.sh> <cluster_id>
13:47:41 <Qiming> senlin can take the specified script and scp that code to each and every cluster node, and ssh to those nodes for execution
13:48:16 <Qiming> it is a generic logic, that can be used to run any shell scripts
13:48:48 <Qiming> its function is comparable to the software-config and software-deployment, but with less constraints
13:49:24 <Qiming> layer three: enable senlin to run a ansible playbook directly
13:50:01 <Qiming> senlin cluster-do --playbook install_mysql_cluster.yml <cluster_id>
13:50:41 <Qiming> behind the scene, you can imagine, senlin is calling ansible to run the playbooks
13:51:24 <lixinhui_> very useful
13:52:00 <Qiming> this is a measure to save those guys trapped by heat softwareconfig/deployments
13:52:23 <Qiming> we have been working with them a long time ago to set up a multi-tiered enterprise application
13:53:02 <Qiming> at least layer one and layer two will be useful for your use case, lixinhui_, right?
13:53:18 <lixinhui_> to be honesty
13:53:27 <lixinhui_> I wanna layer3
13:53:36 <Qiming> :D
13:53:37 <lixinhui_> or a super CLI
13:54:03 <lixinhui_> you know
13:54:10 <lixinhui_> install some agent
13:54:17 <lixinhui_> or remove some gent
13:54:57 <Qiming> yes, it would be very useful and very convenient to provision application HA
13:55:46 <lixinhui_> :)
13:56:14 <Qiming> I have already tried to invoke ansible from a python program ... it works
13:56:22 <lixinhui_> cool
13:56:31 <Qiming> though I need to figure out how to manage the keys
13:56:57 <Qiming> I'm not a super fan of layer 3, to be honest
13:57:22 <lixinhui_> reasons
13:57:28 <Qiming> it sounds a wrapper to ansible, if users already have some ansible playbooks, they may want to use ansible directly
13:57:35 <lixinhui_> yes
13:58:29 <Qiming> the advantage of layer 3 ... is that with such an API, it can be exposed to horizon
13:58:46 <lixinhui_> yes
13:58:54 <Qiming> a user can paste their playbook directly into the web page
13:59:11 <Qiming> or the link to the playbook to the web page, and click 'run', :)
13:59:21 <Qiming> oh ... time's up
13:59:27 <lixinhui_> ...
13:59:27 <Qiming> have to free the channel
13:59:38 <lixinhui_> okay, talk more tomorrow
13:59:42 <Qiming> thanks for your time, guys, you are always good listeners, :D
13:59:48 <Qiming> good night
13:59:55 <Qiming> #endmeeting