13:00:23 <Qiming> #startmeeting senlin 13:00:23 <openstack> Meeting started Tue Jul 26 13:00:23 2016 UTC and is due to finish in 60 minutes. The chair is Qiming. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:00:24 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:00:27 <openstack> The meeting name has been set to 'senlin' 13:00:33 <zzxwill> Hello. 13:00:36 <yanyanhu> hi 13:00:38 <elynn> o/ 13:00:38 <Qiming> #topic roll call 13:01:07 <Qiming> hello 13:01:30 <yanyanhu> o/ 13:01:47 <elynn> Evening! 13:01:51 <Qiming> xinhui or haiwei online? 13:01:59 <lixinhui_> Yes 13:02:06 <lixinhui_> Just jumped in 13:02:24 <Qiming> you jumped beautifully 13:02:30 <lixinhui_> :) 13:02:36 <Qiming> #topic newton work items 13:02:47 <Qiming> #link https://etherpad.openstack.org/p/senlin-newton-workitems 13:03:13 <Qiming> any progress on stress testing last week? 13:03:57 <Qiming> I saw yanyan's rally work blocked for release-cut window 13:04:07 <yanyanhu> Qiming, yes 13:04:27 <yanyanhu> still waiting, guess still need to wait for a while 13:04:42 <Qiming> is that patch the last one we will "beg" rally to merge in? 13:05:03 <yanyanhu> :) 13:05:11 <yanyanhu> Qiming, it is not necessary to add all them into rally repo 13:05:22 <Qiming> yep 13:05:25 <yanyanhu> but letting them stay in rally side is better than keeping them inside senlin 13:05:35 <Qiming> benefit? 13:05:44 <yanyanhu> so will first add plugins into our repo and migrate them into rally gradually 13:05:54 <yanyanhu> we don't need to hold them by ourselves 13:06:12 <Qiming> but it will still be senlin team to maintain it 13:06:20 <yanyanhu> sure 13:06:33 <Qiming> then what's the benefit? 13:07:02 <yanyanhu> just once there some structure refactoring inside rally, we will know at once if it breaks senlin plugin I think 13:07:03 <Qiming> hopefully, we won't forget adding/modifying rally jobs when we change things ... 13:07:11 <yanyanhu> sure 13:07:21 <Qiming> that testing can be done at senlin gate as well 13:08:09 <Qiming> a little bit upset by the slow reviews there 13:08:20 <yanyanhu> Qiming, yes, me too... 13:08:47 <yanyanhu> looks like the team has no enough bandwidth for all these reviews... 13:09:15 <yanyanhu> but we do get lots of important comments :) 13:09:33 <yanyanhu> to help improve my patch and let me get better understand of rally 13:09:39 <Qiming> anyway, I agree we should do this at senlin repo at first, then migrate to rally step by step 13:09:51 <yanyanhu> Qiming, yes 13:09:53 <yanyanhu> this is my plan 13:10:00 <Qiming> graet 13:10:23 <Qiming> any other updates about benchmarking/performance testing? 13:10:56 <Qiming> guess no 13:10:57 <Qiming> moving on 13:10:59 <yanyanhu> no other progress I think 13:11:02 <Qiming> health management 13:11:40 <Qiming> I spent some time reading oslo.messaging code 13:11:47 <Qiming> 2 findings 13:12:27 <Qiming> 1. the transport used for listeners is supposed to be different from the one used for RPC, that one has been fixed, although we still get a working listener there somehow 13:13:21 <Qiming> 2. when invoking 'get_notification_listener', we have an opportunity to specify the 'executor' 13:13:28 <Qiming> which defaults to 'blocking' today 13:13:44 <Qiming> other choices are 'threading', 'eventlet' 13:14:13 <lixinhui_> oh? 13:14:14 <Qiming> I tried them both but had to revert to 'blocking' for the listeners to work properly 13:14:48 <lixinhui_> what do the other two mean? 13:14:58 <Qiming> the only pitfall is that we will get a warning from oslo.messaging saying that our listener may be hang forever listening to events 13:15:04 <Qiming> that is acceptable 13:15:23 <Qiming> they are imported from package 'futurist' directly 13:15:35 <Qiming> that package provides options to execute taks in different flavors 13:15:47 <Qiming> I don't have a lot bandwidth to dig into that 13:15:56 <lixinhui_> ok 13:16:07 <Qiming> if anyone is interested in this, here is the doc: http://docs.openstack.org/developer/futurist/api.html#executors 13:16:36 <Qiming> that is how oslo.message dispatches events 13:16:52 <lixinhui_> ok 13:16:54 <Qiming> LB bug fix, any news there? 13:17:26 <lixinhui_> Two of three patches have been accepted 13:17:35 <lixinhui_> still this one https://review.openstack.org/325624 13:17:43 <Qiming> btw, someone stopped by on senlin channel asking for a working version of health policy 13:17:56 <Qiming> he said he watched our presentation on austin summit 13:18:15 <lixinhui_> oh 13:18:16 <lixinhui_> I can provide one 13:18:18 <Qiming> that is ringing a loud alarm to me 13:18:25 <lixinhui_> if he or she needs 13:18:46 <Qiming> we should be very very very careful when delivering presentation/demos 13:18:47 <lixinhui_> Adam has some concerns 13:19:09 <Qiming> unless we can ensure users can reproduce the demo easily using the public code base 13:19:42 <lixinhui_> I think you have revised the health policy from WIP 13:19:44 <lixinhui_> right? 13:20:03 <Qiming> or else, we will have difficulties attracting them to come back 13:20:17 <Qiming> that health policy is still not working 13:20:31 <Qiming> the loop is not closed 13:20:37 <Qiming> and fencing is not there yet 13:21:13 <Qiming> people will git clone and try it and see that it doesn't work 13:21:16 <Qiming> then they leave 13:22:05 <Qiming> so ... for the coming barcelona presentations, no matter which one(s) are accepted 13:22:16 <Qiming> the demos used in those talks must work 13:22:49 <Qiming> the code/profile/policy has to show up in main tree 13:23:39 <Qiming> I'll spend time on health management this week 13:23:51 <Qiming> try to close the loop asap 13:24:07 <Qiming> let's move on? 13:24:17 <yanyanhu> one question 13:24:24 <Qiming> shoot 13:24:28 <yanyanhu> does https://review.openstack.org/345916 fixes the issues xinhui mentioned? 13:24:43 <yanyanhu> about wait after listener is tarted 13:24:43 <yanyanhu> https://review.openstack.org/346390 13:24:45 <yanyanhu> this one 13:25:06 <Qiming> pls check the bug report 13:25:17 <Qiming> https://launchpad.net/bugs/1605869 13:25:17 <openstack> Launchpad bug 1605869 in senlin "hang: wait is waiting for stop to complete" [Undecided,In progress] - Assigned to Cindia-blue (miaoxinhuili) 13:25:36 <Qiming> it is not an error reported by oslo.messaging 13:25:50 <Qiming> oslo.messaging is too smart in this respect 13:26:14 <Qiming> when it detects we didn't set a timer when calling wait() 13:26:29 <Qiming> it will warn us that the listener may listen forever 13:26:30 <yanyanhu> I see 13:26:33 <Qiming> thus a 'hang' 13:26:49 <Qiming> actually, that is what we wanted in a listener thread 13:27:05 <Qiming> a dedicated listener thread 13:27:23 <Qiming> okay, moving on 13:27:26 <yanyanhu> just need to ensure stop is explicitly called before stoping health manager 13:27:39 <Qiming> yep, that will be desirable 13:27:58 <Qiming> however, in multi-engine setup, we don't have a way to gracefully shutdown all threads 13:28:24 <Qiming> if we start a single engine, we can see that all threads are gracefully killed 13:28:33 <Qiming> that is a broader problem to solve 13:28:43 <yanyanhu> yes, it is 13:28:56 <Qiming> moving on, documentation 13:29:24 <Qiming> I'm working on tutorial documentation for autoscaling today 13:29:44 <Qiming> to make auto-scaling work, I am using ceilometer + aodh + senlin 13:29:59 <Qiming> many interesting/annoying findings 13:30:16 <Qiming> but finally, I got auto-scaling with cpu_util working now 13:30:26 <Qiming> though I know in theory it should work 13:30:35 <joehuang> exit 13:30:40 <Qiming> share some findings with you: 13:31:19 <Qiming> 1. aodh alarm-update cannot process --query parameters properly, we have to get --query specified properly when doing 'aodh alarm create' 13:32:04 <Qiming> 2. recent modifications to python-openstacksdk is breaking server details retrieval 13:32:05 <yanyanhu> sounds like a bug? 13:32:24 <Qiming> we cannot get 'image' and 'flavor' properties if we are using latest master 13:32:33 <yanyanhu> means senlin node-show -D will break as well? 13:32:56 <Qiming> yes, that one was broken as well 13:33:06 <yanyanhu> I see... 13:33:10 <Qiming> I have rebased senlin resources to resource2/proxy2 13:33:37 <yanyanhu> great 13:33:38 <Qiming> https://review.openstack.org/#/c/344662/ 13:34:12 <Qiming> to make that work, I have spent a lot time discussing with sdk team about the 'to_dict()' method which was removed from resource2.Resource 13:34:23 <Qiming> it will break all senlinclient resource show command 13:34:36 <yanyanhu> yes, think so 13:34:46 <yanyanhu> we use [''] now 13:35:01 <Qiming> if you are interested in this, you can check the review history: https://review.openstack.org/#/c/331518/ 13:35:12 <yanyanhu> not a backward compatible change 13:35:23 <Qiming> it took about one month to get that accepted 13:36:11 <Qiming> back to the auto-scaling experiment 13:36:23 <yanyanhu> yes, noticed the discussion between you and brian 13:36:27 <yanyanhu> will check it :) 13:36:52 <Qiming> this is how I created an alarm: 13:36:53 <Qiming> aodh alarm create -t threshold --name c1a1 -m cpu_util --threshold 50 --comparison-operator gt --evaluation-periods 1 --period 60 --alarm-action http://node1:8778/v1/webhooks/518fc9b7-01e8-410a-ac34-59fb33cb398f/trigger?V=1 --repeat-actions True --query metadata.user_metadata.cluster=113707a0-8fdc-434f-b824-98fd706a5e0d 13:37:25 <Qiming> the tricky part is in the --query parameter, not well documented, and it is using 'pyparsing' 13:37:36 <Qiming> the docs says that '==' can be used, but it won't work 13:38:00 <Qiming> no one is telling you that you should use 'metadata.user_metadata.cluster' for filtering 13:38:08 <yanyanhu> well, inconsistency in document again... 13:38:18 <Qiming> had to read the source code to get it work 13:38:38 <yanyanhu> sure, I did that two and half years ago 13:38:47 <yanyanhu> when I first time try fitlering in ceilometer 13:38:48 <Qiming> after this step, you won't get an alarm 13:38:55 <yanyanhu> still happening :) 13:39:27 <Qiming> because in all the cpu_util samples, you won't see the nova metadata included 13:39:48 <Qiming> then ceilometer cannot evaluate the samples, aodh cannot fire an alarm 13:40:10 <yanyanhu> looks weird 13:40:12 <Qiming> after reading the source code, I figured that I have to add one line into ceilometer.conf file: 13:40:23 <Qiming> reserved_metadata_keys = cluster 13:40:39 <yanyanhu> what does that mean? 13:40:40 <Qiming> after that, restart ceilometer compute agent 13:41:17 <Qiming> the ceilometer compute pollster will now know that 'cluster' value in the nova.metadate should be reserved 13:41:49 <Qiming> or else, ceilometer is dropping all metadata key-values, unless the keys are prefixed by 'metering.' 13:41:58 <Qiming> I don't think this is documented anywhere 13:42:06 <yanyanhu> I see 13:42:21 <yanyanhu> I recalled I met similar problem before 13:42:27 <yanyanhu> at the end of 2014 13:42:46 <Qiming> I'll document the process into the tutorial doc, so users will know how to make the whole thing work 13:42:57 <yanyanhu> needed to do some hack to address it 13:43:29 <yanyanhu> since this condition was not always satisfied 13:43:36 <Qiming> yep 13:44:09 <Qiming> since haiwei is not online and no one is working on container support, we can skip the container profile item 13:44:17 <yanyanhu> not a pleasant experience :) 13:44:25 <Qiming> engine, NODE_CREATE, NODE_DELETE 13:44:35 <Qiming> I think the problem is solved now 13:44:48 <yanyanhu> yes 13:44:54 <yanyanhu> saw those patches 13:44:59 <Qiming> I was thinking of deriving cluster actions from node actions so that policies will be respected 13:45:09 <Qiming> but it turned out to be too complicated 13:45:24 <yanyanhu> current solution is good I think 13:45:34 <Qiming> I did a workaround, making policy aware of NODE_xxx actions 13:45:47 <Qiming> that is making things much more clearer 13:45:54 <yanyanhu> yes, and differentiate node actions derived from different sources 13:46:00 <Qiming> so .. deleting that work item 13:46:26 <Qiming> yep, we had that design/impl in place, these patches were just leveraging them 13:46:40 <yanyanhu> yea 13:46:48 <Qiming> em ... need to add some release notes about this 13:46:56 <yanyanhu> right :) 13:47:07 <Qiming> zaqar receiver thing 13:47:12 <Qiming> where are we? 13:47:15 <yanyanhu> no progress this week... 13:47:22 <yanyanhu> still pending for sdk support 13:47:30 <yanyanhu> and also document updating 13:47:50 <yanyanhu> I have made some local test on 'message' resource 13:47:52 <Qiming> if sdk support is in, we will get a working version soon? 13:47:56 <yanyanhu> but still some problems need to fix 13:47:59 <yanyanhu> to figure out 13:48:08 <Qiming> then grab wangfl 13:48:11 <yanyanhu> nope, it is just for queue 13:48:16 <yanyanhu> yea 13:48:23 <yanyanhu> he is working on that I think 13:48:27 <yanyanhu> saw his patch 13:48:29 <Qiming> okay 13:48:45 <Qiming> then continue grabbing him when necessary, :) 13:49:03 <Qiming> no update about event/notification from last week 13:49:08 <yanyanhu> sure :) owe him a beer 13:49:17 <Qiming> ok 13:49:30 <Qiming> #topic newton deliverables 13:49:49 <Qiming> guys, if you take a look at the newton release schedule 13:49:51 <Qiming> #link http://releases.openstack.org/newton/schedule.html 13:50:04 <Qiming> you will see that we are at week R-10 13:50:36 <Qiming> that means we still have 10 weeks before the final 2.0.0 release 13:50:37 <yanyanhu> a month left 13:51:04 <Qiming> if we consider newton-3 milestone, we only have 1 month 13:51:21 <yanyanhu> yes, for feature freeze 13:51:33 <Qiming> hopefully, we can deliver what we planned at the beginning of this cycle 13:51:59 <Qiming> e.g. profile-validate, policy-validate, cluster-collect, cluster-do, health policy, notification, container profile 13:52:20 <yanyanhu> also message type of receiver 13:52:20 <elynn> I might got some spare time next week, hope we can finish that. 13:52:35 <yanyanhu> elynn, great :) 13:52:53 <Qiming> yep, time to step up and claim some items that most interested you 13:52:56 <yanyanhu> know you are really trapping on some annoying stuff :) 13:53:07 <Qiming> that is life 13:53:13 <elynn> :) 13:53:17 <yanyanhu> yea 13:53:22 <yanyanhu> always :) 13:53:26 <Qiming> never meant to be an easy one for anybody 13:53:37 <lixinhui_> :) 13:54:07 <Qiming> glad wie can still get things moving forward and even accomplish something we feel good 13:54:34 <Qiming> let's see what we can complete during the coming month 13:54:40 <Qiming> #topic open discussions 13:54:49 <yanyanhu> oh, BTW, about the mascot 13:55:01 <Qiming> right, I replied their email 13:55:01 <yanyanhu> I guess forest? 13:55:04 <yanyanhu> :P 13:55:07 <Qiming> maybe just forest 13:55:13 <yanyanhu> its an obvious choice for us 13:55:22 <Qiming> that is what senlin means 13:55:29 <lixinhui_> agree 13:55:32 <Qiming> we still have choices 13:55:41 <Qiming> if you have some favorite animal 13:55:47 <elynn> yes!that what senlin is :) 13:56:02 <yanyanhu> forest is straightforward :) 13:56:20 <yanyanhu> easy to understand, I think the picture we always use in the slice is ok 13:56:40 <Qiming> email from Heidi: 13:56:40 <Qiming> Thank you so much for the reply! Of course I won�t mock you. Actually, I�m thrilled to know you already have a great mascot that works with this project. Senlin will have the first right of refusal on a forest since that�s already your logo. You might want to discuss with your team whether you intend the trees in your forest to look deciduous, evergreen, or a specific variety (stands of Aspen, for example). That can help guide our illustrator 13:56:41 <Qiming> to make a forest that reflects what you like. 13:56:41 <Qiming> Cheers, 13:56:41 <yanyanhu> hope no conflict with other projects :P 13:56:42 <Qiming> Heidi Joy 13:57:08 <yanyanhu> haha 13:57:17 <Qiming> deciduous, evergreen, or ... 13:57:48 <yanyanhu> evergreen sounds good, haha 13:57:50 <yanyanhu> for HA 13:57:57 <Qiming> good point 13:58:50 <Qiming> 2 minutes left 13:59:35 <Qiming> thanks for joining boys and girls 13:59:37 <yanyanhu> no other topic from me 13:59:42 <yanyanhu> thanks 13:59:42 <Qiming> will you all a happy night 13:59:45 <Qiming> pleasant one 13:59:51 <yanyanhu> take good care of you baby :) 13:59:54 <elynn> :) 13:59:54 <Qiming> #endmeeting