16:00:07 #startmeeting Performance Team 16:00:08 Meeting started Tue Aug 23 16:00:07 2016 UTC and is due to finish in 60 minutes. The chair is DinaBelova. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:09 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:12 The meeting name has been set to 'performance_team' 16:00:22 I think we may start with organizational questions 16:00:28 as the new time for the meeting :) 16:00:34 #topic New time for the meeting 16:01:18 so we have several options 16:01:36 16:00 UTC (current time, really tough) vs 15:30 UTC vs 17:30 UTC vs 18:00 UTC 16:02:04 1200 EST isn't great.... :) 16:02:20 o/ 16:02:28 rook indeed 16:02:48 from what I heard from people 30 minutes earlier (15:30 UTC) seems to be the best option for now 16:02:53 hey everyone :) 16:03:03 rohanion o/ 16:03:50 15:30 UTC (11:30 EST, 8:30 PST) is something I've heard from AugieMena and ad_rien_ and other Inria folks at least 16:03:57 for French Inria Guys 15:30 UTC is a good choice - (17:30 can be a second one) 16:04:09 +1 to 1530 16:04:16 rook ack 16:04:51 I want to grad confirmation from klindgren - if at least he'll be able to attend from GoDaddy side on more or less weekly basis it'll be cool 16:05:02 I know that for harlowja it'll be too early :D 16:05:15 glad -> grab ** 16:05:50 although these two are having right now a daily meeting ^^, so let's wait for their response later 16:06:24 #info preliminary agreed on moving the meeting 30 minutes earlier on Tuesdays, 15:30 UTC (11:30 EST, 8:30 PST) 16:06:49 rohanion am I right that for Moscow it'll be convenient as well? 16:07:04 rohanion it'll be 18:30 16:07:13 yeah, no problem 16:07:21 18:30 RTZ-2 16:07:30 ok, cool, I suspected this, but wanted to collect the information 16:07:48 ok, so I'll send a final update via email thread 16:08:11 so let's jump to the current progress on test plans 16:08:12 #topic Current progress on the planned tests 16:08:29 from Mirantis side we (finally) were able to move forward with labs configuration 16:09:12 our 200 nodes lab is under intensive Neutron testing for now (dataplane tests with huge enough number of Neutron objects and VMs) 16:10:05 and it looks like we finally were able to filter all non-working nodes from 500 nodes lab, so we'll have 400+ nodes Openstack cluster (installed by MOS) 16:10:28 so we'll run http://docs.openstack.org/developer/performance-docs/test_plans/control_plane/plan.html against it 16:10:37 hopefully this will start tomorrow 16:10:41 MOS ? 16:10:52 Mirantis OpenStack (way of deploying) 16:11:07 Ah MOS != Fuel ? 16:11:13 rook yep :D 16:11:35 MOS includes OpenStack packages with more fixes and backports 16:11:41 From inria side we made some progress on the 1000 nodes test plan 16:11:42 than usual stable/* usually have 16:11:57 but deployment tool itself is Fuel, yes 16:12:03 msimonin cool! 16:12:07 please share some details 16:12:11 but … 16:12:13 :) 16:12:16 :D 16:12:27 we still have some performance issue when scaling with 500+ computes 16:12:44 msimonin you use fake computes? 16:12:46 for the record we are using a kolla based deployment 16:12:51 o/ 16:12:55 luzC o/ 16:12:57 DinaBelova: yes fake drivers 16:13:02 msimonin: what sort of scaling issues? Control plane? 16:13:08 msimonin ok, so kolla + fake driver 16:13:12 what kind of issues? 16:13:16 hello Dina 16:13:31 actually, let me share a document that we are maintaining 16:13:40 msimonin it'll be cool 16:13:49 #link https://pad.inria.fr/p/FK18ZSyjFyLPITfE 16:14:31 msimonin ok, I see " very slow apis when using 500 computes" 16:14:32 apis + mariadb are behind the haproxy 16:14:38 yep 16:14:42 that's the issue 16:14:52 the issue is haproxy? 16:14:59 does it relate to all OpenStack REST APIs? 16:15:29 rook: maybe haproxy, but we need more insight 16:16:15 msimonin I suspect some parameters of HAproxy may influence here indeed on the performance, I'll agree with rook here 16:16:30 yes it's my feeling too 16:16:41 do you have some experience with haproxy ? 16:16:52 msimonin not too big sadly 16:16:53 msimonin do you know at what # of computes you start seeing the problem? 16:17:15 100 computes is ok, 500 isn't 16:17:24 well you go 5x :) 16:17:30 sure :) 16:17:43 msimonin I can suggest you to ping alwex (Alex Shaposhnikov) and try to discuss this with him, he may have some ideas 16:17:51 cool, I'll do that 16:17:54 msimonin we have a monitoring solution to help track down what might be slowing down 16:18:09 using collectd/graphite/grafana 16:18:28 might provide insight into the slowness or the possible culprit. 16:18:29 actually we are using cadvisor/influx/grafana 16:18:45 but it just keep track of basics metrics 16:18:51 #info Mirantis: Neutron testing on 200 nodes in progress, 400 nodes cluster almost installed for baseline performance testing 16:19:07 #info Inria facing issues with 1000 nodes emulation experiment: https://pad.inria.fr/p/FK18ZSyjFyLPITfE 16:19:09 msimonin: sure, we break out per-process 16:20:23 msimonin I think you may try the same method rook is talking about + let's ask alwex to help 16:20:56 #action DinaBelova msimonin alwex find out what's slowing down OpenStack REST APIs in the Inria 1000 nodes experiment 16:21:16 rook: do you have what metrics can help tracking the issue ? 16:21:22 do you know* 16:21:28 Since we built our deployment framework on top of Kolla, we get the ha-proxy by default…. DinaBelova do you think it can be possible to see how you are deploying the different containers on your side ? 16:22:03 For us it is not so complex/time consuming to redeploy a set of containers on top of G5K 16:22:07 ad_rien_ I think that's possible, let's anyway communicate this with Alex as he was originating the experiment 16:22:12 msimonin: since it seems to be based on the # of computes... we could look at the msging services (rabbit?), haproxy, and possibly the individual nova services. since each of the compute nodes will be checking in on a interval. 16:22:49 ad_rien_ fyi now we're playing with k8s and fuel-ccp (containeraized control plane) to see if we can use it for the experiments and testing 16:23:16 rook: ack, rabbit may be the issue as well indeed 16:23:32 msimonin: honestly, it wouldn't shock me. 16:23:44 msimonin: what errors are you seeing? 16:23:49 ad_rien_ although I believe HAproxy is either not an issue at all or this can be fixed even without redeployment 16:24:29 rook: WARNING nova.openstack.common.loopingcall [-] task > run outlasted interval 16:24:35 in the nova-compute.lgo 16:24:49 yeah - could be msging for sure. 16:24:59 yes 16:25:11 msimonin so let's check this option as well 16:25:12 * rook isn't familiar with your control-plane setup 16:25:26 DinaBelova: actually to be fair with your experiments, we should remove it…. ok let's see whether we can fix that point by discussing with Alex. 16:25:36 ad_rien_ ack 16:25:48 (to be faire, i.e. to provide fair comparisons) 16:26:08 ad_rien_ msimonin I'll start separated email thread with Alex 16:26:18 DinaBelova: ack 16:26:21 DinaBelova: ack thanks 16:26:33 it looks like we may proceed for now 16:26:41 DinaBelova: could you also put me in CC 16:26:46 rcherrueau sure 16:26:52 thanks 16:26:59 ok, so let's jump to the OSprofiler 16:27:11 #topic OSProfiler weekly update 16:27:24 rohanion aignatev - the floor is yours 16:27:33 need review for https://review.openstack.org/#/c/340936/ 16:27:47 other than that I'm not currently working with osprofiler, so no updates 16:27:50 ok, will look at it today 16:28:04 thanks! 16:28:18 aignatev I'll take a look as well, thanks 16:28:24 thank you 16:28:29 rohanion any updates from your side? 16:28:37 yep, writing :) 16:28:57 I'm working on enabling osprofiler in Fuel. basically I have to code a PoC that updates configuration and reloads the services. 16:29:10 this PoC will be turned into a Fuel plugin 16:29:34 Also I'm thinking about a set of profiling scenarios with single entry-point and results in table format. 16:29:38 rohanion I believe you found out that's a bit overkill solution? 16:29:52 I mean fule plugin? 16:29:55 fuel* 16:30:48 yes, a little. but still it will be a good option to set all the needed variables in a nice UI 16:30:49 or you decided it's not too difficult to develop this profiling plugin? 16:31:07 rohanion ack 16:31:28 it will be tough but I think I'll handle that. 16:31:42 bummer, will it be written to only work with Fuel? 16:31:44 not OOO? 16:31:49 no updates regarding osprofiler itself 16:31:54 rook, no 16:32:03 no to only Fuel ? 16:32:10 the script will work with vanilla openstack 16:32:14 ah ha cool 16:32:26 rook it's just way for us to deploy it automatically using fuel, but it workd with any OpenStack and is automated to be used with devtack 16:32:36 the plugin will just launch this script, sending all necessary parameters 16:32:51 cool rohanion - have a link to this work 16:33:35 ok, rohanion is this all for now? 16:33:46 I'll publish the changes as soon as I write the code :) 16:33:51 yes, that's everything 16:33:54 ok, cool 16:34:01 so we may jump to open discussion :) 16:34:06 #topic Open Discussion 16:34:45 anything else to share? 16:35:03 We really want to contribute to the performance documentation 16:35:12 rcherrueau you're really welcome :) 16:35:14 do you think it's possible 16:35:25 rcherrueau of course :) 16:35:37 that's usual enough contribution process 16:35:41 to the repository 16:35:48 for instance, do you think it's possible to add a specific subsection in the labs section with information about our grid? 16:35:52 #link https://github.com/openstack/performance-docs 16:35:57 rcherrueau for sure 16:36:03 to have all info shared 16:36:09 cool 16:36:18 and you'll mention this lab in the test plans/test results 16:36:32 pleaze what is the process to do so 16:36:47 rcherrueau you need to upload several commits 16:36:57 commit #1 - regarding the lab description 16:37:03 lemme find an example 16:37:14 yes thanks 16:37:29 rcherrueau like this https://review.openstack.org/#/c/292835/ 16:37:38 but regarding your lab 16:37:58 ok DinaBelova I think we got the information we were looking for 16:38:06 ok thank you DinaBelova 16:38:11 and second one with your results, as you're going to use the same test plan as we did 16:38:39 rcherrueau https://review.openstack.org/#/c/324806/ - test results part 16:38:39 yes modulo the way we perform(ed) the experiments 16:38:47 ad_rien_ indeed 16:39:30 the overall documentation is free form, the only rule is to set test plans regarding the template https://github.com/openstack/performance-docs/blob/master/doc/source/test_plans/template.rst 16:39:40 but you're not going to propose your own test plan :) 16:39:48 so feel free to skip my last message :) 16:39:53 ok we gonna give a look 16:40:12 OK great, that's all for us. 16:40:17 ad_rien_ sure, will take a look with great pleasure :) 16:40:43 anything else here? from my side I have nothing more to share 16:41:00 not sure whether it will be done for the next week as right now we are focusing on kolla related issues but once we get relevant results we will commit them 16:41:15 ad_rien_ ack, thanks for the update 16:41:44 ok, it looks like we're done for now 16:41:51 please wait the meeting time update email 16:42:01 I think we'll stop with 15:30 UTC 16:42:16 thanks folks for coming and participation 16:42:21 see you 16:42:25 #endmeeting