15:30:00 <DinaBelova> #startmeeting Performance Team 15:30:00 <openstack> Meeting started Tue Apr 18 15:30:00 2017 UTC and is due to finish in 60 minutes. The chair is DinaBelova. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:30:01 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:30:04 <openstack> The meeting name has been set to 'performance_team' 15:30:09 <rcherrueau> o/ 15:30:19 <DinaBelova> rcherrueau good evening sir :) 15:30:31 <DinaBelova> tovin07 akrzos o/ 15:30:31 <tovin07> hello o/ 15:30:51 <akrzos> DinaBelova: o/ 15:31:03 <DinaBelova> let's get started I guess :) 15:31:08 <DinaBelova> #topic Action Items 15:31:15 <DinaBelova> last time we had only one action item on rcherrueau 15:31:29 <rcherrueau> hum ... 15:31:29 <DinaBelova> regarding adding new testing methodology to the perf docs 15:31:37 <rcherrueau> ongoing :) 15:31:46 <DinaBelova> yeah, I though so as well :) 15:31:51 <DinaBelova> so let's keep it :) 15:32:00 <DinaBelova> #action rcherrueau add OpenStack testing under networking delays (e.g. multisite deployment) methodology to performance docs (openstack under WAN) 15:32:07 <DinaBelova> #topic Current progress on the planned tests 15:32:21 <DinaBelova> in the meanwhile rcherrueau please share your current progress :) 15:32:26 <rcherrueau> We have our first results :) 15:32:34 <DinaBelova> yay :) 15:32:42 <rcherrueau> The deployment model we use is always the same: control, network and volume services are on the same nodes. computes are on dedicated nodes. 15:32:58 <rcherrueau> Then, we add latency between computes and the control node, and run Rally scenarios. 15:33:09 <rcherrueau> The latency variations are: 10, 25, 50, 100 and 150ms. 15:33:20 <rcherrueau> First experiments show that a bigger latency implies a longer VM boot time and delete time. 15:33:33 <rcherrueau> This was expected. Here are some results: At 10, the average boot time is 22 sec, whereas at 150 it is 30 sec. 15:33:43 <rcherrueau> Respectively 5 sec and 8 sec for delete time. 15:33:50 <DinaBelova> yeah, this looks expected 15:33:57 <rcherrueau> I expect the time difference during boot comes from Glance. No idea for the delete time. 15:34:11 <rcherrueau> except oslo_messaging communications 15:34:21 <rcherrueau> I can say more soon, because I used OSProfiler at the same time to produce traces. 15:34:37 <rcherrueau> I have to dig into that and make a diff between two OSProfiler traces to see which functions are responsible for this difference. 15:34:41 <DinaBelova> cool, this should give us specific place 15:34:48 <rcherrueau> yep 15:34:57 <rcherrueau> For next week I plan to test neutron and I also wanna see how latency affects OpenStack when you have many clients and thus many messages in the bus (by varying concurrency in Rally). 15:35:10 <rcherrueau> If I have the time, I also wanna run Shaker. 15:35:23 <rcherrueau> That's all for Inria 15:35:29 <DinaBelova> ack, thank you sir 15:35:30 <DinaBelova> thanks again 15:35:48 <DinaBelova> akrzos, sir, do you have any news regarding the gnocchi testing? 15:36:18 <akrzos> Yeah continuing to try and scale up 15:36:34 <akrzos> Hit an issue where gnocchi lost coordinator on one controller 15:36:49 <akrzos> thus if thats 3 controllers 15:36:56 <akrzos> you loss 1/3 of your capacity 15:37:01 <akrzos> not sure why that occured yet 15:37:23 <akrzos> been able to get ~5k instances 15:37:27 <akrzos> still trying to get to 10k 15:37:29 <DinaBelova> do data from the monitoring? can't it be some overload somewhere? 15:37:34 <akrzos> also remvoed the collector 15:37:49 <akrzos> but now looks like rpc settings not necessarily optimal for agent-notification 15:37:58 <akrzos> and the gnocchi api to receive so many requests 15:38:41 <akrzos> not really sure from the monitoring other than it's easily fixable with restarting the metricd processes 15:38:49 <akrzos> but you have to catch it 15:39:06 <DinaBelova> okay, what's your current feeling? Do you think it's still possible to reach 10k? 15:39:36 <akrzos> there is plenty of resources still on this setup its getting the services to use it 15:39:55 <akrzos> i have like 4 days now to figure it out 15:40:00 <akrzos> so probably not likely 15:40:06 <akrzos> unfortunately 15:40:09 <DinaBelova> :( 15:40:30 <DinaBelova> that's sad, but let's hope you'll overcome this issue 15:40:36 * akrzos fingers crossed 15:40:42 <DinaBelova> true 15:41:30 <DinaBelova> ok, so from mirantis side I was able to confirm that we're switching to Mirantis Cloud Platform (MCP) usage in our scale and performance tests 15:41:44 <DinaBelova> this is new experience for us, so we need to learn all tips and tricks first :) 15:41:53 <DinaBelova> before going to the tests themselves 15:42:41 <DinaBelova> so I suspect in nearest future we'll be trying to deploy it against various scale and create automatizations around this process with usual set of baseline tests run against it 15:43:29 <DinaBelova> so I suspect my updates won't be really interesting in next month or so :D 15:43:44 <DinaBelova> I think that's all from my side 15:43:59 <DinaBelova> #topic Open Discussion 15:44:06 <DinaBelova> tovin07 anything to talk about? 15:44:11 <tovin07> yes 15:44:33 * tovin07 getting link 15:45:12 <tovin07> #link Raaly + OSprofiler https://review.openstack.org/#/c/456278/ 15:45:36 * DinaBelova adds this change to the review list 15:45:39 <tovin07> Last week, Rally PTL create a spec for this feature 15:45:57 <tovin07> rcherrueau had some comment in this patch already 15:46:05 <rcherrueau> I wrote a small comment on this one on the form. But, on the basis this is really good. 15:46:29 <DinaBelova> sadly I did not take a look on it so far 15:46:36 <DinaBelova> but will do 15:47:19 <tovin07> besides, I tried some test with rally (with OSprofiler enabled) to measure overhead of OSprofiler in devstack environment 15:47:34 <tovin07> *tests 15:47:36 <DinaBelova> tovin07 oh, interesting 15:47:52 <DinaBelova> I suspect the influenct depends much on the chosen storage background 15:48:18 <DinaBelova> tovin07 what is your current feeling on this? 15:48:19 <tovin07> currently, I try with redis 15:49:29 <tovin07> with some small tests, I saw that it take about 0 -> 7% overhead in testing time 15:49:56 <tovin07> with zero-optimization, I think it’s a good result :D 15:50:21 <DinaBelova> how does it depend on the trace points number? 15:50:42 <DinaBelova> is the dependency linear? 15:51:05 <tovin07> I did not have the detail answer for this :/ 15:51:30 <DinaBelova> ok, please keep us updated :) 15:51:52 <tovin07> yup 15:52:01 <tovin07> that’s all from me this week 15:52:06 <DinaBelova> tovin07 thank you 15:52:17 <tovin07> How your Easter? 15:52:22 <DinaBelova> ok, anything else to cover? rcherrueau akrzos? 15:52:26 <DinaBelova> tovin07 fine, thanks :) 15:52:29 <rcherrueau> nop 15:52:46 <DinaBelova> ok, so thanks everyone for joining today, let's have a good week :) 15:52:50 <DinaBelova> bye! 15:52:51 <tovin07> thanks 15:52:52 <tovin07> :D 15:52:56 <DinaBelova> #endmeeting