16:00:20 <DinaBelova> #startmeeting Performance Team 16:00:21 <openstack> Meeting started Tue Feb 9 16:00:20 2016 UTC and is due to finish in 60 minutes. The chair is DinaBelova. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:22 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:24 <openstack> The meeting name has been set to 'performance_team' 16:00:44 <DinaBelova> ok guys I'm not sure if people are around :) 16:00:44 <andreykurilin> o/ 16:00:48 <DinaBelova> andreykurilin o/ 16:01:09 <DinaBelova> anyone else around? :) 16:01:15 <rvasilets_> me=) 16:01:18 <rvasilets_> Hi 16:01:24 <DinaBelova> rvasilets_ o/ 16:01:35 <AugieMena> Hi 16:01:56 <DinaBelova> we have usual agenda 16:02:02 <DinaBelova> #topic Action Items 16:02:19 <DinaBelova> last time ew had in fact only one official action item on me 16:02:50 <DinaBelova> I went through the spec about performance testing by Dragonflow team - it seems to be ok 16:03:11 <DinaBelova> the only thing I need to do right now will be to make them publish it to our docs as well 16:03:12 <DinaBelova> :) 16:04:07 <DinaBelova> as for the other changes to performance-docs, we in fact did have much progress last week - no significant changes proposed 16:04:17 <DinaBelova> so it looks like this meeting will be quick :) 16:04:29 <DinaBelova> #topic Test plans statuses 16:04:34 <gokrokve> hi 16:04:37 <DinaBelova> gokrokve o/ 16:04:56 <gokrokve> Give me couple minutes at the end, please 16:05:03 <DinaBelova> gokrokve sure 16:05:08 <gokrokve> We can discuss what we will do in the lab in Q1 16:05:19 <DinaBelova> gokrokve ack 16:05:41 <DinaBelova> so going back to the test plans 16:06:01 <DinaBelova> currently some refactoring work is in progress by ilyashakhat 16:06:23 * SpamapS has a test plan with a status. 16:06:38 <DinaBelova> he's splitting test plans to something generic + examples with real tools 16:06:52 <DinaBelova> SpamapS - oh, cool :) can you share some details? 16:07:07 <SpamapS> DinaBelova: certainly 16:07:53 <SpamapS> So, unfortunately, the ansible I'm using to deploy is not open source, though I'm working on that. :) But we've successfully spun up tooling to test fake nova in 1000 containers using just a couple of smallish hosts. 16:08:32 <SpamapS> We are going to test 2000 nova-compute's this week, and also the new Neutron OVN work which moves a lot of the python+rabbitmq load off into small C agents. 16:08:45 <DinaBelova> SpamapS wow, that's very cool 16:08:54 <SpamapS> Our results show that with a _REALLY_ powerful RabbitMQ server, you can absolutely scale Nova well beyond 1000 hypervisors. 16:09:10 <DinaBelova> SpamapS do you have any plans on pushing the test plan and the results to the performance/docs? 16:09:20 <DinaBelova> to share it with wider audience? 16:09:25 <SpamapS> 1000 hypervisors, with 60 busy users, results in RabbitMQ using 25 cores (fast baremetal cores) and 10GB of RAM. 16:09:40 <SpamapS> I plan to publish a paper and discuss it at the summit. 16:09:58 <SpamapS> And I'm hoping to include ansible roles for spinning up the thousands of fake nova's. 16:10:28 <SpamapS> What we have found, though, is that the scheduler is a HUGE bottleneck. 16:10:40 <DinaBelova> SpamapS - I think we can place all testing artefacts to the performance-docs as well 16:10:47 <DinaBelova> SpamapS did you file a bug? 16:10:47 <SpamapS> With 2 schedulers running, they eat 2 cores up, and take increasingly long to finish scheduling requests as more busy users are added. 16:11:09 <SpamapS> I have not filed bugs just yet, because I have not tested with Mitaka yet. Most of this has been on Liberty 16:11:17 <DinaBelova> SpamapS ack 16:11:51 <SpamapS> Keystone and Neutron also eat cores very fast, but at a linear scale so far. 16:12:20 <DinaBelova> ok, so can you keep our eyes on the progress for this testing? also test plan sharing is very welcome for us to take a deeper look :) 16:12:27 <SpamapS> The problem with the scheduler is when we run with 3, it does not speed up scheduling, because they start stepping on eachother, resulting in retries. 16:13:22 <gokrokve> This is interesting. I heard there is a config option for scheduler to increase number of threads. 16:13:23 <SpamapS> So yeah I hope to open this up a bit. My long term goal is to be able to spin up a similar version of this inside infra, with maybe 10 nova-compute's and my counter-inspection stuff verifying that we haven't had any non-linear scale effects. 16:13:36 <SpamapS> gokrokve: yeah, threads just don't help unfortunately. 16:13:43 <SpamapS> We have some ideas how they could though 16:13:58 <gokrokve> Nice 16:13:59 <SpamapS> Like, have a random filter, that randomly removes half the hosts that come back as possibilities. 16:14:16 <SpamapS> should reduce conflicts 16:14:25 <gokrokve> Sounds like a workaround 16:14:53 <gokrokve> I wonder if Gant project is alive and wants to fix this issue 16:15:03 <SpamapS> Long term, I'm not sure we can get away from cells as the answer for this. 16:15:14 <SpamapS> AFAIK, gantt is dead. 16:15:27 <SpamapS> But maybe it's doing things more now and I missed it. 16:15:32 <gokrokve> That what I heard as well 16:15:35 <DinaBelova> hm, I wonder what's the mechanism schedulers are using for the coordination (if any) - it looks like it's wrong 16:15:44 <SpamapS> they have 0 coordination with eachother. 16:15:55 <SpamapS> They read the database, choose a host, then attempt to claim resources. 16:16:02 <DinaBelova> SpamapS :D heh, looks like the issue 16:16:21 <SpamapS> What they should do is attempt to claim things in the database, all in one step. 16:16:34 <gokrokve> +1 16:16:37 <DinaBelova> +1 16:16:42 <gokrokve> Sounds like we need to engage nova team 16:17:09 <gokrokve> We can start with Dims and Jay Pipes. 16:17:10 <SpamapS> Also they could collaborate by making a hash ring of hosts, and only ever scheduling things to the hosts they own on the hash ring until those hosts fail, and then move down the line to the next hosts. 16:17:24 <SpamapS> The nova team is focused on cells. 16:17:31 <SpamapS> That is their answer. 16:17:51 <SpamapS> And I am increasingly wondering if that will just have to be the answer (even though I think it makes things unnecessarily complex) 16:17:56 <DinaBelova> well, we can ping someone from Mirantis Nova team and find out if they are interested in fixing 16:18:10 <SpamapS> To be clear, I'm interested in fixing too. :) 16:18:13 <DinaBelova> SpamapS I would say it'll be overkill 16:18:22 <DinaBelova> SpamapS :D 16:18:35 <SpamapS> Yeah, we want to be able to run cells of 1000+ hypervisors. 16:18:57 <DinaBelova> well, yeah - so let's keep our eye on this issue and testing 16:19:10 <SpamapS> One reason cells have been kept below 800 is that RAX ops is just not built to improve OpenStack reliability, so they have needed to work around the limitations while development works on features. 16:19:26 <jaypipes> somebody say scheduler? 16:19:28 <jaypipes> :P 16:19:32 <SpamapS> Everybody else who is running big clouds is in a similar boat. Cells allows you to make small clouds and stitch them together. 16:19:38 <SpamapS> jaypipes: \o/ 16:19:38 <DinaBelova> #info SpamapS is working on testing 2000 nova-compute's, and new Neutron OVN work which moves a lot of the python+rabbitmq load off into small C agents. 16:19:45 <DinaBelova> jaypipes :D 16:19:59 <SpamapS> jaypipes: indeed. I'll summarize where I'm bottlenecking... 16:20:20 <DinaBelova> SpamapS absolutely agree with your thoughts 16:20:26 <SpamapS> jaypipes: We run 1000 pretend fake hypervisors, and 60 pretend users doing create/list/delete. 16:21:05 <SpamapS> jaypipes: With 1 scheduler, this takes an average of 60s per boot. With 2 schedulers, this takes an average of 25s per boot. With 3 schedulers, this takes an average of 60s per boot. 16:21:19 <SpamapS> jaypipes: the 3rd scheduler actually makes it worse, because retries. 16:21:31 <jaypipes> SpamapS: yes, that is quite expected. 16:21:43 <SpamapS> yeah, no shock on this face ;) 16:22:01 <DinaBelova> but not that scalable as probably people want :D 16:22:07 <SpamapS> jaypipes: I think one reason there's no urgency on this problem is that cells is just expected to be the way we get around this. 16:23:04 <jaypipes> SpamapS: not really... the medium-to-long term solution to this is moving claims to the scheduler and getting rid of the cache in the scheduler. 16:23:06 <SpamapS> But a) I want to build really big cells with less bottlenecks, and b) what about a really busy medium sized cell? 16:23:31 <SpamapS> jaypipes: claims to the scheduler is definitely the thing I want to work on. 16:23:41 <cdent> the patchset that's in progress right now to duplicate claims in the scheduler probably helps those retries 16:23:42 * cdent locates 16:23:58 <DinaBelova> SpamapS - it looks like jaypipes has the same approach 16:23:59 <DinaBelova> yeah 16:24:01 <SpamapS> sweet 16:24:17 <jaypipes> SpamapS: I am writing an email to the ML at the moment that describes the progress on the resource tracker and scheduler front. you should read it when it arrives in you inbox. 16:24:36 <cdent> https://review.openstack.org/#/c/262938/ 16:24:44 <DinaBelova> #link https://review.openstack.org/#/c/262938/ 16:25:11 <SpamapS> jaypipes: wonderful. I have been editting a draft of questions since I started this work, and never ready to send it because I keep finding new avenues to explore. Hopefully I can just delete the draft and reply to you. :) 16:25:37 <jaypipes> cdent: actually, that patch doesn't help much at all... and is likely to be replaced in the long run, but I suppose it serves the purpose of illustration for the time being. 16:25:41 <DinaBelova> SpamapS :D 16:26:00 <cdent> jaypipes: I know it is going to be replaced, just it is a datapoint 16:26:03 <SpamapS> cdent: oh yes, that's definitely along the lines I was thinking to get them to stop stomping on eachother. 16:26:54 <SpamapS> The other thing I want to experiment with is a scheduler that runs on each nova-compute, and pulls work. 16:26:55 <DinaBelova> SpamapS jaypipes cdent - anything else here? 16:27:01 <cdent> It's a given that the shared state management in the scheduler, inter and intra, is a mess and having a lock and claim stuff is just stopgaps. 16:27:40 <jaypipes> SpamapS: that's not the direction we're going but is an interesting conversation nonetheless 16:28:20 <SpamapS> jaypipes: yeah no it's a spike to solve other problems. Making what the general community has work well would be preferred. 16:28:23 <cdent> presumably it will just move around the slowness rather than getting to right sooner 16:28:41 <SpamapS> cdent: it would strip all the features out. 16:28:56 <DinaBelova> #info let's go through jaypipes email that describes the progress on the resource tracker and scheduler front - let 16:28:57 <SpamapS> precalc queues based on flavors only 16:29:07 <cdent> SpamapS: sorry, my "it" was the patch I was referencing. 16:29:21 <SpamapS> and each compute can decide "can I handle that flavor right now? yes: I will pull jobs from that queue" 16:29:25 <SpamapS> cdent: ah k. 16:29:55 <SpamapS> anyway, yes I patiently await your email sir pipes. 16:30:07 <DinaBelova> ok, cool 16:30:28 <SpamapS> Oh also we haven't been able to try zmq to solve the 25 core eating RMQ monster because of time constraints. 16:30:59 <SpamapS> Turns out a couple of 56 core boxes are relatively cheap when you're building thousands of hypervisors out. 16:31:53 <dims> SpamapS : on that front, we need help with the zmq driver development 16:32:13 <DinaBelova> SpamapS our internal experiments show that ZMQ is much quicker and eating less resources - but it'll be interesting to take a look on the comparison of the results on your env and your load indeed 16:32:15 <dims> SpamapS : and testing of course :) 16:32:24 <DinaBelova> SpamapS yeah, dims is right 16:32:31 <DinaBelova> work on zmq driver is in progress 16:32:35 <SpamapS> dims: I hope to test it later this week once I get through repeating my work with mitaka. 16:32:43 <DinaBelova> and continuous improvement 16:32:57 <DinaBelova> SpamapS ack 16:33:01 <SpamapS> I also need to send notifications via rabbitmq still, so that will be interesting. :) 16:33:13 <DinaBelova> :D yeah 16:33:30 <DinaBelova> So I suppose we may move on? 16:33:54 <SpamapS> Yes, thanks for the in depth discussion though. :) 16:34:03 <DinaBelova> SpamapS you're welcome sir :) 16:34:17 <DinaBelova> I just hope to get this info publickly available soon 16:34:35 <DinaBelova> publicly* 16:34:48 <DinaBelova> ok, so let's jump to theosprofiler topic 16:34:55 <DinaBelova> #topic OSProfiler weekly update 16:35:10 <DinaBelova> in fact profiler had a nervous week :) 16:35:34 <DinaBelova> we broke some gates with 1.0.0 release and the methods we used for static methods tracing 16:35:37 <DinaBelova> :D 16:36:08 <DinaBelova> so the workaround was to skip staticmethods tracing for now, as it has lots of corner cases, that need to be treated separately 16:36:51 <DinaBelova> and 1.0.1 release seems to work ok - some internal testing by Mirantis folks who needed it has shown good results 16:37:05 <DinaBelova> dims thanks one more time for quick 1.0.1 release pushing 16:37:53 <DinaBelova> #info Neutron change https://review.openstack.org/273951 is still in progress 16:38:17 <DinaBelova> sadly I did not have enough time to make if fully workable yet, that's still in progress 16:38:47 <DinaBelova> the situation with nova change is a bit different - but with the same result 16:39:20 <DinaBelova> #info Nova change https://review.openstack.org/254703 won't land in Mitaka due to the lack of functional testing 16:39:48 <DinaBelova> so either Boris or I will need to add this specific job to get it landed early in Newton 16:40:38 <DinaBelova> johnthetubaguy - thanks btw for letting me know about what do you guys want to see to get it merged :) it was sad that it won't land in M releaase, but still 16:41:38 <DinaBelova> in fact all other work is still in progress with the patches for neutron, nova, keystone on review 16:42:24 <DinaBelova> any questions regarding the profiler? 16:42:46 <DinaBelova> ok, so let's go to the open discussion 16:42:51 <DinaBelova> #topic Open Discussion 16:42:56 <DinaBelova> gokrokve the floor is yours 16:43:01 <gokrokve> Cool. Thannks 16:43:19 <gokrokve> As you probably know we now have a lab with 240 physical nodes 16:44:02 <gokrokve> In Q1 we plan to do several tests related to: MQ (RabbitMQ vs ZMQ), DB (MySQL Galera) and Nova conductor behavior 16:44:43 <gokrokve> In adidtion to that we a re working on rally enhancement to add networking workloads to test data plane networking performance for tenant networks 16:44:56 <DinaBelova> + baremetal provisioning scalability for the cloud deployment 16:45:18 <gokrokve> This will be done in upstream rally. Most of the functionality is already available but we need to create workload itself 16:45:30 <gokrokve> Yep and baremetal as well. 16:46:01 <gokrokve> We also working on test results format 16:46:20 <gokrokve> So that it can be published more or less automatically in RST format 16:46:35 <DinaBelova> gokrokve I think that almost all details are published on the test plans, we'll probably change them if something else will be needed to to covered 16:46:37 <gokrokve> I hope we will be able to upstream this as well 16:46:39 <DinaBelova> gokrokve indeed 16:47:05 <gokrokve> Sure, test plans will be changed accordingly 16:47:26 <DinaBelova> also afair kun_huang 's lab is going to appear somewhen close to Feb 18th 16:47:37 <gokrokve> This is great. 16:47:48 <DinaBelova> so listomin will coordinate if we can run something else on that lab as well 16:47:53 <DinaBelova> gokrokve +1 :) 16:48:00 <gokrokve> More independent runs we have more confident we will be with the results 16:48:08 <DinaBelova> gokrokve indeed 16:48:24 <gokrokve> So that is almost it. 16:48:38 <gokrokve> I just want to share these plans for Q1 with the audience. 16:48:49 <DinaBelova> gokrokve thank you sir 16:49:08 <DinaBelova> ok, folks, any questions? 16:50:03 <DinaBelova> ok, so thanks everyone for coming and special thanks to SpamapS for raising very interesting topic 16:50:12 <DinaBelova> see u guys 16:50:16 <DinaBelova> #endmeeting