13:02:41 #startmeeting hyper-v 13:02:42 Meeting started Wed Jan 13 13:02:41 2016 UTC and is due to finish in 60 minutes. The chair is alexpilotti. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:02:43 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:02:46 The meeting name has been set to 'hyper_v' 13:02:47 hi guys 13:02:50 hello 13:02:51 hi 13:02:52 hi 13:02:53 hi 13:02:57 morning folks! 13:03:19 #topic FC 13:03:32 Hello All 13:03:49 lpetrut: want to give us some updates? 13:04:33 well, a new os-win release has been made, including the FC work 13:04:57 nice 13:05:28 did you guys give it a try? 13:05:52 #link https://review.openstack.org/#/c/258617/ 13:05:59 not yet, i have requested QA support for it 13:06:09 waiting for QA to take it 13:06:20 this means that with the latest os-win, all FC patches gave green light on jenkins 13:06:34 we're still waiting for HP reviews BTW 13:06:47 jobs were failing today morning India time 13:06:50 sagar_nikam: any news from hemna or kurt? 13:06:56 are they passing now 13:07:10 waiting for jenkins to pass, and then will ping them 13:07:32 sagar_nikam: this happened today, so probably after India morning time 13:07:46 ok 13:08:07 #link https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/hyperv-fibre-channel 13:08:24 all patches in this topic are +1 from Jenkins 13:08:39 do you guys have any questions on this? 13:08:45 and even before, except the last one that was depending on os-win, they were green 13:08:59 so reviews could have been done even before, to save time 13:09:11 will now request Kurt and hemna to review 13:09:25 sagar_nikam: great, thanks 13:09:27 lpetrut: no questions as of now 13:09:44 changing topic then :) 13:09:46 this is with pymi ... correct ? 13:09:53 ? 13:09:58 #topic PyMI 13:10:18 sagar_nikam: what is with PyMI? 13:10:40 FC patches are all with pyMI... am i right ? 13:11:07 we use the hbaapi.dll lib directly, so WMI is not used 13:11:09 sagar_nikam: they work either with old wmi or with PyMI 13:11:31 and as lpetrut said, the os-win FC part is plain ctypes 13:11:41 ok 13:11:54 ok 13:12:03 switching to PyMI, as discussed by email we just released a wheel for Py27 x64 13:12:04 is there any perf impact due to that ? 13:12:31 sagar_nikam: perf impact on what in particular? 13:12:50 using ctypes instead of WMI ? 13:13:16 other than overall speed boost, nope 13:14:07 ok 13:14:13 for FC we also had some issues in API coverage on WMI, right lpetrut? 13:14:46 yeah, the reason we were forced to use ctypes is that some of the WMI API functions were broken 13:14:46 alexpilotti: regarding the new wheel for py27, will try it today and let you know if it works 13:15:02 sagar_nikam: perfect thanks 13:15:29 any more questions on PyMI? 13:15:38 one question on ctypes ... i hope it works on windows core and nano 13:15:47 of course 13:15:51 ok 13:16:07 no more questions on pyMI, will try and let you know 13:16:20 next... 13:16:43 hi guys, sorry to interrupt in between, I wanted to share some results with respect to test that we have done last week, do let me know when can i discuss the same 13:16:45 #topic The great networking-hyperv SGR performance boost 13:17:09 kvinod: sure, on what topic in particular? 13:17:38 in terms of performance and scale 13:18:09 kvinod: any particular area? 13:18:36 kvinod: e.g. compute, networking, WMI vs PyMI, ACL/SGR, etc 13:18:52 kvinod: master, liberty? 13:18:54 Networking and nova both 13:19:23 ok, let's get over the current topic then you're next :) 13:19:25 we used msi from cloudbase website 13:19:49 i hope the new one is based out of liberty 13:19:56 k 13:19:58 on the current topic "The great networking-hyperv SGR performance boost" 13:20:39 as already broadly discussed previously, we went on with parallelizing ACL management in networking-hyperv 13:20:59 claudiub: want to share some details? 13:21:21 k 13:21:29 yeah. so, i've ran some tests, it seems to be doing about 2x better now, using native threads 13:21:43 and using 5 threads as workers 13:21:56 claudiub: on a host with how many cores? 13:22:18 at some point, the performance enhancement seems to hit a maximum due to vmms 13:22:26 4 cores 13:23:03 it seems that after a certain point, vmms can't handle many more requests at once anyways. 13:23:13 i'm currently testing on a host with 32 cores. 13:23:29 claudiub: so when you say workers, you mean worker green threads? 13:23:30 claudiub: how many "workers" do you have there? 13:23:42 kvinod: native threads 13:23:43 on the 32 cores, 10 workers. 13:23:51 ok 13:24:17 claudiub: what about one thread per core? aka 32 threads? 13:24:18 but in my opinion, vmms won't b be able to keep up with the workers. 13:24:30 so you are making use of all cores to schedule workers threads? 13:24:58 just for testing, as 32 workers might be a bit insane on production :) 13:25:17 native threads should be scheduled on separate cores. 13:25:41 claudiub: can you do a test w 32 "workers"? ^ 13:25:49 will do. 13:26:18 I'd like to see a graph with: time on Y axis and number of workers on X axis 13:26:38 this perf enhancements is due to which patch ?, is it the BP submitted by kvinod: or due to pyMI 13:26:56 if vmms is the bottleneck as we suspect, the graph will flatten fast 13:27:02 also, it seems that vmms is working slower depending on how many ports have been bound. 13:27:18 claudiub:so has the design changed now and we have completely left green threads and moved on to native threads 13:27:33 sagar_nikam: this is the new patchset 13:27:52 https://review.openstack.org/#/c/264235/8 13:27:55 native threads patch 13:28:03 with pymi 13:28:06 kvinod: as discussed last time, this uses native threads + PyMI 13:28:20 ok 13:28:30 in short, PyMI allows real multithreading to work 13:28:36 unlike the old WMI 13:29:16 this way real threads on Python work almost as in a native context 13:29:23 anyways, the patch will require latest pymi version 13:29:29 master version atm. 13:29:30 ok, in that case I would be interested to try that out on my setup (native thread + PyMI) 13:29:45 kvinod: that'd be great 13:30:12 claudiub: can you pass to kvinod the patch to test? 13:30:15 kvinod: you wanted to discuss some test results .. 13:30:24 claudiub: can you please share the details to consume the same 13:30:27 what was it using ? 13:30:44 already passed the patch a little bit earlier 13:31:01 this branch includes also a bunch of additional patches that improve other areas 13:31:20 so if you guys help on testing, it's much appreciated, as usual 13:31:27 any other questions? 13:31:41 kvinod: so, checkout the commit, install it, make sure you have the pymi master and it should be fine. 13:31:49 ok, will be good if someone can share the consolidated list of patches that I can apply 13:32:03 shouldn't require anything else. 13:32:51 claudiub: also wanted to know how to install and use/enable PyMI 13:33:02 kvinod: pip install pymi 13:33:07 ok 13:33:14 kvinod: as easy as that 13:33:22 ok, next: 13:33:35 and how about the list of patch sets? 13:34:02 claudiub: the patch you listed depends on the previous ones? 13:34:26 I still have trouble finding it in the new Gerrit UI :-) 13:34:32 it does, but a checkout will take the whole branch. 13:34:48 git fetch http://review.openstack.org/openstack/networking-hyperv refs/changes/35/264235/10 && git checkout FETCH_HEAD 13:35:08 so, this will also take the previous commits. 13:35:08 perfect 13:35:12 moving on then! 13:35:21 #topic "Performance" 13:35:31 kvinod: the stage is yours :) 13:35:45 ok 13:36:24 I would like to introduce Thala who does the scale testing 13:36:34 He is in IRC now 13:37:02 Hello All, 13:37:09 hello 13:37:10 So, we downloaded the MSI from cloudbase website and used the same to bring up the Computes 13:37:40 kvinod: what MSI? 13:37:43 This was out test details 13:37:46 liberty, master, etc? 13:37:55 The installer file 13:38:05 Over All VMs Targetted:1k Number user concurrency: 10 Number of VMs per Project:100 Number of Network by Project:1 Security Group used: Default Image used to Spawn VMs:Fedora Dhcp Timeout for VMs:60 seconds 13:38:06 kvinod: there are different MSIs, one for each version 13:38:26 kvinod: did you add PyMI? 13:38:40 No that is something we did not do 13:38:58 So I feel I could have done that 13:39:05 I still need to know what MSI you used 13:39:11 We will give one more run with PyMI 13:39:15 ok 13:39:17 kvinod: you will need to use mitaka MSI 13:39:44 so we saw around 37% nova failure 13:39:45 the mitaka one (aka the dev / beta one) is just a nightly build from master 13:40:05 the current production one is the liberty one 13:40:21 i.e. out of 1000 VMs 37% VM went into error state 13:40:40 kvinod: that's quite impossible on liberty 13:40:48 Thala, could you please tell us which installer you downloaded 13:41:17 sorry, but we actually saw that happening 13:41:20 while the master one (mitaka) is just for dev purpose, so it's not relevant for performance discussion 13:41:47 Thala: could you please confirm which installer you used 13:42:38 Also around 37% failure for neutron as well 13:42:57 I'm curious to see the nova-compute logs on the hyper-v nodes. 13:43:14 vinod: package created on 1/8/2016 13:43:29 Thala: that's master 13:43:42 out of successfully booted VM's only 408 got IP assigned 13:43:43 so all the work you did is useless, sorry to tell you 13:44:19 is the msi name something like "HyperVNovaCompute_Beta.msi"? 13:44:37 alexpilotti: since pyMI can also work with Liberty, you are suggesting thala: use liberty MSI ? 13:44:41 ok, then do you suggest to bring up the setup with liberty and install PyMI and then run the test 13:45:12 sagar_nikam: definitely. There's no reason to do any performance testing on development nightly builds 13:45:34 alexpilotti: agree 13:46:14 thala: will you be able to test scale with and with pymi using liberty MSI ? 13:46:15 fine this run we will try with stable liberty with PyMI 13:46:22 we will start doing performance tests on Mitaka after M3 13:46:27 then we can compare the results 13:46:37 that'd be great 13:46:41 i mean with and without pymi 13:46:52 thala: is it possible ? 13:47:03 sagar_nikam:targetting 1k VMs will do that. 13:47:04 remember to do a "pip install pymi", as Liberty does not come with pymi (yet)! 13:47:38 thala: so you need to run tests twice, may be on the same set of hyperv hosts 13:47:51 Thala: in the meantime, from a development standpoint, seeing the logs you collected could help 13:48:20 Thala: do you think you could send some of them to claudiub? 13:48:21 agreed 13:48:59 also, one detail that was missed, how many compute nodes? 13:49:02 sagar_nikam:I will send to few compute logs 13:49:10 Thala: tx! 13:49:51 but remember I have not enabled debug, no idea what dev will get, better I will run with debug enabled and update you folks 13:50:16 errors should still be visible in the logs. 13:51:06 claudiub:fine, those error's I can share. 13:51:16 Thala: great 13:51:19 also, it might be interesting to test with abalutoiu's associators improvements. that should shave off a lot of execution time. 13:51:44 claudiub: yeah but that requires mitaka ATM 13:52:00 we still have to backport to liberty 13:52:46 is that there in os-win now ... in mitka 13:52:55 the associators patch 13:52:57 sagar_nikam: yep 13:53:05 ok 13:53:24 talking about performance, claudiub just mentioned a set of patches that abalutoiu is working on 13:53:41 we identified a few areas that can provide a significative performance boost 13:53:54 another 20-30% in current tests 13:54:36 based on current Rally results, this brings us much closer to KVM, performance wise 13:54:44 last topic: 13:54:53 I just finished a test on liberty with the patch that I'm currently working on and it seems to be a ~30% performance boost compared 13:55:00 #topic networking-hyperv VLAN bug 13:55:09 claudiu wanna say something? 13:55:19 this is the "false" binding 13:55:38 yeah. so, it seems that there's a small chance that the vlan is not bound to a port, even if Hyper-V says it was bound 13:55:56 the result returned by Hyper-V being positive 13:56:22 there's a fix for this, but it will draw back a bit of the performance 13:56:48 but we should prefer reliability over performance 13:57:08 https://review.openstack.org/#/c/265728/ 13:57:13 the fix 13:57:15 to put this in context, this is apparently a Hyper-V race condition bug, that seems to happen every 50-100 bindings 13:57:49 claudiub: the issue present in Liberty ? 13:57:54 it's very annoying as WMI reports that the port binding was successful, while actually it didn't happen 13:57:57 after a few rally runs, I've managed to capture this: http://paste.openstack.org/show/483347/ 13:58:03 we can check if thala: tests hot the issue 13:58:13 claudiub: shall we take this fix as well when we do the testing 13:58:20 ? 13:58:26 it happens on any Hyper-V version we are testing 13:58:35 independently from the OpenStack release 13:58:40 yeah, it's hyper-v related 13:58:47 ok 13:59:02 then do you want thala: to include that patch 13:59:02 2' to go 13:59:27 anything else to mention before closing? 13:59:34 sagar_nikam claudiub ^ 13:59:40 nope 13:59:44 alexpilotti: before we finsh, FC patch failed jenkins https://review.openstack.org/#/c/258617/ 13:59:50 just saw it now 14:00:11 probably random failure? 14:00:14 lpetrut: there's a Python 3.4 failure 14:00:21 looks transient 14:00:22 py3 in nova is a bit... funky 14:00:28 lpetrut: can you issue a recheck? 14:00:38 gate-nova-python34 14:00:41 2.7 passed 14:00:49 time's up 14:00:53 #endmeeting