13:02:41 <alexpilotti> #startmeeting hyper-v 13:02:42 <openstack> Meeting started Wed Jan 13 13:02:41 2016 UTC and is due to finish in 60 minutes. The chair is alexpilotti. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:02:43 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:02:46 <openstack> The meeting name has been set to 'hyper_v' 13:02:47 <primeministerp> hi guys 13:02:50 <claudiub> hello 13:02:51 <itoader> hi 13:02:52 <abalutoiu> hi 13:02:53 <atuvenie_> hi 13:02:57 <alexpilotti> morning folks! 13:03:19 <alexpilotti> #topic FC 13:03:32 <Thala> Hello All 13:03:49 <alexpilotti> lpetrut: want to give us some updates? 13:04:33 <lpetrut> well, a new os-win release has been made, including the FC work 13:04:57 <sagar_nikam> nice 13:05:28 <lpetrut> did you guys give it a try? 13:05:52 <alexpilotti> #link https://review.openstack.org/#/c/258617/ 13:05:59 <sagar_nikam> not yet, i have requested QA support for it 13:06:09 <sagar_nikam> waiting for QA to take it 13:06:20 <alexpilotti> this means that with the latest os-win, all FC patches gave green light on jenkins 13:06:34 <alexpilotti> we're still waiting for HP reviews BTW 13:06:47 <sagar_nikam> jobs were failing today morning India time 13:06:50 <alexpilotti> sagar_nikam: any news from hemna or kurt? 13:06:56 <sagar_nikam> are they passing now 13:07:10 <sagar_nikam> waiting for jenkins to pass, and then will ping them 13:07:32 <alexpilotti> sagar_nikam: this happened today, so probably after India morning time 13:07:46 <sagar_nikam> ok 13:08:07 <alexpilotti> #link https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/hyperv-fibre-channel 13:08:24 <alexpilotti> all patches in this topic are +1 from Jenkins 13:08:39 <lpetrut> do you guys have any questions on this? 13:08:45 <alexpilotti> and even before, except the last one that was depending on os-win, they were green 13:08:59 <alexpilotti> so reviews could have been done even before, to save time 13:09:11 <sagar_nikam> will now request Kurt and hemna to review 13:09:25 <alexpilotti> sagar_nikam: great, thanks 13:09:27 <sagar_nikam> lpetrut: no questions as of now 13:09:44 <alexpilotti> changing topic then :) 13:09:46 <sagar_nikam> this is with pymi ... correct ? 13:09:53 <lpetrut> ? 13:09:58 <alexpilotti> #topic PyMI 13:10:18 <alexpilotti> sagar_nikam: what is with PyMI? 13:10:40 <sagar_nikam> FC patches are all with pyMI... am i right ? 13:11:07 <lpetrut> we use the hbaapi.dll lib directly, so WMI is not used 13:11:09 <alexpilotti> sagar_nikam: they work either with old wmi or with PyMI 13:11:31 <alexpilotti> and as lpetrut said, the os-win FC part is plain ctypes 13:11:41 <sagar_nikam> ok 13:11:54 <sagar_nikam> ok 13:12:03 <alexpilotti> switching to PyMI, as discussed by email we just released a wheel for Py27 x64 13:12:04 <sagar_nikam> is there any perf impact due to that ? 13:12:31 <alexpilotti> sagar_nikam: perf impact on what in particular? 13:12:50 <sagar_nikam> using ctypes instead of WMI ? 13:13:16 <lpetrut> other than overall speed boost, nope 13:14:07 <sagar_nikam> ok 13:14:13 <alexpilotti> for FC we also had some issues in API coverage on WMI, right lpetrut? 13:14:46 <lpetrut> yeah, the reason we were forced to use ctypes is that some of the WMI API functions were broken 13:14:46 <sagar_nikam> alexpilotti: regarding the new wheel for py27, will try it today and let you know if it works 13:15:02 <alexpilotti> sagar_nikam: perfect thanks 13:15:29 <alexpilotti> any more questions on PyMI? 13:15:38 <sagar_nikam> one question on ctypes ... i hope it works on windows core and nano 13:15:47 <alexpilotti> of course 13:15:51 <sagar_nikam> ok 13:16:07 <sagar_nikam> no more questions on pyMI, will try and let you know 13:16:20 <alexpilotti> next... 13:16:43 <kvinod> hi guys, sorry to interrupt in between, I wanted to share some results with respect to test that we have done last week, do let me know when can i discuss the same 13:16:45 <alexpilotti> #topic The great networking-hyperv SGR performance boost 13:17:09 <alexpilotti> kvinod: sure, on what topic in particular? 13:17:38 <kvinod> in terms of performance and scale 13:18:09 <alexpilotti> kvinod: any particular area? 13:18:36 <alexpilotti> kvinod: e.g. compute, networking, WMI vs PyMI, ACL/SGR, etc 13:18:52 <alexpilotti> kvinod: master, liberty? 13:18:54 <kvinod> Networking and nova both 13:19:23 <alexpilotti> ok, let's get over the current topic then you're next :) 13:19:25 <kvinod> we used msi from cloudbase website 13:19:49 <kvinod> i hope the new one is based out of liberty 13:19:56 <kvinod> k 13:19:58 <alexpilotti> on the current topic "The great networking-hyperv SGR performance boost" 13:20:39 <alexpilotti> as already broadly discussed previously, we went on with parallelizing ACL management in networking-hyperv 13:20:59 <alexpilotti> claudiub: want to share some details? 13:21:21 <kvinod> k 13:21:29 <claudiub> yeah. so, i've ran some tests, it seems to be doing about 2x better now, using native threads 13:21:43 <claudiub> and using 5 threads as workers 13:21:56 <alexpilotti> claudiub: on a host with how many cores? 13:22:18 <claudiub> at some point, the performance enhancement seems to hit a maximum due to vmms 13:22:26 <claudiub> 4 cores 13:23:03 <claudiub> it seems that after a certain point, vmms can't handle many more requests at once anyways. 13:23:13 <claudiub> i'm currently testing on a host with 32 cores. 13:23:29 <kvinod> claudiub: so when you say workers, you mean worker green threads? 13:23:30 <alexpilotti> claudiub: how many "workers" do you have there? 13:23:42 <alexpilotti> kvinod: native threads 13:23:43 <claudiub> on the 32 cores, 10 workers. 13:23:51 <kvinod> ok 13:24:17 <alexpilotti> claudiub: what about one thread per core? aka 32 threads? 13:24:18 <claudiub> but in my opinion, vmms won't b be able to keep up with the workers. 13:24:30 <kvinod> so you are making use of all cores to schedule workers threads? 13:24:58 <alexpilotti> just for testing, as 32 workers might be a bit insane on production :) 13:25:17 <claudiub> native threads should be scheduled on separate cores. 13:25:41 <alexpilotti> claudiub: can you do a test w 32 "workers"? ^ 13:25:49 <claudiub> will do. 13:26:18 <alexpilotti> I'd like to see a graph with: time on Y axis and number of workers on X axis 13:26:38 <sagar_nikam> this perf enhancements is due to which patch ?, is it the BP submitted by kvinod: or due to pyMI 13:26:56 <alexpilotti> if vmms is the bottleneck as we suspect, the graph will flatten fast 13:27:02 <claudiub> also, it seems that vmms is working slower depending on how many ports have been bound. 13:27:18 <kvinod> claudiub:so has the design changed now and we have completely left green threads and moved on to native threads 13:27:33 <alexpilotti> sagar_nikam: this is the new patchset 13:27:52 <claudiub> https://review.openstack.org/#/c/264235/8 13:27:55 <claudiub> native threads patch 13:28:03 <claudiub> with pymi 13:28:06 <alexpilotti> kvinod: as discussed last time, this uses native threads + PyMI 13:28:20 <kvinod> ok 13:28:30 <alexpilotti> in short, PyMI allows real multithreading to work 13:28:36 <alexpilotti> unlike the old WMI 13:29:16 <alexpilotti> this way real threads on Python work almost as in a native context 13:29:23 <claudiub> anyways, the patch will require latest pymi version 13:29:29 <claudiub> master version atm. 13:29:30 <kvinod> ok, in that case I would be interested to try that out on my setup (native thread + PyMI) 13:29:45 <alexpilotti> kvinod: that'd be great 13:30:12 <alexpilotti> claudiub: can you pass to kvinod the patch to test? 13:30:15 <sagar_nikam> kvinod: you wanted to discuss some test results .. 13:30:24 <kvinod> claudiub: can you please share the details to consume the same 13:30:27 <sagar_nikam> what was it using ? 13:30:44 <claudiub> already passed the patch a little bit earlier 13:31:01 <alexpilotti> this branch includes also a bunch of additional patches that improve other areas 13:31:20 <alexpilotti> so if you guys help on testing, it's much appreciated, as usual 13:31:27 <alexpilotti> any other questions? 13:31:41 <claudiub> kvinod: so, checkout the commit, install it, make sure you have the pymi master and it should be fine. 13:31:49 <kvinod> ok, will be good if someone can share the consolidated list of patches that I can apply 13:32:03 <claudiub> shouldn't require anything else. 13:32:51 <kvinod> claudiub: also wanted to know how to install and use/enable PyMI 13:33:02 <alexpilotti> kvinod: pip install pymi 13:33:07 <kvinod> ok 13:33:14 <alexpilotti> kvinod: as easy as that 13:33:22 <alexpilotti> ok, next: 13:33:35 <kvinod> and how about the list of patch sets? 13:34:02 <alexpilotti> claudiub: the patch you listed depends on the previous ones? 13:34:26 <alexpilotti> I still have trouble finding it in the new Gerrit UI :-) 13:34:32 <claudiub> it does, but a checkout will take the whole branch. 13:34:48 <claudiub> git fetch http://review.openstack.org/openstack/networking-hyperv refs/changes/35/264235/10 && git checkout FETCH_HEAD 13:35:08 <claudiub> so, this will also take the previous commits. 13:35:08 <alexpilotti> perfect 13:35:12 <alexpilotti> moving on then! 13:35:21 <alexpilotti> #topic "Performance" 13:35:31 <alexpilotti> kvinod: the stage is yours :) 13:35:45 <kvinod> ok 13:36:24 <kvinod> I would like to introduce Thala who does the scale testing 13:36:34 <kvinod> He is in IRC now 13:37:02 <Thala> Hello All, 13:37:09 <claudiub> hello 13:37:10 <kvinod> So, we downloaded the MSI from cloudbase website and used the same to bring up the Computes 13:37:40 <alexpilotti> kvinod: what MSI? 13:37:43 <kvinod> This was out test details 13:37:46 <alexpilotti> liberty, master, etc? 13:37:55 <kvinod> The installer file 13:38:05 <kvinod> Over All VMs Targetted:1k Number user concurrency: 10 Number of VMs per Project:100 Number of Network by Project:1 Security Group used: Default Image used to Spawn VMs:Fedora Dhcp Timeout for VMs:60 seconds 13:38:06 <alexpilotti> kvinod: there are different MSIs, one for each version 13:38:26 <alexpilotti> kvinod: did you add PyMI? 13:38:40 <kvinod> No that is something we did not do 13:38:58 <kvinod> So I feel I could have done that 13:39:05 <alexpilotti> I still need to know what MSI you used 13:39:11 <kvinod> We will give one more run with PyMI 13:39:15 <kvinod> ok 13:39:17 <sagar_nikam> kvinod: you will need to use mitaka MSI 13:39:44 <kvinod> so we saw around 37% nova failure 13:39:45 <alexpilotti> the mitaka one (aka the dev / beta one) is just a nightly build from master 13:40:05 <alexpilotti> the current production one is the liberty one 13:40:21 <kvinod> i.e. out of 1000 VMs 37% VM went into error state 13:40:40 <alexpilotti> kvinod: that's quite impossible on liberty 13:40:48 <kvinod> Thala, could you please tell us which installer you downloaded 13:41:17 <kvinod> sorry, but we actually saw that happening 13:41:20 <alexpilotti> while the master one (mitaka) is just for dev purpose, so it's not relevant for performance discussion 13:41:47 <kvinod> Thala: could you please confirm which installer you used 13:42:38 <kvinod> Also around 37% failure for neutron as well 13:42:57 <claudiub> I'm curious to see the nova-compute logs on the hyper-v nodes. 13:43:14 <Thala> vinod: package created on 1/8/2016 13:43:29 <alexpilotti> Thala: that's master 13:43:42 <kvinod> out of successfully booted VM's only 408 got IP assigned 13:43:43 <alexpilotti> so all the work you did is useless, sorry to tell you 13:44:19 <claudiub> is the msi name something like "HyperVNovaCompute_Beta.msi"? 13:44:37 <sagar_nikam> alexpilotti: since pyMI can also work with Liberty, you are suggesting thala: use liberty MSI ? 13:44:41 <kvinod> ok, then do you suggest to bring up the setup with liberty and install PyMI and then run the test 13:45:12 <alexpilotti> sagar_nikam: definitely. There's no reason to do any performance testing on development nightly builds 13:45:34 <sagar_nikam> alexpilotti: agree 13:46:14 <sagar_nikam> thala: will you be able to test scale with and with pymi using liberty MSI ? 13:46:15 <kvinod> fine this run we will try with stable liberty with PyMI 13:46:22 <alexpilotti> we will start doing performance tests on Mitaka after M3 13:46:27 <sagar_nikam> then we can compare the results 13:46:37 <alexpilotti> that'd be great 13:46:41 <sagar_nikam> i mean with and without pymi 13:46:52 <sagar_nikam> thala: is it possible ? 13:47:03 <Thala> sagar_nikam:targetting 1k VMs will do that. 13:47:04 <alexpilotti> remember to do a "pip install pymi", as Liberty does not come with pymi (yet)! 13:47:38 <sagar_nikam> thala: so you need to run tests twice, may be on the same set of hyperv hosts 13:47:51 <alexpilotti> Thala: in the meantime, from a development standpoint, seeing the logs you collected could help 13:48:20 <alexpilotti> Thala: do you think you could send some of them to claudiub? 13:48:21 <claudiub> agreed 13:48:59 <claudiub> also, one detail that was missed, how many compute nodes? 13:49:02 <Thala> sagar_nikam:I will send to few compute logs <claudiub> 13:49:10 <alexpilotti> Thala: tx! 13:49:51 <Thala> but remember I have not enabled debug, no idea what dev will get, better I will run with debug enabled and update you folks 13:50:16 <claudiub> errors should still be visible in the logs. 13:51:06 <Thala> claudiub:fine, those error's I can share. 13:51:16 <alexpilotti> Thala: great 13:51:19 <claudiub> also, it might be interesting to test with abalutoiu's associators improvements. that should shave off a lot of execution time. 13:51:44 <alexpilotti> claudiub: yeah but that requires mitaka ATM 13:52:00 <alexpilotti> we still have to backport to liberty 13:52:46 <sagar_nikam> is that there in os-win now ... in mitka 13:52:55 <sagar_nikam> the associators patch 13:52:57 <alexpilotti> sagar_nikam: yep 13:53:05 <sagar_nikam> ok 13:53:24 <alexpilotti> talking about performance, claudiub just mentioned a set of patches that abalutoiu is working on 13:53:41 <alexpilotti> we identified a few areas that can provide a significative performance boost 13:53:54 <alexpilotti> another 20-30% in current tests 13:54:36 <alexpilotti> based on current Rally results, this brings us much closer to KVM, performance wise 13:54:44 <alexpilotti> last topic: 13:54:53 <abalutoiu> I just finished a test on liberty with the patch that I'm currently working on and it seems to be a ~30% performance boost compared 13:55:00 <alexpilotti> #topic networking-hyperv VLAN bug 13:55:09 <alexpilotti> claudiu wanna say something? 13:55:19 <alexpilotti> this is the "false" binding 13:55:38 <claudiub> yeah. so, it seems that there's a small chance that the vlan is not bound to a port, even if Hyper-V says it was bound 13:55:56 <claudiub> the result returned by Hyper-V being positive 13:56:22 <claudiub> there's a fix for this, but it will draw back a bit of the performance 13:56:48 <claudiub> but we should prefer reliability over performance 13:57:08 <claudiub> https://review.openstack.org/#/c/265728/ 13:57:13 <claudiub> the fix 13:57:15 <alexpilotti> to put this in context, this is apparently a Hyper-V race condition bug, that seems to happen every 50-100 bindings 13:57:49 <sagar_nikam> claudiub: the issue present in Liberty ? 13:57:54 <alexpilotti> it's very annoying as WMI reports that the port binding was successful, while actually it didn't happen 13:57:57 <claudiub> after a few rally runs, I've managed to capture this: http://paste.openstack.org/show/483347/ 13:58:03 <sagar_nikam> we can check if thala: tests hot the issue 13:58:13 <kvinod> claudiub: shall we take this fix as well when we do the testing 13:58:20 <kvinod> ? 13:58:26 <alexpilotti> it happens on any Hyper-V version we are testing 13:58:35 <alexpilotti> independently from the OpenStack release 13:58:40 <claudiub> yeah, it's hyper-v related 13:58:47 <sagar_nikam> ok 13:59:02 <sagar_nikam> then do you want thala: to include that patch 13:59:02 <alexpilotti> 2' to go 13:59:27 <alexpilotti> anything else to mention before closing? 13:59:34 <alexpilotti> sagar_nikam claudiub ^ 13:59:40 <claudiub> nope 13:59:44 <sagar_nikam> alexpilotti: before we finsh, FC patch failed jenkins https://review.openstack.org/#/c/258617/ 13:59:50 <sagar_nikam> just saw it now 14:00:11 <claudiub> probably random failure? 14:00:14 <alexpilotti> lpetrut: there's a Python 3.4 failure 14:00:21 <alexpilotti> looks transient 14:00:22 <claudiub> py3 in nova is a bit... funky 14:00:28 <alexpilotti> lpetrut: can you issue a recheck? 14:00:38 <sagar_nikam> gate-nova-python34 14:00:41 <alexpilotti> 2.7 passed 14:00:49 <alexpilotti> time's up 14:00:53 <alexpilotti> #endmeeting