13:00:58 #startmeeting hyper-v 13:00:59 Meeting started Wed Jan 6 13:00:58 2016 UTC and is due to finish in 60 minutes. The chair is alexpilo_. Information about MeetBot at http://wiki.debian.org/MeetBot. 13:01:00 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:01:03 The meeting name has been set to 'hyper_v' 13:01:14 Morning! 13:01:17 morning! 13:01:20 o/ 13:01:27 Hi All, Happy new year !!!! 13:01:35 happy new year. :) 13:01:43 Happy New Year!! :-) 13:02:08 o/ 13:02:27 up your "hands" folks, let's see who do we have today 13:02:38 lpetrut? 13:02:59 sagar_nikam: is sonu joining us? 13:02:59 o/ 13:03:10 o/ 13:03:33 yes 13:03:43 he said he will join 13:03:54 o/ 13:04:01 he sent a patch yesterday, we reviewed quickly to be able to discuss it today 13:04:09 sure 13:04:13 but we can start with the other topics 13:04:20 kvinod: Sonu joining ? 13:04:21 #topic FC support 13:04:39 lpetrut: any updates? 13:04:59 Sonu will join 13:05:02 yep, so, all the os-win FC related patches have merged 13:05:42 the CI and unit tests should be all green on the nova patches once we do an os-win version bump, so we should try to get those merged as soon as possible 13:06:17 sweet 13:06:26 sagar: have you guys looked over those patches? 13:06:38 sagar: it would be great if you could give it a try on your environment 13:06:45 sure 13:06:56 i will have somebody check it 13:07:21 can you add me as reviewes 13:07:23 thanks a lot, please let me know how it works for you 13:07:24 sure 13:07:27 reviewer 13:08:55 #link https://review.openstack.org/#/c/258617/ 13:08:57 I know this is unrelated to FC, but here's the iSCSI refactoring patch: https://review.openstack.org/#/c/249291/ it's currently marked as WIP because we're currently working on the unit tests, but it would be great if you guys could give us some feedback on this as well 13:09:29 most probably we'll get this merged today 13:09:30 there's still a -1 by walter here 13:09:40 anybody knows his IRC nic? 13:10:01 hemna, afaik 13:10:10 yep, basically, he requested a few more info to be provided. such as whether multipath should be used, and the os_type and platform 13:10:12 yep 13:10:12 lpetrut: this has the fix for CHAP ? 13:10:12 his not on this channel anyway 13:10:35 lpetrut: let's stick to FC 13:10:35 ok 13:10:40 hemna is on the #openstack-nova channel. 13:10:46 we'll switch topic to iSCSI soon 13:10:47 my bad, sure 13:10:49 and the fix for rescan ? 13:10:57 sagar_nikam: ^ :-) 13:11:02 ok 13:11:34 I guess I'm a bit over enthusiastic about this one :) 13:11:48 any other questions related to FC? 13:12:00 there are no HP reviews on all Nova FC patches 13:12:02 not from my side 13:12:10 only the last one in the chain 13:12:14 kurt ? 13:12:18 we need reviews on all of them: 13:12:31 #link https://review.openstack.org/#/c/258614 13:12:32 i thought he did the reviews 13:12:43 #link https://review.openstack.org/#/c/258615 13:12:52 #link https://review.openstack.org/#/c/260980 13:12:55 just a side note: I'll add the vHBA support ASAP 13:13:07 just a quick not on vHBA 13:13:33 we have two separate options, passthrough and vHBA 13:14:06 vHBA is a more flexible feature and easier to implement as it doesn't require all the hassle required by passthrough 13:14:15 especially for live migration 13:14:17 yes remember and i think we decided to first implement passthrough 13:14:25 but it has some very hard limitations: 13:14:34 1) no boot from volume 13:14:53 2) guest OS support requirement 13:15:00 so we'll implement both, maybe passthrough as default, letting the user opt for vHBA by setting the bus type to FC 13:15:11 so for this reason it will be implemented separately as soon as the current patches merge 13:15:37 ok 13:15:39 Im just recapping this here to make sure we keep focus on passthrough 13:16:04 i think even kurt had suggested pass through... if i remember right 13:16:15 for the rest, priority is on getting Cloudbase and HP +1s 13:16:34 so that we can move the patches to the Nova etherpad queue 13:16:42 alexpilo_: we have 3 patches ? 13:16:48 that needs review ? 13:16:53 4 13:17:17 the last one is the first link I posted (the one with hemna's review) 13:17:25 614,815 abd 980 13:17:40 ok 13:17:40 258617 13:18:00 anything else you'd like to add on FC? 13:18:08 sagar_nikam, lpetrut 13:18:16 not on my side 13:18:30 no, we will start the review, i think we will need Kurt to review all 4 13:18:39 sweet, tx! 13:18:42 will let him know 13:18:55 #topic iSCSI 3par support 13:19:09 now is the time to get wild on iSCSI :-) 13:19:23 buckwild 13:19:24 2 issues that we have seen in 3par iscsi, CHAP and rescan 13:19:29 heh, so, lun rescanning is in place now 13:19:48 lpetrut: already merged ? 13:19:58 lun rescan 13:20:03 nope, it's the patch I mentioned before 13:20:22 passing CHAP credentials when logging in portals: not yet. As this was causing issues with other backends, I wanted to test this first. 13:20:43 sagar_nikam: this brings up the topic we discussed on Monday 13:21:00 using the FC 3par array for iSCSI testing as well 13:21:07 *nod* 13:21:30 We need hba's for the array 13:21:45 so, as soon as we have the additional HW in place, we will start testing this ASAP 13:22:17 primeministerp would you like to add something here? 13:22:22 sagar_nikam, someone on your side was looking into that 13:23:37 basically we're going to need the hba to add the iscsi functionality to the array 13:23:49 we may need additional licensing as well 13:24:21 lpetrut: did you see kmbharath: patch 13:24:25 on CHAP 13:24:33 he had a fix for it many months back 13:24:46 hmmm 13:25:05 hopefully he'l reconnect 13:25:32 yep, but I was thinking whether we can save time by not logging in the portal twice (once without CHAP creds, once with CHAP creds), and maybe use a flag on the Cinder side. Like "portals_requires_chap_auth" or something similar 13:25:33 i am connected 13:25:45 ;) 13:26:00 kmbharath: your comments ? 13:26:11 on lpetrut: suggestion 13:26:12 we could just push this into the volume connection info 13:26:25 sagar_nikam_: as previously discussed we need to ensure that this patch wont cause issues to other backedends 13:26:33 Yes agreed, if we have a flag and do it , it would be better 13:26:43 as the 3par one seems to be the only one with this requirement 13:26:51 ok 13:26:56 we had tested it on HP LeftHandnetwork and 3par earlier 13:27:02 the option is probably the only way to do that 13:27:08 great, do you know by any change what backends require this? Is it just 3PAR, are all 3PAR backends requiring this? 13:27:21 but: 13:27:30 from our tests, only 3par 13:27:44 what if we have 2 backends, e.g. a 3par and another 3rd party one? 13:27:47 LHN/VSA worked without any change 13:27:57 one expecting portal logins and the other one failing? 13:28:22 the flag won't help, as it would force login on all of them 13:28:31 no, why? 13:28:43 becuase the other backend would fail 13:29:05 an option (I believe suggested by lpetrut) would be to try the login and silently continue if it fails 13:29:29 the other backend would not set this flag, and we would not use CHAP creds when logging in the portal, so it should not fail 13:29:34 please correct me if I got something wrong 13:29:52 can't we do check for backend type.... 13:29:55 that would require changes on the cinder side 13:30:09 because its only 3par what we had seen needs the portal login 13:30:34 lpetrut: who will set the flag ? every cinder driver ? 13:30:51 this can be optional, but the driver would set it when providing the connection info 13:30:59 or we could simply have a list of backends in the Nova driver with the drivers requiring portal login 13:31:08 also, the connection info does not include the backend type at the moment 13:31:21 by making it an option, this could be configurable 13:31:35 lpetrut: d'oh, that'sa blocker 13:31:41 alexpilo_:this option looks better 13:32:15 lpetrut sagar_nikam_: let's bring this back to the witheboard and sync again next week 13:32:19 having list of backends in nova is better than cinder sending it in connection_info 13:32:31 the target_iqn in connection info could help us to identify the backend 13:32:52 kmbharath: umm, is this reliable enough? 13:33:23 alexpilo_: ok, we can talk about this later so that we don't block the meeting for this 13:33:29 lpetrut: i think the iqn had 3par in it 13:33:41 ok, let move to next topic 13:33:46 sagar_nikam: sure, just wanted to make sure that this happens all the time 13:33:54 can we have some networking discussion 13:33:56 yes we had seen it everytime 13:34:04 sonu: is here i thinl 13:34:07 did sonu join? 13:34:17 I am listening.. 13:34:21 i saw him joining 13:34:24 great 13:34:39 sagar_nikam: yes 13:34:44 #topic SGR RPC patch 13:34:55 * alexpilo_ fetches link... 13:35:27 #link https://review.openstack.org/#/c/263865/ 13:35:41 first, thanks Sonu for the patch 13:36:07 did you see claudiub's review? 13:36:22 I am reviewing the same. 13:36:33 Thanks for the comments. I will fix them and re-post 13:36:34 we prioritized it right away to be sure we could talk about it today 13:36:45 Thanks for that 13:36:52 claudiub Sonu: anything to add? 13:37:10 This patch is dependent on https://review.openstack.org/#/c/240577/ 13:37:31 there was a bug in base security groups driver, which I have fixed and Review is in progress. 13:37:36 could you then add a Depends-On: here? 13:37:43 btw we need a BP for this 13:37:43 I will mark the dependency 13:37:53 cool ty. :) 13:37:55 Got it. I will work on the same. 13:38:17 sweet 13:38:28 moving to a broader topic: 13:38:38 #topic networking-hyperv improvements 13:39:00 let me share some of the design aspects related to this agent 13:39:11 there are a few improvements going on 13:39:57 we already talked about this a while back, when we talked about the multiprocessing patches that HP sent for SGR 13:40:22 we are currently finalizing the results, which will turn into a BP 13:40:41 1) drop the port discovery loop and replace it with WMI events 13:41:09 2) use PyMI 13:41:31 3) replace associator queries with direct WQL queries 13:41:58 4) parallelize all the things :) 13:42:28 this includes in particular ACL which are a big bottleneck 13:42:55 alexpilo_ : are you talking about new approach of multiprocessing 13:43:00 unlike the Juno patch that HP is using, this doesnt require multiprocessing 13:43:27 is this blueprint on multiprocessing going to be different from what we posted? 13:43:30 PyMI (unlike the old WMI + pywin32) is designed to work with multiple threads 13:43:55 kvinod it's quite different, especially in the implementation 13:44:27 this is the reason why it has been kept on hold 13:44:45 ok, then are you saying that HP's patch sets are not required and will not get merged? 13:45:01 alexpilo_: how are the tests in a scale environment with this new approach ? 13:45:04 surely not in the current status 13:45:13 so, I've tested the patchsets you've sent and there are a couple of issues. 13:45:17 kvinod: ^ 13:45:39 One biggest problem that we had encountered during scale was, too many port updates introducing delays in processing of new port additions. 13:45:44 sagar_nikam_: we are testing with Rally on scale 13:45:52 there are 2 big issues atm: 1. logging doesn't work, apparently, I've just noticed 30 minutes ago 13:46:11 thats one reason we separated addition of ports into a different workers scheduled on another CPU 13:46:34 Sonu: you can just use threads for that, no need for multiple processes 13:46:35 with the workers patch, logging is only done to stdout, the neutron-hyperv-agent.log is empty 13:47:02 and second, it seems that the agents die randomly during rally. 13:47:43 they freeze, leading to missing to report the alive state, leading to failing to spawn vms, as the neutron agents are considered to be dead and the neutron ports couldn't be bound 13:48:24 hmm thats a news :) 13:48:25 claudiub : we already noticed the logging issue and we solved it by making child process send message to parent process about port binding success or failure and parent process will log into log file 13:48:45 kvinod: that is unnecessary work 13:49:07 and thirdly, if there is any issue in binding a port over and over and over again, the neutron-hyperv-agent process will consume the whole cpu. 13:49:48 alexpilo_ : we did it that way due to limitation in logging framework as it doesnot works for multiprocessing 13:49:54 sorry for trinmming the discussion, as we have only 10' left 13:50:09 kvinod: yeah, i've seen that. if i start the process manually and see the std, i can see what happens in the child processes, including traces and so on. but there's nothing in the log file. 13:50:21 stdout* 13:50:36 the idea of parallel execution in the agent is of course the common goal here 13:50:48 Python's multiprocessing brings a lot of unnecessary drawbacks and there's no reason to use it 13:51:14 threads / green threads work perfectly fine as long as the underlying WMI calls are non-blocking 13:51:38 (otherwise we'd hit the GIL issue, which is I guess why you opted for multiprocessing) 13:52:14 yes you got it right 13:52:36 the discussion is anyway much broader, which is why this BP is taking a while 13:52:39 i'm currently trying out native threads, to see how it's going with those. 13:52:52 Vinod had tried all such possibilities 13:52:57 Sonu: Can you review the new approach if a patchset is available 13:53:13 and check how it will work 13:53:15 he can help you with his findings and observations 13:53:17 i haven't uploaded the native threads patch yet 13:53:23 based on our scale tests run on Juno 13:53:26 Sonu kvinod: that's why we wrote PyMI ;) 13:53:32 great 13:53:42 the main aspect here, is that the ACL API are simply terrible and no matter how you look at them they dont scale 13:53:43 I will have a look at the new patch set and try 13:54:02 claudiub: patchset not available yet ? 13:54:10 ok, please upload your patches 13:54:18 sagar_nikam_: native threads, not yet, still working on it. 13:54:23 so all this parallelization work is improving a bit the situation, but a more drastic approach will be needed 13:54:40 kvinod: expect the patches sometimes next week 13:55:19 1) the OVS driver will become the preferred option as soon as conntrack will be available on our Windows port 13:55:29 sonu: kvinod: you had seen issues with security groups as well 13:55:35 as this is required for an SGR as well 13:55:51 have you seen the development - https://review.openstack.org/#/c/249337 13:56:05 2) we're evaluating a complete rewrite of the ACL WMI API 13:56:27 Sonu: tep, that's what I'm referring to 13:56:48 this is OVS firewall being done once OVS conntrack is available 13:56:49 we need conntrack on Windows for that to work 13:57:09 which is our main goal for OVS 2.6 13:57:13 Sonu: could we have an offline discussion about the HP networking hardware that supports protocol accelleration, I want to see if what it would take to add OVS and Native vswitch testing on the appropriate network hba, the ones we currently have in the HP 3Par ci only support accellerated iSCSI 13:57:13 got it. 13:57:34 3 minutes to go 13:57:47 sure primeministerp 13:57:55 Sagar mentioned about it 13:58:01 primeministerp: sure i will connect you and Sonu: on this topic 13:58:02 Sonu, awesome 13:58:07 changing topic, if you'd like to go on with this topic, please let's move to #openstack-hyper-v 13:58:10 thanks sagar_nikam_ 13:58:17 #topic PyMI 13:58:32 so PyMI is feature complete for all OpenStack use cases 13:58:42 it has been tested under heavy load with Rally 13:58:45 that is a very good news 13:58:59 woot 13:59:12 sagar_nikam_: do you think you could test it in your environments? 13:59:21 yes, we will 13:59:24 we tested, kilo, liberty and mitaka (master) 13:59:36 i need to get some slots from scale team to test this 13:59:58 all CIs are now switching to it, including nova, neutron, networking-hyperv, compute-hyperv and the cinder ones 14:00:08 most of our environments have also move to Liberty 14:00:15 need to find Juno 14:00:20 for testing pyMI 14:00:30 perfect tx! 14:00:37 time's up! :) 14:00:40 #endmeeting