05:31:15 <anil_rao> #startmeeting taas 05:31:17 <openstack> Meeting started Wed Sep 7 05:31:15 2016 UTC and is due to finish in 60 minutes. The chair is anil_rao. Information about MeetBot at http://wiki.debian.org/MeetBot. 05:31:18 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 05:31:20 <openstack> The meeting name has been set to 'taas' 05:31:23 <soichi> hi 05:31:26 <kaz> hi 05:31:27 <anil_rao> Hi 05:32:04 <anil_rao> #topic Performance measurement (progress report) 05:33:10 <soichi> #link: https://wiki.openstack.org/w/images/2/22/Increasing_Source_VMs-20160906.png 05:33:14 <kaz> I uploaded a document about performance measurement last week. 05:33:35 <kaz> I guess that it is the cause that the softirq is unbalanced to one physical cpu. 05:34:28 <kaz> So we tried to balance softirq among cpus. 05:34:58 <anil_rao> kaz: That is interesting to see. 05:35:23 <kaz> It seems that the values of received packets without mirror are increasing. 05:35:31 <kaz> i guess because of softirq balancing. 05:37:10 <anil_rao> Can you please clarify what do you mean by "when source VMs are increased" 05:38:17 <kaz> please see the last weeks slide, page 1 05:38:43 <soichi> #link: https://wiki.openstack.org/w/images/7/74/Increasing_Source_VMs-20160831.png 05:39:24 <kaz> this means that the number of vms are increasing and the number of destination vm and monitor vm are fixed. 05:39:51 <anil_rao> Thanks. 05:40:13 <vnyyad> Kaz: so overall the total flows being monitored is increased 05:40:39 <kaz> yes 05:40:51 <anil_rao> So, if I am reading this right, when mirroring is enabled, we half the throughput of the receiving VM but send at the same rate to the monitor VM. 05:41:57 <kaz> yes, i think so 05:42:00 <vnyyad> anil_rao: looks like 05:44:29 <soichi> last week, i got several valuable comments from anil 05:44:42 <soichi> 1) it is better to measure in case of TCP 05:45:37 <soichi> 2) it is better to use SIPP benchmark, too 05:46:29 <anil_rao> soichi: I am not sure if those cases are better but they would serve to highlight other aspects. :-) 05:46:48 <soichi> okay, i see 05:48:38 <anil_rao> kaz: I did not fully understand the last (2nd) bullet item below the graph in slide #4. 05:49:34 <anil_rao> Compared to the results in last week's graph both cases (w and w/o mirroring) have improved after IRQ balancing. 05:55:22 <anil_rao> Without mirroring, the receiving VM is getting between 200K and 250K pps. With mirroring it gets between 100K and 130K pps, but the monitor VM also receives at the same rate. 05:56:17 <anil_rao> Both the receiving VM and the monitor VM are on the same host, so we are essentially dealing with the same volume of traffic, split between 2 VMs. 05:56:42 <anil_rao> Shouldn't this be expected? 05:57:30 <vnyyad> +1 05:58:24 <kaz> sorry, i don't know why, yet. 05:59:52 <anil_rao> What IPerf is doing is maxing out the bandwidth (for any given packet size). So once we have reached that point, without miorroring and then turn on mirroring we can expect the performance to bve half. 06:00:25 <soichi> +1 06:00:57 <anil_rao> It would be interesting to drive the receiving VM at say 100K pps without mirroring and then turn on mirroring. In that case we should expect no change in the performance of the receiving VM. 06:01:30 <soichi> i think so, too 06:01:37 <anil_rao> The receiving VM should continue to get 100K pps but the monitor VM should get the same rate too. 06:01:50 <soichi> sure 06:02:15 <kaz> I will try. 06:02:35 <anil_rao> Thanks kaz. 06:02:44 <vnyyad> thanks 06:03:38 <anil_rao> Looking at last week's result I see the same behavior there too. I.e. without IRQ balancing we were still seeing the case where when mirroring was turned on, the receiving VM + monitor VM was getting the same rate as just the receiving VM without mirroring. 06:04:07 <anil_rao> IRQ blancing has definitely helped improve the overall host throughput. 06:04:30 <kaz> yes 06:04:37 <vnyyad> yeah 06:05:12 <anil_rao> These are good results! 06:05:31 <kaz> thank you 06:06:28 <anil_rao> To actually demonstrate the overhead of monitoring, it might be better to not saturate the system's bandwidth limit. I.e. we keep enough room for the extra volume generated from mirroring. This way we should be able to show that mirroring doesn't affect the receiving VM (or at least that is the goal) 06:06:28 <reedip> hi 06:07:00 <anil_rao> reedip: Hi 06:07:05 <soichi> reedip: hi 06:07:13 <reedip> sorry, was late, reading up the logs 06:08:03 <soichi> anil_rao: agree 06:08:14 <kaz> anil_rao: +1 06:09:40 <anil_rao> Here is a proposal for the test. 06:09:54 <anil_rao> Compute highest throughput for the receiving VM. 06:10:17 <anil_rao> Send at less than half that rate to the receiving VM (for multiple source VMs) 06:10:24 <anil_rao> Enable mirroring. 06:10:47 <anil_rao> See the diffrence in the rate at the receiving VM and the monitor VM. 06:11:01 <anil_rao> Expected result: No change to receiving VM. Same rate at monitor VM. 06:11:23 <anil_rao> In reality there might be a little difference and we should report that. 06:12:15 <kaz> OK, i will try that. 06:12:52 <anil_rao> Thanks kaz. I look forward to the results. 06:12:54 <soichi> i guess we can see increse of CPU usage on the host 06:13:21 <soichi> after enable mirroring 06:13:38 <anil_rao> soichi: Yes. That would be nice to measure. 06:13:43 <kaz> soichi: I think so 06:14:00 <anil_rao> If we don't hit 100% we get the true overhead, otherwise the overhead is clipped and we don't get a worthwhile result. 06:14:19 <soichi> +1 06:14:39 <kaz> I agree 06:15:10 <anil_rao> If folks are interested, we can discuss the TaaS bug related to ingress side mirroring. 06:15:24 <kaz> sure 06:15:24 <vnyyad> ani_rao: +1 06:15:28 <soichi> +1 06:15:36 <anil_rao> #topic Open Discussion 06:16:18 <anil_rao> I had send out a mail to the Neutron mailing list with a detailed description of the problem, root-cause and a proposal to move forward. 06:17:10 <anil_rao> In summary, given the way OVS treats VLAN tagged ports on a host, we don't have any options left for solutions completely within the scope of TaaS. 06:18:00 <anil_rao> We will need the core Neutron OVS driver to explicitly tag VLAN ids for packets coming in to br-int from the 'instance' ports. 06:18:15 <soichi> +1 06:18:47 <anil_rao> I am prototying this solution and will report back to the mailing list when I have a working version. 06:18:59 <vnyyad> anil_rao: Can we handle this specific case by not forwarding the mirrored traffic to the br-tap but handle it in br-int 06:19:25 <vnyyad> it will be a crude solution but might work 06:20:03 <anil_rao> vnyyad: We cannot avoid forwarding to br-tap because the mirror destination may be on a different host. 06:20:30 <anil_rao> Here is the basic problem. 06:20:36 <vnyyad> hmmm... yes true realized it... 06:20:46 <anil_rao> OVS does not tag packets flowing within the same host's br-int. 06:21:12 <anil_rao> Neutron specifies that port MACs are unique only within a network. 06:21:39 <anil_rao> This means that it is (technically) possible to two ports on differnet networks but on the same host to have the same MAC. 06:21:50 <vnyyad> yes 06:22:15 <anil_rao> If these two networks belong to different tenants, TaaS would have really broken tenant isolation because we would leak traffic of one tenant to another. 06:22:34 <soichi> yes 06:22:40 <kaz> +1 06:22:49 <vnyyad> yes... a thin chance of happening but nevertheless can happen 06:23:06 <anil_rao> vnyyad: Yes. :-( 06:24:20 <anil_rao> My prototype involves having the Neutron driver explictly add a vlan tag (corresponding to the port) for all packets coming in via that port. After that TaaS works without any modification. 06:24:45 <soichi> +1 06:25:08 <anil_rao> This way out currently solution for broadcast/multicast ingress traffic also works as is. 06:25:18 <anil_rao> out --> our 06:25:34 <vnyyad> any rational why they dont tag, may be its an optimization 06:25:34 <soichi> it soudns good 06:25:45 <vnyyad> but this solution should be good to have 06:26:23 <soichi> i have another topic 06:26:29 <anil_rao> When OVS works in normal mode they operate as a legacy switch and just keep track of ports and tags internally without having to actually tag packets. 06:27:00 <anil_rao> Neutron has also set br-int in legacy (or normal) mode for its typical operation. So everything seemed good until now. 06:27:21 <vnyyad> ok 06:27:29 <anil_rao> We are the first applciation that is trying to detect packets ingressing a VM's vNIC in br-int. However, I am sure there will be others soon. 06:27:43 <vnyyad> for sure 06:28:37 <soichi> anil_rao: would you please submit to the vBrownBag Tech Talks at Barcelona Summit? Submission deadline: Sep. 15th 06:28:38 <anil_rao> Looks like we are running out of time. Any other topics 06:28:38 <soichi> in my understand, speaksers will be anil, kaz, and reedip (3 min. each?) 06:28:53 <anil_rao> soichi: I will do that tomorrow morning. 06:29:03 <soichi> okay, thank you 06:30:32 <anil_rao> We'll continue the discussion next week. 06:30:38 <soichi> bye 06:30:42 <kaz> bye 06:30:43 <anil_rao> #endmeeting