05:34:25 <reedip> #startmeeting taas 05:34:26 <openstack> Meeting started Wed Jul 12 05:34:25 2017 UTC and is due to finish in 60 minutes. The chair is reedip. Information about MeetBot at http://wiki.debian.org/MeetBot. 05:34:27 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 05:34:29 <openstack> The meeting name has been set to 'taas' 05:34:38 <anil_rao> Hi 05:34:43 <kaz> hi 05:34:55 <reedip> #chair kaz anil_rao reedip 05:34:56 <openstack> Current chairs: anil_rao kaz reedip 05:35:06 <reedip> #link https://wiki.openstack.org/wiki/Meetings/taas 05:35:32 <reedip> #topic L2 Extension support 05:36:03 <anil_rao> Thanks for sending out the patch for review. I'll get back in a day or two. 05:36:21 <kaz> thank you 05:36:28 <reedip> hi kaz, thanks for the patch. I was unaware that the L2 extension was being created. I would abandon the one I created some time back :) 05:37:19 <kaz> sorry for late announce :) 05:38:03 <anil_rao> kaz: Have you noticed that the current TaaS does not work correctly with Ocata stable. 05:38:44 <reedip> kaz : please update https://review.openstack.org/#/c/454788/6/specs/stadium/pike/tap-as-a-service.rst@50 with the patch information for L2 extension 05:38:55 <kaz> anil_rao: no i do not 05:39:14 <reedip> anil_rao : ^^ above link is for the Project status for getting TaaS accepted into Neutron stadium 05:39:40 <anil_rao> kaz: I found that our TaaS flows in br-tun are getting wiped out. I am assumging this is by Neutron. 05:40:06 <anil_rao> reedip: Yes, I have seen the document you have uploaded. That is a great summary of the remaining work items. 05:40:30 <bennco1> @anil_rao evening you mind sharing the link again please? 05:40:46 <reedip> @bennco1 : which link ??? 05:41:00 <anil_rao> Hi bennco1 05:41:02 <bennco1> R: <reedip> anil_rao : ^^ above link is for the Project status for getting TaaS accepted into Neutron stadium 05:41:11 <reedip> https://review.openstack.org/#/c/454788/6/specs/stadium/pike/tap-as-a-service.rst 05:41:17 <bennco1> thank you 05:41:59 <anil_rao> kaz: I am assuming that the L2 extension version will allow us to better coordindate things with Neutron. 05:42:22 <kaz> anil_rao: is it occurring in l2 extension agent? 05:43:03 <anil_rao> kaz: No, the problem is with the (current) TaaS Agent/Driver and Ocata Stable. It is a very recent problem. I didn't see it a month back. 05:43:25 <kaz> i see. 05:43:48 <yamamoto> anil_rao: why are you using current taas, rather than ocata taas? 05:44:29 <anil_rao> yamamoto: Yes, I mean Ocata TaaS and not the L2 extension version that kaz has out for review. 05:44:32 <reedip> is the problem occuring in Ocata branch or the current branch as well ? 05:44:45 <reedip> I mean the master branch 05:44:48 <yamamoto> anil_rao: i see 05:46:07 <anil_rao> reedip: We are doing some validation work and wanted a stable branch that is why we are using Ocata Stable. 05:46:38 <anil_rao> Every so often, I find that our (TaaS) flows in br-tun are all gone. 05:47:46 <reedip> hmm 05:48:01 <anil_rao> My guess is that something is triggering resets, which ML2 can handle. Since we don't have flow restore logic our stuff, once wiped out, doesn't get repopulated. 05:48:19 <kaz> it seems the flow enries have cookie which is marked by neutron agent. 05:48:37 <anil_rao> kaz: That is why I was suprised. 05:49:05 <kaz> But, the flow entries made by taas agent has no cookies. 05:49:34 <anil_rao> I try to address this issue with the error handling logic I am working on. 05:51:22 <anil_rao> If folks don't mind I would like to discuss how we can handle a very large number of tap-services... 05:51:34 <reedip> sure .. 05:51:48 <anil_rao> Here is the problem 05:52:03 <anil_rao> Today, we consume a VLAN id for each and every tap-service. 05:52:25 <reedip> hmm 05:52:50 <kaz> i see 05:52:59 <anil_rao> Since a tap-service represnts the destination end of a port-mirror session, we cannot have too many tap-flows associated with a single tap-service, especially if there is a lot of traffic being mirrored 05:53:31 <anil_rao> So if one wants to monitor a very large number of VMs, we will end up requiring a lot of separate tap-services. 05:54:09 <anil_rao> e.g. We want to monitor say 1000 VMs. 05:54:56 <anil_rao> Assuming 25 tap-flows to a tap-service (which may already be excessive) we need 40 tap-services. 05:55:24 <anil_rao> So we need some way to deal with this issue (and the consumption of VLAN ids) 05:57:12 <reedip> thats a production level issue and makes sense 05:57:17 <anil_rao> I think we'll need to treat VLAN ids as local to a compute/network node (just like Neutron) instead of global. 05:57:56 <anil_rao> The id management at the TaaS centralized location (on the controller node) will need to be more sophisticated. 05:59:28 <yamamoto> i forgot how vlan is used by our flows. anil_rao, the nice flow diagram you had while ago is available somewhere? 06:00:00 <yamamoto> it (or something similar) ought to be under our doc/ i guess. 06:00:05 <anil_rao> yamamoto: My bad. I'll upload that diagram when I get in to work tomorrow morning. I keep forgetting to do so. Sorry. 06:00:40 <anil_rao> Today, we consume a VLAN id when a tap-service is created. I.e. each tap-service takes up one VLAN id. 06:01:20 <anil_rao> We then switch betwen this VLAN id and tunnel id to safely transport traffic belonging to a tap-service from the source port to the destination port. 06:02:43 <anil_rao> Neutron also uses VLAN ids to seggragate tenant virtual networks but they use the id as local to a compute/network node. 06:03:19 <anil_rao> I am thinking that we can do the same. Except that our logic in the centralized location will be a little more complex. 06:05:12 <anil_rao> This issue (number of tap-services we can support) will directly influence the per-tenant tap-flow / tap-service quotas. 06:05:47 <yamamoto> vlan ids are the only concern? 06:06:38 <anil_rao> I think there are more but this one is critical because it will limit how many VMs we can monitor -- system-wide and also per-tenant. 06:07:49 <bennco1> @anil_rao if we use what I call dummy vlans how can we preserve who the traffic actually is coming from? 06:09:07 <yamamoto> bennco1: what's what you call dummy vlans? 06:09:20 <anil_rao> bennco1: Within a host we use VLAN tags (they are real tags). For the hop between two hosts we translate from VLAN id to tunnel id and then back to VLAN id 06:09:58 <bennco1> ok make sense 06:10:13 <anil_rao> The mirrored traffic associated with a tap-service is seggrated from other tap-service traffic and also from production traffic. 06:11:49 <reedip> anil_rao : the diagram would be great on this :) 06:12:10 <anil_rao> Sure. I'll upload it to the TaaS place tomorrow. 06:12:49 <anil_rao> Are the TaaS quotas actually effective today? 06:14:25 <yamamoto> i suppose no, as reedip has a patch. 06:15:46 <yamamoto> https://review.openstack.org/#/c/373929/ 06:15:57 <anil_rao> Thanks! 06:16:14 <anil_rao> When we have TaaS as an ML2 extension (in review currently) we 06:16:50 <anil_rao> will need to negotiate the VLAN id range used by TaaS. I am not sure how the Neutron core team will react to this reservation. :-) 06:18:33 <yamamoto> it isn't necessary with taas agent? 06:19:03 <anil_rao> Since we were in the sidelines all this while nobody cared. 06:20:16 <anil_rao> I think the br-tun table use (by TaaS) should be less contentious issue. 06:21:07 <yamamoto> i guess it doesn't matter much as the reservation will be short-term anyway. (until l2 flow management things actually happen) 06:22:06 <anil_rao> yamamoto: Yes, that is the real ticket. :-) 06:23:57 <anil_rao> One other issue that we need to take care of is handling the other flows in br-int today. At least for the short term until l2 flow mgmt is in place. 06:24:23 <anil_rao> The current TaaS implementation has effectively shut off ARP anti-spoofing. 06:25:35 <yamamoto> i guess we should ensure having LP bugs for each known issues. 06:25:55 <yamamoto> right now our LP is largely unmaintained. 06:26:20 <anil_rao> Agree. Did Vinay get around to providing access. 06:26:25 <yamamoto> no 06:26:38 <anil_rao> Let me ping him again. 06:26:58 <yamamoto> i guess you have more permission than the rest of us because you are a maintainer. 06:27:28 <anil_rao> Surprisingly even I don't have access to that. Strange. 06:27:33 <yamamoto> heh 06:27:50 <yamamoto> 3 mins left 06:28:11 <yamamoto> i guess we should mention Sydney cfp as its deadline is in this week 06:28:33 <yamamoto> i've heard reedip submitted something 06:28:34 <bennco1> btw I also created an lp report here 06:28:35 <bennco1> https://bugs.launchpad.net/tap-as-a-service/+bug/1694605 06:28:37 <openstack> Launchpad bug 1694605 in tap-as-a-service "Transport Support w/o br-tun " [Undecided,Confirmed] 06:29:11 <yamamoto> bennco1: thank you, it's a valid issue. 06:30:06 <bennco1> I'm hoping to get some focus on it - it appears to be more common than I expected 06:30:38 <anil_rao> bennco1: +1 06:30:49 <kaz> +1 06:31:19 <anil_rao> I guess we are out of time. Let's pick up next week. 06:32:15 <kaz> anil_rao: sure 06:32:56 <yamamoto> Sydney CFP deadline is 14th July. if you have anything about that, don't wait for the next week meeting. use email. 06:34:43 <yamamoto> who can close the meeting? 06:35:02 <kaz> mayby i can. 06:35:03 <reedip> #endmeeting