05:34:25 <reedip> #startmeeting taas
05:34:26 <openstack> Meeting started Wed Jul 12 05:34:25 2017 UTC and is due to finish in 60 minutes.  The chair is reedip. Information about MeetBot at http://wiki.debian.org/MeetBot.
05:34:27 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
05:34:29 <openstack> The meeting name has been set to 'taas'
05:34:38 <anil_rao> Hi
05:34:43 <kaz> hi
05:34:55 <reedip> #chair kaz anil_rao reedip
05:34:56 <openstack> Current chairs: anil_rao kaz reedip
05:35:06 <reedip> #link https://wiki.openstack.org/wiki/Meetings/taas
05:35:32 <reedip> #topic L2 Extension support
05:36:03 <anil_rao> Thanks for sending out the patch for review. I'll get back in a day or two.
05:36:21 <kaz> thank you
05:36:28 <reedip> hi kaz, thanks for the patch. I was unaware that the L2 extension was being created. I would abandon the one I created some time back :)
05:37:19 <kaz> sorry for late announce :)
05:38:03 <anil_rao> kaz: Have you noticed that the current TaaS does not work correctly with Ocata stable.
05:38:44 <reedip> kaz : please update https://review.openstack.org/#/c/454788/6/specs/stadium/pike/tap-as-a-service.rst@50 with the patch information for L2 extension
05:38:55 <kaz> anil_rao: no i do not
05:39:14 <reedip> anil_rao : ^^ above link is for the Project status for getting TaaS accepted into Neutron stadium
05:39:40 <anil_rao> kaz: I found that our TaaS flows in br-tun are getting wiped out. I am assumging this is by Neutron.
05:40:06 <anil_rao> reedip: Yes, I have seen the document you have uploaded. That is a great summary of the remaining work items.
05:40:30 <bennco1> @anil_rao evening you mind sharing the link again please?
05:40:46 <reedip> @bennco1 : which link ???
05:41:00 <anil_rao> Hi bennco1
05:41:02 <bennco1> R: <reedip> anil_rao : ^^ above link is for the Project status for getting TaaS accepted into Neutron stadium
05:41:11 <reedip> https://review.openstack.org/#/c/454788/6/specs/stadium/pike/tap-as-a-service.rst
05:41:17 <bennco1> thank you
05:41:59 <anil_rao> kaz: I am assuming that the L2 extension version will allow us to better coordindate things with Neutron.
05:42:22 <kaz> anil_rao: is it occurring in l2 extension agent?
05:43:03 <anil_rao> kaz: No, the problem is with the (current) TaaS Agent/Driver and Ocata Stable. It is a very recent problem. I didn't see it a month back.
05:43:25 <kaz> i see.
05:43:48 <yamamoto> anil_rao: why are you using current taas, rather than ocata taas?
05:44:29 <anil_rao> yamamoto: Yes, I mean Ocata TaaS and not the L2 extension version that kaz has out for review.
05:44:32 <reedip> is the problem occuring in Ocata branch or the current branch as well ?
05:44:45 <reedip> I mean the master branch
05:44:48 <yamamoto> anil_rao: i see
05:46:07 <anil_rao> reedip: We are doing some validation work and wanted a stable branch that is why we are using Ocata Stable.
05:46:38 <anil_rao> Every so often, I find that our (TaaS) flows in br-tun are all gone.
05:47:46 <reedip> hmm
05:48:01 <anil_rao> My guess is that something is triggering resets, which ML2 can handle. Since we don't have flow restore logic our stuff, once wiped out, doesn't get repopulated.
05:48:19 <kaz> it seems the flow enries have cookie which is marked by neutron agent.
05:48:37 <anil_rao> kaz: That is why I was suprised.
05:49:05 <kaz> But, the flow entries made by taas agent has no cookies.
05:49:34 <anil_rao> I try to address this issue with the error handling logic I am working on.
05:51:22 <anil_rao> If folks don't mind I would like to discuss how we can handle a very large number of tap-services...
05:51:34 <reedip> sure ..
05:51:48 <anil_rao> Here is the problem
05:52:03 <anil_rao> Today, we consume a VLAN id for each and every tap-service.
05:52:25 <reedip> hmm
05:52:50 <kaz> i see
05:52:59 <anil_rao> Since a tap-service represnts the destination end of a port-mirror session, we cannot have too many tap-flows associated with a single tap-service, especially if there is a lot of traffic being mirrored
05:53:31 <anil_rao> So if one wants to monitor a very large number of VMs, we will end up requiring a lot of separate tap-services.
05:54:09 <anil_rao> e.g. We want to monitor say 1000 VMs.
05:54:56 <anil_rao> Assuming 25 tap-flows to a tap-service (which may already be excessive) we need 40 tap-services.
05:55:24 <anil_rao> So we need some way to deal with this issue (and the consumption of VLAN ids)
05:57:12 <reedip> thats a production level issue and makes sense
05:57:17 <anil_rao> I think we'll need to treat VLAN ids as local to a compute/network node (just like Neutron) instead of global.
05:57:56 <anil_rao> The id management at the TaaS centralized location (on the controller node) will need to be more sophisticated.
05:59:28 <yamamoto> i forgot how vlan is used by our flows.  anil_rao, the nice flow diagram you had while ago is available somewhere?
06:00:00 <yamamoto> it (or something similar) ought to be under our doc/ i guess.
06:00:05 <anil_rao> yamamoto: My bad. I'll upload that diagram when I get in to work tomorrow morning. I keep forgetting to do so. Sorry.
06:00:40 <anil_rao> Today, we consume a VLAN id when a tap-service is created. I.e. each tap-service takes up one VLAN id.
06:01:20 <anil_rao> We then switch betwen this VLAN id and tunnel id to safely transport traffic belonging to a tap-service from the source port to the destination port.
06:02:43 <anil_rao> Neutron also uses VLAN ids to seggragate tenant virtual networks but they use the id as local to a compute/network node.
06:03:19 <anil_rao> I am thinking that we can do the same. Except that our logic in the centralized location will be a little more complex.
06:05:12 <anil_rao> This issue (number of tap-services we can support) will directly influence the per-tenant tap-flow / tap-service quotas.
06:05:47 <yamamoto> vlan ids are the only concern?
06:06:38 <anil_rao> I think there are more but this one is critical because it will limit how many VMs we can monitor -- system-wide and also per-tenant.
06:07:49 <bennco1> @anil_rao if we use what I call dummy vlans how can we preserve who the traffic actually is coming from?
06:09:07 <yamamoto> bennco1: what's what you call dummy vlans?
06:09:20 <anil_rao> bennco1: Within a host we use VLAN tags (they are real tags). For the hop between two hosts we translate from VLAN id to tunnel id and then back to VLAN id
06:09:58 <bennco1> ok make sense
06:10:13 <anil_rao> The mirrored traffic associated with a tap-service is seggrated from other tap-service traffic and also from production traffic.
06:11:49 <reedip> anil_rao : the diagram would be great on this :)
06:12:10 <anil_rao> Sure. I'll upload it to the TaaS place tomorrow.
06:12:49 <anil_rao> Are the TaaS quotas actually effective today?
06:14:25 <yamamoto> i suppose no, as reedip has a patch.
06:15:46 <yamamoto> https://review.openstack.org/#/c/373929/
06:15:57 <anil_rao> Thanks!
06:16:14 <anil_rao> When we have TaaS as an ML2 extension (in review currently) we
06:16:50 <anil_rao> will need to negotiate the VLAN id range used by TaaS. I am not sure how the Neutron core team will react to this reservation. :-)
06:18:33 <yamamoto> it isn't necessary with taas agent?
06:19:03 <anil_rao> Since we were in the sidelines all this while nobody cared.
06:20:16 <anil_rao> I think the br-tun table use (by TaaS) should be less contentious issue.
06:21:07 <yamamoto> i guess it doesn't matter much as the reservation will be short-term anyway.  (until l2 flow management things actually happen)
06:22:06 <anil_rao> yamamoto: Yes, that is the real ticket. :-)
06:23:57 <anil_rao> One other issue that we need to take care of is handling the other flows in br-int today. At least for the short term until l2 flow mgmt is in place.
06:24:23 <anil_rao> The current TaaS implementation has effectively shut off ARP anti-spoofing.
06:25:35 <yamamoto> i guess we should ensure having LP bugs for each known issues.
06:25:55 <yamamoto> right now our LP is largely unmaintained.
06:26:20 <anil_rao> Agree. Did Vinay get around to providing access.
06:26:25 <yamamoto> no
06:26:38 <anil_rao> Let me ping him again.
06:26:58 <yamamoto> i guess you have more permission than the rest of us because you are a maintainer.
06:27:28 <anil_rao> Surprisingly even I don't have access to that. Strange.
06:27:33 <yamamoto> heh
06:27:50 <yamamoto> 3 mins left
06:28:11 <yamamoto> i guess we should mention Sydney cfp as its deadline is in this week
06:28:33 <yamamoto> i've heard reedip submitted something
06:28:34 <bennco1> btw I also created an lp report here
06:28:35 <bennco1> https://bugs.launchpad.net/tap-as-a-service/+bug/1694605
06:28:37 <openstack> Launchpad bug 1694605 in tap-as-a-service "Transport Support w/o br-tun " [Undecided,Confirmed]
06:29:11 <yamamoto> bennco1: thank you, it's a valid issue.
06:30:06 <bennco1> I'm hoping to get some focus on it - it appears to be more common than I expected
06:30:38 <anil_rao> bennco1: +1
06:30:49 <kaz> +1
06:31:19 <anil_rao> I guess we are out of time. Let's pick up next week.
06:32:15 <kaz> anil_rao: sure
06:32:56 <yamamoto> Sydney CFP deadline is 14th July.  if you have anything about that, don't wait for the next week meeting.  use email.
06:34:43 <yamamoto> who can close the meeting?
06:35:02 <kaz> mayby i can.
06:35:03 <reedip> #endmeeting