15:01:24 <Swami> #startmeeting distributed-virtual-router 15:01:26 <openstack> Meeting started Wed Mar 19 15:01:24 2014 UTC and is due to finish in 60 minutes. The chair is Swami. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:27 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:30 <openstack> The meeting name has been set to 'distributed_virtual_router' 15:01:37 <xuhanp> hi 15:01:45 <Swami> xuhanp: hi 15:01:59 <Swami> #topic agenda 15:02:15 <Swami> Agent design doc status 15:02:30 <Swami> L3 Namespace 15:02:39 <Swami> L2 Pop FDB timeout 15:02:50 <Swami> admin_state_up 15:02:56 <Swami> DVR+HA 15:03:03 <Swami> Distributed DHCP 15:03:22 <Swami> Is there any other topic that you would like to discuss today as part of the DVR 15:03:39 <Vivek-Narasimhan> the arbitrary port for router 15:03:45 <Vivek-Narasimhan> if that's not communicated already 15:04:19 <Swami> vivek: thanks for bringing it up 15:05:08 <Swami> We would be also discussing about the support of attaching an arbitrary port to an existing router with a subnet with the DVR context. 15:05:25 <Swami> #topic Agent Design doc status 15:06:07 <Swami> #link https://docs.google.com/document/d/1depasJSnGZPOnRLxEC_PYsVLcGVFXZLqP52RFTe21BE/edit 15:06:11 <Swami> L2 agent design doc 15:06:18 <Swami> ajo: hi 15:06:56 <ajo> hi Swami :) 15:07:55 <Swami> Hi folks the l2 agent design doc is out there for the community review, if you have any questions or concerns please feel free to let us know. 15:08:14 <Swami> If your comments or not addressed please let me know and I will make sure that those are addressed. 15:08:26 <Swami> #link https://docs.google.com/document/d/1jCmraZGirmXq5V1MtRqhjdZCbUfiwBhRkUjDXGt5QUQ/edit 15:08:30 <Swami> L3 agent design doc 15:08:50 <Swami> This week we posted the L3 agent design doc for review. 15:09:13 <Swami> Both these docs were created based on the community feedback from the face to face meeting. 15:09:48 <Swami> Please review those documents and provide me your feedback. 15:10:04 <Swami> Is sylvain in here 15:10:33 <Swami> I don't see sylvain today in the IRC. 15:10:39 <Swami> #topic L2 Pop 15:11:04 <Swami> We found out that L2 Pop entries times out. 15:11:13 <Vivek-Narasimhan> Swami a clarification 15:11:21 <Vivek-Narasimhan> we just found out that they don't timeout today 15:11:28 <Swami> vivek: sure 15:11:35 <Vivek-Narasimhan> there is a mix of timing out and non-timing out entries though 15:11:52 <Vivek-Narasimhan> jumbled by tunnel-bridge learning and L2-POP both working on Table 20 - UCAST_TO_TUN 15:12:04 <Swami> vivek: can you explain the mix of timeout and not-timeout 15:12:23 <Vivek-Narasimhan> basically, tunnel bridge learning entries also show up in Table 20 even though L2-Pop is enabled 15:12:40 <Vivek-Narasimhan> so there are two entries for the same destination mac in table 20, one learnt by tunnel-bridge and other pushed by L2-POP 15:13:13 <Vivek-Narasimhan> those entries pushed in by tunnel-bridge learning logic has hard_timeout of 300 secs of expiry (ie., in 5 minutes they expire) 15:13:21 <Vivek-Narasimhan> those entries (same dest mac) pushed by L2-POP donot expire 15:13:30 <Swami> are you proposing to disable one of the learning or trying to understand if there is a way to disable one of the learning 15:13:50 <Vivek-Narasimhan> i thought that having both learning and l2-pop is redundant as they 15:13:57 <Vivek-Narasimhan> occupy tunnel-bridge finite flow space 15:13:57 <Swami> carl: do you have any insight on this 15:14:37 <carl_baldwin> Swami: Not yet. 15:14:55 <Swami> vivek: May be sylvain would be able to answer your question. Unfortunately he is not here. 15:15:14 <ajo> It sounds like redundant to me, 15:15:29 <ajo> there's something not to be populated via L2-POP that could be discovered by learning? 15:15:34 <Swami> I have sent an email to sylvain, if not he will be attending the L3-subteam meeting tomorrow and I will try to bring this topic. 15:15:48 <Vivek-Narasimhan> yes 15:16:24 <Swami> ajo: Is we need to identify it this is the expected behavior or a bug. If it is a bug we need to log a bug on the launchpad or discuss with Sylvain. 15:16:25 <Vivek-Narasimhan> (or) atelast shut off tunnel bridge learning and have L2-POP totally deal with entries in Table 20 15:16:43 <ajo> aha 15:16:43 <Vivek-Narasimhan> having double entries for same MAC seems to be redundant 15:16:49 <Murali_> But here different VLAN with same Mac address in the ovs-rules 15:16:58 <ajo> Vivek-Narasimhan, the double mac entries, are for vrouters, right? 15:16:58 <Vivek-Narasimhan> No murali 15:17:05 <Vivek-Narasimhan> the different VLAN thing we wil investigate further 15:17:06 <ajo> which it's what's populated via L2-POP 15:17:10 <ajo> or do we populate something else? 15:18:03 <Swami> Vivek: Good point, thanks for bringing this up. I will log it and discuss it with sylvain. 15:18:39 <Swami> #topic admin_state_up 15:19:21 <Swami> In the current openstack neutron implementation the "admin_state_up" flag does not do anything usefull to get the state of the object. 15:20:08 <mrsmith> or control the state 15:20:29 <Swami> For example if the "router" admin_state_up is down, the router is still active and serviced by the scheduler. 15:20:37 <ajo> Swami, the problem is that it works for some objects, but it doesn't for other? 15:20:39 <rajeev> That is a defect 15:20:49 <Swami> ajo: May be right. 15:20:54 <ajo> ye,s the vrouter should not be created in such case. 15:21:04 <ajo> afaik 15:21:15 <rajeev> Yes that is how it was in Grizzly 15:21:37 <ajo> I think the setting name is confusing itself 15:21:41 <Swami> In the "neutron-team-meeting" this monday there was a question about getting rid of "admin_state_up" and use the "status". 15:21:50 <ajo> "admin_state_up", talks about "status", not a setting... 15:21:51 <VivekNarasimhan> just back 15:22:09 <ajo> "admin_disabled" would be something more appropriate IMHO 15:22:26 <xuhanp> so how to set the status without admin_state_up anymore? 15:22:41 <ajo> I think those should be two different things 15:22:56 <ajo> status = current status, is it having any problem . Read only from the admin perspective 15:23:13 <Swami> If this is a bug then we need to file a bug on this. If there is already a bug on this, we need to see if we can take the bug and fix it. 15:23:16 <ajo> admin_disabled (current admin_state_up ... bad name) = read/write , to let the admin disable a resource. 15:23:55 <xuhanp> ajo, ok. that makes sense. as long as there is a way to do that. 15:24:21 <Swami> The reason I brought up this topic is because if this is not fixed, then there is not good way of re-scheduling the routers to a different agent, unless you manually associate a router to a specific agent. 15:24:29 <ajo> a name change affects the API specification: https://wiki.openstack.org/wiki/Neutron/APIv2-specification 15:24:35 <mrsmith> open bug: https://bugs.launchpad.net/neutron/+bug/1215387 15:24:50 <mrsmith> from bug: We can not stop router forwarding packets by admin_state_up false. Master branch has this problem (stable/grizzly branch don't have the problem). I know the cause. When run router-update --admin_state_up false, transitions as follows: 15:25:02 <ajo> Swami, I don't see the connection between admin_state_up and rescheduling 15:25:07 <ajo> could you elaborate the problem? 15:25:12 <Swami> mrsmith: thanks for the link 15:25:49 <mrsmith> swami: np 15:26:32 <Murali_> at agent side if the admin_state_up is false we dont allow port updates for that router 15:26:46 <Swami> ajo: Yes from our DVR design we were planning to use the "admin_state_up" to disable or disassociate a router from the current agent and then re-associate to a new agent, when an update occurs ( this is for migration when people wanted to migrate from centralized router to distributed router) 15:27:31 <ajo> Swami, I believe that if admin_state_up is set to False by admin, we may obey that setting, and not schedule the router anywhere 15:28:02 <ajo> the flag (AFAIK, I could be wrong, I had to read the description several times) means that the router want's this router disabled (for some administrative or security reason) 15:28:14 <Murali_> we can false existing router also 15:28:19 <Swami> As a team we need to identify if "admin_state_up" can be fixed an used in the way it is designed or else we need to come up with "admin_state_enabled" or "admin_state_distabled" true/false. 15:29:13 <ajo> Swami, I think you may use some other flag for that 15:29:35 <amuller> Swami: Do you have to set the router as 'down' in order to migrate it? 15:29:35 <ajo> but admin_state_enabled or admin_state_disabled doesn't look very descriptive, 15:30:15 <ajo> Swami, a normal l3-agent-router-remove 15:30:17 <Swami> amuller: We need to hand over the control of the router to a different agent, when we migrate. 15:30:18 <Murali_> swami for migration admin_state_up may not the right option 15:30:18 <ajo> would'nt do? 15:30:33 <ajo> I mean, you remove it from the agent, and then it knows it needs to reschedule somewhere else? 15:30:49 <ajo> hmm, but I suppose, you need to mark the agent to avoid it for being rescheduled to the same agent again 15:30:53 <ajo> , is that your problem? 15:31:40 <Swami> ajo: Yes l3-agent-router-remove will remove the router, but when a router is an operational state, can we move to a different agent. 15:32:07 <Swami> That is the reason that we wanted to disable the state of the router to inactive and then move the router. 15:32:13 <ajo> Swami, wouldn't be enough to start the scheduling logic once you remove it from an agent? 15:32:17 <carl_baldwin> Swami: We've been able to migrate a router with admin_state_up=True if that is what you mean. 15:33:27 <Swami> carl: Thanks that helps. But will there be a network glitch when it happens. 15:33:35 <carl_baldwin> Yes. 15:34:18 <Swami> ok, thanks for the inputs. 15:34:25 <Swami> #topic namespace 15:34:49 <Swami> The next topic is about the router namespace and when to delete the namespace. 15:35:38 <Swami> In the DVR, we will be starting the IR ( internal router) namespace on all the compute nodes. 15:35:55 <Swami> But what would be the right approach to clean up the namespaces. 15:36:38 <carl_baldwin> Swami: "all compute nodes" means all compute nodes with a VM on the same network, right? 15:37:26 <Swami> carl: yes all compute node means with a VM on the compute node and VM being part of a network that is connected to an active router. 15:38:17 <ajo> Swami, I suppose namespaces need to be cleaned up, when there are no resources in the compute node making use of such namespace 15:38:37 <ajo> clean up the iptable rules inside the namespace, kill the processes, remove the ports, kill the namespace 15:38:39 <Swami> Today we are able to create a namespace only if there is a VM in the network, otherwise that compute will not have an active namespace. 15:38:54 <ajo> that makes sense 15:39:01 <amuller> Swami: I have to say, this would be a lot more productive if you guys published your code as WIP 15:39:12 <amuller> Swami: The rest of us would know what problems you guys are facing and try to help 15:39:30 <Swami> amuller: Thanks for reminding us, we will be publishing the WIP code soon. 15:39:46 <amuller> We're discussing details right now but we don't have the code, it's very difficult 15:40:12 <ajo> Swami, 15:40:44 <ajo> what do you think about the cleaning up the namespace when no VMs on the host are attached to such namespace router, + a timeout 15:40:52 <ajo> for the case when you shut off / shut on a VM 15:40:59 <Swami> amuller: As per my previous discussion with the community the work is in progress, and will be posting the WIP code as soon as we have something that can be tested by others. 15:41:09 <ajo> if after a configurable timeout, no resource is making use of such namespace, it's cleaned up and disposed 15:41:19 <amuller> Swami: Thank you 15:41:29 <Swami> ajo: That's what we are thinking on. 15:41:52 <mrsmith> yes - I don't think it is a difficult thing to do 15:41:58 <mrsmith> is the question whether we should do it or not? 15:42:18 <ajo> definitely, we should, resources are not limited, 15:42:27 <mrsmith> agreed 15:42:33 <ajo> and if VMs are moving around in a production environment, you could end up with a machine cluttered with namespaces 15:42:42 <ajo> then sudo degrades, ip netns degrades, ... etc 15:42:51 <Swami> mrsmith: The question is do we need to handle it and will there be a race condition when we delete a namespace and a VM comes back again. 15:42:55 <mrsmith> swami: did you have timing/churn concerns? 15:43:00 <Swami> Will it not create a performance issue. 15:43:00 <mrsmith> ok - right 15:43:12 <Swami> I do have concerns on performance. 15:43:29 <ajo> Swami, I'd leave that tuneable... 15:43:40 <ajo> timeout could be infinite... 15:43:59 <Swami> ajo: thanks 15:44:07 <Swami> ok, before we run of time. 15:44:14 <Murali_> but here we need keep on monitiring if someone is using the namespace or not for every change in VM 15:44:25 <ajo> neutron, definitely has many problems at this stage regarding this, which will eventually be mitigated by better nova/neutron synchronization, but we're having also problems with resources/starvation 15:44:26 <Swami> I need to discuss about the arbitrary port. 15:44:34 <ajo> sure 15:44:35 <ajo> thanks Swami 15:44:49 <Swami> #topic Router arbitrary port. 15:45:17 <Swami> Hi folks, in havana, a subnet can be connected to two different routers. 15:45:56 <Swami> vivek: can you update the status on this 15:46:03 <VivekNarasimhan> yes swami 15:46:15 <VivekNarasimhan> for a given router, a subnet could be attached as an interface 15:46:37 <VivekNarasimhan> similarly for another router by the same tenant, a handcrafted 'port' on the same subnet can be attached as an interface 15:46:53 <VivekNarasimhan> this would be done by the tenant to typically have multiple gateways for his VMs. 15:47:17 <VivekNarasimhan> one router dealing with routing some set of internal networks and another router dealing with routing to different set of internal networks 15:47:29 <amuller> VivekNarasimhan: each router scheduled on a different agent, then configure source routing on your VMs? 15:47:29 <VivekNarasimhan> but both routers have one interface on the same subnet 15:47:58 <VivekNarasimhan> i was mention normal routers 15:48:00 <VivekNarasimhan> not dvr routers 15:48:12 <VivekNarasimhan> the above scenario mentioned is available for normal routers today 15:48:15 <amuller> I know 15:48:20 <VivekNarasimhan> for the dvr we are pursuing the 15:48:30 <VivekNarasimhan> source mac information is lost, so on the receiving l2-agent 15:48:50 <VivekNarasimhan> we cannot decide which gateway-port to emulate in the source mac to send the frame to destination VM 15:48:59 <Swami> Vivek: what you wanted to know is if this is one thing that we need to support for dVR or not. 15:49:09 <VivekNarasimhan> correct swami 15:49:24 <amuller> Sounds like a classic follow up blueprint, shouldn't delay DVR imo 15:49:47 <Swami> My question to the audience is can this be a phase II and can take up this in the next version or it needs to be supported in the first phase. 15:49:57 <Swami> amuller: agree 15:50:03 <ajo> I agree too 15:50:16 <Swami> It would be too much of work to take on everything and so we should push this to a follow on update . 15:50:18 <carl_baldwin> I think it could be a follow-up bp. But, it would be good to have some sense for the feasibility of the follow-up. 15:50:28 <VivekNarasimhan> we have a feasible design 15:50:34 <amuller> But we need tracking... At least file the BP 15:50:35 <Swami> Hope everyone is on the same page. 15:50:41 <ajo> I agree on carl_baldwin 15:50:46 <Swami> Vivek: we should take it up in the next release. 15:50:55 <Swami> Hope this helps. 15:51:03 <VivekNarasimhan> OK swami. BTW, the design was to allocate some bits in the unique DVR LMAC for subnet differentiation 15:51:05 <ajo> even if it's not possible to implement now, be sure if it can be supported somehow. 15:51:07 <Swami> #topic Any Open Discussions 15:51:29 <Swami> Sorry we ran out of time today. 15:52:00 <Swami> Is there any other topic that we wanted to discuss. 15:52:27 <VivekNarasimhan> the duplicate macs in L2-POP 15:52:30 <VivekNarasimhan> as i dropped out 15:52:30 <Swami> amuller: You, myself and sylvain need to sync up on the DVR+HR, we can take that as the first topic next week. 15:52:49 <VivekNarasimhan> can the community folks let me know if thats L2-Pop bug or just a behavior? 15:53:11 <Swami> vivek: we will check it with sylvain and let you know. 15:53:18 <VivekNarasimhan> ok swami, thanks 15:53:19 <amuller> Swami: Ok, if you could send an email with any questions that would help, let me prepare before the meeting... 15:53:20 <Swami> vivek: thanks for your feedback and input. 15:53:31 <Swami> amuller: thanks will do. 15:53:32 <VivekNarasimhan> np 15:53:44 <Swami> Thanks for everyone who joined the meeting today. 15:53:48 <amuller> Thanks everyone, see you in tomorrow's L3 meeting :) 15:53:56 <Swami> Please review the doc and let us know if you have any questions. 15:54:03 <amuller> Swami: Will do 15:54:11 <carl_baldwin> btw, tomorrow's l3 meeting is same time. I'll send email. 15:54:14 <Swami> We will make sure that we push the WIP code as early as possible. 15:54:57 <Swami> Folks tomorrow at the same time we have the L3 subteam meeting, so any general L3 topics can be discussed, if you are interested please join it. 15:54:58 <Swami> Thanks 15:55:12 <Swami> #endmeeting