15:00:23 #startmeeting neutron_l3 15:00:24 Meeting started Thu Jul 31 15:00:23 2014 UTC and is due to finish in 60 minutes. The chair is carl_baldwin. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:25 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:28 The meeting name has been set to 'neutron_l3' 15:00:31 #topic Announcements 15:00:37 o/ 15:00:37 #link https://wiki.openstack.org/wiki/Meetings/Neutron-L3-Subteam 15:00:53 howdy carl_baldwin 15:01:07 Already a week in to Juno-3. Things are moving fast. 15:01:14 hi 15:01:29 juno-3 is targetted for September 4th. 15:01:38 #link https://wiki.openstack.org/wiki/Juno_Release_Schedule 15:02:26 Also, the initial DVR implementation has been merged. This should enable broader testing. 15:02:46 The infra patches to enable experimental job have also merged I think. 15:02:52 carl_baldwin: the first experimental job is running as we speak 15:03:12 http://status.openstack.org/zuul/ 15:03:18 great. 15:03:25 armax: Great. I was just going to look this morning. 15:03:33 https://jenkins06.openstack.org/job/check-tempest-dsvm-neutron-dvr/1/console 15:03:59 for review 108177,10 15:04:49 Does this require an explicit “check experimental” to run? 15:04:59 yes 15:05:18 I did post comment ‘check experimental’ explictily 15:05:37 #topic neutron-ovs-dvr 15:06:15 all: Make use of this job on our DVR related patches. 15:06:49 * carl_baldwin goes to run ‘check experimental’ on his reviews 15:07:07 Swami: Anything to report? 15:07:22 carl_baldwin: hi 15:07:43 we had a couple of issues that we wanted to discuss 15:07:52 This is related to migration 15:08:52 You have the floor. 15:08:55 The first question that we have is for a "router-migration", can we make use of "admin_state_up/admin_state_down" first before issuing a router-update 15:09:32 Do you mean to require that a router be in admin_state_down before migration? 15:09:43 Swami: I think it’s sensible 15:09:46 The reason is, when admin issues these commands, the existing state of the routers are cleaned up and then we can move or migrate the routers to the new agent. 15:10:00 carl_baldwin: Yes 15:10:22 it comes down to the admin running 3 commands or one 15:10:29 Swami: I don’t see a problem with that. It should be documented. 15:10:38 admin_state_down/up are there for a reason 15:10:57 We were initially debating that will the admin be ok, in issuing two commands for a migration. First would be to set the 'admin_state" and next would be to do an update. 15:11:11 could we do this internally 15:11:13 we could flip the state automatically in the migration process, but I’d vote to be more explicit 15:11:19 without the admin explicitly running 3 commands. 15:11:33 does he need to know 3 commmands to be executed in a certain order? 15:11:38 i vote for explicit, this is one time thing, right? 15:11:58 the workflow usually goes like this: you warn your tenant of a maintenance 15:12:02 you bring the router down 15:12:04 you migrate 15:12:08 Can the admin run them one after the other with no delay? 15:12:13 you bring it up (and hope that everything works) 15:12:24 then go back to tenant and tell him that everything is okay :) 15:12:28 viveknarasimhan: I agree with you and that is the reason we wanted concensus from all of us before proceeding. 15:12:37 Or, does the admin need to wait on something before being allow to run the migration? 15:13:11 carl_baldwin: we need to check it out. 15:13:34 So if we all agree with armax: this is how it has to be done. 15:13:39 I ask because it could increase router downtime. But, it is a one time migration and can be planned downtime. 15:13:56 if some command in teh 3 fails 15:13:59 is there a way to rollback 15:14:04 I vote for explicit - it is more straight forward 15:14:07 or he need to recreate the centralized router again? 15:14:07 We will document that "admin" need to first bring down the router, migrate the router and then tell the tenant to use it. 15:14:44 carl_baldwin: i imagine that it’s better being explicit 15:14:44 any questions or concerns there. 15:15:15 we might want to give us some room before the router going down and the migration 15:15:30 if we do everythin in one shot there’s a risk something gets scheduled on that router in between 15:15:49 armax, agreed 15:15:54 armax: agreed 15:16:00 scheduled as in something happens to that router 15:16:01 I don’t think I’m concerned. 15:16:12 unlikely, but you never know 15:16:18 if a problem happens, it could happen in 3 step process as well 15:16:19 explicit sounds less surprising in POV of admins 15:16:27 and rollback is part of migration failure case, right? 15:17:12 yyywu: when do you think that rollback should happen 15:17:46 Sawmi: I am thinking if migration failure happened, rollback should kick in. 15:17:47 right now we are not targeting 'rollback" but we can flag a "migration-error" if something odd happens. 15:18:14 rollback can’t really happen if we don’t implement the distributed->centralized path 15:18:21 Swami: i think we can live as it. 15:18:44 a recovery procedure would be to destroy and recreate the router (with all the interfaces and gateway associated with it) 15:18:45 armax: you are right. 15:19:10 Anything else on migration / admin_state? 15:19:31 carl_baldwin: admin_state is done. 15:19:41 Next question is on the VM migration. 15:20:33 How will vm migration be handled during router conversion. 15:20:41 There are two cases. 15:21:15 One admin would like to use the same compute node, so they will not disturb the VM, but restart the l3-agent with DVR mode enabled. 15:22:29 The other case is where, the admin wants to move all their VM to a greenfiled deployment for DVR enabled Nodes. So they bring up new Compute nodes with DVR enabled L3-agents. In this case the VM migration is out of scope for the dvr team. 15:22:47 The first is the only scenario I had in mind. 15:23:25 on the first l3-agent would need to be updated as well as ovs 15:23:41 Is this live migration in the second case? 15:24:04 carl_baldwin: Ok, if we only target the first scenario, then we will go through the use cases. 15:24:17 one question, during router conversion, could nova initiate vm migration? 15:24:27 carl_baldwin: yes it is a kind of live migration. 15:25:06 yyywu: nova has no idea of router conversion, I don't nova will be aware about the router changes. 15:25:16 I think we should consider the second case out of scope. 15:25:39 Swami: prior to doing the migration every compute host needs to run l2 (with dvr enalbed) and l3 agent 15:25:44 correct? 15:25:54 correct armax 15:26:14 armax: agreed. 15:26:31 So the admin issues a "router_admin_state_down". 15:26:34 does it make sense to keep the compute host disabled well during the migrtion? 15:26:48 The the admin prepares the compute node for migration. 15:27:12 And the admin updates the router for migration. 15:27:33 my understanding was that duing a planned upgrade the admin would deploy the right services with the right configs 15:27:35 This is for the "case 1" where we use the existing compute nodes. 15:27:38 on the elements of the cloud 15:28:24 but router migration should probably be a step right after the upgrade is complete 15:28:47 not 100% true in every case 15:29:07 especially if the default router type is ‘centralized' 15:30:01 carl_baldwin: armax: viveknarasimhan: mrsmith: are we all in an agreement 15:30:25 Swami: on VM migration? 15:30:33 focus on case 1 ? 15:30:45 I think so. I’m not keen on adding the second case to our scope. 15:31:05 i agree. we will try to get case 1 fully covered 15:31:07 Maybe in kilo if there is demand. 15:31:14 case 2 looks bit complex 15:31:16 makes sense 15:31:34 Swami: anything else? 15:31:39 even though moving a vm to a new host 15:31:41 mrsmith: carl_baldwin's reply should have answered your question for VM migration. We should reduce our scope to the Case 1: that we discussed. 15:31:48 does look like pretty much like a scheduling event 15:31:56 so dvr should handle it just as well 15:32:08 carl_baldwin: that's all from me. 15:32:48 Swami: thanks. Let’s get pounding more on the DVR code and fixing bugs. We’ve already got some fixes done and a few more on the way. Great job! 15:33:02 i have a small dvr question 15:33:11 yamamoto_: go ahead. 15:33:17 carl_baldwin: np 15:33:17 see https://review.openstack.org/#/c/110188/ 15:33:30 it's about ofagent but ovs-agent looks same 15:33:49 isn't it a problem for unbind_port_from_dvr? 15:34:43 yamamoto_: this will take a a bit to look into. Do you mind if we take the question to the neutron room? 15:35:00 np. i just wanted dvr folks know. 15:35:26 Now is a good time to grab them in the neutron room. 15:35:31 #topic l3-high-availability 15:35:37 safchain: armax: Any update here? 15:35:51 hi 15:35:55 carl_baldwin: going through the review bits 15:36:19 armax: I need to allocate more time to this though 15:36:22 I addressed comments and reworked base classes 15:36:36 Hi guys, sorry I'm late, tried to find correct meeting room :) 15:36:44 Working on l3 agent functional testing: https://review.openstack.org/#/c/109860/ 15:36:47 I said I’d review last week and did not. But, now with the bulk of DVR merged, I have some review cycles. 15:36:48 Something basic for starters 15:37:04 currently rebasing the scheduler part 15:37:19 The l3 agent patch itself: https://review.openstack.org/#/c/70700/ - Adds HA routers to the functional tests 15:37:54 Once that's working I'll be able to respond to reviewer comments and refactor the HA code in the l3 agent so that it isn't as obtrusive 15:38:33 amuller: do you have a timeline for getting that working? 15:39:16 the base patch that adds the functional tests is working 15:39:49 the ha additions in the l3 ha agent patch aren't... I figure I need 2-3 days working on that and I'll start pushing new patchsets that change the code itself and not the functional tests 15:40:18 amuller: Thanks. I need to catch up on the progress. I’ll review today. 15:40:25 Anything else? 15:40:38 ok for me 15:40:49 all good 15:40:50 Thanks 15:40:56 #topic bgp-dynamic-routing 15:41:01 devvesa_: hi 15:41:06 hi 15:41:22 sorry, i was out last week 15:41:59 Anything to report? 15:42:30 keep working on it, i am close to push a WIP patch soon 15:42:50 so you can start review it 15:43:13 devvesa_: That’d be great. 15:43:27 Be sure to ping me when you post it and I’ll have a look. 15:43:41 ok, great 15:44:00 devvesa_: Anything else? 15:44:14 no, anything else for the moment 15:44:39 devvesa_: thanks 15:44:44 thanks carl 15:45:10 All of the other usual topics are deferred to Kilo. I’ll defer discussion for now. 15:45:28 #topic reschedule-routers-on-dead-agents 15:45:36 kevinbenton: hi, this one is yours. 15:46:13 topic title is pretty self-explanatory. i would like routers to be taken off of dead agents so they can be automatically rescheduled 15:46:29 here is one approach https://review.openstack.org/#/c/110893/ 15:46:59 So, the L3 HA blueprint solves the same problem 15:47:09 this needs to be in icehouse 15:47:14 IMO 15:47:21 so i was hoping for a bugfix 15:47:33 kevinbenton, amuller I think the two overlap 15:47:47 and I see kevinbenton’s approach also a a contingency plan 15:47:51 I was under the impression that people use pacemaker and other solutions currently 15:47:54 This might cross line from bug fix to feature. Might be hard to get in to Icehouse. 15:48:19 that mitigates the need of relying on external elements to the fail-over process 15:48:29 amuller: We’ve toyed around with a pacemaker solution. A colleague gave a talk at the Atl. summit. 15:49:01 amuller: I could confirm it from Mirantis Fuel perspective 15:49:06 We use a Pacemaker based solution in RH OpenStack as well 15:49:11 to solve L3 HA issue 15:49:31 it’s annoying to have to use an external process to do something as simple as rescheduling 15:49:34 we using pm/crm to manage l3 agents and forcing rescheduling of routers. Of course with some downtime ;( 15:49:49 In our testing, we found it very easy to get in to situations where nodes start shooting themselves. It turned out to be somewhat difficult to get right. 15:50:21 kevinbenton: I agree, but I'm really conflicted if something like this should be merged... Since L3 HA is the concensus on how to do it, I'd be really careful with making the code any more complicated pre L3 HA 15:50:40 L3 HA == VRRP blueprint 15:51:03 i don’t see how this is the same really 15:51:16 it just does what can be done with the existing API 15:51:21 I think kevinbenton’s proposal targets non HA deployments 15:51:31 granted we want to minimize potential code conflicts 15:51:49 so let’s see how the two develop and make a call later on when the code is more mature 15:51:56 armax: right, and i don’t think there would be 15:51:58 From my point of view VRRP+cn_sync looks easier than rescheduling. In terms of used technologies. But it's not so easy to implement. 15:52:12 aleksandr_null: what? 15:52:16 I know that Rackspace use something similar to your proposal Kevin 15:52:20 aleksandr_null: did you see my patch? 15:52:23 I’d see L3HA the canonical way of doing things 15:52:29 it’s like 10 lines 15:52:30 they monitor the RPC bus and reschedule routers as needed 15:52:37 kevinbenton: Will take a look, of course. 15:52:48 armax: +1 15:53:10 that said, there are situations where L3HA as a solution won’t be available 15:53:11 armax: yes, l3ha is definitely the way to move forward, but I’m trying to address an issue in icehouse 15:53:19 if possible 15:53:28 now, people might have come up with their own solutions 15:53:37 homegrown and painful 15:53:50 I think kevinbenton is trying to see whether some of that pain can be taken away :) 15:53:56 it’s embarrassing that a node goes down and we just throw our hands up 15:53:57 amuller: But what will happens if something will be wrong with communications inside of the cloud ? MQ fails from time to time, of course its out of scope but VRRP will do that autonomous 15:54:08 I’m concerned that simply rescheduling will not be enough. A pacemaker/corosync type solution would shoot the dead node. This solution would not. With the agent down, there is no one left around to clean up the old router. 15:54:53 I’d promote that effort, but I’d reserve me the judgment to see whether it’s icehouse/juno material once the code is complete 15:54:57 kevinbenton: If the code doesn't end up being more complicated after it's properly tested, and properly solved the problem, then it's safe enough to merge as it is, but I have a gut feeling that you'll find that it's gonna end up a lot more complicated 15:54:59 kevinbenton: how far off are you? 15:55:17 In many situations, the old routers could still be plumbed and moving traffic. 15:55:42 carl_baldwin: do your compute nodes frequently lose connectivity to the neutron server? 15:56:30 kevinbenton: it depends on architecture of cluster. We had an situation when customer just disabled mgmt/comm network. 15:56:36 kevinbenton: There are many reasons an agent can be considered dead. 15:56:38 For a while. 15:56:58 armax: i have the basic patch there, but it doesn’t address zombie nodes like carl_baldwin mentioned 15:57:14 aleksandr_null, carl_baldwin: how does the openvswitch agent handle a broken management network? 15:57:34 we have to assume that’s down too then, right? 15:57:43 yep. 15:58:13 kevinbenton: yes, the agent goes inactive. 15:58:41 well then yes, i wasn’t aware we supported headless operational modes 15:59:03 my patch is pointless and this isn’t a problem that can be solved from the neutron server 16:00:04 because it doesn’t actually know if these routers are online or not 16:00:22 IMHO this could be solved only by using autonomous solutions like vrrp, I dont' have any other solutions related to RPC/MQ because they couldn't work autonomous =( 16:01:41 kevinbenton: It is still something that needs to be addressed. Our pacemaker / corosync solutions have not been as great as we’d hoped. 16:01:44 kevinbenton’s solution obviously need cooperation between servers an agents 16:02:13 kevinbenton: saying its’ pointless is a bit harsh 16:02:14 :) 16:02:18 of course mgmt network outage is extraordinary case. Rescheduling that kevinbenton suggests maybe improved by trying to monitor neighborhood nodes and/or mgmt net and if something happens with mgmt net then don't do anything. Just suggestion. 16:02:33 every solution has tradeoffs 16:02:45 armax: +1 16:02:51 +1 16:02:55 the larger question is: do we want to provide some degree of built-in functionality? 16:03:07 with any of the cons that may have? 16:03:18 external pcm/cm also has issues 16:03:37 carl_baldwin: Completely agree. pm/crm doing almost the same the Kevin suggested and also wouldn't work fine if mgmt network is down. 16:03:46 so long as kevin’s proposal is not disruptive to the current effort for L3HA 16:03:52 corosync cluster will just split up. 16:03:59 kevinbenton: We'd appreciate any contributions to the L3HA efforts :) 16:04:02 I’d like to have the option to decide whether to take it or not 16:04:25 I could test it in corosync environment and without it 16:04:26 amuller: indeed, would kevin’s time best put to L3HA? 16:04:31 kevinbenton: that’s a question for kevin :) 16:04:41 kevinbenton: testing/reviewing would be awesome, and there's loose ends also 16:04:44 he might have an hidden customer requiremetn ;) 16:04:49 armax: that’s not going to happen right now. I need a solution for icehouse 16:04:54 armax: not hidden :-) 16:05:03 kevinbenton: right, you know what I mean 16:05:47 kevinbenton: I think it makes sense if you keep on working on this, let’s revise the progress in a week 16:05:54 I’m glad the discussion is opened. HA will be a hard nut to crack. Is this something we want to add to the permanent agenda? 16:06:05 I just noticed we’re over time. Anyone else waiting for the room? 16:06:09 and see how far we got 16:06:14 waaaay over time 16:06:32 * carl_baldwin is really sorry about going over time if someone is waiting for the room. 16:06:50 looks like nobody :) 16:06:53 carl_baldwin: they would have kicked us out 16:06:56 :) 16:07:00 bye everyone 16:07:06 bye 16:07:10 bye 16:07:18 bye guys! 16:07:25 I’ve got to run. I’d like to discuss rescheduling routers more. I’ll keep it in the agenda near HA. 16:07:45 I’ll also get some of our guys with experience with our HA solution on kevinbenton’s review to provide insight. 16:07:52 Thanks all 16:08:06 #endmeeting