14:30:23 <mestery> #startmeeting neutron nova-network parity 14:30:24 <openstack> Meeting started Wed Jul 23 14:30:23 2014 UTC and is due to finish in 60 minutes. The chair is mestery. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:30:25 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:30:27 <openstack> The meeting name has been set to 'neutron_nova_network_parity' 14:30:35 <openstack> mestery: Error: Can't start another meeting, one is in progress. Use #endmeeting first. 14:30:41 <mestery> Whoops 14:30:45 <mestery> #link https://wiki.openstack.org/wiki/Meetings/NeutronNovaNetworkParity Agenda 14:31:00 <Swami> hi everyone 14:31:09 * mestery waits a few minutes to let anyone else filter in. 14:31:52 <mestery> #topic Gap Analysis Plan 14:31:57 <mestery> #link https://wiki.openstack.org/wiki/Governance/TechnicalCommittee/Neutron_Gap_Coverage 14:32:13 <mestery> So, maybe to start, we could quickly go over where we're at for each item in this plan? 14:32:27 <markmcclain> want me to run through it? 14:32:31 <mestery> markmcclain: Please do :) 14:32:42 <markmcclain> #info Gap 0 is complete 14:33:51 <markmcclain> we merged a healing migration that updates the various schema branches that had formed to the canonical version that includes all models 14:34:09 <mestery> That was awesome work by the team working on gap 0! 14:34:34 <markmcclain> yeah I was super happy to see everyone working to develop a solution for it 14:35:04 <markmcclain> Gap 1 Tempest testing is nearly done 14:35:18 <markmcclain> the lone holdout is enabling the full tempest job for voting 14:35:33 <markmcclain> salv-orlando is working on that bit 14:35:47 <markmcclain> Gap 2 is resumption of Grenade testing 14:36:15 <mestery> #link http://lists.openstack.org/pipermail/openstack-dev/2014-July/040973.html Full Job Email Update 14:37:04 <markmcclain> mestery: thanks for the link 14:37:19 <mestery> markmcclain: np 14:37:46 <markmcclain> for Gap 2 we have to make some changes to the way devstack sets up Neutron 14:38:20 <markmcclain> basically we need to stop misusing enable_service 14:38:43 <markmcclain> I'm working on those and should have them proposed shortly 14:38:57 <mestery> Excellent! 14:39:04 <markmcclain> For Gap 3 is dependent on Gap 1 and 2 14:39:34 <mestery> #link https://review.openstack.org/#/c/105785/ Neutron as default in devstack 14:40:38 <markmcclain> cool.. just need to track down the failures 14:40:51 <markmcclain> Gap 4 is the where we are hurting the most 14:41:08 <markmcclain> the spec within Nova did not get approved by the deadline 14:41:35 <markmcclain> oops.. Gap 5 spec did not get approved 14:41:36 <mestery> Wait, you meant gap 6 14:41:43 <mestery> 4 is missing API calls :) 14:41:46 * markmcclain needs more coffee 14:42:10 * mestery hands markmcclain coffee with kahlua 14:42:26 <marun> obondarev - you want to update on your progress? 14:42:31 <markmcclain> Gap 4 API calls is complete 14:42:40 <markmcclain> Gap 5: is DVR work which is on track 14:42:48 <obondarev> marun: sure I will 14:43:16 <mestery> obondarev: This is for Gap 6 now (nova-net to neutron migration) 14:43:33 <obondarev> mestery: ok 14:43:47 <obondarev> so according to nova team feedback on the spec the overall design of the neutron migration was slightly changed 14:44:11 <obondarev> now instead of migrating to neutron within one compute host the idea is to migrate between hosts 14:44:27 <obondarev> and make neutron migration as part of existing live-migration mechanism 14:44:33 <mestery> #link https://review.openstack.org/#/c/101921/ Neutron migration specification 14:44:49 <obondarev> which is reasonable as it is the usual way to perfom big upgrades 14:44:53 <mestery> obondarev: ++ 14:44:54 <obondarev> I mean host-by-host 14:45:04 <obondarev> mestery: thanks for the link 14:45:24 <obondarev> currently I'm working on agreeng the design with the nova team and implementing POC in parallel 14:45:31 <obondarev> thanks to dansmith for his reviews and suggestions! 14:46:24 <obondarev> there are some difficulties with POC as now I'm not even able to perform an original live migration on a multinode devstack 14:46:27 <marun> I notice that daniel berrange has provided contradictory advice as to the chosen strategy on the most recent patch :/ 14:46:46 <marun> Hopefully we can hammer out the contradictory advise at the mid-cycle meetup next week. 14:46:52 <obondarev> marun: right 14:46:58 <mestery> marun: ++, you and markmcclain will be busy with that next week 14:47:17 <dansmith> well, 14:47:23 <obondarev> I followed the guide from official openstack docs but still facing some issues with live-migration 14:47:33 <dansmith> he said that live migration can't always work, which is what I said regarding make sure that cold migration works as well 14:47:48 <marun> obondarev: as per dan's comment, maybe the starting point is more properly cold migration, since it doesn't have hypervisor dependencies? 14:47:54 <marun> dansmith: ah, gotcha. 14:48:20 <marun> dansmith: though there was a comment from him that talked about in-place network switch 14:48:25 <obondarev> is cold migration is that one that is not a "true" live migration? 14:48:32 <marun> dansmith: on line 119 14:48:33 <mestery> I'm concerned that cold migration won't satisfy the downtime requirements the TC put forth though, just wanted to through that out there. 14:48:47 <marun> mestery: where is this requirement? 14:49:00 <mestery> marun: Documented in the TC meeting minutes from last April I'm afraid :( 14:49:01 <dansmith> marun: yeah, I wonder if he reviewed the previous version of this and saw what that implied 14:49:03 <mestery> markmcclain: Thoughts on this? 14:49:16 <marun> mestery: I think we should pull it out of the minutes and formalize it in the gap coverage page 14:49:24 <mestery> marun: ++ I'll take na action to do that. 14:49:25 <marun> mestery: the lack of visibility is a problem both on the nova and neutron sides 14:49:25 <dansmith> mestery: I would expect that "zero downtime for any and all VM types" is probably not a reasonable requirement anyway 14:49:36 <mestery> #action mestery to scour meeting minutes around downtime requirements for migration and add to coverage wiki 14:49:41 <mestery> marun: agreed 14:49:57 <mestery> dansmith: I agree, just throwing it out there from my memory of the TC meeting. 14:49:57 <markmcclain> right… so originally we specified that network connections can/will drop 14:51:00 <markmcclain> but committed to keeping the VMs running if it is impossible than we can always go back and say that keeping them running will be problematic and why 14:51:30 <mestery> markmcclain: Makes sense to me. 14:52:15 <dansmith> keeping them running and doing it in-place would be awesome, but I'm not sure it's worth what we'll have to do in order to support it 14:53:24 <obondarev> dansmith: by cold migration do you mean "nova migrate..." one? 14:53:26 <markmcclain> agreed… there are some that don't want to reboot everything to upgrade 14:53:34 <dansmith> obondarev: yes 14:54:05 <markmcclain> I just think we have to document everything properly 14:54:08 <marun> there is some question, though, of just how valuable a migration mechanism is - who the target audience is, and what kind of use cases they have for migration 14:54:30 <marun> we need something, but it's not clear what without more involvement from deployers who want to use it 14:54:41 <marun> is it worth raising the question on the operators list? 14:54:56 <dansmith> right, I think we're missing some definition about these details and requirements 14:55:03 <markmcclain> so part of the issue is that if we EOL nova-net then they operators have to hae something 14:55:41 <marun> markmcclain: sure. I don't think that precludes doing some research to see what 'something' should e 14:55:41 <marun> be 14:56:05 <marun> and maybe that should be driven from the TC side, given that its them that are setting the requirements 14:56:49 <markmcclain> I'll work on narrowing the scope a bit 14:57:29 <marun> from the TC side you mean? 14:58:07 <marun> dansmith mentoined that there could be folks from metacloud at the mid-cycle next week (if we're lucky) 14:58:08 <markmcclain> marun: yes 14:58:23 <marun> even if not, it might be worth engaging with them since they're a pretty heavy nova network user 14:58:30 <dansmith> yeah, 14:58:45 <dansmith> knowing what they'd expect a migration to have to look like before they'd be willing to take it would be good data 14:59:21 <markmcclain> good point 14:59:26 <marun> markmcclain: so, we can leave it to you to drive narrowing the requirements from the TC side 14:59:45 <Swami> mestery: I need to quit, I have to be in the DVR status meeting, if you need any info please ping me. 14:59:48 <marun> markmcclain: and I'd hope you'd raise the point of engaging with operators so that the effort is able to be grounded in actual requirements 14:59:54 <mestery> Swami: will do, thanks! 15:00:10 <markmcclain> marun: yeah.. I'll try to make items more specific 15:00:18 <markmcclain> re cold vs live migraiton 15:00:42 <dansmith> markmcclain: also, it'd be good to know if the TC expects every possible vertex on the matrix of configurations to be migratable, without downtime, etc 15:01:25 <dansmith> markmcclain: because it could be that just providing a nova-manage command to tweak the database while everything control-plane-wise is shut down would be sufficient for folks that can't do migrations 15:01:41 <mestery> dansmith: we didn't delve into that level of specifics at the meeting, there is a definite gray zone here. 15:01:48 <markmcclain> we know there will be configurations that cannot be universally upgraded, so part of the process is documenting the ones that cannot be done 15:01:56 <dansmith> markmcclain: and then let nova-compute startup migrate the VIFs on the next startup 15:02:07 <markmcclain> and then give the operator options to manually resolve 15:02:18 <dansmith> markmcclain: okay, well, we should be able to make serious progress on this next week, in terms of ideas and feasibility I think 15:02:51 <markmcclain> dansmith: agreed 15:04:23 <obondarev> also Nachi Ueno has proposed another possible way for neutron migration recently 15:04:42 <obondarev> #link https://docs.google.com/presentation/d/12w28HhNpLltSpA6pJvKWBiqeEusaI-NM-OZo8HNyH2w/edit#slide=id.p 15:04:52 <nati_ueno> hi 15:05:17 <obondarev> hi Nachi 15:05:27 <nati_ueno> it's still in idea phase, but I think using nova-network manager code is also simple way 15:05:35 <marun> I'm not sure of the value of the proposed approach, since it still requires migrating responsibility between nova network and neutron 15:05:44 <obondarev> The idea looks nice as it requires minimal nova-side changes and seems can be implemented fairly quickly 15:05:47 <marun> what does this intermediate step buy us? 15:05:48 <nati_ueno> I think we can make no downtime 15:06:04 <dansmith> nati_ueno: this is more like what I was expecting us to have 15:06:05 <obondarev> but it is still not a true neutron migration I think 15:06:21 <dansmith> nati_ueno: changes to nova-network to bridge the gap until we could "go direct to neutron" after some amount of small transition 15:06:31 <nati_ueno> so current our approach is 100% neutron compat 50% nova compat. 15:06:42 <nati_ueno> I thinks we should start with 100% nova compat 50% neutron compat 15:06:50 <nati_ueno> then we can improve compatibility in neutron side 15:06:52 <dansmith> right 15:06:55 <nati_ueno> dansmith: ya 15:07:15 <Swami> #link: https://review.openstack.org/#/q/status:open+project:openstack/neutron+branch:master+topic:bp/neutron-ovs-dvr,n,z 15:07:46 <Swami> sorry wrong place. 15:09:01 <nati_ueno> so I believe north bound api and data plane downtime matter 15:09:52 <dansmith> nati_ueno: ah, this is why you were asking about the migrate_* methods :) 15:10:07 <nati_ueno> dansmith: yes. 15:10:44 <nati_ueno> _setup_network_on_host is missing in neutron side, but we can call it when we craete/delete port 15:11:04 <marun> I think this proposal is a nicer solution, but it's not clear to me what the implementation cost would be compared to the strategies already proposed. 15:11:06 <markmcclain> we considered this approach earlier too… there are still some issues with how different elements that are currently shared between Nova and Neutron cooperate 15:11:32 <marun> and we'd need clarity from a user perspective to decide whether that cost was worth paying 15:11:52 <marun> (of course, if it's cheaper and simpler, it would be hard to argue against it) 15:11:55 <nati_ueno> markmcclain: what's issue did you have? 15:11:58 <mestery> marun: ++ 15:12:14 <markmcclain> iptables is one of them 15:12:39 <nati_ueno> markmcclain: For that part, we can use neutron side driver 15:12:49 <nati_ueno> markmcclain: nova-network has no security group code 15:13:06 <nati_ueno> Anyway, I agree with marun. I'll try to POC this. 15:13:19 <nati_ueno> may be, if it works, it can be one option, right? 15:13:40 <nati_ueno> it may take 1 week, or 3 month :) 15:13:41 <markmcclain> it can definitely be an option 15:13:46 <marun> nati_ueno: I would recommend collaborating with obondarev to add details to the spec as a secondary step 15:14:03 <mestery> marun: Yes, we should coorinate this as much as possible. 15:14:03 <nati_ueno> marun: sure 15:14:06 <marun> nati_ueno: we'll need comparison between proposed approaches if we're to make a decision 15:14:06 <markmcclain> but we probably should to fail fast vs letting it linger 15:14:22 <marun> markmcclain++ 15:15:02 <nati_ueno> ya. anyway, this is still just an idea. Let me POC it 15:15:26 <nati_ueno> so team should go existing way 15:15:56 <mestery> nati_ueno: Thanks! 15:16:16 <mestery> #topic Open Discussion 15:16:21 <mestery> That's all I had on the agenda for this week. 15:16:28 <mestery> Going over the gaps and reporting progress. 15:16:33 <mestery> Anything else from anyone? 15:17:03 <markmcclain> that's all from me 15:17:31 <mestery> OK, thanks everyone! 15:17:44 <mestery> markmcclain and marun and dansmith: I hope you folks make some serious progress in person next week at the nova mid-cycle. 15:17:53 <mestery> We'll have hte meeting next week as well. 15:17:56 <mestery> Thanks everyone! 15:17:59 <mestery> #endmeeting