14:00:11 #startmeeting Nova Live Migration 14:00:12 Meeting started Tue May 3 14:00:11 2016 UTC and is due to finish in 60 minutes. The chair is PaulMurray. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:13 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:16 The meeting name has been set to 'nova_live_migration' 14:00:30 Hi, is anyoe here for live migration ? 14:00:36 Hi! 14:00:36 o/ 14:00:38 hi I am 14:00:40 o/ 14:00:46 o/ 14:00:53 o/ 14:00:56 \o/ 14:00:59 * mriedem is here but has to run to another physical meeting for a half hour 14:01:13 o/ 14:01:27 just gonna wait a moment while people pop up 14:01:44 There is an agenda here: https://wiki.openstack.org/wiki/Meetings/NovaLiveMigration 14:02:04 * kashyap waves 14:02:08 Now I'm back home I will try to get these agendas sorted on fridays 14:02:31 #topic Review summit outcome 14:02:45 Did everyone have a good summit ? 14:03:03 yes 14:03:20 I put the new deadlines on the agenda 14:03:28 but so you can see them here: 14:03:45 Non-priority spec freeze: May 30-03 (R-18) 14:03:45 Non-priority feature freeze: Jun 27-01 (R-14) 14:03:45 Priority spec freeze: Aug 01-05 (R-9) 14:03:45 Priority feature freeze: Aug 29-02 (R-5) 14:03:45 No deadlie for CI work 14:04:02 #link https://wiki.openstack.org/wiki/Nova/Newton_Release_Schedule 14:04:08 May 30-03 means May 30th? 14:04:32 May 30 to June 3 probably 14:04:34 I think it means the week of MAy 30th (goes up to 3rd June) 14:04:36 june 2nd 14:04:48 thursday 6/2 is n-1 and non-priority spec approval freeze 14:05:25 mriedem, do al those dates refer to Thursday as the deadline ? 14:06:28 Anyway, you have the link there 14:06:47 I also did an ML post - with some corrections in the thread.... 14:07:04 #link Summit round up: http://lists.openstack.org/pipermail/openstack-dev/2016-April/093538.html 14:07:26 Feel free to add corrections if you see any 14:07:31 mistakes 14:07:53 It was a great roundup, thanks 14:08:20 Yeah, useful to those of us who were not present 14:08:40 The storage pools work was considered right to be made a priority 14:08:57 because it addresses security issues and is a mess now 14:09:09 so we will push that one as best we can 14:09:17 the others are non priority 14:09:46 mdbooth, paul-carlton1 and I will need a little help 14:10:04 mdbooth, are you going to share a list of patches soon ? 14:10:40 PaulMurray: They should be automatically attached to the bp? 14:11:00 So, breaking news, there are now 2 of us working on this series 14:11:12 * diana_clarke waves hello! I'll be working with mdbooth to help with this. 14:11:12 mdbooth, who is the second ? 14:11:19 ^^^ 14:11:26 hello diana_clarke 14:11:33 what time zone are you ? 14:11:34 PaulMurray: hello! 14:11:45 PaulMurray: Toronto, Canada (EDT) 14:11:50 diana_clarke: welcome 14:11:58 andrearosa: Thanks! 14:12:30 great, good to have you with us 14:12:57 mdbooth, is the spec approved - I didt look before the meeting 14:13:10 I don't think so, but I know it's being approved 14:13:14 s/approved/reviewed/ 14:13:38 No, it's not approved 14:13:40 #link mdbooth's spec: https://review.openstack.org/#/c/302117/ 14:13:48 right one ^^ ? 14:13:57 Yes 14:14:34 ok - lets get this reviewed 14:14:44 dansmith said he was looking at it 14:15:00 as we speak 14:15:01 good - dansmith did say he would help with core reviews 14:15:02 So I'm expecting a -2 any time now ;) 14:15:08 thanks dansmith 14:15:55 The other spec we focused on in the session was post-copy 14:16:20 luis5tb, I think that's your spec right ? 14:16:26 yes 14:16:40 #link Add post copy: https://review.openstack.org/#/c/301509/ 14:16:46 Were you there ? 14:16:57 PaulMurray: Did you see danpb's post about performance testing? 14:17:03 no, unfortunately I couldn't attend the summit 14:17:26 mdbooth, no, I didn't - do you have a link ? 14:17:48 http://lists.openstack.org/pipermail/openstack-dev/2016-May/093741.html 14:17:57 how should we proceed with that one? should I modify the specs so that it is an admin option (at nova.conf) to enable/disable postcopy? 14:18:10 and then automatic switch based on number of memory iterations? 14:18:26 PaulMurray: I think it's worth parking this discussion until next week when hopefully he'll have posted the results 14:19:07 but regardless of the results, we still need to have a decission on how to enable post-copy 14:19:07 the idea about only using it for small VMs is odd; using it for big VMs works very nicely 14:19:14 jsut for luis5tb sake.... we need to have a discussion on your spec 14:19:18 or possibly on the ML 14:19:34 and danpb's work might inform it a little 14:19:41 even if postcopy is better for performance (with I'm assuming it is), for reliability it may be not the best solution 14:19:49 PaulMurray: We also need to pull in other drivers 14:20:06 We need a general api, not a libvirt api 14:20:28 mdbooth, agreed, but we will take time to get any concensus on it 14:20:54 luis5tb, might be able to create an argument for having it as a fixed config option 14:20:54 It's taking enough to just to decide whether it's a good idea 14:21:23 can we, for now, just propose an admin configurable variable (no API modification) and automatic switch? 14:21:39 maybe later we can decide if it is worthy to include it as an API or not 14:22:01 We have a force completion api already, right? 14:22:24 yes, but that will degrade the performance of the VM, or even make it "non-live" 14:22:46 yes an idea we discussed earlier today was to have force complete switch to post copy if post copy is enabled in conf 14:22:50 mdbooth, right, does it make sense to do without an auto switch - i.e. use post copy if config opt set, otherwise use pause 14:22:54 on foce complete 14:22:55 We might have an admin knob to configure the behaviour of that 14:23:14 paul-carlton1: +1 14:23:28 or specify it in the body of the force complete API call, if I am not wrong we pass so,ething in the body 14:23:46 andrearosa, that's the option they don't like 14:23:51 andrearosa: The problem with that is you start exposing qemu details in the api 14:23:53 it becomes visible in the api 14:24:02 Nobody likes that 14:24:12 oh yes because it is not available for all drivers 14:24:52 so looks like mdbooth paul-carlton1 and me ( PaulMurray ) think doing it on force-complete is ok if set by config 14:24:59 any other suggestions ? 14:25:07 I think the automatic switching is not the problem, d.Berrange like that 14:25:40 but the discussion is still how to include the post-copy flag in the VMs, as this needs to be included when triggering the migration 14:25:46 so, a config option? 14:26:02 auto switch over is a viable option but gives the user less control 14:26:03 yes, sounds good to me - at least for now 14:26:04 I also like the idea to also include post-copy at force option 14:26:08 fyi, hyperv and xenapi appear at first glance to support live migration 14:26:17 And I know it's on vmware's immediate todo list 14:26:22 It would be nice to at least ask them 14:26:35 mdbooth, yep, we can do that 14:27:04 You could do both, i..e allow force complete to invoke it but also allow the migration code to decide it is needed if the instance is not making progress 14:27:20 my initial idea for the automatic switching was to be based on a variable regarding the number of memory iterations before the switching 14:27:23 paul-carlton1: Yes, that's a good combo 14:27:33 paul-carlton1: The latter could be an api option, btw 14:27:40 that could be working together with the force migration + post-copy too 14:27:45 i.e live-migration auto-force=True 14:28:32 mdbooth, do you mean that as a config opt or as an flag in the API ? 14:28:44 PaulMurray: I was thinking api flag 14:28:51 not sure I like that 14:28:55 I like paul-carlton1 idea, pretty similar to what I already have 14:29:22 PaulMurray: I'm not sufficiently wedded to it to bike-shed it :) 14:29:23 ...but would need to think 14:29:44 luis5tb, would you like to do an updated spec 14:29:50 then we can discuss it there 14:29:58 yes, I was waiting for summit decision to do so 14:30:03 and maybe promote its existance on the ML when its up 14:30:11 I am not 100% sure about the auto-switch if we are not making any progress, some users coudl prefer to abort the live migration and not going for the risky post-copy option 14:30:40 but if no progress is being done, you could do the force-completiion 14:30:44 andrearosa: That's why I was thinking api flag rather than config opt 14:30:46 andrearosa: As a feature that's enablable 14:30:49 The plan is to discuss it on the spec / ML to converge on a plan 14:30:51 That is where my spec come in https://review.openstack.org/#/c/306561 Automatic Live Migration Completion 14:30:53 indeed 14:31:22 So, luis5tb, we'll leave it with you to do the spec and tell us when its up 14:31:26 and we can move on here 14:31:40 ok 14:31:44 #topic Specs 14:31:51 I'll update the spec and send the email 14:32:22 #link specs for review: https://review.openstack.org/#/q/project:openstack/nova-specs+status:open 14:32:42 #undo 14:32:42 Removing item from minutes: 14:32:49 that's not what I meant to post 14:33:15 I think this is what I was looking for 14:33:24 #link subteam specs for review: https://etherpad.openstack.org/p/newton-nova-priorities-tracking 14:33:47 The deadline for non-priority is very soon 14:34:57 Does anyone want to talk about one of them now ? 14:35:44 ok 14:36:07 We will review progress on these specs each week up to the freeze date 14:36:14 https://review.openstack.org/#/c/307131 Live Migration of Rescued Instances 14:36:40 reviews welcome, there is code ready to go too 14:36:40 paul-carlton1, ? 14:36:56 You asked about spec reviews 14:37:00 yep, 14:37:21 that is an interesting one - should be a quick win if its all correct 14:37:47 next topic 14:38:05 #topic Open Discussion 14:38:25 aha - https://review.openstack.org/#/c/215483/3 14:38:27 hi, this is related to bug https://bugs.launchpad.net/nova/+bug/1470420 14:38:29 Launchpad bug 1470420 in OpenStack Compute (nova) "Set migration status to 'error' instead of 'failed' during live-migration" [Low,In progress] - Assigned to Rajesh Tailor (rajesh-tailor) 14:38:57 during summit we had discussion with PaulMurray and alaski for the same. 14:39:17 please take a look at it, IMO, as of now we can have this solution and once a permenant fix is implemented we can remove the periodic task _cleanup_incomplete_migrations 14:40:01 abhishek: Is this just the freeform status field? 14:40:13 currently patch is in merge conflict, I can rebase it in few minutes if it is required 14:40:21 what did alaski say when yu spoke to him? 14:40:53 alaski said he will have a look again at it, We have added comment on patch for him 14:41:06 This should follow whatever we do at project level, tbh. It looks like user-visible api to me, which means changing it could be regarded as a regression. 14:41:21 I mean, it's obviously a wart, but it might be a wart we have to live with 14:41:54 However, if we've decided as a project that we fix this sort of thing... 14:41:59 mdbooth, but with this we will be able to cleanup the files from source or desination node 14:42:06 mdbooth, I think we need to look at the migrations reporting in general a bit more carefully 14:42:16 We can punch operators in the face /so many times/ before they complain 14:42:37 at the moment the migrations record is used in several different ways 14:42:58 some types even end up finished when others end up completed 14:43:12 Eww 14:43:12 PaulMurray: right 14:43:41 abhishek, if its about a quick fix, is there any other non-user facing parameters that can be used to flag a clean up is needed ? 14:43:55 currently patch is in merge conflict, I can rebase it 14:44:14 If we're going to do this, can we please create an enum and enforce it somewhere? 14:44:21 PaulMurray, IMO no 14:44:24 Perhaps at the object layer? 14:45:11 mdbooth, this can be done 14:45:16 abhishek, I think mdbooth's point is where this got stuck before - its user facing and some people have tooling that looks for the values 14:47:15 PaulMurray, I think its still good to have way to cleanup files rather keeping it as it is 14:47:35 abhishek, I think we may need to find another way though, we keep going around on this one 14:48:26 PaulMurray, I will try to get alaski's view on this 14:48:42 abhishek, ok 14:49:09 abhishek, thanks for keeping on with it, but be open minded about how it will end up 14:49:25 PaulMurray, mdbooth: sure, thank you for your time 14:49:39 anything else for the last few minutes ? 14:49:52 anything anyone would like to see in these meetings ? 14:50:09 (in future I mean) 14:51:14 I will try to organise the subteam page: https://etherpad.openstack.org/p/newton-nova-live-migration 14:51:24 in the mean time - thanks for comig 14:51:29 Nice to meet you, folks! 14:51:42 #endmeeting