14:00:11 <PaulMurray> #startmeeting Nova Live Migration
14:00:12 <openstack> Meeting started Tue May  3 14:00:11 2016 UTC and is due to finish in 60 minutes.  The chair is PaulMurray. Information about MeetBot at http://wiki.debian.org/MeetBot.
14:00:13 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
14:00:16 <openstack> The meeting name has been set to 'nova_live_migration'
14:00:30 <PaulMurray> Hi, is anyoe here for live migration ?
14:00:36 <luis5tb> Hi!
14:00:36 <rdopiera> o/
14:00:38 <andrearosa> hi I am
14:00:40 <paul-carlton1> o/
14:00:46 <mdbooth> o/
14:00:53 <diana_clarke> o/
14:00:56 <abhishek> \o/
14:00:59 * mriedem is here but has to run to another physical meeting for a half hour
14:01:13 <davidgiluk> o/
14:01:27 <PaulMurray> just gonna wait a moment while people pop up
14:01:44 <PaulMurray> There is an agenda here: https://wiki.openstack.org/wiki/Meetings/NovaLiveMigration
14:02:04 * kashyap waves
14:02:08 <PaulMurray> Now I'm back home I will try to get these agendas sorted on fridays
14:02:31 <PaulMurray> #topic Review summit outcome
14:02:45 <PaulMurray> Did everyone have a good summit ?
14:03:03 <abhishek> yes
14:03:20 <PaulMurray> I put the new deadlines on the agenda
14:03:28 <PaulMurray> but so you can see them here:
14:03:45 <PaulMurray> Non-priority spec freeze: May 30-03 (R-18)
14:03:45 <PaulMurray> Non-priority feature freeze: Jun 27-01 (R-14)
14:03:45 <PaulMurray> Priority spec freeze: Aug 01-05 (R-9)
14:03:45 <PaulMurray> Priority feature freeze: Aug 29-02 (R-5)
14:03:45 <PaulMurray> No deadlie for CI work
14:04:02 <mriedem> #link https://wiki.openstack.org/wiki/Nova/Newton_Release_Schedule
14:04:08 <paul-carlton1> May 30-03 means May 30th?
14:04:32 <abhishek> May 30 to June 3 probably
14:04:34 <PaulMurray> I think it means the week of MAy 30th (goes up to 3rd June)
14:04:36 <mriedem> june 2nd
14:04:48 <mriedem> thursday 6/2 is n-1 and non-priority spec approval freeze
14:05:25 <PaulMurray> mriedem, do al those dates refer to Thursday as the deadline ?
14:06:28 <PaulMurray> Anyway, you have the link there
14:06:47 <PaulMurray> I also did an ML post - with some corrections in the thread....
14:07:04 <PaulMurray> #link Summit round up: http://lists.openstack.org/pipermail/openstack-dev/2016-April/093538.html
14:07:26 <PaulMurray> Feel free to add corrections if you see any
14:07:31 <PaulMurray> mistakes
14:07:53 <mdbooth> It was a great roundup, thanks
14:08:20 <kashyap> Yeah, useful to those of us who were not present
14:08:40 <PaulMurray> The storage pools work was considered right to be made a priority
14:08:57 <PaulMurray> because it addresses security issues and is a mess now
14:09:09 <PaulMurray> so we will push that one as best we can
14:09:17 <PaulMurray> the others are non priority
14:09:46 <PaulMurray> mdbooth, paul-carlton1 and I will need a little help
14:10:04 <PaulMurray> mdbooth, are you going to share a list of patches soon ?
14:10:40 <mdbooth> PaulMurray: They should be automatically attached to the bp?
14:11:00 <mdbooth> So, breaking news, there are now 2 of us working on this series
14:11:12 * diana_clarke waves hello! I'll be working with mdbooth to help with this.
14:11:12 <PaulMurray> mdbooth, who is the second ?
14:11:19 <mdbooth> ^^^
14:11:26 <PaulMurray> hello diana_clarke
14:11:33 <PaulMurray> what time zone are you ?
14:11:34 <diana_clarke> PaulMurray: hello!
14:11:45 <diana_clarke> PaulMurray: Toronto, Canada (EDT)
14:11:50 <andrearosa> diana_clarke: welcome
14:11:58 <diana_clarke> andrearosa: Thanks!
14:12:30 <PaulMurray> great, good to have you with us
14:12:57 <PaulMurray> mdbooth, is the spec approved - I didt look before the meeting
14:13:10 <mdbooth> I don't think so, but I know it's being approved
14:13:14 <mdbooth> s/approved/reviewed/
14:13:38 <mdbooth> No, it's not approved
14:13:40 <PaulMurray> #link mdbooth's spec: https://review.openstack.org/#/c/302117/
14:13:48 <PaulMurray> right one ^^ ?
14:13:57 <mdbooth> Yes
14:14:34 <PaulMurray> ok - lets get this reviewed
14:14:44 <mdbooth> dansmith said he was looking at it
14:15:00 <dansmith> as we speak
14:15:01 <PaulMurray> good - dansmith did say he would help with core reviews
14:15:02 <mdbooth> So I'm expecting a -2 any time now ;)
14:15:08 <PaulMurray> thanks dansmith
14:15:55 <PaulMurray> The other spec we focused on in the session was post-copy
14:16:20 <PaulMurray> luis5tb, I think that's your spec right ?
14:16:26 <luis5tb> yes
14:16:40 <PaulMurray> #link Add post copy: https://review.openstack.org/#/c/301509/
14:16:46 <PaulMurray> Were you there ?
14:16:57 <mdbooth> PaulMurray: Did you see danpb's post about performance testing?
14:17:03 <luis5tb> no, unfortunately I couldn't attend the summit
14:17:26 <PaulMurray> mdbooth, no, I didn't - do you have a link ?
14:17:48 <mdbooth> http://lists.openstack.org/pipermail/openstack-dev/2016-May/093741.html
14:17:57 <luis5tb> how should we proceed with that one? should I modify the specs so that it is an admin option (at nova.conf) to enable/disable postcopy?
14:18:10 <luis5tb> and then automatic switch based on number of memory iterations?
14:18:26 <mdbooth> PaulMurray: I think it's worth parking this discussion until next week when hopefully he'll have posted the results
14:19:07 <luis5tb> but regardless of the results, we still need to have a decission on how to enable post-copy
14:19:07 <davidgiluk> the idea about only using it for small VMs is odd; using it for big VMs works very nicely
14:19:14 <PaulMurray> jsut for luis5tb sake.... we need to have a discussion on your spec
14:19:18 <PaulMurray> or possibly on the ML
14:19:34 <PaulMurray> and danpb's work might inform it a little
14:19:41 <luis5tb> even if postcopy is better for performance (with I'm assuming it is), for reliability it may be not the best solution
14:19:49 <mdbooth> PaulMurray: We also need to pull in other drivers
14:20:06 <mdbooth> We need a general api, not a libvirt api
14:20:28 <PaulMurray> mdbooth, agreed, but we will take time to get any concensus on it
14:20:54 <PaulMurray> luis5tb, might be able to create an argument for having it as a fixed config option
14:20:54 <mdbooth> It's taking enough to just to decide whether it's a good idea
14:21:23 <luis5tb> can we, for now, just propose an admin configurable variable (no API modification) and automatic switch?
14:21:39 <luis5tb> maybe later we can decide if it is worthy to include it as an API or not
14:22:01 <mdbooth> We have a force completion api already, right?
14:22:24 <luis5tb> yes, but that will degrade the performance of the VM, or even make it "non-live"
14:22:46 <paul-carlton1> yes an idea we discussed earlier today was to have force complete switch to post copy if post copy is enabled in conf
14:22:50 <PaulMurray> mdbooth, right, does it make sense to do without an auto switch - i.e. use post copy if config opt set, otherwise use pause
14:22:54 <PaulMurray> on foce complete
14:22:55 <mdbooth> We might have an admin knob to configure the behaviour of that
14:23:14 <mdbooth> paul-carlton1: +1
14:23:28 <andrearosa> or specify it in the body of the force complete API call, if I am not wrong we pass so,ething in the body
14:23:46 <PaulMurray> andrearosa, that's the option they don't like
14:23:51 <mdbooth> andrearosa: The problem with that is you start exposing qemu details in the api
14:23:53 <PaulMurray> it becomes visible in the api
14:24:02 <mdbooth> Nobody likes that
14:24:12 <andrearosa> oh yes because it is not available for all drivers
14:24:52 <PaulMurray> so looks like mdbooth paul-carlton1 and me ( PaulMurray ) think doing it on force-complete is ok if set by config
14:24:59 <PaulMurray> any other suggestions ?
14:25:07 <luis5tb> I think the automatic switching is not the problem, d.Berrange like that
14:25:40 <luis5tb> but the discussion is still how to include the post-copy flag in the VMs, as this needs to be included when triggering the migration
14:25:46 <luis5tb> so, a config option?
14:26:02 <paul-carlton1> auto switch over is a viable option but gives the user less control
14:26:03 <PaulMurray> yes, sounds good to me - at least for now
14:26:04 <luis5tb> I also like the idea to also include post-copy at force option
14:26:08 <mdbooth> fyi, hyperv and xenapi appear at first glance to support live migration
14:26:17 <mdbooth> And I know it's on vmware's immediate todo list
14:26:22 <mdbooth> It would be nice to at least ask them
14:26:35 <PaulMurray> mdbooth, yep, we can do that
14:27:04 <paul-carlton1> You could do both, i..e allow force complete to invoke it but also allow the migration code to decide it is needed if the instance is not making progress
14:27:20 <luis5tb> my initial idea for the automatic switching was to be based on a variable regarding the number of memory iterations before the switching
14:27:23 <davidgiluk> paul-carlton1: Yes, that's a good combo
14:27:33 <mdbooth> paul-carlton1: The latter could be an api option, btw
14:27:40 <luis5tb> that could be working together with the force migration + post-copy too
14:27:45 <mdbooth> i.e live-migration auto-force=True
14:28:32 <PaulMurray> mdbooth, do you mean that as a config opt or as an flag in the API ?
14:28:44 <mdbooth> PaulMurray: I was thinking api flag
14:28:51 <PaulMurray> not sure I like that
14:28:55 <luis5tb> I like paul-carlton1 idea, pretty similar to what I already have
14:29:22 <mdbooth> PaulMurray: I'm not sufficiently wedded to it to bike-shed it :)
14:29:23 <PaulMurray> ...but would need to think
14:29:44 <PaulMurray> luis5tb, would you like to do an updated spec
14:29:50 <PaulMurray> then we can discuss it there
14:29:58 <luis5tb> yes, I was waiting for summit decision to do so
14:30:03 <PaulMurray> and maybe promote its existance on the ML when its up
14:30:11 <andrearosa> I am not 100% sure about the auto-switch if we are not making any progress, some users coudl prefer to abort the live migration and not going for the risky post-copy option
14:30:40 <luis5tb> but if no progress is being done, you could do the force-completiion
14:30:44 <mdbooth> andrearosa: That's why I was thinking api flag rather than config opt
14:30:46 <davidgiluk> andrearosa: As a feature that's enablable
14:30:49 <PaulMurray> The plan is to discuss it on the spec / ML to converge on a plan
14:30:51 <paul-carlton1> That is where my spec come in     https://review.openstack.org/#/c/306561 Automatic Live Migration Completion
14:30:53 <mdbooth> indeed
14:31:22 <PaulMurray> So, luis5tb, we'll leave it with you to do the spec and tell us when its up
14:31:26 <PaulMurray> and we can move on here
14:31:40 <luis5tb> ok
14:31:44 <PaulMurray> #topic Specs
14:31:51 <luis5tb> I'll update the spec and send the email
14:32:22 <PaulMurray> #link specs for review: https://review.openstack.org/#/q/project:openstack/nova-specs+status:open
14:32:42 <PaulMurray> #undo
14:32:42 <openstack> Removing item from minutes: <ircmeeting.items.Link object at 0xa92bb10>
14:32:49 <PaulMurray> that's not what I meant to post
14:33:15 <PaulMurray> I think this is what I was looking for
14:33:24 <PaulMurray> #link subteam specs for review: https://etherpad.openstack.org/p/newton-nova-priorities-tracking
14:33:47 <PaulMurray> The deadline for non-priority is very soon
14:34:57 <PaulMurray> Does anyone want to talk about one of them now ?
14:35:44 <PaulMurray> ok
14:36:07 <PaulMurray> We will review progress on these specs each week up to the freeze date
14:36:14 <paul-carlton1> https://review.openstack.org/#/c/307131 Live Migration of Rescued Instances
14:36:40 <paul-carlton1> reviews welcome, there is code ready to go too
14:36:40 <PaulMurray> paul-carlton1, ?
14:36:56 <paul-carlton1> You asked about spec reviews
14:37:00 <PaulMurray> yep,
14:37:21 <PaulMurray> that is an interesting one - should be a quick win if its all correct
14:37:47 <PaulMurray> next topic
14:38:05 <PaulMurray> #topic Open Discussion
14:38:25 <PaulMurray> aha - https://review.openstack.org/#/c/215483/3
14:38:27 <abhishek> hi, this is related to bug https://bugs.launchpad.net/nova/+bug/1470420
14:38:29 <openstack> Launchpad bug 1470420 in OpenStack Compute (nova) "Set migration status to 'error' instead of 'failed' during live-migration" [Low,In progress] - Assigned to Rajesh Tailor (rajesh-tailor)
14:38:57 <abhishek> during summit we had discussion with PaulMurray and alaski for the same.
14:39:17 <abhishek> please take a look at it, IMO, as of now we can have this solution and once a permenant fix is implemented we can remove the periodic task _cleanup_incomplete_migrations
14:40:01 <mdbooth> abhishek: Is this just the freeform status field?
14:40:13 <abhishek> currently patch is in merge conflict, I can rebase it in few minutes if it is required
14:40:21 <PaulMurray> what did alaski say when yu spoke to him?
14:40:53 <abhishek> alaski said he will have a look again at it, We have added comment on patch for him
14:41:06 <mdbooth> This should follow whatever we do at project level, tbh. It looks like user-visible api to me, which means changing it could be regarded as a regression.
14:41:21 <mdbooth> I mean, it's obviously a wart, but it might be a wart we have to live with
14:41:54 <mdbooth> However, if we've decided as a project that we fix this sort of thing...
14:41:59 <abhishek> mdbooth, but with this we will be able to cleanup the files from source or desination node
14:42:06 <PaulMurray> mdbooth, I think we need to look at the migrations reporting in general a bit more carefully
14:42:16 <mdbooth> We can punch operators in the face /so many times/ before they complain
14:42:37 <PaulMurray> at the moment the migrations record is used in several different ways
14:42:58 <PaulMurray> some types even end up finished when others end up completed
14:43:12 <mdbooth> Eww
14:43:12 <abhishek> PaulMurray: right
14:43:41 <PaulMurray> abhishek, if its about a quick fix, is there any other non-user facing parameters that can be used to flag a clean up is needed ?
14:43:55 <abhishek> currently patch is in merge conflict, I can rebase it
14:44:14 <mdbooth> If we're going to do this, can we please create an enum and enforce it somewhere?
14:44:21 <abhishek> PaulMurray, IMO no
14:44:24 <mdbooth> Perhaps at the object layer?
14:45:11 <abhishek> mdbooth, this can be done
14:45:16 <PaulMurray> abhishek, I think mdbooth's point is where this got stuck before - its user facing and some people have tooling that looks for the values
14:47:15 <abhishek> PaulMurray, I think its still good to have way to cleanup files rather keeping it as it is
14:47:35 <PaulMurray> abhishek, I think we may need to find another way though, we keep going around on this one
14:48:26 <abhishek> PaulMurray, I will try to get alaski's view on this
14:48:42 <PaulMurray> abhishek, ok
14:49:09 <PaulMurray> abhishek, thanks for keeping on with it, but be open minded about how it will end up
14:49:25 <abhishek> PaulMurray, mdbooth: sure, thank you for your time
14:49:39 <PaulMurray> anything else for the last few minutes ?
14:49:52 <PaulMurray> anything anyone would like to see in these meetings ?
14:50:09 <PaulMurray> (in future I mean)
14:51:14 <PaulMurray> I will try to organise the subteam page: https://etherpad.openstack.org/p/newton-nova-live-migration
14:51:24 <PaulMurray> in the mean time - thanks for comig
14:51:29 <diana_clarke> Nice to meet you, folks!
14:51:42 <PaulMurray> #endmeeting