14:00:40 <tdurakov> #startmeeting Nova Live Migration 14:00:41 <openstack> Meeting started Tue Mar 7 14:00:40 2017 UTC and is due to finish in 60 minutes. The chair is tdurakov. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:42 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:45 <openstack> The meeting name has been set to 'nova_live_migration' 14:00:46 <tdurakov> hello everyone 14:01:19 * johnthetubaguy lurks 14:01:23 <davidgiluk> o/ 14:02:07 <tdurakov> agenda 14:02:09 <tdurakov> let's start 14:02:13 <tdurakov> https://wiki.openstack.org/wiki/Meetings/NovaLiveMigration#Agenda_for_next_meeting 14:02:22 <tdurakov> #topic Pike specs 14:02:43 <tdurakov> https://review.openstack.org/#/c/438467/ 14:02:47 <tdurakov> the first one 14:02:59 <tdurakov> johnthetubaguy: I left several comments on that 14:03:59 <johnthetubaguy> yes, had some good chat about that this time last week 14:04:07 <johnthetubaguy> raj_singh is taking the action to get that one updated 14:04:17 <tdurakov> raj_singh: ok 14:04:26 <tdurakov> johnthetubaguy, ok 14:04:27 <johnthetubaguy> I think we something for how long it takes to timeout 14:04:27 <tdurakov> btw 14:04:34 <johnthetubaguy> and something to say what to do on timeout 14:04:43 <johnthetubaguy> both with defaults in CONF, probably 14:04:57 <tdurakov> do we really need additional option on nova.conf? 14:05:19 <johnthetubaguy> I think it replaces the automatic post copy basically 14:05:41 <raj_singh> johnthetubaguy: +1 14:05:42 <johnthetubaguy> so, optionally, force live-migration, rather than abort live-migration, when it times out 14:05:44 <tdurakov> yes, but what's the point, as replace means deprecation process 14:05:49 <raj_singh> tdurakov: I will update the spec today 14:06:00 <tdurakov> raj_singh: yes, please 14:06:15 <johnthetubaguy> so I don't think auto-post copy will get a deprecation cycle as such, its more turned off by default 14:06:33 <johnthetubaguy> the spec will eventually give more context on the thinking 14:06:51 <tdurakov> I still proposing not to introduce new options as long as we could manage this behaviour using existing options 14:06:54 <johnthetubaguy> it should leave us with a must easier to understand set of live-migrate configs 14:07:09 <johnthetubaguy> there are no options for what to do on timeout right now 14:07:31 <johnthetubaguy> I would rather we have a new clearer, easy to understand option, than make some crazy combo of options mean something special 14:08:28 <johnthetubaguy> raj_singh: we might be missing a spec actually, around the config changes pkoniszewski was thinking about 14:08:44 <raj_singh> johnthetubaguy: That is on my todo list 14:08:48 <johnthetubaguy> cool 14:08:51 <raj_singh> Will be done today or tomorrow 14:09:01 <johnthetubaguy> so here is what I was thinking, roughly 14:09:09 <johnthetubaguy> new API, as discussed 14:09:23 <johnthetubaguy> defaults in CONF (what to do on timeout) 14:09:34 <tdurakov> the thing is that default value of the new config option is kind of meaningless 14:09:36 <johnthetubaguy> remove the failed automatic post-copy and deprecated progress timeout logic 14:10:14 <johnthetubaguy> tdurakov: conf defaults to "abort" options are "abort" or "force", thing to do on completion timeout 14:10:33 <johnthetubaguy> it replaces the auto post copy thing, that used to happen whenever you turn on post copy 14:10:58 <tdurakov> raj_singh: please update the spec, will take another look 14:11:11 <raj_singh> tdurakov: yes sir 14:11:14 <johnthetubaguy> yeah, I think we are not describing the picture well over IRC 14:11:22 <johnthetubaguy> to the spec for more details 14:12:47 <tdurakov> but want to mention that overall is ok, and except the option story I like the way it's going 14:13:25 <tdurakov> ok 14:13:28 <tdurakov> next one 14:13:29 <tdurakov> https://review.openstack.org/#/c/347161/ 14:13:38 <tdurakov> live migration of rescued instances 14:13:45 <tdurakov> any updates on that? 14:14:43 <tdurakov> seems to be no 14:14:45 <johnthetubaguy> there is a POC patch up I think 14:15:06 <raj_singh> So Siva is working on that, he should have another revision soon 14:15:25 <tdurakov> johnthetubaguy: https://review.openstack.org/#/c/308198 - this one? 14:15:37 <johnthetubaguy> yeah, I can't find the POC right now, but he got it working I think 14:15:45 <johnthetubaguy> tdurakov: no, its quite a different approach I believe 14:15:51 <raj_singh> https://review.openstack.org/#/c/440904/ 14:16:10 <johnthetubaguy> yeah, thats the one 14:16:40 <tdurakov> we need a tempest test for that 14:16:40 <johnthetubaguy> it basically needs the compatibility and upgrade sugar, but yeah, thats the crux of it 14:16:43 <tdurakov> I assume 14:16:48 <johnthetubaguy> tdurakov: thats in the spec I believe 14:17:15 <johnthetubaguy> boot -> rescue -> live-migrate -> unrescue 14:17:18 <johnthetubaguy> or something like that 14:17:48 <tdurakov> because for now I'm not sure about that copy should be done in this if section: if not is_shared_block_storage 14:18:12 <johnthetubaguy> oh, it needs way more guards against when that is done, its just a hack to prove its possible to make it work 14:18:38 <tdurakov> johnthetubaguy: yeah, the idea for test is ok 14:19:00 <tdurakov> I'll comment PoC patch 14:19:24 <tdurakov> any other specs I've missed? 14:20:01 <tdurakov> ok 14:20:05 <tdurakov> moving on 14:20:09 <tdurakov> #topic bugs 14:20:29 <tdurakov> my favorite one: https://review.openstack.org/#/c/244489/ 14:21:20 <tdurakov> johnthetubaguy: do we have blockers for that? 14:22:18 <johnthetubaguy> no had chance to re-review it 14:22:33 <johnthetubaguy> stuck in herding specs right now 14:23:01 <tdurakov> the case with upgrades seems to be addressed already, please review when sufficient 14:23:16 <johnthetubaguy> its on my list (of doom!) 14:23:28 <tdurakov> lol 14:23:30 <tdurakov> ok 14:23:34 <tdurakov> any new bugs? 14:23:47 <johnthetubaguy> there is the one around post copy 14:23:57 <johnthetubaguy> its not new as such 14:24:25 <tdurakov> the network switch for dvr? 14:24:29 <johnthetubaguy> #link https://review.openstack.org/#/c/434870/ 14:24:52 <johnthetubaguy> well, making sure we switch the networking to the destination host in a much more timely way 14:25:09 <johnthetubaguy> it actually changes things for post copy and regular live-migration 14:25:16 <johnthetubaguy> using the libvirt events 14:25:37 <johnthetubaguy> needs careful review, but I kinda like the improvement 14:25:58 <johnthetubaguy> we spot when the VM is paused on the source host, and then switch over to the destination host 14:26:04 <tdurakov> yes, need a closer look 14:26:38 <johnthetubaguy> the manual testing seems to show significantly less pings dropped during a live-migrate 14:28:10 <johnthetubaguy> tdurakov: are you going to get chance to review the latest version of https://review.openstack.org/#/c/244489/ as well, that would be great if you could hit that and make sure its right now 14:28:25 <davidgiluk> johnthetubaguy: Please get libvirt guys to review it 14:28:37 <tdurakov> ok, starred the post-copy change 14:28:44 <johnthetubaguy> davidgiluk: I have asked a few folks, but would love more eyes on that for sure 14:28:54 <tdurakov> sure, will review claims bug again 14:29:19 <davidgiluk> johnthetubaguy: I'll ask jdenemar 14:29:33 <johnthetubaguy> davidgiluk: that would be great, thank you 14:30:04 <tdurakov> moving on 14:30:07 <tdurakov> #topic ci 14:30:34 <tdurakov> haven't heard updates on serial console hook 14:30:44 <tdurakov> need to ping markus_z for update 14:30:57 <tdurakov> anyone has context? 14:32:01 <tdurakov> seems no 14:32:13 <tdurakov> #topic open discussions 14:32:26 <tdurakov> who is going to visit operators mid-cycle? 14:32:27 <raj_singh> https://etherpad.openstack.org/p/MIL-ops-live-migration 14:32:43 <raj_singh> Oprators etherpad for LM 14:33:10 <raj_singh> *operators 14:33:17 <tdurakov> thanks 14:34:08 <tdurakov> I'll respond in ml for that 14:34:13 <tdurakov> thank you 14:34:17 <tdurakov> any other topics? 14:34:27 <davidgiluk> raj_singh: what is 'rolling migration' ? 14:34:55 <tdurakov> one is done during upgrades? 14:35:04 <raj_singh> davidgiluk: not sure, what is the context? 14:35:17 <davidgiluk> raj_singh: It's in that pad you posted 14:35:36 <tdurakov> line 40 14:35:55 <raj_singh> it should be live migration i think 14:36:23 <tdurakov> heh 14:36:51 <johnthetubaguy> its probably about patching hosts, I suspect 14:37:02 <johnthetubaguy> although its a bit odd 14:37:09 <tdurakov> +1 14:37:22 <raj_singh> VM uptime/liveliness 14:37:35 <johnthetubaguy> I plan to reach out to that etherpad author, I don't understand most of the questions they are trying to ask in there 14:37:59 <tdurakov> johnthetubaguy: please use ml for that 14:38:08 <tdurakov> so everyone will be on the same page 14:38:48 <tdurakov> if there are no other things, let's finish sync up 14:38:53 <tdurakov> thanks everyone 14:39:04 <tdurakov> #endmeeting