14:00:40 #startmeeting Nova Live Migration 14:00:41 Meeting started Tue Mar 7 14:00:40 2017 UTC and is due to finish in 60 minutes. The chair is tdurakov. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:42 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:45 The meeting name has been set to 'nova_live_migration' 14:00:46 hello everyone 14:01:19 * johnthetubaguy lurks 14:01:23 o/ 14:02:07 agenda 14:02:09 let's start 14:02:13 https://wiki.openstack.org/wiki/Meetings/NovaLiveMigration#Agenda_for_next_meeting 14:02:22 #topic Pike specs 14:02:43 https://review.openstack.org/#/c/438467/ 14:02:47 the first one 14:02:59 johnthetubaguy: I left several comments on that 14:03:59 yes, had some good chat about that this time last week 14:04:07 raj_singh is taking the action to get that one updated 14:04:17 raj_singh: ok 14:04:26 johnthetubaguy, ok 14:04:27 I think we something for how long it takes to timeout 14:04:27 btw 14:04:34 and something to say what to do on timeout 14:04:43 both with defaults in CONF, probably 14:04:57 do we really need additional option on nova.conf? 14:05:19 I think it replaces the automatic post copy basically 14:05:41 johnthetubaguy: +1 14:05:42 so, optionally, force live-migration, rather than abort live-migration, when it times out 14:05:44 yes, but what's the point, as replace means deprecation process 14:05:49 tdurakov: I will update the spec today 14:06:00 raj_singh: yes, please 14:06:15 so I don't think auto-post copy will get a deprecation cycle as such, its more turned off by default 14:06:33 the spec will eventually give more context on the thinking 14:06:51 I still proposing not to introduce new options as long as we could manage this behaviour using existing options 14:06:54 it should leave us with a must easier to understand set of live-migrate configs 14:07:09 there are no options for what to do on timeout right now 14:07:31 I would rather we have a new clearer, easy to understand option, than make some crazy combo of options mean something special 14:08:28 raj_singh: we might be missing a spec actually, around the config changes pkoniszewski was thinking about 14:08:44 johnthetubaguy: That is on my todo list 14:08:48 cool 14:08:51 Will be done today or tomorrow 14:09:01 so here is what I was thinking, roughly 14:09:09 new API, as discussed 14:09:23 defaults in CONF (what to do on timeout) 14:09:34 the thing is that default value of the new config option is kind of meaningless 14:09:36 remove the failed automatic post-copy and deprecated progress timeout logic 14:10:14 tdurakov: conf defaults to "abort" options are "abort" or "force", thing to do on completion timeout 14:10:33 it replaces the auto post copy thing, that used to happen whenever you turn on post copy 14:10:58 raj_singh: please update the spec, will take another look 14:11:11 tdurakov: yes sir 14:11:14 yeah, I think we are not describing the picture well over IRC 14:11:22 to the spec for more details 14:12:47 but want to mention that overall is ok, and except the option story I like the way it's going 14:13:25 ok 14:13:28 next one 14:13:29 https://review.openstack.org/#/c/347161/ 14:13:38 live migration of rescued instances 14:13:45 any updates on that? 14:14:43 seems to be no 14:14:45 there is a POC patch up I think 14:15:06 So Siva is working on that, he should have another revision soon 14:15:25 johnthetubaguy: https://review.openstack.org/#/c/308198 - this one? 14:15:37 yeah, I can't find the POC right now, but he got it working I think 14:15:45 tdurakov: no, its quite a different approach I believe 14:15:51 https://review.openstack.org/#/c/440904/ 14:16:10 yeah, thats the one 14:16:40 we need a tempest test for that 14:16:40 it basically needs the compatibility and upgrade sugar, but yeah, thats the crux of it 14:16:43 I assume 14:16:48 tdurakov: thats in the spec I believe 14:17:15 boot -> rescue -> live-migrate -> unrescue 14:17:18 or something like that 14:17:48 because for now I'm not sure about that copy should be done in this if section: if not is_shared_block_storage 14:18:12 oh, it needs way more guards against when that is done, its just a hack to prove its possible to make it work 14:18:38 johnthetubaguy: yeah, the idea for test is ok 14:19:00 I'll comment PoC patch 14:19:24 any other specs I've missed? 14:20:01 ok 14:20:05 moving on 14:20:09 #topic bugs 14:20:29 my favorite one: https://review.openstack.org/#/c/244489/ 14:21:20 johnthetubaguy: do we have blockers for that? 14:22:18 no had chance to re-review it 14:22:33 stuck in herding specs right now 14:23:01 the case with upgrades seems to be addressed already, please review when sufficient 14:23:16 its on my list (of doom!) 14:23:28 lol 14:23:30 ok 14:23:34 any new bugs? 14:23:47 there is the one around post copy 14:23:57 its not new as such 14:24:25 the network switch for dvr? 14:24:29 #link https://review.openstack.org/#/c/434870/ 14:24:52 well, making sure we switch the networking to the destination host in a much more timely way 14:25:09 it actually changes things for post copy and regular live-migration 14:25:16 using the libvirt events 14:25:37 needs careful review, but I kinda like the improvement 14:25:58 we spot when the VM is paused on the source host, and then switch over to the destination host 14:26:04 yes, need a closer look 14:26:38 the manual testing seems to show significantly less pings dropped during a live-migrate 14:28:10 tdurakov: are you going to get chance to review the latest version of https://review.openstack.org/#/c/244489/ as well, that would be great if you could hit that and make sure its right now 14:28:25 johnthetubaguy: Please get libvirt guys to review it 14:28:37 ok, starred the post-copy change 14:28:44 davidgiluk: I have asked a few folks, but would love more eyes on that for sure 14:28:54 sure, will review claims bug again 14:29:19 johnthetubaguy: I'll ask jdenemar 14:29:33 davidgiluk: that would be great, thank you 14:30:04 moving on 14:30:07 #topic ci 14:30:34 haven't heard updates on serial console hook 14:30:44 need to ping markus_z for update 14:30:57 anyone has context? 14:32:01 seems no 14:32:13 #topic open discussions 14:32:26 who is going to visit operators mid-cycle? 14:32:27 https://etherpad.openstack.org/p/MIL-ops-live-migration 14:32:43 Oprators etherpad for LM 14:33:10 *operators 14:33:17 thanks 14:34:08 I'll respond in ml for that 14:34:13 thank you 14:34:17 any other topics? 14:34:27 raj_singh: what is 'rolling migration' ? 14:34:55 one is done during upgrades? 14:35:04 davidgiluk: not sure, what is the context? 14:35:17 raj_singh: It's in that pad you posted 14:35:36 line 40 14:35:55 it should be live migration i think 14:36:23 heh 14:36:51 its probably about patching hosts, I suspect 14:37:02 although its a bit odd 14:37:09 +1 14:37:22 VM uptime/liveliness 14:37:35 I plan to reach out to that etherpad author, I don't understand most of the questions they are trying to ask in there 14:37:59 johnthetubaguy: please use ml for that 14:38:08 so everyone will be on the same page 14:38:48 if there are no other things, let's finish sync up 14:38:53 thanks everyone 14:39:04 #endmeeting