14:00:29 <PaulMurray> #startmeeting Nova Live Migration 14:00:30 <openstack> Meeting started Tue Mar 29 14:00:29 2016 UTC and is due to finish in 60 minutes. The chair is PaulMurray. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:32 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:34 <openstack> The meeting name has been set to 'nova_live_migration' 14:00:40 <pkoniszewski> o/ 14:00:43 <mdbooth> o/ 14:00:46 <davidgiluk> o/ 14:01:25 <PaulMurray> hi - I'm surprised to find people here - its been so quite today 14:01:45 * davidgiluk was here an hour ago, but then I noticed this is a UTC meeting :-) 14:01:46 <scheuran> o/ 14:01:51 <PaulMurray> Agenda: https://wiki.openstack.org/wiki/Meetings/NovaLiveMigration 14:02:10 <PaulMurray> lets wait one more minute in case of late arrivals 14:02:15 <jlanoux> o/ 14:02:43 <pkoniszewski> yeah, at the very beginning I wanted to say thanks for remainding about time changes, i'd fail hard today ;) 14:03:11 <pkoniszewski> PaulMurray: ^ 14:03:18 <PaulMurray> #topic Bugs 14:03:32 <PaulMurray> https://bugs.launchpad.net/nova/+bug/1550250 14:03:32 <openstack> Launchpad bug 1550250 in OpenStack Compute (nova) "migrate in-use status volume, the volume's "delete_on_termination" flag lost" [High,In progress] - Assigned to YaoZheng_ZTE (zheng-yao1) 14:03:51 <PaulMurray> markus_z sent out a mail about this one 14:04:25 <PaulMurray> #link mail thread about "delete_on_termination" bug http://lists.openstack.org/pipermail/openstack-dev/2016-March/090684.html 14:04:43 <PaulMurray> Does anyoe have an opinion on this bug ? 14:05:17 <PaulMurray> it seems volumes lose their delete_on_termination flag on live migration 14:05:25 <pkoniszewski> is this somehow related to live migration? 14:05:35 <pkoniszewski> thought that this is cinder migration, not nova's live migration 14:05:52 <pkoniszewski> 4.run cinder migrate volume 14:06:01 <pkoniszewski> (one of provided steps) 14:06:07 <mdbooth> Yup, that's not nova live migration 14:06:21 <PaulMurray> ah, your right - I was given the impression it was nova, but it looks like cinder ? 14:07:03 <PaulMurray> ok - moving on 14:07:09 <PaulMurray> (embarrased) 14:07:37 <PaulMurray> are there any bugs anyone wants to bring up? 14:08:12 <PaulMurray> #topic Summit sessions 14:08:48 <PaulMurray> Just a reminder to add anything you want considered to the etherpad 14:09:06 <PaulMurray> #link summit sessions: https://etherpad.openstack.org/p/newton-nova-summit-ideas 14:09:41 <PaulMurray> #topic Update libvirt domain xml interface section on live migration 14:09:54 <PaulMurray> this added by scheuran 14:09:57 <scheuran> Hi 14:10:01 <PaulMurray> scheuran, do you want to take this 14:10:04 <scheuran> yep 14:10:30 <scheuran> so my goal is to update the interface section of the domain.xml before live migration starts 14:10:59 <scheuran> I have 2 use cases 14:11:03 <mdbooth> scheuran: On the destination I assume? 14:11:17 <scheuran> during pre_live migration 14:11:33 <scheuran> but update the xml with the destination information, right 14:11:49 <scheuran> #1 Live Migration with newly added neutron macvtap agent 14:11:57 <scheuran> #2 live migration cross neutron agents 14:12:14 <scheuran> e.g. migrate from a host with linuxbridge agent to a host with ovs agent 14:12:38 <scheuran> what I need for this is the vif information for the destinatnion host 14:12:58 <scheuran> neutron generates this when the update_binding_host_id call is made in post live migration 14:13:09 <pkoniszewski> is this something that you can take only from the destination? I mean, this vif information 14:13:23 <scheuran> pkoniszewski, from the neutron server 14:13:30 <scheuran> and only from the neutron server 14:13:37 <pkoniszewski> okay 14:13:57 <scheuran> so in pre_live migration i need to call neutron, and ask for the vif information 14:14:15 <scheuran> but this information gets updated in post live migration today, like described before 14:14:38 <pkoniszewski> is there any incomatibility between agents? or can you live migration from any to any? 14:14:43 <scheuran> so what I want to discuss is, if we can make the udpate_binding_host_id call in pre_live migration insteaed of post_live migration 14:15:03 <scheuran> pkoniszewski, today you cannot migrate between agents 14:15:09 <PaulMurray> scheuran, at the moment the port binding update triggers creating networking on dest 14:15:11 <pkoniszewski> i mean from neutron perspective 14:15:16 <scheuran> as the on the target, nova plugs always the old vif type 14:15:28 <pkoniszewski> yeah, that's right 14:15:32 <PaulMurray> there are a few neutron-nova bugs that want network setup in pre_live_migrate 14:15:36 <scheuran> pkoniszewski, no, nothing from Neutron 14:15:46 <pkoniszewski> okay 14:16:04 <scheuran> PaulMurray, right, I'm aware of them - but they all do not solve this issue 14:16:22 <scheuran> #link https://review.openstack.org/#/c/297100/ 14:16:27 <scheuran> this is the prototype I did 14:16:36 <scheuran> it's working for the good cases 14:17:06 <scheuran> however I'm still looking for a way how to rollback the portbinding if migration failed... 14:17:24 <pkoniszewski> don't you have old domain XML on source? 14:17:30 <scheuran> yes I have 14:18:00 <scheuran> and I need to update it with the new vif information 14:18:10 <mdbooth> Probably stupid question as I know very little about networking: I assume it's possible to have these different agents on the same segment as presented to the vm? 14:19:04 <scheuran> mdbooth, yes, those agents can serve the same network segment 14:19:20 <scheuran> but not on the same host of course 14:19:51 <scheuran> but you could have a mixed cloud, running linuxbridge and running ovs and everybody could talk to each other.. 14:20:03 <PaulMurray> scheuran, does updating the port binding affect the source networking ? 14:20:16 <scheuran> PaulMurray, no 14:20:28 <scheuran> it's just a database operation 14:20:52 <PaulMurray> I saw you removed the migreate_instance_finish() - what does that do ? 14:21:47 <pkoniszewski> scheuran: so, you are asking about rollback, maybe a stupid question, isn't rollback_live_migration in compute manager enough? if source networking is unaffected it should work 14:21:57 <scheuran> PaulMurray, did I? In which file? 14:22:13 <pkoniszewski> in compute manager 14:22:14 <PaulMurray> https://review.openstack.org/#/c/297100/2/nova/compute/manager.py 14:22:21 <PaulMurray> its commented out 14:22:23 <pkoniszewski> https://review.openstack.org/#/c/297100/2/nova/compute/manager.py@5562 14:22:34 <scheuran> PaulMurray, a right 14:22:52 <scheuran> the only purpose of this method is updating the binding_host_id 14:23:02 <scheuran> today 14:23:37 <scheuran> so for production code we could move things around a bit that this hook is still present 14:24:03 <scheuran> pkoniszewski, is rollback_live_migration executed in every case? 14:24:20 <pkoniszewski> if something after pre_live_migration fails - yes 14:24:36 <pkoniszewski> including pre_live_migration step 14:24:51 <scheuran> pkoniszewski, yeah - the problem is, that today won't fail in pre livmigration 14:25:02 <scheuran> but during migration operation 14:25:10 <scheuran> when libvirt tries to define the xml on the destination 14:25:12 <pkoniszewski> it's okay, LM monitor will trigger rollback 14:25:20 <scheuran> and the requested devices are not present... 14:25:47 <pkoniszewski> you mean when LM is already finished from hypervisor perspective? 14:26:19 <pkoniszewski> ah ok, got it 14:26:29 <scheuran> pkoniszewski, not sure - when libvirt is executing the migration 14:26:42 <scheuran> pkoniszewski, not sure if nova treats this as already finished... 14:27:10 <pkoniszewski> this is fine, live_migration_operation goes in seperate thread 14:27:20 <pkoniszewski> we will still start monitor 14:27:23 <pkoniszewski> that will call rollback 14:27:36 <scheuran> pkoniszewski, ok, so I'll give it a try! 14:28:05 <scheuran> just to name the alternatives.. 14:28:38 <scheuran> it would be a new neutron api, that returns that vif information for the destination - without persisting it in the database... 14:28:56 <scheuran> or allow a port to be bound to 2 nodes (source & target) 14:29:21 <scheuran> but just doing the portbinding in pre_live migration seemed to be the most easiest way 14:29:55 <pkoniszewski> we used the last approach that you mentioned for volumes 14:30:10 <pkoniszewski> we use * 14:30:29 <scheuran> pkoniszewski, so that during migration both hosts own the volume? 14:30:44 <PaulMurray> scheuran, volumes are attached to both hosts 14:30:50 <PaulMurray> during migration 14:30:57 <pkoniszewski> we have connection open to both hosts, even if nova fails during LM, instance will keep operating 14:31:08 <scheuran> pkoniszewski, ok I see 14:31:16 <pkoniszewski> sounds like the most secure approach to me 14:31:24 <scheuran> but in the database the owner is still the source, until migration finished, right? 14:31:41 <pkoniszewski> the owner is an instance which does not change 14:31:41 <PaulMurray> but its slightly different - volumes can be used from both hosts 14:31:50 <PaulMurray> with networking we want only one to get packets 14:31:52 <scheuran> right, cause for neutron its just a database problem 14:32:12 <scheuran> physcially I wouldn't do anything other than today 14:32:32 <scheuran> it's "just" about updating the database record 14:33:00 <PaulMurray> scheuran, do you have a spec for this ? 14:33:08 <scheuran> so during migration, the database says that port is bound to destinatnion, although it is still active on source 14:33:17 <scheuran> PaulMurray, not yet 14:33:28 <scheuran> PaulMurray, I first wanted to get a feeling which approach is the best one 14:33:43 <PaulMurray> I understand 14:33:56 <PaulMurray> specs are a good way to get wider opinion as well 14:34:06 <scheuran> PaulMurray, ok 14:34:06 <PaulMurray> so when you think you have an idea 14:34:13 <PaulMurray> its good to write it down 14:34:18 <scheuran> also creating a blueprint? 14:34:22 <PaulMurray> yes 14:34:37 <PaulMurray> blueprints are really only used fro tracking 14:34:47 <PaulMurray> but the spec will get reviewed and you get feedback 14:34:53 <scheuran> PaulMurray, ok, than this is my todo until next week.. 14:34:57 <scheuran> *then 14:35:30 <scheuran> I'm a Neutron guy - so not very familar with the nova process.. 14:35:40 <scheuran> so spec + bp, perfect 14:35:41 <PaulMurray> no worries, we're friendly 14:35:45 <scheuran> :) 14:36:09 <pkoniszewski> scheuran: also if you have spec, you can discuss it during nova unconference session on summit 14:36:24 <scheuran> pkoniszewski, good point 14:36:40 <scheuran> I already added this topic to the nova-neutron topics (in the neutron etherpad) 14:36:51 <pkoniszewski> which is actually the best way to clarify things that can be implemented different ways 14:37:21 <scheuran> ok. so to summarize - I'll try out the rollback stuff 14:37:29 <scheuran> and come up with a bp + spec until next week 14:37:49 <PaulMurray> good - thanks for bringing this to our attention 14:38:00 <PaulMurray> #topic Open Discussion 14:38:00 <scheuran> yes, thank you guys! 14:38:16 <PaulMurray> does anyone have anything else to brin gup ? 14:39:20 <PaulMurray> ok - thanks for coming 14:39:24 <luis5tb> Hi! 14:39:39 <davidgiluk> luis5tb: Hi Luis 14:39:43 <luis5tb> I would like to know if someone is taking a look at including post-copy live migration 14:40:08 <PaulMurray> luis5tb, are you interested in that ? 14:40:16 <luis5tb> I've been working on including it (but for JUNO version) and would like to know if that would be of interest 14:40:45 <PaulMurray> what do you mean by "for Juno version" ? 14:40:55 <luis5tb> yep, I want to take a look at the latest migration code and try to adapt it (many things have changed since then) 14:41:23 <luis5tb> I integrated post-copy into nova (OpenstacK Juno release) a year ago 14:41:42 <PaulMurray> I think there is interest 14:41:50 <PaulMurray> there is a list here: https://etherpad.openstack.org/p/newton-nova-live-migration 14:42:02 <PaulMurray> it is on the list - you could add yourself 14:42:11 <PaulMurray> or rather anything you want to add as information 14:42:12 <luis5tb> I saw points 6 and 7 14:42:21 <davidgiluk> but if I understand the next step, is that it would be to write a spec? 14:42:46 <PaulMurray> davidgiluk, yes, took the words out of my mouth 14:42:57 <PaulMurray> I was actually wondering if someone is already doing that 14:43:00 <PaulMurray> ? 14:43:08 <PaulMurray> I not then please do 14:43:15 * davidgiluk doesn't know of anyone writing a spec 14:43:26 <luis5tb> ok, just wondering if someone else already took a look into it, or is in the to do list for the future 14:43:46 <luis5tb> ok 14:44:01 <pkoniszewski> so, the libvirt change is not in yet 14:44:08 <davidgiluk> pkoniszewski: Oh yes it is! 14:44:11 <PaulMurray> luis5tb, i think we put it off before (as a group) because we had a lot to do 14:44:13 <luis5tb> yep, it is 14:44:18 <pkoniszewski> oh, i missed it then 14:44:22 <PaulMurray> also there was the 2.6 change 14:44:42 <davidgiluk> pkoniszewski: Got merged last week 14:44:59 <pkoniszewski> good news 14:45:02 <PaulMurray> so may be a good time to move on with it 14:45:07 <pkoniszewski> yeah 14:45:13 <luis5tb> great 14:45:49 <PaulMurray> luis5tb, don't worry about waiting for others - if someone else wanted to do it they can get together with you 14:46:23 <luis5tb> ok, I'll try to write a spec regarding work item 6 14:46:41 <luis5tb> ok, I'll try to write a spec regarding work item 6/ 14:46:42 <luis5tb> ok, I'll try to write a spec regarding work item 6/7 14:46:48 <PaulMurray> great - that's a big help 14:46:49 <pkoniszewski> yup, i will be interested in helping there 14:46:50 <PaulMurray> thanks 14:47:03 <davidgiluk> pkoniszewski: Great 14:47:16 <luis5tb> great 14:47:38 <PaulMurray> please add it to the list on the etherpad too 14:48:18 <PaulMurray> ok - anything else 14:48:29 * PaulMurray will give a big long pause this tiem 14:49:32 <PaulMurray> thanks for coming 14:49:36 <PaulMurray> #endmeeting