14:00:21 #startmeeting Nova Live Migration 14:00:22 Meeting started Tue Apr 12 14:00:21 2016 UTC and is due to finish in 60 minutes. The chair is PaulMurray. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:23 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:25 The meeting name has been set to 'nova_live_migration' 14:00:33 hi 14:00:33 Hi, anyone here for live migration ? 14:00:36 * eliqiao lurks 14:00:36 o/ 14:00:37 o/ 14:00:40 hi 14:00:41 o/ 14:01:00 Agenda: https://wiki.openstack.org/wiki/Meetings/NovaLiveMigration 14:01:04 hi 14:01:04 o/ 14:01:21 wait just one minute 14:01:27 hi 14:01:29 for stragglers (if any) 14:02:22 o/ 14:02:33 PaulMurray: while waiting can i ask to review this series: https://review.openstack.org/#/c/304621/ 14:02:49 you just did :) 14:02:59 #topic CI 14:03:00 :) 14:03:04 okay 14:03:17 tdurakov, what is the state of play for CI ? 14:03:45 here is update: now I'm working on coverage for negative cases for live-migration. To check rollback things. 14:04:28 about job stability, any news for devstack plugin? 14:04:53 tdurakov, who was looking at devstack plugin ? 14:05:39 last time I've asked there was activity leaded by markus_z 14:06:33 so,if it's done - we could check whether the problem with jobs is fixed by newer veersions of libvirt/qemu or not 14:07:23 tdurakov, and I just pinged markus_z so lets see if he turns up 14:07:36 yep 14:08:14 he says he will be here in 5-10 mins 14:08:48 anything else while we are waiting ? 14:08:57 maybe let's move on and get back to this topic when markus will be ready? 14:09:01 :) 14:09:06 ok 14:09:17 #topic Summit Sessions 14:09:34 I want to get organised for the summit sessions 14:09:50 to start with I want to make sure we have enough time 14:09:59 we have one session dedicated to us so far 14:10:13 see: https://etherpad.openstack.org/p/newton-nova-summit-ideas 14:10:33 Wed 11:50-12:30: live migration (gaps, re-approved specs, testing) 14:10:54 So we need to makes sure there are no significant clashes 14:11:02 for people with talks etc. 14:11:21 Let me know if you do (or put it on the etherpad) 14:11:37 #link summit session planning https://etherpad.openstack.org/p/newton-nova-summit-ideas 14:11:55 mdbooth, do you need time to go through storage pools 14:12:13 I know paul-carlton2 will not be there 14:12:16 I'm wondering. 14:12:33 My agenda is more around inculcating the importance of cleanup in this code. 14:13:13 But I've already put that in a spec: https://review.openstack.org/#/c/302117/ 14:13:45 If people could get eyes on that spec, if it's not controversial I don't think we need a separate session 14:14:23 * markus_z joins late 14:14:25 mdbooth, do you think it is realistic to get storage pools and changing migrate to use it in this cycle 14:14:36 Yes 14:14:41 Umm.. 14:14:46 Storage pools, yes 14:14:59 The goal of eliminating ssh... maybe not 14:16:13 I would like a plan to get both done if possible - its a relatively high want for security 14:16:48 Hi, markus_z, will come to you in a moment 14:17:25 We can focus on this but there may be some changes for storage pools that require a change to externals that need a release notice, so early in O might be more realistic 14:18:27 ok - we can talk about it outside this meeting - and maybe in the summit 14:19:04 PaulMurray: Yes, that might want a session. 14:19:26 There is also discussion going on around post-copy 14:19:31 How is that going ? 14:20:03 Is luis5tb_ here ? 14:20:07 I know danpb has some comments on luis5tb's spec 14:20:08 I did not receive that much feedback beside the one we already had last week 14:20:36 i can say that i agree with Daniel, we shouldn't expose it through API 14:20:55 the question then is how to structure it instead 14:20:59 we need something that paul-carlton2 mentioned in ml (some kind of SLA) or task api finally 14:21:01 I think that we can include a configuration in nova.conf 14:21:19 so that post-copy migration will be used or not based on that, per host level, instead of VM level 14:21:42 maybe somehow similar to the way of enabling tunneling for migrations 14:22:00 and then, I thought of a simple way of switching from pre-copy to post-copy 14:22:07 I'm working on a spec to define instances as cattle, pets or pandas and migrations as low, medium or high importance then use this to work out if/when to use features like post copy 14:22:09 based on the number of Iterations 14:22:11 imo it's bad idea to put it into nova.conf 14:22:13 again 14:22:31 host based is probably wrong because it does depend on the feature of the VMs 14:23:01 what point not to expose it over api? 14:23:05 so we have configuration vs api 14:23:13 to not expose hypervisor specific details through API 14:23:33 probably we should discuss it during summit 14:23:35 yep, I completely agree with Daniel regarding the automatic switching 14:23:36 * andrearosa thinks that post-copy-lm needs a discussion at the summit as well 14:23:43 well, we could "downgrade" to pre-copy if it's not supported 14:23:54 i'd go with api approach 14:23:54 though I still think it is nice to have a flag for --post-copy to only use it when you like 14:24:11 we never know when it is supported on the api level, thats the point 14:24:30 pkoniszewski: we could check it:p 14:24:43 we can check if libvirt has the VIR_MIGRATE_POSTCOPY flag 14:24:48 tdurakov: then nova can check it instead of the user 14:25:18 andrearosa: what i think we should make l-m fully async 14:25:30 can we agree that it should be discussed during summit? we won't agree on a solution here.. 14:25:30 and force user to check result with instance actions 14:25:37 pkoniszewski: +1 14:25:45 luis5tb_, my approach is to have one flag on live-migration, optional importance defaulting to medium and let it work out what features to use to force migration to complete based on that and the instance type 14:26:02 we will have all cores there and it will be their decision, we can just show pros and cons of both, they will summarize 14:26:45 so, then I should leave the spec as it is until a decision is made during the summit? 14:26:49 It will be a good idea to come in with a proposl - a straw man - and then see what everyone wants to do 14:27:13 luis5tb: yes 14:27:17 ok 14:27:20 the summit should be for decisions - discussion needs to come before if possible so everyone knows what they think before they get there 14:27:37 we have two proposals already 14:27:54 I know, but no agreement yet 14:27:56 * davidgiluk isn't sure if anyone who has experience with postcopy is at the summit, but still OK 14:27:57 with paul-carlton2 idea i think that we have three proposals 14:28:43 davidgiluk: Probably you should head there, yourself? :-) 14:28:43 IMHO, main advantage of having the flag is that you don't need to always include the VIR_MIGRATE_POSTCOPY (and its overhead) when starting the migrations 14:29:01 but overhead is very small, isn't it? 14:29:11 So that's two topics that need some time 14:29:24 is it even noticeable? do we have any numbers? 14:29:44 PaulMurray: maybe, get back to devstack plugin? 14:29:51 davidgiluk: do you have some numbers about the performance impact of using postcopy flag? 14:30:08 tdurakov, just want to finish this topic - markus_z says he can wait a bit 14:30:09 i know that it will add few miliseconds to VM downtime in final phase 14:30:18 PaulMurray: cool 14:30:22 luis5tb_: No I don't; but I'm more worried about restrictions on features than actual overhead 14:30:40 u mean, like RDMA or compression? 14:30:42 davidgiluk: Unfortunately danpb won't be there. 14:30:55 pkoniszewski: Right 14:31:05 so, we do not support compression anyway 14:31:11 pkoniszewski: I meant overhead of including the flag, but including the flag does not mean you are actually using post-copy mode 14:31:16 and if you use RDMA do you really need to use post-copy at all? 14:31:25 luis5tb_: That overhead is pretty low 14:31:32 ok 14:32:00 pkoniszewski: My bet is it's still a bad thing to autoenable based on host; but I may be wrong 14:32:27 lets cut this one now 14:32:38 is there anything else that will need time ? 14:32:51 We will go over things to do and CI 14:32:55 davidgiluk: tbh, i dont like both solutions 14:33:29 I'm wondering about anything that can't be sorted out without discussion there 14:33:58 There is some neutron/nova stuff that is listed at the end of the agenda 14:34:16 and it has been listed in the neutron/nova session for the summit 14:34:40 so may not need to budget for that - I'll make sure to follow up on that 14:35:13 ok - lets go back to markus_z now then 14:35:19 #topic Back To CI 14:35:32 are you still here markus_z 14:35:34 yep 14:35:56 tdurakov, was asking about devstack plugin for versions of libvirt/qemu 14:36:10 what is the status 14:36:23 The "happy path" isn't working right now. I have 2 patches which should solve that. 14:36:24 https://review.openstack.org/#/q/project:openstack/devstack-plugin-additional-pkg-repos 14:36:39 tonyb is the only core of the project and I didn't get him in the last days 14:36:53 starred this 14:37:17 I wanted to discuss with him at the summit how to proceed. 14:37:32 markus_z: any estimates? 14:38:06 tdurakov: estimates for what? Until it is fully finished and no experimental job anymore? 14:38:27 markus_z: plugin readiness i mean 14:38:46 tdurakov: I hope to get it done until newton-1 14:38:52 acked 14:39:07 markus_z: thank you for details 14:39:09 markus_z, do you need any help - or is it just down to core review ? 14:39:32 PaulMurray: It's down to core review. We also need cores for that plugin project. 14:40:10 markus_z, is it a separate project on its own ? 14:40:11 PaulMurray: After I talked with tony I would post on the ML how the version-bump would look like. Then we can discuss there. 14:40:24 PaulMurray: The devstack-plugin is, yes 14:40:47 what about moving ubuntu versions for ci instead? 14:40:51 PaulMurray: It's also of use for Neutron and Cinder, that's why we decided to keep it out-of-tree of Nova. 14:41:32 markus_z, got it - didn't clock that 14:41:46 tdurakov: You mean the test nodes in the jenkins jobs? 14:41:52 yes 14:42:04 just bump whole linux version instead:) 14:42:24 We get the pre-build libvirt(kvm) packages from Ubuntu-Cloud-Archive (at the moment). 14:42:36 batteries will be included then 14:43:22 tdurakov: I think we will bump the libvirt version more often than the linux version of the test nodes 14:43:49 markus_z: ok 14:44:27 markus_z: Only libvirt version, but not QEMU, too? 14:44:34 If you like, we can discuss this at the summit. I'll grab you, PaulMurray and tonyb and figure this out in a few minutes 14:44:47 markus_z, sounds good to me 14:44:56 markus_z: I'll miss Austin 14:45:04 markus_z, also only one core is not a good place to be - we can sort that out I'm sure 14:45:40 Thanks markus_z 14:45:46 we have xenial available for experimental jobs now 14:46:16 and sometime after summit and xenials release will switch trusty jobs over 14:46:33 clarkb: could we disscuss it on #openstack-infra i got one experimental job for this 14:46:35 What's xenial ? 14:46:41 16.0.4 14:46:49 16.04 i mean 14:47:06 new ubuntu version 14:47:17 oh, got it 14:47:32 #topic Specs 14:47:41 PaulMurray: I think it will fit live-migration job 14:48:01 I just wanted to point out that the specs need reviews 14:48:03 https://etherpad.openstack.org/p/newton-nova-priorities-tracking 14:48:28 And paul-carlton2, you need to submit newton versions of your specs 14:48:37 I am going to remove the mitaka ones from the page 14:49:41 If there are specs missing from that page please put them there 14:49:51 I will go through reviewing everything this week 14:50:00 please all do the same if you can 14:50:13 #topic Open Discussion 14:50:31 The only one here is from scheuran 14:50:42 hi 14:50:43 VIF information on migration 14:50:51 We did mention last week 14:51:03 and I know you said the main input is needed from neutron at the moment 14:51:05 is that right ? 14:51:11 Right 14:51:18 i posted a mail on the ML 14:51:23 #link http://lists.openstack.org/pipermail/openstack-dev/2016-April/092073.html 14:52:02 The point is, that we cannot just set the binding:host_id in pre_live_migration, as this would break some thirdparty mechanisms + the l2pop mechanism 14:52:13 so we need to figure out another way on the Neutron side 14:52:35 One idea has already been posted on the ML 14:52:37 PaulMurray, ack 14:53:02 I'll come back to you guys as soon as I have a direction from Neutron side... 14:54:01 I posted to the ML last week about image cache. Anybody get a chance to look at that? 14:54:15 I think markus_z was interested. 14:54:46 Basically, we've got some stored technical debt in image cache which is going to require a layout migration at some point when it's fixed. 14:54:57 storage pools will also require a layout migration. 14:55:34 I suspect we don't want to try to fix image cache in this cycle, but I'd prefer to make this decision eyes open. 14:55:45 i.e. it will require 2 layout migrations. 14:56:24 I want to ask why - but I suspect I should re-read the ML 14:56:28 the layout migration you refer to is disk.info? 14:56:52 paul-carlton2: Short answer: image cache stores data in the wrong place. 14:57:24 Essentially it's broken in design as well as implementation. 14:57:54 paul-carlton2: Nope, the _base directory needs to go away 14:58:44 We only have 1 minute left 14:59:05 I would like to start discussing this - but we can't really 14:59:08 Discussion on the list is fine by my 14:59:09 me 14:59:13 how about we read and answer 14:59:21 +1 thanks 14:59:29 and we can talk again next week if needed 14:59:41 got to cut off now I'm affraid 14:59:46 thanks all for coming 14:59:53 #endmeeting