14:00:17 #startmeeting Nova Live Migration 14:00:18 Meeting started Tue Nov 8 14:00:17 2016 UTC and is due to finish in 60 minutes. The chair is tdurakov. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:19 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:22 The meeting name has been set to 'nova_live_migration' 14:00:37 hi everyone 14:01:22 * kashyap waves 14:01:40 kashyap: hi! 14:01:49 anyone else is here? 14:02:11 o/ 14:02:18 tdurakov: Might want to make another reminder on -nova 14:02:36 did it 5 minutes ago 14:02:38 so 14:02:43 tdurakov: This is the internet, no many have attention spans beyond 10 seconds 14:02:46 s/no/not/ 14:03:01 hi 14:03:26 so, let's start 14:03:58 #topic ci 14:04:42 https://review.openstack.org/#/c/379638 - it seems like the patch isn't merged yet 14:05:30 while it was resubmitted, will try to ping folks in openstack-qa after meeting 14:06:32 another one: https://review.openstack.org/#/c/389546/ - chain is ready, hope nova cores will review it soon 14:06:46 so we will have ceph backend tested 14:07:10 anything on ci topic? 14:08:13 tdurakov: Do we test post-copy in one of the experimental Gate CI jobs? 14:08:37 not sure 14:08:47 it's worth to check imo 14:09:17 o/ 14:09:20 at least ci uses xenial 14:09:24 we don't 14:09:35 qemu on xenial does not support post copy AFAIK 14:10:31 Hm 14:10:43 yeah, it is 2.5 that has only experimental support (and no support in libvirt), we need qemu 2.6 to support it and higher version of libvirt 14:10:57 pkoniszewski: I think we need to get that "Test Nova w/ libvirt / QEMU from source" thing in the Gate "soon" 14:11:12 yeah, would be worth to have it 14:11:44 It came up at the Summit discussion, /me tries to pull up the notes 14:13:57 any blockers now to test post-copy? 14:14:33 https://etherpad.openstack.org/p/ocata-qa-devstack-plugin 14:15:32 no support in gate sounds like a blocker 14:15:38 tdurakov: Not that I know of. I wasn't following it closely 14:15:57 pkoniszewski: Yeah, apart from what pkoniszewski says :-) 14:16:26 another thing is that live migration start in pre-copy mode 14:16:40 even if post copy is allowed 14:17:02 I mean could we upgrade qemu during post hook 14:17:03 we would need to have deterministic way to slow down LM a bit so that we are sure that post-copy trigger will happen 14:17:22 hmm, this might make sense 14:18:02 post_hook is after tempest 14:18:21 local.sh run by devstack may be a choice to run stuff after stacking and before tempest 14:19:01 pkoniszewski: set the bandwidth really low normally works or better start something busy in the guest 14:19:41 VMs that we have in gate are very small with software virtualization 14:19:43 even if we upgrade qemu after tempest 14:20:07 first run 14:20:59 setting bandwidth to something low sounds better 14:21:50 pkoniszewski: let's try that way then 14:22:00 pkoniszewski: Although depending how you trigger postcopy (do you do it based on time or do you do it on number of iterations I can't remember) 14:22:27 based on percentage increase 14:22:49 try a low bandwidth setting, it's the best bet if you've got nothing to run in the guest 14:23:10 if percentage increase in a subsequent iteration is less than 10% then we trigger post copy 14:23:10 pkoniszewski: although if desperate something like a dd bs= in the guest can get you a bit of dirtying 14:24:40 who wants to try? 14:26:39 #action tdurakov to check possibilities for post-copy testing 14:27:05 about intel nfv ci 14:27:48 http://lists.openstack.org/pipermail/openstack-dev/2016-November/106949.html 14:28:04 re nfv tests upstream vs. tempest plugin 14:28:25 wznoinsk: thanks for tread started 14:28:44 so we have 2 options, let's wait for feedback from qa 14:28:54 as for me both options will work 14:29:30 any thoughts on that? 14:30:01 I'm only wondering would some companies be not willing using 'intel' tests repo :) 14:30:29 we could move them to openstack tree 14:30:42 meaning: governed mainly by intel + sfinucan from RH at this moment 14:31:09 tdurakov: They're in the OpenStack tree (big tent) alright 14:31:10 it's under opentsack namespace https://github.com/openstack/intel-nfv-ci-tests 14:31:30 wznoinsk: Maybe we could just rename to nfv-tests? GitHub lets us do that 14:31:47 sfinucan: ++ 14:32:01 sfinucan, I was having similar thinking, I think what's unsure is the scope of that tho 14:32:37 sfinucan, I mean should be broaden what tests we'd like to see there, nfv, hw related too? (PCI, SRIOV) etc. 14:32:49 wznoinsk: https://help.github.com/articles/renaming-a-repository/ Can't speak for the official OpenStack Git repos though 14:32:59 I'd like to see the PCI tests included in there also, yes 14:33:16 sfinucan: what pci tests? 14:33:18 Basically anything that needs hardware and is NFV-related 14:33:45 tdurakov: Any custom tests the Intel PCI CI may be running at the moment 14:33:55 yeah, let's rename it then, and wait for feedback from qa 14:34:02 I don't know how that works though. wznoinsk? 14:34:29 tdurakov I will find a link to these PCI tests sfinucan is talking about in a moment, 14:34:32 (Getting Mellanox to open source their test suit would be great too, but that's another thing) 14:34:43 tdurakov: Sounds good 14:34:46 re renaming I think we should compile work items list first and then get them done 14:34:55 before renaming repo check that existing nfv job config is updated 14:35:26 wznoinsk: ++ 14:35:45 no point renaming if we end up merging into the Tempest tree :) 14:35:50 tdurakov, here's a copy of PCI tests written and run by Intel PCI CI https://github.com/wznoinsk/hw-tests 14:36:14 yes, ML can help with defining the scope as well I hope 14:36:32 wznoinsk: got it, will ping qa after the meeting 14:36:40 let's wait for feedback first 14:36:46 I'll start an etherpad on nfv-tests then 14:36:51 ++ 14:37:49 wznoinsk: please answer with the etherpad to that thread, so we could track this topic 14:37:51 #link https://etherpad.openstack.org/p/nfv-tests should have some in it on the next meeting 14:37:54 let's go next 14:38:05 #topic bugs 14:39:01 pkoniszewski: https://review.openstack.org/#/c/389687/ - so, have you discussed it with jaypipes? 14:39:20 raj_singh: I noticed you'd taken the postcopy network bug off Matt; if you need any help from the qemu/postcopy side let me know 14:39:44 not yet, patch that sahid mentioned has just been merged https://review.openstack.org/#/c/394808/ 14:39:56 so i'll ask jay to take a look once again on this one 14:40:11 pkoniszewski: ok, do you need any help with this? 14:40:40 davidgiluk: Thx will do. 14:41:05 i don't think so, we have a lot of people there trying to get this stuff merged 14:42:12 there is another one, worth to highlight, https://review.openstack.org/#/c/338929/ 14:43:15 need to ping Lee Yarwood 14:43:25 any other bugs? 14:43:52 #topic specs 14:44:14 pkoniszewski: do you have an update for sr-iov topic? 14:45:04 we are still stuck on https://review.openstack.org/#/c/389687/ and https://review.openstack.org/#/c/244489/ 14:45:16 Nikola's change has one more race condition that we need to solve 14:45:39 oh, claims 14:45:47 will review it soon 14:45:55 added to list 14:47:40 other topics? 14:48:21 I will like to work on https://review.openstack.org/#/c/347161/. So reviews will help 14:49:45 raj_singh: what about Paul Carlton? 14:49:57 is he ok with that? 14:50:10 Already talked to him and he don't have bandwidth to work on it 14:50:19 raj_singh: acked 14:50:26 ok then 14:50:48 it looks like spec is almost aproved 14:51:10 yea almost :) 14:51:30 good, then you could start to work on implementation, imo 14:51:48 join 14:51:50 Paul had some code up for review 14:51:57 you might want to reuse at least part of it 14:52:02 raj_singh: ^ 14:52:08 yes I am going to start on it soon 14:52:27 AFAIK it was working, but required some quality improvements 14:52:29 pkoniszewski: Ok I will take a look 14:53:07 * johnthetubaguy adds patches to TODO list 14:53:07 let's move on 14:53:16 #topic open discussion 14:55:08 johnthetubaguy: could you please take a look on patches that block sr-iov topic, see above 14:56:08 anything else to bring? 14:56:38 pkoniszewski: You might find dropping bandwidth doesn't help if you're triggering off % - unless the guest is dirtying pages the % wont rise just because migration is slow 14:57:03 davidgiluk: yeah, i'm still thinking of your idea about running dd on guest 14:58:36 davidgiluk: i tried to slow down live migrations in gate some time ago, but i hit the wall 14:59:35 need to finish meeting 14:59:44 thanks you for coming 14:59:49 #endmeeting