14:00:31 #startmeeting Nova Live Migration 14:00:32 Meeting started Tue Feb 2 14:00:31 2016 UTC and is due to finish in 60 minutes. The chair is PaulMurray. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:00:33 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:00:36 The meeting name has been set to 'nova_live_migration' 14:00:40 hello 14:00:45 hi 14:00:48 o/ 14:00:48 o/ 14:00:53 o/ 14:00:56 o/ 14:00:57 o/ 14:00:57 hi 14:01:25 Hi all 14:01:54 Agenda here: https://wiki.openstack.org/wiki/Meetings/NovaLiveMigration 14:02:22 diving straight in.... 14:02:28 #topic Priority reviews 14:03:02 I noticed yesterday that we have not completed any of our specs yet 14:03:14 and I wanted to do an update message after the mid cycle 14:03:15 that's true 14:03:18 panic mode ON 14:03:20 :) 14:03:47 We don't have any cores in team. 14:03:49 so I would like to go through the ones on the review etherpad quickly and understand what has to be done to complete 14:04:09 and I will see what I can do to get core attention 14:04:31 #info we have 4 weeks to feature freeze 14:04:42 +1, let's go through 14:05:02 #link reviews here https://etherpad.openstack.org/p/mitaka-nova-priorities-tracking 14:05:27 what is left for Enable block live migration with attached volumes 14:06:08 pkoniszewski, I think you are doing those? 14:06:43 so I proposed new solution according to Daniel's review, can't really do anything more there 14:07:25 do you need danpb to review? 14:07:50 question is whether my solution covers all scenarios with shared devices, at the midcycle Daniel said that it should work, we need cores eyes on it as they are most knowledgable ppl there 14:09:08 making a note on pad 14:09:55 I'll see if we can get him and others 14:10:05 I think a lot of this is going to be about gettin reviews 14:10:12 next one in Split network 14:10:30 I reviewed that one, and have some comments on it. 14:10:34 AFAIK split network got some good quality reviews past last 2 days 14:10:39 it require some more work 14:10:46 alex_xu -1 on it for some UT stuff. 14:11:42 I saw the object change, I should have seen that - I've done a lot of objects..... 14:11:53 I have a concern on it. for the validation of live_migration_connect_addr which is configured by admin/operator, can we trust it and use it without any verification? 14:12:14 eliqiao: we dont validate config options 14:12:35 that is not friendly. 14:12:50 eliqiao, if someone has access to conf... 14:13:06 eliqiao, what validation are you thinking of - check format? 14:13:37 if ip address, check if it is valid, if host name, check if it is valid. and more over, check if that hostname/ip can be pinged. 14:13:39 PaulMurray, i think it's about security? 14:13:57 currently, this patch has nothing validation on it. 14:14:16 * alex_xu thinks splits one is pretty close, he will +2 to very soon after update 14:14:20 so if operation configure it by mistake(typo) or something like that, LM will fail 14:14:45 I wouldn't worry about pinging it - that can fail anyway 14:14:51 eliqiao: pinging is not an option imo as you might play with hostnames at any point 14:15:03 s/point/time 14:15:08 other places with ip addresses may check format, but not if it actually works 14:15:22 you find that out when you try to use it 14:15:26 +1 14:15:29 no need to this kind of validation, imo 14:16:03 * davidgiluk can imagine checking the address is on an internal rather than an external network 14:16:16 Okay, then we, please make high note on the new feature releast note, this option should be take care by operatior themself. 14:16:19 eliqiao: i wouldn't worry about validation at all, let's say a bit about ip format, but not that much 14:16:52 eliqiao: remember that it will fallback to old method if IP is invalid so it will still work 14:17:30 Its worth a log message if it doesn't work 14:17:36 comments on patch if anything else 14:17:52 next is Pause VM during live migration 14:17:55 pkoniszewski: I don't find any logic say that will fallback to old method. 14:17:58 (skipping the bug for now) 14:18:12 Okay, talk it offline. 14:18:31 eliqiao: might be that Zhenyu only proposed that but haven't added it yet 14:19:14 pkoniszewski: hmm.. don't think so, I propose the validation, he is kinds of agree, but no update about it. 14:19:20 so, forcing LM is up for review, alex_xu made some good reviews from the API side, IBM folks are also ok with my change after I reorganized layers a bit 14:19:55 change looks like it is pretty big, but it is because of docs and microversions 14:20:19 I saw you rebased yesterday - was there any other change with that patchset? 14:20:22 yeah, but first we should merge https://review.openstack.org/#/c/257270 too, I can help on API layer too. 14:20:50 no, only a merge conflict... because I need to bump both microversions, RPC and API, im getting a lot of merge conflicts there 14:21:00 that's why im rebasing it very often 14:21:25 yeah, cause dan's migration_data patch get merged, we need to bump RPC vesion. 14:21:27 yes - there are 47 patches in the same code at the moment 14:22:03 im not changing the code itself 14:22:35 PaulMurray: I think we can move up this patch to 'Nova core review' section, right ? 14:22:57 as eliqiao said, we should have some core approving the change which introduces the new DB Api method 14:23:13 +1 14:23:14 eliqiao, I think so, they don't want more than a couple of patches up there though 14:23:27 we have three changes that require this new DB API... 14:23:48 the biggest problem for LM team is 'we are lack of core' 14:24:07 eliqiao, noted 14:24:17 eliqiao: yes but LM is high prioprirty so hopefully we will have full attention from cores 14:25:00 can we look at report live migration progress next 14:25:38 anyone know about these? 14:25:47 not so much movement in last couple of weeks 14:25:57 yeah, there is a question about update inerval. 14:26:29 PaulMurray made a comment that 2s is reasonable, need to ping shaohe_feng to update patch. 14:26:42 I had a chat with rackspace guys - HP and rax think 2 sseconds is ok 14:26:58 it could be made a config, but we weren't worried 14:27:04 PaulMurray, got concerns about writing to frequent to db on this 14:27:04 well one proposition from me 14:27:36 oh, ignore me, it's alreay 2s.. :( 14:27:54 I think that this patch to implement index/show on migrations is actually more important than writing migration progress to the DB 14:28:05 I mean this one https://review.openstack.org/#/c/258771/ 14:28:10 tdurakov, even if you are doing 100 migrations concurrently this is still a tiny query rate given overall system load on db 14:28:24 it is required by cancelling and forcing LM 14:28:36 paul-carlton2, there are also rpc for this 14:28:39 there's imo no reason to write progress to the DB if don't have way to read it... 14:29:04 true, but rabbitmq can handle much higher rates than db 14:29:54 pkoniszewski, we do have a way to read it, the GET on server//migrations 14:30:05 will it report migration progress? 14:30:13 pkoniszewski, paul-carlton2 PaulMurray, as decided that compute will have state by itself 14:30:23 thought that it is just a list of migrations 14:30:33 and progress will be excluded from this list 14:30:49 it will be easier to make things clear in that area 14:31:01 there is a spec for getting this info, I'm sure 14:31:07 paul-carlton2: https://review.openstack.org/#/c/258797/11/nova/api/openstack/compute/migrations.py 14:31:47 also GET on specific migration will return details 14:31:47 according to this patch list will not include progress details 14:31:55 but its not implemented yet 14:32:03 and it is in, let's be honest, in bad condition 14:32:30 so we shouldn't focus that much on writing the progress to the DB because we can't read it yet - https://review.openstack.org/#/c/258771/ 14:33:19 It sounds like these patches need some work to get them straight 14:33:28 this is crazy, why would we srupress this info if we are collecting for user info 14:33:37 How about a question on ML regarding update rate 14:33:43 to settle it 14:34:02 We can go back over the specs and leave discussion on the patch 14:34:04 paul-carlton2: we can talk offline, this is reasonable because current call returns only a list of migrations, there is no GET yet 14:34:23 can someone help to summarize it? 14:34:32 What is the point of collecting info if you can't see it 14:34:40 pkoniszewski: seems it related to at least 2 specs ? 14:34:43 paul-carlton2: thats my point 14:35:14 the GET server//migration/ call should display details, if not that is wrong 14:35:57 let's agree that there is more work to be done in reporting and getting progress and let's move forward 14:36:06 +1 14:36:33 agreed, list should be summary info, GET ... should display details 14:36:49 Those are the main priority specs that can make progress this cycle 14:36:54 and andrearosa is about 14:37:03 to add the concel based on the pause 14:37:09 correct 14:37:14 It would be good to get those 5 through 14:37:34 that's great, if you can add me to review i will take a look on this 14:37:45 pause, cancel, progress, block migration with volumes and split network 14:37:54 pkoniszewski:sure thing 14:38:27 So I will see what I can do to get some reviews from cores on the ones that are near and we can all try to push these through 14:38:31 I think that we should distinct progress and show/index on migrations 14:38:31 andrearosa: please add me too. thx. 14:38:47 ack 14:39:01 show/index on migrations is more important than progress 14:39:24 There is also Making the live-migration API friendly 14:39:33 anyone know what is left on that? 14:39:35 but is the original author working onshow/index? 14:39:40 it seems pretty much stuck to me 14:39:45 yeah, I am owner. 14:39:51 cool 14:40:09 not many reviews on it. 14:40:47 I rebased today due to dan's migrate_data patch get merged(bump RPC version) 14:40:57 There is a request for release notes on https://review.openstack.org/#/c/259319/ 14:41:00 eliqiao, this on in merge conflict https://review.openstack.org/#/c/259319/ 14:41:48 PaulMurray: tdurakov yeah, I notied that, and it will be alway on merge confilt so I don't update it for log time. 14:42:14 eliqiao, does it need the release notes ? 14:42:16 But can we do the review on the dependency patch first (virt layer and compute api layer) 14:42:27 eliqiao, sure 14:42:46 PaulMurray: it changes API behaviors, it's better to have a release notes. 14:42:46 one that returns data from task.execute almost fine for me 14:43:07 eliqiao, will leave comments 14:43:39 also got question for libvirt side 14:43:51 tdurakov: thanks, I will update tomorrow, hope it can be done before next week since I will be on vacation due to Chinese New year in next 1 or 2 weeks 14:44:21 tdurakov: please? 14:45:00 eliqiao, can someone take over for your vacation? 14:45:24 yeah, I can ask alex_xu for help. 14:45:28 PaulMurray: ^^ 14:45:43 can you ask him to approave some as well ? 14:45:59 he is a core :) 14:46:09 it might be better if we do the work and he reviews 14:46:16 PaulMurray: good idea :), he promised me that he will do the review next week :) 14:46:48 PaulMurray: yeah. I will try to talk to him before I am on vacation. 14:47:06 I want to do an action now 14:47:21 someone needs to do the email for ML about report live migration progress 14:47:33 who will do that? 14:47:54 PaulMurray: I can do that 14:48:01 thanks 14:48:23 +1 for pkoniszewski , I will follow. Thanks! 14:48:24 #action pkoniszewski to write email for dev ML about tie interval for writing to DB in report live migration progress 14:48:56 Lastly there are a couple of patches hanging around for Series to deprecate migration flags config: 14:49:36 PaulMurray: today sdague approved one and -1 another 14:50:10 pkoniszewski: yeh, it's mostly for a missing test case 14:50:25 it should be quick turn around the 2 things I -1ed today 14:50:45 exactly, so this series is almost in 14:50:52 sdague, do you know if mark is around (can't remember nick) 14:51:06 markmcclain: 14:51:10 shoud be. 14:51:13 PaulMurray: it's markmc, and I don't know that he's been hanging around much 14:51:18 eliqiao: no, that's the other mark 14:51:30 sdague: get it, thx. 14:51:33 markmc is in #nova right now 14:51:42 I think it would also be fine for someone else to update with the fixes 14:51:54 sdague, well, I'll give it a few days and see if I can find him if he doesn't see it 14:52:05 well, I'd suggest turning it around while it's fresh 14:52:21 especially given that they are pretty straight forward 14:52:29 and mark is a pretty busy person 14:52:44 yes, can do - can someone do that? (I'm pretty busy right now too) 14:52:54 any offers to fix the patch up? 14:52:57 i've got to go, thanks everyone, bye! 14:53:19 markmc just replied sdague on the patch one hour ago. 14:53:24 eliqiao: cool 14:53:42 which patch? 14:53:52 #link https://review.openstack.org/#/c/263434/9/nova/tests/unit/virt/libvirt/test_driver.py 14:54:01 oh yes, I see it 14:54:05 he got test case proposed on Ic480be79772e2721e0f57f17ccc9d893e92af4c9 14:54:13 lets leave it with him then 14:54:43 We are nearly on the hour now 14:54:50 #topic Bugs 14:55:13 we have not done much in the way of bug triage that I am aware of 14:55:20 has anyone been looking at them? 14:55:35 I'm happy with markmc's comments as soon as the oslo.config test results come back 14:55:59 sdague, cool - thanks 14:56:59 mikal was going to do something for bugs but ended up too busy 14:57:08 I think its too late to discuss here now 14:57:09 PaulMurray: I promised to help, but I don't get any response from mikal. 14:57:32 PaulMurray: if you could come up with alternative wording for https://review.openstack.org/#/c/263436 - I'll +2 it, but I agree the deprecation warning is really hard to get your head around. 14:57:53 sdague, I'll take a look right after this 14:57:58 cool 14:58:17 eliqiao, lets go offline with the bugs topic 14:58:29 PaulMurray: Okay. 14:58:42 no time for other discussion, so sorry if you have any 14:58:57 remember that you can add things to the agenda if you want to make sure you get time 14:59:05 thanks all for coming 14:59:13 PaulMurray: Any chance of a higher-bandwidth chat about storage pools some time soon with whoever's interested? 14:59:19 Unfortunately I couldn't make the midcycle. 14:59:23 mdbooth, you back in uk now? 14:59:25 #endmeeting