12:59:56 #startmeeting hyper-v 12:59:57 Meeting started Wed Jul 6 12:59:56 2016 UTC and is due to finish in 60 minutes. The chair is claudiub. Information about MeetBot at http://wiki.debian.org/MeetBot. 12:59:58 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 13:00:00 The meeting name has been set to 'hyper_v' 13:00:03 hi all 13:00:06 hello 13:00:08 hi 13:00:11 hi 13:00:12 Hi 13:00:38 anyone else joining today? 13:00:54 \o 13:01:21 not sure.. today is a holiday in india... 13:01:27 oh, I see. 13:01:28 we can start 13:01:42 #topic os-brick patches status 13:02:00 sooo... the iSCSI connector patch is merged 13:02:23 good.... so that leaves only the nova patch ? 13:02:25 the fibre channel and smb ones are still pending. they got a review from hemna, to include a release note 13:02:35 ok 13:03:04 those 2 would also get merged soon ? what are the chances 13:03:12 so, other than that, I hope they get in soon. 13:03:24 but it's not really up to me. :) 13:03:31 will ping hemna again later today, 13:03:46 ok 13:04:18 #topic networking-hyperv patches status 13:04:24 one point 13:04:37 before we go to next topic 13:05:01 we started testing this https://review.openstack.org/#/c/301233/ 13:05:12 works well atleast for nova operations... nice work 13:05:16 now my question 13:05:27 since the topic is on cinder 13:05:41 for us to test cinder attach in mitaka 13:05:49 with cluster driver 13:05:52 #undo 13:05:52 Removing item from minutes: 13:06:12 are all patches that are required already merged in mitaka ? 13:06:35 the cluster driver atm works only with the cinder smb driver 13:06:40 sagar_nikam: hm, not really sure what you mean. what other patches? 13:07:22 claudiub: i meant ... any patches required in nova or os-brick 13:07:41 that are required for testing cluster driver for attaching volumes 13:08:06 atuvenie: only smb volumes ? what about iscsi... we use iscsi only 13:08:13 hm, as far as I know, there aren't any other patches, other than that. 13:08:34 sagar_nikam: yes, atm only smb 13:08:48 sagar_nikam: we are looking into iscsi as well, but it's a little more complicated 13:09:10 atuvenie: oh... we dont have smb as of now... nor planned in near future 13:09:18 we only have iscsi 13:09:46 sagar_nikam: since we can't predict where a machine will failover to, it's complicated to ensure that the target in loged in on that particular host 13:10:03 but... when i last discussed this topic many months back.. alexpilotti: mentioned that iscsi, fc and smb will all work and will be supporrted 13:10:39 atuvenie: why dont we present the iscsi volumes to all hosts in the cluster 13:10:51 ? 13:10:53 Hi guys 13:11:01 as part of attach volumes 13:11:08 sagar_nikam: That is one option we thought about, but it can get complicated pretty fast 13:11:16 lpetrut: hi.... nice time to join... the cinder expert is here 13:11:36 what issues ? 13:11:48 sagar_nikam: another thing we were considering is building a custom provider for the cluster service in hyperv that does this 13:14:01 sagar_nikam: I think it's a little bit overkill to all the targets exposed to every host in the cluster 13:14:52 only presented... not attached 13:15:10 as and when the vm moves to a new host in a cluster due to host failure 13:15:31 mscluster will automatically attach the volume agaon 13:15:35 again 13:16:29 well, for iSCSI/FC we do passthru, so for this to happen, the volume would have to be attached to all of the hosts 13:16:44 it is a iscsi connection... should be fine i think 13:16:59 lpetrut: yes that is the suggestion 13:17:15 presented to als the hosts 13:17:57 there is another reason why simply exposing the disk to all the hosts of the cluster wouldn't solve the issue. Hyper-V identifies the disks attached to instances by the drive number, which may differ among hosts for the same shared volume 13:18:56 so, as atuvenie mentioned, the only feasible way of doing this would involve a custom cluster service provider 13:19:18 lpetrut: let me get back to you on this... my team mate bharath has done some checks on this 13:19:41 what is the custom cluster service provider ? 13:20:57 the idea would be to have our own hooks, controlling what happens when an instance is failed over, before being started 13:21:24 still, there may be another approach 13:22:07 sagar_nikam: basically a custom provider that would be do exactly what the hyper-v one does, just that it also deals with the volumes when the machine is failed over to the host new host 13:22:20 if an instance having iSCSI/FC attached is failed over, it will enter an error state after being moved, as the disks are not accessible. We could just this error-state instance, attach all the disks at this stage, and then power it back on 13:23:10 lpetrut: yeah, but that defeats the purpose of the cluster. The way it works we have minimum downtime at a failover 13:23:10 s/just this/just take this 13:23:37 that may just add a few seconds, but I think that's reasonable 13:23:45 atuvenie: it would still be pretty minimal, imo. you'd have a few seconds downtime. 13:24:14 sagar_nikam: what's your view on this? 13:24:35 i thought the connection_info got from cinder 13:24:44 will have the disk number 13:24:57 lpetrut, claudiub: I don't know, this should be investigated, but I feel that that downtime may be more than that. 13:25:07 and all the details which can help us to attach the volumes 13:25:40 also presenting the volumes as a clustered disk.. means it gets presented to all hosts 13:26:20 that's the target side LUN, but hyper-v cares about the client side disk number 13:26:21 lpetrut, claudiub: also, we are dealing with a cluster resource not a virtual machine, so when it gets in error state it may not be recoverable like a normal vm. Again, this should be investigated 13:26:53 atuvenie: yep, I agree, just saying that this is one possible approach 13:26:53 cluster driver not having iscsi/fc support is a major drawback 13:27:07 atuvenie: from my experience, you could simply reattach the disks and restart it. 13:27:13 most cloud instances will have volumes 13:28:09 we need to find a way to solve this issue 13:28:17 sagar_nikam: sure, that's why we thought that for the beginning, we'd support only SMB backed volumes, while investigating for a proper solution for iSCSI/FC volumes 13:28:35 lpetrut: ok 13:28:37 claudiub: restarting it is a no no in my oppinion. If it's in a saved state, then yes, it may be ok to start it again, but restarting it is really a no no. 13:29:16 claudiub: you mean restart the instance after failover 13:29:26 ? 13:29:47 but ... failover should be transparent to the user 13:29:53 ^ exactly 13:30:02 atuvenie: i don't remember exactly, but if the vm couldn't start after failover, it was ending up in stopped state 13:30:16 restarting may not be fine for tenant users 13:30:29 claudiub: then that is not a solution. 13:30:31 claudiub: how is it done in nova.. other drivers ? 13:30:39 libvirt driver 13:30:49 does it restart the instances ? 13:30:51 I think that the only way in which a reboot may be avoided is using Hyper-V replicas (which we may have to consider) 13:31:07 sagar_nikam: as far as i know, there's only vmware that has a cluster driver. 13:31:15 sagar_nikam: how does libvirt do what? 13:31:37 i have worked on vmware driver... the volume is a vm disk there 13:32:09 sagar_nikam: are you talking about a vmware cluster driver? 13:32:15 claudiub: atuvenie: i meant migration trigered from nova api 13:32:45 atuvenie: vmware driver only supprts vcenter cluster 13:32:55 sagar_nikam: I see 13:33:14 when migration is triggered from nova api... does it end up in rebooting the vm ? 13:33:17 sagar_nikam: you mean cold / live migration triggered from nova api? the volumes are mounted on both source and destinaton nodes before starting the migration. 13:33:51 sagar_nikam: claudiub: but again, in that case we know the destination beforehand 13:33:54 but in the case of failover, the mscluster chooses the destination node. 13:33:57 yes... live and cold migration 13:34:31 since we are waiting for cluster events for us to identify the vm has moved 13:34:41 yep 13:34:48 we use those events and update nova db 13:34:52 with the new hostname 13:35:12 cant we do something similar here... present that volume to the moved host 13:35:23 on getting the event from mscluster 13:35:46 you get the event that the instance was moved, but meanwhile it gets into an error state as it does not have the disks 13:35:52 yes, but by the time we get the event from mscluster, the machine is on the new host and started, which without the volumes will be in error state 13:36:31 sagar_nikam: yeah, we basically get the event that the failover already happened, not that it will happen. 13:36:50 ok.. got it... then the only possibly solution.. we need to present the lun to all hosts... and find a way to get the correct disk number 13:37:49 sagar_nikam: I'm not a fan of that solution, we considered it 13:38:23 atuvenie: ok .. but we need some way to support iscsi and fc volumes with cluster driver 13:40:58 yeah, there has to be anyways. 13:41:52 speaking of hyper-v cluster.. 13:42:05 #topic hyper-v cluster status 13:42:26 well, it seems that nova-core has some more concerns about this feature and how it works 13:42:37 can the patch be moved up the priority for core reviewers ? 13:42:43 oh ...ok 13:42:47 and potential race conditions 13:43:09 which is why they've decided for it to be a topic for the next nova midcycle meetup 13:43:14 which will happen in 2 weeks 13:43:20 ok 13:43:30 I will be attending, armed to the teeth with answers. :) 13:43:56 nice... as far as the code goes... it works as seen in our tests 13:44:19 so hopefully claudiub: has all the answers and ir gets merged soon 13:44:28 yeah, but there might be a few race conditions that we didn't / couldn't take into account 13:44:39 like ... 13:44:53 can you let me know where it can fail ? 13:45:08 for example, if an instance is scheduled to start on a host A, and it would fit, if a vm failovers to host A faster, the scheduled instance might not fit anymore. 13:45:51 ok 13:46:34 plus, we have to make sure that the resource claims are also moved to the new host, in order to ensure that the scheduler has a proper resource view on the nodes. 13:47:03 ok.... 13:47:16 in vmware driver 13:47:25 we dont hit such issues 13:47:38 since one compute -- one cluster 13:47:45 anyways, until then, we'll have to get this in: https://review.openstack.org/#/c/335114/ 13:47:50 here we have n computes for n hosts in a cluster 13:48:40 it improves host the driver handles shared storage, which is a must for the cluster driver anyways. 13:48:51 ok 13:49:10 sagar_nikam: yeah, that's true, but on the vmware there are other issues regarding this 13:49:21 claudiub: any thoughts on having a single nova-compute for a cluster ? 13:49:37 i meant mscluster 13:49:51 one main question... where do we run it 13:50:22 sagar_nikam: for example, on a vmware cluster, the scheduler might see that the cluster has 128 GB memory free, but that doesn't mean you can spawn an isntance with 128 gb memory there 13:50:50 as that 128 gb free is spread across the whole cluster. 13:50:54 agree.... i have seen this issue in vmware driver 13:51:16 not just memory, same problem for disk as well 13:51:17 sagar_nikam: the mscluster is a clustered service, meaning it will run in the cluster. anywhere / everywhere. 13:51:45 sagar_nikam: if at one point it is on host A, if host A goes down, it restarts on host B. 13:51:57 sagar_nikam: same as any other clustered service / resouce. 13:52:12 no ... i meant where do we run nova-compute 13:52:28 sagar_nikam: ah, gotcha. on each compute node. 13:52:33 if we go with the option of one novacompute per mscluster 13:53:08 just got a idea... but too many issues... may not be a good option 13:53:13 sagar_nikam: there was a discussion about this before this spec was approved in the first place. and having a nova-compute on each node was the most favorable one 13:53:24 ok 13:53:55 moving on, as we are short on time. 13:54:04 #topic networking-hyperv patches status 13:54:14 #link https://review.openstack.org/#/c/332715/ 13:54:18 ^ this merged on master 13:54:22 we have almost used most time of this meeting for cluster driver. nice discussion... we need to find a way to slve iscis spport 13:54:34 it fixes the issue when the security group is changed on a port. 13:54:39 backported it to mitaka. 13:54:50 ok nice 13:55:09 will let vinod know about it 13:55:23 5 mins only 13:55:24 during last week's meeting, vinod said that he couldn't replicate the bug on liberty. 13:55:34 ok 13:55:40 so, it seems it doesn't affect liberty. 13:55:57 #link https://review.openstack.org/#/c/328210/ 13:55:57 we are slowly moving to mitaka 13:56:08 this has a few comments, but hyper-v ci is passing, which is good. 13:56:10 we will check this patch 13:56:25 sagar_nikam: cool, sounds good. :) 13:56:31 last topic.. before time ends 13:56:39 what's happening on freerdp 13:56:40 so, if the comments are addressed, +2 from my part. 13:56:44 #topic open discussion 13:56:53 we raised a defect 13:57:19 freerdp does not support tls enabled keystone endpoint 13:58:19 c64cosmin: ohai 13:58:29 hi 13:58:49 hi 13:58:59 how is freerdp work ? 13:59:18 any new solution for the issue we found ? 13:59:30 well, this are going well, just that I must delay that work for a bit 13:59:43 I've been asigned a new project 13:59:44 ok 13:59:54 but it seems that the problem you have is solvable 13:59:54 ok 14:00:01 the https keystone I suppose 14:00:44 almost end of time 14:00:49 yeah.. 14:00:53 we can discuss offline 14:00:55 i'll have to end it here 14:00:56 thanks all 14:01:07 thanks folks for joining, see you next week! 14:01:09 #endmeeting