17:00:12 <hartsocks> #startmeeting VMwareAPI 17:00:13 <openstack> Meeting started Wed Oct 23 17:00:12 2013 UTC and is due to finish in 60 minutes. The chair is hartsocks. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:00:14 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:00:17 <openstack> The meeting name has been set to 'vmwareapi' 17:00:25 <hartsocks> hi all 17:00:30 <hartsocks> I'm back. 17:00:35 <hartsocks> Did you miss me? 17:00:42 <hartsocks> Raise a hand if you did :-) 17:00:43 <tjones> I DID!!! 17:00:49 <hartsocks> *lol* 17:01:08 <tjones> :-D 17:01:48 <hartsocks> I'm still reading my old emails but I'm reading from newest to oldest so if there's something you need me to help on, please re-send… it'll get taken care of sooner (and perhaps twice even). 17:02:14 <hartsocks> One time I actually read all my emails and I didn't know what to do. 17:02:25 <hartsocks> Then the problem went away. 17:03:12 <hartsocks> Vui said he'd be online later so we might see him. 17:03:22 <hartsocks> Anybody else around? 17:03:55 <sarvind> sarvind ; I'm here. first time for the irc meeting 17:04:08 <tjones> welcome sarvind 17:04:28 <tjones> i may start calling you that face to face ;-) 17:05:01 <hartsocks> *lol* 17:06:19 <hartsocks> Well, let's get rolling then... 17:06:22 <hartsocks> #topic bugs 17:06:59 <hartsocks> #link http://goo.gl/uD7VDR 17:07:44 <hartsocks> here's a query on launchpad that combs for open bugs … new , in progress, etc. 17:08:25 <hartsocks> We have a few that's popped up since last week. 17:08:35 <hartsocks> Looks like 5 or so. 17:09:08 <hartsocks> This one troubles me... 17:09:10 <smurugesan> This is sabari here. Hi All. 17:09:11 <hartsocks> #link https://bugs.launchpad.net/nova/+bug/1241350 17:09:13 <uvirtbot> Launchpad bug 1241350 in nova "VMware: Detaching a volume from an instance also deletes the volume's backing vmdk" [Undecided,In progress] 17:09:17 <hartsocks> hey sabari! 17:10:07 <danwent> hi folks, garyk says he is having technical issues with his irc client 17:10:14 <hartsocks> okay. 17:10:16 <tjones> looks like that is out for review 17:10:25 <danwent> (he let me know via skype, apparently a more reliable channel) 17:10:41 <sarvind> i'd trouble with the irc client as well switched to the webchat for now 17:11:04 <hartsocks> okay. 17:11:09 <garyk> hi 17:11:33 <hartsocks> My plan for today is just to hit bugs, then blueprints, then HK summit stuff. 17:11:42 <hartsocks> Hey gary. 17:11:57 <hartsocks> We were just talking about #link https://bugs.launchpad.net/nova/+bug/1241350 17:11:58 <uvirtbot> Launchpad bug 1241350 in nova "VMware: Detaching a volume from an instance also deletes the volume's backing vmdk" [Undecided,In progress] 17:12:26 <garyk> that is a critical issue - hopefully we can get it backported asap 17:12:26 <tjones> jay has it out for review 17:12:49 <garyk> there are some minor issues with test cases. but nothing blocking 17:13:08 <hartsocks> okay good. 17:13:27 <hartsocks> #link https://bugs.launchpad.net/nova/+bug/1243222 17:13:28 <uvirtbot> Launchpad bug 1243222 in nova "VMware: Detach leaves volume in inconsistent state" [Undecided,New] 17:14:02 <garyk> i am currently working on that 17:14:11 <hartsocks> Okay. 17:14:15 <garyk> it is when a snaptshot takes place on the instance 17:14:53 <hartsocks> you've confirmed it then? 17:15:22 <garyk> confirmed the bug? 17:15:45 <hartsocks> yes. So it can be marked "confirmed" not "new" or something else? 17:16:09 <garyk> i'll mark it as confirmed 17:16:37 <hartsocks> groovy. 17:16:47 <danwent> garyk: ok, so that only happens when a snapshot is done? 17:17:05 <danwent> the title does not indicate that anywhere, making it sound much larger :) 17:17:16 <hartsocks> heh. good point. 17:17:33 <garyk> the issues is as follows: when we do a snapshot we ready the disk from the hardware summary. problem is that due to tge fact that there are 2 disks we do not read the right disk 17:17:47 <garyk> this causes us to snapshot the cinder disk instead of the nova disk. 17:17:50 <danwent> we seem to have a problem with that in general, giving bugs very general titles that make things seem very broken 17:18:12 <garyk> danwent: correct. we need to work on our descriptions 17:18:13 <hartsocks> I've edited the title to describe step 4 in the repro steps. 17:18:32 <danwent> ok, i'll send a note to the list just to remind people about this 17:18:33 <tjones> danwent: we went to keep you on your toes 17:18:49 <hartsocks> *lol* what's wrong with: "broken-ness all things happen bad" ? 17:19:00 <danwent> tjones: or the hospital? i almost had a heart attack when i saw that 17:19:36 <hartsocks> I see things like that and I usually say, "yeah right" 17:19:37 <tjones> :-P 17:20:08 <tjones> garyk: did this break in tempest or do we need to add more tests? 17:20:15 <danwent> or wait, that's a different bug than I thought we were talking about 17:20:17 <danwent> one sec 17:20:40 <garyk> tjones: i do not think that this is covered in tempest - if it is, it does not really check the validiaty of the disks 17:21:00 <hartsocks> This is a good point... 17:21:04 <garyk> danwent: there are 2 bugs which are very closely related - they may even be the same 17:21:09 <danwent> https://bugs.launchpad.net/nova/+bug/1241350 17:21:11 <uvirtbot> Launchpad bug 1241350 in nova "VMware: Detaching a volume from an instance also deletes the volume's backing vmdk" [High,In progress] 17:21:26 <danwent> is the one i saw 17:21:27 <hartsocks> yeah. 17:21:37 <hartsocks> That was top of my list too. 17:22:03 <garyk> that bug already has a patch upstream. that is a blocker 17:22:16 <garyk> we have discussed hat a few minutes ago 17:22:24 <danwent> i'm confused, does this mean any use of volumes is broken? 17:23:00 <garyk> danwent: something changed as this is a screio that we have done a million times 17:23:19 <garyk> i am not sure if it is in cinder. nothing on our side was chnaged here (we also have this in a lab) 17:23:25 <danwent> yeah, that is my sense too. 17:23:36 <hartsocks> slow down guys. 17:23:52 <hartsocks> This looks like more bad naming confusion here. 17:24:04 <hartsocks> If I'm reading this line right... 17:24:09 <hartsocks> https://github.com/openstack/nova/blob/master/nova/virt/vmwareapi/vm_util.py#L471 17:24:13 <garyk> i think that subbu and kartik were looking deeepr at the status of the disk and that may have highlighted the problem 17:24:31 <hartsocks> assuming the bug reporter was linking correctly… this means the 17:24:38 <garyk> hartsocks: that is what the fix addresses 17:24:41 <garyk> there are two cases 17:24:49 <garyk> 1. a consolidated disk needs to be deleted 17:24:53 <hartsocks> nova volume-detach 17:24:56 <hartsocks> calls the delete volume code 17:24:57 <garyk> 2. a detachment does not need to be deleted 17:25:03 <danwent> one at a time please :) 17:25:06 <hartsocks> def delete_virtual_disk_spec( 17:25:14 <hartsocks> which is not the right thing to do. 17:25:23 <hartsocks> Since deleting is not detaching. 17:25:34 <hartsocks> So. 17:25:37 <hartsocks> I'm saying: 17:25:41 <hartsocks> delete is not detatch. 17:25:43 <garyk> please look at https://review.openstack.org/#/c/52645/ 17:26:17 <hartsocks> great. 17:26:23 <hartsocks> I was scared there for a second. 17:26:40 <danwent> ok, so we can we up a level and talk about impact on customer? 17:26:48 <hartsocks> So the impact. 17:26:53 <hartsocks> can only be on 17:27:04 <hartsocks> the nova created vmdk right? 17:27:13 <hartsocks> this can't be bleeding into cinder somehow? 17:27:24 <garyk> hartsocks: danwent: no the problem is the cinder volume 17:27:47 <garyk> the 'detachment' 'deletes' the cinder volume. 17:28:05 <garyk> due to the fact that it is attached to a empty vm it will not be deleted but may turn into read only 17:28:31 <garyk> so it the case is we have instance X 17:28:35 <garyk> that uses volume Y 17:28:40 <garyk> and we write to Y 17:28:43 <garyk> the detach 17:29:07 <garyk> and attach to instance Z tehn we can read what was written by X but not may be able to write again 17:29:27 <garyk> sorry for the piece meal of comments - eb client is hard and my irc client is broken 17:29:52 <danwent> well, the bug says that the actually re-attach fails 17:30:02 <danwent> not that it succeeds, but the volume is read-only 17:30:27 <hartsocks> in my book, that means we haven't really confirmed this bug. 17:30:51 <garyk> hartsocks: subbu and kartik have confimed this and i have tested the patch 17:31:23 <danwent> garyk: confirmed what behavior? what is written in the bug (second attach fails) or what you mentioned (second attach works, but read-only) 17:31:50 <hartsocks> I have no doubt you've found *a* bug and fixed it. 17:32:31 <garyk> danwent: i think that they have confirmed what is written in the bug. i am not 100% sure, but I discussed this with them 17:33:31 <danwent> ok… well, i guess one thing that is different in the bug from what I personally have tested is that I've never tried to immediately re-attach a volume to the same VM, we always booted another VM and attached the volume to that vms 17:33:32 <garyk> danwent: my understaning, and i may be wrong, or confused, most likely the latter, is that the disk could become read only when we do something like a delete or a snapshot and it is owned by someone else 17:34:46 <garyk> danwent: that is the scenrio that i always tested 17:35:37 <danwent> ok, i don't totally follow on the read-only part, I'm just trying to understand how pervasive the bug is, as the write-up makes it sound like any volume that is detached is deleted and can never be attached to another VM, which means the whole point of volumes is in question. 17:36:08 <danwent> but that seems to contract what we've tested. 17:36:13 <danwent> contradict 17:36:14 <garyk> danwent: i'll forllow up with subbu and kartik and get all of the details so that we can paint a better picture 17:36:30 <danwent> ok, thanks, yeah, don't need to take up the whole meeting, but this does seem pretty important 17:36:39 <garyk> hartsocks: you can action item that for me 17:36:44 <danflorea> I agree. We need to know if we should say "don't snapshot when you have Cinder volumes attached" or "don't use our Cinder driver" 17:36:52 <garyk> yeah i concur it is very importnat 17:37:31 <hartsocks> #action garyk follow up on https://bugs.launchpad.net/nova/+bug/1241350 and narrow scope/descriptions 17:37:33 <uvirtbot> Launchpad bug 1241350 in nova "VMware: Detaching a volume from an instance also deletes the volume's backing vmdk" [High,In progress] 17:37:52 <hartsocks> Which brings me to... 17:37:55 <hartsocks> #link https://bugs.launchpad.net/nova/+bug/1243193 17:37:56 <uvirtbot> Launchpad bug 1243193 in nova "VMware: snapshot backs up wrong disk when instance is attached to volume" [Undecided,New] 17:38:13 <hartsocks> which seems related. 17:38:18 <garyk> hartsocks: i am currently debugging this 17:38:24 <hartsocks> (if only by subject matter) 17:38:31 <garyk> this is related to https://bugs.launchpad.net/nova/+bug/1243222 17:38:34 <uvirtbot> Launchpad bug 1243222 in nova "VMware: Detach after snapshot leaves volume in inconsistent state" [Undecided,Confirmed] 17:39:01 <hartsocks> yeah, glad you're on it. 17:39:11 <garyk> i need a stiff drink 17:39:46 <hartsocks> putting your name on the bug so I don't accidentally try to pick it up. 17:40:15 <hartsocks> Who's ever in HK should buy Gary a round. 17:40:19 <tjones> garyk: at least its late enough for you to do just hat 17:40:29 <tjones> that 17:40:37 <garyk> :) 17:40:44 <hartsocks> a hat full of vodka. 17:40:47 <hartsocks> :-) 17:41:15 <tjones> :-D 17:41:24 <hartsocks> any other pressing things? 17:41:32 <hartsocks> (on the topic of bugs that is) 17:42:16 <hartsocks> anyone look at #link https://bugs.launchpad.net/nova/+bug/1240355 17:42:18 <uvirtbot> Launchpad bug 1240355 in nova "Broken pipe error when copying image from glance to vSphere" [Undecided,New] 17:43:19 <hartsocks> That seems like someone with a screwy setup more than anything. 17:43:26 <hartsocks> Okay. 17:43:35 <smurugesan> I think this is related to the bug Tracy is working on . let me pull it up 17:43:47 <garyk> i have seen that on a number of occasions. have never been able to debug it 17:44:15 <hartsocks> hmm… so maybe not just a screwy set up (I've never seen this) 17:44:21 <smurugesan> Could this be because the vmdk descriptor file exists but not the flat-file. 17:44:41 <garyk> i actually think that it happens when the image is copied to the vc - i do not think that devtsack uses ssl between nova and glance 17:45:14 <garyk> i see it once every few days using a vanilla devstack installation with the debian instance 17:45:50 <hartsocks> Really?!? 17:46:03 <tjones> odd i have never sen it 17:46:07 <garyk> i am not sure if a packet is discarded or corrupted. but it is a tcp session so it should be retransmitted 17:46:13 <hartsocks> That error looks to me like a transient networking failure. 17:46:42 <hartsocks> Yes. TCP should cover retransmit of the occasional packet loss. 17:47:14 <garyk> my thinking is that the current connection is terminated and a new session witht he vc is started. the file download is not restarted... but then again i have not been able to reproduce to be able to say for sure 17:47:45 <hartsocks> Hmmm… 17:48:24 <hartsocks> When we transfer to the datastores... 17:48:37 <hartsocks> are we using the HTTP "rest" like interfaces? 17:48:47 <hartsocks> I don't recall… I suppose we would have to. 17:49:09 <hartsocks> I recall that there is a problem with session time-outs between the two forms of connections. 17:49:19 <hartsocks> The vanilla HTTP connection used for large file transfer... 17:49:27 <hartsocks> and the SOAP connection have different sessions. 17:49:36 <hartsocks> One can time out and the other can still be active. 17:49:51 <hartsocks> This would tend to happen on long running large file transfers. 17:50:03 <hartsocks> Is that what you've seen Gary? 17:50:46 <garyk> i have just seen the exception. have not delved any deeper than that 17:51:08 <garyk> i'll try and run tcp dump and see if it reprodues. this may crystalize your theory 17:51:33 <danflorea> It would be good to know if this is isolated to one testbed or if we see it in multiple. If it's the latter, it's less likely that this is just a network/setup issue. 17:52:04 <danflorea> PayPal in particular has big image files and is sensitive to failures like this so definitely worth investigating. 17:52:20 <garyk> both ryand and i have seen this. only thing in common is that we use the same cloud 17:52:26 <tjones> gark: you said it was with the debian image? 17:53:03 <garyk> tjones: yes 17:53:33 <hartsocks> Does the transfer ever take more than 30 minutes? 17:53:36 <tjones> that's only 1G 17:54:30 <garyk> nope, it is usually a few seconds, maybe a minute at most 17:54:46 <hartsocks> Okay. That doesn't support my theory at all. 17:55:23 <hartsocks> Hmm… who should look at this? 17:55:44 <garyk> i am bigged donw in the disk and datastores 17:55:44 <tjones> If ryan can show me what he does i can take a look 17:55:52 <tjones> yeah garyk has enough ;-) 17:55:57 <garyk> bogged not bigged 17:55:58 <hartsocks> totally. 17:56:05 <hartsocks> why not both? 17:56:06 <tjones> bugged 17:56:18 <garyk> nah, not bugged. 17:56:53 <tjones> at least i can put some debugging in there so we can catch it more easily if i cannot repo 17:57:07 <hartsocks> okay. 17:57:08 <hartsocks> #action tjones to follow up on https://bugs.launchpad.net/nova/+bug/1240355 17:57:10 <uvirtbot> Launchpad bug 1240355 in nova "Broken pipe error when copying image from glance to vSphere" [Undecided,New] 17:57:27 <hartsocks> We spent most of the meeting on bugs. 17:57:47 <hartsocks> #topic open discussion 17:58:03 <hartsocks> Anything else pressing we need to talk about? 17:58:19 <danflorea> Just one request. Please review upstream Nova driver & Cinder driver docs :) 17:58:25 <tjones> :-D 17:58:43 <danflorea> Nova driver patch: https://review.openstack.org/#/c/51756/ 17:59:05 <danflorea> Cinder driver doc is already merged. But send me comments if you have any and I can update that one too. 17:59:12 <hartsocks> #action everyone give some review love to upstream nova driver and cinder docs! 17:59:35 <hartsocks> So we're out of time. 17:59:41 <tjones> adios 17:59:41 <hartsocks> Thanks for the turn out today. 17:59:50 <danflorea> bye 18:00:08 <hartsocks> We're on #openstack-vmware just hangin' out if anyone needs to chat. 18:00:13 <hartsocks> #endmeeting