17:03:21 #startmeeting vmwareapi 17:03:21 Meeting started Wed Sep 11 17:03:21 2013 UTC and is due to finish in 60 minutes. The chair is hartsocks. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:03:22 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:03:25 The meeting name has been set to 'vmwareapi' 17:03:35 Greetings! Who's about? 17:03:46 hi 17:03:48 hi 17:03:50 yo 17:05:03 We're past feature freeze! 17:05:21 That hopefully means there's not much in the way of blueprints to talk about. 17:05:36 Does anyone have a blueprint we need to discuss? 17:06:17 hartsocks: i would like to mention a few things about the multi cluster 17:06:25 go for it. 17:06:51 it is a nice addition. i think that we need to do some extensive testing here. 17:07:06 agreed 17:07:29 there look like there are a few places where the _vmops parameter is not set correctly. i am checking now. 17:07:45 can you guys please look at the vnc_console. this may be one of them 17:08:00 give me a sec and i'll paste the line 17:08:22 There's a general state keeping problem in the driver… something we may need to address in the future. 17:08:32 IE: don't keep state 17:08:47 https://github.com/openstack/nova/blob/master/nova/virt/vmwareapi/driver.py#L468 17:09:15 i think it is missing _vmops = self._get_vmops_for_compute_node(instance['node']) 17:09:35 but as i said we need to do additional testing 17:10:00 hmm: https://github.com/openstack/nova/blob/master/nova/virt/vmwareapi/driver.py#L422 17:10:18 but if the instance is on another cluster 17:10:31 I see… this is interesting… adding the multi-cluster breaks assumptions. 17:10:44 The assumption being _vmops is set at init. 17:10:51 This is definitely a problem. 17:11:08 i'll test and fix tomorrow. thanks for the input 17:11:11 quite a few more self._vmops…. too 17:11:24 woah… hold your horses there... 17:11:30 witht he work on the migrations i have seen the issues there and am dealing with them 17:11:52 Lines 422 through 425 imply the driver's object's state is set at init time. 17:12:39 I think this means you're going to have to switch something around. 17:12:41 that is for the first cluster but not for the additional ones (which are pat of the _vmops dictionary) 17:13:02 which means some ops only work for the first. 17:13:07 cluster 17:13:11 correct 17:13:49 i think that this is good that it is on the radar. harlowja 17:13:58 ??? 17:13:59 oh 17:14:05 I think we need to remove self._vmops and a few of these others. 17:14:40 harlowja: that was a auto completion. sorry 17:15:00 np 17:15:09 hartsocks: i think that at this stage of the game we should do a case by case fix 17:15:56 i am just happy that this is on everyone radar now. i guess we can continue to discuss after the meeting 17:15:59 garyk: it will be easier to get a bug fix through case-by-case but the fact you have a self._vmops at all becomes dangerous. 17:16:10 agreed 17:16:31 Should we consider this a first refactor for IceHouse? 17:16:43 (spot fix the bugs naturally) 17:17:20 Something like bp/vmware_refactor_cluster_sensitive_objects 17:17:56 Then we can delete the self._vmops and other sensitive objects that might lead to a bug sneaking through. 17:18:40 The comment at 419 troubles me too… BTW. 17:18:46 lets hope our testing finds all of the issues so we will not need to do something like that 17:18:52 https://github.com/openstack/nova/blob/master/nova/virt/vmwareapi/driver.py#L419 17:19:26 garyk: okay, but that design feels fundamentally wrong. 17:19:31 agreed 17:20:35 I really don't want to leave that code in place since it will let bugs slip through. 17:20:43 If designed properly... 17:20:44 https://github.com/openstack/nova/blob/master/nova/virt/vmwareapi/driver.py#L468 17:20:55 should have thrown a NoneType exception 17:21:09 That actually troubles me more than anything else. 17:21:25 #action triage bug https://github.com/openstack/nova/blob/master/nova/virt/vmwareapi/driver.py#L468 17:21:36 garyk: you will report that bug? yes? 17:22:14 wow… I need to stop looking there... 17:22:19 hartsocks: sec 17:22:24 https://github.com/openstack/nova/blob/master/nova/virt/vmwareapi/driver.py#L458 17:22:44 that's the migration stuff that gary was talking about 17:22:50 hartsocks: i'd like to add the fix to https://review.openstack.org/#/c/43616/ 17:23:16 i'll post and update soon 17:23:19 Okay. I did not understand what that was talking about. 17:23:38 vuil: yes, that is correct. i am busy working on that now 17:24:30 #undo 17:24:31 Removing item from minutes: 17:24:48 So, let me get this straight... 17:25:10 You are fixing all these other issues that I see on that page in this bug... 17:25:18 #link https://bugs.launchpad.net/nova/+bug/1216510 17:25:19 Launchpad bug 1216510 in nova "VMware: exception when accessing invalid nodename" [Medium,In progress] 17:26:51 hartsocks: that is where i think that we should have the console fix 17:27:07 console fix? 17:27:10 hartsocks: in addition to this i am looking into the resize issue 17:27:42 have i managed to confuse everyone including myself. 17:27:43 sorry 17:27:53 so… what are you doing in bug 1216510? 17:27:54 Launchpad bug 1216510 in nova "VMware: exception when accessing invalid nodename" [Medium,In progress] https://launchpad.net/bugs/1216510 17:28:19 in that bug fix i have done 2 things: 17:28:26 I understand we have separate issues: resize, console, etc. Are you saying these are all one bug? 17:28:43 1. made fixes to the data structure so that it does not need a redirection to get the _vmops 17:28:52 2. validated that the resource exists prior to accessing 17:29:00 the console will be treated here too 17:29:24 make sense? 17:29:47 I understand multiple patches under one bug… that I understand. 17:30:07 i have pushed the chnage. i guess that we can discuss it on gerrit unless you want to continue here 17:30:52 Well… I'll make it a higher priority than medium! 17:31:07 agreed. thanks! 17:31:30 I'm twiddling some bits on that bug… hold on... 17:31:54 okay… that's done... 17:32:00 #link https://bugs.launchpad.net/openstack-vmwareapi-team/+bug/1216510 17:32:03 Launchpad bug 1216510 in openstack-vmwareapi-team "VMware: exception when accessing invalid nodename" [Critical,In progress] 17:32:24 So, what I've done is link the bug back to a VMwareAPI-Team project. 17:32:30 sorry for taking up everyone times 17:32:50 Better to talk it out than fly around in the dark! 17:33:07 no these are impt to get right. 17:33:09 i have retinitis pigmentosa so i am always in the dark :) 17:33:51 So that project is something Tracy, Dan, and I built because … notice this bug got bumped down to "Medium" … the Nova team guys didn't think our driver bugs were prioritized right. 17:33:58 while we are talking about bugs - im almost ready to push the bug on booting multiple instances with a concurrent image upload - but i am going to make it WIP as i'd like a few pairs of eyes on it since i am mucking with concurrency issues 17:34:11 tjones: great 17:34:19 tjones: yay! 17:35:35 I was going to ask about bugs next anyway… :-) 17:36:03 #link https://bugs.launchpad.net/nova/+bug/1190515 17:36:05 Launchpad bug 1190515 in openstack-vmwareapi-team "Incorrect host stats reported by VMWare VCDriver" [High,In progress] 17:36:28 That one is the only bug from my FF list that hasn't merged. 17:36:41 #link https://review.openstack.org/#/c/33100/ 17:36:58 So with that at the top of my list and RC1 coming up on Sept. 26th ... 17:37:23 what else should I track? 17:37:39 (I think we got a good idea on 2 or three right now) 17:39:05 * hartsocks did I kill the conversation? 17:39:24 i have a few. 17:39:29 gary sent out a list two days ago, and Sabari and I tagged on a few 17:39:59 https://review.openstack.org/#/c/41977/ - i think that we need a bug for this one 17:40:14 https://review.openstack.org/#/c/43268/ - vnc password 17:40:38 https://review.openstack.org/#/c/43616/ - we spoke about this at the beginning of the meeting 17:41:26 issues with volumes - https://review.openstack.org/#/c/45864/ and https://review.openstack.org/#/c/46027/ 17:41:42 guess we should maybe go back to the mail list or have a wiki with all of the issues 17:42:42 busy guy. 17:43:02 … and you're running the scheduler meetings now too! 17:43:54 two on my list: 17:43:56 https://review.openstack.org/#/c/40298/: snapshot failure 17:44:20 https://review.openstack.org/#/c/43994/: spawn failure with sparse disk 17:45:11 can we just add these to the VMwareAPI-Team project so we have 1 place to track them?? 17:45:25 id rather do that than a wiki that we would forget to update 17:45:29 tjones: that sounds like a good idea. 17:45:47 We also need to have a working priority system we can manage ourselves. 17:45:56 which we have on that list 17:46:28 So, to me that sounds good. 17:46:36 AND that list ends up being on this report - which tracks where they land http://partnerweb.vmware.com/programs/vmdkimage/customer_bugs.html 17:47:23 hurrah! 17:47:50 So what we track there on partnerweb... 17:47:57 that's... 17:48:03 Critical = don't deploy without 17:48:10 High = Strongly recommended 17:48:21 and nothing else right now... 17:48:23 right 17:48:24 ? 17:48:34 oh yeah. lets continue this discussion after the meeting. i don't want to add "yet another project" 17:48:48 okay. 17:49:14 But, in general… 17:49:31 I think we just need to identify things for our driver *we* 17:49:35 that is all of us... 17:49:44 would classify as "critical" or "high" 17:49:53 but the nova team would bump to "Medium" 17:50:08 So far, I'm pretty sure that's everything that has been linked to so far. 17:50:20 (in this meeting) 17:51:08 I'll go back over the minutes and fix each related bug report to a driver-specific priority then. 17:51:33 anything else on bugs? 17:52:06 or does anyone have anything else we need to talk about as a group? 17:52:06 can i get some reviews on https://review.openstack.org/#/c/33504/ please? 17:52:55 will do 17:53:06 oh no, jenkins tripped. 17:53:12 I'll look at it later. 17:53:52 hartsocks: that grumpy old man again. i have run a 'recheck' 17:54:10 arosen: thanks for the review 17:54:20 BTW: I've been looking at some of these Jenkins failures… lots of that testing code is multi-process, asynchronous, and uses timers! 17:55:36 sadly, we're just going to have to kick Jenkins in the shins from time to time. 17:57:12 So it sounds like we have some really high priority issues to fix-up. 17:57:25 sorry guys. i need to go and put the kids to bed. 17:57:38 Just about to close the meeting anyhow. 17:57:42 Thanks. 17:58:53 We're over in #openstack-vmware if you need us. Let's try and synchronize efforts more over there. 17:59:01 #endmeeting