17:00:15 <tjones> #startmeeting VMwareAPI 17:00:16 <openstack> Meeting started Wed Oct 2 17:00:15 2013 UTC and is due to finish in 60 minutes. The chair is tjones. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:00:17 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:00:19 <openstack> The meeting name has been set to 'vmwareapi' 17:00:22 <tjones> hi folks - who's here? 17:00:49 <dims> howdy tjones 17:00:53 <tjones> hi dims 17:00:54 <garyk> hi, i am here 17:00:57 <tjones> hi gary 17:01:22 <garyk> tjones: hi 17:01:37 <vuil> hi 17:01:46 <tjones> ok - i think everyone knows that hartsocks is on paternity leave for a few weeks. so i'll run the meetings for a bit 17:02:05 <garyk> mazal tov! (aka good luck) 17:02:12 <tjones> lets get started - i ran the bug report this morning - http://paste.openstack.org/show/47843/ 17:02:22 <tjones> oops - 17:02:26 <tjones> #topic bugs 17:02:57 <tjones> in terms of high/critical bugs we have 17:03:01 <tjones> #link https://bugs.launchpad.net/bugs/1227825 17:03:04 <uvirtbot> Launchpad bug 1227825 in openstack-vmwareapi-team "datastore selection bug - fills first disk only" [Critical,In progress] 17:03:14 <tjones> which hartsocks was working on. 17:03:24 <garyk> tjones: that has been deferred to I as it is a feature 17:03:26 <smurugesan> Hey All, Sabari here 17:03:52 <tjones> this is the issue where we will only use 1 datastore and then throw exceptions when full. russellb has put it back into rc potenial. can someone take this over? 17:03:52 <garyk> russellb said that there may be a slight chance of getting review for it but it is doubtful 17:04:01 <smurugesan> I can take it over 17:04:13 <tjones> i think it's very close. thanks smurugesan 17:04:24 <smurugesan> because i have a disk usage bug that can only be fixed after Shawn's patch. 17:04:31 <smurugesan> let me pull that for record 17:04:39 <tjones> #action smurugesan take over https://bugs.launchpad.net/bugs/1227825 17:04:41 <uvirtbot> Launchpad bug 1227825 in openstack-vmwareapi-team "datastore selection bug - fills first disk only" [Critical,In progress] 17:04:56 <tjones> then we have 2 high/medium that need revision 17:05:10 <tjones> #link https://bugs.launchpad.net/bugs/1184807 17:05:12 <uvirtbot> Launchpad bug 1184807 in openstack-vmwareapi-team "Snapshot failure with VMwareVCDriver" [High,Fix committed] 17:05:15 <smurugesan> https://bugs.launchpad.net/nova/+bug/1220459 will piggy back on 1227825's fix 17:05:17 <uvirtbot> Launchpad bug 1220459 in nova "VMware Driver reports incorrect disk usage" [High,Confirmed] 17:05:40 <tjones> oops - looks like the script has a bug. that one is already committed 17:05:56 <garyk> tjones: that has been approved (it may be in the script as it is a grizzly backport) 17:06:21 <tjones> ah ok. lets look at the other one #link https://bugs.launchpad.net/bugs/1213269 17:06:22 <uvirtbot> Launchpad bug 1213269 in openstack-vmwareapi-team "_check_if_folder_file_exists only checks for metadata file" [High,In progress] 17:06:36 <tjones> hee hee - that one is mine. i'll get on it today 17:07:09 <tjones> the other 2 need review - in fact there are quite a few needing +1 17:07:37 <smurugesan> I will be doing some reviews today, I will take a look at them. 17:07:50 <tjones> by "quite a few" i actually mean 2 high prio and 6 low. lets focus on the critical/high 17:07:59 <tjones> any other bugs needing discussion? 17:08:38 <tjones> *listens* 17:08:44 <garyk> tjones: https://bugs.launchpad.net/nova/+bug/1225002 17:08:46 <uvirtbot> Launchpad bug 1225002 in nova "VMware: no VM connectivity when opaque network does not match bridge id" [Medium,In progress] 17:09:35 <garyk> It also needs to be backported to grizzly. it has been around for quite a while now and we need to escalate to core reviewers 17:09:58 <smurugesan> Other bugs, I am working on https://bugs.launchpad.net/nova/+bug/1193980 - should push a patch today. It's a regression over Grizzly. 17:10:00 <uvirtbot> Launchpad bug 1193980 in nova "Cinder Volumes "unable to find iscsi target" for VMware instances" [High,Confirmed] 17:10:01 <tjones> is it in review? 17:10:12 <tjones> i don't see the link in the bug 17:10:32 <garyk> it has been in review since august - give me a sec 17:10:42 <garyk> tjohttps://review.openstack.org/#/c/41977/ 17:10:48 <garyk> tjones: https://review.openstack.org/#/c/41977/ 17:11:14 <tjones> wow - yes it's very ready for core 17:12:06 <tjones> funny no link in the bug. ok that one is marked need core review in the report. 17:12:25 <garyk> tjones: https://code.launchpad.net/bugs/1197041 17:12:26 <uvirtbot> Launchpad bug 1197041 in nova "nova compute crashes if you do not have any hosts in your cluster" [Medium,In progress] 17:12:55 <smurugesan> for some reviews, the bug is not getting updated. It happened with me as well. 17:13:12 <tjones> gark: yes that one is a pain to debug. i have hit it and other users have reported it with VOVA. 17:13:38 <tjones> smurugesan: that iscsi - i'll add to the list 17:13:47 <tjones> any other bugs? 17:13:48 <garyk> i do not know why russellb removed this form the rc candidate. i'll try and check with him later 17:13:48 <smurugesan> thanks! 17:14:23 <garyk> tjones: https://bugs.launchpad.net/nova/+bug/1228847 17:14:24 <uvirtbot> Launchpad bug 1228847 in nova "VMware: VimException: Exception in __deepcopy__ Method not found" [Medium,In progress] 17:14:31 <tjones> smurugesan: russellb has added havana-rc-potential to that one 17:14:34 <garyk> this is really problematic 17:14:48 <tjones> yes it is hitting our CI guys 17:14:59 <tjones> do you have root cause on it? 17:15:04 <garyk> if there is a an exception in the driver - for example nova fixed ips are all used up, then the actual exception is corrupted 17:15:21 <garyk> this one needs to be moved to high 17:15:43 <tjones> i don't believe i have the ability to do so. But we can ping russellb 17:16:10 <garyk> i think that anyone who is part of the nova bugs team can (you just need to join the group) 17:16:22 <tjones> oh ok - i'll do that :-D 17:16:41 <garyk> russellb moved it from high to medium (just saw this now). i do not agree with his assesment. it makes troubleshooting practically impossible 17:16:43 <tjones> #action set https://bugs.launchpad.net/nova/+bug/1228847 to high prio 17:16:45 <uvirtbot> Launchpad bug 1228847 in nova "VMware: VimException: Exception in __deepcopy__ Method not found" [Medium,In progress] 17:17:16 <tjones> #action target https://bugs.launchpad.net/nova/+bug/1193980 for rc 17:17:18 <uvirtbot> Launchpad bug 1193980 in nova "Cinder Volumes "unable to find iscsi target" for VMware instances" [High,Confirmed] 17:17:39 <tjones> any other bug issues besides triage? 17:18:47 <tjones> #topic bug triage 17:18:54 <tjones> ok here' s the list : http://goo.gl/pTcDG 17:19:18 <tjones> #link https://bugs.launchpad.net/nova/+bug/1232348 17:19:20 <uvirtbot> Launchpad bug 1232348 in nova "VMware: vmdk converted via qemu-img may not boot as SCSI disk" [High,New] 17:19:41 <vuil> i filed this. Can someone confirm this well known issue. 17:19:48 <smurugesan> vuil is working on couple of these bugs. 17:19:50 <smurugesan> oh there he is 17:19:54 <smurugesan> :) 17:20:03 <vuil> sorry out for a couple minutes 17:20:17 <vuil> I am testing a fix. Hopefully out today. 17:20:47 <tjones> ok next is #link https://bugs.launchpad.net/nova/+bug/1194076 17:20:50 <uvirtbot> Launchpad bug 1194076 in nova "current_workload in nova hypervisor-show not recover after nova suspend/resume" [Medium,Incomplete] 17:21:43 <tjones> gark it looks like you were discussing that with the reporter. do you think user error? 17:21:44 <garyk> i was unable to reproduce this 17:21:52 <tjones> ok lets leave that one 17:22:14 <tjones> next is #link https://bugs.launchpad.net/nova/+bug/1226543 17:22:16 <uvirtbot> Launchpad bug 1226543 in nova "VMware: attaching a volume to the VM failed" [Medium,Incomplete] 17:22:40 <tjones> no action on this after garyk 17:22:48 <tjones> commented. 17:23:29 <tjones> last we have the results of our test team doing stress tests #link https://bugs.launchpad.net/nova/+bug/1230047 17:23:31 <uvirtbot> Launchpad bug 1230047 in nova "VMware: spawning large amounts of VMs sometimes causes errors" [Undecided,New] 17:24:24 <tjones> rhsu was looking into the nfs server to see if that was the issue. did anyone else look at this? 17:24:24 <garyk> tjones: this is a tough one. initially we thought it was https://code.launchpad.net/bugs/1228847 17:24:27 <uvirtbot> Launchpad bug 1228847 in nova "VMware: VimException: Exception in __deepcopy__ Method not found" [Medium,In progress] 17:25:02 <garyk> after we added the patch that i have fixed for that problem we saw the real exception and it was that the VC was return an exception that a vmdk was not found 17:25:12 <garyk> when ryan tried on another setup he was unable to reproduce 17:25:20 <garyk> i was also unable to reproduce 17:25:26 <garyk> we need to try and reproduce this 17:25:38 <tjones> hum - the other setup was one in the BLR lab - so that is why he was thinking it was the NFS server somehow 17:26:00 <tjones> ok let me talk with rhsu and see what the next steps are on this. 17:26:16 <tjones> #action talk to the test team about repo on https://bugs.launchpad.net/nova/+bug/1230047 17:26:18 <uvirtbot> Launchpad bug 1230047 in nova "VMware: spawning large amounts of VMs sometimes causes errors" [Undecided,New] 17:26:46 <tjones> any other issues before we go to open discussion? 17:27:16 <garyk> tjones: i think that we need to talk about stable grizzly 17:27:28 <tjones> #topic open discussion 17:27:30 <garyk> tjones: we should also go over documentation 17:27:32 <tjones> ok lets talk about that 17:27:46 <tjones> we have until 10/10 to get backports in correct? 17:27:56 <garyk> ok. regarding the stable grizzly, the feature freeze is the 10th of the month 17:28:18 <garyk> i think that the relase will be a few days later (due to gating problems over the last few days and stable gate is broken) 17:28:38 <garyk> we need to make sure that we have all of our critical and high bugs backported and tested hopefully by them 17:28:58 <garyk> it would be nice if we can make this a formal part of these meetings as the stable branch is very importnat for all 17:29:16 <tjones> #action add grizzly backports to the meeting agenda 17:29:30 <garyk> tjones: thanks 17:29:52 <tjones> ok we have 5 patches that gary has called out as needing backport 17:29:58 <garyk> regarding documentation i saw https://review.openstack.org/#/c/48859/ (and had a comment). can someone else please take a look 17:29:59 <vuil> tjones: where is the list of bugs you showed yesterday that had grizzly status 17:30:21 <tjones> http://partnerweb.vmware.com/programs/vmdkimage/customer_bugs.html 17:31:07 <tjones> garyk: i have that on my list to review today 17:31:32 <garyk> tjones: thanks! 17:32:24 <vuil> will take a look as well 17:32:30 <garyk> thanks 17:32:41 <tjones> #action bug owners review http://partnerweb.vmware.com/programs/vmdkimage/customer_bugs.html and backport their bugs if they have grizzly-backport-potential tags 17:33:21 <tjones> off the top of my head - vui - you have 2, sabari you have 2, i have 1 17:33:28 <vuil> was going to ask what the procedure is. 17:33:40 <smurugesan> sure 17:33:45 <vuil> sure. 17:33:50 <tjones> literally off the top of my head :-) so please check. 17:34:13 <garyk> i think that i have done one of vui's (and was planning on doing a few others) 17:34:25 <vuil> I saw that. Thanks. 17:34:30 <tjones> vuil - i'll paste the email from garyk on pastebin 17:34:43 <garyk> tjones: thanks 17:34:59 <tjones> here you go - http://paste.openstack.org/show/47845/ 17:35:10 <russellb> guys, please don't target new stuff to rc1 17:35:19 <russellb> it's being released today, just waiting on the last change to go through the gate 17:35:26 <tjones> hey russellb 17:35:51 <vuil> Probably can't assume folks not at this meeting get to the minutes in time to deal with their backports, so best we all take all pass through that list. 17:36:13 <garyk> russellb: so will these bugs be targeted for rc2? 17:36:24 <russellb> only if we determine that they qualify as release blockers 17:37:10 <russellb> none of these really seem to be, but happy to evaluate if you think something qualifies 17:37:16 <garyk> ok, understood, the reason we added them to the rc1 list is that we feel they are release blockers 17:37:34 <garyk> for example if there is an exception in the driver we may be unable to trouble shoot as an invalid stack is logged 17:37:35 <russellb> they're not things that block everyone, are they? 17:37:36 <tjones> for example - one is a grizzly regression 17:37:50 <garyk> they block everyone using the vmware drivers 17:38:05 <russellb> you have an interesting definition of block, then 17:38:16 <garyk> can you please clarify 17:38:45 <garyk> my take is that if the service crashes for some reason or another then that is a blocking issue 17:38:49 <russellb> at this point, block needs to be something that can't be worked around, and affects major functionality 17:38:59 <russellb> would you like to go 1 by 1? 17:39:10 <garyk> sure. 17:39:18 <garyk> i think that there are 2 serious bugs: 17:39:33 <garyk> https://review.openstack.org/#/c/41977/ 17:39:50 <russellb> ok on that one, is this a configuration error basically? 17:39:53 <russellb> how do you hit the problem? 17:39:58 <garyk> this may result in a case where there is no network connectvity with the vm 17:40:21 <garyk> the problem happens when the esx host does not have a matching opaque network. 17:40:34 <russellb> so, a setup error? 17:40:41 <garyk> it can happen after a reboot of the host 17:41:26 <garyk> if a new host is added to a cluster then vm's deployed on that host may not have network connectivty 17:41:27 <russellb> what causes it to happen? 17:42:02 <garyk> if the host goes into maintenace mode and then say is rebooted (power outage for example) 17:42:45 <garyk> it is kind of like the ovs having no rules that match traffic 17:43:56 <garyk> my bad is that i did not convey this information on the bug (was away on vacation…) 17:44:05 <garyk> but that is not an axscuse 17:44:15 <russellb> to be honest, i still don't understand what you're saying 17:44:50 <russellb> can you write up on the bug in more detail what the problem is, how it occurs, and make a point to demonstrate that it's a bug with no workaround? 17:45:13 <garyk> sure, i'll do that 17:45:27 <tjones> #action gark: update https://review.openstack.org/#/c/41977/ with more detail 17:45:38 <tjones> #undo 17:45:38 <openstack> Removing item from minutes: <ircmeeting.items.Action object at 0x30f3c90> 17:45:46 <garyk> it is like someone reboots a host with libvirt and after the reboot that are unable to run traffic to any vms on that host 17:45:51 <tjones> irc://chat.freenode.net:6667/#action garyk: update https://review.openstack.org/#/c/41977/ with more detail 17:46:14 <russellb> garyk: i get the end result, but not the steps that lead up to putting the host in that situation 17:46:15 <garyk> the second issue is https://code.launchpad.net/bugs/1228847 17:46:17 <uvirtbot> Launchpad bug 1228847 in nova "VMware: VimException: Exception in __deepcopy__ Method not found" [Medium,In progress] 17:46:21 <garyk> ok 17:46:29 <russellb> i saw you saying that one helps debugging 17:46:36 <russellb> but it's not preventing functionality from working for users 17:46:40 <russellb> so i don't consider that a blocker 17:47:03 <russellb> keep in mind that this far into the RC period, the bar has to be *really* high, or we'll never release 17:47:28 <garyk> ok. 17:47:41 <garyk> the previous bug is really high and this one can be defered 17:47:54 <garyk> deferred (i think my spelling is a mess) 17:48:11 <russellb> k, ping me when you have a more detailed writeup on the bug ready, and hopefully i'll see what you see then 17:48:18 <garyk> ok, thanks 17:49:19 <tjones> ok one more - this is a reggression from grizzly 17:49:20 <tjones> https://bugs.launchpad.net/nova/+bug/1193980 17:49:22 <uvirtbot> Launchpad bug 1193980 in nova "Cinder Volumes "unable to find iscsi target" for VMware instances" [High,Confirmed] 17:50:15 <russellb> no patch? 17:50:28 <russellb> and it's tagged grizzly-backport-potential? 17:50:36 <russellb> so does that mean it exists in grizzly as well? 17:50:57 <tjones> we've been having some issues with bugs not showing they are in review. sabari - can you comment on this one? 17:51:28 <russellb> even if it doesn't happen automatically, you should update it manually :-) 17:51:43 <tjones> um - if it's marked for grizzly i may have mixed it up with another :-D i'll check on it 17:51:51 <russellb> k 17:52:11 <tjones> anything else for russellb folks? we are getting close to tiem 17:52:12 <tjones> time 17:52:26 <russellb> looks like the grizzly tag was added when the bug was filed 17:52:32 <garyk> a hug when i start to cry :) 17:52:43 <tjones> :-D 17:52:45 <russellb> so, clarify if it's a regression from grizzly to havana 17:52:51 <tjones> ok will do 17:52:53 <russellb> and update the bug with the review, and set to In Progress, if there's a patch up 17:53:04 <russellb> and ping me after those updates 17:53:19 <tjones> #action follow up on https://bugs.launchpad.net/nova/+bug/1193980 17:53:22 <uvirtbot> Launchpad bug 1193980 in nova "Cinder Volumes "unable to find iscsi target" for VMware instances" [High,Confirmed] 17:53:24 <garyk> russellb: thanks for the time. much appreciated 17:53:36 <russellb> yep, sorry to be tough, just have to protect the release timeline 17:53:55 <tjones> russellb: thanks! no worries - this is the *fun* part 17:54:04 <russellb> yup 17:54:17 <tjones> anything else folks? 17:54:33 <tjones> going once…. 17:55:09 <tjones> thanks for attending! see you next week 17:55:11 <tjones> #endmeeting