17:01:33 #startmeeting VMwareAPI 17:01:33 that was quick 17:01:34 Meeting started Wed Nov 20 17:01:33 2013 UTC and is due to finish in 60 minutes. The chair is hartsocks. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:01:35 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:01:37 lol 17:01:37 The meeting name has been set to 'vmwareapi' 17:01:47 ok we are done! 17:01:53 I didn't realize it would take the whole name... 17:01:54 tjones: and we completed it without incident :) 17:02:01 awesome! 17:02:10 chat to you guys next week ... 17:02:11 :-) 17:02:25 hi guys 17:02:58 hey. Who else is around? 17:04:49 Okay. Well, we'll just hope people who can jump in will. 17:04:53 hartsocks: o/ 17:04:58 #topic bugs 17:05:08 #link https://bugs.launchpad.net/nova/+bugs?field.tag=vmware+&field.status%3Alist=NEW 17:05:41 We've got 3 bugs I've not managed to cycle back to yet. 17:05:49 I haven't set a priority on: 17:06:00 #link https://bugs.launchpad.net/nova/+bug/1240355 17:06:03 Launchpad bug 1240355 in nova "Broken pipe error when copying image from glance to vSphere" [Undecided,New] 17:06:09 hartsocks: what do you ean we have 3 bugs? 17:06:15 that is intermittent 17:06:20 Because I've had no luck reproducing it. 17:06:22 3 non-triaged bugs 17:06:32 ok, thanks for the clarification 17:06:35 We have 3 bugs without priority. Sorry, I a word. 17:06:57 So 1240355 in nova "Broken pipe error when copying image from glance to vSphere" [ 17:07:01 #link https://bugs.launchpad.net/nova/+bug/1252827 17:07:03 Launchpad bug 1252827 in nova "VMWARE: Intermittent problem with stats reporting" [Undecided,New] 17:07:08 I can't repro. But, I uderstand others have. 17:07:26 i have yet to reproduce this one, but it is criticial in my opinion 17:07:38 this one is ciritcal - it's blocking our CI. 17:07:41 which one? reporting or intermittend pipe error? 17:07:45 if someone has reproduced then they should set it as confirmed 17:07:51 sorry - should have finished pipe discussion 1st 17:08:06 the statistics - the reason is that if the stats are not reported then VM's cannot be launched 17:08:09 Well, I thought Gary had reproduced the pipe thing. 17:08:26 yeah. That's pretty bad. So we can call that "High" then. 17:08:33 which bug are we talking about> 17:08:47 *lol* 17:08:49 lets finish the pipe discussoin and move on - my bad 17:08:57 tjones: ok. np 17:09:02 So the pipe bug. 17:09:13 Gary. You said you've seen it? I can't make it happen. 17:09:35 i have seen this one a few times. not as of late 17:09:54 i am marking it as confirmed as i have seen it on my setup 17:09:59 So… it could be gone now for all we know? 17:10:14 Okay. 17:10:23 When it happens is it bad? 17:10:30 Can you keep working? 17:10:51 yes, when it happens one is unable to boot a VM and you need to try again. That is bad in my opinion. 17:11:12 so is the cloud dead from that point on or does it recover? 17:11:34 … and yet three groups have tried to repro. this and can't. 17:12:02 the cloud is not dead. the current VM being booted fails. 17:12:09 well, I'm calling that Medium for now and moving on to a more important thing then. 17:12:21 #link https://bugs.launchpad.net/nova/+bug/1252827 17:12:23 Launchpad bug 1252827 in nova "VMWARE: Intermittent problem with stats reporting" [Undecided,New] 17:12:28 tracy go! 17:12:36 CRITICAL! blocking CI 17:12:58 awesome. 17:13:04 and someone from this TZ should take it on so we can work with ryan and sreeram on it. gary was looking, but its at the end of his day 17:13:23 sabari was taking it but he was sick yesterday 17:13:27 Technically, we can't put things under "Critical" since our driver's failures don't hit critical in the grand OpenStack scheme of things. 17:13:35 ok super high ;-) 17:13:47 tjones: i looked at the log files that they provided but there was only one concerning then - a image was unable to be deleted 17:13:50 So this I'll tack to the vmwareapi subteam thingy. 17:13:53 but it is affecting CI which makes it important 17:14:11 the bug is criticial 17:14:51 yeah. I guess now that our CI is feeding the Openstack infra we can claim that. Let's do it and see what happens. 17:14:58 i think that we need to provide a debug version to sreeram and ryan and see where it goes from there. 17:15:04 i'll give them something soon 17:15:11 i don't know if sabari is here today - if not who wants to take it? I can't as i have a prep for a demo tomorrow :-P 17:15:30 i can look into it in the coming hours 17:15:53 #action Gary to follow up on blocking bug/1252827 17:15:56 cool. 17:15:57 thanks garyk - you'll need to pass it on to someone at the end of your day 17:16:05 tjones: ok, np 17:16:47 okay, let's not drop that one then. 17:16:58 last one… then on to other topics 17:17:05 #link https://bugs.launchpad.net/nova/+bug/1251501 17:17:06 Launchpad bug 1251501 in nova "VMware: error when booting sparse images" [Undecided,New] 17:17:18 Where did we go on that one? 17:17:20 hi just checking in. 17:17:30 hey. 17:17:31 that looks like a backport issue (on my part) it does not happen in master 17:17:33 Just in time. 17:17:36 i'll take a look 17:17:41 cool. 17:17:55 cool. was wondering whay Ryan meant 17:18:06 so that might not be an issue. 17:18:09 good. 17:18:17 not for master :-) 17:18:45 Okay. 17:18:48 So bugs in general. 17:19:25 #link https://bugs.launchpad.net/nova/+bug/1195139 17:19:27 Launchpad bug 1195139 in nova "vmware Hyper doesn't report hypervisor version correctly to database" [Critical,In progress] 17:19:32 This one popped into my attention. 17:19:43 that is fixed and waiting for review 17:19:48 Gary, you moved the priority on this and it looks like you got some push back in review. 17:19:55 it was −2 due to the inavlid bug status 'won't fix' 17:20:07 it was fixed and vakiudated by a QE engineer at HP who encountered the problem 17:20:09 #link https://review.openstack.org/#/c/53109/ 17:20:29 i have update the bug status and hopefully the reviewr will understand 17:20:57 please exscuse my spelling - i am trying to do three things at once and not succeeding in any of them 17:21:01 So I remember this issue is that the idea of numbers for versions doesn't work universally. 17:21:28 at the moment no one has provided a hypervisor version that does not work with the numbers. 17:21:32 I understand we can't fix upstream … so this is an attempt to solve a blocking problem. 17:21:49 So the argument is that it's an academic argument. 17:22:10 upstream there are 2 issues 17:22:11 For example: W13k isn't the version of any existing hypervisor. 17:22:14 1. the bug fix 17:22:22 2. chnaging the way that is is used in the db 17:22:34 the latter was not liked for reasons unbeknownst to me 17:22:41 in the short term we should go with #1 17:23:04 If they won't change the DB then I suppose we're forced to do it this way. 17:23:24 sadly this is what we have until someone makes the chnages to the db. 17:23:33 Why is it critical? Is it blocking something? 17:23:41 at this stage i am not sure if anyone is actually doing that work 17:23:55 it is critical - one cannot use postgres with our hypervisor 17:25:05 I'm expecting we'll get a knock on the head for calling it critical but I'll let it ride just to see where the line is. 17:25:37 if others disagree they can chnage the severity 17:25:42 It's a shame this got knocked around so long if it's critical. 17:26:18 Do you need someone to reach out to Joe since he's the one who blocked you? 17:26:35 I'll take that off line. 17:26:36 i sent him a mail and wrote on the review. if you could reach out to him it will be great. 17:26:50 thanks 17:27:04 #action follow up on why bug/1195139 is blocked. 17:27:18 Okay. 17:27:32 So does anyone else have a bug we need to discuss and sync up on? 17:28:09 * hartsocks politely listens for people who are slower on the keys. 17:29:42 okay. We'll have open discussion at the end. 17:29:47 #topic blueprints 17:29:52 #link https://blueprints.launchpad.net/nova?searchtext=vmware 17:30:04 So let's do this. 17:30:42 Let's pick a blueprint and discuss it and it's priority to the project, and try and do a different one each week. 17:31:06 hartsocks: sadly all bps are set as low until 2 cores jump in... 17:31:14 yes. 17:31:19 For background... 17:31:50 if you weren't following, the BP process changed. So that if you want higher than "low" priority you have to get 2 core developers signed up for your BP. 17:32:24 In all fairness this doesn't mean a BP won't get done, it just means it won't get done as fast. Virtually all our BP were "low" last round and many still made it in. 17:33:01 On the link I posted... 17:33:11 so our config checker (which i have done nothing on) didn't make this list cause it doesn't say vmware in it… 17:33:28 i just hope that we manage to convince the core guys to jump in and review our stuff. at the moment it seems to be going as usual. an example is the diagnostics. russel has been very helpful here 17:33:28 *all* but one of our BP have been bumped *below* "low" priority BTW. 17:33:43 tjones: it's a dumb query, best I have. 17:33:50 that's ok - i;ll work around it 17:34:11 tjones: post your BP so we can pick on… er… dicuss it. 17:34:15 :-) 17:34:32 https://blueprints.launchpad.net/nova/+spec/config-validation-script 17:34:37 it's OUR BP ;-) 17:34:51 Actually, it's *awesome* that this happened. 17:35:06 :) 17:35:12 i just put vmware in it ;-) 17:35:14 I called my old BP on the configuration validator for the driver defunct. This is much better. 17:36:08 ogelbukh did a nice job of capturing requirements in https://etherpad.openstack.org/p/w5BwMtCG6z 17:36:42 we have 2 distinct parts in it 17:36:47 tjones: make sure to link that into the BP. 17:36:56 just did 17:37:10 ogelbukh: go ahead 17:37:56 first is modifications to common config 17:38:25 additional flag types and validations 17:38:52 i like the idea of doing that part in oslo and auto generating 17:38:54 there are multiple blueprints along that lines 17:39:22 and i think vmware part will be the first one to implement as it's first use case 17:39:40 Yeah. The validation and config-check thing is a cross-cutting concern for even VMwareAPI related drivers... 17:39:54 the folks on Cinder have some of the same validation checks the folks on Nova will. 17:39:55 second part is standalone tool capable of per-service validation 17:40:22 of cross-services consistency 17:41:29 My chief concern about validation at service start up and validation in a stand alone tool… was that this would be mostly the same code … so I wanted to see code reuse to avoid duplicate work. 17:41:48 absoutely 17:42:05 my idea right now is that it should 'register' service config or something like that 17:42:15 So I take it this is going to be part Oslo-level work and part work at the driver level? 17:42:19 and validate against 'known' configs 17:42:26 but that has implications 17:42:44 and I'm still trying to identify all of them 17:43:24 hartsocks: tjones: I'm not sure that validation logic will be the same for those 2 parts 17:43:35 Okay. 17:43:38 config validation? 17:44:02 with oslo.config part it is mostly additional types and regexp matching 17:44:21 Hmm... 17:44:34 while in cross-services part we'll have to inspect sematics 17:45:25 *semantics 17:45:40 logical connections between services 17:45:49 that are not explicit in the code 17:45:55 service validation at runtime would get deeper into the config but config validation can be done either place 17:46:02 2 different things to attack 17:46:03 sure 17:46:05 yes 17:46:27 should we have a separate session to discuss this in depth? 17:46:30 so how best to work on this further? another irc meeting ?? 17:46:35 lol - read my mind 17:46:38 i believe so ) 17:46:42 :-) 17:47:00 we could have a call in webex or google hangouts if you like 17:47:21 but time windows are really narrow 17:47:23 #action set up meeting (in IRC or otherwise) for ogelbukh, tjones, hartsocks, (and anyone else interested) to discuss config validation 17:47:27 given i'm in utc+8 17:47:33 *utc+4 17:47:34 Yeah. 17:47:47 yes they are very narrow - are you in australia? 17:47:47 and 12 hours difference with PST 17:47:55 Well I have a teeny tiny baby… so sometimes 8pm to midnight EST is the best time for me. 17:47:55 no, Russia 17:48:00 Mosciw TZ 17:48:04 ah 17:48:04 *Moscow 17:48:11 cool 17:48:15 very cool 17:48:23 sometimes snowy and cold even. 17:48:28 :) 17:48:28 :-) 17:48:35 probably in 2 weeks ) 17:48:43 so we could start with another irc 17:49:10 We are holding #openstack-vmware for discussions people aren't 100% sure go in #openstack-nova 17:49:14 ok 17:49:17 that's cool 17:49:21 This is one of those that can probably go either place. 17:49:22 i'm already there 17:49:54 So let's table that BP for now. 17:50:00 #topic open discussion 17:50:36 Last 10 minutes, for anything people need to call out. 17:50:46 * hartsocks listens 17:51:20 fyi - i have given sreeram and ryan a debug version. 17:51:35 troubling line of code - https://github.com/openstack/nova/blob/master/nova/virt/vmwareapi/host.py#L118 17:51:54 i have seen exceptions that they have datastore access problems 17:52:08 and everything is returning 0 17:52:09 sreeram had a very good idea of not resetting the stats - we just need to validtae 17:52:40 Can happen sometimes especially with NFS datastores. 17:53:13 So, sometimes an NFS datastore is "not found" and then later it is? 17:53:21 * hartsocks boggles 17:53:21 ugh 17:53:31 yeah. transient network connectivity issues can cause that 17:53:39 VC too then 17:53:43 * hartsocks nods knowingly 17:53:43 wonder how they handle it 17:53:43 i am not sure. hopefully after a run or 2 we'll have some debug info 17:54:03 It's "the 7 fallacies of Network programming" 17:54:10 or something like that. 17:54:35 #link http://en.wikipedia.org/wiki/Fallacies_of_Distributed_Computing 17:54:48 Looks like I remembered wrong. There are 8. 17:54:53 1. the network is reliable 17:54:58 *lol* 17:54:59 yep. 17:55:34 Is it *ironic* that our "cloud computing" code suffers from a lot of these? 17:55:56 i think that they need to reevaluate after the advent of SDN 17:56:21 there might be more? 17:56:39 :-) 17:56:56 :) 17:57:49 I have nothing against short meetings. 17:58:00 As I proved earlier. :-) … going once... 17:58:26 … twice ... 17:59:09 … three times... 17:59:13 #endmeeting