17:00:29 #startmeeting qa 17:00:30 Meeting started Thu Dec 6 17:00:29 2012 UTC. The chair is jaypipes. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:00:31 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:00:33 The meeting name has been set to 'qa' 17:00:59 <- here 17:01:06 davidkranz: present 17:01:06 hi 17:01:13 I'm here 17:01:18 Chun Wang (April) is here. 17:01:20 * afazekas here 17:01:35 * dwalleck is sort of awake 17:01:55 chunwang: hi! :) welcome! 17:02:05 * sdague hands dwalleck virtual coffee 17:02:17 jaypipes: thanks :) 17:02:17 * jaypipes grabs coffee from sdague 17:02:18 Any one else here for the meeting? 17:02:32 lol 17:02:35 heh 17:02:49 here as well 17:02:58 donaldngo: hi there 17:03:05 hello 17:03:06 OK, welcome back davidkranz 17:03:16 #topic Open reviews 17:03:23 #link https://review.openstack.org/#/q/status:open+project:openstack/tempest,n,z 17:03:39 Shall we discuss from bottom to top? 17:03:49 sure 17:03:53 First one (not WIP): https://review.openstack.org/#/c/17063/ 17:03:59 mordred: ping 17:04:07 and fungi ^^ 17:04:34 You guys want to comment on that review? afazekas was able to do soem setup.py factoring that just went in to tempest this morning. 17:04:57 I think you just need to revisit that patch and see if a) it's still relevant and b) if so, if it needs a few changes 17:05:05 mordred, fungi: ^^ 17:05:12 looking 17:05:42 afazekas, sdague: Next review is the boto versioning one: https://review.openstack.org/#/c/17467/ 17:06:02 afazekas: any status on that one regarding updates to glance and nova for updated boto? 17:06:10 https://review.openstack.org/#/c/17256/ 17:06:18 nova done, glance missing 17:06:22 kk 17:06:38 jaypipes: yeh, my only comments remain. I don't think we want commented out lines of code in the tree 17:07:00 afazekas: +2'd that glance review... should be merged shortly. 17:07:01 if someone else wants to override me, I'm cool with that :) 17:07:39 sdague: Workaround must be look like a workaround, which is wrong 17:07:50 sdague: I agree with you, but I think afazekas can abandon once 17256 gets merged. 17:07:55 afazekas: correct? 17:08:07 jaypipes: yes 17:08:19 ok, so not really an issue then? 17:08:39 ok, cool, then we'll just wait to see if 17256 gets through the gate, and then afazekas can mark that tempest patch abandoned. 17:08:43 alright, next one! 17:08:54 https://review.openstack.org/#/c/17464/ 17:08:58 is Rohan here? 17:09:11 I believe he's in India, so probably not :) 17:09:21 anyone from NTTDATA here? 17:09:32 jaypipes: aroo? 17:09:49 jaypipes: The comment says he is waiting for a cinder bug fix. But the link in the review is broken. 17:09:53 looks like Rohan is waiting on a Cinder bug... 17:09:57 davidkranz: yeah. 17:10:00 davidkranz: remove the . 17:10:14 gerrit is detecting the link incorrectly 17:10:41 sdague: Heh. So that one is just in process. 17:10:44 updated the review 17:10:57 the bug isn't actually being worked as far as I can see 17:11:02 afazekas, sdague: I believe https://review.openstack.org/#/c/15972/ should be abandoned now that afazekas's work on the devstack lib/tempest is done. 17:11:58 jaypipes: yes 17:12:01 cool 17:12:17 afazekas: that lib/tempest work was quite cool btw, cleans stuff up a lot 17:12:19 afazekas: ok, cool. if you could ping jaroslav to abandon at his convenience, that would be great 17:12:22 sdague: ++ 17:12:29 very cool indeed. 17:13:03 OK, I believe all the remaining merge requests are in the process of working through the gate. 17:13:23 jaypipes: done 17:13:27 except for dwalleck's server actions one, which I conveniently skipped over (we'll discuss later) 17:13:40 may I know what lib/tempest is doing? which blue print is it for? 17:13:50 chunwang: sure, let me explain 17:13:55 chunwang: it's a cleanup on devstack 17:14:07 ok 17:14:17 chunwang: so prior to afazekas's work, we had an /etc/tempest.conf.sample and a /etc/tempest.conf.tpl 17:15:01 chunwang: his work gets rid of the redundant/duplicate tempest.conf.tpl file and uses the ini_set bash library routine in devstack to construct the tempest configuration file used in the continuous integration gate tests 17:15:34 chunwang: afazekas's patch cleaned up a bunch of duplicate code and made the generation of tempest's configuration file match how the generation for other OpenStack projects is done in devstack 17:15:52 chunwang: it didn't change the configuration file options itself, just the way it was generated. 17:16:05 thanks, got it. 17:16:08 abandoned 17:16:15 jhenner: ah, cheers! 17:16:21 with a pleasure 17:16:26 :) 17:16:35 tempest in devstack is now getting clean enough that we might want to enable it by default 17:17:02 chunwang: sticking with you, I was going to change topic to discuss the tenant isolation issues we were discussing on https://bugs.launchpad.net/tempest/+bug/1087298. Is that OK with you? 17:17:03 Launchpad bug 1087298 in tempest "Request rate too high during test_security_group_rules running" [Undecided,New] 17:17:23 sure. we can discuss it here... 17:17:26 #topic How tenant isolation works (and possible bugs/inconsistencies) 17:17:39 if everyone could give a quick read through the above bug, that would be great 17:18:01 * jaypipes executes sleep 60 17:18:34 https://github.com/openstack-ci/devstack-gate/blob/master/devstack-vm-gate.sh#L162 is it needed ? 17:18:49 yes, let me introduce the background. The issues found in our deployed enviornments, both E & F version openstack. 17:19:42 many instances/image snapshots/security groups are seens after the test running in dashboard. Seems the teardown function is not working properly. 17:20:27 so there is also a real bug in that code 17:20:29 chunwang: do you have details on which tests? the tenant will be named for the test... 17:21:43 err..actually many cases are using rand_name('tenant�) as tenant name in the script 17:21:54 afazekas: AFAICT, yes, since the devstack install of tempest will now create the config file. 17:22:12 I suppose the script name I mentioned in bug. If it's really necessary, I may duplicate the issue and figure it out 17:22:33 chunwang: any test that derives from the base test classes should not be doing that... so it is a bug if they are not doing so. 17:23:15 ok, then there is another bug... 17:24:36 chunwang: I think we'll need to do some deeper investigation into this and figure out if any tests are leaving side effects if tenant_isolation is enabled (none of them should, and it is a bug if they do...) 17:24:43 so the error may happens when the test instance/imagesnapshot is not created in base classes? 17:25:13 chunwang: you mean if/when a fixture gets created in the setUpClass() methods, I presume? 17:25:14 chunwang: that execption you see in the logs is going to mean the final stages of cleanup don't happen 17:25:23 yes 17:25:54 I understand the teardown won't happen when case got errors 17:26:12 but the tenant not delete issue I saw is when cases all passed 17:26:25 chunwang: right, and that's the one I'm concerned about... 17:27:21 chunwang: OK, I will do a deeper investigation this afternoon on this issue, ok? 17:27:36 chunwang: and I'll update the bug report with my findings. 17:27:41 sure. if any help or information needed, I will provide 17:28:19 cool. 17:28:55 OK, so, shall we move on to discuss dwalleck's server actions patch? or is there any other topic folks would like to discuss before that? 17:29:10 jaypipes: this was the second time when the ./tools/configure_tempest.sh called, first it is called by https://github.com/openstack-ci/devstack-gate/blob/master/devstack-vm-gate.sh#L150, you can see in the log the double config: search for example this sting 'source /opt/stack/devstack/lib/database' , here http://logs.openstack.org/17256/3/check/gate-tempest-devstack-vm/21128/console.html.gz 17:29:27 jaypipes: We should discuss how to get the full (or at least almost full) gate on all prjojects. 17:29:44 jaypipes: Another nova bug slipped through yesterday. 17:30:04 The biggest issue, other than some slow tests, is https://bugs.launchpad.net/nova/+bug/1079687 17:30:05 Launchpad bug 1079687 in nova "Flaky failures of instances to reach BUILD and ACTIVE states" [Undecided,New] 17:30:14 afazekas: unless I'm mistaken, I believe you should be able to remove all calls to tools/configure_tempest.sh after your devstack patch lands 17:30:20 which fails the hourly tempest build several times per day. 17:30:33 #topic Blockers for enabling the full gate on core projects 17:30:45 jaypipes: There has not been any progress on this bug as far as I can see. 17:30:59 davidkranz: one sec, reading into bug 17:31:00 davidkranz: I don't know if I'd say those are Tempest issues. In my experience, servers going into error status have always been Nova issues 17:31:18 dwalleck: Yes, the bug if a nova bug ticket. 17:31:26 ahh, sorry 17:31:31 * dwalleck is not fully awake yet 17:31:38 davidkranz: is that related to any of the other nova bugs? 17:31:46 sdague: I don't think so. 17:31:47 I thought the blocker was a different issue 17:31:53 davidkranz: strange that all those failures are the XML server tests, not the JSON ones... 17:32:21 jaypipes: I am not sure that is true. The failures are flaky. 17:32:33 jaypipes: They happen in various places. 17:32:55 davidkranz: oh, ok... 17:33:17 jaypipes: You can ssee a lot of stuff at https://jenkins.openstack.org/job/periodic-tempest-devstack-vm-check-hourly/? if you want. 17:33:46 IMO, this should be the highest priority for some one on the nova team. 17:33:50 davidkranz: k, will look further into it. do we have the rabbit logs somewhere? 17:33:59 davidkranz: agreed about priority. 17:34:36 jaypipes: Yes, all the logs are at http://logs.openstack.org/periodic/periodic-tempest-devstack-vm-check-hourly/ 17:34:38 vishy_zz: can we get some prioritization on https://bugs.launchpad.net/nova/+bug/1079687 ? 17:34:39 Launchpad bug 1079687 in nova "Flaky failures of instances to reach BUILD and ACTIVE states" [Undecided,New] 17:34:56 jaypipes: Whether there is enough info in them is a different issue. 17:35:08 perhaps russellb would be able to look into the RPC issues and help us investigate 17:35:09 jaypipes: More instrumentation may need to be added. 17:35:32 ok, I'll talk to mtreinish about it, he was chasing what I thought was the blocking bug for tempest gate :) 17:35:40 davidkranz: sure, agreed. I'm hoping russellb or vishy_zz can tell us what info they would need to diagnose... 17:35:41 but i'll ask him to dive in on it 17:35:44 sdague: Which did you think it was? 17:36:31 sdague: The log error bugs are desirable for better test monitoring but things ca proceed without that. 17:36:35 davidkranz: I don't know at the moment 17:36:45 sdague, davidkranz: I was having trouble reproducing it. But, I got sidetracked on coverage extension stuff for a while so I haven't put much time into it. 17:37:22 vishy_zz: had a theory that it was related to the instance getting deleted during bring up 17:37:26 actually, eglynn is also a good resource for digging into these tricky issues... Eoghan, feel like helping out? :) 17:37:31 mtreinish: It only fails maybe 10-15% of the time. 17:37:40 jaypipes: I was going to do this but forgot. I should be able to report the reason the server went into error status as well. The reason is in the GET response, so it should be easy to tag on to the fault. I have it partially done, just having some issues with the text population 17:37:46 davidkranz: the issue is making nova the right amount of slow for it to show up 17:37:54 it's definitely a race 17:38:35 jaypipes: yep, I can look later on or tmrw (on a train ATM, dodgy wifi coverage ...) 17:38:40 sdague: OK. I just wanted to make sure some one was looking at it with nova smarts. 17:38:46 eglynn: :) thx man! 17:39:27 jaypipes: I think we can move on then. 17:39:27 davidkranz: I will write a mailing list post about the above bug and try to get priortization. 17:39:40 jaypipes: Great. 17:39:49 #action jaypipes to write ML post about bug 1079687 17:39:51 we can take that into the nova meeting this afternoon as well 17:39:51 Launchpad bug 1079687 in nova "Flaky failures of instances to reach BUILD and ACTIVE states" [Undecided,New] https://launchpad.net/bugs/1079687 17:39:58 kk 17:40:13 #topic dwalleck patch for refactoring server actions tests 17:40:31 dwalleck: OK, I want to understand better your overall plan for these patches. 17:40:42 dwalleck: starting with the impetus behind the patches 17:42:38 dwalleck: more specifically, I'm curious about this in the commit msg: "next patch will break the singe test method into each test class into distinct test methods" 17:42:55 Right... 17:43:04 And this is where I wish I had a whiteboard 17:43:09 heh 17:43:16 just do your best 17:43:17 dwalleck: etherpads work pretty well :) 17:43:36 Let me go for the direct answer first and try to hit the whys along the way 17:43:59 Good call! There's an OpenStack etherpad, right? Anyone have a link so we could pop one open? 17:44:44 http://etherpad.openstack.org 17:44:53 dwalleck: http://etherpad.openstack.org/RefactorServerActionsTest 17:45:29 Okay, so this is test_server_actions right now.... 17:46:09 haha...it's doing well 17:46:40 so there's a few of the test methods, along with the assertions in one test 17:46:50 dwalleck: are you referring to the existing smoke test (test_basic_server_ops)? or something else that currently is not a smoke test -- /tempest/tests/compute/test_server_actions.py? 17:47:23 afazekas: glance boto patch merged... 17:47:38 The problem I'm trying to solve is that there are many differing assertions in one test. Let me give you a larger scale, what I'd like to get in Tempest problem... 17:49:03 So if one small part of test_resize_server_confirm fails, I don't want the whole test to fail 17:49:28 I want the specific part that is invalid to be documented, but let the reports show that everything else is fine 17:49:31 dwalleck: ah, I see... so my next question would be, why not use the smoke.SmokeTest base classes for these, since the test methods for these test cases will be order-dependent? 17:50:17 jaypipes: Actually in this case they wouldn't be. Let me draw... 17:50:22 dwalleck: it seems you just want more detailed test methods, in a specific order. 17:51:47 not necessarily in a specific order, but more detailed, yes 17:52:43 I see your point, this is something Sam and my guys have gone back and forth about....the point he made to me is not that I was trying to test the change password request itself, but the results 17:53:06 So the fixture prepares the scenario and test verifies the results 17:54:24 dwalleck: and what happens when something errors in preparing the scenario (building the fixture)? 17:54:25 jaypipes: And you're right, you can do that with ordered tests, which once we ditch nose and move to plain unit test you can do by overloading the load_tests method of the class 17:54:49 So this is one way (without having that) that my guys devised a way to work around it 17:55:51 jaypipes: The fixture fails, so none of the tests for it run. This makes sense because if the server failed to create, that's something different than the password failing to change 17:57:14 So, back out of the solution I proposed. The problem I'm trying to solve is precise failure of tests, which makes working with results easier 17:57:31 dwalleck: ok, I'm getting you... I guess it's difficult to see without the code, but I'm open to the direction you're going. 17:57:58 How about this. Lets hold off, and let me get a full, real example for folks to look at 17:58:16 seems the change is too large if it's only for get more detailed test results. <-- just my opinion... 17:58:32 We can discuss on the mailing list and get into more details 17:58:38 dwalleck: yeah, how about this: 17:58:55 dwalleck: focus on a complete example for just *one* of the actions (change_password for example) 17:59:04 dwalleck: and complete it out 17:59:14 can do 17:59:14 dwalleck: then we can see the whole picture for a full scenario 17:59:18 dwalleck: kk. 17:59:39 I'll get something out before the weekend 18:00:04 dwalleck: bottom line, I definitely support the goal of getting more detailed results, decoupling results processing from fixture setup, and parallelizing stuff... just need an easier-to-consume-and-review patch :) 18:00:22 alright, we're at one hour right now... 18:00:24 jaypipes: I think we are out of time, but we should discuss the blueprints next week. 18:00:35 davidkranz: kk, agreed. 18:00:41 sounds good 18:00:51 davidkranz: I think we made some good progress today, though... I will send out a status report to the ML 18:01:01 davidkranz: sorry for being absent the last few weeks... 18:01:23 jaypipes: NP. It happens to all of us :) 18:01:39 alright folks, see you all next week! 18:01:42 #endmeeting