17:00:29 <jaypipes> #startmeeting qa
17:00:30 <openstack> Meeting started Thu Dec  6 17:00:29 2012 UTC.  The chair is jaypipes. Information about MeetBot at http://wiki.debian.org/MeetBot.
17:00:31 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
17:00:33 <openstack> The meeting name has been set to 'qa'
17:00:59 <sdague> <- here
17:01:06 <davidkranz> davidkranz: present
17:01:06 <Ravikumar_hp> hi
17:01:13 <mtreinish> I'm here
17:01:18 <chunwang> Chun Wang (April) is here.
17:01:20 * afazekas here
17:01:35 * dwalleck is sort of awake
17:01:55 <jaypipes> chunwang: hi! :) welcome!
17:02:05 * sdague hands dwalleck virtual coffee
17:02:17 <chunwang> jaypipes: thanks :)
17:02:17 * jaypipes grabs coffee from sdague
17:02:18 <davidkranz> Any one else here for the meeting?
17:02:32 <jaypipes> lol
17:02:35 <sdague> heh
17:02:49 <donaldngo> here as well
17:02:58 <jaypipes> donaldngo: hi there
17:03:05 <donaldngo> hello
17:03:06 <jaypipes> OK, welcome back davidkranz
17:03:16 <jaypipes> #topic Open reviews
17:03:23 <jaypipes> #link https://review.openstack.org/#/q/status:open+project:openstack/tempest,n,z
17:03:39 <jaypipes> Shall we discuss from bottom to top?
17:03:49 <Ravikumar_hp> sure
17:03:53 <jaypipes> First one (not WIP): https://review.openstack.org/#/c/17063/
17:03:59 <jaypipes> mordred: ping
17:04:07 <jaypipes> and fungi ^^
17:04:34 <jaypipes> You guys want to comment on that review? afazekas was able to do soem setup.py factoring that just went in to tempest this morning.
17:04:57 <jaypipes> I think you just need to revisit that patch and see if a) it's still relevant and b) if so, if it needs a few changes
17:05:05 <jaypipes> mordred, fungi: ^^
17:05:12 <fungi> looking
17:05:42 <jaypipes> afazekas, sdague: Next review is the boto versioning one: https://review.openstack.org/#/c/17467/
17:06:02 <jaypipes> afazekas: any status on that one regarding updates to glance and nova for updated boto?
17:06:10 <afazekas> https://review.openstack.org/#/c/17256/
17:06:18 <afazekas> nova done, glance missing
17:06:22 <jaypipes> kk
17:06:38 <sdague> jaypipes: yeh, my only comments remain. I don't think we want commented out lines of code in the tree
17:07:00 <jaypipes> afazekas: +2'd that glance review... should be merged shortly.
17:07:01 <sdague> if someone else wants to override me, I'm cool with that :)
17:07:39 <afazekas> sdague: Workaround must be look like a workaround, which is wrong
17:07:50 <jaypipes> sdague: I agree with you, but I think afazekas can abandon once 17256 gets merged.
17:07:55 <jaypipes> afazekas: correct?
17:08:07 <afazekas> jaypipes: yes
17:08:19 <sdague> ok, so not really an issue then?
17:08:39 <jaypipes> ok, cool, then we'll just wait to see if 17256 gets through the gate, and then afazekas can mark that tempest patch abandoned.
17:08:43 <jaypipes> alright, next one!
17:08:54 <jaypipes> https://review.openstack.org/#/c/17464/
17:08:58 <jaypipes> is Rohan here?
17:09:11 <jaypipes> I believe he's in India, so probably not :)
17:09:21 <jaypipes> anyone from NTTDATA here?
17:09:32 <mordred> jaypipes: aroo?
17:09:49 <davidkranz> jaypipes: The comment says he is waiting for a cinder bug fix. But the link in the review is broken.
17:09:53 <jaypipes> looks like Rohan is waiting on a Cinder bug...
17:09:57 <jaypipes> davidkranz: yeah.
17:10:00 <sdague> davidkranz: remove the .
17:10:14 <sdague> gerrit is detecting the link incorrectly
17:10:41 <davidkranz> sdague: Heh. So that one is just in process.
17:10:44 <sdague> updated the review
17:10:57 <sdague> the bug isn't actually being worked as far as I can see
17:11:02 <jaypipes> afazekas, sdague: I believe https://review.openstack.org/#/c/15972/ should be abandoned now that afazekas's work on the devstack lib/tempest is done.
17:11:58 <afazekas> jaypipes: yes
17:12:01 <sdague> cool
17:12:17 <sdague> afazekas: that lib/tempest work was quite cool btw, cleans stuff up a lot
17:12:19 <jaypipes> afazekas: ok, cool. if you could ping jaroslav to abandon at his convenience, that would be great
17:12:22 <jaypipes> sdague: ++
17:12:29 <jaypipes> very cool indeed.
17:13:03 <jaypipes> OK, I believe all the remaining merge requests are in the process of working through the gate.
17:13:23 <afazekas> jaypipes: done
17:13:27 <jaypipes> except for dwalleck's server actions one, which I conveniently skipped over (we'll discuss later)
17:13:40 <chunwang> may I know what lib/tempest is doing? which blue print is it for?
17:13:50 <jaypipes> chunwang: sure, let me explain
17:13:55 <sdague> chunwang: it's a cleanup on devstack
17:14:07 <chunwang> ok
17:14:17 <jaypipes> chunwang: so prior to afazekas's work, we had an /etc/tempest.conf.sample and a /etc/tempest.conf.tpl
17:15:01 <jaypipes> chunwang: his work gets rid of the redundant/duplicate tempest.conf.tpl file and uses the ini_set bash library routine in devstack to construct the tempest configuration file used in the continuous integration gate tests
17:15:34 <jaypipes> chunwang: afazekas's patch cleaned up a bunch of duplicate code and made the generation of tempest's configuration file match how the generation for other OpenStack projects is done in devstack
17:15:52 <jaypipes> chunwang: it didn't change the configuration file options itself, just the way it was generated.
17:16:05 <chunwang> thanks, got it.
17:16:08 <jhenner> abandoned
17:16:15 <jaypipes> jhenner: ah, cheers!
17:16:21 <jhenner> with a pleasure
17:16:26 <jaypipes> :)
17:16:35 <sdague> tempest in devstack is now getting clean enough that we might want to enable it by default
17:17:02 <jaypipes> chunwang: sticking with you, I was going to change topic to discuss the tenant isolation issues we were discussing on https://bugs.launchpad.net/tempest/+bug/1087298. Is that OK with you?
17:17:03 <uvirtbot> Launchpad bug 1087298 in tempest "Request rate too high during test_security_group_rules running" [Undecided,New]
17:17:23 <chunwang> sure. we can discuss it here...
17:17:26 <jaypipes> #topic How tenant isolation works (and possible bugs/inconsistencies)
17:17:39 <jaypipes> if everyone could give a quick read through the above bug, that would be great
17:18:01 * jaypipes executes sleep 60
17:18:34 <afazekas> https://github.com/openstack-ci/devstack-gate/blob/master/devstack-vm-gate.sh#L162 is it needed ?
17:18:49 <chunwang> yes, let me introduce the background. The issues found in our deployed enviornments, both E & F version openstack.
17:19:42 <chunwang> many instances/image snapshots/security groups are seens after the test running in dashboard. Seems the teardown function is not working properly.
17:20:27 <sdague> so there is also a real bug in that code
17:20:29 <jaypipes> chunwang: do you have details on which tests? the tenant will be named for the test...
17:21:43 <chunwang> err..actually many cases are using rand_name('tenant´┐Ż) as tenant name in the script
17:21:54 <jaypipes> afazekas: AFAICT, yes, since the devstack install of tempest will now create the config file.
17:22:12 <chunwang> I suppose the script name I mentioned in bug. If it's really necessary, I may duplicate the issue and figure it out
17:22:33 <jaypipes> chunwang: any test that derives from the base test classes should not be doing that... so it is a bug if they are not doing so.
17:23:15 <chunwang> ok, then there is another bug...
17:24:36 <jaypipes> chunwang: I think we'll need to do some deeper investigation into this and figure out if any tests are leaving side effects if tenant_isolation is enabled (none of them should, and it is a bug if they do...)
17:24:43 <chunwang> so the error may happens when the test instance/imagesnapshot is not created in base classes?
17:25:13 <jaypipes> chunwang: you mean if/when a fixture gets created in the setUpClass() methods, I presume?
17:25:14 <sdague> chunwang: that execption you see in the logs is going to mean the final stages of cleanup don't happen
17:25:23 <chunwang> yes
17:25:54 <chunwang> I understand the teardown won't happen when case got errors
17:26:12 <chunwang> but the tenant not delete issue I saw is when cases all passed
17:26:25 <jaypipes> chunwang: right, and that's the one I'm concerned about...
17:27:21 <jaypipes> chunwang: OK, I will do a deeper investigation this afternoon on this issue, ok?
17:27:36 <jaypipes> chunwang: and I'll update the bug report with my findings.
17:27:41 <chunwang> sure. if any help or information needed, I will provide
17:28:19 <jaypipes> cool.
17:28:55 <jaypipes> OK, so, shall we move on to discuss dwalleck's server actions patch? or is there any other topic folks would like to discuss before that?
17:29:10 <afazekas> jaypipes: this was the second time when the ./tools/configure_tempest.sh called, first it is called by https://github.com/openstack-ci/devstack-gate/blob/master/devstack-vm-gate.sh#L150, you can see in the log the double config:   search for example this sting  'source /opt/stack/devstack/lib/database'  , here http://logs.openstack.org/17256/3/check/gate-tempest-devstack-vm/21128/console.html.gz
17:29:27 <davidkranz> jaypipes: We should discuss how to get the full (or at least almost full) gate on all prjojects.
17:29:44 <davidkranz> jaypipes: Another nova bug slipped through yesterday.
17:30:04 <davidkranz> The biggest issue, other than some slow tests, is https://bugs.launchpad.net/nova/+bug/1079687
17:30:05 <uvirtbot> Launchpad bug 1079687 in nova "Flaky failures of instances to reach BUILD and ACTIVE states" [Undecided,New]
17:30:14 <jaypipes> afazekas: unless I'm mistaken, I believe you should be able to remove all calls to tools/configure_tempest.sh after your devstack patch lands
17:30:20 <davidkranz> which fails the hourly tempest build several times per day.
17:30:33 <jaypipes> #topic Blockers for enabling the full gate on core projects
17:30:45 <davidkranz> jaypipes: There has not been any progress on this bug as far as I can see.
17:30:59 <jaypipes> davidkranz: one sec, reading into bug
17:31:00 <dwalleck> davidkranz: I don't know if I'd say those are Tempest issues. In my experience, servers going into error status have always been Nova issues
17:31:18 <davidkranz> dwalleck: Yes, the bug if a nova bug ticket.
17:31:26 <dwalleck> ahh, sorry
17:31:31 * dwalleck is not fully awake yet
17:31:38 <sdague> davidkranz: is that related to any of the other nova bugs?
17:31:46 <davidkranz> sdague: I don't think so.
17:31:47 <sdague> I thought the blocker was a different issue
17:31:53 <jaypipes> davidkranz: strange that all those failures are the XML server tests, not the JSON ones...
17:32:21 <davidkranz> jaypipes: I am not sure that is true. The failures are flaky.
17:32:33 <davidkranz> jaypipes: They happen in various places.
17:32:55 <jaypipes> davidkranz: oh, ok...
17:33:17 <davidkranz> jaypipes: You can ssee a lot of stuff at https://jenkins.openstack.org/job/periodic-tempest-devstack-vm-check-hourly/? if you want.
17:33:46 <davidkranz> IMO, this should be the highest priority for some one on the nova team.
17:33:50 <jaypipes> davidkranz: k, will look further into it. do we have the rabbit logs somewhere?
17:33:59 <jaypipes> davidkranz: agreed about priority.
17:34:36 <davidkranz> jaypipes: Yes, all the logs are at http://logs.openstack.org/periodic/periodic-tempest-devstack-vm-check-hourly/
17:34:38 <jaypipes> vishy_zz: can we get some prioritization on https://bugs.launchpad.net/nova/+bug/1079687 ?
17:34:39 <uvirtbot> Launchpad bug 1079687 in nova "Flaky failures of instances to reach BUILD and ACTIVE states" [Undecided,New]
17:34:56 <davidkranz> jaypipes: Whether there is enough info in them is a different issue.
17:35:08 <jaypipes> perhaps russellb would be able to look into the RPC issues and help us investigate
17:35:09 <davidkranz> jaypipes: More instrumentation may need to be added.
17:35:32 <sdague> ok, I'll talk to mtreinish about it, he was chasing what I thought was the blocking bug for tempest gate :)
17:35:40 <jaypipes> davidkranz: sure, agreed. I'm hoping russellb or vishy_zz can tell us what info they would need to diagnose...
17:35:41 <sdague> but i'll ask him to dive in on it
17:35:44 <davidkranz> sdague: Which did you think it was?
17:36:31 <davidkranz> sdague: The log error bugs are desirable for better test monitoring but things ca proceed without that.
17:36:35 <sdague> davidkranz: I don't know at the moment
17:36:45 <mtreinish> sdague, davidkranz: I was having trouble reproducing it. But, I got sidetracked on coverage extension stuff for a while so I haven't put much time into it.
17:37:22 <mtreinish> vishy_zz: had a theory that it was related to the instance getting deleted during bring up
17:37:26 <jaypipes> actually, eglynn is also a good resource for digging into these tricky issues... Eoghan, feel like helping out? :)
17:37:31 <davidkranz> mtreinish: It only fails maybe 10-15% of the time.
17:37:40 <dwalleck> jaypipes: I was going to do this but forgot. I should be able to report the reason the server went into error status as well. The reason is in the GET response, so it should be easy to tag on to the fault. I have it partially done, just having some issues with the text population
17:37:46 <sdague> davidkranz: the issue is making nova the right amount of slow for it to show up
17:37:54 <sdague> it's definitely a race
17:38:35 <eglynn> jaypipes: yep, I can look later on or tmrw (on a train ATM, dodgy wifi coverage ...)
17:38:40 <davidkranz> sdague: OK. I just wanted to make sure some one was looking at it with nova smarts.
17:38:46 <jaypipes> eglynn: :) thx man!
17:39:27 <davidkranz> jaypipes: I think we can move on then.
17:39:27 <jaypipes> davidkranz: I will write a mailing list post about the above bug and try to get priortization.
17:39:40 <davidkranz> jaypipes: Great.
17:39:49 <jaypipes> #action jaypipes to write ML post about bug 1079687
17:39:51 <sdague> we can take that into the nova meeting this afternoon as well
17:39:51 <uvirtbot> Launchpad bug 1079687 in nova "Flaky failures of instances to reach BUILD and ACTIVE states" [Undecided,New] https://launchpad.net/bugs/1079687
17:39:58 <jaypipes> kk
17:40:13 <jaypipes> #topic dwalleck patch for refactoring server actions tests
17:40:31 <jaypipes> dwalleck: OK, I want to understand better your overall plan for these patches.
17:40:42 <jaypipes> dwalleck: starting with the impetus behind the patches
17:42:38 <jaypipes> dwalleck: more specifically, I'm curious about this in the commit msg: "next patch will break the singe test method into each test class into distinct test methods"
17:42:55 <dwalleck> Right...
17:43:04 <dwalleck> And this is where I wish I had a whiteboard
17:43:09 <jaypipes> heh
17:43:16 <jaypipes> just do your best
17:43:17 <sdague> dwalleck: etherpads work pretty well :)
17:43:36 <dwalleck> Let me go for the direct answer first and try to hit the whys along the way
17:43:59 <dwalleck> Good call! There's an OpenStack etherpad, right? Anyone have a link so we could pop one open?
17:44:44 <sdague> http://etherpad.openstack.org
17:44:53 <jaypipes> dwalleck: http://etherpad.openstack.org/RefactorServerActionsTest
17:45:29 <dwalleck> Okay, so this is test_server_actions right now....
17:46:09 <chunwang> haha...it's doing well
17:46:40 <dwalleck> so there's a few of the test methods, along with the assertions in one test
17:46:50 <jaypipes> dwalleck: are you referring to the existing smoke test (test_basic_server_ops)? or something else that currently is not a smoke test -- /tempest/tests/compute/test_server_actions.py?
17:47:23 <jaypipes> afazekas: glance boto patch merged...
17:47:38 <dwalleck> The problem I'm trying to solve is that there are many differing assertions in one test. Let me give you a larger scale, what I'd like to get in Tempest problem...
17:49:03 <dwalleck> So if one small part of test_resize_server_confirm fails, I don't want the whole test to fail
17:49:28 <dwalleck> I want the specific part that is invalid to be documented, but let the reports show that everything else is fine
17:49:31 <jaypipes> dwalleck: ah, I see... so my next question would be, why not use the smoke.SmokeTest base classes for these, since the test methods for these test cases will be order-dependent?
17:50:17 <dwalleck> jaypipes: Actually in this case they wouldn't be. Let me draw...
17:50:22 <jaypipes> dwalleck: it seems you just want more detailed test methods, in a specific order.
17:51:47 <dwalleck> not necessarily in a specific order, but more detailed, yes
17:52:43 <dwalleck> I see your point, this is something Sam and my guys have gone back and forth about....the point he made to me is not that I was trying to test the change password request itself, but the results
17:53:06 <dwalleck> So the fixture prepares the scenario and test verifies the results
17:54:24 <jaypipes> dwalleck: and what happens when something errors in preparing the scenario (building the fixture)?
17:54:25 <dwalleck> jaypipes: And you're right, you can do that with ordered tests, which once we ditch nose and move to plain unit test you can do by overloading the load_tests method of the class
17:54:49 <dwalleck> So this is one way (without having that) that my guys devised a way to work around it
17:55:51 <dwalleck> jaypipes: The fixture fails, so none of the tests for it run. This makes sense because if the server failed to create, that's something different than the password failing to change
17:57:14 <dwalleck> So, back out of the solution I proposed. The problem I'm trying to solve is precise failure of tests, which makes working with results easier
17:57:31 <jaypipes> dwalleck: ok, I'm getting you... I guess it's difficult to see without the code, but I'm open to the direction you're going.
17:57:58 <dwalleck> How about this. Lets hold off, and let me get a full, real example for folks to look at
17:58:16 <chunwang> seems the change is too large if it's only for get more detailed test results.  <-- just my opinion...
17:58:32 <dwalleck> We can discuss on the mailing list and get into more details
17:58:38 <jaypipes> dwalleck: yeah, how about this:
17:58:55 <jaypipes> dwalleck: focus on a complete example for just *one* of the actions (change_password for example)
17:59:04 <jaypipes> dwalleck: and complete it out
17:59:14 <dwalleck> can do
17:59:14 <jaypipes> dwalleck: then we can see the whole picture for a full scenario
17:59:18 <jaypipes> dwalleck: kk.
17:59:39 <dwalleck> I'll get something out before the weekend
18:00:04 <jaypipes> dwalleck: bottom line, I definitely support the goal of getting more detailed results, decoupling results processing from fixture setup, and parallelizing stuff... just need an easier-to-consume-and-review patch :)
18:00:22 <jaypipes> alright, we're at one hour right now...
18:00:24 <davidkranz> jaypipes: I think we are out of time, but we should discuss the blueprints next week.
18:00:35 <jaypipes> davidkranz: kk, agreed.
18:00:41 <dwalleck> sounds good
18:00:51 <jaypipes> davidkranz: I think we made some good progress today, though... I will send out a status report to the ML
18:01:01 <jaypipes> davidkranz: sorry for being absent the last few weeks...
18:01:23 <davidkranz> jaypipes: NP. It happens to all of us :)
18:01:39 <jaypipes> alright folks, see you all next week!
18:01:42 <jaypipes> #endmeeting