#openstack-meeting log

17:00:47 <jaypipes> #startmeeting qa
17:00:48 <openstack> Meeting started Thu Dec 13 17:00:47 2012 UTC.  The chair is jaypipes. Information about MeetBot at http://wiki.debian.org/MeetBot.
17:00:49 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
17:00:51 <openstack> The meeting name has been set to 'qa'
17:01:30 <davidkranz> Hi Jay.
17:01:58 <jaypipes> #topic Proposal to add Attila (afazekas) to QA core
17:02:20 <jaypipes> I'd like to recognize afazekas's work in the past couple months on Tempest
17:02:31 <jaypipes> and am proposing his inclusion into qa-core
17:02:51 <jaypipes> I figure it's good enough to do an informal vote here
17:03:05 <davidkranz> jaypipes: +1
17:03:16 <dwalleck> sounds good +1
17:03:23 <jaypipes> If you agree with the proposal, please say so. If you have reservations or would like to hold off, please also say so
17:03:26 <Ravikumar_hp> +1 yes. Afazekas provides valuable review feedbacks
17:04:12 * jaypipes has been impressed with afazekas's ability to decipher the relationships between tempest, devstack and devstack-gate, which can be tricky!
17:05:08 <jaypipes> OK, well, I'll take that as a tentative agreement on Attila. I'll send a note to the QA list this afternoon asking for more feedback, and if I receive none by end of today, I will add afazekas to qa-core
17:05:22 <jaypipes> alrighty, next topic...
17:05:29 <jaypipes> #topic Outstanding reviews
17:05:37 <jaypipes> #link https://review.openstack.org/#q,status:open+project:openstack/tempest,n,z
17:05:43 <jaypipes> We will work bottom up.
17:06:04 <jaypipes> fungi, mordred: https://review.openstack.org/#/c/17063/
17:06:27 <jaypipes> fungi, mordred: any progress on this or thoughts about proceeding to getting Tempest installable via normal Python means?
17:06:59 <jaypipes> fungi, mordred: last I checked, there were issues because tempest had an /tempest/openstack.py module that name-interfered with tempest.openstack.common inclusion?
17:06:59 <fungi> i'm not entirely sure what issues mordred ran into trying to add that
17:07:15 <fungi> ahh, right, namespace problems
17:07:19 <mordred> jaypipes: that is correct - and I have not yet had time to sort that out
17:07:50 <jaypipes> mordred: OK, well it has a negative review and will be auto-expired in the next couple days I believe...
17:08:05 <jaypipes> mordred: if you want to shelve, could you mark it Work In Progress?
17:08:08 <afazekas> Similar change added to the tempest, I did not see the concurent attempt
17:08:10 <mordred> jaypipes: yah
17:08:21 <jaypipes> mordred: also, a blueprint or bug would be great ;)
17:08:45 <afazekas> the issue with that patch we have an openstack.py and openstack folder in the same location
17:08:46 <jaypipes> afazekas: I think the review above is about pulling in the Oslo (openstack-common) packaging help
17:08:51 <jaypipes> afazekas: right.
17:09:15 <jaypipes> afazekas: so it will take some effort to rename the openstack.py module, since it's used virtually everywhere ;)
17:09:44 <jaypipes> afazekas: if you're interested, mordred and fungi can provide some insight into their overall direction on that work
17:10:08 <mordred> yes.
17:10:35 <jaypipes> ok, anything more to add on that review?
17:11:07 <afazekas> I am seeking for merged setup py change
17:11:14 <jaypipes> #action mordred to mark https://review.openstack.org/#/c/17063/ Work in Progress
17:11:38 <jaypipes> #action afazekas to work with mordred and fungi on setup.py normalization with openstack-common
17:12:02 <jaypipes> alright, next one...
17:12:04 <jaypipes> #link https://review.openstack.org/#/c/17829/
17:12:09 <jaypipes> Ravikumar_hp: you're up!
17:12:14 <Ravikumar_hp> jaypipies: object expiry testcase - resubmitted incorporating review feedback . Wish it is merged this week .
17:12:18 <davidkranz> jaypipes: I think this just needs a review.
17:12:19 <jaypipes> this is the swift object expiry test
17:12:30 <Ravikumar_hp> i will follow up sdague to get it reviewed
17:12:37 <Ravikumar_hp> also afazekas
17:12:52 <jaypipes> Ravikumar_hp: good. I will give it a stab this afternoon, as well
17:13:28 <jaypipes> Ravikumar_hp: it looked like the initial concerns from sdague were addressed with a shorter sleep(5) call?
17:13:36 <Ravikumar_hp> yes
17:13:51 <jaypipes> Ravikumar_hp: Other than that, the test is essentially skipped until bug 1069849 is resolved, right?
17:13:52 <uvirtbot> Launchpad bug 1069849 in swift "Containers show expired objects" [Undecided,In progress] https://launchpad.net/bugs/1069849
17:14:01 <Ravikumar_hp> yes
17:14:16 <Ravikumar_hp> also it is not gated smoke test
17:14:21 <jaypipes> right, noted
17:15:00 <jaypipes> Ravikumar_hp: anything more on that one?
17:15:09 <Ravikumar_hp> no . thanks
17:15:12 <jaypipes> np :)
17:15:17 <jaypipes> #link https://review.openstack.org/#/c/18035/
17:15:38 <jaypipes> mtreinish: ping
17:15:42 <davidkranz> jaypipes: This failed with our friendly flaky server failure. Just rechecked.
17:15:48 <jaypipes> anyone know Jaraslov Henner's IRC?
17:16:19 <jaypipes> davidkranz: which one? https://review.openstack.org/#/c/18035/ ?
17:16:28 <davidkranz> jaypipes: Yes.
17:16:55 <jaypipes> davidkranz: hmm, that's odd... it's the glanceclient test.
17:17:27 <jaypipes> maurosr: ping
17:17:37 <jaypipes> maurosr: you had some concerns on https://review.openstack.org/#/c/18035/
17:17:53 <jaypipes> maurosr: and I wanted to make sure you had your questions answered.
17:18:11 <jaypipes> maurosr: what I believe Jaroslav is doing is correct, though a bit obtuse
17:18:50 <jaypipes> maurosr: the set() <= set() operation is detecting whether the IDs of the added images are different from the set of "current images" from the image list call
17:19:26 <jaypipes> dwalleck: welcome back ;) gotta love VPNs.
17:19:46 <dwalleck_> yup
17:19:52 <jaypipes> OK, well it looks like folks aren't around that need to talk about https://review.openstack.org/#/c/18035/... so we'll move on.
17:20:06 <maurosr> jaypipes: right.. got it now, should have tested it before..
17:20:19 <jaypipes> maurosr: no worries!
17:20:30 <jaypipes> #link https://review.openstack.org/#/c/18030/
17:20:55 <jaypipes> I tend to agree with mtreinish about https://review.openstack.org/#/c/18030/. The XML to JSON (and JSON to XML) stuff there is very fragile
17:21:09 <jaypipes> and I'm not sure that the proposed solution really solves the bug properly.
17:21:36 <jaypipes> davidkranz, dwalleck, afazekas, sdague: if you could give a review on https://review.openstack.org/#/c/18030/, that would be great. It's an XML output vs. JSON output mismatch issue.
17:22:07 <jaypipes> mnewby: around?
17:22:12 <afazekas> http://docs.openstack.org/compute/api/v1.1  contains API version
17:23:14 <afazekas> probably this part should be differnt with different cumpute api version
17:23:42 <jaypipes> afazekas: sorry, I'm not following you...
17:23:58 <davidkranz> afazekas: That link doesn not exist
17:24:34 <afazekas> bit it is in the xml
17:24:57 <afazekas> and we might have multiple api version to support
17:25:14 <afazekas> s/bit/but/
17:25:25 <jaypipes> Oh, I see what you're saying....
17:25:58 <afazekas> looks like the [compute] section does not have api_version option
17:26:28 <jaypipes> afazekas: I think, though, in this case of the review, it's a matter of the XML translation in the volume_extensions rest client not being correct
17:28:23 <jaypipes> OK, well, let's move on...
17:28:45 <jaypipes> The remaining reviews seem to be just waiting on a successful tempest gate run, so we can move on to other topics.
17:28:50 <jaypipes> #topic Open Discussion
17:29:05 <jaypipes> Please feel free to bring up issues now
17:29:28 <Ravikumar_hp> jaypipes:any idea on parallel execution or test tools
17:29:36 <jaypipes> dwalleck_ and others: Has anyone been able to make progress on parallelization?
17:29:43 <jaypipes> Ravikumar_hp: lol, you beat me to it :)
17:30:01 <dwalleck_> jaypipes: yes, I've had success with a few different options
17:30:24 <jaypipes> dwalleck_: please do tell! :)
17:30:26 <davidkranz> dwalleck_: Enough to make a recommendation?
17:30:39 <dwalleck_> The first step for all of them though requires ripping out all of our nose tags/imports
17:31:34 <dwalleck_> The second is refactoring some of our tests to be more efficient when run in parallel (which goes back to the patch I unsubmitted/need to submit again)
17:32:16 <dwalleck_> Even something as simple as writing a short python script to gather the tests and spin them up in threads/processes/greenthreads works
17:32:48 <jaypipes> dwalleck_: the patch that breaks out some of the tests into smaller tests?
17:32:53 <jaypipes> dwalleck_: the server actions, etc?
17:33:20 <dwalleck_> The problem we're going to run into isn't resource constaints test server-side, it's the fact that even though you run everything in parallel, the tests will still take as long to run as the longest running test class/module
17:33:22 <dwalleck_> yes
17:33:37 <dwalleck_> So without that, the benefits are there, but minor
17:34:33 <jaypipes> dwalleck_: well, just getting to the state where tempest only takes as long as its longest test would be a huge accomplishment!
17:34:39 <dwalleck_> The only way to get around it would be to "fix" the type of parallization nose allows, but there's a good reason no one else does it: it's tricky to implement right
17:34:41 <davidkranz> dwalleck_: Why are they minor? If the longest test takes 1 minute, that's great.
17:34:54 <jaypipes> davidkranz: heck, if it takes 5 minutes, great ;)
17:35:16 <dwalleck_> davidkranz: The longest running test class. In this case, that's test server actions, which alone takes well over 10 min
17:35:28 <davidkranz> I think the big lossage now is failing to overlap actual test cases with waiting for servers.
17:35:33 <dwalleck_> And got much longer when I added admin actions
17:35:43 <davidkranz> dwalleck_: So if we just break up that test we should be pretty good.
17:36:02 <afazekas> Unfourtantly we are limited in number of VM's we can run on single machine at the same time
17:36:16 <dwalleck_> Yeah, that will be about as good as it gets without resorting to other tricks such as pre-building VMs for certain tests
17:36:25 <davidkranz> dwalleck_: We just need to make sure we don't sequentially allocate servers in a single test.
17:36:52 <jaypipes> afazekas: well, sure, but we aren't really hitting those issues yet... at least as far as total runtime of tempest goes. We're still a serialized execution :(
17:37:33 <davidkranz> afazekas: If we are parallel, we can just throttle vm creation and make tests wait.
17:37:38 <dwalleck_> davidkranz: You can even do that (I think we do it already in the list servers test). You just don't start waiting till you've created all the servers you want, and then start waiting
17:37:56 <afazekas> jaypipes: I am speaking about what can happen if we start every  case in parallel.
17:38:03 <jaypipes> afazekas: ah, yes indeed.
17:38:20 <dwalleck_> So if no one minds me making a WIP branch, I can rip all the nose stuff out and show an example of what this could look like
17:38:33 <jaypipes> dwalleck_: go for it.
17:38:43 <dwalleck_> I think we're going to need a lot more discussion, but it gives a starting point
17:39:04 <davidkranz> dwalleck_: Yes. But once we parallelize waiting, it is not a problem any more.
17:39:14 <dwalleck_> sounds good
17:39:16 <davidkranz> dwalleck_: We become limited by the resource limit.
17:39:30 <davidkranz> dwalleck_: Which is the best we can do.
17:39:45 <dwalleck_> davidkranz: The resource limit can be worked around as well. Quotas are very easy to manipulate
17:39:49 <jaypipes> something else that could significantly improve performance is this: Only do a single setup for ListXXX tests, and then execute the XML *and* JSON clients against those fixtures. Right now, we do a setUpClass() creating servers for both XML and JSON when that isn't necessary for list tests
17:39:54 <afazekas> dwalleck_: Would be great to have some csv about how mutch time spent in every method
17:40:28 <dwalleck_> afazekas: I had that somewhere (you can get the same thing by using the --with-xunit option with nose)
17:40:34 <davidkranz> afazekas: If you mean every test case you can get XML from a nosetest option
17:41:26 <jaypipes> dwalleck_: unfortunately, the xunit output only includes tests themselves, whereas a large portion of time is spent in setUpClass and tearDownClass, and those are not represented in the timings
17:41:30 <dwalleck_> But the problem is that it doesn't include the time spent in fixtures, so I did some instrumentation to get some stats on how many things we build and how long we spend waiting. Those numbers seem to be the crux of the problem anyway
17:41:37 <dwalleck_> right
17:41:38 <jaypipes> dwalleck_: :) right.
17:42:09 <davidkranz> dwalleck_: Sounds like you have a good handle on this. That's great.
17:42:09 <jaypipes> dwalleck_: I've found that reducing the build_interval from 10 to 3 speeds things up significantly.
17:42:10 <dwalleck_> And if my devstack environment will play nicely, I can get that. Having some odd issues with the tests hanging on deletion of floating IPs
17:42:34 <dwalleck_> But I'd be glad to share those numbers once I have them
17:42:41 <jaypipes> cool.
17:42:58 <dwalleck_> jaypipes: In the devstack case definitely. Since the servers build so fast, it makes sense to check more often
17:43:05 <davidkranz> Done with this topic?
17:43:41 <davidkranz> I have been thinking about the fuzz testing.
17:44:02 <davidkranz> Is any one actually working on that?
17:44:28 <dwalleck_> matt has been some with his team. Not sure about his progress though
17:44:48 <davidkranz> dwalleck_: matt?
17:44:53 <jaypipes> dwalleck_: in our CI cluster, our servers build out in about 6 seconds, on average...
17:45:00 <dwalleck_> I'll poke him to make an apperance or at least send an email out
17:45:10 <jaypipes> dwalleck_: which means setting to 3 usually is a good target
17:45:30 <dwalleck_> davidkranz: matt tesauro, app sec guy who was with my group at the conference
17:45:42 <davidkranz> dwalleck_: GOt it.
17:45:57 <jaypipes> dwalleck_: I've found that deleting and waiting for a converged server delete can take as much time as launching...
17:48:03 <jaypipes> anybody have anything more to bring up? davidkranz, I have not seen any response to my ML post about the flaky test failures :( other than sdague's response about the DNS fixes.
17:48:17 <jaypipes> perhaps I should word it differently? or send to different group?
17:48:51 <davidkranz> jaypipes: Not sure. It just seems like we see this as more important than the nova group does.
17:49:10 <davidkranz> jaypipes: We could turn on the gate :)
17:50:04 <davidkranz> jaypipes: I was also thinking of trying the stress tests again.
17:50:26 <davidkranz> jaypipes: As soon as we get the bogus ERRORs out of the logs.
17:51:02 <davidkranz> jaypipes: If the problem is that machines outside of the ci invironment are too fast, stress might how the problem better.
17:51:07 <jaypipes> davidkranz: well, let's do this: let's work on the ERROR crap and cleaning up the tempest output (glanceclient logging, etc) so it's easier for nova devs to work with us in diagnosing the race conditions we see sometimes, and then I'll send another post begging for help
17:51:38 <davidkranz> jaypipes: ++  All the ERROR crap is encapsulated in nova bugs I filed.
17:52:11 <afazekas> I wonder how will someone use the OpenStack's json/xml  REST API, if they can't read python (Openstack  documentation is the source code). For example a typical  java or C coder, does not know python.
17:52:50 <jaypipes> davidkranz: excellent.
17:53:06 <jaypipes> afazekas: it's a problem, true.
17:53:31 <jaypipes> afazekas: I believe at this point the best solution is to do the following:
17:53:47 <jaypipes> a) Identify *very specific questions* that are unanswered or vague in the docs
17:53:57 <davidkranz> afazekas: The start would be good, published docstrings for the python API.
17:54:07 <jaypipes> b) Bring the specific issue to the attention of annegentle and the PTL for the project
17:54:15 <jaypipes> c) File a bug, tagged with doc-impact
17:54:22 <jaypipes> d) Rinse and repeat
17:55:06 <davidkranz> jaypipes: Part of the issue is that in novaclient, for example, there are good docstrings for the cli but not the python API.
17:55:35 <davidkranz> jaypipes: You have to read the code to use the API, but not to use the cli.
17:55:54 <jaypipes> davidkranz: yep.
17:55:58 <davidkranz> jaypipes: I was actually doing that today :)
17:57:15 <jaypipes> OK all, going to end the meeting now. Please send last-minute questions or feedback to posts on the QA mailing list (I just sent out ML post about afazekas nomination to qa-core)
17:57:18 <mnewby> jaypipes: hi
17:57:27 <jaypipes> Good night/day/morning/weekend ;)
17:57:32 <jaypipes> mnewby: ! there you are...
17:57:49 <jaypipes> mnewby: closing meeting right now, but let's go to #openstack-dev to chat about the quantum test review
17:57:53 <jaypipes> #endmeeting