17:00:51 #startmeeting qa 17:00:52 Meeting started Thu Jul 25 17:00:51 2013 UTC. The chair is sdague. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:00:53 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:00:55 The meeting name has been set to 'qa' 17:01:05 ok, who's around for the qa meeting? 17:01:11 me 17:01:13 Hi! 17:01:17 hi! 17:01:18 hi 17:01:37 ok, agenda is here - https://wiki.openstack.org/wiki/Meetings/QATeamMeeting 17:01:49 hi 17:02:03 #topic Blueprint status 17:02:22 hi 17:02:43 I still need to close out H2 and move things out 17:03:06 i took care my blueprints 17:03:06 * afazekas is still waiting for feedback 17:03:27 remaining will be finished in H3 17:03:44 on anything I need to auto bump I'm going to bump to icehouse, so update to h3 if you think you are really going to be able to make it 17:03:47 afazekas: You mean the leak stuff? 17:03:58 dkranz: yes 17:04:02 afazekas: yeh, on which thing? 17:04:21 sdague: https://review.openstack.org/#/c/35516/ 17:04:27 ok, so I've looked at the patch a little, and it's something I realize I need to take more time to understand 17:04:46 do you want to give us a high level view of the approach you are taking? maybe it makes it easier to review 17:05:16 sdague: This is one resion for not making it complete at the first time, I still have minor design things to finalize as well 17:05:19 or maybe some sample output of it 17:05:39 sdague: it has the same design as the pep8 checker 17:05:55 afazekas: ok, but it's not yet hooked into any base classes right? 17:06:04 You have detectors wich first records the begining state, and at the and compare it with the final state 17:06:21 I guess that's what's made it hard for me to review, as it isn't yet functional 17:06:24 sdague: It is before and after test suite action 17:07:03 after the test runner it can say if something not deleted properly 17:07:12 It can be anything.. 17:07:58 oh, you get it by another wrapper 17:07:59 I would like to make some trick in the rand_name generation, to be easier to find aout which test case was the responsible 17:08:27 what about having it in master setUp and tearDown for base class instead? 17:08:36 so that we don't need the wrapper 17:09:18 ok, well I'll go put some feedback out there post meeting 17:09:22 and we can carry on in qa. 17:09:26 sdague: it would require to query everything, and I have concers with it's performance 17:09:54 afazekas: it would be good to see if that was true in practice 17:10:21 I would worry about performance later, right now a more integrated prototype is probably useful 17:10:41 ok, mtreinish how goes testr? 17:11:06 so we've had the testr run for a week and I've started working on tracking down some of the race conditions 17:11:19 sdague: I can make a change to be able to run on setUpclass /tearDown class or on tearDown / and setUp as a configure-able plugin 17:11:28 afazekas: great 17:11:30 I've got a fix pending on one in nova with aggregates list (that I need to rework a bit) 17:11:33 afazekas: +1 17:11:57 mtreinish: is there a wiki or etherpad page with testr issues we're running after? 17:11:58 but right now something got merged in nova that broke something with aggregates and availability zones 17:12:10 and all the testr runs are failing 17:12:26 sdague: there is https://etherpad.openstack.org/debugging-testr-tempest but I haven't updated it in a bit 17:12:37 I'll take some time today to update it 17:12:40 I think right now getting testr functioning is probably the highest priority team mission, as it's going to let us enable other things like heat 17:12:48 Is it allowed to move one host to two availability_zone ? 17:12:59 mtreinish: ok, could you take a little bit of time to do that today? 17:13:16 #link current testr issues https://etherpad.openstack.org/debugging-testr-tempest 17:13:22 sdague: yes I will 17:13:25 great 17:13:48 #action mtreinish to update testr etherpad with latest status of fails so we can try to get more folks on it 17:14:12 sdague: but aside from the one that I'm currently fixing the ones listed there are sill active 17:14:38 so there are 6 other outstanding issues listed there 17:14:43 ok, good to know. I'll start poking from my test env today and see if I can help 17:15:01 plus the az fail that's stopping the job from working 17:15:39 afazekas: I'm not sure, I haven't looked at things in detail yet 17:16:35 maybe we can try to enlist some of the infra guys that did testr conversions other places to help debug the issues 17:16:49 sdague: sure that sounds like a good idea 17:16:56 cool, great 17:17:28 ok, I think the last blueprint that we wanted to be sure to talk about was the stress tests 17:17:32 mkoderer the floor is yours 17:17:51 ok so the refactoring is nearly done 17:18:16 what is left is are unit tests and adding more stress tests 17:18:46 cool 17:18:46 that's coming right along 17:19:01 I am currently quite busy .. so I hope I have something starting next week 17:19:19 and.. I registered a talk for the Summit about the new framework 17:19:21 http://docs.python.org/2/library/fcntl.html#fcntl.flock a lock type which can work if we need to force something to be serialized 17:19:51 mkoderer: cool 17:20:04 I want to present the new framework with some real life test cases 17:20:13 mkoderer: By unit test do you mean something that runs in tempest to make sure the stress scenarios work? 17:20:26 mkoderer: But without stress. 17:20:33 dkranz: yes something like that 17:20:37 a sanity check 17:20:53 mkoderer: I put a suggestion in one of the reviews 17:20:54 dkranz: yes, we were chatting in the review and it occurred to me that we really should have some sanity check on checkin to make sure we didn't land broken python code 17:21:03 dkranz: that would be great 17:21:07 sdague: Definitely. 17:21:19 sdague: I basically suggested running each of them once. 17:21:39 this could be a good and easy solution 17:21:39 sdague: Cleanup is the issue. 17:22:00 dkranz: ok, well we should try to structure for that case 17:22:16 dkranz: or we can add a NOOP action just to check the framework itself 17:22:26 but I like the idea of one run only 17:22:36 that could go into the normal gate runs then 17:22:38 mkoderer: Yes, but it is new cases that may be submitted not working 17:22:46 sdague: sure this covers even more 17:22:59 sounds like a plan 17:23:03 great 17:23:04 yes let's to that! 17:23:40 #action mkoderer and the stress test folks to make it easy to run once through only for gating purposes 17:23:45 very cool 17:23:51 ok, next agenda topic 17:24:04 #topic WebDav status codes in nova? Consistently using the 404 or 422 on actions. 17:24:15 so that one I actually threw to the -dev mailing list 17:24:26 as I think we need more broad input than just us 17:24:27 #link https://bugs.launchpad.net/nova/+bug/1204999 17:24:28 Launchpad bug 1204999 in nova "422 HTTP status code on several actions" [Undecided,New] 17:24:53 #link http://lists.openstack.org/pipermail/openstack-dev/2013-July/012519.html 17:25:00 I would encourage us to discuss it there 17:25:34 so I'm going to jump from that topic quickly, and say lets do it on the list 17:25:53 #topic Adding test cases with skip attribute vote? Exact rule (afazekas) 17:25:55 sdague: IMHO it not just WebDav code, but in this case it is not the correct cde anyway 17:26:15 well it's only in the webdav rfc 17:26:24 but anyway, lets do it on the list 17:26:33 afazekas: you're up next on the skip topic 17:26:56 #link https://review.openstack.org/#/c/35487/ 17:27:14 ok 17:27:39 The current Hacking.rst just forbids the changes which contains only skipped test cases 17:27:55 IMHO this is a good policy 17:27:59 so the rationale behind why we don't want to land skipped tests is that the test itself is then never tested before it's in the tree 17:28:12 This submission has some tests that are not skipped .I will vote for adding tests with skip . 17:28:29 yeah, I've had to go through and fix really broken tests before when I went to remove the skips 17:28:36 I don't want to land non tested code in the tree 17:28:39 It is really hard to come back later to add those tests . 17:28:54 so we should change the wording to we don't land any skipped tests 17:29:02 The never run was happened with several keystone change, because nobody wanted to fix it 17:29:18 ravikumar_hp: it's easily breakupable into a patch with only working tests, and a different one 17:29:21 but in this case it is just config option, and AFAIK it will be default in neutron anyway 17:30:14 Unfortunately in neutron it is not as easy https://review.openstack.org/#/c/38591/ as in devstack https://review.openstack.org/#/c/38267/ 17:30:17 so I remain -1, and am verging on -2 because I'm not comfortable landing never tested code in the tree 17:31:07 afazekas: ok, that's all that's needed there? Is there a reason quotas are off by default in neutron? 17:31:10 sdague: yes. that makes sense 17:31:36 drawback of not accepting skipped tests in general is that you lose potential test coverage... 17:31:40 sdague: when we remove skip , those tests could break .. 17:32:04 mkoderer: we've had tests that are skipped for over a year 17:32:22 sdague: I do not know about any other reason than , difficult to convince the unit tests to be default.. 17:32:25 would it be possible to put them into a seperate branch? 17:32:36 mkoderer: we don't really work with branches well 17:32:38 sdague: I see I don't know the history.. I am new ;) 17:32:43 sdague: I think this is a grey area that is not well served by a simple rule for acceptance. 17:32:58 it does occur to me that maybe we should timestamp when a test is skipped, so we know how old it is 17:33:27 and after a certain time say people need to fix them, or we dump them, because they are bitrotting a lot when skipped 17:33:32 a timestamp in the code? or a script that checks it? 17:33:37 sdague: That works for me. 17:33:51 in the code, make it another param on the skip bug 17:33:51 sdague: sounds logical 17:33:52 dkranz: the git can tell it by 2-3 combined commands 17:34:17 dkranz: landing a skipped test doesn't seem grey to me 17:34:25 it means we put code in the tree that didn't get run 17:34:39 that's a no no in my book 17:34:53 sdague: Yes, but if we reject it maybe no one will add it and we won't have it. 17:35:01 sdague: It can be a fail either way. 17:35:09 dkranz: but it's not like we have it when it's skipped :) 17:35:16 sdague: basically all neutron job is non voting now, so they can run an fail:) 17:35:17 dkranz: I agree to this 17:35:49 sdague: I think we just need to be more diligent about unskipping tests, perhaps by putting a reminder on whatever action is needed to unskip. 17:36:09 yeh, we have the skip tracker, a run on that is probably a good idea 17:36:34 the skip tracker will tell you which bugs we skip have been closed 17:36:37 sdague: we can run that once a week and send the output to the list or something to pester people :) 17:36:46 but unskipping things is still very manual 17:36:49 because of bit rot 17:36:53 Can we extend the skip tracker to show skip ages based on git history ? 17:37:39 afazekas: that can be tricky because what happens if there is a code refactor that moves things around 17:37:40 afazekas: I don't know, that's why I'd rather anotate the skips 17:37:42 or a whitespace change 17:37:48 mtreinish: if you want to do this automatically you need to check the bugtracker 17:37:49 we can seed them with git values 17:37:50 mtreinish: restoring a change and rebasing is also manual 17:37:59 mkoderer: we have a script that does that 17:38:03 mkoderer: we have a script in tools that does that 17:38:09 ;) ok ok 17:38:16 mkoderer: https://github.com/openstack/tempest/blob/master/tools/skip_tracker.py 17:38:38 cool 17:38:44 we have 4 that should be fixed 17:38:53 The following bugs have been fixed and the corresponding skips 17:38:53 should be removed from the test cases: 17:38:53 () 17:38:53 1072318 17:38:54 1074908 17:38:55 1080406 17:38:56 1170718 17:39:05 so that's easy patches for someone... maybe :) 17:39:18 ok, well only 20 minutes left, so lets move on 17:39:32 #topic py26 compatibility (afazekas) 17:40:08 #link https://review.openstack.org/#/c/38284/ 17:41:17 IMHO it is small/simple enough patch for letting tempest running with py26 without skip issues 17:41:50 ok, I think I'm fine with that. We'll probably have to revisit once we drop nose as test-requires 17:42:01 afazekas: that's probably fine. But, I personally don't feel py26 compatibility is something that is very important for tempest. 17:42:03 sdague: Sure. 17:42:18 dkranz / mtreinish: can you just take a look as well 17:42:31 Yup 17:42:50 dkranz: sure 17:43:00 mtreinish: I wish it were not important for OpenStack at all but we are not there yet. 17:43:19 I do think we probably need to rethink py26 compat for icehouse, because py26 will stop having security updates 17:43:20 dkranz: we can lead the pack 17:43:33 and I assume the software channel stuff for rhel will be ga by then 17:43:49 mtreinish: You'll have to lend me a nomex suit. 17:44:04 sdague: I hope so. 17:44:08 dkranz / afazekas do you know if red hat's going to be ok with dropping 2.6 once the software channel stuff is GA? 17:44:16 I realize it's only beta now 17:44:45 sdague: Let me find out. 17:44:52 it would be nice to know, because I'm sure we're going to get asked about python 3 compat soon :) 17:45:13 sdague: Yes, this really sucks. 17:45:13 and 2.6 vs. 3 isn't pretty 17:45:24 cool, thanks dkranz 17:45:56 #topic Consistent reviewing (afazekas) 17:46:03 afazekas: the floor is yours again 17:46:29 sdague: we have too many open reviews, what can we do to make the marge process faster and simpler ? 17:47:30 IMHO one thing what we can do, is documenting the exact expectations about what patch can be merged. And we should be less nit picky some times :) 17:47:50 afazekas: 30 nonWIP reviews really isn't that much 17:47:51 so we only have 35 open reviews, which is 10% of the nova queue :) 17:47:56 yeh 17:48:17 I agree that in reviews we should give people as specific of feedback as possible 17:48:26 mtreinish: in tempest scale it is very much 17:48:28 sdague: I don't think what nova does is really relevant. 17:48:50 https://review.openstack.org/#/q/status:open+-Verified-1+-CodeReview-1+-CodeReview-2+(project:openstack-dev/grenade+OR+project:openstack/tempest),n,z 17:48:53 sdague: The problem is that people can work downstream much faster and we don't want people to work downstream. 17:48:59 we only have 19 without negative feedback on them 17:49:06 18 if you remove wip 17:49:21 Maybe this would be useful doc to keep handy when reviewing: 17:49:22 #link https://wiki.openstack.org/wiki/GitCommitMessages 17:49:50 and I do think it's ok to be picky some times, I was -1ing a lot of patches that were doing cleanup incorrectly 17:49:54 kashyap: I very rare read the commit message :) 17:50:05 afazekas: the commit message is important :) 17:50:06 sdague: That is not nitpicking. 17:50:07 kashyap: I very rarely read the commit message :) 17:50:10 afazekas, I always browse git commit logs 17:50:29 andreaf, it's really important to have all the information right there (without having to click bugs, etc) 17:50:37 (Oops, didn't mean to prompt him) 17:50:50 kashyap: +1 17:51:08 afazekas, The above document is written by danpb, after experience with a lot of communities like Kernel/KVM, QEMU, Libvirt. 17:51:21 I think the biggest issue is turnaround time. 17:51:21 andreaf, I certainly learnt a lot from it (while I'm still new here). 17:51:45 Darn, I keep prompting him (Sorry, again). :( 17:51:49 If turnaround was fast, having to resubmit for a nitpick would not be as much of a problem. 17:52:08 kashyap: its a good doc I often point people to it 17:52:19 dkranz: that's fair, I think it's also fair that if you see patches that others should take a look at, drop it in #openstack-qa 17:52:52 I just figured out how to do the enhanced queries, so that "no negative feedback" query is useful 17:53:01 sdague: If we are all ok with being pinged for reviews on-demand than that would work. 17:53:11 sdague: Within reason of course. 17:53:18 dkranz: as long as it's on #openstack-qa 17:53:26 sdague: Of course. 17:53:33 and only once a day per review, some times people ping everyone every hour 17:53:38 that's no good 17:53:46 mtreinish, True. 17:53:50 sdague: I think -dev would be fine too 17:53:56 sure -dev is fine as well 17:54:06 -qa has a captive audience though 17:54:07 sdague: We could also drop the review-by-more-than-one-company policy for a -1 that says I'm +2 after this nitpick is addressed. 17:54:26 dkranz: yeh, it's only a guideline, if people think it was fixed that's cool 17:54:33 sdague: +1 17:54:52 sdague: OK, let's give this a try 17:55:01 what about considering 3 times +1 as a +2 ? 17:55:09 afazekas: no 17:55:11 afazekas: no 17:55:18 afazekas: no :) 17:55:24 :) 17:55:26 we have lots of people that put +1s on things 17:55:41 sdague: one or two more core reviewers wouldn 17:55:46 by having +2 it means there is trust in your judgement 17:55:47 t hurt of course. 17:56:05 dkranz: yeh, let me run the numbers again, i think some folks were coming up the ranks on reviews 17:56:18 I was planning to see if we had good folks to propose next week 17:56:25 sdague: Great! 17:56:46 what numbers count to become a core reviewer? 17:56:47 I would also encourage folks to use this review query - https://review.openstack.org/#/q/status:open+-Verified-1+-CodeReview-1+-CodeReview-2+(project:openstack-dev/grenade+OR+project:openstack/tempest),n,z 17:56:52 just number of reviews? 17:56:56 mkoderer: reviews 17:57:01 ok 17:57:06 but we also want to see judgement 17:57:12 so correctly -1ing stuff 17:57:26 mkoderer: yeah its not just quantity but quality too 17:57:47 because once you are core, you can approve, and we want to trust that the core members have good judgement in what they keep out of the tree 17:58:09 sdague: three minutes for slow test... 17:58:10 that review query is the no negative feedback one, and there is no reason that shouldn't be a very short list 17:58:17 right 17:58:30 sdague: one thing i found useful was to look for open and approved patches 17:58:35 #topic Separate heat job and slow tests in general (dkranz) 17:58:36 something that got borked in a rebase 17:58:48 any one object to marking tests slow, skipping in full tempest, and having other gate jobs select the tests they want to run 17:58:54 jog0: well that doesn't actually filter -2, so those would be in that list 17:59:05 sdague: see https://github.com/openstack-infra/reviewstats/blob/master/openapproved.py 17:59:41 dkranz: so right now we just have the twin volume test that's > 60s 17:59:42 sdague: at least in nova we sometimes don't require to +2s for a trivial rebase of something already approved 17:59:45 I sent an email to the list about that. 17:59:54 dkranz: yeh, lets do it on the list 17:59:56 sdague: heat? 18:00:02 sdague: ceilometer is coming 18:00:03 well heat is a special case 18:00:15 and I think heat should be it's own gate job 18:00:22 sdague: Only at the moment 18:00:28 I think ceilo is going to just augment the other tests 18:00:32 sdague: Right, that's what we said. 18:00:45 sdague: about ceilo remains to be seen 18:00:50 so I'd be practical now and start with just heat as a seperate job 18:00:56 and see if we need a different solution later 18:01:00 sdague: That's what i was going to do 18:01:08 ok, I'm cool with that 18:01:09 But we need to skip the tests for that job in the main gate 18:01:17 I propose to do that by marking them slow 18:01:22 ok 18:01:25 lets try that way 18:01:31 So be it. 18:01:32 dkranz: just start the service tagging blueprint 18:01:35 tag them as heat 18:01:46 same net effect at first 18:01:46 looks like it's time for the OSSG meeting… I'll wait for the previous one to wrap up 18:01:56 mtreinish: Let's move to qa channel 18:01:59 dkranz: ok 18:02:01 bdpayne: yep, I'll wrap us up now 18:02:08 ok, thanks folks, our time is up 18:02:12 #endmeeting