15:12:13 <scottda> #startmeeting cinder_testing 15:12:14 <openstack> Meeting started Wed Sep 14 15:12:13 2016 UTC and is due to finish in 60 minutes. The chair is scottda. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:12:15 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:12:17 <openstack> The meeting name has been set to 'cinder_testing' 15:12:21 <smcginnis> scottda: Test all the things! 15:12:22 <eharney> one quick thing here 15:12:27 <hemna> smcginnis, +1 15:12:29 <xyang1> hi 15:12:30 <hemna> meeting done! 15:12:36 <erlon_> scottda: hmm ok, its about to start 15:12:37 <smcginnis> hemna: ;) 15:12:40 <eharney> i found another bug just by running our pylint test and looking at it... please don't ignore pylint failures on patches 15:12:42 <erlon_> hi 15:12:46 <e0ne> :) 15:12:49 <eharney> it's non-voting, but useful 15:12:51 <scottda> I put something on the main cinder meeting agenda re: testing things for the release 15:12:53 <smcginnis> eharney: +1 15:13:02 <smcginnis> eharney: We really should have caught that one. :[ 15:13:17 <smcginnis> scottda: Thanks for starting that etherpad. 15:13:33 <erlon> smcginnis: the testing one? 15:13:38 <scottda> eharney: +1. But there's always the problem of the tests that always fail... 15:13:43 <scottda> They make people ignore the non-voting tests that do fail. 15:13:51 <smcginnis> erlon: Yeah. Not much there so far, but it's a start. 15:14:01 <erlon> smcginnis: mhm 15:14:10 <scottda> Perhaps we should disable or otherwise not report on tests that always fail, and we are not fixing. 15:14:14 <scottda> OR fix them. 15:14:46 <scottda> #link https://etherpad.openstack.org/p/cinder-newton-testing 15:14:56 <scottda> ^^ That for cinder-newton specific testing. 15:15:12 <erlon> scottda: hmm, hadnt that one 15:15:40 <scottda> That's to call out areas we *should* be testing for the release. 15:15:52 <xyang> can I get some reviews on the groups functional tests? https://review.openstack.org/#/c/362584/ 15:15:58 <scottda> Things that must be tested manually, or we know isn't covered by automated testing. 15:16:10 <scottda> xyang: I'll look today 15:16:24 <xyang> scottda: thanks 15:17:33 <erlon> scottda: do you have an agenda? 15:17:45 <scottda> For today? NO 15:18:08 <scottda> And we can cut it short. I don't want to discuss the same thing we're discussing in cinder meeting... 15:18:21 <erlon> scottda: ok, them, I had some ideas about that we was talking yesterday 15:19:03 <erlon> scottda: to create a job/tempest tests to cover migration/retype features 15:19:40 <scottda> erlon: Go ahead 15:20:03 <erlon> scottda: the idea is to have some jobs to run only tests related to that, which would reduce the time needed to run them 15:20:36 <erlon> scottda: I added some points in the etherpad: https://etherpad.openstack.org/p/Cinder-testing 15:20:57 <scottda> erlon: Sounds fine. You could add an experimental job for that. 15:21:48 <erlon> scottda: we continue with the multi-be tests in tempest, but may be they need some changes in tempest to allow to run the matrix 15:22:53 <erlon> scottda: today the test considers only 2 backends, it would need to consider more if configured so, or a DDT like scheme to run they individually to each BE configured 15:23:46 <scottda> erlon: OK. IF you get some patches up, put them on the etherpad and ping us and we can review. 15:23:58 <erlon> scottda: sure, Ill 15:24:35 <scottda> erlon: Anything else on this? I want to circle back to the problem of ignoring non-voting tests. 15:24:50 <erlon> scottda: no, that's all 15:25:12 <scottda> It seems that gate-tempest-dsvm-full-drbd-devstack-nv and gate-tempest-dsvm-neutron-identity-v3-only-full-nv regularly fail. 15:25:35 <erlon> scottda: how often? 15:25:59 <scottda> And that leads to people getting used to Big Red Failures on the non-voting tests. I think this leads to people generally ignoring them. 15:25:59 <scottda> erlon: I don't know how often, I just ignore them. 15:26:03 <scottda> :) 15:26:23 <erlon> scottda: haha that is a problem 15:26:45 <erlon> scottda: usually I ignore all non voting tests not related to what Im changing 15:27:37 <erlon> scottda: for example, for the NFS patch, I always look for 'true' negatives in any NFS derived CI 15:27:41 <scottda> I think if things are failing, and we're not fixing them, we should consider a way to disable them. I know that getting things out, and then back in, to the infra repos can be a PITA. 15:27:59 <scottda> eharney: smcginnis e0ne What do you think? 15:28:18 <jgriffith> scottda: who is going to *do* that? 15:28:30 <jgriffith> scottda: and also that should apply to 3'rd party CI right? 15:28:50 <scottda> jgriffith: Well, that's part of the issue. It's certainly easiest to just do nothing. 15:28:51 <jgriffith> scottda: none of this is a new complaint or revelation... but the same problem exists. WHO is going to do it and how 15:29:23 <e0ne> scottda: I think gate-tempest-dsvm-full-drbd-devstack-nv and gate-tempest-dsvm-neutron-identity-v3-only-full-nv cases are different 15:29:35 <e0ne> who does maintain gate-tempest-dsvm-full-drbd-devstack-nv? 15:29:39 <jgriffith> scottda: "step-1: Define a criteria" 15:29:50 <jgriffith> "step-2: Define an action/process" 15:30:01 <jgriffith> "step-3: Define who does what" 15:30:02 <e0ne> gate-tempest-dsvm-neutron-identity-v3-only-full-nv - IMO, we should fix it or move to experimental 15:30:05 <jgriffith> "step-4: do it" 15:30:25 <jgriffith> scottda: "those tests fail a lot" isn't an actionable criteria 15:31:17 <e0ne> jgriffith, scottda: agree. we have to compare their failules with tempest over lvm job 15:31:20 <erlon> if a tests fail mor than X%? 15:31:30 <jgriffith> e0ne: scottda I guess a good start would be some ER queries maybe to get some data? 15:32:03 <e0ne> jgriffith, scottda: yes, we did it in the past to make some jobs voting or move from the experimental queue 15:32:24 <erlon> jgriffith: there's someone working in a tool to get that isnt? 15:32:50 <scottda> sorry, browser crapped out on me.. 15:33:03 <jgriffith> erlon: not sure where those efforts are any more 15:33:24 <jgriffith> erlon: seems like a topic everyone loves to talk about but not really interested in working on 15:33:33 <erlon> IIRC _alastor_ was doint 15:33:39 <jgriffith> erlon: I'm as guilty as the next on that :( 15:33:45 <scottda> jgriffith: You are right on that. I am guilty on that. 15:34:26 <scottda> alright, I guess we go back to : Don't ignore the non-voting tests, just the ones that always fail and should be ignored. 15:34:44 <erlon> johnthetubaguy: so step 0: should 'be able to measure who and how much are failing' 15:34:53 <erlon> jgriffith: ^ 15:35:00 <e0ne> scottda: more or less, but everybody ignores nv jobs:( 15:35:24 <jgriffith> so to be fair, the pylint test is a special case. That realy should never be ignored and it doesn't have false failures very often. 15:35:34 <eharney> yeah, pylint is not exactly the same thing as a tempest run 15:35:35 <scottda> e0ne: Yes, well, that's what motivated me to bring it up. But I guess I'm not motivated enough to fix the whole thing.. 15:35:40 <e0ne> erlon: you can use http://graphite.openstack.org/ to compare how much failures ware against lvm job 15:36:10 <jgriffith> e0ne: oh, neat... I hadn't seen that 15:36:21 <e0ne> scottda: it's not a problem to fix some nv job 15:36:32 <jgriffith> e0ne: I have no idea how in the hell to use it... but it looks pretty :) 15:36:36 <e0ne> scottda: it's very hard to get it passing on a regular basis 15:36:41 <scottda> e0ne: You mean to fix it so it passes? 15:36:48 <e0ne> scottda: yep 15:36:53 <erlon> e0ne: does it gather information about all gate tests? 15:37:08 <e0ne> jgriffith: it has an ugly UX :( 15:37:11 <scottda> Well, maybe not a problem, but it doesn't get the same attention and voting jobs. 15:37:16 <erlon> jgriffith: +1 it looks awesome indeed :) 15:37:21 <jgriffith> e0ne: LOL... reminds me of Windows 15:37:27 <erlon> e0ne: I liked it 15:37:30 <e0ne> erlon: yes, it has information for 3-6 last months 15:38:13 <scottda> Anyway, maybe if we decide to make Otaca a "stability release", this could be part of the efforts. Maybe there's not much else to do without more interest. 15:38:17 <erlon> e0ne: that can solve part of the problem, the gate jobs 15:38:42 <erlon> scottda: +1 on that 15:39:18 <e0ne> one more time 15:39:32 <e0ne> for example: we will fix failures for pylint job 15:39:34 <jgriffith> e0ne: :) 15:39:44 <erlon> e0ne: do you know how to find a comparation there? 15:39:56 <e0ne> how will we keep it passing? 15:40:13 <e0ne> erlon: I did it in the past... 15:42:02 <e0ne> erlon: I try to make an example now 15:43:34 <erlon> e0ne: nice 15:45:07 <scottda> alright, maybe we revisit this it there's time and effort available to actually fix it. 15:45:15 <e0ne> erlon: something like http://graphite.openstack.org/render/?width=586&height=308&_salt=1473867901.344&from=-5months&target=stats.zuul.pipeline.check.job.gate-cinder-python27-db.FAILURE&target=stats.zuul.pipeline.check.job.gate-cinder-python34.FAILURE 15:47:44 <e0ne> scottda: it's a good topic for design session 15:48:13 <erlon> e0ne: 0.10 = 10% failure? 15:49:42 <scottda> OK, anything else before the next meeting? 15:50:22 <scottda> let's take a break then... 15:50:25 <scottda> #endmeeting