#openstack-cinder log

15:12:13 <scottda> #startmeeting cinder_testing
15:12:14 <openstack> Meeting started Wed Sep 14 15:12:13 2016 UTC and is due to finish in 60 minutes.  The chair is scottda. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:12:15 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:12:17 <openstack> The meeting name has been set to 'cinder_testing'
15:12:21 <smcginnis> scottda: Test all the things!
15:12:22 <eharney> one quick thing here
15:12:27 <hemna> smcginnis, +1
15:12:29 <xyang1> hi
15:12:30 <hemna> meeting done!
15:12:36 <erlon_> scottda: hmm ok, its about to start
15:12:37 <smcginnis> hemna: ;)
15:12:40 <eharney> i found another bug just by running our pylint test and looking at it... please don't ignore pylint failures on patches
15:12:42 <erlon_> hi
15:12:46 <e0ne> :)
15:12:49 <eharney> it's non-voting, but useful
15:12:51 <scottda> I put something on the main cinder meeting agenda re: testing things for the release
15:12:53 <smcginnis> eharney: +1
15:13:02 <smcginnis> eharney: We really should have caught that one. :[
15:13:17 <smcginnis> scottda: Thanks for starting that etherpad.
15:13:33 <erlon> smcginnis: the testing one?
15:13:38 <scottda> eharney: +1. But there's always the problem of the tests that always fail...
15:13:43 <scottda> They make people ignore the non-voting tests that do fail.
15:13:51 <smcginnis> erlon: Yeah. Not much there so far, but it's a start.
15:14:01 <erlon> smcginnis: mhm
15:14:10 <scottda> Perhaps we should disable or otherwise not report on tests that always fail, and we are not fixing.
15:14:14 <scottda> OR fix them.
15:14:46 <scottda> #link https://etherpad.openstack.org/p/cinder-newton-testing
15:14:56 <scottda> ^^ That for cinder-newton specific testing.
15:15:12 <erlon> scottda: hmm, hadnt that one
15:15:40 <scottda> That's to call out areas we *should* be testing for the release.
15:15:52 <xyang> can I get some reviews on the groups functional tests?  https://review.openstack.org/#/c/362584/
15:15:58 <scottda> Things that must be tested manually, or we know isn't covered by automated testing.
15:16:10 <scottda> xyang: I'll look today
15:16:24 <xyang> scottda: thanks
15:17:33 <erlon> scottda: do you have an agenda?
15:17:45 <scottda> For today? NO
15:18:08 <scottda> And we can cut it short. I don't want to discuss the same thing we're discussing in cinder meeting...
15:18:21 <erlon> scottda: ok, them, I had some ideas about that we was talking yesterday
15:19:03 <erlon> scottda: to create a job/tempest tests to cover migration/retype features
15:19:40 <scottda> erlon: Go ahead
15:20:03 <erlon> scottda: the idea is to have some jobs to run only tests related to that, which would reduce the time needed to run them
15:20:36 <erlon> scottda: I added some points in the etherpad: https://etherpad.openstack.org/p/Cinder-testing
15:20:57 <scottda> erlon: Sounds fine. You could add an experimental job for that.
15:21:48 <erlon> scottda: we continue with the multi-be tests in tempest, but may be they need some changes in tempest to allow to run the matrix
15:22:53 <erlon> scottda: today the test considers only 2 backends, it would need to consider more if configured so, or a DDT like scheme to run they individually to each BE configured
15:23:46 <scottda> erlon: OK. IF you get some patches up, put them on the etherpad and ping us and we can review.
15:23:58 <erlon> scottda: sure, Ill
15:24:35 <scottda> erlon: Anything else on this? I want to circle back to the problem of ignoring non-voting tests.
15:24:50 <erlon> scottda: no, that's all
15:25:12 <scottda> It seems that gate-tempest-dsvm-full-drbd-devstack-nv and gate-tempest-dsvm-neutron-identity-v3-only-full-nv regularly fail.
15:25:35 <erlon> scottda: how often?
15:25:59 <scottda> And that leads to people getting used to Big Red Failures on the non-voting tests. I think this leads to people generally ignoring them.
15:25:59 <scottda> erlon: I don't know how often, I just ignore them.
15:26:03 <scottda> :)
15:26:23 <erlon> scottda: haha that is a problem
15:26:45 <erlon> scottda: usually I ignore all non voting tests not related to what Im changing
15:27:37 <erlon> scottda: for example, for the NFS patch, I always look for 'true' negatives in any NFS derived CI
15:27:41 <scottda> I think if things are failing, and we're not fixing them, we should consider a way to disable them. I know that getting things out, and then back in, to the infra repos can be a PITA.
15:27:59 <scottda> eharney: smcginnis e0ne What do you think?
15:28:18 <jgriffith> scottda: who is going to *do* that?
15:28:30 <jgriffith> scottda: and also that should apply to 3'rd party CI right?
15:28:50 <scottda> jgriffith: Well, that's part of the issue. It's certainly easiest to just do nothing.
15:28:51 <jgriffith> scottda: none of this is a new complaint or revelation... but the same problem exists.  WHO is going to do it and how
15:29:23 <e0ne> scottda: I think gate-tempest-dsvm-full-drbd-devstack-nv and gate-tempest-dsvm-neutron-identity-v3-only-full-nv  cases are different
15:29:35 <e0ne> who does maintain gate-tempest-dsvm-full-drbd-devstack-nv?
15:29:39 <jgriffith> scottda: "step-1:  Define a criteria"
15:29:50 <jgriffith> "step-2: Define an action/process"
15:30:01 <jgriffith> "step-3: Define who does what"
15:30:02 <e0ne> gate-tempest-dsvm-neutron-identity-v3-only-full-nv - IMO, we should fix it or move to experimental
15:30:05 <jgriffith> "step-4: do it"
15:30:25 <jgriffith> scottda: "those tests fail a lot" isn't an actionable criteria
15:31:17 <e0ne> jgriffith, scottda: agree. we have to compare their failules with tempest over lvm job
15:31:20 <erlon> if a tests fail mor than X%?
15:31:30 <jgriffith> e0ne: scottda I guess a good start would be some ER queries maybe to get some data?
15:32:03 <e0ne> jgriffith, scottda: yes, we did it in the past to make some jobs voting or move from the experimental queue
15:32:24 <erlon> jgriffith: there's someone working in a tool to get that isnt?
15:32:50 <scottda> sorry, browser crapped out on me..
15:33:03 <jgriffith> erlon: not sure where those efforts are any more
15:33:24 <jgriffith> erlon: seems like a topic everyone loves to talk about but not really interested in working on
15:33:33 <erlon> IIRC _alastor_ was doint
15:33:39 <jgriffith> erlon: I'm as guilty as the next on that :(
15:33:45 <scottda> jgriffith: You are right on that. I am guilty on that.
15:34:26 <scottda> alright, I guess we go back to : Don't ignore the non-voting tests, just the ones that always fail and should be ignored.
15:34:44 <erlon> johnthetubaguy: so step 0: should 'be able to measure who and how much are failing'
15:34:53 <erlon> jgriffith: ^
15:35:00 <e0ne> scottda: more or less, but everybody ignores nv jobs:(
15:35:24 <jgriffith> so to be fair, the pylint test is a special case.  That realy should never be ignored and it doesn't have false failures very often.
15:35:34 <eharney> yeah, pylint is not exactly the same thing as a tempest run
15:35:35 <scottda> e0ne: Yes, well, that's what motivated me to bring it up. But I guess I'm not motivated enough to fix the whole thing..
15:35:40 <e0ne> erlon: you can use http://graphite.openstack.org/ to compare how much failures ware against lvm job
15:36:10 <jgriffith> e0ne: oh, neat... I hadn't seen that
15:36:21 <e0ne> scottda: it's not a problem  to fix  some nv job
15:36:32 <jgriffith> e0ne: I have no idea how in the hell to use it... but it looks pretty :)
15:36:36 <e0ne> scottda: it's very hard to get it passing on a regular basis
15:36:41 <scottda> e0ne: You mean to fix it so it passes?
15:36:48 <e0ne> scottda: yep
15:36:53 <erlon> e0ne: does it gather information about all gate tests?
15:37:08 <e0ne> jgriffith: it has an ugly UX :(
15:37:11 <scottda> Well, maybe not a problem, but it doesn't get the same attention and voting jobs.
15:37:16 <erlon> jgriffith: +1 it looks awesome indeed :)
15:37:21 <jgriffith> e0ne: LOL... reminds me of Windows
15:37:27 <erlon> e0ne: I liked it
15:37:30 <e0ne> erlon: yes, it has information for 3-6 last months
15:38:13 <scottda> Anyway, maybe if we decide to make Otaca a "stability release", this could be part of the efforts. Maybe there's not much else to do without more interest.
15:38:17 <erlon> e0ne: that can solve part of the problem, the gate jobs
15:38:42 <erlon> scottda: +1 on that
15:39:18 <e0ne> one more time
15:39:32 <e0ne> for example: we will fix failures for pylint job
15:39:34 <jgriffith> e0ne: :)
15:39:44 <erlon> e0ne: do you know how to find a comparation there?
15:39:56 <e0ne> how will we keep it passing?
15:40:13 <e0ne> erlon: I did it in the past...
15:42:02 <e0ne> erlon: I try to make an example now
15:43:34 <erlon> e0ne: nice
15:45:07 <scottda> alright, maybe we revisit this it there's time and effort available to actually fix it.
15:45:15 <e0ne> erlon: something like http://graphite.openstack.org/render/?width=586&height=308&_salt=1473867901.344&from=-5months&target=stats.zuul.pipeline.check.job.gate-cinder-python27-db.FAILURE&target=stats.zuul.pipeline.check.job.gate-cinder-python34.FAILURE
15:47:44 <e0ne> scottda: it's a good topic for design session
15:48:13 <erlon> e0ne: 0.10 = 10% failure?
15:49:42 <scottda> OK, anything else before the next meeting?
15:50:22 <scottda> let's take a break then...
15:50:25 <scottda> #endmeeting