#openstack-meeting log

22:00:14 <mtreinish> #startmeeting qa
22:00:15 <openstack> Meeting started Thu Feb  6 22:00:14 2014 UTC and is due to finish in 60 minutes.  The chair is mtreinish. Information about MeetBot at http://wiki.debian.org/MeetBot.
22:00:16 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
22:00:20 <openstack> The meeting name has been set to 'qa'
22:00:32 <mtreinish> hi who do we have today?
22:00:33 <dkranz> Here
22:00:36 <masayukig> o/
22:00:39 <rahmu> hello
22:00:43 <mlavalle> hello
22:00:59 <mtreinish> #link https://wiki.openstack.org/wiki/Meetings/QATeamMeeting
22:01:04 <mtreinish> ^^^ today's agenda
22:01:12 <ken1ohmichi> hi
22:01:21 <boris-42_> hi all
22:01:22 <boris-42_> =)
22:01:29 <afazekas> hi
22:01:38 <boris-42_> how are you guys?)
22:01:43 <mtreinish> let's dive into it
22:01:54 <mtreinish> #topic Blueprints
22:02:04 <mtreinish> does anyone have a blueprint that needs attention?
22:02:14 <mtreinish> or a status update on an open blueprint?
22:03:13 <dkranz> mtreinish: no :)
22:03:26 <mtreinish> heh yeah I guess let's move on to the next topic
22:03:38 <mtreinish> #topic Neutron testing
22:03:50 <mtreinish> mlavalle: any updates on the status of things with neutron?
22:04:05 * boris-42_ enikanorov__ ping
22:04:06 <mlavalle> api tests development has continued
22:04:19 <enikanorov__> boris-42_: queque
22:04:43 <mlavalle> lot's of good contributions. We have almost a 100% coverage of the gap identified in the etherpad
22:04:57 <enikanorov__> boris-42_: greate that you've just woke me up
22:04:58 <mtreinish> oh wow are they all up for review?
22:05:18 <mlavalle> they are all in some part of the revuew cycle
22:05:27 <enikanorov__> i wanted to ask tempest cores to look at the patch we want for quite a long time:
22:05:31 <mlavalle> I am doing about 4 reviews a day
22:05:44 <enikanorov__> oh
22:05:48 <enikanorov__> i see it got approved
22:05:51 <enikanorov__> https://review.openstack.org/#/c/58697/
22:05:59 <enikanorov__> thanks!
22:06:06 <mtreinish> mlavalle: ok and is the neutron gate stablized so we can start pushing things through?
22:06:08 <masayukig> yeah
22:06:14 <dkranz> enikanorov__: I was a little concerned about the runtime
22:06:28 <dkranz> mtreinish: It adds 45 seconds to neutron run. What should we do about this?
22:06:39 <mlavalle> salv-orlando reported good progress this past Monday
22:06:45 <mtreinish> dkranz: the neutron run isn't that much of a concern on the time budget
22:06:48 <dkranz> mtreinish: We said we would focus on scenarios but by nature they can take some time
22:06:50 <mtreinish> because they only run smoke
22:06:57 <mlavalle> I understand they are still stabilizing this week
22:07:02 <enikanorov__> dkranz: it spanws a vm. i guess we could setup a backend on the host itself
22:07:11 <dkranz> mtreinish: Yes, but they are only running a few minutes faster...
22:07:12 <enikanorov__> and just add route to the host from the tenant network
22:07:15 <mtreinish> dkranz: it'll be an issue long term but for right now it's ok
22:07:34 <dkranz> mtreinish: sdague wanted us to watch out for this
22:07:37 <mlavalle> I will ping him again today or tomorrow and pass the status in the qa channel
22:07:43 <mtreinish> mlavalle: ok cool
22:07:53 <mtreinish> I know we merged the neutron api tenant isolation patch
22:07:56 <dkranz> mtreinish: ok, I will stop worrying and love the bomb
22:08:07 <mtreinish> which previously broke things
22:08:23 <mlavalle> we've made good progress getting people contributing and I don't want to loose their enthusiasm
22:08:33 <mtreinish> dkranz: yeah sdague just said we need to be careful with what we merge to try and control the overall runtime
22:08:57 <mlavalle> that's all I have
22:09:06 <mtreinish> dkranz: if it becomes an issue we can just start tagging things as slow
22:09:21 <mtreinish> and add a new nonheat slow job
22:09:30 <dkranz> mtreinish: True
22:10:06 <mtreinish> ok does anyone have anything else to bring up about neutron testing?
22:10:44 <mtreinish> ok then let's move on
22:10:55 <mtreinish> #topic When can we enable failing jobs with bogus log ERRORs (dkranz)
22:11:02 <mtreinish> dkranz: you're up
22:11:15 <dkranz> mtreinish: I put in a lot of work to get this feature in.
22:11:28 <dkranz> mtreinish: It got turned off due to unstable gate
22:11:37 <dkranz> Now bugs are creeping back in
22:11:50 <dkranz> That are ignored because they just show up in a log that no one looks at.
22:12:12 <mtreinish> dkranz: yeah it was causing a lot of nondeterministic failures at a time when things were already really unstable
22:12:13 <dkranz> sdague made a comment that we could turn this back on after icehouse-2
22:12:21 <dkranz> mtreinish: which is now
22:12:34 <mtreinish> dkranz: we can't do that now because of the oslo.messaging errors
22:12:35 <dkranz> mtreinish: but now we have all these errors again
22:12:40 <mtreinish> every run will fail
22:12:50 <dkranz> mtreinish: right, so how do we get that fixed?
22:12:52 <rahmu> dkranz: mtreinish: can you please briefly explain what this feature is about?
22:13:00 <dkranz> rahmu: ok
22:13:26 <mtreinish> rahmu: sure, dkranz wrote a script that goes through all the service logs and prints out the error messages after a gate run
22:13:38 <mtreinish> it used to fail the job if there was an error in any of the logs
22:13:44 <dkranz> rahmu: There are a lot of bugs that let tests seem to pass even thought something is screwed up
22:14:10 <dkranz> mtreinish: is fixing the oslo thing a priority for any one?
22:14:40 <mtreinish> I know sdague brought it up on the ML, but I haven't been paying too much attention to it
22:14:55 <rahmu> mtreinish: dkranz: I understand. Thanks :)
22:15:03 <dkranz> The "lot of work" was not to write the script but to make sure all errors existing converged in a whitelist
22:15:21 <dkranz> A task we are stuck with again
22:15:38 <mtreinish> dkranz: yeah I understand, another idea that sdague and I were throwing around was to split the log checking into a separate bot
22:16:24 <dkranz> mtreinish: You mean make it not part of the gate?
22:16:37 <mtreinish> so instead of failing a run, it would look at the logs after jenkins reported the results and would leave another voting score (only +1 or -1)
22:16:51 <mtreinish> that way we wouldn't be hit by nondeterministic issues in the gate
22:17:00 <mtreinish> but we'd still get a -1 if there were errors in the logs
22:17:25 <dkranz> mtreinish: Will that cause any one to fix them?
22:17:59 <dkranz> mtreinish: A lot, if not most, of these errors are bugs.
22:18:07 <rahmu> mtreinish: will that be a blocking -1?
22:18:09 <mtreinish> dkranz: I don't know, we've never done something like that before
22:18:10 <dkranz> mtreinish: And they are easy to track down because you can see where they came from
22:18:26 <mtreinish> rahmu: no, it'd be a -1 like the third party testing
22:19:20 <clarkb> there is no such thing as a blocking -1
22:19:24 <clarkb> only -2 can block
22:19:34 <mtreinish> dkranz: yeah, it was just another approach to consider about doing this
22:19:39 <clarkb> (in the verified column)
22:19:54 <dkranz> mtreinish: Why can't we treat it like any other test failure, as we were doing?
22:20:07 <mtreinish> clarkb: isn't it any column?
22:20:25 <clarkb> mtreinish: well approved is only 0 or 1 so 0 is blocking
22:20:40 <mtreinish> clarkb: oh yeah that's true
22:21:04 <mtreinish> dkranz: it was more about splitting out the complexity from the one job I think
22:21:25 <dkranz> mtreinish: What complexity? I think we either care about this or we don't.
22:21:48 <dkranz> mtreinish: If I am the only one who really cares about it we should just drop the whole thing, no?
22:22:30 <mtreinish> dkranz: it's polluting the console log right now and everyone ignores it. So we have to climb the hurdle to get it failing jobs again
22:22:32 <dkranz> mtreinish: Having some bot is way more complex than what it was doing.
22:22:38 <mtreinish> this was just an idea for a middle ground
22:23:01 <dkranz> mtreinish: Only the oslo thing is polluting it. If we just fixed that there would not be a problem.
22:23:09 <dkranz> mtreinish: There wasn't a problem before.
22:23:28 <dkranz> mtreinish: ANd if we had left this on the oslo thing never would have gotten in!
22:24:00 <mtreinish> dkranz: I don't think that oslo.messaging is the only issue right now. It's the biggest one definitely
22:24:08 <mtreinish> and things slipped in because the script was broken for a while
22:24:15 <mtreinish> (on the d-g side)
22:24:22 <dkranz> mtreinish: Anyway, I am ok with moving on now
22:24:45 <mtreinish> dkranz: ok
22:24:50 <dkranz> mtreinish: If we are going to get anywhere, sdague will have to send an email about it.
22:25:09 <mtreinish> #topic Criteria for accepting tests that cannot run normally in the gate (dkranz)
22:25:16 <mtreinish> dkranz: this one is yours too
22:25:51 <dkranz> mtreinish: There could be a lot of valuable tests that we share but we can't due to our policy of only accepting code that runs upstream
22:26:04 <dkranz> mtreinish: I just thought we should clarify exactly what that means
22:26:22 <dkranz> mtreinish: So folks can decide whether to try to submit tests upstream or do them downstream, which would be a shame.
22:26:36 <mtreinish> dkranz: it's not code that runs upstream, we need results for every review with the test running
22:26:43 <mtreinish> it can be from an outside test system
22:26:52 <mtreinish> like the 3rd party testing requirements in the other projects
22:27:09 <mtreinish> the issue is that if we don't exercise tests for everything they tend to bitrot very quickly
22:27:13 <dkranz> mtreinish: So vote on every commit
22:27:26 <dkranz> mtreinish: That is a very high bar
22:27:30 <mtreinish> it's the same reason we stopped accepting commits with skips
22:27:37 <mtreinish> dkranz: otherwise we don't know if things work or not
22:27:43 <mtreinish> and that's not a good position to be
22:27:46 <dkranz> mtreinish: That may be ok for folks trying to get drivers into the code base
22:28:01 <mtreinish> right now we've got legacy tests in the tree like live migration that I've never run
22:28:07 <mtreinish> I have no idea if they work
22:28:19 <dkranz> mtreinish: I could not justify such a third-party system just to be able to submit my tests upstream
22:28:29 <dkranz> mtreinish: How about a compromise
22:28:42 <mtreinish> dkranz: it came up earlier this week with the multi-backend cinder
22:28:49 <dkranz> mtreinish: The tests can run be reported on by third party nightly
22:29:03 <dkranz> mtreinish: But if they stay broken for more than X time, they are removed
22:29:28 <mtreinish> dkranz: we tried that before with the nightly periodic all job
22:29:35 <mtreinish> no one ever looked at it
22:29:42 <dkranz> mtreinish: Not the remove part :)
22:30:01 <dkranz> mtreinish: That is the teeth
22:30:04 <mtreinish> dkranz: yeah after a few months I just ripped out all the tests that got run in all that didn't run in the gate
22:30:17 <mtreinish> I think it was mostly whitebox
22:30:19 <dkranz> mtreinish: RIght
22:30:37 <dkranz> mtreinish: And any one who cared enough to put the tests upstream would probably care enough to keep them working
22:30:48 <mtreinish> dkranz: it's a good idea but we'll have to be explicit about the policy
22:30:51 <dkranz> mtreinish: Just an idea
22:31:06 <mtreinish> and it'll require someone to watch it
22:31:19 <dkranz> mtreinish: ok, I'll send out some kind of proposal to the ml if it seems worthwhile
22:31:42 <mtreinish> dkranz: that hasn't been my experience. Things normally just get thrown over the fence
22:31:58 <mtreinish> dkranz: yeah bring this out to the ml
22:31:58 <dkranz> mtreinish: The test would be external so it is likely some one would be watching
22:32:08 <mtreinish> and maybe we'll follow up at summit
22:32:30 <dkranz> mtreinish: And one motivation for this is that our  plan is to increase function of upstream
22:32:46 <dkranz> mtreinish: So in the future multnode tests maybe could run and if we do this they will be there
22:33:02 <dkranz> mtreinish: That's all for now
22:33:17 <mtreinish> dkranz: the issue with this though is the integrated gating, it's not just tempest that could break things
22:33:48 <mtreinish> dkranz: ok we can move on
22:34:01 <mtreinish> #topic Bugs
22:34:16 <mtreinish> Does anyone have any bugs that they think needs some attention?
22:34:30 <mtreinish> or any high priority or critical bugs that need extra eyes on them?
22:35:43 <mtreinish> ok I guess there aren't any bugs today :)
22:35:49 <afazekas> https://review.openstack.org/#/c/71575/
22:35:55 <mtreinish> #topic Critical Reviews
22:36:01 <mtreinish> afazekas: good timing
22:36:11 <mtreinish> #link https://review.openstack.org/#/c/71575/
22:36:23 <mtreinish> does anyone else have any reviews that they'd like to get some eyes on?
22:36:53 <dkranz> https://review.openstack.org/#/c/65930/
22:37:09 <boris-42_> mtreinish #link https://review.openstack.org/#/c/70131/
22:37:13 <dkranz> https://review.openstack.org/#/c/71579/
22:37:14 <mtreinish> #link https://review.openstack.org/#/c/65930/
22:37:36 <boris-42_> mtreinish it is not tempest but it is related ..
22:37:40 <mtreinish> #link https://review.openstack.org/#/c/71579/
22:37:44 <mtreinish> boris-42_: that's fine
22:37:55 <mtreinish> I'll take a look at it probably tomorrow
22:38:12 <boris-42_> mtreinish could I share 2 blueprints around integration of rally & tempest?
22:38:46 <mtreinish> like share them between projects in lp? or right now in the meeting?
22:39:20 <boris-42_> mtreinish in meeting, could we have some topic about integration.. I would like to be a closer to openstack QA team..
22:39:31 <boris-42_> mtreinish sorry didn't add it to agenda =(
22:39:39 <mtreinish> boris-42_: sure
22:39:49 <boris-42_> mtreinish thank you
22:39:49 <mtreinish> first does anyone else have reviews to bring up?
22:40:46 <mtreinish> ok I guess not
22:40:54 <mtreinish> #topic Rally tempest integration
22:40:59 <mtreinish> boris-42_: go ahead
22:41:13 <boris-42_> so there are 2 parts of integration
22:41:46 <boris-42_> first of all what is rally… small diagram https://wiki.openstack.org/w/images/e/ee/Rally-Actions.png
22:42:13 <boris-42_> so it is the tool that allows you to work with different clouds, verify them, deploy on (virtual) servers and benchmark
22:42:23 <boris-42_> (in future as well profile & analyze logs)
22:42:31 <boris-42_> it is done to simplify work for human
22:42:37 <boris-42_> =)
22:42:58 <boris-42_> We are trying to reuse as much as possible from OpenStack and related project
22:43:12 <boris-42_> e.g. one of deploy engine is based on DevStack
22:43:29 <boris-42_> so there are 2 first points that are related to tempest
22:43:41 <boris-42_> 1. add some kind of pretty interface to tempest
22:43:55 <boris-42_> https://blueprints.launchpad.net/rally/+spec/tempest-verification
22:44:07 <boris-42_> So when you are working around nova e.g.
22:44:10 <boris-42_> you have cloud
22:44:17 <boris-42_> you would like to have some command like
22:44:39 <boris-42_> rally verify nova (that will run only tempest tests that are related to nova)
22:44:51 <boris-42_> and after something fails
22:44:53 <boris-42_> you are fixing it
22:45:02 <boris-42_> and would like first of all to run failed tests
22:45:14 <boris-42_> so run rally verify latest_failed
22:45:31 <boris-42_> as well you would like to keep results for some cloud somewhere (it will be Rally DB)
22:45:42 <mtreinish> boris-42_: that sounds like a wrapper around a lot of things in tempest already
22:46:11 <mtreinish> boris-42_: We try to service tag tests so you can run with a regex filter compue for example and that should run every test that touches nova
22:46:30 <boris-42_> mtreinish yep buy I would like to simplify this step a bit
22:46:37 <boris-42_> mtreinish if it is already implemented in tempest great
22:46:41 <mtreinish> and testr already keeps a db (obviously a bit more simplistic than rally's) of runs with failed jobs
22:46:43 <boris-42_> mtreinish if it could be implement ok
22:46:46 <mtreinish> and information about them
22:47:26 <boris-42_> mtreinish I know but it is not enough simple for end users imho..
22:47:49 <boris-42_> mtreinish there is a lot of tasty things that could be added
22:48:07 <boris-42_> mtreinish one more time the goal is not to reimplement stuff (just unify & simplify interface)
22:48:20 <boris-42_> mtreinish and somewhere store all results related to specifc cloud
22:48:35 <mtreinish> boris-42_: ok, I just don't know if those 2 examples have to be rally specific
22:48:46 <mtreinish> they seem generally applicable to tempest and testr and improvements
22:49:13 <boris-42_> mtreinish we will try to implement all that is possible inside tempest
22:49:15 <mtreinish> I don't know if we should wrap things to add extra functionality
22:49:21 <mtreinish> boris-42_: ok
22:49:29 <boris-42_> mtreinish the idea is next
22:49:53 <boris-42_> mtreinish it is nice when you have one interface for all operation that is all unified
22:50:17 <boris-42_> mtreinish I mean one commad to add/deploy cloud
22:50:24 <boris-42_> mtreinish then another command to play with tempest
22:50:32 <boris-42_> mtreinish then third command to benchmark
22:50:42 <boris-42_> mtreinish and even after year you will have all results
22:50:54 <boris-42_> mtreinish in one place
22:51:48 <boris-42_> So goal is to unify, make some pretty hooks for most often commands, somewhere store results and so on=)
22:52:37 <boris-42_> mtreinish as well as automation of generation of config for tempest by passing endpoints of cloud
22:53:24 <boris-42_> I know when you are working with tempest for a while it is just a put here and there some info and it works
22:53:47 <boris-42_> but when somebody is newbie to tempest it takes a while..
22:54:10 <boris-42_> So next thing is https://blueprints.launchpad.net/rally/+spec/tempest-benchmark-scenario
22:54:10 <mtreinish> boris-42_: we've actually been working on tooling to simplify that part of the problem
22:54:27 <boris-42_> mtreinish oh it will be nice if you share your results/blueprints/discussion
22:54:39 <boris-42_> mtreinish we will be glad to help you guys
22:55:06 <mtreinish> boris-42_: this is most recent one I'm working on: https://blueprints.launchpad.net/tempest/+spec/config-verification
22:55:32 <mtreinish> boris-42_: you definitely have a lot of info to share here, but we're running out of time
22:55:38 * boris-42_ mtreinish added to bookmarks
22:55:47 <mtreinish> and it feels like we need a larger discussion about rally and tempest
22:55:47 <boris-42_> so okay just a bit about benchmarking scenario
22:55:52 <rahmu> mtreinish: speaking of which. Can you give us a quick status about it and tell us what's left to do?
22:56:04 <rahmu> mtreinish: I'm talking about the config-verification bp
22:56:10 <mtreinish> boris-42_: can you take this to the ML
22:56:10 <boris-42_> mtreinish okay we can continue in mailing list probably?)
22:56:16 <boris-42_> mtreinish sure I will take
22:56:16 <mtreinish> with all the details
22:56:24 <mtreinish> boris-42_: great thanks
22:56:24 <boris-42_> mtreinish let us then prepare demo
22:56:39 <boris-42_> mtreinish we are always able just to put some code from rally to tempst/devstack
22:56:51 <boris-42_> actually we will be only glad to reduce code base of rally=)
22:57:24 <mtreinish> rahmu: sure I'm still working on adding all the extension detection. I've got a patch up for swift now
22:57:40 <mtreinish> then I still need to finish api version discovery for keystone, and cinder
22:57:57 <mtreinish> and add endpoint/service checking to it as well
22:58:08 <mtreinish> and I'm sure there are other optional features or things I'm missing
22:58:17 <mtreinish> but that's what I have on my mind right now for it
22:58:21 <mtreinish> boris-42_: ok cool
22:58:34 <boris-42_> mtreinish thank you for giving timeframe=)
22:58:54 <mtreinish> #topic open discussion
22:59:07 <mtreinish> actually there is only ~1 min left
22:59:11 <mtreinish> so let's end here today
22:59:13 <boris-42_> ^_^
22:59:20 <dkranz> bye
22:59:23 <mtreinish> if there was something we couldn't get to we can just pick it up on -qa
22:59:26 <rahmu> bye everyone
22:59:27 <mtreinish> thanks everyone
22:59:29 <mtreinish> #endmeeting