22:00:14 #startmeeting qa 22:00:15 Meeting started Thu Feb 6 22:00:14 2014 UTC and is due to finish in 60 minutes. The chair is mtreinish. Information about MeetBot at http://wiki.debian.org/MeetBot. 22:00:16 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 22:00:20 The meeting name has been set to 'qa' 22:00:32 hi who do we have today? 22:00:33 Here 22:00:36 o/ 22:00:39 hello 22:00:43 hello 22:00:59 #link https://wiki.openstack.org/wiki/Meetings/QATeamMeeting 22:01:04 ^^^ today's agenda 22:01:12 hi 22:01:21 hi all 22:01:22 =) 22:01:29 hi 22:01:38 how are you guys?) 22:01:43 let's dive into it 22:01:54 #topic Blueprints 22:02:04 does anyone have a blueprint that needs attention? 22:02:14 or a status update on an open blueprint? 22:03:13 mtreinish: no :) 22:03:26 heh yeah I guess let's move on to the next topic 22:03:38 #topic Neutron testing 22:03:50 mlavalle: any updates on the status of things with neutron? 22:04:05 * boris-42_ enikanorov__ ping 22:04:06 api tests development has continued 22:04:19 boris-42_: queque 22:04:43 lot's of good contributions. We have almost a 100% coverage of the gap identified in the etherpad 22:04:57 boris-42_: greate that you've just woke me up 22:04:58 oh wow are they all up for review? 22:05:18 they are all in some part of the revuew cycle 22:05:27 i wanted to ask tempest cores to look at the patch we want for quite a long time: 22:05:31 I am doing about 4 reviews a day 22:05:44 oh 22:05:48 i see it got approved 22:05:51 https://review.openstack.org/#/c/58697/ 22:05:59 thanks! 22:06:06 mlavalle: ok and is the neutron gate stablized so we can start pushing things through? 22:06:08 yeah 22:06:14 enikanorov__: I was a little concerned about the runtime 22:06:28 mtreinish: It adds 45 seconds to neutron run. What should we do about this? 22:06:39 salv-orlando reported good progress this past Monday 22:06:45 dkranz: the neutron run isn't that much of a concern on the time budget 22:06:48 mtreinish: We said we would focus on scenarios but by nature they can take some time 22:06:50 because they only run smoke 22:06:57 I understand they are still stabilizing this week 22:07:02 dkranz: it spanws a vm. i guess we could setup a backend on the host itself 22:07:11 mtreinish: Yes, but they are only running a few minutes faster... 22:07:12 and just add route to the host from the tenant network 22:07:15 dkranz: it'll be an issue long term but for right now it's ok 22:07:34 mtreinish: sdague wanted us to watch out for this 22:07:37 I will ping him again today or tomorrow and pass the status in the qa channel 22:07:43 mlavalle: ok cool 22:07:53 I know we merged the neutron api tenant isolation patch 22:07:56 mtreinish: ok, I will stop worrying and love the bomb 22:08:07 which previously broke things 22:08:23 we've made good progress getting people contributing and I don't want to loose their enthusiasm 22:08:33 dkranz: yeah sdague just said we need to be careful with what we merge to try and control the overall runtime 22:08:57 that's all I have 22:09:06 dkranz: if it becomes an issue we can just start tagging things as slow 22:09:21 and add a new nonheat slow job 22:09:30 mtreinish: True 22:10:06 ok does anyone have anything else to bring up about neutron testing? 22:10:44 ok then let's move on 22:10:55 #topic When can we enable failing jobs with bogus log ERRORs (dkranz) 22:11:02 dkranz: you're up 22:11:15 mtreinish: I put in a lot of work to get this feature in. 22:11:28 mtreinish: It got turned off due to unstable gate 22:11:37 Now bugs are creeping back in 22:11:50 That are ignored because they just show up in a log that no one looks at. 22:12:12 dkranz: yeah it was causing a lot of nondeterministic failures at a time when things were already really unstable 22:12:13 sdague made a comment that we could turn this back on after icehouse-2 22:12:21 mtreinish: which is now 22:12:34 dkranz: we can't do that now because of the oslo.messaging errors 22:12:35 mtreinish: but now we have all these errors again 22:12:40 every run will fail 22:12:50 mtreinish: right, so how do we get that fixed? 22:12:52 dkranz: mtreinish: can you please briefly explain what this feature is about? 22:13:00 rahmu: ok 22:13:26 rahmu: sure, dkranz wrote a script that goes through all the service logs and prints out the error messages after a gate run 22:13:38 it used to fail the job if there was an error in any of the logs 22:13:44 rahmu: There are a lot of bugs that let tests seem to pass even thought something is screwed up 22:14:10 mtreinish: is fixing the oslo thing a priority for any one? 22:14:40 I know sdague brought it up on the ML, but I haven't been paying too much attention to it 22:14:55 mtreinish: dkranz: I understand. Thanks :) 22:15:03 The "lot of work" was not to write the script but to make sure all errors existing converged in a whitelist 22:15:21 A task we are stuck with again 22:15:38 dkranz: yeah I understand, another idea that sdague and I were throwing around was to split the log checking into a separate bot 22:16:24 mtreinish: You mean make it not part of the gate? 22:16:37 so instead of failing a run, it would look at the logs after jenkins reported the results and would leave another voting score (only +1 or -1) 22:16:51 that way we wouldn't be hit by nondeterministic issues in the gate 22:17:00 but we'd still get a -1 if there were errors in the logs 22:17:25 mtreinish: Will that cause any one to fix them? 22:17:59 mtreinish: A lot, if not most, of these errors are bugs. 22:18:07 mtreinish: will that be a blocking -1? 22:18:09 dkranz: I don't know, we've never done something like that before 22:18:10 mtreinish: And they are easy to track down because you can see where they came from 22:18:26 rahmu: no, it'd be a -1 like the third party testing 22:19:20 there is no such thing as a blocking -1 22:19:24 only -2 can block 22:19:34 dkranz: yeah, it was just another approach to consider about doing this 22:19:39 (in the verified column) 22:19:54 mtreinish: Why can't we treat it like any other test failure, as we were doing? 22:20:07 clarkb: isn't it any column? 22:20:25 mtreinish: well approved is only 0 or 1 so 0 is blocking 22:20:40 clarkb: oh yeah that's true 22:21:04 dkranz: it was more about splitting out the complexity from the one job I think 22:21:25 mtreinish: What complexity? I think we either care about this or we don't. 22:21:48 mtreinish: If I am the only one who really cares about it we should just drop the whole thing, no? 22:22:30 dkranz: it's polluting the console log right now and everyone ignores it. So we have to climb the hurdle to get it failing jobs again 22:22:32 mtreinish: Having some bot is way more complex than what it was doing. 22:22:38 this was just an idea for a middle ground 22:23:01 mtreinish: Only the oslo thing is polluting it. If we just fixed that there would not be a problem. 22:23:09 mtreinish: There wasn't a problem before. 22:23:28 mtreinish: ANd if we had left this on the oslo thing never would have gotten in! 22:24:00 dkranz: I don't think that oslo.messaging is the only issue right now. It's the biggest one definitely 22:24:08 and things slipped in because the script was broken for a while 22:24:15 (on the d-g side) 22:24:22 mtreinish: Anyway, I am ok with moving on now 22:24:45 dkranz: ok 22:24:50 mtreinish: If we are going to get anywhere, sdague will have to send an email about it. 22:25:09 #topic Criteria for accepting tests that cannot run normally in the gate (dkranz) 22:25:16 dkranz: this one is yours too 22:25:51 mtreinish: There could be a lot of valuable tests that we share but we can't due to our policy of only accepting code that runs upstream 22:26:04 mtreinish: I just thought we should clarify exactly what that means 22:26:22 mtreinish: So folks can decide whether to try to submit tests upstream or do them downstream, which would be a shame. 22:26:36 dkranz: it's not code that runs upstream, we need results for every review with the test running 22:26:43 it can be from an outside test system 22:26:52 like the 3rd party testing requirements in the other projects 22:27:09 the issue is that if we don't exercise tests for everything they tend to bitrot very quickly 22:27:13 mtreinish: So vote on every commit 22:27:26 mtreinish: That is a very high bar 22:27:30 it's the same reason we stopped accepting commits with skips 22:27:37 dkranz: otherwise we don't know if things work or not 22:27:43 and that's not a good position to be 22:27:46 mtreinish: That may be ok for folks trying to get drivers into the code base 22:28:01 right now we've got legacy tests in the tree like live migration that I've never run 22:28:07 I have no idea if they work 22:28:19 mtreinish: I could not justify such a third-party system just to be able to submit my tests upstream 22:28:29 mtreinish: How about a compromise 22:28:42 dkranz: it came up earlier this week with the multi-backend cinder 22:28:49 mtreinish: The tests can run be reported on by third party nightly 22:29:03 mtreinish: But if they stay broken for more than X time, they are removed 22:29:28 dkranz: we tried that before with the nightly periodic all job 22:29:35 no one ever looked at it 22:29:42 mtreinish: Not the remove part :) 22:30:01 mtreinish: That is the teeth 22:30:04 dkranz: yeah after a few months I just ripped out all the tests that got run in all that didn't run in the gate 22:30:17 I think it was mostly whitebox 22:30:19 mtreinish: RIght 22:30:37 mtreinish: And any one who cared enough to put the tests upstream would probably care enough to keep them working 22:30:48 dkranz: it's a good idea but we'll have to be explicit about the policy 22:30:51 mtreinish: Just an idea 22:31:06 and it'll require someone to watch it 22:31:19 mtreinish: ok, I'll send out some kind of proposal to the ml if it seems worthwhile 22:31:42 dkranz: that hasn't been my experience. Things normally just get thrown over the fence 22:31:58 dkranz: yeah bring this out to the ml 22:31:58 mtreinish: The test would be external so it is likely some one would be watching 22:32:08 and maybe we'll follow up at summit 22:32:30 mtreinish: And one motivation for this is that our plan is to increase function of upstream 22:32:46 mtreinish: So in the future multnode tests maybe could run and if we do this they will be there 22:33:02 mtreinish: That's all for now 22:33:17 dkranz: the issue with this though is the integrated gating, it's not just tempest that could break things 22:33:48 dkranz: ok we can move on 22:34:01 #topic Bugs 22:34:16 Does anyone have any bugs that they think needs some attention? 22:34:30 or any high priority or critical bugs that need extra eyes on them? 22:35:43 ok I guess there aren't any bugs today :) 22:35:49 https://review.openstack.org/#/c/71575/ 22:35:55 #topic Critical Reviews 22:36:01 afazekas: good timing 22:36:11 #link https://review.openstack.org/#/c/71575/ 22:36:23 does anyone else have any reviews that they'd like to get some eyes on? 22:36:53 https://review.openstack.org/#/c/65930/ 22:37:09 mtreinish #link https://review.openstack.org/#/c/70131/ 22:37:13 https://review.openstack.org/#/c/71579/ 22:37:14 #link https://review.openstack.org/#/c/65930/ 22:37:36 mtreinish it is not tempest but it is related .. 22:37:40 #link https://review.openstack.org/#/c/71579/ 22:37:44 boris-42_: that's fine 22:37:55 I'll take a look at it probably tomorrow 22:38:12 mtreinish could I share 2 blueprints around integration of rally & tempest? 22:38:46 like share them between projects in lp? or right now in the meeting? 22:39:20 mtreinish in meeting, could we have some topic about integration.. I would like to be a closer to openstack QA team.. 22:39:31 mtreinish sorry didn't add it to agenda =( 22:39:39 boris-42_: sure 22:39:49 mtreinish thank you 22:39:49 first does anyone else have reviews to bring up? 22:40:46 ok I guess not 22:40:54 #topic Rally tempest integration 22:40:59 boris-42_: go ahead 22:41:13 so there are 2 parts of integration 22:41:46 first of all what is rally… small diagram https://wiki.openstack.org/w/images/e/ee/Rally-Actions.png 22:42:13 so it is the tool that allows you to work with different clouds, verify them, deploy on (virtual) servers and benchmark 22:42:23 (in future as well profile & analyze logs) 22:42:31 it is done to simplify work for human 22:42:37 =) 22:42:58 We are trying to reuse as much as possible from OpenStack and related project 22:43:12 e.g. one of deploy engine is based on DevStack 22:43:29 so there are 2 first points that are related to tempest 22:43:41 1. add some kind of pretty interface to tempest 22:43:55 https://blueprints.launchpad.net/rally/+spec/tempest-verification 22:44:07 So when you are working around nova e.g. 22:44:10 you have cloud 22:44:17 you would like to have some command like 22:44:39 rally verify nova (that will run only tempest tests that are related to nova) 22:44:51 and after something fails 22:44:53 you are fixing it 22:45:02 and would like first of all to run failed tests 22:45:14 so run rally verify latest_failed 22:45:31 as well you would like to keep results for some cloud somewhere (it will be Rally DB) 22:45:42 boris-42_: that sounds like a wrapper around a lot of things in tempest already 22:46:11 boris-42_: We try to service tag tests so you can run with a regex filter compue for example and that should run every test that touches nova 22:46:30 mtreinish yep buy I would like to simplify this step a bit 22:46:37 mtreinish if it is already implemented in tempest great 22:46:41 and testr already keeps a db (obviously a bit more simplistic than rally's) of runs with failed jobs 22:46:43 mtreinish if it could be implement ok 22:46:46 and information about them 22:47:26 mtreinish I know but it is not enough simple for end users imho.. 22:47:49 mtreinish there is a lot of tasty things that could be added 22:48:07 mtreinish one more time the goal is not to reimplement stuff (just unify & simplify interface) 22:48:20 mtreinish and somewhere store all results related to specifc cloud 22:48:35 boris-42_: ok, I just don't know if those 2 examples have to be rally specific 22:48:46 they seem generally applicable to tempest and testr and improvements 22:49:13 mtreinish we will try to implement all that is possible inside tempest 22:49:15 I don't know if we should wrap things to add extra functionality 22:49:21 boris-42_: ok 22:49:29 mtreinish the idea is next 22:49:53 mtreinish it is nice when you have one interface for all operation that is all unified 22:50:17 mtreinish I mean one commad to add/deploy cloud 22:50:24 mtreinish then another command to play with tempest 22:50:32 mtreinish then third command to benchmark 22:50:42 mtreinish and even after year you will have all results 22:50:54 mtreinish in one place 22:51:48 So goal is to unify, make some pretty hooks for most often commands, somewhere store results and so on=) 22:52:37 mtreinish as well as automation of generation of config for tempest by passing endpoints of cloud 22:53:24 I know when you are working with tempest for a while it is just a put here and there some info and it works 22:53:47 but when somebody is newbie to tempest it takes a while.. 22:54:10 So next thing is https://blueprints.launchpad.net/rally/+spec/tempest-benchmark-scenario 22:54:10 boris-42_: we've actually been working on tooling to simplify that part of the problem 22:54:27 mtreinish oh it will be nice if you share your results/blueprints/discussion 22:54:39 mtreinish we will be glad to help you guys 22:55:06 boris-42_: this is most recent one I'm working on: https://blueprints.launchpad.net/tempest/+spec/config-verification 22:55:32 boris-42_: you definitely have a lot of info to share here, but we're running out of time 22:55:38 * boris-42_ mtreinish added to bookmarks 22:55:47 and it feels like we need a larger discussion about rally and tempest 22:55:47 so okay just a bit about benchmarking scenario 22:55:52 mtreinish: speaking of which. Can you give us a quick status about it and tell us what's left to do? 22:56:04 mtreinish: I'm talking about the config-verification bp 22:56:10 boris-42_: can you take this to the ML 22:56:10 mtreinish okay we can continue in mailing list probably?) 22:56:16 mtreinish sure I will take 22:56:16 with all the details 22:56:24 boris-42_: great thanks 22:56:24 mtreinish let us then prepare demo 22:56:39 mtreinish we are always able just to put some code from rally to tempst/devstack 22:56:51 actually we will be only glad to reduce code base of rally=) 22:57:24 rahmu: sure I'm still working on adding all the extension detection. I've got a patch up for swift now 22:57:40 then I still need to finish api version discovery for keystone, and cinder 22:57:57 and add endpoint/service checking to it as well 22:58:08 and I'm sure there are other optional features or things I'm missing 22:58:17 but that's what I have on my mind right now for it 22:58:21 boris-42_: ok cool 22:58:34 mtreinish thank you for giving timeframe=) 22:58:54 #topic open discussion 22:59:07 actually there is only ~1 min left 22:59:11 so let's end here today 22:59:13 ^_^ 22:59:20 bye 22:59:23 if there was something we couldn't get to we can just pick it up on -qa 22:59:26 bye everyone 22:59:27 thanks everyone 22:59:29 #endmeeting