#openstack-meeting log

17:00:10 <mtreinish> #startmeeting qa
17:00:11 <openstack> Meeting started Thu Mar 13 17:00:10 2014 UTC and is due to finish in 60 minutes.  The chair is mtreinish. Information about MeetBot at http://wiki.debian.org/MeetBot.
17:00:12 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
17:00:14 <openstack> The meeting name has been set to 'qa'
17:00:21 <sdague> o/
17:00:26 <mtreinish> hi who's here today
17:00:34 <julien-llp> hi everyone
17:00:50 <mtreinish> #link https://wiki.openstack.org/wiki/Meetings/QATeamMeeting#Proposed_Agenda_for_March_13_2014_.281700_UTC.29
17:00:55 <andreaf> hi
17:00:57 <mtreinish> ^^^ today's agenda
17:01:30 <dkranz> o/
17:01:43 <sdague> ok, lets run through the high priority blueprints
17:01:52 <mtreinish> #topic Blueprints
17:01:56 <mtreinish> yes, let's get started
17:02:03 <mtreinish> sdague: do you want to roll through the list?
17:02:06 <sdague> sure
17:02:10 <sdague> #link https://blueprints.launchpad.net/tempest/+spec/fix-gate-tempest-devstack-vm-quantum-full
17:02:38 <coasterz> Hi all ;)
17:02:46 <sdague> rossella_ put out details on the list about the bugs found
17:03:16 <sdague> I'm going to mark that good progress, and hope it gets closed soon
17:03:37 <sdague> #link https://blueprints.launchpad.net/tempest/+spec/multi-keystone-api-version-tests
17:03:55 <sdague> andreaf: where do we stand there? there is another review out, right?
17:03:55 <andreaf> sdague: so I split my patchsets further
17:03:59 <sdague> ok, great
17:04:06 <andreaf> #link: https://review.openstack.org/#/q/status:open+project:openstack/tempest+branch:master+topic:bp/multi-keystone-api-version-tests,n,z
17:04:18 <andreaf> I have 5 patchsets there now
17:04:32 <andreaf> plus I have two patchsets for getting the experimental keystone v3 check
17:04:38 <sdague> ok, cool
17:04:50 <andreaf> #link https://review.openstack.org/#/c/79212/
17:04:56 <sdague> I think that remains good progress then
17:05:00 <sdague> #link https://blueprints.launchpad.net/tempest/+spec/nova-v3-api-tests
17:05:00 <andreaf> #link https://review.openstack.org/#/c/79314/
17:05:28 <sdague> I just marked v3 api tests as implemented. I think cyeoh is going to open a new blueprint for juno
17:05:38 <mtreinish> yeah that's what we discussed last week
17:06:06 <sdague> #link https://blueprints.launchpad.net/tempest/+spec/tempest-heat-integration
17:06:14 <sdague> we have heat-slow voting now, which is good.
17:06:51 <sdague> if stevebaker is around we could figure out if we should call that done, and create a new BP in juno, or move that one to juno.
17:07:03 <sdague> or anyone else that can speak for heat
17:07:19 <sdague> it looked like they were talking about a tempest day, or set of days in the heat meeting yesterday
17:07:45 <sdague> so I'd like to make sure that we consider new heat patches high priority for review in tempest to support them in that
17:08:06 <sdague> #info consider tempest patches for heat functionality high priority from review perspective
17:08:09 <mtreinish> do you want to info that?
17:08:20 <mtreinish> oops you win
17:08:22 <sdague> heh
17:08:26 <sdague> #link https://blueprints.launchpad.net/tempest/+spec/unit-tests
17:08:28 <dkranz> sdague: I think this has been the case for a while but there have not been many patches
17:08:34 <sdague> dkranz: agreed
17:08:39 <sdague> I'm hoping that changes
17:08:45 <dkranz> sdague: RIght
17:08:55 <sdague> mtreinish: how do you want to handle the unit tests blueprint
17:08:59 <mtreinish> so unit tests is making progress
17:09:06 <mtreinish> but we don't really have a defined end point
17:09:19 <mtreinish> I think there are 4 or 5 patches in progress right now
17:09:58 <mtreinish> I think we probably should just leave the bp open until release
17:10:12 <mtreinish> just to make watching unit test reviews easier
17:10:33 <mtreinish> then at summit I want to have some discussions about project policy on unit test requirements
17:10:41 <sdague> sounds good
17:11:16 <sdague> that's all the high priority blueprints, anything else we want to bring forward
17:11:44 <mtreinish> well it'd be nice to get some eyes on my verify script bp
17:11:46 <afazekas> https://blueprints.launchpad.net/tempest/+spec/stop-leaking
17:11:51 <mtreinish> I've got a few patches in progress
17:12:27 <mtreinish> afazekas: how is the stop leaking bp progressing?
17:12:59 <afazekas> mtreinish: Started the new approach, which does not records the resources at the beginning of the run
17:13:15 <afazekas> https://review.openstack.org/#/c/78251/ currently it is failing
17:13:41 <afazekas> Two small nits related to this https://review.openstack.org/#/c/80280/ , https://review.openstack.org/#/c/78345/3
17:14:38 <sdague> afazekas: cool, is there a review up yet?
17:14:51 <afazekas> So the new style is to try delete everything in the tenant before the tenent gets deleted
17:15:04 <dkranz> afazekas: +1
17:15:07 <afazekas> The tenent will be logged on creation to a database, or some kind of file
17:15:26 <afazekas> But the tenant deletion needs to be post process
17:15:53 <dkranz> afazekas: Why can't we delete stray resources when the tenant  is getting deleted?
17:16:01 <afazekas> The process process will also double check is the tnent really empty (for example leak on setUpClass filures)
17:16:36 <sdague> afazekas: so honestly, I'd like to fix the leaking problem by fixing the places where things leak
17:16:38 <afazekas> dkranz: the setUpclass failures are blind spots
17:16:40 <sdague> not just cleanup at the end
17:16:41 <dkranz> afazekas: It would be better if you had a high-level description of the overall plan
17:16:48 <dkranz> afazekas: Rather than just a set of patches
17:16:54 <dkranz> afazekas: Is that possible?
17:17:30 <sdague> on that front, what do people think about taking the nova approach for juno and having a blueprint gerrit repository
17:17:31 <afazekas> I will rewrite the BP, unless some would like to see the original plan implemented
17:17:53 <sdague> afazekas: or start with a mailing list thread
17:18:02 <sdague> because I feel like we still have some disconnect here
17:18:07 <dkranz> sdague: That  is exactly what yair and I proposed for the neutron stuff . I still like it.
17:18:10 <sdague> and I'd like to get us all on the same page
17:18:30 <sdague> dkranz: which thing that I said? :)
17:18:43 <dkranz> sdague: gerrit blueprints
17:18:53 <sdague> yeh
17:19:04 <afazekas> the setUpclasses could be enforced by a decorator, to call tearDownClass even on failuer, if is ok for everyone
17:19:09 <mtreinish> sdague: yeah I think we can try that starting in juno
17:19:12 <dkranz> sdague: An important issue is whether we are managing resources as GC model or malloc/free
17:19:29 <sdague> dkranz: I think it's got to be malloc / free
17:19:31 <dkranz> sdague: If the former than tenant cleanup is not to catch bugs but expected
17:19:52 <dkranz> sdague: I don't 100% disagree but why?
17:20:03 <sdague> because otherwise we can run into races where we hit high water marks inside of tests
17:20:09 <sdague> we've actually had those races before
17:20:29 <dkranz> sdague: But we already to gc at the class level which is the same as the tenant isolation level
17:20:33 <dkranz> do
17:20:54 <sdague> tenant isolation is not going to work for lots of real environments
17:21:01 <afazekas> dkranz: With the new version the behavior on unexpected resource is configureable
17:21:12 <dkranz> afazekas: I see
17:21:26 <sdague> and if we turn it into just GC, then people won't fix the issues in the tests
17:21:30 <sdague> which are totally fixable
17:21:50 <sdague> so I am -2 on automated cleanup at the end of tests
17:21:56 <sdague> test classes / tenants
17:22:05 <sdague> because I want us to be doing this right in the code
17:22:22 <mtreinish> yeah we shouldn't auto cleanup just report the issues (or fail) if there is a leak
17:22:38 <afazekas> sdague: Are you ok with the clean + raise excpetion settings (default)
17:23:00 <sdague> afazekas: for starting, I just want audit
17:23:17 <sdague> and once we get clean, I'm ok with failing if we have leaked resources
17:23:44 <sdague> but I don't want tempest doing failsafe brute force cleanup on class exit
17:23:53 <dkranz> sdague: I would be more comfortable if we did that
17:23:58 <sdague> because we know that not all resources in openstack are actually discoverable
17:24:14 <dkranz> sdague: that is unfortunate and I would call it a bug
17:24:14 <sdague> especially a bunch of the network ones
17:24:18 <sdague> dkranz: sure
17:24:19 <afazekas> On successful run we  do not leak mach
17:25:00 <sdague> afazekas: so build an audit report at the end of classes some how and pull that together
17:25:06 <sdague> that will help us fix things for real
17:25:07 <dkranz> sdague: Basically I think this is really important but we should decide the semantics up front
17:25:15 <sdague> sure
17:25:22 <dkranz> afazekas: sounds good
17:25:29 <sdague> so mailing list thread is fine at this point
17:25:45 <mtreinish> ok is there anything else on blueprints?
17:25:52 <mtreinish> otherwise let's move on to the next topic
17:25:59 <afazekas> Just logging is also configurable ..
17:26:22 <mtreinish> #topic Neutron testing
17:26:44 <mtreinish> mlavalle: are your around?
17:26:51 <mlavalle> hi
17:27:01 <mtreinish> any updates on neutron testing
17:27:10 <mlavalle> Review of api tests have continued
17:27:29 <mlavalle> I sent a message to the ML with 6 tests that were close to merge
17:27:51 <mlavalle> They all were reviewed by cores over the past 3 days, thanks you :-)
17:28:06 <mlavalle> 2 of them merged and the other required chhanges
17:28:24 <mlavalle> we have merged 12 api tests over the past 3 weeks
17:28:39 <mlavalle> I have also identified 5 tests that are abandones
17:28:54 <mlavalle> I am contacting the authors to see if they have still bandwidth
17:29:07 <mlavalle> if not, will reassign tests to someone else
17:29:15 <mlavalle> all in all, good progress
17:29:29 <mtreinish> mlavalle: ok cool thanks
17:29:37 <mlavalle> that's all I have
17:29:42 <andreaf> mlavalle: this is only slightly related to neutron testing, but I think it worth mentioning
17:29:49 <mlavalle> will continue pushing api tests reviews
17:30:18 * mlavalle listening
17:30:24 <andreaf> mlavalle: we have a run_ssh flag in tempest, which is turned off by default, so a number of API tests are not doing ssh checks. But all new tests are ignoring it.
17:31:02 <dkranz> andreaf: You mean ignoring it and doing ssh checks anyway?
17:31:05 <andreaf> mlavalle: did you try turning that on in neutron devstack? I wonder if we should just remove the flag at all
17:31:28 <mlavalle> andreaf: no, I haven't touched that flag
17:31:40 <andreaf> dkranz: yes
17:31:57 <afazekas> andreaf: after this https://review.openstack.org/#/c/54318/ , I will rebase this https://review.openstack.org/#/c/50337/
17:31:58 <andreaf> dkranz: it must be as one of the main gate issue is ssh failures
17:32:15 <sdague> andreaf: so I believe that was because in some environments tempest can't route to the guests
17:32:51 <dkranz> Well obviously we need to be consistent about this
17:33:02 <sdague> honestly, I'd be ok with just removing the flag if it's getting ignored so much. How terrible are things if we try to default it true?
17:34:02 <dkranz> sdague: I'd be ok with removal too but defaulting to true does not address the issue
17:34:07 <andreaf> sdague: with nova networking we may need afakekas' patch to add floating IP to servers - but in a single node devstack not even that
17:34:25 <sdague> dkranz: defaulting true first tells us how terrible the gate would collapse if we did that
17:34:40 <sdague> it's more of a sniff to figure out if it's a truly terrible idea
17:34:47 <afazekas> sdague: the current ssh_check code expects fixed ip connection, and password injection, so it needs be able to use at least floating_ip with neutron
17:34:47 <dkranz> sdague: oh, that issue. I was talking about random tests failing if you set it to false if they are not checking
17:35:08 <andreaf> sdague: I tried that long time ago and it wasn't going too well, but the remote client has much improved since
17:35:27 <sdague> afazekas: it won't handle cloud-init key injection?
17:35:45 <sdague> ok, so honestly, I think this is probably a bigger issue than we want to bite of at this point in the release
17:36:00 <afazekas> sdague: I will rebase it https://review.openstack.org/#/c/54318/ soon, it is able to handle
17:36:10 <sdague> but I think we should have a summit session on this to make sure we have a solid approach
17:36:12 <dkranz> What use is a cloud if a "user" (tempest) cannot access the created vms?
17:36:16 <afazekas> sdague: the key injection is working
17:36:30 <dkranz> sdague: Are suggesting we fix the tests that are ignoring the flag, or just ignore this issue for now?
17:36:31 <afazekas> but the password (file injection) is not
17:36:40 <sdague> dkranz: I suggest ignoring the issue for now
17:36:48 <afazekas> but the cirros has default passwd as well
17:36:51 <dkranz> sdague: works for me
17:37:07 <sdague> and once the release happens, we sort out a consistent approach for this
17:37:15 <dkranz> sdague: If some one fails due to  not checking they may file a bug :)
17:37:21 <afazekas> sorry this one:  https://review.openstack.org/#/c/50337/
17:37:23 <sdague> yes, sure
17:37:34 <dkranz> sdague: But that should have happened already it=f it was going to
17:38:06 <sdague> yep, we can fix things as hot bugs right now, I just don't want to handle the whole problem, as we've got enough to worry about with the release
17:38:31 <andreaf> sdague: ok makes sense
17:38:49 <mtreinish> ok, I think we can move on the next topic
17:38:49 <sdague> it would be good to get this cleaned up early in juno for sure though
17:38:51 <andreaf> sdague: but was shall we do for new tests?
17:38:58 <sdague> thanks andreaf for bringing it up
17:39:04 <sdague> andreaf: use the run_ssh flag
17:39:07 <sdague> as intended
17:39:19 <mtreinish> #topic Heat testing
17:39:35 <mtreinish> so I think we covered this one in the bp topic
17:39:41 <mtreinish> sdague: unless you had something to add
17:39:52 <sdague> nope, all covered
17:39:57 <mtreinish> ok then let's move on
17:40:03 <mtreinish> #topic Bugs
17:40:25 <mtreinish> so I saw that maurosr sent an email out the ML about the bug day
17:40:58 <mtreinish> I think he was proposing we have the bug day next Wed.
17:41:29 <mtreinish> does anyone have any issues with that date?
17:41:34 <sdague> nope, sounds good
17:41:57 <mtreinish> ok cool so hopefully next week he'll have a summary of the bug day for the meeting
17:42:22 <mtreinish> let's move on to the next topic then
17:42:41 <mtreinish> oh unless someone wants to raise attention on a bug
17:42:59 <andreaf> https://review.openstack.org/#/c/77602/
17:43:21 <mtreinish> andreaf: heh well I guess thats a queue for the next topic
17:43:25 <andreaf> mtreinsh: nothing critical, it just need a +A then I can close the bug, very tiny review I promise
17:43:30 <mtreinish> #topic Critical Reviews
17:43:38 <andreaf> mtreinish: ok sorry about that
17:43:38 <mtreinish> #link https://review.openstack.org/#/c/77602/
17:44:09 <mtreinish> so does anyone else have any reviews they would like to bring up?
17:44:23 <sdague> andreaf: lgtm
17:44:43 <andreaf> sdague: thanks
17:44:45 <sdague> andreaf: I did find an earlier patch ended up dropping all the admin tests
17:44:53 <sdague> so we lost 500 tests for a few days
17:45:10 <sdague> so just as an fyi, be careful in checking test counts in the runs with auth related patches
17:45:25 <sdague> this one looks right, still 2200 tests run in tempest-full
17:45:56 <sdague> it does make me think about the idea of instituting low water mark checking
17:46:07 <andreaf> sdague: oh, thanks for letting me know, I'll be more careful
17:46:20 <sdague> because we know that tempest full should run 2200 tests
17:46:27 <sdague> so if it goes below 2000
17:46:32 <sdague> something is wrong
17:46:55 <andreaf> sdague: sounds good, but you need to tune that based on your tempest.conf -> number of skips
17:47:10 <sdague> andreaf: yeh, more like for the gate
17:47:30 <sdague> I wouldn't implement it in tempest.conf, but in something else we call in the gate
17:48:07 <sdague> anyway, this is a good patch, thanks for doing all this disconnect work
17:48:12 <sdague> any other reviews?
17:48:35 <mtreinish> ok if there aren't any other reviews let's move on to the last topic on the agenda
17:48:52 <afazekas> https://review.openstack.org/#/c/75411/
17:49:14 <mtreinish> #link https://review.openstack.org/#/c/75411/
17:49:56 <afazekas> It fixes low chance random gate issue
17:50:04 <mtreinish> afazekas: ok, I'll take a look at it after the meeting
17:50:12 <mtreinish> unless someone beats me to it
17:50:29 <mtreinish> ok let's move on
17:50:33 <mtreinish> #topic Running tempest as non-admin (dkranz)
17:50:37 <mtreinish> dkranz: you're up
17:50:50 <sdague> afazekas: looks good
17:51:50 <mtreinish> dkranz: ???
17:52:39 <mtreinish> ok well if dkranz isn't around does anyone have anything to say about this topic?
17:52:42 <andreaf> dkranz, mtreinish: while we wait for dkranz, I'd like to comment on this
17:52:57 <mtreinish> andreaf: sure
17:53:04 <andreaf> mtreinish: as part of the multi-auth bp, I shall change tenant isolation to support v3
17:53:23 <andreaf> so it will be possible to create users and tenants within a domain
17:53:37 <andreaf> which only requires a domain admin, rather than an overall identity admin
17:53:53 <afazekas> sounds good to me
17:53:57 <andreaf> so with that in place it will be possible to get tenant isolation with a less powerful admin
17:54:07 <mtreinish> but you still need identity admin to create the domain
17:54:16 <mtreinish> oh nm I see what you're saying you specify a domain to run in
17:54:32 <andreaf> we can use the default domain which exists
17:54:36 <sdague> andreaf: sounds very useful
17:55:13 <sdague> I still think we should support a fallback of "specify users you want me to use" and limit running to that many processes
17:55:17 <andreaf> this is just one part of the story, there are many tests which rely on admin account for doing compute volume network stuff
17:55:17 <afazekas> The question is: is it enough for everyone who want to run tempest as non-admin ?
17:55:23 <mtreinish> yeah that'll be nice to enable running in parallel without an admin
17:56:18 <sdague> ok, so in the 5 minutes we have left, I want to just float the qa-specs gerrit repository idea for official
17:56:33 <mtreinish> sdague: sure
17:56:45 <mtreinish> #topic qa-specs gerrit repository
17:56:52 <sdague> because I think we've had at least 3 different topics just in this meeting which should get a design for a blueprint
17:57:26 <sdague> and I think there is no time like the present to try this new idea
17:57:35 <dkranz> I got booted out.
17:57:41 <dkranz> mtreinish: We don't have a way to run tempest as non-admin and skip all admin tests, right?
17:57:56 <sdague> so my suggestion is I get the repository set up
17:58:18 <sdague> and we start doing these sorts of things there in review format
17:58:36 <mtreinish> dkranz: don't specify an admin in the config and the tests should get skipped
17:58:39 <sdague> dkranz: that's not true, because in one of andreaf's refactorings the admin tenant was lost
17:58:40 <mtreinish> if they don't it's a bug
17:58:47 <sdague> and we skipped 500 tests
17:58:53 <mtreinish> sdague: yeah that's sounds fine to me
17:58:59 <dkranz> mtreinish: ok, cool. I didn't know that.
17:59:44 <sdague> #info qa-specs repo to be created for blueprint design / discussion / approval
17:59:51 <mtreinish> sdague: well we're out of time...
18:00:03 <mtreinish> thanks everyone
18:00:05 <mtreinish> #endmeeting