17:00:34 <mtreinish> #startmeeting qa 17:00:34 <openstack> Meeting started Thu Jul 3 17:00:34 2014 UTC and is due to finish in 60 minutes. The chair is mtreinish. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:00:35 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:00:37 <openstack> The meeting name has been set to 'qa' 17:00:45 <mtreinish> Hi who's here today? 17:00:50 <k4n0> o/ 17:00:50 <asselin> hi 17:00:53 <mkoderer> o/ 17:00:58 <mtreinish> #link https://wiki.openstack.org/wiki/Meetings/QATeamMeeting#Proposed_Agenda_for_July_3_2014_.281700_UTC.29 17:00:59 <salv-orlando> aloha 17:01:02 <mtreinish> ^^^ Today's agenda 17:01:19 <dkranz> o/ 17:01:28 <jlanoux> hi 17:01:55 <mtreinish> ok let's get started 17:02:04 <mtreinish> #topic Spec review day July 9th 17:02:15 <mtreinish> #link http://lists.openstack.org/pipermail/openstack-dev/2014-July/039105.html 17:02:26 <mtreinish> so I'm not sure everyone saw that ML post 17:02:36 <mtreinish> but next Wed. we're going to have a spec review day 17:02:42 <mkoderer> ok 17:02:49 <mtreinish> the goal is to go through the backlog on the qa-specs repo 17:02:56 <mtreinish> which has been a pretty slow process 17:03:17 <mtreinish> so if everyone could concentrate on specs reviews then that would be awesome 17:03:59 <mtreinish> that's all I had on this topic. So unless someone has something else to add we can move on 17:04:14 <asselin> added to my calendar 17:04:37 <mtreinish> #topic Mid-cycle Meet-up full, registration closed 17:04:47 <mtreinish> #link http://lists.openstack.org/pipermail/openstack-dev/2014-July/039209.html 17:05:05 <mtreinish> so just another announcement that we have no more space available for the midcycle meetup 17:05:08 <mkoderer> mtreinish: there was one registration after you closed it ;) 17:05:17 <mtreinish> mkoderer: heh, yeah I noticed 17:05:30 <mtreinish> I'll talk to cody-somerville... 17:05:31 <dkranz> eventual consistency 17:05:36 <mkoderer> I put him on the list anyway 17:05:47 <mtreinish> yeah we actually had 1 free slot so it's ok 17:05:56 <mtreinish> so anyway now we really are full 17:06:14 <mtreinish> so if you're name's not on the list unfortunately there isn't any room 17:06:38 <mtreinish> sorry if you're unable to attend 17:07:03 <mkoderer> next time we have to plan with more ppl ;) 17:07:23 <mtreinish> honestly I wasn't expecting to hit 30 17:07:32 <mtreinish> so I'm a bit surprised 17:07:52 <mkoderer> sure but it's a good sign :) 17:08:10 <mtreinish> mkoderer: no disagreement from me :) 17:08:14 <mtreinish> that's all I had for this topic. So unless someone has something else we can move on 17:08:46 <mtreinish> #topic Specs Review 17:08:57 <mtreinish> #link https://review.openstack.org/#/c/104576/ 17:09:07 <andreaf> hi - sorry I'm late 17:09:12 <mtreinish> so I pushed this out this morning to drop a spec that was superseeded 17:09:31 <mtreinish> but andreaf brought up the point do we really want to just rm the rst files if a spec is abandonded 17:09:37 <mtreinish> or archive them somewhere in tree 17:10:06 <mtreinish> so we have a record of them if they ever need to be restored 17:10:06 <dkranz> mtreinish: Can't hurt much to move them to some abandoned dir 17:10:23 <mtreinish> yeah, I don't have a strong opinion either way, rm just seemed simpler 17:10:25 <dkranz> mtreinish: I doubt there will be a large number, at least I hope not 17:10:30 <mtreinish> yeah hopefully not 17:10:46 <dkranz> It can also be recovered from git though, right? 17:11:02 <mtreinish> dkranz: yeah getting it from git is as simple as a revert 17:11:10 <andreaf> andreaf: mtreinish dkranz yes it can also be recovered by git - it's just a folder is more visible / accessible 17:11:28 <dkranz> andreaf: But realistically who will want to see it? 17:11:38 <mtreinish> dkranz: yeah that's what I was thinking 17:11:48 <dkranz> andreaf: Anyway, I dont have a strong opinion either 17:11:53 <mtreinish> if it's abandonded it means it's not going to be implemented 17:12:03 <mtreinish> and why do we want to archive it 17:12:19 <mkoderer> and the reason why it's abandonded is in gerrit? 17:12:20 <mtreinish> the implemented dir is different because it's pseudo documentation of features 17:12:22 <andreaf> dkranz mtreinish so can a spec stay in review forever? 17:12:29 <dkranz> And it will bitrot anyway as things change. 17:12:53 <mkoderer> I would simply remove it... 17:12:54 <mtreinish> andreaf: yes, I'm not sure cores can force an abandon, just -2 17:13:03 <andreaf> I was thinking about specs which are interesting but abandoned because no one has time to work on them atm 17:13:17 <mtreinish> andreaf: that's the TODO thing again 17:13:41 <mkoderer> andreaf: ah, ok that can be useful 17:13:47 <mkoderer> like a backlog of specs 17:13:53 <mtreinish> andreaf: so there is probably a class of things like that specs without owners 17:13:57 <dkranz> +1 17:14:02 <k4n0> I have a question on specs, do new tests have to be proposed via a spec or a bug? 17:14:03 <mtreinish> but that's different and it hasn't come up yet 17:14:21 <dkranz> k4n0: Not by bug 17:14:24 <mtreinish> k4n0: can we discuss that after the meeting in openstack-qa? 17:14:28 <k4n0> ok 17:14:31 <k4n0> thanks 17:15:12 <mtreinish> ok so I'm thinking the direction we should take with this case and others like it when the spec is superseeded is to just rm the file 17:15:21 <andreaf> mtreinish: so I don't have a very strong opinion on an abandoned folder, I thought it could be useful for new people to look into that folder to pick specs 17:15:30 <mtreinish> and if there is a case where the owner can't continue the work we can revisit archiving them somewhere else 17:15:41 <andreaf> mtreinish: ok sounds good 17:15:53 <mtreinish> ok then let's move on the next spec on the agenda 17:15:53 <dkranz> andreaf: There is a difference between abandoned and no one has done it yet. 17:16:11 <mtreinish> #link https://review.openstack.org/#/c/101232/ 17:16:29 <mtreinish> I'm assuming yfried put this on the agenda 17:16:35 <mtreinish> but I'm not sure what he wanted to discuss 17:16:35 <dkranz> mtreinish: Yes 17:16:50 <dkranz> mtreinish: He wants another core to review it 17:17:00 <dkranz> mtreinish: You disqualified yourself as a co-author 17:17:26 <mtreinish> dkranz: this is a spec, not the addcleanup patch :) 17:17:42 <mtreinish> that bounced on merge conflict after the +A 17:17:47 <dkranz> mtreinish: Oh, sorry 17:18:11 <dkranz> mtreinish: I guess he wanted some feedback, even if not a formal review 17:18:31 <dkranz> mtreinish: I think he is eager to get going but thinks it might be controversial 17:18:53 <mtreinish> ok, well I'll take a look at it. It sounds controversial from the commit summary 17:18:55 <andreaf> dkranz: I'll add it to my review list 17:19:06 <dkranz> mtreinish: thanks 17:19:24 <mtreinish> especially given how much work is involved in a major organizational refactor 17:19:43 <mtreinish> ok are there any other specs that people would like to discuss? 17:19:57 <dkranz> mtreinish: We have some movement on the tempest config script 17:20:21 <dkranz> mtreinish: I left two more comments but am almost ready to give my +2 17:20:29 <mtreinish> dkranz: yeah I saw, I need to take another pass at it too 17:21:01 <dkranz> mtreinish: I am going to also add that the discovery part should be its own module that can be shared with the verify script or anything else that needs it 17:21:27 <andreaf> dkranz: +1 17:21:49 <mtreinish> dkranz: that makes sense to me, you can just break it out of what's in the verify script 17:21:52 <mtreinish> and go from there 17:22:01 <mtreinish> but that should be an explicit work item then 17:22:15 <dkranz> mtreinish: right, that was going to be my additional comment which I will make right after meeting 17:22:41 <mtreinish> ok, cool 17:22:55 <mtreinish> are there any other specs, otherwise lets move on 17:23:29 <mtreinish> #topic Blueprints 17:23:36 <mtreinish> #link https://blueprints.launchpad.net/tempest/ 17:23:47 <mtreinish> does anyone who has an inprogress BP have a status update 17:24:14 <mtreinish> we're down 2 from last week, we marked branchless tempest as complete, although one work item was spun off as a separate spec 17:24:20 <dkranz> mtreinish: The ui would not let me change https://blueprints.launchpad.net/tempest/+spec/client-checks-success to Started 17:24:23 <mtreinish> and the nova v3 test refactor was dropped 17:24:27 <dkranz> mtreinish: Not sure why 17:24:42 <dkranz> mtreinish: At least I could not figure out how to do it 17:24:45 <mtreinish> dkranz: hmm, I just did it for you 17:24:59 <mtreinish> didn't seem to complain 17:25:03 <dkranz> mtreinish: weird 17:25:05 <mtreinish> but lp is weird 17:25:21 <mtreinish> it still drives me crazy that I can't get new bp notifications 17:25:41 <dkranz> mtreinish: anyway I would appreciate a review of https://review.openstack.org/#/c/104290/ because it touches a lot of files 17:25:53 <dkranz> though in a simple way 17:26:02 <dkranz> and I nope to avoid rebase issues :) 17:26:11 <andreaf> dkranz: I'll have a look 17:26:27 <dkranz> That is most of the identity portion of client checking 17:26:36 <dkranz> andreaf: Thanks 17:26:37 <jlanoux> dkranz: me too 17:26:55 <mtreinish> dkranz: ok, I'll take a look. That's just to move the resp code checks into the clients right? 17:27:56 <dkranz> mtreinish: Right, along with a bug fix in the same code 17:28:30 <mtreinish> ok, are there any other BPs to discuss? 17:28:55 <mtreinish> dkranz: hmm, what happened to one logical change per commit :) 17:29:31 <mtreinish> ok, let's move on 17:29:32 <dkranz> mtreinish: The problem was that the fix for the bug and the client check change overwrite exactly the same code. You will see. 17:29:43 <mtreinish> ok, I was just giving you a hard time 17:29:47 <dkranz> mtreinish: :) 17:29:51 <mtreinish> #topic Grenade 17:30:04 <mtreinish> so I don't think sdague is around right now 17:30:15 <mtreinish> but there was a discussion of javelin2 on the ML 17:30:41 <mtreinish> and I suggested that we should avoid adding new features to javelin2 until we get it working in the grenade job 17:31:02 <mtreinish> I also know that EmilienM has been doing a bunch of work on adding other services to grenade 17:31:24 <mtreinish> but I haven't been following things that closely 17:31:45 <mtreinish> so unless anyone has something to add here we can just move on 17:32:42 <mtreinish> #topic Neutron testing 17:32:54 <mtreinish> salv-orlando: I'm sure you've got something for this topic :) 17:33:16 <salv-orlando> yes. Basically we have made some progress in making the full job voting. 17:33:32 <salv-orlando> in a nutshell the problem is not the job being “full” but rather being “parallel" 17:33:58 <salv-orlando> anyway, the top offenders have been identified, and we have patches for it. 17:34:21 <salv-orlando> However, I made a mess in one of them, as I ddi not identify correctly the root cause 17:34:32 <salv-orlando> that’s the bad patch: https://review.openstack.org/#/c/99182/ 17:34:43 <mtreinish> heh, well parallel is always the trouble spot... 17:34:49 <salv-orlando> I know have the correct root cause and will push a patch soon. 17:35:25 <salv-orlando> needless to say, these failure are related to the neutron/nova event mechanism which is the biggest feature introduced since we ‘fixed’ parallel testing back in february 17:35:57 <salv-orlando> beyond these failures we have a few remaining issues, mostly around ‘lock wait timeouts' 17:36:18 <salv-orlando> these issues affect both smoke and full jobs, but parallelism make their frequency slightly higher. 17:36:39 <salv-orlando> We have people working on the ‘lock wait timeout’ issues, but I don’t have a timeline for that. 17:37:09 <mtreinish> salv-orlando: ok, so do you think after the bugginess around the event mechanism is resolved it's time to flip the switch? 17:37:23 <mtreinish> and live with the lock wait timeouts for a while 17:37:30 <salv-orlando> mtreinish: in a nutshell, yes. 17:37:46 <mtreinish> ok sounds sane to me 17:37:50 <salv-orlando> also because the lock wait timeout error is not always crticial, from a job success perspective 17:37:52 * afazekas hopes just fixing the #1329546 will be enough for voting 17:38:12 <salv-orlando> afazekas: 1329546 is where I made the mistake in finding the root cause 17:39:00 <salv-orlando> There is a nova issue where in some cases the VM just boots and does not wait for an event, and I thought it was also happening the opposite: the compute node waiting for an event that would never come 17:39:06 <salv-orlando> as you saw, this was not the case. 17:39:17 <mtreinish> salv-orlando: and on the full everywhere vs asymmetrical thing I'm fine either way. I'll bring it up during the project meeting next week 17:39:36 <salv-orlando> mtreinish: cool, so I’ll change the config patch to make the job full everywhere 17:39:48 <salv-orlando> so I guess we’ll put it directly on the integrated gate? 17:39:48 <mtreinish> just to see if there is a strong opinion either way 17:40:04 <mtreinish> salv-orlando: yeah if we go with it everywhere that'd be the best way to do it 17:40:21 <mtreinish> although I don't think all the projects are using the integrated-gate template 17:40:34 <mtreinish> so you might have to update it manually for a couple of projects 17:40:51 <salv-orlando> that’s all for the neutron full job side from em. If we’re lucky we can get these patches merged this week so the folks at the nuetron code sprint next week will deal with the increased failure rate! 17:41:09 <afazekas> #link http://logstash.openstack.org/#eyJmaWVsZHMiOltdLCJzZWFyY2giOiJtZXNzYWdlOlwiZmFpbGVkIHRvIHJlYWNoIFZFUklGWV9SRVNJWkUgc3RhdHVzXCIgQU5EIG1lc3NhZ2U6XCJDdXJyZW50IHRhc2sgc3RhdGVcXDogcmVzaXplX2ZpbmlzaFwiIEFORCB0YWdzOlwiY29uc29sZS5odG1sXCIiLCJ0aW1lZnJhbWUiOiI2MDQ4MDAiLCJncmFwaG1vZGUiOiJjb3VudCIsIm9mZnNldCI6MCwidGltZSI6eyJ1c2VyX2ludGVydmFsIjowfSwic3RhbXAiOjE0MDQyODA3NTYxNDN9 This is one of the issue type which happens significantly more f 17:41:09 <afazekas> requently in the full job 17:41:36 <marun> I have a topic for discussion before we move on from Neutron, if there's time. 17:41:44 <afazekas> Solving this issue might be enough for voting 17:41:58 <mtreinish> marun: sure 17:42:27 <salv-orlando> afazekas: that signature includes both bug 1329546 and 1333654 17:42:29 <uvirtbot> Launchpad bug 1329546 in nova "Upon rebuild instances might never get to Active state" [Undecided,In progress] https://launchpad.net/bugs/1329546 17:42:30 <uvirtbot> Launchpad bug 1333654 in nova "Timeout waiting for vif plugging callback for instance" [Undecided,In progress] https://launchpad.net/bugs/1333654 17:43:24 <salv-orlando> afazekas: you’ll have to dig into logs to see if the action fails because of an error while posting to instance_external_event 17:43:28 <salv-orlando> or because of a timeout 17:43:56 <salv-orlando> marun: go ahead. I’m done if nobody has anythign to add on the full job. 17:44:10 <marun> ok 17:44:14 <marun> As you all know, nova network/neutron parity is mandated by the TC, and validating the work items requires multi-node testing 17:44:34 <marun> We know that it's not going to be possible to do multi-node in the gate in the near term, so we're left with 3rd party testing. 17:44:54 <marun> We still have to hash out who's going to provide the resources for 3rd party testing, but that's a separate concern. 17:45:32 <marun> I'd like to see Tempest accept multinode-requiring tests, with the provisio (as nova and neutron already require) that such tests are run by 3rd party jobs. 17:45:53 <mtreinish> marun: awesome if there is someone running a ci with multinode 17:45:57 <clarkb> marun: why is it not possible in the gate near term? 17:46:00 <mtreinish> that opens up all sorts of new testing 17:46:03 <marun> Failing that, we'll have to put multi-node scenario tests in the Neutron tree, and it will be harder to get good oversight from the tempest team. 17:46:14 <marun> clarkb: near-term -> in the next month? 17:46:23 <clarkb> marun: I mean we support multinode testing now 17:46:25 <marun> clarkb: I'm happy to be wrong :) 17:46:26 <clarkb> no one is using it 17:46:33 <marun> clarkb: wow, news to me. 17:46:38 <dkranz> clarkb: news to me too 17:46:40 <mtreinish> marun: the only requirement related to this for tempest is that code in new tests gets executed in a ci system 17:46:49 <mtreinish> because tempest is mostly self verifying 17:46:55 <sdague> marun: honestly, it's probably a week's worth of work for someone to make it do multinode devstack testing 17:47:00 <dkranz> clarkb: Is there a wiki page or something about it? 17:47:03 <sdague> but no volunteers 17:47:15 <clarkb> dkranz: no. no one has done anything with it so nothing to wiki 17:47:24 <marun> mtreinish: awesome. so whether we can take advantage of upstream or 3rd party, so long as we are running it, the tests can go in. 17:47:39 <marun> sdague: we'll have volunteers ;) 17:47:43 <dkranz> sdague: I will try to find a volunteer. 17:47:59 <mtreinish> marun: yep, we just don't want to land code that hasn't been run. Which is why we've been blocking things that require a multi-node env 17:47:59 <afazekas> How many nodes we could use in multonede job ? 17:48:12 <clarkb> afazekas: its technically arbitrary but starting with 2 is probably easiest 17:48:16 <sdague> afazekas: honestly, start with 2 17:48:17 <andreaf> sdague, mtreinish: there are things that may not work in multinode, one that comes to mind it the log parsing - unless multinode uses some rsyslog server or so 17:48:20 <marun> mtreinish: totally understand, wouldn't have it any other way. 17:48:36 <sdague> andreaf: there are definitely things that have to be sorted 17:48:40 <mtreinish> andreaf: well that's something for whoever implements it to solve :) 17:49:05 <sdague> but the real issue is just no one is taking advantage of the nodepool facility yet 17:49:07 <afazekas> sdague: I can configure devstack to work in multi-node, but I do not know how it could be added to the gate 17:49:11 <andreaf> marun: so do you need 1 node as now +1 compute or do you need to split neutron control plane as well? 17:49:27 <dkranz> sdague: sounds like afazekas is a volunteer perhaps 17:49:29 <sdague> afazekas: right, that's where we need a person to dive into that 17:49:34 <marun> andreaf: more than 1 compute node 17:49:53 <afazekas> sdague: can we discuss it on the met up ? 17:49:57 <marun> andreaf: so that we can validate connectivity between vm's on multiple nodes and test nova ha/neutron dvr 17:49:59 <andreaf> marun: ok that's great - it would allow also to run migration tests 17:50:06 <sdague> initial multinode configurations should be all services (including compute) on 1 node, and compute only on 2nd node 17:50:08 <marun> andreaf: great! 17:50:10 <sdague> so you'll have 2 computes 17:50:50 <sdague> afazekas: sure, we should have all the right people there 17:51:07 <mtreinish> yeah it's probably a good topic for the meetup 17:51:12 <afazekas> ok 17:51:13 <mtreinish> I'll add it to the potetial topic list 17:51:23 <clarkb> I think someone could start working on it now though 17:51:27 <andreaf> mtreinish: yes good idea :) 17:51:29 <mtreinish> clarkb: very true 17:51:32 <clarkb> I am not convinced meetup is necessary to start the work 17:51:46 <clarkb> might be good to take the hard bits to the meetup 17:51:49 <afazekas> psedlak: ^ 17:52:08 <dkranz> clarkb: Can you give enough info for some one to get started before then? 17:52:22 <andreaf> clarkb: so does nodepool already understand the concept of having more than one node associated to a job? 17:53:08 <clarkb> andreaf: yes 17:53:15 <mtreinish> ok well we're at < 8min left so can you guys can take the multinode coversation to -infra or -qa after the meeting 17:53:22 <clarkb> yes please to -infra 17:53:25 <andreaf> does it make sense to have a qa-spec or infra-spec of this? 17:53:26 <afazekas> clarkb: Is all test node on the same network ? 17:53:34 <clarkb> afazekas: we can talk about it in -infra 17:53:38 <mtreinish> ok let's move on 17:53:43 <mtreinish> #topic Critical Reviews 17:53:58 <mtreinish> so does anyone have any reviews that they'd like to get extra eyes on 17:54:47 <mtreinish> wow this must be a first no one has any reviews that need extra attention :) 17:55:07 <adam_g> ive got a few up that add new compute feature flags, helpful for getting our ironic testing rolling in a more vanilla fashion 17:55:13 <mtreinish> andreaf: links? 17:55:18 <adam_g> sec 17:55:34 <mtreinish> oops stupid tab complete 17:55:34 <dkranz> mtreinish: I already mentioned mine :) 17:55:50 <adam_g> https://review.openstack.org/102628 https://review.openstack.org/101381 actually, i guess its only one at this point but there will be more coming 17:56:15 <mtreinish> #link https://review.openstack.org/102628 17:56:35 <mtreinish> #link https://review.openstack.org/#/c/101381/ 17:56:41 <adam_g> thanks :) 17:57:36 <mtreinish> ok if there aren't any other reviews I guess we can move on to a brief open discussion... 17:57:41 <mtreinish> #topic Open Discussion 17:58:13 <andreaf> mtreinish: we said at some point we would discuss about how to use topics in reviews 17:58:21 <andreaf> mtreinish: but we never did until now 17:58:34 <mtreinish> we did? I don't recall that 17:58:46 <andreaf> mtreinish: I still think it would be helpful to have a way of filtering reviews based on the service 17:58:55 <andreaf> at the summit 17:59:08 <mtreinish> andreaf: well that's part of the point of having only one bp for a testing effort 17:59:17 <mtreinish> to group reviews by an effort 17:59:29 <mtreinish> the problem is that the burden is on the submitter 17:59:49 <mtreinish> and I'm not sure forcing a specific topic is something we want to nit pick on 18:00:19 <mtreinish> although I guess we do nitpick on the commit msg mentioning the bp 18:00:23 <mtreinish> anyway that's time 18:00:27 <mtreinish> thanks everyone 18:00:30 <mtreinish> #endmeeting