17:02:33 <andreaf> #startmeeting qa 17:02:35 <openstack> Meeting started Thu Mar 16 17:02:33 2017 UTC and is due to finish in 60 minutes. The chair is andreaf. Information about MeetBot at http://wiki.debian.org/MeetBot. 17:02:36 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 17:02:40 <openstack> The meeting name has been set to 'qa' 17:02:54 <andreaf> sorry my wifi crashed right before the meeting.... 17:02:59 <andreaf> who's here today? 17:03:06 <blancos> o/ 17:03:14 <andreaf> Today's agenda: #linK https://wiki.openstack.org/wiki/Meetings/QATeamMeeting#Agenda_for_March_16th_2017_.281700_UTC.29 17:03:28 <oomichi> hi 17:03:34 <andreaf> hi 17:04:23 <tosky> hi 17:04:37 <andreaf> the agenda is a bit packed today, let's start, perhaps more will join 17:05:01 <andreaf> #topic actions from previous meeting 17:05:02 <andreaf> http://eavesdrop.openstack.org/meetings/qa/2017/qa.2017-03-09-09.00.txt 17:05:07 <andreaf> #link http://eavesdrop.openstack.org/meetings/qa/2017/qa.2017-03-09-09.00.txt 17:05:38 <andreaf> one action was for gmann to check on libvirt crashes 17:05:56 <andreaf> as far as I know they are still very much around 17:06:19 <andreaf> one action was for me to work on the goals, I'll talk about those in a minute 17:06:34 <andreaf> and that's about it 17:06:42 <jordanP> (hi) 17:06:45 <andreaf> #topic The Forum, Boston 17:07:07 <andreaf> so I just wanted to briefly talk about the forum in boston 17:07:40 <andreaf> as you may have seen in the ML we have a chance to come up with ideas for topics to discuss at the forum 17:07:52 <andreaf> I set up an etherpad for brainstorming: #link https://etherpad.openstack.org/p/BOS-QA-brainstorming 17:08:03 <andreaf> please put your thoughts in there :) 17:08:10 <oomichi> yeah, that is nice to get idea :) btw how much time can we use for forum? 17:08:16 <andreaf> the ML thread is #link http://lists.openstack.org/pipermail/openstack-dev/2017-March/114017.html 17:08:32 <oomichi> 2days or 3days? 17:08:41 <andreaf> oomichi: well there's no project dedicated time as far as I understand 17:08:57 <andreaf> oomichi: it's mostly about cross project discussions 17:09:42 <oomichi> andreaf: oh, ok. I am reading the mail on -dev 17:09:57 <andreaf> err, I got the wrong ML link sorry 17:10:30 <andreaf> ML link #link http://lists.openstack.org/pipermail/openstack-dev/2017-March/113459.html 17:11:18 <andreaf> well I need to check how many days 17:11:32 <oomichi> andreaf: you are already requesting slots on forum :) 17:11:47 <andreaf> also at the summit we will have onboarding sessions 17:11:55 <andreaf> onboarding #link https://etherpad.openstack.org/p/BOS-QA-onboarding 17:12:17 <andreaf> I setup an etherpad for ideas, we're going to have about 15min to present what we do to new contributors 17:12:35 <andreaf> I think it's quite important :) 17:12:52 <andreaf> we share a 90min session with infra / stable / release, so 15 min each 17:12:53 <oomichi> mtreinish is writing testing guideline on gerrit, that would be helpful for forum 17:12:54 <chandankumar> sorry i am late 17:13:01 <chandankumar> hello all! 17:13:12 <oomichi> chandankumar: hi :) 17:13:33 <andreaf> oomichi: heh sure but I think we need to have something more presentation like about what we do 17:13:45 <oomichi> #link https://review.openstack.org/#/c/439830/ 17:13:48 <andreaf> oomichi: with a lof of links to the docs :) 17:14:23 <oomichi> andreaf: yeah, doc is important :) 17:14:24 <andreaf> we won't have time to get into anything detail, we just need to highlight the cool things we do and attract folks to come and join our team 17:14:37 <oomichi> ++ 17:15:01 <andreaf> so again if you have ideas about this put them in the etherpad 17:15:20 <andreaf> moving on since we have a packed agenda 17:15:23 <andreaf> #topic OpenStack University 17:15:35 <chandankumar> oomichi: andreaf i have started putting thoughts on how to improve docs which put on etherpad soon 17:15:35 <andreaf> I wanted to take one moment to mention this 17:15:47 <andreaf> chandankumar: cool thanks 17:15:59 <oomichi> chandankumar: thanks :) 17:16:10 <andreaf> I think this is a really important initiative 17:16:27 <oomichi> chandankumar: which etherpad you will put ? 17:16:42 <andreaf> making a nice and welcoming experience for new contributors is key for OpenStack 17:16:55 <chandankumar> oomichi: i will put on etherpad.o.o , will pass you link tomorrow 17:17:05 <andreaf> #info gmann is the QA liason for openstack university 17:17:10 <oomichi> chandankumar: nice, thanks! 17:17:15 <andreaf> thanks gmann for volunteering 17:17:25 <andreaf> but anyone can put their name in to help out 17:17:51 <andreaf> volunteers for openstack university #link https://wiki.openstack.org/wiki/OpenStack_University 17:18:06 <andreaf> there are sessions at the summit so every 6 months 17:18:10 <andreaf> but it's also an ongoing effort 17:19:18 <andreaf> #topic Gate status 17:19:27 <andreaf> so the gate is not so healthy still 17:19:45 <andreaf> one change for mysql tuning was merged and reverted quickly 17:20:16 <andreaf> Gate status #link https://goo.gl/ptPgEw 17:20:23 <andreaf> failures are still too many 17:20:46 <andreaf> so I think we may still have too much load during the API phase 17:20:58 <oomichi> oh, that seems bad.. 17:21:09 <andreaf> I've been working on splitting API and scenario test, but the d-g patch is still up for review 17:21:33 <jordanP> the problem seems to be memory consumption 17:21:35 <andreaf> d-g patch for scenario job: #link https://review.openstack.org/#/c/442565/ 17:21:51 <oomichi> oom-killer? 17:21:58 <jordanP> if we can't reduce our memory usage and increase the memory available we are stuck 17:22:37 <jordanP> andreaf, what's the purpose of your patch ? 17:23:00 <andreaf> jordanP: so I'm setting up a scenario only job 17:23:02 <jordanP> you plan to not run any scenarios by default ? 17:23:29 <andreaf> jordanP: so if we can take scenarios out completely from the other job and reduce concurrency that would help on memory impact 17:23:38 <jordanP> no, I doubt it 17:24:02 <andreaf> jordanP: well the patch with concurrency 3 seem to behave better 17:24:32 <andreaf> jordanP: and API tests with concurrency 4 create a lot of load because some of servers and volumes creations 17:24:43 <jordanP> it's only a small workaround that will give us, maybe, a couple of months 17:24:47 <andreaf> jordanP: it's one possible direction 17:25:07 <andreaf> jordanP: well it's many efforts going on in parallel 17:25:26 <andreaf> tuning mysql, removing deprecated api versions, reduce concurrency 17:25:40 <andreaf> checking for unneeded heavy tests 17:25:50 <jordanP> Tempest can't be the only one paying the price 17:25:53 <andreaf> debug failures 17:26:09 <andreaf> jordanP: I'm not sure what you mean by that 17:26:28 <andreaf> a lot of folks in the community are looking into this not only QA 17:26:48 <jordanP> well, the problem is not with the tests or with the concurrency but we the dozens of python services openstack spawns 17:27:03 <jordanP> and the apache2 workers etc... 17:27:45 <andreaf> jordanP: I don't think anyone has the final proof on what the cause is and there are many opinions 17:28:07 <andreaf> jordanP: there was a discussion on that in the ML as well 17:28:29 <andreaf> jordanP: but we need to do our best on our side to make sure we keep the SUT under a reasonable load 17:28:49 <andreaf> jordanP: and it's true that for a while we've been adding stuff with not much control 17:29:13 <jordanP> not sure who we mean by "we" 17:29:17 <jordanP> *you 17:29:33 <andreaf> jordanP: well I meant the QA team 17:29:49 <andreaf> jordanP: hitting the SUT as hard as we can is not very good 17:30:01 <jordanP> then I disagree, we've been running with the same concurrency for a really long time 17:30:34 <oomichi> it would be important to measure memory usages for each process at the time the problem happens 17:30:42 <jordanP> already done 17:30:49 <jordanP> we have mem_tracker.py 17:31:03 <oomichi> then which process consumes? 17:31:09 <jordanP> http://logs.openstack.org/65/442565/3/check/gate-tempest-dsvm-neutron-full-ubuntu-xenial/b326e3e/logs/screen-peakmem_tracker.txt.gz 17:31:32 <andreaf> jordanP: so I think we need to tackle this from many different fronts 17:31:46 <jordanP> mysql is arounf 500MB, the sum of apache works around 500M, the sum of python VMs around XGB 17:31:48 <andreaf> jordanP: and just saying it's not a QA problem won't get us anywhere 17:32:28 <andreaf> jordanP: but I value a lot the concurrent testing, so I want to be very careful with keeping it 17:32:50 <tosky> that's ture, but what is the plan in case the services just kept growing their resource usage? 17:32:57 <jordanP> exactly 17:34:10 <andreaf> tosky, jordanP: well that must be kept under control as well for sure but it's out of my hands at least - I can only weigh in in the ML thread and discussions at PTG/ forum 17:34:15 <oomichi> cool, the log is gotten when the end of Tempest, right? 17:34:21 <andreaf> jordanP, tosky: that's for sure a good discussion topic for the forum 17:34:53 <oomichi> because MemAvailable seems much enough 17:35:14 <andreaf> jordanP: perhaps you can write down about that in our brainstorming etherpad 17:35:33 <jordanP> memavailable is always 8GB because those VM have 8GB of ram 17:36:33 <jordanP> I may miss the Forum 17:36:35 <andreaf> jordanP: can I give an action to you to sum up the info we have on this and restore the ML thread? 17:36:40 <andreaf> jordanP: too bad 17:36:53 <andreaf> but please still do contribute to the brainstorming 17:37:05 <jordanP> andreaf, yeah ok, I can restore the ML thread 17:37:07 <andreaf> we should talk about that 17:37:16 <jordanP> which one is it exactly ? 17:37:48 <andreaf> heh there were a couple of them right? 17:38:18 <jordanP> haha, possibly, ok I'll look for it/them 17:38:54 <andreaf> #action jordanP collect memory info about the gate and write to the ML 17:38:55 <andreaf> jordanP: thanks for looking into that 17:38:56 <andreaf> I think we should move on 17:39:44 <andreaf> #topic Pike Goals 17:39:53 <andreaf> so I looked into the Pike goals 17:40:08 <andreaf> for py35 #link https://etherpad.openstack.org/p/pike-qa-goals-py35 17:40:28 <andreaf> there are py35 unit tests missing on stackviz and little more 17:40:53 <andreaf> some python 3.5 to be setup in setup.cfg 17:41:09 <andreaf> I could not get a clear statement yet about integration tests in test projects 17:41:18 <andreaf> but I guess it would be good to have them as well 17:41:39 <andreaf> so patrole has non-voting dsvm jobs - but they are not py35 17:42:16 <andreaf> blancos: I guess that would be something for you ^^^ 17:42:41 <oomichi> andreaf: as the first, we need to enable py35 for unittests, right? 17:42:54 <andreaf> but I think overall it's not worth setting up a spec, I planned to have a gerrit topic or so 17:43:09 <oomichi> #link https://governance.openstack.org/tc/goals/pike/python35.html#projects-with-unit-tests-voting seems to require us to do it 17:43:14 <andreaf> yes we have that everywhere but stackviz 17:43:36 <andreaf> oomichi: have a look at the etherpad 17:44:00 <andreaf> oomichi: I checked every single project in the QA group today 17:44:20 <oomichi> andreaf: cool, thanks 17:44:25 <andreaf> for wsgi #link https://etherpad.openstack.org/p/pike-qa-goals-wsgi 17:44:47 <andreaf> I already put up the goverance patch #link https://review.openstack.org/446500 17:45:29 <andreaf> for wsgi basically is no-op 17:45:47 <andreaf> the only thing more or less related would be openstack heath api 17:46:01 <andreaf> but that runs as wsgi app in openstack infra already 17:46:18 <andreaf> #topic Spec reviews 17:46:27 <andreaf> spec reviews #link https://review.openstack.org/#/q/status:open+project:openstack/qa-specs,n,z 17:46:54 <andreaf> is there anything on spec reviews? 17:46:56 <andreaf> I did not have a chance yet to look at destructive testing spec 17:47:41 <andreaf> #topic Tempest 17:48:19 <andreaf> any update on bugs? 17:48:59 <oomichi> this week is jwhite 17:49:04 <andreaf> yeah 17:49:18 <andreaf> and we need more candidates for upcoming weeks 17:49:37 <oomichi> yeah 17:51:16 <andreaf> ok if there's nothing urgent on Tempest let's move on 17:51:20 <andreaf> #topic Patrole 17:51:32 <andreaf> blancos: anything on patrole? 17:51:53 <blancos> I had a question about the number of gates 17:52:07 <blancos> Is there a limit? We were planning on adding one, possibly two more 17:52:09 <felipemonteiro_> We'll change our gates to py35, but right now we're trying to achieve stability..one is almost there, the other we're debugging 17:52:34 <andreaf> blancos: there's no formal limit 17:53:01 <andreaf> blancos: more gates means more contention on test resources but usually it's not an issue unless you have a really high number 17:53:50 <andreaf> but I would not worry about adding an extra integration job if it's needed 17:53:53 <jordanP> and more false negative 17:53:57 <clarkb> I think most projects end up causing problems for themselves with a bunch of jobs that don't pass reliably before they cause problems for the collective by using too many resources 17:53:59 <clarkb> jordanP: ya that 17:54:03 <jordanP> depends on your job stability 17:54:20 <blancos> Okay, thank you. I think that's about it for Patrole. 17:54:54 <andreaf> felipemonteiro_: thanks I look forward to stable dsvm jobs there 17:55:04 <andreaf> since time is short I will skip to reviews 17:55:07 <andreaf> #topic Critical Reviews 17:55:37 <andreaf> any critical review? 17:56:00 <andreaf> sounds like not? 17:56:02 <jordanP> https://review.openstack.org/#/c/426264/ 17:56:12 <jordanP> not critical but just wanted to "share" 17:56:49 <jordanP> or https://review.openstack.org/#/c/445910/1 17:56:55 <andreaf> Interesting review: #link https://review.openstack.org/#/c/426264/ 17:57:24 <andreaf> Also review about memory footprint: #link https://review.openstack.org/#/c/445910/1 17:57:35 <andreaf> jordanP thanks good to know :) 17:58:11 <andreaf> #topic Open discussion 17:58:14 <clarkb> jordanP: we can also tune the number of workers in apache itself (which affects memory too 17:58:27 <clarkb> tls proxy did that unnecessarily in fact, I can push a revert up 17:58:55 <jordanP> it's worth a try, yes 17:59:06 <andreaf> clarkb: I wonder if not running TLS in every job might help? 17:59:14 <clarkb> andreaf: we don't run tls in every job 17:59:29 <clarkb> only the "base" jobs and I think its more specialty jobs that OOM 17:59:30 <clarkb> ? 17:59:40 <andreaf> heh ok 18:00:00 <andreaf> ok we're at time 18:00:05 <andreaf> #endmeeting QA