19:02:20 <lifeless> #startmeeting tripleo
19:02:21 <openstack> Meeting started Tue Mar 25 19:02:20 2014 UTC and is due to finish in 60 minutes.  The chair is lifeless. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:02:22 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:02:23 <greghaynes> O/
19:02:24 <openstack> The meeting name has been set to 'tripleo'
19:02:26 <derekh> hi
19:03:02 <GheRivero> o/
19:03:18 <jprovazn> hi
19:03:19 <lifeless> #topic agenda
19:03:46 <lifeless> bugs
19:03:47 <lifeless> reviews
19:03:47 <lifeless> Projects needing releases
19:03:47 <lifeless> CD Cloud status
19:03:47 <lifeless> CI
19:03:49 <lifeless> Insert one-off agenda items here
19:03:51 <lifeless> open discussion
19:03:55 <lifeless> #topic bugs
19:04:04 <lifeless> #link https://bugs.launchpad.net/tripleo/
19:04:04 <lifeless> #link https://bugs.launchpad.net/diskimage-builder/
19:04:04 <lifeless> #link https://bugs.launchpad.net/os-refresh-config
19:04:04 <lifeless> #link https://bugs.launchpad.net/os-apply-config
19:04:04 <lifeless> #link https://bugs.launchpad.net/os-collect-config
19:04:06 <lifeless> #link https://bugs.launchpad.net/tuskar
19:04:09 <lifeless> #link https://bugs.launchpad.net/python-tuskarclient
19:05:19 <lifeless> so first up, untriaged stuff...
19:05:58 <lifeless> https://bugs.launchpad.net/diskimage-builder/ has 4 untriaged bugs
19:06:30 <lifeless> and so does https://bugs.launchpad.net/tripleo
19:06:45 <derekh> https://bugs.launchpad.net/diskimage-builder/+bug/1290541 is fix I believe
19:06:50 <lifeless> any suggests for keeping on top of that other than saying 'we need to try harder' ?
19:08:18 <rpodolyaka1> a script that will collect untriaged bugs and post a list of those to #tripleo periodically?
19:08:37 <lifeless> rpodolyaka1: sounds cool. are you wolunteering ?
19:08:42 <rpodolyaka1> lifeless: sure
19:08:45 <lifeless> \o/
19:08:53 <lifeless> #action rpodolyaka1 to make an untriaged nag bot
19:09:08 <lifeless> criticals...
19:09:12 <greghaynes> The dib bugs all have asignee's, just no importance :/
19:10:15 <lifeless> greghaynes: that typically means self filed-and-assigned, which is where folk need guidance
19:10:21 <lifeless> greghaynes: e.g. its an antipattern all of its own
19:10:48 <matty_dubs> So, as someone not too actively working on TripleO, it occurs to me -- if a whole team of people is supposed to be doing triage, but no one is, that tells me that people really don't care about triage.
19:11:08 <matty_dubs> Or the problem would be self-correcting.
19:11:20 <lifeless> matty_dubs: or that triage is hard/annoying/painful - not doing doesn't imply not caring
19:11:35 <lifeless> matty_dubs: (see 'Switch' for citations on that)
19:11:53 <matty_dubs> Yeah, true enough.
19:12:12 <lifeless> erm, I mean 'not doing doesn't *only* imply not caring'
19:12:28 <jistr> i think the bot rpodolyaka1 mentioned would help a good bit. For me the most annoying thing about triage is looking if there are any untriaged bugs.
19:12:41 <jistr> e.g. having to look per-project
19:12:41 <lifeless> https://bugs.launchpad.net/tripleo/+bugs?search=Search&field.importance=Critical
19:12:48 <derekh> according to logstash we havn't had any occurrences of https://bugs.launchpad.net/tripleo/+bug/1292141 in ci since upping the pip timeout, we should close it and reopen another for a local pip chache (no longer critical)
19:12:48 <lifeless> jistr: yeah, its quite terrible
19:13:18 <lifeless> derekh: I saw a different error string, but still a pypi issue last night
19:13:39 <derekh> lifeless: any idea where? or what the error was?
19:13:41 <lifeless> derekh: broke my dib less cp's patch (which was broken for other reasons, but spurious fail is still a problem)
19:13:54 <lifeless> derekh: sec
19:14:31 <lifeless> https://review.openstack.org/#/c/82683/ http://logs.openstack.org/83/82683/1/check-tripleo/check-tripleo-seed-precise/1f5ff5b/console.html
19:14:34 <lifeless> 2014-03-25 01:25:40.327 |     data = self._sock.recv(left)
19:14:37 <lifeless> 2014-03-25 01:25:40.327 | error: [Errno 104] Connection reset by peer
19:15:44 <lifeless> derekh: so, I think we need more error strings for network / pip fails, and the bug is still really important - but if the data doesn't support that, sure, lets downgrade.
19:16:20 <lifeless> are 1293782 and 1295703 the same issue ?
19:17:01 <lifeless> ah the flow one is fixed
19:17:17 <derekh> lifeless: ok, will downgrade, the error quoted in the bug seems to be gone, or close and open a new one?
19:17:59 <slagle_> i submitted a couple reviews this morning for https://bugs.launchpad.net/tripleo/+bug/1270646
19:18:02 <lifeless> derekh: IMO given two bugs with the same cause and different symptoms we should dupe them :)
19:18:22 <derekh> lifeless: k
19:18:35 <lifeless> derekh: so I'm not sure there is any benefit shuffling the metadata around - we depend on a network resource that isn't reliable - thats the bug
19:18:36 <slagle_> i think we should really consider just making the default mtu be 1400 in the dnsmasq options for neutron-dhcp-agent
19:18:39 <slagle_> on the overcloud
19:18:43 <slagle_> it fixes the issue for me
19:19:59 <derekh> lifeless: yes but this particular problem wasn't specific to our netwrok (or at least may not be), I put a similar fix into devstack and it seems to have gotten rid of the bug there also, anyways getting sidetracked
19:20:52 <lifeless> slagle: I'm torn. Its clearly not a deployment caused bug. So yes, doing the workaround is appropriate.
19:21:13 <lifeless> slagle: OTOH OMG WTF ARE THEY THINKING, DONT MESS WITH THE END TO END SIGNALLING MECHANISMS!
19:21:20 <slagle> i don't feel like i have enough networking expertise to say that using 1400 is "right" thing
19:21:39 <slagle> but...all the solutions i can find online just say to use it and be done with it
19:21:56 <lifeless> lets talk after the meeting briefly
19:22:05 <slagle> could just be collective ignorance :)
19:22:12 <lifeless> what about 1290490 ?
19:22:22 <lifeless> and 1292141 ?
19:22:32 <lifeless> ah 141 is what derekh mentioned
19:22:53 <GheRivero> downgrading the MTU is a must as it's now
19:24:21 <lifeless> I've retitled 141
19:25:40 <lifeless> derekh: ^
19:26:05 <derekh> lifeless: cool, ok to downgrad to hig? since its not as common as it was ?
19:26:12 <derekh> *high
19:26:41 <lifeless> derekh: sure
19:26:50 <lifeless> I'd like to downgrade https://bugs.launchpad.net/tripleo/+bug/1290490 too
19:26:55 <lifeless> any objetions ?
19:28:57 <derekh> lifeless: looks like it happens about 3 times a day, interestingly in the last week mostly happened on the 21st
19:29:12 <lifeless> 1290490 ?
19:29:20 <derekh> lifeless: yup
19:30:17 <derekh> lifeless: anyways no objection from me, just info
19:30:53 <lifeless> huh, I didn't realise 1290490 was hitting CI
19:31:01 <lifeless> if it is, I think it is still very important
19:31:05 <lifeless> derekh: got pointers?
19:31:26 <lifeless> ok what about https://bugs.launchpad.net/tripleo/+bug/1271344 ?
19:31:35 <derekh> lifeless: I'll refine my search later to ensure I'm seraching on the correct thing
19:32:49 <lifeless> I think we can downgrade https://bugs.launchpad.net/tripleo/+bug/1271344 - its very important but not breaking CI at the moment, right ?
19:34:03 <derekh> lifeless: correct, at least I haven't seen it
19:34:07 <lifeless> ok
19:37:05 <lifeless> ok, time to move on
19:37:18 <lifeless> we've touched on all the criticals for which people are here, I believe.
19:37:21 <lifeless> #topic reviews
19:37:29 <lifeless> http://russellbryant.net/openstack-stats/tripleo-openreviews.html
19:37:42 <lifeless> 
19:37:42 <lifeless> Stats since the last revision without -1 or -2 :
19:37:42 <lifeless> Average wait time: 5 days, 6 hours, 2 minutes
19:37:42 <lifeless> 1rd quartile wait time: 0 days, 21 hours, 34 minutes
19:37:42 <lifeless> Median wait time: 3 days, 23 hours, 44 minutes
19:37:44 <lifeless> 3rd quartile wait time: 5 days, 23 hours, 7 minutes
19:37:54 <lifeless> we're in trouble :(
19:38:09 <lifeless> any thoughts on why?
19:38:27 <derekh> lifeless: lots more people on the team and no more cores
19:38:38 <slagle> +1
19:38:41 <derekh> lifeless: oh and I could do better too
19:39:04 <lifeless> http://russellbryant.net/openstack-stats/tripleo-reviewers-30.txt if we want to look at reviewer activity
19:39:33 <lifeless> only 6 people doing more than 3 reviews a day on average
19:39:41 <Ng> I've fallen way behind where I was :/
19:39:44 <lifeless> derekh: # of cores doesn't affect the stats above
19:40:10 <slagle> maybe we should have a review day or afternoon
19:40:14 * slagle steals idea from ironic
19:40:15 <lifeless> derekh: because - if the patch is ready, cores can rubber stamp and go very fast
19:41:15 <lifeless> slagle: maybe!. I'd be quite keen on that if it looked like we're overloaded, but what the stats above say to me is that we've collectively stopped pushing as a community
19:41:29 <lifeless> also
19:41:37 <lifeless> a lot of the new contributors are not yet reviewing consistently
19:42:09 <lifeless> like 1 review or so, and (from what I've seen) primarily on patches they need (e.g. from colleagues fixing something affecting them)
19:42:10 <devananda> slagle: that we have review days is a symptom of cores not reviewing steadily IMHO
19:42:16 <lifeless> which isn't a bad thing
19:42:21 <andreaf> lifeless: e.g. I started contributing and I hope to be reviewing soon, but it takes a bit of time to get up to speed
19:42:23 <lifeless> but its not the thing we need
19:42:28 <lifeless> andreaf: hey - cool!
19:42:41 <devananda> slagle: and so when we have external deadlines, it takes a concerted effort to push through the backlog, which is not ideal
19:42:44 <lifeless> andreaf: please do. Note that the fastest way to get feedback on your reviews is to start commenting, right or wrong.
19:43:18 <slagle> ok, fair enough
19:43:20 <derekh> lifeless: I dissagree people have to spend time on a review to know its ready... anyways we also have a backlog from when ci was busted which never got cleared
19:44:04 <lifeless> derekh: point taken, and I'll send that mail asap :)
19:44:20 <lifeless> once I sort more beaureaucrap @ work
19:44:35 <bnemec> It's a tricky project to review properly.  Lots of interdependencies between git repos, lots of new (to me) components, and no unit tests so you have to verify logic in addition to everything else.
19:45:12 <greghaynes> stackalytics graph for review rate seems pretty steady lately
19:45:16 <lifeless> bnemec: thats interesting! perhaps we need a guide explaining things a bit more? To me, its easy to review because we know 'if you change a public API, you'll break something'
19:45:45 <bnemec> lifeless: Right, but entire elements in dib and t-i-e are completely untested in the gate.
19:45:57 <bnemec> We're working on Fedora, but even with that we won't be hitting everything.
19:46:10 <lifeless> bnemec: yes, I know - long way to go
19:46:37 <lifeless> more stats
19:46:47 <lifeless> Total reviews: 2176 (72.5/day)
19:46:48 <lifeless> Total reviewers: 82 (avg 0.9 reviews/day)
19:46:48 <lifeless> Total reviews by core team: 1174 (39.1/day)
19:46:48 <lifeless> Core team size: 20 (avg 2.0 reviews/day)
19:46:48 <lifeless> New patch sets in the last 30 days: 1343 (44.8/day)
19:47:12 <lifeless> cores are doing 40 reviews/day and 44 patch sets are being pushed a day
19:47:28 <lifeless> but
19:47:29 <lifeless> Queue growth in the last 30 days: 64 (2.1/day)
19:47:33 <lifeless> so we're falling behind
19:47:42 <lifeless> what if we ask all cores to do *one more review a day*
19:48:06 <lifeless> thats 16 more reviews a day
19:48:14 <lifeless> which could in principle land 8 changesets a day
19:48:30 <lifeless> and tip things back in the right direction - can everyone here commit to 1 more review a day ?
19:48:46 * derekh doesn't count them but will try
19:48:47 <slagle> so what is that? let's be explicit
19:48:53 <slagle> 2 reviews a day?
19:48:58 * jistr will try
19:49:10 <bnemec> There's some pretty low-hanging fruit out there too.  For example: https://review.openstack.org/#/c/80337/
19:49:29 <lsmola> lifeleless, I will try
19:49:57 <lifeless> slagle: sure, lets say that.
19:50:04 <lifeless> lsmola: thanks
19:50:33 <lifeless> #action lifeless to propose minimum of 2 reviews/day commitment from core reviewers in tripleo
19:50:40 <andreaf> bnemec: more low hanging fruit: https://review.openstack.org/#/c/81813/ and https://review.openstack.org/#/c/82035/
19:50:43 <Ng> I will do
19:50:54 <jprovazn> ack
19:51:56 <lifeless> we need to roll
19:51:59 <lifeless> #topic
19:51:59 <lifeless> Projects needing releases
19:52:04 <lifeless> #topic Projects needing releases
19:52:08 <lifeless> rpodolyaka1: you up for this?
19:52:42 <rpodolyaka1> lifeless: yep!
19:52:45 <lifeless> \o/
19:52:52 <lifeless> #topic CI cloud status
19:53:15 <lifeless> HP region is ok, tripleo-cd still paused, but SpamapS thinks we can unpause soon with the new heat shiny
19:53:28 <lifeless> Redhat region I believe is ready to add to the CI system \o/
19:53:37 <lifeless> anyone have more to add ?
19:53:38 <Ng> \o/
19:53:50 <derekh> lifeless: patch is ready, I've been holding off on getting the nodepool vm's back to 8G
19:53:50 <lifeless> #topic CI
19:54:05 <lifeless> derekh: ack, because of machine size in that region ?
19:54:28 <lifeless> derekh: andreaf: tempest - I think tempest should just run from the slave like it does for devstack
19:54:29 <derekh> lifeless: well because of the number of them 64G x 3 in overcloud
19:54:32 <dprince> lifeless: machine size and number of machines
19:54:40 <lifeless> dprince: ack
19:54:45 <devananda> lifeless: hi! on CI, I'd like to bring up a question
19:54:50 <lifeless> devananda: shoot!
19:55:00 <devananda> lifeless: specifically, how soon can Ironic start relying on tripleo-ci
19:55:05 <lifeless> derekh: andreaf: I mailed the -dev list about tempest.
19:55:12 <andreaf> lifeless: so we can't use an element to configure tempest then?
19:55:16 <devananda> ya'll are already posting -nv checks on our patch sets, which is great
19:55:35 <lifeless> andreaf: I think you'll get worse results
19:56:07 <devananda> also, as a non-integrated project, afaik, it's OK for tripleo to vote on ironic... and I think I trust you guys ;)
19:56:07 <dprince> derekh: we'll get more machines in the future so lets go for it now (regardless of size)
19:56:09 <lifeless> andreaf: you can and may want to do for prod deploys, but CI is resource limited, we need to be juidicious where we run stuff
19:56:24 <derekh> lifeless: yup, saw it, either is good with me, just thougt it would be good to reuse what we have  from seed but whatever
19:56:55 <lifeless> derekh: andreaf: you guys should do whatever works best, was really just getting my thoughts out there where you can see them :)
19:57:17 <andreaf> lifeless: ok so we need some alternate lib/tempest to set-up tempest for tripleo kind of environment - or some more ifs in the existing one
19:57:18 <lifeless> devananda: CI for ironic with tripleo: https://bugs.launchpad.net/ironic/+bug/1297063 this bug
19:57:38 <lifeless> devananda: lists *all* the outstanding patchsets need to be running Ironic properly in check
19:57:41 <derekh> dprince: ok, will push it up later
19:57:51 <lifeless> devananda: one each in Ironic, tripleo-incubator, image-elements, heat-templates
19:58:11 <lifeless> devananda: we get those four landed, and you should see failures such as the one reported in that bug
19:58:23 <lifeless> devananda: w.r.t. voting - once we're multi region we'll start turning the voting bit on.
19:58:31 <lifeless> with infra's cooperation
19:58:45 <lifeless> andreaf: really be guided by derekh here
19:58:48 <dprince> lifeless: once we are multi-region and stable
19:58:53 <lifeless> andreaf: he's spent way more time on it than I
19:59:02 <devananda> lifeless: voting globally vs. voting on ironic are different topics
19:59:21 <derekh> on fedora ci runs we need https://review.openstack.org/#/c/82562/ and https://review.openstack.org/#/q/status:open+branch:master+topic:add-f20-jobs,n,z
19:59:21 <lifeless> devananda: lets follow that up when its actually something we can do
19:59:27 <devananda> lifeless: ack
19:59:28 <andreaf> lifeless: ok that's fine at least I've got a clear direction on where to focus on now, thanks
20:00:24 <lifeless> #topic open discussion
20:00:32 <lifeless> 30 seconds
20:00:40 <devananda> lifeless: do you need anything from ironic folks for the tie/tht/t-i bugs?
20:00:48 <lifeless> devananda: reviews :)
20:00:55 <devananda> ack
20:01:32 <derekh> crap, time flies on these meetings
20:03:06 <lsmola> thanks guys, have a great week
20:03:14 <jistr> thanks, see ya
20:05:08 <lifeless> #endmeeting