19:01:05 #startmeeting tripleo 19:01:06 hi 19:01:06 Meeting started Tue Oct 22 19:01:05 2013 UTC and is due to finish in 60 minutes. The chair is lifeless. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:01:07 hi 19:01:07 hi 19:01:07 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:01:08 hi 19:01:10 The meeting name has been set to 'tripleo' 19:01:11 hi 19:01:13 o/ 19:01:23 hi 19:01:24 hiya 19:01:27 hola 19:01:51 o/ 19:01:59 hi 19:02:51 #agenda 19:02:56 bugs 19:02:56 reviews 19:02:56 Projects needing releases 19:02:56 CD Cloud status 19:02:56 CI virtualized testing progress 19:02:59 Insert one-off agenda items here 19:03:01 review kanban 19:03:03 review the tweaked reviewer rules 19:03:06 open discussion 19:03:08 #topic bugs 19:03:11 #link https://bugs.launchpad.net/tripleo/ 19:03:13 #link https://bugs.launchpad.net/diskimage-builder/ 19:03:16 #link https://bugs.launchpad.net/os-refresh-config 19:03:18 #link https://bugs.launchpad.net/os-apply-config 19:03:21 #link https://bugs.launchpad.net/os-collect-config 19:03:23 #link https://bugs.launchpad.net/tuskar 19:03:26 #link https://bugs.launchpad.net/tuskar-ui 19:03:28 #link https://bugs.launchpad.net/python-tuskarclient 19:03:31 also good morning everyone 19:03:34 hi 19:04:02 we have bug 1241042 which is a firedrill 19:04:09 devtest is broken 19:04:27 So https://bugs.launchpad.net/tripleo/+bug/1241042 , after I reported it and tried a few things my time got sucked away on something else 19:04:35 and we have multiple untriaged bugs 19:04:43 ithout a REALLY good reason 19:05:29 e.g. https://bugs.launchpad.net/os-apply-config/+bug/1243263 19:05:48 * SpamapS closes one old untriaged bug 19:06:04 lifeless: derekh's patch is very close to fix bug 1241042, I tested it today, there is one small issue with nova config template 19:06:13 Oh, I just realized I'm not subscribed to bugmail for oac 19:06:16 rpodolyaka1: sorry to pick on that bug 19:06:22 rpodolyaka1: but you filed it without triaging it 19:06:31 lifeless: i think i hit https://bugs.launchpad.net/tripleo/+bug/1241042 be the reason heat stack create for overcloud falied (even though nova instances came up) on 2 separate boxes today 19:06:32 rpodolyaka1: importance 'undecided' 19:06:37 lifeless: for some reason being in tripleo team on launchpad I can't triage in os-apply-config :( 19:06:49 rpodolyaka1: ok, lets fix that! 19:07:04 everyone: if you find you can't do something you should be able to do, raise it! 19:07:36 oac had a per-project team for no good reason, switching it to tripleo 19:07:39 done 19:08:04 tuskar should be changed to tripleo too 19:08:05 https://bugs.launchpad.net/tripleo/+bug/1240753 also wasn't triaged though 19:08:08 * rpodolyaka1 triages 19:08:35 I would ask 'is everyone making a little time to do triage' 19:08:43 but since we have bugs untriaged for > a week 19:08:47 the answer is no :( 19:08:52 including me, obviuosly 19:09:14 My excuse this week is that we've had a super busy time @ HP with conference proposals for an internal thing 19:09:21 * lifeless is sorry 19:09:47 however, as a group - we need to do better. 19:09:54 Any ideas on how ? 19:10:27 I am a big believer in a unified view that helps us drive to 0 19:10:53 but, in the past that has required writing little one-off launchpad scripty things to generate a report. 19:11:55 ok 19:12:03 I agree, would be wonderful 19:12:09 but ponies. 19:12:16 Unless we want to commit to triaging all of OpenStakc 19:12:29 which I think would be a bit tough until everyone gets on board 19:13:10 We do deploy all of OpenStack.. so there would be value in doing so.. but I'm not sure we can absorb the cost of all the irrelevant things we'd have to filter out. 19:13:14 however, can I get everyone to commit to - just one, one day, over the next week visiting the meetings page, ctrl-clicking on the bug section links, scrolling to the bottom and triaging all 'unknown' importance bugs ? 19:13:33 +1 19:13:34 if we make a joint commitment to do that *once* each once a week, I think we can keep on top of it very easily. 19:13:35 ya 19:13:55 * Ng nods 19:14:00 +1 19:14:01 sounds reasonable 19:14:02 ok 19:14:04 ok 19:14:06 #vote +1 if you will triage across all tripleo LP projects [see the meetings page] once a week 19:14:07 ok 19:14:14 +1 19:14:18 +1 19:14:19 +1 19:14:20 +1 19:14:21 erm, clearly I don't know how to drive mootbot votes. 19:14:23 k 19:14:29 +1 19:14:30 +1 19:14:30 do we want to try and somehow avoid all accidentally doing it all on the same day? 19:14:31 +1 19:14:45 Ng: birthday paradox 19:14:47 +1 19:15:00 +1 from me 19:15:07 Ng: thats like trying to avoid winning the lottery :-) 19:15:09 ok, so thats triage handled. 19:15:16 +1 19:15:17 Onto the critical: this is a firedrill. 19:15:32 It's a bit sad tht /noone/ managed to find time to drive it forwards 19:16:04 tried a updated patch about 2 hours ago that failed, havn't looked at why yet 19:16:07 This is another case of joint responsibilities. And yes, company stuff will draw us away : let me apologise again for the HP folk who had papers to write with a deadline 19:16:12 I did not know it was in need of driving (or even in existence actually) 19:16:43 lifeless: I wasn't here, would you elaborate pls? 19:16:52 SpamapS: ok, so last week we agreed to put firedrills: a) critical in the bug tracker, b) in the firedrill column in trello and c) in the #tripleo channel topic. 19:17:31 Did not look at any of those since Friday. 19:17:38 shadower: the change to remove file injection on the cd-undercloud broke devtest because devtest's undercloud isn't being deployed identically to the cd-undercloud 19:18:15 (so therein lies the problem.. "SpamapS is scatterbrained" bug has been open for decades) 19:18:23 SpamapS: today is your tuesday afternoon ? 19:18:47 SpamapS: anyhow it's not about you specifically 19:19:02 so we basically need to use neutron_dhcp_agent and force use_file_injection=False? that should be done by derekh's patch with minor tweaks to nova config template 19:19:02 there are 15 odd folk driving tripleo as a whole, each with specialities sure 19:19:14 lifeless: aye Tue 12:19 to be exact 19:19:46 rpodolyaka1: right, my concern here is that noone - myself included - said 'moving that bug forward is the most important thing for the team' 19:19:47 I managed to get working overcloud today but haven't tried to run a user VM yet 19:19:59 rpodolyaka1: oh, but it sounds like you have - fantastic! 19:20:37 anyhow, lets not obsess 19:20:41 I was close to reproduce/check this yesterday but hit some strage bugs with ext4_resize_fs() :( 19:20:47 rpodolyaka1: I managed to get use_file_injection=False by forcing it to string in heat template (by wraping in quotes) 19:21:05 i got the overcloud vms up but they didn't get the ssh keys (so couldn't init the keystone setup etc) 19:21:16 though that happened very late in afternoon and i left work a few hours ago 19:21:18 rpodolyaka1: will update bug with details after this meeting 19:21:30 derekh: oh, is this because oac has a bug too ? That bug probably needs to be critical as well, since it's blocking another critical bug. 19:21:54 lifeless: no, oac bug looks similar but is irrelevant 19:22:06 *it 19:22:12 rpodolyaka1: oh, ok. 19:22:31 ah, I see, it's the template. 19:22:37 Lets get on that right after the meeting 19:22:51 any other bug material? 19:22:58 8 19:23:00 derp 19:23:18 orly ? ;) 19:23:24 #topic reviews 19:23:25 http://russellbryant.net/openstack-stats/tripleo-openreviews.html 19:23:29 http://russellbryant.net/openstack-stats/tripleo-reviewers-30.txt 19:23:32 http://russellbryant.net/openstack-stats/tripleo-reviewers-90.txt 19:23:44 19:23:44 Stats since the last revision without -1 or -2 (ignoring jenkins): 19:23:47 Average wait time: 1 days, 13 hours, 49 minutes 19:23:50 1rd quartile wait time: 0 days, 2 hours, 56 minutes 19:23:51 https://review.openstack.org/#/c/50749/ 19:23:52 Median wait time: 0 days, 4 hours, 3 minutes 19:23:55 3rd quartile wait time: 1 days, 4 hours, 29 minutes 19:23:56 19:23:56 Longest waiting reviews (based on oldest rev without nack, ignoring jenkins): 19:23:59 and 19:24:02 6 days, 8 hours, 37 minutes https://review.openstack.org/50341 (Add unique constraint to ResourceClass.) 19:24:05 5 days, 19 hours, 58 minutes https://review.openstack.org/52236 (add python-ironicclient to openstack-clients) 19:24:08 1 days, 4 hours, 29 minutes https://review.openstack.org/49729 (Add Glance image id to `resource_classes` table) 19:24:11 0 days, 4 hours, 20 minutes https://review.openstack.org/50477 (WIP : Add tempest elements) 19:24:14 0 days, 4 hours, 3 minutes https://review.openstack.org/53128 (Add James Slagle to tripleo-cd-admins.) 19:24:17 so overall we're going ok, but there are reviews waiting nearly a week 19:24:44 Add unique constraint to ResourceClass - there is already two +2 19:24:59 i'm gonna approve it 19:25:07 thanks 19:25:12 we didn't approve because we had failing jenkins at the time 19:25:23 fair enough, do you know when jenkins got fixed ? 19:25:37 well it's in the bug that we filled for it 19:25:42 lemme dig it up 19:25:59 also you can 'recheck bug XXXX' to probe and find out if jenkins is fixed without causing gate pipeline resets. 19:26:13 jistr: cool thanks, I'm curious how long the review sat in inventory is all 19:26:21 lifeless: we did a common patch for it with pblaho https://bugs.launchpad.net/tuskar/+bug/1240934 19:26:32 https://review.openstack.org/#/c/52236/ looks like gerrit got confused 19:26:53 I'll ask about that one in -infra 19:28:07 slagle: btw its good seeing lots of reviews from you - thank you! 19:28:26 most folk seem to have stepped up to the plate in fact 19:28:29 which is awesome 19:28:47 lifeless: thanks. been trying to keep up :) 19:29:01 lifeless, ?? https://review.openstack.org/#/c/52236 , it is waiting for dependency to get in, right? 19:29:30 indeed 19:29:57 lsmola: yeah 19:30:06 lsmola: just had cluebat applied to me in -infra :) 19:30:25 lifeless, :-) 19:30:28 so the review stats tool could benefit by taking that into account in some fashion 19:30:45 lifeless, +1 19:31:05 overall, I'm happy with where we are at with reviews: is anyone unhappy? Are you finding reviewing hard? Are you getting reviews in a timely manner? Are they supportive? Are you getting tossed all over by contradictory reviews? 19:31:50 lifeless: my only complaint is that non-tripleo programs do not review nearly as rapidly as tripleo programs. ;) 19:32:05 hehe 19:32:10 I've gotten rather spoiled 19:32:23 SpamapS: from a centre of excelllence.... 19:32:29 lifeless: 625 St. Ann Street 19:32:31 ok, next topic 19:32:46 lifeless: https://review.openstack.org/#/c/50749/ 19:33:17 #topic projects needing releases 19:33:46 dkehn: interesting, did you change the commit id on my draft? I'm going to guess it was 'abandoned' and thus you couldn't push to it 19:33:58 Ng: how did you go getting a release of everything out ? 19:34:28 lifeless: tripleo-heat-templates and the three tuskars are still pending, I failed to drive those through in time for today 19:34:45 Ng: ok, want to take the challenge up for another week ? 19:34:45 all the other bits (incubator excepted) got releases last week 19:34:51 lifeless: I absolutely do 19:34:52 Ng: since many things have had commits 19:34:57 Ng: we need more releases :) 19:35:05 Ng: cool! 19:35:07 ok 19:35:13 #action ng to push the release wheelbarrow 19:35:26 #action Ng to push the release wheelbarrow 19:35:41 is it just me, or am I failing to drive mootbot? 19:35:59 meetbot 19:36:41 ok 19:36:44 #topic CD cloud status 19:36:47 I'll take this one 19:36:55 The CD cloud is deploying very reliably *except* 19:37:07 every couple of days the mellanox ethernet adapter is losing the plot 19:37:14 the symptoms are that it starts failing 19:37:19 and the logs show DNS lookup errors 19:37:40 doing a while true; do host cd-overcloud.tripleo.org; done loop 19:37:45 results in one in 20 or so failing 19:37:58 pings /mtr to the name servers don't show a fail 19:38:02 I've fixed this by 19:38:42 rmmod mlx4_en mlx4_core; modprobe mlx4_en; ip address del /26 dev eth2; ovs-vsctl del-port eth2; ovs-vsctl add-port br-ctlplane eth2 19:38:50 and it comes good for another couple of days 19:39:02 I've added a card to the 'make things better' column for someone to dig into WTF is going on. 19:39:19 lifeless: saucy is out, maybe we should try with its shiny new kernel. :) 19:39:19 Anyone tried actual workloads on the overcloud ? 19:39:42 SpamapS: I think that should wait for us being able to actually redeploy the undercloud :> 19:40:28 lifeless: yeah we have 3 whole months before raring is dead. :) 19:40:50 * Ng has not tried workloads on the overcloud, I don't actually have any cloudy workloads I could retool for such a thing :/ 19:41:05 remember *everyone* can get overcloud accounts 19:41:11 just propose yourself to the incubator 19:41:25 free cloud accounts on top-grade hardware. Go for it! 19:41:43 next topic in 1 mon 19:41:47 *min* 19:41:50 and they're only going to erase your data every few hours! 19:42:07 lol 19:42:14 well, the passwords are stable 19:42:22 so if the deployment is automated 19:42:27 should be pretty straight forward 19:42:38 and hey - next MVP is data persistence :P 19:42:39 we should try juju on it 19:42:45 #topic CI virtualized testing progress 19:43:02 pleia2: how goes? 19:43:55 hey, so we now have an experimental check on the tripleo-incubator project 19:44:21 \o/ 19:44:25 all hail automated tests 19:44:28 can see it being run on this patch: https://review.openstack.org/#/c/52607/1 19:44:41 pleia2: whats next? 19:44:53 right now it's just an echo script, but we have images being successfully built and managed in the tripleo cloud from nodepool 19:45:18 next is Iteration 2 outlined here where we actually make it do something useful: https://etherpad.openstack.org/p/tripleo-test-cluster 19:46:18 no updates on progress here really, have my test nodepool up to get some of the dependencies sorted first (it's currently erroring on some basic things that I need to work out) 19:46:45 ok cool 19:46:56 pleia2: btw you shouldn't need nodepool for iteration 2 at all 19:47:10 pleia2: it's all now within other components 19:47:13 lifeless: hm, fair enough 19:47:24 pleia2: so if I was hacking on it, I wouldn't be worrying about nodepool for now. 19:47:30 ok 19:48:01 #topic review kanban 19:48:07 so I added this one-off topic 19:48:15 I'd like folks feedback on the use of kanban so far 19:48:18 whats good about it? 19:48:19 whats bad? 19:48:23 What would you like changed? 19:49:20 I'm bad at keeping track day-to-day of where we are in it 19:49:40 I have found its mere presence helps me focus on the immediate. 19:49:42 Ng: do you look at it day to day ? 19:49:45 * derekh same as Ng, just not checking it enough 19:49:51 Just seeing trello in my open tabs reminds me "go work on MVP" 19:49:57 it's convenient for finding out what folks are currently working on, what current MVPs are 19:50:07 and it is quite a sense of pride when I get to move a card to done :) 19:50:13 lifeless: no :) 19:51:01 I do not look at it day to day, because I am following the "only have one thing assigned to yourself".. so unless I finish one thing a day.. it doesn't get a detailed look until I finish the thing I am doing now. 19:51:06 between bug triage and trello, I'm starting to think I need to have a stricter cadence to my days/week where I'm looking around at things more 19:51:36 so the tension is between flow and unblocking other people 19:51:41 if we just do our one thing 19:51:43 it's easy 19:51:47 but other folk can get stuck 19:52:17 represented by bugs [untriaged], firedrills [bug/topic/kanban], reviews[no -1/-2] 19:52:18 Right, so to me it is "unblock others" followed by "do MVP work" 19:52:27 SpamapS: yeah 19:52:46 Though I admit that unblock has been 99% reviews. 19:52:56 As evidenced by the lack of triage by all of us :) 19:52:57 SpamapS: so it seems to me we need to poll the metadata for 'is someone out there blocked' at least once a day 19:53:10 as a team 19:53:15 but possibly as individuals too. 19:53:30 Any other thoughts? 19:54:25 makes sense 19:54:26 seems like that would be worth a tool to do that 19:54:57 SpamapS: add it to the roadmap ? 19:55:02 my ponies and rainbows tool would be an IRC bot that notices when I come on in the morning and tells me if there's a firedrill, how many untriaged bugs there are, and suchlike 19:55:05 one person could probably whip up an "obvious blockers" report just pulling all of the obvious data into one place that we all start at. Given the number of people involved, probably worth the time to do it. 19:55:06 SpamapS: as a card we can pickup ? 19:55:15 lifeless: yes doing that 19:55:18 cool 19:55:23 ok, 2nd last topic 19:55:39 oh and russel just fixed http://russellbryant.net/openstack-stats/tripleo-openreviews.html for us 19:56:05 approved patches will no longer show as stuck, because they aren't, its' their deps that are stuck and they will be getting evolved 19:56:12 #topic review the tweaked reviewer rules 19:56:33 So, last week we decided: 19:56:43 - cd reviews could use two +2's from anywhere 19:57:00 - multiple author reviews can use a +2 from the submitters 19:57:13 Feedback on those changes? 19:57:20 Working? Keep it? Discard it? 19:57:36 keep it, it has already been used and helped keep the train moving 19:57:45 multiple author rule worked well for us on the Jenkins critical bug 19:57:57 lifeless: keep it 19:58:06 +1 19:58:14 #agreed keep the review tweaks we introduced last week 19:58:24 #topic open discussion 19:58:26 I did something on two reviews which I think was slightly outside those rules, but seemed pretty reasonable 19:58:28 2 minutes y'll 19:58:58 there was a +2 already, but a typo in the commit message, which I fixed and carried the existing +2 forwards, added my own and Approved 19:59:50 Ng: thats inside the rules 20:00:06 Ng: you + original author - one +2. Other reviewer second +2. Done. 20:00:10 :) 20:00:14 Ng: the 'approve' button is orthogonal. 20:00:55 ok, tiems pu 20:00:58 thanks everyone 20:01:01 times up. 20:01:03 #endmeeting