19:06:47 #startmeeting tripleo 19:06:49 Meeting started Tue Dec 2 19:06:47 2014 UTC and is due to finish in 60 minutes. The chair is SpamapS. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:06:50 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:06:53 The meeting name has been set to 'tripleo' 19:07:21 #link https://wiki.openstack.org/wiki/Meetings/TripleO 19:07:39 #topic agenda 19:07:40 * bugs 19:07:40 * reviews 19:07:40 * Projects needing releases 19:07:40 * CD Cloud status 19:07:42 * CI 19:07:44 * Tuskar 19:07:47 * Specs 19:07:50 * open discussion 19:07:52 Remember that anyone can use the link and info commands, not just the moderator - if you have something worth noting in the meeting minutes feel free to tag it 19:08:05 Anybody have stuff to ninja-add to the agenda? 19:09:05 I'm going to add a specific topic after Specs, which is the meetup, just to raise awareness. 19:09:12 #topic bugs 19:09:29 #link https://bugs.launchpad.net/tripleo/ 19:09:30 #link https://bugs.launchpad.net/diskimage-builder/ 19:09:30 #link https://bugs.launchpad.net/os-refresh-config 19:09:30 #link https://bugs.launchpad.net/os-apply-config 19:09:30 #link https://bugs.launchpad.net/os-collect-config 19:09:32 #link https://bugs.launchpad.net/os-cloud-config 19:09:34 #link https://bugs.launchpad.net/os-net-config 19:09:37 #link https://bugs.launchpad.net/tuskar 19:09:39 #link https://bugs.launchpad.net/python-tuskarclient 19:10:15 I've seen a few untriaged bugs lately, which is a) good news, as it means a sign of users. and b) bad news, as it means we're triaging a bit slow. 19:10:43 ohai 19:10:55 regarding https://bugs.launchpad.net/tripleo/+bug/1374626 .. I have been diverted onto _two_ other things before I can tackle that. I expect to address it in about 3 weeks. 19:10:56 Launchpad bug 1374626 in tripleo "UIDs of data-owning users might change between deployed images" [Critical,Triaged] 19:12:00 I think we might need to start managing the high bugs more aggressively. There are 132 "High" bugs. What would people say to a bug-squash day some time next week? 19:12:20 I won't be around next week, but +1 to squashing some of the high bugs. 19:12:32 We've got so many that it's almost a meaningless categorization now. 19:12:40 +1 anyday but monday. 19:13:13 +1 from me 19:13:40 +1 19:14:19 bnemec: Critical is "we have to do it", High is "We're going to do it some day". Everything else is "if you're looking for something to fix..." 19:14:56 Ok how about Tuesday next week? Just generally the 24 hours that represents "tuesday" on your normal calendar. 19:15:07 SpamapS: That doesn't really match my understanding of the levels. 19:15:15 bnemec: thats what they end up being. :-/ 19:15:27 Agreed. 19:15:49 bnemec: targetting to milestones usually also communicates some urgency.. but we don't do milestones. 19:16:27 Yeah, milestones are nice, but they're a bunch of extra work at release time (as someone who just started doing Oslo releases). 19:16:40 It does force you to be better about triage though. 19:17:35 bnemec: it's really just a way to communicate to users what the plan is. 19:17:43 Ok, anything else on bugs? 19:18:01 I'm still a little unclear on whether or not we have an actual fix for https://bugs.launchpad.net/tripleo/+bug/1385346 19:18:02 Launchpad bug 1385346 in tripleo "upstart service unreliable after introducion of pipe to logger" [Critical,Fix committed] 19:18:20 greghaynes: ^^ weren't you looking at that? 19:19:13 SpamapS: I was looking at a different bug but basically the same issue 19:19:40 greghaynes: any way I can get you to take over that bug and see if they're duplicates and/or resolved fully? 19:19:48 Ill comment on that report but I think its more of "a lot of our services are unreliablely detected whether they start because upstart is wierd" 19:20:19 yep 19:20:29 greghaynes: ah right, so it's just the general problem that upstart's 'started' event isn't an indication that the service is ready. 19:20:44 yep 19:20:58 Itd be awesome to find a better solution for the various openstack services at least 19:21:08 which I suspect also affects systemd, but perhaps they've added in some magic to make it more realistic. 19:21:25 I know there was that systemd notification thing added to Oslo a while back. 19:21:30 That might be relevant here. 19:21:38 bnemec: ooh yes 19:21:56 bnemec: definitely, os-svc-daemon could definitely add the cli args to turn that on and things would be more reliable on systemd. 19:22:25 and upstart has similar notification mechanisms, might be worth looking into doing something similar 19:22:28 Yeah, I'm not sure which projects have actually adopted it yet though. 19:22:38 Though I suspect even with it, services probably don't actually implement it because it's kind of hard to say when your service is actually ready. :-P 19:23:07 #link http://git.openstack.org/cgit/openstack/oslo-incubator/tree/openstack/common/systemd.py 19:23:20 It lives there for the moment, but last I heard the plan was to graduate all of that code to a lib. 19:23:27 greghaynes: you can actually implement the systemd notification protocol in post-start in upstart, but it's evil and requires you to dig around in /proc. :-P 19:23:51 oh joy, I think I had an idea like that and then decided to remain sane 19:23:51 Anyway, lets move on 19:24:01 greghaynes: yes forget I said it, your sanity is at risk. ;) 19:24:19 #topic reviews 19:24:39 #info There's a new dashboard linked from https://wiki.openstack.org/wiki/TripleO#Review_team - look for "TripleO Inbox Dashboard" 19:24:43 #link http://russellbryant.net/openstack-stats/tripleo-openreviews.html 19:24:46 #link http://russellbryant.net/openstack-stats/tripleo-reviewers-30.txt 19:24:50 #link http://russellbryant.net/openstack-stats/tripleo-reviewers-90.txt 19:24:53 Heh, I think we might want to remove the word 'new' there. :) 19:25:05 +1 19:25:38 I haven't updated the dashboard from a while now though, anybody did? 19:25:48 (the link to the dashboard) 19:26:21 So, how does everyone feel about the stats? I think we're all lagging a bit.. but the actual numbers of waiting reviews isn't suffering a ton. 19:26:36 gfidente: I added os-net-config 2 weeks ago. 19:27:03 SpamapS, to the dashboard-creator or just to the link in the wiki? 19:27:31 gfidente: both 19:27:42 I remember there were some disalignments a while back but dashboard-creator should be source ... this isn't very hot topic though, sorry for stopping 19:27:44 tks 19:27:46 :) 19:28:21 np it's super useful 19:29:01 Ok it doesn't sound like people have strong feelings about reviews. Perhaps make a push to do a few more this week so we can slam the queue next week for bug-squash-day. 19:29:20 SpamapS: +1 wait times are acceptable +thanksgiving etc 19:29:38 #topic Projects needing releases 19:30:14 With the previous week dominated by a US holiday, I think we could skip this this week, unless somebody has a huge desire to push for it. 19:30:45 I wonder if stevenk was able to release last week 19:30:47 * greghaynes checks 19:31:32 looks like yes \O/ 19:31:40 so SGTM on skipping this week 19:32:05 sounds like motion is seconded, all opposed? 19:33:45 motion passes, moving on 19:33:50 #topic CI 19:34:17 #link http://goodsquishy.com/downloads/s_tripleo-jobs.html 19:34:37 looks like we have a problem 19:34:41 anybody know what it is? 19:35:26 At a glance it looks like our hash sum mismatch is back. 19:35:35 WONDERFUL 19:35:45 But that's after looking at exactly one result, so it might not be the same for all. 19:35:49 well that generally resolves itself as caches expire 19:36:50 bnemec: would you have time to dive into the results and see what is causing spurious fails? 19:37:01 Yeah, lot's of hash sum mismatches and another one of these: http://logs.openstack.org/70/112370/12/check-tripleo/check-tripleo-ironic-undercloud-precise-nonha/d37f261/console.html#_2014-12-02_18_00_20_392 19:37:28 SpamapS: I can, but I think derek had said it was due to some sort of dns round robin-ing issue. 19:37:33 ok, if it is just hash sum mismatches, then the cache times out after 4 or 8 hours, I forget. 19:37:34 I also still run into our issue where sometimes a node ends up in deleting state in heat 19:37:46 havent had time to dive into it yet 19:37:55 Yeah, that's still a thing too. 19:38:08 bnemec: oh yeah if the mirrors we're using are aggressively round-robining then it is more likely to cache out of sync copies. 19:38:22 Ok, so IMO < 50% pass rate is /topic worth in #tripleo 19:39:15 Even if it is probably just temporary. 19:39:22 tchaypo: here? 19:39:39 tchaypo: how are HP regions looking? (and also congratulations on being the one person I think of when I think of the HP regions ;) 19:40:01 Not really. 19:41:03 Hp2 - haven't looked for a few days, currently blocked by a setup-neutron bug. Don't know about hp1 19:42:00 I think the other big news for regions is derekh set up local bandersnatch mirrors and (I think it landed) switched toci to use them 19:42:00 tchaypo: thanks for that. 19:42:09 greghaynes: sweeeet 19:42:35 perhaps we should also setup local apt mirrors to avoid the hash sum thing. :-P 19:42:55 anyway shall we move on? 19:42:58 Yep, or we could be like infra and AFS 19:43:00 yep 19:43:13 greghaynes: I'd be down to get in on the infra AFS cell. :) 19:43:21 #topic Tuskar 19:43:33 Anything to report Tuskar party people? 19:43:43 * SpamapS pauses 60s 19:45:09 Derailing slightly - in last week's meeting we were wondering if these topics provide much value - maybe a weekly email update would be more useful 19:48:27 ... 19:48:47 I've been trying to compose an email asking what the point of these meetings is all weeks 19:48:55 But I can't make it sound right. 19:49:03 someone needs to unpause spamaps 19:49:08 SpamapS: around? 19:49:11 ^Q 19:49:45 tchaypo: just do it, thats a pretty valid question to ask for a meeting and should be easy to answer :) 19:50:02 sorry 19:50:07 I got tossed off the net for a minute 19:50:10 or 7 19:50:15 I blame the rain 19:50:28 typical angeleno :) 19:50:48 ;) 19:50:55 #topic Specs 19:50:57 gfidente: you're up 19:51:16 hey yes well submissions for the cinder/ceph are all up 19:51:23 gfidente: links? 19:51:25 surely need refinement but most of spec is covered 19:51:32 https://blueprints.launchpad.net/tripleo/+spec/tripleo-kilo-cinder-ha 19:51:40 I urge to ask a fee questions: 19:51:48 1. we want that to be the default for controlscale > 1? 19:51:55 2. we want to test that in CI? 19:52:07 then 19:52:33 1bis: if so, shall we deploy an additional node then hosting the actual Ceph OSD automagically? only one? let the user decide? 19:52:45 2bis. if so, which config shall we use for the CI job? 19:52:59 #link https://blueprints.launchpad.net/tripleo/+spec/tripleo-kilo-cinder-ha 19:53:28 gfidente: I think we should do ceph in the HA job, but I think we may want to first stabilize the current HA job. 19:53:44 +1 19:53:50 Seems like everyone is kind of tapdancing around that as there seem to be more pressing issues. 19:54:09 But really, that is the heart and soul of TripleO for anything other than tiny clouds with tiny requirements. 19:54:21 gfidente: thanks for your work on this btw. :) 19:54:43 oh that was fun, I fell a lot behind with reviews but 19:54:46 Do we want to talk about the dib tracing changes and how there may be a need for a spec? 19:55:06 pardon, one important thing is, I understand we don't want to enable it in CI immediately 19:55:30 but shall we leave to users the control of using vs not-using ceph too? how can we test the submissions in that case? 19:55:34 We should add a column in https://docs.google.com/spreadsheets/d/1LuK4FaG4TJFRwho7bcq6CcgY_7oaGnF-0E6kcK4QoQc/edit?usp=sharing 19:57:13 gfidente: I dont see why not to enable it in CI, its simply just that HA in CI doesnt pass at a rate where it would really be useful 19:57:17 gfidente: I think that may be something to posit in your spec, so there can be inline debate. 19:57:51 Anyway, before we run into the end of the meeting.. I wanted to just bring up the meetup 19:57:57 meetup/sprint/etc. 19:57:59 There's also a resource issue - can we absorb another node/more memory usage? 19:58:18 bnemec: there won't be more nodes, but there will be some more memory usage for ceph-mon and ceph-osd 19:58:23 not much tho 19:58:52 they are pretty lean when they're doing almost nothing. :) 19:59:09 Anyway, let's discuss this in spec review. Please review gfidente's spec! 19:59:14 ack 19:59:15 #topic Mid-Cycle 19:59:26 Feb 16-20 19:59:33 * slagle curses DST 19:59:35 Seattle, up 19:59:39 *hp 19:59:43 Just a reminder before we go, I haven't gotten final confirmation of the dates, but Feb 18 - 20 are the tentative days and Seattle is definitely the destination. 19:59:55 \O/ 20:00:01 slagle: You got it right last time. :-P 20:00:01 Please watch the etherpad and ML for final confirmation. THat is all 20:00:05 #endmeeting