19:18:18 <derekh> #startmeeting tripleo 19:18:19 <openstack> Meeting started Tue Apr 1 19:18:18 2014 UTC and is due to finish in 60 minutes. The chair is derekh. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:18:20 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:18:23 <openstack> The meeting name has been set to 'tripleo' 19:18:31 <bnemec> \o/ 19:18:56 <derekh> basing this on last weeks meeting and havn't done this before give me some slack 19:18:58 <jrist> o/ 19:19:05 <derekh> #topic agenda 19:19:10 <derekh> bugs 19:19:13 <derekh> reviews 19:19:19 <derekh> Projects needing releases 19:19:25 <derekh> CD Cloud status 19:19:27 <derekh> CI 19:19:37 <derekh> Insert one-off agenda items here 19:19:43 <derekh> #topic bugs 19:20:24 <derekh> #link https://bugs.launchpad.net/tripleo/ 19:20:25 <derekh> #link https://bugs.launchpad.net/diskimage-builder/ 19:20:25 <derekh> #link https://bugs.launchpad.net/os-refresh-config 19:20:25 <derekh> #link https://bugs.launchpad.net/os-apply-config 19:20:25 <derekh> #link https://bugs.launchpad.net/os-collect-config 19:20:25 <derekh> #link https://bugs.launchpad.net/tuskar 19:20:28 <derekh> #link https://bugs.launchpad.net/python-tuskarclient 19:20:42 <derekh> lets go down through criticals 19:20:42 <lifeless> SpamapS: thanks 19:20:53 <derekh> https://bugs.launchpad.net/tripleo/+bug/1270646 19:21:00 <derekh> anybody working on this? 19:21:10 <derekh> lifeless: your here wanna take over ? 19:21:21 <slagle> yea, i'm "working" on that 19:21:34 <slagle> i proposed some fixes, i feel they address the bug 19:21:37 <lifeless> derekh: nope 19:21:54 <lifeless> derekh: slept through my alarm, got to make breakfast for C etc 19:22:03 <slagle> the neutron bug is unassigned though 19:22:04 <lifeless> is CI back up ? 19:22:16 <jomara> make && make install cheerios ? 19:22:26 <lifeless> jomara: :) 19:22:38 <derekh> slagle: ok, cool review time so 19:22:38 <bnemec> This is Python: pip install cheerios :-P 19:22:43 <derekh> next https://bugs.launchpad.net/tripleo/+bug/1272803 19:22:44 <jomara> lifeless: i prefer the "precompiled" variety 19:22:55 <derekh> lifeless: nearly see https://etherpad.openstack.org/p/cloud-outage 19:23:06 <marios> so i guess no tripleo meet today 19:23:26 <marios> oops.hi should have scrolled down 19:23:34 <derekh> I'm working on this one, no progress kind of waiting on dprince's fix to ensure bridge to land first 19:23:46 <derekh> https://bugs.launchpad.net/tripleo/+bug/1272969 19:23:47 <tchaypo> marios: welcome ;) 19:23:59 <marios> tchaypo: tx :) 19:24:06 <slagle> https://bugs.launchpad.net/tripleo/+bug/1300663 19:24:13 <slagle> https://bugs.launchpad.net/tripleo/+bug/1290490 19:24:20 <slagle> unassigned criticals ^^ 19:24:48 <derekh> slagle: thanks, any takers? 19:25:32 <derekh> dprince: and Ng both were looking at https://bugs.launchpad.net/tripleo/+bug/1300663 I beleive there is a patch push up 19:25:55 <Ng> https://review.openstack.org/#/c/84466/ 19:26:10 <slagle> i believe https://bugs.launchpad.net/tripleo/+bug/1290490 could be bumped down to High 19:26:16 <Ng> is the simple fix with no refactoring to make dhcp-all-interfaces less bonkers 19:26:22 <derekh> Ng: will take a proper look at it tomorrow 19:26:34 <bnemec> slagle: Agreed 19:26:40 <derekh> Ng: but looked sane when I looked earlier 19:26:49 <Ng> derekh: np. it's literally just adding $INTERFACE to stop dhcp-all-interfaces going crazy :) 19:27:11 <derekh> slagle: agreed, *changed to high* 19:27:51 <slagle> yea, i left a comment too and switched to incomplete 19:27:57 <slagle> we're awaiting submitter feedback :) 19:28:36 <derekh> slagle: ok 19:28:47 <derekh> we got another critical in o-c-c https://bugs.launchpad.net/os-collect-config/+bug/1299110 19:29:06 <derekh> fix propose needs review 19:29:12 <derekh> *proposed 19:29:27 <derekh> moving on 19:29:31 <slagle> i see no other untriaged or unassigned crit bugs 19:29:41 <derekh> slagle: cool 19:29:56 <greghaynes> Seems like untriaged bot is working :) 19:29:59 <derekh> #topic reviews 19:30:09 <derekh> greghaynes: yup, it does 19:30:28 <derekh> how are we on reviews this week? 19:31:16 <derekh> my quick glance show that the review queue is longer then last week so commitments to do more reviews doesn't seem to have helped much 19:31:51 <derekh> lifeless: was also going to send out a mail to maybe increase the number of reviewers 19:31:55 <lifeless> possibly still more inputs ? 19:31:58 <lifeless> yes! I am. 19:32:00 <jistr> derekh++ 19:32:01 <lifeless> cores anyhow 19:32:12 <derekh> lifeless: yup, cores sorry 19:32:17 <slagle> we have improved some from last week 19:32:23 <slagle> Stats since the last revision without -1 or -2 : 19:32:24 <slagle> Average wait time: 3 days, 19 hours, 59 minutes 19:32:34 <tchaypo> Have we fixed the untriaged nagbot to only nag about untriaged and not incomplete? 19:32:36 <slagle> that was 5 days last week 19:32:50 <lifeless> slagle: +1 19:32:54 <derekh> so lets keep the commitments up and wait for mail from lifeless 19:33:08 <lifeless> 3rd quartile wait time: 4 days, 20 hours, 50 minutes 19:33:09 <derekh> slagle: sounds good 19:33:24 <rpodolyaka1> tchaypo: it should not nag incomplete, unless the bug submitter has responded 19:33:48 <derekh> any other suggestions on reviews, things seem to be improving so lets keep it up :-) 19:34:41 <tchaypo> rpodolyaka1: ah, maybe that's what it was doing and we all just tuned them out because we saw them as not untriaged 19:34:53 <derekh> ok, moving on 19:34:55 <derekh> #topic Projects needing releases 19:35:05 <rpodolyaka1> I'm still up for this, no problems 19:35:07 <slagle> i'd like to volunteer for releases this week 19:35:14 <tchaypo> I added a link to https://wiki.openstack.org/wiki/TripleO asserting that we follow the standard tripleo reviewchecklist 19:35:19 <rpodolyaka1> slagle: go ahead :) 19:35:23 <slagle> and create the stable branches for tripleo-* as well during the release process 19:35:37 <slagle> rpodolyaka1: your wiki page has been helpful to get up to speed on the process :) 19:35:39 <tchaypo> I don't know if that's actually the case but it looked like a useful link so I thought I'd just do it and wait for people to get upset if it's wrong 19:35:39 <derekh> #action slagle to release projects 19:35:50 <slagle> i will need lifeless to add me to the tripleo-ptl group in gerrit though 19:36:10 <rpodolyaka1> slagle: cool! 19:36:27 <derekh> lifeless: is that something you can do? 19:36:43 <lifeless> hi yes! 19:37:04 <tchaypo> speaking of links on wiki pages though - maybe rpodolyaka1's page could be linked from the TripleO page as well? 19:37:09 <lifeless> slagle: have you got fully up to speed (-infra /really/ don't like mistakes in the releasing of things ) 19:37:15 <rpodolyaka1> tchaypo: good point! 19:37:39 <derekh> tchaypo: sounds like a good suggestion 19:38:00 <slagle> lifeless: i've read the wiki page and understand the steps 19:38:08 <tchaypo> I'd add the link myself, if i knew where it should point ;) 19:38:17 <lifeless> adding you 19:38:20 <slagle> lifeless: beyond that, i guess i don't know what i don't know 19:38:35 <derekh> rpodolyaka1: can you point tchaypo at the link please 19:38:56 <lifeless> slagle: done 19:39:06 <rpodolyaka1> tchaypo: https://wiki.openstack.org/wiki/TripleO/ReleaseManagement 19:39:50 <tchaypo> thanks 19:39:52 <derekh> lifeless: slagle ok sounds like ye can progress on that outside of meeting 19:40:01 <derekh> moving on 19:40:06 <slagle> yea, i'll ask if anything is not clear 19:40:15 <slagle> i do not want any -infra wrath :) 19:40:20 <derekh> #topic CI cloud status 19:40:35 <derekh> so hardware failure over the weekend 19:41:03 <derekh> ci-overcloud was redeployed and we've been dealing with issues ever since 19:41:43 <SpamapS> sorry .. meatspace issues 19:41:44 <SpamapS> o/ 19:41:55 <derekh> ci was up last night for a bit but the init process quickly hit its limit of file descriptors 19:42:08 <derekh> the fix for that is in dhcp-all-interfaces 19:42:19 <derekh> current status is 19:42:59 <derekh> its still down but we made progress in the last few hours, zuul is now running jobs but they will fail becase we need a new geard broker 19:43:10 <derekh> what we tried and current status is here https://etherpad.openstack.org/p/cloud-outage 19:43:18 <tchaypo> slagle: i updated the wording about stable branches as well, I'm hoping you agree with the wording (although it probably needs to change very soon, once we actually have the stable branches) 19:43:18 <derekh> anything else? questions ? 19:43:40 <slagle> tchaypo: ok, will check that out 19:44:19 <derekh> What I have put int he few lines below "Apparent Solution : neutron floatingip-delete" is what I think needs ot happen next 19:44:32 <lifeless> derekh: nova thinkgs only 2 hypervisors are up 19:44:37 <tchaypo> derekh: no question, just a note that once things calm down I want to start getting access to and familiar with the CI infra so i can take some of the load next time this happens 19:44:42 <lifeless> derekh: I'm trying to reconcile that with your status update 19:44:51 <derekh> lifeless: the others still need the dhcp-all-interfaces update 19:44:58 <lifeless> derekh: ah! where is it 19:45:14 <derekh> we were manually poking at compute 4 5 and 6 19:46:01 <derekh> lifeless: manually change the dhcp-all-interfaces.conf upstart config to say 19:46:04 <derekh> exec /usr/local/sbin/dhcp-all-interfaces.sh $INTERFACE 19:46:18 <lifeless> tchaypo: cool, we're always looking for more admins - basically when you feel you know enough of the setup (by asking, following along, reviewing) submit yourself to the team 19:46:26 <derekh> lifeless: its not the long term fix but seems to be good enough to get us going again 19:48:16 <derekh> lifeless: sound ok to you? 19:48:45 <lifeless> oh wow 19:48:49 <lifeless> thats a big thinko isn't it 19:49:07 <lifeless> novacompute4 has no dhcp-all-interfaces job ? 19:49:17 <lifeless> derekh: ok so recovery is- roll that out to all the nodes 19:49:23 <lifeless> derekh: rebuild te broker ? 19:49:26 <lifeless> derekh: profit ? 19:49:31 <derekh> lifeless: yup, thats the plan 19:49:52 <derekh> lifeless: I gotta run after this meeting so can you take over? 19:50:28 <derekh> at least that my plan 19:50:45 <derekh> ok moving on again 19:50:47 <derekh> #topic CI 19:51:14 <derekh> ok, so when the cloud is up the jobs them selves seem to be mostly stable 19:51:22 <derekh> I've been keeping an eye on them 19:51:55 <derekh> sometime we get fails, we need to chase those down 19:52:06 <derekh> and we also get broke by other projects 19:52:16 <slagle> agreed, we were humming along nicely there for a while 19:52:53 <derekh> we were broken last week by changes in swift and then neutron (although the neutron thing could arguably be our fault) 19:53:15 <SpamapS> Sure there are definitely things that are our fault, but that we find out after they've landed. 19:53:19 <notmyname> derekh: ? 19:53:27 <lifeless> derekh: I can and wkll 19:53:36 <SpamapS> Let's just stay focused, and keep treating our jobs as a gate. 19:53:45 <derekh> SpamapS: yup, good suggestion 19:53:52 <notmyname> derekh: please let me know (maybe after the meeting) how changes in swift has broken things for you. first I've heard of it 19:53:54 <derekh> notmyname: lifeless sent a mail to the list 19:54:01 <SpamapS> We'll have HA soon, and thus ci-overcloud will be less firedrill prone. 19:54:19 <derekh> notmyname: will dig details up for you after 19:54:25 <notmyname> derekh: thanks 19:54:27 <derekh> SpamapS: sounds good 19:54:46 <derekh> any other CI observations ? 19:55:59 <derekh> notmyname: a swift change made chages to permisions on the ring files, here was our fix https://review.openstack.org/#/c/83645/ 19:56:10 <slagle> tuskar folks: i'm assuming you want stable icehouse branches as well? 19:56:37 <derekh> in general if people could keep an eye on http://goodsquishy.com/downloads/tripleo-jobs.html if you see 4 or 5 jobs fail in a row we have a problem 19:57:00 <derekh> time is short so 19:57:01 <derekh> #topic open discussion 19:57:05 <slagle> tuskar folks: i'm assuming you want stable icehouse branches as well? 19:57:14 <derekh> anything people want to talk about for 3 minutes ? 19:58:08 <tchaypo> favorite cat pictures of the week? 19:58:13 <jcoufal> slagle: I'd say so 19:58:18 <jdob> slagle: lsmola and jistr would know best, but I think so 19:58:23 <notmyname> derekh: thanks. we should talk after 19:58:31 <lsmola> slagle, not sure, we kind of rely on some new heat features e.g. 19:58:47 <ccrouch> i know jprovazn: had questions about next steps for HA 19:58:49 <matty_dubs> tchaypo: What about favorite doge pics? http://www.mst.edu/ 19:58:51 <slagle> lsmola: right, but those features will be in heat icehouse right? 19:58:59 * ccrouch nudges jprovazn 19:59:03 <jdob> i thought heat was done with features for icehouse 19:59:06 <SpamapS> next steps for HA is to have Heat inform nodes when they're about to be rebooted or deleted. 19:59:12 <jprovazn> yes, but probably on #tripleo channel after the meeting 19:59:14 <derekh> notmyname: I gotta run quick after this but can pick up on it tomorrow no problem 19:59:16 <SpamapS> https://review.openstack.org/#/c/81666/ will need to land 19:59:19 <lsmola> slagle, most of them in Juno I hope 19:59:27 <SpamapS> which will move us to using software config/deployment from Heat 19:59:28 <derekh> notmyname: or lifeless can fill you in 19:59:38 <slagle> lsmola: i see, let's continue in #tripleo 19:59:46 <SpamapS> then we have to write a resource plugin which will create a deployment just to tell the server that it is being rebuilt or deleted 20:00:05 * jistr switches to #tripleo 20:00:25 <derekh> SpamapS: I've been meaning to review that, willdo tomorrow assuming overcloud is running 20:00:37 <ccrouch> SpamapS: any more element stuff urgently required? now we have mysql/rabbitmq/keepalived 20:00:49 <SpamapS> It is blocked by https://review.openstack.org/#/c/83614/ 20:00:53 <derekh> ok times up, lets get HA discussion moving on #tripleo 20:00:58 <greghaynes> the mysql isnt quite done but its in the review pipeline 20:01:07 <SpamapS> -> #tripleo 20:01:11 <greghaynes> yep 20:01:20 <derekh> #endmeeting