19:03:18 <lifeless> #startmeeting tripleo
19:03:18 <openstack> Meeting started Tue Apr 22 19:03:18 2014 UTC and is due to finish in 60 minutes.  The chair is lifeless. Information about MeetBot at http://wiki.debian.org/MeetBot.
19:03:19 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
19:03:21 <openstack> The meeting name has been set to 'tripleo'
19:03:41 <lifeless> #topic agenda
19:03:48 <lifeless> bugs
19:03:48 <lifeless> reviews
19:03:48 <lifeless> Projects needing releases
19:03:48 <lifeless> CD Cloud status
19:03:49 <lifeless> CI
19:03:54 <lifeless> Atlanta stuff
19:04:04 <lifeless> Open discussion
19:04:46 <lifeless> #topic bugs
19:05:09 <lifeless> #link https://bugs.launchpad.net/tripleo/
19:05:09 <lifeless> #link https://bugs.launchpad.net/diskimage-builder/
19:05:09 <lifeless> #link https://bugs.launchpad.net/os-refresh-config
19:05:10 <lifeless> #link https://bugs.launchpad.net/os-apply-config
19:05:10 <lifeless> #link https://bugs.launchpad.net/os-collect-config
19:05:12 <lifeless> #link https://bugs.launchpad.net/tuskar
19:05:14 <lifeless> #link https://bugs.launchpad.net/python-tuskarclient
19:05:21 <lifeless> hmm, I think we need to add os-cloud-config to that now/soon.
19:05:51 <jcoufal> o/
19:05:53 <lifeless> 9 criticals in tripleo: (
19:06:18 <lifeless> but untriaged is in better shape \o/
19:06:31 <akrivoka> o/
19:06:45 <bnemec> Hooray for untriaged-bot :-)
19:07:24 <lifeless> so the criticals
19:07:25 <marios> i tried to setup a test-env for https://bugs.launchpad.net/neutron/+bug/1290486 today, but wasn't able to immediately repro. (i did nova boot as suggested by reporter). will continue to poke  tomorrow
19:08:07 <lifeless> tchaypo: did you get up at awful oclock today? ^
19:08:55 <lifeless> derekh: do you need help on https://review.openstack.org/#/c/88223/ ?
19:09:06 <derekh> https://bugs.launchpad.net/tripleo/+bug/1308407 is killing us daily ,
19:09:31 <derekh> lifeless: could do with somebody who knows nodepool better then I confirm the problem and my approach
19:09:47 <lifeless> derekh: ok, I'll look
19:09:58 <derekh> lifeless: but I'm pretty sure what I have reproduced locally is what is happening
19:10:10 <lifeless> derekh: I think your analysis is right, but did you consider just switching the order of the nodes in nodepool.yaml ?
19:11:27 <lifeless> so all criticals have assignees
19:11:35 <lifeless> anyone need help with their bug ?
19:11:40 <lifeless> are the assignees stale ?
19:11:52 <tchaypo> Yes, awful o'clock. And because I'm not used to the time yet, and because I'm not at home, that actually meant waking up every 10 minutes starting at awful-1 o'clock in a panic
19:12:02 <derekh> lifeless: yes, that may work, it didn;t in my test but I think it may have if I made the ratio 2:1 instead of 4:1 (precise:f20)
19:13:59 <lifeless> ok, lets move on, since noone else is asking for help :)
19:14:05 <lifeless> #topic reviews
19:14:22 <lifeless> http://russellbryant.net/openstack-stats/tripleo-openreviews.html
19:14:33 <lifeless> 
19:14:34 <lifeless> Stats since the last revision without -1 or -2 :
19:14:34 <lifeless> Average wait time: 8 days, 16 hours, 28 minutes
19:14:34 <lifeless> 1rd quartile wait time: 4 days, 8 hours, 45 minutes
19:14:34 <lifeless> Median wait time: 6 days, 15 hours, 18 minutes
19:14:36 <lifeless> 3rd quartile wait time: 13 days, 4 hours, 34 minutes
19:14:42 <lifeless> what was it last week ?
19:15:17 <marios> lifeless: what's the status on the config pas-through work... we are getting flooded by the various 'enable foo' config reviews
19:15:21 <greghaynes> worse than last week :/
19:15:37 <lifeless> marios: ironically, waiting on reviews I believe
19:15:39 <marios> should we continue to pass these with a light touch
19:15:59 <bnemec> marios: Maybe those should be -2'd and revisited after the passthrough is in to see if that addresses their need?
19:16:12 <lifeless> https://review.openstack.org/#/c/87843/
19:16:19 <lifeless> https://review.openstack.org/#/c/87844/
19:16:36 <marios> bnemec: yeah i think that may be a good idea, otherwise we'll end up with a huge number of enabled options
19:17:08 <greghaynes> 87844 failed tests
19:17:15 <marios> why cant 87843 be pushed?
19:17:17 <marios> (approved)
19:17:17 <lifeless> I just noticed that
19:17:21 * marios looks more closely
19:17:39 <bnemec> 87843 needs a recent CI pass
19:17:43 <lifeless> looks like it can to me
19:19:01 <lifeless> 87844 will fail until 87843 is in
19:19:07 <lifeless> cross project dependency
19:19:08 * marios about to approve unless the +1 with nits have objections?
19:19:13 <marios> 87843
19:19:43 <lifeless> theres only been one patch on trunk since
19:19:47 <lifeless> which was undercloud
19:19:54 <lifeless> so the test results should still be valid
19:20:13 <marios> done
19:21:30 <lifeless> sounds like we should do a pass over all the reviews going 'plumbing - -1' for any we want to use passthrough
19:21:38 <lifeless> light touch, clear it out
19:21:43 <lifeless> or perhaps even -2
19:22:01 <lifeless> note that some may need more t-i-e patches to passthrough enable additional fies
19:22:04 <lifeless> files
19:22:43 <lifeless> thoughts?
19:23:05 <greghaynes> I like -2 - makes the list of reviews to have a look at a lot smaller
19:23:18 <jdob> i like the idea of -2 to make it seem more like a concerted effort to purge the queue
19:23:24 <marios> i agree with this approach too
19:23:38 <bnemec> +1 to -2 :-)
19:23:55 <jdob> bnemec: i couldn't help but add that up in my head
19:23:57 <marios> then they can revisit once software config is done. for anything that is critical we can get in on a per review case (critical/urgent)
19:24:12 <Ng> weirdly, I thought we'd already done -2s to all the little config option reviews
19:24:19 <Ng> but +1 to the idea
19:24:31 <greghaynes> Ive been holding off until I have something to actually point people at...
19:24:34 <jdob> Ng: i kinda thought the same thing, I thought they were all in a holding pattern
19:24:50 <lifeless> we -2'd all the heat things tht were interfering with software-config
19:24:51 <greghaynes> which seems like it should be merge momentarially
19:24:53 <lifeless> thats landed
19:25:00 <lifeless> ok
19:25:19 <lifeless> #note core reviewers to do a one-pass identify-plumbing-and--2-the-world
19:25:36 <lifeless> #topic
19:25:38 <lifeless> #topic Projects needing releases
19:26:23 <lifeless> any wolunteers?
19:26:37 <lifeless> or as we say in nz 'volunteears'?
19:27:07 <lifeless> tap tap tap ?
19:27:26 <ccrouch> i'll throw slagle under the bus again
19:27:39 <ccrouch> given he's not here to defend himself ;-)
19:27:45 <lsmola> lol
19:27:56 <marios> ccrouch: good guy charles crouch
19:27:58 <lifeless> ccrouch: hah!
19:28:13 <lifeless> #action slagle to debusify himself and do releases of the world.
19:28:24 <lifeless> #topic CD cloud status
19:28:36 <lifeless> dprince: / derekh: hows the RH region? I haven't poked at it recently
19:28:42 <lifeless> its not in CI yet I presume?
19:28:52 <dprince> lifeless: it should be soon...
19:28:52 <derekh> lifeless: no, the patch is waiting to be merged
19:28:53 <clarkb> it sin't. on my list of things to do this afternoon
19:29:10 <lifeless> wwwwwicked
19:29:11 <dprince> https://review.openstack.org/#/c/83057/
19:29:18 <lifeless> healthy otherwise?
19:29:32 <lifeless> (I know, hard to say w/out load on it)
19:29:38 <dprince> lifeless: I think so. We'll see
19:29:46 <lifeless> The HP region failed yesterday, SpamapS caught that one.
19:29:54 <derekh> lifeless: I believe so, although haven't tried it in over a week,
19:29:55 <clarkb> going to grab lunch then push that through. hopefully will start going in about 2 hours
19:29:59 <lifeless> I believe it to be fully up and happy again now
19:30:16 <dprince> lifeless: The F20 jobs in particular are running slowish in general though (on the HP rack)
19:30:35 <dprince> be interesting to see what the timings are on the RH rack for comparison
19:30:36 <derekh> clarkb: I should be back on later is any problems pop up when you flick the switch
19:30:38 <lifeless> dprince: at a guess that will be mirror access performance
19:30:48 <clarkb> derekh: thanks
19:30:50 <lifeless> but we can look at the log to see
19:30:56 <dprince> lifeless: yep, we could try that
19:31:22 <derekh> lifeless: the devstack-gate setup stuff consistently taker 5 minutes linger on f20, Its on my list of figure out
19:31:23 <dprince> derekh: what are your thoughts on the F20 slowness, will simply mirroring the RPMs closer help?
19:31:38 <lifeless> derekh: oh, interesting.
19:31:53 <derekh> lifeless: dprince  rpm mirror/cache might help also
19:31:56 <lifeless> so we have a list of things we want to do local mirrors for
19:32:13 <lifeless> I believe thats in the list, right ?
19:32:22 <derekh> lifeless: a lot of nodes seem to be going to the erorr state today, I took quick look between appointments today and 3 compute nodes are having problems taking trafic (I couldn't ping them)
19:32:24 <lifeless> also
19:32:26 <lifeless> #topic CI
19:32:44 <lifeless> derekh: ruh roh, we might have an uptime related bug then
19:32:47 <derekh> but still reporting to nova as running, sounds like the issue we had on the controller a couple of times.
19:33:09 <derekh> lifeless: I havn't dug into it much more then that
19:33:10 <lifeless> derekh: since SpamapS saw a couple fall over - or perhaps we don't have the mellanox driver on the non-compute nodes
19:33:45 <lifeless> lets get the RH region going, then perhaps take the time to finish the automated bringup work, then redeploy the RH region with trusty
19:33:57 <lifeless> which we know makes the hardware in that rack much happier
19:34:20 <derekh> lifeless: did you mean the HP region with trusty?
19:34:26 <lifeless> derekh: yes
19:34:33 <derekh> lifeless: sounds like a plan
19:34:41 <lifeless> derekh: with RH live, we won't have CI downtime in the same way
19:34:54 <lifeless> we'll backlog but we won't halt
19:34:59 <derekh> yup
19:35:07 <dprince> lifeless: why are we switching the RH region to trusty again :)
19:35:26 <SpamapS> doh just noticed the time sorry guys
19:35:26 <lifeless> dprince: we're not - I'm keen to have every regions cloud be a different OS
19:35:35 <lifeless> #topic Atlanta stuff
19:35:41 <derekh> dprince: it was a typo
19:35:49 <dprince> derekh: got it :)
19:35:53 <SpamapS> lifeless: mellanox was not part of the failure yesterday.
19:36:06 <SpamapS> lifeless: there was a failure to load the module on the controller on boot.
19:36:07 <lifeless> SpamapS: ack; the panic was different?
19:36:15 <lifeless> SpamapS: on the hypervisors?
19:36:19 <SpamapS> but it was actually powered off, inexplicably
19:36:35 <SpamapS> two hypervisors were down, 1 was frozen entirely. The other had a kernel panic
19:36:42 <lifeless> SpamapS: derekh is saying he's seeing more hypervisors falling over
19:36:48 <dprince> lifeless: how many session spots do we have for Atlanta?
19:36:54 <lifeless> SpamapS: with symptoms that look like the mellanox fail
19:36:55 <lifeless> dprince: 6
19:37:04 <dprince> lifeless: well that is crap
19:37:15 <SpamapS> well
19:37:21 <SpamapS> my suggestion is that we update to trusty
19:37:21 <dprince> who's bad side did we get on?
19:37:34 <lifeless> dprince: http://lists.openstack.org/pipermail/openstack-dev/2014-April/033317.html
19:37:39 <lifeless> dprince: thats more than Ironic
19:37:44 <SpamapS> since it has the good version of the mellanox driver, and we need to get there anyway
19:37:49 <lifeless> SpamapS: indeed, mine too - see above ;)
19:37:56 <SpamapS> oh right :)
19:38:44 <lifeless> https://etherpad.openstack.org/p/tripleo-icehouse-summit <-
19:39:00 <lifeless> I'm looking to folk to help assess the sessions
19:39:12 <dprince> lifeless: I would really like to has out the network stuff, in particular since there was so much pushback /w the ensure-bridge refactoring.
19:39:16 <lifeless> PTL is meant to be an enabler and tie breaker - sadly only PTL can approve the sessions
19:39:35 <derekh> lifeless: should we take 6 votes each?
19:39:44 <bnemec> This will be my first summit, so I have to admit I don't really know what makes for a good session.
19:39:45 <lifeless> derekh: yeah, that might be a good way
19:39:55 <lifeless> bnemec: ok -
19:39:56 <lifeless> It seems to me we should focus on things where either:
19:39:56 <lifeless> - we need to build basic consensus
19:39:56 <lifeless> - crowdsourcing is at play
19:40:00 <lifeless> bnemec: ^ they are key IMO
19:40:09 <bnemec> lifeless: Okay, thanks
19:40:15 <lifeless> bnemec: on the side of the person putting the session forward, they need to do prep work
19:40:24 <lifeless> turning up and saying 'lets chat' == poor outcome usually
19:40:33 <bnemec> Sure, makes sense
19:40:36 <SpamapS> In my mind, summit sessions are places to build consensus on issues that are somewhat complex and could go in multiple directions.
19:40:55 <dprince> SpamapS: exactly ++
19:40:55 <lifeless> for consensus stuff, having a good well thought out overview and then drilling into figure out where we're disagreeing - good
19:41:23 <lifeless> for crowd sourcing aspects, its similar - have good explanations about it all and then <blank here people need to help out> bits
19:41:26 <SpamapS> They're not places to do the bulk of design work, as design by committee is not awesome.
19:41:58 <SpamapS> Good crowd sourced things are "what are some concrete use cases for this."
19:42:08 <lifeless> as a for instance, SpamapS and I are going to be proposing some fairly deep and extensive changes to heat's internals, and for that I expect we'll do a couple of hours of prep beforehand, at least.
19:42:32 <lifeless> so that the discussion can be effective
19:43:00 <lifeless> a related thing I'd like for atlanta is the new specs repo to be online
19:43:10 <lifeless> I don't think anyone has volunteered to get that setup yet ?
19:43:59 <derekh> lifeless: is it just a matter of creating a blank repo? I can get that together
19:45:03 <lifeless> derekh: yeah - copy the nova one which has doc building and a template spec
19:45:15 <derekh> lifeless: ok, will do
19:45:16 <lifeless> derekh: get it into openstack/ in gerrit
19:45:23 <derekh> yup
19:45:29 <lifeless> #action derekh to setup tripleo-specs repo
19:45:30 <lifeless> thanks!
19:45:34 <derekh> np
19:47:06 <lifeless> with only 6 sessions
19:47:12 <lifeless> I expect a fair number of double duty ones
19:47:19 <lifeless> like CI might touch on several aspects.
19:47:28 <lifeless> the very last session overlaps the wrap-up
19:47:45 <lifeless> so we might use that for either super contentious stuff, or (relatively) niche... I dunno.
19:47:50 <lifeless> any other atlanta stuff ?
19:48:04 <ttx> lifeless: note that you'll have a tripleO "project pod"
19:48:08 <ttx> for extra discussions
19:48:19 <lifeless> ttx: yeap
19:48:22 <SpamapS> lifeless: we don't have any conference sessions.
19:48:33 <lifeless> SpamapS: whew, we can all get work done :)
19:48:38 <SpamapS> lifeless: talks rather. We might want to collaborate on a single lightning talk submission.
19:48:41 <ttx> lifeless: I placed it close to the ironic pod for cross-pollination
19:49:07 <lifeless> ttx: hah... so near heat or nova might be better - Ironic and TripleO are very well connected
19:49:27 <lifeless> but we only have one pseudopod into heat, and only 3 or so into nova
19:49:39 * ttx checks the map
19:50:39 <ttx> same floor but separate rooms
19:50:44 <lifeless> all good
19:50:49 <ttx> anyway, Nova has no pod
19:50:55 <ttx> since they have sessions running all the time
19:51:00 <lifeless> #topic open discussion
19:51:27 <jprovazn> one q regarding pass-through patches
19:51:51 <jprovazn> it can be used for adding extra set of config options, right?
19:52:02 <jprovazn> something like this: https://review.openstack.org/#/c/88105/1/elements/haproxy/os-config-applier/etc/haproxy/haproxy.cfg
19:52:09 <jprovazn> couldn't be done with it
19:52:18 <lifeless> jprovazn: its got three parts
19:52:40 <lifeless> jprovazn: we passthrough enable a config file by dropping an entirely data driven section into the moustache template
19:53:03 <lifeless> jprovazn: then we generate data into that section via heat, picked up from a user parameter
19:53:16 <lifeless> jprovazn: finally the user passes in a json struct matching this
19:53:55 <tchaypo> so for haproxy.cfg, we'd need to passthrough-enable haproxy.cfg, and the user would have to provide a json file?
19:54:27 <lifeless> jprovazn: Until we figure out how to tell heat to do the merging of keys/values etc (and whatever semantics we want for that), we can't really union the user input and heat calculated inputs
19:54:40 <lifeless> jprovazn: that haproxy section looks more like heat calculated stuff to me
19:54:42 <greghaynes> AIUI you need to make sure the app will deal with duplicated config options sanely, which is something im not sure about with haproxy
19:55:21 <lifeless> in particular for this example, haproxy doesn't have a dict-model for its config file
19:55:55 <lifeless> so you might do something similar, but different, to make the haproxy stuff much more configurable
19:56:45 <lifeless> so this is kindof in the bucket of 'v2 passthrough' where we take the time to figure out all the possible use cases and design a long term answer. IMO.
19:56:47 <jprovazn> lifeless, well, I noticed this patch today, I was thinking that setting a default value in heat template + accepting optional value from metadata would be most reasonable
19:56:59 <lifeless> jprovazn: I think thats entirely reasonable.
19:57:15 <jprovazn> but was not sure how far it's in conflict with pass through, thanks for clarification
19:57:19 <lifeless> jprovazn: and/or - feel free to push back a little on the patch and ask why it needs to be configurable.
19:57:47 <jprovazn> lifeless, yes, that's my plan ;)
19:57:49 <lifeless> the HP folk putting forward these patches have /lots/ of production experience - they may well be able to say 'X works better', and we can just change to X.
19:58:04 <lifeless> jprovazn: \o/
19:58:09 <jprovazn> good to know
19:59:36 <SpamapS> Yeah I'd like to see us default to things that production-hardened people want.
20:00:09 <lifeless> thanks for coming everyone!
20:00:11 <lifeless> #endmeeting