19:03:18 <lifeless> #startmeeting tripleo 19:03:18 <openstack> Meeting started Tue Apr 22 19:03:18 2014 UTC and is due to finish in 60 minutes. The chair is lifeless. Information about MeetBot at http://wiki.debian.org/MeetBot. 19:03:19 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 19:03:21 <openstack> The meeting name has been set to 'tripleo' 19:03:41 <lifeless> #topic agenda 19:03:48 <lifeless> bugs 19:03:48 <lifeless> reviews 19:03:48 <lifeless> Projects needing releases 19:03:48 <lifeless> CD Cloud status 19:03:49 <lifeless> CI 19:03:54 <lifeless> Atlanta stuff 19:04:04 <lifeless> Open discussion 19:04:46 <lifeless> #topic bugs 19:05:09 <lifeless> #link https://bugs.launchpad.net/tripleo/ 19:05:09 <lifeless> #link https://bugs.launchpad.net/diskimage-builder/ 19:05:09 <lifeless> #link https://bugs.launchpad.net/os-refresh-config 19:05:10 <lifeless> #link https://bugs.launchpad.net/os-apply-config 19:05:10 <lifeless> #link https://bugs.launchpad.net/os-collect-config 19:05:12 <lifeless> #link https://bugs.launchpad.net/tuskar 19:05:14 <lifeless> #link https://bugs.launchpad.net/python-tuskarclient 19:05:21 <lifeless> hmm, I think we need to add os-cloud-config to that now/soon. 19:05:51 <jcoufal> o/ 19:05:53 <lifeless> 9 criticals in tripleo: ( 19:06:18 <lifeless> but untriaged is in better shape \o/ 19:06:31 <akrivoka> o/ 19:06:45 <bnemec> Hooray for untriaged-bot :-) 19:07:24 <lifeless> so the criticals 19:07:25 <marios> i tried to setup a test-env for https://bugs.launchpad.net/neutron/+bug/1290486 today, but wasn't able to immediately repro. (i did nova boot as suggested by reporter). will continue to poke tomorrow 19:08:07 <lifeless> tchaypo: did you get up at awful oclock today? ^ 19:08:55 <lifeless> derekh: do you need help on https://review.openstack.org/#/c/88223/ ? 19:09:06 <derekh> https://bugs.launchpad.net/tripleo/+bug/1308407 is killing us daily , 19:09:31 <derekh> lifeless: could do with somebody who knows nodepool better then I confirm the problem and my approach 19:09:47 <lifeless> derekh: ok, I'll look 19:09:58 <derekh> lifeless: but I'm pretty sure what I have reproduced locally is what is happening 19:10:10 <lifeless> derekh: I think your analysis is right, but did you consider just switching the order of the nodes in nodepool.yaml ? 19:11:27 <lifeless> so all criticals have assignees 19:11:35 <lifeless> anyone need help with their bug ? 19:11:40 <lifeless> are the assignees stale ? 19:11:52 <tchaypo> Yes, awful o'clock. And because I'm not used to the time yet, and because I'm not at home, that actually meant waking up every 10 minutes starting at awful-1 o'clock in a panic 19:12:02 <derekh> lifeless: yes, that may work, it didn;t in my test but I think it may have if I made the ratio 2:1 instead of 4:1 (precise:f20) 19:13:59 <lifeless> ok, lets move on, since noone else is asking for help :) 19:14:05 <lifeless> #topic reviews 19:14:22 <lifeless> http://russellbryant.net/openstack-stats/tripleo-openreviews.html 19:14:33 <lifeless> 19:14:34 <lifeless> Stats since the last revision without -1 or -2 : 19:14:34 <lifeless> Average wait time: 8 days, 16 hours, 28 minutes 19:14:34 <lifeless> 1rd quartile wait time: 4 days, 8 hours, 45 minutes 19:14:34 <lifeless> Median wait time: 6 days, 15 hours, 18 minutes 19:14:36 <lifeless> 3rd quartile wait time: 13 days, 4 hours, 34 minutes 19:14:42 <lifeless> what was it last week ? 19:15:17 <marios> lifeless: what's the status on the config pas-through work... we are getting flooded by the various 'enable foo' config reviews 19:15:21 <greghaynes> worse than last week :/ 19:15:37 <lifeless> marios: ironically, waiting on reviews I believe 19:15:39 <marios> should we continue to pass these with a light touch 19:15:59 <bnemec> marios: Maybe those should be -2'd and revisited after the passthrough is in to see if that addresses their need? 19:16:12 <lifeless> https://review.openstack.org/#/c/87843/ 19:16:19 <lifeless> https://review.openstack.org/#/c/87844/ 19:16:36 <marios> bnemec: yeah i think that may be a good idea, otherwise we'll end up with a huge number of enabled options 19:17:08 <greghaynes> 87844 failed tests 19:17:15 <marios> why cant 87843 be pushed? 19:17:17 <marios> (approved) 19:17:17 <lifeless> I just noticed that 19:17:21 * marios looks more closely 19:17:39 <bnemec> 87843 needs a recent CI pass 19:17:43 <lifeless> looks like it can to me 19:19:01 <lifeless> 87844 will fail until 87843 is in 19:19:07 <lifeless> cross project dependency 19:19:08 * marios about to approve unless the +1 with nits have objections? 19:19:13 <marios> 87843 19:19:43 <lifeless> theres only been one patch on trunk since 19:19:47 <lifeless> which was undercloud 19:19:54 <lifeless> so the test results should still be valid 19:20:13 <marios> done 19:21:30 <lifeless> sounds like we should do a pass over all the reviews going 'plumbing - -1' for any we want to use passthrough 19:21:38 <lifeless> light touch, clear it out 19:21:43 <lifeless> or perhaps even -2 19:22:01 <lifeless> note that some may need more t-i-e patches to passthrough enable additional fies 19:22:04 <lifeless> files 19:22:43 <lifeless> thoughts? 19:23:05 <greghaynes> I like -2 - makes the list of reviews to have a look at a lot smaller 19:23:18 <jdob> i like the idea of -2 to make it seem more like a concerted effort to purge the queue 19:23:24 <marios> i agree with this approach too 19:23:38 <bnemec> +1 to -2 :-) 19:23:55 <jdob> bnemec: i couldn't help but add that up in my head 19:23:57 <marios> then they can revisit once software config is done. for anything that is critical we can get in on a per review case (critical/urgent) 19:24:12 <Ng> weirdly, I thought we'd already done -2s to all the little config option reviews 19:24:19 <Ng> but +1 to the idea 19:24:31 <greghaynes> Ive been holding off until I have something to actually point people at... 19:24:34 <jdob> Ng: i kinda thought the same thing, I thought they were all in a holding pattern 19:24:50 <lifeless> we -2'd all the heat things tht were interfering with software-config 19:24:51 <greghaynes> which seems like it should be merge momentarially 19:24:53 <lifeless> thats landed 19:25:00 <lifeless> ok 19:25:19 <lifeless> #note core reviewers to do a one-pass identify-plumbing-and--2-the-world 19:25:36 <lifeless> #topic 19:25:38 <lifeless> #topic Projects needing releases 19:26:23 <lifeless> any wolunteers? 19:26:37 <lifeless> or as we say in nz 'volunteears'? 19:27:07 <lifeless> tap tap tap ? 19:27:26 <ccrouch> i'll throw slagle under the bus again 19:27:39 <ccrouch> given he's not here to defend himself ;-) 19:27:45 <lsmola> lol 19:27:56 <marios> ccrouch: good guy charles crouch 19:27:58 <lifeless> ccrouch: hah! 19:28:13 <lifeless> #action slagle to debusify himself and do releases of the world. 19:28:24 <lifeless> #topic CD cloud status 19:28:36 <lifeless> dprince: / derekh: hows the RH region? I haven't poked at it recently 19:28:42 <lifeless> its not in CI yet I presume? 19:28:52 <dprince> lifeless: it should be soon... 19:28:52 <derekh> lifeless: no, the patch is waiting to be merged 19:28:53 <clarkb> it sin't. on my list of things to do this afternoon 19:29:10 <lifeless> wwwwwicked 19:29:11 <dprince> https://review.openstack.org/#/c/83057/ 19:29:18 <lifeless> healthy otherwise? 19:29:32 <lifeless> (I know, hard to say w/out load on it) 19:29:38 <dprince> lifeless: I think so. We'll see 19:29:46 <lifeless> The HP region failed yesterday, SpamapS caught that one. 19:29:54 <derekh> lifeless: I believe so, although haven't tried it in over a week, 19:29:55 <clarkb> going to grab lunch then push that through. hopefully will start going in about 2 hours 19:29:59 <lifeless> I believe it to be fully up and happy again now 19:30:16 <dprince> lifeless: The F20 jobs in particular are running slowish in general though (on the HP rack) 19:30:35 <dprince> be interesting to see what the timings are on the RH rack for comparison 19:30:36 <derekh> clarkb: I should be back on later is any problems pop up when you flick the switch 19:30:38 <lifeless> dprince: at a guess that will be mirror access performance 19:30:48 <clarkb> derekh: thanks 19:30:50 <lifeless> but we can look at the log to see 19:30:56 <dprince> lifeless: yep, we could try that 19:31:22 <derekh> lifeless: the devstack-gate setup stuff consistently taker 5 minutes linger on f20, Its on my list of figure out 19:31:23 <dprince> derekh: what are your thoughts on the F20 slowness, will simply mirroring the RPMs closer help? 19:31:38 <lifeless> derekh: oh, interesting. 19:31:53 <derekh> lifeless: dprince rpm mirror/cache might help also 19:31:56 <lifeless> so we have a list of things we want to do local mirrors for 19:32:13 <lifeless> I believe thats in the list, right ? 19:32:22 <derekh> lifeless: a lot of nodes seem to be going to the erorr state today, I took quick look between appointments today and 3 compute nodes are having problems taking trafic (I couldn't ping them) 19:32:24 <lifeless> also 19:32:26 <lifeless> #topic CI 19:32:44 <lifeless> derekh: ruh roh, we might have an uptime related bug then 19:32:47 <derekh> but still reporting to nova as running, sounds like the issue we had on the controller a couple of times. 19:33:09 <derekh> lifeless: I havn't dug into it much more then that 19:33:10 <lifeless> derekh: since SpamapS saw a couple fall over - or perhaps we don't have the mellanox driver on the non-compute nodes 19:33:45 <lifeless> lets get the RH region going, then perhaps take the time to finish the automated bringup work, then redeploy the RH region with trusty 19:33:57 <lifeless> which we know makes the hardware in that rack much happier 19:34:20 <derekh> lifeless: did you mean the HP region with trusty? 19:34:26 <lifeless> derekh: yes 19:34:33 <derekh> lifeless: sounds like a plan 19:34:41 <lifeless> derekh: with RH live, we won't have CI downtime in the same way 19:34:54 <lifeless> we'll backlog but we won't halt 19:34:59 <derekh> yup 19:35:07 <dprince> lifeless: why are we switching the RH region to trusty again :) 19:35:26 <SpamapS> doh just noticed the time sorry guys 19:35:26 <lifeless> dprince: we're not - I'm keen to have every regions cloud be a different OS 19:35:35 <lifeless> #topic Atlanta stuff 19:35:41 <derekh> dprince: it was a typo 19:35:49 <dprince> derekh: got it :) 19:35:53 <SpamapS> lifeless: mellanox was not part of the failure yesterday. 19:36:06 <SpamapS> lifeless: there was a failure to load the module on the controller on boot. 19:36:07 <lifeless> SpamapS: ack; the panic was different? 19:36:15 <lifeless> SpamapS: on the hypervisors? 19:36:19 <SpamapS> but it was actually powered off, inexplicably 19:36:35 <SpamapS> two hypervisors were down, 1 was frozen entirely. The other had a kernel panic 19:36:42 <lifeless> SpamapS: derekh is saying he's seeing more hypervisors falling over 19:36:48 <dprince> lifeless: how many session spots do we have for Atlanta? 19:36:54 <lifeless> SpamapS: with symptoms that look like the mellanox fail 19:36:55 <lifeless> dprince: 6 19:37:04 <dprince> lifeless: well that is crap 19:37:15 <SpamapS> well 19:37:21 <SpamapS> my suggestion is that we update to trusty 19:37:21 <dprince> who's bad side did we get on? 19:37:34 <lifeless> dprince: http://lists.openstack.org/pipermail/openstack-dev/2014-April/033317.html 19:37:39 <lifeless> dprince: thats more than Ironic 19:37:44 <SpamapS> since it has the good version of the mellanox driver, and we need to get there anyway 19:37:49 <lifeless> SpamapS: indeed, mine too - see above ;) 19:37:56 <SpamapS> oh right :) 19:38:44 <lifeless> https://etherpad.openstack.org/p/tripleo-icehouse-summit <- 19:39:00 <lifeless> I'm looking to folk to help assess the sessions 19:39:12 <dprince> lifeless: I would really like to has out the network stuff, in particular since there was so much pushback /w the ensure-bridge refactoring. 19:39:16 <lifeless> PTL is meant to be an enabler and tie breaker - sadly only PTL can approve the sessions 19:39:35 <derekh> lifeless: should we take 6 votes each? 19:39:44 <bnemec> This will be my first summit, so I have to admit I don't really know what makes for a good session. 19:39:45 <lifeless> derekh: yeah, that might be a good way 19:39:55 <lifeless> bnemec: ok - 19:39:56 <lifeless> It seems to me we should focus on things where either: 19:39:56 <lifeless> - we need to build basic consensus 19:39:56 <lifeless> - crowdsourcing is at play 19:40:00 <lifeless> bnemec: ^ they are key IMO 19:40:09 <bnemec> lifeless: Okay, thanks 19:40:15 <lifeless> bnemec: on the side of the person putting the session forward, they need to do prep work 19:40:24 <lifeless> turning up and saying 'lets chat' == poor outcome usually 19:40:33 <bnemec> Sure, makes sense 19:40:36 <SpamapS> In my mind, summit sessions are places to build consensus on issues that are somewhat complex and could go in multiple directions. 19:40:55 <dprince> SpamapS: exactly ++ 19:40:55 <lifeless> for consensus stuff, having a good well thought out overview and then drilling into figure out where we're disagreeing - good 19:41:23 <lifeless> for crowd sourcing aspects, its similar - have good explanations about it all and then <blank here people need to help out> bits 19:41:26 <SpamapS> They're not places to do the bulk of design work, as design by committee is not awesome. 19:41:58 <SpamapS> Good crowd sourced things are "what are some concrete use cases for this." 19:42:08 <lifeless> as a for instance, SpamapS and I are going to be proposing some fairly deep and extensive changes to heat's internals, and for that I expect we'll do a couple of hours of prep beforehand, at least. 19:42:32 <lifeless> so that the discussion can be effective 19:43:00 <lifeless> a related thing I'd like for atlanta is the new specs repo to be online 19:43:10 <lifeless> I don't think anyone has volunteered to get that setup yet ? 19:43:59 <derekh> lifeless: is it just a matter of creating a blank repo? I can get that together 19:45:03 <lifeless> derekh: yeah - copy the nova one which has doc building and a template spec 19:45:15 <derekh> lifeless: ok, will do 19:45:16 <lifeless> derekh: get it into openstack/ in gerrit 19:45:23 <derekh> yup 19:45:29 <lifeless> #action derekh to setup tripleo-specs repo 19:45:30 <lifeless> thanks! 19:45:34 <derekh> np 19:47:06 <lifeless> with only 6 sessions 19:47:12 <lifeless> I expect a fair number of double duty ones 19:47:19 <lifeless> like CI might touch on several aspects. 19:47:28 <lifeless> the very last session overlaps the wrap-up 19:47:45 <lifeless> so we might use that for either super contentious stuff, or (relatively) niche... I dunno. 19:47:50 <lifeless> any other atlanta stuff ? 19:48:04 <ttx> lifeless: note that you'll have a tripleO "project pod" 19:48:08 <ttx> for extra discussions 19:48:19 <lifeless> ttx: yeap 19:48:22 <SpamapS> lifeless: we don't have any conference sessions. 19:48:33 <lifeless> SpamapS: whew, we can all get work done :) 19:48:38 <SpamapS> lifeless: talks rather. We might want to collaborate on a single lightning talk submission. 19:48:41 <ttx> lifeless: I placed it close to the ironic pod for cross-pollination 19:49:07 <lifeless> ttx: hah... so near heat or nova might be better - Ironic and TripleO are very well connected 19:49:27 <lifeless> but we only have one pseudopod into heat, and only 3 or so into nova 19:49:39 * ttx checks the map 19:50:39 <ttx> same floor but separate rooms 19:50:44 <lifeless> all good 19:50:49 <ttx> anyway, Nova has no pod 19:50:55 <ttx> since they have sessions running all the time 19:51:00 <lifeless> #topic open discussion 19:51:27 <jprovazn> one q regarding pass-through patches 19:51:51 <jprovazn> it can be used for adding extra set of config options, right? 19:52:02 <jprovazn> something like this: https://review.openstack.org/#/c/88105/1/elements/haproxy/os-config-applier/etc/haproxy/haproxy.cfg 19:52:09 <jprovazn> couldn't be done with it 19:52:18 <lifeless> jprovazn: its got three parts 19:52:40 <lifeless> jprovazn: we passthrough enable a config file by dropping an entirely data driven section into the moustache template 19:53:03 <lifeless> jprovazn: then we generate data into that section via heat, picked up from a user parameter 19:53:16 <lifeless> jprovazn: finally the user passes in a json struct matching this 19:53:55 <tchaypo> so for haproxy.cfg, we'd need to passthrough-enable haproxy.cfg, and the user would have to provide a json file? 19:54:27 <lifeless> jprovazn: Until we figure out how to tell heat to do the merging of keys/values etc (and whatever semantics we want for that), we can't really union the user input and heat calculated inputs 19:54:40 <lifeless> jprovazn: that haproxy section looks more like heat calculated stuff to me 19:54:42 <greghaynes> AIUI you need to make sure the app will deal with duplicated config options sanely, which is something im not sure about with haproxy 19:55:21 <lifeless> in particular for this example, haproxy doesn't have a dict-model for its config file 19:55:55 <lifeless> so you might do something similar, but different, to make the haproxy stuff much more configurable 19:56:45 <lifeless> so this is kindof in the bucket of 'v2 passthrough' where we take the time to figure out all the possible use cases and design a long term answer. IMO. 19:56:47 <jprovazn> lifeless, well, I noticed this patch today, I was thinking that setting a default value in heat template + accepting optional value from metadata would be most reasonable 19:56:59 <lifeless> jprovazn: I think thats entirely reasonable. 19:57:15 <jprovazn> but was not sure how far it's in conflict with pass through, thanks for clarification 19:57:19 <lifeless> jprovazn: and/or - feel free to push back a little on the patch and ask why it needs to be configurable. 19:57:47 <jprovazn> lifeless, yes, that's my plan ;) 19:57:49 <lifeless> the HP folk putting forward these patches have /lots/ of production experience - they may well be able to say 'X works better', and we can just change to X. 19:58:04 <lifeless> jprovazn: \o/ 19:58:09 <jprovazn> good to know 19:59:36 <SpamapS> Yeah I'd like to see us default to things that production-hardened people want. 20:00:09 <lifeless> thanks for coming everyone! 20:00:11 <lifeless> #endmeeting