20:04:12 <lifeless> #startmeeting tripleo 20:04:13 <openstack> Meeting started Mon Jul 15 20:04:12 2013 UTC. The chair is lifeless. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:04:14 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 20:04:17 <openstack> The meeting name has been set to 'tripleo' 20:04:27 <lifeless> sorry I'm late; had a non-sleeping baby night :( 20:04:45 <SpamapS> lifeless: had one of those too. Pondering a post meeting nap :) 20:05:26 <lifeless> mmm nap 20:05:35 <jog0> o/ 20:05:41 <lifeless> #topic agenda 20:05:42 <lifeless> bugs 20:05:43 <lifeless> Grizzly test rack status 20:05:43 <lifeless> CI virtualized testing progress 20:05:43 <lifeless> open discussion 20:05:49 <lifeless> #topic bugs 20:06:05 <lifeless> https://bugs.launchpad.net/tripleo/ 20:06:19 <lifeless> #link https://bugs.launchpad.net/tripleo/ 20:06:23 <lifeless> #link https://bugs.launchpad.net/diskimage-builder/ 20:06:55 <lifeless> #link https://bugs.launchpad.net/os-refresh-config 20:07:10 <lifeless> #link https://bugs.launchpad.net/os-config-applier 20:07:11 <SpamapS> as usual, my two bugs are both a bit lacking in context 20:07:35 <lifeless> SpamapS: is there a config-collector tracker now? 20:08:04 <SpamapS> lifeless: no, I set it aside a bit to get my tripleo setup in order for testing it.. 20:08:08 <SpamapS> that was a week ago 20:08:10 <SpamapS> have not booted a VM yet. :-/ 20:08:27 <lifeless> ok. Would you like me to do the LP administrivia ? 20:09:00 <SpamapS> lifeless: yeah, I don't think I'm admin of the teams anyway 20:10:11 <lifeless> SpamapS: will do. (YOu don't need to be admin of them to make and hand-off a project) 20:12:09 <lifeless> ok so 20:12:22 <lifeless> hmm, bug https://bugs.launchpad.net/tripleo/+bug/1182241 20:12:25 <uvirtbot> Launchpad bug 1182241 in tripleo "first-boot.d rules are running on every boot" [Critical,Triaged] 20:12:56 <lifeless> I think thats fixed, should have been closed. 20:13:17 <lifeless> I moved the rules to orc scripts 20:13:29 <lifeless> and made them idempotent 20:13:46 <lifeless> we can't delete the first-boot feature yet 20:13:55 <lifeless> perhaps we should deprecate it though? 20:14:29 <SpamapS> lifeless: indeed I think it may need to stay around with big ugly DEPRECATED warnings for a while .. since we seem to have some adoption now. 20:14:30 <lifeless> salv-orlando can't reproduce the quantum load issue 20:14:43 <stevemar2> ayoung: why would having two delegated auth mechanism bad? 20:14:50 <stevemar2> be bad* 20:14:50 <lifeless> in bug 1184484 20:14:52 <uvirtbot> Launchpad bug 1184484 in tripleo "Quantum default settings will cause deadlocks due to overflow of sqlalchemy_pool" [Critical,Triaged] https://launchpad.net/bugs/1184484 20:14:58 <lifeless> stevemar2: this is a different meeting 20:15:08 <stevemar2> lifeless, wrong window, sry 20:15:10 <ayoung> stevemar2, having a broken delegation mechanism would be bad 20:15:13 <lifeless> stevemar2: please use the dev channel for out-of-meeting chat 20:15:16 <lifeless> stevemar2: np, thanks! 20:15:48 <ayoung> stevemar2, and having two mechanisms is fine, but duplication in general leads to fixes in one needing to be made in the other as well 20:15:51 <lifeless> We may need to do a manual update of the control plane in the POC to help him reproduce, or perhaps we can trigger it in virt. 20:16:01 <lifeless> ayoung: hey, you too please! -> ~-meeting. 20:16:10 <ayoung> lifeless, sorry 20:16:31 <lifeless> bug 1199412 20:16:33 <uvirtbot> Launchpad bug 1199412 in tripleo "seed vm build fails during cinder service install" [Critical,Triaged] https://launchpad.net/bugs/1199412 20:17:07 <lifeless> Thats fixed too isn't it ? 20:17:46 <SpamapS> lifeless: yes 20:17:59 <lifeless> and so is bug 1199568 20:18:01 <uvirtbot> Launchpad bug 1199568 in tripleo "keystone service not running during wipe-openstack" [Critical,Triaged] https://launchpad.net/bugs/1199568 20:19:08 <lifeless> jog0: you've got docs somewhere on the fake virt driver for nova right? 20:19:26 <jog0> lifeless: yeah 20:19:29 <lifeless> jog0: so that we can try booting 100 vms at once without having a 100-vm capable control plane 20:19:47 <lifeless> jog0: perhaps you'd like to try reproducing 1184484 ? 20:20:10 <jog0> https://github.com/openstack-dev/devstack/commit/baf37ea81720982050eceea2b1b1e9bbdf6f0c94 20:20:29 <lifeless> jog0: just take an overcloud and change the virt driver on the compute node then throw a big boot request at it 20:21:03 <lifeless> ok, thats all the crits I can see 20:21:06 <jog0> lifeless: sounds good to me 20:21:16 <lifeless> anyone have high bugs they want to discuss? 20:22:09 <lifeless> nada, ok. 20:22:17 <jog0> lifeless: getting a gate up? 20:22:25 <jog0> not sure if that counts as a bug 20:22:27 <SpamapS> https://bugs.launchpad.net/tripleo/+bug/1201056 20:22:29 <uvirtbot> Launchpad bug 1201056 in tripleo "init-nova requires internet access" [High,Triaged] 20:22:29 <SpamapS> fix released right? 20:22:36 <lifeless> SpamapS: yes plox 20:22:45 <lifeless> jog0: yeah, other business 20:22:53 <lifeless> #topic grizzly POC rack status 20:23:02 <lifeless> SpamapS: you were going to file some bugs about this? 20:23:08 <SpamapS> I'm having problems right now with setup-baremetal ... 20:23:33 <SpamapS> lifeless: drafting them now. 20:23:44 <lifeless> SpamapS: cool 20:24:02 <lifeless> #action SpamapS to finish drafting the bugs about long term rack running 20:24:21 <lifeless> we had the control plane for the POC go offline mid last week 20:24:45 <lifeless> the good news is that 'nova boot' on the undercloud brought it right back. 20:25:28 <lifeless> The bad news is that I suspect the [I think fixed in nova trunk - devananda will know] 'oh look IPMI didn't respond quickly, clearly the machine wants to be off' bug turned it off in the first place. 20:25:50 <lifeless> we haven't confirmed that via logs. 20:26:15 <lifeless> I'm not sure if we need to bother, since the only reason it was a fire drill was this being a non-HA setup. 20:26:27 <lifeless> thoughts? 20:27:01 <SpamapS> I think it is worth confirming that is what happened. 20:27:13 <SpamapS> Its a serious enough thing that we don't want to gloss over because it is "likely" 20:27:29 <SpamapS> Also if we had better monitoring on our POC we'd have known sooner. 20:27:43 <lifeless> SpamapS: entirely agreed. 20:27:51 <lifeless> SpamapS: perhaps you could include a bug about both of those points. 20:28:00 <lifeless> SpamapS: in your drafts 20:28:44 <SpamapS> https://bugs.launchpad.net/tripleo/+bugs?field.tag=poc 20:30:18 <SpamapS> lifeless: there, I think now I have all of them 20:30:57 <lifeless> #link https://bugs.launchpad.net/tripleo/+bugs?field.tag=poc 20:31:16 <lifeless> ok 20:31:46 <lifeless> so I think we should treat these as indeed critical and move on them after the current crits are closed 20:31:50 <lifeless> which is spamaps two fuzzy ones 20:32:04 <lifeless> #topic CI virtualized testing progress 20:32:10 <pleia2> hello 20:32:15 <lifeless> pleia2: how goes it? We synced a little on the weekend 20:32:46 <pleia2> yeah, so that was helpful in understanding some of the networking stuff that I was tripping up on, now just nailing down the specific things I want to run for this testing 20:33:15 <lifeless> pleia2: is your kvm seed setup working - can you nova boot a bm node? 20:33:32 <pleia2> lifeless: unfortunately not :( 20:33:42 <pleia2> in the middle of backing out some changes 20:34:07 <lifeless> pleia2: perhaps we should pair up again after this meeting? 20:34:31 <pleia2> lifeless: yeah, that would be good (but lunch for me first) 20:34:41 <lifeless> pleia2: kk; ping me maybe. 20:34:45 <lifeless> #topic open discussion 20:34:45 <pleia2> will do 20:35:07 <lifeless> jog0: You wanted to talk CI gates 20:35:14 <SpamapS> lifeless: I feel like we need to start pushing harder down the path toward gating. 20:35:19 <dkehn> https://review.openstack.org/#/c/30441/ fingers crossed again 20:35:59 <SpamapS> There are enough people involved, and enough moving parts, that breaking stuff is worse than not moving forward at the highest possible speed. 20:36:36 <lifeless> so we break for two reasons. Other projects. Our stuffups. 20:36:54 <pleia2> are there any small gating tests that can help that don't require my bit yet? 20:36:56 <lifeless> My sense is that in the last 2 weeks its been about 50-50 split 20:37:36 <jog0> SpamapS: ++ 20:37:36 <lifeless> to gate we need all our components mutually gated/low risk of random breaks/or used via releases 20:38:27 <lifeless> uhm 20:38:49 <jog0> what kinds of issues have been breaking trunk? Perhaps there is a smaller gate we can start with as pleia2 suggested 20:38:53 <lifeless> e.g. diskimage-builder - to gate on that we need to get the ubuntu and fedora images it needs cached into the CI infrastructure so we're not dependent on internet access. 20:39:09 <jog0> maybe just getting through DIB or something 20:39:22 <lifeless> for tie we need the git caching derekh is working on, and pip caching which I put a etherpad up designing 20:39:37 <lifeless> jog0: so we do need that; but note that dib hasn't broken. 20:40:09 <jog0> lifeless: I was refering to the image elements aspect, so what generally breaks. 20:40:14 <lifeless> we had a tie rule break (cinder builds), we had neutron break (the rename of the client) and then (the quantum-server compoatbility script broke) 20:40:46 <lifeless> bah, spelling 20:40:58 <lifeless> anyhow, just to say - I'm totall +100 on CI 20:41:41 <jog0> how much of that can be detected with just getting a seed-stack vm running? 20:41:58 <lifeless> the cinder rule break would have been detected 20:42:02 <jog0> (and not any fake baremetal booting) 20:42:16 <lifeless> both neutron failures were silent until we tried to do stuff in anger 20:42:44 <lifeless> jog0: pleia2: so - we can indeed get some benefit from smaller gate checks. 20:43:00 <jog0> is it possible to do seed-stack + tempest? 20:43:18 <lifeless> jog0: in principle yes. 20:43:27 <jog0> err rather would that help us 20:43:29 <lifeless> jog0: in the current gate I very much doubt it. 20:43:32 <rwsu> assuming we can get toci back in working order, is it possible to ask everyone to run it before checking in new changes? if there is a smaller set of tests folks can run, i would be in favor of that too 20:43:51 <SpamapS> Another thing that might help is getting error reporting into the heat templates (via waitconditions + orc) 20:44:04 <lifeless> rwsu: what would be awesome would be if toci just subscribed to branches proposed to tie/the/dib/oac/orc/occ 20:44:16 <jog0> lifeless: ++ 20:45:12 <lifeless> so at the moment, the toci folk are carrying the CI burden in chase-mode for most of tripleo, and pleia2 is the only person working directly on gating infrastructure 20:45:25 <lifeless> pleia2 is working on /nova/ gating infrastructure atm 20:45:28 <rwsu> lifeless: good idea, it would be nice to have it report yea or nay in the review process 20:45:30 <pleia2> and I'm getting impacted by breakage to, so it's a bit slow going :( 20:45:30 <lifeless> because nova bm isn't gated 20:46:10 <lifeless> a consequence of that (that it will gate nova) is that its going to have to be super reliable 20:46:31 <lifeless> which probably means stevebaker's packaging patch set, derekh's git cache stuff, and a pip cache 20:46:44 <lifeless> plus other ancilliary changes will all be needed just to get that 20:47:07 <lifeless> I'm going to suggest that pushing straight at that target is better than picking small side gates 20:47:55 <lifeless> because I don't think small side gates will catch any breakage that actually impacts pleia2's work - she has been hitting all the quantum rename stuff, and also issues with running in lxc not kvm etc. 20:48:23 <lifeless> opinion: we should make bugs that will prevent her gate being activated critical 20:48:37 <lifeless> because getting CI for us is now critical 20:48:54 <jog0> lifeless: ++ 20:49:37 <lifeless> we may have program status soon 20:49:50 <lifeless> if we do, we can ask for -infra help gating everything 20:49:53 <lifeless> which we can't at the moment 20:50:37 <pleia2> I'm going to be out of the office next week for OSCON (checking in, but solid testing+work will be hard) but I hope to have enough progress this week that I have some kind of dependency list of what we'll need in the gate (caches, etc) 20:50:55 <rwsu> is someone already working on the pip cache? 20:51:24 <SpamapS> infra has a nice pip cache :) 20:51:51 <lifeless> rwsu: we have a design nutted out - http://etherpad.openstack.org/TripleO-pip-mirror 20:52:17 <rwsu> nice 20:52:47 <lifeless> hmm last minutes 20:52:51 <lifeless> I followed up SpamapS sprint idea 20:53:11 <lifeless> one thing is that a bunch of folk have said 'after the beta milestone please' 20:53:20 <lifeless> implicitly, just by 'this date is better' 20:53:38 <lifeless> how important is it that we be sprinting before the milestone (to get things in order for it) 20:53:48 <lifeless> vs that we be sprinting together (to get tings together) 20:54:31 <SpamapS> Hrm 20:54:52 <SpamapS> Well my thinking was to get together to push things into h3. 20:54:58 <lifeless> indeed 20:55:04 <lifeless> which is sept 5th 20:55:18 <SpamapS> But if people would rather get together to hash out ideas for working on before the icehouse summit.. I would find value in that as well. 20:55:45 <jog0> lifeless: why not something over sept 5th so we can both push to finish features and then switch to finding/triaging and fixing bugs? 20:56:09 <jog0> (so we can make things work for the burners) 20:56:14 <lifeless> jog0: I have a conference 6/7/8 sept 20:56:14 <SpamapS> I get the feeling that people not explicitly "working on tripleo" are constrained by their primary focus. 20:56:48 <lifeless> jog0: though like mordred I don't strictly need to be at the sprint, I'd really /like/ to be there. 20:57:19 <lifeless> jog0: aug 19th is early enough for most burners I think, at least for 1/2 the week. 20:57:28 <SpamapS> lifeless: I think we'd be less productive without you 20:57:29 <lifeless> 26th is the burn start 20:57:56 <jog0> aug 19th doesn't work for me, but if it works for everyone else I will just have to skip it 20:58:01 <SpamapS> And yeah this is a short sprint.. 19/20 gives them 6 days of pre-burn prep time :) 20:58:29 <jog0> what about sept 2/3? 20:58:31 <devananda> lifeless: 19/20 is probably no good for either monty or me, FWIW 20:58:39 <lifeless> devananda: ack 20:58:57 <lifeless> devananda: mordred had indicated he could do mon maybe tuesday 20:59:08 <lifeless> jog0: terrible for burners 20:59:09 <mordred> maybe tuesday, but it would be pushing it 20:59:17 <devananda> mon yes, maybe. tues is kinda driving day 20:59:18 <lifeless> jog0: they need a couple weeks decompression after the thing 20:59:31 <jog0> right 20:59:31 <jog0> maybe we should do this in http://www.doodle.com/ 20:59:37 <SpamapS> and I'm not available 8/27 - 9/9 .. (not burning.. ;) 20:59:38 <devananda> doodle ++ 20:59:39 <lifeless> out of time 20:59:44 <lifeless> I will take it to the list. 20:59:49 <lifeless> #endmeeting