21:02:33 <ttx> #startmeeting project 21:02:34 <openstack> Meeting started Tue Jan 21 21:02:33 2014 UTC and is due to finish in 60 minutes. The chair is ttx. Information about MeetBot at http://wiki.debian.org/MeetBot. 21:02:35 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 21:02:35 <lifeless> ttx: need me ? :) 21:02:37 <mordred> o/ 21:02:38 <openstack> The meeting name has been set to 'project' 21:02:40 * flaper87 is here in case an update on oslo is needed 21:02:40 <sdague> o/ 21:02:45 <mordred> ttx: bah. I don't believe in meetings 21:02:50 <ttx> lifeless: you can have that hour of your life back 21:02:51 <hub_cap> so does the o/ actually do something bot related? 21:02:57 <ttx> #link http://wiki.openstack.org/Meetings/ProjectMeeting 21:03:03 <ttx> hub_cap: no 21:03:09 <ttx> The main topic of discussion for today is obviously the situation with the frozen gate and the consequences for icehouse-2 21:03:17 <ttx> #topic Frozen gate situation 21:03:29 <ttx> so, on that topic... 21:03:31 <ttx> First of all, I would like to discuss the current situation: flow rate, main causes 21:03:42 <ttx> s othat everyone ois on the same page 21:03:57 <ttx> sdague, jeblair, mordred: any volunteer to summarize the current state ? 21:04:02 <sdague> sure 21:04:14 <sdague> so we've got kind of a perfect storm of issues 21:04:31 <sdague> we're hitting non linear scaling issues on zuul, because we're throwing so much at it 21:04:36 <sdague> and the gate queue is so long 21:04:47 <sdague> we're in starvation for nodes in general 21:05:01 <sdague> our working set is much bigger than all the available nodes to us 21:05:21 <sdague> and we've got a bunch of new gate reseting issues that cropped up in the last couple of weeks 21:05:31 <ttx> is that node starvation due to lack of cloud resources ? Or something else ? 21:05:44 <sdague> which we were blind to because we were actually loosing console logs to elastic search 21:05:53 <sdague> ttx: lack of cloud resources 21:06:05 <sdague> basically we're swapping 21:06:08 <ttx> sdague: is that something we need to ask our kind sponsors to help with ? 21:06:15 <sdague> ttx: it has been requested 21:06:19 <ttx> sdague: ok 21:06:23 <russellb> i think i saw mordred say he was asking 21:06:34 <sdague> yep, mordred is working that 21:06:40 <mordred> yes. I have emailed rackspace about more nodes/quota 21:06:44 <ttx> sdague, mordred: let me know if smoe kind of officioal foundation statement can help 21:06:46 <jeblair> sdague: i don't want to contradict anything you have said, all of which are correct, but i don't think starvation is having a huge impact on throughput 21:06:48 <mordred> pvo hasn't given a final answer back yet 21:07:05 <sdague> jeblair: that could be 21:07:12 <pvo> mordred: if you could take iad or dfw, that would be a huge help. 21:07:16 <pvo> is that possible? 21:07:28 <mordred> hp's new cloud region is where new capacity comes from, and those nodes do not work well for us in the gate yet- so we're dependent on rax for additional node quota if we're getting any 21:07:29 <sdague> though yesterday we were thrashing due to node starvation 21:07:33 <mordred> pvo: I believe so? 21:07:35 <sdague> in some interesting patterns 21:07:43 <mordred> jeblair: thoughts on pvo's question? 21:07:45 <jeblair> also, it is worth mentioning that the reduction in tempest concurrency is likely affecting throughput 21:07:46 <sdague> but we can debate the subtleties later 21:08:04 <sdague> jeblair: well, it saw a noticable increase in reliability 21:08:06 <pvo> mordred: I can get the quotas raised there if that'll work for you. I"ve been going back and forth with our ops guys. 21:08:11 <ttx> so what's the flow rate right now ? 20 changes per day ? more ? less ? 21:08:14 <sdague> so I think, on net, it's helping 21:08:15 <jog0> https://github.com/openstack/openstack/graphs/commit-activity 21:08:18 <jog0> flow 21:08:24 <jeblair> pvo: absolutely 21:08:27 <ttx> jog0: awesome htx 21:08:49 <pvo> jeblair: mordred: cool. I'll try to get those changes done today. 21:09:04 <mordred> pvo: thanks! I owe you much beer 21:09:19 <sdague> I've spent enough time looking at the perf data now, that I'm very much with russellb, 4x concurency puts the environment at such a high load for so long, we get unpredicable behavior 21:09:32 <russellb> +1 21:09:36 <mikal> sdague: is there a guide to running tempest somewhere? I can't see one on the wiki and am ashamed to admit I've never run it in person. 21:09:38 <russellb> it's just a non-starter right now 21:09:42 <ttx> OK, do we have reasons to hope things will get back to a flow rate in the near future that would let us absorb that backlog ? 21:09:47 <pvo> mordred: np. whats the tenant id again? 21:09:54 <russellb> pvo: thanks a ton 21:09:58 <pvo> can private message that if you want 21:10:04 <mikal> Sorry, wrong channel 21:10:08 <sdague> heh 21:10:45 <sdague> ttx: without more people helping on the race fix side... no 21:11:01 <sdague> we're going to mitigate some things, to make us thrash less 21:11:13 <sdague> but we won't get back to normal flow until we aren't doing so many resets 21:11:35 <russellb> i've had a fix for a huge offender up for 5 hours, but it's so behind we can't get it in 21:11:38 <ttx> sdague: do we have a bug number for the key issue(s) 21:11:40 <russellb> it hasn't shown up in check yet even 21:11:51 <sdague> ttx: I put 2 on the list yesterday 21:11:59 <ttx> sdague: ok, those two are still current 21:12:00 <russellb> https://review.openstack.org/#/c/68147/3 ... fixes a bug with 340 hits in the last 12 hours 21:12:02 <sdague> part of it is also discovering the resets 21:12:23 <sdague> ttx: yes, I had a morning update on list about it 21:12:43 <ttx> OK... As far as the icehouse-2 milestone goes, we need to decide if it makes sense to cut branches at the end of the day today, or if we should just delay by a week 21:12:52 <ttx> On one hand, a milestone is just a point in time, so we can cut anytime 21:13:01 <ttx> On the other hand, we want it to be a *reference* point in time, and here we have a lot of stuff in-flight... so cutting now is a bit useless 21:13:03 <sdague> honestly, we're only at about an 80% characterization rate right now as well, so there could be killer bugs we haven't identified yet 21:13:15 <ttx> Decision all depends on our confidence that adding a week will make a difference. 21:13:24 <ttx> that the currently in-flight stuff will make it. 21:13:31 <ttx> which in turn depends on our ability to restore a decent flow rate to absorb the backlog. 21:13:38 <ttx> I'm open to suggestions here 21:13:48 <annegentle> ttx: I do see people try to install milestone releases, so do wait if you think it would help 21:14:10 <ttx> http://status.openstack.org/release/ should show how many bleuprints are stuck in queue for i2 21:14:21 <sdague> my instinct is cut what we have, the trees were always supposed to be shippable 21:14:26 <jog0> ttx: I am not sure if waiting a week is enough TBH. If we don't fix the gate a week won't help 21:14:27 <russellb> sdague: same here 21:14:37 <sdague> my concern is that if we delay i2, then all these people are going to rush the queue harder 21:14:37 <mordred> yeah. we're not timed-based-releses for nothing round here 21:14:48 <russellb> i say just cut 21:14:52 <jog0> ++ 21:15:02 <russellb> sdague: that's a very good point 21:15:08 <russellb> we don't need to encourage a gate rush right now 21:15:15 <russellb> we need people to chill out for a while 21:15:17 <annegentle> yeah avoid gate rush. I'm just going back and forth here. I'll stop 21:15:20 <ttx> sdague: I'm fine with cutting now if you PTLs are fine with that 21:15:24 <stevebaker> I vote for doing a cut now 21:15:37 <ttx> it will looks a bit empty but who cares 21:15:55 <russellb> ttx: sometimes we get to fail in public, and i think that's OK 21:15:58 <sdague> yeh, I already had one group ping me for a requirements review so they could land something in i2 this morning 21:16:00 <devananda> will this encourage folks whose changes are in-flight ut dont make it to delay them further, making a bigger rush for I3 ? 21:16:01 <hub_cap> if it will alleviate the mental push to help clean up the Q a bit, lets do it 21:16:10 <kgriffs> I'm fine with a cut now, except there is one patch that I really need to land first that has been stuck for hours. 21:16:14 <sdague> devananda: we're going to have a bigger rush in i3 regardless 21:16:18 <russellb> if we weren't all so damn busy, we could write a book on all the amazing lessons we're learning along the way :) 21:16:27 <sdague> russellb: ++ 21:16:27 <ttx> so.. let's say I'll cut early next morning, which gives a bit of time for things to land 21:16:34 <kgriffs> +1 21:16:35 <ttx> that would be like 7am MST 21:16:35 <david-lyle> horizon may have a licensing issue if we cut now, may need to get a fix in 21:16:58 <ttx> would that work for everyone ? 21:17:04 <sdague> ttx: good for me 21:17:21 <dolphm> i'm fine with cutting now 21:17:28 <hub_cap> ++ 21:17:31 <ttx> and then tag the next morning same hour 21:17:33 <david-lyle> https://bugs.launchpad.net/horizon/+bug/1270764 21:17:38 <ttx> or even tag on the same day 21:17:42 <mordred> ttx: we may want to get ptls to indicate super-important must-land patches in case promoting them is necessary 21:17:45 <ttx> it's not as if backpotrs would make it through anyway 21:18:08 <sdague> yeh, if there is a licensing patch, that would qualify 21:18:11 <ttx> so... branch cut at 7am MST, tag toward the end of the afternoon 21:18:19 <dolphm> mordred: i hope you're not referring to blueprints .. ? 21:18:19 <russellb> works for me 21:18:22 <stevebaker> ttx: This Critical fix needs to get in one way or another https://review.openstack.org/#/c/68135/ 21:18:40 <russellb> i dont' care about landing more features, i just want to help restore the gate :) 21:18:45 <russellb> at this point 21:18:52 <sdague> russellb: +1 21:18:52 <ttx> sdague: could you add it to your magic list of priority fixes ? 21:18:54 <mordred> dolphm: nope. I just think there are a couple of things, like the license thing, that we should help land before the morning if we can 21:18:55 <mordred> russellb: ++ 21:18:58 <markmcclain> russellb: ++ 21:18:59 <ttx> https://review.openstack.org/#/c/68135/ 21:19:02 <dolphm> russellb: ++ please 21:19:18 <russellb> trying 21:19:26 <sdague> ttx: so it has to have *some* recent test results 21:19:31 <ttx> if we fail to land it today, we still have the backport option tomorrow 21:19:34 <sdague> before we promote 21:19:46 <ttx> sdague: ack 21:20:08 <mordred> ttx: I need to run - are we good on the portion of the meeting for which I'm useful? 21:20:15 <ttx> #agreed I2 branch cut around 7am MST tomorrow. Tags expected toward the end of day 21:20:30 <ttx> mordred: I think we cna handle it frmo here 21:20:33 <ttx> damn it 21:20:38 <russellb> sdague: nova fix is in the check queue now at least (no dsvm nodes yet) 21:20:40 <hub_cap> heh 21:20:41 <ttx> mordred: I think we can handle it from here 21:20:55 <sdague> russellb: yeh, there is something else going on, like last week 21:21:04 <markwash> 7 AM MST3K? 21:21:14 <sdague> too many meetings this afternoon to dive on it :) 21:21:27 <markwash> I'm down 21:21:28 <ttx> sdague: let's make this one quick 21:21:34 <ttx> #topic stable/havana situation (david-lyle) 21:21:50 <ttx> So stable/grizzly was broken a few weeks ago, which is turn paralyzed stable/havana (due to grenade trying to run stable/grizzly stuff) 21:21:55 <ttx> My understanding is that it's all put on the backburner until we get the master gate in order 21:22:03 <ttx> and in hte mea ntim we should not approve stable/* stuf 21:22:14 <ttx> mean time... stuff 21:22:15 <russellb> seems stable +A was removed anyway :) 21:22:18 <sdague> yeh 21:22:19 <ttx> hah 21:22:21 <ttx> good 21:22:26 <russellb> solves that. 21:22:27 <sdague> we made an executive call on that on sunday 21:22:29 <ttx> david-lyle: comment on that ? 21:22:41 <ttx> (you raised that topic) 21:22:47 <david-lyle> ok, just checking. I think queued up jobs are finally failing 21:22:48 <sdague> the stable/grizzly fix is actually in the gate 21:23:00 <sdague> david-lyle: are there still stable jobs in the gate? 21:23:08 <sdague> I thought we managed to kick those all out 21:23:12 <david-lyle> wanted to verify 21:23:37 <sdague> I believe - https://review.openstack.org/#/c/67739/ will make stable/havana work in the gate again 21:23:39 <david-lyle> sdague: I believe I saw a couple Horizon failures yesterday 21:23:44 <sdague> stable/grizzly is actually working again 21:23:49 <sdague> david-lyle: gate or check 21:23:50 <ttx> ok 21:23:59 <ttx> so this is actually being solved too 21:24:11 <ttx> david-lyle: anything more on that topic ? 21:24:14 <david-lyle> sdague: I think gate 21:24:24 <david-lyle> ttx: no, I'm good 21:24:27 <ttx> #topic common openstack client (markwash) 21:24:31 <sdague> david-lyle: so if there are any other stable/havana things in gate, we should remove them 21:24:36 <ttx> markwash: floor is yours 21:24:52 <markwash> okay, I was really just hoping to do a quick poll here 21:24:58 <david-lyle> sdague, ack 21:25:16 <markwash> the discussion in openstack dev is about what kind of approach to take: many client projects, one client project 21:25:25 <ttx> one 21:25:40 <ttx> oh 21:25:45 <markwash> I'm interested especially because I want to make some changes to glanceclient, and would love to know where to put it 21:25:59 <ttx> I'm more worried about MULTIPLE unified client projects 21:26:01 <markwash> there's also a difference between one CLI client and one library SDK 21:26:31 <ttx> I thought openstackclient was the one but hte discussion seemed to point to something else (haven't read it in detail yet) 21:26:33 <sdague> markwash: my read of the thread is you should keep doing what you are doing now, I can't imagine this becomes reality pre icehouse 21:27:02 <markwash> sdague: makes sense 21:27:34 <markwash> If no one else has anything they want to share re one vs many I'm good. . I just was hoping to hear from any PTLs who felt strongly 21:28:16 <jd__> whatever works :) 21:28:18 <dolphm> this is in regard to one SDK, correct? 21:28:22 <dolphm> not another CLI? 21:28:40 <stevebaker> I would be happy to get behind a unified client, but I've still not got around to integrating heat with python-openstackclient 21:28:59 <markwash> dolphm: I think that's the key distinction, yeah. . not a lot of opposition in my mind to unifying the CLI 21:29:09 <ttx> yeah, we really need someone to head that client projects division 21:29:23 <ttx> and push things forward 21:29:36 <ttx> in a coherent way 21:29:52 <sdague> honestly, I'm very pro unified CLI and unified SDK. But I don't think this is an icehouse thing regardless 21:29:53 <markwash> my take is, programs should own their little part of the SDK within reason, but we can probably solve that regardless of how we structure the client repo(s) 21:29:59 <sdague> I'm happy that people are diving on it 21:30:09 <ttx> #topic Red Flag District / Blocked blueprints 21:30:18 <ttx> Any inter-project blocked work that this meeting could help unblock ? 21:30:18 <dolphm> markwash: ++ 21:30:37 <ttx> Though I suspect the gate is the common issue 21:31:41 <ttx> OK, I guess none then 21:31:43 <ttx> #topic Incubated projects 21:31:53 <SergeyLukjanov> o/ 21:32:00 <devananda> o/ 21:32:03 <ttx> devananda, kgriffs, SergeyLukjanov: will cut your branches at the same time as the grown-up's 21:32:10 <devananda> great :) 21:32:18 <SergeyLukjanov> works for me too 21:32:33 <SergeyLukjanov> in savanna, we're only have 1 bp and 1 bug in flight 21:32:39 <SergeyLukjanov> https://launchpad.net/savanna/+milestone/icehouse-2 21:32:43 <devananda> we've got a bunch of bug-fixes in flight, stalled on the same slow gate that the grown-ups are stalled on. 21:32:54 <ttx> I will defer what's still open 21:33:03 <ttx> so keep your blueprints status current 21:33:08 <ttx> (and your bugs too) 21:33:21 <ttx> any question on that process ? 21:33:26 <devananda> our BP status is correct (2 implemented) I've deferred the rest that were targeted previously 21:33:50 <devananda> please feel free to defer any still-open bugs when you cut. we'll land what we can in the meantime 21:34:02 <devananda> no questions here 21:34:05 <ttx> #topic Open discussion 21:34:13 <ttx> Anything else, anyone ? 21:34:27 <sdague> on the cross project front... 21:34:43 <sdague> if you are using eventlet.wsgi to log requests, please turn debug off 21:34:45 <sdague> https://review.openstack.org/#/c/67737/ 21:34:51 <sdague> I proposed that to neutron and nova 21:34:58 <sdague> otherwise you get a ton of log spam 21:35:01 <ttx> #info if you are using eventlet.wsgi to log requests, please turn debug off https://review.openstack.org/#/c/67737/ 21:35:17 <devananda> sdague: thanks for the pointer. I suspect we need to do taht, too 21:35:22 <ttx> OK, back to useful work, then. 21:35:23 <SergeyLukjanov> ttx, are you planning to use 'proposed/icehouse-2' instead of mp branches for i3? 21:35:33 <ttx> no, mp 21:35:39 <ttx> the change wasn't merged yet 21:35:47 <ttx> we migt try that for i-3 21:35:55 <ttx> anything else ? 21:36:16 <ttx> ok then 21:36:19 <ttx> #endmeeting