20:01:23 <shardy> #startmeeting heat 20:01:24 <openstack> Meeting started Wed Aug 21 20:01:23 2013 UTC and is due to finish in 60 minutes. The chair is shardy. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:01:25 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 20:01:27 <openstack> The meeting name has been set to 'heat' 20:01:34 <shardy> #topic rollcall 20:01:40 <shardy> hi all, who's around? 20:01:42 <jpeeler> jpeeler here 20:01:42 <funzo> o/ 20:01:43 <m4dcoder> o/ 20:01:44 <radix> hello! 20:01:45 <stevebaker> here 20:01:48 <spzala> Hi! 20:01:55 <therve> Hi! 20:02:01 <zaneb> \o 20:02:36 <shardy> #link https://wiki.openstack.org/wiki/Meetings/HeatAgenda 20:03:10 <SpamapS> o/ 20:03:14 <SpamapS> (leaving early) 20:03:17 <adrian_otto> o/ 20:03:24 <shardy> asalkeld, sdake? 20:03:37 <shardy> ok lets get started 20:03:56 <shardy> #topic Review last week's actions 20:04:27 <shardy> #link http://eavesdrop.openstack.org/meetings/heat/2013/heat.2013-08-14-20.00.html 20:04:35 * shardy shardy to post mission statement 20:04:42 <shardy> oops, I forgot to do that, again 20:04:59 <shardy> does anyone have a link to the thread where this was requested, then I'll reply after this meeting? 20:05:08 <shardy> #action shardy to post mission statement, again 20:05:20 <stevebaker> I think people just post new threads with the statement 20:05:33 <shardy> stevebaker: Ok, thanks 20:05:44 <shardy> anything else from last week before we move on? 20:06:23 <shardy> #topic Reminder re Feature proposal freeze 20:07:10 <shardy> So it's this Friday, any features posted for review after that will require an exception, so we should start -2ing stuff which gets posted late 20:07:12 <therve> The gate already decided for us it seems 20:07:25 <shardy> therve: haha, yea quite possibly ;) 20:07:29 * bnemec is here, but got distracted playing with stackalytics... 20:07:39 <stevebaker> maybe we should review late items at the meeting 20:08:15 <shardy> stevebaker: yup, next item is h3 bps 20:08:24 <shardy> #topic h3 blueprint status 20:08:32 <zaneb> so, I'm working on https://bugs.launchpad.net/heat/+bug/1176142 20:08:34 <uvirtbot> Launchpad bug 1176142 in heat "UPDATE_REPLACE deletes things before it creates the replacement" [High,In progress] 20:08:40 <zaneb> dunno if you would call that a "feature" 20:08:46 <shardy> #link https://launchpad.net/heat/+milestone/havana-3 20:08:49 <zaneb> but it is *not* going to land by friday 20:08:50 <SpamapS> zaneb: \o/ 20:09:03 <SpamapS> zaneb: that is a bug 100% 20:09:04 <uvirtbot> Launchpad bug 100 in launchpad "uploading po file overwrites authors list" [Medium,Fix released] https://launchpad.net/bugs/100 20:09:09 <SpamapS> haha doh 20:09:22 <zaneb> basically, it takes longer to rebase my current work than time I have to work on it 20:09:24 <shardy> zaneb: well technically I think it's only BPs which are frozen, but IMO we should start deferring any big bugs which contain new functionality not just fixes 20:09:36 <zaneb> so I have a negative amount of time to spend on the actual problem 20:09:40 <shardy> zaneb: so maybe we defer until early icehouse? 20:09:45 <zaneb> which is already *really* hard 20:09:48 <shardy> #link https://launchpad.net/heat/+milestone/havana-3 20:10:21 <shardy> Ok so I think heat-trusts will slip too - I've just had too many problems with keystone, and it's taken too long to get the client support merged 20:10:27 <SpamapS> zaneb: are you saying that you have so many things in-flight that you spend all day rebasing? 20:10:30 <zaneb> yeah, let's see how things go over the next week or so, but I'm thinking it will probably have to be bumped :( 20:10:40 <radix> shardy: sad :( 20:10:59 <therve> zaneb, Considering it's a bug, can't it go past the feature freeze? 20:11:08 <therve> Or is it too big of a behavior change? 20:11:18 <zaneb> it's a big behaviour change, tbh 20:11:19 <adrian_otto> bugs are exempt 20:11:21 <shardy> zaneb, radix: IMO it's much better to bump big stuff which is too late than land loads of risky stuff and make our first integrated release really broken 20:11:31 <adrian_otto> there is no point in delaying a bugfix 20:11:57 <radix> shardy: yeah, I understand 20:12:03 <shardy> adrian_otto: This is a complete rework of our update logic, so it's not a normal bugfix 20:12:20 <adrian_otto> same logic applies 20:12:24 <shardy> but yeah, in general bugs are fine 20:12:29 <adrian_otto> unless you are addinga new feature? 20:12:29 <m4dcoder> zaneb: if the UPDATE_REPLACE is fixed, i think it's going to be easier to implement the rolling update for as-update-policy. 20:12:35 <radix> I was just sad that the keystone stuff is buggy 20:13:04 <shardy> radix: well it's very new stuff, AFAICT we're the first to try to really use it 20:13:33 <SpamapS> shardy: I think landing potentially broken stuff early is also an anti-pattern. Lets land good stuff. 20:13:34 <radix> yeah, that's usually how it goes when you have inter project dependencies like that 20:14:31 <shardy> SpamapS: yeah, but if we've got stuff which is rushed and may be flaky, or depends on stuff known-to-be-flaky in other projects, now is not really the time to land it 20:14:45 <shardy> early in the cycle is much less risky as we've got more time to fix and test 20:14:52 <SpamapS> Yes, what I am saying is, lets not count on it being landable, flaky, on day 1 of icehouse. 20:15:07 <SpamapS> There is never a time where it is ok to risk breaking everything. 20:15:53 <shardy> SpamapS: sure, but deferring gives those working on said features more time to test and solidify their stuff before it's merged 20:16:16 <SpamapS> deferring +1. Planning to de-stabilize trunk, -1. 20:16:40 <SpamapS> (and I acknowledge that you were just saying "lets defer") 20:16:53 <SpamapS> I'm just saying, don't defer, and stop working on it. 20:17:02 <SpamapS> defer landing, keep stabilizing it. 20:17:11 <shardy> Yeah, who was saying destablize trunk, I think you just made that up ;P 20:17:17 <shardy> anyways.. 20:17:32 <SpamapS> The entire software industry made that up. Its called "lets just drop it in trunk when we re-open after freeze". 20:17:43 <shardy> There are a few bp's still in "Good Progress", some of which have patches posted I think: 20:17:45 * SpamapS moves on :) 20:18:05 <shardy> #link https://blueprints.launchpad.net/heat/+spec/hot-parameters 20:18:24 <shardy> #link https://blueprints.launchpad.net/heat/+spec/multiple-engines 20:18:52 <shardy> #link https://blueprints.launchpad.net/heat/+spec/heat-multicloud 20:18:53 <stevebaker> Should have something to submit today 20:18:58 <stevebaker> for multi-engines 20:19:07 <stevebaker> i mean multicloud :/ 20:19:18 <stevebaker> to many multis 20:19:23 <shardy> #link https://blueprints.launchpad.net/heat/+spec/oslo-db-support 20:19:45 <shardy> Ok, cool, just wanted to see if some of those should actually be either Implemented or Needs Code Review 20:19:57 <shardy> anyone know if hot-parameters is really done? 20:20:34 <zaneb> I think it's really sort-of done 20:20:42 <shardy> https://review.openstack.org/#/q/status:merged+project:openstack/heat+branch:master+topic:bp/hot-parameters,n,z 20:21:08 <shardy> All the stuff posted is merged, so can we claim it Implemented for h3 purposes? 20:21:22 <zaneb> for h3 purposes I think so, yes 20:21:43 <shardy> zaneb: Ok, cool, thanks 20:22:14 <zaneb> best to double check with nanjj too though 20:22:19 <radix> then maybe it needs another bp for post-h features 20:22:57 <shardy> I moved a few bugs into h3 which look like it would be good to fix, if anyone has bandwidth and needs something to do please pick them up :) 20:22:57 <zaneb> radix: yes, quite likely 20:23:20 <shardy> Ok I'll ping nanjj to check tomorrow 20:23:43 <shardy> Anyone else have anything to raise re h3 before we move on? 20:23:58 <therve> I may slip my lbaas bp in it 20:24:09 <therve> It "just" needs one more branch I think 20:24:33 <shardy> Overall I think we've done really really well, 27 bps and 60 bugs atm, if we land most of that it's going to be a great effort :) 20:24:33 <radix> what's the remaining branch? 20:24:54 <shardy> therve: Ok, cool, if things are up for review but not targetted please add them 20:24:59 <therve> radix, https://review.openstack.org/#/c/41475/ 20:25:07 <m4dcoder> i'm still working to get my last patch submitted for review for as-update-policy before end of week. as-update-policy was moved to "next". 20:25:17 <shardy> I think m4dcoder posted a patch which got bumped and we may be able to pull back 20:25:21 <shardy> snap 20:25:21 <radix> oh OK 20:25:59 <shardy> m4dcoder: Ok, if it looks like it's going to land, I'll pull it back 20:26:19 <m4dcoder> thx. i have 1 in review and another 1 i'm going to submit before end of week for as-update-policy. 20:26:45 <shardy> m4dcoder: Ok, sounds good thanks 20:27:43 <shardy> #topic Open Discussion 20:27:54 <shardy> anyone have anything? 20:28:05 <spzala> shardy: I will follow up with nanji tonight on hot-parameter validation blueprint, I think it is done like you mentioned. Sorry I was on another meeting. 20:28:22 <m4dcoder> can havana release of heat still work with an openstack instances on grizzly? 20:28:31 <shardy> spzala: Ok, that would be great, thanks, pls change the Implementation status if so 20:28:43 <stevebaker> there was a recent change where if a nova boot fails, the instance resource deletes the server during create. Do we want to be doing this? 20:29:02 <zaneb> stevebaker: excellent question 20:29:02 <shardy> m4dcoder: You mean on a grizzly openstack install? 20:29:06 <spzala> shardy: OK, no problem. Yup, will do. 20:29:08 <m4dcoder> shardy: yes 20:29:14 <zaneb> I don't want to be doing that, it frightens me 20:29:29 <shardy> m4dcoder: maybe, but it's not something we support, you need to use stable/grizzly 20:29:44 <stevebaker> I'm inclined to leave the failed server there, for post-mortum if nothing else 20:29:54 <radix> I have been wondering about having an explicit "try to converge" operation in heat 20:30:37 <radix> which would basically clean up and retry to get things to look like the template 20:30:39 <SpamapS> Are we deleting it, and re-trying? 20:30:42 <SpamapS> I do like that 20:30:46 <SpamapS> an ERROR state is a dead instance. 20:30:54 <stevebaker> just deleting and putting the resource in FAILED state 20:30:55 <shardy> stevebaker: I agree, I don't think we want to delete, or try to delete until stack delete 20:31:07 <stevebaker> i shall raise a bug 20:31:13 <radix> no, it doesn't currently retry. and I don't think it should by default 20:31:27 <m4dcoder> shardy: thanks. 20:31:29 <SpamapS> yeah that could lead to a large bill. ;) 20:31:42 <radix> instance group does this too BTW - deletes the sub resources 20:32:24 <shardy> radix: yeah I was looking at that in conjunction with a patch from liang 20:32:46 <zaneb> how did a major change to the behaviour of instances sneak in in a commit that claimed to be just adding a Racksapce resource? 20:32:48 <therve> radix, It'd be nice to have at least an API call to "retry" create 20:32:50 <zaneb> https://github.com/openstack/heat/commit/2684f2bb4cda1b1a23ce596fcdb476bb961ea3f8 20:32:53 <shardy> radix: IMO that is also wrong, the InstanceGroup resource should go into a failed state (probably UPDATE, FAILED) if it can't adjust 20:32:57 <zaneb> that's extremely uncool 20:33:18 <shardy> therve: there is a bug for allowing retry of create/update 20:33:20 <radix> yeah. I didn't do it, I just tried to maintain the behavior through my redactor :) 20:33:32 <radix> f 20:34:08 <therve> zaneb, That's a bit sad and untested :/ 20:34:30 <bnemec> zaneb: Wow, that was a really bad commit. It introduced the ResourceFailure bug too. 20:34:32 <SpamapS> shardy: all these transitions to failed state make the urgency of needing a "RETRY" capability go up. 20:34:39 <shardy> radix: sure, well lets raise a big and fix it 20:35:02 <zaneb> bnemec: no, I introduced the ResourceFailure bug by not spotting that (bizarre) change 20:35:04 <SpamapS> have not had time to address the lack thereof.. but would still like to very much 20:35:08 <shardy> SpamapS: well IIRC it's assigned to you... 20:35:33 <zaneb> bnemec: and by "not spotting" I mean "relying on the unit tests instead of grep" 20:35:42 <shardy> SpamapS: if you don't have the bandwidth, let me know and we'll reassign 20:35:45 <SpamapS> Yeah, I keep running into these things where heat is a time bomb waiting to eat all of your memory/disk/cpu ... can't seem to prioritize retry over those. ;) 20:35:48 * zaneb goes to the naughty corner 20:36:06 <bnemec> zaneb: Yeah, part of the problem with that is it's too large. 800 some lines is too much to review properly. 20:36:29 <kebray> SpamapS we're running into the same problem.. other general scalability issues are keeping us from getting to implementing a Retry 20:36:47 * SpamapS watches the ducks line up 20:36:56 <kebray> but, +1 on wanting retry, both retry create, and retry individual steps of the create. 20:36:56 <SpamapS> and now I have an appointment that I have to get to. 20:36:59 <zaneb> bnemec: well, the problem is when it says "Add resource for Rackspace Cloud Servers" but actually makes fundamental changes to other resources :) 20:37:00 <SpamapS> anybody need me for something before I go? 20:37:44 <shardy> kebray: OK well lets coordinate getting someone (if SpamapS can't get to it) looking at that soon 20:37:54 <kebray> shardy sounds good. 20:38:01 <shardy> SpamapS: o/ 20:38:21 * SpamapS goes poof 20:38:22 <shardy> So I have a general question re upgrade strategy when we move to trusts.. 20:38:38 <stevebaker> shoot 20:39:00 <shardy> the cleanest way is to drop the DB and just use trusts for all user_creds, but I'm thinking we need to allow transistion, ie existing stacks should still work 20:39:24 <bnemec> zaneb: Sure, but in a 200 line change that probably gets shot down by reviewers. In 800 it gets lost in the noise. 20:39:32 <bnemec> (sorry for the tangent in the middle of the meeting) 20:39:45 <shardy> so my current plan is to extend the context and user_creds adding a trust_id, which we use if it's there, otherwise we fall back to the user/pass for old, existing stacks 20:39:59 <therve> shardy, Is there a way to migrate existing stacks? 20:40:04 <zaneb> bnemec: fair point; that's hard to avoid when you're adding a whole new resource though 20:40:05 <stevebaker> shardy: how about storing in resource_data? 20:40:22 <shardy> therve: not really, because you don't have a connection to keystone at DB migrate time 20:40:40 <zaneb> bnemec: but yes, that should have been at least 3 patches. and at least 1 should have been rejected ;) 20:40:45 <shardy> but we could write a tool which creates a trust using the stored credentials, and migrate it that way 20:41:05 <bnemec> zaneb: Agreed. :-) 20:41:25 <stevebaker> shardy: does this mean new secrets need to make it onto instances? 20:41:34 <shardy> stevebaker: I was thinking of just adding the trust_id to the stack table, but that makes the overlap between old/new methods harder 20:41:38 <stevebaker> oh, this is just for api requests 20:41:38 <zaneb> shardy: heat-manage could do that maybe? 20:41:54 <shardy> stevebaker: no, not yet, this is just the credentials for periodic tasks in the engine 20:42:19 <shardy> zaneb: hmm, yeah, but heat-manage would need the ID of the heat service user 20:42:36 <stevebaker> short answer is, it would be nice if the old method continued to work in parallel 20:42:42 <shardy> I guess it could read it from a cli arg or config file 20:43:00 <zaneb> shardy: it would need the whole config file to decrypt the credentials anyway 20:43:08 <zaneb> shardy: but that seems doable 20:43:13 <shardy> zaneb: good point 20:43:35 <zaneb> shardy: if you want to get really fancy, you could do it in the db migration ;) 20:43:44 <shardy> stevebaker: agreed, but then do we publish e.g that we'll transistion to just trusts after e.g one cycle? 20:44:21 <shardy> zaneb: haha, yeah I guess, was trying to keep things simple ;) 20:44:45 <stevebaker> shardy: actually, we may need to keep the old way for a while if we support older openstack clouds 20:45:07 <shardy> Ok thanks all for the input, will try to get a wip patch up for review soon, aiming for early Icehouse when the keystoneclient patches etc have landed 20:45:19 <stevebaker> like, indefinitely. And have some way of discovering if the keystone supports trusts 20:45:29 <zaneb> ick 20:45:36 <shardy> stevebaker: you mean for multicloud? 20:45:42 <stevebaker> shardy: yes 20:45:48 <shardy> gah 20:45:59 <zaneb> it's extremely, extremely uncool that we are storing credentials at all 20:46:00 <stevebaker> :D 20:46:02 <shardy> that could get really messy 20:46:17 <zaneb> I think it's better to say that if you don't have a compatible keystone, you lose out 20:46:23 <stevebaker> zaneb: yes, unless it is a private heat installation 20:46:39 <zaneb> than to say that if you don't have a compatible keystone, we store your password in a really insecure way 20:46:48 <shardy> zaneb: that's why I was hoping everyone would say drop-the-db, kill the stored-creds ;) 20:46:51 <stevebaker> plaintext! 20:46:59 <zaneb> it may as well be 20:47:31 <zaneb> if we keep both around, you'll never be sure which we're doing 20:47:43 <stevebaker> maybe we can look into a keystore for icehouse 20:47:47 <shardy> stevebaker: So we say master only supports havana for native openstack deployments, are you saying we somehow have to maintain indefinite backwards compat for multicloud? 20:48:29 <shardy> zaneb: If we can manage a flag-day migration, I would much prefer it, and the resulting code will be much much easier to maintain 20:48:30 <stevebaker> shardy: that is something we should discuss 20:48:44 <stevebaker> sorry, I have to go. 20:48:56 <shardy> stevebaker: Ok lets pick it up on the ML 20:49:15 <shardy> anyone else have anything for the last few minutes? 20:49:37 <radix> I was distracted for a bit, was a bug filed about not deleting failed instances? 20:49:45 <radix> I can file one for the InstanceGroup and take that on 20:50:20 <zaneb> radix: I think stevebaker was going to file one 20:50:32 <zaneb> for Instance, that is 20:50:34 <shardy> radix: not yet, please do, it was discussed ref https://review.openstack.org/#/c/42462/ 20:51:03 <radix> okie doke 20:51:17 <zaneb> radix: https://bugs.launchpad.net/heat/+bug/1215132 20:51:19 <radix> whoah. nice new format for jenkins comments :D 20:51:20 <uvirtbot> Launchpad bug 1215132 in heat "Nova server gets deleted immediately after failed create" [High,Confirmed] 20:51:33 <radix> zaneb: ok, I'll create a similar one for InstanceGroup. 20:52:09 <shardy> funzo: Did you get the feedback you needed re autoscaling last week? 20:52:30 <shardy> funzo: guess you mainly need to speak with asalkeld re alarms etc from your ML post? 20:53:06 <funzo> I didn't have a long discussion, it was just pointing to the doc 20:53:14 <radix> I can't assign bugs to milestones, so if someone wants to put https://bugs.launchpad.net/heat/+bug/1215140 in h3 that'd be peachy 20:53:15 <uvirtbot> Launchpad bug 1215140 in heat "InstanceGroup shouldn't delete instances that failed to be created" [Undecided,New] 20:53:57 <funzo> shardy: I believe the general thought is we could use nested stacks as a first cut, but I've been focused more on DIB this past week. 20:54:21 <shardy> funzo: Ok, well shout if there's any info you need from us :) 20:54:23 <funzo> shardy: I'll probably talk more about the scaling work when the rhel images are booting in os 20:54:30 <funzo> shardy: definitely will, thx 20:54:34 <shardy> funzo: Ok, cool 20:54:47 <shardy> anything else before we wrap things up? 20:55:25 <shardy> Ok then, well thanks all! 20:55:31 <shardy> #endmeeting