20:00:09 <stevebaker> #startmeeting heat 20:00:09 <openstack> Meeting started Wed Feb 26 20:00:09 2014 UTC and is due to finish in 60 minutes. The chair is stevebaker. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:00:11 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 20:00:13 <openstack> The meeting name has been set to 'heat' 20:00:17 <stevebaker> #topic rollcall 20:00:39 <randallburt> 0/ 20:00:54 <zaneb> howdy 20:00:58 <tango> o/ 20:01:08 <jpeeler> hi 20:01:17 <bgorski> o/ 20:01:57 <stevebaker> #topic Review last meeting's actions 20:02:06 <stevebaker> therve to update https://blueprints.launchpad.net/heat/+spec/parameter-nested-schema 20:02:23 <stevebaker> I've kicked that from i-3 anyway, so it no longer matters 20:02:32 <stevebaker> #topic Adding items to the agenda 20:02:34 <sdake_> o/ 20:02:39 <stevebaker> #link https://wiki.openstack.org/wiki/Meetings/HeatAgenda#Agenda_.282014-2-26.29 20:03:03 <stevebaker> anything to add? we're probably all quite busy with feature freeze 20:03:40 <Slower> o/ 20:03:51 <stevebaker> #topic Policy for managing syncing from Oslo 20:03:58 <stevebaker> #link https://review.openstack.org/#/c/74635/ 20:04:15 <stevebaker> jpeeler: I assume shardy_afk added this, but you had the same objection 20:04:35 <jpeeler> i added that - i don't know what other projects do but i think it would be good to do regular syncing and stop doing single file updates 20:04:40 <stevebaker> dhellmann_: are you about to discuss oslo sync script? 20:05:05 <jpeeler> i asked in the channel and people had mixed feelings, so i added it to the agenda 20:05:09 <zaneb> jpeeler: +1 20:05:36 <stevebaker> jpeeler: single file updates are fine, cherry-picking single changes is probably what we want to avoid 20:05:48 <zaneb> stevebaker: I disagree 20:05:50 <jpeeler> why though? 20:06:06 <jpeeler> seems like a recipe for potential problems 20:06:07 <zaneb> why wouldn't we update everything when we need a change? 20:07:38 <zaneb> it has to be done eventually anyway 20:07:51 <zaneb> let the person who is most motivated have to do it 20:08:11 <stevebaker> zaneb: nova (for example) have got themselves into a situation where they are so out of sync that updating everything causes multiple breakages due to oslo api changes 20:08:33 <zaneb> stevebaker: so let's avoid that 20:08:36 <jpeeler> right 20:08:37 <randallburt> stevebaker: which seems like a situation we would want to avoid 20:08:45 <zaneb> by making sure we stay in sync 20:09:14 <stevebaker> in my ideal world update.py will create one heat commit per oslo commit, so if and when that happens then breakage can be tracked to a commit 20:09:36 <stevebaker> I believe that is happening but I have no idea where progress is at 20:09:52 <zaneb> that sounds nothing like the world we live in ;) 20:10:26 <sdake_> don't think update.py does that :) 20:11:06 <stevebaker> so how about for now we run python update.py --modules <module> once for each module, and create a commit per run 20:11:42 <sdake_> please don't update rpc 20:11:48 <zaneb> stevebaker: ugh, no 20:11:49 <stevebaker> sdake_: ;) 20:12:12 <sdake_> i guess my theory is if it isn't broken, why fix it :) 20:12:22 <sdake_> the review in question, something is broken 20:12:46 <SpamapS> o/ 20:12:47 <zaneb> let's assume at the very least oslo is internally consistent at any given revision 20:12:47 <jpeeler> we've had to backport changes because we were out of sync 20:12:56 <zaneb> and then not go out of our way to break it 20:13:19 <sdake_> some of the updated files need modification after update.py is run as well 20:14:10 <jpeeler> sdake_: even more reason to not let it get out of date in my opinion 20:14:30 <stevebaker> shall we do a sync, but defer it to after feature freeze? 20:14:47 <stevebaker> just due to the review load 20:14:57 <jpeeler> it's not a one time thing though - it doesn't need to be done right now 20:15:08 <jpeeler> just in the future, some specified interval would be good 20:16:07 <zaneb> release milestones seem like the obvious times to do it 20:16:21 <zaneb> but milestones don't really seem to work that way in OpenStack 20:16:23 <sdake_> zaneb you mean AFTER a milestone 20:16:30 <zaneb> sdake_: yes 20:16:49 <zaneb> so after i-2 we would update to the i-2 version of oslo 20:16:57 <zaneb> for example 20:17:24 <stevebaker> so what is the action from this? 20:17:28 <sdake_> some modules depend on others 20:17:36 <zaneb> but I actually think the current system works fine: whenever somebody needs something new from oslo, they sync the whole thing 20:17:41 <sdake_> it is not as simple as updating one module per commit 20:17:45 <zaneb> that ensures we keep relatively up to data 20:18:01 <zaneb> date 20:18:03 <jpeeler> zaneb: as long as that is enforced 20:18:13 <zaneb> jpeeler: indeed 20:18:31 <Slower> makes sense to me 20:19:44 <stevebaker> so we're syncing after i-3, unless someone proposes it before and we have time to review it? 20:20:48 <sdake_> please don't sync notifier or rpc 20:20:58 <sdake_> TIA :) 20:21:22 <stevebaker> ok, moving on 20:21:26 <stevebaker> #topic Feature freeze blueprints status 20:21:27 <stevebaker> #link https://launchpad.net/heat/+milestone/icehouse-3 20:21:42 <stevebaker> EOB March 4th is feature freeze! 20:22:25 <SpamapS> ugh 20:22:36 <stevebaker> we can ask for extensions on specific blueprints, but those blueprints will need to be marked High priority 20:22:40 <SpamapS> Was hoping I'd get a couple days at the TripleO sprint to jam some patches in 20:22:55 <stevebaker> we have a lot of reviewing to do 20:23:10 <SpamapS> stevebaker: you coming to that? We need to get moving on softwareconfig.... 20:24:10 <stevebaker> SpamapS: nopes, but I expect many polish bug fixes to make it work for you 20:24:35 <SpamapS> stevebaker: ok.. I'm reviewing the whole patch stream today. 20:25:01 <stevebaker> SpamapS: ok, expect completed structured config today 20:25:18 <stevebaker> SpamapS: then I'll rework a tripleo template 20:25:25 <zaneb> stevebaker: why do only Polish bugs get fixed? 20:25:59 <SpamapS> zaneb: politics.... 20:26:02 <stevebaker> zaneb: :) 20:26:22 <stevebaker> I shall reverse my polish notation 20:26:34 <SpamapS> be proud, yoda would 20:27:09 <sdake_> oslo.messaging status -> 56 test cases fail, most API calls seem to work, multi-engine works, just need to beat the test cases into submission 20:27:33 <sdake_> jasond also asked for a refactor of service.py before I further refactor it in the oslo.messaging patch that I'll look into after the test cases pass 20:28:40 <zaneb> how would folks feel about sneaking in pluggable template parsers before the feature freeze? 20:28:58 <stevebaker> 17 blueprints are in review, if you're looking for guidelines on what to review first I guess it would be best to go for blueprint Priority first 20:29:15 <stevebaker> zaneb: are there changes for it? 20:29:45 <zaneb> stevebaker: it's going to be fairly minimal changes beyond what I already have posted patches for 20:30:14 <stevebaker> btw the heat-slow job is reasonably stable right now, so if your change fails on heat-slow you should check to see if it was your change that caused it 20:31:10 <stevebaker> zaneb: I don't think I mind 20:31:29 <zaneb> ok, I will post it when it's done and we will see what happens in the reviews 20:31:51 <stevebaker> #topic open discussion 20:32:31 <SpamapS> As far as "stuff to get in late" ... 20:32:44 <SpamapS> I'm pretty sure we _must_ have manual retry ability in TripleO. 20:33:13 <SpamapS> update is basically a death sentence without it. 20:33:14 <zaneb> SpamapS: retry of what, specifically? 20:33:20 <SpamapS> zaneb: update 20:33:27 <sdake_> is there a blueprint for that? 20:33:30 <SpamapS> yes 20:33:40 <zaneb> there is, it got bumped 20:33:58 <SpamapS> It got bumped, because ETOOHARD .. but I think it is too important .. 20:34:15 <stevebaker> SpamapS: is it possible to do all updates with rollback enabled? 20:34:16 <zaneb> it got bumped because ENOTIME 20:34:16 * radix arrives 20:34:26 <SpamapS> The alternative is if we make abandon/adopt work, we can write some magic to fix broken things in the abandoned dump file and re-adopt the stack. 20:34:31 <SpamapS> stevebaker: no 20:34:44 <sdake_> SpamapS who was assigned to the blueprint? 20:34:46 <SpamapS> zaneb: I'm about to make time for it. 20:34:54 <zaneb> sdake_: I was 20:35:02 <zaneb> /am 20:35:09 <SpamapS> Basically our CD tripleo cloud dies every 8 hours or so because of random 500 errors from overloaded nova-api or neutron-server. 20:35:46 <stevebaker> zaneb: I'm assuming the change would be too disruptive for it to be flagged as a bug, even though it really is 20:35:53 <SpamapS> Now, I _also_ think we should handle those by just retrying... but we need a way to recover from the cases we don't handle well or that are completely unexpected and unhandleable. 20:35:54 <sdake_> I guess what we dont want is tripleo being nonfunctional for 6 months until Juno is released 20:36:19 <SpamapS> sdake_: IMO this is Heat being non-functional 20:36:25 <SpamapS> create is fine.. 20:36:31 <SpamapS> but updates are where the money is at ;) 20:36:39 <zaneb> SpamapS: I feel your pain. The problem, as you know, is that Heat ends up with inconsistent data. I'm not sure what a short-term solution looks like that doesn't make things worse 20:37:12 <SpamapS> zaneb: Yeah, I took a stab at a spike.. storing the snippets in resources and then when loading a stack, overlaying the stored snippets on the loaded template. 20:37:23 <SpamapS> zaneb: worked, but it was really hard to test. 20:37:32 <SpamapS> which told me "you have just created a monster" 20:38:02 <stevebaker> SpamapS: are we talking updates that do replacements? surely if replacements are avoided then handling the case of an inline update failing would be easier 20:38:20 <sdake_> anyone have a link to the blueprint? 20:38:42 <zaneb> stevebaker: additions/removals are the worst 20:38:52 <zaneb> we can lose the resources altogether 20:39:15 <SpamapS> stevebaker: right, as zaneb says.. just adding a server .. removing an object.. haphazard. 20:39:44 <zaneb> sdake_: https://blueprints.launchpad.net/heat/+spec/update-failure-recovery 20:39:46 <radix> replacements are implemented as add-then-delete, right? 20:39:53 <radix> (currently I mean) 20:40:24 <SpamapS> At the moment we fail and don't roll back successfully, the thing in the database (raw template) is completely non-indicative of what is actually in existence. 20:40:31 <SpamapS> So one way to handle this... is to abandon the stack... 20:40:32 <SpamapS> and then edit the abandoned stack dump to reflect actual reality... 20:40:32 <SpamapS> and then adopt 20:40:44 <zaneb> radix: yes, but in that case the resource name stays the same, so we won't actually lose it. 20:41:01 <zaneb> radix: it's not without issues though 20:41:22 <radix> SpamapS: yeah, abandandon/adopt allow for all sorts of cool hacks to work around limitations in heat ;) 20:41:54 <SpamapS> anyway 20:41:55 <SpamapS> I know time is short.. 20:41:59 <SpamapS> and this is not without risk. 20:42:18 <zaneb> radix: true! :) Although, let's not forget it works by allowing us to blame the user when it goes wrong 20:42:21 <zaneb> ;) 20:42:27 <SpamapS> but I think users are really in a bad place without it if they don't know exactly what they're getting into. 20:42:29 <radix> heh heh 20:42:59 <SpamapS> So just raising my hand and sayign "this may get ugly soon" 20:43:14 <zaneb> SpamapS: I think you are on the right track 20:43:24 <SpamapS> It has been tough for me to focus on it, as I've also been trying to follow hot-software-config and piggy back my graceful-update-control thing on that. 20:43:31 <radix> so are we talking about something for icehouse? (sorry I came into the conversation late) 20:43:33 <stevebaker> SpamapS: so since you do CI, feature freeze is a window where you don't get anything but fixes, but as soon as juno opens then you can use landed features again 20:43:37 <zaneb> and I don't think this getting it in will make it harder to clean up later, so +1 20:43:45 <SpamapS> but I feel like hot-software-config is going to make our life easier for graceful updates 20:44:12 <zaneb> but get it in soon, because realistically there will probably be bugs to deal with 20:44:26 <sdake_> 5 days - tight 20:44:36 <SpamapS> right, as soon as I have a clear short term plan for graceful updates I'll put my head down on this 20:46:55 <stevebaker> anything else? we can finish early 20:47:17 <SpamapS> \o/ use this 13 minutes for reviews everyone. :) 20:48:27 <stevebaker> #action everyone uses 780 extra seconds to do code reviews 20:48:31 <stevebaker> #endmeeting