20:00:09 <stevebaker> #startmeeting heat
20:00:09 <openstack> Meeting started Wed Feb 26 20:00:09 2014 UTC and is due to finish in 60 minutes.  The chair is stevebaker. Information about MeetBot at http://wiki.debian.org/MeetBot.
20:00:11 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
20:00:13 <openstack> The meeting name has been set to 'heat'
20:00:17 <stevebaker> #topic rollcall
20:00:39 <randallburt> 0/
20:00:54 <zaneb> howdy
20:00:58 <tango> o/
20:01:08 <jpeeler> hi
20:01:17 <bgorski> o/
20:01:57 <stevebaker> #topic Review last meeting's actions
20:02:06 <stevebaker> therve to update https://blueprints.launchpad.net/heat/+spec/parameter-nested-schema
20:02:23 <stevebaker> I've kicked that from i-3 anyway, so it no longer matters
20:02:32 <stevebaker> #topic Adding items to the agenda
20:02:34 <sdake_> o/
20:02:39 <stevebaker> #link https://wiki.openstack.org/wiki/Meetings/HeatAgenda#Agenda_.282014-2-26.29
20:03:03 <stevebaker> anything to add? we're probably all quite busy with feature freeze
20:03:40 <Slower> o/
20:03:51 <stevebaker> #topic Policy for managing syncing from Oslo
20:03:58 <stevebaker> #link https://review.openstack.org/#/c/74635/
20:04:15 <stevebaker> jpeeler: I assume shardy_afk added this, but you had the same objection
20:04:35 <jpeeler> i added that - i don't know what other projects do but i think it would be good to do regular syncing and stop doing single file updates
20:04:40 <stevebaker> dhellmann_: are you about to discuss oslo sync script?
20:05:05 <jpeeler> i asked in the channel and people had mixed feelings, so i added it to the agenda
20:05:09 <zaneb> jpeeler: +1
20:05:36 <stevebaker> jpeeler: single file updates are fine, cherry-picking single changes is probably what we want to avoid
20:05:48 <zaneb> stevebaker: I disagree
20:05:50 <jpeeler> why though?
20:06:06 <jpeeler> seems like a recipe for potential problems
20:06:07 <zaneb> why wouldn't we update everything when we need a change?
20:07:38 <zaneb> it has to be done eventually anyway
20:07:51 <zaneb> let the person who is most motivated have to do it
20:08:11 <stevebaker> zaneb: nova (for example) have got themselves into a situation where they are so out of sync that updating everything causes multiple breakages due to oslo api changes
20:08:33 <zaneb> stevebaker: so let's avoid that
20:08:36 <jpeeler> right
20:08:37 <randallburt> stevebaker:  which seems like a situation we would want to avoid
20:08:45 <zaneb> by making sure we stay in sync
20:09:14 <stevebaker> in my ideal world update.py will create one heat commit per oslo commit, so if and when that happens then breakage can be tracked to a commit
20:09:36 <stevebaker> I believe that is happening but I have no idea where progress is at
20:09:52 <zaneb> that sounds nothing like the world we live in ;)
20:10:26 <sdake_> don't think update.py does that :)
20:11:06 <stevebaker> so how about for now we run python update.py --modules <module> once for each module, and create a commit per run
20:11:42 <sdake_> please don't update rpc
20:11:48 <zaneb> stevebaker: ugh, no
20:11:49 <stevebaker> sdake_: ;)
20:12:12 <sdake_> i guess my theory is if it isn't broken, why fix it :)
20:12:22 <sdake_> the review in question, something is broken
20:12:46 <SpamapS> o/
20:12:47 <zaneb> let's assume at the very least oslo is internally consistent at any given revision
20:12:47 <jpeeler> we've had to backport changes because we were out of sync
20:12:56 <zaneb> and then not go out of our way to break it
20:13:19 <sdake_> some of the updated files need modification after update.py is run as well
20:14:10 <jpeeler> sdake_: even more reason to not let it get out of date in my opinion
20:14:30 <stevebaker> shall we do a sync, but defer it to after feature freeze?
20:14:47 <stevebaker> just due to the review load
20:14:57 <jpeeler> it's not a one time thing though - it doesn't need to be done right now
20:15:08 <jpeeler> just in the future, some specified interval would be good
20:16:07 <zaneb> release milestones seem like the obvious times to do it
20:16:21 <zaneb> but milestones don't really seem to work that way in OpenStack
20:16:23 <sdake_> zaneb you mean AFTER a milestone
20:16:30 <zaneb> sdake_: yes
20:16:49 <zaneb> so after i-2 we would update to the i-2 version of oslo
20:16:57 <zaneb> for example
20:17:24 <stevebaker> so what is the action from this?
20:17:28 <sdake_> some modules depend on others
20:17:36 <zaneb> but I actually think the current system works fine: whenever somebody needs something new from oslo, they sync the whole thing
20:17:41 <sdake_> it is not as simple as updating one module per commit
20:17:45 <zaneb> that ensures we keep relatively up to data
20:18:01 <zaneb> date
20:18:03 <jpeeler> zaneb: as long as that is enforced
20:18:13 <zaneb> jpeeler: indeed
20:18:31 <Slower> makes sense to me
20:19:44 <stevebaker> so we're syncing after i-3, unless someone proposes it before and we have time to review it?
20:20:48 <sdake_> please don't sync notifier or rpc
20:20:58 <sdake_> TIA :)
20:21:22 <stevebaker> ok, moving on
20:21:26 <stevebaker> #topic Feature freeze blueprints status
20:21:27 <stevebaker> #link https://launchpad.net/heat/+milestone/icehouse-3
20:21:42 <stevebaker> EOB March 4th is feature freeze!
20:22:25 <SpamapS> ugh
20:22:36 <stevebaker> we can ask for extensions on specific blueprints, but those blueprints will need to be marked High priority
20:22:40 <SpamapS> Was hoping I'd get a couple days at the TripleO sprint to jam some patches in
20:22:55 <stevebaker> we have a lot of reviewing to do
20:23:10 <SpamapS> stevebaker: you coming to that? We need to get moving on softwareconfig....
20:24:10 <stevebaker> SpamapS: nopes, but I expect many polish bug fixes to make it work for you
20:24:35 <SpamapS> stevebaker: ok.. I'm reviewing the whole patch stream today.
20:25:01 <stevebaker> SpamapS: ok, expect completed structured config today
20:25:18 <stevebaker> SpamapS: then I'll rework a tripleo template
20:25:25 <zaneb> stevebaker: why do only Polish bugs get fixed?
20:25:59 <SpamapS> zaneb: politics....
20:26:02 <stevebaker> zaneb: :)
20:26:22 <stevebaker> I shall reverse my polish notation
20:26:34 <SpamapS> be proud, yoda would
20:27:09 <sdake_> oslo.messaging status -> 56 test cases fail, most API calls seem to work, multi-engine works, just need to beat the test cases into submission
20:27:33 <sdake_> jasond also asked for a refactor of service.py before I further refactor it in the oslo.messaging patch that I'll look into after the test cases pass
20:28:40 <zaneb> how would folks feel about sneaking in pluggable template parsers before the feature freeze?
20:28:58 <stevebaker> 17 blueprints are in review, if you're looking for guidelines on what to review first I guess it would be best to go for blueprint Priority first
20:29:15 <stevebaker> zaneb: are there changes for it?
20:29:45 <zaneb> stevebaker: it's going to be fairly minimal changes beyond what I already have posted patches for
20:30:14 <stevebaker> btw the heat-slow job is reasonably stable right now, so if your change fails on heat-slow you should check to see if it was your change that caused it
20:31:10 <stevebaker> zaneb: I don't think I mind
20:31:29 <zaneb> ok, I will post it when it's done and we will see what happens in the reviews
20:31:51 <stevebaker> #topic open discussion
20:32:31 <SpamapS> As far as "stuff to get in late" ...
20:32:44 <SpamapS> I'm pretty sure we _must_ have manual retry ability in TripleO.
20:33:13 <SpamapS> update is basically a death sentence without it.
20:33:14 <zaneb> SpamapS: retry of what, specifically?
20:33:20 <SpamapS> zaneb: update
20:33:27 <sdake_> is there a blueprint for that?
20:33:30 <SpamapS> yes
20:33:40 <zaneb> there is, it got bumped
20:33:58 <SpamapS> It got bumped, because ETOOHARD .. but I think it is too important ..
20:34:15 <stevebaker> SpamapS: is it possible to do all updates with rollback enabled?
20:34:16 <zaneb> it got bumped because ENOTIME
20:34:16 * radix arrives
20:34:26 <SpamapS> The alternative is if we make abandon/adopt work, we can write some magic to fix broken things in the abandoned dump file and re-adopt the stack.
20:34:31 <SpamapS> stevebaker: no
20:34:44 <sdake_> SpamapS who was assigned to the blueprint?
20:34:46 <SpamapS> zaneb: I'm about to make time for it.
20:34:54 <zaneb> sdake_: I was
20:35:02 <zaneb> /am
20:35:09 <SpamapS> Basically our CD tripleo cloud dies every 8 hours or so because of random 500 errors from overloaded nova-api or neutron-server.
20:35:46 <stevebaker> zaneb: I'm assuming the change would be too disruptive for it to be flagged as a bug, even though it really is
20:35:53 <SpamapS> Now, I _also_ think we should handle those by just retrying... but we need a way to recover from the cases we don't handle well or that are completely unexpected and unhandleable.
20:35:54 <sdake_> I guess what we dont want is tripleo being nonfunctional for 6 months until Juno is released
20:36:19 <SpamapS> sdake_: IMO this is Heat being non-functional
20:36:25 <SpamapS> create is fine..
20:36:31 <SpamapS> but updates are where the money is at ;)
20:36:39 <zaneb> SpamapS: I feel your pain. The problem, as you know, is that Heat ends up with inconsistent data. I'm not sure what a short-term solution looks like that doesn't make things worse
20:37:12 <SpamapS> zaneb: Yeah, I took a stab at a spike.. storing the snippets in resources and then when loading a stack, overlaying the stored snippets on the loaded template.
20:37:23 <SpamapS> zaneb: worked, but it was really hard to test.
20:37:32 <SpamapS> which told me "you have just created a monster"
20:38:02 <stevebaker> SpamapS: are we talking updates that do replacements? surely if replacements are avoided then handling the case of an inline update failing would be easier
20:38:20 <sdake_> anyone have a link to the blueprint?
20:38:42 <zaneb> stevebaker: additions/removals are the worst
20:38:52 <zaneb> we can lose the resources altogether
20:39:15 <SpamapS> stevebaker: right, as zaneb says.. just adding a server .. removing an object.. haphazard.
20:39:44 <zaneb> sdake_: https://blueprints.launchpad.net/heat/+spec/update-failure-recovery
20:39:46 <radix> replacements are implemented as add-then-delete, right?
20:39:53 <radix> (currently I mean)
20:40:24 <SpamapS> At the moment we fail and don't roll back successfully, the thing in the database (raw template) is completely non-indicative of what is actually in existence.
20:40:31 <SpamapS> So one way to handle this... is to abandon the stack...
20:40:32 <SpamapS> and then edit the abandoned stack dump to reflect actual reality...
20:40:32 <SpamapS> and then adopt
20:40:44 <zaneb> radix: yes, but in that case the resource name stays the same, so we won't actually lose it.
20:41:01 <zaneb> radix: it's not without issues though
20:41:22 <radix> SpamapS: yeah, abandandon/adopt allow for all sorts of cool hacks to work around limitations in heat ;)
20:41:54 <SpamapS> anyway
20:41:55 <SpamapS> I know time is short..
20:41:59 <SpamapS> and this is not without risk.
20:42:18 <zaneb> radix: true! :) Although, let's not forget it works by allowing us to blame the user when it goes wrong
20:42:21 <zaneb> ;)
20:42:27 <SpamapS> but I think users are really in a bad place without it if they don't know exactly what they're getting into.
20:42:29 <radix> heh heh
20:42:59 <SpamapS> So just raising my hand and sayign "this may get ugly soon"
20:43:14 <zaneb> SpamapS: I think you are on the right track
20:43:24 <SpamapS> It has been tough for me to focus on it, as I've also been trying to follow hot-software-config and piggy back my graceful-update-control thing on that.
20:43:31 <radix> so are we talking about something for icehouse? (sorry I came into the conversation late)
20:43:33 <stevebaker> SpamapS: so since you do CI, feature freeze is a window where you don't get anything but fixes, but as soon as juno opens then you can use landed features again
20:43:37 <zaneb> and I don't think this getting it in will make it harder to clean up later, so +1
20:43:45 <SpamapS> but I feel like hot-software-config is going to make our life easier for graceful updates
20:44:12 <zaneb> but get it in soon, because realistically there will probably be bugs to deal with
20:44:26 <sdake_> 5 days - tight
20:44:36 <SpamapS> right, as soon as I have a clear short term plan for graceful updates I'll put my head down on this
20:46:55 <stevebaker> anything else? we can finish early
20:47:17 <SpamapS> \o/ use this 13 minutes for reviews everyone. :)
20:48:27 <stevebaker> #action everyone uses 780 extra seconds to do code reviews
20:48:31 <stevebaker> #endmeeting