20:00:30 #startmeeting heat 20:00:31 Meeting started Wed Mar 11 20:00:30 2015 UTC and is due to finish in 60 minutes. The chair is asalkeld. Information about MeetBot at http://wiki.debian.org/MeetBot. 20:00:32 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 20:00:34 The meeting name has been set to 'heat' 20:00:48 hehe today it works ;) 20:00:55 \o/ 20:01:24 Hi 20:01:25 /o\ 20:01:28 o/ 20:01:48 hi 20:01:59 hi 20:02:00 o/ 20:02:07 #topic Adding items to the agenda 20:02:29 so far we have #link https://wiki.openstack.org/wiki/Meetings/HeatAgenda 20:02:30 I have added one 20:02:40 Hi 20:03:00 hi Tango|2 20:03:38 nothing else? 20:04:03 #topic run up to kilo release 20:04:15 #link https://wiki.openstack.org/wiki/Kilo_Release_Schedule 20:04:29 (March 19 cut off for features),lets run though what we can get in. 20:04:39 that's one week 20:04:45 afaiks 20:05:00 time for hard work 20:05:04 and we got a *lot* to get in 20:05:17 #link https://launchpad.net/heat/+milestone/kilo-3 20:05:55 has anyone seen new progress from tags and breakpoints? 20:05:56 hey, everything in code review 20:06:14 asalkeld: yes, I've been testing the latest breakpoint patches today 20:06:18 asalkeld: good progress on breakpoints (hooks) 20:06:22 the heat one is very close IMO 20:06:23 I'm reviewing at the moment 20:06:24 ok, cool 20:06:42 heatclient needs a bit more work, I assume shadower is on that already 20:06:45 shardy: agree. I think it should land this week 20:06:53 any outstanding patches which will affect db schema? 20:07:02 (the Heat one that is, not the heatclient one) 20:07:16 hi 20:07:30 inc0: there are still ~ 3 or 4 convergence patches, tho' smaller 20:07:41 inc0: pretty sure there are, yes. but all the ones I know are for convergence 20:08:23 we still have convergence, olso versioned objects, keystone resources, tags and breakpoints 20:08:34 stevebaker: you about? 20:08:40 yes 20:08:46 what's the state of you signalling 20:08:56 https://blueprints.launchpad.net/heat/+spec/software-config-swift-signal 20:08:56 keystone resources are in contrib, so surely they are not subject to feature freeze 20:09:22 stevebaker: as long as they don't effect the string freeze 20:09:26 asalkeld: also Mistral resources 20:09:39 skraynev_: yeah phew that's right 20:09:59 asalkeld: hmm, is i18n applied to contrib? It probably shouldn't be 20:10:13 stevebaker: honestly not sure 20:10:40 stevebaker: is software-config-swift-signal done? 20:11:03 asalkeld: everything in heat has landed for software-config-trigger and swift signalling. Review on this related heatclient change is needed though https://review.openstack.org/#/c/160240/ 20:11:35 ok, that's right 20:11:57 say do we offically use heatclient blueprints? 20:12:01 there are some 20:12:19 they look forgotten 20:12:23 https://blueprints.launchpad.net/python-heatclient 20:12:30 occasionally. I'd rather use feature bugs 20:12:53 yeah, i'd rather not use them 20:13:31 so zaneb have you looked at the convergence reviews today? 20:13:40 any major issues? 20:13:42 not today 20:14:13 6am for me, just got up 20:14:13 Is there a way of knowing which is the highest priority convergence reviews required other than blueprint priority? 20:14:18 I don't think it matters whether those patches land before FF 20:14:34 schema changes 20:14:54 zaneb: there is a string freeze 20:14:55 zaneb: could you please take a look on my answers / question for https://review.openstack.org/#/c/161306/ 20:15:07 if you have a minute ;) 20:15:12 so we *could* commit with commented out logs:-O 20:15:16 asalkeld: yeah, we may have to stop landing stuff between FF and RC 20:15:22 I've taken a look now - I can't see anything affecting my patches if I haven't miss anything 20:15:43 inc0: you should be able to get in stack and template 20:15:47 but it's not like convergence is going to be a working thing in Kilo either way 20:15:52 zaneb: you mean it's too late to land anything amounting to useful, so we may as well wait until we branch? 20:16:01 ok, cool 20:16:02 yes 20:16:04 there is some more resource stuff 20:16:10 asalkeld, its already ready....before k3 I'd love to have Resource as well, as its critical object 20:16:31 I mean, if we can land stuff safely, then by all means land it (even during FF) 20:17:15 I think if we are careful, given resource and stack objects lands, we should be good with database compatibility for upgrade k->l 20:17:18 zaneb: you not worried about string freeze? 20:17:40 i guess bugs change strings 20:17:44 asalkeld: we should take that into account when landing stuff 20:17:45 IMO we shouldn't be landing convergence rearchitecting during FF 20:18:08 some things should definitely not land after FF/SF and before the branch 20:18:33 only minor things 20:18:34 other things probably can, so long as they are not called anywhere 20:18:45 don't add strings 20:18:48 &c. 20:18:57 i am ok with that 20:19:14 no db migrations during that period, obvs 20:19:20 we won't be too long out of business 20:19:24 (but new tables might be ok) 20:19:38 Ok, providing we're not making structural changes to existing code then fair enough 20:20:25 we all good with this topic? 20:20:33 +1 20:20:37 +1 20:20:43 #topic Thoughts on spec for balancing scaling groups across AZs 20:20:51 #link https://review.openstack.org/#/c/105907/ 20:21:13 Hi, we have added comments to address some of thequestions with this, so wondering on people's thoughts 20:21:27 not some, all 20:21:54 KarolynChambers: it's been a while since i looked at that 20:22:25 wanted to get it on the radar again so to speak 20:23:09 KarolynChambers, mspreitz: Would this mean implementing an az scheduler in heat? 20:23:29 that's a pretty heavy weight wording for what we have in mind 20:23:36 all of my context on this got paged out :/ 20:23:37 we propose a simple counting scheme 20:23:51 stevebaker, or using gantt:) but I'm more concerned about having to implement logic which will ensure that volume will be in same az as instacne...for example 20:23:57 mspreitz: KarolynChambers it might be better to raise after branching when we can give it proper time 20:24:00 and all that stuff 20:24:11 we are all super focused on kilo 20:24:25 inc0: deliberately not making Heat into holistic scheduler here 20:24:25 it's not long away 20:24:53 inc0: there is a way to do that, you use get_attr on the server to supply the AZ to the volume 20:25:18 zaneb: problem is not really that complex. Scaling group has homogenous members 20:25:47 all we propose is that heat counts members in each az 20:26:13 mspreitz, I wonder what's the plan when a given AZ refuses an instance (no valid host found)? 20:26:27 well, sure, its nothing that can't be solved. 20:26:28 mspreitz: but heat is making placement decisions based on some algorithm (albeit a simple one). We've been deferring getting into this area until gantt is available to consume 20:26:46 pas-ha, I guess CREATE_ERROR -> same as now if nova throws 20:26:51 we can not defer to gantt, it has no interface for choosing which group member to remove 20:26:57 on scale down 20:27:14 inc0: yes, proposal here is very simple 20:27:26 inc0, but then the point of balancing is fainting 20:27:29 well, proposal is vague on that point, initial impl is simple 20:27:48 proposal is vague, allows impl to try again elsewhere 20:27:49 Does anyone know if e.g Neutron has full support for Availability Zones? 20:28:00 not sure shardy 20:28:04 I don't think so 20:28:05 shardy: not relevant 20:28:15 network node is still az-agnostic 20:28:26 we are just looking for a way for heat to make AZ choice that eventually gets to whatever template author cares it to 20:28:48 but I agree with mspreitz - it will provide some measure of safety without too much problems I guess 20:28:56 mspreitz: so if you can't have per-AZ l3 networks, what happens when an AZ goes down containing the Neutron services? 20:28:56 if we keep this thing naive 20:29:30 shardy, is that our problems or nova's?:) 20:29:34 adding just instance support for the AWS resource kinda makes sense, but folks want to scale stacks containing more than just instances with the native resources 20:29:36 or neutrons 20:30:05 shardy: when an AZ goes down it is entirely unavailable of course, when the impl graduates to handling refusals it will naturally cover that 20:30:12 inc0: I guess I'm just asking if the feature will be crippled die to being nova-centric that's all 20:30:18 s/die/due 20:30:24 Nova is not the only thing that knows AZ 20:30:28 Cinder does too 20:30:36 but that is not critical here 20:30:49 the idea is that template author applies this only where relevant 20:30:58 mspreitz: Ok, I'm just trying to understand the gaps, given that we don't place any restrictions on what resources can be scaled out 20:31:16 shardy: OK, let me do case analysis 20:31:17 I guess that is up to the user 20:31:34 shardy: I don't see how that's our problem 20:31:40 case 1: scaling group of OS::Nova::Server or Cinder volume --- clear to all, I suppose 20:32:00 case 2: scaling group of atomic resource that does not support AZ - user does not ask for balancing across AZ 20:32:23 case 3: scaling group of stack - template author propagates AZ param to relevant resources, recurse on the case analysis 20:33:14 well, I for one think this is nice feature to be added, but its Liberty anyway 20:33:19 mspreitz: I guess the question is, do we bear any responsibility to communicate to the user if the scaled unit is only partially balanced over AZs 20:33:44 maybe we don't care, and we just document known limitations 20:33:48 The user can already query that info, albeit not very conveniently 20:33:57 adding convenient query would be a nice add 20:34:53 as far as I understood user chooses herself what resources in a nested stack to pas the az param, so user is aware of what is balanced 20:35:03 right 20:35:10 but question is how does user know results 20:35:26 ok, mspreitz it looks like people don't have a big problem with the spec 20:35:37 just minor questions 20:35:44 an implicit output, mapping resource names to az's 20:35:55 On querying results, I think that can/should be addressed as a later add 20:36:03 can we handle as normal? and attack it in L? 20:36:05 like OS::StackID 20:36:09 OK, actually, we could do that via attribute 20:36:34 asalkeld: pls clarify what "it" 20:36:43 the blueprint 20:36:52 spec and imlementation 20:36:53 (as in do it) 20:37:19 Karolyn is closer to driving need now, I think 20:37:29 Karolyn, delay to L OK? 20:37:34 mspreitz: you could also add to https://etherpad.openstack.org/p/liberty-heat-sessions 20:37:38 IMHO scheduler hints are higher priority 20:37:57 KarolynChambers: I've +1ed, but would like an evaluation of gantt in the Alternatives section 20:38:07 mspreitz, it would be hard for it to land in kilo as there is 1 week left and bp isnt merged even;) 20:38:24 stevebaker: okay, I will look at gantt 20:38:30 stevebaker: there *is* an evaluation in the Alternatives of defering to other scheduler 20:38:34 lets move on 20:38:42 also clarification how to solve resource groups with az-agnostic resources would be nice 20:38:54 Gantt has no interface for choosing which to delete on scale down 20:39:04 inc0: that is not a problem 20:39:06 mspreitz: yet 20:39:47 If Gantt *did* have an interface for choosing which to delete, Heat would need to invoke it.... individuals are identified in the generated template 20:40:00 mspreitz: what stevebaker is suggesting is should we work with gantt to add some of functionality there 20:40:16 and look into that 20:40:29 how would that work, would it be better 20:40:38 akalkeld: that does not get this issue entirely out of Heat, generated template identifies individuals 20:40:42 could others, not using heat benefit 20:41:05 mspreitz: totally we would still need some work in heat 20:41:21 OK, so we are on to L for this. 20:41:27 #topic Documentation options for stack lifecycle scheduler hints 20:41:36 #link https://review.openstack.org/#/c/130294/ 20:41:51 yes, that is what I am suggesting. Or use gantt for initial placement, and logic in heat for removal. I just want to see that analysis. The Alternatives just talks about nova. My idealised notion of gantt is that it is a completely generic service for putting $things in $places 20:41:52 i think people were oka ywith the code but asalked you had a comment in the spec about documentation being needed 20:42:22 KarolynChambers: looking - it's been a while 20:42:30 what are the options for documentation? more documentation in the spec? or is there some other place to document? 20:43:09 heat/doc/source is one option 20:43:44 do people have a preference for where you'd like to see? 20:43:52 KarolynChambers: i am concerned with hooks as 1. devs need to know what there are 20:44:02 understand 20:44:03 so they don't regress 20:44:14 and a user might want to use them too 20:44:22 so what is it for and how do i use it 20:44:36 and how to set it up 20:44:41 yip 20:44:58 agreed 20:45:11 well, the usage we have in mind is not something users would be doing 20:45:15 it is for operators 20:45:29 mspreitz: i more mean operartors 20:45:46 KarolynChambers: you happy with that? 20:45:59 anyone got better idea for docs? 20:46:21 nope 20:46:31 KarolynChambers: topic done? 20:46:34 yes 20:46:37 thank you 20:46:44 np 20:46:51 #topic Work to get WSGI services runnable inside Apache/nginx 20:46:55 let's have them at least somewhere first, then decide what is the best place for them 20:46:56 http://lists.openstack.org/pipermail/openstack-dev/2015-February/057359.html 20:47:29 currently keystone and other services want to deprecate eventlet and use apache for these purposes 20:47:39 do we want be in this tream? 20:47:44 uhh... 20:47:49 and do the same thing in L ? 20:47:57 skraynev_: sure 20:48:04 it's quite easy 20:48:13 this increases comlexity of installation - adds new component 20:48:13 for heat-api*, sure 20:48:20 this only applies to the API though, of course 20:48:25 yes, api only 20:48:30 but I guess if ks will do it anyway... 20:48:48 inc0: we can have it as an option for starters 20:49:01 +1 20:49:11 imagine a cloud with every service on a single IP, port 80 20:49:13 and still have a binary for things like devstack 20:49:22 or a fance checker that there is suitable backend available :) 20:49:29 stevebaker: dreams :) 20:49:36 well, 443 20:49:46 dreams - > reality 20:50:12 skraynev_: seems like a small spec 20:50:18 +1 20:50:25 #topic open discussion 20:50:26 ok, so if all agree with this I will add bp and spec for L. 20:50:27 how does this work with asyncio idea? 20:50:44 inc0: hope that doesn't happen:-O 20:50:54 inc0: that is something we can consider for heat-engine 20:51:15 stevebaker, do we really want to support 2 types of runtime?;) 20:51:27 inc0: that's the idea 20:51:29 inc0: I said consider, not adopt 20:51:30 but meh, apache will be easy enough 20:51:45 if it helps, why not 20:51:50 api use apache, engines/workers use "something" 20:52:00 not sure why, as devstack would have apache available anyway for keystone 20:52:08 eventlet/fork/threads 20:52:15 worker 20:52:18 s 20:52:19 and it already has it for horizon 20:52:40 my point is rather...why do it?;) 20:52:55 inc0: read the thread 20:53:13 inc0: it also give operators some options 20:53:16 and easy to do 20:53:18 apache/nginx is a proven production grade concurrent web server, better than eventlet-driven python 20:53:26 eventlet has issues, and at least for APIs moving to web servers is an easy solution 20:53:58 fair enough 20:54:10 btw, open topic: heat memory consumption 20:54:24 shardy mentioned it today, might be good to take a look 20:54:34 #link https://etherpad.openstack.org/p/liberty-heat-sessions 20:54:36 ? 20:54:51 are we using more memory than normal? 20:54:57 yeah, everyone I did some testing today of a TripleO seed with multiple workers 20:55:12 heat-engine with 4 workers peaked at over 2G memory usage :-O 20:55:21 worse, even, than nova-api 20:55:40 that's... a lot 20:55:44 shardy: i suspect root_stack is somewhat to blame 20:55:56 if anyone has existing thoughts on steps to improve that, it'd be good to start enumerating them as bugs and working through 20:55:59 yeah...and template validation 20:56:08 we need to kill root_stack 20:56:32 asalkeld: yeah, I'm sure you're right 20:56:47 or try to offload something to db 20:57:02 that will load every stack in every stack (if that makes sense) 20:57:14 Shall I just raise a "heat-engine is a memory hog" bug, and we can capture ideas there on how to fix it? 20:57:32 also, does Rally help with quantifying such things, or only performance? 20:57:37 shardy: as a task to make specific bugs 20:57:38 or lets talk in Vancouver? 20:58:00 inc0: it's really an investigation isn't it 20:58:20 inc0: I'm not sure it really warrants summit time, we just need to do the analysis and fix the code ;) 20:58:20 someone needs to figure out what all the problems are 20:58:52 loading templates and files into mem 20:58:55 profiler for the rescue;) 20:59:14 2 mins ... 21:00:12 ok, that's mostly it. Thanks all! 21:00:17 bb 21:00:18 #endmeeting