15:00:30 <therve> #startmeeting heat 15:00:32 <openstack> Meeting started Wed Jun 22 15:00:30 2016 UTC and is due to finish in 60 minutes. The chair is therve. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:34 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:36 <openstack> The meeting name has been set to 'heat' 15:00:41 <therve> #topic Roll call 15:00:46 <jdob> o/ 15:00:46 <Drago> o/ 15:01:11 <cwolferh> o/ 15:01:21 <ochuprykov> h1 15:01:58 <zaneb> hola 15:02:38 <shardy> o/ 15:03:11 <therve> #topic Adding items to agenda 15:03:21 <therve> #link https://wiki.openstack.org/wiki/Meetings/HeatAgenda#Agenda_.282016-06-22_1500_UTC.29 15:03:30 <ramishra> hi 15:04:29 <therve> #topic DB errors blocking gate 15:04:41 <therve> #link https://bugs.launchpad.net/heat/+bug/1546431 15:04:42 <openstack> Launchpad bug 1546431 in heat "Tests fails with SQL error 'Command Out of Sync'" [High,Triaged] 15:04:51 <therve> #link https://bugs.launchpad.net/heat/+bug/1483670 15:04:52 <openstack> Launchpad bug 1483670 in heat "ResourceClosedError from DB in functional tests" [High,Triaged] 15:04:59 <therve> zaneb, You added this one no? 15:05:07 <zaneb> I think so, yes 15:05:22 <zaneb> we are hitting this *all* the time 15:05:45 <therve> Yeah the rate has been pretty bad recently 15:06:02 <therve> Presumably the 2 new builders don't improve things 15:06:11 <zaneb> yes, that's not helping 15:06:25 <zaneb> but we need to fix it 15:06:30 <therve> I looked at this issue, AFAICT it's that we behave badly when we kill greenlets 15:06:31 <zaneb> cwolferh: about? 15:06:41 <cwolferh> yes 15:06:54 <therve> The oslodb work that stevebaker is working on may improve things 15:07:03 <therve> But that's as much as I know 15:07:37 <zaneb> what we do to kill greenlets is kinda uncool, but we should be able to do it without breaking the db 15:08:06 <zaneb> my theory is that every transaction should be using a with ...: block 15:08:22 <zaneb> so that if *any* exception occurs then it gets rolled back 15:08:45 <therve> zaneb, https://review.openstack.org/#/c/330800/ ought to do that 15:08:52 <zaneb> but that there must be places where we are not, and so we switch threads in the middle of a transaction 15:08:55 <therve> Using a decorator, but that's the same solution 15:09:55 <zaneb> that's good enough for me :) 15:10:11 <zaneb> do we think that merging that will fix the problem then? 15:10:36 <therve> I hope so, but not sure 15:10:40 <cwolferh> that looks like a good next thing to try to me :-) 15:10:55 <therve> FWIW https://bugs.launchpad.net/heat/+bug/1499669 has a good reproducer for this kind of issues 15:10:55 <openstack> Launchpad bug 1499669 in heat "Heat stucks in DELETE_IN_PROGRESS for some input data" [Medium,Triaged] 15:11:06 <zaneb> ok, then I'm a happy camper :) 15:11:22 <therve> zaneb, You mean reviewer? :) 15:11:44 <therve> There is a 10+ patches series, so we need to get onto that 15:11:50 <zaneb> therve: oh *that's* the bug I was looking for the other day 15:11:56 <zaneb> I knew I commented on it :D 15:12:14 <therve> If you need to search launchpad talk to me first :) 15:12:37 <zaneb> any reason we can't accelerate it to the top of the patch queue? I guess that's a question for stevebaker 15:12:57 <zaneb> therve: aha, you have a secret weapon 15:13:15 <jdob> he just blasted out a bunch of patches yesterday, they should be at the top 15:13:27 <shardy> Another stuck DELETE_IN_PROGRESS reproducer specific to convergence is https://review.openstack.org/#/c/329460/ 15:13:45 <shardy> https://bugs.launchpad.net/heat/+bug/1592374 15:13:45 <openstack> Launchpad bug 1592374 in heat "deleting in_progress stack with nested stacks fails with convergence enabled" [High,Confirmed] 15:14:03 <shardy> If anyone has time to look into that it'd be good - the functional test appears to reproduce the issue 15:14:05 <zaneb> jdob: I can never remember what order the 'related changes' are in the new gerrit 15:14:16 <jdob> dude, glad i'm not alone there 15:14:50 <zaneb> BRING BACK THE OLD GERRIT! 15:14:54 <zaneb> ahem, sorry. 15:14:57 <jdob> :) 15:15:11 <therve> It's definitely at the bottom of the queue for a reason 15:15:22 <therve> Needs some API change to make it work correctly 15:15:37 <zaneb> dang 15:16:00 <therve> shardy, Yeah the convergence part is definitely something different than we we're talking about 15:16:22 <therve> shardy, Wondering if https://review.openstack.org/#/c/279520/ is related 15:16:25 <ramishra> Is the ResourceClosedError 100% reproducible with steps in bug #1499669? 15:16:25 <openstack> bug 1499669 in heat "Heat stucks in DELETE_IN_PROGRESS for some input data" [Medium,Triaged] https://launchpad.net/bugs/1499669 15:16:26 <shardy> therve: ack, I just wanted to mention it because I've not got time to work on fixing it myself this week 15:16:46 * shardy travelling/meetings 15:16:57 <therve> ramishra, I had to increase the resource count for my env, but yeah after that 15:17:13 <jdob> shardy: speaking of no time, I put my name on the env changes spec and started on it 15:17:18 <therve> We could probably turn that into a (skipped) test 15:17:23 <shardy> jdob: yup I saw that, thanks! :) 15:17:43 <ramishra> shardy: I had a look at it today, seems like there is no way to cancel a CREATE_IN_PROGRESS to acquire/steal the resource lock to DELETE. 15:18:18 <shardy> ramishra: Ok, thanks for the analysis - that sounds, erm, not good ;) 15:18:32 <therve> ramishra, https://review.openstack.org/#/c/301483/ isn't this patch queue about that? 15:18:36 <zaneb> ramishra: it should time out eventually though, right? 15:18:37 * shardy wonders how many times we'll fix cancelling in-progress things 15:18:41 <zaneb> and then the delete will run? 15:18:47 <ramishra> zaneb: yes 15:18:51 <zaneb> shardy: fixing it again right now :) 15:19:14 <zaneb> and by 'again' I mean' for the first time', since it hasn't working since Kilo 15:19:14 <shardy> zaneb: On my tripleo test env it times out after 4 hours ;) 15:19:15 * jdob working on a new cancel bug in reaction to zaneb's fix 15:19:48 * zaneb shakes fist at jdob 15:20:21 <therve> I think merge rate has been slightly better this week than last 15:20:33 <therve> We probably depend on the state of the hardware a bit somehow 15:20:48 <jdob> a shit ton of my stuff landed last night, i was super happy coming into work this morning 15:20:49 <therve> Still, let's try to focus on that, and avoid blank rechecks :) 15:20:59 <zaneb> #action stevebaker merge https://review.openstack.org/#/c/330800/ asap 15:21:23 <therve> Anything else on that topic? 15:21:30 <zaneb> nope, let's move on 15:21:45 <therve> #topic Restoring bp ironic resources implementation 15:22:22 <zaneb> cat ----> pigeons 15:22:25 <therve> OK. I have nothing against that, let's do it. 15:22:39 <therve> Unless ironic people really yell at us again 15:22:45 <shardy> +1 on this, although I'm not planning on reviving the patches I started myself 15:23:35 <shardy> A possible alternative tho is a mistral workflow 15:23:37 <shardy> https://review.openstack.org/#/c/313048/ 15:23:45 <shardy> I started that, although not yet got it working 15:24:01 <shardy> if we can make that work, there's no need for a bespoke heat plugin 15:24:05 <zaneb> I question the sanity of anyone using this, but I (still) see no harm, especially now that we can block admin-only resource types from general view 15:24:17 <therve> prazumovsky added that but isn't around 15:24:40 <zaneb> er s/anyone/anything/ 15:25:20 <therve> #action prazumovsky restore Ironic resources 15:25:24 <shardy> zaneb: I think just a plugin representing the Ironic node isn't enough - there's a workflow around networking we need so that it emulates what Nova does 15:25:41 <shardy> that could be in a heat plugin, but it's kinda workflow-ish 15:25:55 <therve> shardy, Shouldn't you use nova in that case? 15:26:29 <shardy> therve: Nova is huge overkill when you don't care about multiple tenants, flavors or scheduling 15:26:31 <zaneb> shardy: yeah, that's one reason that I question why anything would want to use this 15:27:28 <therve> #topic Legacy to convergence migration 15:27:41 <therve> #link https://review.openstack.org/#/c/232277/ 15:27:51 <therve> ochuprykov, You added this one? 15:27:55 <ochuprykov> yep 15:28:09 <ochuprykov> i think it can be useful now 15:28:19 <ochuprykov> after we switch to convergence 15:28:34 <therve> So that's a new heat-manage command 15:28:41 <ochuprykov> yes 15:28:55 <ochuprykov> implementation here https://review.openstack.org/#/c/280836/ 15:29:10 <ochuprykov> but i've faced with one problem 15:29:45 <ochuprykov> in order to do this migration online ( and it was intended to be online) i need to lock stacks i wont to migrate 15:30:10 <ochuprykov> but such lock can be successfully stealed by working engine 15:30:27 <therve> Yep 15:30:43 <therve> That's a tough one 15:30:53 <zaneb> oh, because its done by the heat-manage command and not an engine :/ 15:31:04 <ochuprykov> i know this) 15:31:08 <therve> zaneb, Why does it make a difference? 15:31:18 <therve> I can steal from another engine, no? 15:31:25 <zaneb> therve: the engine id is stored in the lock 15:31:36 <ochuprykov> no, if this engine is live 15:31:40 <ochuprykov> alive* 15:31:43 <zaneb> so if the engine holding the lock is running, nothing will steal it 15:31:57 <therve> Ah 15:32:02 <therve> ochuprykov, Use the RPC API then 15:32:05 <zaneb> what ochuprykov said 15:32:57 <ochuprykov> but probably we can choose option 2 from alternatives section 15:33:08 <ochuprykov> We could automatically convert stacks on the next stack-update. 15:33:37 <ochuprykov> which is better? 15:33:43 <zaneb> ochuprykov: then we can never deprecate the legacy path 15:33:52 <zaneb> +1 for using the RPC API 15:34:09 <ochuprykov> ok, i will try this variant 15:34:22 <ochuprykov> so, i think i don't need to change a spec 15:34:32 <therve> zaneb, Operators could trigger a stack update? 15:34:52 <therve> You need to be in the tenant though... 15:34:55 <zaneb> therve: mmmmmhmmm I guess 15:35:15 <zaneb> still +1 on the RPC :D 15:35:21 <therve> Yeah I would try that :) 15:35:34 <therve> ochuprykov, Makes sense? 15:35:56 <ochuprykov> therve: yep, i will try to do it via rpc 15:35:58 <zaneb> ochuprykov: I +2'd your spec though! 15:36:02 <therve> Cool 15:36:05 <zaneb> so there's that 15:36:08 <therve> #topic Open discussion 15:36:40 <therve> prazumovsky, http://eavesdrop.openstack.org/meetings/heat/2016/heat.2016-06-22-15.00.log.txt we talked about the ironic resources 15:36:46 <shardy> I'd appreciate wider feedback re https://review.openstack.org/#/c/327149/ 15:36:50 <prazumovsky> hi, sorry for the late, I read the logs and now I understand that before discussing ironic resources I needed in some deeper problem description understanding. 15:37:01 <jdob> two specs I'd appreciate eyes on: https://review.openstack.org/328822 and https://review.openstack.org/330414 15:37:22 <shardy> basically -f yaml in oscplugin doesn't output yaml - jdob provided feedback, I'd like to get some consensus 15:38:54 <therve> shardy, I think I'm with you on that one 15:39:00 <jdob> shardy: i'll look again, I kinda forget what I was arguing 15:39:06 <therve> No point if -f yaml doesn't return yaml 15:39:28 <zaneb> agree 15:39:34 <therve> prazumovsky, Cool let us know. Ironic is tricky :) 15:41:22 <prazumovsky> Some confusion why implementing resource plugins are not enough, but keep investigating in the issue. 15:43:46 <therve> OK, anything else? 15:43:56 <therve> 3 15:44:03 <therve> 2 15:44:11 <therve> 1 15:44:15 <therve> #endmeeting heat