*** lipinski has quit IRC | 00:01 | |
*** ChanServ changes topic to "support @ https://ask.openstack.org | developer wiki @ https://wiki.openstack.org/wiki/Heat | development @ https://launchpad.net/heat | logged @ http://eavesdrop.openstack.org/irclogs/%23heat/" | 00:01 | |
*** jay_t has quit IRC | 00:05 | |
*** achampion has joined #heat | 00:10 | |
*** arbylee has quit IRC | 00:11 | |
*** andersonvom has quit IRC | 00:15 | |
*** m_22 has quit IRC | 00:16 | |
*** spzala has joined #heat | 00:18 | |
aru__ | thanks stevebaker | 00:24 |
---|---|---|
aru__ | ot seemed like solve the problem | 00:24 |
*** asalkeld has quit IRC | 00:24 | |
*** lindsayk1 has quit IRC | 00:25 | |
*** arbylee has joined #heat | 00:26 | |
*** matsuhashi has joined #heat | 00:30 | |
*** lindsayk has joined #heat | 00:33 | |
openstackgerrit | A change was merged to openstack/heat: Implement locking in abandon stack https://review.openstack.org/86663 | 00:36 |
*** asalkeld has joined #heat | 00:37 | |
*** blamar has quit IRC | 00:41 | |
*** lindsayk has joined #heat | 00:49 | |
*** andersonvom has joined #heat | 00:55 | |
*** blamar has joined #heat | 00:59 | |
*** spzala has quit IRC | 01:01 | |
*** nati_uen_ has quit IRC | 01:20 | |
*** andersonvom has quit IRC | 01:28 | |
*** lindsayk has quit IRC | 01:30 | |
*** daneyon has joined #heat | 01:32 | |
*** Qiming has joined #heat | 01:34 | |
*** david-lyle has joined #heat | 01:36 | |
*** matsuhashi has quit IRC | 01:40 | |
*** matsuhas_ has joined #heat | 01:43 | |
*** julienvey has joined #heat | 01:46 | |
*** julienvey has quit IRC | 01:51 | |
*** david-lyle has quit IRC | 02:03 | |
openstackgerrit | Jun Jie Nan proposed a change to openstack/python-heatclient: Add --preview option to stack abandon command https://review.openstack.org/84680 | 02:04 |
*** lipinski has joined #heat | 02:05 | |
*** alexpilotti has quit IRC | 02:13 | |
*** harlowja is now known as harlowja_away | 02:35 | |
*** connie has joined #heat | 02:35 | |
*** connie has quit IRC | 02:36 | |
*** julienvey has joined #heat | 02:45 | |
*** etoews has quit IRC | 02:47 | |
*** matsuhas_ has quit IRC | 02:49 | |
*** matsuhashi has joined #heat | 02:49 | |
*** julienvey has quit IRC | 02:50 | |
*** matsuhas_ has joined #heat | 02:52 | |
*** matsuhashi has quit IRC | 02:52 | |
*** matsuhas_ has quit IRC | 02:58 | |
*** zhiyan_ is now known as zhiyan | 03:01 | |
*** etoews has joined #heat | 03:05 | |
*** sergmelikyan has quit IRC | 03:10 | |
*** sergmelikyan has joined #heat | 03:13 | |
*** etoews has quit IRC | 03:14 | |
*** arbylee has quit IRC | 03:14 | |
*** arbylee has joined #heat | 03:14 | |
*** etoews has joined #heat | 03:22 | |
*** ramishra has joined #heat | 03:23 | |
*** etoews has quit IRC | 03:28 | |
*** nosnos has quit IRC | 03:35 | |
*** lipinski has quit IRC | 03:40 | |
sdake | harlowja had early dinner - which message is interrupting your business continuity? | 03:44 |
sdake | and on that note, i'm off to bed, enjoy :) | 03:45 |
*** julienvey has joined #heat | 03:46 | |
*** etoews has joined #heat | 03:47 | |
*** julienvey has quit IRC | 03:51 | |
openstackgerrit | Jun Jie Nan proposed a change to openstack/heat: Add preview option to stack abandon https://review.openstack.org/84664 | 03:52 |
*** etoews has quit IRC | 03:52 | |
*** IlyaE has quit IRC | 04:03 | |
*** etoews has joined #heat | 04:06 | |
*** sdake_ has joined #heat | 04:08 | |
*** etoews has quit IRC | 04:10 | |
*** asalkeld has quit IRC | 04:14 | |
*** asalkeld has joined #heat | 04:16 | |
*** IlyaE has joined #heat | 04:17 | |
*** sergmelikyan has quit IRC | 04:21 | |
*** sergmelikyan has joined #heat | 04:23 | |
*** nosnos has joined #heat | 04:25 | |
openstackgerrit | Jun Jie Nan proposed a change to openstack/python-heatclient: Add code coverage in resource list test https://review.openstack.org/87846 | 04:25 |
openstackgerrit | Jun Jie Nan proposed a change to openstack/python-heatclient: Fix empty resource list index out of range error https://review.openstack.org/87269 | 04:25 |
*** saju_m has joined #heat | 04:27 | |
*** saju_m has quit IRC | 04:27 | |
*** aru__ has quit IRC | 04:28 | |
*** saju_m has joined #heat | 04:31 | |
*** aweiteka has joined #heat | 04:34 | |
*** pithagora has joined #heat | 04:38 | |
*** achampio1 has joined #heat | 04:42 | |
*** achampion has quit IRC | 04:44 | |
*** julienvey has joined #heat | 04:47 | |
*** achampion has joined #heat | 04:47 | |
*** achampio1 has quit IRC | 04:48 | |
*** nanjj has joined #heat | 04:49 | |
*** julienvey has quit IRC | 04:52 | |
*** cmyster has joined #heat | 04:59 | |
*** cmyster has joined #heat | 04:59 | |
*** IlyaE has quit IRC | 05:07 | |
*** nkhare has joined #heat | 05:09 | |
*** etoews has joined #heat | 05:10 | |
cmyster | morning | 05:11 |
Qiming | morning | 05:14 |
*** etoews has quit IRC | 05:17 | |
cmyster | how are you this morning Qiming ? | 05:17 |
Qiming | cmyster: feeling very hot in the office | 05:18 |
cmyster | same here, summer has started very early this year... | 05:20 |
*** chandan_kumar has joined #heat | 05:31 | |
*** pithagora has quit IRC | 05:37 | |
*** dmueller has joined #heat | 05:48 | |
*** Qiming has quit IRC | 05:50 | |
*** dmueller has quit IRC | 05:50 | |
*** julienvey has joined #heat | 05:51 | |
*** IlyaE has joined #heat | 05:54 | |
*** julienvey has quit IRC | 05:55 | |
*** sdague has quit IRC | 05:58 | |
*** sdague has joined #heat | 06:05 | |
*** slagle has quit IRC | 06:12 | |
*** slagle has joined #heat | 06:13 | |
*** saju_m has quit IRC | 06:17 | |
*** etoews has joined #heat | 06:23 | |
*** liang has joined #heat | 06:28 | |
*** etoews has quit IRC | 06:29 | |
*** saju_m has joined #heat | 06:29 | |
*** fandi has joined #heat | 06:39 | |
*** etoews has joined #heat | 06:40 | |
*** etoews has quit IRC | 06:45 | |
*** arbylee has quit IRC | 06:47 | |
therve | Good morning! | 06:50 |
*** tomek_adamczewsk has joined #heat | 06:51 | |
*** jprovazn has joined #heat | 06:51 | |
cmyster | morning | 06:52 |
*** IlyaE has quit IRC | 06:55 | |
*** chandan_kumar has quit IRC | 07:02 | |
*** chandan_kumar has joined #heat | 07:08 | |
shardy | morning all | 07:17 |
cmyster | morning | 07:20 |
*** sdake has quit IRC | 07:21 | |
*** jiangyaoguo has joined #heat | 07:21 | |
*** jiangyaoguo has left #heat | 07:22 | |
*** sdake has joined #heat | 07:36 | |
*** tspatzier has joined #heat | 07:37 | |
*** tspatzier has quit IRC | 07:42 | |
*** arbylee has joined #heat | 07:48 | |
*** asalkeld has quit IRC | 07:51 | |
*** jistr has joined #heat | 07:52 | |
*** akuznets_ has quit IRC | 07:53 | |
openstackgerrit | Zhang Yang proposed a change to openstack/heat: Allow DesiredCapacity to be zero https://review.openstack.org/87204 | 07:54 |
*** arbylee has quit IRC | 07:55 | |
pas-ha | morning all | 07:56 |
skraynev | Morning all | 07:56 |
*** akuznetsov has joined #heat | 08:00 | |
cmyster | morning | 08:01 |
openstackgerrit | Sergey Kraynev proposed a change to openstack/heat: Adding attribute schema class for attributes https://review.openstack.org/86525 | 08:11 |
openstackgerrit | Sergey Kraynev proposed a change to openstack/heat: Using attribute schema for building documentation https://review.openstack.org/86803 | 08:11 |
openstackgerrit | Sergey Kraynev proposed a change to openstack/heat: Deprecate first_address attribute of Server https://review.openstack.org/86526 | 08:12 |
*** derekh has joined #heat | 08:12 | |
openstackgerrit | Sergey Kraynev proposed a change to openstack/heat: Deprecate first_address attribute of Server https://review.openstack.org/86526 | 08:16 |
cmyster | http://download.fedoraproject.org/pub/fedora/linux/updates/20/Images/x86_64/Fedora-x86_64-20-20140407-sda.qcow2 is heartbleed free btw | 08:17 |
*** e0ne has joined #heat | 08:21 | |
*** che-arne has joined #heat | 08:33 | |
*** sorantis has joined #heat | 08:35 | |
*** petertoft has joined #heat | 08:35 | |
*** pablosan is now known as zz_pablosan | 08:36 | |
*** TonyBurn has joined #heat | 08:55 | |
*** zhangyang has joined #heat | 08:56 | |
*** rpothier has quit IRC | 09:00 | |
*** rpothier has joined #heat | 09:01 | |
*** chandan_kumar has quit IRC | 09:06 | |
*** ramishra has quit IRC | 09:06 | |
*** ramishra has joined #heat | 09:07 | |
*** ramishra has quit IRC | 09:08 | |
*** alexpilotti has joined #heat | 09:16 | |
*** chandan_kumar has joined #heat | 09:20 | |
*** chandan_kumar has quit IRC | 09:26 | |
*** chandan_kumar has joined #heat | 09:26 | |
*** tspatzier has joined #heat | 09:29 | |
openstackgerrit | Zhang Yang proposed a change to openstack/heat: Allow DesiredCapacity to be zero https://review.openstack.org/87204 | 09:30 |
*** liang has quit IRC | 09:32 | |
*** saju_m has quit IRC | 09:49 | |
*** arbylee has joined #heat | 09:53 | |
*** tspatzier has quit IRC | 09:56 | |
*** arbylee has quit IRC | 09:58 | |
*** nosnos has quit IRC | 10:07 | |
openstackgerrit | A change was merged to openstack/heat: Add hint on creating new user for Heat in DevStack https://review.openstack.org/87555 | 10:10 |
*** nosnos has joined #heat | 10:12 | |
openstackgerrit | Zhang Yang proposed a change to openstack/heat: Allow DesiredCapacity to be zero https://review.openstack.org/87204 | 10:14 |
*** dmakogon_ is now known as denis_makogon | 10:14 | |
shardy | If anyone wants lots of details on stack domain users, I just posted this: | 10:15 |
shardy | http://hardysteven.blogspot.co.uk/2014/04/heat-auth-model-updates-part-2-stack.html | 10:15 |
skraynev | shardy: thanks. will read ;) | 10:16 |
*** e0ne has quit IRC | 10:17 | |
*** e0ne has joined #heat | 10:18 | |
*** Qiming has joined #heat | 10:18 | |
*** nanjj has quit IRC | 10:21 | |
*** e0ne has quit IRC | 10:22 | |
*** etoews has joined #heat | 10:29 | |
Qiming | shardy, thank you! | 10:29 |
pas-ha | shardy: thanks, good read | 10:31 |
shardy | Qiming: ah, you're here now, no problem :) | 10:32 |
*** mestery_ has joined #heat | 10:33 | |
*** nosnos has quit IRC | 10:34 | |
*** etoews has quit IRC | 10:34 | |
*** nosnos has joined #heat | 10:35 | |
*** mestery has quit IRC | 10:36 | |
Qiming | shardy, I am trapped by sending a signal to heat using mechanism other than ec2-signed URL | 10:38 |
shardy | Qiming: You can send a signal via the native API | 10:38 |
shardy | but not a WaitCondition notification at the moment | 10:39 |
shardy | heat resource-signal ... | 10:39 |
*** nosnos has quit IRC | 10:39 | |
Qiming | shardy: I read the deployment code where heat-config sent back a signal now | 10:39 |
Qiming | in that implementation, the signalling side need to have user-id, password, project-id, auth-url ... | 10:40 |
shardy | Qiming: Yes, the SoftwareDeployment resources have been designed to use either ec2 signed URLs or native signals | 10:40 |
shardy | yes, it uses a stack domain user and a randomly generated password | 10:40 |
Qiming | I hope your blog will help me better understand how trusts work | 10:40 |
shardy | My post from last week may do, but it's unrelated to in-instance signalling | 10:40 |
shardy | please read both posts and come back if you still have questions :) | 10:41 |
Qiming | then maybe I can try have the Ceilometer::Alarm to post to a 'trust-url' ? | 10:41 |
shardy | Qiming: therve has posted patches which enable exactly that | 10:42 |
Qiming | still not sure how to post some data back along with the signal | 10:42 |
Qiming | shardy, that is the patch I will try, :) | 10:42 |
* Qiming printed the blog posts and a dictionary, started to study English ... | 10:43 | |
*** e0ne has joined #heat | 10:48 | |
*** e0ne_ has joined #heat | 10:50 | |
*** sorantis has quit IRC | 10:53 | |
*** e0ne has quit IRC | 10:53 | |
*** nkhare has quit IRC | 10:58 | |
*** fandi has quit IRC | 11:04 | |
*** e0ne_ has quit IRC | 11:06 | |
*** e0ne has joined #heat | 11:06 | |
*** Michalik- has joined #heat | 11:06 | |
*** nosnos has joined #heat | 11:06 | |
*** sorantis has joined #heat | 11:08 | |
*** ifarkas has quit IRC | 11:10 | |
*** ifarkas has joined #heat | 11:22 | |
*** yassine has joined #heat | 11:33 | |
*** lipinski has joined #heat | 11:39 | |
sdake | morning | 11:40 |
cmyster | morning | 11:41 |
*** mkollaro has joined #heat | 11:42 | |
*** tspatzier has joined #heat | 11:43 | |
sdake | cmyster for some reason I thought TLV was in shutdown until the 22nd | 11:43 |
cmyster | ummm | 11:43 |
cmyster | I'm not really here ? | 11:43 |
sdake | hmm well your in so guess not :) | 11:43 |
*** etoews has joined #heat | 11:49 | |
*** etoews has quit IRC | 11:53 | |
*** arbylee has joined #heat | 11:54 | |
*** igormarnat_ has joined #heat | 11:57 | |
*** arbylee has quit IRC | 11:58 | |
*** tspatzier has quit IRC | 11:58 | |
*** Qiming has quit IRC | 11:59 | |
*** Qiming has joined #heat | 12:00 | |
*** akuznetsov has quit IRC | 12:09 | |
*** akuznets_ has joined #heat | 12:09 | |
*** alexpilotti has quit IRC | 12:15 | |
*** nosnos has quit IRC | 12:17 | |
*** tspatzier has joined #heat | 12:22 | |
*** achampion has quit IRC | 12:23 | |
*** jdob has joined #heat | 12:38 | |
*** slagle has quit IRC | 12:39 | |
*** saju_m has joined #heat | 12:40 | |
*** slagle has joined #heat | 12:40 | |
*** rbuilta has joined #heat | 12:40 | |
*** akuznets_ has quit IRC | 12:43 | |
*** akuznetsov has joined #heat | 12:43 | |
*** saju_m has quit IRC | 12:45 | |
*** blomquisg has joined #heat | 12:46 | |
*** saju_m has joined #heat | 13:02 | |
*** pafuent has joined #heat | 13:03 | |
*** spzala has joined #heat | 13:07 | |
*** erecio has quit IRC | 13:08 | |
*** alexpilotti has joined #heat | 13:14 | |
*** erecio has joined #heat | 13:14 | |
*** achampion has joined #heat | 13:16 | |
*** mestery_ is now known as mestery | 13:22 | |
*** ramishra has joined #heat | 13:27 | |
*** jprovazn has quit IRC | 13:29 | |
*** zz_gondoi is now known as gondoi | 13:31 | |
*** gondoi is now known as zz_gondoi | 13:31 | |
*** dims has quit IRC | 13:32 | |
*** samstav has joined #heat | 13:34 | |
*** zz_gondoi is now known as gondoi | 13:35 | |
*** etoews has joined #heat | 13:36 | |
*** igormarnat_ has quit IRC | 13:36 | |
*** arbylee has joined #heat | 13:41 | |
*** pafuent has left #heat | 13:43 | |
*** pafuent has joined #heat | 13:44 | |
*** gondoi is now known as zz_gondoi | 13:45 | |
*** zz_gondoi is now known as gondoi | 13:51 | |
*** arbylee has quit IRC | 13:52 | |
*** arbylee has joined #heat | 13:52 | |
*** jprovazn has joined #heat | 13:53 | |
*** vijendar has joined #heat | 13:53 | |
*** julienvey has joined #heat | 13:59 | |
*** spzala has quit IRC | 14:00 | |
*** spzala has joined #heat | 14:01 | |
*** tspatzier has quit IRC | 14:02 | |
*** aweiteka has quit IRC | 14:02 | |
*** spzala has quit IRC | 14:04 | |
*** sjmc7 has joined #heat | 14:05 | |
*** aweiteka has joined #heat | 14:15 | |
*** jaustinpage has joined #heat | 14:16 | |
Qiming | shardy: there? | 14:17 |
*** dims has joined #heat | 14:18 | |
shardy | Qiming: yes | 14:20 |
Qiming | shardy: do we assign a role to a user, or assign a user to a role? | 14:21 |
Qiming | or, there is no difference, :p | 14:21 |
*** zns has joined #heat | 14:22 | |
shardy | Qiming: I would say you assign a role to a user, scoped to a project or domain | 14:23 |
jaustinpage | shardy: re: todays blog post: how does heat handle signaling when it is in standalone mode? | 14:23 |
Qiming | thanks, shardy | 14:24 |
shardy | jaustinpage: assuming you don't have permission to create a new domain, you probably have to use the old fallback behavior, which is to create the users as before, in the project of the stack-owner | 14:25 |
shardy | jaustinpage: I guess it depends on what level of control you have over the remote keystone | 14:25 |
jaustinpage | shardy: but in standalone mode, i thought there was an assumption that you couldn't create users either... | 14:25 |
jaustinpage | shardy: thanks for the reply | 14:26 |
shardy | jaustinpage: I'm not aware of any such assumption, or none of the signalling features would have worked for anyone ever | 14:26 |
jaustinpage | shardy | 14:26 |
shardy | jaustinpage: happy to get use-case feedback though, if you have specific issues :) | 14:26 |
jaustinpage | shardy: ok, thanks for the reply | 14:27 |
shardy | jaustinpage: np | 14:27 |
jaustinpage | shardy: one other question, you mentioned the ec2 method of passing keys, is there a significant difference between this method and the heat_signal method of passing keys? | 14:28 |
jaustinpage | shardy: *from the heat engine to the vm being deployed, through cloud-init if i am understanding correctly... | 14:28 |
shardy | jaustinpage: sorry, by heat_signal, you mean the native signals, e.g heat resource-signal? | 14:28 |
jaustinpage | i believe so, i am pretty sure the heat softwaredeployment resource makes use of the heat resource-signal | 14:30 |
shardy | jaustinpage: Ah, HEAT_SIGNAL for SoftwareDeployment resources creates a stack domain user, but not an ec2 keypair, instead it creates a random password, and we use that from the instance | 14:30 |
shardy | So the main difference it removes the dependency on ec2tokens being enabled in keystone, which some deployers don't enable | 14:31 |
*** julienvey has quit IRC | 14:31 | |
shardy | But the disadvantage is you have to obtain a token from the instance, e.g heatclient has to connect to keystone then heat | 14:31 |
jaustinpage | ok, so the instance, in order to signal back, would get a token from keystone, then use that to authenticate to the heat metadata? | 14:31 |
shardy | jaustinpage: exactly | 14:31 |
shardy | jaustinpage: we're still looking at ways we might avoid that additional call to keystone, x509 cert most likely | 14:33 |
*** daneyon has quit IRC | 14:34 | |
jaustinpage | shardy: ok, cool. if the call to keystone could be avoided, it would seem that a custom authentication mechanism in the heat engine's pipeline could then work, and still have signalling support | 14:34 |
*** dims has quit IRC | 14:34 | |
jaustinpage | *heat engines authentication pipeline | 14:34 |
*** daneyon has joined #heat | 14:35 | |
shardy | jaustinpage: "custom authentication mechanism"? | 14:35 |
shardy | jaustinpage: FWIW, we (or at least I) have been specifically trying to avoid inventing something heat-specific for this | 14:35 |
jaustinpage | shardy: somebody writing one of these: https://github.com/openstack/heat/blob/master/heat/common/custom_backend_auth.py | 14:36 |
lipinski | Any reason why the heat client and/or engine needs permissions to /lost+found ? | 14:36 |
lipinski | I'm failing to create a stack because of permissions on /lost+found and /root - while the heat-engine is running as heat user | 14:37 |
*** andrew_plunk has joined #heat | 14:37 | |
jaustinpage | shardy: yea, i can definitely understand trying to avoid having a custom authentication backend | 14:37 |
shardy | jaustinpage: If you write your own auth middleware then the call to keystone becomes irellevant, you just insert your m/w earlier in the paste pipeline, and modify the client to send whatever secret your auth scheme understands | 14:37 |
*** sorantis has quit IRC | 14:38 | |
shardy | jaustinpage: The call to keystone is already optional e.g in python-heatclient, so if for example you were using the heat-api-standalone API pipeline, you could just hard-code a password in all your templates | 14:38 |
jaustinpage | shardy: thanks for the info, and thanks for putting up with all of my questions! | 14:39 |
shardy | jaustinpage: that doesn't really solve the problem that some resources are integrated with keystone functionality though, so you might have to modify them as well as your middleware | 14:40 |
sdague | hey folks, we're seen a lot of inconsistent fails in the heat-slow jobs | 14:40 |
sdague | it would be really great to get some eyes on some of these to figure out what's going wrong | 14:40 |
shardy | jaustinpage: np, given me some things to think about re standalone mode, I've mostly been considering the integrated use-case | 14:40 |
shardy | sdague: sure, got any links? | 14:40 |
*** mriedem has joined #heat | 14:41 | |
mriedem | sdague: hi | 14:41 |
sdague | mriedem has been doing some debug shardy, he should have some links on fails | 14:41 |
jaustinpage | shardy: no worries, heat definitely walks between the iaas-whatever is higher up the chain than iaas line, which makes for some difficult choices when implementing feautres | 14:41 |
shardy | jaustinpage: Yeah, that is the challenge with standalone mode. Feel free to raise bugs if you have specific problems | 14:42 |
*** mtreinish has joined #heat | 14:42 | |
mriedem | http://goo.gl/NNAUfK | 14:42 |
mriedem | was just looking at the results for that fail after mtreinish raised the build timeout in tempest for heat jobs yesterday, which didn't hel[p | 14:42 |
mriedem | because the timeout happens in heat, not tempest | 14:43 |
shardy | So the problem is a signal from the instance is not reaching heat | 14:43 |
sdague | it also looks like it dramatically got worse | 14:44 |
shardy | either because the instance is not running, the network is broken, or the VM deployment is just taking too long and the timeout is expiring | 14:44 |
sdague | 2 days ago | 14:44 |
mriedem | so i wonder if the nova/neutron timeout/wait stuff slowed this all down | 14:44 |
mriedem | it is failing on a slow neutron job right? | 14:44 |
mriedem | nova waits longer for neutron to callback | 14:45 |
shardy | sdague: do we have any timing data, are VM's taking massively longer to launch recently? | 14:45 |
mriedem | heat is waiting for nova? | 14:45 |
mriedem | shardy: ^ | 14:45 |
mriedem | so if nova is waiting on neutron, and heat is waiting on nova, and that all slowed down with callbacks | 14:45 |
mriedem | we're going to see timeouts | 14:45 |
*** zz_pablosan is now known as pablosan | 14:45 | |
shardy | mriedem: Yes, heat is waiting for the VM to boot, some stuff to happen inside the VM, and a signal to be POSTed baack to us | 14:45 |
mriedem | shardy: is that controlled with stack_action_timeout? | 14:46 |
mriedem | which defaults to 1 minute | 14:46 |
mriedem | derp | 14:46 |
shardy | mriedem: sec, let me look at the tests | 14:46 |
mriedem | 1 hour i should say | 14:46 |
shardy | http://docs.openstack.org/developer/heat/template_guide/cfn.html#AWS::CloudFormation::WaitCondition | 14:46 |
shardy | It's controlled by the Timeout specified in the template | 14:47 |
*** dims has joined #heat | 14:47 | |
shardy | do we know which test is failing? | 14:47 |
mriedem | sec | 14:47 |
*** igormarnat_ has joined #heat | 14:47 | |
mriedem | there are a couple | 14:48 |
mriedem | http://logs.openstack.org/47/84447/15/check/check-tempest-dsvm-neutron-heat-slow/f2a11ec/console.html | 14:48 |
sdague | shardy: do the heat tests actually require neutron? I wonder if it's better to disconnect them from neutron failure rates to actually test heat instead of couple heat issues to neutron fails | 14:48 |
mriedem | search for FAIL: | 14:48 |
sdague | http://logs.openstack.org/47/84447/15/check/check-tempest-dsvm-neutron-heat-slow/f2a11ec/console.html#_2014-04-16_14_34_01_860 | 14:48 |
sdague | 3 different tests failed in that one | 14:48 |
shardy | sdague: Most tests probably don't, but e.g api test_neutron_resources.py does :) | 14:48 |
mriedem | sdague: what makes the slow job 'slow'? not run in parallel? | 14:48 |
sdague | mriedem: each of these tests is slow | 14:49 |
sdague | or some of them can be | 14:49 |
*** akuznetsov has quit IRC | 14:49 | |
shardy | ServerCfnInitTestJSON.test_all_resources_created[slow] 631.788 | 14:49 |
shardy | that can't be right.. | 14:49 |
sdague | shardy: right now it's a coin flip to pass heat-slow - http://jogo.github.io/gate/ | 14:49 |
shardy | I thought the entire heat-slow job took about 27 minutes | 14:50 |
sdague | the timing there is right | 14:51 |
sdague | depending on the node it drifts from 280 -> 700s | 14:51 |
sdague | honestly, the other tests take longer than reported, because some stuff is done in setupclass | 14:51 |
mriedem | http://logs.openstack.org/47/84447/15/check/check-tempest-dsvm-neutron-heat-slow/f2a11ec/console.html#_2014-04-16_14_34_13_597 | 14:51 |
sdague | which isn't time accounted | 14:51 |
mriedem | 651 sec for that run | 14:51 |
sdague | yeh, it's been running that duration for as long as I can remember | 14:52 |
shardy | So test_server_cfn_init.py is one that's failing, and that doesn't require neutron | 14:52 |
shardy | Timeout: '600' | 14:52 |
*** jaustinpage has quit IRC | 14:52 | |
shardy | If the test is taking >600s that will timeout | 14:52 |
mriedem | should be 1200 now in tempest: https://review.openstack.org/#/c/87691/ | 14:53 |
shardy | https://github.com/openstack/tempest/blob/master/tempest/api/orchestration/stacks/templates/cfn_init_signal.yaml#L71 | 14:53 |
mriedem | derp | 14:53 |
sdague | the guest boot takes 53s | 14:54 |
sdague | 563s | 14:54 |
sdague | http://logs.openstack.org/47/84447/15/check/check-tempest-dsvm-neutron-heat-slow/f2a11ec/console.html#_2014-04-16_14_34_13_590 | 14:54 |
shardy | Just to boot? ouch :( | 14:54 |
mriedem | so there is a configurable timeout in heat, there is a timeout for the heat-slow job, and there is a build_timeout for tempest.conf | 14:54 |
mriedem | the tempest ones are the same value now, but the heat value isn't changed for the slow jobs | 14:55 |
shardy | AFAICS those config values aren't overriding the values in the templates though | 14:55 |
mriedem | they aren't | 14:55 |
shardy | probably we should have a self.override_timeout(loaded_template) step in all the tests | 14:56 |
sdague | shardy: well booting a full fedora 20 cloud guest 2nd level is slow | 14:56 |
shardy | so we can globally configure the waitcondition timeout | 14:56 |
sdague | could we build a cirros with cfn tools in it? or would that just get crazy | 14:56 |
shardy | sdague: not sure tbh, I've only really used fedora images | 14:57 |
shardy | sdague: if the image has python, cloud-init and boto, then probably | 14:57 |
sdague | it has cloud init | 14:57 |
sdague | I have no idea about the rest | 14:58 |
shardy | cloud-init depends on boto, so it may work | 14:58 |
shardy | there are a few other deps, but those are the main ones | 14:58 |
*** Qiming has quit IRC | 14:58 | |
*** akuznetsov has joined #heat | 15:00 | |
SpamapS | boto is the devil | 15:01 |
SpamapS | period | 15:01 |
shardy | sdague, mriedem: want me to post a patch which aligns the WaitCondition timeout with build_timeout, but passing build_timeout as a parameter into the stack? | 15:02 |
mriedem | shardy: yeah i was just looking at that | 15:02 |
mriedem | when it reads the yaml file is it automatically converted to json? | 15:02 |
sdague | cirros does not have python | 15:03 |
shardy | mriedem: I think there are two ways, either directly override the Timeout in the template, or establish a convention where all templates containing a WaitCondition expose a parameter "timeout" | 15:03 |
sdague | I wonder if they compiled down cloud init to a binary | 15:03 |
shardy | sdague: How does cloud-init work then? | 15:03 |
*** IlyaE has joined #heat | 15:03 | |
larsks | sdague shardy : cirros has a collection of shell scripts. | 15:03 |
larsks | It's actually pretty clever, and the cli is somewhat nicer (it caches results locally, and provides cli tools for querying the data) | 15:04 |
mriedem | shardy: i have no idea how to pass parameters to templates (never looked at heat before) | 15:04 |
shardy | sdague: for WaitConditions, we don't actually need heat-cfntools, you can do it with just curl | 15:04 |
*** TonyBurn has quit IRC | 15:04 | |
shardy | mriedem: Give me 10mins, I'll post a patch showing what I mean | 15:05 |
*** igormarnat_ has left #heat | 15:05 | |
shardy | mriedem: what's the bug # for this issue? | 15:05 |
mriedem | shardy: ok, thanks - fwiw the neutron_basic.yaml in tempest also has a 600 second wait timeout | 15:05 |
*** julienvey has joined #heat | 15:05 | |
mriedem | 1297560 | 15:05 |
shardy | mriedem: thanks | 15:05 |
*** jaustinpage has joined #heat | 15:06 | |
sdague | larsks: so it just emulates cloud init? or is it totally different | 15:06 |
*** sorantis has joined #heat | 15:07 | |
larsks | sdague: It doesn't really emulate cloud-init. It will run scripts in user-data, though. I don't think it makes any attempt at reading cloud-config format data. | 15:10 |
shardy | mriedem: https://review.openstack.org/87993 | 15:12 |
shardy | mriedem: just going to try testing locally, but that's what I meant | 15:12 |
larsks | sdague: Yeah, it just looks for "#!" in userdata and runs it, otherwise it just exposes the data via "cirros-query". | 15:12 |
*** jprovazn is now known as jprovazn_afk | 15:13 | |
mriedem | shardy: cool, that's easy | 15:14 |
mriedem | you missed one template though | 15:14 |
shardy | mriedem: I'm doing it now, was going to post two patches | 15:14 |
shardy | or I can add it to that patch if you prefer :) | 15:14 |
mriedem | doing it in one seems good | 15:14 |
shardy | Ok, git rebase squash it is :) | 15:15 |
*** sorantis has quit IRC | 15:19 | |
shardy | mriedem: updated | 15:19 |
*** kgriffs|afk is now known as kgriffs | 15:21 | |
*** spzala has joined #heat | 15:23 | |
mriedem | shardy: looks good | 15:25 |
sdague | shardy: so the lingering question is currently fedora cloud image is nearly 2 orders of magnitude slower to complete booting than cirros | 15:25 |
sdague | I think that unless we can get it down to 1 order of magnitude, the amount of coverage we can realistically expect out of heat is going to be small | 15:26 |
sdague | so if you have any thoughts on how to trim what's in that image that would be cool | 15:26 |
*** sdake_ has quit IRC | 15:27 | |
*** aweiteka has quit IRC | 15:27 | |
*** andersonvom has joined #heat | 15:28 | |
shardy | sdague: I think test_neutron_resources.py can be converted to use an image not containing heat-cfntools | 15:29 |
*** saju_m has quit IRC | 15:29 | |
shardy | All it does in the user-data is cfn-signal, which is basically just a wrapper for curl | 15:30 |
shardy | so provided there's curl or something similar in the cirros image, perhaps we can use that? | 15:30 |
shardy | I'll have to take a look, don't think I've ever booted a cirros image before | 15:30 |
sdague | yeh, there is curl inside there | 15:31 |
shardy | sdague: I think there are only a very small subset of things which actually *need* cfntools | 15:31 |
sdague | ok, cool, well that would help a lot if we were able to isolate those things | 15:31 |
sdague | then we could run most of the tests on cirros I think | 15:32 |
*** fandi has joined #heat | 15:33 | |
shardy | Even test_server_cfn_init.py could be rewritten to not need cfn-init, although that might defeat the point of it a bit :) | 15:35 |
*** fandi has quit IRC | 15:37 | |
sdague | yeh, I'm fine with using cfn-init where it's needed to test that | 15:37 |
sdague | just given the image weight, I'd rather see what we can test with cirros so we can get broader coverage of heat that doesn't need cfn-tools | 15:38 |
sdague | we'll get more bang for our buck that way | 15:38 |
shardy | sdague: Sure, makes sense | 15:38 |
shardy | sdague: the lack of python is an issue though, as we use python hook scripts for SoftwareDeployment resources IIRC | 15:39 |
*** ramishra has quit IRC | 15:39 | |
shardy | maybe there's a way to do shell script hooks instead, not sure, stevebaker will know | 15:39 |
sdague | well, I expect software deployment resources will need the bigger image | 15:39 |
sdague | I wonder if there are things that could be stripped from the base image that would help with speed. Part of the issue is it's a 500 MB disk, which means we're generating real io, not keeping it in cache | 15:41 |
sdague | whereas the cirros disk is 13M | 15:41 |
*** smulcahy has joined #heat | 15:42 | |
sdague | anyway, got to run away for a bit | 15:42 |
shardy | sdague: In a past life I maintained a Fedora image which was <50M, but the effort to prune things to get to that point was non-trivial | 15:42 |
shardy | sdague: Ok, I'm out till next week but I'll start digging into the image requirements next week | 15:43 |
*** fandi has joined #heat | 15:44 | |
*** sdake_ has joined #heat | 15:45 | |
*** chandan_kumar has quit IRC | 15:46 | |
smulcahy | Hi folks - is anyone looking at https://bugs.launchpad.net/heat/+bug/1306743 ? A few folks in HP are but not making much progress so far. It seems to be a hard blocker for running Heat with more than 2 or 3 nodes which is a surprisingly low bar. | 15:46 |
uvirtbot | Launchpad bug 1306743 in heat "queuepool limit of size 5 overflow" [Critical,Triaged] | 15:46 |
*** arbylee has quit IRC | 15:47 | |
zaneb | therve: I looked at your stack snapshots thing, but I think shardy is more qualified to comment ;) | 15:47 |
shardy | zaneb: Yeah therve and I discussed it on IRC but I've not got around to replying to the ML post yet | 15:48 |
*** etoews has left #heat | 15:49 | |
zaneb | shardy: thanks. that wasn't a hurry-up ;) | 15:49 |
therve | zaneb, OK thanks :) | 15:49 |
zaneb | it was more of a this-is-why-I-haven't-responded-to-the-post-therve-asked-me-to-look-at :) | 15:50 |
shardy | lol :) | 15:50 |
therve | smulcahy, The bug is not super clear to be honest. I don't know where to look | 15:51 |
therve | The only fix I can think of is "make less SQL queries in Heat" | 15:55 |
therve | Which is a fine goal but you may want a faster solution | 15:55 |
smulcahy | therve: are we the only ones seeing this? | 15:55 |
therve | *cough* | 15:56 |
therve | You're the only ones reporting it at least | 15:56 |
*** e0ne has quit IRC | 15:56 | |
smulcahy | we'll see if we can peel out a simpler reproducer | 15:57 |
smulcahy | but currently blocked on any real deploys by this | 15:57 |
*** e0ne has joined #heat | 15:57 | |
therve | smulcahy, Have you simply tried tweaking those parameters? | 15:57 |
therve | 5 and 10 looks small for a real deployment | 15:58 |
smulcahy | yes and yes | 15:58 |
zaneb | smulcahy: what does "nodes" mean in "2 or 3 nodes"? | 15:58 |
*** jlanoux has joined #heat | 15:59 | |
smulcahy | zaneb: servers running nova bare metal | 15:59 |
zaneb | ok | 15:59 |
*** vinsh has joined #heat | 16:00 | |
smulcahy | we're trying to repro with VMs, or maybe figure out a simple Heat only test of some sort | 16:00 |
*** mkollaro has quit IRC | 16:00 | |
smulcahy | but any suggestions and input most welcome on https://bugs.launchpad.net/heat/+bug/1306743 | 16:00 |
uvirtbot | Launchpad bug 1306743 in heat "queuepool limit of size 5 overflow" [Critical,Triaged] | 16:00 |
*** e0ne has quit IRC | 16:02 | |
therve | smulcahy, Input from you would be welcome | 16:02 |
therve | We really lack enough information to help | 16:02 |
zaneb | smulcahy: so what is polling describe_stack_resource in that traceback? | 16:02 |
*** arbylee has joined #heat | 16:04 | |
*** jlanoux has quit IRC | 16:05 | |
*** tomek_adamczewsk has quit IRC | 16:06 | |
*** ramishra has joined #heat | 16:06 | |
smulcahy | zaneb: one of the os- scripts afaik | 16:09 |
zaneb | could that be the problem? how fast is it polling? | 16:09 |
*** geerdest has joined #heat | 16:10 | |
smulcahy | not sure, asking on of our other folks to pop on if they're available | 16:10 |
SpamapS | Hey if somebody can give https://bugs.launchpad.net/heat/+bug/1306743 a look.. | 16:11 |
uvirtbot | Launchpad bug 1306743 in heat "queuepool limit of size 5 overflow" [Critical,Triaged] | 16:11 |
SpamapS | we're hitting scale problems at just 30 nodes requesting metadata from Heat. | 16:11 |
*** Michalik- has quit IRC | 16:11 | |
zaneb | SpamapS: by happy coincidence we were just discussing that :) | 16:12 |
SpamapS | I'm guessing we just need to start looking at a caching layer | 16:12 |
smulcahy | zaneb: lifeless also ran into this problem on his testing last week so should be able to give more info in a bit | 16:12 |
SpamapS | oh hah | 16:12 |
SpamapS | zaneb: not such a coincidence, as smulcahy is indeed somebody probably even more motivated than I am to fix this :) | 16:12 |
SpamapS | anyway, I'm offline for a while | 16:12 |
SpamapS | good luck! | 16:13 |
SpamapS | zaneb: polling once every 30 seconds per node | 16:13 |
SpamapS | zaneb: os-collect-config btw | 16:13 |
zaneb | ok, that doesn't sound unreasonable | 16:13 |
SpamapS | zaneb: pretty slow IMO | 16:13 |
SpamapS | But if each poll takes 5 queries or something.. :-/ | 16:14 |
SpamapS | anyway.. offline.. forrealz | 16:14 |
zaneb | if it was 30 times per second then I would understand | 16:14 |
*** TonyBurn has joined #heat | 16:16 | |
smulcahy | we're still trying to find the source of those 300-400 reqs/sec hitting mysql from heat-engine | 16:17 |
*** zhiyan is now known as zhiyan_ | 16:17 | |
petertoft | Also heat-engine pinning a CPU at 100% | 16:18 |
smulcahy | all we have so far is that its call to resource_data_get(resource, key) in :heat/db/sqlalchemy/api.py | 16:18 |
*** akuznets_ has joined #heat | 16:18 | |
zaneb | sounds like maybe we are creating a new session somewhere to request data that is probably already cached in our existing session | 16:19 |
*** mriedem has left #heat | 16:19 | |
*** IlyaE has quit IRC | 16:20 | |
smulcahy | zaneb: there may be some cascading effect here too | 16:21 |
*** akuznetsov has quit IRC | 16:21 | |
*** cmyster has quit IRC | 16:21 | |
zaneb | I'd look very closely at the Metadata class | 16:22 |
smulcahy | zaneb: Can you put any suggestions and/or requests for more info on https://bugs.launchpad.net/heat/+bug/1306743 - it would help us in digging deeper on this | 16:22 |
uvirtbot | Launchpad bug 1306743 in heat "queuepool limit of size 5 overflow" [Critical,Triaged] | 16:22 |
*** cmyster has joined #heat | 16:23 | |
*** cmyster has joined #heat | 16:23 | |
therve | smulcahy, How many http requests to heat do you get? | 16:24 |
smulcahy | therve: again, could you post these questions to the bug - I'll need to dig to answer them | 16:25 |
*** pablosan has quit IRC | 16:29 | |
*** pablosan has joined #heat | 16:29 | |
*** IlyaE has joined #heat | 16:30 | |
*** ramishra has quit IRC | 16:31 | |
zaneb | smulcahy, therve: done | 16:32 |
*** zhiyan_ is now known as zhiyan | 16:34 | |
*** zhiyan is now known as zhiyan_ | 16:41 | |
*** gokrokve has joined #heat | 16:47 | |
*** harlowja_away is now known as harlowja | 16:51 | |
*** cmyster has quit IRC | 16:52 | |
*** yassine has quit IRC | 16:54 | |
*** cmyster has joined #heat | 16:54 | |
*** cmyster has joined #heat | 16:54 | |
*** wendar has quit IRC | 16:59 | |
*** wendar has joined #heat | 17:01 | |
*** julienvey has quit IRC | 17:02 | |
*** derekh has quit IRC | 17:02 | |
*** jstrachan has joined #heat | 17:09 | |
*** akuznets_ has quit IRC | 17:11 | |
*** Lotus907efi has joined #heat | 17:11 | |
*** Carlos44 has joined #heat | 17:12 | |
*** denis_makogon has quit IRC | 17:13 | |
*** dmakogon_ has joined #heat | 17:13 | |
Carlos44 | hey everyone i have hit this bug https://bugs.launchpad.net/heat/+bug/1290274 trying to get heat db sync'd has anyone found how to get by this? | 17:13 |
uvirtbot | Launchpad bug 1290274 in heat "Index on 'tenant' column will be inefficient and 767 bytes per index key on MySQL " [Medium,Triaged] | 17:13 |
Carlos44 | thats is the one any possible manual fixes | 17:16 |
*** jaustinpage has quit IRC | 17:18 | |
*** Carlos44 has quit IRC | 17:19 | |
*** Carlos44 has joined #heat | 17:20 | |
Carlos44 | hey everyone i have hit this bug https://bugs.launchpad.net/heat/+bug/1290274 trying to get heat db sync'd has anyone found how to get by this? | 17:20 |
uvirtbot | Launchpad bug 1290274 in heat "Index on 'tenant' column will be inefficient and 767 bytes per index key on MySQL " [Medium,Triaged] | 17:20 |
*** david-lyle has joined #heat | 17:20 | |
*** IlyaE has quit IRC | 17:26 | |
sdake | zaneb i'll be at my folks for dinner during the meeting time - so won't be able to make it | 17:26 |
sdake | enjoy :) | 17:26 |
harlowja | zanebaany heat guys around, got a probably easy question about if heat could do something a customer (mail) is asking for | 17:26 |
harlowja | zanebany, haha | 17:26 |
sdake | harlowja heat cannot solve world hunger | 17:27 |
harlowja | :( | 17:27 |
sdake | otherwise its great! | 17:27 |
harlowja | will it solve my business continutity | 17:27 |
Lotus907efi | is heat buzzword compatible? | 17:28 |
harlowja | ha, anyway the simple question is, mail wants to basically startup CI servers, but have them auto-delete if they aren't used after X minutes, my knowledge of heat is not so much, but i thought that it had some type of capability to do this, but i can't remember anymore | 17:28 |
sdake | business continuity - in the context of 1) monitor for failures 2) recover from failures 3) notify of failures 4) escalate on repeated failures | 17:28 |
sdake | harlowja that is not business continuity imo :) | 17:29 |
sdake | but yes, autoscaling will do that | 17:29 |
harlowja | lol | 17:29 |
harlowja | any good docs i can reference about this? | 17:30 |
sdake | the developer docs for openstack show the resources you would want to use | 17:30 |
sdake | the heat templates repo contains a autoscaling example | 17:30 |
sdake | imo autoscaling needs love | 17:30 |
*** jaustinpage has joined #heat | 17:30 | |
sdake | the specific problem you mention, which is autodelete a specific node if it is underutilized, heat will not do | 17:30 |
sdake | heat will take a holistic approach to machines in an autoscaling group and scale up or down based upon metrics | 17:31 |
sdake | but it doesn't target machines that are at low-utilization for removal | 17:31 |
Lotus907efi | is there any documentation that would lead a newbie through all necessary steps to do a simple example of using heat and cloud-init to do a semi-routine config task on newly booted system? | 17:31 |
harlowja | kk | 17:31 |
sdake | heat expects a load balancer to run in front of the services to evenly spread load | 17:31 |
*** lindsayk has joined #heat | 17:31 | |
sdake | so realistically when a node is killed off by autoscaling, it would have a similar load as other vms | 17:32 |
*** kgriffs is now known as kgriffs|afk | 17:32 | |
harlowja | right right, makes sense | 17:32 |
Lotus907efi | I have been looking around a for a few days and reading stuff but I am still a very confused newbie when it comes to using heat meta-data / user-data to get cloud-init to do things on first boot | 17:32 |
Lotus907efi | and the example yaml files I have looked at seem a little vague | 17:33 |
lifeless | zaneb: o/ I'm around now | 17:34 |
harlowja | thx sdake , let me see if i can further figure out what the heck mail people want to do | 17:35 |
*** david-lyle has quit IRC | 17:35 | |
harlowja | ha | 17:35 |
*** yogesh has joined #heat | 17:36 | |
*** lindsayk has quit IRC | 17:36 | |
*** jstrachan has quit IRC | 17:38 | |
*** TonyBurn has quit IRC | 17:39 | |
sdake | you mean yahoo mail harlowja? | 17:42 |
*** petertoft has quit IRC | 17:43 | |
harlowja | ya | 17:43 |
harlowja | i do | 17:43 |
sdake | i suspect they dont' care about killing a low use node if they have a LB in front | 17:44 |
*** lindsayk has joined #heat | 17:44 | |
sdake | they want to reduce utilization holistically rather then specifically | 17:44 |
sdake | we do have some folks that want to be able to target specific nodes for queisce and kill | 17:44 |
sdake | but that isn't implemented (yet) | 17:45 |
harlowja | sdake this is also for there CI, not neccasrily for the mail facing servers yet, so i think we're trying to figure out what exactly they want still :) | 17:45 |
sdake | harlowja http://docs.openstack.org/developer/heat/template_guide/openstack.html#OS::Heat::AutoScalingGroup and http://docs.openstack.org/developer/heat/template_guide/openstack.html#OS::Heat::ScalingPolicy | 17:46 |
sdake | policy controls the group | 17:47 |
sdake | group contains the collection of vms | 17:47 |
harlowja | thx sdake i'll see if i can find out more about there requirement still, and reference the above as a possible way (once the requirement becomes less blurry) | 17:48 |
sdake | lotus907efi have you launched your first stack? | 17:49 |
Lotus907efi | I have been playing around with tripleo for about a month now | 17:50 |
sdake | lotus907efi http://openstack.redhat.com/Deploy_Heat_and_launch_your_first_Application | 17:50 |
sdake | lotus907efi http://openstack.redhat.com/Deploy_an_application_with_Heat | 17:51 |
Lotus907efi | cool, thanks I will read those | 17:51 |
sdake | the first step of heat is launching a stack | 17:51 |
Lotus907efi | ok | 17:51 |
sdake | once you get that down, you can play with the various heat API operations via cli | 17:51 |
sdake | once you understand the clis, you can dig into writing your own templates | 17:51 |
sdake | I'd proceed in that order :) | 17:51 |
sdake | if you having a working openstack install, should take less then 30 minutes to get through those 3 things | 17:52 |
sdake | one option that works really nicely - rax has stood up heat in their infrastructure, so all you need is the python client (and a credit card) to launch stacks :) | 17:52 |
Lotus907efi | ah, well I have two or three tripleo devtest environments running | 17:55 |
Lotus907efi | and I have been digging into those | 17:55 |
*** gokrokve has quit IRC | 17:57 | |
sdake | lotus907efi I have not tried tripleo + heat | 17:57 |
Lotus907efi | tripleo has heat integrated fully into it | 17:57 |
sdake | I intend to get heavily involved atleast in using that model during juno tho | 17:57 |
Lotus907efi | tripleo uses heat to bring up the undercloud and overcloud stacks that running devtest produces | 17:58 |
sdake | lotus907efi yes I know - I think the heat core is going to take a keen interest in making sure that works moar better in juno | 17:59 |
Lotus907efi | and all of those undercloud and overcloud systems have meta-data servers running at http://169.254.169.254 | 18:00 |
*** jprovazn_afk is now known as jprovazn | 18:00 | |
*** spzala has quit IRC | 18:00 | |
Lotus907efi | so are you saying that the heat bits built into tripleo might not be the most up to date fully coked bits? | 18:00 |
Lotus907efi | cooked | 18:01 |
cmyster | evening | 18:01 |
sdake | heat bits built into tripleo are most up to date yes | 18:01 |
*** spzala has joined #heat | 18:02 | |
sdake | lotus907efi I get the impression the integration could be improved | 18:02 |
zaneb | lifeless: SpamapS answered my question already; I added some stuff to the bug | 18:02 |
*** lindsayk1 has joined #heat | 18:02 | |
*** kgriffs|afk is now known as kgriffs | 18:02 | |
sdake | mostly from the heat side | 18:02 |
Lotus907efi | ah, ok | 18:02 |
sdake | (eg heat has gaps that need feed and care) | 18:02 |
Lotus907efi | hmm, from what I can see from what little I have used it the heat stuff in tripleo seems to work pretty well | 18:03 |
*** lindsayk has quit IRC | 18:04 | |
*** e0ne has joined #heat | 18:05 | |
sdake | I think SpamapS would argue with your definition of seems :) | 18:06 |
openstackgerrit | Andreas Jaeger proposed a change to openstack/heat: Check that all po/pot files are valid https://review.openstack.org/84226 | 18:06 |
*** ramishra has joined #heat | 18:09 | |
*** jstrachan has joined #heat | 18:09 | |
Lotus907efi | ah, well he is supposed to be on vacation so not allowed to grouse about stuff now | 18:09 |
*** e0ne has quit IRC | 18:10 | |
*** lindsayk1 has quit IRC | 18:11 | |
*** zhangyang has quit IRC | 18:11 | |
sdague | shardy: you still awake? | 18:12 |
*** e0ne has joined #heat | 18:12 | |
sdague | there is another race I'm seeing here - https://github.com/openstack/tempest/blob/master/tempest/api/orchestration/stacks/test_update.py#L73-L78 | 18:12 |
sdague | will a stack go through a state transition so that we can wait on something? | 18:13 |
*** ramishra has quit IRC | 18:13 | |
cmyster | hi sdague | 18:13 |
sdague | because it looks like a decent amount of the time the update isn't processing before the list pulls it back | 18:13 |
*** rbuilta1 has joined #heat | 18:13 | |
sdague | cmyster: hi | 18:14 |
*** rbuilta has quit IRC | 18:15 | |
sdague | actually, that's a more general heat question on resource wait on update | 18:15 |
*** adeb_ has joined #heat | 18:16 | |
sdague | this has failed 39 times in the last 24 hrs, so pretty bad - http://logstash.openstack.org/#eyJzZWFyY2giOiJcIkZBSUw6IHRlbXBlc3QuYXBpLm9yY2hlc3RyYXRpb24uc3RhY2tzLnRlc3RfdXBkYXRlLlVwZGF0ZVN0YWNrVGVzdEpTT04udGVzdF9zdGFja191cGRhdGVfYWRkX3JlbW92ZVwiIiwiZmllbGRzIjpbXSwib2Zmc2V0IjowLCJ0aW1lZnJhbWUiOiI2MDQ4MDAiLCJncmFwaG1vZGUiOiJjb3VudCIsInRpbWUiOnsidXNlcl9pbnRlcnZhbCI6MH0sInN0YW1wIjoxMzk3NjcxNzUzOTE2fQ== | 18:16 |
Lotus907efi | sdake: one comment about that "Deploy an application with Heat" document - the sentence "There are a number of sample templates available in the github repo" where "githup repo" is a link .... the link does not seem to actually lead to any example templates I can see | 18:17 |
*** jprovazn has quit IRC | 18:18 | |
*** gokrokve has joined #heat | 18:18 | |
*** jstrachan has quit IRC | 18:21 | |
*** lindsayk has joined #heat | 18:22 | |
therve | sdague, "'Unknown resource Type : OS::Heat::RandomString" ? That's a really weird error | 18:23 |
cmyster | depends on version I guess... | 18:26 |
sdague | therve: are you seeing a different issue than I am? | 18:29 |
sdague | http://logs.openstack.org/54/87554/9/check/check-tempest-dsvm-postgres-full/7922180/console.html#_2014-04-16_13_04_22_366 | 18:29 |
*** vinsh has quit IRC | 18:29 | |
sdague | the mismatch is a race on update | 18:29 |
sdague | looks like it happens 25% of the time | 18:30 |
*** spzala has quit IRC | 18:30 | |
*** spzala has joined #heat | 18:30 | |
*** pafuent1 has joined #heat | 18:33 | |
*** spzala has quit IRC | 18:33 | |
*** aweiteka has joined #heat | 18:34 | |
sdague | therve: here's the bug - https://bugs.launchpad.net/heat/+bug/1308682 | 18:35 |
uvirtbot | Launchpad bug 1308682 in tempest "Race in heat stack update " [Undecided,New] | 18:35 |
sdague | I'm curious if this is expected that stack_update is only eventually consistent | 18:35 |
sdague | and if so, if there is a wait condition to know it's done | 18:35 |
*** pafuent has quit IRC | 18:36 | |
*** akuznetsov has joined #heat | 18:37 | |
sdague | therve: so if you have any thoughts before I just straight out skip the test | 18:37 |
sdague | would be appreciated | 18:37 |
*** aweiteka has quit IRC | 18:43 | |
*** kgriffs is now known as kgriffs|afk | 18:44 | |
gokrokve | Hi. Is there any good example of HARestarter resource usage available? I checked here: https://github.com/openstack/heat-templates but did not find any. | 18:46 |
*** jprovazn has joined #heat | 18:47 | |
*** chandan_kumar has joined #heat | 18:49 | |
cmyster | there is actually | 18:52 |
cmyster | http://zenodo.org/record/7571/files/CERN_openlab_report_Michelino.pdf | 18:52 |
*** russellb has quit IRC | 18:53 | |
adeb_ | My settings requires going through a proxy for downloading things | 18:53 |
*** petertoft has joined #heat | 18:53 | |
adeb_ | I am trying to create a stack using this https://github.com/openstack/heat-templates/blob/master/hot/hello_world.yaml template...but my create fails with the follwoing error: Could not retrieve template: Failed to retrieve template: [Errno 110] ETIMEDOUT | 18:54 |
adeb_ | Is there any config file where I can set up the proxy | 18:54 |
*** aweiteka has joined #heat | 18:54 | |
adeb_ | I already have the env variable http_proxy set up in | 18:54 |
*** bgorski has joined #heat | 18:54 | |
*** russellb has joined #heat | 18:55 | |
gokrokve | cmyster: Thanks! | 19:00 |
*** gondoi is now known as zz_gondoi | 19:01 | |
*** tango has joined #heat | 19:01 | |
cmyster | np gokrokve | 19:03 |
cmyster | adeb_: and proxy is working regularly otherwise? | 19:03 |
cmyster | i.e if you go online to some web site | 19:04 |
*** aweiteka has quit IRC | 19:08 | |
*** ramishra has joined #heat | 19:09 | |
*** nati_ueno has joined #heat | 19:10 | |
*** ramishra has quit IRC | 19:14 | |
*** e0ne has quit IRC | 19:15 | |
*** jdob_ has joined #heat | 19:15 | |
*** e0ne has joined #heat | 19:16 | |
*** nati_ueno has quit IRC | 19:20 | |
*** e0ne has quit IRC | 19:21 | |
*** nati_ueno has joined #heat | 19:21 | |
therve | sdague, Sorry was away. Yes in your logstack results I saw a different error | 19:22 |
therve | Like http://logs.openstack.org/92/85392/6/check/check-tempest-master-dsvm-full-havana/6fb8d82/console.html | 19:22 |
sdague | oh, yeh, that's another job | 19:23 |
adeb_ | yes, if I go to online to other sites it works | 19:23 |
sdague | I realized that later | 19:23 |
adeb_ | sorry was away | 19:23 |
sdague | so the failure rate is more like 10% I think | 19:25 |
sdague | check-tempest-master-dsvm-full-havana is our attempt to run tempest master on stable/havana | 19:25 |
therve | Still pretty bad | 19:25 |
sdague | yeh | 19:25 |
sdague | therve: so I pushed a skip | 19:25 |
therve | Yeah I saw :/ | 19:26 |
sdague | however, that doesn't really answer the root question on whether that is expected to be non synchronous | 19:26 |
*** tspatzier has joined #heat | 19:26 | |
sdague | and if it is, how would an api user know things were ready | 19:26 |
*** cmyster has quit IRC | 19:27 | |
*** e0ne has joined #heat | 19:27 | |
therve | Uh | 19:29 |
therve | sdague, TemplateYAMLNegativeTestJSON is pretty bad... It connects to example.com | 19:29 |
*** chandan_kumar has quit IRC | 19:30 | |
sdague | therve: is it actually connecting? | 19:32 |
therve | Yeah :/ | 19:32 |
therve | Unrelated to your issues, just saw that in the logs | 19:32 |
*** nati_uen_ has joined #heat | 19:32 | |
sdague | so if we give it a totally bogus dns name will it do the right thing? | 19:32 |
sdague | I agree we should get rid of network connects like that | 19:33 |
*** nati_ueno has quit IRC | 19:33 | |
therve | It's doing a HTTP GET, so whatever answers fast would be nice | 19:34 |
*** e0ne has quit IRC | 19:34 | |
*** cmyster has joined #heat | 19:34 | |
*** cmyster has joined #heat | 19:34 | |
*** e0ne has joined #heat | 19:35 | |
*** nati_uen_ has quit IRC | 19:35 | |
*** petertoft has quit IRC | 19:37 | |
*** e0ne has quit IRC | 19:39 | |
*** jistr has quit IRC | 19:40 | |
therve | sdague, So to get back to the problem, I think we simply have a race condition in Heat | 19:41 |
therve | We set the state to UPDATE_COMPLETE but it's not really complete | 19:41 |
sdague | ok | 19:41 |
therve | And transactions are for suckers, so... | 19:41 |
sdague | so then it was right that I also marked it as a heat bug | 19:41 |
therve | I think so | 19:42 |
sdague | if you want to add that commentary in there, would be appreciated | 19:42 |
therve | I'm interested about the the other failure we're seeing to. It seems weird. | 19:42 |
therve | Will do | 19:42 |
sdague | yeh, well the tempest-master ones mostly would be an incompatible change from havana to now | 19:46 |
sdague | perhaps something was added? | 19:46 |
therve | Ah yes, those tests wouldn't pass on Heat havana | 19:48 |
*** spzala has joined #heat | 19:50 | |
*** rbuilta1 has quit IRC | 19:53 | |
*** zns has quit IRC | 19:53 | |
*** saurabhs has joined #heat | 19:55 | |
*** tspatzier has quit IRC | 19:55 | |
*** vinsh has joined #heat | 19:56 | |
*** akuznetsov has quit IRC | 19:57 | |
*** chandan_kumar has joined #heat | 19:57 | |
openstackgerrit | Thomas Herve proposed a change to openstack/heat: Push COMPLETE status change at the end of update https://review.openstack.org/88075 | 19:57 |
*** IlyaE has joined #heat | 19:58 | |
*** jprovazn has quit IRC | 19:59 | |
*** jdob_ has quit IRC | 19:59 | |
*** e0ne has joined #heat | 20:04 | |
*** alexpilotti has quit IRC | 20:05 | |
*** e0ne has quit IRC | 20:08 | |
*** e0ne has joined #heat | 20:08 | |
*** ramishra has joined #heat | 20:10 | |
*** zns has joined #heat | 20:14 | |
*** ramishra has quit IRC | 20:14 | |
*** david-lyle has joined #heat | 20:16 | |
*** jistr has joined #heat | 20:17 | |
*** tspatzier has joined #heat | 20:20 | |
sdague | so it looks like - https://review.openstack.org/#/c/87993/ doesn't fix anything | 20:21 |
sdague | we're still failing on wait condition on that | 20:22 |
sdague | the cloud init errors aren't fun there though - http://logs.openstack.org/93/87993/2/gate/gate-tempest-dsvm-neutron-heat-slow/82b8ac1/console.html#_2014-04-16_17_38_16_664 | 20:23 |
sdague | with the heat job at about a 50% fail rate it's bouncing everyone else's patches at this point. I think that if we can't resolve these soon we need to stop voting with it. | 20:25 |
*** radez_g0n3 has quit IRC | 20:26 | |
*** radez_g0n3 has joined #heat | 20:26 | |
*** aweiteka has joined #heat | 20:28 | |
sdake | sdague my guess there with that last trace is that the metadata server is not metadata serving | 20:31 |
sdake | the last trace you showed showed the instance was being orchestrated up to 560 sec and timed out right around 600sec | 20:32 |
sdake | this trace doesn't show anything after 300 sec | 20:32 |
sdake | which implies cloud-init is spinning waiting for the metadata server to provide it the goods | 20:33 |
stevebaker | sdague: \o | 20:33 |
*** Tross1 has quit IRC | 20:33 | |
sdake | sdague I think you mentioned in some cases neutron doesn't setup the network properly? | 20:33 |
*** Tross has joined #heat | 20:33 | |
stevebaker | what is that failed to set hostname error about? | 20:34 |
sdake | no idea never seen that before | 20:35 |
sdake | possibly network connectivity problems not allowing nova to work with the storage? | 20:36 |
*** blomquisg has quit IRC | 20:36 | |
sdake | one thing to eliminate is "does the network actually work properly" prior to blaming wait conditions :) | 20:36 |
sdague | yeh, I think that's probably wise. The initial test conditions here assume a bit too much. So it's hard to work backwards to the failures. | 20:37 |
stevebaker | yes, a waitcondition timeout is a symptom of one failure in a *very* long chain, where most parts are not heat related | 20:37 |
stevebaker | sdague: writing the boot log is meant to help diagnose these, but I'm happy for a whole bunch more debugging to be logged to diagnose these situations | 20:38 |
sdake_ | although if cloudinit fails to set that file, it could caus ecloud init to exit the init process | 20:39 |
sdake_ | and not actually orchestrate the instance | 20:39 |
sdake_ | alhtough I've never seen that happen in years of working on the code | 20:39 |
sdake_ | (the particular error) | 20:39 |
stevebaker | we used to have set hostname failures before the name was pinned to under 63 chars | 20:40 |
therve | It looks like there is a 9 minute window of nothing | 20:40 |
sdague | sdake_: our experience is we run so much more throughput through the gate we see issues that no one else sees, because they happened once, and people just moved past | 20:40 |
sdake_ | sdague yup understood on that point | 20:40 |
sdague | stevebaker: so what kind of additional debug, or assert are you thinking here? | 20:41 |
sdague | we could also be more deliberate about asserting on the way up with things we believe should be working | 20:41 |
stevebaker | sdague: I guess following the whole chain, checking networking connectivity from the server to the heat endpoint, checking heat-api-cfn responds to something | 20:42 |
*** gokrokve has quit IRC | 20:42 | |
sdake_ | sdague most devs also use baremetal rather then virt on virt | 20:43 |
sdague | is there an existing heat-api-cfn client in the tree to do that easily | 20:43 |
therve | Maybe set cloud init debug? | 20:43 |
stevebaker | sdague: if we could find the right assert, then we could at least fail early rather than timeout | 20:43 |
sdague | sdake_: sure, but that should only change timing | 20:43 |
*** tspatzier has quit IRC | 20:43 | |
sdague | this shouldn't *not work* in this environment | 20:43 |
sdague | so it will expose different races | 20:43 |
sdake_ | well virt on virt is a POS when I have tested previously | 20:44 |
sdake_ | stack traces, kernel oopses, machine lockups, etc | 20:44 |
sdake_ | 2014-04-16 17:38:16.628 | [ 0.015000] WARNING: This combination of AMD processors is not suitable for SMP. | 20:44 |
sdake_ | this kernel warning looks problematic | 20:44 |
sdague | well we've not really had any guest issues previously | 20:45 |
stevebaker | sdague: is this an issue that only happens on rax or HP? | 20:45 |
sdague | stevebaker: good question, let me check | 20:46 |
sdague | http://logstash.openstack.org/#eyJmaWVsZHMiOltdLCJzZWFyY2giOiJtZXNzYWdlOlwiaGVhdC5lbmdpbmUucmVzb3VyY2UgV2FpdENvbmRpdGlvblRpbWVvdXQ6IDAgb2YgMSByZWNlaXZlZFwiIEFORCB0YWdzOlwic2NyZWVuLWgtZW5nLnR4dFwiXG4iLCJ0aW1lZnJhbWUiOiI2MDQ4MDAiLCJncmFwaG1vZGUiOiJjb3VudCIsIm9mZnNldCI6MCwidGltZSI6eyJ1c2VyX2ludGVydmFsIjowfSwibW9kZSI6InNjb3JlIiwiYW5hbHl6ZV9maWVsZCI6ImJ1aWxkX25vZGUifQ== | 20:47 |
sdague | seems pretty equal opportunity | 20:47 |
sdake_ | mostly hp cloud | 20:47 |
sdake_ | is the load 50/50 in the gate? | 20:48 |
sdague | well, that's only the top 25 events | 20:48 |
sdague | top 25 facets | 20:48 |
sdague | we run 50 ish | 20:48 |
sdague | This combination of AMD processors is not suitable for SMP. would be a rax error though | 20:49 |
sdague | hp runs intel procs | 20:49 |
sdague | but I really don't think that's the issue | 20:49 |
sdake_ | not sure if it is just pointing it out | 20:49 |
sdake_ | my top two guesses would be network or virt on virt | 20:51 |
sdake_ | i assume other guest tests don't run f20 | 20:51 |
sdague | correct, we're running cirros | 20:52 |
sdake_ | perhaps f20 has some incompatibility with the hypervisor on those environments | 20:52 |
sdague | could be | 20:52 |
therve | sdague, Presumably ssh access to those instances during a test is out of question? | 20:52 |
sdague | therve: no, not out of the question | 20:53 |
*** aweiteka has quit IRC | 20:53 | |
sdague | that would be complete kosher | 20:53 |
therve | It's be interesting to see what's going on | 20:53 |
sdague | sure | 20:53 |
therve | We should have "Cloud-init v. 0.7.2 finished" in the logs, so I think it's still running | 20:53 |
sdague | well the nova console log is there in the dump | 20:54 |
therve | I'm blamming SSH access somewhere | 20:54 |
therve | s/SSH/network | 20:54 |
sdake_ | it is either running or exited in some undefined way | 20:54 |
stevebaker | historically ssh was attempted before the waitcondition returned, but it was moved to after because of ssh timeouts | 20:54 |
sdake_ | systemctl should give output of the cloudinit results | 20:54 |
*** e0ne has quit IRC | 20:54 | |
stevebaker | but actually the ssh timeout is probably exactly the same bug as our current waitcondition timeout | 20:54 |
*** e0ne has joined #heat | 20:55 | |
sdake_ | would it be possible to gate with a distro other then f20? | 20:55 |
*** jdob has quit IRC | 20:55 | |
sdake_ | so we can eliminate the distro as a source | 20:55 |
sdague | sdake_: if we can get this on cirros, that's super easy, as the image is there already | 20:55 |
sdake_ | does cirros have the proper cloudinit? | 20:56 |
sdague | sdake_: no, it's got some lightweight scripts that do part of it | 20:56 |
sdague | enough for nova tests to work | 20:57 |
*** asalkeld has joined #heat | 20:57 | |
therve | stevebaker, Actually you're right, we should see SSH host key generation in there | 20:57 |
sdake_ | sdague heat definately needs cloudinit | 20:57 |
sdague | that should be in the backlog, we were going through that this morning actually to try to figure out | 20:57 |
*** e0ne has quit IRC | 20:57 | |
stevebaker | sdague: that test is to test cfn-init, so cirros is out. But there could be another test which uses cirros to test end to end connectivity, and cfn-signal could be replaced with curl | 20:58 |
*** pafuent1 has quit IRC | 20:58 | |
stevebaker | sdake_: we could only switch to ubuntu when a solution is found for building images in gate | 20:59 |
sdake_ | well with f20 - it could be a kernel bug, a systemd bug, a cloud init bug | 21:00 |
sdake_ | any one of those things could fail and the test would not complete | 21:00 |
sdague | ok, so it feels like we have a bunch of loose ends here. The real question is what made this spike at about 18:00 UTC Apr 14 | 21:01 |
sdake_ | with cirros + curl, it could only be a systemd bug | 21:01 |
sdake_ | sorry with cirros + curl - if the gate works, then it is definately a f20 problem | 21:01 |
sdake_ | if cirros + curl = if the gate fails - likely a network problem | 21:01 |
sdague | yep, sure | 21:02 |
*** gokrokve has joined #heat | 21:02 | |
stevebaker | well, a problem which f20 reveals. It could equally be a neutron or nova bug | 21:02 |
*** tspatzier has joined #heat | 21:02 | |
sdake_ | stevebaker agree | 21:02 |
sdague | stevebaker: definitely could be, however, we're not seeing the same high level of failure on nova | 21:03 |
sdague | which does ~180 guest starts during a run | 21:03 |
sdague | neutron is only probably at about 60 guest starts I think | 21:03 |
*** kgriffs|afk is now known as kgriffs | 21:03 | |
stevebaker | sdague: try starting ~180 f20s ;) | 21:03 |
sdague | stevebaker: :) | 21:03 |
*** kgriffs is now known as kgriffs|afk | 21:04 | |
sdague | my point is we start 1, and we get really high failure. So the guest hypothesis is potentionally interesting | 21:04 |
therve | stevebaker, So the test is doing mostly the same thing as the neutron ones that work, except cfn-int | 21:05 |
sdake_ | ya makes sense | 21:05 |
therve | cfn-init | 21:05 |
therve | Which does a metadata retrieve | 21:05 |
sdake_ | cloud-init does the metadata retrieval, not cfn-init | 21:05 |
therve | sdake_: Heat metadata | 21:06 |
sdague | ok, so we have some long term items, but we also have the short term issue of the 50% failure rate, which is liable to get flaming torches soon | 21:06 |
stevebaker | therve: the cirros test could grep the nova metadata service user_data for some pattern, then curl signal the result. No cloud-init, no heat-cfntools | 21:06 |
sdake_ | so course of action - make heat gate non-voting until we get to bottom of problem | 21:06 |
sdague | yeh, that was basically the opinion I was going to ask | 21:06 |
sdake_ | #2 make a cirros test which uses curl to identify if f20 is the cause | 21:07 |
stevebaker | sdague: we could skip that test, and have some gerrit changes which unskip it while we continue to diagnose | 21:07 |
sdague | stevebaker: that's an option as well, skipping the test means we won't get any data on it though | 21:07 |
sdague | vs. non voting, where the runs will still happen | 21:07 |
therve | stevebaker, Right but it doesn't solve the issue of that test | 21:07 |
stevebaker | yeah, non-voting might be best | 21:07 |
sdague | so I'd like heat core team pov on which you guys think is best | 21:08 |
sdague | I'm happy to execute on either of them | 21:08 |
therve | Note that cfn-signal works, the problem seems to be with cfn-init | 21:08 |
stevebaker | therve: it looks like cfn-init isn't being run | 21:08 |
sdake_ | given the # of failures of f20 vs cirros coupled with the number of guest starts, I think we need to identify if f20 is part of the problem | 21:08 |
stevebaker | therve: you mean cloud-init? | 21:08 |
*** jaustinpage has quit IRC | 21:09 | |
therve | stevebaker, No I mean cfn-init | 21:09 |
sdague | ok, got to drop for a few to relocate. I should be back on in 30 mins or so. | 21:09 |
therve | stevebaker, Difference between https://github.com/openstack/tempest/blob/master/tempest/api/orchestration/stacks/templates/cfn_init_signal.yaml and https://github.com/openstack/tempest/blob/master/tempest/api/orchestration/stacks/templates/neutron_basic.yaml | 21:10 |
stevebaker | therve: there is no evidence in the log that cfn-init is failing though | 21:11 |
sdake_ | it doesn't seem that cfn-init is run | 21:12 |
therve | My guess would be that it hangs | 21:12 |
therve | The first thing it does is connecting to heat | 21:13 |
sdake_ | my guess would be that cloud-init hangs :) | 21:13 |
stevebaker | therve: no, cfn-init just consumes metadata data which cloud-init (and loguserdata.py) has already written to disk | 21:13 |
therve | stevebaker, ? It connects to heat metadata server, no? | 21:13 |
sdake_ | cfn-init does not connect to any server unless yum or deb are specified as files | 21:14 |
sdake_ | it reads off the local disk | 21:14 |
therve | That's not what I understand of the code | 21:14 |
stevebaker | therve: no. cloud-init fetches the user_data from the nova metadata server with an http GET | 21:15 |
stevebaker | therve: the user_data is a mime package containing the cfn metadata, plus loguserdata.py which cloud-init invokes to write that metadata to disk | 21:15 |
stevebaker | (plus a bunch of other stuff) | 21:15 |
therve | stevebaker, https://github.com/openstack/heat-cfntools/blob/master/heat_cfntools/cfntools/cfn_helper.py#L1119 | 21:16 |
therve | It seems we first try to get the remote metadata | 21:16 |
therve | And then fail back to local files | 21:16 |
sdake_ | therve the getting of the remot emetadata described in line 1119 happens via cloud-init | 21:17 |
sdake_ | atleast that is how it behaved in the past :) | 21:18 |
therve | remote_metadata seems pretty remote to me | 21:19 |
stevebaker | therve: right, ok. We could not write out /etc/cfn/cfn-credentials to see if the test gets further, but it would probably fail for the same reason when attempting to cfn-signal | 21:19 |
sdake | therve good point | 21:19 |
therve | Yeah I'd be curious, but anyway | 21:20 |
sdake | therve my apologies that is new code :) | 21:20 |
therve | stevebaker, Maybe we can use a custom image and set debug to some stuff? | 21:20 |
therve | Like cloud-init and cfn-tools | 21:20 |
stevebaker | therve: yeah, sorry. I forgot the reason I wrote out cfn-credentials | 21:20 |
*** andrew_plunk has quit IRC | 21:21 | |
*** zns has quit IRC | 21:31 | |
*** vijendar has quit IRC | 21:32 | |
*** jistr has quit IRC | 21:33 | |
*** achampion has quit IRC | 21:34 | |
*** dims has quit IRC | 21:52 | |
*** dims has joined #heat | 22:04 | |
*** e0ne has joined #heat | 22:05 | |
*** e0ne has quit IRC | 22:10 | |
*** lindsayk has quit IRC | 22:13 | |
*** lindsayk has joined #heat | 22:13 | |
*** lindsayk has quit IRC | 22:13 | |
*** lindsayk has joined #heat | 22:15 | |
*** zns has joined #heat | 22:16 | |
gokrokve | Hi. Is it possible to use Ceilometer alarms for HARestarter instead of CloudWatch alarms? | 22:19 |
sdague | stevebaker / therve / sdake_ : if you guys are good with this, please +1 - https://review.openstack.org/88100 - it's making the job non voting | 22:20 |
*** tspatzier has quit IRC | 22:20 | |
*** Tross1 has joined #heat | 22:28 | |
*** lindsayk has quit IRC | 22:30 | |
*** lindsayk has joined #heat | 22:30 | |
*** Tross has quit IRC | 22:30 | |
*** sjmc7 has quit IRC | 22:34 | |
stevebaker | sdague: +1 | 22:36 |
stevebaker | and if any heat-core +2s a change without checking the reason for a heat-slow failure, I keel yooo! | 22:36 |
stevebaker | winky face | 22:37 |
sdague | hehe | 22:38 |
stevebaker | gokrokve: Yes, but I think you still need to use cfn-push-stats, so the metrics go through heat on the way to ceilometer | 22:38 |
SpamapS | stevebaker: hah, that ventriliquist guy lives in my neighborhood.. ;) | 22:38 |
gokrokve | stevebaker: Do I need to configure cfn-credentials for that? | 22:39 |
stevebaker | gokrokve: um, yes? | 22:40 |
*** yogesh has quit IRC | 22:40 | |
stevebaker | gokrokve: there must be an example template somewhere | 22:40 |
gokrokve | stevebaker: What about cfn-hup? Should it be in crontab for that? | 22:40 |
gokrokve | stevebaker: I've got an json example from CERN guys. its like 2 pages of bash magic in user-data :-( | 22:41 |
mattoliverau | morning all | 22:43 |
*** IlyaE has quit IRC | 22:44 | |
*** zns has quit IRC | 22:46 | |
stevebaker | gokrokve: I would avoid cfn-hup, its a very complicated way of achieving configuration updates | 22:48 |
gokrokve | stevebaker: I see it in all examples. So what will be the best way to setup a VM to report status to Ceilometer? | 22:48 |
stevebaker | gokrokve: look for cfn-push-stats | 22:48 |
stevebaker | gokrokve: call it from cron or a bash loop | 22:49 |
gokrokve | stevebaker: Cool. So I need to setup cfn-credentials with some securekay and then setup crontab to run cfn-push-stats | 22:50 |
stevebaker | gokrokve: yes | 22:50 |
gokrokve | stevebaker: Then create a Ceilometer alarm for specific instance gauge | 22:50 |
gokrokve | stevebaker: Ok. Thanks. Will try to figure out how to glue this all together | 22:50 |
stevebaker | gokrokve: or you could use python instead of cfn-push-stats https://review.openstack.org/#/c/44967/5/tempest/scenario/orchestration/test_autoscaling.yaml | 22:51 |
*** lnxnut_ has joined #heat | 22:51 | |
stevebaker | gokrokve: which would be pure boto | 22:51 |
*** zns has joined #heat | 22:51 | |
*** lnxnut has quit IRC | 22:51 | |
*** lnxnut_ has quit IRC | 22:51 | |
*** lnxnut has joined #heat | 22:52 | |
gokrokve | stevebaker: That is great. I will probably use python version as it is much clearer. | 22:52 |
SpamapS | stevebaker: so, slow polling for metadata... | 22:59 |
SpamapS | stevebaker: do we actually need to parse the whole stack, to just pull the metadata for a server? | 22:59 |
*** david-lyle has quit IRC | 23:01 | |
stevebaker | SpamapS: the current implementation of _authorize_stack_user requires a parsed stack | 23:01 |
*** adeb_ has quit IRC | 23:01 | |
SpamapS | stevebaker: so here's a thought for a potential optimization: shove metadata into swift, and hand out tempurls to said metadata. | 23:02 |
stevebaker | SpamapS: my POLL_DEPLOYMENTS plan would be very low overhead, its just a formatted sql query | 23:03 |
SpamapS | stevebaker: if that is too radical, we could also just precompute the inputs for _authorize_stack_user and save them in resource_data. | 23:03 |
stevebaker | yeah, there is lots of potential optimisations | 23:04 |
stevebaker | SpamapS: does raising max_pool_size in heat.conf mitigate this? | 23:05 |
SpamapS | stevebaker: given that heat-engine is hitting 100% CPU, I think that will just change it from 500 errors to timeouts | 23:05 |
stevebaker | yeah, ok | 23:05 |
*** IlyaE has joined #heat | 23:06 | |
stevebaker | SpamapS: Once https://review.openstack.org/#/c/84269/ has landed I'll carry on with a collector which calls heatclient.software_deployments.metadata directly | 23:08 |
SpamapS | stevebaker: it's got my +2 :) | 23:11 |
*** killer_prince has quit IRC | 23:14 | |
*** ifarkas has quit IRC | 23:14 | |
stevebaker | SpamapS: cool. do you see any issue with getting a new keystone token every 30 seconds for every occ based server? | 23:14 |
SpamapS | stevebaker: seems like a huge waste. | 23:17 |
SpamapS | stevebaker: have to run for a while.. but I'll be back in a while. | 23:17 |
stevebaker | SpamapS: it does doesn't it. the collector should really keep using the token until it is close to expiring | 23:20 |
*** lazy_prince has joined #heat | 23:20 | |
*** lazy_prince is now known as killer_prince | 23:20 | |
*** gokrokve has quit IRC | 23:23 | |
*** vinsh has quit IRC | 23:26 | |
*** asalkeld_ has joined #heat | 23:30 | |
*** lipinski has quit IRC | 23:30 | |
*** asalkeld has quit IRC | 23:30 | |
*** zns has quit IRC | 23:32 | |
*** arbylee has quit IRC | 23:34 | |
*** achampion has joined #heat | 23:36 | |
*** chandan_kumar has quit IRC | 23:37 | |
*** andersonvom has quit IRC | 23:40 | |
cmyster | I am going over the API in http://api.openstack.org/api-ref-orchestration.html and I was wondering how can a software config be updated ? | 23:41 |
stevebaker | cmyster: by creating a new one with different contents, they are designed to be immutable | 23:42 |
cmyster | stevebaker: needs to be the same name or something? | 23:42 |
*** asalkeld_ is now known as asalkeld | 23:43 | |
*** arbylee has joined #heat | 23:43 | |
stevebaker | cmyster: the deployment resource associates a config with a server, and creates derived configs whenever input_values changed. Its the derived config which the server ends up with | 23:43 |
*** arbylee has quit IRC | 23:43 | |
cmyster | so from a user point of view to replace a config is to delete and recreate/ | 23:44 |
cmyster | ? | 23:44 |
*** andersonvom has joined #heat | 23:44 | |
*** cmyster has quit IRC | 23:49 | |
*** andersonvom has quit IRC | 23:50 | |
*** tango has quit IRC | 23:50 | |
*** ramishra has joined #heat | 23:52 | |
*** ramishra has quit IRC | 23:56 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!