Monday, 2016-11-07

*** flepied has quit IRC00:02
*** gfidente has joined #tripleo00:33
*** gfidente has quit IRC00:33
*** gfidente has joined #tripleo00:33
*** jkilpatr has quit IRC00:36
*** yamahata has joined #tripleo00:40
*** flepied has joined #tripleo00:49
openstackgerritMerged openstack/tripleo-heat-templates: swift/proxy: remove swift::proxy::ceilometer::rabbit_host
*** limao has joined #tripleo01:01
*** limao has quit IRC01:02
*** limao has joined #tripleo01:02
*** lblanchard has joined #tripleo01:41
*** lblanchard has quit IRC01:46
openstackgerritSteve Baker proposed openstack/tripleo-heat-templates: Containerized Services for Composable Roles
*** maeca1 has joined #tripleo02:14
*** maeca1 has quit IRC02:27
*** dmacpher has joined #tripleo02:36
*** bkopilov has quit IRC02:57
*** tzumainn has joined #tripleo03:08
openstackgerritSteve Baker proposed openstack/tripleo-heat-templates: Containerized Services for Composable Roles
*** ElCoyote_ has quit IRC03:21
*** links has joined #tripleo03:55
*** coolsvap has joined #tripleo04:10
*** bkopilov has joined #tripleo04:17
*** ayoung has quit IRC04:25
openstackgerritAnshul Jain proposed openstack/diskimage-builder: DIB element to support cinder local attach/detach functionality
*** chandankumar has joined #tripleo04:43
*** chlong has joined #tripleo04:47
*** tzumainn has quit IRC04:53
*** tzumainn has joined #tripleo04:53
*** masco has joined #tripleo05:04
*** sudipto has joined #tripleo05:05
*** sudipto_ has joined #tripleo05:05
*** skramaja has joined #tripleo05:10
*** limao has quit IRC05:13
*** limao has joined #tripleo05:13
*** oshvartz has quit IRC05:28
*** ramishra has quit IRC05:29
*** ramishra has joined #tripleo05:31
*** sudswas__ has joined #tripleo05:52
*** sudipto_ has quit IRC05:55
*** sudipto has quit IRC05:55
*** sudipto has joined #tripleo05:56
*** numans has joined #tripleo06:01
*** xuao has joined #tripleo06:02
*** rcernin has joined #tripleo06:03
*** xuao has quit IRC06:06
*** ealcaniz has joined #tripleo06:16
*** mcornea has joined #tripleo06:17
*** abregman has joined #tripleo06:18
*** panda|Zz is now known as panda|sick06:32
*** lmiccini has joined #tripleo06:33
*** pmannidi_ has quit IRC06:34
*** bfournie1 has quit IRC06:41
*** tzumainn has quit IRC06:42
*** Vijayendra has quit IRC06:43
*** pmannidi_ has joined #tripleo06:50
*** oshvartz has joined #tripleo06:54
*** bana_k has joined #tripleo06:59
*** tesseract has joined #tripleo07:06
*** tesseract is now known as Guest7131007:06
cschwedeHello! I have a small review request: needs only one more +2/+A for stable/newton - that would be very helpful07:15
*** pcaruana has joined #tripleo07:17
*** pmannidi_ has quit IRC07:18
cmystercschwede: thats actually pretty nifty for the undercloud. I can see many more options going into needless07:19
*** anshul has joined #tripleo07:19
cmysterbut I don't have +2 rights07:19
*** limao has quit IRC07:19
cschwedecmyster: thx for looking at it! yes, might be useful for other things too - curious which services could be disabled on the UC as well?07:20
cmystercschwede: I can see a dynamic list here actually, since undercloud.conf can set things like telemtry=bool this can make sure its not there (but probably a misuse)07:22
*** rasca has joined #tripleo07:23
*** oshvartz has quit IRC07:24
*** limao has joined #tripleo07:25
*** ebarrera has joined #tripleo07:26
*** bana_k has quit IRC07:34
*** ramishra has quit IRC07:38
*** ramishra has joined #tripleo07:38
*** cylopez has joined #tripleo07:41
*** asalkeld has joined #tripleo07:43
*** asalkeld has quit IRC07:47
*** ebalduf has quit IRC07:48
*** florianf has joined #tripleo07:49
*** dsariel has joined #tripleo07:50
*** athomas has joined #tripleo07:53
*** hjensas has joined #tripleo07:55
d0ugalapetrich: oh, was the question about testing the password patch?07:59
apetrichd0ugal, yeah :)07:59
apetrichd0ugal, no wait at all. :)07:59
d0ugalapetrich: cool, it has actually landed in stable newton07:59
d0ugalapetrich: can you give it a go and see if you run into any other issues?08:00
d0ugalapetrich: so you'll want to make sure you have and (the second hasn't merged yet)08:01
d0ugaloops, the first hasn't merged - got them the wrong way around.08:01
openstackgerritDougal Matthews proposed openstack/python-tripleoclient: Pass clients to get the get_password function
*** asalkeld has joined #tripleo08:01
*** chem has joined #tripleo08:04
*** jprovazn has joined #tripleo08:05
*** tremble has joined #tripleo08:06
*** tremble has joined #tripleo08:06
*** asalkeld has quit IRC08:06
apetrichd0ugal, anyway I check if both are there anyway08:08
d0ugalapetrich: Thanks08:09
*** b00tcat has quit IRC08:09
*** b00tcat has joined #tripleo08:10
*** ccamacho has joined #tripleo08:10
*** fzdarsky|afk has joined #tripleo08:13
ccamachomorning guys!08:13
*** fzdarsky|afk is now known as fzdarsky08:14
*** chem has quit IRC08:17
*** asalkeld has joined #tripleo08:19
*** asalkeld has quit IRC08:19
bandinimorning *08:19
*** jlinkes has joined #tripleo08:21
matbud0ugal: i saw your review merged, thanks08:27
matbud0ugal: i didn't test it yet, but i will08:27
*** Vijayendra has joined #tripleo08:28
*** pmannidi has joined #tripleo08:29
*** aufi has joined #tripleo08:31
*** liverpooler has joined #tripleo08:31
*** abregman_ has joined #tripleo08:33
*** abregman_ has quit IRC08:33
*** abregman has quit IRC08:36
*** Vijayendra has quit IRC08:40
*** amoralej|off is now known as amoralej08:41
*** jpena|off is now known as jpena08:41
*** chlong has quit IRC08:41
*** abregman has joined #tripleo08:43
*** ohamada has joined #tripleo08:46
*** chem has joined #tripleo08:46
-openstackstatus- NOTICE: Gerrit is going to be restarted due to slowness and proxy errors08:47
*** openstackgerrit has quit IRC08:48
*** openstackgerrit has joined #tripleo08:48
*** jpich has joined #tripleo08:48
*** milan has joined #tripleo08:49
*** dbecker has joined #tripleo08:52
*** percevalbot has quit IRC08:55
*** gchamoul is now known as gchamoul|afk08:56
*** percevalbot has joined #tripleo08:56
*** shardy has joined #tripleo09:00
openstackgerritMerged openstack/python-tripleoclient: Updated from global requirements
*** gchamoul|afk is now known as gchamoul09:04
*** dmacpher has quit IRC09:07
*** abregman is now known as abregman|mtg09:07
*** pblaho has joined #tripleo09:08
*** abregman_ has joined #tripleo09:09
*** abregman|mtg has quit IRC09:12
openstackgerritJulie Pichon proposed openstack/tripleo-common: Add CephClusterFSID to generated passwords
*** lucas-afk is now known as lucasagomes09:18
cmystermorning lucasagomes09:19
lucasagomescmyster, morning09:19
*** hewbrocca_afk is now known as hewbrocca09:25
openstackgerritOpenStack Proposal Bot proposed openstack/tripleo-common: Updated from global requirements
*** dtantsur|afk is now known as dtantsur09:28
*** karthiks has joined #tripleo09:29
*** percevalbot has quit IRC09:30
*** iranzo has joined #tripleo09:31
*** iranzo has joined #tripleo09:31
*** percevalbot has joined #tripleo09:34
hewbroccafolks, how is CI looking this morning09:41
hewbroccaall unblocked and green and stuff?09:41
matbuhewbrocca: it looks ok (afaik, periodic jobs is green)09:44
*** derekh has joined #tripleo09:44
openstackgerritmathieu bultel proposed openstack-infra/tripleo-ci: Implement overcloud upgrade job - Mitaka -> Newton
shadowerI'm seeing an OVB HA failures -- not sure whether related to the previous problems:
openstackgerritMerged openstack/tripleo-ui: Refactor *DriverFields components
*** bogdando has quit IRC09:54
*** shardy has quit IRC09:55
therveRedis issue seems to still be present :/09:56
shadoweryeah :/09:56
thervematbu, Same for periodic AFAICT:
*** shardy has joined #tripleo09:57
*** zoli|gone is now known as zoli09:57
*** zoli is now known as zoliXXL09:57
*** dsariel has quit IRC10:00
d0ugalWhere can I find the reason for a package build failing?10:01
d0ugalI am trying to track down the failure behind:
*** Vijayendra has joined #tripleo10:03
zoliXXLgood morning10:03
jpichd0ugal: delorean_repos.tar.xz -> rpmbuild.log10:05
jpichd0ugal: " OSError: [Errno 17] File exists: '/tmp/tht/tripleo-heat-templates'" ...maybe transient? That's weird10:05
d0ugaljpich: aha, thanks!10:05
d0ugaljpich: yeah, that does seem weird.10:06
*** akrivoka has joined #tripleo10:06
*** shardy_ has joined #tripleo10:07
*** rickflare has quit IRC10:07
*** shardy has quit IRC10:10
*** yamahata has quit IRC10:11
*** rickflare has joined #tripleo10:11
d0ugaljpich: I rechecked it, so we shall see.10:14
jpichd0ugal: Thanks!10:14
*** shardy_ is now known as shardy10:14
shardyshadower: Hey, there's a couple of small comments on - since it looks like we'll need another recheck do you want to see if we should address them now instead of rechecking this revision?10:15
shadowershardy: all  right, sure10:16
dtantsurso, are things worth rechecking now?10:18
hewbroccaseems like we still have a Redis blocker10:18
hewbroccaWho is handling that issue?10:18
* hewbrocca mildly disturbed by the lack of responses10:21
shardySo yeah bug #1638350 has been fixed, but we still seem to have HA test failures10:21
openstackbug 1638350 in tripleo "pingtest failing on OVB jobs to create Cinder volume and Nova server" [Critical,Fix released] - Assigned to Gabriele Cerami (gcerami)10:21
shardyI thought that bug had fixed all-the-HA-things :(10:21
shardypanda|sick and bandini were looking at redis things on Friday, let me see what the latest failures look like10:21
derekhhewbrocca: shardy I believe a new redis package has been built
hewbroccashardy: thanks. Do we also need a telemetry person involved, since they are the only actual Redis consumers?10:22
derekhhewbrocca: shardy but I've just looked at the logs for a recent job and we don't seem to be using it yet10:22
shardyhewbrocca: not sure yet - cistatus actually shows the HA job passing quite regularly earlier today:10:23
derekhJOBLOGS ]$  grep redis-3 overcloud-controller-1.tar.xz_/var/log/host_info.txt10:23
jpichderekh: I think there's some magic that might need to be done around CBS tags for CI to pick it up?10:23
hewbroccashardy: So one wonders if it isn't slowness/timeout related10:24
derekhjpich: probably, looking in the current repo's now10:24
bandiniwell when redis does not come up (for whatever reason), gnocchi services seem to use up loads of CPU10:24
hewbrocca*that* is a telemetry issue10:25
shardyyeah we've seen this before, the gnocchi services spin forever eating CPU when there's any kind of issue, instead of eventually failing and declaring failure10:26
derekhThe new version of redis is in this (pending?) repo
hewbroccathis is actually a significant bug10:26
derekhdo we use that?10:26
shardyso this failed about an hour ago, and it's the badstatusline error10:26
hewbroccaredis becomes temporarily unavailable and gnocchi DOS-es your cloud10:26
shardySo that's which isn't yet fixed10:27
openstackLaunchpad bug 1638908 in tripleo-quickstart "Overcloud deployment fails in minimal configuration with ('Connection aborted.', BadStatusLine("''",))" [Undecided,In progress] - Assigned to Alfredo Moralejo (amoralej)10:27
*** Vijayendra has quit IRC10:27
*** jd__ has joined #tripleo10:29
hewbroccajd__: !10:29
jd__how can I help you gentlemen?10:29
hewbroccashardy, bandini I have summoned the gnocchi maintainer10:29
shardy is trying to fix the badstatusline thing, but it's been proposed to tripleo-quickstart10:29
*** eglynn has joined #tripleo10:29
derekhThe new version of redis is in the delorean-deps repository we use, I don't know how long its been there but it looks like that problem should now be solved
derekhshardy: jpich ^10:29
jpichderekh: Great :)10:30
shardyjd__: Hi!10:30
shardyjd__: so we're trying to understand the expected behavior of gnocchi-metricsd when there's some issue on startup10:30
shardywe've had a couple of cases recently when an error (in one case a packaging bug, and most recently a problem with redis) makes the service fail to start10:31
shardybut it spins forever, eating lots of CPU, instead of failing and declaring the service failed10:31
jd__shardy: it should retry, but not as aggressive as fast10:32
jd__shardy: do you have any hint where it loops too fast?10:32
jd__or what was failing maybe?10:32
*** milan has quit IRC10:32
*** oshvartz has joined #tripleo10:32
amoralejbut we don't have gnocchi in the undercloud in the cases where we hit, i'd say10:33
openstackLaunchpad bug 1638908 in tripleo-quickstart "Overcloud deployment fails in minimal configuration with ('Connection aborted.', BadStatusLine("''",))" [Undecided,In progress] - Assigned to Alfredo Moralejo (amoralej)10:33
shardythat's an example10:33
shardythe previous time we hit it this was similar, with a bad version of cotyledon IIRC10:34
amoralejbut that's in overcloud10:34
amoralejthe bug i reported is referred to the issue in undercloud10:34
sshnaidmderekh, hi10:34
shardyamoralej: Yeah it's two different problems10:34
amoraleji'd say so10:34
*** pblaho has quit IRC10:35
amoralejwhat i've observed in these cases it's very long response time to heat GET api calls10:35
jd__shardy: ok, it's a bug, I think I know what it is10:35
sshnaidmderekh, do you know what the timeout here mean? it doesn't seems like timeout for the whole job, right? is it for waiting for a environment only?
shardyamoralej: actually10:35
jd__shardy: I'll write a fix right now :)10:35
shardywe turned telemetry servies back on by default recently10:36
hewbroccajd__: thanks10:36
shardyjd__: great, thanks! :)10:36
hewbroccashardy: you happen to know is there a bugzilla for this issue10:36
jd__shardy: thanks for reporting :)10:36
*** jaosorior has joined #tripleo10:38
*** limao has quit IRC10:38
amoralejshardy, we hit the issue even without telemetry enabled,
*** chem has quit IRC10:39
amoralejlet me check last occurrence10:39
*** chem has joined #tripleo10:39
jaosoriorHey guys, just to notify, I'm feeling quite sick, so won't be doing much today :/10:43
derekhsshnaidm: that timeout should be the max amount of time the testenv exists, iirc it will be deleted if it is hit10:43
shardyhope you feel better soon jaosorior10:43
amoralejshardy, enable_telemetry=True in undercloud doesn't enable gnocchi, only ceilometer services10:44
amoralejwith database dispatcher10:45
sshnaidmderekh, and then it should kill the job as I see from the comments, although it doesn't seem to work this way...10:45
derekhsshnaidm: what make you say its not working this way?10:46
shardyamoralej: ah, definitely two issues then, thanks for confirming10:46
panda|sickoh, my still redis problems ? maybe we should merge ?10:47
panda|sickthat was green for me twice yesterday10:47
panda|sickI figured with the new package it wasn't needed anymore10:48
sshnaidmderekh, sorry, found out now that it worked :)10:49
sshnaidmderekh, I'm looking for a way to have the postci function to work before zuul kills everything and doesn't post logs10:49
*** Vijayendra has joined #tripleo10:52
derekhsshnaidm: I'd imagine the way to do this is to add another publisher to the job (I'm not 100% sure though)
derekhpanda|sick: as far as I can see the jobs that have run so far are still using python-redis-2.10.3-1.el7.noarch10:53
derekhpanda|sick: but the new version is there in the repo now10:53
panda|sicksshnaidm: wasn't arxcruz working on the same thing ?10:53
derekhpanda|sick: so maybe the repo just hadn't updated up to now10:53
sshnaidmpanda|sick, yeah, I'm trying to look too10:54
panda|sickderekh: so my tests yesterday worked only because of low usage of rh1 ? ... geez10:54
sshnaidmarxcruz, ^^10:54
arxcruzsshnaidm: panda|sick hey10:54
derekhpanda|sick: or maybe the proxy should be restarted in case it has the repo metadata cached10:55
* arxcruz reading10:55
arxcruzwe got a timeout and logs collected on patchset 410:55
sshnaidmarxcruz, derekh : but for publisher we need to have the logs already prepared, right? or is it possible to write publisher that will duplicate postci function?10:56
shardySo references which is a new version of redis10:56
openstackLaunchpad bug 1637961 in tripleo "periodic HA master job pingtest times out" [Critical,Fix released] - Assigned to Gabriele Cerami (gcerami)10:56
*** pkovar has joined #tripleo10:56
panda|sickderekh: have you checked if the package is in testing repo ? asking to apevec in rdo10:56
shardythat exists in
arxcruzsshnaidm: publisher will only get the logs in /var/log and upload right?10:56
arxcruzor am i wrong ?10:56
shardybut are we also missing a puppet-redis update?10:56
sshnaidmarxcruz, it's finished, not timeouted10:56
* shardy needs more coffee10:56
derekhsshnaidm: actually now that I think about it that wouldn't work, as the compute nodes would be gone10:57
panda|sickshardy: puppet-redis should not be needed in this case10:57
panda|sicka new *10:57
derekhsshnaidm: *overcloud nodes10:57
sshnaidmderekh, yeah, testenv-client destroys everything10:57
*** zoliXXL is now known as zoli|lunch10:57
*** thrash|g0ne is now known as thrash10:57
sshnaidmderekh, arxcruz I thought, maybe to wrap everything into additional timeout in toci_instack_*.sh scripts..10:58
derekhpanda|sick: the latest HA job I'm following has passed, will check to see what version of redis is used when its done pushing logs10:58
arxcruzsshnaidm: but we will still have the global timeout from devstack-gate10:58
openstackgerritMichele Baldessari proposed openstack/puppet-tripleo: WIP DO NOT MERGE Initial Composable HA
sshnaidmarxcruz, this job was timeouted when job finished and postci function started11:00
sshnaidmarxcruz, the problem is when job is timeouted before postci starts11:00
openstackgerritRobin Cernin proposed openstack/tripleo-validations: Validation stonith device exists in OpenStack Platform HA cluster
shardySo yeah, the failure I was looking at from a couple of hours ago has the old version of redis still11:01
*** lmiccini has quit IRC11:01
jpichhonza: Good morning! There are folks interested in following the work around proper logging for the UI, I'm wondering if you could move/open the blueprint in the TripleO tracker and keep it up to date? I was going to point them to but it doesn't look like the existing patches are even linked there :( (If I'm looking in the wrong place, please let me know!)11:01
shardyderekh: I notice the job in question is using a cached overcloud-full which may be related?11:02
arxcruzIt would be so more easy if we add a timeout hook function in devstack-gate... :(11:02
arxcruzsshnaidm: well, the timeout will be triggered 13 minutes before the global timeout11:02
arxcruzwhich is plenty of time to postci script runs11:03
panda|sickshardy: derekh new redis package only arrived in ocata this morning.11:03
panda|sickmy tests yesterday likely passed because of the low usage :(11:04
panda|sickthe package arrived at about 9am, maybe in time for the periodic jobs, and to upload a new base image them11:05
shardyOk so we need a periodic promotion to update the cached image11:05
*** lmiccini has joined #tripleo11:05
shardyor temporarily disable using cached images11:05
*** ckyriakidou has joined #tripleo11:07
derekhshardy: the cached image should get the new redis, we do a yum update in it11:07
panda|sickderekh: yeah, right!11:08
shardyderekh: Ok, I'll try rechecking as I guess the job I'm looking at started just before the repo got updated11:08
panda|sickperidoic jobs this morning was still using redis-3.2.4-1.el7.x86_6411:08
sshnaidmpanda|sick, I see redis-3.2.4-2.el7.x86_6411:08
openstackgerritSteven Hardy proposed openstack/tripleo-heat-templates: Add an optional extra node admin ssh key parameter
derekhpanda|sick: shardy Ok, the job I've been following has passed and has the new version of redis, so I think we can stop worrying about the redis problem11:08
derekhshardy: panda|sick
shardyshadower: ^^ I pushed an edit just changing the parameter name11:09
panda|sicksshnaidm: in it was redis-3.2.4-1.el7.x86_6411:09
shardywe can see if that picks up the latest redis version11:09
panda|sicksshnaidm: this is the most annoying issue ever11:10
shadowershardy: thanks. So should I merge the two resources, into one, too?11:10
panda|sickderekh: great11:10
sshnaidmpanda|sick, I see11:11
panda|sicksshnaidm: but yea, since we are updating, everything should be back to normal again ...11:11
openstackgerritRobin Cernin proposed openstack/tripleo-validations: Validation stonith device exists in OpenStack Platform HA cluster
sshnaidmshardy, is "bad status" error handled by somebody?11:12
panda|sicksshnaidm: but then again, this is what I said yesterday, and in some degree friday11:12
yolandahi, i'm starting to implement blueprint to support full disk images
yolandapart of the blueprint is to allow to don't upload kernel and vmlinuz images, but I wonder the best way to do it on the client11:13
sshnaidmarxcruz, try to talk to clarkb on #openstack-infra, he'll suggest something if it's possible to do on devstack-gate side11:13
arxcruzsshnaidm: so, the removing some minutes from testenv-client timeout won't work ?11:13
yolandaif just skip the kernel and vmlinuz validation, or add some flag in the overcloud image upload command, such as overcloud image upload --full , to ensure that we pass a full disk image11:13
yolandaEmilienM, or other cores, what are your thoughts?11:13
sshnaidmarxcruz, nope, it's just kills everything, but we need to fetch logs *before* this11:14
panda|sickarxcruz: testenv-client is the one responsible of creating and destroying test environments11:14
sshnaidmarxcruz, maybe it's an option to add callback to testenv-client actually11:15
sshnaidmderekh, ^^11:15
arxcruzsshnaidm: panda|sick yeah, but there's the trap when testenv-client fails11:15
arxcruzso it runs the postci11:15
*** chlong has joined #tripleo11:15
sshnaidmarxcruz, testenv doesn't run postci11:15
panda|sickstarting to get dizzy again, be back later.11:16
arxcruzsshnaidm: yeah, I know, but on has the trap on line 50 that will caught the exit from testenv-client right ?11:16
sshnaidmarxcruz, it traps not testenv-client, but everything that is executed in itself11:17
sshnaidmarxcruz, testenv-client runs, it's level above11:18
arxcruzsshnaidm: oh shit, I though was the opposite way, who calls testenv-client11:18
arxcruzyeah, you're right11:19
shardyshadower: I don't mind tbh, I'm fine to merge it as-is, but if you'd like to do that cleanup then feel free11:19
shadowershardy: just noticed that the edit has a bug anyway, so I'll just resubmit it11:20
shadowerit'll make teh change smaller, too11:20
derekhsshnaidm: ya, some kind of post-run handler inside of testenv client could work, this wont work if we hit the ZUUL timout but its a lot better then what we have11:20
shardyshadower: ack, thanks!11:20
openstackgerritTomas Sedovic proposed openstack/tripleo-heat-templates: Add an optional extra node admin ssh key parameter
b00tcathi, how can I get +workflow here? need to commit something else on this project and would be nice to have this commit merged ^^"11:22
yolandasshnaidm, i'm getting an error on tripleo-quickstart on my change, but looks unrelated:
yolandaatal: []: FAILED! => {"changed": false, "cmd": ["id", "-u", "stack"], "delta": "0:00:00.003234", "end": "2016-11-07 08:59:45.340246", "failed": true, "rc": 1, "start": "2016-11-07 08:59:45.337012", "stderr": "id: stack: no such user"11:22
yolandai'm not even updating the teardown on that play, is there some known error?11:23
*** rhallisey has joined #tripleo11:23
shardyb00tcat: approved!11:23
shardyb00tcat: apologies for the delay, we had major CI issues last week which blocked merging things for a few days11:24
sshnaidmyolanda, which patch is it?11:24
b00tcatshardy: thanks, and np ;)11:24
yolandamoving to oooq channel...11:24
*** athomas has quit IRC11:25
sshnaidmyolanda, yeah, there is better :)11:25
yolandapasted there11:25
sshnaidmshardy, is "bad status" error handled by somebody?11:27
jpichHey folks, had three +2s before needing a rebase. It's a missed parameter from the passwords migration, if the patch could be reviewed again so it can get backported I'd really appreciate it11:28
openstackgerritTomas Sedovic proposed openstack/tripleo-common: Fix the validation ssh keys workflow
hewbroccajpich: ooh yeah we need that, it's blocking upgrades, no?11:29
jpichhewbrocca: It's blocking Ceph deployments from the UI, as far as I understand the CLI still hardcodes it and is fine11:29
jpichhewbrocca: I could have missed more recent related bugs though11:30
hewbroccajpich: ahh, maybe not same issue then11:31
*** athomas has joined #tripleo11:32
shardysshnaidm: has been proposed by amoralej, I'm reworking that so we set those defaults in puppet-tripleo now11:32
sshnaidmshardy, thanks!11:32
shardyif anyone can reproduce it'd be excellent to get confirmation those timeout increases fix things11:33
amoralejshardy, i'm having doubts about it now. I proposed that assuming that the issue is related to slow hardware in my RDO CI, but now we are hitting it in cases with decent hardware also11:34
*** sudswas__ has quit IRC11:35
*** sudipto has quit IRC11:35
openstackgerritSteven Hardy proposed openstack/puppet-tripleo: Increase haproxy timeouts
shardyamoralej: Yeah I'm not sure either but lets recheck ^^ a few times and see if it helps11:36
shardyamoralej: I did hit this locally, and the requst failed after far less than a minute11:37
openstackgerrityolanda.robla proposed openstack/tripleo-quickstart: Create directories with root
shardynot managed to reproduce yet tho :(11:37
amoralejare we hitting it in tripleo gate jobs also?11:37
shardyamoralej: yes11:37
amoralejyeah, that looks similar11:39
*** dsariel has joined #tripleo11:41
*** tobias-fiberdata has quit IRC11:45
*** nyechiel has joined #tripleo11:45
*** bfournie has joined #tripleo11:46
*** milan has joined #tripleo11:48
slaglecan i get some reviews on and
slaglethey fix a newton bug for manila11:53
slaglemarios: fyi, ^^11:53
slaglei know how important that is to you11:53
mariosslagle: ack on a call now11:55
mariosslagle: in bit will do11:55
slaglethx :)11:55
mariosslagle: appreciate the concern i don't miss anything manila11:56
slaglenp. you are the manila man11:56
*** dprince has joined #tripleo11:57
thrashbetter than being the vanilla man I suppose...11:58
jd__shardy, hewbrocca so that should fix the problem you saw FWIW :) I'll backport it11:59
*** jkilpatr has joined #tripleo11:59
*** pradk has joined #tripleo11:59
*** tdasilva has quit IRC12:06
shardyjd__: thanks for the quick response! :)12:08
*** zoli|lunch is now known as zoli12:10
*** zoli is now known as zoliXXL12:10
*** adarazs is now known as adarazs_lunch12:12
openstackgerritMerged openstack/tripleo-puppet-elements: Separate Datastax repository from the Midonet one
*** lucasagomes is now known as lucas-hungry12:15
slagleshadower: hi, saw your reply about the newton release12:16
slagleshadower: tripleo-validations has not been part of any of our previous newton releases12:16
*** tdasilva has joined #tripleo12:16
shadowerslagle: these are tht and tripleo-common patches though12:16
slagleyes, i saw the 1 tht patch12:17
hewbroccajd__: well that is fantastic12:17
slagleshadower: what is the tripleo-common one?12:17
hewbroccajd__: Once you backport it to stable I guess we'll pull it in automatically12:17
shadowerslagle: it depends on the tht one and it's the actual fix12:18
slagleshadower: ok, thanks, for some reason i thought that was a tripleo-validations patch when I looked earlier12:20
shadowerslagle: I was afraid I pasted a wrong link there :-)12:21
*** bkopilov has quit IRC12:21
*** mburned_out is now known as mburned12:22
*** anshul has quit IRC12:22
slagleshadower: the tht one lgtm. for the tripleo-common one, try to wrangle up some reviews12:22
shadowerslagle: thanks. d0ugal did one in an earlier patchset12:23
dtantsurfolks, can I get W+1 on please? 2x +2 and passed CI12:27
*** mgould|afk is now known as mgould12:28
*** anshul has joined #tripleo12:29
openstackgerritFlorian Fuchs proposed openstack/tripleo-ui: Validate JSON parameters
*** dmacpher has joined #tripleo12:30
*** maticue has joined #tripleo12:35
honzajpich: there was some launchpad-linking weirdness when i published the patch --- i certainly tried to link it :)
openstackgerritMarkos Chandras proposed openstack/diskimage-builder: elements: Add new openssh-server element
*** dougbtv has joined #tripleo12:39
*** shardy has quit IRC12:41
*** rcernin has quit IRC12:43
*** rcernin has joined #tripleo12:44
d0ugalshadower: do you need a re-review somewhere?12:44
shadowerd0ugal: yep, here:
d0ugalshadower: cool, on it12:44
shadowerd0ugal: thanks!12:44
*** limao has joined #tripleo12:49
*** bfournie has quit IRC12:50
*** sudipto has joined #tripleo12:50
openstackgerritMerged openstack/tripleo-heat-templates: Ensure we update ceph and composable nodes
*** sudswas__ has joined #tripleo12:50
openstackgerritDan Prince proposed openstack/tripleo-heat-templates: Add local template generation tox task
*** limao has quit IRC12:53
slaglesocial: you'll need to restore your newton patch for
slagleand update the commit message12:55
jpichhonza: Yeah I saw that, that's how I found the blueprint :) Thank you. There's still a number of manual updates that are required for blueprints in general, e.g. what milestone is targetted or what is the progress like, are all the patches done or is there more to come12:56
honzajpich: i'll see what i can do12:56
jpichhonza: In that case though, the first step will be to migrate the blueprint to the tripleo tracker (it's still in the ol' tripleo-ui one for now)12:56
jpichhonza: Awesome, thank you :)12:57
*** ohamada has quit IRC12:59
jaosoriorslagle: will this commit be included in the release?
openstackgerritDan Prince proposed openstack/tripleo-heat-templates: Make pep8 task run template generation
openstackgerritLukas Bezdicka proposed openstack/tripleo-heat-templates: Ensure we update ceph and composable nodes
socialslagle: ^^13:00
*** maeca1 has joined #tripleo13:00
dprincethrash: see my comment there about potentially having the Mistral action use this code instead13:01
thrashdprince: ack13:02
*** cylopez has quit IRC13:03
slaglejaosorior: it is merged in stable/newton, so yes :)13:03
slaglesocial: thanks13:04
thrashdprince: do you think it fits better in tht, or tripleo-common?13:04
dprincethrash: t-h-t I think13:04
dprincethrash: making heat depend on tripleo-common seems way heavy to me13:05
socialslagle: could you have look at and ? we need them for updates13:05
dprincethrash: but I could be pursuaded13:05
thrashif we want the mistral action to use this, then tripleo-common would need to depend on tht...13:05
d0ugalthrash: it already does, in a way.13:06
thrashnot sure if that's already the case.13:06
*** rlandy has joined #tripleo13:06
d0ugalthrash: it expects the templates to be in /usr/share/13:06
d0ugalthrash: but never actually states the dep AFAIK13:06
*** jayg|g0n3 is now known as jayg13:06
thrashd0ugal: dprince it would have to become an explicit dep.13:08
thrashI'm still forming my thoughts on it...13:08
*** ccamacho is now known as ccamacho|lunch13:09
d0ugalthrash: Yeah, we should maybe add an explicit dep anyway? not sure.13:10
*** zoliXXL is now known as zoli|brb13:11
*** amoralej is now known as amoralej|lunch13:11
jpichd0ugal: Opened for the Mistral env persisting issue, feel free to correct any wrong assumption in there :)13:11
openstackLaunchpad bug 1639787 in tripleo "Mistral environment not reset between deployments" [High,Triaged]13:11
*** cylopez has joined #tripleo13:12
*** maeca1 has quit IRC13:12
openstackgerritDougal Matthews proposed openstack/python-tripleoclient: Reset the parameter_defaults between deployments via the CLI
d0ugaljpich: looks good. Initial patch ^13:13
jpichQuick! :-o13:13
openstackgerritThomas Herve proposed openstack/python-tripleoclient: Use a Zaqar queue to get stack events
d0ugaljpich: haha, it couldn't be much simpler.13:13
arxcruzsshnaidm: this is the output from
jpichd0ugal: So we're keeping the passwords around, and 'template'/'root_template' don't matter?13:15
*** adarazs_lunch is now known as adarazs13:15
*** limao has joined #tripleo13:15
arxcruzsshnaidm: and the arguments should be
d0ugaljpich: Yeah, we need the passwords, so I am glad they got duplicated. The others get updated anyway when the templates are processed IIRC.13:15
d0ugaljpich: but I need to look into this a wee bit more to be sure.13:16
panda|sickis CI stabily good now ?13:16
*** limao_ has joined #tripleo13:16
*** tiswanso has quit IRC13:16
jpichd0ugal: Cool. Thanks a lot!! I'll give the patch a whirl locally as well13:16
sshnaidmpanda|sick, more or less13:17
shadowerslagle, shardy: can this get a +A before jenkins changes its mind?
sshnaidmpanda|sick, I see a row of successes both for ha and nonha, so it looks good atm13:19
panda|sickshadower: what's the "less" part ?13:19
panda|sicksshnaidm: not shadower13:19
slagleshadower: sure13:19
*** limao has quit IRC13:19
sshnaidmarxcruz, it's everything about working with gearman server, I don't think it should be involved here..13:19
sshnaidmpanda|sick, there is always something, y'know13:20
openstackgerritMerged openstack/tripleo-common: Do not try "manage" actions on nodes that are not in "enroll" state
*** cylopez has quit IRC13:21
panda|sicksshnaidm: pessimist :) but yeah, after a week of stacking issues I can see why13:21
*** bfournie has joined #tripleo13:21
*** lucas-hungry is now known as lucasagomes13:21
openstackgerritDougal Matthews proposed openstack/python-tripleoclient: Reset the parameter_defaults between deployments via the CLI
slagleis there anyone that can look at ?13:22
openstackLaunchpad bug 1634260 in tripleo "Missing get_file files don't cause deploy failures" [High,Triaged]13:22
thervepanda|sick, I don't know if the redis failure is gone13:22
*** rbowen has joined #tripleo13:23
panda|sicktherve: if it's not, I swear I'll become a farmer.13:24
openstackgerrityolanda.robla proposed openstack/python-tripleoclient: WIP: Support full disk images in TripleO
therve looked good on that front though13:24
therveStill failing for other reasons though13:24
*** rodrigods has quit IRC13:25
*** zoli|brb is now known as zoli13:25
*** rodrigods has joined #tripleo13:25
*** zoli is now known as zoliXXL13:25
panda|sicktherve: please recheck, that may have been launched before the new redis package was hitting cdn13:26
thervepanda|sick, OK. We'll see with the periodic tomorrow too13:26
*** fultonj has joined #tripleo13:26
openstackgerritDmitry Tantsur proposed openstack/tripleo-common: Do not try "manage" actions on nodes that are not in "enroll" state
*** rbrady-afk is now known as rbrady13:27
panda|sickok, why .. why: 23297:S 07 Nov 10:10:47.236 # Unable to connect to MASTER: Connection timed out13:28
panda|sick23297:S 07 Nov 10:10:48.277 * Connecting to MASTER no-such-master:637913:28
*** tiswanso has joined #tripleo13:28
panda|sickanother redis issue ...13:29
thervepanda|sick, That is sad. Doesn't impact the ci result I think though13:29
panda|sickredis is up and running, pcs status is good and there is a redis master13:29
panda|sickgnocchi-metricsd is not getting all the CPU13:30
panda|sickso there's another issue somewhere.13:30
therveYeah cinder is still returning 50013:31
panda|sickloadaverage is at 6 on controller13:31
therveOSError: [Errno 12] Cannot allocate memory13:32
therve\o/ /o\13:32
panda|sickI wonder how the crops are doing this year.13:32
hewbroccaDoes anybody here know anything about redis13:33
openstackgerritMerged openstack/tripleo-heat-templates: Include keystone authtoken config in manila-share service
panda|sickI don't see any process taking all the CPU this time in host-info13:34
beagleshewbrocca: only that it seems to be reaching legendary status as a trouble maker13:34
thervepanda|sick, is weird13:35
openstackgerritMerged openstack/tripleo-heat-templates: Move db settings from manila-api to manila-base
therveSeems to happen after the OOM error13:36
*** pradk has quit IRC13:37
*** jprovazn has quit IRC13:37
panda|sickthe max I see in MEM utilization in ps is 2%. we need a live deploy to look at13:38
thervepanda|sick, Do we store ps stats of overcloud nodes?13:39
*** cylopez has joined #tripleo13:39
panda|sicktherve: we store the output of a single ps command, at the end of the test AFAIK13:40
openstackgerritJiri Tomasek proposed openstack/tripleo-ui: Refactor addWorkflowExecution
*** links has quit IRC13:40
therveAh yeah host_info13:40
thervefree 151M13:40
therveThat looks a tad small13:40
*** jcoufal has joined #tripleo13:41
*** tiswanso has quit IRC13:43
panda|sickI'll try to launch a deploy in rh1, it will take a while.13:43
*** saneax-_-|AFK is now known as saneax13:43
panda|sickbut I'm ot sure I have enough energy today to do debugging.13:44
therveWe can tweak down httpd config. cinder as 2 proc/2 threads, and horizong 10/313:44
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Move per role Services defaults into environment file
*** jpena is now known as jpena|lunch13:49
panda|sickooohh, metricd in controller-0 is clean, *BUT* in controller-1 is again spinning like crazy13:51
panda|sickbut it seems to sabilize after a while13:52
*** bkopilov has joined #tripleo13:52
*** amoralej|lunch is now known as amoralej13:53
jristanyone have trouble logging in with SSL firefox to the UI?13:54
*** tzumainn has joined #tripleo13:56
*** Goneri has joined #tripleo13:56
openstackgerritJames Slagle proposed openstack/tripleo-heat-templates: Include keystone authtoken config in manila-share service
*** sshnaidm is now known as sshnaidm|afk13:57
openstackgerritJames Slagle proposed openstack/tripleo-heat-templates: Move db settings from manila-api to manila-base
*** Guest71310 has quit IRC13:58
*** skramaja_ has joined #tripleo14:00
*** skramaja has quit IRC14:00
weshaypanda|sick, you were referring to #15 on the tripleo-ci-status etherpad?14:02
panda|sicklive deploy lainched, it will take a while ..14:02
*** saneax is now known as saneax-_-|AFK14:03
jpichjrist: Yeah, because of the self-signed certificate. Unfortunately it looks like it needs to be accepted manually not just for the login page, but also for every URL+port combination too...14:03
slagletherve: any chance you could look at ?14:03
openstackLaunchpad bug 1634260 in tripleo "Missing get_file files don't cause deploy failures" [High,Triaged]14:03
jristjpich: :o14:03
slagleor d0ugal perhaps:
slaglethe problem is in the tripleoclient template processing it looks like14:04
jristjpich: so we have to go through with each URL/port and accept?14:04
jpichjrist: I wonder if we could add an option to accept self-signed certs to the config file. I don't know how we talk to the services from the UI code, but jtomasek / florianf / honza might have an idea if it'd be possible to implement?14:05
panda|sickweshay: it was general agreement that traces 1 and 2 in that bug were caused by the delays caused by gnocchi-metricsd eating all the CPU14:05
*** lblanchard has joined #tripleo14:05
*** cylopez has quit IRC14:06
jpichjrist: As a workaround for now, it seems like it, or maybe the cert can be added manually to the firefox cert management bits * pokes around *14:06
jristjpich: interesting. I know there are lots of ways to do it with react-native14:06
*** skramaja has joined #tripleo14:06
jpichjrist: Do you know if there's a bug open for this? Would be good to document workaround(s) there14:06
*** skramaja_ has quit IRC14:06
weshaypanda|sick, wonder if prad (pradeep) can help w/ that14:06
jristjpich: I don't know, just an email from Udi14:06
jtomasekjrist, jpich: I've tried to look for such option in reqwest which is what we use for ajax requests, but I did not find anything like it14:08
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Move per role Services defaults into environment file
jtomasekjrist, jpich: why are the certs self signed? isn't that insecure? does the ssl make sense when we then work it around with such option?14:09
jpichjtomasek, jrist: I'm going to open a bug so we can track this information14:09
jristthanks jpich14:09
therveslagle, Sure looking14:09
jristjtomasek: lol14:09
jristjtomasek: technically no14:09
jristit doesn't make sense14:09
jtomasekjrist: ...14:09
jristI wonder if we can easily do 'letsencrypt' certificates?14:10
beekneemechjrist: jtomasek: You should talk to jaosorior about this.14:10
d0ugalslagle: I'll take a look and see if I can figure it out14:10
jtomasekjrist: probably, I thought dtrainor was looking into making ssl work14:10
*** beekneemech is now known as bnemec14:10
panda|sickweshay: too much guess work on the logs, when the live env is ready I will start drawing some conclusion again .. *if* i'll be able to think straight, my lunch was a glass of water with sugar.14:10
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Move per role Services defaults into environment file
weshaypanda|sick, feel free to hand off to sshnaidm|afk and get some sleep14:11
slagled0ugal: therve : thanks. if only one of you wants to look, that's fine :) just wanted to make sure we have someone on it14:11
thrashdprince: d0ugal so, indirectly, python-tripleoclient -> instack-undercloud -> openstack-tripleo-heat-templates14:11
thrashthat's the current deps14:11
d0ugalslagle: oh, I didn't see therve has replied. I'll wait a bit and see how that goes :)14:12
*** Guest71310 has joined #tripleo14:12
d0ugalthrash: hah, nice.14:12
thrashso, shouldn't be a stretch for tripleo-common to depend on tripleo-heat-templates14:12
openstackgerritMerged openstack/tripleo-heat-templates: Add an optional extra node admin ssh key parameter
panda|sicksshnaidm|afk: there is a test env deploying HA in rh1, let's see what happens there14:12
panda|sicksshnaidm|afk: I'll ping you when it's ready14:12
thrashd0ugal: and same for tripleo-common -> instack-undercloud -> openstack-tripleo-heat-templates14:12
openstackgerritDougal Matthews proposed openstack/python-tripleoclient: Reset the parameter_defaults between deployments via the CLI
slagleshadower: merged. please backport it14:14
shadowerslagle: just got the email. I'm on it. Thanks14:14
d0ugaljpich: Not sure if you started testing
*** jprovazn has joined #tripleo14:15
d0ugaljpich: but I just updated it because I done something silly14:15
jpichd0ugal: No, I hadn't (sorry) - thanks for the update!14:15
openstackgerritTomas Sedovic proposed openstack/tripleo-heat-templates: Add an optional extra node admin ssh key parameter
d0ugaljpich: probably a good thing, it would have failed anyway (I found the error in CI)14:15
shadowerslagle: ^14:16
*** noslzzp has quit IRC14:16
slagleshadower: thanks14:17
*** noslzzp has joined #tripleo14:17
*** Guest71310 has quit IRC14:18
*** tesseract has joined #tripleo14:19
hewbroccaweshay: FWIW we already have a patch from jd__ to fix the gnocchi issue14:19
*** tesseract is now known as Guest4111714:19
hewbroccaand it has been backported, or is in the process of that14:19
hewbroccaI don't know what else if anything needs to happen to get it pulled into RDO newton14:19
hewbroccait's not clear that fixing gnocchi alone is sufficient14:20
*** anshul has quit IRC14:20
weshayhewbrocca, rockin14:20
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Configure civetweb bind address in brackets when IPv6
*** tiswanso has joined #tripleo14:21
*** Guest41117 has quit IRC14:22
*** dtrainor has joined #tripleo14:25
jpichjrist, jtomasek: I opened . jtomasek could you add the information about reqwest there? I set the severity as high but I don't know how common undercloud deployments with self-signed certificates are, if most folks use their own maybe it's not as bad. I'll see if there might be a "one-step manual workaround" and add it to the bug14:27
openstackLaunchpad bug 1639807 in tripleo "Can't login to the UI with SSL when using Firefox" [High,Triaged]14:27
*** shardy has joined #tripleo14:27
jtomasekjpich: thanks, I will14:27
*** tesseract- has joined #tripleo14:27
*** anshul has joined #tripleo14:30
*** morazi has joined #tripleo14:30
*** ohamada has joined #tripleo14:30
*** sshnaidm|afk is now known as sshnaidm14:32
*** jaosorior has quit IRC14:32
*** jaosorior has joined #tripleo14:33
sshnaidmpanda|sick, ok14:33
*** chlong has quit IRC14:34
weshaysshnaidm, fyi
sshnaidmweshay, yeah, I know :)14:35
*** hoobaman has joined #tripleo14:36
sshnaidmpanda|sick, what is the current issue you check? the redis failure?14:36
hoobamancurrently deploying mitaka14:36
hoobamanhowever the parameter "CloudDomain" does not seem to work14:36
hoobamanit is normally used to provide your overcloud hosts with a valid fqdn14:37
hoobamaninstead it gives us localdomain14:37
*** morazi has quit IRC14:37
hoobamanIs this issue known? any workaround available?14:37
bnemechoobaman: Did you set the domain on the undercloud neutron as well?  I believe CloudDomain just ensures the right domain is added to things like the hosts file on overcloud nodes.14:38
openstackgerritMerged openstack/tripleo-heat-templates: Updated Nuage neutron plugin name
shardyhoobaman: see
openstackLaunchpad bug 1581472 in tripleo "CloudDomain doesn't correctly set hostname" [High,Triaged] - Assigned to Giulio Fidente (gfidente)14:38
openstackgerritMerged openstack/tripleo-quickstart: Clone tripleo-ci in the undercloud
shardythere is a workaround, which is to set dhcp_domain in nova.conf on the undercloud14:38
bnemec(I think it's neutron.conf)14:39
shardybnemec: maybe so, perhaps someone can update the bug if that's confirmed, I thought I set it in nova.conf but could be mistaken14:39
*** masco has quit IRC14:40
bnemecI could be wrong, but Nova shouldn't be setting DHCP parameters.14:40
hoobamanbnemec: thx for your reply. Do you mean domain in neutron subnet-update?14:40
hoobamanshardy: thx14:40
shardyit's in nova.conf AFAICS14:41
*** lmiccini has quit IRC14:41
*** tbonds has joined #tripleo14:41
bnemecOh good, they both have settings for this.  That's not confusing at all.14:42
*** ccamacho|lunch is now known as ccamacho14:42
bnemec /sarcasm14:42
tzumainnd0ugal, jtomasek hI!  not sure who to ask this question to, but here it is: if I use create_default_deployment_plan, should the resulting plan be deployable with no further tinkering?  or is tinkering required?14:43
jtomasektzumainn: it should afaik (if you have the nodes available)14:43
tzumainnd0ugal, jtomasek, and second question - where do the templates from the default deployment plan come from?  I just want to know the baseline used in case I need to create a plan with custom templates14:43
*** lmiccini has joined #tripleo14:44
jtomasektzumainn: those are templates installed at /usr/share/openstack-tripleo-heat-templates14:44
tzumainnjtomasek, okay, great - thanks!14:45
openstackgerritMerged openstack/tripleo-common: Add CephClusterFSID to generated passwords
openstackgerritMerged openstack/tripleo-heat-templates: Change nova ram_allocation_ratio to match puppet-nova
openstackgerritMerged openstack/tripleo-ui: Validate JSON parameters
d0ugaltzumainn: I think it should be possible to deploy it at that point.14:48
d0ugaltzumainn: if not, that is a bug :)14:49
*** morazi has joined #tripleo14:51
*** ebalduf has joined #tripleo14:52
openstackgerritThomas Herve proposed openstack/python-tripleoclient: Fix handling of missing environment files
therveslagle,  d0ugal ^^^14:54
tzumainnd0ugal, \o/  awesome!14:54
therveThis may be wrong, but this has a test at least :)14:54
d0ugaltzumainn: jtomasek or jpich will know if there are any required params as I think they have been doing GUI testing14:54
d0ugaltherve: looking14:54
tzumainnd0ugal, ah, okay, thanks!14:55
slagletherve: thanks!14:56
slagleshardy: can you review therve's fix?
shardyslagle: will do14:57
*** saneax-_-|AFK is now known as saneax15:00
*** sudswas__ has quit IRC15:01
*** sudipto has quit IRC15:01
shardyouch, yeah AFAICS the fix is good, thanks therve15:01
tzumainnjtomasek, jpich, I'll be testing this out, but just let me know if you guys think of any required params that need to be set15:02
therveshardy, No problem :). The flush/safe_dump looks clear, I was wondering about the naming (though it shouldn't matter too much)15:02
jpichtzumainn: I think the "count" variables (controllercount, etc) might be needed still somewhere, but I could be misunderstanding15:03
*** jpena|lunch is now known as jpena15:04
shardyYeah I think it probably doesn't matter too much but this way is more consistent with the file containing env_map15:04
tzumainnjpich, okay, thanks!  I'll test it out15:04
tzumainnjpich, would you happen to have an overcloud deployed through the UI available somewhere?  if so, would it be possible to paste 'heat stack-show' somewhere so I can compare the parameters against my own?15:05
tzumainnif not, don't worry about it!15:05
d0ugaljpich: I think the counts are all optional - by default you should get 1 control and 1 compute I think15:06
jpichtzumainn: I think the mistral environment might be the relevant bit here, in meetings right now, will let you know after :)15:06
tzumainnjpich, okay, thanks!15:07
shardytzumainn: you may find "openstack stack environment show overcloud" useful if you want to compare heat environments without introspecting all the stacks15:07
jaosoriorbnemec: jrist , what's up?15:07
tzumainnshardy, ahh, thanks!15:07
*** jaosorior is now known as jaosorior_sick15:09
*** pblaho has joined #tripleo15:10
Ngwould any kind core folks like to cast an eye on ? :)15:10
jaosorior_sickjrist, bnemec we're not doing letsencrypt certs, because we would need the undercloud to be publicly accessible (which we can't asure) and the deployer would also need to own a domain (which is no big deal, and we already can support hostnames instead of IPs in the endpoints)15:10
jpichjaosorior_sick: I think the context for SSL is at the moment, the self-signed stuff is giving us a couple of issues on the Firefox side15:11
openstackLaunchpad bug 1639807 in tripleo "Can't login to the UI with SSL when using Firefox" [High,Triaged]15:11
bnemecjaosorior_sick: Okay, thanks.  I knew you had looked into it, so when it came up today I wanted to make sure you were in the loop.15:11
*** cylopez has joined #tripleo15:12
openstackgerritDan Prince proposed openstack/tripleo-heat-templates: Neutron L3 service cleanups for hiera json hook
openstackgerritDan Prince proposed openstack/tripleo-heat-templates: Horizon service cleanups for hiera json hook
openstackgerritDan Prince proposed openstack/tripleo-heat-templates: Hiera optimization: use a new hiera hook
jpichjaosorior_sick: Looks like dtrainor is gonna be on it so maybe don't worry about it if you're sick - or maybe it's a different issue you were talking about, sorry about that15:13
jaosorior_sickjpich: well, it's actually quite a normal thing that services that offer SSL by default use self-signed certs (it's the same with FreeIPA). We can't asure that everyone has access to a CA. and in the case of the undercloud deployment, we can't asure that there is external access that the letsencrypt service will use to asure that you own the domain15:13
jaosorior_sickjpich: if someone wants to add letsencrypt support and add it as an option I'm cool with that. I'm just explaining why things are the way they are15:14
jaosorior_sickanyway, Imma go back to the couch, feeling a bit feverish. lets talk about it tomorrow.15:14
jpichjaosorior_sick: Yeah, apparently it's more due to lack of information in the cert itself, looks like what I wrote is mostly wrong and dtrainor has different, better ideas on fixing it :)15:14
jpichjaosorior_sick: Absolutely, take care15:14
dtrainordon't blame me yet until i get this test done :)15:14
trozetshardy: hi.  Remember this change ?  It looks like something doesn't work when you use network isolation and specify internal_api network...and I cannot figure out what it is15:16
trozetshardy: doing a getattr on the ServiceNetMap for OpenDaylightApiNetwork returns nothing15:16
jpichdtrainor: Thanks for all the information on that thread, I am now full of hope :)15:17
dtrainorI'll have it docu'd in the launchbad bug shortly, just want to make sure i have enough info on it15:17
dtrainor*in it15:17
openstackgerritDan Prince proposed openstack/tripleo-heat-templates: Make pep8 task run template generation
openstackgerritDan Prince proposed openstack/tripleo-heat-templates: Add local template generation tox task
*** anshul has quit IRC15:19
jpichdtrainor: Thank you!15:20
openstackgerritJames Slagle proposed openstack/tripleo-heat-templates: Change nova ram_allocation_ratio to match puppet-nova
*** hoobaman has quit IRC15:20
*** aufi has quit IRC15:24
openstackgerritZane Bitter proposed openstack/tripleo-heat-templates: Configure region correctly for heat-api-cfn service
slaglecan we get another +2 on this backport: ?15:28
*** pradk has joined #tripleo15:28
shardyslagle: done15:30
openstackgerritJulie Pichon proposed openstack/tripleo-common: Add CephClusterFSID to generated passwords
*** yamahata has joined #tripleo15:33
*** links has joined #tripleo15:34
*** owalsh_ has joined #tripleo15:34
*** numans has quit IRC15:36
*** absubram has joined #tripleo15:36
*** owalsh has quit IRC15:38
*** coolsvap has quit IRC15:42
*** mwhahaha has quit IRC15:42
*** igorbelikov has quit IRC15:42
*** gregwork has quit IRC15:42
*** florianf has quit IRC15:43
*** Ng has quit IRC15:43
*** fungi has quit IRC15:43
*** igorbelikov has joined #tripleo15:44
*** mwhahaha has joined #tripleo15:45
*** rcarrillocruz has quit IRC15:45
*** coolsvap has joined #tripleo15:45
*** Ng has joined #tripleo15:46
*** ChanServ sets mode: +v Ng15:46
*** gregwork has joined #tripleo15:46
shadowershardy: thanks for your replies on :-)15:49
shardyshadower: np, thanks for the feedback15:50
openstackgerritMerged openstack/tripleo-quickstart: Properly reload kvm module when trying to set up nested virtualization
*** rcarrillocruz has joined #tripleo15:52
*** radeks has joined #tripleo15:54
*** radeksmg has joined #tripleo15:54
*** florianf has joined #tripleo15:54
*** radeks has quit IRC15:54
*** rcernin has quit IRC15:55
*** fungi has joined #tripleo15:55
bandinijd__, pradk: I just filed are missing any tunable that slows things down or is that rate expected?15:55
openstackLaunchpad bug 1639842 in tripleo "Newton - gnocchi-metricd is hammering redis" [Undecided,New]15:55
pradkbandini, was it because redis was down? we had a bug on that15:56
pradkbandini, or redis is up and still getting hammered?15:56
bandinipradk: nope redis is up, that is why I filed a new one15:56
jd__bandini: metricd uses a lot redis but I am surprised it keeps "reconencting"15:56
bandinijd__, pradk: let me investigate a bit more15:57
jd__ok :)15:58
jristjaosorior_sick: noted15:58
jristjaosorior_sick: thanks15:58
jristjaosorior_sick: get well!15:58
openstackgerritAna Krivokapic proposed openstack/tripleo-heat-templates: Add constraint for ControllerCount
d0ugalMistral meeting time in #openstack-meeting for those interested.16:02
d0ugalrbrady: ^16:02
*** yamahata has quit IRC16:06
openstackgerritAttila Darazs proposed openstack/tripleo-quickstart: Skip new ansible-lint rule until fixing the roles
openstackgerritThomas Herve proposed openstack/python-tripleoclient: Fix handling of missing environment files
*** owalsh_ is now known as owalsh16:08
openstackgerritMerged openstack/tripleo-heat-templates: Move per role Services defaults into environment file
weshaysshnaidm, you have a minute?16:09
sshnaidmweshay, yep16:09
weshaysshnaidm, join me in bluejeans for a minute16:09
*** ebarrera has quit IRC16:10
*** chandankumar has quit IRC16:11
*** maeca1 has joined #tripleo16:12
*** pcaruana has quit IRC16:12
panda|sickmy deployment failed for a different reason than the one we're seeing on the gates. Waiting for postci to complete ... but it's very very slow generally16:15
*** bana_k has joined #tripleo16:16
openstackgerritMerged openstack/tripleo-heat-templates: Ensure we update ceph and composable nodes
dtantsurfolks, may I get reviews on and please?16:19
dtantsurthese are the problems with our workflows that seem to affect real people already :)16:19
openstackgerritMerged openstack/tripleo-quickstart: Add ability to deploy an overcloud with ssl
*** ealcaniz has quit IRC16:22
*** tremble has quit IRC16:24
openstackgerritFlorian Fuchs proposed openstack/tripleo-ui: Use mistral action to create new containers
openstackgerritMartin Mágr proposed openstack/python-tripleoclient: Use correct region value
*** nyechiel has quit IRC16:29
*** oshvartz has quit IRC16:29
*** fpan has quit IRC16:30
*** rhallisey has quit IRC16:30
*** rhallisey has joined #tripleo16:30
sshnaidmpanda|sick, postci on dev env doesn't work so good, you can stop it (if you're on dev env)16:31
*** tesseract- has quit IRC16:31
panda|sicksshnaidm: actually, it was only my connetion that froze16:31
jistrccamacho: hey looking at i think you might hit what we noticed earlier with bandini and jaosorior that the ControllerPostPuppet (where the restart script belongs too) doesn't get executed at all16:33
*** limao_ has quit IRC16:33
jistrccamacho: i might have a suggestion how to tackle that, will comment on the review16:34
*** cwolferh has joined #tripleo16:34
*** chem has quit IRC16:35
ccamachohey jistr thanks man! Im also hitting when updating the stack.. All feedback is welcome16:35
openstackLaunchpad bug 1639302 in tripleo "Started Mistral Workflow fails due to malformed template" [High,Confirmed]16:35
*** fpan has joined #tripleo16:37
*** paramite has quit IRC16:37
*** fpan has quit IRC16:38
*** fpan has joined #tripleo16:38
*** ebarrera has joined #tripleo16:39
*** chem has joined #tripleo16:40
panda|sickso, I see all controllers with loadavg > 10, beam is usually on top of top, and I think the queue is flooded with requests. Also redis on slaves are unable to contact master16:40
*** paramite has joined #tripleo16:41
openstackgerritMerged openstack/tripleo-common: Do not try "manage" actions on nodes that are not in "enroll" state
sshnaidmpanda|sick, do I understand right it's swift that going crazy with reconnections to redis?16:44
panda|sicksshnaidm: It's possible, unfortunately things are calming down here, load averages are dropping16:45
openstackgerritMerged openstack/tripleo-quickstart: Skip new ansible-lint rule until fixing the roles
sshnaidmpanda|sick, hmm...16:47
sshnaidmpanda|sick, I see two controllers have the same ip16:47
panda|sicksshnaidm: the weird thing is that loadavg is at 10, cpu is at 60% all the time, but there isn't a single process that is taking all the cpu16:47
sshnaidmpanda|sick, seems like HA fail16:47
*** paramite has quit IRC16:48
panda|sicksshnaidm: which IP16:48
sshnaidmpanda|sick, yeah, also was searching in top..16:48
*** zoliXXL is now known as zoli|gone16:51
sshnaidmpanda|sick, no, sorry, wrong alert16:51
sshnaidmpanda|sick, was looking for IP that swift search for redis in16:51
*** jkilpatr_ has joined #tripleo16:52
*** aufi has joined #tripleo16:53
*** michapma_alt has quit IRC16:54
*** bana_k has quit IRC16:54
*** jkilpatr has quit IRC16:55
*** stendulker has joined #tripleo16:56
*** fragatina has joined #tripleo16:59
openstackgerritJuan Antonio Osorio Robles proposed openstack/tripleo-heat-templates: Ensure heat-domain hiera is in nodes that contain keystone
*** nyechiel has joined #tripleo17:00
*** maeca1 has quit IRC17:00
panda|sicksshnaidm: I keep seeing Nov  7 17:00:28 localhost proxy-server: STDERR: WARNING:ceilometermiddleware.swift:Send queue FULL: Event dc8dc681-d50c-5476-ac35-05a54f1396c3 not added (txn: txa5336d17dd4d418686de5-005820b32c) (client_ip:
panda|sickNov  7 17:00:28 localhost proxy-server: 07/Nov/2016/17/00/28 GET /v1/AUTH_0ce54ff3166a4f639721fda10b83f1d8/measure%3Fformat%3Djson%26limit%3D64%26delimiter%3D/ HTTP/1.0 200 - python-swiftclient-3.1.0 3fa69b2072324712... - 2 - txa5336d17dd4d418686de5-005820b32c - 0.0374 - - 1478538028.511590958 1478538028.549038887 017:00
*** fragatina has quit IRC17:01
panda|sickis ceilometer bombarding swift ?17:01
*** ealcaniz has joined #tripleo17:01
*** fragatina has joined #tripleo17:01
openstackgerritMerged openstack/instack-undercloud: Disable Swift auditors and replicators on the undercloud
openstackgerritMerged openstack/tripleo-heat-templates: Unset Keystone public_endpoint
openstackgerritJuan Antonio Osorio Robles proposed openstack/tripleo-heat-templates: Add missing Barbican endpoint from tls-everywhere environment
sshnaidmpanda|sick, I don't know, how are they connected? Does ceilometer keeps its stats in swift?17:02
hewbroccajd__: ^^^17:02
*** radeksmg has quit IRC17:04
*** abregman_ has quit IRC17:04
sshnaidmpanda|sick, do you have the live setup?17:06
panda|sicksshnaidm: yes17:08
*** owalsh has quit IRC17:08
sshnaidmpanda|sick, can you try just the tcp connectivity to redis port from controllers?17:08
openstackgerritPaul Belanger proposed openstack/diskimage-builder: Add simple-playbook element
sshnaidmpanda|sick, and to virtual ip, in my logs it's
*** rbrady is now known as rbrady-afk17:11
panda|sickfrom controller-0 to controller-2(master) it's working. But redis log says unable to contact master.17:13
jd__panda|sick: Gnocchi stores data in Swift, and Ceilometer stores its data in Gnocchi17:13
hewbroccaso if gnocchi is having trouble reaching swift, you'd have a problem17:14
*** jlinkes has quit IRC17:14
sshnaidmjd__, and where is redis in this matryoshka? does swift send messages to it..?17:16
panda|sickalso, what happens if the redis servers are unable to sync with the master ?17:17
sshnaidmpanda|sick, and what is I/O CPU on these hosts?17:17
jd__sshnaidm: no, Redis is used by Ceilometer and Gnocchi for caching and/or coordination17:18
panda|sicksshnaidm: %Cpu(s): 65.3 us, 19.0 sy,  0.0 ni, 15.0 id,  0.0 wa,  0.0 hi,  0.7 si,  0.0 st17:18
openstackgerritMerged openstack/tripleo-common: Power off new nodes when making them available, not right after enrolling
panda|sicksshnaidm: almost nothing.17:18
*** owalsh has joined #tripleo17:18
sshnaidmpanda|sick, shouldn't it consider itself as a master then..?17:18
openstackgerritDmitry Tantsur proposed openstack/tripleo-common: Power off new nodes when making them available, not right after enrolling
*** links has quit IRC17:20
morazipradk, ^^ any thoughts on that connectivity bit re: gnocchi/swift/ceilo ?17:20
pradkreading up17:21
akrivokaflorianf: what value to you have listed for swift in tripleo_ui_config.js ?17:22
akrivokaflorianf: (trying to test your container patch)17:22
*** panda|sick is now known as panda|weak17:22
*** ebarrera has quit IRC17:22
pradkhmm so regarding ceilometermiddleware and swift, we recently added ceilometer to the swift pipeline17:23
pradknot sure if that has any implications we're seeing here17:23
florianfakrivoka: I use the output of `openstack catalog show swift`, but with the hostname that the UI is using17:24
pradkpanda|weak, can you paste me the swift-proxy.conf17:24
*** rasca has quit IRC17:26
*** bana_k has joined #tripleo17:26
panda|weakpradk: You mean swift/proxy-server.conf17:26
panda|weakpradk: ?17:26
pradkpanda|weak, yes17:27
pradkpanda|weak, I keep seeing Nov  7 17:00:28 localhost proxy-server: STDERR: WARNING:ceilometermiddleware.swift:Send queue FULL: Event dc8dc681-d50c-5476-ac35-05a54f1396c3 not added (txn: txa5336d17dd4d418686de5-005820b32c) (client_ip:
pradkthat concerns me17:28
*** numans has joined #tripleo17:28
akrivokaflorianf: thanks!17:28
*** fragatina has quit IRC17:28
*** nyechiel has quit IRC17:28
pradkjd__, ^^ can you check as well17:29
*** mhenkel has joined #tripleo17:30
florianfakrivoka: thanks for testing!17:30
*** amoralej is now known as amoralej|off17:30
*** sudipto has joined #tripleo17:31
*** sudipto_ has joined #tripleo17:31
*** trown is now known as trown|lunch17:31
jd__pradk: something is odd17:31
pradkhm weird, even though nonblocking_notify is false that shows up17:32
jpichCI is green on this patch with three +2s, would it be possible to get the +A?
jd__pradk: exactly17:32
jpichd0ugal: ^ It's your name on it ;)17:32
pradkpanda|weak, can you paste me the rpm version of python-ceilometermiddleware?17:33
jd__pradk: the conf option is not translated to a boolean17:33
panda|weakVersion     : 0.5.017:33
pradkyea the notification should be sent directly as we set that to false17:33
jd__pradk: so setting it in the conf file enable it17:33
panda|weakRelease     : 0.20161004113850.7f502e2.el7.centos17:34
jd__pradk: facepalm17:34
jd__pradk: conf is just a dict of string I think17:34
jd__so removing that line should fix that17:34
panda|weakhm, explanation ? please ?17:34
pradkpanda|weak, the nonblocking_notify option in the conf17:35
pradkshould ideally trigger a bipass and send notify17:35
pradkpanda|weak, but its not, instead its enabled by just adding to conf17:35
*** stendulker has quit IRC17:35
pradkpanda|weak, a bug in ceilomiddleware imo17:35
pradkbut as a work around we can remove the line from conf17:36
panda|weakpradk: what is this option causing in the end ?17:36
*** ebarrera has joined #tripleo17:36
pradkpanda|weak, falling to the except block there17:36
d0ugaljpich: Thanks!17:37
jpichd0ugal: Thank you \o/17:37
panda|weakpradk: I mean, is this causing performance issue ?17:37
jd__IIUC should fix this "problem"17:37
pradkpanda|weak, well thats why the queue is full17:37
panda|weakjd__: do you know what happes when redis slaves are unable to contact the master ?17:37
pradkand possibly why causing cpu load17:38
jd__panda|weak: no, googling for "redis sentinel mode" should give explanation17:38
panda|weakjd__: and I can't stop thinking about scrubs.17:38
pradkjd__, i +2'ed it, if you can release asap, i'll rebase ceilomiddleware packages17:39
*** ealcaniz has quit IRC17:39
jd__panda|weak: I have the chance to share my initial with Dr Dorian so… that's why I picked that nick 10 years ago17:39
panda|weakduckduckgoing "redis sentinel mode" doesn't give much results17:39
sshnaidmjd__, do we run sentinel? isn't it just haproxy?17:39
jd__pradk: thanks17:40
jd__sshnaidm: pradk might know better17:40
*** dtrainor has quit IRC17:40
jd__panda|weak: it says it works automagically17:41
*** dtrainor has joined #tripleo17:41
pradksshnaidm, we do use sentinel in redis17:41
*** ebarrera has quit IRC17:41
pradksshnaidm, i dont know how the new ng ha architecture changed things, but we do configure sentinel in puppet redis17:41
panda|weakok, but in this case, is ceilometer affected in any way by this ? or is just tries to contact the master17:42
panda|weakI see gnocchi logs are clean, so it looks like it's happy with redis.17:42
pradkwhats the issue with redis  if i may ask17:42
pradkwe fixed the missing firewall rule issue last week17:42
pradkso all should be fine17:42
panda|weakpradk: I don't know exactly, the slavess have this output on the logs17:42
panda|weak23087:S 07 Nov 17:42:46.210 # Unable to connect to MASTER: Connection timed out17:43
panda|weak23087:S 07 Nov 17:42:47.219 * Connecting to MASTER no-such-master:637917:43
panda|weakfrom a slave host to a master host, telnet to port 6379 is working17:43
pradkhmm suspect the same firewall thing ..17:44
pradkpanda|weak, this is with latest build?17:44
pradkpanda|weak, can you check iptables -L |grep redis17:44
panda|weakpradk: latest build of what ? redis ?17:44
panda|weakI see gnocchi logs are clean, so it looks like it's happy with redis.ACCEPT     tcp  --  anywhere             anywhere             multiport dports 6379,26379 /* 108 redis */ state NEW17:44
pradkpanda|weak, assume you're seeing this in osp ?17:44
panda|weakon slave17:44
panda|weakACCEPT     tcp  --  anywhere             anywhere             multiport dports 6379,26379 /* 108 redis */ state NEW17:45
panda|weakon master17:45
pradkyep that looks fine17:45
*** lucasagomes is now known as lucas-afk17:45
pradkpanda|weak, if central agent, gnocchi metricd are all coordinating correctly i assume all is fine17:45
panda|weakpradk: rdo, we're trying to track down shy operation after deployment in tripleo CI are timing out17:45
*** hewbrocca is now known as hewbrocca_afk17:46
*** jpich has quit IRC17:47
pradkpanda|weak, ok any connection errors to redis in ceilometer/central.log , gnocchi/metricd.log17:47
panda|weakpradk: lots of delays and performance issue, in the past two weeks17:47
pradkpanda|weak, understand, anything specific pointing to redis as the reason for performance issue?17:47
*** dtantsur is now known as dtantsur|afk17:47
*** athomas has quit IRC17:47
pradkpanda|weak, bandini also mentioned this morning that metricd is taking up some cpu .. not sure if thats the same17:48
panda|weakpradk: that was the main issue last week, redis was not starting properly and metricd was eating all the CPU17:49
*** florianf has quit IRC17:49
panda|weakthis may be interesting17:49
panda|weak2016-11-07 17:38:29.244 16318 INFO swiftclient [-] REQ: curl -i -X GET -H "Accept-Encoding: gzip" -H "X-Auth-Token: 4cf0a0a7670a4160..."17:49
panda|weak2016-11-07 17:38:29.245 16318 INFO swiftclient [-] RESP STATUS: 401 Unauthorized17:49
sshnaidmpradk, how can I see that this sentinel is running? I don't see something related in process17:49
panda|weak2016-11-07 17:38:29.245 16318 INFO swiftclient [-] RESP HEADERS: {u'Content-Length': u'131', u'Www-Authenticate': u'Swift realm="AUTH_0ce54ff3166a4f639721fda10b83f1d8", Keystone uri=\'\'', u'X-Trans-Id': u'txbb70c1375ce24056b4388-005820bc15', u'Date': u'Mon, 07 Nov 2016 17:38:30 GMT', u'Content-Type': u'text/html; charset=UTF-8', u'X-Openstack-Request-Id':17:49
openstackgerritPaul Belanger proposed openstack/diskimage-builder: Add simple-playbook element
panda|weak2016-11-07 17:38:29.245 16318 INFO swiftclient [-] RESP BODY: <html><h1>Unauthorized</h1><p>This server could not verify that you are authorized to access the document you requested.</p></html>17:49
panda|weakin metricsd17:49
openstackgerritMartin André proposed openstack/tripleo-heat-templates: Containerized Services for Composable Roles
pradkhmm that looks like a keystone issue17:50
pradkthere was a swift issue where proxy server was not able to talk to account server17:51
* panda|weak adds keystone to the stack of troublemakers today17:51
pradkand hence not finding the account17:51
pradkso this looks like gnocchi is trying to post measures to swift with a token and getting unauthorized17:53
pradkdid you check keystone if thats a valid token?17:53
panda|weakpradk: this error does appear regularly, but it's not flooding the logs17:54
bnemecpanda|weak: In case you aren't having enough fun today:
openstackLaunchpad bug 1639881 in tripleo "Bogus rabbit server address with ipv6" [Critical,Triaged]17:55
sshnaidmpanda|weak, pradk sorry, but I don't see any sentinel running on hosts.. Do I miss something?17:57
*** jkilpatr_ has quit IRC17:58
panda|weaksshnaidm:  maybe that's the problem17:58
panda|weakpradk: I'm trying to look at the token17:58
panda|weakbnemec: yay.17:59
gfidentesshnaidm panda|weak pradk there isnt any redis sentinel with pcmk17:59
panda|weakbnemec: I remember those happy times when I was woking on some features, instead of fixing nested issues ...18:00
sshnaidmgfidente, that's what I suspected..18:00
gfidentesshnaidm yeah that is expected18:00
pradkyea with ng arch i guess things changed .. but we do try to configure it still ..
pradki guess we havent removed the code yet18:00
sshnaidmgfidente, does something manage redis then?18:00
gfidentesshnaidm pcmk does18:01
bnemecpanda|weak: Welcome to tripleo, aka the project that finds everyone else's bugs. :-)18:01
gfidenteshadower it uses a RA to set the replica master across the set18:01
gfidentesshnaidm ^18:01
sshnaidmpanda|weak, weshay  let's track it in the new issue:
openstackLaunchpad bug 1639885 in tripleo "CI: pingtest timeouts cause by performance issues (redis, swift, ceiliometer)" [Undecided,New]18:01
sshnaidmgfidente, RA?18:02
gfidenteresource agent18:02
gfidentehaproxy has a rule to gather what is the redis master too18:02
gfidentewhich is set by pcmk18:02
*** derekh has quit IRC18:03
sshnaidmgfidente, any link to read about it? or how to trace it18:03
sshnaidmgfidente, I try to understand why redis doesn't connect to its master..18:03
gfidentestart by checking the haproxy listener config18:03
sshnaidmgfidente, seems ok18:03
gfidenteso is the problem that non-primary nodes don't connect to the master?18:04
*** fragatina has joined #tripleo18:04
panda|weakpradk: ... python-keystoneclient is installed in undercloud but there is no keystone or openstack identity command ... am I missing something ?18:05
pradkpanda|weak, part of openstack cli?18:05
pradkopenstack endpoint blah i think18:06
*** yamahata has joined #tripleo18:06
sshnaidmgfidente, yes18:07
sshnaidmgfidente, they don't know about it AFAIU18:07
*** shardy has quit IRC18:07
*** jkilpatr_ has joined #tripleo18:07
gfidentesshnaidm right that's what the pcmk resource agent is meant to control18:08
gfidentesshnaidm is there a bug for this?18:08
sshnaidmgfidente, not special, only here:
openstackLaunchpad bug 1639885 in tripleo "CI: pingtest timeouts cause by performance issues (redis, swift, ceiliometer)" [High,Triaged]18:09
*** rhallisey has quit IRC18:09
panda|weakpradk: .. ok I don't know how to check the token ..18:09
*** jayg is now known as jayg|g0n318:09
pradkyea i dont think that should cause any performance issues though18:10
pradkjust sounds like a mis config18:10
*** dsariel has quit IRC18:10
*** jcoufal_ has joined #tripleo18:10
*** rhallisey has joined #tripleo18:10
panda|weakpradk: doesn't that mean that ceilometer is unable to store data on swift ?18:10
gfidentesshnaidm can you paste the output from18:10
*** chandankumar has joined #tripleo18:10
gfidentereplication info18:10
gfidentefrom all nodes?18:10
gfidente(into the bug)]18:10
gfidentehaproxy knows the password to send commands to redis, you can take it from there18:11
pradkpanda|weak, gnocchi is not able to post measures .. yea which looks like a config issue18:11
sshnaidmgfidente, what is replication info..?18:11
gfidentesshnaidm can I get on the environment?18:11
sshnaidmgfidente, yep18:11
gfidentewe do it together18:11
panda|weak[root@overcloud-controller-0 log]# redis-cli replication info18:11
*** jayg|g0n3 is now known as jayg18:11
panda|weakCould not connect to Redis at Connection refused18:11
panda|weakCould not connect to Redis at Connection refused18:11
sshnaidmgfidente, you public key, pleas18:11
gfidentepanda|weak nah it's not binding on
*** chandankumar has quit IRC18:11
sshnaidmpanda|weak, where is this log  from?18:12
*** jcoufal has quit IRC18:13
sshnaidmgfidente, look at priv18:13
panda|weakso recapping. nonblocking_notify is causing perfomrance issues. Redis is unable to replicate to slaves, and gnocchi is unable to store measurement to swift18:13
*** jkilpatr_ has quit IRC18:13
sshnaidmgfidente, it's panda|weak's environemnt :)18:13
panda|weakdid I forgeet something ?18:13
*** fzdarsky is now known as fzdarsky|afk18:13
panda|weaksshnaidm: no logs, just a command18:14
*** jaosorior_sick has quit IRC18:14
*** rbrady-afk is now known as rbrady18:14
*** rhallisey has quit IRC18:15
*** akrivoka has quit IRC18:15
*** rhallisey has joined #tripleo18:15
panda|weaksshnaidm: maybe the wrongest command ever18:16
*** akrivoka has joined #tripleo18:17
panda|weakmh redis-cli  -h is taking a lot of time to answer to any command18:17
*** dtrainor has quit IRC18:18
sshnaidmpanda|weak, how can I get to overcloud nodes in your env?18:18
*** aufi has quit IRC18:18
*** yamahata has quit IRC18:19
panda|weaksshnaidm: ssh heat-admin@
sshnaidmpanda|weak, omg, it's so sloooooow18:20
panda|weaksshnaidm: dns reverse check has to timeout first ...18:20
sshnaidmpanda|weak, and Permission denied (publickey,gssapi-keyex,gssapi-with-mic)18:20
sshnaidmpanda|weak, root or jenkins?18:21
panda|weaksshnaidm: should be jenkins18:21
sshnaidmpanda|weak, nothing.. if you get access, can you give it to gfidente please?18:21
gfidenteI am there18:22
sshnaidmpanda|weak, adding jenkins keys there18:22
gfidentewaiting for info replication to return18:22
sshnaidmgfidente, great18:22
*** yamahata has joined #tripleo18:22
panda|weakgfidente: it usually takes so much to reply ?18:22
gfidentepanda|weak not at all18:22
sshnaidmpanda|weak, ssh to overcloud node is extremely slow, box seems very busy18:23
gfidenteso yeah this will cause cascading issues I suppose18:23
panda|weaksshnaidm: ssh is slow because it's trying to revers dns before letting you in18:23
sshnaidmpanda|weak, still, so many time18:24
gfidentebut yes looks like pcmk failed to set the master node18:24
panda|weakgfidente: any ideas why it could be so slow ?18:24
*** dtrainor has joined #tripleo18:24
*** jpena is now known as jpena|off18:24
*** dougbtv has quit IRC18:24
*** liverpooler has quit IRC18:24
panda|weakpcs status says differently18:24
panda|weak Master/Slave Set: redis-master [redis]18:24
panda|weak     Masters: [ overcloud-controller-2 ]18:24
gfidenteright replica in between controller-1 and controller-2 is working fine18:26
gfidenteand they are much faster as well18:26
*** jkilpatr_ has joined #tripleo18:26
*** sshnaidm is now known as sshnaidm|brb18:26
panda|weakbnemec: is TripleoCI like this by design ?18:27
openstackgerritMerged openstack/python-tripleoclient: Pass clients to get the get_password function
panda|weakgfidente: replica between the two slaves ?18:28
gfidentepanda|weak no overcloud-controller-2 is the master18:29
*** mcornea has quit IRC18:29
*** sudipto has quit IRC18:30
*** sudipto_ has quit IRC18:30
panda|weakoh right.18:30
*** fragatina has quit IRC18:30
gfidenteso it's really only controller-0 which is so slow18:30
*** fragatina has joined #tripleo18:31
*** ohamada has quit IRC18:31
*** yamahata has quit IRC18:31
gfidenteI think -0 is unable to join the replica because it is too slow18:31
gfidenteI tried to gave slaveof manually and it didn't work either18:32
gfidentewhy only -0 is so slow remains to be seen18:32
gfidenteit joined the cluster now18:34
gfidenteand seems much quicker18:34
gfidentein fact it now goes at same speed of others18:34
gfidenteWOW :)18:34
panda|weakgfidente: check now18:34
gfidenteyeah it's good now18:35
gfidenteI set slaveof18:35
gfidenteand as soon as that was set it went to normal speed18:35
gfidenteand joined the replica18:35
*** mgould is now known as mgould|afk18:36
panda|weakgfidente: so it's not CPU usage on - the problem ?18:36
gfidenteso it seems to be working fine to me now18:36
gfidenteno it's not18:36
gfidentecpu usage remains high, but after it joined the cluster redis became responsive18:36
gfidentethough I don't think this would cause any issue to the clients18:36
gfidentewhich are forwarded to the master node only anyway18:36
gfidentesorry guys, going for dinner, be back later18:37
panda|weakgfidente: ok, thanks!18:37
panda|weaksshnaidm|brb: I think it's better to create different bugs for the different issues.18:37
openstackgerritBen Nemec proposed openstack/python-tripleoclient: Pass clients to get the get_password function
*** hjensas has quit IRC18:39
dsneddonA change in disk-image-builder will affect our use of networking in the overcloud images.
dsneddonOnce this patch goes in, we have to choose between the network service and NetworkManager, instead of running both. ^^^18:39
dsneddonI may write an email discussing this to openstack-dev, but I was wondering if anyone had any insights on possible effects of disabling NetworkManager in our overcloud images universally.18:40
dsneddondprince, gfidente, bnemec, slagle, any thoughts? ^^^18:40
slagledsneddon: i don't know of any possible side effects. pretty much the only reason i know of why it's not disabled already is b/c we've understood that you shouldn't have to18:42
slaglebut i take it that is no longer true?18:42
*** milan has quit IRC18:42
dprincedsneddon: I suppose our existing approach was mostly around letting distro defaults persist, and simply setting NM managed = no via os-net-config where we needed it18:43
dprincedsneddon: is there a reasons this approach isn't working anymore?18:43
dsneddonslagle, dprince: With the recent release of RHEL 7.3 we started seeing selinux AVC alerts because both services were trying to run dhclient for the same interface.18:44
dsneddonIt hasn't caused any problems per se, but it seems like bad behavior.18:44
*** akrivoka has quit IRC18:44
openstackgerritMerged openstack/tripleo-heat-templates: Change nova ram_allocation_ratio to match puppet-nova
*** akrivoka has joined #tripleo18:45
dprincedsneddon: in that case, a setting to control/select which one we want running seems reasonable18:45
dsneddondprince, What concerns me is that in Noam's DIB patch, he claims that "NetworkManager know how to handle link carrier and network service don't. This crucial for scenarios like nova suspend resume,shelve unshelve,and co. NetworkManager know when this signal received to initiate DHCP"18:45
dsneddondprince, I hadn't heard that before, but I think his use case is at odds with ours, and we would likely set DIB_NETWORK_MANAGER='network' to disable NM.18:46
dprincedsneddon: I suppose I'd rather not have it baked into an image though. It would be much better to have it config configured dynamically18:46
dsneddondprince, I agree, it seems like a big hammer to use to address the issue of both services being enabled.18:47
slagledsneddon: is this the bz? bug 1390011 in rhel-osp-director "dhclient related selinux avcs on the overcloud nodes" [Urgent,Assigned] - Assigned to bfournie18:47
*** saneax is now known as saneax-_-|AFK18:48
bfournieslagle: yes the problem with both network and NetworkManager running mainly comes play with dhcp-all-interfaces because NM_CONTROLLED=no is not set there18:48
slagleit sounds like a result of NM + dhcp-all-interfaces18:48
slaglein which case, that's why baking it into the image feels necessary18:48
openstackgerritOpenStack Proposal Bot proposed openstack/tripleo-common: Updated from global requirements
dsneddonslagle, I think you're right.18:49
*** cylopez has left #tripleo18:49
bfournieslagle: yes, we'd have to either choose a networking service for the image or set NM_CONTROLLED=no, but that the 2nd one causes problems with anyone using dhcp-all-interfaces who wants to use NetworkManager, as Noam patch refers to18:51
dsneddonbfournie: If I understand the patch correctly, there will no longer be an option to have both networking services enabled, so I don't think that just setting NM_CONTROLLED=no and not specifying a DIB_NETWORK_MANAGER is an option.18:55
bfourniedsneddon: yes, I agree18:55
bfourniedsneddon: one that patch goes in we wouldn't have that option, and also should remove the setting of NM_CONTROLLED=no from os-net-config and just rely on 'network' being set for DIB_NETWORK_MANAGER18:57
dsneddonbfournie, I see no reason to change os-net-config, especially as that might affect upgrades (the removal of NM_CONTROLLED=no will cause os-net-config to restart the interfaces).18:59
lblanchardjrist-afk, jtomasek|afk, honza…and anyone else who may be interested…I wanted to show you all some ideas I had on composable roles in the UI. This is future thinking, but just wanted to throw it out there:
dsneddonbfournie, Although I do think we could make it an os-net-config parameter, potentially.19:00
bfourniedsneddon: true, ok. It would only matter if DIB_NETWORK_MANa19:00
bfournieGER was set to NetworkManager19:00
honzalblanchard: nom nom nom19:00
lblanchardhonza: :)19:01
openstackgerritPaul Belanger proposed openstack/diskimage-builder: Add simple-playbook element
*** ayoung has joined #tripleo19:06
honzalblanchard: given how little i know about this feature, i see no issues with the wireframes; it makes senses and it's simple19:07
*** rbowen has quit IRC19:08
*** trown|lunch is now known as trown19:10
*** radeksmg has joined #tripleo19:12
openstackgerritMichele Baldessari proposed openstack/puppet-pacemaker: WIP DO NOT MERGE Allow the creation of resources in disabled state
dsneddondprince, I think we may need to call 'systemctl restart network' after os-net-config runs. In which repo does the script live now that calls os-net-config?19:20
weshaypanda|weak, sshnaidm|brb thanks guys :)19:21
dprincedsneddon: it still lives in the elements19:22
dprincedsneddon: tripleo-image-elements19:22
dsneddondprince, Ah, I thought we had moved away from that, thanks.19:22
dprincedsneddon: we need to update steve's review to move it into t-h-t again. I would much prefer if we moved away from elements for this19:23
*** rbowen has joined #tripleo19:23
dsneddondprince, Yes, I agree, I'll take a look at Steve's review.19:23
dprincedsneddon: he may have abandoned it btw. You might need to dig a bit .... ;)19:24
dsneddondprince, Ah, yes, found it, I remember reviewing this many moons ago19:24
dsneddondprince, Which explains the vague memory that we had already moved it from t-i-e19:25
slaglebnemec: can you review
*** d0ugal has quit IRC19:30
*** d0ugal has joined #tripleo19:31
*** d0ugal has quit IRC19:31
*** d0ugal has joined #tripleo19:31
openstackgerritMerged openstack-infra/tripleo-ci: Add Barbican key order to scenario002
openstackgerritMerged openstack/tripleo-heat-templates: Include keystone authtoken config in manila-share service
*** cylopez has joined #tripleo19:38
*** yamahata has joined #tripleo19:39
lblanchardhonza: thanks for reviewing!! I don't know a whole lot about the feature either unfortunately :( Maybe jtomasek|afk can comment more tomorrow and enlighten us.19:41
openstackgerritMerged openstack/tripleo-heat-templates: Move db settings from manila-api to manila-base
*** pkovar has quit IRC19:44
openstackgerritDougal Matthews proposed openstack/python-tripleoclient: Pass clients to get the get_password function
*** akrivoka has quit IRC19:47
*** akrivoka has joined #tripleo19:48
*** sshnaidm|brb is now known as sshnaidm19:52
openstackgerritLeif Madsen proposed openstack/tripleo-docs: Link to RDO built images
leifmadsentrown: ^^ fyi19:53
leifmadsenthx again19:53
*** yamahata has quit IRC19:54
openstackgerritBrent Eagles proposed openstack/puppet-tripleo: WIP: Call VF configuration from udev rules
*** dprince has quit IRC20:00
sshnaidmgfidente, what did you do exactly so controller was back to life?20:00
*** cylopez has quit IRC20:02
*** dsneddon_ has joined #tripleo20:06
*** dciabrin has quit IRC20:16
*** dciabrin has joined #tripleo20:18
*** mcornea has joined #tripleo20:18
panda|weaksshnaidm: issued slaveof command from redis-cli20:18
panda|weaksshnaidm: I'm splitting the issues, one per bug in launchpad20:19
sshnaidmpanda|weak, sure, agree20:19
*** dougbtv has joined #tripleo20:22
*** noslzzp has quit IRC20:24
*** noslzzp has joined #tripleo20:24
*** akrivoka has quit IRC20:25
*** d0ugal has quit IRC20:26
*** d0ugal has joined #tripleo20:27
*** d0ugal has quit IRC20:27
*** d0ugal has joined #tripleo20:27
*** maeca1 has joined #tripleo20:33
*** dsneddon_ has quit IRC20:37
openstackgerritMerged openstack/tripleo-heat-templates: Add an optional extra node admin ssh key parameter
openstackgerritLeif Madsen proposed openstack/tripleo-docs: Link to RDO built images
*** mhenkel has quit IRC20:53
openstackgerritMerged openstack/python-tripleoclient: Fix handling of missing environment files
*** Goneri has quit IRC20:57
*** dsariel has joined #tripleo20:58
*** mcornea has quit IRC21:01
openstackgerritJames Slagle proposed openstack/python-tripleoclient: Fix handling of missing environment files
*** ayoung has quit IRC21:04
mwhahahaanyone have any thoughts as to why an HA deploy fails at the pcs cluster setup with Unable to authenticate to overcloud-controller-0 - (HTTP error: 401)21:11
*** rhallisey has quit IRC21:12
mwhahahaoh i guess not having a corosync.conf might be problemattic21:13
trozethi can someone tell me what the overcloud parameter outputs for <service>InternalVip are used for? like  NovaInternalVip:21:13
*** iranzo has quit IRC21:14
*** ebarrera has joined #tripleo21:15
mwhahahatrozet: aren't they for like internal access between the services?21:15
trozetmwhahaha: I just can't find any reference to the variable in THT, so not sure where it is being used21:16
trozetmwhahaha: the description says stuff like VIP for Neutron API internal endpoint21:16
trozetmwhahaha: I looked at endpoint_map I dont see it using them21:17
trozetdsneddon maybe you know^^^^^?21:17
dsneddontrozet, That gets constructed, based on the setting for NovaApiNetwork in the ServiceNetMap21:18
trozetdsneddon: yeah i see how it gets created, I'm just not sure what it is used for afterwards21:19
dsneddontrozet, You can override which network that lives on, in which case the IP will be different as a result of this line in overcloud.j2.yaml: value: {get_attr: [VipMap, net_ip_map, {get_attr: [ServiceNetMap, service_net_map, NovaApiNetwork]}]}21:19
trozetdsneddon: like I don't see any reference to *InternalVip anywhere in THT21:19
*** jayg is now known as jayg|g0n321:19
dsneddontrozet, Yeah, I don't see it used anywhere, either.21:19
trozetdsneddon: I am trying a deployment now and just deleted the ODL one21:20
trozetdsneddon: if it works maybe i will try another deployment and delete all of them?21:20
bnemectrozet: It's possible they aren't being used yet.  There's ongoing work to have ssl everywhere that I think those may have been added for.21:22
*** rbrady is now known as rbrady-afk21:24
*** ayoung has joined #tripleo21:27
*** jkilpatr_ has quit IRC21:27
*** maeca1 has left #tripleo21:29
trozetbnemec, dsneddon, mwhahaha:
trozetit's not used anymroe in the dnpoint map21:31
trozetused anymore in endpoint map21:31
trozetI'm going to push a patch to remove them21:31
dsneddontrozet, Sounds good to me21:34
bnemecAh, interesting.21:35
openstackgerritTim Rozet proposed openstack/tripleo-heat-templates: Fixes incorrect reference to OpendaylightApiNetwork
*** rbowen has quit IRC21:38
trozetdsneddon, bnemec: does it need a bug ID?21:38
*** yamahata has joined #tripleo21:41
*** rbowen has joined #tripleo21:41
*** jcoufal_ has quit IRC21:41
openstackgerritBen Nemec proposed openstack/instack-undercloud: Newtonthing to see here
dsneddontrozet, Hmm, it never hurts, although we do sometimes remove cruft without a bug ID.21:43
dsneddontrozet, For this many lines, I would vote yes on a bug ID21:43
trozetdsneddon: ok21:43
trozetlol on the bnemec commit msg^^^^21:44
*** trown is now known as trown|outtypewww21:44
bnemecGotta differentiate from my master Nothing to see here patch. ;-)21:45
openstackgerritTim Rozet proposed openstack/tripleo-heat-templates: Removes deprecated overcloud VIP outputs
openstackgerritMerged openstack/tripleo-common: Sets defaults in swift connection related to retries
*** jkilpatr has joined #tripleo21:58
*** cylopez has joined #tripleo21:59
openstackgerritMichele Baldessari proposed openstack/puppet-pacemaker: WIP DO NOT MERGE Allow the creation of resources in disabled state
*** lblanchard has quit IRC22:05
*** yamahata has quit IRC22:06
*** mhenkel has joined #tripleo22:07
*** jprovazn has quit IRC22:13
*** tiswanso has quit IRC22:23
*** cylopez has quit IRC22:23
*** radeksmg has quit IRC22:26
*** absubram has quit IRC22:26
*** pblaho has quit IRC22:31
*** pblaho has joined #tripleo22:33
*** fragatin_ has joined #tripleo22:37
*** eglynn has quit IRC22:38
openstackgerritBrent Eagles proposed openstack/os-net-config: WIP: Add support for enabling hotplug on interfaces
openstackgerritBrent Eagles proposed openstack/os-net-config: WIP: Add support for enabling hotplug on interfaces
*** fragatina has quit IRC22:40
dsneddonbeagles, Would HOTPLUG=yes/no really apply to *any* interface type? Instead of applying it to BaseOpts, shouldn't it only be added to objects which support hotplug events?22:40
beaglesdsneddon: good point... should only be relevant to Interface... will fix22:42
dsneddonbeagles, Yeah, interface, and *maybe* Infiniband interfaces, but since we don't have hardware to test, probably just Interface for now.22:42
*** tiswanso has joined #tripleo22:47
*** bfournie has quit IRC22:51
*** tiswanso has quit IRC22:52
*** dsariel has quit IRC22:54
panda|weaksshnaidm: I don't think anything we've found so far is the real issue with HA jobs22:56
openstackLaunchpad bug 1639970 in tripleo "CI: cinder fails to allocate memory while creating volume for ping test tenant" [Critical,Confirmed]22:56
*** limao has joined #tripleo22:56
panda|weakexhausted, going to bed.22:56
*** panda|weak is now known as panda|zZ22:56
sshnaidmpanda|zZ, this is something new22:57
sshnaidmpanda|zZ, g'nite!22:57
panda|zZsshnaidm: therve shomed me a few hours ago, but I was blind22:57
panda|zZall the failing jobs of the past hours have that message22:58
sshnaidmpanda|zZ, you were weak22:58
*** limao_ has joined #tripleo22:58
sshnaidmpanda|zZ, arxcruz saw this in tempest already..22:58
panda|zZsshnaidm: heh.22:58
panda|zZmaybe 6G is not enough anymore for the overcloud nodes ?22:58
openstackgerritBrent Eagles proposed openstack/os-net-config: WIP: Add support for enabling hotplug on interfaces
*** limao has quit IRC23:01
sshnaidmpanda|zZ, either to create something less than 1GB image23:02
sshnaidmbut seems it's not possible23:03
*** saneax-_-|AFK is now known as saneax23:07
openstackgerritCarlos Camacho proposed openstack/tripleo-heat-templates: Reload haproxy configuration as a post-deployment step
*** morazi has quit IRC23:17
*** gfidente has quit IRC23:21
*** rlandy has quit IRC23:24
*** pradk has quit IRC23:25
*** ayoung has quit IRC23:31
*** tiswanso has joined #tripleo23:34
*** ayoung has joined #tripleo23:34
*** bfournie has joined #tripleo23:38
*** tiswanso has quit IRC23:38
*** limao_ has quit IRC23:46
*** ayoung has quit IRC23:49
*** sshnaidm is now known as sshnaidm|away23:53
*** dciabrin has quit IRC23:56

Generated by 2.14.0 by Marius Gedminas - find it at!