Tuesday, 2016-10-04

*** cdearborn has quit IRC00:04
*** [1]cdearborn has joined #tripleo00:19
*** social has quit IRC00:22
*** bana_k has quit IRC00:29
*** saneax is now known as saneax-_-|AFK00:30
*** [1]cdearborn has quit IRC00:39
*** bana_k has joined #tripleo00:45
*** thrash is now known as thrash|g0ne00:50
openstackgerritBrad P. Crochet proposed openstack/python-tripleoclient: Migrate overcloud update to a mistral workflow  https://review.openstack.org/38135100:51
*** tiswanso has joined #tripleo00:57
*** limao_ has quit IRC01:03
*** limao has joined #tripleo01:04
*** limao has quit IRC01:07
*** bana_k has quit IRC01:37
*** bana_k has joined #tripleo01:40
larsksI'm seeing deployments fail with: [overcloud]: CREATE_FAILED  Resource CREATE failed: Expression consumed too much memory01:54
*** bana_k has quit IRC01:54
larsksI am pretty sure this is not the undercloud itself running out of memory.01:54
larsksAny hints as to what this error is referring?01:55
openstackgerritEmilien Macchi proposed openstack/tripleo-heat-templates: Include ceilometer in swift proxy pipeline  https://review.openstack.org/37195002:16
openstackgerritEmilien Macchi proposed openstack/puppet-tripleo: Clean out UI httpd configuration file  https://review.openstack.org/38015202:18
*** yamahata has quit IRC02:36
*** dmacpher is now known as dmacpher-afk02:39
*** myoung|bbl is now known as myoung02:44
openstackgerritMerged openstack/tripleo-heat-templates: Cinder volume service is not managed by Pacemaker on BlockStorage  https://review.openstack.org/38124003:00
openstackgerritMerged openstack/tripleo-heat-templates: Set ceph osd max object name and namespace len on upgrade when on ext4  https://review.openstack.org/37940103:01
openstackgerritEmilien Macchi proposed openstack/tripleo-heat-templates: reload HAProxy config in HA setups when certificate is updated  https://review.openstack.org/38137403:04
openstackgerritEmilien Macchi proposed openstack/tripleo-heat-templates: Set ceph osd max object name and namespace len on upgrade when on ext4  https://review.openstack.org/38137503:05
openstackgerritMerged openstack/python-tripleoclient: Remove another openstackclient import  https://review.openstack.org/38124703:07
*** sudipto has joined #tripleo03:08
*** sudipto has joined #tripleo03:09
*** sudipto_ has joined #tripleo03:09
*** links has joined #tripleo03:10
openstackgerritTuan Luong-Anh proposed openstack/tripleo-specs: Fix a typo in documentation  https://review.openstack.org/38137903:13
*** sudipto has quit IRC03:23
*** sudipto_ has quit IRC03:23
openstackgerritRedHat RDO CI proposed openstack/tripleo-heat-templates: GATE TEST, please ignore  https://review.openstack.org/36544903:30
*** tiswanso has quit IRC03:45
*** sudipto_ has joined #tripleo04:07
*** sudipto has joined #tripleo04:07
*** dmacpher-afk is now known as dmacpher04:09
*** absubram has joined #tripleo04:17
*** absubram_ has joined #tripleo04:18
*** absubram has quit IRC04:21
*** absubram_ is now known as absubram04:21
*** tzumainn has quit IRC04:27
*** saneax-_-|AFK is now known as saneax04:33
*** pmannidi has quit IRC04:39
*** pgadiya has joined #tripleo04:43
*** pmannidi has joined #tripleo04:56
*** rwsu has quit IRC04:59
*** masco has joined #tripleo05:03
openstackgerritMerged openstack/tripleo-heat-templates: Make keystone api network hiera composable  https://review.openstack.org/38038605:18
*** fultonj has quit IRC05:36
openstackgerritCarlos Camacho proposed openstack/tripleo-common: Modify j2 templating to allow role files generation  https://review.openstack.org/37875005:40
*** ccamacho has quit IRC05:43
*** jaosorior has joined #tripleo05:46
openstackgerritMerged openstack/puppet-tripleo: Clean out UI httpd configuration file  https://review.openstack.org/38015205:48
*** flaper87 has joined #tripleo05:49
*** flaper87 has joined #tripleo05:49
*** mbozhenko has joined #tripleo05:51
*** yamahata has joined #tripleo05:59
jaosoriorbandini: it was still failing pretty randomly regarding the redis password06:06
jaosoriorbut at least today things look better06:06
jaosoriorI think the old package was still cached somewhere06:06
bandinijaosorior: ah that would explain things. I did go wtf this morning ;)06:07
jaosoriorindeed06:07
*** numans has joined #tripleo06:11
*** mbozhenko has quit IRC06:15
*** lmiccini has joined #tripleo06:15
*** bana_k has joined #tripleo06:18
*** rasca has joined #tripleo06:26
*** mbozhenko has joined #tripleo06:27
*** jbadiapa has quit IRC06:29
*** jbadiapa has joined #tripleo06:30
*** jprovazn has joined #tripleo06:30
openstackgerritMerged openstack/tripleo-heat-templates: reload HAProxy config in HA setups when certificate is updated  https://review.openstack.org/38137406:30
*** hjensas has quit IRC06:30
*** bana_k has quit IRC06:40
*** rcernin has joined #tripleo06:43
*** jlinkes has joined #tripleo06:45
*** ccamacho has joined #tripleo06:47
*** chem has joined #tripleo06:48
*** jlinkes has quit IRC06:50
openstackgerritSharat Sharma proposed openstack/tripleo-common: Changed the link to home-page  https://review.openstack.org/38144306:51
*** tesseract- has joined #tripleo06:52
openstackgerritSharat Sharma proposed openstack/tripleo-validations: Changed the link to home-page  https://review.openstack.org/38144506:54
*** cylopez has joined #tripleo06:54
*** jlinkes has joined #tripleo06:55
*** b00tcat has joined #tripleo06:56
*** yamahata has quit IRC06:56
*** jlinkes_ has joined #tripleo06:59
*** jlinkes_ has quit IRC06:59
*** jlinkes has quit IRC07:00
openstackgerritDougal Matthews proposed openstack/tripleo-docs: Remove usage of --compute-scale from the baremetal overcloud docs  https://review.openstack.org/38145407:14
*** b00tcat has quit IRC07:15
*** b00tcat has joined #tripleo07:15
openstackgerritDougal Matthews proposed openstack/tripleo-docs: Remove usage of --control-scale from the HA docs  https://review.openstack.org/38145807:17
*** fzdarsky has joined #tripleo07:21
*** panda|Zz is now known as panda07:22
*** dtantsur|afk is now known as dtantsur07:26
*** abehl has joined #tripleo07:28
openstackgerritDougal Matthews proposed openstack/tripleo-docs: Remove the replace controller docs  https://review.openstack.org/38146907:29
*** aufi has joined #tripleo07:31
openstackgerritCarlos Camacho proposed openstack/tripleo-common: Modify j2 templating to allow role files generation  https://review.openstack.org/37875007:32
*** hjensas has joined #tripleo07:35
*** hjensas has quit IRC07:35
*** hjensas has joined #tripleo07:35
*** tremble has joined #tripleo07:37
*** tremble has joined #tripleo07:37
*** jlinkes has joined #tripleo07:37
*** snecklifter has joined #tripleo07:44
*** jpich has joined #tripleo07:44
snecklifterMorning #tripleo07:44
snecklifterOn my Mitaka undercloud node, keystone-manage token_flush doesn't appear to be reducing the db size.07:45
snecklifterAny ideas?07:45
openstackgerritMichele Baldessari proposed openstack/tripleo-heat-templates: Change the rabbitmq ha policies during an M/N Upgrade  https://review.openstack.org/38148507:45
*** jtomasek|afk is now known as jtomasek07:51
b00tcathi, I'm deploying overcloud-full and everything goes fine, but when the `deploy` command ends my controller is gone and only the compute node is up07:52
b00tcatI can see that the controller is up at one point using `nmap`07:53
b00tcatbut then it's gone07:53
b00tcatanybody saw this before?07:53
b00tcatI'm redeploying after a reboot and see if this helps..07:53
*** jpena|off is now known as jpena07:53
*** dbecker has quit IRC07:54
openstackgerritMichele Baldessari proposed openstack/tripleo-heat-templates: Change rabbitmq queues HA mode from ha-all to ha-exactly  https://review.openstack.org/38148907:55
*** dbecker has joined #tripleo07:56
*** yamahata has joined #tripleo07:56
bandinijaosorior: ok https://review.openstack.org/#/c/380665/ passed ci now (you were right something was probably cached somewhere). ok to +A it?07:56
*** jistr has joined #tripleo07:57
*** mcornea has joined #tripleo07:57
jaosoriorbandini: sure07:58
*** yamahata has quit IRC07:58
jaosoriorbandini: done07:58
bandinijaosorior++ thanks!07:58
*** rasca has quit IRC08:00
*** hewbrocca-afk is now known as hewbrocca08:01
*** radeks has joined #tripleo08:01
*** zoli_gone-proxy is now known as zoliXXL08:02
*** rwsu has joined #tripleo08:03
*** amoralej|off is now known as amoralej08:03
*** rasca has joined #tripleo08:03
openstackgerritCarlos Camacho proposed openstack/tripleo-common: Modify j2 templating to allow role files generation  https://review.openstack.org/37875008:06
skramajahello.. we got a new lab setup which we are trying to bring up for TripleO master. during the deploy process, the IPMI is able to boot the machine, DHCP request is reaching the undercloud and an IP is being assinged to the overcloud node. But after acquring DHCP, the overcloud node is trying for a TFTP connect and it is timing out. what i understand is that it should use http to boot the kernel and ipa ramdisk.08:07
skramajaany idea whether TFTP is right and supposed to work or there any bios setting to use http only?08:07
*** yamahata has joined #tripleo08:07
skramajadtantsur: jaosorior ccamacho ^08:08
*** ohamada has joined #tripleo08:08
dtantsurskramaja, TFTP is needed to bootstrap iPXE, if you hardware does not support it natively08:09
ccamachojaosorior, hey man sorry for not making the reviews yesterday... Ill check your reviews now :)08:10
skramajadtantsur: TFTP connection is timing out. i tried tcpdump on port 69 there is no trace. any idea on how to debug the TFTP connection timeout?08:10
*** egafford has joined #tripleo08:11
*** stendulker has joined #tripleo08:11
jaosoriorccamacho: don't worry about it, thanks for checking them out08:11
dtantsurskramaja, hard to tell, maybe something about your hardware, maybe something about your networking08:11
dtantsurskramaja, first ensure that you access TFTP at least locally08:12
skramajadtantsur: yes. i did that locallay and verified tftp is able to download the files successfully.08:12
*** yamahata has quit IRC08:13
*** yamahata has joined #tripleo08:13
bandinijaosorior: can you take a peek at this one as well please? https://review.openstack.org/#/c/379586/08:14
*** athomas has joined #tripleo08:15
*** milan has joined #tripleo08:20
mariosbandini: looks like there was more update at https://review.openstack.org/#/c/379586/08:24
mariosbandini: going over review requests from yesterday08:24
ccamachojaosorior, just to double check From: http://jaormx.github.io/2016/how-is-tls-powered-by-certmonger-being-done/08:24
ccamachoThen check: https://review.openstack.org/#/c/366548/ and https://review.openstack.org/#/c/356430/08:24
ccamachoright??08:24
bandinimarios: I addressed al the comments08:24
bandinimarios: or did I miss any?08:25
mariosbandini: no is fine i was just looking,.... jaosorior already +A it looks like anyway08:25
jaosoriorccamacho: yep08:26
ccamachoack08:26
mariosbandini: i mean, i didn't expect changes is all but it looks fine08:26
bandinimarios: I addressed all the comments last night, so it should be good afaict08:26
b00tcatis it possible to force a tripleo deployment to be sequential? a.k.a. first deploy the controller, then the compute?08:29
*** fzdarsky has quit IRC08:29
*** tosky has joined #tripleo08:31
*** akrivoka has joined #tripleo08:36
*** yamahata has quit IRC08:37
*** abehl has quit IRC08:39
d0ugalradeks: I have replied to your review here: https://review.openstack.org/#/c/38145408:51
*** dtantsur is now known as dtantsur|bbl08:52
*** abehl has joined #tripleo08:52
*** mbozhenko has quit IRC09:00
*** electrofelix has joined #tripleo09:00
openstackgerritJulie Pichon proposed openstack/puppet-tripleo: Clean out UI httpd configuration file  https://review.openstack.org/38156809:03
openstackgerritJulie Pichon proposed openstack/puppet-tripleo: Use FallbackResource instead of Rewrite for UI  https://review.openstack.org/38127609:04
*** derekh has joined #tripleo09:04
*** paramite has joined #tripleo09:04
*** gfidente has joined #tripleo09:11
*** mbozhenko has joined #tripleo09:11
openstackgerritMarkos Chandras proposed openstack/diskimage-builder: Move the opensuse mkinitrd script to the zypper element  https://review.openstack.org/38157409:11
openstackgerritMarkos Chandras proposed openstack/diskimage-builder: Add zypper-minimal element  https://review.openstack.org/38157509:11
openstackgerritMarkos Chandras proposed openstack/diskimage-builder: Add opensuse-minimal element  https://review.openstack.org/38157609:11
openstackgerritMerged openstack/puppet-tripleo: Fix the timeout for pacemaker systemd resources  https://review.openstack.org/38066509:12
*** abehl has quit IRC09:13
saneaxdtantsur|bbl, trying to debug the tftp issue, it seems the undercloud is getting a request like this -09:15
saneax0.60.21.4.echo > 10.60.21.255.echo: [udp sum ok] UDP, length 4309:15
saneax09:14:42.718948 IP (tos 0x0, ttl 128, id 0, offset 0, flags [none], proto UDP (17), length 71)09:15
saneaxshould the undercloud reply for echo request for 10.60.21.255?09:15
skramajadtantsur|bbl: saneax and i are working on the same issue ^^09:19
*** fzdarsky has joined #tripleo09:24
*** abehl has joined #tripleo09:25
openstackgerritCarlos Camacho proposed openstack/tripleo-common: Modify j2 templating to allow role files generation  https://review.openstack.org/37875009:27
*** zoliXXL is now known as zoli|mtg09:28
*** fzdarsky has joined #tripleo09:30
openstackgerritSharat Sharma proposed openstack/puppet-pacemaker: Changed the home-page to point Openstack Puppet Homepage  https://review.openstack.org/38158209:31
openstackgerritCarlos Camacho proposed openstack/python-tripleoclient: Add optional roles_data.yaml override  https://review.openstack.org/37874009:34
*** ramishra has quit IRC09:39
*** ramishra has joined #tripleo09:42
openstackgerritMichele Baldessari proposed openstack/puppet-tripleo: Fix the timeout for pacemaker systemd resources  https://review.openstack.org/38158409:42
*** shardy has joined #tripleo09:47
*** dtantsur|bbl is now known as dtantsur09:50
*** gfidente has quit IRC09:51
openstackgerritCarlos Camacho proposed openstack/tripleo-heat-templates: Add generic template for custom roles.  https://review.openstack.org/38158709:51
*** gfidente has joined #tripleo09:53
openstackgerritCarlos Camacho proposed openstack/tripleo-heat-templates: Add generic template for custom roles.  https://review.openstack.org/38158709:54
openstackgerritMarkos Chandras proposed openstack/diskimage-builder: Add zypper-minimal element  https://review.openstack.org/38157509:57
openstackgerritMarkos Chandras proposed openstack/diskimage-builder: Add opensuse-minimal element  https://review.openstack.org/38157609:57
*** jistr is now known as jistr|mtg09:59
*** zigo has quit IRC10:01
matbumarios: hey man, i'm trying to understand the CI failure for the tripleoclient review:10:02
matbuhttp://logs.openstack.org/47/379547/14/check/gate-tripleo-ci-centos-7-nonha-multinode-updates-nv/1ee38a4/console.html#_2016-10-03_15_17_44_15356810:02
matbumarios: i don't understand from where comes this user-files/97850658acf0327e3882d6e8e00a2f2d-net-config-multinode.yaml'10:03
matbumarios: do you have an idea ?10:03
*** radeks has quit IRC10:03
*** zigo has joined #tripleo10:04
*** zigo is now known as Guest8478010:05
openstackgerritCarlos Camacho proposed openstack/tripleo-common: Add support to create role main template file based in generic.role.j2.yaml  https://review.openstack.org/38159310:05
dtantsursaneax, skramaja, no idea what this echo request is, to be honest10:08
openstackgerritCarlos Camacho proposed openstack/tripleo-heat-templates: Add generic template for custom roles.  https://review.openstack.org/38158710:08
*** mbozhenko has quit IRC10:08
dtantsursaneax, skramaja, what is the hardware you're using? anything interesting about network topology? do you see TFTP requests coming to undercloud (port 69 IIRC)10:09
mariosmatbu: looking10:09
mariosmatbu: could it be something went wrong with the copied file?10:10
mariosmatbu: i mean if /tmp/tmp.... is what you just pulled from swift with the change you made10:10
shardymatbu: The additional -e foo.yaml files get added to the plan under a special user-files directory10:11
shardyso I suspect the path prefix is broken or something, looking at the logs10:11
mariosmatbu: shardy well there is an extra '\' in the patch there Errno 2] No such file or directory: '/tmp/tmpTjatcP/tripleo-heat-templates//user-files/97850658acf0327e3882d6e8e00a2f2d-net-config-multinode.yaml10:12
marioscould it be that i wonder10:12
*** leanderthal|afk is now known as leanderthal10:12
*** mbozhenko has joined #tripleo10:12
openstackgerritCarlos Camacho proposed openstack/tripleo-common: Modify j2 templating to allow role files generation  https://review.openstack.org/37875010:12
shardyIs it an ordering thing, e.g you're downloading from the plan before user-files is created?10:12
matbumarios: nop the extra / is not a pb10:12
shardyyou should be able to reproduce by passing a -e something_outside_tht.yaml10:13
matbushardy: prehaps, idk what is this user-file10:13
mariosmatbu: getting you link it is from ci10:13
matbushardy: hm k, i already tried on my env, with an extra file outside of tht10:13
marioshttps://github.com/openstack-infra/tripleo-ci/tree/master/heat-templates here matbu10:13
shardymatbu: d0ugal can probably explain it - basically it's where we put the stuff resolved locally which doesn't exist in the plan created from --templates foo10:14
*** Guest84780 has quit IRC10:14
shardyI think it's fair to say all of this logic needs work to simplify, but I'll take a look and see if I can spot where the patch is breaking10:14
saneaxno dtantsur - the tcpdump shows, DHCP requeest which gives the host IP address, after which we see two more requests -10:16
saneax04:44:42.149235 IP 10.60.21.81.newlixengine > rhbuildhost.nfv.rh.tftp:  35 RRQ "undionly.kpxe" octet blksize 145610:16
matbuack i'll try to reproduce it10:16
saneax04:45:18.175627 IP 10.60.21.81.newlixconfig > rhbuildhost.nfv.rh.tftp:  35 RRQ "undionly.kpxe" octet blksize 145610:16
matbumarios: thx marios10:16
*** zigo_ has joined #tripleo10:16
gfidentethanks jaosorior :)10:17
matbui think i will use the heat templates from tripleo-ci. Maybe i missed something, it probably needs additional unit tests10:17
dtantsursaneax, ok, you can grep journalctl over "in.tftp" and see what it says there10:18
jaosoriorgfidente: no biggie10:20
saneaxthere is nothing for in.tftp :(10:21
saneaxexcept the normal xinetd startup logs10:22
saneaxneed to run a more complete tcpdump on the undercloud and host machine10:23
*** hjensas has quit IRC10:24
saneaxwill let I will ping you back dtantsur10:24
saneaxdtantsur, just for my knowledge, generally if UEFI boot is enabled,the request should be a http, after the DHCP offer ?10:27
shardyccamacho: Hey, I notice you rebased https://review.openstack.org/#/c/378740/10:28
shardyI was planning to add some tests and a similar interface for plan create today, unless you're planning to do it?10:28
shardywhat I posted should work tho, I tested it locally10:29
shardyif that would help share the load of the remaining composable roles patches, I can do that while you focus on the tht patches?10:29
openstackgerritMartin Mágr proposed openstack/puppet-tripleo: Deploy monitoring/logging agents sooner  https://review.openstack.org/38160410:31
ccamachoshardy yeah! If you can help there would be awesome10:32
ccamachoI notice that we are missing also the generation of the main template for custome roles10:33
shardyccamacho: ack, OK I'll focus on the tripleoclient stuff, and pull/test/review your tht patches10:33
shardyccamacho: yes, that's the final piece10:33
ccamachoack I already pushed the submission10:33
ccamachojust testing it locally10:33
shardyI think there's going to be a few ugly special-cases there to cope with inconsistent parameter naming etc10:33
shardyso we can either leave the old per-role templates in place (and ignore generating them), or have a few j2 conditionals10:34
ccamachoan the tripleo-common update as we should not replace i.e. compute.yaml with the generic one10:34
ccamachowhat im testing is to create a generic.role.j2.yaml10:34
shardyyeah, but if we choose the do-not-relplace option, we've got a problem for updates, as we won't know when to replace10:34
shardyccamacho: ack, I'll check out your patch and we can discuss further if needed10:35
shardythanks for picking it up :)10:35
ccamachosure10:35
ccamachonp let hope we can land this before Thursday10:35
*** thrash|g0ne is now known as thrash10:37
dtantsursaneax, these are not quite related.. if your machine does not have iPXE ROM, it first has to download ipxe.efi (undionly.kpxe is for BIOS)10:39
saneaxdtantsur, so before it requests for undionly.kpxe, it should request for ipxe,efi?10:40
saneaxIt will help if I get a reference doc I can dig more into it10:40
*** leanderthal has quit IRC10:41
dtantsursaneax, instead10:43
dtantsurundionly.kpxe is for BIOS only, you don't need it for UEFI10:43
openstackgerritCarlos Camacho proposed openstack/tripleo-common: Add support to create role main template file based in generic.role.j2.yaml  https://review.openstack.org/38159310:44
saneaxok, clear now, and the url should be undercloud_ip:69/undionly.kpxe10:45
saneaxthe url is working locally on the undercloud, however on the console for the remote server its timing out10:45
*** rain has joined #tripleo10:45
*** rain is now known as leanderthal10:46
saneaxI can see a lot of udp echo requests landing on the undercloud with dest ip as the broadcast IP of the provisioning network10:46
saneaxundercloud is not replying10:46
jaosoriorshardy: seems that the commit fixing the hostnames is failing10:51
jaosoriorshardy: I believe it's because, now that mysql is binding on the right place, trying to do actions on it fails, since we need to specify the hostname to the provider10:52
jaosoriorit used to work cause it used to bind to ctlplane10:52
shardyjaosorior: aha, I didn't see that locally as I didn't try with net-iso10:52
shardyjaosorior: do you have a fix you can push?10:53
jaosoriorshardy: I don't, I just started looking into it a bit ago10:53
shardyjaosorior: Ok, I was about to dig into it, but you've ahead of me in figuring it out, are you OK to go ahead and dig into the solution?10:54
shardys/you've/you're/10:55
jaosoriorshardy: looking into it10:56
shardyjaosorior: ack, thanks!10:57
*** rcernin has quit IRC10:57
*** rcernin has joined #tripleo10:58
*** jistr|mtg is now known as jistr11:00
openstackgerritMerged openstack/tripleo-heat-templates: Use netapp_host_type instead of netapp_eseries_host_type  https://review.openstack.org/36395511:00
*** dprince has joined #tripleo11:01
*** coolsvap is now known as coolsvap_11:01
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Use netapp_host_type instead of netapp_eseries_host_type  https://review.openstack.org/38165611:02
*** jpena is now known as jpena|lunch11:09
*** radeks has joined #tripleo11:10
jschlueterlarsks: you saw "[overcloud]: CREATE_FAILED  Resource CREATE failed: Expression consumed too much memory" failure ... did you get any hints or ideas why it was failing?11:11
jschlueterccamacho: ^^11:12
*** sudipto has quit IRC11:12
jschlueterwe just saw it again today and was wondering if anyone else had hints as to why or what they did to resolve this issue.11:12
*** amoralej is now known as amoralej|lunch11:13
ccamachojschlueter there was a patch increase the heat memory11:13
shardyccamacho: Added comments to https://review.openstack.org/#/c/381593/11:13
ccamachomerged already11:13
*** sudipto has joined #tripleo11:13
shardyI like the general approach, but I think we can simplify things a bit11:13
jschlueterccamacho: ahh when did that get merged?11:13
jaosoriorshardy: so... the issue stems farther that I thought11:13
jschlueterafazekas: ^^11:13
shardyhttps://github.com/openstack/instack-undercloud/commit/55ccd0e1e8c66c9f474112358063bc263720d84f11:13
shardyjschlueter: ^^11:13
ccamachoshardy ack, good idea11:14
jaosoriornevermind11:14
jschluetershardy: thanks11:14
* jaosorior keeps reading11:14
*** ccamacho is now known as ccamacho|lunch11:14
openstackgerritMarkos Chandras proposed openstack/diskimage-builder: Add zypper-minimal element  https://review.openstack.org/38157511:14
openstackgerritMarkos Chandras proposed openstack/diskimage-builder: Add opensuse-minimal element  https://review.openstack.org/38157611:14
*** pkovar has joined #tripleo11:15
*** rhallisey has joined #tripleo11:15
afazekasjschlueter, shardy : thx11:16
*** zigo_ is now known as zigo11:23
*** tzumainn has joined #tripleo11:27
jschluetershardy, ccamacho|lunch: we are still seeing this in a CI job  internally11:28
jschluetereven with that patch applied11:28
shardyjschlueter: is it a particularly large deployment?  E.g much more than we test in upstream CI?11:29
jschluetershardy: actually a minimal 1 controller 1 compute 1 ceph deployment by InfraRed11:30
shardyjschlueter: ack, Ok that's strange that we're not seeing consistent errors11:30
jschlueterwhat should we be looking for to figure out what's causing the issue?11:31
jschlueterwhat are the moving pieces that effect this?11:31
*** karthiks has quit IRC11:33
larsksshardy, I'm seeing the same error as jschlueter, in approx the same situation (1 controller/1 compute).11:33
d0ugalmatbu: sorry, I was at the vet - did you get the answer you needed about the user-file?11:33
shardylarsks: Ok and you have the instack-undercloud patch?11:33
shardylarsks: what t-h-t commit are you using?11:34
shardysome more yaql stuff landed yesterday ...11:34
openstackgerritDougal Matthews proposed openstack/tripleo-docs: [WIP] Mistral API Documentation  https://review.openstack.org/35868511:36
larsksshardy, blew away the environment yesterday after repeated failures (I ran into some possibly unrelated issues and decided I needed to start fresh).  I will retry w/ current master as soon as I finish gettings kids to school etc.11:37
shardylarsks: ack - I've not hit this locally so it'd be good to figure out what is different11:37
shardyone thing is I don't use tripleo-quickstart, so possibly there could be differences in the VM setup/specs11:37
EmilienMhi11:38
openstackgerritJames Slagle proposed openstack-infra/tripleo-ci: Do Not Merge - Test Undercloud upgrade mitaka -> newton  https://review.openstack.org/38130911:38
jschluetershardy: yes we have the patch and THT is at 4cdc4fc11:39
*** stendulker has quit IRC11:40
jschluetershardy, larsks: we have a CI job that is hitting this issue if you need access to it let us know, or what to look for11:40
*** Goneri has quit IRC11:43
openstackgerritJames Slagle proposed openstack/tripleo-common: Centos images no longer require epel element  https://review.openstack.org/36897611:45
openstackgerritDougal Matthews proposed openstack/tripleo-docs: Remove usage of --compute-scale from the baremetal overcloud docs  https://review.openstack.org/38145411:46
openstackgerritDougal Matthews proposed openstack/tripleo-docs: Remove usage of --control-scale from the HA docs  https://review.openstack.org/38145811:47
openstackgerritDougal Matthews proposed openstack/tripleo-docs: Remove usage of --control-scale from the scale roles docs  https://review.openstack.org/38118711:47
*** zoli|mtg is now known as zoliXXL11:48
*** karthiks has joined #tripleo11:49
jschluetershardy, larsks, ccamacho|lunch: what am I looking for here? this is new debugging area for me but willing to learn and do foot work to figure this out but could use a pointer to where to start11:54
shardyjschlueter: I'd start by either trying to reproduce locally, or testing with your CI job and a temporary patch which increases the limits more (ref the instack-undercloud patch I linked)11:55
shardyeither we didn't increase the limit enough, or there's something about your environment causing the queries to use more memory11:55
shardyalso it'd be good to figure out from the heat logs if we're hitting the memory quota or limit_iterators limit11:56
* shardy hasn't yet checked if the errors raised are different11:56
jschluetershardy: thanks will attempt to see if an increase in that max will get it to work, but any tips to see what it was working on and why it was so big?11:57
shardyjschlueter: No, that's probably something we could add to heat, e.g when raising an error include the query (or reference to where it was defined)11:58
shardyjschlueter: I'd probably reproduce locally, then drop a line of debug in to the yaql function so we can see exactly which query blows up11:58
shardymaybe we can rework it to be more efficient11:58
shardyhttps://github.com/openstack/heat/blob/master/heat/engine/hot/functions.py#L108211:59
*** lucas-afk is now known as lucasagomes11:59
jschluetershardy: ack got another person who ran into it as well12:00
shardyjschlueter: that's where you could add some debug to see what's going on when it breaks12:00
jschluetershardy: thanks12:00
*** soc_off has joined #tripleo12:01
shardyjschlueter: I'm not sure if the context is included in the limit evaluation, but you might want to try https://review.openstack.org/#/c/381588/ if you can locally reproduce12:01
soc_offshardy: I'll tru that soon12:02
*** flepied has quit IRC12:04
pradkmarios, Hi12:06
*** fultonj has joined #tripleo12:06
*** karthiks has quit IRC12:06
pradkmarios, saw your comments.. so yea our assumption to leverage hiera data from new templates is what lead to this manifest12:06
mariospradk: hey man, did you see the comments ?12:06
mariospradk: yeah we get away with the stuff that was already there12:07
pradkmarios, i'm ok with adding the config params to the manifest12:07
pradkif there is a way to access that value from manifest12:07
pradkthe problem is we dont have this data in mitaka12:07
pradkas we used eventlet there12:07
mariospradk: the things that fails isn't trying to reference the new hiera directly, but rather, assuming the config from the new templates is already in place (wsgi.conf for ceilo-api for example)12:07
pradkmarios, hmm i'm referring to the servername syntax issue12:08
mariospradk: yeah we are saying same... the 'syntax' issue is because the server isn't set int he config file (empty) since we don't pass/set that from the hiera properly until the converge12:09
pradkright12:09
pradkwe can pass the servername => blah to apache class12:09
pradkbut is there a way to get that value from the new templates12:10
*** jayg|g0n3 is now known as jayg12:10
mariospradk: i think we can only rely on any existing hiera at this point in the upgrade i.e. not directly by referencing the config_settings of https://github.com/openstack/tripleo-heat-templates/blob/master/puppet/services/ceilometer-api.yaml#L7412:11
jaosoriorpradk: aren't we already passing the servername to the vhost?12:11
pradkyea but as marios says thats not accessible until converge apparently12:12
mariospradk: so i think we can get to the values we need... am currently getting ready to push a saharaa upgrades related change then i'll switch to mitaka branch and poke some more12:12
mariosjaosorior: problem is we want to set config here https://review.openstack.org/#/c/360004/16/extraconfig/tasks/mitaka_to_newton_ceilometer_wsgi_upgrade.pp12:12
mariosjaosorior: which is mitaka --> newton ... the setup of ceilometer under wsgi like in https://github.com/openstack/tripleo-heat-templates/blob/master/puppet/services/ceilometer-api.yaml#L74 for example, won't happen until the converge step since we don't apply puppet before12:13
jpichjtomasek: FYI - https://bugs.launchpad.net/tripleo/+bug/1630222 . I don't know if this affects deployments, if you've run a successful deploy with these settings recently maybe you can confirm it's not a big problem :)12:13
openstackLaunchpad bug 1630222 in tripleo "Kernel / Ramdisk not set when registering nodes with the UI" [Medium,Triaged]12:13
jaosoriorpradk, marios: well, servername is not REALLY necessary for the vhost since for mitaka we're still accessing everything via IP addresses12:14
openstackgerritDougal Matthews proposed openstack/python-tripleoclient: [WIP] Save the result of direct action calls in Mistral  https://review.openstack.org/38173912:15
jaosoriorwouldn't the servername be set in a later step once the puppet stuff runs? it should be fine12:15
pradkjaosorior, problem is apache restart fails with syntax error12:15
pradkon missing servername in conf file12:15
jtomasekjpich: thanks for bringing this up, yeah it is a problem. I am quite confident that image names were defaulted in the action before12:16
pradkjaosorior, https://paste.fedoraproject.org/442730/52346714/12:16
jaosoriorah crap12:16
jaosoriorwhy the hell are we passing an empty servername12:16
jpichjtomasek: It's possible, I remember struggling around that in order to allow the '--no-deploy-images' to work on the CLI with Mistral, so it's possible we need the UI to add the default as well or that will break12:16
soc_offshardy: insreased mem limit didn't help, applying patch12:16
jaosoriorwe're supposed to be using defaults, and if it's empty it should not even write it X_x12:17
shardysoc_off: did you try increasing both limit_iterators and memory_quota?12:17
jaosoriorpradk, it's hacky... but what about giving a provisional servername coming from the $::hostname fact? (which is the actual default)12:17
soc_offshardy: yep added 0 to both12:18
jaosoriorpradk: later that will be overwritten anyway12:18
shardyhmm12:18
shardyOk, it's quite surprising that didn't fix it unless we've got a bad query blowing up huge memory usage somewhere12:18
shardy(which I'd expect to see locally and in upstream CI)12:18
*** akshai has joined #tripleo12:18
*** karthiks has joined #tripleo12:19
soc_offshardy: how do I find out what query os wrong?12:19
pradkjaosorior, hmm so you're suggesting just set servername => $::hostname in the manifest itself and let converge override it12:20
jaosoriorpradk: pretty much12:20
shardysoc_off: as I mentioned above, probably will need a line of debug added to the heat function, so you can dump out each query string just before it's evaluated12:20
pradkthat could work but i dont know how functional the api would be until the converge12:20
*** flepied has joined #tripleo12:20
pradks/functional/reachable perhaps12:20
jaosoriorpradk: we either way (until now) access the nodes via the IPs12:20
shardyhttps://github.com/openstack/heat/blob/master/heat/engine/hot/functions.py#L108112:21
jaosoriorso we don't really use the servername to route (yet)12:21
shardye.g log self._expression from there12:21
pradkmarios, ^^ what do you think of that12:21
jaosorioronly if you really modify your deployment (which I do)12:21
*** trown|outtypewww is now known as trown12:22
openstackgerritMarkos Chandras proposed openstack/diskimage-builder: Add opensuse-minimal element  https://review.openstack.org/38157612:22
mariospradk: it could be something to fall back on if we can't get a better fix... but we definitely expect services to be up and 'normal' after the step (there could be time lapse of days between them)12:22
EmilienMjaosorior, jpich, matbu, bandini, gfidente: we merged a bunch of patches recently, please make sure everything candidate for newton is backported in stable/newton, thanks12:23
mariosjaosorior: pradk i mean, assuming that is the only thing we need to set (or do we need to set all the config from https://github.com/openstack/tripleo-heat-templates/blob/master/puppet/services/ceilometer-api.yaml#L7412:23
jpichEmilienM: Will do, thanks!12:23
jtomasekjpich: ok, we don't allow to specify node images in GUI yet, so the fix is quite straightforward - hardcode the default image names in the workflow input in GUI, I can create a fix for that unless you want to do it12:24
jaosoriormarios: what do you mean?12:24
openstackgerritSteven Hardy proposed openstack/tripleo-heat-templates: Make keystone api network hiera composable  https://review.openstack.org/38175412:24
mariospradk: well.. the service name you're setting already in https://review.openstack.org/#/c/360004/16/extraconfig/tasks/mitaka_to_newton_ceilometer_wsgi_upgrade.pp12:24
mariosjaosorior: i'm saying that we may need to pass all config at this stage like we do to setup the service under wsgi... it fails and we are discussing the servername on enow... i was wondering about the rest of them12:25
EmilienMjaosorior, shardy: do we need something else to make https://review.openstack.org/#/c/378764/ passing CI?12:26
openstackgerritSteven Hardy proposed openstack/tripleo-heat-templates: Replace per role manifests with a common role manifest  https://review.openstack.org/38175712:26
jaosoriorwell, most of them are set, only ones I'm not sure about are the host bindings12:26
jaosoriorEmilienM: I' investigating that. I have a theory on why it fails.12:26
jaosoriorso it might need something more12:27
pradki think we'll need bind_host and servername12:27
shardyEmilienM: yes, it's broken with network isolation, jaosorior is working on a fix12:27
pradkrest can default12:27
EmilienMshardy: ack. Also, can you look at https://review.openstack.org/#/c/379547/ ?12:27
mariosjaosorior: would appreciate any comments on the review esp your idea about using the $::hostname for now if needs be12:27
bandiniEmilienM: ack. am waiting for the rabbit folks to comment the last ones. If we do not get positive responses I will just drop them for newton12:28
pandahttps://review.openstack.org/363674 is failing liberty and mitaka, but I don't whink it's related to the change ? is something missing to merge it ?12:28
jpichjtomasek: Yeah, I agree the fix doesn't look too complicated. If you have the fix ready to go, don't wait on me!! I have to leave for a couple of hours now but can look at it after the TripleO meeting otherwise12:28
shardyEmilienM: yup, I'd like to test that one locally, on my todo list for this afternoon12:28
jaosoriormarios: done12:28
EmilienMshardy: ack12:29
jtomasekjpich: ok, if I manage to create one before that I'll let you know12:29
mariosjaosorior: thanks12:29
EmilienMbandini: ok, let me know. Ideally, we need the patch merged by tomorrow12:29
bandiniEmilienM: yeah if it is not done today I will drop it most likely12:29
jpichjtomasek: Cool, I'll test+review if that's the case!12:29
pradkmarios, if you look at my initial patch https://review.openstack.org/#/c/360004/1/extraconfig/tasks/mitaka_to_newton_ceilometer_wsgi_upgrade.pp12:30
pradkmarios, i explictly commented this out with a note12:30
EmilienMbandini: k12:30
pradkmarios, then i was told hiera data is accessible12:30
pradkmarios, and we move towards that approach12:30
pradkon another note, my undercloud upgrade seems to hang .. any known issues12:31
soc_offshardy: so first off changing just undercloud hiera won't work as the heat yaql options will be ignored by puppet and second even with the patch to functions.py it fails.12:31
*** cylopez has left #tripleo12:31
shardysoc_off: why are the heat yaql options ignored by puppet?12:32
soc_offshardy: because it does not set them, there is no support for them12:32
soc_offshardy: let me check why it got ignored12:33
shardysoc_off: Ok, so you don't have https://review.openstack.org/#/q/Id41001d74ce1008dbb5a98b962d5c53dbf39c90312:33
hewbroccasoc_off: you don't look off to me12:34
openstackgerritMerged openstack/tripleo-common: Default to Ironic API v1.15  https://review.openstack.org/36431912:34
mariospradk: my osp9 undercloud update was ok today (will share process sec) ... with the hiera, i think the new hiera is only available after we do a puppet run ... so the things set/passed from the https://github.com/openstack/tripleo-heat-templates/blob/master/puppet/services/ceilometer-api.yaml for example, wouldn't be there if you originally deployed mitaka, when that file didn't exist (unless we were alrea12:34
mariosdy passing those items in te templates but in other files previously)12:34
*** akshai has quit IRC12:35
mariospradk: i assume you are still on a downstream osp9 env? in which case this is what i did today for example http://paste.openstack.org/show/584187/12:35
pradkmarios, i use upstream repos for mitaka not rhos12:36
pradkthis worked for me couple of weeks ago12:36
pradksince yesterday undercloud upgrade has been hanging12:36
mariospradk: ah ok i thought you were using downstream setup then matbu may have idea about current undercloud upgrade issue upstream?12:36
*** amoralej|lunch is now known as amoralej12:37
soc_offmarios: we should provide all dependancies of puppet before running puppet in undercloud. update puppet, facter, hiera and puppet modules before running it12:38
soc_offmarios: something like https://review.openstack.org/#/c/328264/12:38
*** jeckersb is now known as jeckersb_gone12:38
mariospradk: i don't think it came up in yesterday so may be a new issue12:39
pradkok, i'll do another fresh one today and report back12:40
soc_offshardy: https://review.openstack.org/#/c/381588/1/heat/engine/hot/functions.py didn't work but manually changing yaql options in hiera seem to have worked12:41
*** pgadiya has quit IRC12:41
soc_off*hiera=heat.cnf12:41
soc_offshardy: so it's downstream too old puppet issue12:41
mariossoc_off: i know you've brought this up before and honestly i don't think it is something that we get for n..m ... do we have a spec/blueprint for it? we have the https://github.com/openstack/tripleo-specs/blob/master/specs/newton/undercloud-upgrade.rst for now... so that may be a good start12:42
*** jpena|lunch is now known as jpena12:42
*** ccamacho|lunch is now known as ccamacho12:42
mariossoc_off: i mean write something, maybe just a bug, doesn't have to be a spec, that says why you think we should do that12:42
hewbroccaIf you want to write a bug you need gfidente , right?12:43
*** yolanda has quit IRC12:44
marioshewbrocca: only if you need it written and well solved all at once12:44
hewbroccamarios: seems like gfidente could save himself some effort by not writing the bugs in the first place12:46
*** yolanda has joined #tripleo12:46
mariosspeaking of italians, gfidente perhaps you know/can cofirm what i wrote at https://review.openstack.org/#/c/360004/16 - first comment from today from me - the hiera keys we set on newton templates won't be available on the nodes until the pos-deploy puppet config is first applied (ie converge) right?12:47
soc_offshardy: so yaql options in heat fix the issue12:49
shardysoc_off: ack, thanks for confirming12:49
soc_offmarios: you mean launchpad bug yes?12:50
*** jcoufal has joined #tripleo12:50
mcorneajaosorior: hey man,if you have few minutes could you please have a look at https://bugs.launchpad.net/tripleo/+bug/1629098? it's related to haproxy servers names, looks like they are always set to controllers even though they're using the correct ips12:53
openstackLaunchpad bug 1629098 in tripleo "The server names in haproxy configs do not match the ip addresses" [Undecided,New]12:53
EmilienMpradk: https://review.openstack.org/#/c/371950/12:54
EmilienMswift is broken in your patch12:54
EmilienMOct  4 03:23:23 localhost swift-proxy-server: ImportError: No module named ceilometermiddleware.swift12:54
EmilienMlooks like packaging?12:55
jaosoriormcornea: that is the case, and it is an issue12:55
jaosoriormcornea: we need another fix12:55
jaosoriormcornea: first we need to finish this one https://review.openstack.org/#/c/378764/812:55
jaosoriormcornea: then we need to fix manifests/haproxy.pp since that's where it's hardcoded (it's been always like this unfortunately :/)12:56
*** shardy is now known as shardy_mtg12:56
*** limao has joined #tripleo12:56
pradkEmilienM, is this the ha job?12:56
pradkEmilienM, hmm packaging should be there12:57
mcorneajaosorior: gotcha, is https://review.openstack.org/#/c/378764/8 ready for testing? I can try adding it to my environment12:57
*** limao_ has joined #tripleo12:58
openstackgerritCarlos Camacho proposed openstack/tripleo-common: Add support to create role main template file based in generic.role.j2.yaml  https://review.openstack.org/38159312:58
EmilienMpradk: yeah only HA job12:58
openstackgerritMarios Andreou proposed openstack/tripleo-heat-templates: Add Flag to Keep or Remove Sahara during controller upgrade  https://review.openstack.org/37551712:59
mariosjistr: tosky added a flag for now ^^^^12:59
mariosjistr: tosky i think we need a launchpad bug for discussing what we will do here12:59
openstackgerritCarlos Camacho proposed openstack/tripleo-heat-templates: Add generic template for custom roles.  https://review.openstack.org/38158712:59
jaosoriormcornea: it's not. It's failing on net-iso environments, which is what I'm trying to figure out at the moment12:59
openstackgerritMerged openstack/tripleo-quickstart: Customize undercloud and overcloud with virt-customize  https://review.openstack.org/37011412:59
pradkEmilienM, packages are in the repo https://trunk.rdoproject.org/centos7/current/python-ceilometermiddleware-0.5.0-0.20161003160220.7f502e2.el7.centos.noarch.rpm12:59
*** sudipto_ has quit IRC13:00
*** sudipto has quit IRC13:00
toskyoh13:00
pradkEmilienM, perhaps puppet is not pulling the pkg down13:00
pradki'll investigate13:00
mariosjistr: tosky basically we can 'detect' if sahara is running during controller_pacemaker_1.sh because all the things are still running then. but by pacemaker_3.sh we've moved to next gen HA so we can't tell then if sahara was running earlier13:00
*** limao has quit IRC13:01
EmilienMpradk: I'm looking too13:01
mariosjistr: tosky so another thought was writing to file/signal between those two steps but that is ugly :/13:01
toskyI would kind of expect that you can read the old status before starting to upgrade it, and then you can't properly know the old status because it was upgraded :)13:01
*** akshai has joined #tripleo13:01
mariostosky: jistr: we'd ideally use 'enabled_services' from the newton templates, but default is for sahara to be off now so it wouldn't tell us if it was previously running (no enabled_services for mitaka)13:02
*** Goneri has joined #tripleo13:02
jistrmarios, tosky: but reading the old status might not help, right? It's not so much whether sahar was running or not, as we know it *was* running in Mitaka. It's more about whether the user wants to keep it or not...13:02
*** akshai has quit IRC13:02
openstackgerritOpenStack Proposal Bot proposed openstack/python-tripleoclient: Updated from global requirements  https://review.openstack.org/37512513:02
jistri.e. the gap between the Mitaka and Newton defaults13:02
*** yolanda has quit IRC13:02
mariosjistr: tosky so tosky was trying to answer the 'can we just detect' which is was what was asked of us to do13:03
toskyright, we know that it was on13:03
*** paramite has quit IRC13:03
toskymarios: but as jistr pointed out, we know that from the beginning13:03
toskyif you stripped out sahara, you did some magic custom post-config manipulation13:04
mariostosky: jistr yeah was thinking some more... like right, so it would be not trivial to just remove it then ^^13:04
mariostosky: jistr so that assumption means we don't need to try and detect and donig the env file/flag like the current review is ok13:04
*** akshai has joined #tripleo13:04
slagleEmilienM: have a look at https://review.openstack.org/#/c/381790/13:05
mariostosky: jistr unless they manually steopped sahara for some reason but then they'd know to explicitly disable it from the docs i guess13:05
jistrmarios, tosky: yea i think so. What we're asking is inherently undetectable AFAICT, it's a user decision.13:05
jistryea it might be detectable in cases when the users did something special with their deployment, as tosky wrote earlier, but that probably can't be considered the general case13:06
EmilienMslagle: yes Sir13:06
*** sudipto has joined #tripleo13:06
toskyand if they did it once, they know they should explicitely remove it with the new switch13:06
toskyyep: if they know about it and they don't want anymore, they can force its removal now; if they don't care, default -> keep13:06
*** yolanda has joined #tripleo13:07
jistr+1 on default = keep13:07
*** [1]cdearborn has joined #tripleo13:07
EmilienMslagle: commented13:07
jistr+2'd the patch13:08
mariosjistr: advantage of doing this https://review.openstack.org/#/c/375517/2/environments/major-upgrade-pacemaker.yaml is we should be able to reuse for converge (though may not help at that point, i mean we need to use the existing  'deploy sahara' environment file13:08
slagleEmilienM: doh, thx13:08
*** sudipto_ has joined #tripleo13:09
mariosjistr: by 'this' i mean having a KeepSaharaServices in paramater_defaults frmo the controller step... will be persisted13:09
*** panda is now known as panda|afk13:10
jistrmarios: yea we'll need some amount of docs around this whole issue simply b/c the Mitaka vs. Newton defaults differ, so one either needs to set the param on upgrade to false, or add the sahara env file on converge and beyond13:10
jistrbut i kinda like the explicitness here, e.g. in case user forgets about the whole issue, their sahara won't disappear just by itself, and they can still fix the issue later relatively easily (either stop it manually, or start passing the env file to start managing sahara configs properly)13:11
jistr(depending on which way they want to go wrt sahara-ful vs. sahara-less deployment :) )13:12
*** jeckersb_gone is now known as jeckersb13:12
mariosmy what a saharaful deployment !13:12
EmilienMslagle: modest +113:13
jistr:)13:13
gfidenteEmilienM, so I realized ensure_packages is not safe if you call it multiple times with same list of packages13:15
gfidenteare there 'known' workarounds for that?13:15
gfidenteI read about creating a class which uses ensure_packages and include the class instead?13:15
*** akshai has quit IRC13:16
openstackgerritMerged openstack/tripleo-docs: Update docs for undercloud upgrade  https://review.openstack.org/36813813:16
*** lblanchard has joined #tripleo13:16
toskyjistr, marios: are you saying that the default configuration will still require some work to reach a proper configuration?13:16
mariostosky: jistr on converge we will need to explicitly enable the sahara services yes13:17
toskyI kind of understand the need for being explicit (as python tries to enforce), but I'd still argue that the default configuration would lead to a working setup13:17
EmilienMgfidente: why is it not same?13:17
toskywhich matches the environment available before13:17
mariostosky: this review is about whether we keep/remove the sahara services during the controller upgrade13:17
EmilienMgfidente: like resource duplications?13:17
gfidenteresource duplications yes13:17
*** limao_ has quit IRC13:18
gfidenteit's safe if it gets multiple times same package from a single call13:18
gfidentebut not from multiple calls13:18
*** limao has joined #tripleo13:18
jaosoriorshardy_mtg: so... I tried running your patch and it actually worked on my local deployment... so that means my theory was wrong. Now I don't know why it's failing.13:19
jistrtosky: it's sorta inevitable due to Newton being saharaless by default. So if we want to keep Sahara, we will need to start passing an env file on converge and beyond. And if we want to remove Sahara, we'd need to pass KeepSaharaOnUpgrade: false during the upgrade.13:19
openstackgerritMerged openstack/tripleo-quickstart: Update get-overcloud-nodes script  https://review.openstack.org/37419013:19
toskyjistr: even if it is sahara-less by default, we are talking about an upgrade here, and an upgrade is from mitaka, no other possibilities, so it should be possible to have the env file for convergence by default13:20
jistrtosky, marios: the only way i can see this could be changed is to default to `KeepSaharaOnUpgrade: false` so that the "remove sahara" use case is without work (== the upgrade by default converges to the Newton defaults), which shifts the work to the "keep Sahara" case13:20
*** akshai has joined #tripleo13:20
toskywhich is against the suggested direction of "whatever was available before"13:20
*** tiswanso has joined #tripleo13:21
jistrwell i don't think it's possible (within reasonable implementation limits) to include env files automatically based on what we're upgrading from... i mean even if we baked Sahara env into the converge by default (which would make "remove Sahara" case a bit more complicated again perhaps), the user would still need to start passing the sahara env file *after* the upgrade, with any `overcloud deploy` commands13:23
mariosjistr: the requirement we currently have is 'whatever was there previously' ... going stricly on what you can deploy with the mitaka templates, then we can assume it was default on,13:23
openstackgerritCarlos Camacho proposed openstack/tripleo-heat-templates: Add generic template for custom roles.  https://review.openstack.org/38158713:23
mariosjistr: we could try work out how to detect that if we really need to - detect at pacemaker_1 and signale to pacemaker_313:24
toskythere is only one path from where we're upgrading from, that's my point13:24
openstackgerritHironori Shiina proposed openstack/diskimage-builder: Fix a command in Developer Documentation  https://review.openstack.org/38183313:24
jistrmarios, tosky: regardless of what we do for upgrade, we'll still need to start passing the env file after the upgrade, unless we go and change Newton defaults to deploy Sahara by default13:25
openstackgerritDougal Matthews proposed openstack/python-tripleoclient: [WIP] Save the result of direct action calls in Mistral  https://review.openstack.org/38173913:25
openstackgerritCarlos Camacho proposed openstack/tripleo-heat-templates: Add generic template for custom roles.  https://review.openstack.org/38158713:25
jistrso we can't make Newton behave the same way as Mitaka did without passing the additional env file13:26
toskyjistr: uhm, and really no possibility of working around this?13:26
openstackgerritJuan Antonio Osorio Robles proposed openstack/puppet-tripleo: Make MySQL client default host be same as bind address  https://review.openstack.org/38183413:26
toskyI guess no13:26
toskyunfortunate13:26
mariosjistr: 'any stack update operation henceforth' right13:26
jistryea...13:26
mariosjistr: we need a 'resource_registry_defaults' :)13:27
openstackgerritJuan Antonio Osorio Robles proposed openstack/tripleo-heat-templates: Select per-network hostnames for service_node_names  https://review.openstack.org/37876413:27
jistrtosky: not so unfortunate though, as this means that all Newton deployments behave the same, regardless what they were upgraded from, which IMO should be prioritized over "i don't want to change the set of env files i pass in"13:27
jistrmarios: ^13:27
toskyjistr: all? So what happens to custom changes to the enabled services when you will migrate from N to O or later?13:28
openstackgerritCarlos Camacho proposed openstack/tripleo-heat-templates: Add generic template for custom roles.  https://review.openstack.org/38158713:28
mariosjistr: what do you mean? fresh newton deployment won't have sahara if they don't enable it explicitly dduring their deploy13:28
jistri mean "Does this Newton env have Sahara?" should be a question answerable by looking at what we pass into the deployment command, not by going over the history of that deployment :)13:28
jistrtosky: as long as users keep passing their custom changes, they should persist13:29
mariosjistr: ah so if they have newton with sahara, regardless of from upgrade or from fresh, they would have to include the -e sahara henceforth forever and ever till evermore anyway13:29
jistrmarios: yea re "fresh newton deployment won't have sahara if they don't enable it explicitly dduring their deploy" -- that was actually my point. Upgraded deployments should behave the same.13:29
jistrmarios: exactly13:30
jistrmarios, tosky: if we don't stick to such approach, we'll go crazy just after a few releases13:30
jistressentially our compute vs. novacompute problem, scaled up :)13:30
toskyso back to the initial point, with the difference that, if you don't specify -e sahara, after the upgrade process you should be able to recover it, while if you want to kill it you need to explicitely pass another env13:31
toskyor configuration, or whatever it is relevant in this case13:31
mariostosky: yeah another env if you like or just set the KeepSaharaOnUpgrade from the environment/major-upgrade-pacemaker.yaml for the controller upgrade13:32
openstackgerritMerged openstack/tripleo-common: Modify j2 templating to allow role files generation  https://review.openstack.org/37875013:32
dprincepradk: Hi, I replied on https://bugs.launchpad.net/tripleo/+bug/162993413:32
openstackLaunchpad bug 1629934 in tripleo "firewall rules defined in service templates missing on overcloud" [Critical,Incomplete]13:32
dprincepradk: could you have another look at the report there?13:32
jistrtosky: yea. Basically for both options you need to do an explicit action. We could make the "remove Sahara" case non-explicit too, but it would make it slightly more dangerous perhaps, and the explictiness wouldn't be totally gone, it would just shift to the "keep Sahara" case.13:32
jistri.e. if we default to KeepSahara: false, we'd need to pass KeepSahara: true when we want to keep it13:33
toskyjistr: explict, but keeping the service (minus the final convergence) should be still the default, so that you don't lose your data if you forgot -e sahara and you redeploy with it13:33
*** jaosorior has quit IRC13:34
toskywhich means, if I get it correctly: please set KeepSahara true13:34
toskyby default13:34
dprincepradk: oh, wait. I think it isn't working still.13:34
jistrtosky: yea +1, safer13:34
*** jaosorior has joined #tripleo13:35
mariosjistr: tosky filing a launchpad bug to try capture some of this discussion and so we can track the patches (there may yet be something we need on converge too)13:35
openstackgerritMerged openstack/tripleo-heat-templates: Use netapp_host_type instead of netapp_eseries_host_type  https://review.openstack.org/38165613:36
*** fultonj has quit IRC13:37
jistrmarios, tosky: generally, anytime we want to change the default set of deployed services, we'll have this problem (especially painful if the new default means removing some service i think)13:37
*** fultonj has joined #tripleo13:38
*** limao has quit IRC13:38
*** limao has joined #tripleo13:39
openstackgerritJiri Tomasek proposed openstack/tripleo-ui: Deployment states optimization  https://review.openstack.org/38184513:39
pradkdprince, if you look in one of our ci jobs.. the rules seems to be missing there too13:40
*** rodrigods has quit IRC13:40
*** rodrigods has joined #tripleo13:40
dprincepradk: okay, I missed the last line of the bug ticket. You are saying that only the redis, mongo, ports are missing?13:40
EmilienMdprince: why don't we enable firewall by default? I thought bnemec enable it13:40
pradkdprince, from what i could tell yea13:41
*** skramaja has quit IRC13:41
dprincepradk: I don't see those either. But I did see keystone13:41
EmilienMI think we should enable environments/manage-firewall.yaml  by default in our CI13:42
EmilienMwdyt?13:42
pradkdprince, it seems like whatever is defined in puppet-tripleo in haproxy config make it fine13:42
EmilienMor set ManageFirewall to True by default13:43
dprinceEmilienM: step at a time. Lets fix what pradk is seeing first13:43
mariostosky: jistr not sure how well the irc chat works there but fwiw https://bugs.launchpad.net/tripleo/+bug/163024713:43
openstackLaunchpad bug 1630247 in tripleo "sahara services during mitaka to newton upgrade" [Undecided,Triaged] - Assigned to Marios Andreou (marios-b)13:43
soc_offmarios: https://bugs.launchpad.net/tripleo/+bug/163024913:43
openstackLaunchpad bug 1630249 in tripleo " [instack-undercloud] updated puppet and its dependancies before running it" [Undecided,New]13:43
EmilienMI remember bnemec switched ManageFirewall to be True by default13:43
EmilienMand I see it false now13:43
EmilienMsomeone changed it13:43
dprinceEmilienM: it may have gotten lost in the composable services somewhere13:45
jistrmarios: works as a reference in case someone (mainly other than us) needs wants re-live that discussion :)) thanks13:45
EmilienMyeah13:45
dprinceEmilienM: I will re-enable it, sure. But first I'd like to fix what pradk is seeing too13:46
EmilienMfor history: bnemec enabled firewall by default in June: https://review.openstack.org/#/c/321833/13:46
EmilienMdprince: yeah, it makes sense!13:46
openstackgerritMarios Andreou proposed openstack/tripleo-heat-templates: Add Flag to Keep or Remove Sahara during controller upgrade  https://review.openstack.org/37551713:47
mariosjistr: plus we will need to get this to stable/newton so will need a bug# for it13:47
jistrright13:47
jaosoriorshardy_mtg: nevermind, it didn't fail cause the update didn't restart galera. so it still has the old config13:47
dprincepradk: so I guess what I'd like to know from you is did you manually set ManageFirewall to true? Or use the environments/manage-firewall.yaml?13:48
toskyjistr, marios: not only for other people, also for us (at least for me and my memory)13:49
*** links has quit IRC13:49
pradkdprince, whatever is the default, i dint change anything explicitly .. i just use the same workflow our ci jobs use13:51
dprincepradk: okay, so that is part of the problem. But I did enable it. I saw some firewall rules, not from haproxy.pp (see the rules.txt I linked in the bug)13:52
dprincepradk: but I'm still not seeing mongo rules for whatever reason. SO I'm looking into that now13:52
dprincepradk: no, I see them13:52
pradkdprince, ok did redis show up ?13:52
mariossoc_off: thanks13:52
*** jprovazn has quit IRC13:53
dprincepradk: no redis rules though13:53
dprincepradk: so just redis is missing. but I'm seeing everything else13:53
* dprince checks hiera13:54
pradkdprince, so in our ci we dont enable Managefirewall either i presume?13:54
openstackgerritJiri Tomasek proposed openstack/tripleo-ui: Add image names to Nodes registration workflow  https://review.openstack.org/38185513:54
dprincepradk: exactly, I will push a patch to re-enable that13:54
pradkunderstood13:54
dprincepradk: but I did use it... and I'm still missing redis?13:54
jtomasekjpich: ^^^^13:56
jpichjtomasek: Awesome, thank you!13:56
*** panda|afk is now known as panda13:57
jtomasekjpich: I thought gerrit is supposed to update the launchpad bug when a patch is sent for it but for some reason it is not happening13:57
EmilienMmeeting in 2 min13:58
jtomasekjpich: oh, now it's there...13:58
jpichjtomasek: It did :)13:58
jpichjtomasek: Will test and review now13:58
jtomasekjpich: thanks13:58
toskymarios: bandini told me about an upgrade statament for the sahara database which is currently commented, did you touch it so far and/or is it relevant for the patch above?13:59
EmilienMpradk: we did13:59
dprincepradk: got it, it is a bug in the pacemaker template for redis13:59
EmilienMbut someone disabled it, i'll find which commit to understand why13:59
*** coolsvap_ is now known as coolsvap13:59
*** limao_ has joined #tripleo14:00
mariostosky: ah yes thanks for reminder bandini lemme get you a pointer tosky sec14:00
soc_offmarios: one thing about it, I'm not sure where to actually put the update, my patch puts it into element but maybe we could consider tripleo-client (dunno) or even just making it package dependancy of instack-undercloud so if you update that you'll get updated puppet and opm14:01
soc_offmarios: package dep seems to be cleaner to me but I don't have technical argument for that14:01
mariostosky: https://github.com/openstack/tripleo-heat-templates/blob/master/extraconfig/tasks/major_upgrade_controller_pacemaker_2.sh#L6814:01
openstackgerritDan Prince proposed openstack/tripleo-heat-templates: Re-enable ManageFirewall by default.  https://review.openstack.org/38186414:01
-openstackstatus- NOTICE: The Gerrit service on review.openstack.org is being restarted to address performance degradation and should return momentarily14:02
toskymarios: that statement is enabled in the granade jobs which upgrades sahara, so I would say that it should be enabled14:02
*** limao has quit IRC14:03
mariostosky: ack will include it with https://review.openstack.org/#/c/375517/ i mean uncomment it there14:03
openstackgerritOpenStack Proposal Bot proposed openstack/tripleo-common: Updated from global requirements  https://review.openstack.org/37599714:03
toskymarios: thanks!14:04
openstackgerritCarlos Camacho proposed openstack/tripleo-common: Add support to create role main template file based in generic.role.j2.yaml  https://review.openstack.org/38159314:04
openstackgerritMarios Andreou proposed openstack/tripleo-heat-templates: Add Flag to Keep or Remove Sahara during controller upgrade  https://review.openstack.org/37551714:05
*** shardy_mtg is now known as shardy14:06
openstackgerritDan Prince proposed openstack/tripleo-heat-templates: Include redis/mongo hiera when using pacemaker  https://review.openstack.org/38186914:06
dprincepradk, EmilienM: ^^14:06
EmilienMdprince: I'll look after meeting14:07
dprinceEmilienM: also, this https://review.openstack.org/#/c/381864/14:07
*** morazi has quit IRC14:07
openstackgerritMerged openstack/tripleo-quickstart: Revert "Return to using ping test in minimal jobs"  https://review.openstack.org/37954514:07
pradkdprince, nice catch14:07
dprincepradk: well, you caught the symptom here ;)14:08
openstackgerritCarlos Camacho proposed openstack/tripleo-common: Add support to create role main template file based in generic.role.j2.yaml  https://review.openstack.org/38159314:08
pradk:)14:08
openstackgerritCarlos Camacho proposed openstack/tripleo-common: Add support to create role main template file based in role.role.j2.yaml  https://review.openstack.org/38159314:10
dprincebnemec: hi, so per https://review.openstack.org/#/c/381864/ should we actually just remove environments/manage-firewall.yaml?14:12
*** gfidente has quit IRC14:13
dprincebnemec: it doesn't hurt but setting this default makes that file mute14:13
openstackgerritCarlos Camacho proposed openstack/tripleo-common: Add support to create role main template file based in role.role.j2.yaml  https://review.openstack.org/38159314:13
*** masco has quit IRC14:14
openstackgerritDan Prince proposed openstack/puppet-tripleo: Cleanup the firewall logic.  https://review.openstack.org/38187514:14
bnemecdprince: Yeah, that probably makes sense.14:15
dprincebnemec: okay, so update this patch? Or do it in a separate one?14:16
dprincebnemec: I'm inclined to add it to the patch...14:16
*** jeckersb is now known as jeckersb_gone14:18
bnemecdprince: I'm fine either way.14:19
jaosoriorbandini: in the undercloud, where is mysql listening on? is it only on localhost?14:19
*** tosky has quit IRC14:20
dprincebnemec: patch updated.14:20
*** jeckersb_gone is now known as jeckersb14:21
*** tosky has joined #tripleo14:21
*** mcornea has quit IRC14:21
bandinijaosorior: don't think so. the only reason we are somewhat protected is because of iptables on the undercloud not because of properly listening to interfaces14:22
*** mcornea has joined #tripleo14:23
jaosoriorok14:23
bandinijaosorior: lemme check though14:23
bandinijaosorior: no it actually binds to the br-ctlplane (via /etc/my.cnf.d/galera.cnf). iptables filters that off though14:24
openstackgerritJuan Antonio Osorio Robles proposed openstack/instack-undercloud: Set MySQL bind-address via parameter  https://review.openstack.org/38188714:25
openstackgerritJuan Antonio Osorio Robles proposed openstack/puppet-tripleo: Make MySQL client default host be same as bind address  https://review.openstack.org/38183414:25
openstackgerritJuan Antonio Osorio Robles proposed openstack/tripleo-heat-templates: Select per-network hostnames for service_node_names  https://review.openstack.org/37876414:26
*** limao_ has quit IRC14:30
*** apetrich has quit IRC14:34
*** apetrich has joined #tripleo14:34
*** dmacpher has quit IRC14:37
*** dmacpher has joined #tripleo14:38
*** rook_ is now known as rook14:39
openstackgerritCarlos Camacho proposed openstack/tripleo-heat-templates: Add generic template for custom roles.  https://review.openstack.org/38158714:39
openstackgerritCarlos Camacho proposed openstack/tripleo-common: Add support to create role main template file based in role.role.j2.yaml  https://review.openstack.org/38159314:40
*** numans has quit IRC14:42
hrybackiayoung: I've got this deploying locally and passing tests (not that it had an reason to break them) mind reviewing my updates? https://review.openstack.org/#/c/315749/14:44
ayoung++14:44
EmilienMdprince: ok looking now.14:44
ayounghrybacki, the "false" values in there are for debugging, right?14:45
ayoungchanged_when: false14:45
EmilienMdprince: thanks :)14:45
*** ayoung has quit IRC14:46
*** ayoung has joined #tripleo14:46
hrybackiayoung: not sure -- I didn't add those. sounds like clever stuff from adaraz14:46
openstackgerritBrad P. Crochet proposed openstack/python-tripleoclient: Downloads templates from swift before processing update  https://review.openstack.org/38189914:48
thrashd0ugal: ^^^^14:48
trownayoung: that is an ansible thing so that we dont need to do "ignore errors" and such... makes anisible-lint happy14:48
thrashd0ugal: untested. But theoretically... :)14:48
ayoungtrown, ah, ok14:48
thrashd0ugal: i'm tesitng now.14:48
d0ugalrbrady: FYI ^14:48
hrybackitrown: interesting, are we pushing that as standard across oooq?14:49
d0ugalthrash: cool, I'll test it too14:49
rbradyd0ugal, thrash: taking a look14:49
trownhrybacki: we have gates for ansible-lint, so ya :)14:49
hrybackitrown: are they catching extra roles as well? /me wonders14:50
rhalliseythrash, there's already a patch for that: https://review.openstack.org/#/c/379547/14:50
trownhrybacki: doubt it, they are openstack-infra gates14:50
rhalliseyjust an fyi14:50
hrybackitrown: ack, thanks for the intel :)14:50
thrashrhallisey: that doesn't cover the update scenario14:51
rhalliseythrash, gotcha14:52
thrashrhallisey: but the code is pretty much the same. :)14:53
rhalliseywhatever works :)14:53
EmilienMdprince: I'm adding tripleo/rc3 gerrit topic to your firewall fixes14:53
d0ugalrhallisey: lol, at this point that is all that matters.14:53
EmilienMdprince: they sounds me critical to have, right?14:53
EmilienMdprince: or no?14:53
rhalliseyd0ugal, :D14:53
EmilienMmy hope is we don't add a new regression in upgrades but I don't think so14:53
openstackgerritSteven Hardy proposed openstack/tripleo-heat-templates: j2 template per-role ServiceNetMapDefaults  https://review.openstack.org/38190214:55
*** mbozhenko has quit IRC14:56
openstackgerritSteven Hardy proposed openstack/tripleo-common: Modify j2 templating to allow role files generation  https://review.openstack.org/38190314:56
d0ugalthrash: it seems to just be hanging14:59
EmilienMdprince: so it sounds like shardy removed the global parameter which was True by default https://review.openstack.org/#/c/347050/14:59
EmilienMdprince: but the parameter was False in puppet/controller.yaml14:59
EmilienMwhich is the original reason why we have it to False since this time15:00
thrashd0ugal: did you pass -i?15:00
EmilienMdprince: so we have been disabling firewall on controller for 9 weeks iiuc15:00
openstackgerritTomas Sedovic proposed openstack/tripleo-validations: Validate the instackenv format  https://review.openstack.org/35395615:00
*** rajinir has joined #tripleo15:01
shardyouch :(15:01
*** r-mibu has quit IRC15:04
*** r-mibu has joined #tripleo15:04
openstackgerritCarlos Camacho proposed openstack/tripleo-common: Add support to create role main template file based in role.role.j2.yaml  https://review.openstack.org/38159315:04
*** lmiccini has quit IRC15:05
dprinceEmilienM: yep, plus there was a different regression effecting mongo, and redis15:05
EmilienMnice catch 2 days before RC3 :P15:05
EmilienMjust don't tell anyone please15:05
dprinceEmilienM: that was what confused me about the ticket15:05
dprinceEmilienM: FWIW, I missed we enabled them by default (I was using that environment file)15:06
dprinceEmilienM: so I saw these working... again confusing15:06
EmilienMyeah, the puppet-tripleo managed rules work by default15:07
EmilienMis it expected?15:07
dprinceEmilienM: if we had removed that environment file when we enabled it I might have caught it sooner15:07
*** electrofelix has quit IRC15:07
dprinceEmilienM: but probably not the redis and mongo issue. That was different15:07
*** electrofelix has joined #tripleo15:07
jaosoriordprince: what was the issue with redis and mongo?15:08
dprincejaosorior: when using pacemaker their firewall rules wouldn't get set, because the pacemaker templates extended the wrong base serices15:08
dprinceserices15:08
dprinceservices15:08
* dprince apparently my left index finger isn't typing 'v's today15:09
dprincejaosorior: https://review.openstack.org/#/c/381869/15:10
dprincejaosorior: just look at the patch. It is pretty simple15:10
*** rcernin has quit IRC15:10
jaosoriordprince: it is15:11
*** tremble has quit IRC15:12
*** yamahata has joined #tripleo15:15
*** lucasagomes is now known as lucas-hungry15:16
*** jlinkes has quit IRC15:17
*** paramite has joined #tripleo15:17
jaosoriorshardy: I need some help with the hostnames CR15:18
d0ugalthrash: no, I didn't lol15:18
jaosoriorshardy: I'm figuring out what's up with the ha failure, but the nonha gate failes because of ceph for some reason15:18
d0ugalthrash: trying with that now.15:18
jaosoriorshardy: http://logs.openstack.org/64/378764/8/check-tripleo/gate-tripleo-ci-centos-7-ovb-nonha/afd9ef2/logs/postci.txt.gz15:19
thrashd0ugal: it still should have returned though...15:19
thrashd0ugal: make sure you run with --debu15:19
thrash*debug15:19
d0ugalthrash: how long should it take?15:21
d0ugalthrash: What command are you running exactly?15:21
*** panda is now known as panda|bbl15:21
thrashd0ugal: I'm just getting to it now. I suppose passing the environments is not completely necessary with a plan??15:22
thrashd0ugal: or does it need to maintain that?15:22
thrashd0ugal: I guess we do since I put in the create or update plan part...15:22
*** akshai has quit IRC15:24
*** saneax is now known as saneax-_-|AFK15:24
thrashd0ugal: openstack overcloud update stack --templates <path> -i -e <environments> overcloud15:27
*** aufi has quit IRC15:28
d0ugalthrash: k, so, since I just deployed initiall with "openstack overcloud deploy --templates" I am trying "openstack overcloud update stack overcloud --templates -i"15:28
d0ugalthrash: and I get this error...15:28
d0ugalthrash: http://paste.openstack.org/show/584240/15:28
d0ugalthrash: oh, maybe I need to install a newer tripleo-common?15:29
thrashd0ugal: would you?15:29
thrashd0ugal: maybe...15:29
thrashd0ugal: mine seems to be working...15:29
d0ugalthrash: I remember some mergepy references were removed recently15:29
thrashd0ugal: that could do it.15:29
shardyceph-osd-activate-/srv/data]/returns: change from notrun to 0 failed: Command exceeded timeoutm15:30
shardyjaosorior: Hmm, not sure but I assume something got broken in the ceph service hieradata which does consume the hostname, looking15:30
thrashd0ugal: nope. I hit that too...15:30
thrashd0ugal: maybe tripleo-common does need to be fixed too.15:30
thrashd0ugal: let me look at the UpdateManager code.15:31
thrashd0ugal: constants.TEMPLATE_NAME15:32
thrashshould I change that reference to OVERCLOUD_YAML_NAME and get rid of the TEMPLATE_NAME?15:32
*** akshai has joined #tripleo15:32
*** mcornea has quit IRC15:32
d0ugalthrash: https://review.openstack.org/#/c/375540/15:32
*** leanderthal is now known as leanderthal|afk15:33
d0ugalthrash: I thought it had landed already15:33
thrashd0ugal: ahhh15:33
thrashyep. that'd do it.15:33
shardyjaosorior: I assume it means ceph_mon_node_names is now pointing at a different network because CephMonNetwork is mapped to storage, not ctlplane15:33
* shardy looks around for gfidente15:33
thrashd0ugal: I'll add a depends-on15:33
jaosoriorshardy: and the mysql failure makes no sense either... it actually seems to be working on my deployment15:34
openstackgerritBrad P. Crochet proposed openstack/python-tripleoclient: Downloads templates from swift before processing update  https://review.openstack.org/38189915:34
shardyjaosorior: and you're deploying with net-iso enabled?15:34
openstackgerritJuan Antonio Osorio Robles proposed openstack/tripleo-heat-templates: Select per-network hostnames for service_node_names  https://review.openstack.org/37876415:34
jaosoriorshardy: yes15:34
jaosoriorshardy: maybe we're missing some iptables setting?15:35
*** flepied1 has joined #tripleo15:35
jaosoriorshardy: do we set anything for that? or in CI?15:35
*** flepied has quit IRC15:37
shardyjaosorior: well yes, but I'm not sure how your local tests work if the problem is with the firewall config15:38
shardyEmilienM, dprince: the patch you were referring to earlier, where the firewall was disabled due to my patch, did the fix for that just land?15:38
shardyand/or can you point to how per-network firewall rules are set up?15:38
dprinceshardy: https://review.openstack.org/#/c/381864/15:39
shardywe've been sending some traffic over ctlplane and it's now breaking when mapped to the ServiceNetMap networks15:39
EmilienMno we haven't merged anything wrt this topic15:39
dprinceshardy: there is also a related bugfix here that effects only pacemaker https://review.openstack.org/#/c/381869/15:40
shardyhttps://github.com/openstack/tripleo-heat-templates/blob/master/puppet/services/ceph-osd.yaml#L4515:40
shardydprince, EmilienM: Ok thanks, so how do we define which bind network to apply those rules to?15:40
shardye.g if ceph needs tcp/6800, do we use the ServiceNetMap network to define the firewall rule, or apply it for all networks?15:41
EmilienMit's using 0.0.0.0 by default AFIK https://github.com/openstack/puppet-tripleo/blob/master/manifests/firewall/rule.pp#L7315:41
EmilienMdestination is "undef" by default oo15:41
EmilienMtoo15:41
EmilienMwe might want to make some dynamic binding here, re-using tht15:42
*** masco has joined #tripleo15:42
shardyOk, so yeah we can probably narrow the rules a little, but it's most likely not the reason the patch is failing15:42
shardythanks15:42
jaosoriorcrap15:42
jaosoriorso if that's not the reason then I'm out of ideas :/15:42
shardyjaosorior: can you reproduce the ceph failure locally?15:43
shardyI'd probably start with that, and run the failing command manually while tcpdumping to see where it gets stuck15:43
d0ugalthrash: now I get the same error but with overcloud.yaml :-D15:43
jaosoriorjaosorior: I haven't tried, I'm running another overcloud deploy trying out things :/15:43
EmilienMcan someone approve this backport please? https://review.openstack.org/#/c/381375/15:44
EmilienMsame for https://review.openstack.org/#/c/381276/15:44
*** mcornea has joined #tripleo15:46
d0ugalthrash: I think I have got it working15:50
d0ugalthrash: I think I have got it working15:50
d0ugaloops15:50
openstackgerritmathieu bultel proposed openstack/python-tripleoclient: Download templates from swift before processing with heatclient  https://review.openstack.org/37954715:50
*** akshai has quit IRC15:51
shardyCan I get a final review/+A for https://review.openstack.org/#/c/378737/ please?15:52
EmilienMshardy: looking15:52
EmilienMshardy: we don't wait for ovb?15:53
d0ugalthrash: Yeah, I am getting an UPDATE_IN_PROGRESS now... so looking promising.15:53
shardyEmilienM: Oh, yeah sorry I missed that hadn't voted yet, sorry for the noise15:53
EmilienMshardy: i'll +A it15:53
EmilienMonce it's pass OVB15:53
shardyack, thanks15:53
*** yamahata has quit IRC15:55
*** masco has quit IRC15:56
*** tiswanso has quit IRC15:56
dsneddonshardy, When you have a few minutes, can you look at https://bugzilla.redhat.com/show_bug.cgi?id=132774215:58
openstackbugzilla.redhat.com bug 1327742 in rhel-osp-director "[RFE] Director Should Allow For Multiple Compute Network Configurations" [Medium,New] - Assigned to athomas15:58
shardydsneddon: Will do, sounds like something we should solve via multiple compute roles using the new custom roles interfaces15:58
dsneddonshardy, I don't think this requires immediate action to document, but at least I'd like to communicate back to the customer when this will be ready to test.15:59
*** tiswanso has joined #tripleo15:59
*** tiswanso has quit IRC15:59
shardydsneddon: Ok, will do - tbh we're still working out some kinks in the custom-roles usage, but hopefully the worst of those will be fixed by the final newton deadline15:59
EmilienMdsneddon: could we use a launchpad bug, tracked to ocata-1 maybe?15:59
*** karthiks has quit IRC15:59
dsneddonEmilienM, Good idea, I'll create one.16:00
EmilienMdsneddon: check if it's not already there16:00
shardyYeah, although it may end up a docs-only fix16:00
*** akshai has joined #tripleo16:00
*** tiswanso has joined #tripleo16:00
jaosoriorshardy: hey dude, I gotta go :/16:01
jaosoriorHey guys, if someone can help out figuring out what's wrong with https://review.openstack.org/#/c/378764/ it would be greatly appreciated.16:01
thrashd0ugal: me too! :)16:01
shardyjaosorior: Ok, I'll try to take a look later & we can sync about it tomorrow, thanks for investigating!16:01
thrashd0ugal: I got to a breakpoint16:01
d0ugalthrash: Nice! I'm still waiting...16:02
*** jaosorior has quit IRC16:02
d0ugalthrash: oh, actually, I am on one too! woo16:02
d0ugalthrash: not that easy to spot with the debug output16:02
thrashd0ugal: yah16:02
d0ugalthrash: What do I do now?16:02
thrashd0ugal: just hit enter16:02
thrashthat will clear a breakpoint and continue. Then it will hit another.16:02
*** zoliXXL is now known as zoli|gone16:03
*** morazi has joined #tripleo16:03
openstackgerritCarlos Camacho proposed openstack/tripleo-heat-templates: Add generic template for custom roles.  https://review.openstack.org/38158716:05
*** zoli|gone is now known as zoli_gone-proxy16:06
*** dhill_ has quit IRC16:08
dsneddonEmilienM, I found this bug in Launchpad already, which seems to cover the custom Compute role: https://bugs.launchpad.net/tripleo/+bug/162697616:08
openstackLaunchpad bug 1626976 in tripleo "Custom role requires manual environment/files" [High,In progress] - Assigned to Carlos Camacho (ccamacho)16:08
openstackgerritJohn Trowbridge proposed openstack/tripleo-quickstart: Add centosci release configs for cloudsig-testing repos  https://review.openstack.org/38195716:08
*** radeks has quit IRC16:10
dsneddonEmilienM, Although that bug is slightly different. I will create a new bug for the custom networking and link to this existing bug.16:10
*** rbrady is now known as rbrady-mtg16:11
dsneddonEmilienM, OK, found an existing bug that covers this exactly, I'll link them: https://bugs.launchpad.net/tripleo/+bug/162555816:11
openstackLaunchpad bug 1625558 in tripleo "NIC templates examples for usual custom roles" [Medium,Triaged] - Assigned to Steven Hardy (shardy)16:11
*** jcoufal_ has joined #tripleo16:11
*** jcoufal has quit IRC16:12
openstackgerritDougal Matthews proposed openstack/python-tripleoclient: Save the result of direct action calls in Mistral  https://review.openstack.org/38173916:13
*** dhill_ has joined #tripleo16:13
EmilienMdsneddon: perfect.16:13
EmilienMcan someone approve https://review.openstack.org/381875 ?16:14
*** dtrainor has joined #tripleo16:14
openstackgerritMerged openstack/tripleo-quickstart: Revert "Temporarily pin pycparser"  https://review.openstack.org/38119016:15
*** karthiks has joined #tripleo16:15
EmilienMdprince: sounds like we have iptables now http://logs.openstack.org/64/381864/2/check/gate-tripleo-ci-centos-7-nonha-multinode/5325ef3/logs/subnode-2/iptables.txt.gz16:15
*** weshay is now known as weshay_lunch16:15
EmilienMpradk: can you review https://review.openstack.org/#/c/381864/ ? i'll approve it once OVB jobs are green16:16
pradklooking16:16
*** links has joined #tripleo16:17
pradkEmilienM, done16:18
*** links has quit IRC16:18
*** lucas-hungry is now known as lucasagomes16:21
EmilienMpradk, dprince: sounds like pingtest is failing http://logs.openstack.org/64/381864/2/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/1886035/console.html#_2016-10-04_16_08_27_29042616:22
EmilienMto ping the fip16:22
EmilienMmhh16:23
EmilienMhttp://logs.openstack.org/64/381864/2/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/1886035/console.html#_2016-10-04_16_08_27_24325916:23
EmilienMin fact it sounds like the instance didn't get a private IP16:23
EmilienMlet's see firewall rules16:23
*** tiswanso has quit IRC16:23
*** tiswanso has joined #tripleo16:25
*** openstackgerrit has quit IRC16:26
*** openstackgerrit has joined #tripleo16:27
EmilienMbnemec: do you know where does this debug come from in cirros vms ? http://logs.openstack.org/64/381864/2/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/1886035/console.html#_2016-10-04_16_08_27_24318916:27
EmilienMwhat is running it?16:27
bnemecEmilienM: Looks like that's part of the cirros image itself.16:28
*** openstackgerrit has quit IRC16:28
EmilienM2016-10-04 16:08:27.241977 | failed 20/20: up 229.08. request failed in the VM shows that VM can't reach neutron-dhcp16:28
EmilienMbnemec: ok16:28
*** tesseract- has quit IRC16:29
*** openstackgerrit has joined #tripleo16:29
EmilienMbnemec: so we don't ssh it, right? we get the info with metadata,16:29
bnemecEmilienM: That's just the console log of the vm, so no we don't ssh to it.16:29
EmilienMok16:29
EmilienMso yeah, something's wrong in iptables16:29
*** morazi has quit IRC16:29
*** openstackgerrit has quit IRC16:30
EmilienMwe have -A INPUT -p udp -m multiport --dports 67 -m comment --comment "115 neutron dhcp input" -m state --state NEW -j ACCEPT16:30
*** openstackgerrit has joined #tripleo16:30
EmilienMand -A OUTPUT -p udp -m multiport --dports 68 -m comment --comment "116 neutron dhcp output" -m state --state NEW -j ACCEPT16:30
*** abehl has quit IRC16:31
openstackgerritCarlos Camacho proposed openstack/tripleo-heat-templates: Move the main template files for defalut services to new syntax generation  https://review.openstack.org/38197516:32
openstackgerritLucas Alvares Gomes proposed openstack/tripleo-quickstart: Force the use of python/pip version 2  https://review.openstack.org/36801316:33
openstackgerritMerged openstack/tripleo-quickstart: Add centosci release configs for cloudsig-testing repos  https://review.openstack.org/38195716:34
*** ohamada has quit IRC16:35
*** karthiks has quit IRC16:35
*** milan has quit IRC16:37
openstackgerritCarlos Camacho proposed openstack/tripleo-heat-templates: Add generic template for custom roles.  https://review.openstack.org/38158716:37
*** rbowen has quit IRC16:37
*** yamahata has joined #tripleo16:38
EmilienMdprince: I'll continue to dig after lunch16:39
*** trown is now known as trown|lunch16:40
openstackgerritCarlos Camacho proposed openstack/tripleo-heat-templates: Add generic template for custom roles.  https://review.openstack.org/38158716:41
*** yamahata has quit IRC16:49
openstackgerritDougal Matthews proposed openstack/tripleo-docs: [WIP] Mistral API Documentation  https://review.openstack.org/35868516:50
*** rbowen has joined #tripleo16:50
*** tiswanso has quit IRC16:50
*** karthiks has joined #tripleo16:52
*** weshay_lunch is now known as weshay16:54
*** hewbrocca is now known as hewbrocca-afk16:55
openstackgerritMerged openstack/tripleo-common: Remove references to overcloud-without-mergepy  https://review.openstack.org/37554016:55
openstackgerritCarlos Camacho proposed openstack/tripleo-common: Add support to create role main template file based in role.role.j2.yaml  https://review.openstack.org/38159316:55
*** rbrady-mtg is now known as rbrady16:57
*** yamahata has joined #tripleo16:57
*** fzdarsky is now known as fzdarsky|afk16:57
*** weshay is now known as weshay_mtg16:58
*** ccamacho has quit IRC17:00
*** derekh has quit IRC17:00
openstackgerritMerged openstack/tripleo-specs: Fix a typo in documentation  https://review.openstack.org/38137917:01
openstackgerritMerged openstack/tripleo-heat-templates: Set ceph osd max object name and namespace len on upgrade when on ext4  https://review.openstack.org/38137517:03
openstackgerritMerged openstack/puppet-tripleo: Use FallbackResource instead of Rewrite for UI  https://review.openstack.org/38127617:03
rbowenExcellent17:03
*** mcornea has quit IRC17:03
openstackgerritMerged openstack/tripleo-heat-templates: Include redis/mongo hiera when using pacemaker  https://review.openstack.org/38186917:03
*** penick has joined #tripleo17:05
*** social has joined #tripleo17:07
*** tiswanso has joined #tripleo17:08
*** jpich has quit IRC17:09
*** jprovazn has joined #tripleo17:15
openstackgerritmathieu bultel proposed openstack/python-tripleoclient: Download templates from swift before processing with heatclient  https://review.openstack.org/37954717:16
*** dtantsur is now known as dtantsur|afk17:18
openstackgerritBen Nemec proposed openstack-infra/tripleo-ci: Test with scheduler hints  https://review.openstack.org/37804017:19
openstackgerritBen Nemec proposed openstack-infra/tripleo-ci: Add support for testing predictable placement  https://review.openstack.org/37801417:19
openstackgerritBen Nemec proposed openstack-infra/tripleo-ci: Test hostname map  https://review.openstack.org/37801717:19
openstackgerritBen Nemec proposed openstack-infra/tripleo-ci: Fall back to previous failure list for older releases  https://review.openstack.org/38199817:19
openstackgerritJames Slagle proposed openstack-infra/tripleo-ci: Do Not Merge - Test Undercloud upgrade mitaka -> newton  https://review.openstack.org/38130917:20
*** tiswanso has quit IRC17:22
*** pkovar has quit IRC17:23
*** dmacpher is now known as dmacpher-afk17:24
*** sudipto_ has quit IRC17:24
*** sudipto has quit IRC17:24
*** tiswanso has joined #tripleo17:26
*** ayoung has quit IRC17:27
*** ayoung has joined #tripleo17:27
*** flepied has joined #tripleo17:34
larsksshardy, jschlueter: I'm still seeing that "Expression consumed too much memory" error using current tripleo-heat-templates master.17:36
larsksAny luck investigating that earlier?17:36
EmilienMlarsks: have you looked in logstash if it's consistent?17:36
larsksEmilienM, I'm seeing this locally, and pretty consistently.17:37
*** flepied1 has quit IRC17:37
EmilienMin logstash?17:37
*** tosky has quit IRC17:37
*** rbowen has quit IRC17:37
larsksEmilienM, I've got no logstash.  This is on the console, when running "overcloud deploy".17:37
EmilienMlet's see in http://logstash.openstack.org/#/dashboard/file/logstash.json17:37
shardylarsks: yes the downstream build is missing a puppet-heat patch17:37
shardyso the increase wasn't doing anything17:37
shardySee ChangeId Id41001d74ce1008dbb5a98b962d5c53dbf39c90317:38
larsksshardy, awesome.  So if I upgrade to puppet-heat master I should be good?17:38
shardyhttps://github.com/openstack/instack-undercloud/commit/55ccd0e1e8c66c9f474112358063bc263720d84f17:38
gregworkany thoughts on cleaning out a failed overcloud deploy where the status of overcloud is DELETE_FAILED17:38
larsksshardy, thanks!17:38
shardylarsks: I believe so, yes17:38
EmilienMshardy: no17:38
EmilienMnot master17:38
EmilienMto stable/newton17:38
shardyyou can confirm by checking heat.conf to see if you have actually changed the heat config settings17:39
EmilienMand I'm preparing a new release of puppet-heat at this time17:39
EmilienMis it the yaql fix?17:39
*** rasca has quit IRC17:39
shardyEmilienM: ack, yeah, I mean you need that patch, which is on master and stable/newton now17:39
shardyEmilienM: yup17:39
EmilienMlarsks: ^17:39
EmilienMdo not update on master17:39
shardysome folks pulled the instack-undercloud fix without the puppet-heat one17:39
shardyso it does nothing17:39
EmilienMI haven't seen it in our CI FYI: build_name: *tripleo* AND message: "Expression consumed too much memory"17:39
shardyEmilienM: Yes, some folks are testing downstream builds17:40
EmilienMshardy: ok, makes sense then.17:40
EmilienMi'm working on releasing a new puppet-heat: https://review.openstack.org/#/c/381901/17:40
EmilienMit will be done by today i think17:40
larsksshardy, my local heat is 7.0.0-0.20160923054727.e4c4c56. Will I need to upgrade that as well?17:40
shardylarsks: No it just needs to be configured to increase the limit17:41
larsksAck.17:41
EmilienMlarsks: nope, just puppet17:41
EmilienMor the config manually, indeed17:41
*** weshay_mtg is now known as weshay17:41
slaglematbu: is https://bugs.launchpad.net/tripleo/+bug/1624448 even still an issue? based on the latest PS of https://review.openstack.org/#/c/323750/, you are not even using multinode jobs to test the upgrades. you're using ovb17:42
openstackLaunchpad bug 1624448 in tripleo "CI Upgrade job hang or failed on undercloud install" [High,In progress] - Assigned to James Slagle (james-slagle)17:42
openstackgerritBrad P. Crochet proposed openstack/python-tripleoclient: Downloads templates from swift before processing update  https://review.openstack.org/38189917:44
thrashjprovazn: do you remember why you originally made 'openstack overcloud update stack' auth_required = False?17:45
*** Goneri has quit IRC17:46
*** rbowen has joined #tripleo17:49
dprinceEmilienM: weird. I just rebased my dev environment and  now I'm hitting this: http://paste.openstack.org/show/584276/17:50
EmilienMdprince: weird indeed. The error in CI is different17:52
EmilienMdprince: pingtest doesn't work, it seems like the instance created by the heat template is not getting an IP address17:52
EmilienMdprince: I'm rebasing my env too now.17:56
*** radeks has joined #tripleo17:56
openstackgerritEmilien Macchi proposed openstack/tripleo-heat-templates: Include ceilometer in swift proxy pipeline  https://review.openstack.org/37195017:58
*** rcernin has joined #tripleo17:58
EmilienMpradk: let's try again ^17:58
EmilienMpradk: ok I know why17:59
EmilienMpradk: puppet-tripleo patch needs to bebackported17:59
EmilienMour CI still checkout stable/newton I think17:59
*** ccamacho has joined #tripleo18:00
pradkEmilienM, i proposed the backport already18:00
dprinceEmilienM: I rechecked my patch BTW. Now that the redis, mongo fix landed. I don't think it is related but figured it worth a try since we are testing locally too18:00
EmilienMpradk: I'm approving it now18:00
pradkcool18:00
EmilienMpradk: it will work for sure after that18:00
EmilienMdprince: yes, my env is almost deployed, I'll debug18:00
EmilienMdprince: have you seen the failure in CI?18:00
EmilienMhttp://logs.openstack.org/64/381864/2/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/1886035/console.html#_2016-10-04_16_08_27_24322418:00
dprinceEmilienM: Yes, saw the pingtest18:00
EmilienMk18:01
dprinceEmilienM: the firewall rules seem to be in place too18:01
EmilienMright18:01
dprinceEmilienM: its been off for awhile so hard to say what this is18:01
EmilienMmaybe are we missing a rule18:01
*** ayoung has quit IRC18:01
EmilienMyeah, 9 weeks18:01
*** ayoung has joined #tripleo18:01
EmilienMI'll compare iptables rules since then18:01
EmilienMI'll compare when it worked and now I mean18:01
dprinceEmilienM: its good it wasn't 9 1/2 weeks. Then we'd be in trouble18:01
EmilienMdprince: why?18:02
EmilienMyou're trolling me :P18:02
dprinceEmilienM: I'll let you google that one :)18:02
*** tiswanso has quit IRC18:02
EmilienMhttps://en.wikipedia.org/wiki/9%C2%BD_Weeks18:03
EmilienMok nice18:03
EmilienMdamn, it's not safe for work18:03
*** radeks has quit IRC18:03
*** tiswanso has joined #tripleo18:03
EmilienMdprince: dude, I didn't know that movie, thanks18:03
jprovaznthrash, not sure what you mean by 'auth_required = False"?18:04
EmilienMdprince: sad, we don't have logs anymore from 9 weeks ago18:05
*** abehl has joined #tripleo18:06
*** electrofelix has quit IRC18:06
Slower_is there a way to force delete a stack now?18:09
*** Slower_ is now known as Slower18:09
Slowerfailed stacks with mistral seem to break things and I'm not sure how to debug that yet18:10
EmilienMdprince: I found old logs with iptables, from August!18:10
EmilienMI'm comparing now18:10
shardySlower: you should always be able to do heat stack-delete overcloud && openstack plan delete overcloud18:10
shardypossibly you didn't do the plan delete?18:10
shardySlower: if you really need to reset the stack status, you can force it with heat-manage reset_stack_status18:11
Slowerplan?18:11
Sloweropenstack: 'plan' is not an openstack command. See 'openstack --help'.18:11
*** mcornea has joined #tripleo18:11
Slower| c820d2de-6b11-436a-bdda-4d2a146e2508 | overcloud  | DELETE_FAILED | 2016-10-04T17:08:45Z | 2016-10-04T17:15:31Z |18:11
shardySlower: sorry openstack overcloud plan delete18:11
Slowerah ok18:12
shardyit deletes the swift bucket and mistral environment18:12
Slowerrighto18:12
Slowerthanks I needed that :)18:12
shardyif you end up with that stuff partially created things can fail non-obviously18:12
* shardy raised a bug about it IIRC18:12
EmilienMdprince: I have a diff between old iptables rules and new ones https://www.diffchecker.com/7LjRfmZY18:12
Slower Cannot delete a plan that has an associated stack.18:12
Slowershardy: so I'm guessing I have to enable stack abandon and do that?18:13
shardySlower: you have to delete the stack first, then the plan18:13
SlowerDELETE_FAILED18:13
shardywhy is it delete failed tho?18:13
Slowerdo you really want to know? :)18:13
EmilienMsounds like we have a bunch of new rules18:13
*** Goneri has joined #tripleo18:13
shardySlower: hehe, I just mean that shouldn't ever happen unless something somewhere is broken18:14
dprinceEmilienM: vrrp?18:14
Slowershardy: well my stack is always broken..18:14
shardySlower: heat doesn't care how broken your stack is tho, it should always be able to delete it18:14
Slowercould be an ironic issue..18:14
Sloweror something18:14
shardyundeletable stacks are either a critical heat bug, or a misconfiguration somewhere18:14
Slowerthere are no instances up tho18:15
EmilienMdprince: indeed I don't see vrrp any more in new iptables18:15
EmilienMdprince: though vrrp is not useful for dhcp18:15
Slowershardy: it happens every once in a while18:15
Slowershardy: I always just enabled stack abandon and did that18:15
EmilienMhttp://logs.openstack.org/64/381864/2/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/1886035/console.html#_2016-10-04_16_08_21_79933918:15
EmilienMand we can see dhcp agent running fine18:15
shardySlower: I never see stack delete failures unless there's a bug or I somehow broke my undercloud18:16
EmilienMlooking in neutron logs to see if we see the dhcp request18:16
shardyso sure, you can abandon, but it's possible you're papering over some other issue18:16
Slowershardy: ResourceFailure: resources.ServiceChain: resources.ObjectStorageServiceChain.(pymysql.err.InternalError) (1205, u'Lock wait timeout e18:16
Slowerxceeded; try restarting transaction') [SQL: u'DELETE FROM resource WHERE resource.id = %s'] [parameters: (410,)]18:16
EmilienMdprince: http://logs.openstack.org/64/381864/2/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/1886035/logs/overcloud-controller-0/var/log/neutron/dhcp-agent.txt.gz#_2016-10-04_16_01_38_08118:16
EmilienMCould not load neutron.agent.linux.interface.OVSInterfaceDriver18:17
Slowershardy: guessing that's it.. not sure yet18:17
EmilienMdo we have that log in current jobs?18:17
*** abehl has quit IRC18:17
shardySlower: ack, so that sounds like it could well be a bug - if you can capture some data from the heat-engine logs we might be able to figure out what18:17
shardyparticularly if it's failing consistently every delete18:17
EmilienMyes we have it, nevermind18:17
*** akshai has quit IRC18:18
dprinceEmilienM: all of the neutron API stuff is missing. I got it18:18
dprinceEmilienM: we specified neutron_server instead of neutron_api18:18
dprinceEmilienM: sec and I'll push the patch18:18
dprinceEmilienM: that diff was really helpful FWIW18:19
Slowershardy: also if I run openstack stack failures list overcloud on the failed-to-delete stack it backtraces too18:19
Slowershardy: but that may not be  a bug, not sure18:19
EmilienMdprince: indeed18:19
EmilienMdprince: nice catch dude18:19
dprinceEmilienM: same patch? or new one. I'll have to rebase the other one anyway....18:19
EmilienMdprince: same patch18:19
EmilienMdprince: let's save time & resources18:20
EmilienMat this stage of the week18:20
openstackgerritSteven Hardy proposed openstack/python-tripleoclient: Add optional overcloud deploy roles_data.yaml override  https://review.openstack.org/37874018:20
EmilienMshardy: do you remember what we did with sahara this cycle?18:20
Slowershardy: https://paste.fedoraproject.org/443483/14756052/18:20
EmilienMI see iptables rules for sahara removed by default, we stopped deploying it by default right?18:20
shardyEmilienM: Not without looking at the git logs tbh18:21
shardyEmilienM: that reminds me I meant to discuss release notes in the meeting18:21
EmilienMshardy: reno?18:21
*** trown|lunch is now known as trown18:21
openstackgerritDan Prince proposed openstack/tripleo-heat-templates: Re-enable ManageFirewall by default.  https://review.openstack.org/38186418:21
EmilienMshardy: it's on my looong TODO list18:21
shardywe've not used reno this cycle (we should for ocata)18:21
*** bana_k has joined #tripleo18:21
dprinceEmilienM: ^^18:21
EmilienMdprince: ack thx!18:21
shardyEmilienM: yeah, but we need to generate some release notes manually this week18:21
EmilienMshardy: yes. I'm adding it today18:21
EmilienMshardy: I know how reno works, I don't need much time18:22
EmilienMshardy: the question is about newton18:22
EmilienMhow are we going to add release notes and where18:22
EmilienMimho it's too late, we should document in tripleo-docs18:22
EmilienMand use reno in ocata18:22
shardyEmilienM: yes, my assumption was that we'd collaborate on an etherpad which can be used for the release notes18:22
shardybut if that's not possible we can do it elsewhere18:23
shardyseems like something which could be done between the final code and cycle-trailing deadline to me18:23
Slowershardy: oh and this time it worked..18:23
EmilienMshardy: etherpad works for me as a draft before putting it in tripleo-docs18:23
shardyEmilienM: Ok, I'd prefer it to be included with the main OpenStack release notes, but if that's not possible then I guess tripleo-docs will work18:24
EmilienMshardy: how would be included in main OpenStack release notes?18:24
shardylets capture the information & discuss with the release team18:25
EmilienMfor Newton I mean18:25
EmilienMwe could do it with reno but it's probably too late18:25
EmilienMin term of tags, etc18:25
EmilienM(reno works with tags and branches)18:25
*** athomas has quit IRC18:25
shardyEmilienM: Ok maybe it is, I assumed there was some manual fallback method we could use, like all pre-reno releases18:25
shardyand also that by observing the cycle trailing deadline we'd be allowed more time to finalize the release notes18:26
Slowershardy: https://paste.fedoraproject.org/443493/60558414/18:26
EmilienMshardy: I think it's possible18:26
Slowershardy: can I abandon or do you want more info?18:26
shardySlower: something is obviously really broken but I don't have time to debug unfortunately18:27
shardyperhaps raise a heat bug if you think it's possible to reproduce18:27
*** shardy is now known as shardy_afk18:27
*** amoralej is now known as amoralej|off18:29
*** jpena is now known as jpena|off18:30
*** radeks has joined #tripleo18:31
*** bnemec has quit IRC18:34
*** dsariel_ has joined #tripleo18:42
openstackgerritCarlos Camacho proposed openstack/tripleo-common: Add support to create role main template file based in role.role.j2.yaml  https://review.openstack.org/38159318:49
*** akshai has joined #tripleo18:50
ccamachoEmilienM arround? Mind to approve https://review.openstack.org/#/c/378737/  ??18:52
ccamacho+2ed' and passing ovb jobs18:52
openstackgerritEmilien Macchi proposed openstack/python-tripleoclient: Add ReNo support  https://review.openstack.org/38204618:53
EmilienMccamacho: looking18:53
ccamachothanks!18:53
EmilienMccamacho: done18:54
*** rwsu_ has joined #tripleo19:01
*** rwsu has quit IRC19:04
openstackgerritPradeep Kilambi proposed openstack/tripleo-heat-templates: Ceilometer Wsgi Mitaka->Newton upgrades  https://review.openstack.org/36000419:04
EmilienMccamacho: the patch is not going to land19:09
EmilienMbecause of the Depends-On19:09
EmilienMbut we first need https://review.openstack.org/#/c/381903/19:10
EmilienMI restored it and ran recheck, let's see19:10
*** radeks has quit IRC19:11
openstackgerritSagi Shnaidman proposed openstack-infra/tripleo-ci: POC: WIP: Full quickstart gate run on OVB  https://review.openstack.org/38109419:14
rookbeagles sai hey - so back in the day I made a change to nova to reserve the host a bit more memory then upstream (512 -> 2048MB)... If we switch OOO to deploy DVR, and with the metadata service bug (memory hangry)... we should reserve the host a bit more memory... And with the latest DVR it might be a bit silly... (ie, should we set a minimum to 10GB).19:15
d0ugalSlower: This may be useful, but it isn't very complete yet: http://www.dougalmatthews.com/2016/Sep/21/debugging-mistral-in-tripleo/19:16
saibeagles: yup, i can confirm the memory growth with DVR on controller..19:16
saii have data19:16
saisorry i meant compute19:17
Slowerd0ugal: I saw that.  Mostly it went over my head tbh :)19:18
Slowerit's pretty crazy when you list and there's so many actions and the seemingly random error/non-error status etc.19:18
d0ugalSlower: Yeah, I can imagine. I'll try and do a follow up to filter that - the number keeps increasing at the moment too19:19
* d0ugal isn't really here19:19
openstackgerritHonza Pokorny proposed openstack/tripleo-ui: Integrate node tagging workflow  https://review.openstack.org/36756219:20
openstackgerritmathieu bultel proposed openstack/python-tripleoclient: Download templates from swift before processing with heatclient  https://review.openstack.org/37954719:26
thrashjprovazn: https://github.com/openstack/python-tripleoclient/blob/master/tripleoclient/v1/overcloud_update.py#L3219:29
thrashsomehow I missed your response earlier... :)19:29
rookEmilienM so, I have a idea to make deployments a bit easier at scale, not sure if you are the right person to discuss the idea with.. However, here it goes... Right now when we do a OC deploy, and the nodes go from build->active, I have seen many times where either 1) the host doesn't finish up with the PXE install and/or the guest doesn't reboot. So a deployment/scale deployment ends up timing out (if you19:33
rookare not actively baby sitting the deployment).19:33
rookto get around this -- manually, I typically just ping the set of nodes provisioning interface (after it goes from building-> active)19:33
jprovaznthrash, no idea (this is really looong time ago), but IIRC an existing overcloud command was used as a "template" for it19:33
rookif it doesn't ping after 5 minutes, I start rebooting the host via ironic... that typically "fixes" things, however sometimes that doesn't unwedge things...19:34
*** rbrady is now known as rbrady-afk19:34
rookAnyway, maybe there is somewhere we can add this logic?19:34
jprovaznthrash, e.g. I can see in "git log" that rdomanager_oscplugin/v1/overcloud_deploy.py  used the same, so there is a solid chance that if you find answer why it was used for other commands, the same will apply for this one :)19:34
thrashjprovazn: ok. I'm really wondering if that should still be the case. Anyway, thanks for the answer. I'll review all of them.19:40
*** shardy_afk is now known as shardy19:41
shardyrook: that's really interesting feedback, thanks19:41
ccamachoEmilienM this already landed19:41
ccamachohttps://review.openstack.org/#/c/378750/19:42
openstackgerritBrad P. Crochet proposed openstack/python-tripleoclient: Downloads templates from swift before processing update  https://review.openstack.org/38189919:42
EmilienMccamacho: not backported19:42
shardyrook: I think there may be somewhere we could wire that in provided the host actually booted19:42
ccamachoaaaaaaaaaaaa I see19:42
EmilienMccamacho: our CI is running stable/newton packages...19:42
shardyrook: it's more difficult if it failed to boot and the ironic reboot fixes a boot issue19:42
rookshardy if there is somewhere I could add this check, I don't mind looking at implementing it... it is a PITA to find out that a host is either A) Stuck in PXE or B) Stuck due some fubar raid controller, at which point we should reschedule to a different node.19:42
shardyrook: yeah, I think it's a discussion to have with dtantsur|afk and lucasagomes when they're around19:43
shardyrook: there are probably things we could do via mistral to e.g ping nodes before attempting to configure them, but right now the only ping validation we do is inside the nodes19:43
shardyso clearly we're not getting to that if it's a boot time problem19:44
rookshardy yeah... it happens quite often... which isn't a _huge_ deal... but if a customer is scaling a install out, and they hit this -- they need to debug the reason for the failure, IE look at nova list, see which hosts don't ping, then ironic node-list | grep nova-uuid, then open up the console to that host and figure out wtf went wrong.19:44
shardyrook: could you perhaps start a ML thread, or raise a bug about the symtoms and workaround you have in mind?19:44
rookright shardy19:44
shardyrook: I think we definitely want to act on this operational experience, it's just worth some wider discussion before we do a tripleo specific workaround I think19:45
shardyrook: one thing which would really help is if you can describe in detail the recovery workflow you want to automate19:46
shardyincluding which API (nova, ironic) and what state things are in19:46
shardyrook: I started looking into a mistral workflow which deploys nodes directly via ironic (see https://review.openstack.org/#/c/313048/), but it's possible we could have a workflow that instead automates the nova->ironic dance you're describing19:47
lucasagomesshardy, rook hi there. Yeah we do lack a mechanism to check if the machine did boot correctly or not19:48
rookshardy sure - so a RFE bz and a us discussion19:49
larsksshardy, in https://github.com/openstack/instack-undercloud/commit/55ccd0e1e8c66c9f474112358063bc263720d84f, shouldn't heat::limit_iterators be heat::yaql_limit_iterators?  I am looking at e.g. https://github.com/openstack/puppet-heat/blob/stable/newton/manifests/init.pp#L42219:51
*** mbozhenko has joined #tripleo19:52
shardyrook: yup, and/or just file a launchpad bug where we can discuss further19:54
shardythanks for the feedback19:54
shardylarsks: yes!  Good catch :)19:55
shardyhttp://logs.openstack.org/64/381864/3/check/gate-tripleo-ci-centos-7-nonha-multinode/3eff38e/logs/etc/heat/heat.conf.txt.gz19:55
shardylarsks: you can see in that recent CI job we're only setting memory_quota19:55
larsksshardy, I will file a change...19:55
shardylarsks: thanks19:55
shardyI don't think we actually hit that limit, but it seemed a good idea to bump them both at the same time19:56
*** mbozhenko has quit IRC19:57
*** paramite has quit IRC19:58
openstackgerritLars Kellogg-Stedman proposed openstack/instack-undercloud: correctly spell yaql_limit_iterators  https://review.openstack.org/38205619:58
shardyEmilienM: FYI I abandoned https://review.openstack.org/#/c/381903/ because https://review.openstack.org/#/c/378737/ won't merge to master with the backport posted19:58
shardyI'll re-approve to see if we can get the master tht patch to gate, then we can restore19:58
EmilienMwe have a chicken and egg problem then19:58
EmilienMour CI is using packages from stable/newton19:58
shardyFor patches to master?19:59
shardywe need to land both dependent patches to master, then propose them to stable/newton, no?19:59
EmilienMshardy: yes19:59
EmilienMuntil we have ocata repo, tripleo takes newton packages.19:59
EmilienMsince Friday.20:00
EmilienMit's how delorean works, they take latest branch20:00
shardyHmm, OK20:00
shardythen I guess we have to land https://review.openstack.org/#/c/381903/20:00
EmilienMyes20:01
*** dprince has quit IRC20:01
EmilienMshardy: i'm approving it20:01
shardyEmilienM: Ok, thanks, I hadn't realized we had such a branch issue with CI20:01
EmilienMshardy: I hope it will pass CI though20:01
EmilienMbecause our multinode job is voting20:01
shardyI don't think there's any reason for this to fail the multinode job, so hopefully it will be OK20:02
EmilienMok20:03
*** shardy has quit IRC20:03
*** lblanchard has quit IRC20:06
EmilienMdprince left, firewall still not passing pingtest http://logs.openstack.org/64/381864/3/check-tripleo/gate-tripleo-ci-centos-7-ovb-nonha/045a9cc/console.html#_2016-10-04_19_59_14_92698420:06
*** paramite has joined #tripleo20:06
*** ccamacho has quit IRC20:08
*** Goneri has quit IRC20:09
openstackgerritSteven Hardy proposed openstack/tripleo-heat-templates: Move the main template files for defalut services to new syntax generation  https://review.openstack.org/38197520:15
*** ipsecguy has quit IRC20:20
*** ipsecguy has joined #tripleo20:20
*** gchamoul has quit IRC20:21
*** gchamoul has joined #tripleo20:22
*** rbowen has quit IRC20:23
*** zeroshft has joined #tripleo20:28
*** zeroshft has quit IRC20:28
*** egafford has quit IRC20:29
*** coolsvap has quit IRC20:32
*** maticue has joined #tripleo20:34
*** jayg is now known as jayg|g0n320:37
*** rbowen has joined #tripleo20:38
*** shardy has joined #tripleo20:39
*** paramite has quit IRC20:47
*** jprovazn has quit IRC20:54
*** dhill_ has quit IRC20:54
*** chem has quit IRC20:54
*** oneswig has joined #tripleo20:58
*** dhill_ has joined #tripleo20:58
*** dhill_ has quit IRC21:01
*** dhill_ has joined #tripleo21:01
*** rbowen has quit IRC21:10
*** akrivoka has quit IRC21:11
*** trown is now known as trown|outtypewww21:12
*** shardy has quit IRC21:15
*** tiswanso has quit IRC21:15
*** mcornea has quit IRC21:15
*** ipsecguy has quit IRC21:18
*** ipsecguy has joined #tripleo21:18
*** jeckersb is now known as jeckersb_gone21:25
*** adam_g` is now known as adam_g21:25
openstackgerritJames Slagle proposed openstack-infra/tripleo-ci: Do Not Merge - Test Undercloud upgrade mitaka -> newton  https://review.openstack.org/38130921:27
openstackgerritJames Slagle proposed openstack-infra/tripleo-ci: Undercloud upgrade for mitaka and newton  https://review.openstack.org/38128621:27
*** akshai has quit IRC21:30
openstackgerritMerged openstack/tripleo-common: Modify j2 templating to allow role files generation  https://review.openstack.org/38190321:36
*** rbrady-afk is now known as rbrady21:38
openstackgerritMerged openstack/tripleo-heat-templates: j2 template role config templates  https://review.openstack.org/37873721:40
*** dbecker has quit IRC21:40
*** b00tcat has quit IRC21:43
*** jcoufal_ has quit IRC21:45
*** akshai has joined #tripleo21:48
*** akshai has quit IRC21:49
*** bana_k has quit IRC21:50
*** bana_k has joined #tripleo21:52
*** mbozhenko has joined #tripleo21:53
openstackgerritEmilien Macchi proposed openstack-infra/tripleo-ci: enable undercloud/ssh on multinode jobs  https://review.openstack.org/38208221:56
*** mbozhenko has quit IRC21:58
slagleMOAR ssh21:59
*** oneswig has quit IRC22:00
EmilienM:)22:03
*** mburned is now known as mburned_out22:04
dmsimardslagle: fyi https://review.rdoproject.org/r/#/c/3044/122:12
*** tobias-fiberdata has joined #tripleo22:29
*** tobias_fiberdata has quit IRC22:34
openstackgerritHonza Pokorny proposed openstack/tripleo-common: Support node untagging  https://review.openstack.org/37262822:34
*** dsariel_ has quit IRC22:36
*** pradk has quit IRC22:48
*** rcernin has quit IRC22:50
*** rhallisey has quit IRC22:55
*** yamahata has quit IRC22:55
*** dsariel_ has joined #tripleo23:04
*** rajinir has quit IRC23:05
*** tiswanso has joined #tripleo23:09
*** tiswanso has quit IRC23:13
*** saneax-_-|AFK is now known as saneax23:13
*** sthillma has joined #tripleo23:16
*** yamahata has joined #tripleo23:17
*** sthillma has quit IRC23:21
*** penick has quit IRC23:26
*** penick has joined #tripleo23:30
*** bana_k has quit IRC23:39
*** bana_k has joined #tripleo23:42
*** maticue has quit IRC23:42
openstackgerritBrad P. Crochet proposed openstack/python-tripleoclient: Downloads templates from swift before processing update  https://review.openstack.org/38189923:51
*** mbozhenko has joined #tripleo23:53
*** mbozhenko has quit IRC23:58

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!