Friday, 2016-11-04

*** bank_ has quit IRC00:02
*** apetrich has quit IRC00:04
*** apetrich has joined #tripleo00:04
*** ooolpbot has joined #tripleo00:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION00:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163796100:10
openstackLaunchpad bug 1637961 in tripleo "periodic HA master job pingtest times out" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)00:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163835000:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163886400:10
*** ooolpbot has quit IRC00:10
openstackLaunchpad bug 1638350 in tripleo "pingtest failing on OVB jobs to create Cinder volume and Nova server" [Critical,In progress] - Assigned to Emilien Macchi (emilienm)00:10
openstackLaunchpad bug 1638864 in tripleo "mariadb-10.1.18 breaks the resource agent" [Critical,In progress] - Assigned to Michele Baldessari (michele)00:10
*** fultonj has quit IRC00:12
*** fultonj has joined #tripleo00:14
*** panda is now known as panda|zZ00:18
*** links has joined #tripleo00:27
*** limao has joined #tripleo00:36
*** b00tcat has quit IRC00:44
*** dougbtv has joined #tripleo00:46
*** percevalbot has quit IRC00:58
*** cdearborn has quit IRC01:00
*** maticue has quit IRC01:06
*** jkilpatr has quit IRC01:07
*** ooolpbot has joined #tripleo01:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION01:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163796101:10
openstackLaunchpad bug 1637961 in tripleo "periodic HA master job pingtest times out" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)01:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163835001:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163886401:10
*** ooolpbot has quit IRC01:10
openstackLaunchpad bug 1638350 in tripleo "pingtest failing on OVB jobs to create Cinder volume and Nova server" [Critical,In progress] - Assigned to Emilien Macchi (emilienm)01:10
openstackLaunchpad bug 1638864 in tripleo "mariadb-10.1.18 breaks the resource agent" [Critical,In progress] - Assigned to Michele Baldessari (michele)01:10
*** b00tcat has joined #tripleo01:11
*** saneax is now known as saneax-_-|AFK01:15
*** lblanchard has quit IRC01:19
*** jerrygb has joined #tripleo01:26
*** jerrygb_ has quit IRC01:27
*** jerrygb_ has joined #tripleo01:32
*** jerrygb has quit IRC01:35
*** jerrygb_ has quit IRC01:39
*** jerrygb has joined #tripleo01:41
*** dougbtv has quit IRC01:44
*** jerrygb has quit IRC01:49
*** jerrygb has joined #tripleo01:50
*** chlong has joined #tripleo02:06
*** ooolpbot has joined #tripleo02:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION02:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163796102:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163835002:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163886402:10
*** ooolpbot has quit IRC02:10
openstackLaunchpad bug 1637961 in tripleo "periodic HA master job pingtest times out" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)02:10
openstackLaunchpad bug 1638350 in tripleo "pingtest failing on OVB jobs to create Cinder volume and Nova server" [Critical,In progress] - Assigned to Emilien Macchi (emilienm)02:10
openstackLaunchpad bug 1638864 in tripleo "mariadb-10.1.18 breaks the resource agent" [Critical,In progress] - Assigned to Michele Baldessari (michele)02:10
*** jerrygb has quit IRC02:12
*** jerrygb has joined #tripleo02:13
*** fzdarsky__ has joined #tripleo02:14
*** rlandy has quit IRC02:14
*** fzdarsky_ has quit IRC02:17
*** rhallisey has quit IRC02:17
openstackgerritSteve Baker proposed openstack/tripleo-common: Clean up configure_containers.sh script  https://review.openstack.org/38486502:21
openstackgerritSteve Baker proposed openstack/tripleo-common: Allow building heat-agents image from master  https://review.openstack.org/38486602:21
openstackgerritSteve Baker proposed openstack/tripleo-common: Create new docker command hook  https://review.openstack.org/31272302:21
openstackgerritSteve Baker proposed openstack/tripleo-common: Install configuration files for all downloaded packages  https://review.openstack.org/34741202:21
openstackgerritSteve Baker proposed openstack/tripleo-common: Create new docker command hook  https://review.openstack.org/31272302:24
*** jerrygb has quit IRC02:29
*** jerrygb has joined #tripleo02:30
*** jerrygb_ has joined #tripleo02:42
*** jerrygb has quit IRC02:44
*** jerrygb has joined #tripleo02:47
*** jerrygb_ has quit IRC02:50
*** dmacpher has joined #tripleo02:54
*** ooolpbot has joined #tripleo03:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION03:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163796103:10
openstackLaunchpad bug 1637961 in tripleo "periodic HA master job pingtest times out" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)03:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163835003:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163886403:10
*** ooolpbot has quit IRC03:10
openstackLaunchpad bug 1638350 in tripleo "pingtest failing on OVB jobs to create Cinder volume and Nova server" [Critical,In progress] - Assigned to Emilien Macchi (emilienm)03:10
openstackLaunchpad bug 1638864 in tripleo "mariadb-10.1.18 breaks the resource agent" [Critical,In progress] - Assigned to Michele Baldessari (michele)03:10
*** yamahata has quit IRC03:10
openstackgerritSteve Baker proposed openstack/tripleo-common: Create new docker command hook  https://review.openstack.org/31272303:14
openstackgerritRedHat RDO CI proposed openstack/tripleo-heat-templates: GATE TEST, please ignore  https://review.openstack.org/36544903:30
*** apetrich has quit IRC03:35
*** apetrich has joined #tripleo03:36
*** sudipto has joined #tripleo03:46
*** sudipto_ has joined #tripleo03:47
*** ebalduf has quit IRC03:50
*** numans has joined #tripleo03:51
*** ooolpbot has joined #tripleo04:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION04:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163796104:10
openstackLaunchpad bug 1637961 in tripleo "periodic HA master job pingtest times out" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)04:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163835004:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163886404:10
*** ooolpbot has quit IRC04:10
openstackLaunchpad bug 1638350 in tripleo "pingtest failing on OVB jobs to create Cinder volume and Nova server" [Critical,In progress] - Assigned to Emilien Macchi (emilienm)04:10
openstackLaunchpad bug 1638864 in tripleo "mariadb-10.1.18 breaks the resource agent" [Critical,In progress] - Assigned to Michele Baldessari (michele)04:10
*** jerrygb has quit IRC04:24
*** abregman has quit IRC04:25
*** tzumainn has quit IRC04:46
*** sudipto_ has quit IRC05:10
*** sudipto has quit IRC05:10
*** ooolpbot has joined #tripleo05:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION05:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163796105:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163835005:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163886405:10
*** ooolpbot has quit IRC05:10
openstackLaunchpad bug 1637961 in tripleo "periodic HA master job pingtest times out" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)05:10
openstackLaunchpad bug 1638350 in tripleo "pingtest failing on OVB jobs to create Cinder volume and Nova server" [Critical,In progress] - Assigned to Emilien Macchi (emilienm)05:10
openstackLaunchpad bug 1638864 in tripleo "mariadb-10.1.18 breaks the resource agent" [Critical,In progress] - Assigned to Michele Baldessari (michele)05:10
*** abehl has joined #tripleo05:13
*** masco has joined #tripleo05:24
*** ramishra has quit IRC05:29
*** ramishra has joined #tripleo05:31
*** saneax-_-|AFK is now known as saneax05:52
*** rcernin has joined #tripleo05:55
*** sudipto_ has joined #tripleo05:56
*** sudipto has joined #tripleo05:56
*** ooolpbot has joined #tripleo06:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION06:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163796106:10
openstackLaunchpad bug 1637961 in tripleo "periodic HA master job pingtest times out" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)06:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163835006:10
*** ooolpbot has quit IRC06:10
openstackLaunchpad bug 1638350 in tripleo "pingtest failing on OVB jobs to create Cinder volume and Nova server" [Critical,In progress] - Assigned to Emilien Macchi (emilienm)06:10
*** florianf has joined #tripleo06:38
*** florianf has quit IRC06:42
*** florianf has joined #tripleo06:49
*** gfidente has quit IRC06:52
*** nyechiel has joined #tripleo06:55
*** bana_k has joined #tripleo06:57
*** mrunge has quit IRC07:03
*** lmiccini has joined #tripleo07:04
*** tesseract has joined #tripleo07:04
*** tesseract is now known as Guest1319407:04
*** rasca has joined #tripleo07:05
*** mrunge has joined #tripleo07:07
*** ooolpbot has joined #tripleo07:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION07:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163796107:10
openstackLaunchpad bug 1637961 in tripleo "periodic HA master job pingtest times out" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)07:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163835007:10
*** ooolpbot has quit IRC07:10
openstackLaunchpad bug 1638350 in tripleo "pingtest failing on OVB jobs to create Cinder volume and Nova server" [Critical,In progress] - Assigned to Emilien Macchi (emilienm)07:10
*** bana_k has quit IRC07:12
*** asalkeld has joined #tripleo07:14
*** liverpooler has joined #tripleo07:20
*** jprovazn has joined #tripleo07:22
openstackgerritMerged openstack/tripleo-validations: Pass the the custom cacert to nova and heat client  https://review.openstack.org/39083307:30
*** pcaruana has joined #tripleo07:33
*** chlong has quit IRC07:33
*** b00tcat` has joined #tripleo07:34
openstackgerritOpenStack Proposal Bot proposed openstack/tripleo-validations: Updated from global requirements  https://review.openstack.org/39117007:38
*** cylopez has joined #tripleo07:38
*** mhenkel has joined #tripleo07:43
*** mcornea has joined #tripleo07:46
*** zoli|gone is now known as zoli|trng-afk07:48
*** dmacpher has quit IRC07:52
*** chandankumar has joined #tripleo07:52
*** ccamacho has quit IRC07:53
*** florianf has quit IRC07:58
*** ohamada has joined #tripleo07:59
*** shardy has joined #tripleo08:00
*** florianf has joined #tripleo08:04
*** ccamacho has joined #tripleo08:05
*** jaosorior has joined #tripleo08:07
*** d0ugal has joined #tripleo08:08
*** fragatina has joined #tripleo08:09
*** fragatina has quit IRC08:10
*** ooolpbot has joined #tripleo08:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION08:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163796108:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163835008:10
*** ooolpbot has quit IRC08:10
openstackLaunchpad bug 1637961 in tripleo "periodic HA master job pingtest times out" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)08:10
openstackLaunchpad bug 1638350 in tripleo "pingtest failing on OVB jobs to create Cinder volume and Nova server" [Critical,In progress] - Assigned to Emilien Macchi (emilienm)08:10
*** fragatina has joined #tripleo08:10
openstackgerritSteven Hardy proposed openstack/tripleo-heat-templates: WIP prototyping composable upgrades with Heat+Ansible  https://review.openstack.org/39344808:11
shardymarios: Hey g'morning - FYI I started trying some things related to the discussion on the spec, see ^^08:12
shardynot yet functional, but if you're happy with the approach I'll spend some more time on it today08:12
shardyit's pretty similar to the previous ansible prototyping, but it's wired in with the new composable services interfaces08:13
mariosshardy: awesome. i tried keeping it updated after the initial discussion this morning (thanks for including the context/chat with emilienm there nice to have it all in one place)08:15
mariosshardy: the spec i mean08:15
mariosheh s/this morning/ this week  even :/08:15
* marios gets the coffee08:15
shardymarios: yup thanks for updating the spec - perhaps we can push a little more on the first pass of prototyping then do a final spec update when we're comfortable with the approach/interfaces08:15
mariosshardy: yeah sounds good... i think by next week end we should have a fairly good idea on the overall approach so we can land it in time for O1 easy08:16
shardyI feel fairly happy with how it fits now, and we could still reuse the same pieces in future even if ansible was run by something other than heat08:16
*** ebarrera has joined #tripleo08:16
mariosshardy: thanks very much for picking that up shardy even though I'd love to have the time to play a bit too :(08:16
shardymarios: np - honestly I think the hard part will be writing all the per-service snippets and not so much the initial architecture08:17
*** yamahata has joined #tripleo08:17
mariosshardy: check email when you have a change pls08:17
shardyI'm thinking we may be able to reuse some of the stuff from jistr's earlier ansible prototype there tho08:17
mariosshardy: yeah i softened the 'no ansible' on the alternatives... it had occurred to me that since we are using ansible now we may be able to pick something from there... though that was designed to be run stand-alone. I guess the output of the upgrade snippet munging on the tht side will be something like the stuff that jistr wrote (which were the e-e upgrades playbooks)08:19
*** athomas has joined #tripleo08:19
shardymarios: yeah exactly - I expect the output to be a playbook with a bunch of tasks tagged with each step08:20
shardywe can even write it to e.g /root/tripleo_upgrade_steps.yml on the nodes (we could do the same with the per-role puppet manifests)08:20
shardyor at least make it easy to grab it if you want to run it via ansible directly08:21
mariosshardy: yeah that would be cool... we could then invoke that with w/e - even just a 'upgrade-my-node.sh' which invoked the playbook. i mean we already have upgrade-non-controller.sh08:21
shardymarios: yeah, I was thinking we'd retain the option to just deploy an upgrade script on the node, and optionally run the upgrade08:22
shardythat way folks can do their special snowflake things on upgrade if they really want to08:22
mariosshardy: i mean that assumes a /root/tripleo_upgrade_node.sh08:22
shardymarios: yeah, but that could just be a wrapper for ansible or $whatever08:23
openstackgerritJuan Antonio Osorio Robles proposed openstack-infra/tripleo-ci: WIP TLS everywhere job  https://review.openstack.org/39173808:23
shardybrb08:23
mariosshardy: right... i'm saying the 'deliver but not invoke yet' is what we do for the non controller upgrade scripts08:23
matbushardy: hey, i was looking briefly at your review08:24
matbushardy: the ansible playbook would be run on localhost on the nodes ?08:25
*** d0ugal has quit IRC08:28
matbumarios: by the way, i kick the pingtest between controller upgrade and compute, and it failed, glance is not reachable, i need to investigate08:28
*** amoralej|off is now known as amoralej08:29
openstackgerritCarlos Camacho proposed openstack/tripleo-heat-templates: Reload haproxy configuration as a post-deployment step  https://review.openstack.org/39364408:30
mariosmatbu: check if swift is started on overcloud they might be down08:30
mariosmatbu: matbu https://review.openstack.org/#/c/392680/08:31
matbumarios: yup i'll check, but i applied the review (afair, it was late :))08:31
mariosmatbu: was the pingtest fail like this https://bugzilla.redhat.com/attachment.cgi?id=121657408:32
marios500 Internal Server Error tripleo.sh -- Overcloud pingtest, uploading demo tenant image to glance08:32
matbumarios: 500 Internal Server Error: Failed to upload image08:33
matbumarios: i didn't check the log08:33
mariosmatbu: and on controllers there is this trace https://bugzilla.redhat.com/attachment.cgi?id=1216577 ...08:33
mariosmatbu: yeah sounds similar anyway08:33
mariosmatbu: but even after i fixed swift, still have an issue with heat stack domain... there is info in https://bugzilla.redhat.com/show_bug.cgi?id=1386719#c808:33
openstackbugzilla.redhat.com bug 1386719 in rhel-osp-director "OSP9 to OSP10 upgrade pingtest fails." [High,New] - Assigned to mandreou08:33
matbumarios: actually it depend on the priority, should i spend sometimes on that ? or just try to go further (i mean compute nodes upgrade and converge)08:34
mariosmatbu: so i'd say try and continue for now. i think it would be good if you could first verify that you are hitting the same 'swift was down' main issue there and that review fixes it08:34
*** jpena|off is now known as jpena08:35
mariosmatbu: if you also then hit the subsequent heat stack domain auth issue then perhaps we should file BZ for it as another issue (i.e. use https://bugzilla.redhat.com/show_bug.cgi?id=1386719 for the 'fix swift' which is a problem anyway)08:35
openstackbugzilla.redhat.com bug 1386719 in rhel-osp-director "OSP9 to OSP10 upgrade pingtest fails." [High,New] - Assigned to mandreou08:35
matbumarios: ack08:35
mariosmatbu: s/we/i will file a bz08:36
mariosmatbu: thanks08:36
matbumarios: yep pretty much the same stack trace08:38
matbubut i applied the fix08:38
*** ebarrera has quit IRC08:40
mariosmatbu: hmm there may be a nit there then I set -1 on the review https://review.openstack.org/#/c/392680/208:41
mariosmatbu: can you check the logs. was there an attempt to start the swift services?08:42
matbumarios: yep08:42
matbumarios: but actually swift is running08:43
mariosmatbu: k, unsetting -1 then if swift is running  :)08:44
matbumarios: lol seconds08:45
mariosmatbu: /me jumpy08:45
matbuhehe08:45
matbumarios: so yes swift is running, the review works fine, but there is another issue08:46
mariosmatbu: ok08:47
*** hewbrocca_afk is now known as hewbrocca08:48
mariosmatbu: so don't spend *too* long imo... capture some debug info incase but then try get onto the converge etc08:48
socialmoin08:49
matbumarios: yep, it looks like it's an authentication issue08:49
mariosmatbu: right so it may be the overcloud heat domain issue i saw then08:49
marioshttps://bugzilla.redhat.com/show_bug.cgi?id=1386719#c708:50
openstackbugzilla.redhat.com bug 1386719 in rhel-osp-director "OSP9 to OSP10 upgrade pingtest fails." [High,New] - Assigned to mandreou08:50
mariosmatbu: like this https://bugzilla.redhat.com/attachment.cgi?id=121658608:50
mariosERROR: Authorization failed.08:50
shardymarios: are you sure that's not another manifestation of https://bugzilla.redhat.com/show_bug.cgi?id=1388474 ?08:51
openstackbugzilla.redhat.com bug 1388474 in openstack-tripleo "Overcloud heat fails to create an IAM user" [Unspecified,On_dev] - Assigned to shardy08:51
shardyfixed by https://review.openstack.org/#/c/39228808:51
shardypuppet misconfigures the heat.conf after the upgrade unless you have that fix08:52
* matbu lost in the bz08:52
mariosshardy: thanks i was not aware of that bug reading08:52
shardyyou can tell by looking inside heat.conf - without the fix the heat domain settings are all unset, and openstack domain list with the overcloudrc.v3 won't have the special heat_stack domain08:53
*** paramite has joined #tripleo08:53
mariosshardy: i went looking for earlier 8/9 heat domain related issues we had in the passed yesterday but this seems to be a new thing08:53
shardymatbu: Hey sorry missed your question, yes for now at least the playbook will be run just on the localhost on a node by node basis08:55
shardythat is the only ansible model heat supports08:55
shardyin future we might enable a different mode where the same playbooks are driven from the undercloud without heat, but this is IMO an easier first step08:55
mariosshardy: cool would be fantastic if this ifxes it ... i just checked heat.conf and i do see things like stack_domain_admin = heat_stack_domain_admin08:55
mariosstack_domain_admin_password = BQqY etc08:56
shardymarios: ah, possibly a different issue then, IME those are unset when the bug I referenced is present08:56
shardypuppet actually unconfigures them on update08:56
matbushardy: k, yes i was wondering of something more "ansible oriented".08:57
matbushardy: actually heat is limited us08:57
mariosshardy: ok... it still sounds relevant though i mean it is about keystone domain auth failure which isn't happening before we upgrade the controllers08:57
shardymatbu: Yeah - like, this is the first step, e.g generating the playbook with all the per-upgrade-step tags, for all services08:57
shardymatbu: heat provides an easy way to run those, but we could also dump the playbook out and let the user run it some other way08:58
shardyand/or automate doing that via a mistral workflow like we do for validations08:58
*** d0ugal_ has joined #tripleo08:58
shardybut that's a more difficult fit for our existing heat orientated architecture, so I'd prefer to tackle that as a later feature08:58
*** gfidente has joined #tripleo08:58
*** gfidente has quit IRC08:58
*** gfidente has joined #tripleo08:58
matbushardy: yep, k08:58
shadowerjaosorior: could you have a look at these one liners, pls? https://review.openstack.org/#/c/391280/ https://review.openstack.org/#/c/391272/ and https://review.openstack.org/#/c/391267/08:58
shardymatbu: well heat isn't limiting us exactly, it's just that there is overlap between heat an ansible in this case08:59
matbushardy: and what about handling ansible via the tripleoclient ?08:59
shardye.g heat or ansible could orchestrate a rolling upgrade, but not both at the same time08:59
matbushardy: yes agree with the overlap08:59
*** jpich has joined #tripleo08:59
*** jlinkes has joined #tripleo08:59
shardymatbu: not sure I follow, what is the ansible/tripleoclient requirement?09:00
openstackgerritMerged openstack/tripleo-quickstart: Drop *openstack/common* in flake8 exclude list  https://review.openstack.org/37387409:00
jaosoriorshardy: sure09:01
matbushardy: well, it just a thought, but i was wondering of just using heat for managing deploying nodes and ansible for applying config (and so upgrade things) ... but it's really far away of what we have today09:01
shardymatbu: yes, that's one possible end-goal here, but not something I'm considering in this iteration09:01
matbushardy: so the client could orchestrate ansible (through the python ansible api)09:01
shardymatbu: I think when we have this model in place, it'd be pretty simple to go that extra step and disable heat running both puppet and ansible09:02
shardyand dump out a playbook which contains the deploy and upgrade steps09:02
matbushardy: i was also wondering of something that could make a graph of the dependencies of the services/roles for ordering the upgrade (in the client)09:02
shardythen folks can do whatever they want (and stop complaining about heat :\)09:02
matbushardy: hehe yep09:03
shardymatbu: So, yes, but we have to avoid putting business logic in tripleoclient itself, particularly if we want that functionality to be available via the UI09:03
shardyso it might be in a mistral workflow instead09:03
matbushardy: ha yes right , i always forgot the UI :/09:04
shardyI can imagine e.g openstack overcloud deploy --templates -e $tht/environments/no-config.yaml09:04
openstackgerritMerged openstack/tripleo-validations: Fix the pacemaker-status validation  https://review.openstack.org/39128009:05
shardywhich would deploy the nodes and noop all puppet/ansible configuration09:05
shardythen we'd provide the per-role puppet and ansible stuff as outputs from the stack09:05
openstackgerritMerged openstack/tripleo-validations: Fix the rabbitmq-limits validations  https://review.openstack.org/39127209:05
openstackgerritMerged openstack/tripleo-validations: Fix the check-network-gateway validation  https://review.openstack.org/39126709:05
shardyor write them into swift, or whatever09:05
matbuyep09:06
shardymatbu: something that graphs the dependencies for both deploy and upgrade would be really useful09:06
shardyI suspect it might be fairly tricky to write tho09:06
openstackgerritJulie Pichon proposed openstack/python-tripleoclient: Pass clients to get the get_password function  https://review.openstack.org/39319209:06
matbushardy: so the UI could handle the deploy/upgrade with this solution ? (calling mistral heat api only)09:07
shardymatbu: yes, that's one reason I've left driving ansible inside heat09:07
matbushardy: yes i was wonedring of that, i wanted to test something, but the upgrade bugs is killing me :)09:07
shardythe UI doesn't have to change at all, and all existing upgrade docs still work09:07
matbu(i mean graph)09:07
matbushardy: k, but it too late for ocata ? or do you think we could try something like this for ocata ?09:09
shardythe other thing to consider is containers - I actually think the heat model will work much better for upgrades in that case09:09
shardyso I wanted to wire this all in via the heat model now, then we can review the next step when containerization is done09:09
shardymatbu: this upgrade work is targetted at ocata09:09
shardyit's basically essential since we released composable roles09:10
*** ooolpbot has joined #tripleo09:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION09:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163796109:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163835009:10
*** ooolpbot has quit IRC09:10
openstackLaunchpad bug 1637961 in tripleo "periodic HA master job pingtest times out" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)09:10
openstackLaunchpad bug 1638350 in tripleo "pingtest failing on OVB jobs to create Cinder volume and Nova server" [Critical,In progress] - Assigned to Emilien Macchi (emilienm)09:10
matbushardy: yep but i mean the model : let heat managing/deploying nodes and output the playbook/manifest to ansible/puppet .. this model could be target for ocata ?09:10
*** psanchez has quit IRC09:10
*** lucas-afk is now known as lucasagomes09:12
*** nyechiel has quit IRC09:13
*** katkapilatova has joined #tripleo09:13
*** psanchez has joined #tripleo09:13
shardymatbu: possibly but I don't view it as a super high priority given all the other work we have this (short) cycle09:16
shardymatbu: basically we have to get composable upgrades done first, and you might consider your requirement related to https://blueprints.launchpad.net/tripleo/+spec/split-stack-default09:16
shardyI suspect by the end of ocata what you're asking for will be possible, but perhaps not fully tested/supported yet09:16
shardylets see how long the composable upgrades stuff takes to get finished, then we can assess how much work remains to fully split things09:17
hewbroccashardy: I never would have predicted the shape this is taking, but it's not bad :)09:17
matbushardy: ack09:18
shardyhewbrocca: heh, thanks (I think?! ;)09:19
shardyI think it actually fits together fairly well, even though it does end up with heat being essentially a translation layer for the software config09:19
shardyprovided we still want to use heat for the node/network orchstration that's probably not so bad tho09:20
hewbroccaThat's the thing I wouldn't have predicted09:20
hewbroccaand, yes09:20
hewbroccashardy: makes me wonder if we shouldn't resurrect that ironic resource for Heat09:20
matbushardy: agree09:21
openstackgerritArx Cruz proposed openstack-infra/tripleo-ci: Reducing a few minutes from the job timeout to save the logs  https://review.openstack.org/39330909:21
hewbroccaThere's even less reason to have Nova in the picture if we're really just driving Ironic09:21
shardyhewbrocca: actually I think those are already getting resurrected by ricolin, but I'm not sure they are vital to our use-case09:21
shardyhewbrocca: I was planning instead a mistral workflow that coordinates the dance between ironic and neutron ref https://review.openstack.org/#/c/313048/09:22
hewbroccaAgree, I wouldn't say vital09:22
hewbroccawell, you are two steps ahead of me, as usual09:22
shardyneed to spend some time getting that working, then we could remove nova potentially09:22
hewbroccacarry on :)09:22
shardyhehe :)09:22
openstackgerritJuan Antonio Osorio Robles proposed openstack-infra/tripleo-ci: WIP TLS everywhere job  https://review.openstack.org/39173809:23
*** lmiccini has quit IRC09:24
*** hjensas has quit IRC09:25
*** ebarrera has joined #tripleo09:26
openstackgerritJulie Pichon proposed openstack-infra/tripleo-ci: Add UI to undercloud sanity checks  https://review.openstack.org/39084509:28
openstackgerritJuan Antonio Osorio Robles proposed openstack-infra/tripleo-ci: WIP TLS everywhere job  https://review.openstack.org/39173809:29
*** dtantsur|afk is now known as dtantsur09:32
openstackgerritMerged openstack/tripleo-common: Install configuration files for all downloaded packages  https://review.openstack.org/34741209:36
openstackgerritKaterina Pilatova proposed openstack/tripleo-validations: undercloud-disk-space.yaml: improved output  https://review.openstack.org/39335409:36
jaosoriorHas anybody attempted to reproduce the CI issues locally?09:37
openstackgerritKaterina Pilatova proposed openstack/tripleo-validations: undercloud-disk-space.yaml: improved output  https://review.openstack.org/39335409:39
*** charliejllewelly has joined #tripleo09:40
*** Steve__ has joined #tripleo09:40
*** Steve__ is now known as TheRealCharlieJl09:41
*** TheRealCharlieJl is now known as SteveRelf09:41
jaosoriorccamacho: have you tested this? https://review.openstack.org/#/c/393644/109:42
charliejllewellyHi All, is anyone able to help me understand how the auth_url for os-collect-config is set? Currently it is returning the internal API address space which prevents it accessing the HEAT API, I would expect it to be using the Public URL.09:43
ccamachojaosorior checking that with https://bugzilla.redhat.com/show_bug.cgi?id=139096209:43
openstackbugzilla.redhat.com bug 1390962 in rhel-osp-director "HAProxy doesn't load the new configuration after scaling out the role running the Openstack API services" [Urgent,Assigned] - Assigned to ccamacho09:43
ccamachojaosorior, good morning man :)09:43
*** milan has joined #tripleo09:44
jaosoriorccamacho: I remember in the summit, bandini, mcornea, jistr and I arrived to the grim conclusion that no pacemaker restarts were happening due to a slip-up in the composable roles work, not sure if that made it into a bug report, but yeah09:44
jaosoriorccamacho: so it might be that the stuff you put there doesn't even get set up09:44
jaosorior:/09:44
jaosorior*doesn't even get ran09:44
openstackgerritOpenStack Proposal Bot proposed openstack/tripleo-common: Updated from global requirements  https://review.openstack.org/38995709:45
openstackgerritOpenStack Proposal Bot proposed openstack/tripleo-validations: Updated from global requirements  https://review.openstack.org/39117009:45
ccamachojaosorior, aaaaaa crap... Do you have a bug for that?09:45
ccamachoim adding some echo's locally to test09:45
jaosoriorccamacho: that's what I said, I'm not sure if that made it into a bug report, we just figured out in the summit cause of another bug09:45
ccamachommmmm jaosorior thanks man09:45
jaosoriorccamacho: yeah :/09:46
mcorneajaosorior: ccamacho I don't think there's a bug except the SSL cert related one09:46
ccamacho:) thanks for the hint dude09:46
jaosoriorccamacho: so, hopefully this saves you some time debugging the same stuff we did :/09:46
jaosoriorccamacho: good morning, by the way :D09:46
jpichd0ugal_: Good morning! Is it fair to assume you're going to keep the approach of updating the passwords in the mistral env for https://review.openstack.org/#/c/392593/ ? As in, should I make my patch dependent on yours I can solve All My Problems and assume get_password() will expect passwords to be in the new format (name), regardless of whether this is Mitaka/Newton/an upgrade?09:47
jtomasekshardy: Hi, so I've been looking into if it is possible to add additional sections to environment and it seems that it is not possible (I am getting environment has wrong section "metadata") did I misunderstood something during the summit session?09:47
*** percevalbot has joined #tripleo09:47
jaosoriormcornea: was the patch that I sent you the other day about the VIP hosts useful?09:48
thervejtomasek, It's not possible, sections are validated09:48
shardyjtomasek: It's not possible, we'd need to propose changes to both heat and heatclient09:48
jaosoriorshardy: you might know the answer to charliejllewelly's question :O09:48
shardyjtomasek: I suggested an alternative approach where we abuse a specially named parameter_default09:48
therveUhhhhh09:49
shardybut we can revive the discussion wrt just adding some things to the environment too09:49
* therve didn't see that09:49
shardylike at least a description field09:49
shardytherve: I was only mentioning it in the context of a temporary workaround09:49
jtomasekshardy: yeah, I outlined the intention here https://review.openstack.org/#/c/393365/1/specs/ocata/gui-deployment-configuration.rst09:50
therveshardy, I know :)09:50
d0ugal_jpich: Yup, I will still be doing that, one way or another09:50
mcorneajaosorior: yep, I was looking for you yesterday to confirm but it was too late. I got an environment deployed with CloudName and it worked well. I believe we can document the parameters as the different domain for public/internal makes sense.09:50
d0ugal_jpich: I think I just need to be a bit smarter about when I do that, as bnemec pointed out it will get a bit messy after the upgrade unless the user deletes the password file.09:50
jtomasekshardy: as I looked into it more, 'title' and 'description' would be sufficient09:50
mcorneajaosorior: probably we need to take them into consideration when we're going to test SSL for internal services?09:50
jtomasekshardy: could you please point me to the 'description' discussion?09:50
*** d0ugal_ is now known as d0ugal09:51
jaosoriormcornea: yep, actually, I only test internal SSL with CLOUDNAME, FreeIPA doesn't give you certs for IP addresses anyway09:51
jpichd0ugal_: Cool! We get to play to the "patch dependency chain" game again :)09:51
mcorneajaosorior: interesting, I'm going to look into that after I finish with the composable roles testing09:52
*** d0ugal has joined #tripleo09:52
shardyjtomasek: this thread http://lists.openstack.org/pipermail/openstack-dev/2016-June/097178.html09:52
jtomasekshardy: thank you09:52
jaosoriormcornea: if you have time to try it out, I have this blog post about it :D http://jaormx.github.io/2016/testing-out-the-tls-everywhere-patches-for-tripleo/09:52
mcorneajaosorior: yep, it's on my to do list :)09:53
*** rhallisey has joined #tripleo09:53
*** panda|zZ is now known as panda09:55
openstackgerritMerged openstack/tripleo-validations: undercloud-disk-space.yaml: improved output  https://review.openstack.org/39335409:59
shardycharliejllewelly: Hi, you can configure the heat.conf on the undercloud to influence the os-collect-config settings10:00
shardybasically we get the endpoint from keystone on the undercloud, and inject it via cloud-init userdata10:00
openstackgerritMerged openstack/tripleo-validations: Updated from global requirements  https://review.openstack.org/39117010:00
*** hjensas has joined #tripleo10:00
*** stevemul has joined #tripleo10:00
shardyyou can set the [clients_heat] endpoint_type to e.g publicURL10:00
shardycharliejllewelly: note that on recent TripleO versions os-collect-config is actually polling swift, not heat10:01
shardyso that will change which config setting is modified10:01
openstackgerritMerged openstack/tripleo-validations: Change HAProxy timeouts to match the defaults  https://review.openstack.org/39135910:01
openstackgerritChris Jones proposed openstack/puppet-tripleo: Improve failed mysql node removal time in HA deploys.  https://review.openstack.org/39367310:03
charliejllewellyshardy: thanks for the pointer but this is for our users of the overcloud. Unfortunately w are not running swift in our cloud hence having to use cfn or heat. I couldn't see a way to set the auth_url in heat.conf...must be being silly10:03
openstackgerritJulie Pichon proposed openstack/python-tripleoclient: Pass clients to get the get_password function  https://review.openstack.org/39319210:04
shardycharliejllewelly: the same applies I think, sec let me see how to configure it10:04
*** thrash|g0ne is now known as thrash10:06
shardycharliejllewelly: do you have HeatApiNetwork assigned to external in the ServiceNetMap ?10:07
charliejllewellyshardy: not sure just checking10:08
*** ooolpbot has joined #tripleo10:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION10:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163796110:10
openstackLaunchpad bug 1637961 in tripleo "periodic HA master job pingtest times out" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)10:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163835010:10
*** ooolpbot has quit IRC10:10
openstackLaunchpad bug 1638350 in tripleo "pingtest failing on OVB jobs to create Cinder volume and Nova server" [Critical,In progress] - Assigned to Emilien Macchi (emilienm)10:10
charliejllewellyshardy: Could you explain how I would check that please?10:11
*** limao has quit IRC10:11
charliejllewellya catalog list only shows a heat endpoint in the service catalog with public, internal and admin endpoints10:12
shardycharliejllewelly: by default all services listen on the internal_api network, but there's a parameter available which lets you assign services to whatever network you want10:12
shardycharliejllewelly: Ok so all the endpoints in catalog list for heat are on your internal_api network?10:12
shardyopenstack endpoint show heat should also tell you10:13
charliejllewellyshardy: correct10:13
shardycharliejllewelly: what version of TripleO are you running please?10:14
charliejllewellyalthough they also list public and admin10:14
charliejllewellyRedhat OSP810:14
openstackgerritGiulio Fidente proposed openstack/tripleo-heat-templates: Defaults kernel.pid_max to 1048576  https://review.openstack.org/39368210:14
*** athomas has quit IRC10:15
jaosoriorgfidente: hey dude10:16
shardycharliejllewelly: Ok, so if you want to reassign heat to your external network, you need to pass a parameter which overrides the default ServiceNetMap, see here:10:16
shardyhttps://github.com/openstack/tripleo-heat-templates/blob/stable/liberty/overcloud.yaml#L67210:16
jaosoriorgfidente: so, I've been digging into the CI issue involving cinder... it seems that cinder is actually up and running, but for some reason it takes a VERY long time to respond.10:16
jaosoriorgfidente: doing a simple volume list, it seems that it takes around 7 to 8 seconds to do anything in my deployment. And I think it's the same issue in CI.10:17
jaosoriorEmilienM: ^^10:17
shardycharliejllewelly: it would look something like this:10:18
shardyhttp://paste.openstack.org/show/587877/10:18
shardynot that on liberty and mitaka you need to copy the whole map (for all services), but since Newton we allow just passing those services you want to override10:18
shardyyou can assign any other services you want on the external net there too, then check it all looks good via the catalog10:19
*** akrivoka has joined #tripleo10:19
shardybasically heat puts the value from the catalog in the userdata for os-collect-config by default10:19
shardybut if for any reason that doesn't work for you, there is a config option to override it10:19
shardyhttps://github.com/openstack/heat/blob/master/heat/engine/clients/os/heat_plugin.py#L7810:20
shardyprobably not needed though if you just set ServiceNetMap how you want10:20
*** athomas has joined #tripleo10:21
*** tremble has joined #tripleo10:22
*** tremble has joined #tripleo10:22
dtantsurhey folks! I've heard CI is having numerous problems. is there something I could help with from Ironic side?10:22
charliejllewellyshardy: brilliant thanks for the info, I'll have read10:22
openstackgerritJuan Antonio Osorio Robles proposed openstack-infra/tripleo-ci: CINDER TEST  https://review.openstack.org/39368710:22
openstackgerritMarios Andreou proposed openstack/tripleo-heat-templates: Add special case handling for OVS upgrade in updates and upgrades  https://review.openstack.org/39368810:23
openstackgerritMarios Andreou proposed openstack/tripleo-heat-templates: Add replacepkgs to the manual ovs upgrade workaround and fix a typo  https://review.openstack.org/39368910:23
shadowershardy: how should we proceed with this? https://review.openstack.org/#/c/390854/10:24
shadowershardy: you wrote the first revision so I'm not sure whether you can review it or not10:24
*** lmiccini has joined #tripleo10:24
shadowerbut it's a prerequisite for an important validation fix so I'd like to land it asap10:25
jaosoriorshadower: well, we do need to fix CI before landing more stuff10:25
shadowerjaosorior: yeah that's true10:26
*** ealcaniz has joined #tripleo10:26
pandaimage builder is not honouring OVERCLOUD_IMAGES_ARGS and I still don't understand why ... selinux is still on enforcing ...10:26
jaosoriorpanda: well, we do modify the images after being built10:27
jaosoriorpanda: why not do that?10:28
gfidentejaosorior I think I need to update my image templates because I still get cinder-api as service10:28
pandajaosorior: where, with what ? virt-customize is not installed in undercloud ?10:29
jaosoriorgfidente: probably that and the puppet-tripleo from the images10:29
shardyshadower: I'm fine for it to land when we fix CI10:30
shadowershardy: thanks10:30
shardyI +2'd it but someone else should probably approve since I'm a co-author10:30
*** yamahata has quit IRC10:30
jaosoriorpanda: look at the update_image function in common_functions.sh10:30
shardyshadower: Also I'd welcome your feedback on https://review.openstack.org/#/c/393448/10:31
shadowershardy: I'll have a look10:31
shardyshadower: I'm experimenting with building an ansible playbook for upgrades from the composable service interfaces10:31
shardyit occurs to me that the same approach might work well for ansible (or $whatever) based validations of the deployed overcloud10:31
*** skramaja has quit IRC10:32
shardyparticularly if we get to per-service validations as has recently been proposed by flepied in https://review.openstack.org/#/c/372336/10:32
bogdandofolks, I tried the tripleo quickstart https://github.com/openstack/tripleo-quickstart/ and it fails with "modprobe: ERROR: could not insert 'kvm_intel': Operation not supported" Is it because my VIRTHOST is a VM?10:34
pandabogdando: yes10:35
bogdandodoes the VM case without HW accel supported? :(10:35
bogdandohow could I try it on the GCE instance?10:35
bogdandoany w/a to that to tun non HW accelerated VMs by installer?10:36
bogdandos/tun/run10:36
bogdandoor what is anouther most simple alternative to fit my case?10:36
openstackgerritGabriele Cerami proposed openstack-infra/tripleo-ci: Workaround: Set selinux to permissive to workaround bug LP#1637961  https://review.openstack.org/39270310:37
jaosoriorpanda: did you find the function I mentioned?10:40
*** kbyrne has quit IRC10:40
*** abehl has quit IRC10:40
pandajaosorior: yes, but it will take a while, in the meantime, I'll try with my original ugly hack.10:40
*** NobodyCam has quit IRC10:42
*** Ng has quit IRC10:42
*** hrybacki has quit IRC10:42
*** kbyrne has joined #tripleo10:42
*** ramishra has quit IRC10:43
*** akrivoka has quit IRC10:44
*** afazekas has quit IRC10:44
*** fungi has quit IRC10:44
*** tdasilva has quit IRC10:44
pandabogdando: I think using non-accelerated VMs on top of another VM would take so much to deploy that I'm not sure it's even worth it. If you still want to do it, you may try to modify the ansible tasks to not install the kvm_intel module and use qemu to launch the instances.10:44
*** mgould|afk is now known as mgould10:44
*** ramishra has joined #tripleo10:45
*** ramishra has quit IRC10:45
*** Ng has joined #tripleo10:45
*** ChanServ sets mode: +v Ng10:45
*** hrybacki has joined #tripleo10:46
pandajaosorior: I'm confused. we use update_image when we're not building the image, but the gates are actually building it, and it's weird because the image shold be built only during periodic jobs ...10:46
*** ramishra has joined #tripleo10:46
bogdandopanda, so recommended way to make dev envs is use only a BM host?10:46
shardybogdando: most folks are either using a BM host or an environment where nested virt is enabled10:47
jaosoriorpanda: we don't build the images in every gate. tripleo-ci builds them, but AFAIK when it's run on tripleo-heat-templates we don't.10:47
pandajaosorior: anyway, since I just want to change selinux and not updating all the image, I'll have to use part of this function up to some point, and use it only to change SELINUX variable10:47
bogdandoshardy, got it thanks10:47
*** openstackgerrit has quit IRC10:47
*** openstackgerrit has joined #tripleo10:48
*** afazekas has joined #tripleo10:49
pandajaosorior: or maybe I'll just install libguestfs-tools and use virt-customize10:50
jaosoriorpanda: sounds reasonable too10:50
jaosoriorpanda: this is a workaround anyway, doesn't need to be perfect.10:51
*** afazekas has quit IRC10:54
openstackgerritMerged openstack/tripleo-validations: Fix the ctlplane-ip-range validation  https://review.openstack.org/39286810:54
openstackgerritMerged openstack/tripleo-validations: Fix the mysql-open-files-limit validation  https://review.openstack.org/39286910:54
*** afazekas has joined #tripleo10:55
*** NobodyCam has joined #tripleo10:56
*** akrivoka has joined #tripleo10:57
*** fungi has joined #tripleo10:58
openstackgerritJuan Antonio Osorio Robles proposed openstack-infra/tripleo-ci: CINDER TEST  https://review.openstack.org/39368711:00
*** jlinkes_ has joined #tripleo11:05
*** jkilpatr has joined #tripleo11:07
openstackgerritFlorian Fuchs proposed openstack/tripleo-ui: Validate JSON parameters  https://review.openstack.org/39371311:09
pandalibguestfs method works locally ...11:09
*** jlinkes has quit IRC11:09
*** ooolpbot has joined #tripleo11:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION11:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163796111:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163835011:10
*** ooolpbot has quit IRC11:10
openstackLaunchpad bug 1637961 in tripleo "periodic HA master job pingtest times out" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)11:10
openstackLaunchpad bug 1638350 in tripleo "pingtest failing on OVB jobs to create Cinder volume and Nova server" [Critical,In progress] - Assigned to Emilien Macchi (emilienm)11:10
charliejllewellyshardy: thanks for the advice earlier, I have found that by changing the auth_uri in heat.conf it allows the correct auth endpoint to be set via os-collect-config. Would this be considered an acceptable solution or is the correct method still to change the endpoint map?11:10
charliejllewellyMy concern is changing the isolated network that some of the internal service interactions take place over may be insecure?11:11
shardycharliejllewelly: so, you want to have keystone listen on the external network, and heat only listen on the internal_api network?11:12
shardymaybe I misunderstood, I thought you wanted the connection to heat to happen via a different network11:12
*** katkapilatova has left #tripleo11:12
charliejllewellyI was attempting to get VM's in the overcloud to interact with heat via the public URL's but if other services internal to the overcloud consume heat they should use the internal API network11:14
*** jlinkes_ has quit IRC11:14
*** jlinkes_ has joined #tripleo11:15
shardycharliejllewelly: ah, so yeah looking at the liberty templates we actually force the URL to the vip for the HeatApiNetwork11:18
shardyhttps://github.com/openstack/tripleo-heat-templates/blob/stable/liberty/overcloud.yaml#L101111:18
shardywhich overrides the heat logic that detects the endpoint from the catalog11:18
shardywe fixed that in Newton11:18
charliejllewellyAh...that makes sense :)11:18
shardyhttps://github.com/openstack/tripleo-heat-templates/blob/stable/liberty/puppet/controller.yaml#L95311:19
shardyhttps://github.com/openstack/tripleo-heat-templates/blob/stable/liberty/puppet/controller.yaml#L139011:19
openstackgerritGabriele Cerami proposed openstack-infra/tripleo-ci: Workaround: Set selinux to permissive to workaround bug LP#1637961  https://review.openstack.org/39270311:20
openstackgerritMerged openstack/tripleo-heat-templates: Enable proxy headers parsing for Neutron  https://review.openstack.org/39313011:21
*** pblaho has joined #tripleo11:22
shardycharliejllewelly: yeah so if you just want to force a different value in the heat.conf you can do that like this:11:22
shardyhttp://paste.openstack.org/show/587886/11:22
shardythen you can set whatever url you want to override what we set in the template as linked above11:22
*** dsariel has quit IRC11:24
shardycharliejllewelly: you can use the same trick if you wish to change the auth_uri when using the native heat transport11:24
openstackgerritJulie Pichon proposed openstack/instack-undercloud: Open firewall port for the TripleO UI  https://review.openstack.org/39371911:25
charliejllewellyshardy: perfect thanks for the help, I'll try that now11:25
*** jprovazn has quit IRC11:26
openstackgerritBogdan Dobrelya proposed openstack/tripleo-quickstart: Add support of deploy on a VM w/o nested virt  https://review.openstack.org/39372011:28
bogdandopanda, shardy , mwhahaha ^^ (still testing tho)11:28
openstackgerritMerged openstack/tripleo-common: Configure run-validation to use the custom output  https://review.openstack.org/39346711:29
*** mburned_out is now known as mburned11:31
*** dprince has joined #tripleo11:36
cschwedeHello there! Question: is it possible to disable a single service on one role only? I know I can modify a role (eg. Controller), but i I want to disable only a single service there might be an easier way?11:37
cschwedeMaybe using resource_registry & OS::Heat::None but restrict that to one role only (not global)?11:38
shardycschwede: you can map the service to OS::Heat::None, or modify roles_data.yaml to remove the service you don't want11:38
shardyboth basically do the same, other than the None approach giving you a spurious service when looking at the deployed heat stack11:39
shardyhttps://github.com/openstack/tripleo-heat-templates/blob/master/overcloud-resource-registry-puppet.j2.yaml#L10511:39
*** tdasilva has joined #tripleo11:40
*** d0ugal has quit IRC11:40
shardycschwede: it's actually how we currently disable services which we don't want deployed by default11:40
cschwedeshardy: ok, but if I map a service (say OS::TripleO::Services::SwiftStorage) to OS::Heat::None, it will affect all roles, and I only want this for the controller role11:40
shardybut pretty soon we'll be moving away from that as heat allows for merging environments now11:40
cschwedeso i need to copy the default controller role definition, and just remove the unwanted service11:40
EmilienMhello11:40
shardycschwede: Ok, then you have to either remove it from the roles_data.yaml or pass a modified ControllerServices list as a parameter11:40
shardycschwede: yes, or just copy the list of services and pass a different ControllerServices list11:41
shardyhttps://github.com/openstack-infra/tripleo-ci/blob/master/test-environments/scenario001-multinode.yaml#L511:41
shardycschwede: ^^ like that11:41
cschwedeshardy: yes, both is working for me, thx. i was just wondering if there is an easier way when i only want to disable one service on one role11:41
cschwedeshardy: like not adding 50-1 services to my custom template11:42
shardycschwede: currently not really - you could pass hierdata via ControllerExtraConfig which tells puppet not to start the service, but that might cause other issues because it would still be wired in to any other services which expect to interface with it11:42
cschwedeshardy: yes, tried that, failed for exact the reason you described11:43
EmilienMdo we have progress on our CI issues?11:43
shardycschwede: Yeah, maybe we can enhance the heat parameter merging stuff to allow an xor merge or something11:43
cschwedeshardy: i add that to my list of topics/questions for next months OOO workshop :)11:43
shardycschwede: it'd also be pretty easy to have e.g ControllerExcludedServices as a parameter which we then use to filter the ControllerServices list11:43
shardybut that seems like a kind of klunky interface11:44
pandaEmilienM: mariadb issue fixed, my workaround needed more passes instead11:44
EmilienMpanda: right, mariadb we updated packages last night again11:44
shardyone of the aims with the composable services/roles was to keep the interface simple, and the explicit list does at least keep things simple11:44
EmilienMbut what about the cinder problem?11:44
EmilienMjaosorior: ^11:44
pandaEmilienM: there's a review from jaosorior11:44
pandaEmilienM: https://review.openstack.org/39368711:45
cschwedeshardy: agree; i was just a bit worried that eg the controller list gets updated between releases, and an operator might miss to update the customized services then11:45
*** d0ugal has joined #tripleo11:45
EmilienMpanda: interesting11:45
shardycschwede: yeah, either way it's a compromise, but so far I've preferred the explicit approach of having the operator just define the list of services they want11:46
*** ealcaniz has quit IRC11:46
shardyseems like it should yield the least surprising results in most cases despite the minor cut/paste invonvenience11:46
*** lmiccini has quit IRC11:46
shardyinconvenience even11:46
cschwedeshardy: yep, makes sense to me11:46
openstackgerritSteven Hardy proposed openstack/tripleo-heat-templates: WIP prototyping composable upgrades with Heat+Ansible  https://review.openstack.org/39344811:48
shardymarios: ^^ it now works! :)11:48
shardywell, appending to /root/upgrade.log does, it doesn't really implement the upgrade workflow yet11:48
shardyquestion, do we still need major-upgrade-pacemaker-init.yaml if we can configure the new repos e.g as step1 of the upgrade, or run the rameter_defaults:11:50
shardysorry or run the UpgradeInitCommand before running the steps11:50
*** zoli|trng-afk is now known as zoli11:52
*** zoli is now known as zoli|lunch11:52
*** bfournie has quit IRC11:55
*** morazi has quit IRC11:57
cschwedewould be great if someone has a few minutes for a small t-h-t bugfix review: https://review.openstack.org/#/c/391222/11:59
EmilienMcschwede: looking12:00
cschwedeEmilienM: thx a lot!12:00
weshaysshnaidm, panda how's the ci situation?12:02
EmilienMshardy: I don't think we need this init thing, if we have step1,2,3,4,5,etc in place12:02
EmilienMweshay: I see HA job very unstable12:03
*** jayg|g0n3 is now known as jayg12:03
EmilienMweshay: tripleo.org/cistatus.html12:03
weshayya.. I'm looking12:03
weshayha ping test12:03
pandaweshay: mariadb solved, redis workaround still under work12:03
EmilienMslagle: we have 3nodes job in place12:06
*** fultonj_ has joined #tripleo12:06
shardyEmilienM: ack, I'll see how the init stuff could be wired in via the steps instead12:08
slagleEmilienM: yes i saw12:08
slagleEmilienM: it will be a while before i can work on it, as i'm working on other things12:09
EmilienMslagle: right, just fyi12:09
*** ooolpbot has joined #tripleo12:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION12:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163796112:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163835012:10
*** ooolpbot has quit IRC12:10
openstackLaunchpad bug 1637961 in tripleo "periodic HA master job pingtest times out" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)12:10
openstackLaunchpad bug 1638350 in tripleo "pingtest failing on OVB jobs to create Cinder volume and Nova server" [Critical,In progress] - Assigned to Emilien Macchi (emilienm)12:10
*** chlong has joined #tripleo12:13
pandaI can't believe it ... it's easier to fix the issue than to put a workaround ...12:14
openstackgerritGabriele Cerami proposed openstack-infra/tripleo-ci: Workaround: Set selinux to permissive to workaround bug LP#1637961  https://review.openstack.org/39270312:18
mariosshardy: cool i will have a closer look still todo :) thanks a lot12:21
*** cdearborn has joined #tripleo12:21
EmilienMpanda: can I assign 1638350 to you please? I'm leaving on PTO tonight, I'm not going to take care of it during the next days12:22
pandaEmilienM: yes12:23
EmilienMpanda: thx12:23
slagleEmilienM: you're going to get rid of the 2 node job?12:23
*** jerrygb has joined #tripleo12:23
EmilienMslagle: not yet i think, we might want to wait to have a stable 3nodes jobs in place before, no?12:23
slaglei wasnt thinking of removing it at all honestly12:24
EmilienMI was actually thinking at keeping it for some projects or kind of patches12:24
EmilienMand keep 3nodes only for some projects (THT & puppet-tripleo)12:24
shadowerjaosorior: can I have one more 1 liner?  https://review.openstack.org/#/c/391259/12:25
EmilienMI see some value in keeping the 2nodes job, where a project can't break composable roles & network isolation12:25
*** sudipto_ has quit IRC12:25
*** sudipto has quit IRC12:25
slagleEmilienM: ok. you're comment sounded like you planned to get rid of it12:25
EmilienMslagle: I was wrong, let me phrase it again in the review12:25
*** maticue has joined #tripleo12:29
*** bfournie has joined #tripleo12:29
*** jerrygb has quit IRC12:31
*** maeca1 has joined #tripleo12:33
weshaypanda, fyi.. the ping test isssues have been escalated12:36
*** rlandy has joined #tripleo12:36
*** dsariel has joined #tripleo12:41
EmilienMweshay: what does it mean?12:41
*** jerrygb has joined #tripleo12:42
*** kjw3 has joined #tripleo12:44
weshayEmilienM, bugs that stop CI related to the prod chain, like upstream CI are now treated like gss/customer escalations and use the same process12:44
EmilienMnice12:44
*** lmiccini has joined #tripleo12:45
beaglesis there a way to sort gerrit search results?12:45
weshayit helps to get visibility, getting people to help out, and offers some amount of root cause analysis to prevent the same issue in the future12:45
*** jlinkes_ has quit IRC12:45
beagleslike by subject, etc12:45
shardybeagles: have you tried gerrit dash creator? https://github.com/openstack/gerrit-dash-creator12:46
*** jlinkes_ has joined #tripleo12:46
*** chlong has quit IRC12:46
*** rodrigods has quit IRC12:46
beaglesshardy: nope.. not yet :)12:46
*** rodrigods has joined #tripleo12:46
beaglesshardy: but it looks like I'm about to12:46
pandabeagles: shardy gertty allows sorting by change number and last update12:47
shardyI created some tripleo dashboards with that then bookmarked them, you can hack the tripleo.dash and/or tripleo-stable.dash to suit your needs :)12:47
*** jlinkes_ has quit IRC12:47
*** tzumainn has joined #tripleo12:48
beaglesshardy: cool.. I've reached my tipping point where procrastinating on building some custom dashboards is no longer an option :)12:49
*** dougbtv has joined #tripleo12:49
*** lucasagomes is now known as lucas-hungry12:52
*** jlinkes has joined #tripleo12:52
*** percevalbot has quit IRC12:53
*** maeca1 has quit IRC12:55
*** morazi has joined #tripleo12:55
*** percevalbot has joined #tripleo12:56
EmilienMmarios: approving https://review.openstack.org/#/c/392680/ please backport it asap12:57
EmilienMslagle: I think we can also approve https://review.openstack.org/#/c/392313/ -- wdyt?12:58
*** jcoufal has joined #tripleo12:58
slagleit's not passed ha12:59
EmilienMsocial: we can approve https://review.openstack.org/#/c/392123/ as it won't break CI - please make sure you backport it into stable/newton asap12:59
openstackgerritMarios Andreou proposed openstack/tripleo-heat-templates: Fixup the start of swift services  https://review.openstack.org/39376012:59
mariosEmilienM: thanks done setting -1 till master merges12:59
slagleEmilienM: i guess it got past the point where that code is executed12:59
EmilienMslagle: right, that's why I think we can go ahead13:00
EmilienMbut I'm fine waiting13:00
*** jerrygb has quit IRC13:01
*** morazi has quit IRC13:02
*** masco has quit IRC13:02
socialEmilienM: https://review.openstack.org/#/c/392260/13:02
socialEmilienM: related commit13:03
EmilienMsocial: I know, waiting for CI though13:03
*** lblanchard has joined #tripleo13:03
socialEmilienM: ack13:03
*** maeca1 has joined #tripleo13:06
*** tobias_fiberdata has joined #tripleo13:06
*** tobias-fiberdata has quit IRC13:08
openstackgerritArx Cruz proposed openstack-infra/tripleo-ci: Reducing a few minutes from the job timeout to save the logs  https://review.openstack.org/39330913:09
EmilienMshardy: https://review.openstack.org/#/c/391064/ is also a good candidate for backport13:09
dtantsurfolks, I've started a blueprint to get back RAID: https://blueprints.launchpad.net/tripleo/+spec/raid-workflow13:09
dtantsurmgould, I've assigned you just because you've been working on it ^^^13:10
dtantsurmaybe we'll shuffle assignees (e.g. lucas-hungry might want to help you)13:10
mgoulddtantsur: OK, thanks13:10
shardyEmilienM: yeah I was waiting for it to land before proposing the backport13:10
shardybut I can cherry-pick it now13:10
mariosshardy: thanks the review looks great its excellent we have actual code to point at end of the first week after summit (session was last thursday)13:10
ansiwenwhat is the current status of CI? working again? (sorry, didn't follow here)13:10
EmilienMshardy: no, just making sure it's in the radar13:10
EmilienMansiwen: right now, AFIK it's unstable13:11
shardyEmilienM: ack, thanks13:11
EmilienMansiwen: you can use tripleo.org/cistatus.html to follow it13:11
ansiwenEmilienM: thank you!13:11
*** ooolpbot has joined #tripleo13:11
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION13:11
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163796113:11
openstackLaunchpad bug 1637961 in tripleo "periodic HA master job pingtest times out" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)13:11
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163835013:11
*** ooolpbot has quit IRC13:11
openstackLaunchpad bug 1638350 in tripleo "pingtest failing on OVB jobs to create Cinder volume and Nova server" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)13:11
*** d0ugal has quit IRC13:12
*** pradk has quit IRC13:12
*** pradk has joined #tripleo13:13
*** links has quit IRC13:13
*** zoli|lunch is now known as zoli13:15
*** morazi has joined #tripleo13:15
*** zoli is now known as zoliXXL13:15
*** akshai has joined #tripleo13:19
*** d0ugal has joined #tripleo13:21
*** jprovazn has joined #tripleo13:22
openstackgerritGabriele Cerami proposed openstack-infra/tripleo-ci: Workaround: Set selinux to permissive to workaround bug LP#1637961  https://review.openstack.org/39270313:24
openstackgerritMerged openstack/tripleo-heat-templates: Add option to disable "d1" Swift device  https://review.openstack.org/39122213:24
openstackgerritMerged openstack/puppet-tripleo: Deploy monitoring/logging agents sooner  https://review.openstack.org/39080213:24
*** [1]cdearborn has joined #tripleo13:25
openstackgerritChristian Schwede proposed openstack/tripleo-heat-templates: Add option to disable "d1" Swift device  https://review.openstack.org/39376913:26
openstackgerritChristian Schwede proposed openstack/puppet-tripleo: swift/proxy: configure rabbitmq properly  https://review.openstack.org/39377013:27
EmilienMcschwede: you missed -x option to cherry-pick ^13:27
jaosoriorEmilienM: did you see what I posted about the cinder issue?13:27
EmilienMcschwede: also, master patch is not merged yet, we shouldn't propose backports before13:28
cschwedeEmilienM: bummer; I did that in Gerrit13:28
EmilienMjaosorior: not much13:28
cschwedeEmilienM: eh, i have too many patches open. abandoning that, it was the wrong one. sorry for the noise13:28
EmilienMcschwede: np13:28
EmilienMcschwede: we need to backport it but only when master patch is merged13:29
cschwedeEmilienM: yep, fully agree13:29
jaosoriorEmilienM: right, so I did a local deployment, and at first it actually shows the same deal, no cinder logs, but no error in any logs either. However, when running volume list, the first runs take a VERY long time, and then it works. So right now my theory is that it's apache's lazy load being the culprit. Cinder is working, but doing the initial load takes too long and it times out.13:29
jaosoriorEmilienM: I think we could merge the eventlet patch, and right now I'm figuring out a way to speed up apache's loading of the virtual host13:30
EmilienMjaosorior: ok. It is the number of workers?13:30
EmilienMjaosorior: look, it failed on latest run https://review.openstack.org/#/c/392647/13:31
EmilienMhttp://logs.openstack.org/47/392647/7/check-tripleo/gate-tripleo-ci-centos-7-ovb-nonha/f920b85/console.html#_2016-11-04_12_23_52_58857613:31
EmilienManother issue13:31
jaosoriorEmilienM: that was an ironic issue13:31
EmilienMd0ugal: ^ can you look this one?13:31
EmilienMah13:31
EmilienMa known issue?13:31
d0ugalEmilienM: looking13:31
d0ugalEmilienM: just in a meeting13:31
jaosoriorEmilienM: not that I know of13:32
*** amoralej is now known as amoralej|lunch13:32
jaosoriorEmilienM: seen in a couple of times now in the 5 minutes I've been checking for those. Since that happened in another patch that I wanted to try to load cinder before the pingtest   https://review.openstack.org/#/c/393687/13:32
*** ccamacho is now known as ccamacho|lunch13:32
d0ugalEmilienM: I've not seen that before, is it a one off?13:33
openstackgerritJuan Antonio Osorio Robles proposed openstack-infra/tripleo-ci: CINDER TEST  https://review.openstack.org/39368713:34
d0ugalEmilienM: oh, actually, it might be related to ... https://review.openstack.org/#/c/388920/13:34
jaosoriord0ugal: it isn't13:34
jaosoriord0ugal: * it isn't a one-off13:34
*** d0ugal has quit IRC13:35
*** d0ugal has joined #tripleo13:37
jpichhttps://review.openstack.org/#/c/392148/ might also be of interest for the node locked error13:37
d0ugalEmilienM, jaosorior - sorry, I am on an unstable connection.13:38
*** jpena is now known as jpena|lunch13:39
*** jerrygb has joined #tripleo13:39
*** jerrygb has quit IRC13:39
d0ugalEmilienM, jaosorior - I think it might be related to the fact that the Ironic client mistral creates doesn't do any retrying by default13:39
*** cdearborn has quit IRC13:40
*** jroll is now known as jrollinhatin13:40
d0ugali.e. we do this: https://github.com/openstack/tripleo-common/blob/d0fb1822def70068b9f8a4aa70602e8ff6d06920/tripleo_common/actions/base.py#L52-L6413:40
d0ugaland depending on the path in the workflow, you get that client, or the one mistral creates without any retrying values13:41
openstackgerritTomas Sedovic proposed openstack/tripleo-specs: Allow specifying multiple sources of validations  https://review.openstack.org/39377513:41
EmilienMcschwede: please submit the backport again, but with -x option. I think it's fine. I don't want to forget this one13:42
cschwedeEmilienM: looking13:42
EmilienMd0ugal: ok thanks13:44
*** jerrygb has joined #tripleo13:46
socialEmilienM: and from updates I think we need to push more for https://review.openstack.org/#/c/392593  and https://review.openstack.org/#/c/389830/ :(13:46
EmilienMsocial: none of them pass CI and have positive review13:48
*** jerrygb has quit IRC13:48
EmilienMI'm afraid some work needs to be done before we land them13:48
openstackgerritChristian Schwede proposed openstack/puppet-tripleo: swift/proxy: configure rabbitmq properly  https://review.openstack.org/39377013:48
*** d0ugal_ has joined #tripleo13:51
openstackgerritMerged openstack/tripleo-quickstart: Add roles-gate playbook run to OVB ci-script  https://review.openstack.org/39287313:53
gfidentejaosorior so cinder is not that slow for me13:53
gfidentenot slower than other APIs13:53
*** d0ugal has quit IRC13:53
jaosoriorgfidente: well, it's timing out in CI13:54
gfidentejaosorior the pingtest you mean?13:55
jaosoriorgfidente: my theory at the moment is that the issue is the first command given to an instance (a vhost in a specific host), since that takes some time for apache to laod it (apache does lazy initialization of the vhosts)13:55
jaosoriorgfidente: correct13:55
*** ebalduf has joined #tripleo13:56
*** lucas-hungry is now known as lucasagomes13:56
gfidentejaosorior but the nonha job is passing13:57
*** tobias-fiberdata has joined #tripleo13:57
jaosoriorgfidente: so it seems to be the case that one cinder instance in the ha job has logs (meaning it can respond quickly) while the other two nodes have no logs for cinder (yet), which is something I could reproduce locally.13:57
jaosoriorgfidente: did you do an ha deployment? Can you verify that in your case cinder is working as well?13:58
gfidenteI did an HA deployment but with a single controller13:58
*** Guest13194 has quit IRC13:58
gfidenteand it's working fine it seems13:58
gfidenteit couldn't start redis instead13:59
jaosoriorgfidente: right, that is the issue that panda is trying to solve13:59
jaosoriorgfidente: can you do an HA deployment with 3 controllers and verify?14:00
gfidentejaosorior yep14:00
jaosoriorgfidente: awesome :D14:00
*** fultonj_ has quit IRC14:00
gfidentein CI I see the nova api returning bad status line14:01
pandawell, the issue is solved with the package ... it's the workaround that is actually taking ages :(14:01
*** tobias_fiberdata has quit IRC14:01
*** fultonj has quit IRC14:01
pandagfidente: link ?14:01
gfidentehttp://logs.openstack.org/82/393682/1/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/d2b3ff2/console.html#_2016-11-04_12_16_09_45687614:02
*** fultonj has joined #tripleo14:02
jaosoriorgfidente: check the logs, that seems to be when attempting to boot from volume14:03
shardygfidente: Hey are you planning to refresh your patch adding the ansible hook to overcloud-full?14:04
shardyI think we just need to install the python-heat-agent-ansible package now14:04
gfidenteshardy instead of heat-config-ansible ?14:05
openstackgerritDougal Matthews proposed openstack/python-tripleoclient: Fix password handling for users upgrading from Mitaka  https://review.openstack.org/39259314:05
gfidenteoh no you mean the rpm, not the element14:06
shardygfidente: Yeah I don't think we need the element anymore, now that it's packaged14:06
gfidenteso that patch should go against a different repo14:06
gfidentesend it if you have it already14:07
shardygfidente: oh yeah, not got a patch yet but I can push one, thanks!14:07
openstackgerritMerged openstack/tripleo-common: Clean up configure_containers.sh script  https://review.openstack.org/38486514:07
gfidenteshardy thanks for pinging though :)14:07
openstackgerritMerged openstack/tripleo-common: Allow building heat-agents image from master  https://review.openstack.org/38486614:07
openstackgerritJuan Antonio Osorio Robles proposed openstack/tripleo-heat-templates: Add preload wsgi script option for cinder  https://review.openstack.org/39379014:07
openstackgerritMerged openstack/tripleo-heat-templates: Add special case handling for OVS upgrade in updates and upgrades  https://review.openstack.org/39368814:07
openstackgerritMerged openstack/tripleo-heat-templates: Add replacepkgs to the manual ovs upgrade workaround and fix a typo  https://review.openstack.org/39368914:08
*** eggmaster has joined #tripleo14:08
openstackgerritMerged openstack/tripleo-heat-templates: Fixup the start of swift services  https://review.openstack.org/39268014:08
openstackgerritMerged openstack/tripleo-heat-templates: Update openstack-puppet-modules dependencies  https://review.openstack.org/39212314:08
*** Goneri has joined #tripleo14:09
*** ooolpbot has joined #tripleo14:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION14:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163796114:10
openstackLaunchpad bug 1637961 in tripleo "periodic HA master job pingtest times out" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)14:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163835014:10
*** ooolpbot has quit IRC14:10
openstackLaunchpad bug 1638350 in tripleo "pingtest failing on OVB jobs to create Cinder volume and Nova server" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)14:10
*** lmiccini has quit IRC14:11
*** lmiccini has joined #tripleo14:11
*** dtantsur is now known as creepy_owlet14:11
*** Guest13194 has joined #tripleo14:13
openstackgerritMerged openstack/tripleo-ui: Prepare 1.0.6  https://review.openstack.org/39179814:16
jtomasekflorianf: do you intend to fix the node power state bug by introducing polling?14:16
openstackgerritOpenStack Proposal Bot proposed openstack/tripleo-common: Updated from global requirements  https://review.openstack.org/38995714:17
florianfjtomasek: not sure yet. But I guess there is nothing coming through the websocket...14:17
*** zoliXXL is now known as zoli|brb14:17
jtomasekflorianf: yeah, problem is that ironic would have to send zaqar messages14:17
florianfjtomasek: exactly. do you think that is something worth investigating?14:18
jtomasekflorianf: I am quite in favor of introducing polling14:18
openstackgerritPaul Belanger proposed openstack/diskimage-builder: Switch to bindep to manage OS requirements  https://review.openstack.org/39193114:18
florianfjtomasek: I think for now that's the best option14:18
*** saneax is now known as saneax-_-|AFK14:19
gfidentejaosorior panda apparently cinder-volume created and attached the volume at 12:14:3614:19
jtomasekflorianf: we can either file it as a bug in ironic or just hide the specific node under transparent loader14:19
jaosoriorgfidente: so it did get attached? right, so then it seems that it's just a slowness issue14:20
jtomasekflorianf: I am not sure if the node power state is any important during introspection14:20
gfidentejaosorior I am still not sure14:20
gfidentelooking at nova logs14:20
gfidenteto see what causes the badstatusline14:20
florianfjtomasek: AFAIK it doesn't update the maintenance state, so power state is probably the only column that is updated during introspection.14:21
*** apetrich has quit IRC14:21
gfidentejaosorior see the heat volume1 resource went in COMPLETE state14:22
florianfjtomasek: So are you saying it's not even worth updating it?14:22
jtomasekflorianf: yes14:22
*** apetrich has joined #tripleo14:23
jristhonza: might need a rebase on this? https://review.openstack.org/#/c/392589/14:24
openstackgerritLukas Bezdicka proposed openstack/tripleo-heat-templates: Update openstack-puppet-modules dependencies  https://review.openstack.org/39379714:25
*** rajinir has quit IRC14:26
florianfjtomasek: hmm. but then we shouldn't just hide the power state, but all columns.14:27
honzajrist: might have to redo the whole thing :)14:27
florianfjtomasek: except for the mac address, name and role14:27
jpichjtomasek: Hey, a couple of days ago folks mentioned letting users switch nodes back to 'manageable', looks like we do have a workflow for changing state already -> https://github.com/openstack/tripleo-common/blob/master/workbooks/baremetal.yaml#L814:28
jpichjtomasek: (I made the mistake a couple of times where I forgot to run introspection and ended up stuck, from the UI perspective)14:28
*** ccamacho|lunch is now known as ccamacho14:28
*** jistr is now known as jistr|call14:29
*** jaosorior is now known as jaosorior_mtg14:29
*** jaosorior_mtg has quit IRC14:31
jtomasekjpich: cool14:31
jtomaseksure we can add it then14:31
EmilienMmwhahaha: can you confirm we don't need this backport in stable/newton https://review.openstack.org/#/c/393455/ ? It's an Ocata thing, right?14:32
*** jaosorior_mtg has joined #tripleo14:32
jpichjtomasek: Cool! I'll open a bug to remind us to do it sometime14:32
mwhahahaEmilienM: i thought we did that change in newton upstream14:32
jtomasekjpich: thanks!14:32
EmilienMmwhahaha: https://review.openstack.org/#/c/368169/ was not backported14:32
mwhahahaEmilienM: techincally I don't think we need it in newton14:33
EmilienMok14:33
jtomasekflorianf: so, I am not against doing polling, but we need to make sure we only start polling when node operation is started and stops when node operation finishes14:33
mwhahahaEmilienM: it's in stable/newton, it was merged 6+ weeks ago14:33
jtomasekflorianf: we will not solve the problem of restarting the polling when user refreshes the page (for now)14:33
mwhahahaEmilienM: there's nothing to backport in puppet-keystone14:33
mwhahahaEmilienM: it was in 9.4.014:34
EmilienMmwhahaha: mhh ok14:34
EmilienMso I can approve it14:34
jtomasekflorianf: (that is a app state recovery problem)14:34
mwhahahaEmilienM: yea14:34
*** zoli|brb is now known as zoli14:36
mariosEmilienM: thanks for the backport magic14:37
*** SteveRelf has quit IRC14:38
*** d0ugal_ has quit IRC14:39
*** d0ugal has joined #tripleo14:39
jpichjtomasek: FYI - https://bugs.launchpad.net/tripleo/+bug/163926214:40
openstackLaunchpad bug 1639262 in tripleo "Cannot set nodes to 'manageable' from the UI" [Medium,Triaged]14:40
*** fragatina has quit IRC14:42
*** fragatina has joined #tripleo14:42
jristhonza: ah14:42
jristhonza: is that the one with the circular dependencies14:42
honzajrist: yes, sir14:43
*** sudipto has joined #tripleo14:43
jristok don't mind me :)14:43
*** sudipto_ has joined #tripleo14:43
*** rajinir has joined #tripleo14:43
EmilienMmarios: yw :)14:45
*** amoralej|lunch is now known as amoralej14:46
*** dmacpher has joined #tripleo14:49
*** jpena|lunch is now known as jpena14:50
openstackgerritJames Slagle proposed openstack/tripleo-heat-templates: TEST: Disable convergence on the overcloud  https://review.openstack.org/39381314:51
*** jerrygb_ has joined #tripleo14:52
florianfjtomasek: ok, makes sense. are we tackling app state recovery for ocata?14:56
jtomasekflorianf: probably not14:56
therveslagle, Dang, I really hope convergence is not an issue :/14:58
d0ugalshardy: Hey14:59
d0ugalor any other Heat experts around :)15:00
therved0ugal, Try me :)15:00
d0ugaltherve: Am I correct in thinking I can get the parameters back out of Heat?15:00
slagletherve: can you see why the stack is stuck in create in progress here? http://logs.openstack.org/13/392313/7/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/f383107/15:00
jaosorior_mtgEmilienM: this is my current attempt at going at the cinder issue https://review.openstack.org/#/q/topic:wsgi_script_preload15:00
therveslagle, Looking15:00
therved0ugal, You mean with stack-show?15:01
d0ugaltherve: That might be it, and can I get hidden params with that too? passwords?15:01
*** jerrygb has joined #tripleo15:01
EmilienMjaosorior_mtg: ack15:01
therved0ugal, Good question, let me check15:01
slagletherve: i see "Stack CREATE COMPLETE (tenant-stack): Stack CREATE completed successfully" in the heat-engine.log from controller 0, yet when we resource-list at the end of the pingtest, the server1 resources is still CREATE_IN_PROGRESS15:02
jaosorior_mtgEmilienM: the result of this https://review.openstack.org/#/c/393687/ will let us know if it's actually the lag of apache's lazy load that's actually the issue15:02
EmilienMjaosorior_mtg: ok cool. it would be cool to also test your series of patches in tripleo ci15:03
*** jistr|call is now known as jistr15:03
therves;15:03
jaosorior_mtgEmilienM: right, so the last of that patch in the series is to tripleo-heat-templates15:03
therveslagle, The tenant-stack is still in progress, so that message is probably wrong15:04
jaosorior_mtgEmilienM: so that'll trigger the ovb-ha job anyway15:04
shardyd0ugal: Hey, sorry was in a meeting - yeah stack-show will give you parameters but IIRC it obfuscates hidden parameters15:04
shardyif they end up in server metadata and/or a SoftwareDeployment they are actually visible via the resource-metadata or deployment-show commands tho...15:04
*** jerrygb_ has quit IRC15:04
therveslagle, Oh, it just happened to be very very slow15:04
EmilienMjaosorior_mtg: right15:05
therveslagle, finished at 14:12:23, but your tests timed out at 300s15:05
EmilienMgood15:05
slagletherve: ah, indeed15:05
therveslagle, at 14:1015:05
therveSo it took 7mins instead of 515:05
slagleok, so we need to increase the timeout15:05
therveMaybe15:06
therveOTOH it took 30sec to do a stack-show, which is not a good sign15:06
pandaslagle: therve: this is the redis bug again.15:07
thervepanda, Is it? It completed successfully though15:08
pandatherve: http://logs.openstack.org/13/392313/7/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/f383107/logs/overcloud-controller-0/var/log/audit/audit.txt.gz15:08
therved0ugal, it looks like you may not get the hidden value, but I didn't try15:08
pandatherve: search for "denied"15:08
d0ugalshardy: hrm, I'll have to try. Trying to find a sensible way to get the passwords for Mitaka users upgrading, my current approach isn't sensible.15:09
d0ugaltherve: dang, shardy suggested looking in the resource-metadata or deployment-show commands - so I can try that if I can navigate my way in15:09
thervepanda, Wouldn't that break things though?15:09
*** ooolpbot has joined #tripleo15:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION15:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163796115:10
openstackLaunchpad bug 1637961 in tripleo "periodic HA master job pingtest times out" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)15:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163835015:10
*** ooolpbot has quit IRC15:10
openstackLaunchpad bug 1638350 in tripleo "pingtest failing on OVB jobs to create Cinder volume and Nova server" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)15:10
slagleok, so gnocchi-metricd is consuming all CPU15:10
*** jistr is now known as jistr|biab15:10
slaglepanda: yea, sounds like the same issue i guess15:11
therveslagle, gnocchi is connecting to redis, so that might be it15:11
*** chandankumar has quit IRC15:12
pandatherve: yep, metricd eating all CPU, everything slows down.15:12
therveOK15:12
pandatherve: the weird thing is that deploy reports CREATE_COMPLETE even if redis resource is stopped in all the nodes.15:14
*** jistr|biab is now known as jistr15:14
openstackgerritMerged openstack/tripleo-heat-templates: Fixup the start of swift services  https://review.openstack.org/39376015:15
thervepanda, I think very few services depend on redis, but if it makes things slow it can impact the test stack15:15
EmilienMjaosorior_mtg: can you remove -1 on  https://review.openstack.org/#/c/392647/  please?15:15
therveShould https://review.openstack.org/#/c/392703/ land then?15:16
jaosorior_mtgEmilienM: I am not convinced that that is the real issue, which is why I did the -1 there.15:16
pandatherve: I'm waiting for it to pass the gates15:17
*** hjensas has quit IRC15:17
jaosorior_mtgEmilienM: panda's fix even makes more sense to me.15:17
slaglepanda: it looks like the liberty jobs have already failed on the patch15:17
slagleis that a known issue?15:17
shardyd0ugal: aha, you don't need to fish in the stack parameters at all15:17
shardyd0ugal: try openstack stack environment show overcloud15:18
shardyI think I even added that API but forgot about it :)15:18
slagleImportError: cannot import name secretutils15:18
pandaslagle: is there a way to see how the liberty gate failed before zull is reporting back to the review ?15:18
EmilienMpanda: http://status.openstack.org/zuul/15:18
slaglepanda: yea, you can look here: http://status.openstack.org/zuul/15:18
slaglepanda: type in the patch # in the filter15:19
slagle39270315:19
EmilienMlooks like it could be a backport in liberty? let me check15:19
pandaslagle: EmilienM, when the build has already failed, telnet to host is impossible15:19
*** sudipto_ has quit IRC15:19
*** sudipto has quit IRC15:19
EmilienMlooks like something in https://github.com/openstack/osprofiler15:20
EmilienMpanda: right but you have logs15:20
slaglefiled bug: https://bugs.launchpad.net/tripleo/+bug/163927415:20
openstackLaunchpad bug 1639274 in tripleo "ImportError: cannot import name secretutils" [Undecided,New]15:20
openstackgerritMerged openstack/tripleo-heat-templates: Use correct password for keystone bootstrap  https://review.openstack.org/39345515:20
pandaEmilienM: yes, my question is where I can see the log of that build when telnet is not available and logs links are not in the review15:20
slaglepanda: from here: http://status.openstack.org/zuul/15:21
slagleuse the filter, and then click on the failed job name15:21
EmilienMslagle: it looks like a packaging issue with either osprofiler or oslo.utils, I'm investigating15:21
pandaslagle: All the links I have in zuul for the builds on my review are telnet URIs ...15:23
d0ugalshardy: oh, wow. That is perfect. Thanks!15:23
pandaslagle: nevermind, was looking at the wrong page15:23
d0ugaltherve: FYI, this does it: openstack stack environment show overcloud15:23
EmilienMslagle: https://github.com/openstack/osprofiler/commit/a3accb74512b5355a020c876190e3d1a44455b6c15:24
EmilienMI think we don't pin osprofiler in RDO/liberty15:24
EmilienMlet me check15:24
*** rcernin has quit IRC15:25
*** dsariel has quit IRC15:25
*** rcernin has joined #tripleo15:27
EmilienMslagle: bingo15:28
EmilienMI'm sending a patch in rdo15:28
openstackgerritMerged openstack/puppet-tripleo: Add port to rabbitmq node ip list  https://review.openstack.org/39299615:28
pandaoh, my. This review is a bug magnet. memcache, mariadb, osprofiler. We'll fix internet before landing this15:29
openstackgerritThiago da Silva proposed openstack/tripleo-heat-templates: set url_base option in static web middleware  https://review.openstack.org/39291815:29
*** absubram has joined #tripleo15:32
EmilienMpanda: pingtest is running :)15:34
EmilienMlet's see15:34
pandaEmilienM: I'm glued to the monitor ...15:34
*** jprovazn has quit IRC15:35
pandaEmilienM: why doesn't osprofiler have stable/liberty, and we're using master even for newton ?15:36
*** yamahata has joined #tripleo15:36
pandaEmilienM: tenant-stack creation should take ~3 minutes15:37
EmilienMpanda: I don't know about osprofiler15:37
EmilienMI never contributed to it yet15:37
panda\o/ woohoo15:38
EmilienMOvercloud pingtest, heat stack CREATE_COMPLETE15:38
EmilienMwell, we set selinux in permissive15:38
pandayes15:38
*** Guest13194 is now known as tesseract-15:38
pandabut it took some attempts to get it right15:39
EmilienMpanda: please revert this change as soon as we have resource-agent package15:39
pandaEmilienM: of course15:39
pandamaybe at this point I should've patched the resource agent directly15:40
therved0ugal, Nice15:40
bogdandoso folks, the patch https://review.openstack.org/#/c/393720 seems almost working for me, almost as it fails randomly on *different* places :) please comment if you think the case is interesting at all15:40
therveshardy, So I started at looking at using zaqar for events. What's the procedure to propose something, a spec, a spec-lite bug?15:41
bogdandofailures are mostly related to intermittent libguestfs / virt-customize errors15:41
shardytherve: we've mostly been using blueprints for features but a spec lite bug is OK too15:42
EmilienMpanda: how the patch looks like? any example?15:42
shardytherve: if you feel a spec will be useful you're free to propose one, but we've not enforced them for smaller features15:42
therveshardy, OK. It should be a small one, it's relatively self contained15:42
pandaEmilienM: https://github.com/ClusterLabs/resource-agents/commit/ba604dde3a58e1aaf4218487fcf40543a0e79db215:43
EmilienMpanda: sounds tricky ;) but if you can do it... The thing is, you need to make sure to apply this patch *after* the package install15:44
pandaEmilienM: this means modifying some manifests, I think it's simpler what I did at this time .. and It's easier to remove once all landed. It's not touching any project15:45
*** artom has quit IRC15:45
*** artom_ has joined #tripleo15:46
EmilienMslagle: can you review https://review.openstack.org/#/c/393797/ please?15:46
EmilienMpanda: I see nonha job failing though15:46
slagleEmilienM: done15:47
pandajaosorior_mtg: http://logs.openstack.org/03/392703/9/check-tripleo/gate-tripleo-ci-centos-7-ovb-nonha/a72e0e2/console.html#_2016-11-04_15_32_47_857988 working around redis bug did not fix cinder problem it seems15:47
*** dsariel has joined #tripleo15:48
jaosorior_mtgpanda: so the selinux issue didn't fix the cinder issue?15:48
jaosorior_mtg*selinux fix15:48
pandaEmilienM: jaosorior_mtg , ClientException: resources.volume1: Gateway Time-out (HTTP 504)15:49
jaosorior_mtgor workaround15:49
slagleEmilienM: should we go ahead and merge https://review.openstack.org/#/c/392703/ anyway?15:49
EmilienMit doesn't sound like15:49
EmilienMslagle: wait15:49
slagleok15:49
EmilienMslagle: nonha job failed for cinder timeout15:49
thervepanda, http://logs.openstack.org/03/392703/9/check-tripleo/gate-tripleo-ci-centos-7-ovb-nonha/a72e0e2/logs/overcloud-controller-0/var/log/gnocchi/metricd.txt.gz redis issue still here?15:50
openstackgerritPradeep Kilambi proposed openstack/tripleo-heat-templates: swift/proxy: remove swift::proxy::ceilometer::rabbit_host  https://review.openstack.org/39186415:50
openstackgerritDougal Matthews proposed openstack/tripleo-common: Use parameters from existing Heat stack if it already exists  https://review.openstack.org/39383115:51
pandatherve:  I don't see any denied in audit.log15:52
thervepanda, Yeah, but it can't still connect somehow15:52
thervehttp://logs.openstack.org/03/392703/9/check-tripleo/gate-tripleo-ci-centos-7-ovb-nonha/a72e0e2/logs/overcloud-controller-0/var/log/redis/redis.txt.gz is fishy as well15:52
d0ugalrbrady, marios, bnemec, chem: I *think* this is a better solution. https://review.openstack.org/#/c/393831/ - thoughts? (NOTE: not tested yet, doing that now)15:53
*** rbowen has quit IRC15:54
*** dsariel has quit IRC15:54
pandatherve: damn ... pacemaker is not even used in nonha15:54
matbud0ugal: +1 agree with patching tripleo-common instead15:54
matbud0ugal: looking at the review atm15:54
openstackgerritMerged openstack/puppet-tripleo: pacemaker/mysql: wait step 2 to remove default accounts  https://review.openstack.org/39331715:55
*** athomas has quit IRC15:55
d0ugalmatbu: Thanks - it is much much simpler15:55
*** numans has quit IRC15:55
matbud0ugal: yep15:55
pandatherve: with pacemaker is the resource agent that is creating /var/run/redis dir15:55
pandanot sure wha'ts supposed to creat it without paccemaker15:56
openstackgerritDougal Matthews proposed openstack/tripleo-common: Use parameters from existing Heat stack if it already exists  https://review.openstack.org/39383115:56
*** rcernin has quit IRC15:56
openstackgerritMerged openstack/puppet-tripleo: Make sure keepalived is restarted before haproxy.  https://review.openstack.org/39336115:56
d0ugalshardy: If you have time to look at https://review.openstack.org/#/c/393831 I would really appreciated it15:57
openstackgerritJuan Antonio Osorio Robles proposed openstack-infra/tripleo-ci: CINDER TEST  https://review.openstack.org/39368715:57
*** dprince has quit IRC15:58
socialslagle: EmilienM: can you have a look on https://review.openstack.org/#/c/392260/15:58
*** pcaruana has quit IRC15:58
EmilienMsocial: recheck15:59
*** tesseract- has quit IRC15:59
pandahow did it work until now ?15:59
mariosd0ugal: you must have been sneezing few minute ago we were talking abut you16:00
marios(lifecycle scrum) talking about the overcloudrc issue16:00
mariosd0ugal: will look thanks16:00
*** athomas has joined #tripleo16:01
openstackgerritMerged openstack/puppet-tripleo: Set redis file descriptor limit when run via pacemaker  https://review.openstack.org/39334316:02
d0ugalmarios: haha, oh dear, I can't imagine you were saying anything good :)16:03
mariosd0ugal: course not!16:03
*** ebarrera has quit IRC16:03
openstackgerritMerged openstack/tripleo-heat-templates: gnocchi statsd should be able to send data to port 8125  https://review.openstack.org/39329616:05
ccamachojaosorior dude upstream updates are brooooken mmm a mistral workflow snag I believe...16:05
d0ugalccamacho: related to the passwords?16:06
ccamachod0ugal there is an env file generated empty from the the jinja templates and its breaking my "update" here part of the logs http://paste.openstack.org/show/587908/16:08
d0ugalccamacho: oh, weird - I've not seen that one before.16:08
ccamachojust reprovisioned the server, and launched the deployment command 2 times..16:09
ccamachothe second time failed with the mistral error16:09
*** ooolpbot has joined #tripleo16:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION16:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163796116:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163835016:10
*** ooolpbot has quit IRC16:10
openstackLaunchpad bug 1637961 in tripleo "periodic HA master job pingtest times out" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)16:10
openstackLaunchpad bug 1638350 in tripleo "pingtest failing on OVB jobs to create Cinder volume and Nova server" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)16:10
*** jaosorior_mtg is now known as jaosorior16:11
jaosoriorccamacho: fuck, so they're broken again?16:12
jaosorior:(16:12
d0ugalccamacho: Can you share the mistral executor logs?16:12
openstackgerritCharlie Llewellyn proposed openstack/tripleo-heat-templates: Add method to retry registration as we expect occasional network issues  https://review.openstack.org/38652916:13
pandathere's nothing in redis that supports direclty having a socket in tmpfs ...16:13
*** tiswanso has joined #tripleo16:14
ccamachod0ugal sure let me re-reproduce the error to confirm and Ill get the executor logs.16:15
pandaThe only workaround that I can think of is addin and ExecPre in systemd to creates the dir16:15
pandabut with hardcoded path16:15
*** akshai has quit IRC16:15
*** percevalbot has quit IRC16:17
*** percevalbot has joined #tripleo16:19
*** saneax-_-|AFK is now known as saneax16:20
*** dprince has joined #tripleo16:21
*** dsariel has joined #tripleo16:26
*** abehl has joined #tripleo16:27
pandaEmilienM: it's possible to use it something like this in erb files ? <%= File.dirname(@unixsocket) %>16:28
*** bnemec is now known as beekneemech16:28
openstackgerritJulie Pichon proposed openstack/python-tripleoclient: Pass clients to get the get_password function  https://review.openstack.org/39319216:29
*** panda is now known as panda|bbl16:29
panda|bblbe back in ~3hours16:29
*** jcoufal has quit IRC16:29
*** jcoufal has joined #tripleo16:30
*** dsariel has quit IRC16:33
openstackgerritDougal Matthews proposed openstack/tripleo-common: Use parameters from existing Heat stack if it already exists  https://review.openstack.org/39383116:34
*** rbrady is now known as rbrady-run16:35
ayoungNo op redeploy started 15:31:36Z  ended 16:09:59Z  one controller, one compute, virtualized env16:35
ayoung10 minutes were between 2016-11-04 15:59:20Z [NovaComputeDeployment]: SIGNAL_COMPLETE  Unknown16:36
ayoung  and 2016-11-04 16:09:38Z [overcloud-AllNodesDeploySteps-tg2bq2qb4ogw-ControllerDeployment_Step5-duronstn7yce.0]: SIGNAL_IN_PROGRESS  Signal: deployment d3ddc624-372d-4379-b0d1-219e4d9a4134 succeeded16:36
ayoungSomething is hanging.  How do I debug?16:36
ayoungsame thing earlier in the deploy:  10 minutes between 2016-11-04 15:46:31Z [NovaComputeDeployment]: SIGNAL_COMPLETE  Unknown  and  SIGNAL_IN_PROGRESS  Signal: deployment caa21456-320d-418b-9392-4a8f2d171a44 succeeded16:37
*** cylopez has quit IRC16:41
*** zoli is now known as zoli|gone16:42
*** chandankumar has joined #tripleo16:42
shardyayoung: log onto the node and ps ax | grep heat - see if there's a heat-config hook running and stuck16:43
*** jlinkes has quit IRC16:43
ayoungshardy, ok...lets see...16:43
*** jaosorior has quit IRC16:43
shardyif there is you can copy the command, kill the hook and re-run puppet with --debug to see where it's stuck16:43
shardyayoung: note if you do that, you'll need to note the step it's stuck on and pass that to puppet16:44
shardye.g by adding step: 3 to a hieradata file or something16:44
ayoungshardy, so the deploy finally finished16:44
ayoungI;m on the controller, and Heat is running as a service .  Did you mean I should log in to a compute?16:44
shardywell, log in to whichever node has a stuck Deployment resource16:45
*** yamahata has quit IRC16:45
ayoungshardy, so kick it off again and ssh in when it hangs.  But I t did actually complete.  Is there some log I should look at from the last time first?16:46
ayoung/var/log/os-apply-config.log  ?16:46
shardyayoung: you can look at the stdout from the deployment e.g16:47
ayoungshardy, that is what I pasted above16:47
shardyopenstack software deployment output show caa21456-320d-418b-9392-4a8f2d171a44 --all --long16:47
shardyno that's just the events16:47
ayounghttp://paste.openstack.org/show/587912/16:47
ayoungah...ok16:47
shardysounds like you want to profile the puppet run to make it faster?16:48
ayounggah just lost all output in this window...one sec16:48
ayoungshardy, yeah, or at least understand where the time is going16:48
EmilienMslagle: can you please approve https://review.openstack.org/#/c/393769/ ?16:50
openstackgerritEmilien Macchi proposed openstack/tripleo-heat-templates: Include redis/mongo hiera when using pacemaker  https://review.openstack.org/39331816:51
openstackgerritEmilien Macchi proposed openstack/tripleo-heat-templates: Remove duplicate metadata keys from nova-api.yaml  https://review.openstack.org/39332716:51
*** mcornea has quit IRC16:51
*** ayoung has quit IRC16:51
openstackgerritJames Slagle proposed openstack/python-tripleoclient: Fix nodes count check on stack-update with custom roles  https://review.openstack.org/39385516:51
*** ayoung has joined #tripleo16:52
*** paramite has quit IRC16:54
*** bana_k has joined #tripleo17:03
gfidentepanda|bbl so I don't see any particular slowdown in HA job with three controllers17:03
gfidenteand pingtest succeeded17:03
gfidenteI can see initial request for cinder taking a bit longer than the others, so wsgi preload might help17:03
gfidentebut I am inclined to think that this isn't the root cause17:03
gfidenteas the volume resource in heat goes into COMPLETE state17:04
gfidenteit's when we boot the nova guest which things fail, probably because of CPU overload17:04
*** flepied has joined #tripleo17:04
ccamachod0ugal reported here: https://bugs.launchpad.net/tripleo/+bug/163930217:04
openstackLaunchpad bug 1639302 in tripleo "Started Mistral Workflow fails due to malformed template" [Undecided,New]17:04
*** hewbrocca is now known as hewbrocca_afk17:07
d0ugalccamacho: Thanks.17:08
shardyayoung: Note that you can get the puppet manifest for a role from heat, e.g17:10
shardyopenstack software config list  | grep ComputePuppetConfigImpl17:10
*** ooolpbot has joined #tripleo17:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION17:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163796117:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163835017:10
*** ooolpbot has quit IRC17:10
openstackLaunchpad bug 1637961 in tripleo "periodic HA master job pingtest times out" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)17:10
shardyopenstack software config show <ID>17:10
openstackLaunchpad bug 1638350 in tripleo "pingtest failing on OVB jobs to create Cinder volume and Nova server" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)17:10
shardywhich makes it fairly easy to cut/paste then run on the node manually17:10
ayoungshardy, oh, I did not realize, but it makes sense17:10
shardyI've been thinking we should write the manifest on each node to make this easier17:10
ayoungshardy, yes17:10
ayoungshardy, so I was thinking it should be like this:17:11
shardyayoung: the other option is when the heat hook is running, you can copy the <deploymentid>.pp file somewhere17:11
shardythen run it again after the heat hook finishes17:11
ayoung1. User calls an API to update the manifest in heat.  That is saved as a delta17:11
ayoung2. User can see both the old and new state in heat17:11
shardy(in both cases you have to add a step value to hiera)17:11
ayoung3.  User can run the deploy in "test only" mode17:11
ayoungfinally, apply....17:11
*** fragatina has quit IRC17:11
ayoungbut even then, we need to be able to trace and see the log as it happens to diagnose the unpredicted breakages, and to fix a broken redeploy17:12
ayoungis a workflow like that possible?17:12
openstackgerritHonza Pokorny proposed openstack/tripleo-ui: Redirect user to login page when token expires  https://review.openstack.org/39258917:12
shardyayoung: probably yes, it'd take a bit of work to wire it all together tho17:12
*** jpich has quit IRC17:13
shardythe biggest missing pieces is a way for the heat hook to pass line-by-line data back instead of waiting for puppet to finish (or hang..)17:13
shardyayoung: if we write the manifest then exist, it'd be pretty easy to allow folks to run it directly or via another tool tho17:13
shardys/exist/exit17:14
*** jprovazn has joined #tripleo17:17
openstackgerritGiulio Fidente proposed openstack/tripleo-common: Install Heat's Ansible agent in the overcloud-full image  https://review.openstack.org/39387217:20
*** tremble has quit IRC17:22
*** ohamada has quit IRC17:23
openstackgerritMerged openstack/tripleo-heat-templates: Updated Nuage neutron plugin name  https://review.openstack.org/39192317:23
*** saneax is now known as saneax-_-|AFK17:23
*** rbowen has joined #tripleo17:23
*** florianf has quit IRC17:23
*** florianf has joined #tripleo17:29
openstackgerritAlfredo Moralejo proposed openstack/tripleo-quickstart: Increase haproxy timeouts  https://review.openstack.org/39387617:30
slagleamoralej: should we just fix that properly? instead of a ci workaround?17:32
*** lucasagomes is now known as lucas-afk17:32
amoralejslagle, i'm open to any suggestion, but i think slow hardware in rdo-ci is behing the cause17:32
amoralejin rdo-ci GET heat api calls take up to more that 60 seconds to respond17:33
amoralejwhile in better hardware it takes 12seconds max17:33
amoralejwe are trying to tune things  in rdo-ci in parallel17:34
*** ebarrera has joined #tripleo17:35
*** lmiccini has quit IRC17:36
*** d0ugal has quit IRC17:39
*** rasca has quit IRC17:40
slaglesounds like shardy saw the same issue in his env. makes me think the root cause might be something other than slow hardware since this appears to have just come up17:40
*** jkilpatr has quit IRC17:42
*** jkilpatr has joined #tripleo17:43
shardyYeah, I didn't have much luck figuring out the root cause tho unfortunately17:44
shardyit was definitely the post to swift which caused it tho17:44
*** creepy_owlet is now known as dtantsur|afk17:44
*** ebarrera has quit IRC17:44
*** ebarrera has joined #tripleo17:47
*** shardy has quit IRC17:51
*** ebarrera has quit IRC17:54
*** charliejllewelly has quit IRC18:00
*** akshai has joined #tripleo18:00
*** yamahata has joined #tripleo18:03
*** rbrady-run is now known as rbrady18:03
openstackgerritDavid Moreau Simard proposed openstack/tripleo-quickstart: Properly reload kvm module when trying to set up nested virtualization  https://review.openstack.org/38601218:07
amoralejshardy, slagle it may be different issues, have you seen https://bugzilla.redhat.com/show_bug.cgi?id=1320164#c21 ?18:07
openstackbugzilla.redhat.com bug 1320164 in openstack-swift "2 tests for object storage failed with BadStatusLine" [Medium,Post] - Assigned to zaitcev18:07
*** links has joined #tripleo18:09
*** ooolpbot has joined #tripleo18:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION18:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163796118:10
openstackLaunchpad bug 1637961 in tripleo "periodic HA master job pingtest times out" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)18:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163835018:10
*** ooolpbot has quit IRC18:10
openstackLaunchpad bug 1638350 in tripleo "pingtest failing on OVB jobs to create Cinder volume and Nova server" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)18:10
*** links has quit IRC18:10
openstackgerritEmilien Macchi proposed openstack/tripleo-heat-templates: Add preload wsgi script option for cinder  https://review.openstack.org/39379018:11
*** liverpooler has quit IRC18:14
*** jpena is now known as jpena|off18:16
openstackgerritJiri Stransky proposed openstack/tripleo-heat-templates: Updated Nuage neutron plugin name  https://review.openstack.org/39389218:18
*** stevemul has left #tripleo18:20
openstackgerritMerged openstack/python-tripleoclient: Fix nodes count check on stack-update with custom roles  https://review.openstack.org/39231318:20
*** sshnaidm has quit IRC18:24
*** sshnaidm has joined #tripleo18:27
*** rhefner has quit IRC18:27
openstackgerritBob Fournier proposed openstack/diskimage-builder: Set "NM_CONTROLLED=no" for ifcfg files in dhcp-all-interfaces.sh  https://review.openstack.org/39389718:28
EmilienMslagle: can you remove -1 on https://review.openstack.org/#/c/393855/ please?18:30
openstackgerritBob Fournier proposed openstack/diskimage-builder: Set "NM_CONTROLLED=no" for ifcfg files in dhcp-all-interfaces.sh  https://review.openstack.org/39389718:32
*** pblaho has quit IRC18:33
panda|bblbkero: what's the upstream for puppet-redis ?18:35
*** mgould is now known as mgould|afk18:37
*** milan has quit IRC18:38
*** saneax-_-|AFK has quit IRC18:38
EmilienMpanda|bbl: https://github.com/arioch/puppet-redis18:41
EmilienMI think18:41
EmilienMlet me check18:41
EmilienMyes18:41
panda|bblEmilienM: thanks.18:42
EmilienMpanda|bbl: FYI it's here https://github.com/redhat-openstack/rdoinfo/blob/master/rdo.yml#L32418:42
panda|bblEmilienM: ok. I'm not sure how any of this worked, but if we really want redis socket to be in /var/run, something has to be changed. The best way is to change the systemd service file. I'm not sure if it's better to propose a change upstream or put a patch in the package. Maybe both and then remove the patch when upstream lands ...18:44
*** jerrygb has quit IRC18:45
Sloweris there some magic to killing a stack while it's being started?18:45
Slowerevery time I try it ends up taking forever18:45
beaglesbandini: some other info on the whole "what creates this route" question of yesterday18:45
Slowereg while in CREATE_IN_PROGRESS18:45
*** ebalduf has quit IRC18:45
*** jerrygb has joined #tripleo18:45
bandinibeagles: oh I found out what happened18:45
bandinibeagles: your pointer gave me the right tip18:46
*** akshai has quit IRC18:46
beaglesbandini: ok cool.. fwiw: if net-config-noop.yaml is spec'd in the resource registry, something else is used...18:46
bandinibeagles: basically it was a wrongly configured EC2MetadataIP and ControlPlaneDefaultRoute. So the templates were adding routes to 169.254... that were impossible so the kernel would not create them and everything broke apart18:47
bandinibeagles: ah cool18:47
beaglesbandini: ahhh.. yeah, that'd do it :)18:48
bandinibeagles: I'd have to check if validations would have caught this one and file a bug ;)18:48
bandiniit's not entirely trivial to debug because the deployment simply times out18:49
beaglesbandini: ugh18:49
*** jkilpatr has quit IRC18:50
beaglesbandini: speaks to something that's been brewing for me.. I'd like us to explore better separation between the essential "infrastructure" and overcloud networking elements. For example, right now we configure neutron in the overcloud with mappings to the same bridge we use for the control plane18:52
beaglesbandini: we could have them as two distinct bridges and link them with patch ports18:52
beaglesbandini: so if we screw up configuration of a neutron service in the overcloud, it is less likely to mess up the networking for the control plane18:53
* beagles is just thinking out loud18:53
bandinibeagles: sounds like worth exploring, yes18:53
bandinibeagles: I need to play and see how much we can cover of these issues via validations18:54
beaglesbandini: cool18:54
bandinithe case I looked at the other day, screamed for a warning saying "dude, not going to work" ;)18:54
beagles:)18:55
beekneemechNow that we have the api, we should stop requiring users to specify the route to the undercloud api endpoints.18:55
beekneemechWe have a thing that knows what address the undercloud is using and can pass that into the overcloud config automatically.18:55
bandiniyeah that would definitely be an improvement as well18:56
*** jkilpatr has joined #tripleo19:03
*** dsariel has joined #tripleo19:07
*** fzdarsky__ is now known as fzdarsky|afk19:08
EmilienMslagle: can you approve https://review.openstack.org/#/c/393770/ please?19:08
slagleEmilienM: the master patch hasn't merged19:09
EmilienMmy bad19:09
*** ooolpbot has joined #tripleo19:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION19:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163796119:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163835019:10
openstackLaunchpad bug 1637961 in tripleo "periodic HA master job pingtest times out" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)19:10
*** ooolpbot has quit IRC19:10
openstackgerritEmilien Macchi proposed openstack/puppet-tripleo: swift/proxy: configure rabbitmq properly  https://review.openstack.org/39186219:10
openstackLaunchpad bug 1638350 in tripleo "pingtest failing on OVB jobs to create Cinder volume and Nova server" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)19:10
EmilienMrebasing it/ recheck -- failed to pass CI19:10
openstackgerritMartin André proposed openstack/tripleo-heat-templates: Containerized Services for Composable Roles  https://review.openstack.org/33065919:10
openstackgerritMerged openstack/tripleo-heat-templates: Update openstack-puppet-modules dependencies  https://review.openstack.org/39379719:10
slagleEmilienM: should we w-1 the backport?19:10
slaglesomeoen might come along and merge it accidentally19:11
openstackgerritIan Main proposed openstack/tripleo-heat-templates: Containerized Services for Composable Roles  https://review.openstack.org/33065919:11
EmilienMslagle: you can, I won't otherwise you'll be stuck19:11
EmilienMslagle: or -1 it19:11
EmilienMit's visible enough19:11
*** rbrady is now known as rbrady-afk19:11
*** amoralej is now known as amoralej|off19:11
slagleoh, well if i decide to never come back to work after today, i guess it maybe stuck19:11
EmilienMbut won't block if we want to approve19:11
EmilienMlol19:11
EmilienMslagle: please come back19:12
slagleok, yea, i just -1'd19:12
slagleif i'm not around when the master patch lands, someone else can push it through if they so desire19:12
EmilienMsounds good19:12
openstackgerritJames Slagle proposed openstack/python-tripleoclient: Fix nodes count check on stack-update with custom roles  https://review.openstack.org/39385519:13
slagleEmilienM: can you approve that ^ :)19:14
EmilienMslagle: -219:14
slagleit fully passed CI and all i did was update commit message19:14
EmilienMyeah19:14
slaglethanks19:14
*** jerrygb has quit IRC19:14
*** saneax-_-|AFK has joined #tripleo19:16
*** jerrygb has joined #tripleo19:16
EmilienMslagle: the list of things we know critical is getting smaller, wdyt about doing an etherpad maybe? or a gerrit topic19:21
slagleEmilienM: that might help19:26
EmilienMslagle: etherpad.openstack.org/p/tripleo-newton-219:27
*** jcoufal has quit IRC19:28
*** akshai has joined #tripleo19:34
*** akshai has quit IRC19:35
*** fragatina has joined #tripleo19:44
openstackgerritIan Main proposed openstack/tripleo-heat-templates: Containerized Services for Composable Roles  https://review.openstack.org/33065919:44
panda|bblfirst hastily made attempt to solve redis issue even without a ressource agent: https://github.com/arioch/puppet-redis/pull/13119:44
panda|bblbut if this ever merge, it will take ages .... and we need a workaround anyway ...19:45
panda|bblthe easiest thing is really use just /tmp/redis.sock as socket. Any suggestion appreciated19:47
*** chandankumar has quit IRC19:54
openstackgerritMerged openstack/puppet-tripleo: Create heat user in keystone profile  https://review.openstack.org/39300019:57
openstackgerritMerged openstack/tripleo-heat-templates: Remove duplicate metadata keys from nova-api.yaml  https://review.openstack.org/39332719:59
*** jerrygb_ has joined #tripleo20:01
*** fragatina has quit IRC20:03
*** jerrygb has quit IRC20:05
*** jerrygb_ has quit IRC20:06
*** tiswanso has quit IRC20:06
*** coolsvap has quit IRC20:07
*** morazi has quit IRC20:09
*** ooolpbot has joined #tripleo20:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION20:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163796120:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163835020:10
*** ooolpbot has quit IRC20:10
openstackLaunchpad bug 1637961 in tripleo "periodic HA master job pingtest times out" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)20:10
openstackLaunchpad bug 1638350 in tripleo "pingtest failing on OVB jobs to create Cinder volume and Nova server" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)20:10
*** dprince has quit IRC20:12
*** abregman has joined #tripleo20:14
*** jayg is now known as jayg|g0n320:14
*** florianf has quit IRC20:15
*** jprovazn has quit IRC20:15
openstackgerritMerged openstack/tripleo-heat-templates: Include redis/mongo hiera when using pacemaker  https://review.openstack.org/39331820:22
*** jkilpatr has quit IRC20:25
*** athomas has quit IRC20:25
*** derekh has joined #tripleo20:27
derekhAre we still hitting CI issues? anything that needs an extra pair of hands to debug?20:28
*** panda|bbl is now known as panda20:31
EmilienMpanda: managing this file is imho a bad idea20:31
EmilienMit should be done by packaging20:31
EmilienMderekh, slagle: FYI I also see some issues when deploying undercloud20:32
EmilienMironic api sounds unreachable sometimes20:32
EmilienMhttp://logs.openstack.org/76/391876/5/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/ab8a98e/console.html#_2016-11-04_19_14_23_94967420:32
derekhEmilienM: I'll try a spin up a env and see which of the problems I hit first, then debug20:33
pandaEmilienM: the problem is that with the package is not easy to get the config file options for the unix socket20:33
pandais there something that works this week ? :(20:34
pandaEmilienM: but I agree that is a ad idea ...20:35
EmilienMfwiw, stable/newton seems pretty stable20:35
pandaEmilienM: I'd like to check if redis starts there ...20:36
slagleEmilienM: you wanna know why?20:36
beaglesthis still the selinux issue20:36
beaglesovercloud nodes are permissive in newton :)20:36
slagleit's b/c we didnt backport https://review.openstack.org/#/c/393472/20:37
slagleright20:37
slaglethat's why i -1'd the backport20:37
slaglemaybe we shoudl revert on master20:37
beaglesit would help unblock CI, but it kind of sucks20:37
pandaslagle: it's not enough, for the nonha is not a selinux problem. In HA redis is started by resource agente that creates the /var/run/redis dir. Without pacemaker there is nothing that creates that dir, unix socket cannot be created and redis does not start anyway.20:38
slaglepanda: do we know what has changed to cause that problem?20:39
* beagles reflects that is kind of weird that is happening now20:39
beaglesyeah, what he said20:40
pandaslagle: no, I wonder if this ever worked ... /var/run should have been a tmps for a long time now20:40
slagleright. /var/run has been tmpfs for at least all of centos 720:40
beaglesI imagine we've been down the obvious routes of having the service file do something.. like in the ExecStartPre or something?20:41
pandawe don't have overcloud logs for newton ... nodes are deleted when all goes well, or seems to go well20:44
EmilienMpanda: what?20:45
*** tiswanso has joined #tripleo20:45
*** afazekas_ has joined #tripleo20:45
EmilienMpanda: that's newton/ovb-ha logs http://logs.openstack.org/18/393318/2/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/9c6ffa2/logs/20:45
pandaEmilienM: https://bugs.launchpad.net/tripleo/+bug/163194220:46
openstackLaunchpad bug 1631942 in tripleo "Ha periodic jobs don't gather overcloud nodes logs on successful runs" [Undecided,New] - Assigned to Gabriele Cerami (gcerami)20:46
EmilienMah, periodic jobs20:46
*** jkilpatr has joined #tripleo20:46
*** afazekas has quit IRC20:48
pandaEmilienM: http://logs.openstack.org/18/393318/2/check-tripleo/gate-tripleo-ci-centos-7-ovb-nonha/0b5af97/logs/overcloud-controller-0/var/log/redis/redis.txt.gz20:49
pandaredis did not start in newton nonha!20:49
pandabut maible ceilometer was disabled20:49
pandamaybe*20:49
*** tiswanso has quit IRC20:50
pandanothing was using it20:50
EmilienMpanda: have we compared the version of redis over the last days/weeks?20:50
openstackgerritMerged openstack/tripleo-heat-templates: Add option to disable "d1" Swift device  https://review.openstack.org/39376920:51
pandaEmilienM: I didn't. Not even sure where to start to do it.20:51
EmilienMI'm doing it but let me show you:20:51
EmilienMyou go on http://tripleo.org/cistatus.html and you take a job from 1 or 2 weeks ago. You inspect the logs and look what version of redis is installed on the overcloud (/var/log/hosts-info.txt)20:52
EmilienMand you compare with the version we have right now20:52
*** dbecker has joined #tripleo20:54
*** dbecker has quit IRC20:54
pandaEmilienM: last job I see is from 25 October20:54
EmilienMok so version of redis looks ok20:54
EmilienMpanda: is it file ok ? http://logs.openstack.org/56/390556/3/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/5ede534/logs/overcloud-controller-0/var/log/redis/redis.txt.gz20:55
*** pradk has quit IRC20:55
EmilienMit's an ovb job from 10 days ago, redis seems to start20:55
EmilienMlet's now compare packages between there and now (diff will be huge but still helpful)20:55
pandaEmilienM: this is ha, the resource agent takes care of creating the /var/run/redis dir20:55
pandaEmilienM: you have to look at nonha20:56
EmilienMok20:56
EmilienMpanda: http://logs.openstack.org/56/390556/3/check-tripleo/gate-tripleo-ci-centos-7-ovb-nonha/f3972f2/logs/overcloud-controller-0/var/log/redis/redis.txt.gz20:56
EmilienMis it ok?20:56
pandaEmilienM: yes20:57
EmilienM● redis.service                                                               loaded    failed   failed    Redis persistent key-value database20:57
EmilienMI'm not sure20:57
pandaEmilienM: if there's that error, it did not start20:58
EmilienMsee in http://logs.openstack.org/56/390556/3/check-tripleo/gate-tripleo-ci-centos-7-ovb-nonha/f3972f2/logs/overcloud-controller-0/var/log/host_info.txt.gz20:58
EmilienMso it was 10 days ago, which means redis was already broken?20:58
pandaEmilienM: yes20:58
EmilienMwe really need to get scenario001 green again, it's a telemetry CI job20:58
EmilienMpanda: ok I'l looking for old jobs now20:59
EmilienMok I found redis working21:00
EmilienMhttp://logs.openstack.org/67/371567/1/check-tripleo/gate-tripleo-ci-centos-7-ovb-nonha/f48f1f0/logs/overcloud-controller-0/var/log/redis/redis.txt.gz21:00
EmilienMSep 1721:00
openstackgerritMerged openstack/python-tripleoclient: Fix nodes count check on stack-update with custom roles  https://review.openstack.org/39385521:00
EmilienMso now we have redis-3.2.4-1.el7.x86_6421:01
EmilienMbut it worked with edis version 3.2.321:01
EmilienMI guess we need to investigate diff between 3.2.3 and 3.2.421:01
EmilienMlooking at https://github.com/antirez/redis/releases21:02
EmilienM3.2.5 was release end of september which makes sense regarding our failures21:02
*** flepied has quit IRC21:03
mwhahahamore package fun?21:03
pandaEmilienM: https://github.com/antirez/redis/blob/3.2/00-RELEASENOTES21:04
*** dougbtv has quit IRC21:04
EmilienMmwhahaha: yes, I think redis broke us end of september and we didn't catch it21:04
mwhahahanice21:04
EmilienMI know a guy, called mwhahaha, he's super good at finding what commit broke us :P21:04
mwhahahateach me to speak up on a friday afternoon21:05
pandaEmilienM: maybe we should look at packaging changes21:05
EmilienMpanda: both code and packaging changes could break us21:06
EmilienMlooking at release notes, I think it's in code21:06
EmilienMhttps://github.com/antirez/redis/compare/3.2.3...3.2.421:06
EmilienMdiff is not too bad21:06
EmilienMI like their commit messages21:07
EmilienMhttps://github.com/antirez/redis/commit/c01abcdebf4fa2b1cd3d3a89049651d528ed565621:07
EmilienM"fix the fix"21:07
pandaI don't see anything related. They changed the message for socket bind problem, but that's it21:09
*** ooolpbot has joined #tripleo21:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION21:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163796121:10
openstackLaunchpad bug 1637961 in tripleo "periodic HA master job pingtest times out" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)21:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163835021:10
*** ooolpbot has quit IRC21:10
openstackLaunchpad bug 1638350 in tripleo "pingtest failing on OVB jobs to create Cinder volume and Nova server" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)21:10
EmilienMpanda: maybe I'm wrong wrt redis version21:11
EmilienMlet me find a CI job right after redis upgrade and right before21:11
pandaEmilienM: in the meantime I checked if in case, we enabled pacemaker for nonha jobs too. It doesn't seem the case.21:13
*** abregman has quit IRC21:13
*** absubram has quit IRC21:13
EmilienMpanda: ok so I check the day after redis release and redis was failing in tripleo CI21:15
jidaris there any example of using something like tripleo-common/blob/master/scripts/upload-puppet-modules to augment an existing puppet post-config update?21:16
jidarI'm not sure what glue there is between using that script to create a swift object for the puppet modules and then using it on the other end in post-config deployment pattern21:16
jidarI would suppose you just build a heat resource that pulls it down and unzips it in /etc/puppet/modules ?21:17
EmilienMpanda: and 2 days earlier with previous version it was working21:17
EmilienMso I'm quite sure now, that redis version has to do something here21:17
EmilienMslagle: do we already have a bug for redis?21:20
EmilienMotherwise i'll create one in launchpad21:20
slagleEmilienM: not one of outside of the 2 that have been alerting already21:21
EmilienMok21:21
EmilienM3.2.5 is out also, I'm wondering if this one would work21:22
EmilienMlooks like no, it's just a compilation fix21:23
mwhahahawell the redis package works with no configuration so it must be the way we're configuring it21:23
mwhahahado we capture the redis configs?21:23
beekneemechjidar: Try http://hardysteven.blogspot.com/2016/08/tripleo-deploy-artifacts-and-puppet.html21:24
EmilienMslagle: https://bugs.launchpad.net/tripleo/+bug/163935621:24
openstackLaunchpad bug 1639356 in tripleo "Redis 3.2.4 breaks TripleO CI" [Critical,In progress]21:24
EmilienMmwhahaha: yes you need to download log file21:25
mwhahahathe /etc/redis folder is empty21:25
*** rhallisey has quit IRC21:25
mwhahaha(from the extracted overcloud-controller-0.tar.xz)21:26
jidarbeekneemech: yea,that's where I found this in the first place - was hoping for some code of how to recreate that image21:26
EmilienMo_O21:26
mwhahahathere's an etc/redis.conf.puppet21:26
mwhahahawhich seems weird21:27
mwhahahaoh its /etc/redis.conf21:27
pandawhat else changed that day ?21:28
pandaEmilienM: what's the day redis was upgraded ?21:28
EmilienMpanda: Sep 2621:29
beekneemechjidar: image?21:30
jidarbeekneemech: oh, the image there just shows a workflow where you pull data out of swift using "deploy-artifacts.sh"21:30
jidarvia the heat templates21:31
*** rlandy has quit IRC21:32
beekneemechjidar: Right.  upload-puppet-modules creates an env file for you that you pass to your deployment to apply the artifacts to the deployed servers.21:33
beekneemechBy default, $HOME/.tripleo/environments/puppet-modules-url.yaml21:33
*** maeca1 has quit IRC21:34
EmilienMmwhahaha: even when redis was working, no config in /etc/redis21:35
EmilienMactually we have etc/redis-sentinel.conf21:35
*** jrollinhatin is now known as jroll21:36
*** paramite has joined #tripleo21:37
pandaEmilienM: look in /etc/redis.con21:37
pandaEmilienM: look in /etc/redis.conf21:37
pandaIs theere a way to download the logs ?21:37
EmilienMpanda: yes, go in http://logs.openstack.org/39/375339/1/check-tripleo/gate-tripleo-ci-centos-7-ovb-nonha/c3b122d/logs/21:37
EmilienMyou download the tar gz21:37
mwhahahaYea I found the redis.conf and was looking at it21:38
*** abehl has quit IRC21:40
*** ebalduf has joined #tripleo21:42
*** jeckersb is now known as jeckersb_gone21:44
EmilienMmhh only diff in config is the IP @ we use for binding21:45
EmilienMbefore (working) it was 192.0.2.11 and now it's 172.17.0.2121:46
EmilienMI wonder if we care21:46
beekneemechHow are the stable/newton jobs passing with redis 3.2.4?21:49
EmilienMbeekneemech: same problem21:50
EmilienMwe cut stable/newton after 3.2.4 anyway21:50
pandabeekneemech: because there's nothing that uses redis there ...21:50
EmilienMso if it's a config problem, it's in newton21:50
EmilienMpanda: I have to go now, please use the launchpad bug if you find something21:51
beekneemechAh, so redis is still broken there, but it doesn't matter.21:51
pandabeekneemech: it doesn't matter for our CI, which is probably disabling some service21:51
*** jerrygb has joined #tripleo21:51
pandaEmilienM: sure, have a nice PTO21:51
pandaEmilienM: leave us in despair21:51
pandaEmilienM: :)21:52
*** morazi has joined #tripleo21:52
mwhahahai think it's the socket file dir21:52
pandamwhahaha: ?21:52
mwhahahayea21:53
mwhahahaso /var/run/redis doesn't exist21:53
mwhahahaso it won't start21:53
mwhahahayou comment that out of the config and it works fine21:53
*** dtrainor has quit IRC21:54
mwhahahathe old spec used to create /var/run/redis21:54
pandamwhahaha: oh? where ?21:55
pandamwhahaha: link to the old spec ?21:55
pandamwhahaha: if it's really doing that, it's broken anyway21:55
mwhahahai pulled the srpm from cbs21:55
pandamwhahaha: /var/run is tmpfs, at first reboot /var/run/redis is deleted, and redis won't start21:55
mwhahahaso we need to stop configuring that21:56
mwhahahait gets put in from the puppet21:56
*** jerrygb has quit IRC21:56
mwhahaha- remove /var/run/redis with systemd #137472821:56
mwhahahaso that was in 3.2.3-221:56
*** paramite has quit IRC21:56
jidarbeekneemech: oh crap, so you can just generate the yaml there and then `-e $HOME/.path/to/yaml-file.yaml` straight out of the script? Super neat. I did a artifact deploy to drop some custom images in there and didn't realize I could just include it directly21:57
mwhahahabut if you look in the redis.conf it's configuring /var/run/redis/redis.sock21:57
jidarI feel silly haha21:57
mwhahahapanda: https://cbs.centos.org/koji/buildinfo?buildID=12030 old file21:57
mwhahahaor old package (you can just download and extract the srpm)21:57
pandamwhahaha: puppet-redis is also configuring that socket21:57
*** derekh has quit IRC21:58
mwhahahayea that's the problem so we need to either pass a new dir (or stop configuring it)21:58
beekneemechjidar: Yeah, I think that's it.21:58
pandamwhahaha: we have other alternatives, but I'd like to know if: 1) we really need to use unix socket (I think we do) 2) the socket file has to be in /var/run21:59
pandathe unix socket should be faster21:59
mwhahahapanda: gimme a min to look, what puppet module do we use?21:59
pandamwhahaha: https://github.com/arioch/puppet-redis.git22:00
pandamwhahaha: if the answer is yes for both, this may prove useful https://github.com/arioch/puppet-redis/pull/13122:00
mwhahahahttps://bugzilla.redhat.com/show_bug.cgi?id=137472822:00
openstackbugzilla.redhat.com bug 1374728 in redis "/var/run/redis and /usr/lib/tmpfiles/redis.conf not used" [Unspecified,New] - Assigned to fpercoco22:00
mwhahahai like how it said it wasn't used22:01
mwhahaha:(22:01
mwhahahaunless we're configuring something to consume the unix socket, i don't think we need to configure it22:02
mwhahahai wonder if it would just be ok if we didn't configure it22:02
pandamwhahaha: what is gnocchi-metricd using ?22:02
mwhahahai have no idea22:03
*** Vijayendra has quit IRC22:03
*** Vijayendra has joined #tripleo22:04
mwhahahaso we're getting that socket config from the defaults in the module, https://github.com/arioch/puppet-redis/blob/master/manifests/params.pp#L75-L7622:04
mwhahahapanda: is there a bug for this?22:04
openstackgerritAlex Schultz proposed openstack/puppet-tripleo: Skip unix socket configuration for redis  https://review.openstack.org/39394022:06
mwhahahapanda: EmilienM -^22:06
pandamwhahaha: https://bugs.launchpad.net/tripleo/+bug/163935622:07
openstackLaunchpad bug 1639356 in tripleo "Redis 3.2.4 breaks TripleO CI" [Critical,In progress] - Assigned to Alex Schultz (alex-schultz)22:07
*** derekh has joined #tripleo22:09
*** ooolpbot has joined #tripleo22:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION22:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163796122:10
openstackLaunchpad bug 1637961 in tripleo "periodic HA master job pingtest times out" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)22:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163835022:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163935622:10
*** ooolpbot has quit IRC22:10
openstackLaunchpad bug 1638350 in tripleo "pingtest failing on OVB jobs to create Cinder volume and Nova server" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)22:10
openstackLaunchpad bug 1639356 in tripleo "Redis 3.2.4 breaks TripleO CI" [Critical,In progress] - Assigned to Alex Schultz (alex-schultz)22:10
mwhahahaoh that's going to be annoying if it spams by name on a timer22:10
pandamwhahaha: no pressure :)22:10
pandamwhahaha: can you depend on change I1fc28d522d05033add53dad41b7519a1bd033b62 ?22:11
openstackgerritAlex Schultz proposed openstack/puppet-tripleo: Skip unix socket configuration for redis  https://review.openstack.org/39394022:11
mwhahahak22:11
pandamwhahaha: thanks. If we end up not using the socket, maybe my change is not needed.22:13
mwhahahai don't think we need it22:13
mwhahahabut i'm looking22:13
pandagnocchi configuration points to the IP, not the socket, for redis22:13
mwhahahaseems that in THT the only thing we configure a socketdir for is ovs22:14
*** mhenkel has quit IRC22:14
*** fultonj has quit IRC22:19
*** artom_ has quit IRC22:19
*** gfidente has quit IRC22:19
*** noslzzp has quit IRC22:24
*** noslzzp has joined #tripleo22:24
*** flepied has joined #tripleo22:27
*** bfournie has quit IRC22:29
openstackgerritBen Nemec proposed openstack/tripleo-heat-templates: Include keystone authtoken config in manila-share service  https://review.openstack.org/39394722:34
openstackgerritBen Nemec proposed openstack/tripleo-heat-templates: Move db settings from manila-api to manila-base  https://review.openstack.org/39394822:34
*** rajinir has quit IRC22:36
*** artom has joined #tripleo22:36
*** amoralej|off is now known as amoralej22:46
*** jerrygb has joined #tripleo22:52
*** jerrygb has quit IRC22:58
derekhmy attempt at reproducing failed with this error, is it one of the current issues?23:00
derekh++ timeout -k 10 240 ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=Verbose -o PasswordAuthentication=no -o ConnectionAttempts=32 heat-admin@192.0.2.10 sudo crm_resource -r openstack-heat-api --wait23:00
derekhcrm_resource for openstack-heat-api has failed!23:00
*** b00tcat` has quit IRC23:01
*** ooolpbot has joined #tripleo23:10
ooolpbotURGENT TRIPLEO TASKS NEED ATTENTION23:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163796123:10
openstackLaunchpad bug 1637961 in tripleo "periodic HA master job pingtest times out" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)23:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163835023:10
ooolpbothttps://bugs.launchpad.net/tripleo/+bug/163935623:10
*** ooolpbot has quit IRC23:10
openstackLaunchpad bug 1638350 in tripleo "pingtest failing on OVB jobs to create Cinder volume and Nova server" [Critical,In progress] - Assigned to Gabriele Cerami (gcerami)23:10
openstackLaunchpad bug 1639356 in tripleo "Redis 3.2.4 breaks TripleO CI" [Critical,In progress]23:10
*** saneax-_-|AFK is now known as saneax23:19
*** bfournie has joined #tripleo23:29
*** akshai has joined #tripleo23:35
*** akshai has quit IRC23:36
EmilienMmwhahaha, panda: back, what's up?23:42
EmilienMslagle: can you approve https://review.openstack.org/#/c/393770/ ?23:43
EmilienMerr https://review.openstack.org/#/c/391862/23:43
EmilienMwell we can approve both23:43
*** akshai has joined #tripleo23:44
pandaEmilienM: we have the root cause of all root causes23:46
pandaEmilienM: all the work of the past week  can be removed, abandoned, destroyed.23:46
panda(the work to fix the breakage)23:47
*** akshai has quit IRC23:48
mwhahahaEmilienM: https://bugzilla.redhat.com/show_bug.cgi?id=1374728 it's a packaging fix23:53
openstackbugzilla.redhat.com bug 1374728 in redis "/var/run/redis and /usr/lib/tmpfiles/redis.conf not used" [Unspecified,Post] - Assigned to apevec23:53
*** maticue has quit IRC23:55
openstackgerritOpenStack Proposal Bot proposed openstack/python-tripleoclient: Updated from global requirements  https://review.openstack.org/38994523:58
*** amoralej is now known as amoralej|off23:59

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!