openstackgerrit | James Slagle proposed openstack/tripleo-common master: Config download support for all deployments https://review.openstack.org/508189 | 00:02 |
---|---|---|
*** salmankhan has joined #tripleo | 00:07 | |
*** ooolpbot has joined #tripleo | 00:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 00:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1722640 | 00:10 |
*** ooolpbot has quit IRC | 00:10 | |
openstack | Launchpad bug 1722640 in tripleo "pike promotion errors, overcloud images and containers were not promoted properly" [Critical,Triaged] - Assigned to John Trowbridge (trown) | 00:10 |
*** salmankhan has quit IRC | 00:11 | |
*** toure is now known as toure_biab | 00:15 | |
*** yamahata has quit IRC | 00:18 | |
*** ecerquei has joined #tripleo | 00:26 | |
*** ecerquei has quit IRC | 00:39 | |
*** psahoo has joined #tripleo | 00:42 | |
EmilienM | slagle: nice work | 00:43 |
openstackgerrit | Merged openstack/puppet-tripleo stable/newton: Allow to configure snmpd_config https://review.openstack.org/510218 | 00:43 |
*** Goneri has quit IRC | 00:53 | |
*** tongl has joined #tripleo | 01:05 | |
*** ooolpbot has joined #tripleo | 01:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 01:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1722640 | 01:10 |
*** ooolpbot has quit IRC | 01:10 | |
openstack | Launchpad bug 1722640 in tripleo "pike promotion errors, overcloud images and containers were not promoted properly" [Critical,Triaged] - Assigned to John Trowbridge (trown) | 01:10 |
*** liverpooler has quit IRC | 01:10 | |
*** jkilpatr has quit IRC | 01:10 | |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-quickstart-extras master: overcloud-deploy: add config-download + ansible run feature https://review.openstack.org/508306 | 01:15 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-common master: Config download support for all deployments https://review.openstack.org/508189 | 01:15 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-heat-templates master: Config download support for standalone deployments https://review.openstack.org/505827 | 01:15 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-quickstart master: fs10: deploy steps with ansible https://review.openstack.org/508307 | 01:16 |
openstackgerrit | Merged openstack-infra/tripleo-ci master: Add logs for config-download https://review.openstack.org/510709 | 01:21 |
openstackgerrit | Merged openstack-infra/tripleo-ci master: Update document url and fix a spelling error https://review.openstack.org/505475 | 01:22 |
*** daidv has quit IRC | 01:24 | |
*** daidv has joined #tripleo | 01:24 | |
openstackgerrit | Merged openstack/tripleo-heat-templates master: Remove Heat Cloudwatch API during upgrade and disable by default https://review.openstack.org/510101 | 01:26 |
*** daidv has quit IRC | 01:26 | |
*** daidv has joined #tripleo | 01:26 | |
openstackgerrit | Merged openstack-infra/tripleo-ci master: Include gate jobs in the cistatus page https://review.openstack.org/482316 | 01:27 |
*** daidv has quit IRC | 01:27 | |
*** daidv has joined #tripleo | 01:27 | |
openstackgerrit | Merged openstack/puppet-tripleo stable/ocata: Disable SSH login for nova_migration user when migration over ssh is disabled. https://review.openstack.org/510793 | 01:41 |
openstackgerrit | Merged openstack/puppet-tripleo stable/newton: Disable SSH login for nova_migration user when migration over ssh is disabled. https://review.openstack.org/510798 | 01:41 |
*** dmacpher has joined #tripleo | 01:49 | |
*** psachin has joined #tripleo | 01:51 | |
*** owalsh_ has joined #tripleo | 02:08 | |
*** ooolpbot has joined #tripleo | 02:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 02:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1722640 | 02:10 |
openstack | Launchpad bug 1722640 in tripleo "pike promotion errors, overcloud images and containers were not promoted properly" [Critical,Triaged] - Assigned to John Trowbridge (trown) | 02:10 |
*** ooolpbot has quit IRC | 02:10 | |
*** owalsh has quit IRC | 02:11 | |
*** Shatadru|Gone is now known as Shatadru | 02:13 | |
*** Shatadru is now known as Shatadru|coffee| | 02:13 | |
*** catintheroof has joined #tripleo | 02:20 | |
*** catintheroof has quit IRC | 02:22 | |
*** Shatadru|coffee| is now known as Shatadru | 02:23 | |
*** ramishra has joined #tripleo | 02:39 | |
*** Shatadru is now known as Shatadru|afk | 02:47 | |
*** marrusl has quit IRC | 02:53 | |
*** links has joined #tripleo | 02:56 | |
*** marrusl has joined #tripleo | 03:06 | |
*** daidv has quit IRC | 03:08 | |
*** ooolpbot has joined #tripleo | 03:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 03:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1722640 | 03:10 |
*** ooolpbot has quit IRC | 03:10 | |
openstack | Launchpad bug 1722640 in tripleo "pike promotion errors, overcloud images and containers were not promoted properly" [Critical,Triaged] - Assigned to John Trowbridge (trown) | 03:10 |
*** yamahata has joined #tripleo | 03:17 | |
*** tzumainn has quit IRC | 03:21 | |
openstackgerrit | OpenStack Proposal Bot proposed openstack/instack-undercloud master: Updated from global requirements https://review.openstack.org/510309 | 03:33 |
*** pdeore has joined #tripleo | 03:54 | |
*** noslzzp has quit IRC | 03:55 | |
*** noslzzp has joined #tripleo | 03:56 | |
*** aditya_r has joined #tripleo | 04:03 | |
*** ykarel|afk has joined #tripleo | 04:04 | |
*** ratailor has joined #tripleo | 04:05 | |
*** mdnadeem has joined #tripleo | 04:06 | |
*** sshnaidm|off is now known as sshnaidm | 04:06 | |
*** ratailor_ has joined #tripleo | 04:07 | |
*** daidv has joined #tripleo | 04:11 | |
*** ratailor has quit IRC | 04:11 | |
*** fpan has quit IRC | 04:11 | |
*** zaneb has quit IRC | 04:17 | |
openstackgerrit | Merged openstack/tripleo-heat-templates master: Add networking-sfc support https://review.openstack.org/489749 | 04:18 |
*** dpawar has joined #tripleo | 04:19 | |
*** aditya_r has quit IRC | 04:20 | |
*** ratailor__ has joined #tripleo | 04:21 | |
*** jtomasek has joined #tripleo | 04:22 | |
*** ratailor_ has quit IRC | 04:22 | |
*** Shatadru|afk is now known as Shatadru | 04:23 | |
*** dpawar has quit IRC | 04:24 | |
EmilienM | jaosorior: FYI https://review.openstack.org/#/c/510983 | 04:26 |
jaosorior | bummer | 04:28 |
*** gbarros has joined #tripleo | 04:28 | |
EmilienM | jaosorior: yeah | 04:28 |
*** aditya_r has joined #tripleo | 04:28 | |
EmilienM | jaosorior: see my comment | 04:28 |
EmilienM | jaosorior: maybe we can engage with the ec2-api community | 04:28 |
jaosorior | EmilienM: when we introduced it we only ran pingtest jobs | 04:28 |
EmilienM | ansiwen: ^ when you're up, please look | 04:28 |
jaosorior | maybe they'll find it more useful now that we run tempest. | 04:29 |
EmilienM | yeah | 04:29 |
EmilienM | anyway I go to bed | 04:29 |
EmilienM | ttyl | 04:29 |
jaosorior | have a good one | 04:29 |
* EmilienM will try to not dream about gate | 04:29 | |
pdeore | EmilienM, need your review on https://review.openstack.org/#/c/509079/ .. could you please have a look on this? | 04:31 |
*** ratailor_ has joined #tripleo | 04:32 | |
*** shreshtha has joined #tripleo | 04:33 | |
pdeore | EmilienM, jaosorior , Thanks !! | 04:33 |
*** ratailor__ has quit IRC | 04:36 | |
*** gbarros has quit IRC | 04:37 | |
*** fpan has joined #tripleo | 04:40 | |
*** zaneb has joined #tripleo | 04:41 | |
*** dparkes has quit IRC | 05:00 | |
*** ebarrera has quit IRC | 05:06 | |
*** skramaja has joined #tripleo | 05:07 | |
*** mdnadeem_ has joined #tripleo | 05:08 | |
*** mdnadeem has quit IRC | 05:10 | |
*** gkadam has joined #tripleo | 05:15 | |
*** Guest2849 is now known as sdake | 05:20 | |
*** sdake is now known as Guest95430 | 05:21 | |
*** Guest95430 is now known as sdake_fixing | 05:23 | |
*** ratailor__ has joined #tripleo | 05:23 | |
*** ratailor_ has quit IRC | 05:26 | |
*** sdake_fixing has quit IRC | 05:29 | |
*** sdake_fixing has joined #tripleo | 05:29 | |
*** jaganathan has joined #tripleo | 05:30 | |
*** sdake_fixing has quit IRC | 05:35 | |
*** udesale has joined #tripleo | 05:38 | |
*** yog_ has quit IRC | 05:38 | |
*** aufi has joined #tripleo | 05:42 | |
openstackgerrit | Sagi Shnaidman proposed openstack/tripleo-quickstart master: Fix IP address for ssh-tunnel for UI https://review.openstack.org/511143 | 05:43 |
*** mdnadeem_ has quit IRC | 05:44 | |
openstackgerrit | OpenStack Proposal Bot proposed openstack/instack-undercloud master: Updated from global requirements https://review.openstack.org/510309 | 05:44 |
*** cshastri has joined #tripleo | 05:49 | |
*** dpawar has joined #tripleo | 05:49 | |
*** dparkes has joined #tripleo | 05:51 | |
*** yog_ has joined #tripleo | 05:51 | |
sshnaidm | https://stackoverflow.com is down, let's go home | 05:52 |
*** sshnaidm is now known as sshnaidm|afk | 05:52 | |
*** achadha has joined #tripleo | 05:53 | |
*** marios has joined #tripleo | 05:55 | |
*** janki has joined #tripleo | 05:55 | |
*** yog_ has quit IRC | 05:56 | |
*** achadha has quit IRC | 05:58 | |
*** mdnadeem_ has joined #tripleo | 06:00 | |
*** dmacpher has quit IRC | 06:02 | |
*** jfrancoa has joined #tripleo | 06:04 | |
*** aditya_ra has joined #tripleo | 06:04 | |
*** aditya_r has quit IRC | 06:05 | |
*** yprokule has joined #tripleo | 06:07 | |
*** yog_ has joined #tripleo | 06:08 | |
*** aditya_ra has quit IRC | 06:14 | |
*** aditya_r has joined #tripleo | 06:15 | |
*** fragatina has joined #tripleo | 06:16 | |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/tripleo-common master: Add novajoin images https://review.openstack.org/467074 | 06:16 |
Tengu | hello jaosorior :9 | 06:16 |
jaosorior | Tengu: hey dude, how's it going? | 06:18 |
Tengu | fine and you? | 06:18 |
jaosorior | pretty good | 06:19 |
jaosorior | getting my morning coffee | 06:19 |
Tengu | eh :). getting my morning tea in here. and preparing a new deploy for my ceph nodes - I found out the issue yesterday, the colleague who plugged the additional network card plugged it in the wrong socket - meaning the deploy failed because it didn't find the 4 interfaces we configured… | 06:20 |
Tengu | so hopefully today will be THE day where all works as expected. wish me luck ;) | 06:21 |
jaosorior | oooh damn | 06:21 |
jaosorior | but good that you found it | 06:21 |
jaosorior | good luck! | 06:21 |
Tengu | funnily, this issue also reported and "unknown" code/message. | 06:21 |
Tengu | had to go on the node and check its logs directly. | 06:22 |
*** spectr has quit IRC | 06:27 | |
*** spectr has joined #tripleo | 06:28 | |
*** jprovazn has joined #tripleo | 06:30 | |
*** nyechiel_ has joined #tripleo | 06:31 | |
*** ratailor_ has joined #tripleo | 06:32 | |
openstackgerrit | Merged openstack/tripleo-heat-templates stable/newton: Correctly set DEFAULT/dhcp_domain to empty string. https://review.openstack.org/506688 | 06:34 |
openstackgerrit | Merged openstack/tripleo-heat-templates master: Fix some missed hard-coded network references https://review.openstack.org/508947 | 06:34 |
*** ratailor__ has quit IRC | 06:35 | |
*** ebarrera has joined #tripleo | 06:36 | |
openstackgerrit | Gael Chamoulaud proposed openstack/tripleo-common master: Make validation inputs configurable via Mistral https://review.openstack.org/383763 | 06:37 |
*** ccamacho has joined #tripleo | 06:47 | |
*** owalsh_ is now known as owalsh | 06:49 | |
*** threestrands has quit IRC | 06:50 | |
*** dpawar has quit IRC | 06:54 | |
openstackgerrit | Marios Andreou proposed openstack/tripleo-heat-templates master: EARLY WIP: Convert tags to when statements for Q major upgrade workflow https://review.openstack.org/510902 | 06:54 |
openstackgerrit | Merged openstack/tripleo-heat-templates stable/ocata: Addition of Nuage as mechanism driver for ML2 https://review.openstack.org/492224 | 06:58 |
*** aditya_r has quit IRC | 06:58 | |
openstackgerrit | Marios Andreou proposed openstack/tripleo-heat-templates stable/pike: Remove Heat Cloudwatch API during upgrade and disable by default https://review.openstack.org/511155 | 06:58 |
*** sdake has joined #tripleo | 07:01 | |
*** sdake has joined #tripleo | 07:01 | |
*** aditya_r has joined #tripleo | 07:02 | |
openstackgerrit | Merged openstack-infra/tripleo-ci master: set neutron mtu to match system mtu https://review.openstack.org/509761 | 07:02 |
*** openstackgerrit has quit IRC | 07:03 | |
*** rcernin has joined #tripleo | 07:05 | |
*** jlinkes has joined #tripleo | 07:05 | |
*** jpena|off is now known as jpena | 07:08 | |
Tengu | oh, new issue now: deploy seems to be stuck. | 07:12 |
*** pcaruana has joined #tripleo | 07:13 | |
*** ykarel_ has joined #tripleo | 07:14 | |
*** ffiore has joined #tripleo | 07:15 | |
*** ykarel|afk has quit IRC | 07:16 | |
*** tesseract has joined #tripleo | 07:19 | |
*** sid1 has joined #tripleo | 07:19 | |
*** dpawar has joined #tripleo | 07:19 | |
*** amoralej|off is now known as amoralej | 07:20 | |
*** florianf has joined #tripleo | 07:20 | |
*** ykarel_ is now known as ykarel | 07:21 | |
*** ratailor__ has joined #tripleo | 07:21 | |
*** ratailor_ has quit IRC | 07:24 | |
*** tongl has quit IRC | 07:30 | |
*** ffiore has quit IRC | 07:35 | |
*** ratailor__ has quit IRC | 07:36 | |
*** cylopez has joined #tripleo | 07:36 | |
*** jpich has joined #tripleo | 07:39 | |
lvdombrkr | folks, if i deploy my overcloud with dvr, how can i remove it? | 07:43 |
*** openstackgerrit has joined #tripleo | 07:43 | |
openstackgerrit | Marios Andreou proposed openstack/tripleo-docs master: Update documentation for O to P upgrade and update https://review.openstack.org/496223 | 07:43 |
Tengu | lvdombrkr: I think you'd better redeploy it from 0 - we did try something similar but in the end we couldn't change the network setting because a log of things are done in a one-way way, meaning we should have searched everywhere in order to ensure all variables/configurations are properly reverted/managed | 07:46 |
*** openstackgerrit has quit IRC | 07:48 | |
lvdombrkr | Tengu: thanks, i will redeploy from 0...but it if i want to remove it from existing cloud (just to know) i need to remove from deploy command dvr templates? | 07:49 |
Tengu | lvdombrkr: you need to replace all dvr stuff by the network plan you want, but that won't be enough | 07:50 |
Tengu | because many configurations won't be reverted/removed, as they will be "unmanaged" once you drop dvr templates. | 07:50 |
lvdombrkr | Tengu: thanks now its more clear.. | 07:51 |
*** itlinux has joined #tripleo | 07:52 | |
Tengu | :) | 07:52 |
*** ffiore has joined #tripleo | 07:54 | |
*** openstackgerrit has joined #tripleo | 07:55 | |
openstackgerrit | Jose Luis Franco proposed openstack/tripleo-heat-templates master: Remove Heat Cloudwatch API https://review.openstack.org/508964 | 07:55 |
*** sid2 has joined #tripleo | 07:59 | |
*** florianf has quit IRC | 08:02 | |
*** sid1 has quit IRC | 08:03 | |
*** ffiore has quit IRC | 08:04 | |
*** itlinux has quit IRC | 08:04 | |
*** ffiore has joined #tripleo | 08:05 | |
*** florianf has joined #tripleo | 08:07 | |
openstackgerrit | Sagi Shnaidman proposed openstack-infra/tripleo-ci master: DNM: test containers update https://review.openstack.org/511175 | 08:12 |
*** florianf has quit IRC | 08:13 | |
*** florianf has joined #tripleo | 08:13 | |
*** sid1 has joined #tripleo | 08:15 | |
*** nyechiel_ has quit IRC | 08:15 | |
*** ebarrera has quit IRC | 08:18 | |
*** sid2 has quit IRC | 08:18 | |
*** yamahata has quit IRC | 08:19 | |
*** hjensas has joined #tripleo | 08:20 | |
*** ebarrera has joined #tripleo | 08:20 | |
Tengu | jaosorior: so, deploy failed, got stuck in some process without any output. I suspect I kind of broke the stack with all the failures I got lately trying to add the ceph storages… Last try now, else I'll drop the whole stack and re-create it from zero… | 08:25 |
jaosorior | Tengu: or you could wait for shardy to log in. He could probably help you out with that. | 08:25 |
jaosorior | dammit | 08:26 |
jaosorior | I just wrecked my development environment | 08:27 |
jaosorior | X_x | 08:27 |
jaosorior | looks like a good time to go for lunch | 08:27 |
*** pcaruana has quit IRC | 08:27 | |
*** nyechiel_ has joined #tripleo | 08:28 | |
*** shardy has joined #tripleo | 08:29 | |
*** pcaruana has joined #tripleo | 08:31 | |
*** gfidente has joined #tripleo | 08:32 | |
*** openstackgerrit has quit IRC | 08:33 | |
*** spectr has quit IRC | 08:40 | |
*** egonzalez has joined #tripleo | 08:41 | |
Tengu | jaosorior: duh. well, we have to change something deep inside the overcloud, so I think we will re-deploy all the stuff anyway. | 08:42 |
*** openstackgerrit has joined #tripleo | 08:42 | |
openstackgerrit | Merged openstack/tripleo-heat-templates stable/ocata: Use sub_nodes_private instead of node_private https://review.openstack.org/509704 | 08:42 |
*** nyechiel_ has quit IRC | 08:47 | |
*** aditya_r has quit IRC | 08:48 | |
*** spectr has joined #tripleo | 08:50 | |
*** hjensas has quit IRC | 08:52 | |
*** fzdarsky has joined #tripleo | 08:52 | |
*** derekh has joined #tripleo | 08:52 | |
*** salmankhan has joined #tripleo | 08:53 | |
*** lucas-afk is now known as lucasagomes | 08:55 | |
*** ykarel is now known as ykarel|lunch | 08:56 | |
*** hjensas has joined #tripleo | 08:58 | |
*** hjensas has quit IRC | 08:58 | |
*** hjensas has joined #tripleo | 08:58 | |
openstackgerrit | Thomas Herve proposed openstack/instack-undercloud master: Handle relative hieradata_override https://review.openstack.org/510813 | 09:03 |
*** milan has joined #tripleo | 09:07 | |
*** fzdarsky has quit IRC | 09:10 | |
*** dtantsur|afk is now known as dtantsur | 09:11 | |
*** suuuper has joined #tripleo | 09:14 | |
*** ratailor has joined #tripleo | 09:14 | |
*** salmankhan has quit IRC | 09:14 | |
*** dr_gogeta86 has joined #tripleo | 09:16 | |
*** dr_gogeta86 has joined #tripleo | 09:16 | |
*** salmankhan has joined #tripleo | 09:20 | |
*** tosky has joined #tripleo | 09:21 | |
ramishra | therve: tempest chnage has broken our gate, we probably need https://review.openstack.org/#/c/510919/ | 09:24 |
jaosorior | sshnaidm|afk, adarazs: I'm trying to run quickstart and it fails the repo setup step with: Failed to parse dlrn hash | 09:25 |
jaosorior | any idea why? | 09:25 |
ramishra | ignore it here:) | 09:25 |
sshnaidm|afk | jaosorior, no, but logs would help | 09:25 |
adarazs | jaosorior: yeah, how are you running? where? what job? :) | 09:26 |
jaosorior | adarazs: no job, locally | 09:26 |
jaosorior | adarazs, sshnaidm|afk: This is what I'm running: ./quickstart.sh --no-clone --teardown all --clean -p quickstart-extras.yml -N config/nodes/1ctlr_1comp.yml -c config/general_config/minimal.yml -R master-tripleo-ci --tags "all" $HOST | 09:26 |
*** mcornea has joined #tripleo | 09:27 | |
*** mcornea has quit IRC | 09:27 | |
adarazs | jaosorior: so probably your env is not clean and you have some variables defined from an older run of maybe devmode where you specified some changes that the playbook picks up? | 09:27 |
jaosorior | adarazs, sshnaidm|afk: This is the step where it breaks http://paste.openstack.org/show/623299/ | 09:28 |
adarazs | jaosorior: that would be my guess. | 09:28 |
jaosorior | adarazs: ok, gonna try deleting ~/.quickstart | 09:28 |
*** hewbrocca_afk is now known as hewbrocca | 09:28 | |
*** mcornea has joined #tripleo | 09:28 | |
adarazs | jaosorior: and also try to start in a fresh shell session. | 09:28 |
sshnaidm|afk | jaosorior, and what is in /home/stack/_home_stack_repo_setup.sh.log ? | 09:28 |
adarazs | (from the virthost) | 09:29 |
*** psahoo has quit IRC | 09:29 | |
jaosorior | sshnaidm|afk: https://paste.fedoraproject.org/paste/abm3yhWoKOfH0fxxRYZEGw | 09:31 |
*** mcornea has quit IRC | 09:31 | |
adarazs | jaosorior: hmm, that's weird. | 09:32 |
adarazs | not what I expected. so it tries to curl https://trunk.rdoproject.org/centos7/current-tripleo/delorean.repo and it failed to parse the hash. | 09:32 |
adarazs | not sure why that would happen apart from random download error from the DLRN server. | 09:32 |
jaosorior | adarazs: I've had that happen three times in a row now :/ | 09:32 |
adarazs | that is weird. | 09:33 |
adarazs | we should look at that code that tries to parse it. | 09:33 |
*** mcornea has joined #tripleo | 09:34 | |
*** jaosorior has quit IRC | 09:36 | |
openstackgerrit | Gael Chamoulaud proposed openstack/tripleo-quickstart-extras master: Fix ansible-lint.sh to check playbooks https://review.openstack.org/446525 | 09:37 |
*** fragatina has quit IRC | 09:37 | |
*** mdnadeem has joined #tripleo | 09:38 | |
sshnaidm|afk | adarazs, jaosorior now current and current-tripleo are different, that's why script fails | 09:39 |
sshnaidm|afk | not sure why it's different and why it should be the same according to pabelanger, checking.. | 09:39 |
*** mdnadeem_ has quit IRC | 09:39 | |
*** dparkes has quit IRC | 09:41 | |
sshnaidm|afk | adarazs, jaosorior oops, sorry, it shouldn't be the same, just checking if it's not empty.. | 09:41 |
openstackgerrit | Rajesh Tailor proposed openstack/tripleo-heat-templates master: Enable TLS for ec2api service https://review.openstack.org/509393 | 09:42 |
openstackgerrit | Gael Chamoulaud proposed openstack/tripleo-quickstart-extras master: Add pre-deployment negative tests for validations https://review.openstack.org/488495 | 09:43 |
openstackgerrit | Gael Chamoulaud proposed openstack/tripleo-quickstart-extras master: Add post-deployment negative tests for validations https://review.openstack.org/504014 | 09:43 |
pdeore | shardy, gfidente, need your review on https://review.openstack.org/#/c/510846/ please... | 09:44 |
adarazs | sshnaidm|afk: yeah, it shouldn't be the same :) | 09:45 |
*** dparkes has joined #tripleo | 09:46 | |
*** pdeore has quit IRC | 09:47 | |
*** jaosorior has joined #tripleo | 09:47 | |
Tengu | jaosorior: good news: deploy is working fine. I have all my seven nodes recognized, and they were already rebooted on the final system, now it's configuration time for their different roles. | 09:52 |
jaosorior | Tengu: nice! | 09:53 |
ykarel|lunch | jaosorior, to me it looks like there is some network issue(connectivity to trunk.rdoproject.org) in your environment, can you try hacking your master-tripleo-ci.yml | 09:54 |
*** ykarel|lunch is now known as ykarel | 09:54 | |
*** sid1 has quit IRC | 09:55 | |
ykarel | i mean the output of: curl --silent ${NODEPOOL_RDO_PROXY}/centos7/current/delorean.repo | 09:55 |
Tengu | jaosorior: small note though: might be good to redefine the specs for the undercloud node, as they are pretty wrong in the end. Especially regarding RAM. and mentionning it's an I/O-intensive task would be nice as well. | 09:55 |
jaosorior | ykarel: trying to run it out again. restarted my router and such. | 09:57 |
jaosorior | Tengu: where? in the docs? | 09:57 |
ykarel | jaosorior, Ok | 09:58 |
*** udesale has quit IRC | 09:58 | |
Tengu | jaosorior: yup. | 09:58 |
openstackgerrit | Marios Andreou proposed openstack/tripleo-heat-templates master: EARLY WIP: Convert tags to when statements for Q major upgrade workflow https://review.openstack.org/510902 | 10:00 |
jaosorior | Tengu: you could propose a patch for this https://github.com/openstack/tripleo-docs/blob/master/doc/source/install/environments/baremetal.rst | 10:02 |
Tengu | jaosorior: hmm yup. | 10:03 |
Tengu | will do that now as I have some spare time during the deploy process. thanks for the link :) | 10:03 |
openstackgerrit | Numan Siddique proposed openstack/tripleo-heat-templates master: ci-ovn: Disable Swift services in scenario 007 container job https://review.openstack.org/511199 | 10:04 |
*** dpawar has quit IRC | 10:05 | |
*** dpawar has joined #tripleo | 10:06 | |
-openstackstatus- NOTICE: The CI system will be offline starting at 11:00 UTC (in just under an hour) for Zuul v3 rollout: http://lists.openstack.org/pipermail/openstack-dev/2017-October/123337.html | 10:09 | |
openstackgerrit | Cédric Jeanneret proposed openstack/tripleo-docs master: added some remarks for undercloud requirements https://review.openstack.org/511201 | 10:11 |
Tengu | jaosorior: -^ :) | 10:11 |
*** mdnadeem has quit IRC | 10:12 | |
*** dbecker has joined #tripleo | 10:12 | |
openstackgerrit | Giulio Fidente proposed openstack/tripleo-heat-templates master: Do not override the docker image for scenario001 https://review.openstack.org/510984 | 10:14 |
openstackgerrit | Sagi Shnaidman proposed openstack-infra/tripleo-ci master: Mirror images from RDO server https://review.openstack.org/510362 | 10:15 |
openstackgerrit | Sagi Shnaidman proposed openstack/tripleo-quickstart master: Update images location to new ones https://review.openstack.org/510363 | 10:16 |
Tengu | jaosorior: thank you :) | 10:16 |
sshnaidm|afk | adarazs, ^^ | 10:17 |
*** ebarrera has quit IRC | 10:19 | |
*** ratailor has quit IRC | 10:21 | |
*** ebarrera has joined #tripleo | 10:21 | |
*** ratailor has joined #tripleo | 10:22 | |
*** rhallisey has joined #tripleo | 10:22 | |
adarazs | sshnaidm|afk: https://review.openstack.org/510363 -- reviewed, realized some things are missing :) | 10:23 |
sshnaidm|afk | adarazs, let's do it in different patch, this one is only for tripleo-ci | 10:24 |
Tengu | oh. hmmm. the fact we will be able to add custom backends to haproxy makes me think we might have a nice way to get let's encrypt certificate in a way that won't require any external storage. | 10:24 |
openstackgerrit | Dmitry Tantsur proposed openstack/tripleo-common stable/ocata: Take wwn_with_extension into account, when configuring a boot device https://review.openstack.org/510817 | 10:25 |
*** salmankhan has quit IRC | 10:26 | |
*** mdnadeem has joined #tripleo | 10:26 | |
adarazs | sshnaidm|afk: you know how difficult it is to push through a change nowadays, one less thing to recheck 20 times. | 10:27 |
adarazs | sshnaidm|afk: again it was just rebased so we don't lose any time with pushing another patchset now. | 10:28 |
sshnaidm|afk | adarazs, ci will shutdown soon anyway, so no hurry.. | 10:28 |
adarazs | sshnaidm|afk: still do you think it will be easier to push things through after? :D | 10:29 |
adarazs | sshnaidm|afk: I just mean the less change we need to merge the better. | 10:29 |
sshnaidm|afk | adarazs, I'd like to do more atomic changes for easy reverting them in case we need to | 10:29 |
adarazs | especially for 3 line extra changes. | 10:29 |
*** dobson has quit IRC | 10:29 | |
*** ffiore has quit IRC | 10:29 | |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/puppet-tripleo master: Use cacertfile option to get CA certificate https://review.openstack.org/507515 | 10:29 |
adarazs | sshnaidm|afk: well, your patch is mainly concerned with renaming all the ipa_images.tar mentions, so that fits in the style. | 10:29 |
adarazs | sshnaidm|afk: same thing just for 3 different files. | 10:30 |
adarazs | if we need to revert it, it needs to be reverted for both. | 10:30 |
adarazs | those links are pointing to the same things. | 10:30 |
sshnaidm|afk | adarazs, yeah, but these are patches for tripleo-ci anyway.. | 10:31 |
adarazs | sshnaidm|afk: eh, okay. | 10:32 |
adarazs | you'll help me with the rechecks. :P | 10:32 |
*** dobson has joined #tripleo | 10:32 | |
*** salmankhan has joined #tripleo | 10:32 | |
sshnaidm|afk | adarazs, it's not a new issue :) | 10:33 |
sshnaidm|afk | adarazs, and with zuulv3 it will be much easier and better (as infra promises) | 10:33 |
adarazs | riiiiiiight. | 10:33 |
sshnaidm|afk | next quarter maybe.. | 10:33 |
Tengu | hmm so no need to stress the "recheck" command for CI stuff right now, right? :) | 10:34 |
Tengu | uhu, I see people on other openstack related channels stressing their merge requests/reviews :Þ | 10:34 |
adarazs | Tengu: it's probably as useful as beating a dead horse. | 10:34 |
Tengu | poor horse. | 10:35 |
Tengu | shall we eat it instead? :] | 10:35 |
adarazs | I wouldn't dare, it's possessed by Zuul :P :) | 10:35 |
Tengu | it's ok, I'm a daemon myself, I don't fear Zuul ;) | 10:36 |
Tengu | deploy's almost done. pfiouu. | 10:39 |
Tengu | took over an hour… thanks to our undercloud instance that's just… well. anyway. it's working. | 10:39 |
*** ratailor_ has joined #tripleo | 10:41 | |
ansiwen | EmilienM: ok, I looked. Yes, it's a pity if they remove it, thanks for your comment. on the other hand, I can understand that they feel uncomfortable that something is gating their job that they don't control. If tripleo CI job is broken, they can't merge... | 10:42 |
ansiwen | *gating their project... | 10:42 |
*** fandrieu has quit IRC | 10:43 | |
rook | shardy hey - thanks for checking out the doc... the only concern I have is, I was able to scale out beyond 32 nodes w/o convergence. | 10:44 |
*** ratailor has quit IRC | 10:45 | |
*** sid1 has joined #tripleo | 10:45 | |
openstackgerrit | Sagi Shnaidman proposed openstack-infra/tripleo-ci master: Mirror images from RDO server https://review.openstack.org/510362 | 10:46 |
openstackgerrit | Attila Darazs proposed openstack/tripleo-quickstart master: Fix ironic-python-agent.tar name for centosci https://review.openstack.org/511206 | 10:46 |
openstackgerrit | Michele Baldessari proposed openstack/tripleo-heat-templates master: WIP Fix ConfigDebug for puppet host runs https://review.openstack.org/511207 | 10:48 |
openstackgerrit | Dougal Matthews proposed openstack/tripleo-common master: Migrate to the new Mistral context class https://review.openstack.org/506186 | 10:48 |
*** jkilpatr has joined #tripleo | 10:49 | |
*** dpawar has quit IRC | 10:52 | |
*** achadha has joined #tripleo | 10:53 | |
*** achadha has quit IRC | 10:58 | |
*** fzdarsky has joined #tripleo | 10:59 | |
*** dpawar has joined #tripleo | 11:00 | |
*** dbecker has quit IRC | 11:02 | |
*** ffiore has joined #tripleo | 11:02 | |
lvdombrkr | folks, someone see this error, when tried to update existing overcloud? http://paste.openstack.org/raw/623307/ | 11:03 |
*** stee_3 has quit IRC | 11:04 | |
*** brault has quit IRC | 11:04 | |
*** stee_3 has joined #tripleo | 11:05 | |
*** pkovar has joined #tripleo | 11:05 | |
*** itlinux has joined #tripleo | 11:05 | |
itlinux | hello all.. I have a question since I have a client that wants to use provider network only what's the best way to adopt that using OOO? | 11:06 |
mandre | jistr: https://review.openstack.org/#/c/501872/ is green!! | 11:06 |
shardy | rook: Hi, so your nova error went away after disabling convergence? | 11:07 |
rook | shardy the nova/keystone-auth issue didn't happen with convergence disabled,no. | 11:07 |
rook | shardy: I also find it interesting it happened during a scale-up operation, not during the initial deploy | 11:08 |
itlinux | also has anyone used or checked cell2 config on OOO | 11:10 |
jistr | mandre: whohooo \o/ +A'd both of them | 11:11 |
* jistr grabs tea | 11:11 | |
mandre | jistr: looks like the upgrade failures are valid bugs, "Could not find the requested service openstack-ceilometer-central" | 11:12 |
mandre | wow a lot of sweat in these two patches, thanks jistr | 11:12 |
*** fandrieu has joined #tripleo | 11:13 | |
*** sshnaidm|afk is now known as sshnaidm | 11:15 | |
rook | scaling up w/o convergence shardy | 11:16 |
shardy | rook: Ok, would be good to see the nova logs, as currently I'm not clear why convergence would cause a nova error | 11:17 |
shardy | in theory the calls to nova from heat should be about the same | 11:18 |
shardy | the timing may be different but otherwise I'm unclear why it would make any difference to nova | 11:18 |
shardy | unless we dos'd the DB or something? | 11:18 |
* shardy bbiab lunchtime | 11:18 | |
*** shardy is now known as shardy_lunch | 11:19 | |
sshnaidm | for whom it could be relevant, zuul-v3 status page is here: http://cistatus.tripleo.org:8000/ | 11:20 |
sshnaidm | zuul-v2 is here: http://cistatus.tripleo.org | 11:20 |
itlinux | hi.. all about zuul is there a module then to install zuul? | 11:21 |
*** udesale has joined #tripleo | 11:21 | |
jaosorior | itlinux: where do you want to install zuul? | 11:22 |
*** fzdarsky_ has joined #tripleo | 11:22 | |
*** lucasagomes is now known as lucas-hungry | 11:23 | |
*** bfournie has quit IRC | 11:26 | |
jistr | mandre: hm right, there's not a single recent upgrade job success... | 11:26 |
jistr | mandre: and mainly thank *you* for those patches! :) | 11:26 |
openstackgerrit | Gael Chamoulaud proposed openstack/tripleo-validations master: Add new SELinux validation check https://review.openstack.org/511211 | 11:26 |
rook | shardy_lunch i don't see how the ddos of the db could cause this | 11:28 |
rook | we did see an additional CPU WAIT time of 2% | 11:28 |
rook | but i don't see how that would impact the scale | 11:28 |
rook | i will look once I get back to add more information | 11:28 |
*** morazi has joined #tripleo | 11:29 | |
*** mburned is now known as mburned_out | 11:31 | |
*** mburned_out is now known as mburned | 11:33 | |
lvdombrkr | folks, someone see this error, when tried to update existing overcloud? http://paste.openstack.org/raw/623307/ | 11:36 |
jistr | mandre: interesting -- the upgrade job on https://review.openstack.org/#/c/501872/ went *much* further than on other patches currently proposed | 11:37 |
jistr | mandre: we're fixing things by accident, that's a new level | 11:37 |
*** ratailor__ has joined #tripleo | 11:38 | |
*** ffiore has quit IRC | 11:38 | |
*** Shatadru is now known as Shatadru|Gone | 11:38 | |
jistr | everything else on http://tripleo.org/cistatus.html fails around 2 hour mark with some dbus permission issue it looks. I'm not even sure if it's worth reporting and investigating given that the problem might disappear shortly... | 11:40 |
*** shreshtha has quit IRC | 11:41 | |
*** ratailor_ has quit IRC | 11:41 | |
*** jpena is now known as jpena|lunch | 11:42 | |
*** spectr has quit IRC | 11:44 | |
*** skramaja has quit IRC | 11:44 | |
*** spectr has joined #tripleo | 11:46 | |
*** ffiore has joined #tripleo | 11:47 | |
*** ratailor__ has quit IRC | 11:49 | |
*** pdeore has joined #tripleo | 11:49 | |
*** pdeore has quit IRC | 11:50 | |
*** pdeore has joined #tripleo | 11:51 | |
*** ffiore has quit IRC | 11:53 | |
*** fzdarsky_ has quit IRC | 11:54 | |
*** spectr-RH has joined #tripleo | 11:54 | |
*** fzdarsky has quit IRC | 11:54 | |
*** fzdarsky_ has joined #tripleo | 11:56 | |
*** fzdarsky has joined #tripleo | 11:56 | |
mwhahaha | ugh looks like v3 ovb jobs are failing | 11:57 |
*** athomas has joined #tripleo | 11:57 | |
*** spectr has quit IRC | 11:57 | |
*** jaganathan has quit IRC | 11:57 | |
mwhahaha | sshnaidm: do you know why the v3 ovb jobs are failing or do we have a bug for it? | 11:58 |
sshnaidm | well first failures of v3 are coming.. | 11:58 |
sshnaidm | mwhahaha, "rm: cannot remove ‘/home/zuul/cache/files/cirros-0.3.5-x86_64-disk.vhd.tgz’: Permission denied" | 11:58 |
*** jcoufal has joined #tripleo | 11:58 | |
sshnaidm | mwhahaha, no, it's something new | 11:59 |
mwhahaha | :( | 11:59 |
sshnaidm | mwhahaha, I think related patch was merged just yesterday in infra, about cache dir.. | 12:00 |
*** amoralej is now known as amoralej|lunch | 12:00 | |
mwhahaha | k might want to let them know | 12:00 |
*** lucas-hungry is now known as lucasagomes | 12:01 | |
*** bfournie has joined #tripleo | 12:01 | |
*** dprince has joined #tripleo | 12:03 | |
sshnaidm | mwhahaha, yeah, we have an etherpad: https://etherpad.openstack.org/p/zuulv3-issues reported there | 12:03 |
*** abishop has joined #tripleo | 12:03 | |
mwhahaha | sounds good | 12:04 |
sshnaidm | seems like it's not for us only | 12:04 |
*** jlabarre has joined #tripleo | 12:04 | |
sshnaidm | mwhahaha, EmilienM btw, v3 jobs are here: http://cistatus.tripleo.org:8000/ | 12:05 |
*** pdeore has quit IRC | 12:05 | |
openstackgerrit | Marios Andreou proposed openstack/tripleo-heat-templates master: EARLY WIP: Convert tags to when statements for Q major upgrade workflow https://review.openstack.org/510902 | 12:09 |
*** fzdarsky has quit IRC | 12:10 | |
*** fzdarsky_ has quit IRC | 12:10 | |
*** itlinux has quit IRC | 12:10 | |
*** shardy_lunch is now known as shardy | 12:11 | |
*** ffiore has joined #tripleo | 12:11 | |
*** spectr-RH has quit IRC | 12:13 | |
*** spectr has joined #tripleo | 12:13 | |
*** itlinux has joined #tripleo | 12:14 | |
*** spectr-RH has joined #tripleo | 12:14 | |
*** panda|off is now known as panda | 12:15 | |
jaosorior | mandre, shardy: Is there a way to run an ansible task immediately after a container is spawned? | 12:16 |
*** spectr has quit IRC | 12:18 | |
*** openstackgerrit has quit IRC | 12:18 | |
*** ebarrera_ has joined #tripleo | 12:19 | |
*** openstackgerrit has joined #tripleo | 12:20 | |
openstackgerrit | Michele Baldessari proposed openstack/puppet-pacemaker master: Implement stonith levels resource https://review.openstack.org/508678 | 12:20 |
*** aditya_r has joined #tripleo | 12:21 | |
*** aputtur has joined #tripleo | 12:26 | |
*** tzumainn has joined #tripleo | 12:26 | |
jaosorior | jistr: the containerized undercloud currently doesn't support SSL, or does it? | 12:27 |
shardy | jaosorior: you can run bootstrapping tasks via docker_config which could run a command/script but it's not integrated to run ansible tasks specifically atm | 12:27 |
shardy | jaosorior: the ordering is controlled via the steps, so you could launch the container in step3 the do some docker_config in step4 | 12:28 |
shardy | jaosorior: there is also docker_puppet_tasks which does bootstrapping operations (only on the bootstrap host) via puppet | 12:28 |
*** rlandy has joined #tripleo | 12:28 | |
jaosorior | shardy: I would need to run a task on the host that runs the container :/ | 12:29 |
*** trown|outtypewww is now known as trown | 12:29 | |
jaosorior | shardy: so this is the case: Currently we use certmonger to auto-generate certificates for the undercloud (when you set generate_service_certificate to True). I was trying to do something similar for the containerized undercloud, which would use a containerized certmonger. | 12:31 |
shardy | jaosorior: can you expand on why? The normal pattern is to run tasks in a temporary init container | 12:31 |
shardy | jaosorior: as we can no longer be sure any dependencies exist on the host | 12:31 |
thrash | weshay: remember that issue I was telling you about quickstart where a tag occurs "after" my patch, and devmode won't install the dlrn package with my patch in it? | 12:31 |
thrash | weshay: trown https://bugs.launchpad.net/tripleo/+bug/1722783 | 12:31 |
openstack | Launchpad bug 1722783 in tripleo "devmode Quickstart does not install built package" [Medium,Triaged] | 12:31 |
*** ecerquei has joined #tripleo | 12:31 | |
jaosorior | Now, while we can run certmonger containerized, and request certificates with it, this wouldn't be very useful, as the CA certificate needs to be trusted by the host, else we won't be able to use that certificate | 12:32 |
thrash | weshay: trown I've hit it again. | 12:32 |
shardy | jaosorior: so can you spin up a container via docker_config with some directory bind mounted from the host? | 12:32 |
weshay | thrash, ok.. thanks for opening a bug | 12:32 |
weshay | we'll poke at it | 12:32 |
jaosorior | shardy: sure! but then I need to run a command to actually get the host to trust that cert | 12:32 |
*** sshnaidm is now known as sshnaidm|mtg | 12:32 | |
shardy | jaosorior: Ok so a command needs to run on the host, or just some config data be written? | 12:32 |
thrash | weshay: sure thing | 12:32 |
shardy | jaosorior: ack, I'm just not familiar with the command, what does it change, and can that be done via a bind mount? | 12:33 |
jaosorior | shardy: yep, a command; namely update-ca-trust | 12:33 |
*** liverpooler has joined #tripleo | 12:33 | |
jaosorior | shardy: it grabs whatever CA certificates we add to /etc/pki/ca-trust/source/anchors/ and appends it to the trusted bundle. | 12:33 |
jaosorior | shardy: if we don't do that, we're gonna have to specify the CA certificate for every command we do. | 12:34 |
jistr | but the bundle is also just a file in /etc isn't it? | 12:34 |
jaosorior | right | 12:34 |
jaosorior | oh | 12:35 |
*** itlinux has quit IRC | 12:35 | |
*** jpena|lunch is now known as jpena | 12:35 | |
shardy | yeah so we just bind the ca bundle file or etc dir into the bootstrapping container | 12:35 |
shardy | and run that via docker_config? | 12:35 |
jaosorior | jistr: so you're suggesting I could bind-mount in r/w mode /etc/pki/ca-trust and run the task | 12:35 |
*** itlinux has joined #tripleo | 12:35 | |
jistr | yea | 12:35 |
jaosorior | riight | 12:35 |
jaosorior | well yeah, I guess that could work | 12:35 |
shardy | not super clean but I think that may work | 12:35 |
*** pchavva has joined #tripleo | 12:35 | |
jaosorior | ok | 12:36 |
jaosorior | sounds like a way forward | 12:36 |
matbu | shardy: o/ hey, do you have any idea why the tests is failing there: http://logs.openstack.org/88/487488/31/check/gate-python-tripleoclient-python27-ubuntu-xenial/9e8817c/console.html#_2017-10-11_06_40_56_616364 | 12:36 |
jaosorior | shardy, jistr: thanks | 12:36 |
jistr | jaosorior, larsks, jbadiapa: btw just finished reviewing the logging spec and i love it :) That's exactly what we need for k8s. Can't wait to be able to access all logs centrally just via `kubectl logs`. | 12:37 |
*** aputtur has quit IRC | 12:37 | |
jaosorior | jistr: yay :D | 12:37 |
jaosorior | jistr: thanks for taking a look | 12:38 |
shardy | http://logs.openstack.org/88/487488/31/check/gate-python-tripleoclient-python27-ubuntu-xenial/9e8817c/console.html#_2017-10-11_06_40_58_999041 | 12:38 |
shardy | matbu: looks like an out of memory problem? | 12:38 |
shardy | o/ btw :) | 12:39 |
openstackgerrit | Gael Chamoulaud proposed openstack/tripleo-validations stable/pike: Update services in the process count validation https://review.openstack.org/511220 | 12:39 |
matbu | shardy: yes, looks like but thats weird | 12:39 |
aputtur_ | aputtur | 12:39 |
openstackgerrit | Dougal Matthews proposed openstack/tripleo-common master: Migrate to the new Mistral context class https://review.openstack.org/506186 | 12:42 |
openstackgerrit | Dmitry Tantsur proposed openstack/tripleo-docs master: Rework documentation for ironic in the overcloud https://review.openstack.org/510867 | 12:43 |
*** catintheroof has joined #tripleo | 12:43 | |
jaosorior | mwhahaha: hey, if you have some time could you give this a read https://review.openstack.org/#/c/510001/ ? | 12:44 |
openstackgerrit | Brad P. Crochet proposed openstack/tripleo-common master: Validate roles data and network data https://review.openstack.org/508567 | 12:45 |
mwhahaha | jaosorior: yea it's on my list today | 12:45 |
*** rodolof has joined #tripleo | 12:45 | |
jaosorior | mwhahaha: thanks | 12:45 |
*** jcoufal has quit IRC | 12:47 | |
shardy | matbu: perhaps try running the unit tests locally and monitor the memory usage? | 12:47 |
openstackgerrit | Merged openstack/tripleo-quickstart-extras master: Set NeutronGlobalPhysnetMtu=1350 for ovb based deployment https://review.openstack.org/506152 | 12:47 |
*** jcoufal has joined #tripleo | 12:48 | |
lvdombrkr | folks who are using pacemaker and dvr same time? | 12:48 |
openstackgerrit | Merged openstack/instack-undercloud master: Updated from global requirements https://review.openstack.org/510309 | 12:48 |
*** jmelvin has joined #tripleo | 12:49 | |
*** hewbrocca is now known as hewbrocca_afk | 12:49 | |
*** fragatina has joined #tripleo | 12:51 | |
*** hewbrocca_afk is now known as hewbrocca | 12:52 | |
*** amoralej|lunch is now known as amoralej | 12:55 | |
*** ykarel has quit IRC | 12:56 | |
*** ykarel has joined #tripleo | 12:56 | |
EmilienM | hello | 12:58 |
openstackgerrit | Marios Andreou proposed openstack/tripleo-heat-templates master: Add post_upgrade_tasks and upgrade_batch_tasks to deploy output https://review.openstack.org/511228 | 12:58 |
EmilienM | sshnaidm|mtg: ok | 12:58 |
openstackgerrit | Tim Rozet proposed openstack/tripleo-heat-templates stable/pike: Fixes dynamic networks falling back to ctlplane https://review.openstack.org/511005 | 12:59 |
*** toure_biab is now known as toure | 13:00 | |
*** ebarrera has quit IRC | 13:01 | |
*** ebarrera has joined #tripleo | 13:02 | |
*** openstackgerrit has quit IRC | 13:03 | |
*** dpawar has quit IRC | 13:05 | |
-openstackstatus- NOTICE: Due to unrelated emergencies, the Zuul v3 rollout has not started yet; stay tuned for further updates | 13:07 | |
*** dpawar has joined #tripleo | 13:07 | |
*** bobh has joined #tripleo | 13:07 | |
*** nyechiel_ has joined #tripleo | 13:09 | |
gfidente | mwhahaha fultonj regarding my comment in https://bugs.launchpad.net/tripleo/+bug/1722633/comments/3 | 13:09 |
openstack | Launchpad bug 1722633 in tripleo "Puppet Duplicate declaration error when role is composed with both Ceph OSDs and Mons" [Low,Triaged] - Assigned to John Fulton (jfulton-org) | 13:09 |
gfidente | I am just not sure there is an easy way to figure if the ::mon profile is enabled at puppet compile time | 13:09 |
*** sdake has quit IRC | 13:11 | |
*** sdake has joined #tripleo | 13:11 | |
*** sdake has joined #tripleo | 13:11 | |
*** ebarrera_ has quit IRC | 13:14 | |
*** openstackgerrit has joined #tripleo | 13:14 | |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/paunch master: Add option to configure uts namespace https://review.openstack.org/511234 | 13:15 |
*** dbecker has joined #tripleo | 13:15 | |
*** sdake has quit IRC | 13:15 | |
*** ebarrera has quit IRC | 13:15 | |
*** ebarrera has joined #tripleo | 13:17 | |
rook | shardy: so updating the cloud /scaling up w/o convergence worked :/ | 13:17 |
openstackgerrit | Pradeep Kilambi proposed openstack/tripleo-heat-templates master: Set metric procssing delay for metricd https://review.openstack.org/511236 | 13:17 |
shardy | rook: ack, any more clues from the logs? I'm still not really clear how convergence can cause errors in nova | 13:17 |
rook | looking now shardy | 13:17 |
*** nyechiel_ has quit IRC | 13:18 | |
*** sdake has joined #tripleo | 13:19 | |
*** sdake has quit IRC | 13:19 | |
*** sdake has joined #tripleo | 13:19 | |
openstackgerrit | Thomas Herve proposed openstack/instack-undercloud master: Handle utf-8 when running python commands https://review.openstack.org/511238 | 13:23 |
rook | shardy so Nova -> Neutron issues it seems... With Convergence does heat being to query things more often? https://gist.github.com/jtaleric/e1ff2d3579dd902719c6d5838c26e4bd | 13:25 |
*** marrusl has quit IRC | 13:31 | |
*** cdearborn has joined #tripleo | 13:35 | |
ykarel | slagle, hi | 13:35 |
*** tosky_ has joined #tripleo | 13:36 | |
*** itlinux has quit IRC | 13:36 | |
*** aditya_r has quit IRC | 13:37 | |
*** aditya_r has joined #tripleo | 13:38 | |
*** tosky has quit IRC | 13:38 | |
*** jistr is now known as jistr|mtg | 13:38 | |
*** pblaho has quit IRC | 13:38 | |
*** aditya_r has quit IRC | 13:40 | |
*** pblaho has joined #tripleo | 13:41 | |
slagle | ykarel: hello | 13:41 |
ykarel | slagle, actually i was trying trass, | 13:42 |
ykarel | and it was failing and now i got that it was because of default: tripleo_ci_remote | 13:42 |
*** lblanchard has joined #tripleo | 13:43 | |
ykarel | so will try with openstack-infra/tripleo-ci | 13:43 |
*** udesale has quit IRC | 13:43 | |
ykarel | actually this patch https://review.openstack.org/#/c/510839/ is not there in slagle/tripleo-ci that's why it was failing | 13:43 |
slagle | ykarel: how was it failing? the default value for the tripleo_ci_remote parameter should just work | 13:43 |
slagle | ykarel: oh i see. i may need to rebase my fork | 13:44 |
ykarel | slagle, yes and in trass branch as that's the default | 13:44 |
Tengu | uhu. I have a *doc* build that fails. I think I won't do anything for now, right? :) | 13:44 |
slagle | ykarel: the reason i have the fork is because this patch is needed: https://review.openstack.org/#/c/490032/ | 13:44 |
slagle | EmilienM: could you have another look at https://review.openstack.org/#/c/490032/ ? | 13:45 |
ykarel | slagle, Ok | 13:45 |
EmilienM | slagle: of course. Do you mind if I rebase it and make sure CI pass well, and also re-verify from where package is deployed? | 13:46 |
slagle | EmilienM: i think i need to update the patch anyway because my history undo command there could cause a problem | 13:46 |
EmilienM | slagle: yeah | 13:46 |
*** mdnadeem has quit IRC | 13:46 | |
*** paramite has quit IRC | 13:47 | |
rook | shardy nothing in the neutron logs. | 13:48 |
rook | shardy: i wonder if haproxy is killing the connection? | 13:49 |
openstackgerrit | Gael Chamoulaud proposed openstack/tripleo-quickstart-extras master: Remove workaround for LP#1701239 bug https://review.openstack.org/511242 | 13:49 |
openstackgerrit | James Slagle proposed openstack-infra/tripleo-ci master: Enable repo for python-os-testr https://review.openstack.org/490032 | 13:49 |
slagle | EmilienM: there, the logic should be correct now. just need to confirm where the rpm is coming from | 13:49 |
EmilienM | slagle: the only problem I see is if we have a RDO outage, the job will fail because it doesn't use the AFS mirror | 13:53 |
*** mcornea has quit IRC | 13:53 | |
*** mcornea has joined #tripleo | 13:53 | |
EmilienM | why not running `generate-subunit $(date +%s) 10 fail pingtest | gzip - > $WORKSPACE/logs/testrepository.subunit.gz` later ? | 13:53 |
*** ffiore has quit IRC | 13:53 | |
EmilienM | like once we have the repos in place (with AFS mirrors) | 13:54 |
slagle | EmilienM: not sure, i didn't add that part | 13:54 |
EmilienM | I'm not trying to be picky, but I just don't want us to do recheck because RDO server was down | 13:54 |
slagle | EmilienM: how do the AFS mirrors get configured? | 13:54 |
EmilienM | and we have elastic queries that prove it's the case very often | 13:54 |
*** udesale has joined #tripleo | 13:55 | |
EmilienM | slagle: in quickstart for the oooq jobs and in tripleo-ci for legacy jobs | 13:55 |
EmilienM | sshnaidm|mtg: you added this code, maybe we can discuss | 13:55 |
openstackgerrit | Pradeep Kilambi proposed openstack/tripleo-heat-templates master: Set host name explicitly for telemetry https://review.openstack.org/509199 | 13:55 |
sshnaidm|mtg | EmilienM, on mtg now, is it urgent? | 13:55 |
slagle | EmilienM: really, we shouldn't be using this openstack package (ostestr) until the repos have been setup | 13:56 |
*** links has quit IRC | 13:56 | |
EmilienM | sshnaidm|mtg: no | 13:56 |
*** janki has quit IRC | 13:56 | |
EmilienM | slagle: I can't agree more | 13:56 |
sshnaidm|mtg | EmilienM, ok, I'll ping you then | 13:57 |
EmilienM | sshnaidm|mtg: ack | 13:57 |
EmilienM | slagle: we'll figure that out once Sagi is back - meantime I'll look where we can run that | 13:57 |
*** ffiore has joined #tripleo | 13:57 | |
*** paramite has joined #tripleo | 13:57 | |
*** Guest85006 is now known as sc | 13:59 | |
openstackgerrit | Attila Darazs proposed openstack/tripleo-quickstart-extras master: GATE CHECK for quickstart-extras https://review.openstack.org/472607 | 14:00 |
*** jistr|mtg is now known as jistr | 14:00 | |
openstackgerrit | Adriano Petrich proposed openstack/tripleo-common master: WIP validate parameters before updating https://review.openstack.org/511249 | 14:00 |
openstackgerrit | Gael Chamoulaud proposed openstack/tripleo-quickstart-extras master: Add pre-deployment negative tests for validations https://review.openstack.org/488495 | 14:00 |
*** leitan has joined #tripleo | 14:00 | |
openstackgerrit | Gael Chamoulaud proposed openstack/tripleo-quickstart-extras master: Add post-deployment negative tests for validations https://review.openstack.org/504014 | 14:00 |
openstackgerrit | Andreas Scheuring proposed openstack/diskimage-builder master: Fix rendering issues in DIB "building_an_image" doc https://review.openstack.org/511250 | 14:02 |
openstackgerrit | Jason E. Rist proposed openstack/tripleo-quickstart master: SSH Tunnel wrongly including http/https https://review.openstack.org/511251 | 14:02 |
*** dpawar has quit IRC | 14:03 | |
openstackgerrit | Andy Smith proposed openstack/puppet-tripleo master: WIP: Updates to separate messaging backends alternative https://review.openstack.org/510684 | 14:03 |
*** spectr-RH has quit IRC | 14:04 | |
*** chlong has joined #tripleo | 14:04 | |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-quickstart-extras master: overcloud-deploy: add config-download + ansible run feature https://review.openstack.org/508306 | 14:04 |
*** spectr has joined #tripleo | 14:04 | |
*** lblanchard1 has joined #tripleo | 14:05 | |
EmilienM | slagle: please review https://review.openstack.org/#/c/508306/19..20 | 14:05 |
EmilienM | slagle: I forgot something in the commit message | 14:05 |
*** lblanchard has quit IRC | 14:06 | |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-quickstart-extras master: overcloud-deploy: add config-download + ansible run feature https://review.openstack.org/508306 | 14:07 |
*** tosky_ is now known as tosky | 14:10 | |
openstackgerrit | Andy Smith proposed openstack/puppet-tripleo master: WIP: Updates to separate messaging backends alternative https://review.openstack.org/510684 | 14:11 |
*** dpawar has joined #tripleo | 14:12 | |
*** aditya_r has joined #tripleo | 14:12 | |
*** gbarros has joined #tripleo | 14:12 | |
*** rajinir-afk is now known as rajinir | 14:14 | |
EmilienM | mwhahaha: have we found a way to track new timeouts on ovb-ha? | 14:14 |
EmilienM | the query doesn't seem to work | 14:14 |
EmilienM | I remember we had a query for that | 14:14 |
mwhahaha | there's a generic timeout one | 14:14 |
mwhahaha | that one? | 14:14 |
*** artom has joined #tripleo | 14:14 | |
EmilienM | yeah | 14:15 |
EmilienM | but it doesn't work on some jobs | 14:15 |
mwhahaha | it should | 14:15 |
mwhahaha | maybe elastic check is lagged | 14:15 |
mwhahaha | i saw it pop up a few time recently | 14:15 |
EmilienM | ok I'll investigate a bit | 14:16 |
EmilienM | mwhahaha: this one? Bug 1686542 - Generic job timeout bug | 14:16 |
openstack | bug 1686542 in OpenStack-Gate "Generic job timeout bug" [Low,Confirmed] https://launchpad.net/bugs/1686542 | 14:16 |
mwhahaha | yea | 14:16 |
*** gbarros has quit IRC | 14:17 | |
EmilienM | ouch indeed | 14:17 |
dalvarez | egonzalez, ping | 14:17 |
egonzalez | dalvarez, sup | 14:17 |
*** gbarros has joined #tripleo | 14:17 | |
dalvarez | egonzalez, re. https://review.openstack.org/#/c/511225/ i'd like to fix the sources thing but i'm super new to kolla so im unsure about how to do it... | 14:18 |
dalvarez | egonzalez, i took a look at: https://github.com/openstack/kolla/blob/master/docker/neutron/neutron-server-ovn/Dockerfile.j2 so maybe i could use its {% elif install_type == 'source' %} block? | 14:18 |
EmilienM | mwhahaha: stats aren't good, 28 timeouts in 3h | 14:18 |
mwhahaha | i wonder if it's because of increased load from v2/v3 | 14:19 |
EmilienM | mwhahaha: zuul or keystone? | 14:19 |
mwhahaha | zuul | 14:20 |
mwhahaha | since we're 2x the ovb jobs | 14:21 |
mwhahaha | though i thought we had an upper limit of total running ovb jobs | 14:21 |
* mwhahaha shrugs | 14:21 | |
mwhahaha | what does bnemec say :D | 14:21 |
EmilienM | mwhahaha: the queue should be bigger but not the load, iiuc | 14:21 |
egonzalez | dalvarez, more like this example https://review.openstack.org/#/c/454745/14/docker/blazar/blazar-base/Dockerfile.j2 server-ovn uses plugins feature | 14:22 |
EmilienM | slagle: any thoughts on the way we deploy "jq" in https://review.openstack.org/#/c/508189/11/tripleo_common/templates/deployments.yaml ? do you think of a better way? | 14:22 |
egonzalez | dalvarez, and https://review.openstack.org/#/c/454745/14/kolla/common/config.py to add the source tarball | 14:22 |
openstackgerrit | mathieu bultel proposed openstack/tripleo-common master: WIP - POC on callback plugin for ansible to collect result https://review.openstack.org/511254 | 14:22 |
bnemec | mwhahaha: I think we're still only running 70 jobs at once between 2 and 3 though, which is what we had before. | 14:23 |
dalvarez | egonzalez, python-networking-ovn-metadata-agent is a subpackage of python-networking-ovn. Also neutron-server-ovn is installed from python-networking-ovn so that's why i thought I had to use it | 14:24 |
EmilienM | mwhahaha, bnemec: https://goo.gl/UkPEs6 | 14:24 |
bnemec | They split them 55/15 during the rollback. | 14:24 |
slagle | EmilienM: yes. we need a way to run some bootstrap style tasks | 14:24 |
slagle | EmilienM: i plan to add that in a subsequent patch | 14:24 |
EmilienM | bnemec, mwhahaha: it keeps being worse :( | 14:24 |
slagle | the first patch is already quite large | 14:24 |
EmilienM | slagle: I was thinking to add "jq" as a dependency of something we deploy on the overcloud | 14:25 |
egonzalez | dalvarez, source code for metadata-ovn comes in networking-ovn git | 14:25 |
dalvarez | egonzalez, exactly | 14:25 |
mwhahaha | EmilienM: plz don't hide the deps that way | 14:25 |
EmilienM | ok | 14:25 |
mwhahaha | it's better to be exliclit in the playbook than wedge a dep on a random package that may or maynot continue to be deployed | 14:26 |
EmilienM | ok | 14:26 |
openstackgerrit | Brad P. Crochet proposed openstack/tripleo-common master: Validate roles data and network data https://review.openstack.org/508567 | 14:26 |
mwhahaha | remember we want to get rid of the images so let's think about creating a dep task to capture some of these software requirements | 14:26 |
slagle | we need to add some bootstrap tasks anyway to do things like install the initial heat hooks, etc | 14:26 |
mwhahaha | yes that :D | 14:26 |
slagle | so all that should just be done explicitally in an ansible task | 14:26 |
EmilienM | slagle: in tripleo-common, right? | 14:27 |
openstackgerrit | Brad P. Crochet proposed openstack/tripleo-common master: Validate roles data and network data https://review.openstack.org/508567 | 14:27 |
slagle | EmilienM: yes | 14:27 |
*** paramite has quit IRC | 14:27 | |
EmilienM | (I just didn't want to see it in quickstart-extras) | 14:27 |
EmilienM | slagle: perfect | 14:27 |
bnemec | EmilienM: mwhahaha: I suppose it's possible convergence has increased the load to the point where we can't run as much at once, but the load on rh1 doesn't look that high. | 14:27 |
honza | Hello folks, this patch seems to be stuck in CI (unknown time left). Is there a way to kill it/reset it? https://review.openstack.org/#/c/505609/ | 14:27 |
mwhahaha | bnemec: i'd hate to revert the convergence switch, should we lower the limit on rh1 to like 65 jobs and see if that clears it up? | 14:28 |
EmilienM | bnemec, mwhahaha : so look at logstash: https://www.awesomescreenshot.com/image/2882878/02ab2173a93f284dcc6b64080220f4ba | 14:28 |
mwhahaha | honza: no need it's queued upstream | 14:29 |
mwhahaha | honza: you don't want to reset it | 14:29 |
bnemec | mwhahaha: We could try that. We did lose a couple of compute nodes in the reboot so it's possible that is a factor too, which would be helped by reducing the job concurrency. | 14:29 |
EmilienM | sorry that sounds stupid but we switched to heat convergence? | 14:29 |
mwhahaha | on the undercloud | 14:29 |
honza | mwhahaha: It says 'Remaining time unknown'. Should I ignore that? All things as they should be? | 14:29 |
EmilienM | o_O when? | 14:30 |
honza | mwhahaha: Moar patience required? | 14:30 |
mwhahaha | honza: click on it you'll see two jobs are queued (waiting on nodes most likely) | 14:30 |
honza | mwhahaha: ok, thanks | 14:30 |
EmilienM | ok bnemec did https://review.openstack.org/#/c/499283/ | 14:30 |
EmilienM | I didn't see it | 14:30 |
mwhahaha | 2 days ago | 14:30 |
mwhahaha | that's when it spiked the 2nd time | 14:30 |
mwhahaha | looks like we were timing out more starting on 10/1 | 14:31 |
mwhahaha | bnemec: when did you do the reboots? | 14:31 |
EmilienM | yeah and it doubled | 14:31 |
*** sadasu has joined #tripleo | 14:31 | |
*** udesale has quit IRC | 14:34 | |
*** dpawar has quit IRC | 14:36 | |
openstackgerrit | Thomas Herve proposed openstack/instack-undercloud master: Use transport_url for Heat https://review.openstack.org/511256 | 14:37 |
*** sid1 has quit IRC | 14:39 | |
gfidente | fultonj, don't pay attention marios always finds additional work for me | 14:40 |
EmilienM | mwhahaha: it sounds like we didn't have logs frm 5 days ago | 14:40 |
EmilienM | mwhahaha: but timeouts started 5 days ago if we look logstash | 14:40 |
mwhahaha | wat | 14:40 |
EmilienM | ah no, 10 days | 14:41 |
EmilienM | weird logstash paging | 14:41 |
EmilienM | I can't see hits after 6 days but the chart shows 10 days | 14:41 |
EmilienM | it started to be serious on October 3 | 14:42 |
bnemec | mwhahaha: According to my records, I finished the reboot on/around Sept. 26. | 14:43 |
*** spectr has quit IRC | 14:43 | |
fultonj | gfidente: ack | 14:45 |
marios | gfidente: fultonj i +2 man it was a _suggestion_ but really after this comment I think i have to revote. | 14:46 |
marios | sec | 14:46 |
*** dpawar has joined #tripleo | 14:46 | |
*** shardy has quit IRC | 14:47 | |
gfidente | marios oh I was joking | 14:50 |
fultonj | :) | 14:50 |
marios | gfidente: me too :) | 14:50 |
*** dpawar has quit IRC | 14:50 | |
gfidente | marios thanks for pointing it out | 14:51 |
gfidente | (you were joking too) | 14:51 |
gfidente | ahahahha | 14:51 |
marios | gfidente: did you just throw some kind of exception? | 14:51 |
gfidente | no I was actually writing some release notes | 14:51 |
gfidente | got distracted | 14:51 |
* marios stops harassing the faidentee | 14:52 | |
marios | though i should point out you started it | 14:52 |
gfidente | yes but you can take over the submission | 14:53 |
gfidente | ! | 14:53 |
EmilienM | so cute :) | 14:56 |
EmilienM | bnemec, mwhahaha : ok so what's the plan for this timeout? we need to take a look and see if indeed heat convergence has an effect. We could compare overcloud deployment time now and 10 days ago | 14:57 |
bnemec | I would love to, but I don't think we have those numbers in a consumable format anywhere these days. | 14:57 |
*** spectr has joined #tripleo | 14:57 | |
EmilienM | mwhahaha: re: https://review.openstack.org/#/c/508189/11/tripleo_common/templates/deployments.yaml - why jq would be a dep in package feature in ansible? ? | 14:58 |
EmilienM | bnemec: we can look at the ansible logs though | 14:59 |
*** dbecker has quit IRC | 14:59 | |
EmilienM | bnemec: and see task run time | 14:59 |
EmilienM | bnemec: I'll take a look... | 14:59 |
mwhahaha | EmilienM: use proper ansible thank ansible == bash++ | 14:59 |
mwhahaha | s/thank/than | 14:59 |
* bnemec is going to start referring to Ansible as Bash++ now | 15:00 | |
EmilienM | I'm trying to have productive discussion here :) | 15:00 |
EmilienM | the timeouts are really bad, we need to do something | 15:01 |
mwhahaha | i'm not trolling | 15:01 |
mwhahaha | use proper ansible to do package installs | 15:01 |
*** cshastri has quit IRC | 15:01 | |
mwhahaha | as for CI, drop the capacity | 15:01 |
mwhahaha | step 1 | 15:01 |
mwhahaha | step 2 figure out why we had to do step 1 | 15:01 |
bnemec | I'm sort of trolling but also serious. The whole reason we had graphite stats was for debugging stuff like this. | 15:02 |
*** rhallisey has quit IRC | 15:02 | |
EmilienM | mwhahaha: I thought you were talking about the ansible logs, ok for package, it makes sense. slagle said we follow-up with jq in a later patch | 15:02 |
*** psachin has quit IRC | 15:02 | |
EmilienM | bnemec: we can still restore it | 15:02 |
EmilienM | someone has to do it | 15:03 |
mwhahaha | if only there was a team who should be focusing on ci | 15:03 |
* mwhahaha now starts trolling | 15:03 | |
* mwhahaha pokes weshay & company | 15:03 | |
mwhahaha | by the way, when are we cutting over to rdo cloud for ovb | 15:03 |
slagle | mwhahaha: EmilienM : i think what i will do is start an etherpad (with bugs) of stuff we need to clean up and improve around for the ansible effort | 15:04 |
weshay | which timeout, upgrades? | 15:04 |
EmilienM | weshay: not only, ovb | 15:04 |
EmilienM | slagle: lgtm | 15:04 |
slagle | i can see a situtation where this tripleo-common patch never gets merged b/c we can nitpick it to death | 15:04 |
weshay | ovb is migrating to rdo-cloud starting now | 15:04 |
weshay | we're walking through the planning on that as I type :) | 15:05 |
mwhahaha | weshay: we're getting timeouts on ovb, seems like it might be capacity related. was wondering when rh1->rdo cloud | 15:05 |
EmilienM | slagle: no, we'll merge it and iterate. | 15:05 |
weshay | mwhahaha, yup.. in planning and process now | 15:05 |
EmilienM | slagle: we want this thing in the gate asap | 15:05 |
mwhahaha | slagle, EmilienM: yea i don't want to nit pick it, i just pointing out that we shouldn't be doing that. it can be a follow up but a lot of times we lose track of those follow ups and they become set in stone | 15:05 |
*** chlong has quit IRC | 15:06 | |
mwhahaha | slagle, EmilienM: i'm all for iteration but we have to follow through on the clean up | 15:06 |
mwhahaha | our clean up track record is terrible | 15:06 |
*** yprokule has quit IRC | 15:06 | |
*** salmankhan has quit IRC | 15:07 | |
trown | mwhahaha: we will be sending an email to the ML after our planning meeting to layout our plan for moving OVB jobs, but that is the focus of this sprint for the CI squad | 15:07 |
mwhahaha | also i wasn't -1 because of that either, but if we do have to touch it it would be good to swithc out | 15:07 |
*** salmankhan has joined #tripleo | 15:07 | |
mwhahaha | trown: k sounds good | 15:07 |
EmilienM | mwhahaha: we will follow-up | 15:08 |
openstackgerrit | Brad P. Crochet proposed openstack/tripleo-common master: Add a GetNetworksAction https://review.openstack.org/509419 | 15:08 |
EmilienM | slagle: can I start an etherpad or you're already on it? | 15:08 |
slagle | EmilienM: go4it | 15:08 |
openstackgerrit | Brad P. Crochet proposed openstack/tripleo-common master: Add a Get Networks workflow https://review.openstack.org/509419 | 15:08 |
EmilienM | slagle: ok | 15:09 |
EmilienM | bnemec, mwhahaha: another option is to revert the convergence switch on the undercloud | 15:09 |
EmilienM | give me a reason not doing so | 15:09 |
mwhahaha | be cause it started before the convergence switch | 15:09 |
EmilienM | slagle: https://etherpad.openstack.org/p/tripleo-config-download | 15:09 |
mwhahaha | so it's not going to 100% fix it | 15:09 |
*** ebarrera_ has joined #tripleo | 15:09 | |
EmilienM | please look at logstash | 15:10 |
mwhahaha | i did | 15:10 |
EmilienM | it doubled the day we switched | 15:10 |
mwhahaha | and it started on 10/1 | 15:10 |
EmilienM | 85 hits in 12h | 15:10 |
mwhahaha | drop the capacity of rh1 | 15:11 |
*** links has joined #tripleo | 15:11 | |
* mwhahaha has 0 visibility into rh1's status | 15:11 | |
*** ebarrera has quit IRC | 15:11 | |
mwhahaha | i'm all for revert if it's a complete fix | 15:12 |
mwhahaha | for timeouts the revert is not one of those | 15:12 |
mwhahaha | it takes it from awful to slightly less awful | 15:12 |
*** aditya_r has quit IRC | 15:13 | |
*** aditya_r has joined #tripleo | 15:14 | |
bnemec | https://fedorapeople.org/~bnemec/CIMon3.png is the current load on rh1, for the record. | 15:14 |
bnemec | None of the nodes are running above 50% utilization and most are far below that. | 15:14 |
mwhahaha | network throughput maybe? | 15:14 |
*** aputtur has joined #tripleo | 15:15 | |
*** ebarrera has joined #tripleo | 15:15 | |
bnemec | Possibly. I haven't looked at that since we started doing containers. | 15:15 |
mwhahaha | the ha job is the one that's timing out more regularly i think | 15:15 |
* mwhahaha starts digging into logs | 15:15 | |
mwhahaha | so it's taking ~30 mins to get to the undercloud install | 15:17 |
bnemec | I wonder, did we lose the optimization I did to disable unused services on the undercloud? | 15:17 |
mwhahaha | i don't think so | 15:18 |
*** ebarrera_ has quit IRC | 15:18 | |
mwhahaha | we haven't really merged much in instack-undercloud | 15:18 |
mwhahaha | unless that was a quickstart change | 15:18 |
*** aditya_ra has joined #tripleo | 15:19 | |
*** aditya_r has quit IRC | 15:19 | |
mwhahaha | so it seems the modifying of the image is taking a decent chunk of time | 15:20 |
EmilienM | mwhahaha: have you compared tasks with a run 10 days ago for ex? | 15:20 |
mwhahaha | no just spot checking timings right now looking for the big ones | 15:21 |
*** tongl has joined #tripleo | 15:21 | |
mwhahaha | it's litterally like 30 mins of prep around everything before we even start installing the undercloud | 15:21 |
mwhahaha | that seems crazy in general | 15:22 |
*** achadha has joined #tripleo | 15:22 | |
mwhahaha | well i can tell you that we're running a different ordering of stuff from ~2 weeks ago | 15:25 |
mwhahaha | or at leas tit seems this way | 15:25 |
* mwhahaha supresses his hatred for ansible outputs | 15:26 | |
openstackgerrit | Jiri Stransky proposed openstack/tripleo-heat-templates master: Add global deployment tasks executed on undercloud https://review.openstack.org/510122 | 15:26 |
openstackgerrit | Jiri Stransky proposed openstack/tripleo-heat-templates master: Deploy kubernetes using TripleO on the overcloud https://review.openstack.org/471759 | 15:26 |
openstackgerrit | Jiri Stransky proposed openstack/tripleo-heat-templates master: Support Kubespray installation via config download mechanism https://review.openstack.org/511272 | 15:26 |
openstackgerrit | Brad P. Crochet proposed openstack/tripleo-common master: Validate roles data and network data https://review.openstack.org/508567 | 15:26 |
* mwhahaha gives ARA a try | 15:26 | |
mwhahaha | so everything is taking slightly longer | 15:27 |
EmilienM | jistr, flaper87: do we still need https://review.openstack.org/#/c/483434/ ? | 15:27 |
*** aditya_ra has quit IRC | 15:28 | |
jistr | EmilienM: we don't have another way to test in CI, but right now i'm not pursuing testing in CI... It would be preferable to have it packaged. I hear bogdando pursued that with centos community, i gotta sync up on it with him. | 15:28 |
*** jaganathan has joined #tripleo | 15:29 | |
bnemec | mwhahaha: No, I mean from the pre-quickstart days. This: https://review.openstack.org/#/c/422785/ | 15:29 |
jistr | flaper87, gfidente, slagle, dprince: btw i successfully installed overcloud with external k8s installer using the config download + ansible-playbook mechanism https://review.openstack.org/#/c/511272/ | 15:29 |
bnemec | 4.5 minutes per job adds up quite quickly. | 15:30 |
EmilienM | jistr: it would be great to test in CI imho | 15:30 |
mwhahaha | bnemec: it's possible but I see that we're getting an image or something now that we weren't 2 weeks ago | 15:30 |
EmilienM | jistr: and good job! | 15:30 |
* mwhahaha is checking quickstart | 15:30 | |
jaosorior | bnemec: hey, could you give this a read? https://review.openstack.org/#/c/510001/ | 15:30 |
jaosorior | jistr: nice!!! | 15:30 |
*** aputtur has quit IRC | 15:30 | |
gfidente | jistr nice how does the inventory looks like? | 15:31 |
*** jlinkes has quit IRC | 15:31 | |
mwhahaha | bnemec, EmilienM: fetch-images : Get image is a new task that task ~1.5 mins | 15:31 |
jistr | gfidente: see line 55 https://review.openstack.org/#/c/511272/1/extraconfig/services/kubernetes-master.yaml | 15:31 |
mwhahaha | bnemec, EmilienM http://logs.openstack.org/02/510902/4/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha-oooq/4fc841a/logs/ara_oooq/ vs http://logs.openstack.org/83/499283/5/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha-oooq/c95fb83/logs/ara_oooq/ | 15:31 |
jistr | gfidente: i reuse the outer inventory and essentially add aliases for what kubespray expects | 15:31 |
mwhahaha | bnemec, EmilienM: were doing a bunch of image stuff we weren't doing 2 weeks ago | 15:33 |
EmilienM | oops | 15:33 |
mwhahaha | modify-image : make sure an image and script are provided | 15:33 |
*** dparkes has quit IRC | 15:33 | |
mwhahaha | that used to be true | 15:33 |
mwhahaha | is now false | 15:33 |
mwhahaha | EmilienM: so there's yer problem | 15:33 |
dprince | jistr: very nice on the kubespray patch | 15:34 |
*** mcornea has quit IRC | 15:34 | |
EmilienM | slagle: is https://review.openstack.org/#/c/505827 safe to land even if gate-tripleo-ci-centos-7-ovb-ha-oooq timeouts? i think yes | 15:35 |
EmilienM | mwhahaha: ok let me look git history | 15:35 |
mwhahaha | EmilienM: https://github.com/openstack/tripleo-quickstart-extras/blob/master/roles/modify-image/tasks/main.yml#L8 hasn't changed | 15:36 |
slagle | EmilienM: has it ever passed? | 15:36 |
mwhahaha | so it must be that config | 15:36 |
EmilienM | slagle: 6 days ago | 15:36 |
jistr | gfidente: i'm not sure if that's directly applicable to ceph-ansible too? But even if it weren't, there should be ways to introspect the outer Ansible inventory and generate the inner one without including the file itself, we could try and figure out whatever you need. | 15:36 |
slagle | EmilienM: actually, legacy-tripleo-ci-centos-7-ovb-ha-oooq passes | 15:37 |
slagle | EmilienM: so i think it's good | 15:37 |
*** chlong has joined #tripleo | 15:37 | |
slagle | aren't those jobs currently the same? | 15:37 |
mwhahaha | yea they are | 15:38 |
gfidente | jistr so ceph-ansible we used to have roles based on service_name | 15:38 |
EmilienM | mwhahaha: maybe https://review.openstack.org/#/c/508868/ ? | 15:38 |
gfidente | and collect the list of ips where such a role is installed | 15:38 |
EmilienM | slagle: they are same | 15:38 |
EmilienM | slagle: I'll approve | 15:38 |
openstackgerrit | Alan Bishop proposed openstack/tripleo-heat-templates master: Enable Cinder as a backend for Glance https://review.openstack.org/511275 | 15:38 |
mwhahaha | EmilienM: no i don't think so | 15:38 |
gfidente | jistr like http://paste.openstack.org/show/623365/ | 15:38 |
mwhahaha | EmilienM: did we bump ansible version? maybe it's that conditional | 15:39 |
*** rodolof has quit IRC | 15:39 | |
jistr | gfidente: we could generate that somehow i think, but it might be easiest to do what i did in the kubespray patch. tripleo-ansible-inventory script generates host groups based on the service_manifest names | 15:39 |
*** ebarrera has quit IRC | 15:40 | |
jistr | so you could do | 15:40 |
* jistr looking at code | 15:40 | |
*** aufi has quit IRC | 15:40 | |
jistr | https://github.com/openstack/tripleo-heat-templates/blob/master/puppet/services/ceph-mon.yaml#L105 | 15:40 |
jistr | [mons:children] | 15:40 |
jistr | ceph_mon | 15:40 |
jistr | and that's it IIUC | 15:40 |
jistr | sorta like an alias... :) | 15:41 |
EmilienM | mwhahaha: which conditional sorry? | 15:41 |
mwhahaha | EmilienM: modify-image : make sure an image and script are provided | 15:41 |
mwhahaha | EmilienM: https://github.com/openstack/tripleo-quickstart-extras/blob/master/roles/modify-image/tasks/main.yml#L8 | 15:41 |
mwhahaha | EmilienM: or we weren't previously updating the repos (a bug) and now we are (not a bug) | 15:42 |
openstackgerrit | Ricardo Noriega proposed openstack/tripleo-heat-templates master: Using stevedore alias for BGPVPN Service Plugin https://review.openstack.org/511277 | 15:42 |
EmilienM | mwhahaha: indeed, that could be a reason | 15:43 |
gfidente | jistr is ceph_mon collected by tripleo-inventory | 15:43 |
gfidente | I mean, what does it represent? | 15:43 |
gfidente | I am not familiar with how the outer ansible inventory looks like | 15:44 |
*** gkadam has quit IRC | 15:44 | |
EmilienM | trown, sshnaidm|mtg : please ping when you're back | 15:44 |
mwhahaha | actually let me see maybe i'm reading this wrong | 15:44 |
jistr | gfidente: yes it collects on which roles are which services, and generates an ansible host group for each service. here's a snippet http://paste.openstack.org/show/623368/ | 15:45 |
mwhahaha | so the cache image doesn't exist anymore | 15:46 |
mwhahaha | wrong conditional | 15:46 |
mwhahaha | fetch-images : Ensure image cache directory exists | 15:46 |
mwhahaha | which i think is triggering all the stuff | 15:47 |
mwhahaha | it's creating /home/jenkins/images-cache | 15:47 |
mwhahaha | that's what's triggering the fetch-images task | 15:47 |
jistr | gfidente, slagle, dprince: also i reused bogdando's logging approach (like we use for puppet) for the inner ansible run, so the logs are nicely readable within the outer playbook. Snippet: http://paste.openstack.org/show/623369/ | 15:47 |
mwhahaha | EmilienM: http://logs.openstack.org/99/511199/1/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha-oooq/5dc995f/logs/ara_oooq/result/15f97876-4230-4dca-b6f9-0a6fd9b0dede/ | 15:49 |
mwhahaha | EmilienM: http://logs.openstack.org/02/510902/4/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha-oooq/4fc841a/logs/ara_oooq/result/995c35ce-cbeb-4f58-96e9-a17538ce7262/ | 15:49 |
mwhahaha | instead of http://logs.openstack.org/83/499283/5/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha-oooq/c95fb83/logs/ara_oooq/result/e8df6ac3-ac89-4f9a-990e-a4dc8a738102/ | 15:49 |
dprince | jistr: are there any controls with that for Debug, vs Info logging levels? | 15:49 |
*** thrash is now known as thrash|biab | 15:51 | |
*** chlong has quit IRC | 15:51 | |
jistr | dprince: not now i think (neither the inner ansible, nor puppet -- though i'm not sure about puppet). We could surely add a flag on the outer playbook and pass apppropriate flag or set a variable on the inner one. The only limit is whatever the inner tool supports. | 15:51 |
mwhahaha | EmilienM, trown, sshnaidm|mtg: We're purging the image cache dir on ovb http://logs.openstack.org/02/510902/4/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha-oooq/4fc841a/logs/ara_oooq/result/76fe6364-48ae-404b-a4f6-b1594c0418ba/ | 15:52 |
sshnaidm|mtg | mwhahaha, yes, so...? | 15:53 |
mwhahaha | sshnaidm|mtg: so every ovb run is rebuilding them | 15:53 |
numans | EmilienM, hi, FYI - the job gate-tripleo-ci-centos-7-scenario007-multinode-oooq-container-nv is passing in https://review.openstack.org/#/c/511199/ | 15:53 |
sshnaidm|mtg | mwhahaha, no | 15:53 |
mwhahaha | sshnaidm|mtg: yes | 15:53 |
sshnaidm|mtg | mwhahaha, no, we use cached images | 15:53 |
mwhahaha | no | 15:54 |
sshnaidm|mtg | yes | 15:54 |
mwhahaha | sshnaidm|mtg: please read scroll back | 15:54 |
*** ratailor has joined #tripleo | 15:54 | |
*** egonzalez has quit IRC | 15:54 | |
mwhahaha | sshnaidm|mtg: we are running the fetch-image tasks where 2 weeks ago we were not | 15:54 |
mwhahaha | this leads to longer ovb run times and timeouts | 15:54 |
sshnaidm|mtg | mwhahaha, if nothing was changed recently we use cached images | 15:55 |
mwhahaha | how often are we changing stuff in master? | 15:55 |
sshnaidm|mtg | mwhahaha, will check after mtg | 15:55 |
EmilienM | numans: AWESOME! why did you disable swift? can you explain in commit message? | 15:55 |
EmilienM | numans: or at least in comment in the review | 15:55 |
sshnaidm|mtg | mwhahaha, anyway removing image-cache is not related | 15:56 |
EmilienM | numans: also, please project a patch to project-config to make it voting once this THT patch is merged | 15:56 |
mwhahaha | sshnaidm|mtg: then what's the problem then | 15:56 |
*** marios has quit IRC | 15:57 | |
*** pcaruana has quit IRC | 15:59 | |
*** rcernin has quit IRC | 15:59 | |
EmilienM | shardy, gfidente, jistr: i've been asked why skydive patches haven't been reviewed, https://review.openstack.org/#/c/502353/ and https://review.openstack.org/#/c/502350/ - I was wondering if you could help, at least to make sure it's ok to do the same workflow thing as we do for ceph-ansible | 16:00 |
sshnaidm|mtg | mwhahaha, do you have a link? | 16:01 |
EmilienM | gfidente, jistr: or if no, they should do another way | 16:01 |
mwhahaha | sshnaidm|mtg: wait why aren't we building the images anymore | 16:01 |
gfidente | EmilienM didn't we have this conversation already | 16:01 |
gfidente | and decided with slagle to postpone it until the new mechanism was in place? | 16:01 |
EmilienM | gfidente: yeah but the reviews didn't happen | 16:01 |
sshnaidm|mtg | mwhahaha, we build always on tripleo-common and instack-undercloud | 16:01 |
EmilienM | gfidente: we might want to write that in Gerrit | 16:01 |
EmilienM | safchain: ^ fyi | 16:02 |
*** rodolof has joined #tripleo | 16:02 | |
sshnaidm|mtg | mwhahaha, https://github.com/openstack-infra/tripleo-ci/blob/1e0099dc0beaee4bf6884fcd27820aab92492c1e/scripts/to_build#L16 | 16:02 |
*** lucasagomes is now known as lucas-afk | 16:02 | |
gfidente | EmilienM I remember asking slagle to do so because he suggested waiting | 16:02 |
sshnaidm|mtg | on these repos we always build | 16:02 |
mwhahaha | sshnaidm|mtg: it seems that the cached version is no faster than building the images in terms of timing | 16:02 |
mwhahaha | so let me go get a different ci run to compare against | 16:03 |
*** ratailor has quit IRC | 16:03 | |
safchain | EmilienM, gfidente thx if could have a status on the patches | 16:03 |
mwhahaha | sshnaidm|mtg: for the record i was comparing http://logs.openstack.org/99/511199/1/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha-oooq/5dc995f/logs/ara_oooq/ vs http://logs.openstack.org/83/499283/5/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha-oooq/c95fb83/logs/ara_oooq/ | 16:03 |
*** rodolof has quit IRC | 16:04 | |
*** rodolof has joined #tripleo | 16:04 | |
slagle | gfidente: EmilienM : all I had said was that I was going to block any patches | 16:05 |
slagle | err, wasn't | 16:05 |
sshnaidm|mtg | mwhahaha, 8 minutes to update the image, not sure it's a problem (first link) | 16:05 |
EmilienM | I think these guys just want to know what happens | 16:05 |
slagle | i honestly don't know what the urgency around skydive was, so i had not plans to block anything | 16:05 |
mwhahaha | sshnaidm|mtg: so now i'm going to compare http://logs.openstack.org/02/510902/4/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha-oooq/4fc841a/logs/ara_oooq/ against http://logs.openstack.org/39/502039/7/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha-oooq/f1b940d/logs/ara_oooq/ | 16:05 |
mwhahaha | sshnaidm|mtg: 8 mins is significant when we're constantly bumping up against the wall clock | 16:06 |
*** rodolof has quit IRC | 16:06 | |
sshnaidm|mtg | mwhahaha, it always was 10 | 16:06 |
*** ffiore has quit IRC | 16:06 | |
*** rhallisey has joined #tripleo | 16:06 | |
*** rodolof has joined #tripleo | 16:06 | |
*** milan has quit IRC | 16:07 | |
*** dbecker has joined #tripleo | 16:07 | |
*** rodolof has quit IRC | 16:08 | |
*** jpich has quit IRC | 16:08 | |
*** rodolof has joined #tripleo | 16:08 | |
mwhahaha | sshnaidm|mtg: yup, actually the image cache was longer previously. looks like the deploy is just taking long for some reason | 16:09 |
bnemec | We just got one of the compute nodes in rh1 back. | 16:10 |
mwhahaha | EmilienM, bnemec: so the undercloud install is about 2 mins longer and the image prep is about 2 mins longer, but the overall time to get to the deploy is the same. the deploy is taking longer | 16:10 |
mwhahaha | so it's something in the overcloud deployment that's going long | 16:10 |
bnemec | That gives us like 3.3% more capacity. :-) | 16:10 |
mwhahaha | 3.333333333333333333333333333333333333333% | 16:10 |
*** chlong has joined #tripleo | 16:10 | |
EmilienM | we've lost alex | 16:10 |
EmilienM | he was a nice guy | 16:11 |
*** itlinux has joined #tripleo | 16:11 | |
*** lebauce has quit IRC | 16:11 | |
trown | mostly | 16:11 |
mwhahaha | partially | 16:11 |
EmilienM | rarely | 16:12 |
trown | from canadian to french in 1 minute | 16:13 |
EmilienM | yeah :) | 16:14 |
pabelanger | 2017-10-11 15:50:47 | 2017-10-11 15:50:47,205 INFO: [1;31mError: Evaluation Error: Error while evaluating a Function Call, Could not find class ::swift for centos-7-2-node-infracloud-vanilla-11334692 at /etc/puppet/manifests/puppet-stack-config.pp:329:1 on node centos-7-2-node-infracloud-vanilla-11334692[0m | 16:14 |
pabelanger | known issue? | 16:14 |
mwhahaha | no | 16:15 |
EmilienM | slagle, gfidente : I discussed with the skydive team and they want it asap and probably iterate with the new framework later - but right now they don't know what to do to make progress and kind of wait on us | 16:15 |
EmilienM | pabelanger: which change id? | 16:15 |
pabelanger | http://logs.openstack.org/43/497543/14/gate/gate-tripleo-ci-centos-7-scenario001-multinode-oooq-puppet/592662a/logs/undercloud/home/jenkins/undercloud_install.log.txt.gz | 16:15 |
pabelanger | that is in gate | 16:15 |
pabelanger | which just caused a reset | 16:15 |
EmilienM | let me check the puppet error | 16:16 |
gfidente | slagle EmilienM ok so while on one hand it looks like we might have to redo it within the queens cycle | 16:16 |
EmilienM | safchain: ^ | 16:16 |
gfidente | if that helps testing we can merge it as is | 16:16 |
sshnaidm|mtg | mwhahaha, but I see downloading image took 5 mins, when should take 1 maximum.. and it's suspicious | 16:16 |
*** sshnaidm|mtg is now known as sshnaidm | 16:16 | |
mwhahaha | sshnaidm: i saw 10 on the old job | 16:17 |
sshnaidm | mwhahaha, that's very bad | 16:17 |
mwhahaha | sshnaidm: it would be nice to get all the various prep peices < 15 mins. cause we spend ~30 mins messing with images | 16:17 |
mwhahaha | which is significant in the CI | 16:17 |
mwhahaha | 30 mins for prep, 30mins for undercloud install and that leaves like ~1hour for the overcloud deploy | 16:18 |
mwhahaha | +/- various other things | 16:18 |
sshnaidm | mwhahaha, actually it was about 15 mins or less before | 16:18 |
mwhahaha | not sure what you mean by before | 16:18 |
mwhahaha | the test results from the end of sept the timing is the same as it is now | 16:19 |
sshnaidm | mwhahaha, when I counted it, month(?) ago | 16:19 |
itlinux | hello all.. does anyone have the puppet mani for ooo to push designate ? | 16:19 |
mwhahaha | pabelanger: EmilienM puppet-swift is missing from the rpm list | 16:19 |
mwhahaha | pabelanger: EmilienM: http://logs.openstack.org/43/497543/14/gate/gate-tripleo-ci-centos-7-scenario001-multinode-oooq-puppet/592662a/logs/undercloud/var/log/extra/rpm-list.txt.gz | 16:19 |
sshnaidm | mwhahaha, 5 mins for closing qcow2 image! | 16:19 |
sshnaidm | mwhahaha, pabelanger that's vm disk performance sucks | 16:20 |
EmilienM | mwhahaha: oh wait | 16:20 |
mwhahaha | bnemec: -^ io issues maybe? | 16:20 |
mwhahaha | EmilienM: did you forget to add puppet-swift to the deps for puppet-tripleo | 16:20 |
EmilienM | oops | 16:20 |
EmilienM | fuck | 16:21 |
mwhahaha | not sure who that didn't fall out anywhere else | 16:21 |
mwhahaha | that seems odd | 16:21 |
pabelanger | sshnaidm: feel free to donate more resources | 16:21 |
mwhahaha | sshnaidm: this is on rh1 anyway so that's not upstream's fault. /me points to rdocloud migration questions from ealier | 16:21 |
EmilienM | mwhahaha: https://review.rdoproject.org/r/10112 | 16:22 |
EmilienM | mwhahaha: I'll double check we don't miss any other module now | 16:22 |
EmilienM | mwhahaha: it's weird, I copy-pasted from OPM | 16:22 |
*** lebauce has joined #tripleo | 16:22 | |
sshnaidm | mwhahaha, next 2 weeks: https://trello.com/c/wyUOPIhP/377-ovb-migration-to-rdo-cloud-and-related-work | 16:22 |
EmilienM | mwhahaha: but I obviously did a mistake | 16:22 |
bnemec | mwhahaha: Maybe? That would be weird though because rh1 is configured with unsafe caching and SSDs specifically to avoid io performance issues. | 16:22 |
sshnaidm | pabelanger, it seems more like disk problem, did something changed recently? | 16:23 |
*** rodolof has quit IRC | 16:23 | |
sshnaidm | pabelanger, sorry, it was for bnemec | 16:23 |
safchain | gfidente, EmilienM in case something changed during the last few weeks we are going to restest it, otherwise that's ok for us to merge it, it will help to test | 16:23 |
bnemec | mwhahaha: Is it happening on more than one job? We do have one or two compute nodes that are missing SSDs, so it's possible that if a vm got scheduled to one of those it would be slow. | 16:23 |
bnemec | But even then it's usually not a problem because of the cache settings. | 16:24 |
mwhahaha | bnemec: the ha job more consistently | 16:24 |
akrivoka | jtomasek: do you remember if there's a bug open for the inconsistency issue in parameter names in tht (ObjectStorage vs SwiftStorage, etc)? | 16:24 |
mwhahaha | which is why i asked about io | 16:24 |
*** cylopez has quit IRC | 16:24 | |
mwhahaha | dat galera | 16:24 |
EmilienM | safchain: it works for me, I'll have a look at the patches again today | 16:24 |
mwhahaha | bnemec: i dont' suppose there's a way to track nodepool vm to hypervisor? | 16:25 |
mwhahaha | tripleo-centos-7-tripleo-test-cloud-rh1-11332064 | 16:25 |
safchain | EmilienM, great thx | 16:25 |
*** mugsie has joined #tripleo | 16:26 | |
*** sadasu has quit IRC | 16:26 | |
EmilienM | mwhahaha: missed ovn as well | 16:27 |
bnemec | EmilienM: Maybe, but I don't think one or even a couple of bad computes would explain the failure rates we're seeing. | 16:27 |
EmilienM | mwhahaha: I'll patch and merge | 16:27 |
bnemec | But I need to run for a meeting like 10 minutes ago. | 16:27 |
bnemec | I'll be back in ~1 hour. | 16:27 |
*** shreshtha has joined #tripleo | 16:28 | |
*** yamahata has joined #tripleo | 16:30 | |
*** Goneri has joined #tripleo | 16:31 | |
sshnaidm | mwhahaha, btw, overcloud deployment was taking about 1 hour, if it's more, it's also a problem.. | 16:31 |
mwhahaha | yea but that could be io related | 16:31 |
sshnaidm | mwhahaha, it may be network problem, especially if downloads take so much time | 16:31 |
mwhahaha | which is why i happens more on the ha than the non-ha | 16:31 |
openstackgerrit | Giulio Fidente proposed openstack/tripleo-heat-templates master: Update CephPools format in the docker templates to fit ceph-ansible https://review.openstack.org/508859 | 16:32 |
sshnaidm | mwhahaha, yeah, either disk io or network | 16:32 |
*** weshay is now known as weshay|ruck | 16:32 | |
mwhahaha | if it was network i would assume it'd happen more consistently on all the jobs | 16:32 |
*** panda is now known as panda|rover | 16:32 | |
mwhahaha | since the non-ha job (As far as i've seen) is fine, but the scenario/ha job is timing out i think it's io | 16:33 |
*** dparkes has joined #tripleo | 16:33 | |
mwhahaha | it's taking ~10 mins longer | 16:33 |
sshnaidm | mwhahaha, which non-ha job? multinode? | 16:34 |
EmilienM | mwhahaha: sorry for that, human error - can you review https://review.rdoproject.org/r/#/c/10112/ again plz? | 16:34 |
mwhahaha | sshnaidm: containers | 16:35 |
mwhahaha | gate-tripleo-ci-centos-7-ovb-containers-oooq | 16:35 |
*** d0ugal has quit IRC | 16:35 | |
sshnaidm | yeah, it also had problems with timeouts, but less, as well as gate-tripleo-ci-centos-7-ovb-1ctlr_1comp_1ceph-featureset024 job | 16:36 |
EmilienM | mwhahaha: when ready, let's approve it | 16:36 |
mwhahaha | EmilienM: k done | 16:37 |
EmilienM | mwhahaha: we should have 72 modules, let me check | 16:38 |
sshnaidm | mwhahaha, all these problems started from this Monday, btw | 16:38 |
*** thrash|biab is now known as thrash | 16:38 | |
mwhahaha | sshnaidm: no we had timeouts starting on 10/1 | 16:39 |
mwhahaha | sshnaidm: it got worse starting on monday (probably related to the undercloud convergence switch) | 16:39 |
*** dtantsur is now known as dtantsur|afk | 16:39 | |
sshnaidm | mwhahaha, yeah, I see.. | 16:39 |
*** yog_ has quit IRC | 16:40 | |
*** dparkes has quit IRC | 16:40 | |
sshnaidm | mwhahaha, do you have a bug? | 16:40 |
mwhahaha | i don't think there is one yet | 16:40 |
sshnaidm | weshay|ruck, can we handle this issue? ^^ timeouts in ovb jobs | 16:40 |
sshnaidm | weshay|ruck, http://logstash.openstack.org/#/dashboard/file/logstash.json?query=message:%5C%22setup%20script%20run%20by%20this%20job%20failed%20-%20exit%20code:%20143%5C%22%20AND%20build_name:*tripleo-ci-*%20AND%20tags:console%20AND%20voting:1%20AND%20build_status:FAILURE | 16:41 |
weshay|ruck | sshnaidm, with regards to this sprint? | 16:41 |
sshnaidm | weshay|ruck, not sure, it still should work until we move.. | 16:42 |
sshnaidm | weshay|ruck, seems like disk io or network problem actually | 16:42 |
sshnaidm | on rh1 | 16:42 |
weshay|ruck | sshnaidm, ya.. should be tracked in a bug | 16:42 |
weshay|ruck | sshnaidm, feel free to pick it up | 16:43 |
weshay|ruck | :) | 16:43 |
sshnaidm | :D | 16:43 |
sshnaidm | where is |rover ?? | 16:43 |
openstackgerrit | Andy Smith proposed openstack/puppet-tripleo master: WIP: Updates to separate messaging backends alternative https://review.openstack.org/510684 | 16:44 |
weshay|ruck | panda|rover, <---- | 16:44 |
mwhahaha | red rover, red rover send weshay over | 16:44 |
weshay|ruck | mwhahah ahahahhaha :) | 16:44 |
*** tzumainn has quit IRC | 16:45 | |
openstackgerrit | Ronelle Landy proposed openstack/tripleo-common master: DO NOT MERGE - Add no-op change for containers update testing https://review.openstack.org/498898 | 16:45 |
weshay|ruck | that works anytime alex tells a joke | 16:45 |
openstackgerrit | Andy Smith proposed openstack/tripleo-heat-templates master: WIP Separate rpc and notify messaging backends https://review.openstack.org/507963 | 16:45 |
*** chlong has quit IRC | 16:48 | |
*** dprince has quit IRC | 16:48 | |
*** milan has joined #tripleo | 16:48 | |
openstackgerrit | Michele Baldessari proposed openstack/tripleo-heat-templates master: Fix ConfigDebug for puppet host runs https://review.openstack.org/511207 | 16:49 |
openstackgerrit | Ronelle Landy proposed openstack/tripleo-quickstart-extras master: Add the option to run the container-check script https://review.openstack.org/501028 | 16:49 |
*** dprince has joined #tripleo | 16:50 | |
sshnaidm | weshay|ruck, https://bugs.launchpad.net/tripleo/+bug/1722864 | 16:51 |
openstack | Launchpad bug 1722864 in tripleo "OVB frequent timeouts on rh1 cloud" [Critical,Triaged] | 16:51 |
weshay|ruck | thanks | 16:51 |
*** sshnaidm is now known as sshnaidm|off | 16:52 | |
*** yog_ has joined #tripleo | 16:53 | |
*** links has quit IRC | 16:54 | |
*** achadha has quit IRC | 16:55 | |
honza | I'm getting errors in CI where some binary packages aren't installed. Is this a known issue? http://logs.openstack.org/09/505609/5/check/gate-tripleo-ui-nodejs6-npm-run-test/4b53880/console.html | 16:57 |
*** tzumainn has joined #tripleo | 16:57 | |
*** d0ugal has joined #tripleo | 16:58 | |
*** derekh has quit IRC | 16:58 | |
weshay|ruck | arxcruz, ok.. the doc looks good for the sprint end, please send to openstack-dev[tripleo] | 17:00 |
honza | What does 'see alerts' mean? | 17:01 |
EmilienM | honza: alerts are posted here on IRC by a bot | 17:03 |
EmilienM | honza: they represent launchpad bugs with "alert" tag, see https://bugs.launchpad.net/tripleo/+bugs?field.tag=alert | 17:03 |
EmilienM | they usually are critical issues in our CI | 17:03 |
honza | EmilienM: thanks! | 17:04 |
*** MaxPC has joined #tripleo | 17:06 | |
EmilienM | honza: sure, anytime | 17:06 |
*** paramite has joined #tripleo | 17:06 | |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-quickstart master: fs10: deploy steps with ansible https://review.openstack.org/508307 | 17:08 |
EmilienM | slagle: ^ this one is also ready for review plz | 17:09 |
*** jpena is now known as jpena|off | 17:09 | |
*** ooolpbot has joined #tripleo | 17:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 17:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1722864 | 17:10 |
*** ooolpbot has quit IRC | 17:10 | |
openstack | Launchpad bug 1722864 in tripleo "OVB frequent timeouts on rh1 cloud" [Critical,Triaged] | 17:10 |
*** trown is now known as trown|lunch | 17:10 | |
*** marrusl has joined #tripleo | 17:11 | |
*** jfrancoa has quit IRC | 17:12 | |
EmilienM | honza: this is an alert ^ | 17:13 |
honza | nice | 17:13 |
*** tesseract has quit IRC | 17:15 | |
*** athomas has quit IRC | 17:15 | |
*** aditya_r has joined #tripleo | 17:16 | |
*** aditya_r has quit IRC | 17:17 | |
openstackgerrit | Ana Krivokapic proposed openstack/tripleo-validations master: Warn if there are not enough node IPs in pools https://review.openstack.org/511304 | 17:18 |
*** aditya_r has joined #tripleo | 17:18 | |
*** hewbrocca is now known as hewbrocca_afk | 17:20 | |
*** panda|rover is now known as panda|rover|off | 17:22 | |
slagle | EmilienM: we don't actually want to take over fs10 do we? | 17:24 |
EmilienM | trown|lunch, sshnaidm|off, mwhahaha: if some of you can review https://review.openstack.org/#/c/508306/ please | 17:24 |
EmilienM | slagle: oh ok | 17:24 |
slagle | i dunno | 17:24 |
EmilienM | slagle: so where should we take over? | 17:24 |
trozet | hi what happened to stable directory in https://images.rdoproject.org/pike/delorean/current-tripleo/ ? | 17:24 |
EmilienM | slagle: all? | 17:24 |
slagle | i guess i was thinking new job | 17:27 |
*** achadha has joined #tripleo | 17:27 | |
slagle | i dont know what job we'd get rid of | 17:27 |
*** links has joined #tripleo | 17:27 | |
*** ebarrera has joined #tripleo | 17:28 | |
EmilienM | slagle: on previous fs10, deploying the overcloud took 31 minutes and 10 seconds. On the new fs10 with config-download and ansible steps, overcloud deployment takes 33 minutes | 17:28 |
EmilienM | slagle: just fyi | 17:28 |
slagle | that's about what i'd expect | 17:29 |
EmilienM | slagle: but we aren't testing the mistral workflow yet (not written yet) | 17:30 |
EmilienM | slagle: I would keep an eye on these numbers and see | 17:30 |
*** achadha_ has joined #tripleo | 17:30 | |
EmilienM | slagle: you said "it might offset the agent sleep/poll times" - where can we see that? | 17:30 |
slagle | see what? | 17:31 |
slagle | i said it might | 17:31 |
*** achadha has quit IRC | 17:31 | |
slagle | you'd "see" it in the timing data | 17:31 |
*** achadha has joined #tripleo | 17:32 | |
weshay|ruck | trown|lunch, sshnaidm|off adarazs panda|rover|off https://review.openstack.org/#/c/509232/ | 17:32 |
EmilienM | slagle: of which tasks? | 17:32 |
slagle | EmilienM: i dont know what youre talking about | 17:32 |
openstackgerrit | wes hayutin proposed openstack/tripleo-quickstart-extras master: Always pass docker-ha env when passing docker env https://review.openstack.org/509605 | 17:32 |
*** achadha has quit IRC | 17:33 | |
*** achadha has joined #tripleo | 17:33 | |
weshay|ruck | trown|lunch, we also need to get your patch through https://review.openstack.org/#/c/509605/ | 17:33 |
EmilienM | slagle: I'm just asking which metrics can we monitor in these changes to see if there are improvements or not... that's all | 17:33 |
*** achadha has quit IRC | 17:33 | |
*** achadha_ has quit IRC | 17:34 | |
slagle | EmilienM: are we collecting any metrics in the ci jobs? | 17:34 |
slagle | EmilienM: if we're not, we can't see it | 17:34 |
*** achadha has joined #tripleo | 17:34 | |
EmilienM | slagle: ARA collects the lengh of Ansible tasks, that's all I'm aware of | 17:34 |
*** achadha has quit IRC | 17:34 | |
EmilienM | the duration of tasks | 17:34 |
*** tosky has quit IRC | 17:34 | |
dmsimard | I believe Sagi might have something to feed task duration from ara into graphite | 17:34 |
*** fragatina has quit IRC | 17:35 | |
EmilienM | https://review.openstack.org/#/c/480121/ | 17:35 |
dmsimard | https://review.openstack.org/#/c/480121/ | 17:35 |
dmsimard | EmilienM: wtf | 17:35 |
EmilienM | and https://review.openstack.org/#/c/479882/ | 17:36 |
EmilienM | slagle: so I guess it's wip to have metrics, at this point | 17:36 |
dmsimard | EmilienM: otherwise job-wide duration is available on grafana.o.o | 17:37 |
dmsimard | although jeblair is still ironing out some issues from v3 for that | 17:37 |
EmilienM | dmsimard: we're not interested by this data | 17:38 |
EmilienM | dmsimard: we want details of what happens | 17:38 |
*** ykarel is now known as ykarel|afk | 17:38 | |
EmilienM | exactly what sshnaidm is doing | 17:38 |
*** achadha has joined #tripleo | 17:38 | |
dmsimard | https://review.openstack.org/#/q/topic:zuulv3-statsd | 17:38 |
slagle | EmilienM: we'd have to be collecting metrics around exactly what we want to measure | 17:38 |
dmsimard | EmilienM: bah | 17:38 |
slagle | EmilienM: we aren't measuring ansible tasks, we should be measuring overcloud deloy times, for example | 17:39 |
EmilienM | slagle: re: config-download job - I'm not sure we want a new job | 17:39 |
*** fzdarsky has joined #tripleo | 17:39 | |
EmilienM | slagle: yes, it's exactly what sshnaidm is working | 17:39 |
slagle | EmilienM: if it happens to correspond to 1 specific task, that might work | 17:39 |
EmilienM | or at least I think | 17:39 |
*** fzdarsky_ has joined #tripleo | 17:39 | |
dmsimard | another approach is that I believe tasks are pushed to firehose | 17:39 |
EmilienM | anyway, we could already discuss about what jobs we want to switch to use config-download | 17:40 |
*** itlinux has quit IRC | 17:40 | |
EmilienM | until we have a proper mistral workflow, I suggest to not move all the jobs using it | 17:40 |
EmilienM | just maybe one for now is enough | 17:40 |
EmilienM | so we can keep continue the testing on the features and iterate | 17:40 |
EmilienM | slagle: what do you think? | 17:41 |
slagle | EmilienM: that's fine. just not sure which we can do without? | 17:41 |
trozet | weshay|ruck: do you know where stable directory went? https://images.rdoproject.org/pike/delorean/current-tripleo/ | 17:42 |
slagle | every job we have, we currently need | 17:42 |
dmsimard | EmilienM: fyi https://github.com/openstack-infra/system-config/blob/master/modules/openstack_project/files/puppetmaster/mqtt.py | 17:42 |
slagle | EmilienM: which one were you thinking we don't need coverage on any longer? | 17:42 |
EmilienM | dmsimard: uh? | 17:43 |
*** dparkes has joined #tripleo | 17:43 | |
*** suuuper has quit IRC | 17:43 | |
EmilienM | slagle: I proposed to switch fs010 (container-multinode) as a first iteration for now, as this job is run almost every time | 17:43 |
dmsimard | EmilienM: just saying the task data might also be available through mqtt but it doesn't look like we're using it for v3 yet | 17:44 |
slagle | EmilienM: ok, if you don't think we need coverage on that any longer | 17:44 |
weshay|ruck | trozet, kind of.. I didn't realize those paths were changing. However they have.. they are now using the same directory structure as delorean | 17:44 |
trozet | weshay|ruck: but ocata path still has stable | 17:45 |
EmilienM | slagle: can you explain what do you mean by coverage? what coverage are you talking about? | 17:45 |
*** paramite has quit IRC | 17:45 | |
EmilienM | slagle: do you mean "classic overcloud deployment"? | 17:45 |
weshay|ruck | trozet, ya next time it promotes it will look like pike | 17:46 |
slagle | EmilienM: i mean what the job does today before the patch. that would no longer be covered (obviously) if it's changed to use config download | 17:46 |
trozet | weshay|ruck: ok sounds like it will stay this way. I'll change my patch and just remove stable. thanks | 17:46 |
slagle | EmilienM: so we've lost coverage on how we actually tell people to deploy conatiners | 17:47 |
slagle | EmilienM: what other jobs even deploy containers? does ovb? | 17:47 |
EmilienM | slagle: ok yeah, that's what I thought. So, I think we still have coverage on gate-tripleo-ci-centos-7-nonha-multinode-oooq for example but I agree this is not the exact same environment | 17:47 |
EmilienM | slagle: gate-tripleo-ci-centos-7-ovb-containers-oooq | 17:47 |
EmilienM | dprince: gate-tripleo-ci-centos-7-undercloud-containers-nv is green, you might want to propose a patch to make it voting, so we stop breaking it. | 17:48 |
EmilienM | slagle: we still have gate-tripleo-ci-centos-7-scenario00x-multinode-oooq-container who run quite often every day, if we would break something I think we'll see it | 17:49 |
EmilienM | slagle: so having said that, I would go with fs010 | 17:49 |
slagle | EmilienM: yea, it wfm, as I said, if you don't think we need coverage on that any longer | 17:50 |
dprince | EmilienM: ack, thanks | 17:51 |
EmilienM | slagle: cool - so let's targer fs010 until we have a mistral workflow | 17:53 |
EmilienM | slagle: once we have it, maybe we can switch all the other jobs? | 17:53 |
*** links has quit IRC | 17:53 | |
*** amoralej is now known as amoralej|off | 17:56 | |
*** tosky has joined #tripleo | 17:58 | |
*** fandrieu has quit IRC | 17:58 | |
*** jmelvin has quit IRC | 17:59 | |
slagle | EmilienM: yea, i think once we have the mistral workflow ... it's a more transparent change, almost an implementation detail | 17:59 |
EmilienM | ok | 18:00 |
openstackgerrit | Tim Rozet proposed openstack/tripleo-heat-templates stable/pike: Fix some missed hard-coded network references https://review.openstack.org/511317 | 18:00 |
trozet | weshay|ruck: i just realized...undercloud.qcow2 is missing | 18:02 |
trozet | weshay|ruck: from https://images.rdoproject.org/pike/delorean/current-tripleo/ | 18:02 |
*** hrybacki is now known as hrybacki|trainin | 18:03 | |
weshay|ruck | trozet, which config/release file are you using? | 18:04 |
trozet | weshay|ruck: im not using anything, just the url | 18:04 |
weshay|ruck | trozet, so you don't need the undercloud if you have this set, https://github.com/openstack/tripleo-quickstart/blob/master/config/release/tripleo-ci/pike.yml#L2 | 18:05 |
weshay|ruck | trozet, let me look at the test jobs to see if someone botched something w/ the default release files | 18:06 |
*** cylopez has joined #tripleo | 18:06 | |
trozet | weshay|ruck: im not using oooqs | 18:06 |
*** salmankhan has quit IRC | 18:06 | |
weshay|ruck | k right but you guys were sharing the undercloud | 18:07 |
weshay|ruck | so let me poke for a minute | 18:07 |
weshay|ruck | I think I know what's up.. trown|lunch ping me when you are back | 18:07 |
*** trown|lunch is now known as trown | 18:07 | |
trown | weshay|ruck: back | 18:07 |
weshay|ruck | trown, can you please jump on my blue | 18:08 |
trown | weshay|ruck: sure, supposed to have 1-1 in 20min just fyi | 18:08 |
weshay|ruck | trown, this won't take that long | 18:09 |
*** salmankhan has joined #tripleo | 18:09 | |
*** shreshtha_ has joined #tripleo | 18:10 | |
*** ooolpbot has joined #tripleo | 18:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 18:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1722864 | 18:10 |
*** ooolpbot has quit IRC | 18:10 | |
openstack | Launchpad bug 1722864 in tripleo "OVB frequent timeouts on rh1 cloud" [Critical,Triaged] | 18:10 |
*** fzdarsky_ has quit IRC | 18:10 | |
openstackgerrit | Sai Sindhur Malleni proposed openstack/tripleo-heat-templates stable/newton: Bump fs.inotify.max_user_instances for scale https://review.openstack.org/509521 | 18:10 |
*** fzdarsky has quit IRC | 18:10 | |
*** shreshtha has quit IRC | 18:11 | |
*** sid1 has joined #tripleo | 18:12 | |
*** salmankhan has quit IRC | 18:13 | |
*** sid2 has joined #tripleo | 18:14 | |
*** sid1 has quit IRC | 18:15 | |
*** brault has joined #tripleo | 18:19 | |
*** artom_ has joined #tripleo | 18:20 | |
Tengu | jaosorior: for information, I'll try tomorrow to integrate the haproxy dynamic endpoint patch on the undercloud - I'll make a simple proxy for prometheus. Once this one is OK, I think it can be merged in master, then cherry-picked into stable/pike, and I'll provide another patch in order to support basic auth. | 18:20 |
*** pkovar has quit IRC | 18:23 | |
*** artom has quit IRC | 18:23 | |
*** brault has quit IRC | 18:23 | |
*** d0ugal has quit IRC | 18:24 | |
*** cylopez has quit IRC | 18:27 | |
*** yolanda has quit IRC | 18:28 | |
*** suuuper has joined #tripleo | 18:30 | |
*** suuuper has quit IRC | 18:30 | |
*** CaptTofu has quit IRC | 18:31 | |
*** jaganathan has quit IRC | 18:32 | |
*** jlabarre has quit IRC | 18:33 | |
*** aditya_r has quit IRC | 18:33 | |
*** vpickard is now known as vpickard_ | 18:34 | |
openstackgerrit | Tim Rozet proposed openstack/tripleo-heat-templates master: Fixes OpenDaylight deployments docker support https://review.openstack.org/511319 | 18:34 |
vkmc | hey guys, is the gate broken? | 18:36 |
*** tongl has quit IRC | 18:36 | |
weshay|ruck | trozet, I have to open a bug or two on this and start fixing it | 18:36 |
*** gfidente has quit IRC | 18:37 | |
trozet | weshay|ruck: ok cool. In the meantime is it possible to manually upload the missing file to the repo? | 18:37 |
weshay|ruck | trozet, we can ask dmsimard but we don't have the creds to the image server | 18:37 |
weshay|ruck | bit of a disconnect between ci and infra there | 18:38 |
trozet | weshay|ruck: ok. Can you include me on the bugs please? | 18:38 |
weshay|ruck | aye | 18:38 |
trozet | dmsimard: can you help me out? | 18:38 |
trozet | weshay|ruck: thanks | 18:38 |
mwhahaha | slagle, bnemec: When you get a minute can you review https://review.openstack.org/#/c/510233/ | 18:39 |
EmilienM | sshnaidm, trown : please review when you have time https://review.openstack.org/#/c/508306/ | 18:40 |
EmilienM | vkmc: we had some outage, should be back to normal. Do you have an example? | 18:40 |
vkmc | EmilienM, https://review.openstack.org/#/c/490974/2 | 18:41 |
*** dbecker has quit IRC | 18:41 | |
EmilienM | vkmc: October 5, yes you want to recheck. A bunch of things were fixed | 18:41 |
vkmc | EmilienM, very well, I triggered the recheck a few hours ago | 18:42 |
vkmc | waiting for the results then | 18:42 |
vkmc | thanks | 18:42 |
vkmc | :) | 18:42 |
EmilienM | cool | 18:43 |
dmsimard | trozet: reading, hang on | 18:45 |
dmsimard | trozet: upload what where ? | 18:45 |
*** rbrady has quit IRC | 18:45 | |
trozet | dmsimard: the missing undercloud.qcow2 file to https://images.rdoproject.org/pike/delorean/current-tripleo/ | 18:46 |
*** MaxPC has quit IRC | 18:48 | |
openstackgerrit | Alex Schultz proposed openstack/tripleo-specs master: Remove python3 squad https://review.openstack.org/511321 | 18:51 |
*** milan has quit IRC | 18:51 | |
*** rwsu has quit IRC | 18:51 | |
dmsimard | trozet: well, the file is definitely not there | 18:53 |
dmsimard | trozet: I can put a file there, but I need to know where it would be | 18:53 |
*** yolanda has joined #tripleo | 18:54 | |
dmsimard | weshay|ruck, trown ^ was there an issue with image upload or something ? | 18:54 |
*** aputtur has joined #tripleo | 18:54 | |
weshay|ruck | dmsimard, no.. we created a bug that trozet just hit | 18:54 |
weshay|ruck | and our jobs | 18:54 |
*** aputtur has quit IRC | 18:55 | |
weshay|ruck | dmsimard, we could do a one off and upload something for trozet but it's probably better to wait for the proper fix unless we're causing too much pain | 18:55 |
dmsimard | weshay|ruck: so there was no issue with the upload or the symlink but there's a missing file ? o_O | 18:55 |
weshay|ruck | I'm about to open a bug, details will be there | 18:56 |
dmsimard | weshay|ruck: just tell me what to do and I'll do it :p | 18:56 |
dmsimard | weshay|ruck: if you know what the hash before that one was, I can change the symlink | 18:56 |
weshay|ruck | ur fine right now, trozet was just asking if I could upload something for him | 18:56 |
* weshay|ruck goes to write a bug on the ci team | 18:56 | |
dmsimard | whatever works, we can either upload something there or revert to a prior hash | 18:57 |
dmsimard | ping me if you want to do something | 18:57 |
*** milan has joined #tripleo | 18:57 | |
trozet | weshay|ruck, dmsimard: yeah would be good if you can manually fix it for the time being. trying to test something and it kind of blocks me | 18:57 |
weshay|ruck | that will take a little time | 18:58 |
*** akrivoka has quit IRC | 18:58 | |
trown | dmsimard: could you give access to the image server and I can fix it ... and we can disable further promotes until we fix the bug weshay|ruck is filing? | 18:59 |
*** milan has quit IRC | 18:59 | |
trown | EmilienM: do we have any job actually showing that patch working? | 18:59 |
trown | EmilienM: a little hard to review that... seems fine in that it wont break anything, but pretty hard to say whether it works | 19:00 |
*** jlabarre has joined #tripleo | 19:01 | |
*** MaxPC has joined #tripleo | 19:04 | |
*** brault has joined #tripleo | 19:09 | |
*** ooolpbot has joined #tripleo | 19:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 19:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1722864 | 19:10 |
*** ooolpbot has quit IRC | 19:10 | |
openstack | Launchpad bug 1722864 in tripleo "OVB frequent timeouts on rh1 cloud" [Critical,Triaged] | 19:10 |
*** brault has quit IRC | 19:13 | |
EmilienM | trown: yes, in https://review.openstack.org/#/c/508307/ | 19:21 |
*** brault has joined #tripleo | 19:22 | |
EmilienM | trown: fs010 is container-multinode | 19:26 |
*** milan has joined #tripleo | 19:34 | |
*** ykarel|afk has quit IRC | 19:50 | |
trown | EmilienM: thanks | 19:50 |
EmilienM | trown: seeking your feedback on the way we did in oooq and oooq-extras | 19:51 |
EmilienM | trown: we document the WIP here: https://etherpad.openstack.org/p/tripleo-config-download and also the tech debt | 19:51 |
EmilienM | trown: since it's WIP, it will evolved during the next weeks in iterations | 19:52 |
trown | EmilienM: ya seems implemented differently than most quickstart stuff | 19:54 |
trown | EmilienM: seems like it should be done with more upfront collaboration from CI squad as well | 19:54 |
trown | EmilienM: like bringing it up to ci squad at some point before ... hey review this :P | 19:54 |
EmilienM | trown: I guess we wanted a prototype working in CI before using your time | 19:56 |
EmilienM | trown: now we have something that work, we want to use your time and see what we did and see how we can do better | 19:56 |
EmilienM | trown: I would be happy to work on next patchsets with CI squad | 19:57 |
*** jprovazn has quit IRC | 19:57 | |
*** liverpooler has quit IRC | 20:00 | |
*** itlinux has joined #tripleo | 20:03 | |
*** rbrady has joined #tripleo | 20:06 | |
*** rbrady has joined #tripleo | 20:06 | |
*** apetrich has quit IRC | 20:07 | |
mwhahaha | fultonj: looks like scenario001-containers might be broken again http://logs.openstack.org/52/500552/7/gate/gate-tripleo-ci-centos-7-scenario001-multinode-oooq-container/4cefca2/ | 20:07 |
mwhahaha | fultonj: http://logs.openstack.org/52/500552/7/gate/gate-tripleo-ci-centos-7-scenario001-multinode-oooq-container/4cefca2/logs/undercloud/home/jenkins/tempest_output.log.txt.gz#_2017-10-11_19_26_24 | 20:07 |
EmilienM | trown: to be honest, our prototype in oooq-extras is 70 lines of code, so it's not a big deal that we didn't interrupt you for that - we wanted a PoC, we made it working - now seeking for how to make it better | 20:07 |
EmilienM | trown: it would have been bad to implement a PoC of 300 lines of code or more and ask you to review | 20:08 |
EmilienM | trown: but that's not the case, our PoC is really basic, just some tasks and some doc - that we're happy to rethink | 20:08 |
fultonj | mwhahaha: looking | 20:09 |
*** tongl has joined #tripleo | 20:09 | |
*** apetrich has joined #tripleo | 20:10 | |
*** ooolpbot has joined #tripleo | 20:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 20:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1722864 | 20:10 |
*** ooolpbot has quit IRC | 20:10 | |
openstack | Launchpad bug 1722864 in tripleo "OVB frequent timeouts on rh1 cloud" [Critical,Triaged] | 20:10 |
*** tbonds has joined #tripleo | 20:10 | |
fultonj | mwhahaha: indeed ceph is broken there, no OSD, I am checking to see if container and ceph-ansible version are out of sync | 20:14 |
*** toure is now known as toure_biab | 20:18 | |
*** jmelvin has joined #tripleo | 20:18 | |
*** achadha_ has joined #tripleo | 20:21 | |
*** achadha_ has quit IRC | 20:21 | |
fultonj | mwhahaha: the fix is to merge https://review.openstack.org/#/c/510984 i can explain | 20:21 |
fultonj | gfidenete isn't here | 20:21 |
*** achadha_ has joined #tripleo | 20:22 | |
fultonj | we use the centos-storage-sig's -release repo to provide the ceph-ansible rpm | 20:22 |
*** achadha_ has quit IRC | 20:23 | |
fultonj | a new ceph-ansible was recently promoted from -candidate to -release | 20:23 |
*** achadha_ has joined #tripleo | 20:23 | |
fultonj | it requires the new docker container (which will default) once we don't hard code it | 20:24 |
fultonj | mwhahaha: what doyou think? | 20:24 |
mwhahaha | fultonj: sec meeting | 20:25 |
*** achadha has quit IRC | 20:25 | |
fultonj | the alternative is to undo the promotion but we want the promotion ; we just landed tings out of sync breaking the scenario while we wait for the other | 20:25 |
* fultonj waits | 20:25 | |
*** sid2 has quit IRC | 20:26 | |
*** milan has quit IRC | 20:26 | |
fultonj | mwhahaha: i will make a bug documenting all this | 20:27 |
*** achadha_ has quit IRC | 20:27 | |
fultonj | even when we fix we need better process | 20:28 |
*** fragatina has joined #tripleo | 20:31 | |
*** achadha has joined #tripleo | 20:35 | |
fultonj | mwhahaha: https://bugs.launchpad.net/tripleo/+bug/1722908 | 20:35 |
openstack | Launchpad bug 1722908 in tripleo "scenario001-containers fail on tempest run; missing OSD from ceph-ansible and ceph-docker being out of sync" [Critical,Triaged] - Assigned to John Fulton (jfulton-org) | 20:36 |
weshay|ruck | trozet, fyi you may be able to use http://artifacts.ci.centos.org/artifacts/rdo/images/pike/cbs/cloudsig-stable/stable/ | 20:36 |
weshay|ruck | trown, ^ fyi | 20:36 |
trozet | weshay|ruck: trown said cbs was broken for a long time with a bug where it would not update. Is that fixed now or should we keep using rdo images server? | 20:38 |
*** achadha has quit IRC | 20:39 | |
*** dprince has quit IRC | 20:40 | |
weshay|ruck | I know it's very slow, but it's better than nothing atm | 20:40 |
weshay|ruck | not sure if it's busted | 20:40 |
openstackgerrit | Ihar Hrachyshka proposed openstack/tripleo-quickstart-extras master: Revert "Adding test_mtu_sized_frames to skip list" https://review.openstack.org/507642 | 20:46 |
*** gbarros has quit IRC | 20:47 | |
openstackgerrit | wes hayutin proposed openstack/tripleo-quickstart master: update release config to use overcloud-as-undercloud https://review.openstack.org/511334 | 20:49 |
*** jkilpatr has quit IRC | 20:50 | |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-quickstart master: fs10: deploy steps with ansible https://review.openstack.org/508307 | 20:50 |
*** pchavva has quit IRC | 20:50 | |
*** rodolof has joined #tripleo | 20:55 | |
*** ccamacho has quit IRC | 20:55 | |
dmsimard | trown: I totally missed your ping from earlier | 20:57 |
dmsimard | still need something done ? | 20:57 |
*** trown is now known as trown|outtypewww | 20:59 | |
*** lblanchard1 has quit IRC | 20:59 | |
*** dprince has joined #tripleo | 21:02 | |
*** dprince has quit IRC | 21:02 | |
*** catintheroof has quit IRC | 21:02 | |
*** rodolof has quit IRC | 21:02 | |
*** leitan has quit IRC | 21:03 | |
*** rodolof has joined #tripleo | 21:03 | |
mwhahaha | fultonj: so should we restore the unpin? | 21:04 |
fultonj | mwhahaha: why not unabandon https://review.openstack.org/#/c/510984 ? | 21:04 |
mwhahaha | yea that's what i mean | 21:05 |
fultonj | ok | 21:05 |
fultonj | mwhahaha: thanks | 21:05 |
mwhahaha | https://review.openstack.org/#/c/509673/ never merged tho | 21:05 |
mwhahaha | that might hav ebeen the problem | 21:05 |
fultonj | it's true | 21:06 |
fultonj | technically it doesn't have to depend on it | 21:06 |
fultonj | i can revise and remove the depends-on | 21:06 |
mwhahaha | fultonj: please do thanks | 21:06 |
fultonj | mwhahaha: ok, coming up | 21:06 |
*** dprince has joined #tripleo | 21:06 | |
*** rodolof has quit IRC | 21:08 | |
*** rodolof has joined #tripleo | 21:08 | |
*** achadha has joined #tripleo | 21:09 | |
*** ooolpbot has joined #tripleo | 21:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 21:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1722864 | 21:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1722908 | 21:10 |
openstack | Launchpad bug 1722864 in tripleo "OVB frequent timeouts on rh1 cloud" [Critical,Triaged] | 21:10 |
*** ooolpbot has quit IRC | 21:10 | |
openstack | Launchpad bug 1722908 in tripleo "scenario001-containers fail on tempest run; missing OSD from ceph-ansible and ceph-docker being out of sync" [Critical,Triaged] - Assigned to John Fulton (jfulton-org) | 21:10 |
*** mwhahaha changes topic to "CI Status: YELLOWish - scenario001-containers and ovb timeouts (see alerts) | http://tripleo.org/ | https://docs.openstack.org/tripleo-docs/latest/" | 21:10 | |
openstackgerrit | John Fulton proposed openstack/tripleo-heat-templates master: Hardcode tag-stable-3.0-jewel-centos-7 in scenario001-containers https://review.openstack.org/511338 | 21:11 |
*** florianf has quit IRC | 21:12 | |
*** rodolof has quit IRC | 21:13 | |
*** rodolof has joined #tripleo | 21:14 | |
*** jtomasek has quit IRC | 21:14 | |
*** achadha has quit IRC | 21:14 | |
*** abishop has quit IRC | 21:15 | |
fultonj | mwhahaha: I went with https://review.openstack.org/511338 instead ; explanation in comment #3 of bug | 21:16 |
mwhahaha | k | 21:17 |
mwhahaha | fultonj: so should I re-abandon the other one? | 21:17 |
fultonj | mwhahaha: no, please don't | 21:18 |
fultonj | mwhahaha: i like the other, but 511338 is quicker | 21:18 |
mwhahaha | k | 21:18 |
fultonj | merging two vs one | 21:18 |
fultonj | but two is good for long run | 21:18 |
fultonj | hard code only in repos generating tool | 21:18 |
fultonj | s/repos/registry | 21:19 |
*** dprince has quit IRC | 21:19 | |
fultonj | mwhahaha: i am going to open another bug about improving the process | 21:19 |
mwhahaha | fultonj: sounds good | 21:19 |
fultonj | it's about lock-stepping docker.io and centos-storage sig promotions | 21:20 |
fultonj | a hard problem to solve | 21:20 |
fultonj | so perhaps hard coding in CI is the best fix | 21:20 |
fultonj | they don't always become incompatible but this is a case where they did | 21:21 |
*** MrRoot has joined #tripleo | 21:23 | |
trozet | weshay|ruck: did you ever file a bug? | 21:27 |
*** rodolof has quit IRC | 21:27 | |
*** jkilpatr has joined #tripleo | 21:27 | |
*** rodolof has joined #tripleo | 21:27 | |
weshay|ruck | trozet, yup | 21:28 |
* weshay|ruck gets | 21:28 | |
weshay|ruck | https://bugs.launchpad.net/tripleo/+bug/1722885 | 21:28 |
openstack | Launchpad bug 1722885 in tripleo "default release configs in config/release point to an unavailable undercloud image" [Critical,In progress] - Assigned to wes hayutin (weshayutin) | 21:28 |
weshay|ruck | https://bugs.launchpad.net/tripleo/+bug/1722889 | 21:29 |
openstack | Launchpad bug 1722889 in tripleo "tripleo-quickstart public pipelines no longer generate an undercloud image" [High,Triaged] | 21:29 |
*** cdearborn has quit IRC | 21:30 | |
*** ecerquei has quit IRC | 21:30 | |
fultonj | mwhahaha: do you need me to hang around right now to see that bug through? | 21:32 |
mwhahaha | fultonj: no we'll just see how the results come back later | 21:32 |
trozet | weshay|ruck: thanks | 21:32 |
fultonj | ok, i'll check back later | 21:32 |
fultonj | i'm just going to go AFK for a few hours as nothing else I can do til then | 21:32 |
mwhahaha | fultonj: might send giulio an email about it and maybe he can look at it in the morning | 21:33 |
fultonj | yes, i will | 21:33 |
fultonj | i can cc you | 21:33 |
* mwhahaha has to go afk for a bit | 21:33 | |
mwhahaha | sounds good | 21:33 |
fultonj | ack | 21:33 |
*** gbarros has joined #tripleo | 21:33 | |
*** rodolof has quit IRC | 21:33 | |
*** rodolof has joined #tripleo | 21:33 | |
openstackgerrit | Ronelle Landy proposed openstack/tripleo-quickstart-extras master: Add the option to run the container-check script https://review.openstack.org/501028 | 21:33 |
*** rodolof has quit IRC | 21:38 | |
*** rodolof has joined #tripleo | 21:38 | |
*** rodolof has quit IRC | 21:43 | |
*** rodolof has joined #tripleo | 21:43 | |
*** rodolof has quit IRC | 21:48 | |
*** rodolof has joined #tripleo | 21:49 | |
*** leitan has joined #tripleo | 21:50 | |
*** gbarros has quit IRC | 21:53 | |
*** threestrands has joined #tripleo | 21:56 | |
*** jkilpatr has quit IRC | 21:57 | |
*** jkilpatr has joined #tripleo | 21:57 | |
*** rodolof has quit IRC | 21:58 | |
*** rodolof has joined #tripleo | 21:58 | |
*** rodolof has quit IRC | 22:03 | |
*** rodolof has joined #tripleo | 22:04 | |
EmilienM | weshay|ruck: please join #openstack-infra-incident | 22:07 |
*** rlandy has quit IRC | 22:08 | |
*** ooolpbot has joined #tripleo | 22:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 22:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1722864 | 22:10 |
openstack | Launchpad bug 1722864 in tripleo "OVB frequent timeouts on rh1 cloud" [Critical,Triaged] | 22:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1722908 | 22:10 |
*** ooolpbot has quit IRC | 22:10 | |
openstack | Launchpad bug 1722908 in tripleo "scenario001-containers fail on tempest run; missing OSD from ceph-ansible and ceph-docker being out of sync" [Critical,In progress] - Assigned to John Fulton (jfulton-org) | 22:10 |
*** bfournie has quit IRC | 22:11 | |
*** bobh has quit IRC | 22:11 | |
*** fzdarsky_ has joined #tripleo | 22:13 | |
*** fzdarsky has joined #tripleo | 22:14 | |
*** rodolof has quit IRC | 22:16 | |
openstackgerrit | Emilien Macchi proposed openstack-infra/tripleo-ci master: Revert "Collect mistral-ansible execution files from /tmp" https://review.openstack.org/511347 | 22:16 |
EmilienM | weshay|ruck: ^ | 22:16 |
openstackgerrit | Emilien Macchi proposed openstack-infra/tripleo-ci master: Revert "Collect mistral-ansible execution files from /tmp" https://review.openstack.org/511347 | 22:16 |
*** MaxPC has quit IRC | 22:17 | |
dmsimard | EmilienM, weshay|ruck: with the /tmp/ansible fire extinguished, I think it would be a nice improvement opportunity to avoid completely globbing /var/log and /etc by default in https://github.com/openstack/tripleo-quickstart-extras/blob/master/roles/collect-logs/defaults/main.yml | 22:18 |
dmsimard | and pick the directories we're really interested in | 22:18 |
dmsimard | we do this in puppet-openstack-integration https://github.com/openstack/puppet-openstack-integration/blob/master/copy_logs.sh | 22:18 |
weshay|ruck | EmilienM, k.. re: https://github.com/openstack-infra/tripleo-ci/blob/master/toci-quickstart/config/collect-logs.yml#L9 | 22:19 |
weshay|ruck | EmilienM, k.. I thought we did that but apparently not | 22:19 |
openstackgerrit | Steve Baker proposed openstack/tripleo-common master: Action to always populate container image parameters https://review.openstack.org/503296 | 22:21 |
openstackgerrit | Merged openstack-infra/tripleo-ci master: Revert "Collect mistral-ansible execution files from /tmp" https://review.openstack.org/511347 | 22:22 |
openstackgerrit | Monty Taylor proposed openstack-infra/tripleo-ci master: Stop collecting ephemeral temp dirs https://review.openstack.org/511348 | 22:23 |
openstackgerrit | wes hayutin proposed openstack-infra/tripleo-ci master: remove the collection of /tmp/* from upstream jobs https://review.openstack.org/511349 | 22:24 |
weshay|ruck | EmilienM, ^ | 22:24 |
EmilienM | weshay|ruck: thanks | 22:25 |
openstackgerrit | Merged openstack-infra/tripleo-ci master: remove the collection of /tmp/* from upstream jobs https://review.openstack.org/511349 | 22:25 |
openstackgerrit | Pradeep Kilambi proposed openstack/python-tripleoclient master: Support to drive undercloud deploy via undercloud.conf https://review.openstack.org/511350 | 22:29 |
*** pmannidi has joined #tripleo | 22:37 | |
*** kbyrne has quit IRC | 22:48 | |
*** kbyrne has joined #tripleo | 22:50 | |
*** jcoufal has quit IRC | 22:53 | |
openstackgerrit | Lars Kellogg-Stedman proposed openstack/tripleo-specs master: Add blueprint for Logging to stdout and rsyslog https://review.openstack.org/510001 | 22:56 |
*** bfournie has joined #tripleo | 22:59 | |
*** fzdarsky has quit IRC | 23:00 | |
*** fzdarsky_ has quit IRC | 23:00 | |
*** bfournie has quit IRC | 23:00 | |
*** bfournie has joined #tripleo | 23:01 | |
*** noslzzp has quit IRC | 23:04 | |
*** dsneddon_afk has quit IRC | 23:06 | |
*** dsneddon has joined #tripleo | 23:07 | |
*** dsariel has quit IRC | 23:09 | |
*** ooolpbot has joined #tripleo | 23:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 23:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1722864 | 23:10 |
openstack | Launchpad bug 1722864 in tripleo "OVB frequent timeouts on rh1 cloud" [Critical,Triaged] | 23:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1722908 | 23:10 |
*** ooolpbot has quit IRC | 23:10 | |
openstack | Launchpad bug 1722908 in tripleo "scenario001-containers fail on tempest run; missing OSD from ceph-ansible and ceph-docker being out of sync" [Critical,In progress] - Assigned to John Fulton (jfulton-org) | 23:10 |
openstackgerrit | Andy Smith proposed openstack/tripleo-heat-templates master: WIP Separate rpc and notify messaging backends https://review.openstack.org/507963 | 23:15 |
pabelanger | EmilienM: at this point, you might consider abandoning the gate. That will give logs.o.o time to recover, for now, the pipeline is red | 23:19 |
*** morazi has quit IRC | 23:21 | |
*** fandrieu has joined #tripleo | 23:24 | |
*** tbonds has quit IRC | 23:26 | |
EmilienM | pabelanger: what does it mean abandon the gate? | 23:30 |
* EmilienM afk | 23:37 | |
*** myoung|remote has joined #tripleo | 23:42 | |
*** tosky has quit IRC | 23:43 | |
*** myoung|remote has quit IRC | 23:43 | |
*** fragatina has quit IRC | 23:52 | |
*** itlinux has quit IRC | 23:57 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!