*** flepied has quit IRC | 00:05 | |
*** dparkes has quit IRC | 00:08 | |
*** ooolpbot has joined #tripleo | 00:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 00:10 |
---|---|---|
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674681 | 00:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674770 | 00:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674955 | 00:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1675174 | 00:10 |
openstack | Launchpad bug 1674681 in tripleo "buildlogs.centos.org CDN issues" [Critical,Triaged] | 00:10 |
*** ooolpbot has quit IRC | 00:10 | |
openstack | Launchpad bug 1674770 in tripleo "Update timeout too long in CI" [Critical,Triaged] | 00:10 |
openstack | Launchpad bug 1674955 in tripleo "oooq now believes that overcloud deploy succeeded even though it failed" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 00:10 |
openstack | Launchpad bug 1675174 in tripleo "Timeout passed to overcloud deploy not effective" [Critical,Triaged] | 00:10 |
*** pkovar has joined #tripleo | 00:14 | |
dmsimard | rlandy|afk: I'm back, did you manage to find anything ? | 00:16 |
openstackgerrit | Merged openstack/diskimage-builder master: Typo fix: curent => current https://review.openstack.org/448966 | 00:16 |
*** Goneri has quit IRC | 00:17 | |
*** Goneri has joined #tripleo | 00:19 | |
*** tbonds has quit IRC | 00:19 | |
*** flepied has joined #tripleo | 00:25 | |
openstackgerrit | wes hayutin proposed openstack/tripleo-quickstart master: WIP, make script work in CI and interactive https://review.openstack.org/449370 | 00:30 |
*** limao has joined #tripleo | 00:30 | |
*** rlandy|afk is now known as rlandy | 00:36 | |
*** tbonds has joined #tripleo | 00:55 | |
openstackgerrit | Merged openstack/diskimage-builder master: functests: skip qcow2 generically but add specific test https://review.openstack.org/448837 | 01:07 |
*** ooolpbot has joined #tripleo | 01:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 01:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674681 | 01:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674770 | 01:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674955 | 01:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1675174 | 01:10 |
openstack | Launchpad bug 1674681 in tripleo "buildlogs.centos.org CDN issues" [Critical,Triaged] | 01:10 |
*** ooolpbot has quit IRC | 01:10 | |
openstack | Launchpad bug 1674770 in tripleo "Update timeout too long in CI" [Critical,Triaged] | 01:10 |
openstack | Launchpad bug 1674955 in tripleo "oooq now believes that overcloud deploy succeeded even though it failed" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 01:10 |
openstack | Launchpad bug 1675174 in tripleo "Timeout passed to overcloud deploy not effective" [Critical,Triaged] | 01:10 |
*** japestinho has quit IRC | 01:46 | |
*** japestinho has joined #tripleo | 01:47 | |
*** rlandy has quit IRC | 01:51 | |
*** apevec has quit IRC | 01:56 | |
*** ooolpbot has joined #tripleo | 02:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 02:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674681 | 02:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674770 | 02:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674955 | 02:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1675174 | 02:10 |
openstack | Launchpad bug 1674681 in tripleo "buildlogs.centos.org CDN issues" [Critical,Triaged] | 02:10 |
*** ooolpbot has quit IRC | 02:10 | |
openstack | Launchpad bug 1674770 in tripleo "Update timeout too long in CI" [Critical,Triaged] | 02:10 |
openstack | Launchpad bug 1674955 in tripleo "oooq now believes that overcloud deploy succeeded even though it failed" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 02:10 |
openstack | Launchpad bug 1675174 in tripleo "Timeout passed to overcloud deploy not effective" [Critical,Triaged] | 02:10 |
*** nyechiel has joined #tripleo | 02:11 | |
*** dmacpher-afk is now known as dmacpher | 02:20 | |
*** pkovar has quit IRC | 02:27 | |
*** fzdarsky_ has joined #tripleo | 02:29 | |
*** fzdarsky|afk has quit IRC | 02:32 | |
*** gkadam has joined #tripleo | 02:35 | |
*** limao has quit IRC | 02:36 | |
*** limao_ has joined #tripleo | 02:36 | |
*** tbarron has quit IRC | 02:38 | |
*** yamahata has quit IRC | 02:43 | |
*** tbarron has joined #tripleo | 02:48 | |
*** nyechiel has quit IRC | 02:58 | |
*** ooolpbot has joined #tripleo | 03:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 03:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674681 | 03:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674770 | 03:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674955 | 03:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1675174 | 03:10 |
openstack | Launchpad bug 1674681 in tripleo "buildlogs.centos.org CDN issues" [Critical,Triaged] | 03:10 |
*** ooolpbot has quit IRC | 03:10 | |
openstack | Launchpad bug 1674770 in tripleo "Update timeout too long in CI" [Critical,Triaged] | 03:10 |
openstack | Launchpad bug 1674955 in tripleo "oooq now believes that overcloud deploy succeeded even though it failed" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 03:10 |
openstack | Launchpad bug 1675174 in tripleo "Timeout passed to overcloud deploy not effective" [Critical,Triaged] | 03:10 |
*** ramishra has joined #tripleo | 03:12 | |
openstackgerrit | Merged openstack/puppet-pacemaker master: Update test-requirements.txt https://review.openstack.org/448953 | 03:25 |
*** psahoo has joined #tripleo | 03:29 | |
*** gbarros has quit IRC | 03:47 | |
*** yamahata has joined #tripleo | 03:47 | |
*** limao_ has quit IRC | 04:09 | |
*** ooolpbot has joined #tripleo | 04:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 04:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674681 | 04:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674770 | 04:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674955 | 04:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1675174 | 04:10 |
openstack | Launchpad bug 1674681 in tripleo "buildlogs.centos.org CDN issues" [Critical,Triaged] | 04:10 |
*** ooolpbot has quit IRC | 04:10 | |
openstack | Launchpad bug 1674770 in tripleo "Update timeout too long in CI" [Critical,Triaged] | 04:10 |
openstack | Launchpad bug 1674955 in tripleo "oooq now believes that overcloud deploy succeeded even though it failed" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 04:10 |
openstack | Launchpad bug 1675174 in tripleo "Timeout passed to overcloud deploy not effective" [Critical,Triaged] | 04:10 |
*** limao has joined #tripleo | 04:10 | |
*** limao has quit IRC | 04:14 | |
*** links has joined #tripleo | 04:20 | |
*** ratailor has joined #tripleo | 04:40 | |
*** limao has joined #tripleo | 04:43 | |
openstackgerrit | Merged openstack/diskimage-builder master: Use correct Ubuntu distro url on non-x86 arches https://review.openstack.org/448848 | 04:43 |
*** skramaja has joined #tripleo | 04:44 | |
*** janki has joined #tripleo | 04:53 | |
*** radeks has joined #tripleo | 04:55 | |
*** udesale has joined #tripleo | 04:57 | |
*** fragatin_ has joined #tripleo | 05:01 | |
*** fragati__ has joined #tripleo | 05:03 | |
*** fragatina has quit IRC | 05:05 | |
*** fragatin_ has quit IRC | 05:06 | |
*** fragati__ has quit IRC | 05:07 | |
*** ooolpbot has joined #tripleo | 05:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 05:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674681 | 05:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674770 | 05:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674955 | 05:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1675174 | 05:10 |
openstack | Launchpad bug 1674681 in tripleo "buildlogs.centos.org CDN issues" [Critical,Triaged] | 05:10 |
*** ooolpbot has quit IRC | 05:10 | |
openstack | Launchpad bug 1674770 in tripleo "Update timeout too long in CI" [Critical,Triaged] | 05:10 |
openstack | Launchpad bug 1674955 in tripleo "oooq now believes that overcloud deploy succeeded even though it failed" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 05:10 |
openstack | Launchpad bug 1675174 in tripleo "Timeout passed to overcloud deploy not effective" [Critical,Triaged] | 05:10 |
*** udesale has quit IRC | 05:11 | |
*** udesale__ has joined #tripleo | 05:11 | |
*** pgadiya has joined #tripleo | 05:18 | |
*** udesale__ has quit IRC | 05:20 | |
*** udesale has joined #tripleo | 05:20 | |
*** udesale has quit IRC | 05:21 | |
*** fragatina has joined #tripleo | 05:21 | |
*** udesale has joined #tripleo | 05:22 | |
*** fragatina has quit IRC | 05:25 | |
*** prateek has joined #tripleo | 05:31 | |
*** fragatina has joined #tripleo | 05:33 | |
*** fragatina has quit IRC | 05:33 | |
*** fragatina has joined #tripleo | 05:34 | |
*** masco has joined #tripleo | 05:34 | |
*** masco has quit IRC | 05:51 | |
*** mdnadeem has joined #tripleo | 05:53 | |
*** yprokule has joined #tripleo | 05:53 | |
*** iranzo has joined #tripleo | 06:00 | |
*** ooolpbot has joined #tripleo | 06:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 06:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674681 | 06:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674770 | 06:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674955 | 06:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1675174 | 06:10 |
openstack | Launchpad bug 1674681 in tripleo "buildlogs.centos.org CDN issues" [Critical,Triaged] | 06:10 |
*** ooolpbot has quit IRC | 06:10 | |
openstack | Launchpad bug 1674770 in tripleo "Update timeout too long in CI" [Critical,Triaged] | 06:10 |
openstack | Launchpad bug 1674955 in tripleo "oooq now believes that overcloud deploy succeeded even though it failed" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 06:10 |
openstack | Launchpad bug 1675174 in tripleo "Timeout passed to overcloud deploy not effective" [Critical,Triaged] | 06:10 |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder master: Create PReP boot partition for PPC https://review.openstack.org/447739 | 06:10 |
*** aufi has joined #tripleo | 06:18 | |
*** lmiccini has joined #tripleo | 06:41 | |
*** dparkes has joined #tripleo | 06:47 | |
*** karimb has joined #tripleo | 06:51 | |
*** karimb has quit IRC | 06:51 | |
bandini | morning | 07:06 |
*** ooolpbot has joined #tripleo | 07:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 07:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674681 | 07:10 |
openstack | Launchpad bug 1674681 in tripleo "buildlogs.centos.org CDN issues" [Critical,Triaged] | 07:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674770 | 07:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674955 | 07:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1675174 | 07:10 |
*** ooolpbot has quit IRC | 07:10 | |
openstack | Launchpad bug 1674770 in tripleo "Update timeout too long in CI" [Critical,Triaged] | 07:10 |
openstack | Launchpad bug 1674955 in tripleo "oooq now believes that overcloud deploy succeeded even though it failed" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 07:10 |
openstack | Launchpad bug 1675174 in tripleo "Timeout passed to overcloud deploy not effective" [Critical,Triaged] | 07:10 |
*** ealcaniz has joined #tripleo | 07:13 | |
cschwede | Good Morning! This fix needs only a +A https://review.openstack.org/#/c/448208/ - would be great so I can propose a backport to stable/ocata soon | 07:14 |
bandini | cschwede: done | 07:15 |
cschwede | bandini: \o/ thx! | 07:15 |
openstackgerrit | Michele Baldessari proposed openstack/puppet-tripleo master: Qpid dispatch router puppet profile https://review.openstack.org/425710 | 07:15 |
bandini | chem: https://review.openstack.org/#/c/446893/ is this one okay for you? | 07:19 |
*** pmannidi has quit IRC | 07:21 | |
openstackgerrit | Luke Hinds proposed openstack/tripleo-heat-templates master: SSHD Service extensions https://review.openstack.org/444622 | 07:22 |
*** pmannidi has joined #tripleo | 07:28 | |
openstackgerrit | Luke Hinds proposed openstack/tripleo-heat-templates master: SSHD Service extensions https://review.openstack.org/444622 | 07:30 |
*** tesseract has joined #tripleo | 07:31 | |
*** pmannidi has quit IRC | 07:32 | |
chem | bandini: looking | 07:39 |
*** jprovazn has joined #tripleo | 07:40 | |
bandini | chem: good morning, sir! | 07:43 |
chem | bandini: god morning | 07:44 |
matbu | bandini: chem bonjour | 07:44 |
bandini | bonjour matbu! | 07:44 |
chem | matbu: buongiorno | 07:45 |
*** jaosorior has joined #tripleo | 07:55 | |
*** cylopez has joined #tripleo | 07:55 | |
*** ccamacho has joined #tripleo | 07:58 | |
*** zzzeek has quit IRC | 08:00 | |
openstackgerrit | Adriano Petrich proposed openstack/tripleo-common master: add caching the GetParametersAction https://review.openstack.org/444220 | 08:00 |
*** zzzeek has joined #tripleo | 08:01 | |
*** percevalbot has quit IRC | 08:02 | |
*** bogdando has joined #tripleo | 08:03 | |
*** jlinkes has joined #tripleo | 08:05 | |
d0ugal | How is CI today? | 08:07 |
*** yamahata has quit IRC | 08:07 | |
openstackgerrit | mathieu bultel proposed openstack/tripleo-quickstart master: Download overcloud_release rpm for mixed upgrade https://review.openstack.org/449349 | 08:08 |
*** shardy_afk is now known as shardy | 08:09 | |
*** florianf has joined #tripleo | 08:09 | |
*** ooolpbot has joined #tripleo | 08:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 08:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674681 | 08:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674770 | 08:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674955 | 08:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1675174 | 08:10 |
openstack | Launchpad bug 1674681 in tripleo "buildlogs.centos.org CDN issues" [Critical,Triaged] | 08:10 |
*** ooolpbot has quit IRC | 08:10 | |
openstack | Launchpad bug 1674770 in tripleo "Update timeout too long in CI" [Critical,Triaged] | 08:10 |
openstack | Launchpad bug 1674955 in tripleo "oooq now believes that overcloud deploy succeeded even though it failed" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 08:10 |
openstack | Launchpad bug 1675174 in tripleo "Timeout passed to overcloud deploy not effective" [Critical,Triaged] | 08:10 |
jaosorior | d0ugal: I guess that's the answer ^^ | 08:10 |
d0ugal | jaosorior: yikes | 08:11 |
*** percevalbot has joined #tripleo | 08:11 | |
* d0ugal attempts to understand them | 08:11 | |
openstackgerrit | mathieu bultel proposed openstack/tripleo-quickstart master: Download overcloud_release rpm for mixed upgrade https://review.openstack.org/449349 | 08:16 |
ccamacho | jaosorior d0ugal :( CI is having a big Influenza shoot.. | 08:16 |
d0ugal | ccamacho: no kidding, I am finding it harder to help as I am less familiar with quickstart too | 08:17 |
d0ugal | I need to get up to speed. | 08:17 |
jaosorior | ccamacho: yeah dude, it's been a rough week | 08:17 |
openstackgerrit | Michele Baldessari proposed openstack/tripleo-heat-templates master: WIP DO NOT MERGE Move rabbitmq behind haproxy https://review.openstack.org/390114 | 08:18 |
d0ugal | It would be awesome if somebody done a "debugging quickstart CI deep dive" - or maybe something like this exists? | 08:18 |
bandini | d0ugal: +1 | 08:18 |
jaosorior | not that I know of | 08:18 |
jaosorior | I'm acquainted with the classic bash-based CI | 08:18 |
jaosorior | but with this quickstart move I'm pretty lost | 08:18 |
remix_tj | hi guys, i've a failing gate saying this: http://logs.openstack.org/87/446887/1/gate/gate-tripleo-ci-centos-7-nonha-multinode-oooq/cf58322/logs/undercloud/home/jenkins/overcloud_validate.log.txt.gz | 08:19 |
d0ugal | jaosorior: yeah, same, I was never that good at understanding/helping but now I am terrible lol | 08:19 |
bandini | remix_tj: looking | 08:19 |
ccamacho | :( ya with oooq i.e. if you use the --config it wont append your yaml.. that option is not working.. :P And I dont know why I have to launch 2 times the deploy command to have it working :P | 08:19 |
bogdando | o/ | 08:19 |
ccamacho | +1 to that deep dive session | 08:19 |
bogdando | please merge https://review.openstack.org/#/c/444308/ | 08:19 |
shardy | +1 also, that is a good idea | 08:19 |
d0ugal | /cc trown|outtypewww ^ :) | 08:19 |
bogdando | and https://review.openstack.org/#/c/445883/ please | 08:20 |
jaosorior | d0ugal: wouldn't the overcloudrc be generated by a mistral workflow nowadays? | 08:20 |
remix_tj | bandini: ansible play says that's a single failure http://logs.openstack.org/87/446887/1/gate/gate-tripleo-ci-centos-7-nonha-multinode-oooq/cf58322/console.html#_2017-03-24_07_52_56_111847 | 08:20 |
d0ugal | jaosorior: yeah, it is | 08:20 |
bogdando | one more https://review.openstack.org/#/c/448294/ . thank you! | 08:20 |
*** janki has quit IRC | 08:20 | |
jaosorior | d0ugal: remix_tj is seeing a successful overcloud deploy, and when validation is tried out, it fails cause it doesn't find the overcloudrc :/ | 08:21 |
jaosorior | remix_tj: what's the patch? | 08:21 |
remix_tj | https://review.openstack.org/446887 | 08:21 |
*** mhenkel has quit IRC | 08:21 | |
ccamacho | bogdando, check the last command for #/c/448294 | 08:21 |
ccamacho | sorry ^ 445883 | 08:21 |
bandini | remix_tj: hohum that's a weird one | 08:22 |
remix_tj | it's since the 17th i'm continuing to recheck this patch and all its backports, CI was quite weird this week | 08:22 |
bandini | maybe d0ugal can help as to why overcloudrc is missing | 08:22 |
bogdando | ccamacho: missed that, thanks! | 08:22 |
jaosorior | d0ugal: what's the name of the workflow? we should find it having been executed in the mistral logs, right? http://logs.openstack.org/87/446887/1/gate/gate-tripleo-ci-centos-7-nonha-multinode-oooq/cf58322/logs/undercloud/var/log/mistral/ | 08:22 |
* d0ugal looks at the lods | 08:22 | |
d0ugal | logs* | 08:23 |
d0ugal | jaosorior: it isn't a workflow, just an action call | 08:23 |
*** amoralej|off is now known as amoralej | 08:23 | |
bogdando | and these two require some review from oooq folks please https://review.openstack.org/#/c/447409/ https://review.openstack.org/#/c/447000/ | 08:23 |
d0ugal | and it is called... | 08:23 |
*** mhenkel has joined #tripleo | 08:23 | |
d0ugal | jaosorior: https://github.com/openstack/tripleo-common/blob/master/setup.cfg#L71 | 08:23 |
d0ugal | jaosorior: I don't see the action mentioned in the logs at all | 08:24 |
jaosorior | funky | 08:24 |
jaosorior | mandre, jistr: Hey guys, could you check these out https://review.openstack.org/#/q/status:open+project:openstack/tripleo-heat-templates+branch:master+topic:keystone-fernet-docker ? | 08:25 |
d0ugal | http://logs.openstack.org/87/446887/1/gate/gate-tripleo-ci-centos-7-nonha-multinode-oooq/cf58322/logs/undercloud/var/log/mistral/executor.log.txt.gz#_2017-03-24_07_21_51_061 | 08:26 |
d0ugal | That is the last log entry matching "tripleo_common" | 08:26 |
d0ugal | so nothing in Mistral is started after the deploy. | 08:26 |
d0ugal | Where can I see the CLI output? | 08:27 |
*** leanderthal|afk is now known as leanderthal | 08:27 | |
*** leanderthal is now known as leanderthal|afk | 08:28 | |
openstackgerrit | Bogdan Dobrelya proposed openstack/tripleo-heat-templates master: Rework container volumes as hostpath mounts https://review.openstack.org/448510 | 08:29 |
jaosorior | d0ugal: http://logs.openstack.org/87/446887/1/gate/gate-tripleo-ci-centos-7-nonha-multinode-oooq/cf58322/logs/undercloud/home/jenkins/overcloud_deploy.log.txt.gz | 08:29 |
d0ugal | jaosorior: weird, that doesn't get to the end of the deploy command. | 08:31 |
d0ugal | Where is https://github.com/openstack/python-tripleoclient/blob/master/tripleoclient/v1/overcloud_deploy.py#L1170-L1171 | 08:31 |
jaosorior | but it passed O_o | 08:31 |
d0ugal | oh man, I have a theory | 08:32 |
* d0ugal digs | 08:32 | |
bogdando | need some help with solving this puzzle http://logs.openstack.org/43/448543/3/check/gate-tripleo-ci-centos-7-nonha-multinode-oooq/9ece507/console.html (RC of failure?) | 08:33 |
bogdando | so overcloud-validate had failed, where are logs?.. | 08:34 |
*** social has quit IRC | 08:34 | |
jaosorior | d0ugal: in the same direcotry as the deploy logs I showed you | 08:35 |
d0ugal | jaosorior: was that for bogdando ? | 08:35 |
*** stendulker has joined #tripleo | 08:35 | |
jaosorior | d0ugal: http://logs.openstack.org/87/446887/1/gate/gate-tripleo-ci-centos-7-nonha-multinode-oooq/cf58322/logs/undercloud/home/jenkins/overcloud_validate.log.txt.gz | 08:35 |
bandini | bogdando: http://logs.openstack.org/43/448543/3/check/gate-tripleo-ci-centos-7-nonha-multinode-oooq/9ece507/logs/undercloud/home/jenkins/overcloud_deploy.log.txt.gz | 08:35 |
jaosorior | d0ugal: oh. In weechat you both handles are green. Got confused hahaha | 08:35 |
d0ugal | :) | 08:36 |
bandini | bogdando: the reason it went to do the validate even though the deploy failed is bug 1674955 | 08:36 |
openstack | bug 1674955 in tripleo "oooq now believes that overcloud deploy succeeded even though it failed" [Critical,Triaged] https://launchpad.net/bugs/1674955 - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 08:36 |
bogdando | bandini: how those are related? so I could find it next time? | 08:36 |
bogdando | I mean, how could I follow the build log to find it | 08:36 |
jaosorior | bogdando: when is that run from>? | 08:36 |
jaosorior | bogdando: the ssl_depth issue should have been fixed already | 08:36 |
bandini | bogdando: well I am not super expert after the oooq move, but in general those commands are run from the undercloud, so the undercloud/home/jenkins folder is the one with the most interesting logs most of the time | 08:37 |
bogdando | jaosorior: https://review.openstack.org/#/c/448543/3 I'm only learning for finding failures in CI builds, not the RCA yet ) | 08:37 |
bogdando | so I need to now how to locate that failed check log | 08:37 |
jaosorior | RCA? | 08:37 |
bogdando | u, root cause | 08:37 |
bogdando | um* | 08:38 |
jaosorior | I need coffee | 08:38 |
bogdando | and a=analysis :) | 08:38 |
bogdando | bandini: oh, thank you! | 08:38 |
bandini | bogdando: so now that we know the deployment failed (as opposed to the validation), I usually hop on the nodes that failed and check /var/log/messages | 08:39 |
bandini | bogdando: in your case the culprit is http://logs.openstack.org/43/448543/3/check/gate-tripleo-ci-centos-7-nonha-multinode-oooq/9ece507/logs/subnode-1/var/log/messages.txt.gz#_Mar_22_13_34_08 | 08:39 |
bogdando | I need to raise kibana unarmed firghting skill | 08:40 |
bandini | ahahahah | 08:40 |
bogdando | btw, in fuel I used to search for errors with some grep and perl magic | 08:40 |
*** milan has joined #tripleo | 08:40 | |
bogdando | it served well for years | 08:41 |
bogdando | need to find something similar here as well | 08:41 |
bogdando | https://github.com/bogdando/fuel-log-parse :) | 08:41 |
bandini | yeah something along those lines would be useful for ooo as well | 08:42 |
bogdando | it looks ugly but works | 08:42 |
d0ugal | I am 100% confused. Something causes the deploy command to exit early, but with a status code 0. | 08:42 |
bandini | we used to have a postci.txt that contained the proper error most of the time, I think we are working on reinstantiating it in oooq | 08:42 |
bandini | iirc I saw some bug/review flying by about it | 08:43 |
jaosorior | d0ugal: maybe this https://github.com/openstack/python-tripleoclient/blob/master/tripleoclient/v1/overcloud_deploy.py#L1147 ? | 08:43 |
jaosorior | bandini: postci still exists, it's just in a different directory | 08:43 |
d0ugal | jaosorior: I checked that, it should be false - it isn't passed at least. | 08:43 |
jaosorior | dafuq | 08:43 |
d0ugal | jaosorior: but we should maybe add a print for that branch, just to be sure | 08:43 |
bandini | jaosorior: oh!? | 08:43 |
*** jpena|off is now known as jpena | 08:44 | |
jaosorior | bandini: http://logs.openstack.org/43/448543/3/check/gate-tripleo-ci-centos-7-nonha-multinode-oooq/9ece507/logs/undercloud/var/log/postci.txt.gz | 08:44 |
jaosorior | it's in /var/log/ in the undercloud | 08:44 |
bandini | oh blimey | 08:44 |
bandini | jaosorior: thanks, that will save me some time! | 08:44 |
jaosorior | I was also very confused and started ranting about it the other day | 08:44 |
jaosorior | so I was pointed to it | 08:45 |
bandini | eheh well good to know! | 08:46 |
jaosorior | anyway, I think it would still be better to move that from there, to the place where it was before | 08:46 |
jaosorior | it's a bit more intuitive | 08:46 |
openstackgerrit | Dougal Matthews proposed openstack/python-tripleoclient master: Default --update-plan-only to False https://review.openstack.org/449491 | 08:46 |
bogdando | bandini: is it possible to get all build artifacts in the tarbal? | 08:46 |
bogdando | to grep logs locally, or yes, kibana stuff is good (but slow I'm afraid) | 08:47 |
jaosorior | d0ugal: not sure if the default=False is really necessary | 08:47 |
jaosorior | d0ugal: but the print sure helps | 08:47 |
bandini | bogdando: dunno tbh. you mean just the env files? | 08:47 |
bogdando | this should be asked rather to infra folks... but may be some one ate that dog already! :) | 08:47 |
d0ugal | jaosorior: agreed, but I figured explicit is better etc. :) | 08:47 |
bandini | bogdando: I guess it can all be reconstructed by looking at the logs | 08:47 |
bandini | but yeah having it all in a single place would be nice | 08:48 |
bogdando | bandini: yeah, my concern is to grab all the failed job has produced and inspect locally with perl magic | 08:48 |
bandini | yeah | 08:49 |
bogdando | then to add yet another file to my repo :) | 08:49 |
bandini | eheh | 08:49 |
bogdando | so may be there is some url to fetch all... | 08:49 |
* bogdando gonna ask infra folks | 08:50 | |
bogdando | wget recursive to the rescue | 08:51 |
openstackgerrit | yolanda.robla proposed openstack/diskimage-builder master: Apply setfiles on all mountpoints https://review.openstack.org/447076 | 08:52 |
shardy | bogdando: sec, there's a script that does that in tripleo-ci | 08:56 |
bogdando | shardy: right in time, thank you! :) | 08:56 |
shardy | https://github.com/openstack-infra/tripleo-ci/blob/master/scripts/getthelogs | 08:57 |
*** athomas has joined #tripleo | 08:57 | |
*** jpena is now known as jpena|off | 08:57 | |
shardy | bogdando: that may help ^^ ? | 08:57 |
bogdando | yes, thanks | 08:57 |
*** zoli|gone is now known as zoli | 08:58 | |
openstackgerrit | mathieu bultel proposed openstack/tripleo-quickstart master: Download overcloud_release rpm for mixed upgrade https://review.openstack.org/449349 | 08:58 |
*** milan has quit IRC | 09:00 | |
*** mcornea has joined #tripleo | 09:03 | |
*** jpena|off is now known as jpena | 09:04 | |
openstackgerrit | OpenStack Proposal Bot proposed openstack/tripleo-ui master: Imported Translations from Zanata https://review.openstack.org/449507 | 09:06 |
bogdando | shardy: hm, not sure is that works "while in the tmp directory run "getthelogs" with no params | 09:09 |
bogdando | to download any log files you hadn't previously downloaded" | 09:09 |
shardy | bogdando: Hmm, OK I've not used it in a while, I was just aware it existed | 09:10 |
*** ooolpbot has joined #tripleo | 09:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 09:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674681 | 09:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674770 | 09:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674955 | 09:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1675174 | 09:10 |
openstack | Launchpad bug 1674681 in tripleo "buildlogs.centos.org CDN issues" [Critical,Triaged] | 09:10 |
*** ooolpbot has quit IRC | 09:10 | |
openstack | Launchpad bug 1674770 in tripleo "Update timeout too long in CI" [Critical,Triaged] | 09:10 |
openstack | Launchpad bug 1674955 in tripleo "oooq now believes that overcloud deploy succeeded even though it failed" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 09:10 |
openstack | Launchpad bug 1675174 in tripleo "Timeout passed to overcloud deploy not effective" [Critical,Triaged] | 09:10 |
shardy | I don't see where but perhaps it's been broken by the move to quickstart and/or multinode jobs | 09:10 |
openstackgerrit | yolanda.robla proposed openstack/diskimage-builder master: WIP: Add lvm management to diskimage-builder https://review.openstack.org/444403 | 09:11 |
*** jlinkes has quit IRC | 09:12 | |
*** suuuper has joined #tripleo | 09:12 | |
bogdando | folks, is elastic-check is still in use? | 09:13 |
bogdando | I've read about that, it looks cool | 09:13 |
bogdando | but f.e. I can't find the aforementioned bug in the https://git.openstack.org/cgit/openstack-infra/elastic-recheck/tree/queries list | 09:13 |
bogdando | https://www.elastic.co/blog/openstack-elastic-recheck-powered-elk-stack | 09:13 |
shardy | http://status.openstack.org/elastic-recheck/ | 09:13 |
shardy | bogdando: I think so, but we've not really been making use of it for TripleO | 09:13 |
bogdando | I see. But the thing is just great, nothing to say more | 09:14 |
*** udesale has quit IRC | 09:15 | |
shardy | bogdando: yeah, I think it could be useful, we've discussed it in the past, but nobody has had time to really push on getting regular queries added for bugs | 09:15 |
bogdando | hehe. This could be some bot using LP tags :) a nice exercise | 09:16 |
bogdando | so someone has only to merge them | 09:16 |
bogdando | pathces | 09:16 |
shardy | bogdando: yeah, I think the problem is/was that it takes human analysis of logs to figure out the query and propose it | 09:16 |
shardy | what would be really cool is to have a special comment format, so when raising a bug you could do e.g "ER_QUERY=foo" | 09:17 |
shardy | or something | 09:17 |
bogdando | yeah, you're right. In google for example, they consider things went really bad if SREs has to read logs ever. | 09:17 |
shardy | so maybe a bot could help there | 09:17 |
bogdando | it eats time | 09:18 |
shardy | bogdando: yeah, our CI is not really at google levels of automation at this point ;) | 09:18 |
bogdando | we will make it so! | 09:18 |
bogdando | and even better | 09:18 |
shardy | It's something to aim for, certainly ;) | 09:18 |
shardy | One issue we have in TripleO is that we're not in the gate of every other project, so we deal with a large number of regressions, which makes automated analysis/recovery harder | 09:19 |
*** limao has quit IRC | 09:21 | |
bogdando | right. Although having a dummy bot that greps for errors and submit patches to the elastic-recheck could help to offload ppl here perhaps. At least we will have frequencies for each type of error | 09:24 |
bogdando | cuz each failure will be linked to some bug, IIUC how it works | 09:24 |
bogdando | so it will be semi automatic. A person puts a tag to a known bug and here it is - each related error is linked to that bug | 09:25 |
bogdando | but it's easy to say, not to do... | 09:26 |
jaosorior | mandre: by the way, the bind-mount patch depends on the one setting fernet as the default | 09:26 |
jaosorior | so if you want you could try those two together | 09:26 |
bandini | d0ugal, remix_tj: fwiw I just spotted another place where we fail the validate due to no overcloudrc but deploy seemed to have succeeded http://logs.openstack.org/19/447319/2/check/gate-tripleo-ci-centos-7-nonha-multinode-oooq/5ea42f4/logs/undercloud/home/jenkins/overcloud_validate.log.txt.gz | 09:26 |
d0ugal | bandini: thanks | 09:27 |
d0ugal | apetrich: ^ another one. | 09:27 |
bandini | d0ugal: I'll start by opening a bug to track this | 09:27 |
mandre | jaosorior: yep, I'm testing the series | 09:27 |
remix_tj | bandini: yep, i wait for other news | 09:28 |
d0ugal | bandini: good idea. I am still looking into it. The weirdest bug I have seen in a while :) | 09:28 |
bandini | d0ugal: gotta lova fridays ;) | 09:28 |
d0ugal | haha | 09:28 |
apetrich | bandini, best bugs best days | 09:28 |
bandini | lol | 09:28 |
*** lucas-afk is now known as lucasagomes | 09:29 | |
remix_tj | i suggest you to stop touching on fridays after 3pm local time. Here touching after that time will lead to an interesting weekend | 09:29 |
*** flepied has quit IRC | 09:29 | |
*** dbecker has joined #tripleo | 09:30 | |
remix_tj | (anyway i can't still start my oooq environment) | 09:31 |
bandini | d0ugal, apetrich: https://bugs.launchpad.net/tripleo/+bug/1675709 to track it | 09:31 |
openstack | Launchpad bug 1675709 in tripleo "deploy succeeded but no overcloudrc was generated" [High,Triaged] | 09:31 |
d0ugal | bandini: I was hoping the title would be "Weirdest bug in a while" :P | 09:32 |
d0ugal | thanks! | 09:32 |
bandini | ahahah | 09:33 |
apetrich | d0ugal, ok linking my bug to that | 09:33 |
*** derekh has joined #tripleo | 09:33 | |
*** jrist has quit IRC | 09:34 | |
openstackgerrit | Yurii Prokulevych proposed openstack/tripleo-heat-templates master: Run cluster check on nodes configured in wsrep_cluster_address. https://review.openstack.org/449154 | 09:35 |
openstackgerrit | Thomas Herve proposed openstack/tripleo-heat-templates master: Remove yaql call when building logging_groups https://review.openstack.org/447605 | 09:36 |
openstackgerrit | Karthik S proposed openstack/puppet-tripleo master: vhostuser socket dir shall be created for vhostuserclient mode https://review.openstack.org/449530 | 09:38 |
*** janki has joined #tripleo | 09:38 | |
*** deadnull has joined #tripleo | 09:43 | |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/puppet-tripleo master: Ensure directory exists for certificates for httpd https://review.openstack.org/449536 | 09:44 |
*** akrivoka has joined #tripleo | 09:45 | |
*** fzdarsky_ is now known as fzdarsky | 09:46 | |
*** snecklifter has joined #tripleo | 09:48 | |
*** gkadam has quit IRC | 09:51 | |
*** skramaja_ has joined #tripleo | 09:51 | |
*** ckyriakidou has joined #tripleo | 09:53 | |
*** dixiaoli has joined #tripleo | 09:53 | |
*** skramaja has quit IRC | 09:54 | |
openstackgerrit | yolanda.robla proposed openstack/diskimage-builder master: WIP: Add lvm management to diskimage-builder https://review.openstack.org/444403 | 09:54 |
openstackgerrit | yolanda.robla proposed openstack/diskimage-builder master: Use stevedore for module config of block device https://review.openstack.org/447090 | 09:54 |
openstackgerrit | yolanda.robla proposed openstack/diskimage-builder master: Refactor: block-device filesystem creation, mount and fstab https://review.openstack.org/444586 | 09:54 |
openstackgerrit | Luke Hinds proposed openstack/tripleo-specs master: blueprint for TCP Wrapper Service https://review.openstack.org/441211 | 09:57 |
*** flepied has joined #tripleo | 09:59 | |
openstackgerrit | Karthik S proposed openstack/puppet-tripleo master: vhostuser socket dir shall be created for vhostuserclient mode https://review.openstack.org/449530 | 10:00 |
openstackgerrit | Merged openstack/puppet-tripleo stable/ocata: Fixes issues with raising mysql file limit https://review.openstack.org/447515 | 10:01 |
openstackgerrit | Luke Hinds proposed openstack/tripleo-specs master: blueprint for TCP Wrapper Service https://review.openstack.org/441211 | 10:02 |
*** karthiks is now known as karthiks_afk | 10:02 | |
openstackgerrit | yolanda.robla proposed openstack/diskimage-builder master: Refactor: block-device filesystem creation, mount and fstab https://review.openstack.org/444586 | 10:02 |
*** pcaruana has joined #tripleo | 10:05 | |
openstackgerrit | Luke Hinds proposed openstack/tripleo-heat-templates master: Extends audit serivce https://review.openstack.org/444804 | 10:07 |
d0ugal | Do we have a CI run that passed recently? | 10:07 |
d0ugal | oh, cistatus. oops | 10:08 |
*** Vijayendra has quit IRC | 10:09 | |
d0ugal | bandini, apetrich: I think this is the problem: http://logs.openstack.org/19/447319/2/check/gate-tripleo-ci-centos-7-nonha-multinode-oooq/5ea42f4/logs/undercloud/var/log/mistral/engine.log.txt.gz#_2017-03-24_07_57_11_455 | 10:09 |
d0ugal | I don't know what it means | 10:09 |
*** salmankhan has joined #tripleo | 10:09 | |
d0ugal | but I think it is related to these timeouts | 10:09 |
d0ugal | I just checked a couple of jobs that passed, they don't have these. | 10:09 |
*** skramaja_ is now known as skramaja | 10:10 | |
d0ugal | so my guess is that tripleoclient calls the overcloudrc action and it just gets stuck. | 10:10 |
*** ooolpbot has joined #tripleo | 10:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 10:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674681 | 10:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674770 | 10:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674955 | 10:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1675174 | 10:10 |
openstack | Launchpad bug 1674681 in tripleo "buildlogs.centos.org CDN issues" [Critical,Triaged] | 10:10 |
*** ooolpbot has quit IRC | 10:10 | |
openstack | Launchpad bug 1674770 in tripleo "Update timeout too long in CI" [Critical,Triaged] | 10:10 |
openstack | Launchpad bug 1674955 in tripleo "oooq now believes that overcloud deploy succeeded even though it failed" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 10:10 |
openstack | Launchpad bug 1675174 in tripleo "Timeout passed to overcloud deploy not effective" [Critical,Triaged] | 10:10 |
d0ugal | .... I don't know why it doesn't error tho' | 10:10 |
bandini | d0ugal: ha, interesting | 10:13 |
apetrich | d0ugal, I'm looking at the other fail example that I have and I see those timeouts also | 10:14 |
shardy | jtomasek: Hi! Sorry, another tripleo-ui related question :) | 10:15 |
openstackgerrit | yolanda.robla proposed openstack/diskimage-builder master: WIP: Add lvm management to diskimage-builder https://review.openstack.org/444403 | 10:15 |
shardy | jtomasek: in tripleoclient, we have a validation which ensures NtpServer is set when ControllerCount > 1 | 10:15 |
shardy | I don't see that in tripleo-common, does the UI do that same validation? | 10:15 |
d0ugal | bandini, apetrich: I actually seen them in my environment the other day - but thought it was a one off | 10:16 |
d0ugal | bandini, apetrich: re-creating it at the moment, so maybe I'll hit it again. | 10:16 |
bandini | ack | 10:18 |
openstackgerrit | Alfredo Moralejo proposed openstack-infra/tripleo-ci master: [DNM] Adding yum debug in oooq playbook https://review.openstack.org/449548 | 10:20 |
matbu | can someone add workflow on https://review.openstack.org/#/c/448274/ | 10:25 |
matbu | plz | 10:25 |
*** social has joined #tripleo | 10:26 | |
*** zoli is now known as zoli|lunch | 10:27 | |
openstackgerrit | Bogdan Dobrelya proposed openstack-infra/tripleo-ci master: Adapt getthelogs UX for more use cases https://review.openstack.org/449552 | 10:27 |
bogdando | bandini, shardy: https://review.openstack.org/449552 | 10:28 |
jtomasek | shardy: not really afaik | 10:29 |
openstackgerrit | Steven Hardy proposed openstack/tripleo-image-elements master: 51-hosts fails if given lots of changes https://review.openstack.org/449198 | 10:30 |
bogdando | mandre: ^ ^ | 10:30 |
jtomasek | shardy: how is that validation run in tripleoclient? | 10:31 |
shardy | https://github.com/openstack/python-tripleoclient/blob/master/tripleoclient/v1/overcloud_deploy.py#L227 | 10:32 |
shardy | jtomasek: it just looks at the merged environment before calling heat | 10:32 |
shardy | jtomasek: I'd like to remove that, but I think we need to add something similar to either t-h-t or tripleo-common first | 10:33 |
jtomasek | shardy: I could see it as a normal pre-deployment validation in tripleo-validations. | 10:33 |
shardy | jtomasek: can't those happen before the plan is created? | 10:33 |
jaosorior | mandre: yay, thanks for checking it out | 10:34 |
shardy | jtomasek: but yeah, we need to find a better place for this, I want to remove it from tripleoclient | 10:34 |
jtomasek | shardy: good question, tripleo-validations probably have access just to plan in swift | 10:34 |
shardy | jtomasek: Ok, I think I'll leave it here for now, with a FIXME so we can work out removing it later | 10:35 |
jaosorior | bandini: sshnaidm|off posted a patch to get back postci.log :D | 10:35 |
jaosorior | bandini: https://review.openstack.org/#/c/448135/ | 10:35 |
shardy | jtomasek: what I would like is a plan update workflow, which includes any validation required on the data | 10:35 |
jtomasek | shardy: ack, I think having it as pre-deployment validation in tripleo-validations should be fine. Filing it as bug would be helpful I think | 10:35 |
shardy | jtomasek: which could include specific parameters like this, but also e.g dependencies described in the capabilities map | 10:35 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack-infra/tripleo-ci master: DO NOT MERGE: Testing novajoin authtoken https://review.openstack.org/446348 | 10:36 |
shardy | jtomasek: Ok, FWIW I don't really see the value in having ansible do this, I'd probably prefer something in tripleo-common | 10:36 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack-infra/tripleo-ci master: DO NOT MERGE: Testing ensure dir for httpd certs https://review.openstack.org/446348 | 10:36 |
shardy | but I'll raise a bug and we can discuss it, thanks! | 10:36 |
jtomasek | shardy: so that plan update workflow would get a list of validations as input? | 10:37 |
jtomasek | shardy: oh, I see, the validation would be just part of the plan update workflow... | 10:38 |
shardy | jtomasek: No, it'd get a list of environment files, then a validation would automatically run that fails the plan update if the data is bad | 10:38 |
shardy | jtomasek: yeah | 10:38 |
shardy | jtomasek: vs the action we currently have that just enables environments | 10:38 |
*** tosky has joined #tripleo | 10:38 | |
jtomasek | shardy: in case of the parameters you mentioned, we would want the update to happen anyway because user just selects environments but he is still able to update the parameters later | 10:40 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/tripleo-heat-templates master: Change the directory for httpd certs/keys to be service-specific https://review.openstack.org/449558 | 10:40 |
shardy | jtomasek: I think it'd still be the same plan_update workflow, but perhaps we'd have a validate=False option | 10:41 |
jtomasek | shardy: that is why I see it fit as separate pre-deployment validation. Although I am aware it is a bit complicated for tripleoclient workflow as it does everything at once and | 10:41 |
*** nyechiel has joined #tripleo | 10:41 | |
shardy | jtomasek: I think we need to split it into two workflows, one updates the plan (including all operations related to environment files, and that includes parameters) | 10:41 |
shardy | the other just deploys the plan | 10:41 |
shardy | right now we have those two steps mixed up, despite making them separate in both Ux's | 10:42 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack-infra/tripleo-ci master: DO NOT MERGE: Testing change dir for httpd certs/keys https://review.openstack.org/449560 | 10:42 |
jtomasek | shardy: yes, that's what UI does anyway | 10:42 |
openstackgerrit | Marios Andreou proposed openstack/tripleo-heat-templates master: WIP - Add upgrade_batch_tasks to neutron-l3-agent https://review.openstack.org/445494 | 10:42 |
openstackgerrit | Jiri Stransky proposed openstack/tripleo-quickstart-extras master: Upgrade to containerized overcloud https://review.openstack.org/448576 | 10:43 |
jtomasek | shardy: only reason why I mention tripleo-validations is that we already have this concept and introducing different kind of validations makes things more confusing | 10:43 |
shardy | jtomasek: sure, well FWIW I think all tripleo validations should be described in terms of mistral workflows, even if those then run ansible | 10:44 |
jtomasek | shardy: it would be possible to run tripleo-validation workflow as a subworkflow of plan update workflow btw. | 10:44 |
openstackgerrit | Sagi Shnaidman proposed openstack/tripleo-quickstart-extras master: Clone projects if they aren't cloned by ZUUL https://review.openstack.org/449562 | 10:44 |
shardy | jtomasek: I don't want to be forced to run mistral->ansible->swift when I can just do mistral->swift | 10:45 |
jtomasek | shardy: +1 | 10:45 |
jtomasek | shardy: agree | 10:45 |
*** gaurangt has joined #tripleo | 10:45 | |
shardy | jtomasek: Ok, thanks, I guess we have some work to do here, I'll raise a bug to track it | 10:46 |
openstackgerrit | Sagi Shnaidman proposed openstack/tripleo-quickstart-extras master: Clone projects if they aren't cloned by ZUUL https://review.openstack.org/449562 | 10:46 |
jtomasek | shardy: thanks for bringing this up | 10:46 |
jtomasek | shardy: btw. is there progress on flattening parameters in tripleo-common? I am just curious. It is all fine if it is deferred due to low priority:) | 10:47 |
shardy | jtomasek: I think skramaja was planning to work on it, but I'm not aware of any patches yet | 10:48 |
florianf | shardy, jtomasek: I think we should be careful not to spread validation logic over multiple projects. | 10:48 |
shardy | jtomasek: I suspect we can do it with yaql in a mistral workflow | 10:48 |
jtomasek | shardy: ok | 10:49 |
shardy | florianf: I agree, but I think tripleo-validations started as kind of a "bolt on" addition to tripleo | 10:49 |
shardy | I'd like to see the validations properly integrated into our architecture | 10:49 |
shardy | and we can't do that if every kind of validation must be done only via ansible | 10:50 |
*** athomas has quit IRC | 10:50 | |
d0ugal | shardy: can you share the bug with me when you open it? | 10:50 |
* skramaja reading through | 10:50 | |
shardy | IMHO mistral makes a much more flexible integration point, and it fits well with our current architecture | 10:50 |
shardy | even if a bunch of validations still use ansible | 10:50 |
*** jprovazn has quit IRC | 10:51 | |
d0ugal | Makes sense. Ansible would be overkill for some simple validations | 10:51 |
shardy | yeah, all I want is to check two values in the plan | 10:51 |
jtomasek | florianf:, shardy: so we have means to run tripleo-validations as part of mistral workflow. next good step could be that a tripleo-validation could run any code, not just ansible | 10:51 |
d0ugal | shardy: Have you seen these? https://review.openstack.org/#/q/topic:validations-in-workflows | 10:51 |
*** athomas has joined #tripleo | 10:51 | |
jtomasek | (I am not entirely sure how it is implement atm) | 10:51 |
*** athomas has quit IRC | 10:52 | |
florianf | jtomasek: We already have a bunch of validations that extend ansible via custom ansible modules. | 10:52 |
openstackgerrit | yolanda.robla proposed openstack/diskimage-builder master: Apply setfiles on all mountpoints https://review.openstack.org/447076 | 10:52 |
shardy | d0ugal: Ah, no, cool, yeah that's exactly what I'm talking about :) | 10:52 |
shardy | florianf: so, we already have a split implementation, mistral workflows, custom mistral actions, ansible playbooks and custom ansible modules | 10:53 |
* shardy sighs | 10:53 | |
shardy | oh well, we'll have to work out ways to rationalize that over time I guess | 10:53 |
florianf | shardy, d0ugal: Yeah, it seems overkill. How about validations contributors who want a single place to see what's already being validated, plus a strainght forward way to contribute new ones? | 10:53 |
d0ugal | shardy: they are ports of the validations in tripleoclient, I think thrash|g0ne plans to move them to tripleo-validations eventually - but maybe that isn't needed. | 10:53 |
*** [1]cdearborn has joined #tripleo | 10:54 | |
florianf | shardy, d0ugal: I asked thrash|g0ne to put in that comment :-) | 10:54 |
d0ugal | aha | 10:54 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/tripleo-heat-templates master: Bind mount directories that contain the key/certs for keystone https://review.openstack.org/449569 | 10:55 |
florianf | (the TODO to port the tripleoclient validations to ansible validations. | 10:55 |
florianf | ) | 10:55 |
florianf | So I'm not advocating using ansible for every single validation, but keeping validation logic in a single place | 10:55 |
d0ugal | bandini, apetrich: my local env is currently being flooring with timeouts... so I am expecting the CLI command to fail (but not lol) any second. | 10:55 |
shardy | florianf: sure, I think that's a good idea, but it still means we've got two layers of validations, e.g pure mistral and mistral driving ansible | 10:56 |
bandini | d0ugal: ah nice that you can reproduce it! | 10:56 |
shardy | florianf: maybe that's OK and we just need to maintain the mistral workflows in the tripleo-validations repo | 10:56 |
florianf | One problem we'll already see with the legacy tripleoclient validations that are about to land in tripleo-common: They will not show up in the list of validations (in the UI for instance). | 10:57 |
apetrich | d0ugal, seeing the same here. I'm not seeing floods of it but a some | 10:57 |
d0ugal | bandini: dang, it actually completed properly - I have an overcloudrc but the mistral log is still displaying these timeouts every couple of seconds - so I think something is wrong. | 10:57 |
skramaja | shardy: jtomasek yes. i started with flatten params, but couldn't progress because of ovs2.6 issues.. i think i should be able to focus on it next week. will keep you posted. as of now, i got the parameters, and resources. but couldn't get the service name thoug as it is in output. | 10:57 |
bandini | d0ugal: oh damn :/ | 10:57 |
d0ugal | apetrich: I am tailing all three mistral logs - when a timeout happens in one it seems to happen in them all, makes the rate seem higher :) | 10:57 |
*** athomas has joined #tripleo | 10:58 | |
d0ugal | shardy, florianf - we could add custom Mistral actions to tripleo-validations. | 10:58 |
apetrich | d0ugal, bandini mine about to finish. | 10:58 |
d0ugal | there is no reason they all need to be in tripleo-common - but that could be confusing for different reasons. | 10:58 |
jtomasek | skramaja: ok, let me know if you need anything, It would be great to make sure the result matches the implementation we currently have in UI | 10:59 |
shardy | florianf: ack, yeah that sounds like something we can and should solve, e.g by enabling pure mistral and mistral+ansible validations to be discovered via a mistral workflow | 10:59 |
florianf | shardy: Sure, ansible shouldn't be the hammer to hit every nail. But I like it as an easy way to contribute new validations for outsiders. | 10:59 |
shardy | skramaja: ack, thanks - I don't think it's super urgent, but let me know if you need any help :) | 10:59 |
skramaja | sure shardy jtomasek | 11:00 |
florianf | shardy: And that discovery workflow should live in -common? or -validations? | 11:00 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack-infra/tripleo-ci master: DO NOT MERGE: Testing TLS-everywhere with keystone container https://review.openstack.org/449570 | 11:01 |
shardy | florianf: right now it'll be in tripleo-common, but it sounds like a discussion about moving the mistral pieces into the tripleo-validations repo is worthwhile | 11:01 |
florianf | shardy: ack. | 11:01 |
*** jlinkes has joined #tripleo | 11:02 | |
*** dixiaoli has quit IRC | 11:03 | |
florianf | shardy: Not sure if I'm putting too much emphasis on potential contributors here. But I always thought it's nice that the tripleo-validations can be run independently from a simple validations repo checkout. I have absolutely no idea, of course, how many user-contributed validations we can expect in the future. But the ability to easily develop new ones seems like an asset to me, which we might not want to make more complicated | 11:05 |
florianf | than necessary. | 11:05 |
*** nyechiel has quit IRC | 11:06 | |
*** stendulker has quit IRC | 11:06 | |
*** thrash|g0ne is now known as thrash | 11:07 | |
*** ooolpbot has joined #tripleo | 11:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 11:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674681 | 11:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674770 | 11:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674955 | 11:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1675174 | 11:10 |
openstack | Launchpad bug 1674681 in tripleo "buildlogs.centos.org CDN issues" [Critical,Triaged] | 11:10 |
*** ooolpbot has quit IRC | 11:10 | |
openstack | Launchpad bug 1674770 in tripleo "Update timeout too long in CI" [Critical,Triaged] | 11:10 |
openstack | Launchpad bug 1674955 in tripleo "oooq now believes that overcloud deploy succeeded even though it failed" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 11:10 |
openstack | Launchpad bug 1675174 in tripleo "Timeout passed to overcloud deploy not effective" [Critical,Triaged] | 11:10 |
openstackgerrit | Athlan-Guyot sofer proposed openstack/tripleo-heat-templates master: WIP: O->N Upgrade, make sure all nova placement parameter properly set. https://review.openstack.org/449572 | 11:13 |
chem | marios: he, do you think that make sense https://review.openstack.org/449572 ? | 11:15 |
*** tvignaud has quit IRC | 11:16 | |
chem | marios:or am i missing something ? | 11:16 |
d0ugal | apetrich: how did yours turn out? | 11:18 |
*** dtantsur|afk is now known as dtantsur | 11:18 | |
*** udesale has joined #tripleo | 11:19 | |
chem | mcornea: the quick patch https://review.openstack.org/449572 (owalsh) ... | 11:19 |
dtantsur | folks, could you please approve 2 easy backports with CI passed and 1x +2? https://review.openstack.org/#/c/448028/ and https://review.openstack.org/#/c/448029/ | 11:19 |
marios | chem: hey, looking | 11:19 |
* dtantsur needs these landed before his PTO next week.. | 11:19 | |
chem | marios: sorry for the lack of explanation. It causes this error 2017-03-24 10:54:48.707 12174 ERROR nova.compute.manager MissingRequiredOptions: Auth plugin requires parameters which were not given: auth_url on the compute node | 11:20 |
chem | marios: and we hope that it's the root of another bz for nova upgrade :) | 11:20 |
marios | chem: yeah is ok, so i am the one that added that crudini stuff there it is meant to go to the non controllers right i mean wherever the manual upgrade will run | 11:21 |
marios | chem: and i think we originally had a restart | 11:21 |
mcornea | chem: thanks, will test it | 11:21 |
marios | chem: will comment on the review | 11:21 |
chem | marios: oki, thanks. | 11:21 |
*** udesale has quit IRC | 11:21 | |
marios | chem: btw had a call with jakub and he had some more info/ideas/fix for package (neutron-*) but for now i added mask/unmask at https://review.openstack.org/#/c/445494/5/puppet/services/neutron-ovs-agent.yaml not sure if it will help but a possibility | 11:22 |
marios | chem: thanks for the idea (you mentioned disable i think mask does what we need) | 11:22 |
chem | marios: this is not run during the manual step, it's done during the delivery of the script | 11:22 |
marios | chem: yeah it is run during delivery but it is only done for those nodes that will be manually upgraded | 11:23 |
thrash | shardy: d0ugal florianf jtomasek so, reading back (and not sure I got everything), I see the validations doing multiple things... The stuff that's in tripleo-validations are doing "hardware" checks it seems. Things that ansible would be good at, since it has the facts there. | 11:23 |
chem | marios: oki, we are on the same page then :) | 11:23 |
thrash | The validations being done in the client and soon in a mistral workflow are more surrounding the deployment. | 11:24 |
chem | marios: so many variation of systemctl, didn't know the mask command | 11:24 |
marios | chem: yeah neither did i i found it today :) | 11:27 |
jaosorior | sshnaidm|off: seems some ha jobs are failing cause they couldn't get a test-env environment :/ | 11:27 |
EmilienM | hello | 11:28 |
*** tvignaud has joined #tripleo | 11:29 | |
shardy | thrash: Hey, yeah that's the way I see it too, and I don't think the ansible approach is needed for the validations only concerned with the plan contents vs the actual nodes | 11:29 |
jaosorior | EmilienM: hey dude, for the TLS-everywhere work that involves getting it to work with containers I opened another blueprint: https://blueprints.launchpad.net/tripleo/+spec/tls-via-certmonger-containers if you have time to check it out. | 11:30 |
EmilienM | jaosorior: ack | 11:32 |
owalsh | chem: so all of the crudini command should have been within the if statement? | 11:34 |
jaosorior | d0ugal: damn dude, saw the missing overcloudrc error again: http://logs.openstack.org/48/446348/4/check/gate-tripleo-ci-centos-7-nonha-multinode-oooq/d4ea9f3/ | 11:35 |
chem | owalsh: well like this we won't have the auth_url error anymore, hope that it will fix the migration too | 11:35 |
owalsh | chem: how about the ROLE= line? | 11:35 |
d0ugal | jaosorior: ugh, there is something seriously wrong here | 11:37 |
d0ugal | jaosorior: I've put all I've learned on the bug: https://bugs.launchpad.net/tripleo/+bug/1675709 | 11:37 |
openstack | Launchpad bug 1675709 in tripleo "deploy succeeded but no overcloudrc was generated" [High,Triaged] | 11:37 |
chem | owalsh: it's unrelated, it's kind of an hack to pass down the ROLE name current to the written script | 11:37 |
chem | owalsh: but you're right it belong outside of the if | 11:37 |
chem | owalsh: thanks | 11:37 |
*** tvignaud has quit IRC | 11:37 | |
d0ugal | jaosorior: yours is slightly different which is interesting. | 11:37 |
*** pkovar has joined #tripleo | 11:38 | |
jaosorior | what the hell man | 11:38 |
jaosorior | funky | 11:38 |
owalsh | chem: ack, yea, thats what I meant, assumed it should be outside | 11:38 |
jtomasek | thrash: I agree ansible is not necessary there, although what we probably should do is keep validations in single place and make sure they use same api and are provided by same api | 11:38 |
jaosorior | d0ugal: well, if the print statement you added doesn't appear, then it might be another issue. | 11:38 |
*** pgadiya has quit IRC | 11:38 | |
jtomasek | thrash: so clients are able to list the validations as well as access validation results | 11:39 |
openstackgerrit | Tom Barron proposed openstack/tripleo-heat-templates stable/ocata: Configure horizon to use keystone v2 https://review.openstack.org/449027 | 11:40 |
apetrich | d0ugal, btw mine passed the timeouts happened but early on. not close to the finish | 11:40 |
jaosorior | d0ugal: DUDE, your commit failed with the same issue | 11:40 |
d0ugal | jaosorior: I don't think that print statement will appear tbh | 11:40 |
jaosorior | d0ugal: and the print didn't appear there. | 11:40 |
d0ugal | jaosorior: woah | 11:40 |
openstackgerrit | Athlan-Guyot sofer proposed openstack/tripleo-heat-templates master: N->O Upgrade, make sure all nova placement parameter properly set. https://review.openstack.org/449572 | 11:40 |
d0ugal | jaosorior: yeah, I didn't think it would be that. but good to eliminate it | 11:40 |
apetrich | d0ugal, I lie. they are there close to the finish | 11:40 |
*** bfournie has quit IRC | 11:41 | |
openstackgerrit | Bogdan Dobrelya proposed openstack-infra/tripleo-ci master: Adapt getthelogs UX for more use cases https://review.openstack.org/449552 | 11:43 |
d0ugal | apetrich, jaosorior: I am running out of ideas :( | 11:43 |
d0ugal | it is almost like somebody snuck a sys.exit(0) in there somewhere lol | 11:43 |
jaosorior | d0ugal: would it be that there's an Exception going on somewhere that cliff is not setting up as fatal? | 11:44 |
openstackgerrit | Athlan-Guyot sofer proposed openstack/tripleo-heat-templates master: N->O Upgrade, make sure all nova placement parameter properly set. https://review.openstack.org/449572 | 11:44 |
florianf | shardy, thrash, jtomasek: Maybe the question isn't so much if we use ansible or something else, but: What is our interface to list and execute validations? atm ansible is a common interface for all, which unifies listing/execution and fact-gathering. Of course the drawback is that it's overkill for simple stuff. But can we really keep simple stuff simple if we want a unified API for all validations? | 11:44 |
d0ugal | jaosorior: Good question. I'll test for that. | 11:44 |
thrash | florianf: it's extreme overkill for quite a bit. And I'm not sure that we should be calling both of these "validations" tbh | 11:45 |
openstackgerrit | Bogdan Dobrelya proposed openstack-infra/tripleo-ci master: Adapt getthelogs UX for more use cases https://review.openstack.org/449552 | 11:45 |
*** abishop has joined #tripleo | 11:46 | |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/tripleo-quickstart-extras master: DO NOT MERGE: Setting debug for overcloud deploy https://review.openstack.org/449580 | 11:46 |
jaosorior | d0ugal: ^^ | 11:46 |
thrash | as shardy said... There's really no point to go mistral -> ansible -> swift when we can go mistral -> swift. | 11:46 |
thrash | florianf: perhaps the proper way to do this would be to move the validations workbook and associated actions into tripleo-validations. | 11:47 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/tripleo-heat-templates master: DO NOT MERGE: Trying to replicate missing overcloudrc https://review.openstack.org/449582 | 11:47 |
jaosorior | d0ugal: ^^ | 11:47 |
thrash | and that all validations, whether they are in ansible or directly in mistral, should have a workflow. | 11:47 |
*** abishop has quit IRC | 11:48 | |
florianf | thrash: Right, so you mean we should distinguish between simple plan checks and validations? | 11:48 |
thrash | florianf: similar to how I did it. | 11:48 |
openstackgerrit | Dougal Matthews proposed openstack/python-tripleoclient master: TESTING print any exceptions https://review.openstack.org/449583 | 11:48 |
d0ugal | jaosorior: nice. this too ^ | 11:48 |
thrash | florianf: that's one way of looking at it, yes. | 11:48 |
*** abishop has joined #tripleo | 11:48 | |
jaosorior | d0ugal: dude! good idea! | 11:48 |
d0ugal | jaosorior: and now we wait a bit :) | 11:49 |
*** tbonds has quit IRC | 11:49 | |
d0ugal | jaosorior: meanwhile I am trying to break my deploy locally with no luck... the one time you want tripleo to fail it doesn't :-D | 11:49 |
jaosorior | haha dammit! | 11:49 |
*** abishop has quit IRC | 11:49 | |
thrash | florianf: it's likely that we can convert some of those ansible modules into mistral actions (and they can probably have dual citizenship) | 11:49 |
*** abishop has joined #tripleo | 11:50 | |
*** abishop has quit IRC | 11:51 | |
*** tvignaud has joined #tripleo | 11:51 | |
*** abishop has joined #tripleo | 11:51 | |
*** zoli|lunch is now known as zoli | 11:51 | |
thrash | florianf: shardy but first step right now should be getting them out of tripleoclient... | 11:52 |
*** abishop has quit IRC | 11:52 | |
thrash | florianf: shardy and my preference for things coming out of tripleoclient should go directly to tripleo-common. | 11:52 |
EmilienM | jaosorior: * Separate the certificate requests from the puppet files of the services | 11:52 |
*** abishop has joined #tripleo | 11:52 | |
florianf | thrash: oh. like a library of validation logic that can be a mistral action or be run on some host via ansible. | 11:52 |
EmilienM | jaosorior: what does it mean exactly? | 11:52 |
thrash | florianf: yeah... | 11:54 |
thrash | If you want to run it from ansible, fine. Here's the core logic. | 11:54 |
openstackgerrit | Dougal Matthews proposed openstack/python-tripleoclient master: TESTING is it any different without the websocket events? https://review.openstack.org/449585 | 11:54 |
thrash | florianf: then the ansible modules become thin wrappers just to make them modules. | 11:54 |
jaosorior | EmilienM: https://review.openstack.org/#/c/444891/ | 11:54 |
jaosorior | EmilienM: so instead, the requests are done here https://github.com/openstack/puppet-tripleo/blob/master/manifests/profile/base/certmonger_user.pp | 11:55 |
EmilienM | jaosorior: ah, nice | 11:55 |
jaosorior | EmilienM: that's still done on the baremetal node though. | 11:55 |
jtomasek | thrash, florianf: I think so far only problem is with listing validations as that's tied to ansible (AFAIK) if we were able to change this, then all we need to is keep the validation workflow execution output match certain format - so we can get validation results | 11:55 |
*** rbrady-afk is now known as rbrady | 11:55 | |
jaosorior | EmilienM: hopefully at some point we can containerize all the stuff that's needed. So there would be one container that has the credentials to the CA and does the requests | 11:56 |
*** jpena is now known as jpena|lunch | 11:56 | |
jaosorior | EmilienM: so in that sense, this is moving in the right direction as well | 11:56 |
thrash | jtomasek: I think that should be a relatively easy problem to solve. | 11:56 |
thrash | jtomasek: you call a mistral workflow to list them, right? | 11:56 |
*** jayg|g0n3 is now known as jayg | 11:57 | |
jtomasek | thrash: we're calling mistral actions: VALIDATIONS_LIST: 'tripleo.validations.list_validations', | 11:57 |
jtomasek | VALIDATIONS_RUN: 'tripleo.validations.v1.run_validation', | 11:57 |
jtomasek | VALIDATIONS_RUN_GROUPS: 'tripleo.validations.v1.run_groups' | 11:57 |
thrash | jtomasek: just out of curiousity, why are y'all calling actions directly, and not workflows? | 11:58 |
*** tvignaud has quit IRC | 11:58 | |
jtomasek | thrash: calling action is a request->response thing which is much faster/simpler for GUI to consume - it is basically ordinary API call | 11:58 |
*** salmankhan has quit IRC | 11:59 | |
jtomasek | thrash: operations which don't need to be wrapped in workflow are faster to call as actions | 11:59 |
thrash | jtomasek: but calling workflows at some point is ok, right? | 12:00 |
thrash | :D | 12:00 |
jtomasek | thrash: yes, it is | 12:00 |
thrash | jtomasek: ok... Just making sure. :D | 12:00 |
jtomasek | thrash: :) | 12:00 |
openstackgerrit | Christian Schwede proposed openstack/tripleo-quickstart-extras master: Use subjectAltName in self-generated SSL certs https://review.openstack.org/449588 | 12:00 |
*** abishop has quit IRC | 12:00 | |
thrash | jtomasek: anyway... I think we can have validation discovery from mistral too... | 12:01 |
*** udesale has joined #tripleo | 12:01 | |
jaosorior | cschwede: nice! | 12:01 |
*** salmankhan has joined #tripleo | 12:01 | |
thrash | jtomasek: we'd just namespace the validation workflows. | 12:02 |
jtomasek | thrash: yes | 12:02 |
*** abishop has joined #tripleo | 12:02 | |
*** adarazs is now known as adarazs_lunch | 12:02 | |
jtomasek | thrash: so each validation would basically be separate workflow? | 12:02 |
thrash | jtomasek: then you can make a direct call to list the workflows, right? | 12:02 |
jtomasek | yes | 12:02 |
*** links has quit IRC | 12:02 | |
thrash | jtomasek: ideally, yes. | 12:02 |
florianf | thrash, jtomasek: Currently most (if not all) tripleo-validations are things that potentially run on multiple hosts. So we need ansible for this. So the question is probably more: Do these "simple" deployment checks qualify as validations and do we in fact want them available/executable in clients like the UI. | 12:03 |
thrash | jtomasek: so you'd ask mistral for 'give me all workflows named "tripleo.validations.v1.impl.*"' | 12:03 |
thrash | florianf: we do want them available/executable from the clients. | 12:03 |
thrash | florianf: shardy unless we can somehow move these into a heat template. | 12:03 |
thrash | whiich I'm not sure that would even work. | 12:04 |
thrash | florianf: but I deifinitely see the demarcation there... | 12:04 |
thrash | florianf: if need to run across multiple hosts, absolutely that makes sense to be in ansible, | 12:04 |
jtomasek | thrash: well primarily those are going to be triggered from other workflows as subworkflows | 12:04 |
shardy | florianf: I think it's two different types of validation, one is check if the environment is broken, one is did the user do something wrong and the plan data is broken (or e.g there aren't enough nodes, or there's a profile mismatch, or whatever) | 12:04 |
*** links has joined #tripleo | 12:05 | |
shardy | thrash: some things are already validated via constraints in the heat templates, but really we want to do this earlier | 12:05 |
thrash | shardy: right... We aren't checking the nodes themselves for errors. | 12:05 |
thrash | shardy: ack | 12:05 |
shardy | thrash: perhaps if the plan create workflow did a stack preview we could do that | 12:05 |
shardy | thrash: but yeah, and there's a bunch of other stuff like checking ironic node tagging against flavors etc | 12:06 |
shardy | which we can't do in heat, so I like the approach you've already taken with the mistral based validations | 12:06 |
shardy | maybe we can solve this debate by just namespacing them "plan_validation" ;) | 12:06 |
thrash | shardy: ack. I really think we just need a rename. :) Stop calling both of the these things "validations" | 12:07 |
jtomasek | in every case, all the validations are workflows which most usually run as subworkflow for other workflow | 12:07 |
openstackgerrit | Marius Cornea proposed openstack/tripleo-heat-templates master: WIP: Stop openstack-nova-compute during nova-ironic upgrade https://review.openstack.org/449596 | 12:07 |
jtomasek | that is why I think it would be nice if they shared the same format | 12:07 |
jaosorior | d0ugal: mistral creates the overcloudrc files. But does it write them in the filesystem as well? or is that tripleoclient? | 12:07 |
d0ugal | jaosorior: that is still in tripleoclient (mistral doesn't know where or what the users home dir is) | 12:08 |
jaosorior | ok | 12:08 |
d0ugal | jaosorior: https://github.com/openstack/python-tripleoclient/blob/master/tripleoclient/utils.py#L66 | 12:08 |
*** salmankhan has quit IRC | 12:08 | |
thrash | jtomasek: what is that format? Are you talking about the stdout/stderr thing? | 12:08 |
openstackgerrit | Marios Andreou proposed openstack/tripleo-heat-templates master: WIP - Add upgrade_batch_tasks to neutron-l3-agent https://review.openstack.org/445494 | 12:08 |
d0ugal | jaosorior: which is called from..... https://github.com/openstack/python-tripleoclient/blob/36b6b09fb307399458a9bfadef497cbcae35f3c4/tripleoclient/v1/overcloud_deploy.py#L1160 | 12:09 |
openstackgerrit | Marios Andreou proposed openstack/tripleo-heat-templates master: WIP - Add upgrade_batch_tasks to neutron-l3-agent https://review.openstack.org/445494 | 12:09 |
jtomasek | thrash: yes mostly, + the api to be able to list them. But if we agree that those plan_validations are something that is not supposed to be listed and run from client, but solely as part of certain workflow, then fine | 12:10 |
jaosorior | d0ugal: thanks dude; makes sense | 12:10 |
thrash | jtomasek: that doesn't work for the checks that I'm doing... We have errors and warnings. Errors are things that must be fixed. Warnings are just that. Warnings that something *could* go wrong should you proceed. | 12:10 |
*** ooolpbot has joined #tripleo | 12:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 12:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674681 | 12:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674770 | 12:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674955 | 12:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1675174 | 12:10 |
openstack | Launchpad bug 1674681 in tripleo "buildlogs.centos.org CDN issues" [Critical,Triaged] | 12:10 |
*** ooolpbot has quit IRC | 12:10 | |
jaosorior | d0ugal: somehow I really think there's an exceptino somewhere that's being ignored. | 12:10 |
openstack | Launchpad bug 1674770 in tripleo "Update timeout too long in CI" [Critical,Triaged] | 12:10 |
openstack | Launchpad bug 1674955 in tripleo "oooq now believes that overcloud deploy succeeded even though it failed" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 12:10 |
openstack | Launchpad bug 1675174 in tripleo "Timeout passed to overcloud deploy not effective" [Critical,Triaged] | 12:10 |
jtomasek | thrash: ack | 12:10 |
*** vpickard_ is now known as vpickard | 12:11 | |
thrash | jtomasek: yeah... it probably should just be a part of the deploy workflow anyway, since that's where they were in the client. | 12:11 |
openstackgerrit | Marius Cornea proposed openstack/tripleo-heat-templates master: WIP: Stop openstack-nova-compute during nova-ironic upgrade https://review.openstack.org/449596 | 12:11 |
thrash | although some could probably be done when uploading the plan. | 12:11 |
jaosorior | EmilienM: could I get your opinion on this patch? https://review.openstack.org/#/c/449536/ | 12:11 |
jtomasek | thrash: so in case that validation fails, whole workflow fails and that validation populates output of the workflow, so client can display error | 12:11 |
shardy | thrash: yeah I think many of these can be done when creating/updating the plan | 12:11 |
thrash | jtomasek: yes | 12:12 |
*** tvignaud has joined #tripleo | 12:12 | |
shardy | so we'd have a plan_update workflow which can optionally run plan_validations, which can return errors and warnings | 12:12 |
shardy | the thing we're missing is the top-level workflow to wire it all together I think | 12:12 |
thrash | shardy: exactly. And if each check is a workflow, that should be easy to integrate into the plan update/create workflows. | 12:12 |
d0ugal | jaosorior: yeah, hopefully - that would be the "best" outcome | 12:12 |
shardy | thrash: cool, yeah exactly :) | 12:13 |
EmilienM | jaosorior: sure | 12:14 |
jtomasek | shardy: by top level workflow you mean something that wraps plan_update workflow (e.g. deploy workflow?) | 12:14 |
*** lucasagomes is now known as lucas-hungry | 12:14 | |
*** nyechiel has joined #tripleo | 12:14 | |
EmilienM | jaosorior: can you have multiple certificates request with the same cert_dir? | 12:15 |
shardy | jtomasek: I guess we can enhance update_deployment_plan to do it | 12:15 |
shardy | jtomasek: the problem is that doesn't accept all data associated with the plan atm, e.g the list of environments etc, so we end up having to create the plan, then do stuff, then do validation | 12:15 |
EmilienM | jaosorior: because if yes, you'll have duplicated resource in your catalog, unless you use ensure_resource from puppetlabs-stdlib | 12:15 |
jaosorior | EmilienM: oh shit. | 12:15 |
jaosorior | I'll use ensure_resource then | 12:16 |
jaosorior | EmilienM: thanks man! | 12:16 |
shardy | jtomasek: I'm just saying we can perhaps integrate the steps a little better in future | 12:16 |
jtomasek | ok | 12:16 |
florianf | thrash, jtomasek, shardy: but if those plan_validation make the create/update workflows fail, do we really want/need them to be separately exectuable from a client? | 12:16 |
*** bfournie has joined #tripleo | 12:16 | |
jtomasek | florianf: yes, we don't need that - that is why it is different from 'ansible validations we have' (UUIC) | 12:17 |
shardy | florianf: probably not, but it's still useful to define each validation as a discreate workflow | 12:17 |
shardy | florianf: maybe we just have a way to say they're internal vs something you click on in the UI | 12:17 |
*** ratailor has quit IRC | 12:17 | |
*** dougbtv_ has joined #tripleo | 12:18 | |
*** tvignaud has quit IRC | 12:18 | |
* shardy votes for plan_validation namespace :) | 12:18 | |
* florianf agrees with shardy | 12:18 | |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/puppet-tripleo master: Ensure directory exists for certificates for httpd https://review.openstack.org/449536 | 12:19 |
jaosorior | EmilienM: what about this? ^^ | 12:19 |
EmilienM | jaosorior: better | 12:20 |
*** katkapilatova has joined #tripleo | 12:20 | |
jaosorior | EmilienM: nice catch. Thanks for the feedback. | 12:21 |
EmilienM | jaosorior: anytime | 12:21 |
openstackgerrit | Pradeep Kilambi proposed openstack/instack-undercloud master: Run ceilometer-upgrade for gnocchi conditionally https://review.openstack.org/448762 | 12:21 |
EmilienM | jaosorior: it would be nice to have a tls everywhere testing in the multinode jobs, would it be possible? | 12:22 |
*** nyechiel has quit IRC | 12:23 | |
*** katkapilatova has left #tripleo | 12:24 | |
jaosorior | EmilienM: I gotta investigate. | 12:24 |
*** katkapilatova has joined #tripleo | 12:24 | |
EmilienM | jaosorior: even if we have to deploy a service on the undercloud | 12:24 |
jaosorior | EmilienM: so, deploying the service in the undercloud is not really an issue. | 12:24 |
jaosorior | EmilienM: there are two deterrents currently. | 12:24 |
*** jprovazn has joined #tripleo | 12:24 | |
*** d0ugal has quit IRC | 12:24 | |
jaosorior | EmilienM: 1. We need to deploy the CA somewhere, so it would require another node that gets set up before the undercloud (this one we can probably address and is not a big deal) | 12:25 |
*** dprince has joined #tripleo | 12:25 | |
jaosorior | 2. multinode jobs use DeployedServer resources from heat, and currently it's not possible to set metadata for these; so the kerberos principals would have issues there. | 12:25 |
EmilienM | jaosorior: I se | 12:26 |
EmilienM | see* | 12:26 |
*** liverpooler has quit IRC | 12:26 | |
jaosorior | EmilienM: I tried adding that https://review.openstack.org/#/c/422868/ but apparently we can't asure that a DeployedServer will come from OpenStack | 12:26 |
*** dprince has quit IRC | 12:26 | |
jaosorior | so that's the main blocker at the moment. | 12:26 |
*** liverpooler has joined #tripleo | 12:26 | |
*** dprince has joined #tripleo | 12:26 | |
EmilienM | jaosorior: ok thanks... we'll have to figure out something later probably | 12:27 |
jaosorior | EmilienM: I really need to dig more into DeployedServer and come up with a solution for this before we can do this. | 12:27 |
jaosorior | EmilienM: so meanwhile, we currently are pretty tied to OVB | 12:27 |
thrash | florianf: jtomasek the ansible validations are selectable because they don't necessarily apply to every deployment, right? | 12:29 |
*** janki has quit IRC | 12:30 | |
*** tvignaud has joined #tripleo | 12:31 | |
*** shardy is now known as shardy_lunch | 12:31 | |
snecklifter | jprovazn: hello, what are the chances of getting https://bugs.launchpad.net/tripleo/+bug/1644784 as a backport to Newton? | 12:31 |
openstack | Launchpad bug 1644784 in tripleo "Support deploying of Manila / CephFS with managed Ceph" [Medium,Fix released] - Assigned to Jan Provaznik (jan-provaznik) | 12:31 |
jtomasek | thrash: they should, they are mean to persistently warn user that something with his setup is wrong and should not deploy before those issues are fixed | 12:32 |
florianf | thrash: They all belong to a group, so in principle the should be run in every deployment. | 12:32 |
florianf | thrash: You can run them separately so you can restart a failed one after you made changes for instance. | 12:33 |
*** morazi has joined #tripleo | 12:33 | |
*** pkovar has quit IRC | 12:35 | |
*** pkovar has joined #tripleo | 12:35 | |
*** flepied has quit IRC | 12:36 | |
openstackgerrit | Merged openstack/tripleo-docs master: Switch trunk/cbs/buildlogs to use https https://review.openstack.org/448044 | 12:38 |
*** trown|outtypewww is now known as trown | 12:39 | |
jprovazn | snecklifter: hello, I think that chances are minimal, it was a new feature consisting of about 7 patches accross multiple repos | 12:39 |
snecklifter | jprovazn: ack, thanks for responding | 12:39 |
jprovazn | snecklifter: np | 12:40 |
snecklifter | jprovazn: oh and thanks for implementing it in Ocata too! | 12:41 |
jprovazn | snecklifter: it was gfidente who added ceph mds support :) | 12:41 |
*** udesale has quit IRC | 12:42 | |
jprovazn | snecklifter: I helped with some manila specific things | 12:42 |
snecklifter | jprovazn: a true team effort \o/ | 12:42 |
jprovazn | snecklifter: anyway, you are welcome :) | 12:42 |
*** rbowen has quit IRC | 12:42 | |
*** dougbtv_ has quit IRC | 12:42 | |
*** rbowen has joined #tripleo | 12:44 | |
openstackgerrit | Athlan-Guyot sofer proposed openstack/tripleo-heat-templates master: WIP: N->O upgrade, blanks ipv6 rules before activating it. https://review.openstack.org/449613 | 12:45 |
*** dougbtv_ has joined #tripleo | 12:45 | |
*** dougbtv_ is now known as dougbtv|laptop | 12:46 | |
chem | matbu: bandini could you cross check -> https://review.openstack.org/#/c/449613/ | 12:47 |
chem | matbu: bandini it's about the firewall ipv6 | 12:48 |
*** psahoo has quit IRC | 12:48 | |
chem | matbu: bandini I've just coded that without running it once, so beware :) | 12:48 |
openstackgerrit | Jiri Stransky proposed openstack/tripleo-quickstart-extras master: Upgrade to containerized overcloud https://review.openstack.org/448576 | 12:49 |
*** eck` is now known as eck`gone | 12:49 | |
chem | dsneddon: hi, we have an issue with upgrade with this https://github.com/openstack/tripleo-heat-templates/commit/a3f03eb307797ac5eef1251b9252e642db326e07 | 12:50 |
chem | dsneddon: basically what do you think would be the best course of action to migrate those parameter from previous to new version ? | 12:51 |
*** thrash is now known as thrash|brb | 12:51 | |
*** rlandy has joined #tripleo | 12:52 | |
*** flepied has joined #tripleo | 12:54 | |
*** tzumainn has joined #tripleo | 12:56 | |
*** ramishra has quit IRC | 12:56 | |
matbu | chem: /me looks | 12:56 |
rlandy | dmsimard: hi - 1450 didn't help - we got an undercloud install failure on the third run | 12:58 |
dmsimard | rlandy: what did it fail on ? | 12:58 |
*** jpena|lunch is now known as jpena | 13:03 | |
*** tvignaud has quit IRC | 13:03 | |
*** flepied has quit IRC | 13:03 | |
*** adarazs_lunch is now known as adarazs | 13:04 | |
*** milan has joined #tripleo | 13:04 | |
openstackgerrit | Bogdan Dobrelya proposed openstack-infra/tripleo-ci master: Adapt getthelogs UX for more use cases https://review.openstack.org/449552 | 13:09 |
*** ooolpbot has joined #tripleo | 13:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 13:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674681 | 13:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674770 | 13:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674955 | 13:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1675174 | 13:10 |
openstack | Launchpad bug 1674681 in tripleo "buildlogs.centos.org CDN issues" [Critical,Triaged] | 13:10 |
*** ooolpbot has quit IRC | 13:10 | |
openstack | Launchpad bug 1674770 in tripleo "Update timeout too long in CI" [Critical,Triaged] | 13:10 |
openstack | Launchpad bug 1674955 in tripleo "oooq now believes that overcloud deploy succeeded even though it failed" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 13:10 |
openstack | Launchpad bug 1675174 in tripleo "Timeout passed to overcloud deploy not effective" [Critical,Triaged] | 13:10 |
*** lblanchard has joined #tripleo | 13:13 | |
*** thrash|brb is now known as thrash | 13:13 | |
*** jcoufal has joined #tripleo | 13:15 | |
*** tvignaud has joined #tripleo | 13:16 | |
*** shardy_lunch is now known as shardy | 13:17 | |
*** flepied has joined #tripleo | 13:17 | |
*** zoli is now known as zoli|afk-FBCI | 13:19 | |
openstackgerrit | Carlos Camacho proposed openstack/tripleo-ui master: Add favicon icons https://review.openstack.org/420111 | 13:20 |
openstackgerrit | Carlos Camacho proposed openstack/tripleo-ui master: Add favicon icons https://review.openstack.org/420111 | 13:21 |
*** jrist has joined #tripleo | 13:22 | |
openstackgerrit | Merged openstack/tripleo-quickstart-extras master: Switch trunk/cbs/buildlogs to use https https://review.openstack.org/448037 | 13:25 |
openstackgerrit | Oliver Walsh proposed openstack/tripleo-common master: Add MigrationSshKey to generated passwords https://review.openstack.org/449239 | 13:26 |
*** lucas-hungry is now known as lucasagomes | 13:27 | |
EmilienM | owalsh: ^ can you add a release note please? | 13:30 |
owalsh | EmilienM: ack | 13:31 |
*** zoli|afk-FBCI is now known as zoli | 13:32 | |
*** mburned is now known as mburned_out | 13:33 | |
*** mburned_out is now known as mburned | 13:33 | |
*** lucasagomes is now known as lucas-brb | 13:35 | |
*** prateek has quit IRC | 13:36 | |
bandini | chem: ack will look shortly | 13:39 |
chem | matbu: bandini thanks | 13:40 |
*** jkilpatr has quit IRC | 13:44 | |
*** jcoufal_ has joined #tripleo | 13:44 | |
*** d0ugal has joined #tripleo | 13:45 | |
*** jcoufal has quit IRC | 13:46 | |
*** skramaja has quit IRC | 13:48 | |
d0ugal | jaosorior: lol, so my debugging patches both passed >_< | 13:48 |
jaosorior | agh!! | 13:49 |
jaosorior | d0ugal: do you think it's due to the patches that get several messages at once? | 13:49 |
d0ugal | jaosorior: no, because none of that CLI work has landed yet. | 13:49 |
d0ugal | jaosorior: well, other than the initial code that isn't used - so I don't think so, but maybe worth looking into | 13:50 |
*** gbarros has joined #tripleo | 13:50 | |
*** salmankhan has joined #tripleo | 13:51 | |
jaosorior | I see | 13:51 |
*** links has quit IRC | 13:53 | |
openstackgerrit | Carlos Camacho proposed openstack/tripleo-ui master: Add favicon icons https://review.openstack.org/420111 | 13:53 |
*** sshnaidm|off has quit IRC | 13:54 | |
openstackgerrit | Gael Chamoulaud proposed openstack/tripleo-quickstart-extras master: Add blank newline at the end of file https://review.openstack.org/449028 | 13:56 |
openstackgerrit | Steven Hardy proposed openstack/python-tripleoclient master: Don't track added_files in deploy environment processing https://review.openstack.org/446045 | 13:57 |
openstackgerrit | Steven Hardy proposed openstack/python-tripleoclient master: Move clients into class constructor https://review.openstack.org/449633 | 13:57 |
bandini | chem: I have added a comment | 14:00 |
remix_tj | http://logs.openstack.org/16/447416/1/gate/gate-tripleo-ci-centos-7-multinode-upgrades/9848094/console.html#_2017-03-23_22_23_51_587577 how do i troubleshoot this? | 14:00 |
dmsimard | rlandy: I didn't hear back earlier, what did the third undercloud install fail on ? | 14:01 |
*** lucas-brb is now known as lucasagomes | 14:01 | |
openstackgerrit | Luke Hinds proposed openstack/tripleo-heat-templates master: Adds service for managing securetty https://review.openstack.org/449153 | 14:02 |
rlandy | dmsimard: sorry - instance got cleaned up - got another job with logs going now - will see | 14:02 |
jrist | akrivoka: I think your import/export stuff is getting close to merging | 14:03 |
chem | bandini: thanks | 14:03 |
bandini | remix_tj: I usually go on the nodes themselves (like http://logs.openstack.org/16/447416/1/gate/gate-tripleo-ci-centos-7-multinode-upgrades/9848094/logs/subnode-2/var/log/messages in that case) | 14:03 |
jrist | who else do we need to poke? | 14:03 |
akrivoka | jrist: it's been "close" for ages | 14:03 |
akrivoka | jrist: any tripleo core | 14:03 |
jrist | k | 14:03 |
jrist | d0ugal: ^ | 14:03 |
jrist | :) | 14:03 |
d0ugal | jrist: link? | 14:04 |
akrivoka | d0ugal: https://review.openstack.org/#/c/414169/ | 14:04 |
akrivoka | that's the one we really need to land first, all the other dependent ones already have +2s | 14:04 |
d0ugal | akrivoka: thanks. I feel like I've +2ed this in the past... | 14:04 |
jrist | d0ugal: (ノ◕ヮ◕)ノ*:・゚✧ | 14:04 |
d0ugal | huh, I've not even reviewed it | 14:04 |
d0ugal | I'm sure I've looked at it before... :/ | 14:05 |
* jrist cries for argentina | 14:05 | |
dmsimard | jrist: wow that is a magical emoji | 14:05 |
jrist | dmsimard: yesssss | 14:05 |
jrist | dmsimard: do you want a kawaii emoji? | 14:05 |
dmsimard | kawaiiiiiiiii | 14:05 |
remix_tj | bandini: 15.184.64.1 is not pingable. Recheck is enough? | 14:06 |
*** mdnadeem has quit IRC | 14:06 | |
jrist | \(^○^)人(^○^)/ | 14:06 |
jrist | hi 5! | 14:06 |
bandini | remix_tj: I *think* so, yes | 14:06 |
bandini | i am also getting a bunch of odd failures on multinode jobs atm | 14:06 |
slagle | did you run that emoji by the foundation first? | 14:06 |
slagle | not sure it fits with the existing branding rules | 14:06 |
dmsimard | burn | 14:07 |
akrivoka | lol | 14:07 |
jrist | lol | 14:07 |
jrist | slagle: welcome back! | 14:07 |
dmsimard | ok </friday> | 14:07 |
jrist | fine, different high five ┏(^0^)┛┗(^0^) ┓ | 14:07 |
amoralej | rlandy, are you using patched versions to get debug info if it fails?, that'd be nice | 14:08 |
jaosorior | EmilienM: can you check this one out https://review.openstack.org/#/c/447953/ ? | 14:08 |
EmilienM | sure | 14:08 |
jtomasek | akrivoka: how many of your patches is missing to merge | 14:09 |
jtomasek | ? | 14:09 |
EmilienM | jaosorior: have you seen dtantsur's comment? | 14:09 |
jaosorior | EmilienM: no | 14:09 |
openstackgerrit | Luke Hinds proposed openstack/puppet-tripleo master: Adds service for managing securetty https://review.openstack.org/449148 | 14:09 |
*** ealcaniz has quit IRC | 14:09 | |
EmilienM | jaosorior: I can approve this one and you follow up with the domain params | 14:09 |
*** ooolpbot has joined #tripleo | 14:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 14:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674681 | 14:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674770 | 14:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674955 | 14:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1675174 | 14:10 |
openstack | Launchpad bug 1674681 in tripleo "buildlogs.centos.org CDN issues" [Critical,Triaged] | 14:10 |
*** ooolpbot has quit IRC | 14:10 | |
openstack | Launchpad bug 1674770 in tripleo "Update timeout too long in CI" [Critical,Triaged] | 14:10 |
openstack | Launchpad bug 1674955 in tripleo "oooq now believes that overcloud deploy succeeded even though it failed" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 14:10 |
openstack | Launchpad bug 1675174 in tripleo "Timeout passed to overcloud deploy not effective" [Critical,Triaged] | 14:10 |
akrivoka | jtomasek: https://review.openstack.org/#/c/414169/ https://review.openstack.org/#/c/422789/ https://review.openstack.org/#/c/425858/ https://review.openstack.org/#/c/437676/ | 14:10 |
rlandy | amoralej: I am not getting extra debug info - but we collect a decent log | 14:10 |
akrivoka | jtomasek: but the bottleneck is only the first one, all the other ones have +2s (and depend on the first one) | 14:10 |
jtomasek | akrivoka: thanks | 14:10 |
jaosorior | EmilienM: we'll do that for every service's auth | 14:11 |
jaosorior | EmilienM: I'm starting to think we should just have a hiera parameter that contains the domain | 14:11 |
jaosorior | EmilienM: and do some overriding of the keystone resource | 14:11 |
amoralej | rlandy, it'd be fantastic if we could get debug info by setting additional environment variable, i'm trying it in https://review.openstack.org/#/c/449548 | 14:11 |
EmilienM | jaosorior: yes | 14:11 |
d0ugal | akrivoka: Does this patch handle upgrades? | 14:12 |
EmilienM | jaosorior: approving your patch now and I'll let you update the rest if neded | 14:12 |
rlandy | amoralej: sure - on next run | 14:12 |
*** psahoo has joined #tripleo | 14:12 | |
d0ugal | akrivoka: i.e. I have a plan in Mistral, I upgrade to Pike, will it move the plan to Swift? | 14:12 |
*** psahoo is now known as psahoo|away | 14:12 | |
amoralej | thanks | 14:12 |
jaosorior | EmilienM: thanks | 14:12 |
jaosorior | EmilienM: oh... wait, can you check the depends-on for that patch? | 14:12 |
d0ugal | akrivoka: oh wait, I see a comment about this already. /me continues reading | 14:12 |
EmilienM | jaosorior: approved | 14:13 |
jaosorior | thanks dude | 14:13 |
*** noslzzp has quit IRC | 14:17 | |
*** noslzzp has joined #tripleo | 14:17 | |
*** jbadiapa has quit IRC | 14:18 | |
dprince | jistr: I rebuilt Ironic-conductor's container and tried it w/ the new heat templates. It seems to be restarted due to sudoers issue... | 14:19 |
bogdando | bandini, shardy: folks who fight CI for errors (EmilienM, mwhahaha ?), I improved the getthelog tool PTAL https://review.openstack.org/#/c/449552/. Also I gave examples for my custom parser script https://github.com/bogdando/fuel-log-parse#examples-for-tripleo-ci-openstack-infra-logs. The given example finds that 'ssl_depth' error we'd discussed above easy. | 14:19 |
openstackgerrit | Dougal Matthews proposed openstack/python-tripleoclient master: DO NOT MERGE: difference without websockets? https://review.openstack.org/449585 | 14:20 |
openstackgerrit | Dougal Matthews proposed openstack/python-tripleoclient master: DO NOT MERGE: print any exceptions https://review.openstack.org/449583 | 14:20 |
dprince | jistr: just wondering if you ever say any of this with the new 'host_prep_task' stuff for this service... | 14:20 |
bogdando | ofc you need no all these if you're a kibana guru :) | 14:20 |
jistr | dprince: i'm deploying a full e2e upgrade from BM to containerized right now, will check when it's upgraded | 14:20 |
EmilienM | bogdando: ok i'll look shortly | 14:20 |
dprince | jistr: could be a regression somewhere else. Just wondering what might have changed | 14:21 |
jistr | dprince: do you have the exact error message? | 14:21 |
bogdando | EmilienM: I used to use that tool to find all nasty HA bugs and race conditions in fuel. Now it seems applicable to ooo as well | 14:21 |
jistr | dprince: i know kolla has quite often custom sudoers to allow their entrypoint scripts to sudo for very specific things but nothing else | 14:21 |
*** jmelvin has joined #tripleo | 14:21 | |
*** amoralej is now known as amoralej|lunch | 14:21 | |
bogdando | if you think it is usable (please try it) I'll make an announce perhaps | 14:22 |
openstackgerrit | Juan Antonio Osorio Robles proposed openstack/instack-undercloud master: Set default domain for all keystone users https://review.openstack.org/449644 | 14:22 |
jaosorior | EmilienM: what about this? ^^ | 14:22 |
bandini | bogdando: nice. I will take a look | 14:23 |
tbarron | EmilienM: jaosorior I got better results on https://review.openstack.org/#/c/448614 and think we can go with that approach. | 14:23 |
*** bnemec is now known as beekneemech | 14:23 | |
jaosorior | tbarron: nice! thanks for the update | 14:24 |
bogdando | I believe in the scope of a single job investigation, it outperforms ELK/Kibana by speed and UX perhaps | 14:24 |
tbarron | jaosorior: ty! | 14:24 |
dprince | jistr: weird. It looks like sudo -E kolla_set_configs is failing for Ironic | 14:25 |
dprince | jistr: wonder what changed | 14:25 |
bogdando | but for the full world few, we really ought to adopt some bots for elastic-recheck! | 14:25 |
dprince | jistr: you removed the 'user: root' in your patch I think.... | 14:25 |
*** chlong has joined #tripleo | 14:25 | |
*** eck`gone is now known as eck` | 14:25 | |
dprince | jistr: this was required for it to functionally work I think. It is probably a bug on the container but I think we'll need it set for it to functionally work in the meantime | 14:26 |
bogdando | oops, I mean world view* | 14:26 |
*** pkovar has quit IRC | 14:26 | |
jistr | dprince: i removed the full container that has done `mkdir`s, which had user: root, but we don't need that container anymore, as the dirs are managed by host_prep_tasks now. https://github.com/openstack/tripleo-heat-templates/commit/1a4ece16cea40075fe7332ed048b9c289b3ff424 | 14:28 |
jistr | dprince: as far as the normal ironic container goes, i don't think there was user: root to begin with | 14:28 |
openstackgerrit | Alex Schultz proposed openstack/tripleo-specs master: Create patch abandonment policy https://review.openstack.org/449332 | 14:28 |
dprince | jistr: yeah, weird. I see that now | 14:29 |
*** yprokule has quit IRC | 14:30 | |
jistr | https://github.com/openstack/kolla/commit/5752c7eb0b1f9c5978dd4e9271ded346cea231e0 | 14:31 |
jistr | https://github.com/openstack/kolla/commit/05c0d6998bdda155c925d563b75ac353303f93ff | 14:31 |
*** pkovar has joined #tripleo | 14:31 | |
jistr | dprince: ^ these do sth with sudoers, could be related maybe... i think the ironic images we have in dockerhub are quite old now, so this may have gone unnoticed for a while | 14:31 |
mwhahaha | bandini: you may need to -A+A to unstick https://review.openstack.org/#/c/445479/ | 14:31 |
d0ugal | akrivoka: reviewed | 14:32 |
dprince | jistr: exactly, I just rebuilt them. I was aiming to push new ones today fwiw.... | 14:32 |
dprince | jistr: it seems to be kolla_set_configs that fails though. Very weird | 14:33 |
*** jbadiapa has joined #tripleo | 14:33 | |
bandini | mwhahaha: oh just did. will that be enough? | 14:33 |
dprince | jistr: the https://github.com/openstack/kolla/commit/5752c7eb0b1f9c5978dd4e9271ded346cea231e0 is almost certainly related to what I'm seeing | 14:34 |
mwhahaha | bandini: it should, sometimes when it doesn't start gatting it's because the notification of +A gets lost | 14:34 |
bandini | ooh | 14:34 |
bandini | did not know that | 14:34 |
dprince | jistr: I'll file a bug and try to fix it | 14:35 |
dprince | jistr: sorry I pinged about your commit. I saw 'user: root' got removed there and it confused me :) | 14:35 |
jistr | dprince: np :) and thanks for looking into that issue | 14:35 |
dmsimard | rlandy: any info yet ? | 14:36 |
mwhahaha | bandini: oh looks l ike it's been stuck in the gat for 12+ hours on status.openstack.org | 14:36 |
mwhahaha | bandini: so we just need to be patient | 14:37 |
*** morazi has quit IRC | 14:37 | |
rlandy | dmsimard: this time, passed | 14:37 |
rlandy | I am still running with 1450 mtu | 14:37 |
rlandy | but afaict - not much difference | 14:37 |
bandini | mwhahaha: I hope it is the golden pass, it's been 10days of rechecks :) | 14:37 |
dmsimard | rlandy: ok, let us know if you see a failure :( | 14:37 |
akrivoka | d0ugal: thanks! | 14:37 |
jistr | dprince: interesting. the commits only touch files in /etc/sudoers.d, not /etc/sudoers itself (where the kolla_set_configs lives). I wonder if perhaps one of the files in /etc/sudoers.d/ is malformed, preventing sudo from working in general. | 14:38 |
rlandy | dmsimard: sure - spinning all the time - and I enabled full log collection now - so it should be easier to share | 14:38 |
*** udesale has joined #tripleo | 14:38 | |
jistr | dprince: s/where the kolla_set_configs lives/where the kolla_set_configs rule lives/ | 14:38 |
openstackgerrit | wes hayutin proposed openstack/tripleo-quickstart master: WIP, make script work in CI and interactive https://review.openstack.org/449370 | 14:39 |
dprince | jistr: could be | 14:39 |
openstackgerrit | Pradeep Kilambi proposed openstack/instack-undercloud master: Run ceilometer-upgrade for gnocchi conditionally https://review.openstack.org/448762 | 14:41 |
openstackgerrit | Oliver Walsh proposed openstack/tripleo-heat-templates master: WIP: SSH known_hosts config https://review.openstack.org/449660 | 14:41 |
openstackgerrit | mathieu bultel proposed openstack/tripleo-quickstart-extras master: Allow complex upgrade deployment for N to O https://review.openstack.org/439598 | 14:43 |
pradk | EmilienM, can you review this https://review.openstack.org/#/c/448762/9 .. i confirmed this is working locally (added some notes as well) | 14:44 |
pradk | blocking qe :( | 14:44 |
beekneemech | Heads up: the compute node hosting tripleo.org is up for reboot. The site will be down for a little while until that node is back up. | 14:45 |
pradk | mwhahaha, when you're around ^^ .. finally, works :) | 14:46 |
mwhahaha | pradk: ok i'll take a look | 14:47 |
pradk | ty sir | 14:47 |
*** fragatina has quit IRC | 14:47 | |
chem | owalsh: he again, ... so we still have the issue. It boils down "vif_type=binding_failed" | 14:51 |
chem | owalsh: does it ring a bell ? | 14:51 |
jaosorior | dtantsur: well, I'm not sure if it's relevant to add a release note for the keystone domain bits, since it's not something that the users can really see. We do use the default keystone domain that comes out of the box anyway | 14:51 |
jaosorior | in that patch | 14:51 |
*** morazi has joined #tripleo | 14:51 | |
dtantsur | jaosorior, well, it enabled keystone v3 support, so it's kinda a feature | 14:52 |
dtantsur | or a fix for keystone v3 | 14:52 |
jaosorior | it doesn't enable it yet | 14:52 |
dtantsur | well, I think it mostly does not work because of missing domain | 14:53 |
jaosorior | dtantsur: this one enables it https://review.openstack.org/#/c/446752/ | 14:53 |
dtantsur | assuming we already switched from versioned auth_url | 14:53 |
jaosorior | dtantsur: we haven't switched | 14:53 |
*** fragatina has joined #tripleo | 14:53 | |
jaosorior | dtantsur: so actually, that commit doesn't really do much until we actually make the switch | 14:53 |
dtantsur | jaosorior, then this change does not enable it. stackrc is unrelated to authtoken configuration | 14:53 |
openstackgerrit | Athlan-Guyot sofer proposed openstack/tripleo-heat-templates master: WIP: N->O upgrade, blanks ipv6 rules before activating it. https://review.openstack.org/449613 | 14:54 |
dtantsur | ok, if we use versioned auth_url still (I thought EmilienM has fixed it), then indeed this change does not affect keystone v3 support | 14:54 |
jaosorior | uhm... | 14:54 |
jaosorior | wait up | 14:54 |
jaosorior | dtantsur: I think you're right | 14:54 |
dtantsur | let's wait for the CI, I guess, and see what it actually uses | 14:55 |
jaosorior | right | 14:55 |
*** jbadiapa has quit IRC | 14:55 | |
*** jbadiapa has joined #tripleo | 14:56 | |
EmilienM | tbarron: excellent, I'll look at it | 14:57 |
tbarron | EmilienM: ty! | 14:57 |
openstackgerrit | Jiri Stransky proposed openstack/tripleo-quickstart-extras master: Upgrade to containerized overcloud https://review.openstack.org/448576 | 14:57 |
EmilienM | jaosorior: dtantsur is right about release note | 14:57 |
*** jkilpatr has joined #tripleo | 14:58 | |
*** prateek has joined #tripleo | 14:58 | |
jaosorior | EmilienM: why? we're using the default and it's not something the deployer cares at all. Unless we make it configurable. | 14:58 |
jaosorior | EmilienM: the release notes are getting very confusing and they should be useful for deployers | 14:58 |
EmilienM | jaosorior: ok | 14:59 |
jaosorior | EmilienM: at least that's the way I see it. | 14:59 |
EmilienM | tbarron: so we can move forward with https://review.openstack.org/#/c/448614/ ? | 14:59 |
EmilienM | tbarron: last call? :) | 14:59 |
pradk | is there any nifty script to clean up undercloud install ? | 14:59 |
tbarron | EmilienM: yeah, go for it | 14:59 |
EmilienM | tbarron: approving | 14:59 |
tbarron | EmilienM: thanks | 15:00 |
openstackgerrit | wes hayutin proposed openstack/tripleo-quickstart master: WIP, make script work in CI and interactive https://review.openstack.org/449370 | 15:00 |
*** prateek has quit IRC | 15:01 | |
*** prateek has joined #tripleo | 15:02 | |
chem | owalsh: this is certainly an openvswitch issue: | 15:02 |
jaosorior | d0ugal: I've been still seeing that error every once in a while. For some reason it's not constant. | 15:02 |
jaosorior | d0ugal: what was the LP bug again? | 15:02 |
openstackgerrit | Pradeep Kilambi proposed openstack/instack-undercloud master: Run ceilometer-upgrade for gnocchi conditionally https://review.openstack.org/448762 | 15:02 |
jrist | can we get a +a on https://review.openstack.org/#/c/414169/ plz | 15:03 |
chem | 2017-03-24T15:02:55.075Z|03596|rconn|WARN|br-tun<->tcp:127.0.0.1:6633: connection failed (Connection refused) | 15:03 |
chem | 2017-03-24T15:02:55.075Z|03597|rconn|WARN|br-int<->tcp:127.0.0.1:6633: connection failed (Connection refused) | 15:03 |
chem | 2017-03-24T15:02:55.075Z|03598|rconn|WARN|br-ex<->tcp:127.0.0.1:6633: connection failed (Connection refused) | 15:03 |
chem | 2017-03-24T15:02:55.075Z|03599|rconn|WARN|br-infra<->tcp:127.0.0.1:6633: connection failed (Connection refused) | 15:03 |
jrist | has a +2 and 3 +1s | 15:03 |
chem | owalsh: ^ | 15:03 |
chem | mcornea: ^ | 15:03 |
mcornea | chem: yeah, I think it's my workaround, I shouldn't apply it on computes | 15:04 |
*** gbarros has quit IRC | 15:04 | |
*** gbarros has joined #tripleo | 15:05 | |
d0ugal | jaosorior: https://bugs.launchpad.net/tripleo/+bug/1675709 | 15:07 |
openstack | Launchpad bug 1675709 in tripleo "deploy succeeded but no overcloudrc was generated" [High,Triaged] | 15:07 |
jaosorior | d0ugal: I added the alert tag | 15:07 |
d0ugal | jaosorior: cool, good idea. | 15:08 |
chem | bandini: about the ipv6 firewall stuff, do you think it's at the right step and place ? | 15:08 |
d0ugal | jaosorior: I am never sure when to use that. | 15:08 |
jaosorior | d0ugal: I just thought the bug was annoying enough to merit that :P | 15:09 |
bandini | chem: good question. let me relook at the steps | 15:09 |
d0ugal | jaosorior: Yup, that's for sure - more eyes may help too | 15:09 |
chem | bandini: I mean it doesn't strike you as completly idiotic :) | 15:09 |
*** morazi has quit IRC | 15:09 | |
jaosorior | d0ugal: just popped again here http://logs.openstack.org/58/449558/1/check/gate-tripleo-ci-centos-7-nonha-multinode-oooq/2ff609a/logs/undercloud/home/jenkins/overcloud_validate.log.txt.gz | 15:10 |
jaosorior | * popped up | 15:10 |
*** ooolpbot has joined #tripleo | 15:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 15:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674681 | 15:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674770 | 15:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674955 | 15:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1675174 | 15:10 |
openstack | Launchpad bug 1674681 in tripleo "buildlogs.centos.org CDN issues" [Critical,Triaged] | 15:10 |
*** ooolpbot has quit IRC | 15:10 | |
openstack | Launchpad bug 1674770 in tripleo "Update timeout too long in CI" [Critical,Triaged] | 15:10 |
openstack | Launchpad bug 1674955 in tripleo "oooq now believes that overcloud deploy succeeded even though it failed" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 15:10 |
openstack | Launchpad bug 1675174 in tripleo "Timeout passed to overcloud deploy not effective" [Critical,Triaged] | 15:10 |
bandini | chem: with my small brain, just about everything not done by me strikes me as awesome and amazing ;) | 15:10 |
d0ugal | jaosorior: http://logs.openstack.org/83/449583/2/check/gate-tripleo-ci-centos-7-nonha-multinode-oooq/ed030c5/logs/undercloud/home/jenkins/overcloud_deploy.log.txt.gz#_2017-03-24_15_02_12_684262 | 15:10 |
d0ugal | jaosorior: and go down a bit to the traceback! | 15:10 |
d0ugal | jaosorior: despite the fact that I re-raise it, it is still a status_code 0 | 15:11 |
openstackgerrit | Rob Crittenden proposed openstack/tripleo-docs master: Small fixups for the TLS everywhere documentation https://review.openstack.org/449672 | 15:11 |
jaosorior | duuuuude | 15:11 |
jaosorior | wow | 15:12 |
jaosorior | what the hell | 15:12 |
chem | bandini:haha | 15:12 |
d0ugal | jaosorior: Maybe RuntimeError is a bad choice? https://github.com/openstack/python-tripleoclient/blob/master/tripleoclient/exceptions.py#L35 | 15:12 |
*** jprovazn has quit IRC | 15:12 | |
d0ugal | apetrich: ^ | 15:13 |
jaosorior | d0ugal: why would it be skipping RuntimeErrors? | 15:13 |
d0ugal | jaosorior: No idea. | 15:13 |
jaosorior | well, lets try switching it to Exception then | 15:13 |
d0ugal | jaosorior: Maybe the thing that checks the status_code is broken? | 15:13 |
d0ugal | I don't know where/how that happens | 15:14 |
apetrich | d0ugal, oooohhhh | 15:14 |
trown | dobson: jaosorior I see the problem | 15:14 |
trown | d0ugal: rather | 15:14 |
trown | unping dobson | 15:14 |
trown | false && status_code=0 || status_code=$? | 15:14 |
trown | that statement gives exit code of 0 | 15:14 |
d0ugal | trown: oh, lol | 15:15 |
trown | wait... that is what we want there though... because we need to not fail ansible at that spot | 15:15 |
trown | the statement above does set status_code to 1 | 15:15 |
d0ugal | trown: oh, so the status_code=0 doesn't mean the command returned 0? | 15:16 |
openstackgerrit | Dougal Matthews proposed openstack/python-tripleoclient master: DO NOT MERGE: Are RuntimeErrors handled badly? https://review.openstack.org/449676 | 15:16 |
trown | d0ugal: well ya if status_code=0 after that it means the command returned 0 exit code | 15:17 |
*** gaurangt has quit IRC | 15:19 | |
openstackgerrit | Pradeep Kilambi proposed openstack/tripleo-heat-templates master: Add upgrade tasks for gnocchi container services https://review.openstack.org/445627 | 15:19 |
bandini | chem: looks good to me, I added a comment | 15:21 |
beekneemech | tripleo.org is back up | 15:21 |
jistr | weshay, matbu: so i've just had a full successful e2e upgrade from non-containerized master to containerized master with this https://review.openstack.org/#/c/448576/ | 15:22 |
jistr | weshay, matbu: however, i'd like to ask you for some help, if you have bandwidth for it (e.g. next week or so) with actually getting CI to run it | 15:22 |
*** morazi has joined #tripleo | 15:22 | |
openstackgerrit | Numan Siddique proposed openstack/puppet-tripleo master: Pacemaker support for OVN DB servers https://review.openstack.org/372274 | 15:23 |
jistr | there seems to be multiple ways to run CI jobs, from my understanding of reading the current toci repo | 15:23 |
jistr | and i don't mean just tripleo.sh vs OOOQ. IIUC there's actually multiple different approaches to trigger jobs via OOOQ | 15:23 |
*** gkadam has joined #tripleo | 15:24 | |
jistr | so i would definitely appreciate some assistance in figuring this out (even if i figured it out, i may not figure out a way which is in sync with what's the intended direction of TOCI in following weeks/months) | 15:25 |
jaosorior | Alright, I'm off. Have a good weekend everyone! | 15:25 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-docs master: Basic structure of TripleO Deployment Guide https://review.openstack.org/449684 | 15:26 |
jistr | after we've got CI to actually execute this, we can figure out how to make it do ocata->master rather than master->master, and perhaps add converge etc. | 15:26 |
*** udesale has quit IRC | 15:27 | |
jistr | though on my machine which didn't do anything else, it took 2.5 hours to run the whole thing (without converge or pingtest) | 15:27 |
jistr | so it might be quite tight w/r/t job runtime | 15:28 |
jistr | weshay, matbu ^ | 15:28 |
shardy | jistr: I think we'll have to optimze this before it'll run in CI | 15:28 |
shardy | jistr: when I was testing local composable upgrades the CI runtimes were considerably longer | 15:28 |
shardy | so I suspect this will be the same/similar | 15:28 |
jistr | shardy: right, most likely. But i'd quite like the next step to be the CI at least attempting to execute what we have. It's quite difficult to judge how far we're from that. | 15:29 |
shardy | that said we can still be getting a WIP/experimental job to quantify that :) | 15:29 |
jistr | e.g. multinode OOOQ has it's own playbook that cannot do upgrade so far | 15:29 |
shardy | jistr: ack, yeah sounds good | 15:29 |
jistr | https://github.com/openstack-infra/tripleo-ci/blob/master/scripts/quickstart/multinode-playbook.yml | 15:29 |
jistr | so i expect there's quite some work ahead to at least get the CI to red :D | 15:29 |
jistr | and then we can figure out how to make it green :)) | 15:29 |
EmilienM | adarazs: https://review.openstack.org/#/c/448511/ | 15:29 |
shardy | lol | 15:29 |
shardy | probably true :) | 15:29 |
EmilienM | adarazs: Sagi is not here today, can you update it please? so we can merge it today. | 15:29 |
adarazs | EmilienM: okay. | 15:30 |
*** jaosorior has quit IRC | 15:30 | |
jistr | flaper87, mandre: ^^ fyi re upgrades CI status | 15:31 |
openstackgerrit | Marius Cornea proposed openstack/tripleo-heat-templates master: Stop openstack-nova-compute during nova-ironic upgrade https://review.openstack.org/449596 | 15:31 |
openstackgerrit | Pradeep Kilambi proposed openstack/tripleo-heat-templates master: Add ceilometer ipmi agent https://review.openstack.org/430447 | 15:32 |
trozet | mwhahaha: when you have a minute, can you please review: https://review.openstack.org/#/c/448827/ | 15:32 |
EmilienM | adarazs: on #openstack-infra, pabelanger is helping to promote some tripleo gate jobs in zuul | 15:32 |
*** athomas has quit IRC | 15:33 | |
EmilienM | adarazs: and he asked me the patches we want to merge ASAP. Please look if I missed some | 15:33 |
flaper87 | jistr: thanks | 15:33 |
EmilienM | adarazs: http://eavesdrop.openstack.org/irclogs/%23openstack-infra/latest.log.html (look the end) | 15:34 |
adarazs | EmilienM: looking | 15:34 |
*** dparkes has quit IRC | 15:34 | |
*** jkilpatr has quit IRC | 15:35 | |
openstackgerrit | Oliver Walsh proposed openstack/tripleo-heat-templates master: WIP: SSH known_hosts config https://review.openstack.org/449660 | 15:36 |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-docs master: Basic structure of TripleO Deployment Guide https://review.openstack.org/449684 | 15:36 |
*** aufi has quit IRC | 15:41 | |
adarazs | EmilienM: I think those should be enough, and of course this repo one I just -1'd :) | 15:41 |
EmilienM | adarazs: yeah please update it so we can approve it and land it today | 15:42 |
adarazs | damn, that reponame is already used in there before... | 15:42 |
*** prateek has quit IRC | 15:42 | |
openstackgerrit | yolanda.robla proposed openstack/tripleo-common master: WIP: Add creation of security hardened images https://review.openstack.org/448528 | 15:44 |
openstackgerrit | Dougal Matthews proposed openstack/python-tripleoclient master: DO NOT MERGE: Are RuntimeErrors handled badly? https://review.openstack.org/449676 | 15:45 |
EmilienM | adarazs, trown : I'm taling with Paul Belanger now about our situation and the fact patches don't merge | 15:45 |
EmilienM | I'm proposing to force-merge some patches without waiting for zuul | 15:46 |
EmilienM | that would be outstanding | 15:46 |
d0ugal | sounds dangerous :) | 15:46 |
EmilienM | I know | 15:46 |
d0ugal | but obviously useful | 15:46 |
EmilienM | but jobs are failing a lot | 15:46 |
adarazs | it's non working gates vs. probably non working gates, so I think we should take our chances :P | 15:46 |
openstackgerrit | yolanda.robla proposed openstack/tripleo-image-elements master: Add overcloud-secure-block-device element https://review.openstack.org/449122 | 15:46 |
EmilienM | gimme a list of patches you want to merge and I'll see what I can do | 15:47 |
d0ugal | adarazs: lol, I like our chances! | 15:47 |
*** social has quit IRC | 15:48 | |
EmilienM | trown, adarazs: please help me to set gerrit topic to tripleo/outstanding for the patches we want to land today | 15:48 |
owalsh | chem: ack | 15:48 |
*** yamahata has joined #tripleo | 15:48 | |
*** flepied has quit IRC | 15:49 | |
adarazs | EmilienM: you know if we're so much in crunch mode we could submit https://review.openstack.org/448511 as is, I was even unsure if I should -1 for these. | 15:49 |
adarazs | I'll give it a +2 and +w and see what happens. | 15:49 |
dmsimard | EmilienM: the patches in the buildlogs bug are probably worth looking at ( current repo issue and https ) | 15:50 |
dmsimard | They're contributing to the gate instability situation. | 15:50 |
EmilienM | dmsimard: yes | 15:51 |
EmilienM | can someone review https://review.openstack.org/#/c/448041/2 please ? | 15:51 |
EmilienM | at least +2, so we can approve later | 15:51 |
*** d0ugal has quit IRC | 15:52 | |
*** amoralej|lunch is now known as amoralej | 15:52 | |
* adarazs still can't +2 there. | 15:53 | |
*** gbarros has quit IRC | 15:55 | |
*** gbarros has joined #tripleo | 15:55 | |
jrist | trying again: can we get a +A? https://review.openstack.org/#/c/414169/ - EmilienM? | 15:56 |
EmilienM | jrist: bad time now | 15:56 |
jrist | sorry :( | 15:57 |
EmilienM | I'm trying to get CI back | 15:57 |
jrist | EmilienM: I understand. anything I can do to help? | 15:57 |
EmilienM | jrist: and we have plenty of reviewers here, please don't ping me | 15:57 |
EmilienM | jrist: don't ping me for reviews | 15:57 |
jrist | EmilienM: noted. sorry | 15:57 |
EmilienM | that will help me to focus | 15:57 |
shardy | jrist: I'll take a look, I've reviewed that one before | 15:57 |
jrist | shardy: thanks yeah you already +1'd | 15:58 |
openstackgerrit | Attila Darazs proposed openstack/tripleo-quickstart-extras master: GATE CHECK https://review.openstack.org/430277 | 16:00 |
*** pcaruana has quit IRC | 16:03 | |
*** udesale has joined #tripleo | 16:04 | |
EmilienM | I would gently ask folks to not recheck patches from now | 16:07 |
EmilienM | pabelanger and I are working on merging critical patches in tripleo that would help to have a more stable CI | 16:07 |
EmilienM | the queue in zuul is huge | 16:08 |
jrist | EmilienM: noted. thanks | 16:08 |
jrist | http://img2.wikia.nocookie.net/__cb20140425031646/villains/images/f/fe/Vlcsnap-2014-04-24-23h13m48s25.png | 16:08 |
EmilienM | that's /me now :D | 16:09 |
*** udesale has quit IRC | 16:09 | |
*** pabelanger has joined #tripleo | 16:09 | |
*** udesale has joined #tripleo | 16:09 | |
*** ooolpbot has joined #tripleo | 16:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 16:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674681 | 16:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674770 | 16:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674955 | 16:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1675174 | 16:10 |
openstack | Launchpad bug 1674681 in tripleo "buildlogs.centos.org CDN issues" [Critical,Triaged] | 16:10 |
*** ooolpbot has quit IRC | 16:10 | |
openstack | Launchpad bug 1674770 in tripleo "Update timeout too long in CI" [Critical,Triaged] | 16:10 |
openstack | Launchpad bug 1674955 in tripleo "oooq now believes that overcloud deploy succeeded even though it failed" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 16:10 |
openstack | Launchpad bug 1675174 in tripleo "Timeout passed to overcloud deploy not effective" [Critical,Triaged] | 16:10 |
*** trown is now known as trown|lunch | 16:10 | |
*** d0ugal has joined #tripleo | 16:14 | |
*** Goneri has quit IRC | 16:15 | |
openstackgerrit | Oliver Walsh proposed openstack/tripleo-heat-templates master: WIP: SSH known_hosts config https://review.openstack.org/449660 | 16:16 |
openstackgerrit | yolanda.robla proposed openstack/tripleo-common master: Add creation of security hardened images https://review.openstack.org/448528 | 16:17 |
*** thrash is now known as thrash|f00dz | 16:18 | |
*** sshnaidm|off has joined #tripleo | 16:19 | |
*** skramaja has joined #tripleo | 16:19 | |
EmilienM | I had to -2 some patches to kill them from zuul and free some resources | 16:19 |
EmilienM | I'll remove -2 before EOD | 16:20 |
*** d0ugal has quit IRC | 16:22 | |
akrivoka | shardy: many thanks | 16:23 |
*** jzimnowoda has quit IRC | 16:26 | |
*** jzim has quit IRC | 16:27 | |
* adarazs off, have a nice weekend folks! | 16:29 | |
openstackgerrit | Luke Hinds proposed openstack/tripleo-heat-templates master: Adds service for managing securetty https://review.openstack.org/449153 | 16:32 |
*** rlandy is now known as rlandy|brb | 16:32 | |
openstackgerrit | Luke Hinds proposed openstack/puppet-tripleo master: Adds service for managing securetty https://review.openstack.org/449148 | 16:33 |
shadower | UI people: how do you set up your undercloud these days? Still a custom script or is it quickstart now? | 16:33 |
* shadower has a 4 moth-old setup that's beyond broken | 16:34 | |
shadower | jrist, florianf ^ | 16:34 |
jrist | hehe | 16:34 |
jrist | triple OHHHHHH q | 16:35 |
jrist | is the recommended way | 16:35 |
jrist | oooq | 16:35 |
shadower | oooqay | 16:35 |
jrist | :) | 16:35 |
mwhahaha | lots of crying and hoping it works | 16:35 |
mwhahaha | that's how i deploy | 16:35 |
jrist | hahaha | 16:35 |
shardy | lol :) | 16:35 |
jrist | mwhahaha: in the instructions it specifically said weeping | 16:36 |
jrist | preferably in the corner in a fetal position | 16:36 |
mwhahaha | yes it is the most comfortable way | 16:36 |
*** bogdando has quit IRC | 16:36 | |
openstackgerrit | Marios Andreou proposed openstack/tripleo-heat-templates master: WIP - Add upgrade_batch_tasks to neutron-l3-agent https://review.openstack.org/445494 | 16:37 |
deadnull | im running into a validation error on ocata deploy, maybe someone can point me in the right direction... "unexpected keyword argument 'user_domain_id'" - https://gist.github.com/rvalente/e1ecde794b393b6d3f90d97a13aac4cb | 16:38 |
florianf | shadower: last time I still used instack-virt-setup because oooq didn't work for me immediately and I didn't have the time to look into it | 16:39 |
shadower | hm okay | 16:39 |
shadower | thanks | 16:39 |
openstackgerrit | Adam Harwell proposed openstack/diskimage-builder master: Use DIB_PYTHON_VERSION to run commands https://review.openstack.org/449721 | 16:39 |
openstackgerrit | Alex Schultz proposed openstack/tripleo-heat-templates master: Run update after RHEL registration https://review.openstack.org/449724 | 16:42 |
*** thrash|f00dz is now known as thrash | 16:44 | |
*** fragatina has quit IRC | 16:46 | |
*** jlinkes has quit IRC | 16:47 | |
*** salmankhan has quit IRC | 16:48 | |
chem | good week end people | 16:52 |
*** rwsu has quit IRC | 16:57 | |
*** yamahata has quit IRC | 17:00 | |
*** rwsu has joined #tripleo | 17:00 | |
openstackgerrit | Athlan-Guyot sofer proposed openstack/tripleo-heat-templates master: WIP: N->O upgrade, blanks ipv6 rules before activating it. https://review.openstack.org/449613 | 17:02 |
openstackgerrit | yolanda.robla proposed openstack/diskimage-builder master: WIP: Add lvm management to diskimage-builder https://review.openstack.org/444403 | 17:03 |
*** udesale has quit IRC | 17:06 | |
weshay | jistr, nice man.. yes I think we can get you help :) | 17:10 |
*** ooolpbot has joined #tripleo | 17:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 17:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674681 | 17:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674770 | 17:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674955 | 17:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1675174 | 17:10 |
openstack | Launchpad bug 1674681 in tripleo "buildlogs.centos.org CDN issues" [Critical,Triaged] | 17:10 |
*** ooolpbot has quit IRC | 17:10 | |
openstack | Launchpad bug 1674770 in tripleo "Update timeout too long in CI" [Critical,Triaged] | 17:10 |
openstack | Launchpad bug 1674955 in tripleo "oooq now believes that overcloud deploy succeeded even though it failed" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 17:10 |
openstack | Launchpad bug 1675174 in tripleo "Timeout passed to overcloud deploy not effective" [Critical,Triaged] | 17:10 |
*** rlandy|brb is now known as rlandy | 17:11 | |
jistr | weshay: great, thanks :) | 17:11 |
*** lmiccini has quit IRC | 17:12 | |
weshay | rlandy, my local deploy w/ the script worked.. full deploy + validate | 17:13 |
weshay | jistr, I would have thought that need much more code | 17:13 |
*** zoli is now known as zoli|gone | 17:13 | |
weshay | impressvie | 17:13 |
weshay | impressive | 17:13 |
jistr | well it's just master->master, no converge, no pingtest | 17:14 |
jistr | just the most crucial stuff | 17:14 |
*** cylopez has quit IRC | 17:14 | |
jistr | a lot of logic was already in place for reuse | 17:14 |
jistr | like prep-containers, and the upgrade skeleton | 17:15 |
jistr | mainly i'm concerned about how will we get CI to run the upgrade on multinode, which will be new | 17:15 |
morazi | jistr, awesome! | 17:15 |
*** mcornea has quit IRC | 17:15 | |
jistr | and about the run time in general | 17:15 |
openstackgerrit | Dougal Matthews proposed openstack/python-tripleoclient master: DO NOT MERGE: Are RuntimeErrors handled badly https://review.openstack.org/449676 | 17:15 |
morazi | jistr, (not about the concern, but about getting it all to work) | 17:16 |
jistr | haha yea i understood | 17:16 |
jistr | thanks :) | 17:16 |
*** d0ugal has joined #tripleo | 17:18 | |
weshay | rlandy, I'd like to update the internal ovb osp gate to also take changes to quickstart | 17:18 |
rlandy | weshay: ok - sure | 17:21 |
jrist | shadower: any luck? | 17:22 |
rlandy | weshay; we run very little from quickstart there though | 17:22 |
jrist | I'm genuinely curious if that is the best way | 17:22 |
*** dtantsur is now known as dtantsur|pto | 17:22 | |
weshay | rlandy, I'm going only trigger w/ changes to the ovb scripts | 17:22 |
shadower | jrist: just started quickstart and keeping the fingers crossed | 17:22 |
jrist | shadower: nice let me know how that goes :) | 17:23 |
rlandy | ok | 17:23 |
*** suuuper has quit IRC | 17:24 | |
fultonj | dprince: would you mind dropping a comment into https://review.openstack.org/#/c/387631 ? | 17:25 |
fultonj | dprince: i know we had talked about it tuesday but if you unable to comment on it, then can you recommend someone else from the containers squad who might be a good person to review it? thanks. | 17:26 |
dprince | fultonj: sure, I've got it up | 17:26 |
fultonj | thanks | 17:26 |
dprince | fultonj: will comment on the spec too | 17:26 |
dprince | fultonj: the SPEC has been on my todo. Sorry for being slow | 17:27 |
fultonj | dprince: no problem, i'm glad it's on your list. thanks | 17:27 |
*** flepied has joined #tripleo | 17:28 | |
openstackgerrit | Luke Hinds proposed openstack/puppet-tripleo master: Adds service for managing securetty https://review.openstack.org/449148 | 17:29 |
*** jmelvin has quit IRC | 17:33 | |
weshay | rlandy, did the port quotas get lifted? | 17:34 |
*** trown|lunch is now known as trown | 17:34 | |
bkero | trown, adarazs: Could I get some eyes on https://review.openstack.org/#/c/446934/ and https://review.openstack.org/#/c/447142/ ? | 17:36 |
*** yamahata has joined #tripleo | 17:36 | |
weshay | trown, re: ovb I thought I'd try to reuse the full-deploy-ovb.sh script for something that devs would use. It's a little limited due to backwards compat until everyhting is updated and merged. https://review.openstack.org/#/c/449370/3/ci-scripts/full-deploy-ovb.sh | 17:37 |
trown | bkero we need some patch to tht stable/ocata to test https://review.openstack.org/447142 actually works | 17:38 |
trown | bkero: so a dummy tht patch that depends on ^ | 17:38 |
bkero | trown: https://review.openstack.org/#/c/447714/ | 17:38 |
bkero | There's newton | 17:38 |
bkero | trown: https://review.openstack.org/#/c/446865/ | 17:38 |
bkero | There's ocata | 17:38 |
trown | bkero: sweet | 17:38 |
*** psahoo_ has joined #tripleo | 17:39 | |
trown | thanks | 17:39 |
*** psahoo|away has quit IRC | 17:39 | |
*** derekh has quit IRC | 17:40 | |
openstackgerrit | Ben Kero proposed openstack/tripleo-quickstart-extras master: Refactor the toci-vxlan-networking script to be more correct https://review.openstack.org/446934 | 17:40 |
*** milan has quit IRC | 17:41 | |
bkero | ^ That's just changing the commit message | 17:41 |
trown | bkero: reviewing that now, sorry it has taken me so long | 17:43 |
bkero | No worries | 17:43 |
*** lucasagomes is now known as lucas-afk | 17:44 | |
*** akrivoka has quit IRC | 17:46 | |
pradk | EmilienM, mwhahaha, this is passing ci https://review.openstack.org/#/c/448762/ .. can we get it in please | 17:47 |
*** dparkes has joined #tripleo | 17:48 | |
*** tesseract has quit IRC | 17:50 | |
*** psahoo_ has quit IRC | 17:50 | |
EmilienM | pradk: I'll approve it once our zuul queue is low again | 17:51 |
EmilienM | we're still trying to land CI patches | 17:51 |
trown | cascading failures | 17:53 |
trown | thats the name of my new harsh noise band | 17:53 |
pradk | EmilienM, sure, we just need this soon to unblock 11 testing .. if we can get this merged and into ocata by this weekend i'm cool | 17:54 |
EmilienM | pradk: no worries, it will merge today | 17:55 |
*** jpena is now known as jpena|off | 17:55 | |
trown | that seems a bit optimistic :P | 17:55 |
openstackgerrit | Merged openstack-infra/tripleo-ci master: Only force http for repos that allow it https://review.openstack.org/448187 | 17:57 |
openstackgerrit | Merged openstack/tripleo-quickstart-extras master: Exit with error code when overcloud wasn't deployed https://review.openstack.org/448541 | 17:58 |
EmilienM | woot | 17:58 |
EmilienM | 2 down | 17:58 |
*** jcoufal has joined #tripleo | 17:58 | |
trown | finally | 17:59 |
trown | though I am not sure either of those fixes the root issue with DLRN | 17:59 |
dmsimard | one at a time, there's a couple outstanding issues | 18:00 |
*** ckyriakidou has quit IRC | 18:00 | |
*** skramaja has quit IRC | 18:00 | |
*** jcoufal__ has joined #tripleo | 18:00 | |
*** jcoufal_ has quit IRC | 18:01 | |
openstackgerrit | Ben Kero proposed openstack/tripleo-quickstart-extras master: Refactor the toci-vxlan-networking script to be more correct https://review.openstack.org/446934 | 18:02 |
bkero | trown: dealt with all your referenced issues :) | 18:02 |
trown | bkero: awesome thanks | 18:02 |
pabelanger | ya, looks like things are catching up | 18:02 |
pabelanger | at least on gate pipeline | 18:02 |
bkero | It's worth noting that ovb already had a 'mtu' default variable called 'mtu' | 18:02 |
bkero | So we could simply use that, or use vxlan_mtu | 18:03 |
*** jcoufal has quit IRC | 18:03 | |
EmilienM | beekneemech: pabelanger told me rh1 doesn't have capacity to handle gate-tripleo-ci-centos-7-ovb-containers-oooq-nv in check pipeline anymore. Thoughts? | 18:03 |
pabelanger | check-tripleo pipeline, is another story. You likely want to remove gate-tripleo-ci-centos-7-ovb-containers-oooq-nv from tripleo-check pipeline. tripleo-test-cloud-rh1 does have the capacity to keep up now with demand | 18:03 |
beekneemech | EmilienM: I said that at the PTG and people went ahead and added the job anyway. | 18:04 |
pabelanger | some quick math show me, anything about 55 items in the pipeline, will take 24 hours to drain | 18:04 |
beekneemech | Yeah, we've been 100% utilized all week, and were the same for the last 3 days of last week. | 18:04 |
beekneemech | experimental jobs haven't run in like 3 days | 18:05 |
EmilienM | dprince: ^ any thoughts on this? | 18:05 |
pabelanger | beekneemech: right, need to scale back the jobs to help drain the pipelines | 18:06 |
dprince | EmilienM: if our aim is to deliver containers in Pike I think we pick another job perhaps | 18:06 |
*** sshnaidm|off has quit IRC | 18:06 | |
*** axisys has quit IRC | 18:06 | |
EmilienM | dprince: multinode probably | 18:07 |
dprince | EmilienM: and also, make the containers overcloud job run faster. Is tempest still running there? | 18:07 |
EmilienM | dprince: not afik | 18:07 |
dprince | EmilienM: okay, mandre must have disable it. | 18:07 |
pabelanger | do you need ovb for the container stuff? | 18:07 |
pabelanger | or can that be on public clouds | 18:07 |
dprince | pabelanger: ideally we would be testing it end to end | 18:08 |
dprince | pabelanger: somewhere... | 18:08 |
dprince | we should cut a job | 18:08 |
dprince | pabelanger: I support you in this | 18:08 |
dprince | EmilienM: unclear to me if containers is the job to cut. Perhaps we cut something else? Or make it periodic | 18:09 |
dprince | EmilienM: or make something else that isn't being worked on everyday a periodic job | 18:09 |
*** ooolpbot has joined #tripleo | 18:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 18:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674681 | 18:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674770 | 18:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674955 | 18:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1675174 | 18:10 |
openstack | Launchpad bug 1674681 in tripleo "buildlogs.centos.org CDN issues" [Critical,Triaged] | 18:10 |
*** ooolpbot has quit IRC | 18:10 | |
openstack | Launchpad bug 1674770 in tripleo "Update timeout too long in CI" [Critical,Triaged] | 18:10 |
openstack | Launchpad bug 1674955 in tripleo "oooq now believes that overcloud deploy succeeded even though it failed" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 18:10 |
openstack | Launchpad bug 1675174 in tripleo "Timeout passed to overcloud deploy not effective" [Critical,Triaged] | 18:10 |
dprince | beekneemech, EmilienM if jobs are pegged we have to cut something. We should do this quickly and start an email thread about what to keep with our given resources | 18:10 |
pabelanger | Ya, I think you are at the point to make the tough call on what to scale back on, until you get more OVB resources | 18:12 |
bkero | trown: ok, retriggered all the test jobs. Depending on how nodepool is feeling today we'll see if the last change didn't break anything in about 1.5hr. | 18:13 |
*** gkadam has quit IRC | 18:15 | |
EmilienM | beekneemech, dprince: could we move non-ha features into ha? | 18:16 |
EmilienM | so we keep ovb-ha, ovb-updates and ovb-containers | 18:16 |
dprince | EmilienM: fine by me | 18:16 |
*** jkilpatr has joined #tripleo | 18:16 | |
beekneemech | EmilienM: Some, but a big part of the reason for three jobs was testing all three major net archs - sans net-iso, ipv4 net-iso, and ipv6 net-iso. | 18:17 |
*** eck` is now known as eck`gone | 18:17 | |
EmilienM | beekneemech: the "sans net-iso" might be skipped? | 18:17 |
*** d0ugal has quit IRC | 18:17 | |
dsneddon | chem, How's the stuff going with the IPv6 FixedIPs? I haven't tested the external loadbalancer/IPv6 combo, but I think you are on the right track going from the *Vip to the *FixedIPs parameters. | 18:18 |
beekneemech | EmilienM: It's what developers are most likely to use, so if it breaks a) it blocks a bunch of people and b) it probably gets fixed quickly. So that might be an acceptable compromise. | 18:19 |
EmilienM | beekneemech: I would give it a try until it breaks us next time and see how we can do | 18:19 |
beekneemech | The other problem we have is that we can't switch to containers until containers is passing consistently. | 18:19 |
beekneemech | And preferrably not using the full 2:45 every run. | 18:20 |
beekneemech | Trading jobs is still going to hurt our capacity if we swap a 100 minute nonha job for a 160 minute containers job. | 18:21 |
*** dtrainor has quit IRC | 18:21 | |
chem | dsneddon: not sure what you're referencing to, I'm mostly a interested into a working solution to https://bugzilla.redhat.com/show_bug.cgi?id=1430757 | 18:21 |
openstack | bugzilla.redhat.com bug 1430757 in rhosp-director "OSP10 -> OSP11 upgrade fails for deployments using external loadbalancer" [Urgent,Assigned] - Assigned to sathlang | 18:21 |
*** iranzo has quit IRC | 18:21 | |
chem | dsneddon: If you could have a look at what could be the best course of action that would be great | 18:22 |
chem | dsneddon: I must go now :) | 18:22 |
dsneddon | chem, I mist have misunderstood the IRC log. I'll comment on this BZ. | 18:22 |
dsneddon | must have | 18:22 |
dsneddon | chem, What's in -e ~/openstack_deployment/environments/external-lb.yaml | 18:24 |
EmilienM | beekneemech: I agree | 18:24 |
openstackgerrit | Justin Kilpatrick proposed openstack/tripleo-quickstart-extras master: Fix introspection with retries edge case https://review.openstack.org/437946 | 18:24 |
dsneddon | chem, And why are you importing both that and environments/external-loadbalancer-vip.yaml? | 18:24 |
*** chlong has quit IRC | 18:24 | |
chem | dsneddon: I'm not :) it's a bug report I didn't do it. The patch shown inside the bz is kind of pretty clear, but if mcornea missed something just add it to the bz | 18:25 |
*** chem is now known as chem_off | 18:25 | |
dsneddon | chem_off, I found the environment files linked... thanks. | 18:26 |
*** jmelvin has joined #tripleo | 18:26 | |
EmilienM | dprince: could you help on this topic please? you're focused on containers. I would be great to transform the words ^ into a patch in tripleo-ci asap, our CI issues are super serious | 18:27 |
dprince | EmilienM: yes | 18:27 |
*** jcoufal has joined #tripleo | 18:29 | |
*** jcoufal__ has quit IRC | 18:31 | |
*** sshnaidm|off has joined #tripleo | 18:34 | |
openstackgerrit | Ben Kero proposed openstack/tripleo-quickstart-extras master: Refactor the toci-vxlan-networking script to be more correct https://review.openstack.org/446934 | 18:38 |
*** dtrainor has joined #tripleo | 18:38 | |
EmilienM | fyi all: I'm -2 all patches in gate, abandon & restore them, to clear the queue and let ci patches to land in priority | 18:40 |
*** tosky has quit IRC | 18:43 | |
dprince | EmilienM: ack, do it | 18:44 |
weshay | trown, I think we have validated rlandy's ovb changes | 18:45 |
weshay | should be good to go there | 18:45 |
trown | weshay: what do you mean? | 18:46 |
EmilienM | pabelanger: all done, the queue is low now | 18:46 |
weshay | trown, the two changes listed in https://review.openstack.org/#/c/449370/ | 18:46 |
*** dtrainor has quit IRC | 18:46 | |
weshay | lines 154 155 | 18:46 |
pabelanger | EmilienM: okay, so lets not enqueue anything more | 18:46 |
pabelanger | until we land the current service | 18:47 |
pabelanger | series* | 18:47 |
EmilienM | ok | 18:47 |
pabelanger | then, we can enqueue and promote again if needed | 18:47 |
EmilienM | pabelanger: what if one of the patches we need fails again? | 18:48 |
pabelanger | EmilienM: enqueue / promote | 18:48 |
EmilienM | pabelanger: what about askin fungi to force-merge them | 18:48 |
pabelanger | lets see | 18:49 |
pabelanger | if they still fail now, we have some other gate issue | 18:49 |
pabelanger | because all fixes should be in gate | 18:49 |
dprince | beekneemech: containers runs without network iso ATM so it should be okay I think right? | 18:49 |
dprince | beekneemech: it would cover that case still... | 18:49 |
EmilienM | dprince: if beekneemech is ok with that, one the first actions is to see what ovb-nonha does and migrate it to ovb-ha (except the net-iso thing, where ovb-ha will keep current config) - and retire ovb-nonha | 18:50 |
dprince | EmilienM: ack | 18:50 |
*** jmelvin has quit IRC | 18:50 | |
beekneemech | Oh, we have another problem: http://git.openstack.org/cgit/openstack-infra/tripleo-ci/tree/README.rst#n98 | 18:51 |
beekneemech | ha is the only job type that doesn't use ceph | 18:51 |
EmilienM | well, let's use ceph... | 18:51 |
*** jmelvin has joined #tripleo | 18:51 | |
EmilienM | or maybe keep updates with ceph | 18:52 |
beekneemech | Also, it's going to break our existing ssl certs. | 18:52 |
beekneemech | (which can be fixed, but something to be aware of) | 18:52 |
beekneemech | EmilienM: No, opposite problem. We'll _only_ be testing with ceph if we merge ha and nonha. | 18:52 |
trozet | EmilienM: whats going on with my patch? | 18:52 |
*** dtrainor has joined #tripleo | 18:52 | |
beekneemech | If containers doesn't use ceph then that's probably okay though. | 18:52 |
EmilienM | trozet: please read the backlog here... | 18:53 |
EmilienM | trozet: "+EmilienM | fyi all: I'm -2 all patches in gate, abandon & restore them, to clear the queue and let ci patches to land in priority" | 18:53 |
* beekneemech badly needs lunch | 18:53 | |
trozet | EmilienM: oh ok | 18:53 |
*** jmelvin has quit IRC | 18:56 | |
*** jmelvin has joined #tripleo | 18:56 | |
fungi | EmilienM: fwiw it doesn't need to be my call to bypass ci votes. any of our infra-root gerrit admins have that ability. however if we bypass testing to merge something which makes your jobs even more broken, that's not a great situation so one we tend to be very cautious about | 18:57 |
*** jcoufal_ has joined #tripleo | 18:57 | |
*** hexo has quit IRC | 18:57 | |
pabelanger | EmilienM: fungi: Ya, that is the scenario I am trying to avoid honestly | 18:58 |
*** hexo has joined #tripleo | 18:58 | |
pabelanger | I think promote / enqueue is a good option currently | 18:58 |
openstackgerrit | wes hayutin proposed openstack/tripleo-quickstart master: WIP, make script work in CI and interactive https://review.openstack.org/449370 | 18:58 |
EmilienM | fungi: I agree with you. Though the situation is outstanding for us atm | 18:58 |
*** jcoufal has quit IRC | 18:58 | |
*** sshnaidm|off has quit IRC | 18:59 | |
fungi | the idea with pre-merge gating is that it should hopefully prevent you from merging changes which completely break your jobs, so if we bypass the gate we lose that assurance | 18:59 |
fungi | and increase the chances that we'll need to force another change through to fix the previous one | 19:00 |
pabelanger | Ya, I've heard too many horror stories of force merges gone bad | 19:00 |
EmilienM | fungi: the patches we want to land already passed check jobs (and failed randomly in gate) | 19:00 |
pabelanger | EmilienM: the issue however is, none of the other patches in check have tested against that patch. | 19:01 |
pabelanger | thats why promote / enqueue in gate is powerful | 19:01 |
fungi | or depends-on | 19:01 |
pabelanger | it is a dependent pipeline, so ever patch behind it gets said patch | 19:01 |
EmilienM | this week was an "horror story" for tripleo CI; really, things can't be more bad | 19:02 |
EmilienM | worst even | 19:02 |
*** pkovar has quit IRC | 19:02 | |
pabelanger | I only found out how bad things were today | 19:02 |
pabelanger | and was able to jump in and help | 19:02 |
pabelanger | right now, we are about 1h9min from the tripleo change queue for gate being empty | 19:03 |
pabelanger | lets see what fails | 19:03 |
EmilienM | pabelanger: http://tripleo.org/cistatus.html - a nice overview to see how red was our week | 19:04 |
*** jmelvin has quit IRC | 19:05 | |
*** ooolpbot has joined #tripleo | 19:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 19:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674681 | 19:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674770 | 19:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674955 | 19:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1675174 | 19:10 |
openstack | Launchpad bug 1674681 in tripleo "buildlogs.centos.org CDN issues" [Critical,Triaged] | 19:10 |
*** ooolpbot has quit IRC | 19:10 | |
openstack | Launchpad bug 1674770 in tripleo "Update timeout too long in CI" [Critical,Triaged] | 19:10 |
openstack | Launchpad bug 1674955 in tripleo "oooq now believes that overcloud deploy succeeded even though it failed" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 19:10 |
openstack | Launchpad bug 1675174 in tripleo "Timeout passed to overcloud deploy not effective" [Critical,Triaged] | 19:10 |
*** eck`gone is now known as eck` | 19:15 | |
dmsimard | so far so good... http://i.imgur.com/iwMsRiB.png | 19:16 |
pabelanger | almost | 19:19 |
pabelanger | http://logs.openstack.org/41/448041/2/gate/gate-tripleo-ci-centos-7-undercloud-oooq/3f7a9c8/logs/undercloud/home/jenkins/install_packages.sh.log.txt.gz | 19:19 |
pabelanger | that just failed | 19:19 |
pabelanger | [Errno 14] curl#35 - "TCP connection reset by peer" | 19:19 |
EmilienM | https://trunk.rdoproject.org/centos7/current/puppet-mysql-3.10.1-0.20170324145036.b9a0a55.el7.centos.noarch.rpm: [Errno 14] curl#35 - "TCP connection reset by peer" | 19:20 |
EmilienM | dmsimard: ^ | 19:20 |
pabelanger | dmsimard: how large is dlrn current? | 19:20 |
pabelanger | GB wise | 19:20 |
dmsimard | pabelanger: not large, iirc it's like 250MB ? The problem is it's fast moving so hard to mirror | 19:21 |
dmsimard | let me look | 19:21 |
pabelanger | dmsimard: how fast would you say | 19:22 |
dmsimard | on a good day it could change 100 times a day ? | 19:22 |
pabelanger | is this part of the promotion stuff? | 19:22 |
dmsimard | /current/ is a symlink that changes when a new package is built | 19:22 |
pabelanger | k | 19:23 |
dmsimard | EmilienM: what is using /current/ anyway ? | 19:23 |
EmilienM | dmsimard: tripleo-ci | 19:23 |
dmsimard | EmilienM: there's nothing, nowhere, that should be using /current/ | 19:23 |
pabelanger | depending on the size, we could have an more aggressive mirroring | 19:23 |
EmilienM | dmsimard: we use current for tripleo packages | 19:23 |
pabelanger | depends on what uses it | 19:23 |
EmilienM | (tht, puppet-tripleo, etc) | 19:23 |
pabelanger | EmilienM: how far back could that lag? | 19:24 |
dmsimard | EmilienM: why not consistent ? or the CDN repository for promoted packages ? | 19:24 |
*** hewbrocca_afk is now known as hewbrocca | 19:24 | |
openstackgerrit | Dan Prince proposed openstack-infra/tripleo-ci master: Combine HA and non-ha features https://review.openstack.org/449785 | 19:24 |
*** jkilpatr has quit IRC | 19:24 | |
dprince | EmilienM, beekneemech ^^ | 19:24 |
*** hewbrocca is now known as hewbrocca_afk | 19:24 | |
EmilienM | we don't want to lag on tripleo packages | 19:25 |
EmilienM | dprince: sounds good to me, I'll review it, thanks | 19:26 |
dmsimard | EmilienM: we have a whole process for promoting tripleo packages to a cdn repository for the purpose of the gate, I don't understand why current is used | 19:26 |
dprince | EmilienM: I will push a project-configs patch to remove nonha now too? | 19:26 |
dprince | EmilienM: or you are doing that? | 19:26 |
EmilienM | dmsimard: like I said, just for tht, instack, puppet-tripleo | 19:26 |
dmsimard | EmilienM: that failure came from puppet-mysql | 19:27 |
*** jmelvin has joined #tripleo | 19:27 | |
EmilienM | dprince: you can push for it. Some jobs in puppet ci are using ovb-nonha, we need to find an alternative | 19:27 |
dmsimard | so it's a dependency from puppet-tripleo I guess | 19:27 |
dprince | EmilienM: ovb-ha will work I think there? | 19:27 |
EmilienM | dmsimard: probably and it's not our intention to skip promoted repo on puppet-* | 19:28 |
EmilienM | dprince: probably, yes | 19:28 |
slagle | we use current, b/c we want the absolute latest tripleo packages and puppet modules | 19:28 |
slagle | since we are so heavily dependent on the puppet modules | 19:28 |
*** jcoufal_ has quit IRC | 19:28 | |
slagle | otherwise, we aren't testing tripleo patches against "latest". it would be against last promote | 19:28 |
pabelanger | dmsimard: this appears to be how it is setup: http://logs.openstack.org/41/448041/2/gate/gate-tripleo-ci-centos-7-undercloud-oooq/3f7a9c8/logs/undercloud/home/jenkins/repo_setup.sh.txt.gz | 19:29 |
*** jcoufal has joined #tripleo | 19:29 | |
EmilienM | slagle: yes that | 19:29 |
dmsimard | slagle: but I thought that was the whole point of the promotion process, to test things so it doesn't break the gate | 19:29 |
dmsimard | slagle: and the gate uses the promoted packages | 19:29 |
EmilienM | dmsimard: except tripleo projects | 19:29 |
slagle | dmsimard: no, that's not the point | 19:29 |
EmilienM | dmsimard: because tripleo projects are tested by tripleo jobs | 19:29 |
dmsimard | EmilienM: but they're rebuilt in-flight anyway if required/depends-on, right ? | 19:30 |
slagle | the promotion process is to validate an entire repo | 19:30 |
dmsimard | pabelanger: baseurl=https://trunk.rdoproject.org/centos7/current | 19:30 |
dmsimard | that's bad | 19:30 |
pabelanger | we should just mirror that to AFS | 19:30 |
dmsimard | pabelanger: there's a patch in the queue | 19:30 |
dmsimard | to fix it | 19:30 |
pabelanger | dmsimard: which? | 19:31 |
EmilienM | yes | 19:31 |
dmsimard | let me find it | 19:31 |
EmilienM | https://review.openstack.org/#/c/448512/ | 19:31 |
pabelanger | ya | 19:31 |
pabelanger | it is on list to promote | 19:31 |
dmsimard | pabelanger: https://review.openstack.org/#/c/448512/ | 19:31 |
pabelanger | once other 2 merge | 19:31 |
dmsimard | since /current/ changes every time a package is built, jobs can land on 404s or other errors since the symlink is updated | 19:32 |
EmilienM | that's one of the things we're fixing I think | 19:32 |
dmsimard | yeah | 19:33 |
pabelanger | dmsimard: we get around that issue in AFS, but not deleting existing packages on disk, then mirror new things. Only after 2 hours do the original packages get delete | 19:33 |
dmsimard | just saying that might explain the yum error we just got | 19:33 |
pabelanger | in an effort not to break jobs like you say | 19:33 |
pabelanger | but, I am not sure how dlrn works when you combine things | 19:33 |
dmsimard | pabelanger: /current/ and /consistent/ were never meant to be used as baseurls for yum repositories | 19:34 |
dmsimard | people and systems must take the delorean hashed repository (i.e, the target of the symlink) and use that | 19:34 |
dmsimard | so that it doesn't change in flight | 19:34 |
dmsimard | so for example | 19:35 |
*** gbarros has quit IRC | 19:35 | |
dmsimard | https://trunk.rdoproject.org/centos7-master/current/delorean.repo <-- the base url: baseurl=https://trunk.rdoproject.org/centos7/77/e0/77e0621a8968d898877613ebabecda4f32ed353b_cd690faf | 19:35 |
dmsimard | https://trunk.rdoproject.org/centos7/77/e0/77e0621a8968d898877613ebabecda4f32ed353b_cd690faf will never change, it is persistent | 19:35 |
dprince | EmilienM: also this https://review.openstack.org/#/c/449791/ | 19:35 |
dprince | EmilienM: that one is going to be tedious to review :/ | 19:36 |
pabelanger | dmsimard: why do you have /current then? | 19:36 |
*** jmelvin has quit IRC | 19:36 | |
*** gbarros has joined #tripleo | 19:36 | |
EmilienM | dprince: sounds good | 19:37 |
dmsimard | pabelanger: people are supposed to use /current/delorean.repo, not craft their own .repo file that points to /current/ | 19:37 |
EmilienM | dprince: bookmarked, will review asap | 19:37 |
dprince | EmilienM: feel free to push on them too. I don't mind, but I will check back to fix this ASAP | 19:37 |
openstackgerrit | Merged openstack/tripleo-quickstart master: Switch trunk/cbs/buildlogs to use https https://review.openstack.org/448036 | 19:37 |
pabelanger | dmsimard: seems like an easy fix, don't drop packages into /current :) | 19:37 |
dmsimard | ¯\_(ツ)_/¯ | 19:38 |
EmilienM | dprince: ack | 19:38 |
dmsimard | There's a .repo file in there people should be using | 19:38 |
EmilienM | pabelanger: https://review.openstack.org/#/c/448512/ is in merge conflict, I need to rebase and push again | 19:40 |
*** gbarros has quit IRC | 19:41 | |
openstackgerrit | Emilien Macchi proposed openstack/tripleo-quickstart master: Change current repo to exact delorean hash https://review.openstack.org/448512 | 19:41 |
EmilienM | pabelanger: can we promote 448512 please? | 19:41 |
*** pkovar has joined #tripleo | 19:43 | |
*** dprince has quit IRC | 19:44 | |
openstackgerrit | Mikhail S Medvedev proposed openstack/diskimage-builder master: Create PReP boot partition for PPC https://review.openstack.org/447739 | 19:44 |
pabelanger | EmilienM: yes, let the next 2 merge first | 19:45 |
pabelanger | actually | 19:45 |
pabelanger | 1 sec | 19:45 |
pabelanger | I can enqueue | 19:45 |
*** jcoufal_ has joined #tripleo | 19:46 | |
pabelanger | EmilienM: enqueue to gate | 19:47 |
EmilienM | done | 19:47 |
EmilienM | thanks | 19:47 |
*** jcoufal has quit IRC | 19:47 | |
EmilienM | pabelanger: what to do with https://review.openstack.org/#/c/448041/ ? should I unqueue it? | 19:48 |
EmilienM | pabelanger: and recheck? | 19:48 |
pabelanger | EmilienM: leave for now | 19:48 |
*** florianf has quit IRC | 19:49 | |
*** toure is now known as toure|afk | 19:49 | |
pabelanger | once it clears out, I will enqueue it again | 19:49 |
EmilienM | ok | 19:49 |
*** pkovar has quit IRC | 19:53 | |
*** liverpooler has quit IRC | 19:55 | |
*** jmelvin has joined #tripleo | 19:59 | |
*** axisys has joined #tripleo | 20:02 | |
*** dprince has joined #tripleo | 20:03 | |
*** rbrady is now known as rbrady-afk | 20:09 | |
*** ooolpbot has joined #tripleo | 20:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 20:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674681 | 20:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674770 | 20:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674955 | 20:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1675174 | 20:10 |
openstack | Launchpad bug 1674681 in tripleo "buildlogs.centos.org CDN issues" [Critical,Triaged] | 20:10 |
*** ooolpbot has quit IRC | 20:10 | |
openstack | Launchpad bug 1674770 in tripleo "Update timeout too long in CI" [Critical,Triaged] | 20:10 |
openstack | Launchpad bug 1674955 in tripleo "oooq now believes that overcloud deploy succeeded even though it failed" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 20:10 |
openstack | Launchpad bug 1675174 in tripleo "Timeout passed to overcloud deploy not effective" [Critical,Triaged] | 20:10 |
*** gbarros has joined #tripleo | 20:12 | |
*** jcoufal_ has quit IRC | 20:12 | |
*** gbarros has quit IRC | 20:14 | |
*** jayg is now known as jayg|g0n3 | 20:20 | |
EmilienM | pabelanger: http://logs.openstack.org/23/449023/2/gate/gate-tripleo-ci-centos-7-nonha-multinode-oooq/8995277/logs/undercloud/home/jenkins/overcloud_deploy.log.txt.gz#_2017-03-24_20_09_15_997769 | 20:20 |
EmilienM | I don't know why this one failed | 20:20 |
EmilienM | it sounds like a multinode issue on this one | 20:21 |
*** amoralej is now known as amoralej|off | 20:25 | |
*** jobewan has joined #tripleo | 20:25 | |
EmilienM | pabelanger: should I recheck https://review.openstack.org/#/c/449023/ ? | 20:35 |
*** dougbtv|laptop has quit IRC | 20:35 | |
openstackgerrit | Ian Main proposed openstack/tripleo-heat-templates master: Remove kolla_config copy from keystone service. https://review.openstack.org/447676 | 20:35 |
pabelanger | EmilienM: I should be able to enqueue again | 20:35 |
pabelanger | once 448511 is done | 20:36 |
*** jmelvin has quit IRC | 20:36 | |
*** jmelvin has joined #tripleo | 20:38 | |
*** vpickard is now known as vpickard_ | 20:39 | |
*** trown is now known as trown|outtypewww | 20:41 | |
*** jcoufal has joined #tripleo | 20:49 | |
*** abishop has quit IRC | 20:50 | |
openstackgerrit | Ian Main proposed openstack/tripleo-heat-templates master: Remove kolla config files from ironic. https://review.openstack.org/445620 | 20:50 |
EmilienM | Slower: could we make it for all services in a row? ^ it would save CI resources ... | 20:52 |
EmilienM | dprince: ^ | 20:52 |
EmilienM | and easier to review imho | 20:53 |
dprince | EmilienM, Slower: 2-3 services at a time would be fine I think | 20:54 |
dprince | EmilienM: but I thi we can go ahead and approve what he has posted probably too | 20:54 |
EmilienM | why not all? | 20:54 |
dprince | EmilienM: unless there is a rework | 20:54 |
dprince | EmilienM: just saying, batches might be the middle ground | 20:54 |
*** yolanda has quit IRC | 20:54 | |
EmilienM | why not all in a same patch, see how CI workd and merge it | 20:54 |
dprince | EmilienM: all in one if fine w/ me | 20:54 |
dprince | EmilienM: I post big patches :) so no need to convince me man | 20:54 |
EmilienM | yeah :D | 20:55 |
*** yolanda has joined #tripleo | 20:56 | |
dprince | Slower: the gist is, the CI pipeline got jammed up (periodic jobs haven't ran for days). This was due to the containers job. Anything we can do to help save CI resources is nice I think | 20:57 |
EmilienM | mburned: wasn't it fixed? https://bugs.launchpad.net/tripleo/+bug/1675914 | 20:59 |
openstack | Launchpad bug 1675914 in tripleo "docker-registry fails to start when installing undercloud on rhel7.3 with rdo-ocata" [Undecided,New] | 20:59 |
mburned | EmilienM: i thought so | 21:00 |
mburned | EmilienM: https://review.openstack.org/#/q/I89e14cc2a27299ce4c191d2a823deb0424693831 | 21:03 |
mburned | EmilienM: that should have fixed it | 21:03 |
EmilienM | mburned: yeah | 21:03 |
EmilienM | dhill_: can you self triage when you file a bug? | 21:03 |
*** yolanda has quit IRC | 21:04 | |
mburned | EmilienM: only thing i can think of is if the test was with released rdo ocata | 21:04 |
mburned | and maybe it hasn't hit an rpm there yet? | 21:04 |
*** lblanchard has quit IRC | 21:04 | |
mburned | but rdo trunk ocata should work | 21:05 |
EmilienM | probably | 21:05 |
*** dprince has quit IRC | 21:05 | |
*** yolanda has joined #tripleo | 21:05 | |
*** links has joined #tripleo | 21:05 | |
dmsimard | EmilienM: fyi I just temporarily disabled DLRN from picking up new package builds for centos7-master/current/ | 21:06 |
dmsimard | once it finishes it's current queue it won't pick up any more packages | 21:06 |
dmsimard | it should help until the necessary patch lands | 21:06 |
EmilienM | why? | 21:06 |
openstackgerrit | Oliver Walsh proposed openstack/tripleo-heat-templates master: WIP: SSH known_hosts config https://review.openstack.org/449660 | 21:06 |
EmilienM | ah I see ok | 21:06 |
EmilienM | thanks, it will help indeed | 21:07 |
dmsimard | EmilienM: yeah it means /current/ will stop moving | 21:07 |
dmsimard | EmilienM: I won't interrupt what's currently building but after that it'll be stopped | 21:07 |
EmilienM | dmsimard: subscribe to https://review.openstack.org/#/c/448512/ so when it merges you can rollback | 21:07 |
dmsimard | done | 21:07 |
*** ooolpbot has joined #tripleo | 21:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 21:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674681 | 21:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674770 | 21:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674955 | 21:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1675174 | 21:10 |
openstack | Launchpad bug 1674681 in tripleo "buildlogs.centos.org CDN issues" [Critical,Triaged] | 21:10 |
*** ooolpbot has quit IRC | 21:10 | |
openstack | Launchpad bug 1674770 in tripleo "Update timeout too long in CI" [Critical,Triaged] | 21:10 |
openstack | Launchpad bug 1674955 in tripleo "oooq now believes that overcloud deploy succeeded even though it failed" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 21:10 |
openstack | Launchpad bug 1675174 in tripleo "Timeout passed to overcloud deploy not effective" [Critical,Triaged] | 21:10 |
EmilienM | pabelanger: we need to enqueue https://review.openstack.org/#/c/448511/ and https://review.openstack.org/#/c/448512/ again please | 21:11 |
mburned | EmilienM: looks like this is the build: http://cbs.centos.org/koji/buildinfo?buildID=15964 | 21:12 |
mburned | but it's not in the release tag yet | 21:12 |
EmilienM | mburned: ah | 21:12 |
mburned | so need to get one of the RDO people to promote it | 21:12 |
mburned | only rc1 is in the release tag | 21:12 |
EmilienM | right, rdo promotion | 21:13 |
dmsimard | EmilienM: crap.. https://review.openstack.org/#/c/448512/ passed but failed due to merge fail | 21:15 |
EmilienM | pabelanger: thx | 21:15 |
EmilienM | dmsimard: should I rebase https://review.openstack.org/#/c/448511/ ? | 21:16 |
EmilienM | I think that's fine now | 21:16 |
*** jcoufal_ has joined #tripleo | 21:16 | |
dmsimard | EmilienM: I don't know enough about the relationship between oooq and oooq-extras | 21:17 |
*** yolanda has quit IRC | 21:17 | |
*** jcoufal has quit IRC | 21:17 | |
*** mburned is now known as mburned_out | 21:17 | |
dmsimard | I guess if we're going to "recheck" it, might as well rebase | 21:17 |
dmsimard | in case it could benefit from one of the patches that managed to merge | 21:18 |
pabelanger | 448511 is already in the gate | 21:18 |
pabelanger | so, don't both rebasing | 21:18 |
dmsimard | pabelanger: ack | 21:18 |
pabelanger | okay, stepping away for a bit | 21:19 |
EmilienM | same | 21:19 |
*** yolanda has joined #tripleo | 21:19 | |
openstackgerrit | Ian Main proposed openstack/tripleo-heat-templates master: Remove kolla config entries from heat services. https://review.openstack.org/443974 | 21:21 |
*** bfournie has quit IRC | 21:24 | |
*** rlandy has quit IRC | 21:25 | |
*** yolanda has quit IRC | 21:26 | |
*** fragatina has joined #tripleo | 21:29 | |
*** radeks has quit IRC | 21:31 | |
*** jcoufal has joined #tripleo | 21:31 | |
*** yolanda has joined #tripleo | 21:32 | |
*** jcoufal_ has quit IRC | 21:32 | |
*** thrash is now known as thrash|g0ne | 21:34 | |
*** pradk has quit IRC | 21:36 | |
*** yolanda has quit IRC | 21:39 | |
*** yolanda has joined #tripleo | 21:39 | |
openstackgerrit | Andreas Florath proposed openstack/diskimage-builder master: Use stevedore for plugin config of block device https://review.openstack.org/447090 | 21:42 |
*** eck` is now known as eck`gone | 21:45 | |
*** jcoufal_ has joined #tripleo | 21:47 | |
*** jcoufal has quit IRC | 21:47 | |
*** morazi has quit IRC | 21:50 | |
*** sshnaidm|off has joined #tripleo | 22:02 | |
openstackgerrit | James Slagle proposed openstack/tripleo-docs master: Additional Networking docs for deployed-server https://review.openstack.org/442222 | 22:04 |
*** jmelvin has quit IRC | 22:08 | |
*** ooolpbot has joined #tripleo | 22:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 22:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674681 | 22:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674770 | 22:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674955 | 22:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1675174 | 22:10 |
openstack | Launchpad bug 1674681 in tripleo "buildlogs.centos.org CDN issues" [Critical,Triaged] | 22:10 |
*** ooolpbot has quit IRC | 22:10 | |
openstack | Launchpad bug 1674770 in tripleo "Update timeout too long in CI" [Critical,Triaged] | 22:10 |
openstack | Launchpad bug 1674955 in tripleo "oooq now believes that overcloud deploy succeeded even though it failed" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 22:10 |
openstack | Launchpad bug 1675174 in tripleo "Timeout passed to overcloud deploy not effective" [Critical,Triaged] | 22:10 |
openstackgerrit | Oliver Walsh proposed openstack/tripleo-heat-templates master: WIP: SSH known_hosts config https://review.openstack.org/449660 | 22:29 |
*** jobewan has quit IRC | 22:33 | |
*** yamahata has quit IRC | 22:36 | |
openstackgerrit | Merged openstack-infra/tripleo-ci master: Fix timeout for quickstart jobs https://review.openstack.org/449023 | 22:45 |
* mwhahaha cheers on the CI queue | 22:45 | |
mwhahaha | YOU CAN DOO EEEET | 22:46 |
pabelanger | next 3 should merge | 23:00 |
pabelanger | but still 3 failures because of the network | 23:00 |
EmilienM | yeah | 23:00 |
EmilienM | pabelanger: do you have telnet on 448041 ? | 23:01 |
EmilienM | I don't have ipv6 | 23:01 |
pabelanger | EmilienM: yes, should pass | 23:01 |
EmilienM | ok | 23:01 |
pabelanger | just doing log collection | 23:01 |
EmilienM | ah ok | 23:01 |
pabelanger | same with 448511 | 23:02 |
EmilienM | pabelanger: well the last one would be https://review.openstack.org/#/c/448512/ | 23:02 |
EmilienM | once this one merges we can ping dmsimard to unblock current | 23:02 |
pabelanger | EmilienM: yes, that failed because github.com timeout | 23:03 |
EmilienM | lol | 23:03 |
EmilienM | someone hates us this week :D | 23:03 |
pabelanger | EmilienM: so, next week, you should make a list of thing you are depending on for github.com and we'll try to get a mirror solution | 23:03 |
EmilienM | I wonder why do we clean hosts at the end of jobs | 23:03 |
EmilienM | it sounds very useless | 23:04 |
pabelanger | if you stop going out of network, things will fail less often I think | 23:04 |
openstackgerrit | Merged openstack-infra/tripleo-ci master: Switch trunk/cbs/buildlogs to use https https://review.openstack.org/448041 | 23:04 |
EmilienM | pabelanger: yes, thanks. Added on my list | 23:04 |
openstackgerrit | Merged openstack/tripleo-heat-templates master: Clarify Kolla build overrides for tripleo https://review.openstack.org/444308 | 23:04 |
EmilienM | pabelanger: do you know which repo? | 23:04 |
openstackgerrit | Merged openstack/tripleo-quickstart-extras master: Support includepkgs option in downloaded repos https://review.openstack.org/448511 | 23:05 |
pabelanger | EmilienM: I am guessing: Getting https://github.com/rdo-packages/tripleo-heat-templates-distgit.git | 23:06 |
EmilienM | pabelanger: please enqueue 448512 | 23:06 |
EmilienM | we can't mirror rdo-packages I guess | 23:06 |
pabelanger | EmilienM: we should be able to mirror anything off github | 23:06 |
EmilienM | pabelanger: in openstack git? | 23:07 |
pabelanger | EmilienM: that has been suggested before | 23:07 |
pabelanger | we have a pretty big git farm | 23:08 |
EmilienM | yeah | 23:08 |
EmilienM | I would be in favor of that for sure | 23:08 |
pabelanger | yes, goal should be to any dependency on the network now | 23:09 |
pabelanger | this is what we did for devstack | 23:09 |
pabelanger | rarely now it times out because of download failures | 23:10 |
EmilienM | bkero: you around by any chance? | 23:10 |
*** ooolpbot has joined #tripleo | 23:10 | |
ooolpbot | URGENT TRIPLEO TASKS NEED ATTENTION | 23:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674681 | 23:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674770 | 23:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1674955 | 23:10 |
ooolpbot | https://bugs.launchpad.net/tripleo/+bug/1675174 | 23:10 |
openstack | Launchpad bug 1674681 in tripleo "buildlogs.centos.org CDN issues" [Critical,Triaged] | 23:10 |
bkero | EmilienM: Hi | 23:10 |
*** ooolpbot has quit IRC | 23:10 | |
openstack | Launchpad bug 1674770 in tripleo "Update timeout too long in CI" [Critical,Triaged] | 23:10 |
openstack | Launchpad bug 1674955 in tripleo "oooq now believes that overcloud deploy succeeded even though it failed" [Critical,Triaged] - Assigned to Sagi (Sergey) Shnaidman (sshnaidm) | 23:10 |
openstack | Launchpad bug 1675174 in tripleo "Timeout passed to overcloud deploy not effective" [Critical,Triaged] | 23:10 |
EmilienM | bkero: do we teardown something at the end of oooq multinode jobs? | 23:10 |
bkero | Tear down? No, I don't believe so. | 23:11 |
mwhahaha | yea you do | 23:11 |
bkero | Could you point it out? | 23:12 |
mwhahaha | oh sec no it's misleading | 23:12 |
EmilienM | let me find it in logs | 23:12 |
mwhahaha | --skip-tags teardown-all | 23:12 |
EmilienM | here | 23:12 |
EmilienM | http://logs.openstack.org/11/448511/7/check/gate-tripleo-ci-centos-7-nonha-multinode-oooq/c0cd008/console.html#_2017-03-24_12_27_01_671287 | 23:12 |
EmilienM | what happens in 5 min? | 23:12 |
mwhahaha | that's teh log collection isn't it | 23:12 |
EmilienM | how comes it takes 5 min? | 23:12 |
mwhahaha | it's skipping the teardown-all | 23:13 |
EmilienM | collecting logs should be fast no? | 23:13 |
mwhahaha | not from remote nodes and all those files | 23:13 |
bkero | It's collecting them and gzipping them and rsyncing them | 23:13 |
bkero | I don't know why collect-logs takes so long, that's an adarazs question. | 23:13 |
mwhahaha | sosreport takes like 5 minutes but that collects way more logs | 23:14 |
bkero | Here's the collectlogs run: http://logs.openstack.org/11/448511/7/check/gate-tripleo-ci-centos-7-nonha-multinode-oooq/c0cd008/logs/quickstart_collect_logs.txt.gz | 23:14 |
EmilienM | oh thanks | 23:14 |
mwhahaha | it really should just run sosreport | 23:15 |
EmilienM | hopefuly we have timestamps soon | 23:15 |
bkero | collect-logs : Gather the logs to /tmp --------------------------------- 99.49s | 23:15 |
bkero | collect-logs : Call get_host_info script ------------------------------- 86.72s | 23:15 |
EmilienM | mwhahaha: yeah, therefore I know someone who did a cli in openstack tripleo client | 23:15 |
bkero | Those are the two major timesinks | 23:15 |
* mwhahaha is working on it | 23:15 | |
mwhahaha | stupid swift | 23:15 |
EmilienM | I've hear it's cool | 23:16 |
EmilienM | ok this is not the critical topic at this time | 23:16 |
EmilienM | even if saving time would be great | 23:16 |
bkero | Here's the script that takes 87s: http://git.openstack.org/cgit/openstack/tripleo-quickstart-extras/tree/roles/collect-logs/templates/get_host_info.sh.j2 | 23:17 |
*** yolanda has quit IRC | 23:17 | |
EmilienM | the heat command take time I think | 23:17 |
mwhahaha | pcs isn't fast either | 23:17 |
EmilienM | openstack stack event list --nested-depth 2 -f json overcloud | 23:18 |
EmilienM | I think this one is the winner | 23:18 |
EmilienM | (but quite useful for debug though) | 23:18 |
bkero | Something quite convenient is at the end of every ansible run, the longest runtime tasks are listed. | 23:19 |
bkero | (which is where my earlier pastes came from) | 23:19 |
pabelanger | http://logs.openstack.org/19/447419/1/gate/gate-tripleo-ci-centos-7-nonha-multinode-oooq/ec733d3/logs/delorean_logs/23/e8/23e86e4116fed979b0ccb5bca8286f41d4c3ba89_dev/build.log.txt.gz | 23:19 |
pabelanger | that looks like a broken package | 23:20 |
pabelanger | which makes me ask, how are tripleo-heat-templates getting into the gate | 23:20 |
pabelanger | if the package is broken | 23:20 |
*** links has quit IRC | 23:24 | |
pabelanger | EmilienM: http://logs.openstack.org/19/447419/1/gate/gate-tripleo-ci-centos-7-nonha-multinode-oooq/ec733d3/console.html#_2017-03-24_23_05_18_606139 | 23:27 |
pabelanger | that is wrong | 23:27 |
pabelanger | 447419 is stable/newton | 23:28 |
pabelanger | but dlrn is using rpm-master | 23:28 |
pabelanger | how did check not catch that | 23:28 |
*** gbarros has joined #tripleo | 23:28 | |
EmilienM | I thought we fixed that | 23:28 |
EmilienM | bkero: can you help on this one? ^ | 23:29 |
mwhahaha | rebase? | 23:29 |
EmilienM | bizarre | 23:29 |
bkero | Let me go look at the code. | 23:30 |
bkero | "zuul_changes": | 23:33 |
bkero | "openstack-infra/tripleo-ci:master:refs/changes/23/449023/2^openstack-infra/tripleo-ci:master:refs/changes/41/448041/2^openstack/tripleo-heat-templates:master:refs/changes/08/444308/1^openstack/tripleo-quickstart-extras:master:refs/changes/11/448511/7^openstack/tripleo-heat-templates:master:refs/changes/74/449174/1^openstack/tripleo-ui:master:refs/changes/07/449507/1^openstack/tripleo-heat-templates:stable/n | 23:33 |
bkero | ewton:refs/changes/19/447419/1" | 23:33 |
*** toure|afk is now known as toure|gone | 23:33 | |
bkero | That's quite the patch stack | 23:33 |
bkero | EmilienM: for some reason it think the branch is 'master' | 23:34 |
pabelanger | Wow | 23:37 |
pabelanger | there is a massive amount of jinja2 magic in that playbook | 23:38 |
pabelanger | bkero: http://logs.openstack.org/19/447419/1/gate/gate-tripleo-ci-centos-7-nonha-multinode-oooq/ec733d3/console.html#_2017-03-24_23_05_18_191869 | 23:40 |
pabelanger | is the issue | 23:40 |
pabelanger | it is saying master branch | 23:40 |
bkero | http://git.openstack.org/cgit/openstack/tripleo-quickstart-extras/tree/roles/build-test-packages/library/zuul_deps.py | 23:42 |
*** tbonds has joined #tripleo | 23:43 | |
pabelanger | bkero: ya, looks to be the issue | 23:43 |
EmilienM | I'm down for this week | 23:43 |
EmilienM | bkero: if you have an idea, feel free to send a patch and irc me, I'll review it over the weekend | 23:44 |
EmilienM | see you all | 23:44 |
* EmilienM out | 23:44 | |
bkero | friday 4:45pm issues, heh | 23:44 |
pabelanger | Ya, you'll have to stop approving tripleo-heat-template patches | 23:46 |
bkero | pabelanger: correct me if I'm reading that wrong, but isn't it saying openstack/tripleo-heat-templates:stable/newton:refs/changes/19/447419/1? | 23:46 |
*** jcoufal has joined #tripleo | 23:46 | |
pabelanger | bkero: Oh | 23:47 |
pabelanger | I see the issue | 23:47 |
bkero | ? | 23:47 |
pabelanger | if it likely the length of zuul_changes in the gate pipeline that is breaking the paring on zuul_deps | 23:47 |
pabelanger | check will allows be a single change | 23:48 |
pabelanger | http://logs.openstack.org/81/439681/2/check/gate-tripleo-ci-centos-7-nonha-multinode-oooq/90015ed/console.html#_2017-03-24_18_49_33_012931 | 23:48 |
pabelanger | that is why it passes check& | 23:48 |
pabelanger | independent pipelines only have 1 change | 23:48 |
pabelanger | but gate is dependant, which has all the changes | 23:49 |
pabelanger | so, until you fix that bug | 23:49 |
pabelanger | there is no point approaching tripleo-heat-templates | 23:49 |
bkero | You think the variable is too long and getting truncated? | 23:49 |
pabelanger | they will just slow down the gate | 23:49 |
*** jcoufal_ has quit IRC | 23:49 | |
pabelanger | bkero: let me run the code quickly | 23:49 |
*** jcoufal has quit IRC | 23:50 | |
bkero | The only data types here are an environment variable and an ansible (python) variable (string) | 23:51 |
pabelanger | bkero: ah, i see the issue | 23:56 |
bkero | ?? | 23:56 |
pabelanger | when there is duplicate projects, they are selecting the first instance, and skipping all others | 23:56 |
bkero | oh, ugh | 23:57 |
pabelanger | ya | 23:57 |
pabelanger | it is backwarks | 23:57 |
pabelanger | backwards | 23:57 |
pabelanger | they need to keep the last | 23:57 |
pabelanger | is that will be the latest code path | 23:57 |
*** lblanchard has joined #tripleo | 23:58 | |
pabelanger | in fact | 23:58 |
pabelanger | why does a stable project need to build dlrn package from master? | 23:58 |
pabelanger | so much wrong here | 23:58 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!