*** rlandy has quit IRC | 01:34 | |
*** ykarel|away has joined #oooq | 01:35 | |
*** apetrich has quit IRC | 02:14 | |
*** skramaja has joined #oooq | 03:32 | |
*** udesale has joined #oooq | 03:59 | |
*** hamzy has joined #oooq | 04:15 | |
*** gkadam has joined #oooq | 05:04 | |
*** ykarel_ has joined #oooq | 05:29 | |
*** ykarel|away has quit IRC | 05:32 | |
*** sanjayu__ has joined #oooq | 05:32 | |
*** ykarel_ is now known as ykarel | 05:53 | |
*** ykarel_ has joined #oooq | 05:55 | |
*** ykarel_ is now known as ykarel|afk | 05:56 | |
*** ykarel has quit IRC | 05:58 | |
*** ykarel|afk has quit IRC | 06:01 | |
*** jbadiapa has joined #oooq | 06:01 | |
*** holser_ has joined #oooq | 06:20 | |
*** apetrich has joined #oooq | 06:24 | |
*** holser_ has quit IRC | 06:29 | |
*** ccamacho has joined #oooq | 06:43 | |
*** ccamacho has quit IRC | 06:44 | |
*** ccamacho has joined #oooq | 06:52 | |
*** ccamacho has quit IRC | 06:57 | |
*** ccamacho has joined #oooq | 06:58 | |
*** sanjayu__ is now known as saneax | 06:59 | |
*** jfrancoa has joined #oooq | 07:12 | |
jfrancoa | Good morning folks! hey, have you seen too that most of the RDO cloud legacy jobs are failing due to some dependency issue? it's really similar to this reported LP from May https://bugs.launchpad.net/tripleo/+bug/1773936 | 07:13 |
---|---|---|
openstack | Launchpad bug 1773936 in tripleo "ci failing on cmd2 pip" [Critical,Fix released] | 07:13 |
*** holser_ has joined #oooq | 07:29 | |
*** holser_ has quit IRC | 07:30 | |
*** holser_ has joined #oooq | 07:30 | |
ssbarnea | jfrancoa: last update on this bug was 3 months, ago. | 07:50 |
jfrancoa | ssbarnea: yes, but the logs which are appearing are very similar to the ones stated in that LP: https://logs.rdoproject.org/74/590774/1/openstack-check/legacy-tripleo-ci-centos-7-containers-multinode-upgrades-pike-branch/e3044a8/job-output.txt.gz#_2018-08-30_18_35_47_059395 | 07:53 |
jfrancoa | ssbarnea: I'm wondering if the ansible version increase couldn't have anything to do with it. This log could give some clue https://logs.rdoproject.org/74/590774/1/openstack-check/legacy-tripleo-ci-centos-7-containers-multinode-upgrades-pike-branch/e3044a8/job-output.txt.gz#_2018-08-30_18_35_39_493665 | 07:53 |
ssbarnea | jfrancoa: wrong, this was due to wrong version of ansible, which found and fixed last night, was merged 3h ago. | 07:54 |
ssbarnea | Unexpected Exception: No module named manager - is caused by ansible 2.3, when we need 2.5 to run | 07:54 |
jfrancoa | ssbarnea: ah, ok. Good then, could you please send me the link to the patch that fixed it? | 07:55 |
ssbarnea | see https://review.openstack.org/#/c/591527/ | 07:55 |
jfrancoa | ssbarnea: thanks a lot | 07:55 |
ssbarnea | i suspect that we will continue to see ansible issues as zuul, ara and tripleo are all defining their own ansible versions. small variations are ok-ish, but when one is downgrading ansible from 2.5.x to 2.3.x, you can bet that everything will go wrong. | 07:57 |
*** dbecker has quit IRC | 08:04 | |
*** dbecker has joined #oooq | 08:05 | |
ssbarnea | panda|rover: Hi! can you help me with something? like explaining the first POST_FAILURE on https://review.openstack.org/#/c/560445/ -- for some reason collect was killed after 30min when it was run with 40min timeout and clearly it was not zuul as the total duration was <2h | 08:10 |
ssbarnea | i suspect some timing issues around, which could be cause for serious number of such errors we seen in the last weeks. | 08:11 |
ssbarnea | i am asking because I am not sure is a bug, if it is will raise it on LP. | 08:13 |
panda|rover | ssbarnea: latest patches, containerized upgrade job ? | 08:26 |
ssbarnea | yes, this one http://logs.openstack.org/45/560445/128/check/tripleo-ci-centos-7-containerized-undercloud-upgrades/ba31b85/ | 08:28 |
panda|rover | ssbarnea: collect logs shouldn't take that mich | 08:28 |
ssbarnea | yep, but i suspect "du" can cause this. i remember NFS issues causing it to get stuck of being very slow. | 08:29 |
panda|rover | it could take du hast, du hast mich though. | 08:29 |
panda|rover | we don't have nfs anywhere AFAIK | 08:29 |
ssbarnea | yep, but still it may be disk and we don't want this to break our build. | 08:30 |
panda|rover | du has completed correctly | 08:30 |
panda|rover | it's not the problem | 08:30 |
ssbarnea | i see two options: removing it, or running it in background as soon as possible and collecting result, even if the result may be incomplete. | 08:31 |
panda|rover | the write pipe error happens always | 08:31 |
ssbarnea | running out of temp disk or memory? | 08:32 |
ssbarnea | still, the culprint line is the du one, i almost always seen it the timeout happening on it. | 08:33 |
panda|rover | I think it's happening *after* it | 08:33 |
ssbarnea | still, there is something weird because I see the " Broken pipe" but the kill happens 15minutes later. | 08:34 |
ssbarnea | maybe we should not trust the timings, it could be related to buffering in zuul, maybe? | 08:34 |
panda|rover | the broken pipe is "normal" | 08:34 |
ssbarnea | why? | 08:35 |
panda|rover | zuul timing has always been correct AFAIR | 08:35 |
panda|rover | ssbarnea: don't know why, but it happens always | 08:36 |
panda|rover | it seems here ansible is not starting the next task in the post.yaml playbook | 08:36 |
ssbarnea | i would use async on this command | 08:36 |
ssbarnea | i really don't want to fail a build due to this command, regardless if we succeed or not from collecting its result. | 08:37 |
panda|rover | the task calls the collect script log in its entirety, not just that command | 08:37 |
ssbarnea | if it takes more then 5min, i don't care about its result, i would only print a warning and continue | 08:38 |
ssbarnea | well, i know how to do it in bash, no need for ansible. | 08:38 |
panda|rover | ssbarnea: http://logs.openstack.org/45/560445/127/check/tripleo-ci-centos-7-containerized-undercloud-upgrades/9aeb137/job-output.txt.gz#_2018-08-30_09_55_04_738134 | 08:41 |
*** chem has joined #oooq | 08:41 | |
ssbarnea | panda|rover: searching for "du -L -ch" got code in 3 places in tripleo-ci -- which one needs to be updated? | 08:41 |
*** ykarel has joined #oooq | 08:43 | |
ssbarnea | i will do some tests, i think i remember what cause this with sort. still which line needs fixation? (2nd question is why we need 3 copies of somethign that looks as the same code-logic) | 08:43 |
panda|rover | ssbarnea: it's not consistent, it will be hard to understand if the patch is really solving anything. | 08:43 |
panda|rover | ssbarnea: anyway the du here is the one in playbooks/tripleo-ci/templates/oooq_common_functions.sh.j2 | 08:44 |
panda|rover | ssbarnea: and the answer to you second question is "legacy, legacy, legacy stuff" | 08:44 |
ssbarnea | thanks. the comment about " tail -n +1" preventing the broken pipe messae is funny, because it clearly doesn't work. | 08:46 |
panda|rover | maybe it worked for a while | 08:49 |
panda|rover | the mystery thickens | 08:49 |
panda|rover | ssbarnea: updated promoter confing in the server directly, let's see if we can get a promotion for phase1 now | 08:55 |
panda|rover | ssbarnea: how's the master phase1 looking ? | 08:55 |
ssbarnea | panda|rover: didn't run, for a long time. http://cistatus.tripleo.org/phase1/ | 08:58 |
ssbarnea | panda|rover: most of the builds are failed because of the ansible bug fix that was merged only ~3-4h ago. | 09:01 |
*** dtantsur|afk is now known as dtantsur | 09:07 | |
*** cgoncalves has joined #oooq | 09:09 | |
*** dbecker has quit IRC | 09:14 | |
*** dbecker has joined #oooq | 09:17 | |
ykarel | ssbarnea, panda|rover somehow http://cistatus.tripleo.org/phase1/ is not updated, as phase1 ran yesterday: https://ci.centos.org/view/rdo/view/promotion-pipeline/job/rdo_trunk-promote-master-current-tripleo/384/ | 09:20 |
ykarel | ssbarnea, panda|rover and it has 1 failure, fixing efforts for ^^ going on here:- https://review.openstack.org/#/c/598095/ and https://review.rdoproject.org/r/#/c/16044/ | 09:22 |
panda|rover | ykarel: yep, I know about master and virtualbmc, I was also looking at rocky, that seemed to pass | 09:24 |
ykarel | panda|rover, yes rocky was good, i replied against: didn't run, for a long time. http://cistatus.tripleo.org/phase1/ | 09:25 |
panda|rover | rocky phase1 is promoting | 09:28 |
panda|rover | \o/ | 09:28 |
panda|rover | promoting containers right now | 09:28 |
cgoncalves | I started yesterday getting into vbmc errors: http://paste.openstack.org/show/729215/ ideas? | 09:29 |
panda|rover | cgoncalves: known bug | 09:30 |
cgoncalves | panda|rover, thanks. is there a bug# I could follow? | 09:31 |
*** ykarel is now known as ykarel|lunch | 09:33 | |
*** hamzy_ has joined #oooq | 09:37 | |
panda|rover | cgoncalves: hhmm apparently, no. This is the work in progress to fix it though https://review.openstack.org/598095 | 09:37 |
*** hamzy has quit IRC | 09:37 | |
panda|rover | ssbarnea: maybe we should open a bug on the virtualbmc, I thought there was one | 09:37 |
cgoncalves | panda|rover, what's better than a bug report? a bug fix! thanks :) | 09:37 |
panda|rover | yes, but only if it fixes the bug | 09:37 |
ssbarnea | panda|rover: doing it now, the bug | 09:39 |
ssbarnea | what is the behaviour of that virtualbmc bug, a link to the build error would help me. | 09:45 |
ssbarnea | cgoncalves: please help me fix the missing bits on the bug https://bugs.launchpad.net/tripleo/+bug/1790109 -- we need bugs, before fixes, especially if they are not obvious. | 09:50 |
openstack | Launchpad bug 1790109 in tripleo "virtualbmc>=1.4 is not supported" [Undecided,New] | 09:50 |
cgoncalves | ssbarnea, I can paste what I have in http://paste.openstack.org/show/729215/ if that helps | 09:51 |
cgoncalves | although not much meaty info is available there | 09:51 |
panda|rover | ssbarnea: it's the bug hitting the phase1 master promotion right now | 09:51 |
panda|rover | so it's a promotion-blocker | 09:52 |
ssbarnea | brb 15min | 09:52 |
panda|rover | maybe it's already there | 09:52 |
cgoncalves | cherry-picked the proposed fix. testing | 09:52 |
cgoncalves | nooo! :( http://paste.openstack.org/show/729217/ | 09:53 |
-openstackstatus- NOTICE: Jobs using devstack-gate (legacy devstack jobs) have been failing due to an ara update. We use now a newer ansible version, it's safe to recheck if you see "ImportError: No module named manager" in the logs. | 09:55 | |
*** ykarel|lunch is now known as ykarel | 09:58 | |
ykarel | cgoncalves, it also needs the package changes:- https://review.rdoproject.org/r/#/c/16044/, cgoncalves can u try https://logs.rdoproject.org/44/16044/2/check/legacy-DLRN-rpmbuild/869b1f4/buildset/centos/current/python2-virtualbmc-1.4.1-0.20180831065059.fa04f7b.el7.noarch.rpm | 09:59 |
cgoncalves | ykarel, oooq newbie here. how could I test with that RPM if every time I ran oooq it rebuilds the undercloud node? | 10:02 |
cgoncalves | one possible way I see is changing roles/virtbmc/tasks/configure-vbmc.yml to point to that URL | 10:03 |
ykarel | cgoncalves, i think adding it in undercloud_rpm_dependencies should work | 10:04 |
cgoncalves | ok, should do it. it will run sudo yum install -y {{ undercloud_rpm_dependencies }} | 10:06 |
ykarel | cgoncalves, mmm that would not work as virtbmc is run before undercloud install | 10:10 |
*** skramaja has quit IRC | 10:18 | |
panda|rover | phase1 rocky successfully promoted | 10:26 |
ykarel | cool | 10:32 |
cgoncalves | ykarel, managed to get python2-virtualbmc-1.4.1-0.20180831065059.fa04f7b installed by instructing roles/virtbmc/tasks/configure-vbmc.yml to install from the url you provided | 10:37 |
ykarel | cgoncalves, okk cool, did virtbmc role finished successfully with the tq patch? | 10:38 |
cgoncalves | new error: http://paste.openstack.org/show/729220/ | 10:38 |
cgoncalves | [stack@undercloud ~]$ rpm -qa | grep virtual | 10:39 |
cgoncalves | python2-virtualbmc-1.4.1-0.20180831065059.fa04f7b.el7.noarch | 10:39 |
ykarel | cgoncalves, hmm permission issues | 10:40 |
cgoncalves | yep | 10:40 |
*** dsneddon has quit IRC | 10:41 | |
ykarel | cgoncalves, looks liek become: true not needed | 10:42 |
cgoncalves | ok. testing | 10:44 |
ykarel | i pinged etingof on #rdo, can u try without become:true | 10:44 |
cgoncalves | in-progress | 10:46 |
*** cgoncalves has quit IRC | 10:48 | |
*** cgoncalves has joined #oooq | 10:50 | |
*** ykarel_ has joined #oooq | 10:52 | |
*** ykarel has quit IRC | 10:53 | |
*** udesale has quit IRC | 11:12 | |
*** ykarel_ is now known as ykarel|afk | 11:12 | |
panda|rover | ykarel|afk: did you open the bug for the metadata in the end ? | 11:15 |
*** ykarel|afk has quit IRC | 11:17 | |
ssbarnea | panda|rover: i am looking at the gate-check at https://review.openstack.org/#/c/560445/ and there is only one more job failing, but is failing with timeout while running tempest for ~10mins. i suspect the reason being something before that that took too much time. | 11:29 |
ssbarnea | scenario001 | 11:29 |
*** dbecker has quit IRC | 11:38 | |
*** ykarel has joined #oooq | 11:40 | |
panda|rover | ykarel: did you open the bug for the metadata in the end ? | 11:40 |
panda|rover | ykarel: it's blocking all ovb jobs | 11:40 |
ykarel | panda|rover, i haven't | 11:41 |
ykarel | was i about to create? | 11:41 |
ykarel | i thought u were going to associate it with the rdo cloud networking issue | 11:41 |
*** saneax has quit IRC | 11:42 | |
* ykarel also thought u are working on it, | 11:42 | |
ssbarnea | panda|rover: ykarel : filed the metadata error as https://bugs.launchpad.net/tripleo/+bug/1790127 | 11:45 |
openstack | Launchpad bug 1790127 in tripleo "curl fails to get http://169.254.169.254/openstack/2015-10-15/meta_data.json" [Undecided,New] | 11:45 |
ykarel | ssbarnea, cool, also add promotion_blocker | 11:46 |
ykarel | panda|rover, did u found the issue with metadata, may be need to reach rhos-ops? | 11:47 |
panda|rover | ykarel: I already asked there | 11:49 |
ykarel | panda|rover, cool, | 11:49 |
*** gkadam has quit IRC | 12:00 | |
*** dsneddon has joined #oooq | 12:06 | |
*** trown|outtypewww is now known as trown | 12:07 | |
*** rlandy has joined #oooq | 12:40 | |
*** abishop has joined #oooq | 12:41 | |
abishop | hi, I could use some help troubleshooting ci check failures on stable/queens patches | 12:42 |
abishop | after https://review.openstack.org/597141 merged, scenarios 2,3 pass but 1,4 seem to consistently fail | 12:43 |
abishop | see https://review.openstack.org/595357 and failure http://logs.openstack.org/57/595357/1/check/tripleo-ci-centos-7-scenario001-multinode-oooq-container/a3f7856/logs/undercloud/var/log/mistral/executor.log.txt.gz#_2018-08-30_00_40_15_950 | 12:43 |
abishop | any debug tips are appreciated! | 12:43 |
abishop | ssbarnea: you assisted last time, so any thoughts on ^^ :D | 12:47 |
ssbarnea | abishop: i can only confirm the timeouts issue, i am affected by it too and it looks pretty bad http://dashboard-ci.tripleo.org/d/cEEjGFFmz/cockpit?orgId=1 | 12:51 |
ssbarnea | and what is worse is that this time is not slow infra that causes the timeouts, is normal performance. i don't really know what to do. | 12:52 |
ssbarnea | i am working on investigating scenario001 timeout causes but feel free to do the same. | 12:53 |
abishop | ssbarnea: yikes! thx for the info. I was concerned the problem was specific to the scenario 1 and 4 jobs | 12:53 |
ssbarnea | based on http://cistatus.tripleo.org/ I would say that problem is only without containers on scenario001, all those with cont passed. | 12:56 |
*** tosky has joined #oooq | 13:02 | |
abishop | ssbarnea: they pass on master and rocky, but I'm not seeing any recent passes on queens | 13:06 |
ssbarnea | abishop: indeed, not sure what fails http://logs.openstack.org/45/594145/2/check/tripleo-ci-centos-7-scenario001-multinode-oooq-container/036bb98/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz | 13:15 |
ssbarnea | i do see resources.WorkflowTasks_Step2_Execution: ERROR but is not clear me to me what is causing it. | 13:16 |
ykarel | ssbarnea, abishop can check mistral log for ^^ | 13:16 |
ykarel | ssbarnea, abishop http://logs.openstack.org/57/595357/1/check/tripleo-ci-centos-7-scenario001-multinode-oooq-container/a3f7856/logs/undercloud/var/log/mistral/executor.log.txt.gz#_2018-08-30_00_40_15_950 | 13:17 |
ssbarnea | "timed out waiting for ping module test success | 13:20 |
panda|rover | jfrancoa: https://review.openstack.org/597450 are these changes needed for master and rocky too ? | 13:21 |
rlandy | https://github.com/openstack/tripleo-heat-templates/blob/master/ci/common/net-config-multinode-os-net-config.yaml#L136 | 13:21 |
rlandy | and it keeps going | 13:21 |
panda|rover | oh dear | 13:22 |
jfrancoa | panda|rover: do you mean because only queens in included in the conditional? the same question was asked weshay|pto in the patch I based this from and he said that rocky/master make use of other config_download_args https://review.openstack.org/#/c/597141/1/config/general_config/featureset016.yml@103 | 13:23 |
jfrancoa | panda|rover: so I guess not | 13:23 |
panda|rover | rlandy: I have no idea how to fix that | 13:23 |
panda|rover | rlandy: other than creating that file anyway | 13:23 |
rlandy | panda|rover: well I am wonder if that is the right template to be used | 13:23 |
rlandy | we moved the relevant ci templates to tripleo-ci | 13:23 |
rlandy | as in | 13:23 |
rlandy | ttps://github.com/openstack-infra/tripleo-ci/blob/master/heat-templates/net-config-multinode.yaml | 13:24 |
panda|rover | jfrancoa: ok thenk | 13:24 |
rlandy | this one still resides in tht | 13:24 |
rlandy | https://github.com/openstack/tripleo-heat-templates/blob/master/ci/common/net-config-multinode-os-net-config.yaml | 13:24 |
rlandy | either way, we have no equivalent in tripleo-ci | 13:25 |
rlandy | so this is a PROBLEM | 13:26 |
jfrancoa | panda|rover: anyway, I still need to rework that patch, because some jobs are failing as they can't find /home/zuul/config-download.yaml http://logs.openstack.org/42/597142/3/check/tripleo-ci-centos-7-scenario008-multinode-oooq-container/050f23b/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz#_2018-08-29_17_29_42 | 13:26 |
jfrancoa | panda|rover: I'll -1W until I'll have it working | 13:26 |
rlandy | panda|rover: on the upside, /etc/nodepool is only in this file | 13:27 |
panda|rover | jfrancoa: oh | 13:27 |
rlandy | wrt tht | 13:27 |
panda|rover | jfrancoa: it's blocking promotion, we need to raise priority on this patch | 13:27 |
panda|rover | rlandy: at least .... | 13:27 |
jfrancoa | panda|rover: ok then, I'll try to give it an eye after I finish one thing | 13:28 |
panda|rover | rlandy: but we have no ansible interaction with these templates, so even the new subnodes variable will not help here. Either we move it somewhere that takes ansible variables, or we need to create that file anyway | 13:28 |
*** dtrainor has quit IRC | 13:28 | |
panda|rover | jfrancoa: thanks | 13:28 |
panda|rover | rascasoft: I need rascahard right now. The card has reahed the prod chain board | 13:29 |
rascasoft | panda|rover, I saw it | 13:30 |
panda|rover | rascasoft: ok so do you need a trasformation, hulk style ? | 13:31 |
rascasoft | panda|rover, gimme 5 mins to regulate the gamma radiation levels | 13:32 |
*** myoung has joined #oooq | 13:33 | |
panda|rover | and when you're ready for the bug, I'll shout "Rasca, SMASH!" | 13:35 |
rascasoft | panda|rover, I'm ready now, but even if Hulk is one of my favs, my dark side is Raoh, king of Hokuto | 13:36 |
*** rascasoft is now known as raoh | 13:36 | |
raoh | Now I'm ready | 13:36 |
rlandy | panda|rover: Emilien added those lines, maybe he can advise - will ask | 13:36 |
raoh | panda|rover, first of all the DFG here is TripleoCI, for sure, isn't it? | 13:37 |
panda|rover | raoh: bluejeans ? | 13:37 |
raoh | panda|rover, sure | 13:37 |
panda|rover | rlandy: that template is referenced pretty much everywhere in the jobs, part of the multinode environment | 13:38 |
raoh | panda|rover, https://bluejeans.com/9579113890 | 13:38 |
rlandy | panda|rover: yep - asking on #tripleo | 13:38 |
ssbarnea | added https://bugs.launchpad.net/tripleo/+bug/1790144 - panda|rover please check (if you are not also overprovisioned already) | 13:50 |
openstack | Launchpad bug 1790144 in tripleo "queens: overcloud-deploy.sh fail with mistra timed out waiting for ping module test success: SSH Error: data could not be sent to remote host " [Critical,Triaged] | 13:50 |
*** raoh is now known as rascasoft | 13:52 | |
panda|rover | ssbarnea: is it multinode in openstack ci ? | 13:53 |
ssbarnea | panda|rover: job name is tripleo-ci-centos-7-scenario001-multinode-oooq-container | 13:54 |
ssbarnea | was reported by ykarel half an hour ago, I looked at it but that is all I was able to find. | 13:56 |
ykarel | mmm, i think abishop reported ^^ | 13:58 |
panda|rover | Is there something that is not failing today :( | 13:58 |
ykarel | are gate jobs master/rocky also failing | 13:59 |
abishop | ykarel, panda|rover: I hadn't filed bug, just chatted w/ ssbarnea about it. failures are on stable/queens | 14:01 |
panda|rover | abishop: is this something related with the config download parameters ? | 14:02 |
panda|rover | ssbarnea: ^ | 14:02 |
abishop | panda|rover: don't think so, in fact https://review.openstack.org/597141 got two jobs working (scenarios 2,3). 1 and 4 still consistently fail | 14:03 |
rlandy | pls can I get another core vote on https://review.openstack.org/#/c/581488 | 14:06 |
rlandy | sshnaidm|off already approved. | 14:06 |
rlandy | panda|rover: you are the only one around - pls | 14:06 |
ykarel | panda|rover, abishop then possibly related with ceph, | 14:07 |
ykarel | slagle might have idea on it | 14:07 |
panda|rover | rlandy: approved | 14:12 |
rlandy | thank you, sir | 14:12 |
ykarel | also it doesn't look like config-download is working there | 14:13 |
panda|rover | ssbarnea: https://review.rdoproject.org/zuul/stream.html?uuid=b99085d01cff43149aa8b4c682571fb3&logfile=console.log | 14:19 |
panda|rover | ssbarnea: metadata bug is fixed | 14:19 |
*** saneax has joined #oooq | 14:20 | |
*** sanjayu_ has joined #oooq | 14:23 | |
*** hamzy_ is now known as hamzy | 14:26 | |
*** ykarel is now known as ykarel|away | 14:26 | |
*** saneax has quit IRC | 14:26 | |
*** sanjayu_ has quit IRC | 14:36 | |
*** dtrainor has joined #oooq | 14:37 | |
rascasoft | panda|rover, wow that was fast, I moved the card to done | 14:37 |
*** dtrainor has quit IRC | 14:37 | |
*** dtrainor has joined #oooq | 14:38 | |
*** dsneddon has quit IRC | 14:39 | |
rlandy | panda|rover: I am sort of out of ideas here - waiting in more direction from the alex/emilen | 14:40 |
rlandy | and to think | 14:40 |
rlandy | all I ever wanted to do was add a browbeat job | 14:40 |
rlandy | it's become a whole career | 14:41 |
*** jaosorior has quit IRC | 14:43 | |
rascasoft | myoung, hey man you around? | 14:46 |
panda|rover | rlandy: you and your acts of kindness :) | 14:46 |
rlandy | no good deed goes unpunished | 14:47 |
rascasoft | myoung, so I just realized that I can't use rocky nic-configs with queens, so I'm separating all the nic configs per release inside a review, we'll need it to be merged quickly to make the jobs green again... So in short: incoming review :D | 14:47 |
*** dsneddon has joined #oooq | 14:49 | |
myoung | rascasoft, aye, gotcha | 15:01 |
myoung | rascasoft, ongoing VPN issues, I'm just now as of a few mins ago able to get in via phx2, rdu still down/flaky | 15:01 |
rascasoft | myoung, yeah everybody hurts today | 15:02 |
myoung | rascasoft, last night also i noticed all the slaves in rdo-manager-64 on jenkins are offline with ssh connectivity issues, trying to get in now to check...and those are needed to launch the jobs | 15:02 |
myoung | (well for some of them) - I have a patch or two coming for rdo2:rocky as well, tweaks from the patch you merged a day or so ago | 15:04 |
*** rook has quit IRC | 15:04 | |
*** myoung is now known as myoung_ | 15:04 | |
rascasoft | myoung_, ack, put me as a reviewer | 15:05 |
*** myoung has joined #oooq | 15:06 | |
*** ykarel|away has quit IRC | 15:08 | |
*** myoung_ has quit IRC | 15:10 | |
*** ccamacho has quit IRC | 15:12 | |
rascasoft | myoung, https://code.engineering.redhat.com/gerrit/148891 testing it here https://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/oooq-queens-rdo_trunk-bmu-ha-lab-cygnus-float_nic_with_vlans/7/console if the pip cache don't make us mad as usual | 15:32 |
rascasoft | :D | 15:32 |
*** jfrancoa has quit IRC | 15:45 | |
myoung | rascasoft: methinks we should just make all those jobs obliterate the pip cache each time. I still think because we're not rev'ing the pip module version per commit we're always going to have these issues. that or we just stop using it entirely | 15:56 |
myoung | rascasoft: would rather just pay the 4 min tax each job than spend hours debugging what ends up being "pip doing the right thing" when we have stale ansible bits in the cache | 15:57 |
myoung | (playbooks, ymls, etc) | 15:57 |
*** dsneddon has quit IRC | 16:03 | |
*** dsneddon has joined #oooq | 16:04 | |
rascasoft | myoung, agreed all the line here | 16:05 |
*** holser_ has quit IRC | 16:07 | |
*** abishop has quit IRC | 16:21 | |
*** abishop has joined #oooq | 16:22 | |
myoung | rascasoft: re 148891, makes sense what you're doing. The only other route is to put conditionals inside the yamls, but I think that's worse and not in line with eventually merging this with upstream/zuul design | 16:24 |
rascasoft | myoung, exactly, I thought the same | 16:24 |
myoung | (e.g. per release) | 16:24 |
*** trown is now known as trown|lunch | 16:34 | |
*** ssbarnea|ruck has joined #oooq | 16:36 | |
ssbarnea|ruck | it seems that matrix server died again about an hour ago, so I may have missed few messages. | 16:37 |
rlandy | rascasoft: so you ever use ci-rhos? | 16:57 |
ssbarnea|ruck | panda|rover still around here? can you kick https://review.openstack.org/#/c/571176/ ? | 16:58 |
*** trown|lunch is now known as trown | 17:34 | |
*** sshnaidm|off has quit IRC | 17:37 | |
*** sshnaidm|off has joined #oooq | 17:38 | |
*** sshnaidm|off has quit IRC | 17:41 | |
*** sshnaidm|off has joined #oooq | 17:41 | |
*** dsneddon has quit IRC | 17:58 | |
*** dtantsur is now known as dtantsur|afk | 17:59 | |
*** dsneddon has joined #oooq | 18:00 | |
ssbarnea|ruck | found something weird, this reports exit 1 even if I see the stack was created succesfully anf no reason to fail, http://logs.rdoproject.org/50/597450/3/openstack-check/legacy-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035-master/fcafc96/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz | 18:12 |
*** dsneddon has quit IRC | 18:35 | |
ssbarnea|ruck | I raised it as https://bugs.launchpad.net/tripleo/+bug/1790199 | 18:35 |
openstack | Launchpad bug 1790199 in tripleo "fs035: openstack overcloud deploy returns error code without any signs or error" [Critical,Triaged] | 18:35 |
*** dsneddon has joined #oooq | 18:38 | |
*** dsneddon has quit IRC | 18:43 | |
*** tosky has quit IRC | 18:52 | |
*** dsneddon has joined #oooq | 18:55 | |
*** openstackstatus has quit IRC | 18:58 | |
rlandy | ssbarnea|ruck: any progress on the vxlan networking error? | 19:05 |
rlandy | https://logs.rdoproject.org/22/596422/32/openstack-check/legacy-tripleo-ci-centos-7-containers-multinode-upgrades-pike-branch/79a26b5/logs/undercloud/home/zuul/vxlan_networking.sh.log.txt.gz#_2018-08-31_13_05_42 | 19:06 |
*** openstackstatus has joined #oooq | 19:36 | |
*** ChanServ sets mode: +v openstackstatus | 19:36 | |
*** openstackstatus has quit IRC | 19:55 | |
*** openstackstatus has joined #oooq | 19:57 | |
*** ChanServ sets mode: +v openstackstatus | 19:57 | |
*** openstackstatus has quit IRC | 20:11 | |
*** openstackstatus has joined #oooq | 20:11 | |
*** ChanServ sets mode: +v openstackstatus | 20:11 | |
*** holser_ has joined #oooq | 20:23 | |
*** trown is now known as trown|outtypewww | 20:36 | |
*** openstackstatus has quit IRC | 20:36 | |
*** openstackstatus has joined #oooq | 20:36 | |
*** ChanServ sets mode: +v openstackstatus | 20:36 | |
*** myoung has quit IRC | 20:44 | |
*** holser_ has quit IRC | 20:58 | |
*** dtrainor has quit IRC | 22:01 | |
*** dtrainor has joined #oooq | 22:02 | |
*** tosky has joined #oooq | 22:09 | |
*** rlandy has quit IRC | 22:16 | |
*** jrist has quit IRC | 22:58 | |
*** tosky has quit IRC | 23:14 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!