*** dsneddon has joined #oooq | 00:00 | |
*** dsneddon has quit IRC | 00:05 | |
*** d0ugal has quit IRC | 00:05 | |
*** d0ugal has joined #oooq | 00:21 | |
*** dsneddon has joined #oooq | 00:38 | |
*** dsneddon has quit IRC | 00:43 | |
*** dsneddon has joined #oooq | 01:11 | |
*** dsneddon has quit IRC | 01:16 | |
*** dsneddon has joined #oooq | 01:50 | |
*** dsneddon has quit IRC | 01:55 | |
*** dsneddon has joined #oooq | 02:27 | |
*** dsneddon has quit IRC | 02:32 | |
*** dsneddon has joined #oooq | 03:06 | |
*** dsneddon has quit IRC | 03:11 | |
*** dsneddon has joined #oooq | 03:44 | |
*** hamzy_ has joined #oooq | 03:48 | |
*** aakarsh|2 has quit IRC | 03:49 | |
*** dsneddon has quit IRC | 03:49 | |
*** hamzy has quit IRC | 03:51 | |
*** ykarel has joined #oooq | 03:59 | |
*** udesale has joined #oooq | 04:05 | |
*** surpatil has joined #oooq | 04:18 | |
*** ykarel has quit IRC | 04:19 | |
*** dsneddon has joined #oooq | 04:23 | |
*** ykarel has joined #oooq | 04:24 | |
*** dsneddon has quit IRC | 04:27 | |
*** raukadah is now known as chkumar|rover | 04:37 | |
*** dsneddon has joined #oooq | 05:02 | |
*** bhagyashris has joined #oooq | 05:11 | |
*** dsneddon has quit IRC | 05:35 | |
*** jaosorior has joined #oooq | 05:51 | |
*** surpatil has quit IRC | 06:02 | |
*** dsneddon has joined #oooq | 06:06 | |
*** dsneddon has quit IRC | 06:11 | |
*** ykarel is now known as ykarel|afk | 06:21 | |
*** skramaja has joined #oooq | 06:23 | |
*** ykarel|afk is now known as ykarel | 06:32 | |
*** dsneddon has joined #oooq | 06:37 | |
*** dsneddon has quit IRC | 06:41 | |
*** bhagyashris has quit IRC | 06:47 | |
*** ykarel is now known as ykarel|afk | 07:01 | |
*** jtomasek has joined #oooq | 07:07 | |
*** bogdando has joined #oooq | 07:08 | |
*** dsneddon has joined #oooq | 07:10 | |
*** jpena|off is now known as jpena | 07:11 | |
*** dsneddon has quit IRC | 07:15 | |
*** sshnaidm|afk is now known as sshnaidm | 07:16 | |
sshnaidm | chkumar|rover, commented on https://review.opendev.org/#/c/678472/ | 07:29 |
---|---|---|
*** jfrancoa has joined #oooq | 07:29 | |
sshnaidm | chkumar|rover, well, it's not enough.. I'll start from reverting this | 07:34 |
chkumar|rover | sshnaidm: https://review.opendev.org/#/c/678472/1/roles/run-test/templates/oooq_common_functions.sh.j2@86 it should be venvpath/share /usr/local/share/tripleo-quickstart/roles na ? | 07:36 |
chkumar|rover | sshnaidm: ok | 07:38 |
*** ykarel|afk is now known as ykarel | 07:43 | |
*** apetrich has joined #oooq | 07:43 | |
*** dsneddon has joined #oooq | 07:50 | |
*** dsneddon has quit IRC | 07:58 | |
sshnaidm | chkumar|rover, https://review.rdoproject.org/r/#/c/21881/ | 07:59 |
chkumar|rover | sshnaidm: will I also +w it? | 08:02 |
sshnaidm | chkumar|rover, feel free :) | 08:02 |
*** jfrancoa has quit IRC | 08:03 | |
*** derekh has joined #oooq | 08:09 | |
*** jfrancoa has joined #oooq | 08:18 | |
*** ykarel is now known as ykarel|lunch | 08:26 | |
*** dsneddon has joined #oooq | 08:27 | |
*** dsneddon has quit IRC | 08:32 | |
chkumar|rover | sshnaidm: fs039 tempest runs using os_tempest https://logs.rdoproject.org/24/639324/43/openstack-check/tripleo-ci-centos-7-ovb-3ctlr_1comp_1supp-featureset039/69d3757/job-output.txt.gz#_2019-08-26_08_31_14_619952 but it got failed | 08:35 |
chkumar|rover | no logs so we cannot do anything | 08:36 |
sshnaidm | ipalib.errors.NotFound: controller-0.ooo.test: host not found | 08:40 |
sshnaidm | seems like ipa dns didn't work | 08:40 |
chkumar|rover | sshnaidm: where ctrlplane network is used? | 08:41 |
chkumar|rover | is it related to undercloud | 08:42 |
sshnaidm | chkumar|rover, it's related to IPA setup. On supplemental node we have IPA server which serves as DNS for all nodes - undercloud, overcloud. | 08:43 |
sshnaidm | chkumar|rover, maybe worth to see how they run tempest downstream, maybe need some additional settings | 08:44 |
*** dtantsur|afk is now known as dtantsur | 08:46 | |
*** dsneddon has joined #oooq | 09:01 | |
*** dsneddon has quit IRC | 09:06 | |
*** ykarel|lunch is now known as ykarel | 09:07 | |
chkumar|rover | sshnaidm: sure | 09:21 |
chkumar|rover | sshnaidm: putting a bug and then take it from there | 09:21 |
chkumar|rover | sshnaidm: https://9b8de9f26d5cfa9d157d-87a4d4201f92d032de2e8328e3e57fa8.ssl.cf2.rackcdn.com/678476/2/check/tripleo-ci-centos-7-scenario004-standalone/0f3b02d/logs/ | 09:34 |
sshnaidm | chkumar|rover, ? | 09:35 |
chkumar|rover | sshnaidm: https://review.opendev.org/#/c/678476/ | 09:35 |
chkumar|rover | sshnaidm: let's merge the revert | 09:35 |
sshnaidm | chkumar|rover, aah, ok | 09:35 |
*** chem has joined #oooq | 09:37 | |
*** dsneddon has joined #oooq | 09:40 | |
*** dsneddon has quit IRC | 09:44 | |
*** apetrich has quit IRC | 09:49 | |
*** saneax has joined #oooq | 09:59 | |
chkumar|rover | arxcruz: https://review.opendev.org/#/c/678418/ will fix the undefined tempest_install_method undefined | 10:04 |
arxcruz | chkumar|rover: okay, for now i'm adding all the variables on fs001, then i'll move to baremetal-full-overcloud-validate.yml | 10:05 |
arxcruz | because there are a lot of vars, every run i got a different error | 10:05 |
arxcruz | and i don't want to keep adding depends on, that ill increase the time of checking | 10:05 |
*** dsneddon has joined #oooq | 10:19 | |
*** ksambor is now known as ksambor|lunch | 10:23 | |
*** dsneddon has quit IRC | 10:24 | |
*** jaosorior has quit IRC | 10:26 | |
*** sanjayu_ has joined #oooq | 10:43 | |
*** saneax has quit IRC | 10:43 | |
*** surpatil has joined #oooq | 10:55 | |
*** ksambor|lunch is now known as ksambor | 10:56 | |
*** dsneddon has joined #oooq | 10:59 | |
sshnaidm | arxcruz, chkumar|rover, do you know why is that happening? No package pytho3n-barbican-tests-tempest available. http://logs.rdoproject.org/45/21945/3/check/periodic-tripleo-ci-rhel-8-scenario002-standalone-master/1c89d62/job-output.txt.gz | 11:01 |
chkumar|rover | sshnaidm: chekcing | 11:02 |
chkumar|rover | sshnaidm: fxing it | 11:02 |
arxcruz | sshnaidm: chkumar|rover pytho3n | 11:04 |
arxcruz | sshnaidm: chkumar|rover https://review.opendev.org/#/c/678520/ | 11:04 |
*** dsneddon has quit IRC | 11:04 | |
*** udesale has quit IRC | 11:04 | |
sshnaidm | arxcruz, cool :) | 11:04 |
*** tesseract has joined #oooq | 11:12 | |
*** ccamacho has joined #oooq | 11:21 | |
*** jpena is now known as jpena|lunch | 11:25 | |
*** dsneddon has joined #oooq | 11:30 | |
*** dsneddon has quit IRC | 11:35 | |
*** apetrich has joined #oooq | 11:41 | |
*** jaosorior has joined #oooq | 11:44 | |
*** rlandy has joined #oooq | 11:59 | |
*** rlandy is now known as rlandy|ruck | 11:59 | |
*** derekh has quit IRC | 12:00 | |
rlandy|ruck | chkumar|rover: hi | 12:00 |
rlandy|ruck | chkumar|rover: just two more days :) | 12:01 |
rlandy|ruck | chkumar|rover: ok if I move https://trello.com/c/ZvV5ul81/1029-cixlp1836046tripleociproa-tempestscenariotestnetworkbasicopstestnetworkbasicops-failing-on-queens to done? | 12:01 |
weshay_MOD | ykarel chkumar|rover you guys looking at scen003 packstack in master? | 12:05 |
weshay_MOD | Error while evaluating a Resource Statement, Could not find declared class ::ceilometer::agent::central | 12:05 |
*** dsneddon has joined #oooq | 12:06 | |
ykarel | weshay_MOD, https://review.opendev.org/#/c/678169/ | 12:06 |
weshay_MOD | ykarel thank you.. please open a lp in case it doesn't promote in time | 12:07 |
*** weshay_MOD is now known as weshay | 12:07 | |
ykarel | weshay, ack, will open against packstack | 12:08 |
*** dsneddon has quit IRC | 12:10 | |
rlandy|ruck | chkumar|rover: sshnaidm: does the revert close out https://bugs.launchpad.net/tripleo/+bug/1841405 or is there still more work required here? | 12:11 |
openstack | Launchpad bug 1841405 in tripleo "role 'dump_vars' not found leading to logs not getting collect in post" [Critical,Confirmed] | 12:11 |
chkumar|rover | rlandy|ruck: revert worked | 12:12 |
rlandy|ruck | chkumar|rover: k - thank - closing that bug | 12:12 |
arxcruz | chkumar|rover: woot, tempest run, it fails, but it runs | 12:13 |
chkumar|rover | arxcruz: error log? | 12:13 |
chkumar|rover | arxcruz: please try with config_drive fix | 12:14 |
chkumar|rover | arxcruz: https://review.opendev.org/#/c/639324/45/config/general_config/featureset039.yml@150 | 12:15 |
rlandy|ruck | chkumar|rover: from ^^ ... ok if I move https://trello.com/c/ZvV5ul81/1029-cixlp1836046tripleociproa-tempestscenariotestnetworkbasicopstestnetworkbasicops-failing-on-queens to done? | 12:15 |
rlandy|ruck | there was a failure in the last run but it didn't get to tempest | 12:15 |
chkumar|rover | rlandy|ruck: yes, move it to done | 12:16 |
arxcruz | chkumar|rover: i believe it's related to the ip address for tempest_cidr | 12:16 |
chkumar|rover | arxcruz: might be or might not be as we have no logs | 12:17 |
arxcruz | chkumar|rover: http://logs.rdoproject.org/00/673400/13/openstack-check/tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001/d534e97/job-output.txt.gz | 12:17 |
rlandy|ruck | weshay: still want to keep https://trello.com/c/rPZGwJmU/1081-cixlp1839532tripleociproa-tripleo-gate-jobs-are-failing-to-pull-containers-when-running-on-ovh-provider-with-unauthorized-error on the escalation board? | 12:19 |
weshay | rlandy|ruck let's explain it today and close it today | 12:20 |
panda | rfolco: ping me whenever | 12:21 |
rfolco | panda, ping | 12:21 |
chkumar|rover | rlandy|ruck: http://zuul.openstack.org/builds?job_name=tripleo-ci-centos-7-scenario001-standalone does not looks good | 12:21 |
rlandy|ruck | chkumar|rover: better than it was :) - adding on etherpad to investigate | 12:22 |
rfolco | panda, do you have a task for me ? | 12:24 |
panda | rfolco: wanna sync ? | 12:24 |
rfolco | panda, sure | 12:25 |
*** surpatil has quit IRC | 12:25 | |
rfolco | panda, my bj or yours? | 12:26 |
chkumar|rover | arxcruz: please try fs01 with compute_feature_enabled.config_drive': 'True' and let see if it works or not, then we need to dig what is wrong | 12:28 |
*** jpena|lunch is now known as jpena | 12:30 | |
*** Vorrtex has joined #oooq | 12:31 | |
panda | rfolco: I'm in yours | 12:32 |
rfolco | k k | 12:33 |
*** dsneddon has joined #oooq | 12:44 | |
weshay | rlandy|ruck chkumar|rover let's look at https://trello.com/c/dq5ntFjS/1084-cixlp1840763tripleociproa-queensfs037-upgrade-jobs-are-failiing-while-doing-overcloud-minor-update-with-ssh-error | 12:45 |
rlandy|ruck | weshay: k - your bj? | 12:45 |
weshay | rlandy|ruck looks like ssh is failing.. is that the consistent issue? | 12:46 |
* rlandy|ruck will confirm with latest failure - sec | 12:46 | |
chkumar|rover | weshay: that ssh issue is consistent failure | 12:47 |
weshay | chkumar|rover rlandy|ruck ok.. any chance we can setup a reproducer and ask jose luis or other upgrade folks to take a look? | 12:48 |
rlandy|ruck | 2019-08-26 06:15:10 | "msg": "SSH Error: data could not be sent to remote host \"192.168.24.3\". Make sure this host can be reached over ssh", | 12:48 |
rlandy|ruck | weshay: yep - same one | 12:48 |
chkumar|rover | weshay: that we call do | 12:48 |
*** dsneddon has quit IRC | 12:48 | |
chkumar|rover | *can | 12:50 |
rlandy|ruck | weshay: ack - we can try a reproducer - just want to check if we have similar failures on other upgrade | 12:50 |
weshay | k.. thanks | 12:50 |
rlandy|ruck | weshay: chkumar|rover: https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset037-updates-stein | 12:51 |
rlandy|ruck | ^^ stein is all green | 12:51 |
weshay | rockin.. that is still the important one :) | 12:52 |
rlandy|ruck | as a comparison | 12:52 |
weshay | ah.. thought you meant the status.. ignore me | 12:52 |
rlandy|ruck | weshay: no the same upgrade in stein | 12:52 |
rfolco | scrum time | 13:00 |
* chkumar|rover headed home, will connect from there | 13:00 | |
*** Goneri has joined #oooq | 13:02 | |
jfrancoa | weshay: rlandy|ruck hey, I'm having a look at the very same issue. Just for my understanding, is this something that has been happening for long time? | 13:10 |
jfrancoa | weshay: rlandy|ruck or did it start recently? | 13:10 |
rlandy|ruck | jfrancoa: it's been going a while | 13:10 |
rlandy|ruck | jfrancoa: here is the history ... | 13:11 |
rlandy|ruck | https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset037-updates-queens | 13:11 |
rlandy|ruck | ^^ so pretty much as long as we are keeping history | 13:12 |
jfrancoa | rlandy|ruck: oh yes..it's been already a while. Ok, I think I'll give a try to the reproducer to debug the issue, but I'm considering it could be the missing overcloud-ssh-user in "overcloud update prepare" | 13:12 |
rlandy|ruck | jfrancoa: as a comparison, the stein and rocky jobs are green ... https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset037-updates-rocky | 13:13 |
jfrancoa | rlandy|ruck: if you check any job in rocky or stein, we pass the user zuul during overcloud deploy and during prepare too | 13:13 |
rlandy|ruck | jfrancoa: this job runs in periodic ... so no check | 13:13 |
rlandy|ruck | jfrancoa: but yes, the rocky and stein equivalent jobs pass | 13:14 |
rlandy|ruck | per links above | 13:14 |
rlandy|ruck | jfrancoa: I was just starting to set up a reproducer ... | 13:14 |
jfrancoa | rlandy|ruck: could you give me access once it finishes, please? or are you running it locally? | 13:14 |
rlandy|ruck | jfrancoa: but if you have a fix you suspect might work, we can spin on that | 13:15 |
rlandy|ruck | jfrancoa: ack - it will be on my rdocloud tenant - pls pm send me your key | 13:15 |
jfrancoa | rlandy|ruck: if not, I'll try one in my rdo cloud account. So far it's a hint...that's why I would like to have an environment to confirm it | 13:15 |
rlandy|ruck | jfrancoa: ack - will ping you with env | 13:15 |
jfrancoa | rlandy|ruck: http://pastebin.test.redhat.com/791671 | 13:16 |
rlandy|ruck | thanks | 13:16 |
jfrancoa | rlandy|ruck: thanks. I'll investigate meanwhile why do we pass the zuul user in rocky and stein, but nothing for queens.. | 13:16 |
rlandy|ruck | so that might really be it | 13:18 |
*** dsneddon has joined #oooq | 13:18 | |
weshay | chkumar|rover you are coming to the pre-planning mtg right? | 13:21 |
arxcruz | chkumar|rover: that option is already in the fs001 | 13:22 |
*** dsneddon has quit IRC | 13:23 | |
*** skramaja has quit IRC | 13:25 | |
*** ykarel is now known as ykarel|away | 13:26 | |
chkumar|rover | weshay: yes | 13:30 |
chkumar|rover | arxcruz: the syntax is different | 13:31 |
arxcruz | chkumar|rover: what you mean ? | 13:31 |
chkumar|rover | tempest_tempest_conf_overrides: | 13:31 |
chkumar|rover | compute_feature_enabled.config_drive: 'True' | 13:32 |
chkumar|rover | arxcruz: ^^ | 13:32 |
arxcruz | chkumar|rover: oh, yeah, because it's os_tempest, shit... | 13:38 |
chkumar|rover | ack | 13:45 |
*** dsneddon has joined #oooq | 13:56 | |
rlandy|ruck | sshnaidm: hi - reproducer keeps failing on ansible-role-tripleo-ci-reproducer : Wait for zuul tenant error copied in http://pastebin.test.redhat.com/791691. any suggestions? | 14:00 |
*** dsneddon has quit IRC | 14:00 | |
*** apetrich has quit IRC | 14:00 | |
rlandy|ruck | I can see the :9000 auul | 14:01 |
rlandy|ruck | zuul | 14:01 |
rlandy|ruck | no jobs is launched | 14:01 |
sshnaidm | rlandy|ruck, need to see logs, could be anything.. take a look at scheduler logs, it's most critical service | 14:02 |
*** apetrich has joined #oooq | 14:03 | |
chkumar|rover | rlandy|ruck: time to remove pike job https://review.opendev.org/#/c/678154/ from where we will start? | 14:03 |
rlandy|ruck | chkumar|rover: not sure I understand question 'from where we will start'? | 14:04 |
rlandy|ruck | ie: which jobs to remove first? | 14:04 |
chkumar|rover | rlandy|ruck: yes which jobs to remove first? | 14:05 |
chkumar|rover | first remove from promotion criteria | 14:05 |
rlandy|ruck | chkumar|rover: just fighting with reproducer now - will look in a bit | 14:05 |
chkumar|rover | rlandy|ruck: ack | 14:05 |
rlandy|ruck | basically job definitions go last | 14:05 |
rlandy|ruck | chkumar|rover: everywhere they are called can be removed | 14:05 |
rlandy|ruck | pipeline etc. | 14:06 |
chkumar|rover | rlandy|ruck: ok, I will take a look and propose the patches | 14:06 |
rlandy|ruck | chkumar|rover: go to review.rdoproject.org | 14:06 |
rlandy|ruck | remove pipeline | 14:06 |
rlandy|ruck | first | 14:06 |
rlandy|ruck | then other check inclusions | 14:07 |
ykarel|away | rlandy|ruck, chkumar|rover fyi ocata-eol tag job failed | 14:11 |
ykarel|away | i see tags are created, but stable/ocata not removed | 14:11 |
ykarel|away | will followup with openstack/releases | 14:11 |
ykarel|away | https://storage.gra1.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/logs_9c/9c30af482980c34ed865db11bbcd56bf3fd4cf1f/release-post/tag-releases/ba9f773/job-output.txt | 14:12 |
*** aakarsh|2 has joined #oooq | 14:12 | |
chkumar|rover | ykarel|away: checking | 14:13 |
rlandy|ruck | sshnaidm: k - http://pastebin.test.redhat.com/791697 - scheduler not starting properly | 14:13 |
rlandy|ruck | did we qualify against python 3.7? | 14:14 |
chkumar|rover | ykarel|away: openstack/os-cloud-config.git this one failed | 14:15 |
rlandy|ruck | panda: web search shows you hit this error with reproducer ... http://pastebin.test.redhat.com/791697 - remember how you got out of it? | 14:16 |
ykarel|away | chkumar|rover, hmm i see tag not exist for it, neither ocata-em nor ocata-eol | 14:16 |
ykarel|away | https://github.com/openstack/os-cloud-config/releases | 14:17 |
chkumar|rover | rlandy|ruck: yes | 14:18 |
panda | rlandy|ruck: does the web search show you if I ever did ? Seems you need to add some host to the /etc/hosts, but as I said, I had to give up on testing the reproducer because it was not working for me, maybe that's one of the errors that I could not get out of. | 14:20 |
jfrancoa | rlandy|ruck: I believe this could fix the job https://review.opendev.org/#/c/678572/ , let's try it out once the reproducer is ready | 14:20 |
rlandy|ruck | jfrancoa: I'm having some reproducer troubles - I will set up a testproject job to run your fix with fs037 queens | 14:21 |
chkumar|rover | rlandy|ruck: I am taking care of above upgrade test | 14:21 |
chkumar|rover | testing in testproject | 14:21 |
rlandy|ruck | chkumar|rover: k - pls add jfrancoa to the review | 14:22 |
jfrancoa | chkumar|rover: rlandy|ruck awesome, thanks. let's see if at least we can progress from that step (it might appear some different issue though) | 14:23 |
rlandy|ruck | ok - will watch progress | 14:23 |
chkumar|rover | rlandy|ruck: jfrancoa https://review.rdoproject.org/r/#/c/21946/ | 14:25 |
rlandy|ruck | chkumar|rover: thanks | 14:26 |
rlandy|ruck | chkumar|rover: why there - rather than testproject? | 14:27 |
chkumar|rover | rlandy|ruck: it also works there | 14:27 |
rlandy|ruck | yeah - just interested | 14:27 |
chkumar|rover | rlandy|ruck: I left using testproject, we can do the same thing with rdo-jobs so | 14:28 |
*** dsneddon has joined #oooq | 14:30 | |
*** dsneddon has quit IRC | 14:35 | |
*** jbadiapa has quit IRC | 14:36 | |
rlandy|ruck | go something going npw | 14:38 |
rlandy|ruck | now | 14:38 |
*** jbadiapa has joined #oooq | 14:43 | |
rlandy|ruck | jfrancoa: pls try ssh zuul@38.145.32.46 | 14:45 |
rlandy|ruck | that is your reproducer env | 14:45 |
jfrancoa | rlandy|ruck: I'm in, did the job already fail? Is it supposed to be the undercloud that host? | 14:46 |
jfrancoa | rlandy|ruck: I can't locate the stackrc | 14:47 |
rlandy|ruck | jfrancoa: you're on the undercloud | 14:47 |
rlandy|ruck | it's just started running | 14:47 |
rlandy|ruck | sending you zuul info | 14:47 |
jfrancoa | rlandy|ruck: ahh ok. I'll wait then | 14:47 |
rlandy|ruck | jfrancoa: will ping you when the test stops | 14:49 |
jfrancoa | rlandy|ruck: ack, thank you | 14:49 |
rlandy|ruck | weshay: ping re: https://bugs.launchpad.net/tripleo/+bug/1833465 - hitting that error | 14:54 |
openstack | Launchpad bug 1833465 in tripleo "tripleo reproducer fails w/ "waiting on logger"" [Critical,Incomplete] | 14:54 |
rlandy|ruck | attached patch was never merged | 14:54 |
rlandy|ruck | what did you do to fix? | 14:54 |
rlandy|ruck | upstream-cloudinit-centos-7 image | 14:56 |
rlandy|ruck | Updated At | 14:56 |
rlandy|ruck | 2019-05-10T11:31:44Z | 14:56 |
rlandy|ruck | nvm - it carried on | 14:58 |
rlandy|ruck | oh no- it's going again | 14:58 |
rlandy|ruck | chkumar|rover: how's it going with the pike job removals? need any help there? | 15:00 |
chkumar|rover | rlandy|ruck: working on reviews | 15:00 |
rlandy|ruck | k | 15:00 |
chkumar|rover | weshay: rfolco meeting time | 15:01 |
rfolco | yeah I am there | 15:01 |
rfolco | weshay, ping | 15:03 |
*** dsneddon has joined #oooq | 15:08 | |
chkumar|rover | rfolco: ping, we are waiting on bj | 15:11 |
*** sanjayu_ has quit IRC | 15:11 | |
*** bogdando has quit IRC | 15:12 | |
*** ykarel|away has quit IRC | 15:12 | |
*** dsneddon has quit IRC | 15:13 | |
sshnaidm | rlandy|ruck, Name or service not known - need to check dns config | 15:23 |
rlandy|ruck | sshnaidm: got by that error | 15:24 |
rlandy|ruck | sshnaidm: now dealing with 'waiting on logger' as in https://bugs.launchpad.net/tripleo/+bug/183346 | 15:24 |
openstack | Launchpad bug 183346 in ubuntu-dev-tools (Ubuntu) "requestsync files a bug with the wrong version in the title, not in the changelog" [Low,Fix released] - Assigned to Michael Bienia (geser) | 15:24 |
rlandy|ruck | oops | 15:24 |
rlandy|ruck | https://bugs.launchpad.net/tripleo/+bug/1833465 | 15:24 |
openstack | Launchpad bug 1833465 in tripleo "tripleo reproducer fails w/ "waiting on logger"" [Critical,Incomplete] | 15:24 |
rlandy|ruck | ^^ that one | 15:24 |
rlandy|ruck | sshnaidm: possible my tenant has old images? | 15:25 |
rlandy|ruck | openstack-infra-centos-7? | 15:25 |
sshnaidm | rlandy|ruck, wait, it's not from scheduler? | 15:25 |
rlandy|ruck | sshnaidm: the initial error was - fixed that one | 15:25 |
rlandy|ruck | on to another error completely | 15:26 |
sshnaidm | rlandy|ruck, and what is the error now? | 15:26 |
rlandy|ruck | sshnaidm: "waiting on logger" as shown in https://bugs.launchpad.net/tripleo/+bug/1833465 | 15:26 |
openstack | Launchpad bug 1833465 in tripleo "tripleo reproducer fails w/ "waiting on logger"" [Critical,Incomplete] | 15:26 |
rlandy|ruck | there is a review attached | 15:26 |
rlandy|ruck | never merged | 15:26 |
sshnaidm | mm.. haven't seen this a long time | 15:27 |
sshnaidm | rlandy|ruck, on which task did it start? | 15:27 |
rlandy|ruck | sshnaidm: http://10.10.125.142:9000/t/tripleo-ci-reproducer/stream/b29bf1a2b633452f8132e25a2e3c2b39?logfile=console.log | 15:27 |
rlandy|ruck | sshnaidm: 2019-08-26 15:22:43.524755 | TASK [persistent-firewall : List current ipv4 rules] | 15:28 |
rlandy|ruck | ^^ starts on that tasks | 15:28 |
rlandy|ruck | happens on a few on them | 15:28 |
rlandy|ruck | until jobs dies | 15:28 |
sshnaidm | rlandy|ruck, can I log in to this machine? | 15:28 |
rlandy|ruck | sshnaidm: the machine running zuul or the instance? | 15:29 |
sshnaidm | rlandy|ruck, where containers are | 15:29 |
sshnaidm | rlandy|ruck, it was suspect that it's related to bad performance of rdo cloud host | 15:29 |
sshnaidm | and poor network connection | 15:30 |
rlandy|ruck | sshnaidm: it's running from my minidell - instances on my rdocloud tenant | 15:30 |
rlandy|ruck | I can add your key to my minidell | 15:30 |
*** hamzy_ is now known as hamzy | 15:30 | |
sshnaidm | rlandy|ruck, try to rerun the job, just curios if it's reproducible | 15:32 |
rlandy|ruck | sshnaidm: this is my second run | 15:32 |
rlandy|ruck | same deal | 15:32 |
rlandy|ruck | not sure how weshay got rid of his errors | 15:32 |
rlandy|ruck | also, the images should be upstream-cloudinit-centos-7 | 15:33 |
rlandy|ruck | maintained centrally | 15:33 |
sshnaidm | rlandy|ruck, started job on my tenant, let's see if it has same error | 15:33 |
sshnaidm | rlandy|ruck, yeah | 15:33 |
rlandy|ruck | sshnaidm: thanks | 15:33 |
rlandy|ruck | I'm running mutinode | 15:33 |
weshay | rfolco chkumar|rover https://docs.google.com/document/d/1lXxKY794_TCByXovoFwkf6JfKOkOirL6dyAcw1zYjrY/edit# | 15:44 |
*** dsneddon has joined #oooq | 15:45 | |
*** dsneddon has quit IRC | 15:50 | |
rlandy|ruck | sshnaidm: second job - died | 15:53 |
rlandy|ruck | weshay: chat about fs021? | 16:02 |
*** jpena is now known as jpena|off | 16:02 | |
*** jfrancoa has quit IRC | 16:03 | |
rlandy|ruck | why is tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master-vexxhost running in periodic??? | 16:05 |
*** sabedevops has joined #oooq | 16:06 | |
rlandy|ruck | https://review.rdoproject.org/r/#/c/21881/' | 16:06 |
rlandy|ruck | really??? | 16:06 |
chkumar|rover | rlandy|ruck: I was added temprory with check pipeline | 16:07 |
chkumar|rover | *it | 16:07 |
rlandy|ruck | not sure I would agree | 16:07 |
rlandy|ruck | more failures | 16:07 |
chkumar|rover | rlandy|ruck: it will help to test the stuff for vexhost | 16:08 |
chkumar|rover | sshnaidm: ^^ | 16:08 |
*** dsneddon has joined #oooq | 16:08 | |
rlandy|ruck | chkumar|rover: every 4 hours? | 16:09 |
rlandy|ruck | I have no problem with a periodic vexxhost job | 16:09 |
rlandy|ruck | that pipeline is for promotions | 16:09 |
rlandy|ruck | but it | 16:09 |
sshnaidm | rlandy|ruck, why not? | 16:09 |
rlandy|ruck | 's there now | 16:09 |
sshnaidm | rlandy|ruck, the pipeline is just time trigger based | 16:10 |
sshnaidm | rlandy|ruck, it shouldn't affect promotions | 16:10 |
rlandy|ruck | sshnaidm: it is - but I don't get why we need to debug that failing job every four hours | 16:10 |
rlandy|ruck | it wont | 16:10 |
sshnaidm | I need to run it periodically as frequent as possible | 16:10 |
sshnaidm | rlandy|ruck, why do you need to debug it? | 16:10 |
rlandy|ruck | imho there are better places to run that job | 16:11 |
rlandy|ruck | I am not going to remove it or anything | 16:11 |
rlandy|ruck | just my personal opinion | 16:11 |
sshnaidm | rlandy|ruck, where for example? | 16:12 |
rlandy|ruck | sshnaidm; any other temp periodic pipeline | 16:12 |
rlandy|ruck | or at least call the job periodic | 16:12 |
sshnaidm | rlandy|ruck, we run promotions in all periodic pipelines | 16:12 |
sshnaidm | rlandy|ruck, and to define same zuul pipeline for one job is an overhead | 16:13 |
rlandy|ruck | sshnaidm: question is - is your intention to check vexxhost job with the latest hash - or just to run a check job 4 times a day? | 16:14 |
sshnaidm | rlandy|ruck, I didn't call it periodic just for that reason, not to confuse it with periodic promotion jobs | 16:14 |
sshnaidm | rlandy|ruck, just to run a job as much as possible | 16:14 |
rlandy|ruck | honestly not our biggest problem right now | 16:14 |
rlandy|ruck | just surprised to see it show up | 16:14 |
rlandy|ruck | and I assume we agree that the expection is not on ruck/rover to keep debugging this | 16:15 |
sshnaidm | rlandy|ruck, yeah, no need atm, I'm working on this | 16:15 |
*** matbu has quit IRC | 16:17 | |
*** sabedevops has quit IRC | 16:19 | |
rlandy|ruck | sshnaidm: back to reproducer - do you see the logger issue? | 16:29 |
rlandy|ruck | I might be better off going the libvirt option | 16:29 |
rlandy|ruck | if it's an rdocloud network issue | 16:29 |
weshay | rlandy|ruck I was wrong about 21.. 7.7 rpms are not there yet.. sorry | 16:43 |
weshay | chkumar|rover set me straight | 16:43 |
rlandy|ruck | chkumar|rover++ | 16:45 |
* chkumar|rover is working on bring a bot soon | 16:45 | |
chkumar|rover | *bringing | 16:45 |
rlandy|ruck | ugh reproducer | 16:46 |
*** tesseract has quit IRC | 16:57 | |
sshnaidm | rlandy|ruck, for me job worked in reproducer | 17:01 |
sshnaidm | rlandy|ruck, do you use this image? | 6a6d23d7-65e7-43ea-9307-71b2e17d2ead | upstream-cloudinit-centos-7 | active | | 17:03 |
sshnaidm | rlandy|ruck, maybe try "docker-compose down -v; docker-compose up -d" | 17:04 |
rlandy|ruck | sshnaidm: trying that | 17:18 |
rlandy|ruck | same image | 17:18 |
chkumar|rover | see ya tomorrow, Good night, Bye! | 17:21 |
*** chkumar|rover is now known as raukadah | 17:21 | |
*** dtantsur is now known as dtantsur|afk | 17:25 | |
raukadah | rlandy|ruck: https://openstack.fortnebula.com:13808/v1/AUTH_e8fd161dc34c421a979a9e6421f823e9/logs_60/652060/6/check/tripleo-ci-centos-7-standalone-os-tempest/97289e7/logs/undercloud/home/zuul/install_packages.sh.log.txt.gz | 17:31 |
raukadah | more breakage is coming | 17:31 |
raukadah | rlandy|ruck: https://review.opendev.org/#/c/678622/ and we need change in tripleo-ansible spec file | 17:31 |
rlandy|ruck | raukadah: over and above https://review.opendev.org/#/c/673366/ | 17:33 |
rlandy|ruck | I see discussion on #tripleo | 17:33 |
raukadah | rlandy|ruck: :-) | 17:37 |
rlandy|ruck | 2019-08-26 12:06:13.362741 | primary | TASK [os_tempest : Execute tempest tests] ************************************** | 17:41 |
rlandy|ruck | 2019-08-26 12:06:13.847162 | primary | Monday 26 August 2019 12:06:13 -0400 (0:00:00.750) 2:02:29.520 ********* | 17:41 |
rlandy|ruck | 2019-08-26 16:27:29.380476 | RUN END RESULT_TIMED_OUT: [untrusted : opendev.org/openstack/tripleo-ci/playbooks/tripleo-ci/run-v3.yaml@master] | 17:41 |
rlandy|ruck | 2019-08-26 16:27:29.380779 | POST-RUN START: [trusted : review.rdoproject.org/config/playbooks/tripleo-ci-periodic-base/post.yaml@master] | 17:41 |
rlandy|ruck | 2019-08-26 16:27:31.116683 | | 17:41 |
raukadah | rlandy|ruck: which job? | 17:41 |
rlandy|ruck | raukadah: looking at master promotion ... http://logs.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-rhel-8-standalone-master/f93ae1d/logs/undercloud/home/zuul/ | 17:41 |
rlandy|ruck | 4 hours on tempest | 17:42 |
rlandy|ruck | 4.5 | 17:42 |
rlandy|ruck | excessive, no? | 17:42 |
raukadah | rlandy|ruck: too much excessive and expensive | 17:42 |
rlandy|ruck | raukadah: too many tests running? | 17:43 |
rlandy|ruck | raukadah: not commonly ... https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-rhel-8-standalone-master | 17:43 |
rlandy|ruck | raukadah: ^^ rerunning to check | 17:44 |
rlandy|ruck | first timeout we've seen | 17:44 |
rlandy|ruck | possibly a htch | 17:44 |
raukadah | rlandy|ruck: ok, I am adding to my morning checklist | 17:44 |
rlandy|ruck | raukadah: let's see if it reproducer first | 17:44 |
rlandy|ruck | reproduces | 17:44 |
rlandy|ruck | oh no | 17:45 |
rlandy|ruck | log failures again | 17:45 |
rlandy|ruck | weshay: when was the last time you ran a libvirt reproducer? | 18:38 |
rlandy|ruck | weshay: did it pass? | 18:38 |
weshay | rlandy|ruck w/ the new repro? | 18:38 |
weshay | it's been quite some time | 18:38 |
rlandy|ruck | weshay: wow - no easy getting the reproducer functional | 18:39 |
rlandy|ruck | The error was: libvirtError: Cannot get interface MTU on 'virbr0': No such device | 18:39 |
rlandy|ruck | failed: [localhost] (item={u'flavor': u'control', u'name': u'subnode-1'}) => {"changed": false, "item": {"flavor": "control", "name": "subnode-1"}, "msg": "Cannot get interface MTU on 'virbr0': No such device"} | 18:39 |
rlandy|ruck | ^^ seen that one? | 18:39 |
rlandy|ruck | trying libvirt as using tenant gave me the log error | 18:40 |
weshay | rlandy|ruck ur trying on an rdo box? | 18:40 |
rlandy|ruck | ./reproducer-zuul-based-quickstart.sh -l -e libvirt_volume_path=/home/temp/images | 18:40 |
rlandy|ruck | ^^ no on my minidell | 18:40 |
rlandy|ruck | worked a while back | 18:40 |
weshay | rlandy|ruck I have no insight.. other than the ci jobs running daily | 18:49 |
rlandy|ruck | ok | 18:50 |
*** brault has joined #oooq | 18:51 | |
*** brault has quit IRC | 18:56 | |
rlandy|ruck | weshay: can you vote on https://review.opendev.org/#/c/678290/ pls | 19:50 |
weshay | done | 19:54 |
weshay | rlandy|ruck have you seen: | 20:00 |
weshay | Transaction check error: | 20:00 |
weshay | 2019-08-26 16:05:31 | file /usr/share/ansible/roles/tripleo-hieradata/tasks/hieradata_vars.yaml conflicts between attempted installs of openstack-tripleo-common-11.1.1-0.20190826025903.29b7c8a.el7.noarch and tripleo-ansible-0.2.1-0.20190826144854.bf61a6f.el7.noarch | 20:00 |
weshay | https://openstack.fortnebula.com:13808/v1/AUTH_e8fd161dc34c421a979a9e6421f823e9/logs_28/678028/3/gate/tripleo-ci-centos-7-undercloud-containers/ed39d44/logs/undercloud/home/zuul/install_packages.sh.log.txt.gz | 20:00 |
rlandy|ruck | weshay: ack | 20:00 |
rlandy|ruck | was supposed to be fixed by kevin's patches | 20:00 |
rlandy|ruck | discussion on #tripleo | 20:00 |
weshay | +1 | 20:01 |
rlandy|ruck | weshay: merged about 1 hr ago | 20:01 |
rlandy|ruck | if you are still seeing it | 20:01 |
rlandy|ruck | we have a problem | 20:01 |
* weshay just catching up | 20:02 | |
weshay | impossible | 20:02 |
rlandy|ruck | gate jobs are clearer now | 20:02 |
rlandy|ruck | we had mass failures a while back | 20:02 |
rlandy|ruck | check jobs may need to be rebased | 20:02 |
rlandy|ruck | 2019-08-26 19:37:28,050 29646 INFO promoter Promoting the container images for dlrn hash 2485770cdc7f29d75b79dc9b6c7c1d099e926c86 on master to current-tripleo | 20:04 |
rlandy|ruck | weshay: redhat8 is promoting now | 20:04 |
rlandy|ruck | was stalled for a bit | 20:04 |
weshay | wooot! | 20:04 |
rlandy|ruck | stein should also - if fs020 passes | 20:05 |
rlandy|ruck | queens promoted as well | 20:05 |
rlandy|ruck | log server keeps clogging though | 20:05 |
rlandy|ruck | we are runing at 95% | 20:05 |
rlandy|ruck | and each time the prune script runs, we go over the top | 20:06 |
rlandy|ruck | other than getting reproducer to work, other stuff is ok | 20:06 |
rlandy|ruck | reproducer is killing me | 20:28 |
weshay | rlandy|ruck ya.. if I could squeeze out time, I'd be working on that | 20:29 |
weshay | my own pet project... maybe in november | 20:30 |
rlandy|ruck | I am giving up on rdocloud | 20:31 |
rlandy|ruck | need to get libvirt working then | 20:31 |
rlandy|ruck | I am the only one with this logger issue? | 20:31 |
*** aakarsh|2 has quit IRC | 20:40 | |
*** Vorrtex has quit IRC | 20:45 | |
*** Goneri has quit IRC | 20:57 | |
rlandy|ruck | weshay: can you tell me how virbr0 is configured on the box where you run libvirt? | 23:03 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!