hubbot | FAILING CHECK JOBS on stable/ocata: tripleo-ci-centos-7-undercloud-upgrades @ https://review.openstack.org/564291, master: tripleo-ci-centos-7-3nodes-multinode, tripleo-ci-centos-7-containerized-undercloud-upgrades @ https://review.openstack.org/560445 | 00:16 |
---|---|---|
*** EmilienM is now known as EmilienM_PTO | 00:18 | |
*** matbu has quit IRC | 00:19 | |
*** jtomasek has quit IRC | 00:25 | |
*** jtomasek has joined #oooq | 00:29 | |
*** jtomasek has quit IRC | 00:53 | |
*** jtomasek has joined #oooq | 01:15 | |
*** tcw has quit IRC | 01:17 | |
*** tcw has joined #oooq | 01:17 | |
*** jtomasek_ has joined #oooq | 01:27 | |
*** jtomasek has quit IRC | 01:28 | |
myoung | rlandy|rvr|bbl: i can add ya... | 02:15 |
myoung | rlandy|rvr|bbl: doing that now | 02:15 |
hubbot | FAILING CHECK JOBS on stable/ocata: tripleo-ci-centos-7-undercloud-upgrades @ https://review.openstack.org/564291, master: tripleo-ci-centos-7-3nodes-multinode, tripleo-ci-centos-7-containerized-undercloud-upgrades @ https://review.openstack.org/560445 | 02:16 |
myoung | rlandy|rvr|bbl: the reason the dashboards at http://rhos-release.virt.bos.redhat.com:3030/rhosp are showing 7d for queens promotions are because they are looking at the date the hash was created. | 02:17 |
myoung | rlandy|rvr|bbl: @ http://dashboards.rdoproject.org/queens have both create and promote date listed for this reason...so for rdo2 (current-tripleo-rdo-internal) https://trunk.rdoproject.org/centos7-queens/61/15/61152f1f452f02d2f0bccc8e3b3b1695103c4114_ba256d89 is currently promoted hash, created 5/17, promoted 5/22 | 02:18 |
myoung | rlandy|rvr|bbl: next rdo2 jobs will pull the current-tripleo-rdo hash (https://trunk.rdoproject.org/centos7-queens/85/de/85de06e2c40bfdc8dee80506f8d1d809a93b900e_25e5ea4b), created today, promoted today | 02:19 |
*** rlandy|rvr|bbl is now known as rlandy|rover | 02:20 | |
rlandy|rover | myoung: hi - reading back | 02:20 |
rlandy|rover | 2018-05-24 17:57:42,051 4326 INFO promoter Skipping promotion of current-tripleo-rdo to current-tripleo-rdo-internal, missing successful jobs: ['oooq-queens-rdo_trunk-bmu-haa16-lab-float_nic_with_vlans'] | 02:21 |
myoung | rlandy|rover: looking at current job https://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rdo-promote-queens-rdo_trunk/88/ | 02:21 |
myoung | rlandy|rover: IMHO this log is the pits and hard to read/parse lol. | 02:21 |
rlandy|rover | I know | 02:21 |
* myoung looks at http://38.145.34.55/queens.log | 02:21 | |
rlandy|rover | queens says its is 7 day out | 02:22 |
rlandy|rover | that job did run | 02:22 |
myoung | so the snipit above was the promoter trying to promote 23e8921b6f52e0361bf5e78123ff3843a4c7328c_1fed7df5 | 02:23 |
myoung | i think...sec...always takes me a few mins. | 02:23 |
* myoung thinks we should spend a sprint on making ruck/rover eyes bleed less | 02:23 | |
myoung | rlandy|rover: i end up looking here btw http://sol.usersys.redhat.com/dlrnapi-reports/queens-combined.txt | 02:24 |
myoung | ^^ all promotion activity for queens | 02:24 |
myoung | and there are filtered ones too by phase e.g. http://sol.usersys.redhat.com/dlrnapi-reports/queens-current-tripleo-rdo-internal.txt | 02:24 |
rlandy|rover | 2018-05-22 08:37:10, https://trunk.rdoproject.org/centos7-queens/61/15/61152f1f452f02d2f0bccc8e3b3b1695103c4114_ba256d89, current-tripleo-rdo-internal | 02:24 |
rlandy|rover | not 7 days old | 02:25 |
myoung | http://dashboards.rdoproject.org/queens | 02:25 |
myoung | so the internal dashboard is saying 7 days because it's checking the creation date of the repo file, which is 5/17 | 02:26 |
rlandy|rover | why doesn't it promote a more current? | 02:26 |
rlandy|rover | jobs passed today | 02:26 |
rlandy|rover | ugh ... promotion failure | 02:27 |
myoung | 2018-05-17 01:27 is the creation date | 02:27 |
myoung | looking at log now to see why | 02:27 |
myoung | jobs passed (fs20 and oooq-bmu) | 02:27 |
rlandy|rover | "overcloud_deploy_result": "failed" | 02:27 |
rlandy|rover | sorry - looking at that failure | 02:27 |
myoung | no maybe because this | 02:27 |
myoung | 2018-05-25 00:40:25,799 17311 ERROR promoter Unable to acquire lock. Another promoter process is running. Aborting. | 02:27 |
myoung | this is a wedged promoter process, or another one | 02:28 |
myoung | have a quick sec? getting into promoter server and adding your key | 02:28 |
rlandy|rover | ResourceInError: resources.NovaCompute: Went to status ERROR due to "Message: No valid host was found. , Code: 500" | 02:28 |
myoung | ^^ which is that? | 02:28 |
rlandy|rover | myoung: looking at the error in the current promotion job ... Went to status ERROR due to "Message: No valid host was found. There are not enough hosts available., Code: 500" | 02:34 |
rlandy|rover | fs001 failing both queens and master | 02:34 |
myoung | rlandy|rover: ack..re promoter server it's in the middle of promoting master containers | 02:34 |
myoung | 2018-05-25 00:54:07,222 17991 INFO promoter Promoting the container images for dlrn hash 98e667518f7aaa0aa9f31e2d41bbd6d3124cc7e3 on master to current-tripleo-rdo-internal | 02:34 |
rlandy|rover | ok - but queens should promote a newer hash | 02:34 |
myoung | yes...after master | 02:35 |
myoung | promoter.sh does master --> pike --> ocata --> queens serially | 02:35 |
rlandy|rover | o gee | 02:35 |
myoung | last cycle wes and I spent some friday night time plowing thru this, dns at that time was messed so stuff was just straight up failing...but a full container push was taking around 88m | 02:36 |
myoung | one of the pain points around our promoter is the output is all hidden until it's done | 02:38 |
myoung | so can't tell if there's a problem or not...other than sleuthing as root on the promoter vm | 02:38 |
rlandy|rover | ok - thanks for your help | 02:40 |
rlandy|rover | it's so confusing | 02:40 |
rlandy|rover | and I'm kind of on my own here' | 02:40 |
rlandy|rover | myoung: sorry - juts logging a bug | 02:41 |
rlandy|rover | I'll need you to check me | 02:41 |
rlandy|rover | it's been a while since I logged a promotion blocker | 02:41 |
rlandy|rover | myoung: pls check this ... https://bugs.launchpad.net/tripleo/+bug/1773289 | 02:46 |
openstack | Launchpad bug 1773289 in tripleo "[queens/master promotion] fs001 fails overcloud deploy with 'No valid host was found. , Code: 500'" [Critical,Triaged] | 02:46 |
rlandy|rover | did I get the tags/status etc.correct? | 02:46 |
myoung | rlandy|rover: yup | 02:47 |
myoung | alert will add an alter in the #tripleo channel | 02:47 |
myoung | promotion-blocker will autocreate a CIX card | 02:47 |
*** myoung is now known as myoung|off | 02:57 | |
*** myoung|off is now known as myoung|zzz | 03:12 | |
rlandy|rover | arxcruz|ruck: when you get in ... https://bugs.launchpad.net/tripleo/+bug/1773289 | 03:19 |
openstack | Launchpad bug 1773289 in tripleo "[queens/master promotion] fs001 fails overcloud deploy with 'No valid host was found. , Code: 500'" [Critical,Triaged] | 03:19 |
rlandy|rover | I started investigating but it's getting late here | 03:20 |
*** skramaja has joined #oooq | 04:09 | |
*** skramaja has quit IRC | 04:12 | |
*** skramaja has joined #oooq | 04:12 | |
*** skramaja has quit IRC | 04:13 | |
*** skramaja has joined #oooq | 04:13 | |
hubbot | FAILING CHECK JOBS on stable/ocata: gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-ocata, tripleo-ci-centos-7-undercloud-upgrades @ https://review.openstack.org/564291, master: tripleo-ci-centos-7-3nodes-multinode, gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-queens-branch, tripleo-ci-centos-7-containerized-undercloud-upgrades, gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master @ (1 more message) | 04:16 |
*** ykarel|away has joined #oooq | 04:17 | |
*** links has joined #oooq | 04:19 | |
*** jaganathan has joined #oooq | 04:23 | |
*** jtomasek_ has quit IRC | 04:30 | |
*** pgadiya has joined #oooq | 04:43 | |
*** pgadiya has quit IRC | 04:43 | |
*** ratailor has joined #oooq | 04:44 | |
*** ykarel|away is now known as ykarel | 04:55 | |
*** links has quit IRC | 04:58 | |
*** saneax has joined #oooq | 05:06 | |
*** links has joined #oooq | 05:08 | |
*** links has quit IRC | 05:12 | |
*** links has joined #oooq | 05:14 | |
*** udesale has joined #oooq | 05:21 | |
*** marios has joined #oooq | 05:36 | |
*** quiquell|off is now known as quiquell | 05:37 | |
quiquell | arxcruz|ruck: Do you know why this is failing ? http://logs.openstack.org/67/570167/1/gate/tripleo-ci-centos-7-scenario002-multinode-oooq-container/18e0db5/job-output.txt.gz | 05:38 |
quiquell | It's around tempest | 05:38 |
quiquell | It's a reproducer change | 05:38 |
quiquell | ft1.1: setUpClass (tempest.api.object_storage.test_object_services.ObjectTest)_StringException: Traceback (most recent call last): | 05:39 |
quiquell | File "/usr/lib/python2.7/site-packages/tempest/test.py", line 172, in setUpClass | 05:39 |
quiquell | six.reraise(etype, value, trace) | 05:40 |
quiquell | File "/usr/lib/python2.7/site-packages/tempest/test.py", line 165, in setUpClass | 05:40 |
quiquell | cls.resource_setup() | 05:40 |
quiquell | File "/usr/lib/python2.7/site-packages/tempest/api/object_storage/test_object_services.py", line 36, in resource_setup | 05:40 |
quiquell | cls.container_name = cls.create_container() | 05:40 |
quiquell | File "/usr/lib/python2.7/site-packages/tempest/api/object_storage/base.py", line 113, in create_container | 05:40 |
quiquell | cls.container_client.update_container(container_name) | 05:40 |
quiquell | File "/usr/lib/python2.7/site-packages/tempest/lib/services/object_storage/container_client.py", line 37, in update_container | 05:40 |
quiquell | resp, body = self.put(url, body=None, headers=headers) | 05:40 |
quiquell | File "/usr/lib/python2.7/site-packages/tempest/lib/common/rest_client.py", line 343, in put | 05:40 |
quiquell | return self.request('PUT', url, extra_headers, headers, body, chunked) | 05:40 |
quiquell | File "/usr/lib/python2.7/site-packages/tempest/lib/common/rest_client.py", line 668, in request | 05:40 |
quiquell | self._error_checker(resp, resp_body) | 05:40 |
quiquell | File "/usr/lib/python2.7/site-packages/tempest/lib/common/rest_client.py", line 794, in _error_checker | 05:40 |
quiquell | raise exceptions.PreconditionFailed(resp_body, resp=resp) | 05:40 |
quiquell | tempest.lib.exceptions.PreconditionFailed: Precondition Failed | 05:40 |
quiquell | Details: Bad URL | 05:40 |
quiquell | 05:40 | |
*** jfrancoa has joined #oooq | 05:40 | |
hubbot | FAILING CHECK JOBS on stable/ocata: gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-ocata, tripleo-ci-centos-7-undercloud-upgrades @ https://review.openstack.org/564291, master: tripleo-ci-centos-7-3nodes-multinode, gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-queens-branch, tripleo-ci-centos-7-containerized-undercloud-upgrades, gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master @ (1 more message) | 06:16 |
*** ratailor has quit IRC | 06:40 | |
*** ratailor has joined #oooq | 06:42 | |
*** ccamacho has quit IRC | 06:53 | |
*** ccamacho has joined #oooq | 06:53 | |
chandankumar | Hey guys, fs001 is failing on pike and queens while deploying overcloud | 07:13 |
chandankumar | https://review.rdoproject.org/jenkins/job/gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-pike-branch/4837/ | 07:13 |
chandankumar | https://review.rdoproject.org/jenkins/job/gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-queens-branch/1795/ | 07:13 |
*** tesseract has joined #oooq | 07:17 | |
*** rlandy|rover has quit IRC | 07:22 | |
*** matbu has joined #oooq | 07:23 | |
*** ykarel is now known as ykarel|lunch | 07:33 | |
*** tesseract-RH has joined #oooq | 07:38 | |
*** bogdando has joined #oooq | 07:48 | |
arxcruz|ruck | quiquell: investigating | 07:53 |
*** saneax has quit IRC | 07:58 | |
*** holser__ has joined #oooq | 08:02 | |
*** amoralej|off is now known as amoralej | 08:04 | |
arxcruz|ruck | quiquell: can you please send me again that grafana link? | 08:12 |
arxcruz|ruck | I forgot to add to bookmarks and yesterday i restart my laptop :( | 08:12 |
hubbot | FAILING CHECK JOBS on stable/ocata: gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-ocata, tripleo-ci-centos-7-undercloud-upgrades @ https://review.openstack.org/564291, master: tripleo-ci-centos-7-3nodes-multinode, gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-queens-branch, tripleo-ci-centos-7-containerized-undercloud-upgrades, gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master @ (1 more message) | 08:16 |
*** dtantsur|afk is now known as dtantsur | 08:33 | |
*** kopecmartin has joined #oooq | 08:34 | |
*** panda|off is now known as panda | 08:51 | |
*** panda is now known as Guest45658 | 08:52 | |
*** Guest45658 is now known as panda | 08:53 | |
*** ykarel|lunch is now known as ykarel | 08:54 | |
*** tesseract-RH has quit IRC | 08:56 | |
quiquell | arxcruz|ruck: http://38.145.34.131:3000/d/pgdr_WVmk/ruck-rover?orgId=1 | 09:03 |
quiquell | Changing it now | 09:03 |
arxcruz|ruck | changing what ? | 09:03 |
quiquell | the dashboard | 09:04 |
quiquell | You have skiped the test ? | 09:04 |
arxcruz|ruck | quiquell: yes, i'm finishing deploy a scenario002 and i'll test | 09:10 |
arxcruz|ruck | but if continuously failing, so in order to not block anything better skip for now | 09:11 |
arxcruz|ruck | i also open the lp | 09:11 |
*** jbadiapa has quit IRC | 09:20 | |
*** jbadiapa has joined #oooq | 09:21 | |
*** udesale_ has joined #oooq | 09:30 | |
*** udesale__ has joined #oooq | 09:32 | |
*** udesale has quit IRC | 09:33 | |
*** udesale_ has quit IRC | 09:34 | |
*** marios has quit IRC | 09:39 | |
*** marios has joined #oooq | 09:40 | |
*** holser__ has quit IRC | 09:45 | |
*** zoli is now known as zoli|lunch | 09:52 | |
quiquell | panda: Good morning | 09:56 |
quiquell | Have an idea for the upgrades delta type | 09:56 |
quiquell | https://trello.com/c/pfQ867XP/779-differentiate-the-same-featureset-to-do-the-two-begin-release-end-release-combinationsç+ | 09:56 |
quiquell | https://review.openstack.org/#/c/570551 | 09:56 |
quiquell | https://review.openstack.org/#/c/57055 | 09:56 |
quiquell | Shit | 09:56 |
quiquell | https://review.openstack.org/#/c/570551 | 09:56 |
panda | quiquell: so we need to duplicate job definitions for the two types of jobs | 10:00 |
quiquell | This was going to happend | 10:01 |
quiquell | We want to run two types of upgrades | 10:01 |
panda | quiquell: that is adding another job cloned from this but with upgrade_delta_type: decrement | 10:01 |
quiquell | We can have a base job | 10:01 |
quiquell | For upgrades | 10:01 |
quiquell | Or the base is the default | 10:01 |
quiquell | idea is not to add jobs with n -> n + 1 ? | 10:02 |
quiquell | We were going to have to do new jobs anyways | 10:02 |
panda | quiquell: don't take anything I say as a no, I'm just evaluating the consequences | 10:03 |
quiquell | panda: Ok ok, feels always like a negative | 10:03 |
quiquell | panda: Other option would be like a job builder with all the upgrades posibilites | 10:04 |
quiquell | we can have a base job with | 10:05 |
quiquell | job: | 10:05 |
quiquell | name: tripleo-ci-centos-7-containerized-undercloud-upgrades | 10:05 |
quiquell | parent: tripleo-ci-dsvm | 10:05 |
quiquell | run: playbooks/tripleo-ci/run.yaml | 10:05 |
quiquell | post-run: playbooks/tripleo-ci/post.yaml | 10:05 |
quiquell | timeout: 10800 | 10:05 |
quiquell | nodeset: legacy-centos-7 | 10:05 |
quiquell | voting: false | 10:05 |
quiquell | branches: ^(?!stable/(newton|ocata|pike|queens)).*$ | 10:06 |
quiquell | vars: | 10:06 |
quiquell | toci_jobtype: singlenode-featureset050 | 10:06 |
quiquell | and the the children | 10:06 |
quiquell | just one with upgrade_delta_type: increment and other with upgrade_delta_type: decrement | 10:06 |
quiquell | panda: Can we override jobs vars at project-template ? | 10:06 |
panda | quiquell: the only concern I have is that we may be forced to call two jobs in a different way if they do something different | 10:07 |
*** hamzy has quit IRC | 10:07 | |
quiquell | panda: What do you mean ? | 10:08 |
panda | with this solution we are calling two jobs that do two different things with the same name. I'm just not sure if they are or not different enough that we need to call them in a different way | 10:08 |
panda | we're really playing with releases here | 10:08 |
quiquell | panda: They will have different names | 10:08 |
quiquell | Changing job names is a problem I think | 10:08 |
quiquell | We can just add the new ones with a suffix | 10:09 |
panda | quiquell: and if we add a suffix we don't need the variable | 10:09 |
quiquell | is normal to parse the job name ? | 10:09 |
panda | quiquell: because we will handle the suffix in the TOCI_JOBTYPE handling loop | 10:09 |
panda | quiquell: not the name, but the type | 10:10 |
quiquell | You mean have a new TOCI_JOBTYPE | 10:10 |
quiquell | but we still need new jobs | 10:10 |
panda | quiquell: yes | 10:10 |
quiquell | pointing to the TOCI_JOBTYPE | 10:10 |
quiquell | TOCI_JOBTYPE is more consisten with what we have | 10:11 |
quiquell | But kind of cryptic if we add more stuff | 10:11 |
panda | quiquell: yes | 10:12 |
quiquell | But the variable is not good for non upgrade jobs | 10:12 |
quiquell | It will add a varible that doesn't make sense | 10:12 |
panda | quiquell: first thing to understan really is if we really need to have two different job types for the two sides of the job | 10:12 |
quiquell | panda: We have to do two different runs | 10:13 |
quiquell | panda: At different zuul executions | 10:14 |
quiquell | Don't know if we can enqueue to builds from the same job | 10:14 |
quiquell | Everytime a job runs | 10:14 |
panda | food for questions :) | 10:15 |
arxcruz|ruck | quiquell: so, the test object storage, i wasn't able to reproduce, on rdocloud is working :( | 10:15 |
panda | quiquell: probably openstack-indra is the best place to dump our concerns on the name | 10:16 |
quiquell | or #zuul | 10:16 |
hubbot | FAILING CHECK JOBS on stable/ocata: tripleo-ci-centos-7-undercloud-upgrades @ https://review.openstack.org/564291, master: tripleo-ci-centos-7-3nodes-multinode, gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-queens-branch, tripleo-ci-centos-7-containerized-undercloud-upgrades, gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master @ https://review.openstack.org/560445 | 10:16 |
panda | quiquell: zuull is probably more focus on zuul itself, not on the policy for naming in infra | 10:17 |
quiquell | panda: The main question if we need additional jobs | 10:18 |
quiquell | panda: in that case, we have multiple solutions to pass the upgrade delta type | 10:18 |
panda | I think it's absolutely inevitable to add additional jobs | 10:21 |
quiquell | panda: Then we have two options, new variable or TOCI_TYPE | 10:21 |
panda | quiquell: yes, and that depends at least partially on if we need a different job name or not | 10:22 |
panda | job name and type | 10:22 |
quiquell | panda: New name is good for debugging | 10:22 |
panda | we need to find a good name to address this | 10:22 |
panda | a big historic name | 10:23 |
panda | like little endian and big endia | 10:23 |
quiquell | My 2 cents is about deltas | 10:23 |
quiquell | upgrade delta type | 10:23 |
quiquell | incremente/decrement is about deltas | 10:24 |
panda | delta to me is a difference between to values ... too much physics and math in my curriculum | 10:25 |
quiquell | lt or gt is a difference | 10:26 |
quiquell | both of them | 10:26 |
quiquell | increment decrement | 10:26 |
panda | downstep and upstep ? we need brands, like slogans, mottos | 10:27 |
panda | this is advertisement, marketing, we need a sinle work to encapsulate an entire concept | 10:27 |
quiquell | ffu is more than one step | 10:27 |
panda | like featureset | 10:28 |
quiquell | ffu is delta=3 | 10:28 |
panda | which nobody like apparently, but is the solution the suks less | 10:28 |
quiquell | Yep, let don't spend too much time in namings | 10:28 |
panda | yes, but with ffu is is easy, you always have downsteps | 10:29 |
quiquell | nop | 10:29 |
panda | but with n-1 -> n or n -> n+ 1 the delta is always 1 | 10:29 |
quiquell | we don't have to support n -> n + 3 ? | 10:29 |
quiquell | A change in newton and need to test newton -> queens | 10:29 |
quiquell | Or a change in ocata and need to toest ocata -> rocky | 10:30 |
panda | quiquell: no | 10:30 |
panda | quiquell: n -3 is usually so OEL that most of the times is not even possible to put a change in n-3 | 10:31 |
panda | EOL* | 10:31 |
quiquell | I see | 10:31 |
quiquell | ok | 10:31 |
panda | quiquell: and not all the releases will have ffu | 10:31 |
quiquell | Have to be a hell to support it btw | 10:31 |
panda | newton -> queens, then queens -> T release | 10:32 |
quiquell | ok ok | 10:32 |
panda | quiquell: yep | 10:32 |
quiquell | we have to focus on n -> n + 1 | 10:32 |
panda | quiquell: how do I access the wiki page with the table with upgrade map ? | 10:32 |
panda | I want to add the other map | 10:32 |
quiquell | Going to link it in the trello card | 10:33 |
panda | quiquell: do I need credentials ? | 10:33 |
quiquell | I was thinkint about scripting it out | 10:33 |
quiquell | To generate the table from git repos | 10:33 |
quiquell | https://wiki.openstack.org/wiki/Tripleo-upgrades-fs-variables | 10:33 |
quiquell | Don't think so | 10:33 |
*** udesale__ has quit IRC | 10:36 | |
*** udesale__ has joined #oooq | 10:36 | |
chandankumar | myoung|zzz: arxcruz|ruck I and kopecmartin have populated the backlog for this sprint with all description feel free to take a look https://trello.com/c/dksT94bI/768-sprint-14-release-python-tempestconf-200 checklist | 10:39 |
panda | quiquell: I was thinking about something like this | 10:40 |
panda | quiquell: updated the page with a new table | 10:40 |
*** holser__ has joined #oooq | 10:41 | |
panda | quiquell: updated again | 10:44 |
panda | quiquell: and regading your idea of making a script to update the table: https://xkcd.com/1319/ :) | 10:46 |
quiquell | panda: ^ Programmers are lazy people :-) | 10:51 |
panda | quiquell: going to discuss this with jistr | 10:51 |
quiquell | panda: Ask him why do we need a undercloud + overcloud upgrade job | 10:52 |
quiquell | Doens't make any sense to me | 10:52 |
quiquell | Maybe to discover problems from an upgraded undercloud when you upgrade de overcloud ? | 10:52 |
quiquell | s/de/the/ | 10:53 |
panda | I asked him yesterday. This is the complete workflow, this is one of the things that customers do, and it has value to check all the steps in a upgrade, he mentioned ssl as a pain point | 10:53 |
panda | and one of the reasons it's important to test it | 10:53 |
quiquell | panda: Maybe we can do the undercloud upgrade, shelve it and next job do a overcloud upgrade | 10:54 |
quiquell | To able to chop it | 10:54 |
panda | quiquell: I doubt it :( | 10:54 |
quiquell | panda: Crazy idea | 10:54 |
*** marios has quit IRC | 11:00 | |
*** marios has joined #oooq | 11:00 | |
*** jbadiapa has quit IRC | 11:01 | |
*** zoli|lunch is now known as zoli | 11:04 | |
*** hubbot has quit IRC | 11:04 | |
*** hubbot has joined #oooq | 11:05 | |
*** jbadiapa has joined #oooq | 11:16 | |
*** udesale__ has quit IRC | 11:31 | |
*** jaosorior has quit IRC | 11:47 | |
*** jaosorior has joined #oooq | 11:48 | |
*** Guest60997 is now known as honza | 11:49 | |
*** ratailor has quit IRC | 11:49 | |
hubbot | FAILING CHECK JOBS on stable/ocata: tripleo-ci-centos-7-undercloud-upgrades @ https://review.openstack.org/564291, master: tripleo-ci-centos-7-3nodes-multinode, tripleo-ci-centos-7-scenario002-multinode-oooq-container, gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-queens-branch, gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master @ https://review.openstack.org/560445 | 12:17 |
*** rlandy has joined #oooq | 12:39 | |
*** rlandy is now known as rlandy|rover | 12:39 | |
rlandy|rover | arxcruz|ruck: hello | 12:42 |
arxcruz|ruck | rlandy|rover: hey, not so good today | 12:42 |
*** myoung|zzz is now known as myoung | 12:42 | |
rlandy|rover | arxcruz|ruck: we need PTL permission for https://review.openstack.org/#/c/570533/ | 12:42 |
arxcruz|ruck | rlandy|rover: i open a bug for object store test that is failing in scenario002 | 12:42 |
arxcruz|ruck | rlandy|rover: senario002 doesn't use featureset019 | 12:43 |
rlandy|rover | yes - I saw the commit | 12:43 |
arxcruz|ruck | but yeah, forgot to check if 019 run this test | 12:43 |
myoung | rlandy|rover, arxcruz|ruck, do you guys have time to sync in a bit ~2 hrs ? | 12:43 |
rlandy|rover | featureset016 - featureset019 | 12:43 |
rlandy|rover | yeah | 12:43 |
myoung | or sooner if tempest squad sprint 14 planning goes on the faster side | 12:44 |
arxcruz|ruck | myoung: i might not, but i'll let you know | 12:44 |
rlandy|rover | arxcruz|ruck: opened https://bugs.launchpad.net/tripleo/+bug/1773289 yesterday | 12:44 |
openstack | Launchpad bug 1773289 in tripleo "[queens/master promotion] fs001 fails overcloud deploy with 'No valid host was found. , Code: 500'" [Critical,Triaged] | 12:44 |
chandankumar | myoung: it will go faster, we are almost done with everything | 12:44 |
rlandy|rover | arxcruz|ruck: that did not get to the escalation board | 12:45 |
rlandy|rover | not sure what I missed there | 12:45 |
rlandy|rover | arxcruz|ruck: the promotion server needs work | 12:45 |
rlandy|rover | http://38.145.34.55/queens.log | 12:45 |
arxcruz|ruck | rlandy|rover: this test is not in fs016 or 019 | 12:46 |
rlandy|rover | that's fine - ignore my comment | 12:46 |
rlandy|rover | arxcruz|ruck: are you looking at the promotion logs? | 12:46 |
arxcruz|ruck | rlandy|rover: might need the milestone? | 12:46 |
rlandy|rover | arxcruz|ruck: milestone? | 12:47 |
arxcruz|ruck | rlandy|rover: no, i wasn't | 12:47 |
arxcruz|ruck | target milestone | 12:47 |
arxcruz|ruck | i add rocky-3 in the bug | 12:47 |
arxcruz|ruck | not sure | 12:47 |
rlandy|rover | ah | 12:47 |
rlandy|rover | arxcruz|ruck: bugger question | 12:47 |
rlandy|rover | bigger | 12:47 |
arxcruz|ruck | 42 | 12:47 |
rlandy|rover | will check if it's still failing | 12:48 |
rlandy|rover | rokcy-2 | 12:48 |
rlandy|rover | rocky-2 | 12:48 |
rlandy|rover | | 2018-05-25 07:26 || 118.0 min || Overcloud stack: FAILED. /home/jenkins/overcloud-deploy.sh fail. || Logs || openstack-periodic | | 12:49 |
rlandy|rover | | 2018-05-25 01:30 || 118.0 min || Overcloud stack: FAILED. /home/jenkins/overcloud-deploy.sh fail. || Logs || openstack-periodic | | 12:49 |
rlandy|rover | still failing | 12:49 |
rlandy|rover | ugh | 12:49 |
rlandy|rover | arxcruz|ruck: I am concerned wht queens phase 2 has not prmoted | 12:51 |
rlandy|rover | it should have | 12:51 |
*** amoralej is now known as amoralej|lunch | 12:52 | |
rlandy|rover | ykarel: hello - you ping'ed about https://bugs.launchpad.net/tripleo/+bug/1773289 late last night? | 12:52 |
openstack | Launchpad bug 1773289 in tripleo "[queens/master promotion] fs001 fails overcloud deploy with 'No valid host was found. , Code: 500'" [Critical,Triaged] - Assigned to Ronelle Landy (rlandy) | 12:52 |
rlandy|rover | arxcruz|ruck: ocata is also a pain - I can't get that to reproduce | 12:53 |
rlandy|rover | it fails on a diff error each time | 12:53 |
*** jaosorior has quit IRC | 12:55 | |
rlandy|rover | ykarel: I can reproduce the error on my own tenant for fs001 - Went to status ERROR due to "Message: No valid host was found. , Code: 500" | 12:55 |
rlandy|rover | provisioning state - clean failed | 12:56 |
ykarel | rlandy|rover, yes i pinged, and seeing the bug created yesterday made me think that can it be because of rdo cloud minor update yesterday | 12:57 |
rlandy|rover | ykarel: it could be - but here is what I see in my reproducer | 12:58 |
panda | quiquell: I have updated the table after an exhausting meeting with jistr | 12:58 |
rlandy|rover | ykarel: introspection passes | 12:58 |
rlandy|rover | 4 node(s) successfully moved to the "available" state. | 12:58 |
quiquell | panda: Can you give me some summary ? | 12:58 |
quiquell | Or the table is enough | 12:58 |
panda | quiquell: lol | 12:58 |
rlandy|rover | but when I look at the state of the nodes after deploy failed | 12:58 |
rlandy|rover | they are in provisioning state - clean failed | 12:59 |
panda | quiquell: the table is not enough, but you have to read it first | 12:59 |
panda | quiquell: it's actually very complex | 12:59 |
rlandy|rover | ykarel: that is the first time I have seen such a provisioning state | 12:59 |
rlandy|rover | panda: EmilienM_PTO asked if we could land https://review.openstack.org/#/c/568946/ last night | 13:00 |
rlandy|rover | trown|outtypewww: ^^ | 13:00 |
rlandy|rover | any objections> | 13:00 |
panda | it's too late to land it last night | 13:00 |
myoung | chandankumar: kopecmartin arxcruz|ruck i'll be a few mins late to planning | 13:00 |
rlandy|rover | https://review.openstack.org/#/c/567060/ merged | 13:00 |
rlandy|rover | panda: ^^ | 13:00 |
chandankumar | myoung: ack | 13:00 |
panda | \o/ | 13:01 |
rlandy|rover | panda: will need to recheck https://review.openstack.org/#/c/568946/ | 13:01 |
rlandy|rover | but any objection to merging it? | 13:01 |
panda | rlandy|rover: it's ok to land it, we may need to modify this implementation in this sprint | 13:01 |
rlandy|rover | panda: ok - thanks | 13:01 |
rlandy|rover | will recheck it | 13:02 |
rlandy|rover | once we figure out fs001 | 13:02 |
rlandy|rover | ykarel: any thoughts? | 13:02 |
rlandy|rover | 2018-05-25 05:22:44 | 2018-05-25 05:21:37Z [1.Controller]: CREATE_FAILED ResourceInError: resources.Controller: Went to status ERROR due to "Message: Exceeded maximum number of retries. Exhausted all hosts available for retrying build failures for instance 43016df9-31a6-42d9-a391-05d27df25a56., Code: 500" | 13:03 |
rlandy|rover | that is usually a state of the overcloud nodes not being sufficient for the deployment | 13:04 |
ykarel | rlandy|rover, i have seen this error earlier also, ironic guys would have more insight on it. rlandy|rover do you see some errors in nova | 13:05 |
rlandy|rover | dtantsur: hi - what would put nodes in a 'clean failed ' state if introspection passed | 13:05 |
*** skramaja has quit IRC | 13:05 | |
rlandy|rover | yakrel: yep - asking ironic gurus | 13:06 |
ykarel | rlandy|rover, on one of the logs i can see: Insufficient compute resources: Free disk 39.00 GB < requested 40 GB, do you see similar in your local reproducer | 13:06 |
rlandy|rover | I am looking there now | 13:07 |
rlandy|rover | we nay need bigger nodes | 13:07 |
rlandy|rover | may | 13:07 |
rlandy|rover | why only fs001, though? | 13:08 |
rlandy|rover | fs035 passes | 13:08 |
ykarel | good point, let's check wh | 13:10 |
ykarel | containerized undercloud? | 13:10 |
rlandy|rover | yakrel: that error makes sense though | 13:10 |
rlandy|rover | queens and master failing | 13:10 |
rlandy|rover | for what we see happen to the nodes | 13:11 |
rlandy|rover | containerized_overcloud: >- | 13:11 |
rlandy|rover | {% if release in ['newton', 'ocata', 'pike'] -%} | 13:11 |
rlandy|rover | false | 13:11 |
rlandy|rover | {%- else -%} | 13:11 |
rlandy|rover | true | 13:11 |
rlandy|rover | {%- endif -%} | 13:11 |
rlandy|rover | ykarel: yes ^^ | 13:12 |
dtantsur | rlandy|rover: cleaning process failed | 13:12 |
dtantsur | if you have it enabled, it will run when the node goes to "available" | 13:12 |
rlandy|rover | dtantsur: is it a problem if cleaning process fails? | 13:12 |
dtantsur | rlandy|rover: yes, it should not | 13:12 |
rlandy|rover | undercloud_clean_nodes: >- | 13:13 |
rlandy|rover | {% if release not in ['newton','ocata','pike'] -%} | 13:13 |
rlandy|rover | true | 13:13 |
rlandy|rover | {%- else -%} | 13:13 |
rlandy|rover | false | 13:13 |
rlandy|rover | {%- endif -%} | 13:13 |
rlandy|rover | dtantsur: ^^ that setting | 13:13 |
arxcruz|ruck | rlandy|rover: how to log into promotion? what's the username? centos? | 13:13 |
rlandy|rover | arxcruz|ruck: ask myoung to add your key | 13:14 |
dtantsur | rlandy|rover: right, this should pass on queens and master | 13:14 |
rlandy|rover | I can do it in a but | 13:14 |
rlandy|rover | biy | 13:14 |
rlandy|rover | ykarel: dtantsur: ok - that setting is in fs001 and not in fs035 | 13:14 |
panda | quiquell: everything clear, right ? :) | 13:14 |
rlandy|rover | and it's failing | 13:14 |
rlandy|rover | I have a reproducer | 13:15 |
quiquell | panda: I am digesting | 13:15 |
dtantsur | well, then we need to fix it :) it's not enabled in fs035 indeed | 13:15 |
rlandy|rover | dtantsur: ok - pls advise as to how - is it an infra issue | 13:16 |
rlandy|rover | I can give you access to my reproducer env | 13:16 |
rlandy|rover | https://bugs.launchpad.net/tripleo/+bug/1773289 | 13:16 |
openstack | Launchpad bug 1773289 in tripleo "[queens/master promotion] fs001 fails overcloud deploy with 'No valid host was found. , Code: 500'" [Critical,Triaged] - Assigned to Ronelle Landy (rlandy) | 13:16 |
rlandy|rover | ^^ related bug | 13:16 |
panda | quiquell: bon appetit | 13:17 |
rlandy|rover | it's blocking gates/promotion since yesterday | 13:17 |
rlandy|rover | we are getting no available host - if that makes sense from nodes in clean failed state | 13:18 |
dtantsur | rlandy|rover: "Timeout reached while waiting for callback for node" it may be an infra issue, hard to tell | 13:18 |
rlandy|rover | dtantsur: we have a third possibility .... | 13:19 |
rlandy|rover | <ykarel> rlandy|rover, on one of the logs i can see: Insufficient compute resources: Free disk 39.00 GB < requested 40 GB, do you see similar in your local reproducer | 13:19 |
dtantsur | rlandy|rover: I wonder if we can be hitting https://storyboard.openstack.org/#!/story/2002079 | 13:19 |
rlandy|rover | but that does not kill fs035 | 13:19 |
dtantsur | tbh in your logs I don't see cleaning failures, but rather: https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master/512d712/undercloud/var/log/containers/ironic/ironic-conductor.log.txt.gz?#_2018-05-25_01_13_16_594 | 13:20 |
dtantsur | which is a deploy failure | 13:20 |
dtantsur | what makes you think that cleaning is related? | 13:20 |
rlandy|rover | ha | 13:20 |
rlandy|rover | just that I saw clean failed state on my reproducer | 13:21 |
rlandy|rover | may be a side problem | 13:21 |
rlandy|rover | dtantsur: we can increase the nodes size | 13:21 |
dtantsur | btw this log is stripped - it does not have the beginning | 13:21 |
rlandy|rover | for the overcloud | 13:21 |
dtantsur | is it okay? | 13:21 |
dtantsur | rlandy|rover: can I get a link to the "Insufficient resources" in the logs please? | 13:22 |
rlandy|rover | ykarel saw that | 13:22 |
rlandy|rover | that error would make the most sense but I can't tell why then fs035 would pass | 13:22 |
* rlandy|rover gets full logs | 13:23 | |
ykarel | dtantsur, https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master/512d712/undercloud/var/log/extra/errors.txt.gz#_2018-05-25_01_14_27_346 | 13:23 |
ykarel | rlandy|rover, ^^ | 13:23 |
dtantsur | hmm | 13:24 |
*** tcw has quit IRC | 13:25 | |
dtantsur | on master we should not use disk resources at all, I wonder why it's popping up | 13:25 |
rlandy|rover | on master and queens | 13:25 |
*** tcw has joined #oooq | 13:25 | |
rlandy|rover | since yesterday | 13:25 |
rlandy|rover | we had an rdocloud update | 13:25 |
rlandy|rover | perhaps somethings changed there?? | 13:26 |
*** tcw has quit IRC | 13:26 | |
dtantsur | rlandy|rover: well, I suspect the immediate problem is that (contrary to common sense) you cannot put an instance requiring 40 GiB to a 40 GiB node | 13:26 |
dtantsur | s/node/machine/ | 13:26 |
dtantsur | Ironic needs some space for partition table, configdrive, etc | 13:27 |
dtantsur | so when introspecting we report <real disk size> - 1 | 13:27 |
dtantsur | which should no longer matter since queens, but for some reason it does | 13:27 |
* dtantsur goes hunting for nova folks | 13:27 | |
rlandy|rover | dtantsur: I'd agree :) - but it seems strange that it only hots fs001 | 13:27 |
rlandy|rover | hits | 13:28 |
dtantsur | agreed | 13:28 |
dtantsur | there is one more difference between these jobs | 13:28 |
dtantsur | fs001 runs a simpler version of introspection. it should not affect local disk discovery, but... | 13:29 |
dtantsur | rlandy|rover: anyway, asking owalsh on #tripleo | 13:29 |
rlandy|rover | dtantsur: thanks - following | 13:29 |
rlandy|rover | I can change the node size but I would prefer not to | 13:30 |
dtantsur | well, I think this problem can be ignored, this is why: | 13:31 |
dtantsur | the failures begins with 4 nodes getting timeouts on callback (from the ramdisk) | 13:31 |
dtantsur | then nova tries to reschedule the nodes, BUT we have cleaning enabled, so the nodes may not be immediately available | 13:32 |
dtantsur | during this time we see all kinds of weird messages from nova | 13:32 |
dtantsur | but the key problems seems the callback timeout | 13:32 |
dtantsur | I wonder if we actually have similar timeouts on fs035 but they're masked by retries? | 13:32 |
dtantsur | rlandy|rover: thought dump ^^ | 13:33 |
rlandy|rover | dtantsur: fs035 does not having cleaning enabled so we would not have that extra step | 13:33 |
dtantsur | rlandy|rover: well, it's fine, but the cleaning itself may not be an issue | 13:33 |
dtantsur | it may simply uncover something requiring rescheduling | 13:34 |
dtantsur | but this is just a wild guess | 13:34 |
dtantsur | can I get similar logs from fs035 please? | 13:34 |
rlandy|rover | yep - posting | 13:34 |
rlandy|rover | dtantsur: https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035-master/ebcb753/undercloud/ | 13:35 |
rlandy|rover | logs from the same promotion run | 13:35 |
dtantsur | no, no timeouts there | 13:36 |
dtantsur | so, it's time to start my favourite rant: we don't publish console logs from the "baremetals", do we? | 13:36 |
dtantsur | aka https://bugs.launchpad.net/tripleo/+bug/1771082 | 13:38 |
openstack | Launchpad bug 1771082 in tripleo "[rfe] TripleO CI must collect and publish console logs from fake overcloud nodes during introspection and deployment" [High,Triaged] | 13:38 |
rlandy|rover | no :( - but I'll raise that priority | 13:38 |
quiquell | panda: this table mix updates + upgrades | 13:40 |
quiquell | Let's separate them | 13:40 |
quiquell | I don't understand the package type | 13:40 |
rlandy|rover | hmmm ... we should deploy large nodes | 13:40 |
rlandy|rover | https://github.com/openstack-infra/tripleo-ci/blob/master/scripts/te-broker/create-env#L51 | 13:41 |
rlandy|rover | ci.m1.large | 8192 | 80 | 0 | 4 | False | 13:41 |
rlandy|rover | why is it complaining about 40? | 13:41 |
*** jfrancoa has quit IRC | 13:42 | |
rlandy|rover | comparing: https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035-master/ebcb753/undercloud/home/jenkins/instackenv.json.txt.gz | 13:43 |
rlandy|rover | and | 13:43 |
rlandy|rover | https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master/64c9587/undercloud/home/jenkins/instackenv.json.txt.gz | 13:44 |
rlandy|rover | does say 80 | 13:44 |
*** jfrancoa has joined #oooq | 13:46 | |
dtantsur | rlandy|rover: they have only one disk, right? | 13:46 |
panda | quiquell: I have a personal issue to solve, then we can chat. | 13:47 |
quiquell | panda: Let's talk next week, I will disconnect soon | 13:47 |
rlandy|rover | I think so | 13:47 |
ykarel | rlandy|rover, have you see any failures recently? looks like we are not seeing it now, in morning i asked kforde to look into the issue, then he applied some patches and told the patch should fix the issue | 13:47 |
dtantsur | rlandy|rover: because apparently introspection detects 40.. | 13:48 |
panda | quiquell: regarding package type, we have repos that are never installed in overcloud. THT for example is only insalled in the undercloud, so the injection on the overcloud is never needed | 13:48 |
ykarel | i can see some passes in fs001 for queens/master gate jobs | 13:48 |
* dtantsur MAAAAAGIIIC | 13:48 | |
panda | quiquell: other repos like nova, for example, will eventually need to be injected in both undercloud and overcloud | 13:48 |
rlandy|rover | ykarel: that would be nice | 13:48 |
quiquell | panda: So you mean excluded overcloud packages ? | 13:48 |
rlandy|rover | I'll ask kforde what he did | 13:48 |
rlandy|rover | dtantsur: thanks for your time | 13:49 |
ykarel | rlandy|rover, yes we should know that :) | 13:49 |
rlandy|rover | sorry for the runaround | 13:49 |
dtantsur | rlandy|rover: you're always welcome :) | 13:49 |
panda | quiquell: I mean that for some changes, we'll need to ignore overcloud injection, because it's not needed | 13:49 |
*** d0ugal_ has joined #oooq | 13:50 | |
rlandy|rover | dtantsur: I am going to find out what happened on the infra side so we don't have this fun again :) | 13:50 |
quiquell | panda: Ok, but the package type, are the packages excluded in the overcloud, isn't it ? | 13:50 |
quiquell | panda: NBut let's talk next week | 13:50 |
dtantsur | :) | 13:50 |
*** d0ugal has quit IRC | 13:51 | |
panda | quiquell: if the package type is undercloud only, yes | 13:53 |
quiquell | panda: A lot of stuff is mixed in the table, we have to simplify man | 13:53 |
panda | quiquell: upgrades are complicated. Let's sync next week. This is what we came up with jiri | 13:54 |
quiquell | panda: Let's sync, have a good weekend | 13:54 |
*** tesseract-RH has joined #oooq | 13:57 | |
rlandy|rover | arxcruz|ruck: hi - if you send me your keys, I can get you on the promotion server | 13:59 |
arxcruz|ruck | rlandy|rover: https://github.com/arxcruz.keys | 14:00 |
*** tesseract has quit IRC | 14:00 | |
*** ykarel is now known as ykarel|away | 14:00 | |
*** amoralej|lunch is now known as amoralej | 14:02 | |
quiquell | rlandy|rover: About promoter, life I check the promoter.sh script was still running in the tmux | 14:02 |
quiquell | s/life/last/ | 14:02 |
rlandy|rover | quiquell: master takes forever and I want queens to promote :) | 14:02 |
rlandy|rover | queens phase 2 | 14:02 |
quiquell | ok | 14:02 |
rlandy|rover | those jobs are running and the dashboard still says 7d old | 14:02 |
quiquell | There is now a promoer.sh script instead of crontab | 14:02 |
rlandy|rover | which it is not | 14:03 |
quiquell | TO run them sequencially | 14:03 |
quiquell | If you do a tmux a | 14:03 |
quiquell | You go to the execut ion of the script | 14:03 |
quiquell | No one has never time to productify that | 14:03 |
quiquell | rlandy|rover, arxcruz|ruck: https://review.rdoproject.org/r/#/c/13622/ | 14:04 |
rlandy|rover | looking | 14:04 |
rlandy|rover | arxcruz|ruck: promotion just failed | 14:05 |
quiquell | It's blocked at master | 14:05 |
rlandy|rover | Install the undercloud | 14:05 |
arxcruz|ruck | quiquell: you have my -1 | 14:06 |
rlandy|rover | oh dear | 14:06 |
rlandy|rover | promotion failing all over the place | 14:06 |
arxcruz|ruck | rlandy|rover: checking | 14:06 |
quiquell | arxcruz|ruck: This is just a WIP | 14:07 |
rlandy|rover | quiquell: will have to review this when the world is not on fire | 14:07 |
quiquell | rlandy|rover: Sure thing, we didn't manage to productify the state of promoter | 14:08 |
*** moguimar has joined #oooq | 14:08 | |
arxcruz|ruck | rlandy|rover: which job fails ? | 14:09 |
arxcruz|ruck | i'm not seeing | 14:09 |
arxcruz|ruck | which phase | 14:09 |
ykarel|away | arxcruz|ruck, queens/master | 14:10 |
ykarel|away | https://review.rdoproject.org/zuul/ | 14:10 |
rlandy|rover | arxcruz|ruck: the promotion is blood red | 14:10 |
rlandy|rover | master and queens are failing for diff reasons | 14:11 |
*** d0ugal_ has quit IRC | 14:11 | |
*** d0ugal has joined #oooq | 14:11 | |
arxcruz|ruck | checking | 14:12 |
rlandy|rover | Prepare for the containerized deployment on master | 14:12 |
arxcruz|ruck | ImageUploaderException: Could not pull image docker.io/tripleomaster/centos-binary-rsyslog-base | 14:12 |
rlandy|rover | yep | 14:12 |
rlandy|rover | we got a promoter issue | 14:13 |
rlandy|rover | arxcruz|ruck: I'll check the queens problem | 14:13 |
rlandy|rover | if you can investigate this one | 14:13 |
arxcruz|ruck | rlandy|rover: opening the bug | 14:13 |
arxcruz|ruck | rlandy|rover: I can for like one hour, i'm in the office today, so i need to leave soon, it's 4pm here | 14:14 |
ykarel|away | why docker.io is used ^^, here rdo registry should have been used | 14:14 |
*** quiquell is now known as quiquell|off | 14:14 | |
arxcruz|ruck | ykarel|away: in this phase ? | 14:14 |
arxcruz|ruck | i think only on phase 2 rdo registry is being used no ? | 14:15 |
rlandy|rover | queens also has containerized_deployment issues | 14:15 |
ykarel|away | arxcruz|ruck, in promotion jobs rdo registry should be used as container-build job pushes to rdo registry, no? | 14:15 |
rlandy|rover | ImageUploaderException: Could not pull image docker.io/tripleoqueens/centos-binary-cron | 14:15 |
rlandy|rover | imagename: docker.io/tripleoqueens/centos-binary-manila-scheduler:1f0eaa23ba556a9af5abd7f394374dd81d657d3d_5a496b9e | 14:16 |
hubbot | FAILING CHECK JOBS on stable/queens: gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-queens @ https://review.openstack.org/567224, stable/ocata: tripleo-ci-centos-7-undercloud-upgrades @ https://review.openstack.org/564291, master: tripleo-ci-centos-7-3nodes-multinode, tripleo-ci-centos-7-scenario002-multinode-oooq-container, gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-queens-branch, gate- (1 more message) | 14:17 |
rlandy|rover | ok - well we got one major issue going on here | 14:17 |
rlandy|rover | ykarel|away: arxcruz|ruck; ^^ so at least it's not two as I though | 14:17 |
rlandy|rover | thought | 14:17 |
arxcruz|ruck | rlandy|rover: https://bugs.launchpad.net/tripleo/+bug/1773381 opened for the docker issue | 14:18 |
openstack | Launchpad bug 1773381 in tripleo "Promotion failing to pull image from docker" [Critical,Triaged] - Assigned to Arx Cruz (arxcruz) | 14:18 |
arxcruz|ruck | ykarel|away: so i'll move to rdo | 14:18 |
rlandy|rover | ok | 14:18 |
rlandy|rover | not so fast with the moving - let's compare | 14:18 |
rlandy|rover | imagename: trunk.registry.rdoproject.org/tripleoqueens/centos-binary-aodh-api:d29f833a5977b89a110d19c02fa6bf860709af34_f0f5884b | 14:19 |
rlandy|rover | yeah - that looks diff | 14:19 |
rlandy|rover | what happened to change that? | 14:19 |
arxcruz|ruck | rlandy|rover: ykarel|away hmmm https://github.com/openstack-infra/tripleo-ci/blob/master/toci-quickstart/config/testenv/multinode-rdocloud.yml#L40 | 14:21 |
arxcruz|ruck | should be using registry instead of docker, perhaps the PERIODIC var isn't set anymore ? | 14:21 |
rlandy|rover | looking at what might have changed | 14:22 |
rlandy|rover | https://github.com/openstack-infra/tripleo-ci/blob/master/toci-quickstart/config/testenv/ovb-rdocloud.yml#L26 | 14:23 |
rlandy|rover | same there | 14:23 |
myoung | chandankumar, arxcruz|ruck, kopecmartin: thanks for your time and a great sprint planning! I think we're in great shape for sprint 14. The pre-planning and time put into cards for this sprint ahead of planning meeting worked very well. | 14:26 |
myoung | chandankumar++ | 14:26 |
hubbot | myoung: chandankumar's karma is now 4 | 14:26 |
myoung | kopecmartin++ | 14:26 |
hubbot | myoung: kopecmartin's karma is now 1 | 14:26 |
rlandy|rover | [undercloud-deploy : Install the undercloud] issue - but maybe that's a side problem | 14:26 |
myoung | arxcruz|ruck++ | 14:26 |
hubbot | myoung: arxcruz|ruck's karma is now 1 | 14:26 |
arxcruz|ruck | rlandy|rover: periodic is set to 1 | 14:26 |
arxcruz|ruck | https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset016-master/e77b063/console.txt.gz#_2018-05-25_13_35_33_592 | 14:26 |
myoung | chandankumar: do you have a hot sec? forgot to ask you something | 14:26 |
myoung | chandankumar: bluejeans.com/matyoung, only need 45sec | 14:27 |
rlandy|rover | hmmm ... maybe it's hardcoded now | 14:30 |
arxcruz|ruck | rlandy|rover: what is hardcoded ? | 14:30 |
arxcruz|ruck | okay, latest success had pull from trunk.registry.rdoproject.org | 14:34 |
rlandy|rover | 13:35:33 +(/opt/stack/new/tripleo-ci/toci_gate_test.sh:209): PERIODIC=1 | 14:35 |
rlandy|rover | arxcruz|ruck: latest success> | 14:36 |
rlandy|rover | ? | 14:36 |
arxcruz|ruck | rlandy|rover: https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset016-master/ece9569/console.txt.gz | 14:37 |
rlandy|rover | arxcruz|ruck: oh you mean before this failure> | 14:37 |
arxcruz|ruck | rlandy|rover: https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset016-master/ece9569/undercloud/home/jenkins/overcloud_prep_containers.log.txt.gz#_2018-05-25_06_00_04 | 14:37 |
arxcruz|ruck | rlandy|rover: yes | 14:37 |
rlandy|rover | question is - what changed?? | 14:37 |
rlandy|rover | arxcruz|ruck: maybe this review messed things up? https://review.openstack.org/#/c/567060/34/toci_quickstart.sh | 14:39 |
rlandy|rover | I don't see how though | 14:39 |
rlandy|rover | 13:38:28 TASK [Always build images in the periodic jobs] ******************************** | 14:41 |
rlandy|rover | 13:38:28 Friday 25 May 2018 13:38:28 +0000 (0:00:00.058) 0:04:59.672 ************ | 14:41 |
rlandy|rover | 13:38:28 [WARNING]: when statements should not include jinja2 templating delimiters | 14:41 |
rlandy|rover | 13:38:28 such as {{ }} or {% %}. Found: {{ lookup('env', 'PERIODIC')|default('0')|int == | 14:41 |
rlandy|rover | 13:38:28 1 }} | 14:41 |
rlandy|rover | 13:38:28 ok: [undercloud] | 14:41 |
rlandy|rover | ^^ knows it's set | 14:41 |
arxcruz|ruck | rlandy|rover: where are you seeing this task ? | 14:45 |
*** links has quit IRC | 14:45 | |
rlandy|rover | in fs035 console log | 14:45 |
rlandy|rover | seems fine though | 14:45 |
arxcruz|ruck | rlandy|rover: thing is, periodic is being set to 1 | 14:46 |
rlandy|rover | arxcruz|ruck: merged this morning Zuul | 14:48 |
rlandy|rover | Change has been successfully merged by Zuul | 14:48 |
rlandy|rover | 6:59 AM | 14:48 |
rlandy|rover | https://review.openstack.org/#/c/567060/ | 14:49 |
rlandy|rover | idk what else changed | 14:49 |
ykarel|away | rlandy|rover, i think you are correct ^^ patch is causing the issue | 14:49 |
ykarel|away | this messed with the install command | 14:49 |
rlandy|rover | ykarel|away: arxcruz|ruck: I am going to revert it | 14:50 |
ykarel|away | and changed the order of --extra-args passed, multinode-rdocloud should override the release file params | 14:50 |
*** jtomasek has joined #oooq | 14:50 | |
ykarel|away | rlandy|rover, yup revert should fix that, | 14:50 |
rlandy|rover | ykarel|away: arxcruz|ruck: https://review.openstack.org/#/c/570585/ | 14:50 |
ykarel|away | ${RELEASE_ARGS[$playbook] $PLAYBOOK_REPEATED_ARGS should correct, but good to check complete order | 14:51 |
rlandy|rover | ykarel|away: the tests on the patch never failed | 14:51 |
ykarel|away | rlandy|rover, because we use docker.io and promoted container images on gate jobs | 14:51 |
rasca | myoung, rlandy|rover hey folks how's going? After an entire day of debugging I should be able to fix the master Rdophase2 job... testing it right now, hopefully soon I'll ask you for a review... | 14:51 |
rlandy|rover | $PLAYBOOK_REPEATED_ARGS should correct? | 14:52 |
rlandy|rover | should be correct? | 14:52 |
rlandy|rover | ykarel|away: ^^ not sure what you mean | 14:52 |
ykarel|away | rlandy|rover, so multinode-rdocloud.yml should be in the end of the command(after promoting-testing-hash-master.yml) | 14:52 |
ykarel|away | but the patch reversed the order | 14:53 |
*** tesseract has joined #oooq | 14:53 | |
arxcruz|ruck | rlandy|rover: maybe add a related-bug or closes bug https://bugs.launchpad.net/tripleo/+bug/1773381 ? | 14:53 |
openstack | Launchpad bug 1773381 in tripleo "Promotion failing to pull image from docker" [Critical,Triaged] - Assigned to Arx Cruz (arxcruz) | 14:53 |
rlandy|rover | yep - will add that | 14:54 |
rlandy|rover | I'll fix that patch later | 14:54 |
ykarel|away | rlandy|rover, you got what i meant? | 14:55 |
*** tesseract has quit IRC | 14:56 | |
*** tesseract-RH has quit IRC | 14:56 | |
rlandy|rover | ykarel|away: yep - will rework order | 14:56 |
*** tesseract has joined #oooq | 14:56 | |
rlandy|rover | ykarel|away: arxcruz|ruck: https://review.openstack.org/#/c/570585/ | 14:56 |
rlandy|rover | pls vote | 14:57 |
arxcruz|ruck | rlandy|rover: i'm just a +1 guy :/ | 14:57 |
rlandy|rover | panda: ^^ pls vote | 14:58 |
rlandy|rover | ykarel|away: do you have +2? | 14:59 |
arxcruz|ruck | we should not merge stuff on friday lol | 14:59 |
arxcruz|ruck | rlandy|rover: already talked with alex | 14:59 |
ykarel|away | rlandy|rover, voted +1 :) | 14:59 |
*** moguimar has quit IRC | 14:59 | |
rlandy|rover | thanks | 14:59 |
rlandy|rover | I'll rework the playbooks order there | 15:00 |
* ykarel|away leaving | 15:00 | |
rlandy|rover | we should get ykarel|away +2 rights | 15:00 |
rlandy|rover | more than earned it | 15:01 |
arxcruz|ruck | we should stolen ykarel|away for our team | 15:01 |
ykarel|away | there is still lot more to learn | 15:01 |
rlandy|rover | panda: ping ping there | 15:02 |
myoung | panda: ping, still want to take a whack at backlog grooming? | 15:02 |
arxcruz|ruck | myoung: rlandy|rover want to talk, otherwise i'm heading to home | 15:03 |
rlandy|rover | arxcruz|ruck: ant news on stuck promotion script> | 15:03 |
rlandy|rover | any | 15:03 |
rlandy|rover | arxcruz|ruck: also I am out on monday | 15:04 |
myoung | arxcruz|ruck: rlandy|rover: I can chat if you like or are blocked, or need help with the promoter / promotion script, or ______. | 15:04 |
arxcruz|ruck | rlandy|rover: not yet, but i can take a look | 15:04 |
arxcruz|ruck | i didn't because myoung told he was looking into it after the meeting | 15:04 |
myoung | arxcruz|ruck: rlandy|rover: I wanted to sync prior to holiday just to sync on where we are. | 15:04 |
rlandy|rover | myoung: arxcruz|ruck: ready to sync when you are | 15:05 |
myoung | arxcruz|ruck, rlandy|rover, I'm in my room (bj/matyoung). panda hasn't entered yet so guessing I have this hour free :) | 15:05 |
* myoung logs into promoter to poke at it | 15:05 | |
rlandy|rover | I need panda to +2 my patch first | 15:05 |
panda | ouch | 15:06 |
arxcruz|ruck | panda: bad panda | 15:06 |
myoung | panda: lol...i'm trying on sarcasm :) | 15:06 |
panda | sorry | 15:06 |
rlandy|rover | panda: before you go anywhere https://review.openstack.org/#/c/570585/ | 15:06 |
panda | I'm trying to solve to many problems at once | 15:06 |
rlandy|rover | need to rework the order there | 15:06 |
rlandy|rover | but blocking promotion atm | 15:07 |
panda | myoung: coming | 15:07 |
rlandy|rover | I need another +2 | 15:07 |
panda | rlandy|rover: approved, what did it break ? | 15:07 |
rlandy|rover | panda: the order of arguments | 15:07 |
rlandy|rover | easy enough fix | 15:08 |
rlandy|rover | I';; do it later | 15:08 |
panda | oh | 15:08 |
panda | ok | 15:08 |
rlandy|rover | just want to revert to get the promotions moving | 15:08 |
panda | sure | 15:08 |
rlandy|rover | edge case | 15:08 |
*** holser__ has quit IRC | 15:09 | |
*** holser__ has joined #oooq | 15:12 | |
*** jfrancoa has quit IRC | 15:12 | |
*** bogdando has quit IRC | 15:14 | |
*** holser__ has quit IRC | 15:17 | |
*** dtrainor has joined #oooq | 15:18 | |
*** matbu has quit IRC | 15:19 | |
*** ykarel|away has quit IRC | 15:19 | |
*** tcw has joined #oooq | 15:23 | |
*** marios has quit IRC | 15:25 | |
*** jtomasek has quit IRC | 15:27 | |
*** dtrainor has quit IRC | 15:36 | |
rasca | rlandy|rover, myoung, https://code.engineering.redhat.com/gerrit/#/c/139887/ if you can please have a look by the end of today and maybe merge it, this should fix BM Rdophase2 deployments | 15:50 |
*** zoli is now known as zoli|gone | 16:02 | |
*** ccamacho has quit IRC | 16:06 | |
rlandy|rover | sorry - busy with upstream fires atm | 16:13 |
*** ccamacho has joined #oooq | 16:15 | |
hubbot | FAILING CHECK JOBS on stable/ocata: tripleo-ci-centos-7-undercloud-upgrades @ https://review.openstack.org/564291, master: tripleo-ci-centos-7-3nodes-multinode, tripleo-ci-centos-7-scenario002-multinode-oooq-container, gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-queens-branch, gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master @ https://review.openstack.org/560445, stable/queens: gate- (1 more message) | 16:17 |
*** jtomasek has joined #oooq | 16:28 | |
rlandy|rover | panda: https://review.openstack.org/#/c/570593/ - your vote pls | 16:33 |
rlandy|rover | alex ok'ed it | 16:33 |
panda | rlandy|rover: quotes are wrong | 16:34 |
rlandy|rover | panda: overcloud_templates_path is a var | 16:35 |
rlandy|rover | the rest is an actual string | 16:35 |
panda | does this work ? | 16:35 |
rlandy|rover | we will find out | 16:35 |
rlandy|rover | panda: so ... | 16:36 |
rlandy|rover | https://github.com/openstack/tripleo-quickstart/blob/master/config/release/tripleo-ci/master.yml#L21 | 16:36 |
rlandy|rover | we default to a var w/o brakcets | 16:36 |
rlandy|rover | idk how else to denote a string and a var in default | 16:37 |
*** jtomasek has quit IRC | 16:37 | |
panda | rlandy|rover: yeah it may be the only way, I just never done it before. I'll +2, you can +1W when checks pass | 16:38 |
rlandy|rover | panda:ack | 16:38 |
rlandy|rover | sorry for the rush - a lot of breaking changes :( | 16:39 |
panda | yeah, let's try not to add some more :P | 16:40 |
rlandy|rover | myoung: seen this error before: https://logs.rdoproject.org/85/570585/2/openstack-check/gate-tripleo-ci-centos-7-container-to-container-upgrades-master-nv/Zb3586619a40848269762ae63b5fe5c5f/undercloud/home/jenkins/overcloud_deploy.log.txt.gz#_2018-05-25_16_20_55? | 16:46 |
rlandy|rover | you mentioned something about hostname bet maybe that was unrelated | 16:46 |
rlandy|rover | https://bugs.launchpad.net/tripleo/+bug/1771997 | 16:47 |
openstack | Launchpad bug 1771997 in tripleo "Mixed version (R/Q) deploy with config-download cannot find role_name variable" [High,Fix released] - Assigned to Jiří Stránský (jistr) | 16:47 |
rlandy|rover | nvm - been there for a wheil | 16:50 |
rlandy|rover | while | 16:50 |
rlandy|rover | panda: hmmm - string didn't quite work | 16:55 |
* myoung finds lunch and will biab | 16:56 | |
panda | rlandy|rover: :( | 16:56 |
rlandy|rover | thinking about how to do this | 16:56 |
rlandy|rover | -r {{undercloud_roles_data|default("{{ overcloud_templates_path/roles_data_undercloud.yaml}}")}} | 16:57 |
rlandy|rover | panda: ^^ how do we feel about that one | 16:57 |
myoung | panda: thanks for spending an hour with me backlog grooming. | 16:58 |
myoung | panda++ | 16:58 |
hubbot | myoung: panda's karma is now 1 | 16:58 |
rlandy|rover | "{{ overcloud_templates_path }} /roles_data_undercloud.yaml" | 16:58 |
rlandy|rover | better yet | 16:58 |
myoung | all: if curious where the "new" and "technical debt" cols on our trello board went, we've been plowing thru, grooming, sorting, cleaning, etc. | 16:59 |
myoung | https://trello.com/b/N9gHLMyP/tripleo-ci-backlog-grooming is the sandbox. Nothing is being arbitrarily nuked, but moved to columns that can be reviewed by whomever is interested. | 16:59 |
panda | rlandy|rover: yeahm let's try the second one | 16:59 |
rlandy|rover | okie dokie | 16:59 |
rlandy|rover | brb | 17:02 |
*** rlandy|rover is now known as rlandy|rover|brb | 17:02 | |
rlandy|rover|brb | let's hope the world doe snot catch on yet another fire in the next few minutes | 17:02 |
*** tesseract has quit IRC | 17:15 | |
*** zoli|gone is now known as zoli | 17:23 | |
*** rlandy|rover|brb is now known as rlandy|rover | 17:34 | |
*** amoralej is now known as amoralej|off | 17:43 | |
*** myoung is now known as myoung||bbl | 17:45 | |
*** kopecmartin has quit IRC | 17:56 | |
*** dtantsur is now known as dtantsur|afk | 18:06 | |
hubbot | FAILING CHECK JOBS on stable/ocata: tripleo-ci-centos-7-undercloud-upgrades @ https://review.openstack.org/564291, master: tripleo-ci-centos-7-3nodes-multinode, tripleo-ci-centos-7-scenario002-multinode-oooq-container, gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-queens-branch, gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master @ https://review.openstack.org/560445 | 18:17 |
*** ccamacho has quit IRC | 18:43 | |
arxcruz|ruck | rlandy|rover: just arrive home, if you need my help with something | 18:55 |
rlandy|rover | arxcruz|ruck: hi | 18:59 |
rlandy|rover | we have patches in flight | 18:59 |
rlandy|rover | https://review.openstack.org/#/c/570593/ | 19:00 |
rlandy|rover | to fic fs001 | 19:00 |
rlandy|rover | fix | 19:00 |
rlandy|rover | and https://review.openstack.org/#/c/570585 waiting for gates | 19:00 |
rlandy|rover | arxcruz|ruck: qqueens still says 7d | 19:01 |
rlandy|rover | myoung||bbl: ^^ did we get anywhere with that? | 19:01 |
rlandy|rover | 2018-05-25 15:33:40,714 6680 INFO promoter Promoting the container images for dlrn hash 85de06e2c40bfdc8dee80506f8d1d809a93b900e on queens to current-tripleo-rdo-internal | 19:02 |
rlandy|rover | arxcruz|ruck: ocata needs work | 19:03 |
rlandy|rover | will get back to that after resubmitting reverted patch | 19:03 |
*** holser__ has joined #oooq | 19:04 | |
*** myoung||bbl is now known as myoung | 19:09 | |
myoung | rlandy|rover: looking at it now | 19:09 |
myoung | rlandy|rover: it pulled all 98 images from rdo for the promoted queens hash. the queens promo script is still running | 19:10 |
myoung | rlandy|rover: looking at it now to see where it's hung up | 19:10 |
myoung | rlandy|rover: seems to be working, just slowly. very slowly | 19:11 |
myoung | root 30635 0.0 0.2 138292 8964 pts/2 Sl+ 19:01 0:00 /usr/bin/docker-current push trunk.registry.rdoproject.org/tripleoqueens/centos-binary-cinder-api:current-tripleo-rdo-internal | 19:11 |
myoung | rlandy|rover: cinder-api is the 12th image...so the prev 11 should be tagged with current-tripleo-rdo-internal in rdoregistry... | 19:12 |
* myoung checks https://console.registry.rdoproject.org/registry#/images/tripleomaster | 19:12 | |
*** holser__ has quit IRC | 19:12 | |
*** holser__ has joined #oooq | 19:12 | |
* myoung meant https://console.registry.rdoproject.org/registry#/images/tripleoqueens | 19:13 | |
rlandy|rover | myoung: thanks | 19:20 |
rlandy|rover | yay - passing fs001 https://review.rdoproject.org/jenkins/job/gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master/14852/console | 19:21 |
rlandy|rover | waiting to check logs there | 19:21 |
panda | rlandy|rover: I'll let you approve | 19:33 |
*** panda is now known as panda|off | 19:33 | |
rlandy|rover | panda|off: thanks | 19:34 |
rlandy|rover | waitung on gates | 19:34 |
rlandy|rover | tripleo-quickstart-extras-gate-newton-delorean-full-minimalFAILURE | 19:34 |
rlandy|rover | myoung: hi - https://ci.centos.org/view/rdo/view/tripleo-gate/job/tripleo-quickstart-extras-gate-newton-delorean-full-minimal/ looks like this has been failing for a while | 19:37 |
rlandy|rover | do we still care? | 19:37 |
rlandy|rover | https://ci.centos.org/artifacts/rdo/jenkins-tripleo-quickstart-extras-gate-newton-delorean-full-minimal-5960/undercloud/home/stack/undercloud_install.log.gz | 19:39 |
rlandy|rover | failing since 05/-9 | 19:40 |
rlandy|rover | 09 | 19:40 |
rlandy|rover | myoung: ^^ | 19:41 |
rlandy|rover | 2018-05-09 20:25:19 - was the last pass | 19:41 |
myoung | rlandy|rover: ok so the promoter for queens is running, but DREADFULLY slow | 19:51 |
myoung | you can see the current status here: | 19:51 |
myoung | https://console.registry.rdoproject.org/registry#/?namespace=tripleoqueens | 19:51 |
myoung | it's pushing cinder-scheduler now | 19:51 |
myoung | but at this rate it's going to take a couple hours. something's not right | 19:51 |
myoung | we can run it from another machine (we did this last sprint too) - it's not too hard and I have things set up to do it already | 19:52 |
rlandy|rover | myoung: thanks for watching | 19:52 |
myoung | i can do with you, but I need to go pick up the wee one (ok she's 13) | 19:52 |
myoung | something is not right with rdocloud networking (again) or with our promoter vm | 19:52 |
myoung | things are taking 4x or more longer than they should | 19:52 |
myoung | rlandy|rover: on promoter server on promoter server, /home/centos/ci-config/ci-scripts/container-push/parsed_containers-queens.txt is the list (in order) of containers that the (running) script is using | 19:55 |
myoung | [centos@promoter-server container-push]$ ps aux | grep docker | grep push | 19:57 |
myoung | root 978 0.0 0.2 138292 8836 pts/2 Sl+ 19:46 0:00 /usr/bin/docker-current push trunk.registry.rdoproject.org/tripleoqueens/centos-binary-cinder-volume:current-tripleo-rdo-internal | 19:57 |
myoung | cinder-volume is 14/98 images, and it's been running for hours | 19:57 |
myoung | will check again when I'm back at keyboard, will plan to run this promotion from a box outside rdocloud if it's still poking along later | 19:57 |
*** myoung is now known as myoung|bbl | 19:57 | |
myoung|bbl | arxcruz|ruck: ^^ | 20:01 |
arxcruz|ruck | myoung|bbl: dns issues maybe? | 20:06 |
arxcruz|ruck | i would vote to reboot the vm just in case | 20:06 |
*** holser__ has quit IRC | 20:08 | |
rlandy|rover | arxcruz|ruck: fyi - if you have time on monday ... https://bugs.launchpad.net/tripleo/+bug/1773445 | 20:11 |
openstack | Launchpad bug 1773445 in tripleo "tripleo-quickstart-extras-gate-newton-delorean-full-minimal fails to install undercloud - Access denied for user 'heat'@'192.168.24.1" [High,Triaged] | 20:11 |
rlandy|rover | it's been broken since 05/09 | 20:11 |
rlandy|rover | so it was not us | 20:12 |
arxcruz|ruck | ok | 20:15 |
hubbot | FAILING CHECK JOBS on master: tripleo-ci-centos-7-3nodes-multinode, gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master @ https://review.openstack.org/560445 | 20:17 |
arxcruz|ruck | :GG: | 20:21 |
*** yolanda has quit IRC | 20:34 | |
rlandy|rover | myoung|bbl: ping when you are back | 21:18 |
hubbot | FAILING CHECK JOBS on master: tripleo-ci-centos-7-3nodes-multinode, gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master @ https://review.openstack.org/560445 | 22:17 |
*** myoung|bbl is now known as myoung | 22:56 | |
myoung | rlandy|rover: this one now... root 13355 0.0 0.2 138292 8748 pts/2 Sl+ 22:46 0:00 /usr/bin/docker-current push trunk.registry.rdoproject.org/tripleoqueens/centos-binary-gnocchi-metricd:current-tripleo-rdo-internal | 22:57 |
myoung | ^^ 24/98 | 22:57 |
myoung | want me to run this locally? | 22:57 |
*** hamzy has joined #oooq | 23:18 | |
rlandy|rover | it's ok | 23:57 |
*** rlandy|rover has quit IRC | 23:58 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!