hubbot | FAILING CHECK JOBS on master: tripleo-quickstart-extras-gate-newton-delorean-full-minimal @ https://review.openstack.org/560445, stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/567224 | 00:52 |
---|---|---|
hubbot | FAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/567224, master: tripleo-quickstart-extras-gate-newton-delorean-full-minimal @ https://review.openstack.org/560445 | 02:52 |
*** skramaja_ has quit IRC | 03:06 | |
*** skramaja has joined #oooq | 03:08 | |
*** skramaja has quit IRC | 03:34 | |
*** ykarel has joined #oooq | 03:35 | |
*** udesale has joined #oooq | 03:46 | |
*** skramaja has joined #oooq | 04:18 | |
*** ratailor has joined #oooq | 04:33 | |
*** ykarel has quit IRC | 04:35 | |
hubbot | FAILING CHECK JOBS on master: tripleo-quickstart-extras-gate-newton-delorean-full-minimal, tripleo-ci-centos-7-scenario003-multinode-oooq-container @ https://review.openstack.org/560445, stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/567224 | 04:52 |
*** ykarel has joined #oooq | 05:02 | |
*** quiquell|off is now known as quiquell|rover | 05:31 | |
*** skramaja_ has joined #oooq | 05:44 | |
*** skramaja has quit IRC | 05:47 | |
*** ccamacho has joined #oooq | 05:59 | |
*** jbadiapa has joined #oooq | 06:04 | |
*** holser_ has joined #oooq | 06:07 | |
*** holser_ has quit IRC | 06:08 | |
*** holser_ has joined #oooq | 06:08 | |
*** ccamacho has quit IRC | 06:20 | |
*** ccamacho has joined #oooq | 06:20 | |
*** quiquell|rover is now known as quique|rover|bbl | 06:31 | |
*** yolanda__ has joined #oooq | 06:42 | |
*** yolanda__ is now known as yolanda | 06:43 | |
hubbot | FAILING CHECK JOBS on master: tripleo-quickstart-extras-gate-newton-delorean-full-minimal, tripleo-ci-centos-7-scenario003-multinode-oooq-container @ https://review.openstack.org/560445, stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/567224 | 06:52 |
*** bogdando has joined #oooq | 06:53 | |
*** quique|rover|bbl is now known as quiquell|rover | 07:02 | |
*** amoralej|off is now known as amoralej | 07:07 | |
*** tesseract has joined #oooq | 07:10 | |
*** ratailor_ has joined #oooq | 07:29 | |
*** tosky has joined #oooq | 07:32 | |
*** ratailor has quit IRC | 07:32 | |
*** ratailor_ has quit IRC | 07:53 | |
*** ratailor has joined #oooq | 07:55 | |
quiquell|rover | arxcruz, chandankumar: Do you know anything about this ? https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset020-queens/3d190d9/undercloud/home/jenkins/tempest/tempest.html.gz | 07:57 |
*** ratailor_ has joined #oooq | 07:58 | |
*** dtrainor has quit IRC | 07:58 | |
*** dtrainor has joined #oooq | 07:58 | |
*** ratailor_ has quit IRC | 07:59 | |
*** ratailor has quit IRC | 07:59 | |
*** gkadam has joined #oooq | 08:02 | |
*** d0ugal has joined #oooq | 08:03 | |
ykarel | quiquell|rover, so only fs020 failed yesterday | 08:05 |
ykarel | https://trunk-primary.rdoproject.org/api-centos-queens/api/civotes_detail.html?commit_hash=927e8115f6605ea18c883251b05e76b2f448eeba&distro_hash=43b025ca6c0610b81fefa32f5c826ba5385c10fd | 08:05 |
ykarel | do you know why we have not considered promotion by skipping fs020 | 08:05 |
ykarel | queens is 8 days behind | 08:06 |
*** ratailor has joined #oooq | 08:06 | |
quiquell|rover | ykarel: Don't know, I don't know If can take those decisions | 08:06 |
ykarel | hmm but you can propose :) | 08:07 |
quiquell|rover | ykarel: I will, I will... | 08:07 |
ykarel | yup as fixing tempest tests can go in paraller | 08:08 |
bogdando | o/ PTAL https://review.openstack.org/#/c/577391/ | 08:11 |
*** skramaja_ has quit IRC | 08:13 | |
*** skramaja_ has joined #oooq | 08:15 | |
*** ykarel is now known as ykarel|lunch | 08:16 | |
quiquell|rover | sshnaidm: https://review.rdoproject.org/r/#/c/14507/, https://review.rdoproject.org/r/#/c/14473/ to merge latest from RR dashboard | 08:27 |
ykarel|lunch | quiquell|rover, isn't the tempest issue fixed in master | 08:30 |
ykarel|lunch | https://logs.rdoproject.org/51/579051/1/openstack-experimental/gate-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset020-master/Z5c0b8cc927614b77a6537a4dced72a0e/undercloud/home/jenkins/tempest/tempest.html.gz | 08:31 |
ykarel|lunch | hmm tempestconf change is not promoted, local tempest run timed out | 08:32 |
ykarel|lunch | running again tempest locally | 08:34 |
quiquell|rover | Don't see this thing passing https://review.rdoproject.org/jenkins/job/gate-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset020-master/ | 08:35 |
quiquell|rover | what do you mean ? | 08:35 |
ykarel|lunch | quiquell|rover, ignore for now | 08:37 |
ykarel|lunch | checking again | 08:37 |
*** gkadam_ has joined #oooq | 08:43 | |
*** gkadam_ has quit IRC | 08:43 | |
*** gkadam has quit IRC | 08:46 | |
hubbot | FAILING CHECK JOBS on master: tripleo-quickstart-extras-gate-newton-delorean-full-minimal, tripleo-ci-centos-7-scenario003-multinode-oooq-container @ https://review.openstack.org/560445, stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/567224 | 08:52 |
*** panda|off is now known as panda | 08:54 | |
*** skramaja_ has quit IRC | 09:09 | |
quiquell|rover | panda, sshnaidm: To have more info at the promoter logs https://review.rdoproject.org/r/#/c/14523/ | 09:12 |
quiquell|rover | just print all the hashes at Skipping | 09:12 |
*** zoli is now known as zoli|lunch | 09:13 | |
*** skramaja has joined #oooq | 09:16 | |
*** sshnaidm is now known as sshnaidm|off | 10:02 | |
ykarel|lunch | quiquell|rover, as python2-tripleo-common is included in tq, we creating the package again | 10:05 |
*** ykarel|lunch is now known as ykarel | 10:05 | |
ykarel | quiquell|rover, https://review.rdoproject.org/r/#/c/14522/ | 10:05 |
quiquell|rover | ykarel: You mean the change on the releases file | 10:05 |
ykarel | quiquell|rover, yes | 10:05 |
quiquell|rover | ykarel: ok, so you are reverting the revert | 10:06 |
ykarel | yes | 10:06 |
quiquell|rover | ykarel: Ok, have to work a little the other one with the proper release file | 10:06 |
quiquell|rover | will have it in a few | 10:06 |
ykarel | but that will not affect https://review.rdoproject.org/r/#/c/14522/, right? | 10:06 |
quiquell|rover | ykarel: Don't expect, still don't really know why do we have to use those release file | 10:10 |
ykarel | ack | 10:11 |
*** holser_ has quit IRC | 10:12 | |
*** holser_ has joined #oooq | 10:22 | |
sshnaidm|off | quiquell|rover, left comments on https://review.rdoproject.org/r/#/c/14507/ | 10:25 |
quiquell|rover | sshnaidm|off: Have to show you something cool | 10:26 |
sshnaidm|off | que? | 10:27 |
quiquell|rover | sshnaidm|off: http://38.145.34.131:3000/d/2kHMNHvik/exploration?orgId=1 | 10:28 |
quiquell|rover | sshnaidm|off: Check out the filter part, grafa ad-hoc is super powerful | 10:28 |
quiquell|rover | sshnaidm|off: It's clear we have more latency at limeston cloud provider for container job | 10:28 |
sshnaidm|off | quiquell|rover, niiiice | 10:31 |
quiquell|rover | sshnaidm|off: You see the possibilty of this grafana ad-hoc variables ? | 10:31 |
quiquell|rover | sshnaidm|off: Now we can really explore stuff | 10:31 |
sshnaidm|off | quiquell|rover, but we need to remove all short jobs from there to avoid noise, like openstack-linters | 10:31 |
quiquell|rover | sshnaidm|off: I am changint the patch with your comments, agree with all of them | 10:31 |
sshnaidm|off | quiquell|rover, yep, variables are great, we use them in tripleo-ci dashboard too | 10:31 |
quiquell|rover | sshnaidm|off: ad-hoc variable type is the one, I think we are not using it there | 10:32 |
*** holser_ has quit IRC | 10:32 | |
sshnaidm|off | quiquell|rover, cool, will look in Sun more closely, gotta run | 10:32 |
quiquell|rover | sshnaidm|off: nÂp, go go go | 10:32 |
sshnaidm|off | quiquell|rover, thanks for creating this! | 10:32 |
quiquell|rover | sshnaidm|off: Let's make it better with all of us | 10:33 |
*** dtantsur|afk is now known as dtantsur | 10:43 | |
*** skramaja has quit IRC | 10:45 | |
hubbot | FAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/567224, master: tripleo-quickstart-extras-gate-newton-delorean-full-minimal, tripleo-ci-centos-7-scenario003-multinode-oooq-container @ https://review.openstack.org/560445 | 10:52 |
*** panda is now known as panda|ko | 10:53 | |
*** kopecmartin has joined #oooq | 11:01 | |
*** d0ugal_ has joined #oooq | 11:04 | |
*** d0ugal has quit IRC | 11:05 | |
*** zoli|lunch is now known as zoli | 11:10 | |
*** udesale has quit IRC | 11:24 | |
*** sshnaidm|off has quit IRC | 11:29 | |
*** ratailor has quit IRC | 11:32 | |
ykarel | quiquell|rover, nodepool fix working for both master and queens | 11:37 |
ykarel | fs018 queens: tempest running, | 11:37 |
ykarel | and master passed beyond config-download | 11:37 |
quiquell|rover | ykarel: Let see what we get about fs017 timeout | 11:38 |
quiquell|rover | ykarel: \o/ !!! | 11:38 |
quiquell|rover | ykarel: We have to celebrate if we have promotions | 11:38 |
ykarel | we need fix for fs027 fix for master | 11:39 |
ykarel | we can expect queens promotions by skipping fs20 | 11:39 |
quiquell|rover | ykarel: Let's see if the fix land | 11:39 |
ykarel | queue reset again | 11:39 |
quiquell|rover | ykarel: :-(((( | 11:39 |
ykarel | it's 16+ hrs now | 11:39 |
quiquell|rover | ykarel: Nah we have to wait | 11:39 |
*** trown|outtypewww is now known as trown | 11:49 | |
*** holser_ has joined #oooq | 11:50 | |
trown | panda|ko: ya I think what you did with the empty list peers might work ... if not that, maybe we can have "- secondary" there... not sure if zuul inventory creator will complain about secondary being undefined ... we could also put up a patch to the multi-node-bridge role to add a "when: groups['peers']" to those includes | 12:01 |
panda|ko | trown: in that way, the peers tasks will try to run on an non existing node | 12:02 |
trown | panda|ko: actually empty list peers already made it past pre, so that seems like it is working | 12:02 |
panda|ko | trown: crossing fingers | 12:02 |
*** d0ugal_ has quit IRC | 12:15 | |
*** d0ugal_ has joined #oooq | 12:17 | |
quiquell|rover | panda|ko, trown: Do you know if wes is PTO ' | 12:18 |
panda|ko | quiquell|rover: I don't think he is, it's 6:30 in denver, maybe sometimes he doesn't want to wake up before dawn | 12:24 |
*** holser_ has quit IRC | 12:35 | |
*** ccamacho has quit IRC | 12:38 | |
*** holser_ has joined #oooq | 12:39 | |
*** quiquell|rover is now known as quique|rover|lch | 12:40 | |
*** rlandy has joined #oooq | 12:42 | |
*** ccamacho has joined #oooq | 12:46 | |
hubbot | FAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/567224, master: tripleo-quickstart-extras-gate-newton-delorean-full-minimal @ https://review.openstack.org/560445 | 12:53 |
*** holser_ has quit IRC | 13:04 | |
ykarel | quique|rover|lch, all(except 002,027 in master and 002 in queens) jobs in queens/master passed | 13:10 |
ykarel | 002 in queens/master still running which should fail | 13:10 |
*** d0ugal_ has quit IRC | 13:12 | |
*** amoralej is now known as amoralej|lunch | 13:13 | |
*** d0ugal has joined #oooq | 13:13 | |
*** d0ugal has quit IRC | 13:13 | |
*** d0ugal has joined #oooq | 13:13 | |
*** udesale has joined #oooq | 13:14 | |
panda|ko | trown: I'm seeing undercloud jobs passing | 13:17 |
panda|ko | trown: I think we did it. | 13:17 |
trown | panda|ko: yep +2'd looking good | 13:18 |
trown | panda|ko: I will analyze times | 13:18 |
*** tcw has quit IRC | 13:21 | |
*** quiquell|tmp has joined #oooq | 13:23 | |
*** tcw has joined #oooq | 13:29 | |
*** agopi has quit IRC | 13:31 | |
quiquell|tmp | ykarel: queens is promoting ! | 13:33 |
ykarel | quiquell|tmp, Great | 13:33 |
ykarel | quiquell|tmp, now can think about what to do with pike | 13:34 |
ykarel | if it's only fs20 same issue as queens | 13:34 |
quiquell|tmp | It was the nodepool issue + fs20 | 13:34 |
quiquell|tmp | We can try to run the periodic on pike to confirm that | 13:34 |
ykarel | nodepool issue should be fixed similar to master/queens | 13:39 |
ykarel | if it's only fs020 it should be promoted in tomorrow's run | 13:39 |
ykarel | if we skip it today | 13:39 |
ykarel | in promotion criteria | 13:39 |
ykarel | than on monday we can see all green and move with zuul v3 :) | 13:40 |
quiquell|tmp | It was multinode jobs and the fs020 | 13:40 |
ykarel | hoping to get fs027 fix merged by then | 13:40 |
quiquell|tmp | Ykarel: green summertime | 13:40 |
quiquell|tmp | Ykarel would be nice to revies with closes bugs have more priority in the queue | 13:41 |
quiquell|tmp | Like detecting Closes-Bug or simar | 13:42 |
ykarel | the one marked with Critical | 13:42 |
ykarel | as most review has Closes-Bug in it | 13:42 |
ykarel | but zuul test all review in queue together to avoid issues | 13:43 |
quiquell|tmp | Would be nice | 13:44 |
quiquell|tmp | Sometimes those unblock the queue | 13:44 |
quiquell|tmp | Make sense to merge them first | 13:44 |
*** quiquell__ has joined #oooq | 13:47 | |
*** quiquell|tmp has quit IRC | 13:50 | |
*** bogdando has quit IRC | 13:56 | |
rlandy | panda: just fyi ... for reproducer https://review.openstack.org/#/c/579154 ... https://review.openstack.org/#/c/579161 - still testing | 13:57 |
panda|ko | rlandy: thanks | 13:57 |
*** agopi has joined #oooq | 14:02 | |
panda|ko | rlandy: are you able to test those with and undercloud-only job in multinode reproducer ? I'm anticipating some hiccups during bridge creation if we use a static groupd definition for the host | 14:02 |
panda|ko | rlandy: adding a comment in the review | 14:02 |
rlandy | I am just checking that we actually attempt the bridge creation for now | 14:02 |
rlandy | and yes, I can switch fs | 14:03 |
*** agopi_ has joined #oooq | 14:03 | |
rlandy | but I'd like to see how far I can get | 14:03 |
*** agopi_ has quit IRC | 14:03 | |
*** quiquell__ has quit IRC | 14:04 | |
*** agopi_ has joined #oooq | 14:05 | |
*** quique|rover|lch is now known as quiquell|rover | 14:06 | |
*** agopi has quit IRC | 14:06 | |
*** ccamacho has quit IRC | 14:09 | |
*** d0ugal has quit IRC | 14:15 | |
rlandy | panda: is the plan still to use the pre playbook from https://github.com/openstack-infra/zuul-jobs? | 14:17 |
rlandy | or will we be rolling our won | 14:17 |
*** d0ugal has joined #oooq | 14:17 | |
rlandy | own | 14:17 |
rlandy | panda|ko: ^^? | 14:17 |
rlandy | looking at the reviews out there, looks like transition legacy stuff | 14:17 |
trown | rlandy: we will use the multinode pre playbook | 14:19 |
* rlandy looks for that | 14:20 | |
trown | rlandy: https://github.com/openstack-infra/zuul-jobs/blob/master/playbooks/multinode/pre.yaml | 14:21 |
*** amoralej|lunch is now known as amoralej | 14:22 | |
rlandy | trown:ok - so the same one | 14:22 |
trown | ya exactly | 14:23 |
*** ccamacho has joined #oooq | 14:27 | |
panda|ko | rlandy: we can use the roles from zuul-jobs and openstack-zuul-jobs for free upstream, we just call them when needed. playbooks however need to be copied in our own repo | 14:34 |
panda|ko | trown: : https://github.com/openstack-infra/zuul-jobs/blob/master/playbooks/multinode/pre.yaml is used byt our base parent, so hierarchy comes to help, and we don't need to copy it | 14:35 |
panda|ko | I mean ^ | 14:36 |
panda|ko | rlandy: ^ | 14:36 |
panda|ko | quiquell|rover: I'm not sure if we'll migrate the jobs today, but you should know what is happening, ping me if you have 5 minutes to sync | 14:37 |
rlandy | panda|ko: that's fine - I was just not sure of we were still going to call anything directly from zuul-jobs | 14:38 |
panda|ko | rlandy: trown we are running multi-node-firewall twice | 14:44 |
panda|ko | rlandy: trown I think we can remove multinode-networking from the pre-run list in a next patch | 14:45 |
panda|ko | trown rlandy as the only thing it does it to call the multi-node-firewall and it seems to be already done by our new parent | 14:45 |
*** kopecmartin has quit IRC | 14:51 | |
hubbot | FAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/567224, master: tripleo-quickstart-extras-gate-newton-delorean-full-minimal @ https://review.openstack.org/560445 | 14:53 |
*** d0ugal has quit IRC | 14:55 | |
panda|ko | 3-nodes passed and collecting logs | 14:56 |
rlandy | panda|ko: I was calling the playbook directly | 14:58 |
panda|ko | rlandy: which playbooks and where ? | 14:59 |
rlandy | https://github.com/openstack-infra/zuul-jobs/blob/master/playbooks/multinode/pre.yaml - there is no base job in the reproducer | 15:00 |
rlandy | but I don;t want to confuse your workflow | 15:00 |
*** d0ugal has joined #oooq | 15:02 | |
panda|ko | rlandy: ah yeah, I understand now. | 15:06 |
panda|ko | both patch passed. We're ready to migrate when we get the reproducer feedback | 15:07 |
*** openstack has quit IRC | 15:22 | |
*** openstack has joined #oooq | 15:23 | |
*** d0ugal has quit IRC | 15:23 | |
rlandy | weshay: 1-on-1? | 15:30 |
*** d0ugal_ has quit IRC | 15:32 | |
*** d0ugal__ has joined #oooq | 15:32 | |
panda|ko | rlandy: we didn't see wes today ... | 15:33 |
rlandy | panda|ko: yeah - but weird because he sent me an updated 1-on-1 invite for now | 15:34 |
*** ykarel is now known as ykarel|afk | 15:40 | |
*** tesseract has quit IRC | 15:42 | |
*** d0ugal__ has quit IRC | 15:42 | |
*** ccamacho has quit IRC | 15:44 | |
*** zoli is now known as zoli|gone | 15:46 | |
*** zoli|gone is now known as zoli | 15:46 | |
*** sshnaidm|off has joined #oooq | 15:55 | |
weshay | rlandy, aye.. | 15:56 |
weshay | rlandy, sorry I'm late | 15:56 |
rlandy | weshay: np - we can ski if you don't have time | 15:57 |
rlandy | ship | 15:57 |
rlandy | skip | 15:57 |
weshay | im in | 15:57 |
*** weshay has quit IRC | 16:09 | |
*** honza has quit IRC | 16:09 | |
*** gchamoul has quit IRC | 16:09 | |
*** zoli has quit IRC | 16:09 | |
*** dtantsur has quit IRC | 16:09 | |
*** weshay has joined #oooq | 16:14 | |
*** honza has joined #oooq | 16:14 | |
*** gchamoul has joined #oooq | 16:14 | |
*** zoli has joined #oooq | 16:14 | |
*** udesale has quit IRC | 16:15 | |
*** jtomasek has quit IRC | 16:18 | |
*** dtantsur has joined #oooq | 16:23 | |
*** dtantsur is now known as dtantsur|afk | 16:25 | |
*** trown is now known as trown|lunch | 16:31 | |
*** ykarel|afk is now known as ykarel|away | 16:31 | |
*** ykarel|away has quit IRC | 16:37 | |
*** rlandy is now known as rlandy|brb | 16:43 | |
weshay | rlandy|brb, fyi https://engineering.redhat.com/rt/Ticket/Display.html?id=473869 | 16:50 |
weshay | not sure which location is the right place for the ticket | 16:51 |
weshay | asking in irc as well | 16:51 |
hubbot | FAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/567224 | 16:53 |
*** openstack has quit IRC | 17:11 | |
*** openstack has joined #oooq | 17:12 | |
*** rlandy|brb is now known as rlandy | 17:16 | |
rlandy | weshay: thanks | 17:17 |
*** jfrancoa has quit IRC | 17:19 | |
*** trown|lunch is now known as trown | 17:41 | |
trown | hmm singlenode jobs arent tracked with graphite? | 17:52 |
rlandy | reproducer got as far as under cloud install | 18:30 |
*** ykarel|away has joined #oooq | 18:42 | |
panda|ko | rlandy: then stopped ? | 18:46 |
rlandy | panda|ko: I see the problem ... br-ex is not available. | 18:46 |
rlandy | PLAY [Configure a multi node environment] **************************************************************************************************************************************************** | 18:46 |
rlandy | skipping: no hosts matched | 18:46 |
rlandy | [WARNING]: Could not match supplied host pattern, ignoring: switch | 18:46 |
rlandy | [WARNING]: Could not match supplied host pattern, ignoring: peers | 18:46 |
rlandy | ^^ my hosts must be wrongly configuring | 18:47 |
rlandy | fixing | 18:47 |
*** jaganathan has quit IRC | 18:50 | |
*** jaganathan has joined #oooq | 18:50 | |
rlandy | Create inventory suitable for zuul-jobs/multinode - skipped that task | 18:50 |
hubbot | FAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/567224 | 18:53 |
rlandy | | properties | {u'memory_mb': u'8192', u'cpu_arch': u'x86_64', u'local_gb': u'49', u'cpus': u'4', u'capabilities': u'cpu_aes:true,cpu_hugepages:true,boot_option:local,cpu_vt:true,cpu_hugepages_1g:true,boot_mode:bios'} | 19:12 |
rlandy | interesting | 19:12 |
*** yolanda_ has joined #oooq | 19:19 | |
*** d0ugal__ has joined #oooq | 19:19 | |
*** yolanda__ has joined #oooq | 19:22 | |
*** yolanda has quit IRC | 19:23 | |
*** yolanda_ has quit IRC | 19:24 | |
*** d0ugal__ has quit IRC | 19:28 | |
rlandy | so it is accepting the attached disk | 19:30 |
rlandy | we can increase that | 19:31 |
rlandy | weshay: so, I have misjudged this rhos-13 error ... we are deploying with a volume and ironic is picking it up ... | 19:47 |
rlandy | | properties | {u'memory_mb': u'8192', u'cpu_arch': u'x86_64', u'local_gb': u'49', u'cpus': u'4', u'capabilities': u'cpu_aes:true,cpu_hugepages:true,boot_option:local,cpu_vt:true,cpu_hugepages_1g:true,boot_mode:bios'} | 19:47 |
rlandy | the current job is at overcloud deploy ... | 19:47 |
rlandy | https://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/tq-gate-rhos-13-ci-rhos-ovb-featureset001/83/console | 19:47 |
rlandy | but may or may not be stuck ... | 19:47 |
rlandy | 2018-06-29 15:27:57 | 2018-06-29 19:27:53Z [overcloud.Controller.0.UserData]: CREATE_IN_PROGRESS state changed | 19:48 |
rlandy | ^^ was a while back | 19:48 |
rlandy | I removed the env delete - so we can debug | 19:48 |
rlandy | 2018-06-29 16:06:58 | 2018-06-29 20:05:55Z [overcloud.Compute.0.NovaCompute]: CREATE_FAILED ResourceInError: resources.NovaCompute: Went to status ERROR due to "Message: No valid host was found. , Code: 500" | 20:11 |
*** ykarel|away has quit IRC | 20:19 | |
weshay | rlandy, hrm.. so the no hosts found is due to another reason, not the diskspace | 20:38 |
weshay | is that the sum? | 20:38 |
hubbot | FAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/567224 | 20:53 |
rlandy | weshay: it may be - will probably have to ask from here - going through errors in logs | 21:01 |
rlandy | https://thirdparty.logs.rdoproject.org/jenkins-tq-gate-rhos-13-ci-rhos-ovb-featureset001-83/undercloud/var/log/ironic/ironic-conductor.log.txt.gz#_2018-06-29_16_02_39_883 | 21:02 |
rlandy | 2018-06-29 14:29:46.895 25429 ERROR nova.compute.manager [req-0a7e6434-2fe7-4a94-994b-180e9eb26312 - - - - -] No compute node record for host undercloud.localdomain: ComputeHostNotFound_Remote: Compute host undercloud.localdomain could not be found. | 21:02 |
rlandy | https://thirdparty.logs.rdoproject.org/jenkins-tq-gate-rhos-13-ci-rhos-ovb-featureset001-83/undercloud/var/log/nova/nova-compute.log.txt.gz#_2018-06-29_14_29_46_895 | 21:03 |
rlandy | weshay: ^^ seen that before? | 21:03 |
* weshay looks | 21:03 | |
rlandy | need to check for other successful rhos-13 deploys | 21:04 |
*** trown is now known as trown|outtypewww | 21:05 | |
weshay | rlandy, I've seen that error, just not always as something fatal | 21:06 |
* weshay looks more | 21:06 | |
*** agopi_ is now known as agopi | 21:08 | |
weshay | rlandy, instance_uuid=instance.uuid, reason=e.format_message())\n', u'RescheduledException: Build of instance d75c3b43-da3f-4813-9972-f43bb0a62eeb was re-scheduled: Insufficient compute resources: Free memory 0.00 MB < requested 4096 MB; Free disk -31.00 GB < requested 40 GB.\n'] | 21:10 |
weshay | https://thirdparty.logs.rdoproject.org/jenkins-tq-gate-rhos-13-ci-rhos-ovb-featureset001-83/undercloud/var/log/extra/errors.txt.gz#_2018-06-29_16_04_06_928 | 21:10 |
rlandy | so it was the memory | 21:11 |
rlandy | and disk space | 21:11 |
weshay | too bad you can't bond instances | 21:11 |
rlandy | we can do better on the disk space | 21:11 |
rlandy | can get a bigger volume | 21:11 |
rlandy | memory is interesting | 21:12 |
rlandy | look at the flavors | 21:12 |
rlandy | we deploy a large overcloud | 21:12 |
rlandy | and an extra large undercloud | 21:12 |
*** dougbtv_ has quit IRC | 21:13 | |
rlandy | | properties | {u'memory_mb': u'8192', u'cpu_arch': u'x86_64', u'local_gb': u'49', u'cpus': u'4', u'capabilities': u'cpu_aes:true,cpu_hugepages:true,boot_option:local,cpu_vt:true,cpu_hugepages_1g:true,boot_mode:bios'} | 21:14 |
rlandy | u'local_gb': u'49 - that I accept is lower | 21:14 |
rlandy | but memory is the same as rdocloud | 21:14 |
rlandy | (undercloud) [stack@undercloud ~]$ sudo df -h | 21:16 |
rlandy | Filesystem Size Used Avail Use% Mounted on | 21:16 |
rlandy | weshay: I assume from the message it's complaining about the resources on the overcloud - not undercloud | 21:22 |
weshay | rlandy, guess what | 21:23 |
rlandy | surprise me | 21:23 |
weshay | we're the only ones using OVB | 21:23 |
weshay | internal is not | 21:23 |
weshay | "looking into it" | 21:23 |
rlandy | wow - so why doesn't someone else complain? | 21:24 |
weshay | that is why | 21:24 |
weshay | no one is trying osp w/ ovb except for us | 21:24 |
agopi | dontcha worry rlandy weshay soon we'll be there with you to complain | 21:24 |
weshay | agopi, hey.. I just tested setup.py | 21:25 |
rlandy | agopi is going to run away from out OVB idea | 21:25 |
weshay | agopi, not seeing any problem w/ changes being sucked into the local working dir | 21:25 |
weshay | agopi, made changes to tqe, tripleo-env.. playbooks | 21:25 |
weshay | it's all good | 21:25 |
agopi | weshay, the update fixed all problems, i made changes to the gdoc immediately aftere thought you checked it | 21:25 |
agopi | weshay++ | 21:25 |
hubbot | agopi: weshay's karma is now 5 | 21:25 |
weshay | OH | 21:25 |
weshay | oh this fixed it.. | 21:26 |
weshay | https://bugs.launchpad.net/tripleo/+bug/1778748 | 21:26 |
openstack | Launchpad bug 1778748 in tripleo "CI: parsing error while installing cliff " [High,Won't fix] | 21:26 |
weshay | ok | 21:26 |
weshay | k k | 21:26 |
weshay | ya.. we need to bail out in the right place w/ pip errors | 21:26 |
agopi | yes sir! | 21:26 |
weshay | cool | 21:26 |
agopi | im looking into reproducer script | 21:27 |
weshay | k | 21:28 |
rlandy | https://github.com/cybertron/openstack-virtual-baremetal/blob/master/templates/undercloud-volume.yaml#L16 | 21:30 |
rlandy | ^^ can change that setting | 21:30 |
rlandy | weshay: can I delete that deployment and try again with a bigger volume? | 21:33 |
weshay | rlandy, NUKE IT | 21:33 |
rlandy | at least I was not totally off about resources problem :( | 21:33 |
*** agopi is now known as agopi|off | 21:39 | |
*** agopi|off has quit IRC | 21:44 | |
*** dtantsur|afk has quit IRC | 22:06 | |
*** dtantsur has joined #oooq | 22:07 | |
*** matbu has quit IRC | 22:17 | |
*** agopi|off has joined #oooq | 22:18 | |
weshay | bah matt strikes again | 22:32 |
weshay | (pending—There are no nodes with the label ‘rdo-manager-64-74proto’) | 22:32 |
rlandy | baremetal_boot_from_volume_size: 50 | 22:33 |
rlandy | undercloud_boot_from_volume_size: 50 | 22:33 |
weshay | cool | 22:34 |
rlandy | ha - we can set this - is it will read it | 22:34 |
rlandy | if it | 22:34 |
weshay | rlandy, ping me w/ a link when it kicks | 22:34 |
weshay | https://code.engineering.redhat.com/gerrit/#/c/142864/ | 22:36 |
weshay | rlandy, any reasons why this shouldn't be removed? | 22:37 |
weshay | that you know of? | 22:37 |
weshay | we're not building images for pike, queens, master | 22:38 |
rlandy | weshay: none - but I don;t know a lot about this stuff | 22:38 |
rlandy | weshay: watching https://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/tq-gate-rhos-13-ci-rhos-ovb-featureset001/84/console | 22:39 |
rlandy | has volumes set to 80 | 22:39 |
rlandy | for undercloud and baremetal | 22:39 |
weshay | niiiice | 22:39 |
rlandy | let's see if it takes | 22:40 |
weshay | where does the volume get added? | 22:40 |
rlandy | -e baremetal_boot_from_volume_size=80 -e undercloud_boot_from_volume_size=80 | 22:40 |
rlandy | I added it in the job for now | 22:40 |
weshay | ya.. but on the file system | 22:40 |
rlandy | https://github.com/openstack/tripleo-quickstart-extras/blob/master/roles/ovb-manage-stack/defaults/main.yml#L31 | 22:41 |
rlandy | https://github.com/cybertron/openstack-virtual-baremetal/blob/master/templates/virtual-baremetal-servers-volume.yaml#L11 | 22:41 |
rlandy | that sets the env in | 22:41 |
rlandy | https://github.com/openstack/tripleo-quickstart-extras/blob/master/roles/ovb-manage-stack/templates/env.yaml.j2#L42 | 22:41 |
rlandy | 6 "| stack_status | CREATE_FAILED | 22:42 |
rlandy | we probably overran our quota | 22:42 |
weshay | :( | 22:49 |
weshay | rlandy, if you need to shutdown the other internal ovb jobs.. you have my blessing | 22:49 |
rlandy | weshay: nah - I think we are ok now | 22:50 |
rlandy | we're create complete | 22:51 |
rlandy | https://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/tq-gate-rhos-13-ci-rhos-ovb-featureset001/85/console | 22:51 |
rlandy | let's see what this one does | 22:52 |
rlandy | the memory shortage is a loss to me though. The disk we have changed | 22:52 |
hubbot | FAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/567224 | 22:53 |
*** rlandy has quit IRC | 23:41 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!