Wednesday, 2018-06-13

*** rlandy|rover is now known as rlandy|rover|bbl00:00
*** agopi has quit IRC00:38
*** agopi has joined #oooq00:39
hubbotFAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/56722400:40
*** agopi has quit IRC01:26
*** rfolco_doctor is now known as rfolco02:00
*** agopi has joined #oooq02:23
*** myoung|bbl is now known as myoung02:28
hubbotFAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/56722402:40
*** rlandy|rover|bbl is now known as rlandy|rover02:59
*** rlandy|rover has quit IRC02:59
*** agopi has quit IRC03:36
*** agopi has joined #oooq03:43
*** udesale has joined #oooq03:45
*** skramaja has joined #oooq03:48
*** agopi has quit IRC03:51
*** agopi has joined #oooq03:51
*** agopi has quit IRC04:00
*** links has joined #oooq04:24
hubbotFAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/56722404:40
*** ykarel|away has joined #oooq04:43
*** holser__ has joined #oooq04:43
*** holser___ has joined #oooq05:01
*** holser___ has quit IRC05:02
*** holser__ has quit IRC05:05
*** ykarel|away is now known as ykarel05:20
*** ratailor has joined #oooq05:25
*** jbadiapa has joined #oooq05:27
*** kopecmartin has joined #oooq05:51
*** bogdando has joined #oooq06:18
*** quiquell|off is now known as quiquell06:29
*** holser__ has joined #oooq06:34
*** saneax has joined #oooq06:38
hubbotFAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/56722406:40
*** tosky has joined #oooq06:40
*** sanjay__u has joined #oooq06:46
*** tesseract has joined #oooq07:07
*** ccamacho has joined #oooq07:31
*** amoralej|off is now known as amoralej07:36
*** brault has joined #oooq07:37
*** zoli is now known as zoli|wfh07:38
*** zoli|wfh is now known as zoli07:38
*** brault has quit IRC07:46
*** arxcruz|brb is now known as arxcruz|ruck07:59
*** jaganathan has quit IRC07:59
*** brault has joined #oooq08:03
*** brault has quit IRC08:06
*** brault has joined #oooq08:06
*** gkadam has joined #oooq08:20
*** udesale_ has joined #oooq08:21
quiquellsshnaidm: Fixed some ut of the promoter08:22
quiquellhttps://review.rdoproject.org/r/#/c/14084/08:23
*** udesale has quit IRC08:24
quiquellmarios: Added you to the review too08:25
quiquellIt add some unit testing to the promoter08:25
*** udesale__ has joined #oooq08:26
*** udesale_ has quit IRC08:29
ykarelarxcruz|ruck, https://bugs.launchpad.net/tripleo/+bug/177659608:29
openstackLaunchpad bug 1776596 in tripleo "[QUEENS] Promotion Jobs failing at overcloud deployment with AttributeError: 'IronicNodeState' object has no attribute 'failed_builds'" [Undecided,Confirmed]08:29
arxcruz|ruckjesus christ08:29
mariosquiquell: ack08:30
*** ykarel is now known as ykarel|lunch08:31
quiquellsshnaidm, marios: also this one https://review.openstack.org/#/c/572736/ for the sprint08:38
hubbotFAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/56722408:40
*** dtantsur|afk is now known as dtantsur08:58
quiquellarxcruz|ruck: panda's change on promoter is merged09:04
quiquellarxcruz|ruck: You can close this https://bugs.launchpad.net/tripleo/+bug/176809009:04
openstackLaunchpad bug 1768090 in tripleo "promoter script is not comparing timestamps correctly when folding hashes" [High,In progress] - Assigned to Gabriele Cerami (gcerami)09:04
*** sshnaidm has quit IRC09:06
*** links has quit IRC09:08
arxcruz|ruckquiquell: /me hates pytest09:09
quiquellarxcruz|ruck: Why ? the fixtureis great09:10
quiquellarxcruz|ruck: you can express setup/teardown in the same function09:10
quiquellarxcruz|ruck: OOP on those test is adding complexity09:10
arxcruz|ruckquiquell: complexity is the openstack way :P09:11
arxcruz|ruckquiquell: just wanted to have it like openstack standards but again, i didn't -1 :)09:11
quiquellarxcruz|ruck: Thanks for the comments btw, let's get more ut on our stuff.09:13
quiquellmarios: the other part of trown's changes https://review.openstack.org/#/c/57242009:22
*** links has joined #oooq09:25
*** udesale_ has joined #oooq09:26
*** ykarel|lunch is now known as ykarel09:26
*** udesale__ has quit IRC09:26
*** udesale__ has joined #oooq09:28
mariosthank you quiquell09:30
*** udesale_ has quit IRC09:31
quiquellmarios: Let's see if we can get those merged, it's a pain with zuul's queues so big09:34
quiquellarxcruz|ruck: checking the r&r board, looks like promoter has still master commented out09:41
quiquellarxcruz|ruck: Is still broken ?09:41
arxcruz|ruckquiquell: oh yeah09:41
quiquellarxcruz|ruck: promotions looks really fucked up09:41
arxcruz|ruckquiquell: yup09:42
arxcruz|ruckmaster and queens09:42
quiquellarxcruz|ruck: just out of curiosity what's the issue ?09:42
toskys/f.../not working/09:42
arxcruz|ruckquiquell: on master: keystone flaskination messed with some endpoints09:42
*** sshnaidm has joined #oooq09:42
arxcruz|ruckon queens: fix was merged, but the job was killed by admin this morning, not sure why09:42
quiquellarxcruz|ruck: They are moving it to flask ?09:42
arxcruz|ruckquiquell: yes09:43
toskythey already moved (at least partially)09:43
quiquellarxcruz|ruck: What a sprint to be ruck/rover... :-/09:43
quiquelltosky: I see... so we will suffer the next sprint too09:43
arxcruz|ruckquiquell: it was fun ronelly really makes my job easier :)09:43
quiquellarxcruz|ruck: Yep, rlandy is super hero too09:44
arxcruz|ruckquiquell: also there's the problem with error 503 on containers, that ykarel fix, but we weren't able to test yet09:44
arxcruz|rucktest i mean, on the promotion pipeline09:44
quiquellarxcruz|ruck: So we are near to mordor in promotions09:44
*** tcw1 has joined #oooq09:45
ykarelarxcruz|ruck, i asked to abort queens job, to clean up rdo zuul queue, as queens were anyway going to fail because of https://bugs.launchpad.net/tripleo/+bug/177659609:46
openstackLaunchpad bug 1776596 in tripleo "[QUEENS] Promotion Jobs failing at overcloud deployment with AttributeError: 'IronicNodeState' object has no attribute 'failed_builds'" [Undecided,Confirmed]09:46
arxcruz|ruckykarel: oh, yeah, sorry, it was queens09:47
arxcruz|rucki only got one coffe today, and doing a lot of reviews09:47
*** tcw has quit IRC09:47
ykarelarxcruz|ruck, np. can you add triaged/critical to ^^09:47
ykarelarxcruz|ruck, chandankumar is patch to tempestconf pushed?09:48
arxcruz|ruckykarel: on the gates09:49
ykarelarxcruz|ruck, Ok good09:49
quiquellarxcruz|ruck: RDO's zuul does not support Depends-On between projects ?09:50
arxcruz|ruckquiquell: i think it does, better ask to rdo guys09:51
quiquellarxcruz|ruck: Cool thanks09:51
sshnaidmquiquell, hi09:51
sshnaidmquiquell, have time for sync?09:51
quiquellsshnaidm: What's up man09:51
quiquellsshnaidm: Sure09:51
quiquellsshnaidm: https://bluejeans.com/789106523209:52
*** links has quit IRC10:12
*** d0ugal has quit IRC10:17
*** jbadiapa has quit IRC10:19
*** links has joined #oooq10:25
*** udesale_ has joined #oooq10:26
*** udesale__ has quit IRC10:29
*** udesale__ has joined #oooq10:30
*** jbadiapa has joined #oooq10:30
*** udesale_ has quit IRC10:33
*** zoli is now known as zoli|lunch10:33
*** d0ugal has joined #oooq10:37
*** jaganathan has joined #oooq10:40
hubbotFAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/56722410:41
arxcruz|ruckykarel: tempestconf patch merged11:02
ykarelarxcruz|ruck, ack and we should be consistent in some time11:02
ykarellet's hope it's before next run11:03
*** anande has joined #oooq11:03
arxcruz|ruckyup11:03
*** ratailor has quit IRC11:04
*** florianf has joined #oooq11:04
arxcruz|ruckykarel: if not, we can ask admins to kill again11:06
ykarelarxcruz|ruck, ack11:06
*** udesale_ has joined #oooq11:07
*** quiquell is now known as quiquell|afk11:08
*** udesale__ has quit IRC11:10
*** jaganathan has quit IRC11:11
*** florianf has quit IRC11:15
*** florianf has joined #oooq11:16
*** atoth has joined #oooq11:33
ykarelarxcruz|ruck, we are consistent, so next run should go with merged python-tempestconf11:35
*** quiquell|afk is now known as quiquell11:35
*** zoli|lunch is now known as zoli|wfh11:47
*** rlandy has joined #oooq11:49
*** rlandy is now known as rlandy|rover11:50
quiquellsshnaidm: To test the add release to gating repo11:52
quiquellsshnaidm: https://trello.com/c/CuEgOpw3/794-ci-job-create-a-new-job-that-uses-fs50-but-tests-n-n-111:52
quiquellsshnaidm: I have put there some test patches that exercise also the  new job for n -> n + 111:53
sshnaidmquiquell, I'd like to start with testing backward compatibility for patches with one release without any upgrades11:53
sshnaidmquiquell, I'll make one..11:54
quiquellsshnaidm: Ok, going to use {{ working_dir }} for your comment11:54
weshayarxcruz|ruck, rlandy|rover who is joining the program call?11:56
weshayand updating the doc?11:56
weshayarxcruz|ruck, ah. you did it11:57
weshaythanks11:57
weshayarxcruz|ruck, don't try to predict :)11:57
rlandy|rovernow it's upstream with the long queues :( 21 hr 6 min11:58
arxcruz|ruckweshay: i don't, i have ykarel :P11:59
weshayarxcruz|ruck, :)11:59
weshayykarel++11:59
hubbotweshay: ykarel's karma is now 611:59
weshayarxcruz|ruck, we merge that tempest patch?11:59
arxcruz|ruckweshay: yup, next run will have it11:59
rlandy|roveryay - pike promoted12:00
weshayoh nice12:01
*** trown|outtypewww is now known as trown12:04
*** amoralej is now known as amoralej|lunch12:04
rlandy|roverarxcruz|ruck: ykarel: editing the card on the escalations board ... https://trello.com/c/nuMokZMf/618-cixlp1776596tripleociproa-queens-promotion-jobs-failing-at-overcloud-deployment-with-attributeerror-ironicnodestate-object-has-n12:05
rlandy|roverpls check that the labels etc. are correct12:05
rlandy|roverI set the DFG to hardware prov12:06
rlandy|roverbecause of ironic but really the bug change in on nova12:06
ykarelrlandy|rover, ack12:06
weshayarxcruz|ruck, so this fixes tempest config? https://review.openstack.org/#/c/574735/12:08
weshayarxcruz|ruck, have you confirmed that locally?12:09
*** quiquell is now known as quiquell|lunch12:09
arxcruz|ruckweshay: doesn't matter, the python-tempestconf will check both endpoints if one fail, it will check the second12:09
arxcruz|ruckso with this patch or not, it will work12:09
weshayk.. and that's all merged up?12:10
arxcruz|ruckweshay: it was merged this morning, ykarel already confirm that is already in consistent, and next periodic run will have it12:10
weshayrock on12:10
weshaythanks guys12:10
weshaychandankumar, where can I see this run w/ venv? https://review.openstack.org/#/c/572347/5/roles/validate-tempest/templates/configure-tempest.sh.j212:14
weshayarxcruz|ruck, do we know why containers-build failed on master?12:16
rlandy|rover%gatestatus12:16
hubbotFAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/56722412:16
weshayarxcruz|ruck, ERROR:kolla.common.utils:qdrouterd Failed with status: error"12:17
ykarelweshay, https://trello.com/c/5PoOjYc312:18
chandankumarweshay: https://review.openstack.org/#/q/topic:tempestconf_200+(status:open+OR+status:merged) -> https://review.openstack.org/#/c/572094/ -> http://logs.openstack.org/94/572094/5/check/tripleo-ci-centos-7-undercloud-oooq/9cd7fda/logs/undercloud/home/zuul/tempest.log.txt.gz#_2018-06-12_06_05_1112:20
weshayykarel, why is that not on the CIX board?12:20
*** marios has quit IRC12:20
*** marios has joined #oooq12:20
ykarelweshay, i checked in tripleo once it was about to fix12:20
weshayykarel, doesn't matter12:21
ykarelweshay, ok we can move there12:21
ykareli thought if it takes few hours(4+) then we move there12:22
ykarelweshay, ^^12:22
weshayykarel, is there a lp?12:23
ykarelweshay, no12:23
*** udesale__ has joined #oooq12:23
ykareli haven't seen any lp for it12:23
weshayykarel, if there is something that blocks a job in the chain, once you have an understanding ensure there is a lp bug w/ promotion_blocker tag.  Then let the bug handle the cix card12:25
weshayand the timing12:25
weshaythe lp bug is used to inform12:25
weshayand also for metrics of what has impacted the chain12:25
ykarelweshay, sure will take care next time12:25
weshayykarel, thank you sir12:25
*** udesale_ has quit IRC12:25
weshayykarel, we're skewing the metrics to some degree by not reporting it12:27
weshayarxcruz|ruck, thanks for the update12:27
ykarelweshay, understood12:28
ykarelwill take care now onwards12:28
*** saneax has quit IRC12:28
*** saneax has joined #oooq12:28
*** udesale__ has quit IRC12:31
*** anande has quit IRC12:35
*** quiquell|lunch is now known as quiquell12:40
hubbotFAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/56722412:41
weshaymyoung, let's shoot out some pings / reminders to fill out the retrospective12:42
weshayboad12:42
weshayboard12:42
*** tcw1 has quit IRC12:43
*** tcw has joined #oooq12:47
weshayarxcruz|ruck, rlandy|rover does this card cover the container build issue in our master job? https://trello.com/c/5PoOjYc3/804-puppet-cipoi-scenario-001-tripleocontainers-buildmasterfailing-1st-puppet-run-while-installing-qpid-dispatch-router12:51
rlandy|roverwe never use that column12:53
rlandy|roverI did not witness this failure - checking the jobs history12:55
myoungweshay: ack, prepping board now12:55
weshaysshnaidm, https://github.com/sshnaidm/sova/pulls12:55
sshnaidmweshay, +w12:56
myoungall squads: retrospective board is live for sprint 14 --> https://trello.com/b/0VFswmht/rdo-infra-retrospective?menu=filter&filter=label:Sprint%201412:57
myoungLet's fill out cards for first 10m12:58
rlandy|roverweshay: ack - looks like the same error12:58
rlandy|roverykarel: hi - any reason you added the containers build card to failing jobs rather than a bug? looking if this should be on the escalations board?12:59
rlandy|roverbecause the reviews are already out there?13:00
rlandy|rovermerged in early june13:01
weshaychandankumar, retro13:01
rlandy|roverhttps://review.rdoproject.org/r/#/c/14194/ merged this morning13:01
chandankumarweshay: bj link13:02
sshnaidmchandankumar, it's different channel from this in calendar13:03
sshnaidmmyoung, can we update calendar with right blueajeans links?13:04
chandankumarsshnaidm: got the right link13:04
*** Goneri has joined #oooq13:07
*** skramaja has quit IRC13:21
*** amoralej|lunch is now known as amoralej13:28
*** hrybacki is now known as hrybacki|mtg13:31
*** kopecmartin has quit IRC13:34
rascamyoung, rlandy|rover, hey folks, can you help me understand why this pipeline seems stuck https://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/osp-director-promote-13-puddle/ ?14:09
rascawhat is it waiting for?14:09
myoungrasca: we are in retrospective14:10
myoungrasca: can look in a bit sorry14:10
rascamyoung, no worries14:10
*** sanjay__u has quit IRC14:10
rlandy|roverrasca: hey - as myoung says, a bit distracted - what do you mean by stuck?14:12
*** dtrainor has joined #oooq14:13
*** florianf has quit IRC14:15
*** florianf has joined #oooq14:19
hubbotFAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/56722414:41
*** kopecmartin has joined #oooq14:45
rascarlandy|rover, if you look at https://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/osp-director-promote-13-puddle/13/ you'll see that the jobs seems waiting for something14:51
rascamyoung ^^^14:52
rascamyoung, rlandy|rover, specifically https://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/osp-director-promote-13-puddle/13/console14:53
*** hrybacki|mtg is now known as hrybacki15:04
*** agopi has joined #oooq15:07
rascatrown, hey, regarding https://review.openstack.org/#/c/553465/6/doc/source/node-configuration.rst what is a TOC link?15:09
trownrasca: table of contents, check the CI result that I linked15:09
trownrasca: the doc is kind of hard to see that there are different sections15:09
rascatrown, ooooh I see you mean on the left15:10
trownrasca: ya, the "====" formatting doesnt add any header in the text...  only a table of contents entry on the left15:11
trownrasca: I am fine if you put a patch on top of yours to fix up doc, and we can merge and unblock you15:11
rascatrown, but I see it as a general problem of this doc, isn't it? Apart from the additions15:12
rascatrown, ok, I get it15:12
rascatrown, let me see if I'm able :D15:12
trownrasca: well the doc before your modification had just that section as "Node configuration" you split it into sections, but the sections are not actually clear15:13
rascatrown, Ok, the section was "Libvirt Node Configuration", so what if I turn it into "Node Configuration" as === and then all the subsections as ----?15:13
rascadid I get correctly what you meant?15:14
trownrasca: ya I think that would be great15:14
rascatrown, and why it could be not in this review?15:14
trownrasca: it can be, just feel bad blocking your work over docs15:14
rascatrown, no worries, I'll do in this one15:15
*** ykarel is now known as ykarel|away15:33
*** links has quit IRC15:36
*** saneax has quit IRC15:39
ykarel|awayarxcruz|ruck, fs018 failed with one tempest test, Good part is tempestconf issue fixed15:43
ykarel|away:(15:43
arxcruz|ruckykarel|away: :( do you have the log easy ?15:44
rascatrown, I think I fixed it15:44
ykarel|awayarxcruz|ruck, https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset018-master/c940db4/undercloud/home/jenkins/tempest/tempest.html.gz15:44
trownrasca: thanks15:45
rascatrown, let's see what the gates tell us15:45
arxcruz|ruckquiquell: told you, no thanks, beer15:45
rlandy|roverrasca: ok - finally out of meeting - checking15:46
rlandy|roverarxcruz|ruck: change your nick :)15:46
*** dtantsur is now known as dtantsur|afk15:46
quiquellarxcruz|ruck: Another keg for you15:46
quiquellLeave now15:46
*** quiquell is now known as quiquell|off15:47
rascarlandy|rover, ok, I'm open for suggestions15:47
rascabecause I don't know why this is not working15:47
rlandy|roverok - I'm looking at it now15:47
rascaor proceeding, to be precise15:47
*** bogdando has quit IRC15:50
*** bogdando has joined #oooq15:52
*** jaganathan has joined #oooq15:52
rlandy|rovertaking forever to resolve15:54
rlandy|roverrasca: which job is stuck - the get_puddle or the ones that trigger afterwards15:55
rlandy|roverbuild 13 ran yesterday15:56
*** myoung is now known as myoung|biaf15:58
rascarlandy|rover, myoung|biaf, https://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/osp-director-promote-13-puddle/13/15:59
rlandy|roverdoes this trigger dailY15:59
rlandy|rovercan rekick it15:59
rlandy|roverlooking at config15:59
rlandy|roverjenkins is so slow16:00
rascano wait16:00
rlandy|roverI am on the get_puddle job16:00
rascaI already re kicked it and it 's almost the same16:00
*** weshay is now known as weshay|ruck16:00
rascaand this does not run each day16:00
rascait just run when there's a new puddle16:00
myoung|biafrasca: the virt-ha machine's slave is probably hanging again16:00
weshay|ruckarxcruz|ruck, rlandy|rover fyi.. we may have enough to promote master16:00
rlandy|roverrasca: can we bj? so I understand what you are getting at?16:00
*** florianf has quit IRC16:00
weshay|ruckdepending.. on this one tempest failure16:00
myoung|biafhttps://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/osp-director-promote-13-puddle/13/console16:00
weshay|ruckit may be a fluke16:01
myoung|biafhttps://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/osp-rhos-13-promote-puddle-virtha-3ctlr_1comp_192gb/16:01
rascarlandy|rover, sure, but I think myoung|biaf hit the problem16:01
rlandy|roverweshay|ruck; it's not done yet16:01
rascahit == found what this is about16:01
arxcruz|ruckweshay|ruck: so, even with fs018 failing ?16:01
myoung|biaf^^ that slave is probably offline again, that's the one that over the past few months has been hitting intermittent freezes.  in the past I've had to ssh into virthost for that slave VM and restart it via virsh16:01
*** arxcruz|ruck is now known as arxcruz16:01
rlandy|roveroh ok16:01
rlandy|roverfs020 and fs035 are not done yet16:02
weshay|ruckarxcruz, how much energy do you have left.. it's one test failing on that job... before promoting it.. I would run it by the tripleo ptls16:02
myoung|biafif it's the same issue, journalctl for libvirt will show a wierd HW fault that causes the VM to freeze up.  I think it's just entropy and old hardware...we should IMHO move it.  For now if we disable the job and/or kill the pending instances of the virt-ha job it'll finish the multijob16:02
myoung|biafrasca: I can help if you need, or if you know how to resolve you should have the perms / access16:03
*** bogdando has quit IRC16:03
myoung|biafrasca: confirmed: https://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/computer/rdo-manager-18-virt-ha/ is off in space16:03
rlandy|roverok fs001 and fs035  are ok16:03
*** myoung|biaf is now known as myoung16:03
rascamyoung|biaf, who have the rights to act on that machine?16:04
* myoung didn't actually leave yet and chuckles16:04
weshay|ruckrlandy|rover, arxcruz we'll see what fs20 returns16:04
rlandy|roverweshay|ruck: you want to change the promotion criteria?16:04
weshay|ruckrlandy|rover, I may.. but will discuss w/ tripleo ptls before doing so16:04
myoungrasca: my BJ if you like, unless weshay|ruck you're on it already16:04
rlandy|rovermyoung; can use mine16:04
weshay|ruckrlandy|rover, its not great to hold back openstack patches from tripleo for too long16:04
rlandy|roverwas going to dicuss with rasca anyways16:04
myoungno i meant you folks are looking at promotion blocker stuff16:05
rlandy|roverweshay|ruck: ack - I agree at this point16:05
myoungwant me to reboot the slave VM for virt-ha?16:05
weshay|ruckrlandy|rover, arxcruz you can renick :)16:05
rascarlandy|rover, can you give me the link?16:05
rlandy|roverweshay|ruck: you have no rover now16:05
weshay|ruckrlandy|rover, arxcruz well done :) thanks for your time as ruck/rover16:05
myoungrlandy++16:05
myoungarxcruz++16:05
hubbotmyoung: arxcruz's karma is now 216:05
weshay|ruckya. it will be the same thing tomorrow afternoon16:05
*** florianf has joined #oooq16:05
rlandy|roverweshay|ruck: I'll renick but you can call on me16:05
weshay|ruckthanks16:06
*** rlandy|rover is now known as rlandy16:06
myoungrlandy++16:06
hubbotmyoung: rlandy's karma is now 416:06
myoungweshay|ruck: do you want me to sort rasca / osp13  virt-ha slave?16:06
* myoung doesn't want to overstep16:06
rlandyrasca: https://bluejeans.com/u/rlandy/16:06
rlandymyoung: ^^16:06
myoungack16:07
rlandywe have slave options16:07
* myoung actually fades away this time for a few16:07
*** myoung is now known as myoung|biaf16:07
*** gkadam has quit IRC16:08
*** ykarel|away has quit IRC16:12
*** kopecmartin has quit IRC16:17
*** ccamacho has quit IRC16:21
*** agopi has quit IRC16:23
*** florianf has quit IRC16:24
*** ykarel|away has joined #oooq16:25
*** myoung|biaf is now known as myoung|lunch16:29
*** brault has quit IRC16:31
*** tesseract has quit IRC16:33
*** ykarel|away has quit IRC16:37
hubbotFAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/56722416:41
*** holser__ has quit IRC16:42
weshay|ruckrlandy, slave issues?16:42
rlandyweshay|ruck; in which context? rasca?16:44
*** ykarel|away has joined #oooq16:52
myoung|lunchweshay|ruck: i rebooted the virt-ha slave, it was switched off (brn slave virthost)16:53
myoung|lunchit's the third time in a year this has happened...libvirt throws a hardware fault and the VM comes crashing down16:53
myoung|lunchjust had to ssh in and start it.  we should really relocate slaves and / or have IT ticket logged as it's now a pattern16:54
myoung|lunchweshay|ruck: i have some notes buried in a card on this i can dig up.  back in a few...importing calories16:54
weshay|ruckis that our slave or rasca's?16:54
myoung|lunchit's the jenkins slave VM that drives jobs targeted at  PIDONE 40 core + 196 gb virthost.  the actual machine in PIDONE lab is fine.16:55
myoung|lunchhttps://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/computer/rdo-manager-18-virt-ha16:55
weshay|ruckis it our slave or rascas?16:55
myoung|lunch^^ that VM is the issue16:55
myoung|lunchour slave, located on (fetches)16:55
myoung|lunchhp-dl320g8-10.lab.eng.brq.rh.com16:56
myoung|lunchhp-dl320g8-10.lab.eng.brq.{REDACTED}.com16:56
weshay|ruckquiquell|off, fyi the gate job failure log links are busted17:05
weshay|ruckquiquell|off, the prepend http://38.145.34.131:3000/17:05
weshay|ruckto the url to the log17:05
weshay|rucksshnaidm, you still around?17:09
sshnaidmweshay|ruck, yeah17:09
weshay|rucksshnaidm, RFE for sova... adding a column w/ the nodepool_provider17:09
weshay|ruckwdyt?17:10
sshnaidmweshay|ruck, what does it mean?17:10
sshnaidmweshay|ruck, sorting by infra clouds?17:12
weshay|rucksshnaidm, so in sova I can see which cloud provider the job ran on17:12
weshay|ruckwell next to the "Length" column17:12
sshnaidmweshay|ruck, I have it in grafana.. and will include it in new dashboard too17:12
weshay|ruckso I can detect a pattern w/ a certain cloud provider17:12
weshay|ruckk.. new dashboard would be good too17:12
sshnaidmweshay|ruck, well, about length you can check right now17:13
weshay|ruckgrafana meaning the one in rdo-sf?17:13
weshay|ruckya.. corralating the length w/ provider17:13
sshnaidmweshay|ruck, and one of quiquell|off too17:13
weshay|rucksshnaidm, here? https://review.rdoproject.org/grafana/dashboard/db/tripleo-ci?orgId=117:14
weshay|ruckah ya17:14
weshay|rucksee it17:14
sshnaidmweshay|ruck, go to https://review.rdoproject.org/grafana/dashboard/db/tripleo-ci?orgId=1  set there "cloud"17:14
sshnaidmweshay|ruck, and "duration" graph will change accordingly17:14
weshay|ruckya did17:14
weshay|rucknot seeing a clear trend17:15
weshay|ruckhttps://review.rdoproject.org/grafana/dashboard/db/tripleo-ci?orgId=1&var-pipeline=All&var-branch=All&var-cloud=rax-iad&var-type=All&var-jobtype=All17:15
*** amoralej is now known as amoralej|off17:18
*** myoung|lunch is now known as myoung17:19
sshnaidmweshay|ruck, added red line for 2 h 50 min there..17:19
weshay|rucksshnaidm, thanks17:20
weshay|rucksshnaidm, doesn't seem to be specific to a host17:20
weshay|ruckto a provider17:21
weshay|rucksee the same trend on vex as rax17:21
sshnaidmweshay|ruck,limestone and vex are the worst17:21
sshnaidmweshay|ruck, rax is not good too17:22
sshnaidmweshay|ruck, just going one by one..17:22
*** trown is now known as trown|lunch17:24
*** agopi has joined #oooq17:24
*** ykarel|away has quit IRC17:37
*** zoli|wfh is now known as zoli|gone17:38
*** zoli|gone is now known as zoli17:39
*** Goneri has quit IRC17:43
*** yolanda has quit IRC17:50
*** brault has joined #oooq17:55
*** brault has quit IRC17:59
*** yolanda has joined #oooq18:05
*** myoung is now known as myoung|biab18:08
*** jaganathan has quit IRC18:17
chandankumarquiquell|off: myoung|biab sshnaidm arxcruz https://www.youtube.com/watch?v=rhMFX9dE4cw -> to learn about rpm packaging and rdo packaging18:17
sshnaidmchandankumar++18:20
hubbotsshnaidm: chandankumar's karma is now 418:20
*** Goneri has joined #oooq18:28
*** trown|lunch is now known as trown18:30
weshay|ruckrlandy, ^18:37
rlandyweshay|ruck: thanks18:38
*** matbu has quit IRC18:38
hubbotFAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/56722418:41
*** brault has joined #oooq18:54
*** brault has quit IRC18:59
rlandyweshay|ruck: https://review.openstack.org/#/c/574894/ https://code.engineering.redhat.com/gerrit/#/c/141378/. afaict, browbeat and rasca use different jjb builders19:09
rlandysince we are not touching the full-deploy-baremetal script, we should be able to merge without impacting other jobs19:09
*** myoung|biab is now known as myoung19:15
myoungchandankumar++, bookmarked to watch it when i have some time (today is not the day :))19:16
rlandymyoung: pls take a look at ^^ JJb change. I think I changed the right one for the bm pipelines19:27
myoungrlandy: you have, i commented on that earlier today with some suggestions (optional)19:27
myoungrlandy: double checking...19:28
myoungrlandy: ack, that's the correct builder.  let me know what you think re: gerrit notes i left there19:29
rlandyk - cool - I'll look19:29
weshay|ruckrlandy, want me to merge this? https://review.openstack.org/#/c/574894/219:32
weshay|ruckit will only take two days to merge lolz19:32
rlandyweshay|ruck: if I am really lucky19:33
rlandyit need to merge on top of rasca patch19:33
weshay|ruckoops19:33
rlandybut sure - no harm addition19:33
*** tcw has quit IRC19:34
*** tcw has joined #oooq19:35
weshay|ruckrlandy, on second thought.. merging19:35
weshay|ruckso I can get on it w/ it and test19:35
rlandyweshay|ruck: are we talking about the same review??19:36
rlandythe t-q one19:36
weshay|ruckrlandy,https://review.openstack.org/#/c/574894/19:36
rlandyyeah ok19:37
weshay|ruckrlandy, re: the jjb I think it would make sense to be able to pass a featureset via jjb template variable19:37
weshay|ruckand the node as well19:37
rlandyweshay|ruck: basically, I just needed those two review to email out to other users and explain the change19:37
weshay|ruckrlandy, k k19:38
rlandymyoung had the same comment - looking at how to pass those vars rather19:38
weshay|ruckrlandy, so you have these two reviews, plus a  job that matches this config19:38
weshay|ruckproof it works..19:38
rlandy{functional_config} models "--config" {topology} models "--nodes"19:38
weshay|ruckmakes sense19:38
weshay|ruckrook, is ignoring us :)19:38
rlandyI don't blame him :(19:38
myoungrlandy: ack, if it helps can point to other examples19:38
rlandyI should have had this done a while back19:38
myoungrlandy: but didn't want to -1 anything if yall are running fast and want to hard code configs19:39
weshay|ruckrlandy, jebus19:39
weshay|ruckrlandy, you said it19:39
rlandysec updating review19:39
rlandythen we need to make rhos-13 work - git it as far as overcloud deploy19:40
weshay|ruckrlandy, oh I remember now... we do not want to initially include the fs and node config as vars19:40
weshay|ruckrlandy, that will end up changing the name of the jobs19:40
weshay|ruckand screw all that for now19:40
myoungrlandy: http://git.app.eng.bos.redhat.com/git/tripleo-environments.git/tree/jenkins/jobs/tripleo-quickstart/tripleo-quickstart-main.yml#n192 is example of using those variables from project section19:40
rlandyweshay|ruck:so I thought about that - and yes, it would change the job name19:41
rlandybut that is not a bad thing19:41
weshay|ruckrlandy, ya.. so let's not19:41
weshay|rucknot right now19:41
weshay|ruckon top of this patch later19:41
* myoung nods at weshay|ruck and rlandy and goes back to other stuff19:41
rlandyideally we should name it like upstream19:41
weshay|ruckrlandy, right19:41
weshay|ruckhow about that19:41
rlandywe could19:42
weshay|ruckwe could, but later19:42
rlandyso we need to keep the name undre a certain number of cars19:42
weshay|ruckthis is wes pushing off work19:42
rlandychars19:42
weshay|ruckya19:42
rlandyand we would probably stillw ant the env name19:42
rlandyand single-nic19:42
weshay|ruckrlandy, if we play our cards right.. soon only the bm jobs will be left here19:42
rlandyright - and I thought we trigger those from zuul as well eventually19:43
rlandyso I actully thought about this a bit ...19:43
weshay|ruckya.. it "could" happen19:43
rlandyand changing the job name is not a big deal19:43
weshay|ruckrlandy, I just don't want to deal w/ it19:44
rlandyb/c we only impact ours;eves19:44
weshay|ruckrlandy, removing the old jobs etc19:44
rlandyweshay|ruck: lol - I don't blame you19:44
weshay|ruckloosing history19:44
weshay|ruckblah blah blah19:44
rlandyupdating promote scripts etc.19:44
rlandyokie dokie19:44
weshay|rucklet's just see it work w/ upstream configs for a bit and be merry19:44
* weshay|ruck going through the configs19:45
rlandyso what do I do about myoung's comment and using the {functional_config} models "--config" {topology} models "--nodes"19:45
rlandychange the review or leave as is?19:45
rlandyweshay|ruck: we are missing C and A19:45
rlandy(not in use in CI atm)19:45
rlandyif all goes well,  I can add those afterwards19:45
rlandyreally just cut/paste work19:45
rlandyor I can just add them now - ugh - will take me 15 minutes19:46
weshay|ruckheh.. rlandy upstream zuul queue at 1.1 days19:46
rlandyoh dear me19:46
weshay|rucklolz19:46
rlandyworse than rdo zuul yesterday19:47
rlandyweshay|ruck: it's like playing those 90's video games, when you can withstand enough pressure, you get moved to a level where it gets worse19:47
weshay|ruckheh19:48
rlandyall openstack devs - vacation tomorrow - we need to clear the queues19:48
weshay|ruckrlandy, heh.. Alex can shoot it in the head if he wanted to19:49
rlandyweshay|ruck; well if nothing is really broken, and we just have  volume, and we DO want people to run tests, how will killing zuul help?19:50
rlandywe'd have to kill off a few contributors19:50
weshay|ruckrlandy, he clear the gate queue for tripleo I meant19:52
weshay|ruckhe can19:52
toskyit's only two days from Friday19:52
weshay|ruckafaict it looks like there is some variation across nodepool providers on some jobs19:52
rlandyby friday, the queue will be 5 days long19:52
myoungI think the longer queue times absent infra issues is useful data on where we need to either add resources, optimize things, etc.  Some of the ideas around how CI time can be re-used and/or restructured to be more effecient almost needs a wheel to squeak right?19:52
weshay|ruck0/ tosky :)19:53
toskynah, it will go down on Saturday19:53
tosky(unless zuul explodes)19:53
rlandyweshay|ruck: ack - we have seen trouble with some providers more than others19:53
rlandybut we don't get to pick19:53
rlandyor do we?19:53
weshay|ruckrlandy, we don't pick.. lolz no19:54
weshay|ruckrlandy, oh my19:56
weshay|ruckthe change in config is not easy to follow or walk through19:56
myoungspeaking of gate times, 11.5 hours after +w and https://review.openstack.org/#/c/574970 is into gates \o/19:57
rlandyweshay|ruck: everything got deleted expect instackenv.json and 1 env_settings per hardware environment19:57
weshay|ruckmyoung, you've only just begun the journey.. oy vey.. 1.1 more days to go for that patch19:57
weshay|ruckrlandy, that I get19:57
rlandyweshay|ruck: we can walk  through it if you like19:57
myoungHello darkness my old friend...19:58
weshay|ruckrlandy, however I'm trying to trace variables from the deleted config19:58
myoungweshay|ruck: {chuckle}19:58
weshay|ruckto the env_settings19:58
rlandyok19:58
weshay|ruckmaybe that is the wrong approach19:58
rlandynothing wrong with that19:58
rlandyit's possible19:58
rlandythe single_nic_vlans.yaml file become the network_environment args19:59
weshay|ruckrlandy, so given that if there are failures here.. I'll probably ping you for extra eyes..19:59
rlandyweshay|ruck: ack19:59
weshay|ruckrlandy, any other good ways to try and verify this prior to merge19:59
rlandyweshay|ruck: yes ...20:00
rlandyhttps://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rlandy-poc-tripleo-quickstart-master-baremetal-hp_dl360_envE-single_nic_vlans/20:00
rlandywe change the env name and run20:00
rlandyas far as the jjb, not really20:00
rlandybut the command should be the same20:00
weshay|ruckrlandy, oh wait.. I know a way20:01
weshay|ruckrlandy, just following your pattern20:01
weshay|ruckI'll create a test job.. rlandy-poc$job for B,D20:02
weshay|ruckwe could consider breaking this review into three20:02
weshay|rucknot needed really though20:02
weshay|ruckmerge, run the rlandy-poc jobs and profit20:02
weshay|ruckif we needed to revert we could20:03
rlandyweshay|ruck: we can test w/o merge20:03
rlandysee te current poc20:03
weshay|ruckk20:03
rlandyjust takes in the changes20:03
rlandybefore quickstart run20:03
weshay|ruckah yes20:04
weshay|ruckk20:04
* weshay|ruck creates two new jobs20:04
rlandyweshay|ruck: feel free to edit that job20:04
rlandychange the env called in the quickstart.sh run - and the virthost20:05
weshay|ruckrlandy, if we ran this against pike too.. that would be informative20:06
weshay|ruckas they all pass minus the bond_with_vlan20:06
rlandyweshay|ruck: sure - run one env agaist queens, one pike20:07
rlandybond templates are not teste20:07
rlandywhich is why I want to get rid of them in the gates20:07
rlandyand here20:07
rlandythe point of this stag eis not to test new templates20:07
weshay|ruckaye20:07
rlandyas with the rhos-13 gates20:07
rlandythey failed on overcloud deploy20:08
rlandymajor debug to figure that template out20:08
rlandymove to fs001 - more streamlines20:08
rlandyforget the bond env20:08
rlandyunless someone tells us to bring it back20:08
rlandyweshay|ruck, ^^ I have tried to get help in debugging that template before20:09
rlandylittle success20:09
weshay|ruckk20:09
rlandyha - we have  a problem20:19
rlandythe templates need to change20:23
weshay|ruckhttps://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rlandy-poc-tripleo-quickstart-master-rdo_trunk-baremetal-dell_fc430_envB-single_nic_vlans/20:24
weshay|ruckhttps://rhos-dev-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rlandy-poc-tripleo-quickstart-master-rdo_trunk-baremetal-hp_dl360_envD-single_nic_vlans/20:25
weshay|ruckbah.. virthost20:25
weshay|ruckrlandy, ok.. those look good to go, not 100% sure re: the throttle but just copied what you had20:30
rlandyweshay|ruck: we have a problem with the templates20:30
weshay|ruckjjb?20:30
rlandytht20:30
rlandyhttps://github.com/openstack/tripleo-heat-templates/tree/master/environments20:30
rlandyno longer exists20:31
rlandyit's a j2 file20:31
rlandywe had a bug reported that we are using hardcoded templates in ci - generally20:31
rlandyand we should be using the new j2 templates20:32
rlandyfor ovb rdocloud, we use https://github.com/openstack/tripleo-heat-templates/tree/master/ci/environments20:32
rlandy^^ status20:32
rlandystatic20:32
rlandyfor single_nic there are no more static templates20:33
weshay|ruckrlandy, that change must have just happened no?20:33
rlandyrecently, I think20:34
rlandyI am trying to figure out who this worked yesterday20:35
weshay|ruckso this implies to me...20:36
weshay|ruckif we choose single-nic-vlan w/ config_download.. the templates are rendered for us20:37
weshay|ruckso far its in master and queens20:37
weshay|ruckrlandy, it's this way in pike too20:38
weshay|ruckrlandy, so interesting that it's working20:38
rlandyweshay|ruck: if I look on the undercloud20:38
rlandyhttps://thirdparty.logs.rdoproject.org/jenkins-rlandy-poc-tripleo-quickstart-master-baremetal-hp_dl360_envE-single_nic_vlans-4/undercloud/usr/share/openstack-tripleo-heat-templates/environments/20:39
rlandywhere are those files?20:39
weshay|ruckbluejeans20:40
hubbotFAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/56722420:41
*** matbu has joined #oooq20:57
*** trown is now known as trown|outtypewww21:01
*** dtrainor has quit IRC21:35
*** atoth has quit IRC21:54
*** Goneri has quit IRC22:06
*** myoung is now known as myoung|off22:20
hubbotFAILING CHECK JOBS on stable/queens: tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades @ https://review.openstack.org/56722422:41
*** gkadam has joined #oooq22:46
rlandypike promoting23:01
rlandyEmilienM: hi - do you have a moment for an 'openstack overcloud deploy' question?23:30
EmilienMrlandy: hey, debugging gate on #triploe23:31
EmilienM#tripleo23:31
rlandyEmilienM: I think this is unrelated - and definitely not urgent - I will paste in my question in pvt channel - for whenever you have the time to look at it.23:33
*** dtrainor has joined #oooq23:38
*** tosky has quit IRC23:51
*** dtrainor has quit IRC23:58

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!