*** ysandeep|away is now known as ysandeep | 06:32 | |
*** sshnaidm|afk is now known as sshnaidm | 06:34 | |
pojadhav | ysandeep, 0/ | 07:37 |
---|---|---|
*** jpena|off is now known as jpena | 07:37 | |
ysandeep | pojadhav, hi | 07:37 |
pojadhav | ysandeep, do you know ceph related issue was going on.. is still there ? https://sf.hosted.upshift.rdu2.redhat.com/logs/openstack-component-network/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-rhel-8-scenario010-standalone-network-rhos-17/58d5590/logs/undercloud/home/zuul/standalone_deploy.log | 07:37 |
ysandeep | tripleo component need promotion to fix that, fix is stuck at current hash, A FTBFS is ongoing.. https://trello.com/c/1aKt6YKf/2064-cixftbfsosp-170 | 07:40 |
pojadhav | ysandeep, ohh okay.. this is the same issue which ronelle discussed. got it.. thanks! | 07:41 |
ysandeep | yup.. same issue | 07:41 |
arxcruz|off | zbr: frenzy_friday i'm getting this error when i try to do a make up | 07:56 |
arxcruz|off | er-cron | elasticsearch.exceptions.SSLError: ConnectionError([SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate is not yet valid (_ssl.c:1131)) caused by: SSLError([SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate is not yet valid (_ssl.c:1131)) | 07:56 |
arxcruz|off | should i set some variable to ignore it ? | 07:57 |
frenzy_friday | arxcruz|off, didnt see this error till now. Checking | 07:58 |
frenzy_friday | which elasticsearch version is it? | 07:59 |
arxcruz|off | frenzy_friday: i got from your latest patch | 07:59 |
arxcruz|off | let me check | 07:59 |
arxcruz|off | elasticsearch==7.14.0 | 07:59 |
frenzy_friday | hm.. not sure which url it is failing for. me checks | 08:01 |
*** ykarel is now known as ykarel|lunch | 08:23 | |
arxcruz|off | frenzy_friday: fixed, date issues :) | 08:24 |
frenzy_friday | oh, okay. What did you change? I'll update the doc | 08:25 |
arxcruz|off | frenzy_friday: no, actually, i'm running docker on a vm because i don't have on my machine | 08:26 |
arxcruz|off | and i paused the vm when i turn my computer off | 08:27 |
arxcruz|off | when i bring it up back, it did not update the date, it was august 17 | 08:27 |
arxcruz|off | so when tried to connect to es, it was failing | 08:27 |
frenzy_friday | got it | 08:27 |
arxcruz|off | see, certificate is NOT YET valid | 08:27 |
arxcruz|off | lol | 08:27 |
arxcruz|off | frenzy_friday: i was getting a lot of unknown bug | 09:08 |
arxcruz|off | but then, after reach https://review.rdoproject.org/elasticsearch/logstash-2021.08.10/_stats | 09:08 |
arxcruz|off | now the page is empty | 09:08 |
frenzy_friday | arxcruz|off, shall we have a call? | 09:13 |
arxcruz|off | sure | 09:13 |
arxcruz|off | give me 5 min | 09:13 |
frenzy_friday | sure | 09:13 |
arxcruz|off | frenzy_friday: ready | 09:17 |
frenzy_friday | arxcruz|off, https://meet.google.com/ykw-xxva-akx | 09:17 |
*** ykarel|lunch is now known as ykarel | 09:36 | |
*** ykarel is now known as ykarel|afk | 10:53 | |
*** arxcruz|off is now known as arxcruz | 11:01 | |
weshay|ruck | pojadhav, 0/ | 11:31 |
*** dviroel|out is now known as dviroel|ruck | 11:31 | |
*** jpena is now known as jpena|lunch | 11:33 | |
pojadhav | weshay|ruck, hi | 11:36 |
weshay|ruck | pojadhav, you coming ? | 11:36 |
pojadhav | weshay|ruck, yup | 11:37 |
weshay|ruck | chandankumar++ nice job w/ the FTBS | 11:37 |
chandankumar | weshay|ruck: I have done nothing there | 11:37 |
chandankumar | just got the trello card from CIX | 11:38 |
*** rlandy is now known as rlandy|rover | 11:43 | |
rlandy|rover | weshay|ruck: hi - you wanted to talk about https://trello.com/c/1aKt6YKf/2064-cixftbfsosp-170? | 11:45 |
chandankumar | dviroel|ruck: \o | 11:49 |
chandankumar | dviroel|ruck: please have a look at this https://review.opendev.org/c/openstack/tripleo-common/+/804797/1#message-c7c650b1231a53a24e1238d90a36b4cd19c273a7 when free, thanks! | 11:49 |
weshay|ruck | ya.. but talking to pooja | 11:50 |
dviroel|ruck | chandankumar: hi, ok | 11:51 |
rlandy|rover | dviroel|ruck: hi - ho ware things? | 11:58 |
dviroel|ruck | rlandy|rover: hi, so far so good, it seems that only periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-ussuri was missing for ussuri promotion | 12:00 |
dviroel|ruck | openstack-periodic-integration-stable3 just started | 12:03 |
rlandy|rover | cool | 12:05 |
rlandy|rover | ysandeep: hey - do you know what's up with all the standalone failures in 17 line or should I investigate that? | 12:25 |
ysandeep | rlandy|rover: they started failing in currently run, I checked in my morning apart from 010 others were passing | 12:28 |
rlandy|rover | ysandeep: k - will check into it after meetings | 12:28 |
rlandy|rover | maybe cloud hitch | 12:28 |
*** jpena|lunch is now known as jpena | 12:29 | |
rlandy|rover | dviroel|ruck: feel free to comment at CIX - will jump in when needed | 12:31 |
dviroel|ruck | rlandy|rover: ok, tks | 12:31 |
rlandy|rover | chandankumar: hey - want to touch base on el9? | 12:52 |
chandankumar | rlandy|rover: yes sure | 12:52 |
rlandy|rover | chandankumar: https://meet.google.com/rku-bupo-rqz?pli=1&authuser=0 | 12:53 |
chandankumar | arxcruz, zbr, sshnaidm, rlandy, marios, ysandeep, bhagyashris, svyas, soniya29, pojadhav, akahat, weshay, chandankumar, frenzy_friday, dviroel Scrum: https://meet.google.com/bqx-xwht-wky | 13:00 |
chandankumar | sorry https://meet.google.com/xnf-tvdh-pmk?authuser=0 | 13:01 |
rlandy|rover | zbr: arxcruz: ^^ | 13:03 |
*** amoralej is now known as amoralej|lunch | 13:07 | |
zbr | i am in another meeting, i will join later | 13:09 |
*** amoralej|lunch is now known as amoralej | 13:46 | |
rlandy|rover | znr: no worries | 13:48 |
rlandy|rover | ysandeep: pls ping when you are EoD - I will follow through promoting the tripleo component if needed | 13:51 |
rlandy|rover | we ar replying on OVB in place of BM, right? | 13:51 |
ysandeep | rlandy|rover, ack for eod, yes we are relying on OVB.. I am fixing BM for 17.. currently in manual testing mode.. hitting some issue.. but I am making progress | 13:52 |
weshay|ruck | dviroel|ruck, need anything? | 13:53 |
arxcruz | rlandy|rover: System is going down. Unprivileged users are not permitted to log in anymore. For technical details, see pam_nologin(8). | 13:53 |
arxcruz | when i try to access the vm | 13:53 |
rlandy|rover | arxcruz: the candidate vm | 13:53 |
dviroel|ruck | weshay|ruck: investigating that https://zuul.openstack.org/builds?job_name=tripleo-ci-centos-8-standalone-upgrade-ussuri | 13:53 |
arxcruz | rlandy|rover: the dandidate vm yes | 13:53 |
rlandy|rover | ysandeep may be working on it | 13:53 |
dviroel|ruck | weshay|ruck: trying to search for more error than https://4d34507513d46a298d9b-c450cedbad3a93d818e1040974f11faf.ssl.cf1.rackcdn.com/periodic/opendev.org/openstack/tripleo-heat-templates/stable/ussuri/tripleo-ci-centos-8-standalone-upgrade-ussuri/3e39fdd/logs/quickstart_install.log | 13:54 |
rlandy|rover | arxcruz: checking | 13:54 |
ysandeep | rlandy|rover, arxcruz, I am working on candidate vm | 13:54 |
arxcruz | ysandeep: so you're the one to blame | 13:54 |
weshay|ruck | dviroel|ruck, probably best to debug the "periodic" version of those... | 13:55 |
arxcruz | ysandeep: https://www.youtube.com/watch?v=SrDSqODtEFM | 13:55 |
weshay|ruck | that way.. there is no additional code change to muck it up | 13:55 |
ysandeep | arxcruz, :) I will ping you once I am done deploying standalone env | 13:56 |
weshay|ruck | dviroel|ruck, so fails here: 2021-08-22 10:00:56.111025 | primary | TASK [os_tempest : Ping router ip address] ************************************* | 13:56 |
dviroel|ruck | yep | 13:56 |
weshay|ruck | 2021-08-22 08:59:54.415 ERROR /var/log/containers/neutron/server.log: 19 ERROR neutron.db.agentschedulers_db [req-d53e2492-8504-4b2f-922f-ce1edaccd4c1 - - - - -] Exception encountered during network rescheduling: oslo_db.exception.DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.24.3' ([Errno 113] EHOSTUNREACH)") | 13:57 |
weshay|ruck | dviroel|ruck, https://9bbde2059085467a4330-af2016a5632320f910deb9dcbf495ac6.ssl.cf5.rackcdn.com/periodic/opendev.org/openstack/tripleo-heat-templates/stable/ussuri/tripleo-ci-centos-8-standalone-upgrade-ussuri/eb7bbdc/logs/undercloud/var/log/extra/errors.txt | 13:58 |
weshay|ruck | https://9bbde2059085467a4330-af2016a5632320f910deb9dcbf495ac6.ssl.cf5.rackcdn.com/periodic/opendev.org/openstack/tripleo-heat-templates/stable/ussuri/tripleo-ci-centos-8-standalone-upgrade-ussuri/eb7bbdc/logs/undercloud/var/log/containers/mysql/mysqld.log | 13:59 |
weshay|ruck | dviroel|ruck, looks like the upgrade never brought it back up | 13:59 |
weshay|ruck | https://9bbde2059085467a4330-af2016a5632320f910deb9dcbf495ac6.ssl.cf5.rackcdn.com/periodic/opendev.org/openstack/tripleo-heat-templates/stable/ussuri/tripleo-ci-centos-8-standalone-upgrade-ussuri/eb7bbdc/logs/undercloud/var/log/containers/mysql/mysqld-upgrade.log | 13:59 |
weshay|ruck | let's compare to a successful job | 13:59 |
weshay|ruck | ok.. nope.. the restart is not logged .. successful job https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_b1e/periodic/opendev.org/openstack/tripleo-heat-templates/stable/ussuri/tripleo-ci-centos-8-standalone-upgrade-ussuri/b1e21a3/logs/undercloud/var/log/containers/mysql/mysqld-upgrade.log | 14:00 |
rlandy|rover | dviroel|ruck: sorry ... need help with anything other than ^^? | 14:00 |
rlandy|rover | can look now | 14:00 |
dviroel|ruck | rlandy|rover: no, not really, waiting ussuri jobs yet | 14:01 |
weshay|ruck | dviroel|ruck, this is not an error: | 14:04 |
weshay|ruck | 2021-08-22 09:40:39 | "<13>Aug 22 09:40:12 puppet-user: Warning: Unknown variable: '::deployment_type'. (file: /etc/puppet/modules/tripleo/manifests/packages.pp, line: 39, column: 69)", | 14:04 |
weshay|ruck | but.. suspicious | 14:04 |
weshay|ruck | https://9bbde2059085467a4330-af2016a5632320f910deb9dcbf495ac6.ssl.cf5.rackcdn.com/periodic/opendev.org/openstack/tripleo-heat-templates/stable/ussuri/tripleo-ci-centos-8-standalone-upgrade-ussuri/eb7bbdc/logs/undercloud/home/zuul/standalone_upgrade.log | 14:04 |
dviroel|ruck | weshay|ruck: hum, same warning in the sucessfull job too https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_b1e/periodic/opendev.org/openstack/tripleo-heat-templates/stable/ussuri/tripleo-ci-centos-8-standalone-upgrade-ussuri/b1e21a3/logs/undercloud/home/zuul/standalone_upgrade.log | 14:06 |
weshay|ruck | dviroel|ruck, aye.. agree | 14:07 |
* weshay|ruck hunting https://review.opendev.org/q/tripleo+-age:5d+status:merged | 14:07 | |
weshay|ruck | https://review.opendev.org/q/tripleo+-age:5d+status:merged+branch:stable/ussuri | 14:08 |
weshay|ruck | ¯\_(ツ)_/¯ | 14:09 |
weshay|ruck | https://review.opendev.org/c/openstack/puppet-tripleo/+/804155/3/manifests/profile/pacemaker/database/mysql_bundle.pp | 14:09 |
weshay|ruck | upgrades are tough.. spend a couple hours.. write up what you can.. and cix for help | 14:12 |
*** pojadhav- is now known as pojadhav | 14:14 | |
rlandy|rover | dviroel|ruck: weshay|ruck: note - vexxhost is having issues | 14:15 |
rlandy|rover | see #rhos-ops | 14:15 |
rlandy|rover | arxcruz: ^^ fyi | 14:16 |
rlandy|rover | login problem | 14:16 |
ysandeep | arxcruz, rlandy|rover instance is back, you can try to login now | 14:20 |
rlandy|rover | arxcruz: ^^ ols try so we can see if you have access | 14:21 |
rlandy|rover | pls | 14:21 |
weshay|ruck | dviroel|ruck, let's get the bug open so that people see why their check jobs are getting blocked | 14:29 |
dviroel|ruck | weshay|ruck: ok | 14:30 |
dviroel|ruck | weshay|ruck: the change that you pasted above, is it related or is a guess? | 14:31 |
weshay|ruck | it's ussuri so low traffic | 14:31 |
weshay|ruck | dviroel|ruck, I'm guessing at what could be the problem there.. | 14:31 |
weshay|ruck | not a lot of changes in train and ussuri though.. | 14:31 |
dviroel|ruck | weshay|ruck: ok, i'll open the bug with the os_tempest error: "Ping router ip address.." | 14:32 |
weshay|ruck | dviroel|ruck, that and the mysql error | 14:33 |
dviroel|ruck | weshay|ruck: yeah, maybe some patch that is in victoria and not in ussuri since: https://zuul.openstack.org/builds?job_name=tripleo-ci-centos-8-standalone-upgrade-victoria | 14:34 |
rlandy|rover | dviroel|ruck: just fyi - testprojected the two wallaby failures | 14:35 |
arxcruz | rlandy|rover: ysandeep i was able to log in | 14:43 |
ysandeep | arxcruz: cool! | 14:43 |
rlandy|rover | yay! | 14:44 |
rlandy|rover | progress | 14:45 |
dviroel|ruck | weshay|ruck: https://bugs.launchpad.net/tripleo/+bug/1940844 | 14:53 |
dviroel|ruck | weshay|ruck: will continue to investigate to add more details to it | 14:53 |
weshay|ruck | dviroel|ruck++ thank you | 14:54 |
dviroel|ruck | weshay|ruck: didn't add promotion-blocker tag to this one | 14:57 |
weshay|ruck | dviroel|ruck, that's fine for now.. but as you get closer to wanting to hand off.. add it | 14:58 |
*** amoralej is now known as amoralej|off | 15:08 | |
zbr | using redhat sso seems to become somethign that needs training | 15:13 |
zbr | i tried to login to jira, end-up on sso.redhat.com which redirected my to https://sbarnea.com/ss/Screen-Shot-2021-08-23-16-14-25.42.png --- apparently google was not enough, now we also got salesforge in. | 15:14 |
*** dviroel|ruck is now known as dviroel|ruck|lunch | 15:17 | |
*** jpena is now known as jpena|off | 15:36 | |
sshnaidm | rlandy|rover, hey | 15:49 |
sshnaidm | rlandy|rover, can we talk about psi c9 issue? | 15:49 |
rlandy|rover | sshnaidm: ack | 15:49 |
sshnaidm | rlandy|rover, https://meet.google.com/vji-fzhb-bkp | 15:50 |
*** ysandeep is now known as ysandeep|dinner | 15:51 | |
rlandy|rover | chandankumar: ^^ hey was talking with sshnaidm about c9 - booking some time tomorrow so we can sync up | 16:05 |
rlandy|rover | he has standalone working with c8 containers | 16:05 |
rlandy|rover | we could out c9 containers in there | 16:06 |
*** ysandeep|dinner is now known as ysandeep | 16:10 | |
rlandy|rover | https://sf.hosted.upshift.rdu2.redhat.com/zuul/t/tripleo-ci-internal/builds?pipeline=%09openstack-promote-component | 16:17 |
rlandy|rover | woohoo - only one component not promoted | 16:17 |
rlandy|rover | network 17 - getting there | 16:17 |
rlandy|rover | weshay|ruck: ^^ | 16:17 |
rlandy|rover | need 17 promotion | 16:18 |
*** dviroel|ruck|lunch is now known as dviroel|ruck | 16:19 | |
rlandy|rover | ysandeep: ^^ :) | 16:20 |
ysandeep | rlandy|rover, tripleo component run report back: https://code.engineering.redhat.com/gerrit/c/testproject/+/211643 | 16:21 |
ysandeep | all passed except bm failure(expected not in criteria) | 16:21 |
ysandeep | rlandy|rover, tripleo will promote in next run promote-to-promoted run... then we will need integration line promotion for other component to pick the sc010 fix | 16:22 |
rlandy|rover | ysandeep: yep - saw - thanks | 16:23 |
rlandy|rover | it's in promoted-components | 16:23 |
rlandy|rover | so can rekick 17 line | 16:23 |
rlandy|rover | ysandeep: ^^ rekicking 17 line | 16:26 |
ysandeep | ack o/ | 16:47 |
* ysandeep wondering if you/wes can demo in community call.. how to retrigger the pipelines | 16:48 | |
rlandy|rover | ack can do | 16:53 |
rlandy|rover | zuul admin instructions | 16:53 |
rlandy|rover | weshay|ruck: re: your patch on previous tripleo-ci-testing | 16:54 |
rlandy|rover | promoted-components :) | 16:54 |
weshay|ruck | hrm.. that ran on components? | 16:56 |
rlandy|rover | https://review.rdoproject.org/r/c/config/+/34934/3/playbooks/tripleo-ci-base-promote-consistent-to-tripleo-ci-testing/run.yaml | 16:57 |
rlandy|rover | that playbook moves consisent to tripleo-ci-testing | 16:57 |
rlandy|rover | we don't do that anymore | 16:57 |
rlandy|rover | promoted-components -> tripeo-ci-testing | 16:58 |
weshay|ruck | oh ya.. ur right | 16:58 |
rlandy|rover | so we need ... | 16:58 |
weshay|ruck | I think I just made the change in the wrong place | 16:58 |
weshay|ruck | and this is duplicated elsewhere | 16:58 |
rlandy|rover | https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-centos-8-wallaby-promote-promoted-components-to-tripleo-ci-testing/8d1667f/job-output.txt | 16:58 |
rlandy|rover | taking example ^^ | 16:58 |
rlandy|rover | [trusted : review.rdoproject.org/config/playbooks/tripleo-ci-base-promote-hash/run.yaml@master] | 16:59 |
weshay|ruck | review.rdoproject.org/config/playbooks/tripleo-ci-base-promote-hash/run.yaml | 16:59 |
rlandy|rover | https://github.com/rdo-infra/review.rdoproject.org-config/blob/master/playbooks/tripleo-ci-base-promote-hash/run.yaml | 17:00 |
rlandy|rover | right thta | 17:00 |
rlandy|rover | that | 17:00 |
rlandy|rover | promote-primary-distro.yaml | 17:00 |
rlandy|rover | https://github.com/rdo-infra/review.rdoproject.org-config/blob/master/roles/promote-hash/tasks/promote-primary-distro.yaml | 17:01 |
weshay|ruck | wonder if it's a better use case after tripleo-repos get hash.. has been integrated here | 17:14 |
dviroel|ruck | weshay|ruck: didn't make progress on that https://bugs.launchpad.net/tripleo/+bug/1940844 - tricky | 17:17 |
dviroel|ruck | alex made some comment in the LP | 17:17 |
dviroel|ruck | maybe a network issue that we can't see in the logs | 17:18 |
weshay|ruck | aye.. k.. if you add promotion-blocker it won't cix for 5 hours.. | 17:18 |
weshay|ruck | so .. | 17:18 |
weshay|ruck | I looked at the network settings too.. didn't see much diff.. damian will probably pick it up in his morning | 17:19 |
weshay|ruck | dviroel|ruck, fyi https://zuul.openstack.org/builds?job_name=tripleo-ci-centos-8-scenario001-standalone | 17:20 |
weshay|ruck | tempest.lib.exceptions.Conflict: Conflict with state of target resource | 17:20 |
weshay|ruck | Details: {'type': 'SecurityGroupInUse', 'message': 'Security Group deca90a1-f2ef-4c8b-8ae8-8f7f11a1dbc9 in use.', 'detail': ''} | 17:20 |
weshay|ruck | }}} | 17:20 |
weshay|ruck | just 1 hit.. will be watching for more | 17:20 |
dviroel|ruck | ok | 17:20 |
dviroel|ruck | so, will add the tag to it | 17:21 |
rlandy|rover | ci.centos rekicked | 17:37 |
weshay|ruck | dviroel|ruck, rlandy|rover for the compute component: https://review.opendev.org/c/openstack/nova/+/805663 | 17:39 |
dviroel|ruck | ++ | 17:40 |
rlandy|rover | nice | 17:41 |
dviroel|ruck | rlandy|rover: openstack-periodic-integration-stable3 disaster might be related with vexxhost earlier issue, right? | 17:43 |
rlandy|rover | dviroel|ruck: ack | 17:43 |
rlandy|rover | no worries it will rerun | 17:43 |
dviroel|ruck | ok | 17:43 |
ysandeep | rlandy|rover, weshay|ruck fyi.. I have analyzed https://bugs.launchpad.net/tripleo/+bug/1940729 which was discussed on df call, We only hit this on second rerun of overcloud deploy on existing environment.. We don't have a CI job that do overcloud deploy rerun(Update / Scale operation) in upstream/component ci.. | 18:08 |
* ysandeep will check rabi's bug tomorrow | 18:08 | |
weshay|ruck | ysandeep, rock.. and you have a handle on the ipv6 mis-config right? | 18:08 |
weshay|ruck | re: another issue rabi was speaking to | 18:09 |
weshay|ruck | ah.. that's what ur talking about in your second comment | 18:09 |
weshay|ruck | nevermind | 18:09 |
weshay|ruck | ur ahead of me | 18:09 |
* ysandeep hoping once baremetal is up again we can possibly add scaleout on one of a baremetal env.. | 18:11 | |
weshay|ruck | that may or may not be worth the effort because it's covered by qe... | 18:24 |
weshay|ruck | ysandeep, if you have a sec.. let's chat | 18:24 |
weshay|ruck | can be tomorrow as well.. but you AIN'T GOT NOTHIN GOING ON... you are a FREE MAN | 18:25 |
ysandeep | weshay|ruck, lets chat | 18:26 |
weshay|ruck | ysandeep, meet.google.com/hdn-puut-rcm | 18:27 |
rlandy|rover | periodic-tripleo-ci-centos-8-scenario001-standalone-wallaby | 18:35 |
rlandy|rover | - ok - see that failed again | 18:35 |
dviroel|ruck | same tempest tests | 18:39 |
dviroel|ruck | there are two wallaby failures for that: | 18:48 |
dviroel|ruck | https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario001-standalone-wallaby/d97fad9/logs/undercloud/var/log/tempest/stestr_results.html.gz | 18:48 |
dviroel|ruck | https://logserver.rdoproject.org/95/24995/93/check/periodic-tripleo-ci-centos-8-scenario001-standalone-wallaby/23c3023/logs/undercloud/var/log/tempest/stestr_results.html.gz | 18:48 |
*** ysandeep is now known as ysandeep|away | 19:05 | |
dviroel|ruck | ^ nova-api stop responding after GET http://192.168.24.3:8774/v2.1/servers/88242305-5529-4f74-a342-5a37f7d25005 | 19:18 |
dviroel|ruck | https://logserver.rdoproject.org/95/24995/93/check/periodic-tripleo-ci-centos-8-scenario001-standalone-wallaby/23c3023/logs/undercloud/var/log/containers/nova/nova-api.log.txt.gz req-bef84584 | 19:18 |
rlandy|rover | dviroel|ruck: hey - you're looking into scenario001 failure? | 19:49 |
dviroel|ruck | yes | 19:50 |
*** slaweq is now known as slaweq_ | 19:50 | |
rlandy|rover | dviroel|ruck: k - need help? | 19:51 |
rlandy|rover | tempest failure | 19:52 |
dviroel|ruck | rlandy|rover: should we wait for one more failure? it seems that in master we have two tests failing, and one in wallaby | 19:53 |
dviroel|ruck | i was looking into wallaby, now looking at master | 19:54 |
rlandy|rover | dviroel|ruck: wallaby has one in gate and one in testproject integration line | 19:55 |
rlandy|rover | both tempest.scenario.test_volume_boot_pattern.TestVolumeBootPattern | 19:55 |
rlandy|rover | ^^ right? | 19:55 |
rlandy|rover | you're seeing that failure? | 19:55 |
dviroel|ruck | here https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_1dd/800341/23/check/tripleo-ci-centos-8-scenario001-standalone/1dda775/logs/undercloud/var/log/tempest/stestr_results.html we also have tempest.scenario.test_snapshot_pattern.TestSnapshotPattern | 19:56 |
dviroel|ruck | ^ and this one seems to be a glance issue | 19:56 |
dviroel|ruck | just trying to fing the root cause | 19:57 |
dviroel|ruck | find* | 19:57 |
rlandy|rover | so we have different tempest failures | 19:58 |
rlandy|rover | https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_1dd/800341/23/check/tripleo-ci-centos-8-scenario001-standalone/1dda775/logs/undercloud/var/log/tempest/stestr_results.html | 19:59 |
rlandy|rover | same | 19:59 |
rlandy|rover | tempest.lib.exceptions.UnexpectedResponseCode: Unexpected response code received | 19:59 |
rlandy|rover | Details: 503 | 19:59 |
dviroel|ruck | https://17c6cb2f9fa2917c93e9-c8c2a5181a911daab043eb1d43163b4b.ssl.cf2.rackcdn.com/805559/1/gate/tripleo-ci-centos-8-scenario001-standalone/5ffb422/logs/undercloud/var/log/tempest/stestr_results.html tempest.scenario.test_snapshot_pattern.TestSnapshotPattern with a different error | 20:01 |
* dviroel|ruck has so many tabs open | 20:02 | |
rlandy|rover | dviroel|ruck: lol - welcome to ruck/rover life | 20:03 |
rlandy|rover | when your laptop starts smoking from overuse, you've made it :) | 20:03 |
dviroel|ruck | ah, i just got a new one, still have lot of resources :) | 20:04 |
rlandy|rover | let's see when this hit integration | 20:05 |
rlandy|rover | and check/gate | 20:05 |
dviroel|ruck | glance failing here https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_1dd/800341/23/check/tripleo-ci-centos-8-scenario001-standalone/1dda775/logs/undercloud/var/log/containers/glance/api.log | 20:06 |
dviroel|ruck | ^test_snapshot_pattern.TestSnapshotPattern | 20:06 |
rlandy|rover | https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-scenario001-standalone-wallaby | 20:06 |
rlandy|rover | last success 08/22 | 20:06 |
rlandy|rover | w promoted yesterday | 20:07 |
rlandy|rover | let's check when glace updated | 20:07 |
rlandy|rover | https://trunk.rdoproject.org/centos8-wallaby/component/glance/ hasn't updated in ages | 20:08 |
rlandy|rover | more likely tripleo | 20:09 |
rlandy|rover | https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-8-scenario001-standalone-wallaby | 20:11 |
rlandy|rover | ^^ see when that first started | 20:11 |
rlandy|rover | those could be vexx impacted | 20:13 |
dviroel|ruck | 2021-08-19 18:02 https://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario001-standalone-wallaby/dc1177a/logs/undercloud/var/log/tempest/stestr_results.html.gz | 20:14 |
dviroel|ruck | same errors: http://logserver.rdoproject.org/openstack-periodic-integration-stable1/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-8-scenario001-standalone-wallaby/dc1177a/logs/undercloud/var/log/containers/glance/api.log.txt.gz | 20:16 |
rlandy|rover | 2021-08-23 16:23:31.177031 | localhost | Provider: vexxhost-nodepool-tripleo | 20:17 |
rlandy|rover | ^^ gate failure | 20:17 |
rlandy|rover | scratch that | 20:17 |
rlandy|rover | 2021-08-23 18:09:39.920713 | localhost | Provider: inap-mtl01 | 20:18 |
rlandy|rover | 2021-08-23 18:09:39.920803 | localhost | Label: centos-8-stream | 20:18 |
rlandy|rover | other than just vexx | 20:18 |
rlandy|rover | comparing passing and failing rpms | 20:19 |
rlandy|rover | dviroel|ruck: hmmm https://review.opendev.org/c/openstack/tripleo-heat-templates/+/805280/ passes | 20:21 |
rlandy|rover | wonder if we need that | 20:21 |
rlandy|rover | to get the rest to pass | 20:21 |
rlandy|rover | https://zuul.opendev.org/t/openstack/builds?job_name=tripleo-ci-centos-8-scenario001-standalone&branch=master | 20:22 |
rlandy|rover | much more consistent failure | 20:22 |
rlandy|rover | https://review.opendev.org/c/openstack/tripleo-heat-templates/+/804896 merged august 20 | 20:22 |
dviroel|ruck | https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_e46/805280/3/check/tripleo-ci-centos-8-scenario001-standalone/e469d21/logs/undercloud/var/log/containers/glance/api.log this one passes and also has that glance error | 20:24 |
dviroel|ruck | there is other passing jobs that don't have this error, weird | 20:24 |
rlandy|rover | https://zuul.opendev.org/t/openstack/builds?job_name=tripleo-ci-centos-8-scenario001-standalone&branch=master - prob most consistent | 20:26 |
rlandy|rover | https://review.opendev.org/c/openstack/tripleo-heat-templates/+/805541/ | 20:26 |
rlandy|rover | passes | 20:26 |
dviroel|ruck | ^ has a depends on that is failing on scenario001 https://review.opendev.org/c/openstack/puppet-tripleo/+/805540 | 20:29 |
rlandy|rover | yeah | 20:29 |
rlandy|rover | puppet-tripleo | 20:29 |
weshay|ruck | ok.. scenario001 tempest.. anyone on that or shall I write it up | 20:29 |
dviroel|ruck | rlandy|rover: but isn't failing on tempest :p | 20:30 |
rlandy|rover | weshay|ruck: lol | 20:31 |
rlandy|rover | dviroel|ruck: weshay|ruck: so here's what we know ... | 20:31 |
rlandy|rover | wallaby and master | 20:31 |
weshay|ruck | in the upstream gate it is.. in wallaby | 20:31 |
rlandy|rover | mostly TestSnapshotPattern | 20:31 |
weshay|ruck | perhaps two diff issues.. yes... | 20:32 |
weshay|ruck | snapshot | 20:32 |
dviroel|ruck | in wallaby i see tempest.scenario.test_volume_boot_pattern.TestVolumeBootPattern failing only | 20:32 |
dviroel|ruck | in master has tempest.scenario.test_snapshot_pattern.TestSnapshotPattern too | 20:32 |
weshay|ruck | k.. so.. create one bug, two different entries in tempest skip file | 20:33 |
rlandy|rover | dviroel|ruck: weshay|ruck: k - I'll write one bug | 20:33 |
rlandy|rover | let's start there | 20:33 |
weshay|ruck | ya | 20:33 |
rlandy|rover | we can edit | 20:33 |
rlandy|rover | at leats we have a place to drop all this debug | 20:33 |
rlandy|rover | sec - coming up | 20:33 |
dviroel|ruck | +1 | 20:33 |
weshay|ruck | 2 wallaby hits, 1 master here: http://dashboard-ci.tripleo.org/d/Z4vLSmOGk/cockpit?orgId=1 | 20:34 |
abregman | hey. do we have any docs on how to add new jobs to component pipline? | 20:41 |
weshay|ruck | abregman, zuul or jenkins | 20:42 |
abregman | k maybe a different question...where component pipeline jobs are running today? :) | 20:42 |
weshay|ruck | zuul and jenkins | 20:43 |
weshay|ruck | pick your poison | 20:43 |
rlandy|rover | dviroel|ruck: weshay|ruck: https://bugs.launchpad.net/tripleo/+bug/1940866 | 20:43 |
rlandy|rover | ^^ let's capture debug there | 20:43 |
weshay|ruck | ++ | 20:43 |
abregman | I honestly don't care. what is the "right" place? | 20:44 |
weshay|ruck | both | 20:44 |
rlandy|rover | abregman; depends if you want to job to run in zuul or just be triggered with the component line and report there | 20:44 |
weshay|ruck | depends on what you want to execute.. if you want to run an upstream job.. use zuul | 20:44 |
weshay|ruck | if you want to run something from p1/2/3 use jenkins | 20:44 |
dviroel|ruck | rlandy|rover: will create the skiplist patch | 20:44 |
rlandy|rover | dviroel|ruck++ thanks | 20:44 |
abregman | so probably Jenkins | 20:45 |
rlandy|rover | abregman: if you already have a jenkins job that does what you want | 20:45 |
rlandy|rover | then let's just trigger it and report | 20:45 |
rlandy|rover | if not, look into zuul jobs | 20:45 |
rlandy|rover | you can use attila's jobs as examples | 20:45 |
weshay|ruck | abregman, you'll want to use https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/pipeline/ as example.. | 20:46 |
weshay|ruck | ya.. what rlandy|rover said | 20:46 |
abregman | sounds good. I'll have a look. thank you both | 20:46 |
rlandy|rover | you will trigger off https://sf.hosted.upshift.rdu2.redhat.com/logs/openstack-periodic-integration-rhos-17/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-rhel-8-upload-job-trigger-rhos-17/56796a4/ | 20:46 |
weshay|ruck | np | 20:46 |
rlandy|rover | for example | 20:46 |
rlandy|rover | and then just report back to jenkins | 20:46 |
rlandy|rover | you need to match the hash under test | 20:46 |
rlandy|rover | and pick up the right set of containers | 20:47 |
weshay|ruck | I wouldn't start w/ 17 though.. they are still getting 17 working | 20:47 |
rlandy|rover | simple as that :) | 20:47 |
weshay|ruck | go 4 the stable branch luke | 20:47 |
abregman | will it be visible in tripleo-cockpit? | 20:47 |
rlandy|rover | yes | 20:47 |
weshay|ruck | abregman, yes | 20:47 |
abregman | cool | 20:47 |
rlandy|rover | https://sf.hosted.upshift.rdu2.redhat.com/logs/openstack-periodic-integration-rhos-17/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-rhel-8-upload-job-trigger-rhos-17/56796a4/ | 20:47 |
rlandy|rover | oops | 20:47 |
rlandy|rover | http://tripleo-cockpit.usersys.redhat.com/d/KyHCwLHMk/rhos-16-2-full-component-pipeline?orgId=1 | 20:47 |
rlandy|rover | if you scroll down | 20:47 |
rlandy|rover | you'll see the pipeline jenkins jobs there | 20:48 |
rlandy|rover | once you have the job name - we can make sure it's captured | 20:48 |
abregman | that's great. Easier than I imagined | 20:48 |
weshay|ruck | we're here to please | 20:48 |
rlandy|rover | abregman: attila is your new best friend here | 20:49 |
rlandy|rover | we'll provide containers and overcloud images you can use etc. | 20:50 |
rlandy|rover | abregman: when your job is ready, we add it to criteria | 20:50 |
rlandy|rover | and then it decides whether the component promotes or not | 20:51 |
rlandy|rover | done and done | 20:51 |
dviroel|ruck | rlandy|rover: so, should i skip tripleo-ci-centos-8-scenario001-standalone + both periodics? | 20:52 |
rlandy|rover | excellent question | 20:52 |
abregman | rlandy|rover: where the criteria is defined? in tripleo-environments repo? | 20:52 |
* rlandy|rover gets | 20:53 | |
weshay|ruck | dviroel|ruck, I think so for now.. yes.. all varients of scenario001 | 20:53 |
weshay|ruck | for those branches | 20:53 |
rlandy|rover | http://git.app.eng.bos.redhat.com/git/tripleo-environments.git/tree/ci-scripts/dlrnapi_promoter/config/RedHat-8/component | 20:53 |
abregman | great. I definitely know much more now. thanks again | 20:55 |
rlandy|rover | dviroel|ruck; thanks - voted | 21:02 |
dviroel|ruck | \o/ | 21:02 |
abregman | rlandy|rover: I see that Jenkins jobs are triggered every time there is a change in rhel8-osp16-2/component/network/component-ci-testing/commit.yaml | 21:09 |
abregman | rlandy|rover: is this the same also for Zuul? | 21:09 |
rlandy|rover | yes | 21:09 |
rlandy|rover | we trigger those line once a day | 21:09 |
rlandy|rover | which changes component-ci-testing | 21:09 |
rlandy|rover | so from the zuul side, it's a time trigger | 21:10 |
abregman | rlandy|rover: so a periodic 24h trigger? but what/who updates the content of commit.yaml file? | 21:11 |
rlandy|rover | the first job in that line | 21:11 |
rlandy|rover | so it works like this ... | 21:12 |
rlandy|rover | taking networking component as an example | 21:12 |
abregman | rlandy|rover: k so the first job is triggered every 24 hours and then all the other jobs are triggered based on monitoring changes in commit.yaml? | 21:12 |
rlandy|rover | the first job gets triggered once a day ( one a cycle) | 21:13 |
rlandy|rover | that job uses dlrn to move consistent to component-ci-testing | 21:13 |
rlandy|rover | if that job is successful, the rest of the zuul jobs run | 21:13 |
rlandy|rover | it's a zuul dependency | 21:14 |
rlandy|rover | jenkins jobs pick their trigger as you saw | 21:14 |
rlandy|rover | abregman: http://git.app.eng.bos.redhat.com/git/openstack/tripleo-ci-internal-jobs.git/tree/zuul.d/project-templates-components.yaml | 21:15 |
rlandy|rover | http://git.app.eng.bos.redhat.com/git/openstack/tripleo-ci-internal-config.git/tree/zuul.d/pipelines.yaml#n273 | 21:16 |
rlandy|rover | abregman: ^^ line trigger | 21:16 |
rlandy|rover | and the one above, the zuul dependencies | 21:16 |
abregman | aha..k. I need to write it all down somewhere | 21:16 |
rlandy|rover | abregman: we'll need to write this all up for a bunch of people I guess | 21:17 |
abregman | rlandy|rover: k one last question for today because I need to do some processing as well, I see the jobs are not running networking tests (only some basic tempest tests). Will it be fine to add more component-specific tests in the future? | 21:17 |
rlandy|rover | we shoudl put together a hackmd or doc of sorts | 21:17 |
rlandy|rover | we have some of it here: | 21:18 |
rlandy|rover | abregman: to answer question above - YES please | 21:18 |
rlandy|rover | the more testing earlier, the better | 21:18 |
weshay|ruck | abregman, YES please add a better deployment + better tests | 21:18 |
rlandy|rover | lol | 21:18 |
weshay|ruck | think of the current jobs as HELLO-WORLD | 21:18 |
rlandy|rover | we are like stereo | 21:19 |
weshay|ruck | it's boiler plate.. | 21:19 |
weshay|ruck | but we don't know what is the right job or tests .. hence the subject matter expert | 21:19 |
rlandy|rover | https://docs.openstack.org/tripleo-docs/latest/ci/stages-overview.html#the-component-promotion-pipeline | 21:20 |
abregman | k great. will do. but first I need to write some notes, probably draw little bit...just to understand better the workflow here | 21:20 |
rlandy|rover | abregman: ^^ that's upstream doc but the concept is the same | 21:21 |
abregman | great. I have enough docs now to process until the end of the year | 21:21 |
abregman | but seriously, thanks a lot. this is all very helpful | 21:22 |
rlandy|rover | end of the year is only a few weeks away :) | 21:22 |
rlandy|rover | in fact a few days | 21:23 |
weshay|ruck | lolz | 21:23 |
weshay|ruck | oh heeb humor | 21:23 |
abregman | good one :D | 21:23 |
-opendevstatus- NOTICE: The Gerrit service on review.opendev.org has been restarted for a patch version upgrade, resulting in a brief outage | 21:41 | |
rlandy|rover | weshay|ruck: hey - do you see arxcruz's open reviews? | 21:52 |
rlandy|rover | for fs001? | 21:52 |
arxcruz | rlandy|rover: https://review.opendev.org/c/openstack/openstack-tempest-skiplist/+/804396 and https://review.opendev.org/c/openstack/tripleo-quickstart/+/804399 | 22:02 |
rlandy|rover | ah thanks looking | 22:03 |
arxcruz | rlandy|rover: i'll update jira tomorrow morning | 22:03 |
rlandy|rover | arxcruz: we should be able to merge https://review.opendev.org/c/openstack/openstack-tempest-skiplist/+/804396 w/o issue right | 22:04 |
rlandy|rover | arxcruz: which one needs to merge first? | 22:05 |
rlandy|rover | oh depends | 22:05 |
rlandy|rover | I see it | 22:05 |
rlandy|rover | arxcruz: ok - so here's what I'd like to do ... | 22:06 |
rlandy|rover | https://review.opendev.org/c/openstack/openstack-tempest-skiplist/+/804396 | 22:06 |
rlandy|rover | merge ^^ | 22:06 |
rlandy|rover | (should not impact w/o https://review.opendev.org/c/openstack/tripleo-quickstart/+/804399) | 22:07 |
rlandy|rover | and merge https://review.opendev.org/c/openstack/tripleo-quickstart/+/804399 tomorrow when ruck/rover can watch it | 22:07 |
rlandy|rover | ok? | 22:07 |
rlandy|rover | dviroel|ruck: weshay|ruck: ^^ fyi | 22:07 |
dviroel|ruck | ack | 22:15 |
* rlandy|rover back in a bit | 22:25 | |
*** dviroel|ruck is now known as dviroel|out | 22:37 | |
*** chem is now known as Guest5201 | 22:51 | |
weshay|ruck | rlandy|rover, ack | 22:59 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!