*** dviroel is now known as dviroel|out | 00:09 | |
*** rlandy|bbl is now known as rlandy | 01:44 | |
rlandy | jm1[m]: added https://bugs.launchpad.net/tripleo/+bug/1988347 - fs064 bug - master and wallaby | 01:48 |
---|---|---|
*** rlandy is now known as rlandy|out | 02:20 | |
*** ysandeep|out is now known as ysandeep | 05:36 | |
soniya29 | chandankumar, hello | 06:29 |
soniya29 | around? | 06:29 |
soniya29 | can you help me out with this patch:- https://code.engineering.redhat.com/gerrit/c/testproject/+/397264 | 06:29 |
soniya29 | c8 sc10 kvm internal standalone is failing on tempest | 06:31 |
jm1 | hell#oooq | 06:42 |
jm1 | hello#ooq | 06:42 |
*** jm1|ruck is now known as jm1|rover | 06:42 | |
soniya29 | jm1, hello | 06:43 |
chandankumar | soniya29: checking | 06:44 |
soniya29 | chandankumar, thanks | 06:46 |
*** jpena|off is now known as jpena | 07:38 | |
soniya29 | pojadhav tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset039 is failing with post_failure issue | 07:44 |
soniya29 | is this issue known/reported? | 07:44 |
*** pojadhav is now known as pojadhav|ruck | 07:44 | |
pojadhav|ruck | soniya29, jm1 having more idea about this but I can see some fs39 failures on integration lines some notes on RR hackmd as well. | 07:51 |
pojadhav|ruck | jm1, do we reported any bug against fs39 ? | 07:52 |
soniya29 | i dont see any cix against it | 07:52 |
pojadhav|ruck | need to check first whether it is consistent one or not | 07:53 |
soniya29 | https://review.rdoproject.org/zuul/builds?job_name=tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset039&skip=0 | 07:55 |
pojadhav|ruck | soniya29, issue is not consistent one, everytime It fails with different error | 07:55 |
soniya29 | pojadhav|ruck, ack | 08:00 |
jm1 | pojadhav|ruck, soniya29: we are having trouble with a lot of intermittent failures on c9 wallaby and c9 master. see rr notes in section "Intermittent Failures" for some ideas of what will be solved by a simple rerun | 08:03 |
soniya29 | jm1, well i have rechecked for 2 times :) | 08:04 |
jm1 | soniya29: 2 times is nothing, you can get worried after 5 times ๐ no kidding, fs64 on c9 master had to be rekicked 5 times yesterday until it passed | 08:05 |
soniya29 | jm1, i have rechecked once more, let's see | 08:05 |
jm1 | soniya29: this one is known intermittent https://logserver.rdoproject.org/44/852844/2/openstack-check/tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset039/3163945/job-output.txt | 08:06 |
soniya29 | jm1: ack | 08:06 |
jm1 | soniya29: did you rebase your patches on top of master? because this looks somehow familiar https://logserver.rdoproject.org/26/855126/3/openstack-check/tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset039/94c69f0/job-output.txt | 08:07 |
soniya29 | jm1, i have rebased the patch now, waiting for the new results | 08:10 |
jm1 | soniya29: it should pass eventually. we had a couple of passes of fs39 today | 08:11 |
soniya29 | jm1, that would be great | 08:11 |
jm1 | pojadhav|ruck, rlandy|out: we had a promotion of c9 master yesterday ๐ฅณ fs64 finally passed. trying to do the same for c9 wallaby fs64 again.. it MUST pass at some point ^^ | 08:15 |
pojadhav|ruck | jm1, great | 08:15 |
jm1 | ysandeep, chandankumar: did you see this error before? any hint on what to do? https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-build-containers-centos-9-push-master/ea8ed37/logs/report.html | 08:20 |
ysandeep | jm1, looking | 08:20 |
ysandeep | https://logserver.rdoproject.org/openstack-periodic-integration-main/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-build-containers-centos-9-push-master/ea8ed37/job-output.txt | 08:21 |
ysandeep | 2022-09-01 01:20:54.644089 | primary | "msg": "Depsolve Error occured: \n Problem: problem with installed package catatonit-3:0.1.7-7.el9.x86_64\n - package podman-2:4.2.0-3.el9.x86_64 conflicts with catatonit provided by catatonit-3:0.1.7-7.el9.x86_64\n - cannot install the best candidate for the job", | 08:21 |
jm1 | ysandeep: omg | 08:21 |
jm1 | ysandeep: thanks! | 08:21 |
jm1 | ysandeep, chandankumar: lets see if a rerun solves that | 08:22 |
ysandeep | fyi.. there was a bug related to catatonit few days back: https://bugs.launchpad.net/tripleo/+bug/1985981 | 08:22 |
ysandeep | https://review.opendev.org/c/openstack/tripleo-quickstart/+/853142 was temp workaround | 08:23 |
ykarel | hi is issue with undercloud upgrade known? | 08:24 |
ykarel | last 2 runs fails due to same reason | 08:24 |
ykarel | https://zuul.opendev.org/t/openstack/builds?job_name=tripleo-ci-centos-9-undercloud-upgrade | 08:24 |
ykarel | jm1|rover, pojadhav|ruck ^ | 08:25 |
ysandeep | https://32f730b73845cbdfd0fc-4cf4bb096d4774f141f32d30757edf40.ssl.cf1.rackcdn.com/853411/2/check/tripleo-ci-centos-9-undercloud-upgrade/b6a6d75/logs/undercloud/home/zuul/undercloud_upgrade.log | 08:27 |
jm1 | ysandeep: thanks! | 08:27 |
ysandeep | hmm, another deps issue: Depsolve Error occurred: \n Problem: package os-net-config-15.2.1-0.20220629114404.6505f24.el9.noarch requires NetworkManager-ovs, but none of the providers can be installed\n - package NetworkManager-ovs-1:1.39.10-1.el9.x86_64 requires NetworkManager(x86-64) = 1:1.39.10-1.el9, but none of the providers can be installed\n - package NetworkManager-ovs-1:1.39.12-1.el9.x86_64 requires NetworkManager(x86-64) = 1:1.39.12-1.el9, but | 08:27 |
ysandeep | none of the providers can be installed\n - package NetworkManager-ovs-1:1.39.6-1.el9.x86_64 requires NetworkManager(x86-64) = 1:1.39.6-1.el9, but none of the providers can be installed\n | 08:27 |
ykarel | yes both runs failed it ^ | 08:28 |
pojadhav|ruck | ykarel, this is bot known issue yet, new bug... | 08:28 |
ykarel | pojadhav|ruck, ack ok please check then | 08:29 |
pojadhav|ruck | reporting new bug against ^^ | 08:29 |
ykarel | may be due to mirror sync | 08:29 |
ysandeep | ykarel: yes, we hit similiar issue last month when there was mirros sync issue: https://trello.com/c/b5yeyBus/2667-cixlp1984175tripleociproa-tripleo-ci-centos-9-undercloud-upgrade-cannot-install-both-networkmanager-11395-1el9x8664-and-networkm | 08:29 |
ykarel | ysandeep, ok then likely it's same | 08:31 |
ykarel | might reopen previous bug itself with fresh details | 08:31 |
jm1 | ysandeep, ysandeep, pojadhav|ruck: it passed 2 hours earlier, so i would simply... wait? | 08:32 |
jm1 | ykarel: ^ | 08:32 |
ykarel | jm1, i think it's still good to check what's still missing | 08:33 |
jm1 | ysandeep, pojadhav|ruck, ykarel: looks like waiting and rerunning is the most sucessful issue handling strategy we have for a while now... ๐ฌ | 08:34 |
ykarel | if mirrors are up to date then no action needed | 08:34 |
ysandeep | last 2 run failed that means we now have issues (it the latest job passed than okay) otherwise I would open a bug, most probably the content-provider issue you highlighted above is also because of same cause(mirror sync) | 08:35 |
pojadhav|ruck | ysandeep, ykarel, jm1 : I have opened a bug https://bugs.launchpad.net/tripleo/+bug/1988397 if we have green runs then will close out. | 08:36 |
tosky | arxcruz: in fact that python-tempestconf change passed after a recheck, but I left a comment about... a removed comment | 08:39 |
ysandeep | ykarel, jm1 pojadhav|ruck I think we still have mirror issues:- | 08:42 |
ysandeep | job have .40 version installed | 08:42 |
ysandeep | https://32f730b73845cbdfd0fc-4cf4bb096d4774f141f32d30757edf40.ssl.cf1.rackcdn.com/853411/2/check/tripleo-ci-centos-9-undercloud-upgrade/b6a6d75/logs/undercloud/var/log/extra/package-list-installed.txt | 08:42 |
ysandeep | ~~~ | 08:42 |
ysandeep | NetworkManager.x86_64 1:1.40.0-1.el9 @baseos | 08:42 |
ysandeep | NetworkManager-libnm.x86_64 1:1.40.0-1.el9 @baseos | 08:42 |
ysandeep | ~~~ | 08:42 |
ysandeep | I don't see .40 rpms for NetworkManager on rackspace mirror: http://dfw.mirror.rackspace.com/centos-stream/9-stream/AppStream/x86_64/os/Packages/ | 08:42 |
ykarel | ysandeep, Thanks for checking | 08:44 |
ykarel | facebook mirror looks good | 08:46 |
ykarel | rackspace one not updated since 10 hours | 08:47 |
ykarel | http://dfw.mirror.rackspace.com/centos-stream/timestamp.txt | 08:47 |
jm1 | ysandeep, ykarel: and neither on http://mirror.mtl01.iweb.opendev.org/centos-stream/9-stream/AppStream/x86_64/os/Packages/ | 08:49 |
jm1 | ysandeep, ykarel: ^ this has not been updated since 2022-08-23? | 08:50 |
ykarel | yes ^ syncs from mirror which is rax currently so should be same | 08:50 |
ykarel | http://mirror.mtl01.iweb.opendev.org/centos-stream/timestamp.txt | 08:50 |
ykarel | 2022-09-01T06:08:45,769043757+00:00 | 08:50 |
jm1 | ykarel: ack, thnks! | 08:51 |
jm1 | pojadhav|ruck: added your bug with description to rr notes, thanks for logging it! | 09:04 |
pojadhav|ruck | jm1, ack | 09:04 |
* pojadhav|ruck lunch | 09:05 | |
jm1 | ysandeep, pojadhav|ruck, rlandy|out: having issues with rr notes "Sorry, you've reached the max length this note can be. Please reduce the content or divide it to more notes, thank you!" ๐ฑ | 09:46 |
ysandeep | time to archive :D | 09:46 |
jm1 | limit seems to be 100.000 chars | 09:46 |
jm1 | ysandeep: good that we have thursday | 09:46 |
jm1 | ysandeep, rlandy|out, pojadhav|ruck: hackmd forces me to stop working ๐ | 09:47 |
ysandeep | jm1, pojadhav|ruck Are you writing a book on ruck rover this week :D | 09:47 |
pojadhav|ruck | ysandeep, jm1 : I made very short updates.. this book writtern by jm1 :D | 09:48 |
jm1 | pojadhav|ruck: let me clean up the old days... | 09:49 |
pojadhav|ruck | jm1, sure may be today and yesterday's updates are fine about upstream for RR hand off | 09:50 |
jm1 | pojadhav|ruck: left today in rr notes and moved the rest to new rr doc | 09:55 |
jm1 | pojadhav|ruck: rr stuff until yesterday is linked in "Previous RR notes" | 09:55 |
pojadhav|ruck | jm1, ack | 09:55 |
jm1 | pojadhav|ruck: the doc is much more responsive now ^^ | 09:56 |
jm1 | ysandeep: regarding book: i logged the reasons for all intermittent failures in rr notes to track down real issues. a couple of those intermittent failures are happning REALLY often | 09:58 |
pojadhav|ruck | jm1, thats great ! | 09:58 |
ysandeep | sure, whatever works for rr tag team :) | 09:59 |
pojadhav|ruck | ysandeep, do we need CIX card for https://bugs.launchpad.net/tripleo/+bug/1988397 ?? | 10:04 |
ysandeep | If its still happening and blocking patches from getting merged then yes | 10:05 |
pojadhav|ruck | ack thank you! | 10:06 |
jm1 | pojadhav|ruck: i would leave it open because we have other c9 master and c9 wallaby jobs which might be failing for the same bug (on different packages though) | 10:10 |
* jm1 lunch | 10:10 | |
ysandeep | ykarel|lunch, chandankumar In downstream metadata issue is back, Could you please review https://review.rdoproject.org/r/c/config/+/44717 and https://review.rdoproject.org/r/c/config/+/44718 that enable config drive in our jobs for PSI env. | 10:21 |
ysandeep | reviewbot, please add in review list: https://review.rdoproject.org/r/q/topic:enable_config_downstream | 10:22 |
reviewbot | I could not add the review to Review List | 10:22 |
*** rlandy|out is now known as rlandy | 10:31 | |
rlandy | jm1: lol - I never jnew hackmd would be the limitng factor | 10:32 |
rlandy | jm1: pojadhav|ruck: want to sync quickly? | 10:36 |
rlandy | we will need to resync with new rr later | 10:37 |
pojadhav|ruck | rlandy, sure | 10:37 |
rlandy | ysandeep: can I just merge https://review.rdoproject.org/r/q/topic:enable_config_downstream? | 10:38 |
rlandy | still testing/ danger? | 10:39 |
ysandeep | rlandy, I have tested locally it works(I was not able to test using testproject with pre-run, as ovb pre triggers first.), I have requested ykarel for review - let's wait for his review. | 10:40 |
rlandy | oh I see waiting for ykarel to review | 10:40 |
ysandeep | rlandy, yes | 10:41 |
rlandy | ysandeep: also https://review.rdoproject.org/r/c/config/+/44717/4/roles/ovb-manage/tasks/find_undercloud_uuid.yml would run everywhere | 10:41 |
ysandeep | rlandy, yes we need that to run everywhere because our C9 image is virt-customize build currently and we want to mount config-drive there. | 10:42 |
rlandy | pojadhav|ruck: waiting for jm1 | 10:44 |
pojadhav|ruck | ack | 10:44 |
rlandy | ysandeep: chandankumar: let's meet for a few re: travel | 10:51 |
ysandeep | ack | 10:52 |
rlandy | https://meet.google.com/gux-rwxv-wwg?pli=1&authuser=0 | 10:52 |
* soniya29 tea break | 10:54 | |
ysandeep | chandankumar: you around o/ | 10:55 |
chandankumar | ysandeep: sorry I was afk | 11:07 |
chandankumar | ysandeep: still on call? | 11:07 |
ysandeep | chandankumar, yes | 11:07 |
ysandeep | please jump on | 11:07 |
Tengu | hello there! apparently there's once again a mirror issue or something? Problem: package os-net-config-15.2.1-0.20220629114404.6505f24.el9.noarch requires NetworkManager-ovs, but none of the providers can be installed | 11:11 |
Tengu | full trace: https://6831567f6cd28d1c548e-a47ddc14dbf9c2ea3d1835942b1575b6.ssl.cf1.rackcdn.com/854360/4/check/tripleo-ci-centos-9-undercloud-upgrade/5583fdc/logs/undercloud/home/zuul/undercloud_upgrade.log | 11:11 |
ysandeep | Tengu, yes mirror issues again, its known | 11:13 |
Tengu | ok! | 11:14 |
jm1 | rlandy, pojadhav|ruck: ready for sync | 11:15 |
jm1 | chandankumar, arxcruz: as i was creating rr notes anyway, i have created a new rr doc for you and added links to rr status page https://hackmd.io/0RguTgCAQsiZz_SnF5209A | 11:17 |
rlandy | jm1: pojadhav|ruck: ok - ... https://meet.google.com/ybb-ddea-xmo?pli=1&authuser=0 - should be quick | 11:18 |
arxcruz | jm1 oh, great, thanks!!! | 11:18 |
ysandeep | chandankumar, could you please +w https://review.rdoproject.org/r/c/config/+/44718 | 11:20 |
ysandeep | chandankumar++ | 11:21 |
ysandeep | pojadhav|ruck, jm1 fyi.. please let me know incase any ovb job fails on ovb stack setup.. fyi.. we have merged https://review.rdoproject.org/r/q/topic:enable_config_downstream to fix downstream jobs. | 11:23 |
pojadhav|ruck | rlandy, fyi https://bugs.launchpad.net/tripleo/+bug/1988397 | 11:25 |
*** dviroel|out is now known as dviroel | 11:27 | |
jm1 | pojadhav|ruck: did the one job which required rebooting the bm node pass? | 11:27 |
jm1 | ysandeep: awesome, thanks :) | 11:28 |
pojadhav|ruck | jm1, it is still running | 11:28 |
pojadhav|ruck | jm1, jfyi https://sf.hosted.upshift.rdu2.redhat.com/zuul/t/tripleo-ci-internal/status#276230 | 11:29 |
pojadhav|ruck | chandankumar, rlandy : allow list discussion | 11:30 |
chandankumar | joining | 11:30 |
rlandy | one sec | 11:31 |
jm1 | pojadhav|ruck: ah! atm it is deploying the overcloud so it rebooting the node seemed to have "solved" the issue, great ^^ | 11:32 |
jm1 | pojadhav|ruck: need help with anything? | 11:32 |
pojadhav|ruck | jm1, nope.. will let you know.. thanks! | 11:33 |
jm1 | pojadhav|ruck: okidoki :) then i will chase some bugs | 11:34 |
ysandeep | jm1, run that job again to crosscheck.. I wonder you will hit the same issue. | 11:34 |
ysandeep | regarding bm | 11:34 |
jm1 | ysandeep: which job? pojadhav|ruck's bm job which is still wip? | 11:35 |
ysandeep | jm1, yes | 11:35 |
jm1 | pojadhav|ruck: ^ ;) | 11:36 |
pojadhav|ruck | jm1, ysandeep : ack | 11:36 |
* pojadhav|ruck tea break.. brb in few mins | 11:48 | |
*** ysandeep is now known as ysandeep|afk | 11:51 | |
sshnaidm | I have standalone and multinode Centos9 jobs failing on ansible-podman repo, they run on RDO. Was there any change to centos9 nodesets recently? Like from yesterday. hhttps://review.rdoproject.org/zuul/builds?pipeline=github-check&skip=0 | 11:59 |
sshnaidm | Or new image of centos9 built on RDO for standalone/multinode nodeset, because OVB is fine | 12:01 |
* sshnaidm is trying to recall what can break | 12:01 | |
rlandy | there were reviews up to rebuild those nodes using dib | 12:04 |
reviewbot | Do you want me to add your patch to the Review list? Please type something like add to review list <your_patch> so that I can understand. Thanks. | 12:04 |
sshnaidm | rlandy, the error: "Error: invalid policy in \"/etc/containers/policy.json\": Unknown key \"keyPaths\"" | 12:07 |
sshnaidm | probably it has prebaked incompatible /etc/containers/policy.json | 12:08 |
sshnaidm | for podman | 12:08 |
sshnaidm | rlandy, can someone take a look? | 12:09 |
rlandy | sshnaidm: yeah - sorry - was on another channel | 12:09 |
rlandy | let me get you those reviews | 12:09 |
reviewbot | Do you want me to add your patch to the Review list? Please type something like add to review list <your_patch> so that I can understand. Thanks. | 12:09 |
rlandy | dasm|off: pls see #rhos-ops when you are in - we're taking down zuul - questioning latest rr tool patches querying zuul | 12:10 |
ykarel | jm1, ysandeep|afk rax mirror still not refreshed, pushed the revert https://review.opendev.org/c/opendev/system-config/+/855457 | 12:10 |
rlandy | sshnaidm: this was the ticket: https://issues.redhat.com/browse/RHOSZUUL-701 - doesn't look completed though | 12:15 |
sshnaidm | dpawlik, hey, did you do changes to centos9 images according to this ticket? ^^ | 12:16 |
sshnaidm | rlandy, ack, thanks, let's see who can fix it.. | 12:16 |
dpawlik | sshnaidm: today I updated centos 9 stream images to be up to date | 12:17 |
dpawlik | but for upstream-centos-9-stream image, we are still working on it | 12:17 |
sshnaidm | dpawlik, now we know whom to blame! :D | 12:17 |
dpawlik | o lol | 12:17 |
dpawlik | whats happening? | 12:17 |
sshnaidm | dpawlik, kidding, can you please see the problem above? | 12:17 |
rlandy | dpawlik: you're having quite a EoD week :) | 12:18 |
sshnaidm | dpawlik, I run standalone/multinode on RDO and it started to fail from yesterday: "Error: invalid policy in \"/etc/containers/policy.json\": Unknown key \"keyPaths\"" | 12:18 |
sshnaidm | dpawlik, probably wrong version of /etc/containers/policy.json for podman in image | 12:18 |
sshnaidm | dpawlik, logs: https://review.rdoproject.org/zuul/builds?pipeline=github-check&skip=0 | 12:19 |
sshnaidm | dpawlik, we use centos9 image from RDO nodeset and run upstream job on it | 12:19 |
dpawlik | but | 12:19 |
dpawlik | images has been updated today | 12:19 |
dpawlik | and upstream centos 9 stream image is not available yet | 12:19 |
dpawlik | due https://review.opendev.org/c/openstack/diskimage-builder/+/855014 | 12:19 |
dpawlik | so the issue is mostly in podman, not in image :) | 12:20 |
dpawlik | but similar issue I see on IBM instance | 12:20 |
sshnaidm | hmm, I'd double check since this started to happen only on centos9 images in RDO, even OVB images in RDO work fine (it's a different nodeset) | 12:21 |
dpawlik | s/has/have been updated today | 12:21 |
sshnaidm | dpawlik, the problem is only with standalone/multinode nodeset | 12:21 |
dpawlik | o | 12:21 |
dpawlik | < we should use upstream DIB base images > | 12:22 |
sshnaidm | dpawlik, iirc this is what used: https://github.com/rdo-infra/rdo-jobs/blob/master/zuul.d/nodesets.yaml#L552-L563 | 12:24 |
sshnaidm | "faked" | 12:24 |
sshnaidm | probably you don't fake it anymore :) | 12:25 |
rlandy | hmmm ... we're still a no show on OVB stacks in downstream ... Resource CREATE failed: WaitConditionTimeout: resources.baremetal_env.resources.bmc.resources.bmc_wait_condition: 0 of 1 received | 12:28 |
rlandy | checking merge of bmc template | 12:28 |
rlandy | bmc_image: bmc-template | 12:29 |
rlandy | still using old bmc template | 12:29 |
rlandy | pojadhav|ruck: ^^ note diff OVB failure | 12:30 |
pojadhav|ruck | rlandy, ack | 12:31 |
bhagyashris | rlandy, hi, let me know when you free want to talk about 17.1 or may be we can talk in scrum | 12:33 |
rlandy | bhagyashris: hey - any movement on that bug to unblock you? | 12:33 |
rlandy | just looking at downstream OVB | 12:33 |
rlandy | needs debug | 12:33 |
rlandy | let's chat at scrum | 12:34 |
bhagyashris | bogdando, submitted this patch https://code.engineering.redhat.com/gerrit/c/openstack-tripleo-heat-templates/+/426525 but this is not fixing issue | 12:34 |
bhagyashris | i talked with him on rhos-dev | 12:35 |
bhagyashris | so what i got to know that | 12:35 |
bhagyashris | we will not allow fresh install of 17.1 on rhel 8 | 12:36 |
bhagyashris | that is not somethign we will supprot | 12:36 |
bhagyashris | the only way to get to 17.1 on rhel8 is to deploy with 16.2 and upgrade | 12:36 |
bhagyashris | there is to be no scale out or frsh isntal or 17.1 on rhel 8.4 hosts | 12:36 |
bhagyashris | so i am not sure about the exact plan | 12:37 |
bhagyashris | you can check <sean-k-mooney> messages on rhos-dev | 12:37 |
*** ysandeep|afk is now known as ysandeep | 12:38 | |
rlandy | bhagyashris: ok - let's talk at scrum about the right way to go here | 12:40 |
rlandy | pojadhav|ruck: this needs debug | 12:40 |
rlandy | I am looking at the BMC console | 12:40 |
rlandy | https://sf.hosted.upshift.rdu2.redhat.com/logs/72/190672/275/check/periodic-tripleo-ci-rhel-9-ovb-3ctlr_1comp-featureset001-internal-rhos-17.1/5aba9c7/logs/bmc_275_90843-console.log | 12:40 |
rlandy | https://sf.hosted.upshift.rdu2.redhat.com/logs/72/190672/275/check/periodic-tripleo-ci-rhel-9-ovb-3ctlr_1comp-featureset001-internal-rhos-17.1/5aba9c7/logs/failed_ovb_stack.log | 12:40 |
pojadhav|ruck | rlandy, ^^ for this logs right ?? | 12:41 |
rlandy | pojadhav|ruck: look at failed OVB jobs | 12:41 |
rlandy | see the times | 12:41 |
rlandy | cross check OVB stack failure | 12:41 |
pojadhav|ruck | ok | 12:41 |
dpawlik | should we hold the node and check what's going on? | 12:43 |
rlandy | sorry - was on something else | 12:50 |
rlandy | looking at console of passed job | 12:50 |
ysandeep | rlandy, https://sf.hosted.upshift.rdu2.redhat.com/logs/58/418658/21/check/periodic-tripleo-ci-rhel-9-ovb-3ctlr_1comp-featureset001-internal-rhos-17.1/b9302c9/logs/bmc_21_9836-console.log | 12:51 |
* ysandeep debugging failure | 12:51 | |
rlandy | yep - comparing | 12:52 |
*** dasm|off is now known as dasm | 12:54 | |
dasm | o/ | 12:54 |
dasm | rlandy: will check it in a few | 12:54 |
dasm | wrt killikg zuul | 12:54 |
ysandeep | rlandy, fyi.. disabling ovb cleanup script in downstream to debug bmc failure | 12:55 |
rlandy | dasm: pls chat to dpawlik and see conversation on #rhos-ops | 13:00 |
pojadhav|ruck | rlandy, akahat, chandankumar, ysandeep : scrum | 13:01 |
dasm | jm1: where are we hosting telegraf which runs all queries to zuul? | 13:05 |
jm1 | dasm: dashboard-ci.tripleo.org | 13:11 |
jm1 | dasm: and downstream cockpit as well | 13:11 |
chandankumar | https://review.opendev.org/q/topic:tripleo-ansible-ee | 13:13 |
dasm | jm1: ack, thx. i accessed both locations. | 13:14 |
chandankumar | https://review.opendev.org/c/openstack/tripleo-ci/+/843836/25/zuul.d/standalone-jobs.yaml | 13:14 |
bhagyashris | rlandy, here is test project patch https://code.engineering.redhat.com/gerrit/c/testproject/+/424992 logs https://sf.hosted.upshift.rdu2.redhat.com/logs/92/424992/27/check/periodic-tripleo-ci-rhel-8-standalone-rhos-17.1/d96df59/logs/undercloud/home/zuul/standalone_deploy.log here is definition patch https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/424293 | 13:15 |
bhagyashris | fixed is here https://code.engineering.redhat.com/gerrit/c/openstack-tripleo-heat-templates/+/426525 | 13:15 |
ysandeep | rlandy, recent run passed stack create: https://sf.hosted.upshift.rdu2.redhat.com/zuul/t/tripleo-ci-internal/stream/1c30dad6bbb54e038df97d93d3da9897?logfile=console.log | 13:15 |
ysandeep | rlandy, may be transient issue earlier | 13:15 |
rlandy | ysandeep: good sign | 13:16 |
rlandy | maybe it took a while to get eveything in sync | 13:16 |
ysandeep | re-enabling stack cleanup script | 13:16 |
chandankumar | rlandy: https://review.opendev.org/c/openstack/tripleo-ci/+/850736 please +w it | 13:16 |
ysandeep | rlandy, probably | 13:17 |
bhagyashris | new patch fix for upstream : https://review.opendev.org/c/openstack/tripleo-heat-templates/+/855503 | 13:17 |
jm1 | dasm: https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/43307/28#message-d0baab4ed36f3b27cafda6e76f1cc74261bb019d | 13:19 |
dasm | jm1: dpawlik mentioned that the issue started before 07/18 | 13:20 |
dasm | my change probably exposed that issue | 13:20 |
jm1 | dasm: my comment is unrelated to zuul crashes | 13:21 |
dasm | jm1: answering your question: it's fetching ZUUL_JOBS_LIMIT because i'm querying zuul jobs in batches. if te number would be variable based on jobs number, we would miss responses. | 13:21 |
dasm | we could end up with duplicated jobs, because eg. job1, job1, job1 ran 3x and not any other | 13:22 |
dasm | it's tricky. i'm gonna look into that one more time tho | 13:22 |
jm1 | dasm: but its running once for all jobs | 13:22 |
dasm | jm1: what do you mean? | 13:23 |
jm1 | dasm: the call done by this requests.get is e.g. curl 'https://review.rdoproject.org/zuul/api/builds?job_name=periodic-tripleo-ci-centos-9-standalone-baremetal-master&job_name=periodic-tripleo-ci-centos-9-scenario012-standalone-baremetal-master&job_name=periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset001-baremetal-master&job_name=periodic-tripleo-ci-centos-9-scenario000-multinode-oooq-container-updates-baremetal-master&limit=1000' | 13:25 |
rlandy | chandankumar: question on https://review.opendev.org/c/openstack/tripleo-ci/+/850736 | 13:26 |
jm1 | dasm: but maybe you wanted get 1000 results per job? | 13:27 |
soniya29 | https://review.opendev.org/q/topic:tempest_allow_list | 13:28 |
dasm | jm1: no, i'm actually asking for "all jobs last 1000 results" not "onu job 1000 results". | 13:28 |
rlandy | chandankumar: if that is ok for all jobs - ie: having the stream9 image on 8 jobs etc. | 13:28 |
ysandeep | closed config drive epic :D | 13:28 |
rlandy | then I can merge | 13:28 |
pojadhav|ruck | https://review.opendev.org/q/topic:tempest_allow_list | 13:28 |
dasm | jm1: in the past we were issuing query per job | 13:28 |
jm1 | dasm: ok, great! | 13:30 |
* dviroel coffee | 13:30 | |
chandankumar | rlandy: yes, it will be only used in tripleo-ee container | 13:31 |
chandankumar | it is not going to affect anything else | 13:31 |
tosky | arxcruz: sooo, for that python-tempestconf patch :) | 13:51 |
arxcruz | tosky sorry, o forgot to check, let me see it | 13:52 |
arxcruz | tosky lol, it was me who wrote the patch, let me fix it, i might remove it by accident | 13:53 |
tosky | arxcruz: thanks | 13:55 |
arxcruz | tosky done | 13:55 |
akahat | Mixed OS compute job: https://review.rdoproject.org/r/c/rdo-jobs/+/44415, https://review.rdoproject.org/r/c/testproject/+/44499, https://review.opendev.org/c/openstack/tripleo-quickstart/+/853860 | 13:57 |
akahat | folks please take a look when you are free. Thanks :) | 13:57 |
arxcruz | tosky i'm assuming that was your only complain ;) | 14:09 |
tosky | arxcruz: yes, it was :) | 14:12 |
*** jm1|ruck is now known as jm1|rover | 14:13 | |
arxcruz | tosky great :) | 14:17 |
pojadhav|ruck | rlandy, bm job running here https://sf.hosted.upshift.rdu2.redhat.com/zuul/t/tripleo-ci-internal/status#20987 | 14:19 |
rlandy | ty | 14:20 |
*** ysandeep is now known as ysandeep|afk | 14:22 | |
*** pojadhav|ruck is now known as pojadhav|afk | 14:23 | |
tosky | chandankumar: if you have some time for https://review.opendev.org/c/openinfra/python-tempestconf/+/849127, it passed in the previous review and the review being tested now just adds a few comments | 14:30 |
dasm | dviroel: jm1 can you +2+W ? https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/44724 | 14:32 |
dasm | let's make it work again | 14:32 |
rlandy | bhagyashris: mind if I edit your patch? | 14:35 |
jm1 | dasm: did you test it in cockpit? | 14:38 |
dasm | jm1: yes, dpawlik said no issues with zuul atm | 14:41 |
dasm | upstream cockpit | 14:41 |
dasm | we need to have the same on downstream one | 14:41 |
rlandy | openstack-tripleo-heat-templates.noarch 14.3.1-1.20220829144516.f7e97cb.el8osttrunk @delorean-component-tripleo | 14:42 |
rlandy | let's see if this works with promoted content | 14:43 |
tosky | chandankumar: thanks! | 14:49 |
rlandy | bhagyashris: rebuilding containers with new content | 14:55 |
rlandy | job is in rerun | 14:55 |
bhagyashris | rlandy, np you can edit | 15:04 |
*** pojadhav|afk is now known as pojadhav|ruck | 15:06 | |
dviroel | dasm: W+1 | 15:10 |
dasm | dviroel: ack, thx. dpawlik we're good to go | 15:10 |
*** ysandeep|afk is now known as ysandeep | 15:21 | |
ysandeep | dasm, dviroel should we also tune interval if each run takes ~100 mins... https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/44724/1#message-fbd3542bcb0f2dfe9f96ab95374b389a97cbf23c | 15:22 |
dasm | ysandeep: why would it run 100 minutes? | 15:24 |
dasm | it's a simple API query | 15:24 |
dviroel | dasm: because of the sleep time | 15:24 |
ysandeep | dasm, I am reading dviroel comment - If we consider the average time, it will take ~100 minutes to finish. | 15:24 |
dasm | it's not one after another. it starts all of jobs at once | 15:25 |
dasm | the random here spreads it across few seconds | 15:25 |
dasm | htat's enough to not overwhelm zuul | 15:25 |
dasm | does it make sense? | 15:26 |
ysandeep | dasm, yeah.. looks like exec plugin executes all the commands in parallel | 15:27 |
ysandeep | I am reading docs atm: https://github.com/influxdata/telegraf/blob/master/plugins/inputs/exec/README.md#exec-input-plugin | 15:27 |
dviroel | dasm: hum, correct, they run in parallel, so my sentence is not correct there | 15:27 |
dasm | ysandeep: that was the main issue with queries. all of them started at once, causing zuul to sweat | 15:28 |
dasm | ++ | 15:28 |
rlandy | jm1: guess what??? | 15:37 |
rlandy | periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset064-wallaby "success": true, | 15:38 |
rlandy | rekicked periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset064-wallaby | 15:39 |
rlandy | oops | 15:39 |
rlandy | https://review.rdoproject.org/r/c/testproject/+/44662 | 15:40 |
rlandy | for fs001 and fs035 | 15:40 |
*** ysandeep is now known as ysandeep|out | 15:40 | |
ysandeep|out | see you all tomorrow o/ | 15:40 |
dviroel | o/ | 15:40 |
dasm | ysandeep|out: o/ | 15:41 |
jm1 | rlandy: periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset064-wallaby is still wip? | 15:45 |
rlandy | wip? | 15:53 |
rlandy | in what way? | 15:53 |
* dviroel lunch | 15:59 | |
*** dviroel is now known as dviroel|lunch | 15:59 | |
jm1 | rlandy: the job is still running. not sure why you get success true | 16:02 |
jm1 | rlandy: https://review.rdoproject.org/zuul/status/change/44661,8 | 16:03 |
jm1 | rlandy: last pass of that job was 2 days ago https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp_1supp-featureset064-wallaby&skip=0 | 16:04 |
rlandy | https://review.rdoproject.org/zuul/stream/b8fb1de2e9df42128a4024000ca052a2?logfile=console.log | 16:05 |
jm1 | rlandy: oh its uploading logs and has passed tempest ๐ฑ | 16:06 |
rlandy | jm1; I know - do the happy dance | 16:06 |
jm1 | rlandy: a promotion of c9 wallaby, our present for chandankumar and arxcruz ;) | 16:07 |
rlandy | jm1++ yep | 16:07 |
arxcruz | jm1 viele danke! das gefรคhlt mir sehr gut | 16:07 |
jm1 | rlandy, arxcruz: it took 10 runs until it passed ๐ | 16:09 |
rlandy | jm1: as we used to say in south africa ... https://mymemory.translated.net/en/Afrikaans/English/aanhouer-wen | 16:11 |
jm1 | rlandy, arxcruz: it is gathering logs for 50 minutes now. not whether its failing again... | 16:11 |
rlandy | dlrn reported | 16:12 |
arxcruz | yeah, if dlrn report, doesn't matter the logs | 16:12 |
jm1 | rlandy, arxcruz: oh ok great | 16:12 |
rlandy | http://promoter.rdoproject.org/promoter_logs/container-push/20220901-153339.log | 16:12 |
arxcruz | although i would check with the rdo team about the logs | 16:12 |
rlandy | promoting | 16:12 |
jm1 | arxcruz: go ahead if you want ;) i am eod now | 16:12 |
jm1 | rlandy: with c9 wallaby being promoted i will be eod | 16:13 |
rlandy | jm1: have a good night | 16:13 |
rlandy | I am just watching bhagyashris's work | 16:13 |
rlandy | after that will look into ovb rerun | 16:13 |
rlandy | for downstream components | 16:13 |
jm1 | rlandy: ok thanks! it looks like downstream is making more trouble than upstream this week | 16:14 |
rlandy | like having twins - each one takes a turn to make you crazy | 16:14 |
jm1 | rlandy: ๐ | 16:14 |
rlandy | my sister has twins | 16:14 |
jm1 | rlandy: not sure i could handle that for more than a week ;) | 16:15 |
rlandy | lol -she has 5 other kids | 16:15 |
jm1 | rlandy: oha... we were 5 kids in total and even that was a lot of fun.. | 16:16 |
* jm1 out for today, have a nice evening! | 16:16 | |
arxcruz | rlandy 7 in total? | 16:24 |
arxcruz | jesus christ... | 16:24 |
rlandy | arxcruz: yep | 16:24 |
rlandy | bhagyashris: so I think this is fixed | 16:32 |
rlandy | error is later now | 16:32 |
rlandy | https://sf.hosted.upshift.rdu2.redhat.com/zuul/t/tripleo-ci-internal/build/7aaac9ad36f8402682dbe9d1aab9cf48 | 16:38 |
*** jpena is now known as jpena|off | 16:50 | |
rlandy | lunch - brb | 17:11 |
*** dviroel|lunch is now known as dviroel | 17:18 | |
dviroel | dasm: i think that you can vote now https://softwarefactory-project.io/r/c/config/+/25973 | 17:33 |
dviroel | :) | 17:33 |
dasm | checking | 17:33 |
dasm | > Forbidden | 17:34 |
dasm | lemme try relogging | 17:34 |
dasm | nope, clean browser, the same | 17:35 |
dasm | dviroel: not so "fun fact" i'm getting emails from softwarefactory about updates. | 17:36 |
dasm | and i can see it when not logged in. but after logging in -- Forbidden | 17:36 |
dviroel | dasm: weird, i see that apavec added you to reviewer list - maybe still missing something no your account, you should ping him | 17:38 |
dasm | i tried that yesterday | 17:41 |
dasm | the same outcome | 17:41 |
dasm | dviroel: i can access it now. However it doesn't change that I'm not an expert in this particular area :D | 17:46 |
dviroel | dasm: of course you are | 17:49 |
dasm | dviroel: True. We all are experts. In everything :) | 17:50 |
dviroel | dasm: yeah, nothing that you can't learn in 5min reading ;) | 17:51 |
rlandy | is the cockpit stuck again | 18:29 |
dviroel | looks like it is working | 18:32 |
dviroel | rlandy: you don't see updates there? | 18:32 |
dviroel | maybe its missing new data | 18:33 |
rlandy | dviroel: cockpit is ok - w c9 pushed containers but didn't promote | 18:34 |
rlandy | will look at that after meetings | 18:34 |
dviroel | hum | 18:35 |
rlandy | dviroel: http://promoter.rdoproject.org/promoter_logs/centos9_wallaby_2022-09-01T16:19.log | 18:56 |
rlandy | says 1 hash promoted | 18:56 |
rlandy | http://dashboard-ci.tripleo.org/d/HkOLImOMk/upstream-and-rdo-promotions?orgId=1 - says 3 days ago | 18:57 |
rlandy | dasm: ^^ per our conversation earlier | 18:57 |
rlandy | looks like that promo was not picked up | 18:57 |
rlandy | also w c8 promoted today | 18:57 |
rlandy | so cockpit is out of date | 18:57 |
rlandy | dviroel: rcastillo|rover: dasm: only one review on review hackmd - needs infar w+ | 18:58 |
rlandy | so I think we can skip review time | 18:58 |
dviroel | rlandy: yeah | 19:01 |
rlandy | ok | 19:01 |
rlandy | so ... dasm .. think we have your first cockpit task :) | 19:01 |
dasm | i see some issue in logs. dpawlik our recent change doesn't work | 19:03 |
dasm | > Error in plugin: exec: exit status 1 for command 'sleep $((RANDOM % 180)); ruck_rover.py --influx --release wallaby --distro centos-9 --component cloudops': sleep: unrecognized option '--influx' | 19:03 |
dasm | interesting, because there is no issue when running the same thing manually | 19:04 |
dasm | i need to see why it's doing that | 19:04 |
rlandy | dpawlik is probably end of day now | 19:11 |
dasm | he's definitely eod. i wanted to give him some heads up | 19:13 |
jm1[m] | @dasm: how did you test locally? did you restart the telegraf container after changing the config? | 19:53 |
jm1[m] | @dasm: how about changing telegraf config to '/bin/bash -c "sleep $(( โฆ )); ruck_rover.py โฆ cloudops"'? | 19:57 |
dviroel | dasm: ^ you can test inside the running container to see if works | 20:01 |
* dviroel lets test in prod | 20:02 | |
dasm | rlandy: dviroel i ran it inside the container. it didn't complain | 20:09 |
dasm | i'm gonna give it a one more try. | 20:09 |
dviroel | ack | 20:11 |
dasm | i'm checking for second best option to run the command | 20:13 |
rlandy | ok | 20:15 |
dasm | rlandy: your idea works: https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/44725 cc dviroel | 20:29 |
* rlandy looks | 20:35 | |
rlandy | dasm: dviroel: should I merge that? | 20:36 |
rlandy | broken anyways | 20:36 |
dasm | rlandy: please go ahead. indeed | 20:36 |
dasm | i ran the command inside the container. this time i've seen it actually gathering results | 20:37 |
rlandy | dasm: pls keep an eye on the cockpit data | 20:37 |
dasm | last time apparently i didn't check long enough to catch the issue | 20:37 |
* dviroel bbl | 20:57 | |
*** dviroel is now known as dviroel|afk | 20:57 | |
*** dasm is now known as dasm|off | 21:42 | |
dasm|off | o/ | 21:42 |
*** dviroel|afk is now known as dviroel | 23:55 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!