*** ysandeep has joined #oooq | 00:52 | |
ysandeep | weshay|ruck, rlandy we promoted tripleo in 16.2? that had legit issue and now everything is hosed :) | 00:56 |
---|---|---|
ysandeep | https://sf.hosted.upshift.rdu2.redhat.com/logs/openstack-component-tempest/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-rhel-8-standalone-tempest-rhos-16.2/4b94358/logs/undercloud/home/zuul/install_packages.sh.log | 00:56 |
ysandeep | imho..We should have waited for ansible dist-git changes before we promoted tripleo.. | 00:59 |
* ysandeep checking where we are on merging that podman patch | 00:59 | |
ysandeep | hurrah we finally merged.. https://review.opendev.org/c/openstack/tripleo-quickstart-extras/+/792904 | 01:00 |
ysandeep | checking if sshnaidm already created reverts for distgits, otherwise i will create those reverts now | 01:01 |
rlandy | ysandeep: we didn't | 01:08 |
rlandy | promoted tripleo component | 01:08 |
rlandy | rdo zuul went down | 01:08 |
rlandy | we need to revert the distgit changes | 01:08 |
rlandy | ysandeep: https://review.rdoproject.org/r/c/openstack/tripleo-ansible-distgit/+/33840 | 01:09 |
ysandeep | rlandy, i have just created the reverts | 01:10 |
ysandeep | https://review.rdoproject.org/r/c/openstack/tripleo-ansible-distgit/+/33843 https://review.rdoproject.org/r/c/openstack/tripleo-ansible-distgit/+/33844 and https://review.rdoproject.org/r/c/openstack/tripleo-ansible-distgit/+/33845 | 01:10 |
ysandeep | rlandy, weird.. 16.2 current-tripleo for tripleo looks new to me: http://osp-trunk.hosted.upshift.rdu2.redhat.com/rhel8-osp16-2/component/tripleo/ | 01:11 |
rlandy | ysandeep: yep - weshay|ruck promoted the component through as they were cutting beta | 01:12 |
rlandy | but | 01:12 |
rlandy | tripleo-ci-testing is nehind | 01:12 |
rlandy | behind | 01:12 |
rlandy | you can see my comment abive | 01:12 |
rlandy | <weshay|ruck> I promoted tripleo 16.2 component all the way through | 01:12 |
rlandy | but somehow one was missed | 01:13 |
rlandy | maybe it got recreated | 01:13 |
rlandy | we also need to promote network component | 01:13 |
rlandy | which died during rerun due to ^^ | 01:14 |
ysandeep | ohh.. zuul went down in internal again :) fun.. | 01:15 |
rlandy | ysandeep: it went down everywhere . | 01:15 |
rlandy | apevec> rlandy, looks like neutron router for the whole infra-rdo project went into bad state, vexxhost is looking into it now | 01:15 |
rlandy | <apevec> "somehow we're in a state where neutron route l3 ha is active on 2 nodes and standby on one" | 01:15 |
rlandy | <apevec> sounds familiar | 01:15 |
rlandy | <apevec> rlandy, neutron router fixed, VMs are reachable | 01:15 |
rlandy | ysandeep: also I have one more fox for container updates for multinode | 01:17 |
rlandy | standalone is working fine | 01:17 |
rlandy | will test that new patch tomorrow | 01:17 |
ysandeep | rlandy, just wondering if tripleo-ci-testing still contains old tripleo.. why its failing in integration line | 01:17 |
rlandy | ysandeep: yep that's a mess | 01:18 |
ysandeep | my bad.. i have seen component line result.. https://sf.hosted.upshift.rdu2.redhat.com/logs/openstack-component-tempest/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-rhel-8-standalone-tempest-rhos-16.2/4b94358/logs/undercloud/home/zuul/install_packages.sh.log | 01:18 |
rlandy | idk how that one hash got missed | 01:18 |
rlandy | the next run should pick up promoted components | 01:18 |
rlandy | which is newer | 01:18 |
rlandy | ysandeep: if you can just get those reverts merged, I can pick this up when I come back on line | 01:19 |
rlandy | didn't help that zuul died at the same time :( | 01:19 |
rlandy | ysandeep: see you in a few hours | 01:20 |
* rlandy out | 01:20 | |
ysandeep | rlandy, yeah.. lets get those revert merged, promote tripleo and network component, and hopefully promotion tomorrow in your morning | 01:20 |
*** rlandy has quit IRC | 01:20 | |
weshay|ruck | ysandeep, don't worry about 16.2 it promoted... | 02:19 |
weshay|ruck | they have the content now.. we're done for a bit | 02:19 |
ysandeep | weshay|ruck, but 16.2 tripleo had legit issue.. | 02:20 |
ysandeep | weshay|ruck, now everything failing with https://sf.hosted.upshift.rdu2.redhat.com/logs/openstack-component-tempest/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-rhel-8-standalone-tempest-rhos-16.2/4b94358/logs/undercloud/home/zuul/install_packages.sh.log | 02:20 |
weshay|ruck | ysandeep, yes.. I warned jjoyce about the issues | 02:20 |
weshay|ruck | they needed to cut the beta.. | 02:20 |
weshay|ruck | this is just kind of how it goes... we're not the last line of defense.. but when we get to this point of the release... | 02:21 |
ysandeep | weshay|ruck, ack fine then.. but beta will be hosed then with same issue :) | 02:21 |
weshay|ruck | I saw the baremetal tests passing | 02:21 |
weshay|ruck | so.. it's ok | 02:21 |
weshay|ruck | both in integration and on the tripleo component hash we promoted.. | 02:21 |
ysandeep | weshay|ruck, want to chat for quick sec | 02:22 |
weshay|ruck | we'll see what happens in phase1/2 .. but jjoyce was warned | 02:22 |
weshay|ruck | sure | 02:22 |
ysandeep | grabbing my earphone | 02:22 |
weshay|ruck | meet.google.com/xrh-oaup-pph | 02:22 |
ysandeep | sshnaidm, arxcruz|rover fyi.. regarding that msr issue.. The node on which overcloud deployment failed because its not reachable.. i was able to access that after reboot.. i notice this traceback in cloud-int log.. http://paste.openstack.org/show/806297/ | 02:49 |
*** frenzy_friday has quit IRC | 02:54 | |
*** frenzy_friday has joined #oooq | 02:54 | |
*** ysandeep is now known as ysandeep|afk | 03:23 | |
*** ysandeep|afk has quit IRC | 03:54 | |
*** ykarel|away has joined #oooq | 03:56 | |
*** ykarel|away is now known as ykarel | 04:14 | |
*** ysandeep|afk has joined #oooq | 04:30 | |
*** ysandeep|afk is now known as ysandeep|ruck | 04:35 | |
*** ysandeep__ has joined #oooq | 04:38 | |
*** soniya29 has joined #oooq | 04:40 | |
*** ysandeep|ruck has quit IRC | 04:44 | |
*** soniya29 has quit IRC | 05:04 | |
*** marios has joined #oooq | 05:15 | |
*** soniya29 has joined #oooq | 05:19 | |
*** ysandeep__ is now known as ysandeep_afk | 05:26 | |
*** soniya29 has quit IRC | 05:31 | |
*** soniya29 has joined #oooq | 05:32 | |
*** pojadhav- has joined #oooq | 05:37 | |
*** pojadhav has quit IRC | 05:41 | |
*** soniya29 has quit IRC | 05:43 | |
*** soniya29 has joined #oooq | 05:43 | |
*** ysandeep_afk is now known as ysandeep|ruck | 05:51 | |
bhagyashris | ysandeep|ruck, hey i want to know do we have promotable hash for ussuri because i want to test one change on new promoter server | 06:06 |
ysandeep|ruck | bhagyashris, ussuri is in good shape.. only fs02/fs035 are failing with ovb issue.. but that's not ussuri specific.. but i would let weshay|ruck to comment if we can wave that off.. we did yesterday for victoria.. but i don't want to take that call myself :) | 06:10 |
*** slaweq has joined #oooq | 06:16 | |
bhagyashris | ysandeep|ruck, ack | 06:26 |
bhagyashris | thanks you :) | 06:26 |
bhagyashris | ysandeep|ruck, akahat i am looking into the c7 train promotion | 06:26 |
*** amoralej|off has joined #oooq | 06:28 | |
*** pojadhav- is now known as pojadhav | 06:29 | |
*** amoralej|off is now known as amoralej | 06:30 | |
*** slaweq[m] has joined #oooq | 06:46 | |
*** slaweq has quit IRC | 06:57 | |
*** slaweq[m] is now known as slaweq | 07:15 | |
*** jpena|off is now known as jpena | 07:20 | |
*** tosky has joined #oooq | 07:20 | |
*** pojadhav has quit IRC | 07:44 | |
*** pojadhav has joined #oooq | 07:44 | |
bhagyashris | ysandeep|ruck, hey just want to confirm the c7 train images are stored here http://images.rdoproject.org/centos7/train/rdo_trunk/ right? | 08:10 |
*** derekh has joined #oooq | 08:11 | |
ysandeep|ruck | bhagyashris, yes afaik | 08:11 |
bhagyashris | ysandeep|ruck, ok, wondering the promotable hash is not there https://trunk.rdoproject.org/api-centos-train/api/civotes_detail.html?commit_hash=6059d84f17fae8745e10fd5aa37fc3b787159dd1&distro_hash=583c625c52d3a386d4cae90db02a1ddfe62b3a70 | 08:12 |
ysandeep|ruck | bhagyashris, you can check with weshay|ruck if he decided to wave off some jobs.. | 08:16 |
bhagyashris | ysandeep|ruck, ok dropping mail thanks | 08:17 |
*** ysandeep|ruck is now known as ysandeep|lunch | 08:32 | |
*** ykarel is now known as ykarel|lunch | 08:48 | |
*** pojadhav has quit IRC | 09:01 | |
*** jbadiapa_ has joined #oooq | 09:19 | |
arxcruz|rover | ysandeep|lunch: let me know when you're back | 09:23 |
*** jbadiapa has quit IRC | 09:24 | |
*** strider has joined #oooq | 09:37 | |
*** ysandeep|lunch is now known as ysandeep|ruck | 09:41 | |
ysandeep|ruck | arxcruz|rover, back o/ | 09:42 |
*** soniya29 has quit IRC | 09:45 | |
arxcruz|rover | ysandeep|ruck: so, how did you get that info from the vm that was failing ? | 09:47 |
ysandeep|ruck | arxcruz|rover, reboot the failing vm, then you will be able to access the vm.. login to vm via undercloud node.. | 09:48 |
*** ykarel|lunch is now known as ykarel | 09:49 | |
ysandeep|ruck | arxcruz|rover, wanna tmate? i can show you how.. | 09:49 |
ysandeep|ruck | arxcruz|rover, we can play on env which yatin gave to us. | 09:50 |
arxcruz|rover | ysandeep|ruck: sure | 09:52 |
ysandeep|ruck | pm | 09:52 |
sshnaidm | ysandeep|ruck, seems like cloud-init doesn't find user "centos" on image, does it exist there? | 10:09 |
sshnaidm | ysandeep|ruck, probably broken by https://gitlab.cee.redhat.com/virt/cloud-init/-/merge_requests/28 | 10:13 |
sshnaidm | maybe worth to try cloud-init before cloud-init-20.3-6.el8 | 10:13 |
sshnaidm | or after 20.3-9, where they reverted - https://gitlab.cee.redhat.com/virt/cloud-init/-/commit/4dde2a9bed58aba13c730bf4a7314b21038d7a31 | 10:15 |
ysandeep|ruck | sshnaidm, Issue reproduced in test patch even which using older cloud-int:- details here: https://bugs.launchpad.net/tripleo/+bug/1929745/comments/10 | 10:16 |
opendevmeet | Launchpad bug 1929745 in tripleo "Unchecked MSR access error - overcloud deploy "timed out waiting for ping module test" [Critical,Triaged] | 10:16 |
ysandeep|ruck | we were using cloud-int 20.3-10.el8 from long and 20.3-10.el8_4.2 recently came, but adding new one in exclude didn't help | 10:19 |
sshnaidm | ysandeep|ruck, so we use the latest: https://gitlab.cee.redhat.com/virt/cloud-init/-/commits/rhel840/master-20.3/ | 10:19 |
sshnaidm | ysandeep|ruck, an we try 20.3-5.el8 ? | 10:20 |
sshnaidm | s/an/can/ | 10:20 |
ysandeep|ruck | sshnaidm: thanks, I will explore that o/ | 10:21 |
sshnaidm | ysandeep|ruck, can you check also - does user centos exist on the image? | 10:21 |
ysandeep|ruck | sshnaidm: cat /etc/passwd | grep -i centos -> returns nothing | 10:22 |
sshnaidm | ysandeep|ruck, I think that's the problem.. | 10:23 |
ysandeep|ruck | i ran this on overcloud node | 10:23 |
sshnaidm | ysandeep|ruck, I'd try to add user centos into image and try this image in job | 10:23 |
sshnaidm | seems like cloud-init is looking for that user and fails, not sure why.. | 10:24 |
sshnaidm | ysandeep|ruck, can you paste the full cloud-init log? it's in /var/log/cloud-init.log and /var/log/cloud-init-output.log | 10:25 |
ysandeep|ruck | sure 1 mins | 10:26 |
ysandeep|ruck | sshnaidm, http://paste.openstack.org/show/806312/ | 10:32 |
ysandeep|ruck | sshnaidm, and http://paste.openstack.org/show/806313/ | 10:33 |
sshnaidm | ysandeep|ruck, hmm.. I see network is up | 10:34 |
sshnaidm | in failed cases I think it wasn't up.. | 10:34 |
*** soniya29 has joined #oooq | 10:34 | |
ysandeep|ruck | sshnaidm, arxcruz|rover cloud-int engineering friend is with us helping in debug | 10:35 |
sshnaidm | ysandeep|ruck, great | 10:36 |
sshnaidm | ysandeep|ruck, look at failed image, it doesn't have network up: https://logserver.rdoproject.org/63/794363/2/openstack-check/tripleo-ci-centos-8-ovb-1ctlr_1comp-featureset001/53b2ebc/logs/baremetal_2_24756_0-console.log | 10:36 |
sshnaidm | no interfaces table in the log | 10:36 |
*** soniya has joined #oooq | 10:57 | |
*** soniya29 has quit IRC | 10:57 | |
*** soniya29 has joined #oooq | 11:15 | |
*** pojadhav has joined #oooq | 11:18 | |
*** dviroel|out is now known as dviroel | 11:19 | |
*** soniya has quit IRC | 11:20 | |
*** jpena is now known as jpena|lunch | 11:33 | |
ysandeep|ruck | sshnaidm: after setting password for console, we can cloud-int manually and noticed that intermittently we are losing IP from interface. | 11:34 |
ysandeep|ruck | we ran* | 11:34 |
sshnaidm | ysandeep|ruck, yep, it's what we see in logs also, it can't set interfaces up | 11:35 |
sshnaidm | ysandeep|ruck, maybe something with networking in centos stream | 11:35 |
ysandeep|ruck | sshnaidm, diff b/w older logs and current rpm versions https://termbin.com/9r2r1 | 11:36 |
sshnaidm | ysandeep|ruck, yeah, there are a lot of changes :) | 11:39 |
sshnaidm | is there anything in logs about losing IP, maybe network manager logs? | 11:40 |
sshnaidm | probably need to find logs before this started and right after, to minimize the diff | 11:40 |
ysandeep|ruck | sshnaidm, I will ping back in sometime | 11:42 |
sshnaidm | sure | 11:42 |
*** soniya29 has quit IRC | 11:43 | |
*** soniya29 has joined #oooq | 11:43 | |
*** pojadhav has quit IRC | 11:43 | |
*** akahat_ has joined #oooq | 11:44 | |
*** akahat has joined #oooq | 11:48 | |
*** akahat_ has quit IRC | 11:49 | |
*** soniya29 has quit IRC | 11:51 | |
*** rlandy has joined #oooq | 11:53 | |
*** akahat_ has joined #oooq | 11:55 | |
rlandy | ysandeep|ruck: how are things downstream? | 11:56 |
*** akahat_ has quit IRC | 11:56 | |
ysandeep|ruck | rlandy, hosed , we merged tripleo-ansible distgit change, i ran upstream train component so that we get them to promoted-component soon.. https://review.rdoproject.org/r/c/testproject/+/32054 | 12:00 |
rlandy | ysandeep|ruck: it looks like that passed | 12:00 |
ysandeep|ruck | ^^ passed.. we should have/ will get new package in downstream soon.. | 12:00 |
rlandy | k | 12:00 |
*** ysandeep|ruck is now known as ysandeep|debug_ssn | 12:01 | |
rlandy | did we run the promote to promoted components job? | 12:01 |
ysandeep|debug_ssn | rlandy, i haven't .. not sure automatic build ran or not ..i didn't get chance to check status after throwing that test patch.. | 12:02 |
rlandy | ysandeep|debug_ssn: ok - I'll pick that up now | 12:02 |
ysandeep|debug_ssn | thanks! | 12:02 |
*** pojadhav has joined #oooq | 12:04 | |
rlandy | periodic-tripleo-centos-8-train-component-tripleo-promote-to-promoted-components | 12:04 |
rlandy | ^^ job is currently queued | 12:05 |
*** amoralej is now known as amoralej|lunch | 12:12 | |
weshay|ruck | arxcruz|rover, hey.. so repos on the overcloud images need a full rm -Rf /etc/yum.repos.d/ | 12:13 |
arxcruz|rover | weshay|ruck: you mean because the exclude thing ? | 12:13 |
weshay|ruck | ya | 12:14 |
arxcruz|rover | weshay|ruck: ok | 12:15 |
arxcruz|rover | weshay|ruck: can we sync? | 12:15 |
arxcruz|rover | ysandeep|debug_ssn: ^ | 12:15 |
weshay|ruck | 2021-06-03 03:54:59.755 | cloud-init noarch 20.3-10.el8_4.2 appstream 1.0 M | 12:15 |
arxcruz|rover | weshay|ruck: well, we are also hitting the same thing with 20.3-10.el8 | 12:15 |
weshay|ruck | meet.google.com/tqj-hbmi-ymv | 12:15 |
akahat | need +1 and +w : https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/33855 | 12:16 |
ysandeep|debug_ssn | arxcruz|rover, I am in a debug session for same issue with yatin, arxcruz|rover can fill weshay|ruck with details what we find. | 12:17 |
akahat | weshay|ruck, arxcruz|rover zbr marios ^^ | 12:17 |
*** jpena|lunch is now known as jpena | 12:23 | |
pojadhav | marios, 0/ | 12:29 |
pojadhav | marios, I just started with the adding upgrades jobs.. please have a look once you free : https://review.opendev.org/c/openstack/tripleo-ci/+/794571 | 12:29 |
pojadhav | marios, also one concern is there- standalone upgrades and undercloud upgrades jobs are present atm at : https://github.com/openstack/tripleo-ci/blob/3f9eb85616ac96d29135d930c380afd420527457/zuul.d/upgrades-jobs-templates.yaml. and now we are adding them into periodic. so should we remove them from where they are present now? or we need to keep them as it..? | 12:35 |
*** soniya29 has joined #oooq | 12:35 | |
marios | pojadhav: ok added for next reviews will check | 12:39 |
marios | pojadhav: no you should not move the job definitions into the periodic.yaml template leave them in the upgrades-jobs-templates.yaml please | 12:40 |
marios | pojadhav: you should only put the jobs into the zuul layout in periodic.yaml | 12:41 |
pojadhav | marios, ack.. will do the changes.. thanks for clarification. | 12:42 |
marios | pojadhav: np | 12:42 |
weshay|ruck | ysandeep|debug_ssn, check out out cloud-init-20.3-10.el8_4.3.noarch.rpm2021-06-02 15:151.0M | 12:51 |
bhagyashris | akahat, arxcruz|rover marios pojadhav zbr soniya29 rlandy weshay|ruck sshnaidm | 13:00 |
bhagyashris | scrum time | 13:00 |
bhagyashris | frenzy_friday, ^ | 13:01 |
*** soniya has joined #oooq | 13:06 | |
*** amoralej|lunch is now known as amoralej | 13:07 | |
soniya | bhagyashris, here there is power supply failure, hence i may lose internet connection during the meeting | 13:11 |
bhagyashris | soniya, ack | 13:11 |
*** soniya29 has quit IRC | 13:13 | |
marios | weshay|ruck: rlandy: https://review.opendev.org/q/topic:tripleo-get-hash | 13:15 |
arxcruz|rover | ysandeep|debug_ssn: ykarel i'll update https://review.opendev.org/c/openstack/tripleo-ci/+/794585 | 13:19 |
rlandy | http://git.app.eng.bos.redhat.com/git/openstack/tripleo-ci-internal-jobs.git/tree/zuul.d/required-projects-overrides.yaml | 13:19 |
arxcruz|rover | with a new cloud-init version 4.3 in an image that me and weshay|ruck uploaded right now | 13:19 |
ysandeep|debug_ssn | ack o/ | 13:20 |
ysandeep|debug_ssn | i am throwing a test patch also.. to try pinning NetworkManager instead.. | 13:21 |
arxcruz|rover | ykarel: can you put a hold on this ? | 13:21 |
marios | https://review.opendev.org/c/openstack/tripleo-ci/+/794194/1#message-b75e024ca7d96cef5baa940270adc8f0152d7aa6 | 13:42 |
ykarel | arxcruz|rover, which job and which patch? | 14:03 |
arxcruz|rover | ykarel: https://review.rdoproject.org/r/c/rdo-jobs/+/33961 | 14:03 |
arxcruz|rover | patchset 3 | 14:03 |
*** ysandeep|debug_ssn is now known as ysandeep | 14:05 | |
*** ysandeep is now known as ysandeep|ruck | 14:05 | |
ykarel | arxcruz|rover, done | 14:06 |
weshay|ruck | arxcruz|rover, ysandeep|ruck https://meet.google.com/xnf-tvdh-pmk?authuser=1 | 14:17 |
ysandeep|ruck | joining | 14:17 |
arxcruz|rover | sec, grab coffee | 14:18 |
*** ykarel is now known as ykarel|away | 14:37 | |
*** soniya has quit IRC | 14:41 | |
weshay|ruck | bhagyashris, can you promote.. wallaby w/ | 14:44 |
weshay|ruck | <ysandeep|ruck> weshay|ruck 3e4ca88391cf85cd127b130745319d45 | 14:44 |
weshay|ruck | [08:43:10] <ysandeep|ruck> │ periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset035-wallaby │ | 14:44 |
weshay|ruck | [08:43:10] <ysandeep|ruck> │ periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-wallaby │ | 14:44 |
weshay|ruck | [08:43:10] <ysandeep|ruck> │ periodic-tripleo-ci-centos-8-ovb-3ctlr_1comp_1supp-featureset039-wallaby | 14:44 |
weshay|ruck | skip those three jobs | 14:44 |
weshay|ruck | results are here: https://trunk.rdoproject.org/api-centos8-wallaby/api/civotes_agg_detail.html?ref_hash=3e4ca88391cf85cd127b130745319d45 | 14:56 |
rlandy | marios: https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/245008 | 15:00 |
rlandy | may also need an entry in sf-config | 15:01 |
rlandy | checking | 15:01 |
rlandy | http://git.app.eng.bos.redhat.com/git/openstack/sf-config.git/tree/resources/tripleo-ci-internal.yaml#n213 | 15:01 |
rlandy | has that - ok | 15:01 |
rlandy | marios: ^^ ok - so the above review should be all that is needed | 15:02 |
rlandy | ysandeep|ruck: weshay|ruck: if all the rest of the train jobs pass for tripleo component - ok if we remove OVB from criteria? | 15:07 |
rlandy | waiting on the third attmept for that job | 15:07 |
marios | rlandy: sorry dealing something else right now will check in bit | 15:07 |
ysandeep|ruck | rlandy, yes we will need to wave that off .. ovb in bad condition | 15:07 |
rlandy | needed to clean up 16.2 | 15:07 |
rlandy | ysandeep|ruck: k - will take care of that when the rest of the jobs pass | 15:08 |
rlandy | will put in a dnm patch and run the promote job | 15:08 |
weshay|ruck | sshnaidm, fyi https://review.rdoproject.org/r/c/rdo-jobs/+/33982 | 15:12 |
weshay|ruck | rlandy, fs002 pass? | 15:12 |
sshnaidm | weshay|ruck, when previous-current-tripleo-rdo is from? | 15:14 |
rlandy | weshay|ruck: np fs002 | 15:14 |
rlandy | no | 15:14 |
rlandy | it's the component line | 15:14 |
sshnaidm | oh, it's 2021-05-18, seems fine | 15:14 |
weshay|ruck | arxcruz|rover, https://4d0f809b29b439d33d96-ee10ddbfa4b3a945502b6b74773c0853.ssl.cf1.rackcdn.com/793507/1/gate/tripleo-ci-centos-8-containers-multinode/18e1500/logs/undercloud/var/log/tempest/stestr_results.html | 15:14 |
rlandy | we need to get the change to promoted-components | 15:14 |
rlandy | to import downstream | 15:15 |
weshay|ruck | sshnaidm, fyi https://review.opendev.org/c/openstack/openstack-tempest-skiplist/+/794608 | 15:16 |
sshnaidm | weshay|ruck++ | 15:17 |
*** ykarel|away has quit IRC | 15:24 | |
rlandy | ysandeep|ruck: weshay|ruck: lol - nvm- it I guess it doesn't take much to promote train tripleo these days: https://github.com/rdo-infra/ci-config/blob/master/ci-scripts/dlrnapi_promoter/config/CentOS-8/component/train.yaml#L82 | 15:27 |
weshay|ruck | arxcruz|rover, https://389a51b11efe9e84b938-e69135b470b32777c2fd508640e5c31f.ssl.cf1.rackcdn.com/792888/2/gate/tripleo-ci-centos-8-containers-multinode/284389a/logs/undercloud/var/log/tempest/stestr_results.html | 15:35 |
arxcruz|rover | weshay|ruck: adding | 15:35 |
weshay|ruck | arxcruz|rover, use a new patch | 15:36 |
weshay|ruck | arxcruz|rover, 608 is about to merge | 15:37 |
arxcruz|rover | weshay|ruck: yes, just waiting to gate finish, to not have conflict later | 15:37 |
weshay|ruck | arxcruz|rover++ | 15:37 |
weshay|ruck | woot.. https://review.opendev.org/c/openstack/openstack-tempest-skiplist/+/794608 merged | 15:37 |
weshay|ruck | THANK YOU for not using hte tripleo pipeline :)) so awesome | 15:37 |
arxcruz|rover | weshay|ruck: https://review.opendev.org/c/openstack/openstack-tempest-skiplist/+/794612 | 15:43 |
weshay|ruck | arxcruz|rover, also.. remember http://dashboard-ci.tripleo.org/d/3pUqDadGk/tempest-skipped-tests?orgId=1 ? | 15:43 |
weshay|ruck | arxcruz|rover, the skip list may be able to be pruned.. not today.. but I'd like to work w/ you on this.. | 15:44 |
arxcruz|rover | it's been a long time... | 15:44 |
weshay|ruck | and figuring out why wallably and victoria are not showing up :) | 15:44 |
arxcruz|rover | weshay|ruck: probably the job name | 15:45 |
weshay|ruck | arxcruz|rover, jobs run here: https://review.rdoproject.org/zuul/builds?pipeline=openstack-periodic-weekend | 15:46 |
weshay|ruck | bhagyashris, fyi https://review.rdoproject.org/r/c/rdo-jobs/+/33984 | 15:54 |
ysandeep|ruck | weshay|ruck, sshnaidm arxcruz|rover fyi.. https://review.opendev.org/c/openstack/tripleo-quickstart/+/794636 , testprojecting based on it | 16:02 |
sshnaidm | ysandeep|ruck, thanks | 16:03 |
rlandy | periodic-tripleo-ci-centos-8-multinode-1ctlr-featureset010-tripleo-train reporting now | 16:07 |
rlandy | getting promotion patch ready | 16:08 |
*** marios is now known as marios|out | 16:08 | |
weshay|ruck | ysandeep|ruck, you have a testproject for NM? | 16:13 |
rlandy | periodic-tripleo-centos-8-train-component-tripleo-promote-to-promoted-components running | 16:17 |
ysandeep|ruck | weshay|ruck, https://review.rdoproject.org/r/c/testproject/+/28446 | 16:17 |
weshay|ruck | thanks.. | 16:17 |
weshay|ruck | ysandeep|ruck, arxcruz|rover I've organized our three test projects in the google tasks | 16:17 |
ysandeep|ruck | weshay|ruck, rlandy you need anything before i leave for the day? | 16:21 |
weshay|ruck | ysandeep|ruck, I'm gong to promote wallaby | 16:21 |
rlandy | ysandeep|ruck: I don't think so | 16:22 |
weshay|ruck | ysandeep|ruck, /me looks at ur patch | 16:22 |
rlandy | waiting for promotion | 16:22 |
weshay|ruck | sec | 16:22 |
rlandy | train triplei | 16:22 |
rlandy | then will pick it up downstream | 16:22 |
ysandeep|ruck | weshay|ruck, sure | 16:22 |
rlandy | ysandeep|ruck: chat with you your time tomorrow | 16:23 |
*** amoralej is now known as amoralej|off | 16:23 | |
weshay|ruck | ysandeep|ruck, k.. nothing needed in tripleo-ci to exclude network-manager? | 16:23 |
weshay|ruck | ya.. we're good | 16:23 |
weshay|ruck | ysandeep|ruck, k.. I'll watch it through.. good night :) | 16:23 |
ysandeep|ruck | weshay|ruck, i don't think if we are building images | 16:23 |
weshay|ruck | see you in 8 hours lolz | 16:23 |
weshay|ruck | 10 hours | 16:24 |
ysandeep|ruck | weshay|ruck, sshnaidm one more thing .. i have enabled the cloud cleanup script back(we disabled to debug).. it seems to be hitting some timeout trying to reach vexx host.. openstack token issue also facing same issue.. probably some netorwk issue and will clear in sometime.. horizon seems to be working | 16:25 |
ysandeep|ruck | on toolbox | 16:26 |
*** jpena is now known as jpena|off | 16:27 | |
*** ysandeep|ruck is now known as ysandeep|away | 16:27 | |
*** sshnaidm is now known as sshnaidm|afk | 16:32 | |
*** marios|out has quit IRC | 16:40 | |
*** jlarriba has quit IRC | 16:46 | |
arxcruz|rover | weshay|ruck: failure :( | 16:54 |
weshay|ruck | where? | 16:54 |
arxcruz|rover | https://review.rdoproject.org/zuul/stream/2b5e93f00cff4e1db30b5c53b78302a5?logfile=console.log | 16:54 |
weshay|ruck | arxcruz|rover, https://logserver.rdoproject.org/61/33961/3/check/tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-2/080b663/logs/undercloud/home/zuul/build.log.txt.gz | 16:58 |
weshay|ruck | cloud-init was removed | 16:58 |
weshay|ruck | weird | 16:58 |
weshay|ruck | cloud-init.noarch 20.3-10.el8_4.3 @appstream | 16:59 |
weshay|ruck | https://logserver.rdoproject.org/61/33961/3/check/tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-2/080b663/logs/overcloud-controller-0/var/log/extra/package-list-installed.txt.gz | 16:59 |
weshay|ruck | arxcruz|rover, so.. on to the Network-Manager theory? | 16:59 |
arxcruz|rover | cloud-init.noarch 20.3-10.el8_4.3 @appstream | 16:59 |
arxcruz|rover | weshay|ruck: cloud init is latest one, the new one | 17:00 |
weshay|ruck | yes | 17:00 |
weshay|ruck | and didn't fix it | 17:00 |
weshay|ruck | rlandy, we may need to hold back / pin some core networking / os packages and use a dep pipeline | 17:01 |
weshay|ruck | if NetworkManager is the culprit here... it's just going to get worse in el9 | 17:02 |
arxcruz|rover | weshay|ruck: well, it's worse now | 17:03 |
arxcruz|rover | https://logserver.rdoproject.org/61/33961/3/check/tripleo-ci-centos-8-ovb-3ctlr_1comp-featureset001-2/080b663/logs/undercloud/home/zuul/overcloud_deploy.log.txt.gz | 17:03 |
arxcruz|rover | now two controllers fail | 17:03 |
arxcruz|rover | so far we saw only one failing | 17:03 |
rlandy | weshay|ruck: k - give me a list and I'll set something up | 17:04 |
weshay|ruck | arxcruz|rover, I've seen two I think previously | 17:04 |
weshay|ruck | rlandy, not ready yet.. just sowing seeds... we probably will need to close out tripleo-repos prior | 17:05 |
rlandy | k - trying to clean up train -16.2 | 17:05 |
weshay|ruck | arxcruz|rover, is it worth trying again w/ a much older version? | 17:05 |
weshay|ruck | of cloud-init | 17:06 |
weshay|ruck | or just go after NM now? | 17:06 |
*** derekh has quit IRC | 17:06 | |
arxcruz|rover | you mean old cloud-init and nm version? | 17:08 |
arxcruz|rover | at this point, i don't know, i'm out of ideas already | 17:08 |
weshay|ruck | arxcruz|rover, let's see what happens w/ https://review.rdoproject.org/r/c/testproject/+/28446 | 17:10 |
weshay|ruck | and to some degree https://review.rdoproject.org/r/c/rdo-jobs/+/33982 | 17:10 |
rlandy | train promoted | 17:11 |
rlandy | promoted-components | 17:11 |
weshay|ruck | rlandy, rdo? | 17:13 |
rlandy | weshay|ruck: ack - waiting for that to hit downstream | 17:13 |
rlandy | then we go with that | 17:13 |
rlandy | http://osp-trunk.hosted.upshift.rdu2.redhat.com/rhel8-osp16-2/component/tripleo/consistent/commit.yaml | 17:14 |
rlandy | hasn't hit there yet | 17:14 |
rlandy | weshay|ruck: once we get that through, we can use double component job | 17:14 |
bhagyashris | weshay|ruck, ack | 17:14 |
rlandy | and promote network and tripleo | 17:14 |
rlandy | weshay|ruck: do you know how often consistent updates? | 17:17 |
weshay|ruck | date stamp | 17:17 |
weshay|ruck | ya | 17:17 |
bhagyashris | weshay|ruck, http://10.0.148.74/promoter_logs/container-push/20210603-170952.log | 17:18 |
weshay|ruck | bhagyashris++ | 17:19 |
*** ysandeep|away has quit IRC | 17:31 | |
arxcruz|rover | weshay|ruck: lol https://review.opendev.org/c/openstack/tripleo-ci/+/794585 | 17:32 |
arxcruz|rover | bleh, didn't built a new image, it's with cloud-init 4.2 | 17:34 |
weshay|ruck | arxcruz|rover, ? | 17:36 |
weshay|ruck | arxcruz|rover, but you had to_build true | 17:36 |
weshay|ruck | and the updated cloud-init was installed | 17:36 |
weshay|ruck | frenzy_friday, you got a sec? | 17:42 |
*** pojadhav has quit IRC | 17:43 | |
*** pojadhav has joined #oooq | 17:44 | |
frenzy_friday | weshay|ruck, O/ | 17:44 |
weshay|ruck | frenzy_friday, we can add a key to sova: regexes: I assume to mark wether or not there is a matching query.yml right? | 17:45 |
weshay|ruck | harmless extra key for human eyes only | 17:45 |
weshay|ruck | or is there a better way.. to note in sova-patterns.yml it's converted.. | 17:46 |
weshay|ruck | I guess I can cross reference queries.. because that has it | 17:46 |
frenzy_friday | didnt get that. queries.yml and sova regexes will always be in sync after we shift to the new format right? | 17:46 |
frenzy_friday | so all regexes in sova will have an entry in queries.yml | 17:46 |
weshay|ruck | ya.. if you add something to queries.yml it will fail if there isn't a match.. in sova-patterns.yml iirc | 17:47 |
frenzy_friday | no, it will fail if you dont add a string against which it will be checked by tox. When you add a query tox will automatically convert it for sova and for ER. So all regexes in queries, sova and er will be in sync | 17:49 |
weshay|ruck | k | 17:52 |
bhagyashris | weshay|ruck, wallaby promoted https://trunk.rdoproject.org/centos8-wallaby/current-tripleo/delorean.repo.md5 , reverted criteria file changes | 17:59 |
weshay|ruck | bhagyashris, thank you!!!! | 18:00 |
bhagyashris | :) | 18:00 |
* bhagyashris out | 18:00 | |
weshay|ruck | frenzy_friday, have you had any luck w/ wild cards.. for things like build_name? | 18:01 |
weshay|ruck | build_name:"tempest-slow" | 18:01 |
weshay|ruck | like that? | 18:01 |
frenzy_friday | weshay|ruck, In the regex? | 18:02 |
weshay|ruck | y | 18:02 |
* weshay|ruck messing w/ | 18:03 | |
weshay|ruck | message:"Could not resolve host:" AND (tags:"console") AND NOT build_name:"openstack-*" | 18:03 |
frenzy_friday | Special chars in regex should work | 18:04 |
frenzy_friday | for build_name tag I need to check | 18:04 |
weshay|ruck | eye.. looking at https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-wildcard-query.html#wildcard-query-field-params | 18:04 |
frenzy_friday | ^ this has a different structure.. with more fields | 18:06 |
weshay|ruck | frenzy_friday, sent a screenshot w/ something that is working | 18:07 |
weshay|ruck | querystring... | 18:08 |
weshay|ruck | but I'm not familiar enough w/ the query format.. on how to do that properly | 18:08 |
frenzy_friday | ooh, ok. build_name wont work now, I'll add it as a param tomorrow. | 18:08 |
frenzy_friday | NOT is missing in out structure as well | 18:09 |
frenzy_friday | *our | 18:10 |
weshay|ruck | frenzy_friday, some of these errors we should filter the name... because we don't want hits from jobs we don't care about | 18:10 |
weshay|ruck | frenzy_friday, k | 18:10 |
weshay|ruck | infra errors... for the most part.. we should filter in.. build_name:"tripleo*" or AND NOT build_name:"openstack-*" | 18:11 |
frenzy_friday | ok, I'll add build_name param and check how to define NOT in our queries format. | 18:12 |
rlandy | weshay|ruck: promoting tripleo to component-ci-testing and trying one standalone job to see if we have the right hash | 18:18 |
weshay|ruck | frenzy_friday, and.. we need.. "build_status" | 18:20 |
*** amoralej|off has quit IRC | 18:21 | |
*** amoralej|off has joined #oooq | 18:30 | |
weshay|ruck | rlandy, https://review.opendev.org/c/openstack/ansible-role-collect-logs/+/794664 | 18:32 |
rlandy | weshay|ruck: https://review.opendev.org/c/openstack/ansible-role-collect-logs/+/794664 nit on placement | 18:34 |
weshay|ruck | rlandy, sorin says it needs to be at the top.. | 18:34 |
weshay|ruck | to ensure it's collected | 18:34 |
rlandy | okie dokie | 18:35 |
weshay|ruck | so it should not be alphabetical, but ranked by importance | 18:35 |
rlandy | revoted | 18:35 |
*** jlarriba has joined #oooq | 18:59 | |
weshay|ruck | rlandy, hey.. want to look at 16.2 for a minute w/ me and chat about it moving forward? | 19:04 |
rlandy | weshay|ruck: ack | 19:20 |
rlandy | just looking at log that ran | 19:20 |
rlandy | failed deploy | 19:20 |
weshay|ruck | rlandy, heh.. old overcloud-images fixes the issue w/ ovb | 19:21 |
weshay|ruck | arxcruz|rover, ^ | 19:22 |
weshay|ruck | from 5/18 | 19:22 |
weshay|ruck | at least we can get a proper rpm diff now | 19:22 |
rlandy | weshay|ruck: k - let's meet | 19:24 |
weshay|ruck | k | 19:24 |
weshay|ruck | meet.google.com/jbk-tntw-quv | 19:24 |
rlandy | Not found image: https://docker-registry.upshift.redhat.com/v2/tripleorhos-16-2/openstack-swift-account/manifests/a68f2f7c63f98fb923595bdd144ce819"] | 19:25 |
*** amoralej|off has quit IRC | 19:37 | |
weshay|ruck | rlandy, https://review.opendev.org/c/openstack/tripleo-quickstart-extras/+/792904 | 19:38 |
rlandy | https://review.opendev.org/c/openstack/tripleo-quickstart/+/792818 | 20:08 |
rlandy | https://sf.hosted.upshift.rdu2.redhat.com/zuul/t/tripleo-ci-internal/builds?pipeline=openstack-periodic-integration-rhos-17 - not too bad | 20:25 |
rlandy | oh - the registry is defined in a diff place | 21:00 |
weshay|ruck | really? | 21:01 |
rlandy | yeah - files gets built from tripleo-ansible or something | 21:02 |
rlandy | https://sf.hosted.upshift.rdu2.redhat.com/logs/openstack-periodic-integration-rhos-17/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-rhel-8-multinode-1ctlr-featureset010-rhos-17/278840f/logs/undercloud/etc/containers/registries.conf | 21:03 |
rlandy | we made a change on container builds | 21:04 |
weshay|ruck | rlandy, ya.. that file changed iirc w/ containers_tools 2.0 -> 3.0 | 21:05 |
rlandy | http://git.app.eng.bos.redhat.com/git/openstack/tripleo-ci-internal-jobs.git/tree/zuul.d/tripleo-build-containers.yaml | 21:05 |
rlandy | http://git.app.eng.bos.redhat.com/git/openstack/tripleo-ci-internal-jobs.git/tree/zuul.d/tripleo-build-containers.yaml#n47 | 21:05 |
rlandy | need to see that that is matched in settings somewhere | 21:05 |
rlandy | http://git.app.eng.bos.redhat.com/git/tripleo-environments.git/tree/config/release/promotion-testing-hash-rhos-17.yml#n195 | 21:07 |
rlandy | it is - but maybe we switch the order | 21:07 |
rlandy | interesting ho standalone passes | 21:17 |
rlandy | oh | 21:17 |
rlandy | I defined that in standalone config | 21:17 |
rlandy | 2021-06-03 16:46:31.953066 | primary | TASK [os_tempest : Execute tempest tests] ************************************** | 21:22 |
rlandy | 2021-06-03 16:46:31.953072 | primary | Thursday 03 June 2021 16:46:31 -0400 (0:00:00.048) 0:50:30.484 ********* | 21:22 |
rlandy | 2021-06-03 16:46:37.327022 | primary | fatal: [undercloud]: FAILED! => { | 21:22 |
rlandy | 2021-06-03 16:46:37.327416 | primary | "changed": false, | 21:22 |
rlandy | 2021-06-03 16:46:37.327458 | primary | "cmd": "set -e\nif [ -d /openstack/venvs/tempest-untagged/bin ];\nthen\n. /openstack/venvs/tempest-untagged/bin/activate\nfi\ntempest run --concurrency 2 --blacklist-file /home/zuul/tempest/etc/tempest_blacklist.txt --whitelist-file /home/zuul/tempest/etc/tempest_whitelist.txt > /var/log/tempest/tempest_run.log\n", | 21:22 |
rlandy | 2021-06-03 16:46:37.327481 | primary | "delta": "0:00:04.887361", | 21:22 |
rlandy | 2021-06-03 16:46:37.327490 | primary | "end": "2021-06-03 20:46:37.288135", | 21:22 |
rlandy | 2021-06-03 16:46:37.327507 | primary | "rc": 1, | 21:22 |
rlandy | 2021-06-03 16:46:37.327515 | primary | "start": "2021-06-03 20:46:32.400774" | 21:22 |
rlandy | 2021-06-03 16:46:37.327523 | primary | } | 21:22 |
rlandy | ha no tempest tests running on scenario007 | 21:22 |
weshay|ruck | oh no | 21:31 |
weshay|ruck | probably because we accidently skipped | 21:31 |
rlandy | yeah | 21:32 |
rlandy | digging through that | 21:32 |
weshay|ruck | rlandy, fyi.. I pointed arx at http://dashboard-ci.tripleo.org/d/3pUqDadGk/tempest-skipped-tests?orgId=1 | 21:32 |
rlandy | nice | 21:32 |
weshay|ruck | after we get through the ovb fire drill hopefully he can restore some tests | 21:32 |
weshay|ruck | rlandy, do you know what to change to get the registry right? | 21:37 |
* weshay|ruck steps away | 21:39 | |
rlandy | weshay|ruck: yep | 22:20 |
rlandy | under test | 22:20 |
* rlandy going o warehouse | 22:20 | |
rlandy | back later | 22:20 |
*** rlandy is now known as rlandy|bbl | 22:20 | |
*** yamamoto has quit IRC | 22:45 | |
*** tosky has quit IRC | 23:00 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!