*** ysandeep|out is now known as ysandeep | 04:36 | |
ysandeep | good morning team o/ | 04:40 |
---|---|---|
akahat | Good morning o/ | 04:43 |
ysandeep | akahat: hey Amol o/ I heard about heavy rainfall near Pune, How is everything near your place? | 04:44 |
akahat | ysandeep, hey.. it's raining a lot.. rivers are flooded. | 04:50 |
akahat | weather forecast is showing heavy rain for few more days. | 04:50 |
ysandeep | For few more day - okay, stay safe and enjoy the weather mate :) | 04:51 |
akahat | ysandeep, you might be missing monsoon trips around Pune. :P | 04:57 |
ysandeep | Yes missssssing it a lot, I once went to Tamani ghat and we also went to hidden waterfall in Monsoon, It was an awesome experience. | 04:59 |
ysandeep | akahat, also missing bike trip to mahabaleswar and lonavala :D | 05:03 |
ysandeep | akahat, you got a chance to visit any nearby mountains in this season? | 05:03 |
akahat | ysandeep, no.. not this time.. | 05:04 |
akahat | ysandeep, come back to Pune | 05:04 |
akahat | Every weekend we will go for ride. :D | 05:04 |
ysandeep | thanks mate, I can definitely visit for a short trip for a week :D | 05:07 |
tonyb | rlandy, frenzyfriday, marios: I think I have updated https://code.engineering.redhat.com/gerrit/c/tripleo-environments/+/427518 Add 17.1 standalone jobs to promotion criteria correctly. | 05:48 |
tonyb | rlandy, frenzyfriday, marios: Sorry for missing that part. I'll update our (CRE team) internal docs to include the missing step. | 05:50 |
marios | tonyb: o/ will check thanks | 05:53 |
marios | tonyb: not your fault it was really ours for not spotting it ;) | 05:53 |
tonyb | I'll go though the open chnages from our team and try to address feedback and update the pipeline similarly | 05:54 |
*** ysandeep is now known as ysandeep|afk | 05:54 | |
abregman | hey everyone | 06:00 |
abregman | can anyone tell me what's the issue here? https://sf.hosted.upshift.rdu2.redhat.com/zuul/t/tripleo-ci-internal/build/d6f8439050ab4cc6b75aa906f652e74e | 06:00 |
marios | abregman: looks like tempest issu e | 06:02 |
marios | i mean tempest test fail | 06:02 |
marios | abregman: that https://sf.hosted.upshift.rdu2.redhat.com/logs/86/427986/3/check/periodic-tripleo-ci-rhel-9-scenario004-standalone-glance-rhos-17.1/d6f8439/logs/undercloud/var/log/tempest/stestr_results.html | 06:02 |
marios | abregman: for future reference i found that in 2 steps. 1. open job-output.txt search for "failed: 1" (https://sf.hosted.upshift.rdu2.redhat.com/logs/86/427986/3/check/periodic-tripleo-ci-rhel-9-scenario004-standalone-glance-rhos-17.1/d6f8439/job-output.txt) | 06:04 |
marios | abregman: from there you learn which file you need for step2 (in this case tempest so i knew to go look in /var/log/tempest ) | 06:04 |
abregman | oh k, I saw the failed: 1 but totally missed the "TASK [os_tempest : Fail if tempest tests did not succeed]" | 06:12 |
abregman | thanks! | 06:12 |
abregman | marios: in such case where a test fails, what do we do exactly with this change? https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/427710 | 06:13 |
abregman | marios: to clarify that job executed as part of testing the change 427710 | 06:14 |
marios | abregman: yeah you have to scroll up from failed: 1 i forgot to say that | 06:16 |
marios | abregman: so "what to do" 1. could be a legit issue, so we file a bug and track it as CIX (is it legit? is it consistently failing on this in consecutive executions?) | 06:17 |
marios | abregman: and 2. check with ruck|rover if this is a known issue. for that start with this irc channel's topic: tripleo-ci || rr status: https://hackmd.io/2hB-P772SqyqDs0KKZzZEQ?view | 06:18 |
abregman | marios: got it. already executed again. so if it fails once more with that test failure, I'll open a bug and escalate it | 06:18 |
marios | abregman: yeah but also check if there is an existing bug/known issue for this. for example the ruck|rover may decide to skip a test or some other temp workaround that may unblock you | 06:18 |
abregman | marios: sure, will do. thanks | 06:19 |
marios | abregman: dont see something in the current notes (there https://hackmd.io/s4TgnCY-QQGKv2ONxTjOZA) | 06:19 |
marios | abregman: if you do file a new bug please alert the ruck|rover about it ( bhagyashris|ruck and jm1[m] ) | 06:20 |
abregman | marios: should I ask Bhagyashris or Jakob? (or both?) | 06:20 |
marios | abregman: usually you tell the 'ruck' about it | 06:20 |
marios | abregman: but either if one may be away/different timezone etc | 06:20 |
abregman | got it. thanks again | 06:20 |
abregman | marios++ | 06:20 |
marios | np abregman | 06:20 |
abregman | no karma bot I guess | 06:20 |
marios | :) appreciate it all the same | 06:21 |
*** ysandeep|afk is now known as ysandeep | 06:30 | |
abregman | added the procedure of what we discussed here https://docs.engineering.redhat.com/display/PRODCHAIN/Component+pipeline+%5B17.1%5D+standup+Notes#Componentpipeline[17.1]standupNotes-Unabletoverifytestprojectchangetestproject_verification | 06:35 |
*** jm1|ruck is now known as jm1|rover | 06:45 | |
jm1 | moin #oooq | 06:45 |
* bhagyashris|ruck lunch | 06:49 | |
jm1 | abregman: o/ a quick way to check what has failed is to go to the logs/ subdirectory and watch out for files such as "_Tempest_tests_FAILED.log" https://sf.hosted.upshift.rdu2.redhat.com/logs/86/427986/3/check/periodic-tripleo-ci-rhel-9-scenario004-standalone-glance-rhos-17.1/d6f8439/logs/ | 06:55 |
jm1 | abregman: sometimes it says something like "no failure reason found" but often you get an idea where to look next | 06:55 |
jm1 | abregman: you will appreciate this micro optimization when you have to check dozens of logs a day ;) | 06:56 |
marios | \o good morning jm1|rover | 06:57 |
jm1 | marios: o/ | 06:58 |
*** jpena|off is now known as jpena | 07:10 | |
tonyb | jm1: nice. | 07:12 |
tonyb | So we have a few changes that will confliuct with each other as the all edit the pipeline file. Apart from building a review chain is there a good way to make the review/merge process easier? | 07:13 |
tonyb | see: https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/427727 "Merge Conflicts" for example | 07:14 |
jm1|rover | marios: ^ | 07:26 |
jm1 | marios: pls :) | 07:27 |
abregman | jm1: good tip. I'll add it. thanks! | 07:28 |
marios | tonyb: not aware of another way than to create a chain/rebase | 07:29 |
tonyb | marios: Okay. | 07:30 |
abregman | jm1, bhagyashris|ruck: is there a bug for these issue with the OSP 17.0 job? https://sf.hosted.upshift.rdu2.redhat.com/zuul/t/tripleo-ci-internal/build/23752a8efa844e99afdbf084453bb428/log/job-output.txt | 07:39 |
abregman | perhaps related to what I see in the 17.1 job although different failure | 07:40 |
abregman | jm1, bhagyashris|ruck: if not, then I guess I'll open a bug for this issue https://sf.hosted.upshift.rdu2.redhat.com/logs/86/427986/3/check/periodic-tripleo-ci-rhel-9-scenario004-standalone-glance-rhos-17.1/d6f8439/logs/undercloud/var/log/tempest/stestr_results.html | 07:40 |
bhagyashris|ruck | abregman, hey let me check | 07:40 |
marios | jm1: bhagyashris|ruck: let me know if you want to discuss anything or need any help | 08:00 |
jm1 | marios: ack. still walking wading through todays fallout | 08:01 |
marios | bhagyashris|ruck: jm1: gentle reminder that we need to have an update for latest status on all active cix cards for the call this afternoon | 08:02 |
bhagyashris|ruck | abregman, hey is this consistent issue ^ i dont see it's failing consistently https://sf.hosted.upshift.rdu2.redhat.com/zuul/t/tripleo-ci-internal/builds?job_name=periodic-tripleo-ci-rhel-9-scenario004-standalone-glance-rhos-17&skip=0 but will hit testproject and will verify it | 08:03 |
abregman | bhagyashris|ruck: should I open a bug for this one? (executed twice) https://sf.hosted.upshift.rdu2.redhat.com/logs/86/427986/3/check/periodic-tripleo-ci-rhel-9-scenario004-standalone-glance-rhos-17.1/d6f8439/logs/undercloud/var/log/tempest/stestr_results.html | 08:07 |
jm1 | marios: updated cix cards, will need some input from rlandy on some cards | 08:32 |
bhagyashris|ruck | abregman, yes you can | 08:39 |
bhagyashris|ruck | i will also check at my end | 08:40 |
bhagyashris|ruck | marios, ack | 08:40 |
*** amoralej is now known as amoralej|afk | 08:51 | |
* jm1 mtg 1h | 08:58 | |
frenzyfriday | jm1, pojadhav Hey, have you seen this error while setting up the cockpit manually? ERROR: Missing mandatory value for "environment" option interpolating | 09:51 |
frenzyfriday | I am trying to change the influxdb container version to 1.8 on the development server which does not have ansible pull) | 09:52 |
frenzyfriday | it works if I export the missing env values, but I didnt see this error the last time | 09:56 |
* marios food biab | 10:08 | |
rlandy | jm1: bhagyashris|ruck: hey - I'm around if you need anything | 10:31 |
bhagyashris|ruck | rlandy, ack | 10:32 |
bhagyashris|ruck | chasing 16.2 promotion rest are good ... | 10:32 |
bhagyashris|ruck | sc010 is away for 16.2 re-running that | 10:33 |
*** ysandeep is now known as ysandeep|lunch | 10:34 | |
rlandy | tonyb: thanks for the update - merging https://code.engineering.redhat.com/gerrit/c/tripleo-environments/+/427518 - marios +2'ed | 10:36 |
rlandy | bhagyashris|ruck: nice | 10:39 |
rlandy | bhagyashris|ruck: components ok? | 10:40 |
bhagyashris|ruck | checking | 10:40 |
bhagyashris|ruck | on that | 10:42 |
rlandy | thanks | 10:43 |
rlandy | chandankumar: jm1: bhagyashris|ruck: I see some passes here: https://review.rdoproject.org/zuul/builds?job_name=tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset001&skip=0 :) | 10:44 |
rlandy | did we merge a patch to fix? | 10:44 |
rlandy | are there further fixes to stabilize check? | 10:44 |
chandankumar | rlandy: nope, promotion and daniel kernel update in image fixed it | 10:44 |
rlandy | dpawlik++ | 10:45 |
rlandy | thank you for that | 10:45 |
chandankumar | rlandy: we still need this https://review.opendev.org/c/openstack/tripleo-quickstart/+/856603 to avoid it | 10:46 |
dpawlik | hey, soon we will move from our customize image to upstream image | 10:46 |
dpawlik | today I will try to push more that topic | 10:47 |
rlandy | chandankumar: dpawlik: ok - I added https://review.opendev.org/c/openstack/tripleo-quickstart/+/856603 to our review list | 10:47 |
rlandy | and +2'ed it | 10:48 |
rlandy | dpawlik: thanks - that should help our stats on OVB check | 10:48 |
rlandy | Tengu: hi - will miss DF call again today ... pls tell the DF about ^^ | 10:49 |
rlandy | we are seeing some passes on OVB check now | 10:49 |
jm1 | rlandy: o/ had a long mtg, will have lunch now, then we can sync | 10:49 |
rlandy | also ... Tengu pls remind DF members to vote for TC | 10:49 |
bhagyashris|ruck | 16.2 promoted | 10:49 |
rlandy | jm1: sure ... pls ping bhagyashris|ruck and me when ready | 10:49 |
rlandy | jm1: bhagyashris|ruck: we just need to run through CIX | 10:51 |
rlandy | bhagyashris|ruck: thanks for taking care of https://trello.com/c/8tGYExhe/2603-cixlp1980255tripleociproa-tripleo-ci-centos-9-standalone-and-multinode-ipa-are-failing-the-testminimumbasicinstancehardrebootaft - can you post your results when you have them? | 10:57 |
bhagyashris|ruck | rlandy, sure | 10:58 |
Tengu | shall we +W that oooq patch, rlandy ? | 11:04 |
Tengu | to me, it makes perfectly sense to NOT update the OC image kernel - only the tripleo packages. | 11:05 |
rlandy | Tengu: I think so - added it to today's review list to see if anyone else can spot an issue with doing that - if not - we will w+ shortly | 11:06 |
Tengu | I +2 it | 11:06 |
rlandy | thanks | 11:14 |
Tengu | ysandeep|lunch: I'll add the nftables switch topic for tomorrow's CI community call. | 11:16 |
rlandy | Tengu: we have Robert coming to tomorrow's call to discuss our scrum methodologies | 11:17 |
Tengu | rlandy: meh... | 11:17 |
rlandy | can we move you to next week? | 11:17 |
Tengu | we're ready to switch (missing one or 2 patches), and I have a slot in next week all-hand to present the nftables thingy | 11:17 |
Tengu | while we can wait a bit, it would be nice to get things as close as possible to "it's switched" | 11:18 |
rlandy | Tengu: maybe you can come to today's scrum? | 11:18 |
Tengu | sure? when is it? is there a hackmd as well? | 11:19 |
rlandy | 1:30 pm UTC | 11:19 |
rlandy | pojadhav: ^^ | 11:19 |
Tengu | I don't see it in my calendar - care to invite me? | 11:19 |
Tengu | and I'll have to jump on the DF call at 2pm UTC. not a big deal, topic shouldn't take too long anyway :) | 11:20 |
jm1 | frenzyfriday: cockpit is expected to fail when environment variables are not defined, please refer to https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/44680 | 11:39 |
jm1 | rlandy, bhagyashris|ruck: ready for sync | 11:39 |
rlandy | sec - still on review time | 11:39 |
rlandy | will ping | 11:39 |
chandankumar | rlandy: pojadhav our today's scrum collides with Product Engineering open office hours | 11:45 |
*** dviroel_ is now known as dviroel | 11:45 | |
chandankumar | if we finish it in 30 mins then there is no overlap | 11:45 |
rlandy | chandankumar: pojadhav; yeah - we can try be quick | 11:45 |
rlandy | we did scrum on thurs | 11:45 |
rlandy | so we can do Tengu's topic and blockers only | 11:46 |
rlandy | Tengu: sent you meeting invite | 11:46 |
Tengu | thanks! | 11:48 |
rlandy | jm1: bhagyashris|ruck: can sync now if you want | 11:52 |
rlandy | jm1: bhagyashris|ruck: https://meet.google.com/hvc-qpjh-hna?pli=1&authuser=0 - when ready | 11:53 |
rlandy | we should run though CIX | 11:54 |
*** amoralej is now known as amoralej|lunch | 12:18 | |
rcastillo | o/ happy monday all | 12:29 |
rlandy | arxcruz: chandankumar: added revert https://review.rdoproject.org/r/c/rdo-jobs/+/44757 per bug https://bugs.launchpad.net/tripleo/+bug/1988810 | 12:38 |
rlandy | bhagyashris|ruck: https://review.opendev.org/c/openstack/tripleo-quickstart/+/856603 | 12:38 |
*** ysandeep|lunch is now known as ysandeep | 12:44 | |
*** amoralej|lunch is now known as amoralej | 12:51 | |
abregman | can anyone send me an invite to tripleo reviews meeting? | 12:57 |
marios | abregman: sent | 13:12 |
jm1 | marios: looks like this one merged a couple of minutes too early XD https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/44872 | 13:12 |
abregman | marios: thanks :) | 13:13 |
marios | jm1: yes :D :( | 13:13 |
rlandy | marios: jm1: let's revert that tomorrow if needed | 13:15 |
rlandy | line promoted 20 hours ago | 13:15 |
marios | jm1: rlandy: i am posting the criteria removal now sec | 13:16 |
jm1 | rlandy, marios: 20 hours ago... we have plenty of time! lets wait for rdo folks to get ovs updated | 13:17 |
marios | jm1: rlandy: will add the info on the bug and then up to you rlandy .. you can at least get the gate clear today once zuul reports workflow https://review.opendev.org/c/openstack/tripleo-ci/+/857142 | 13:18 |
jm1 | rlandy, marios: ah saw your decision on #rdo | 13:18 |
marios | jm1: yea added the info in comment#1 on the bug | 13:18 |
marios | jm1: i mean about the 'plan' | 13:18 |
marios | added now the patches | 13:18 |
jm1 | marios: yep, thank you :) | 13:19 |
rlandy | https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/44885 - so we're merge or not? | 13:21 |
rlandy | both 8 and 9 criteria there | 13:21 |
rlandy | and we need a 9 promo | 13:22 |
rlandy | jm1; marios: ^^ | 13:22 |
dpawlik | chandankumar, ysandeep: hey, may I ask you one think: some of the settings https://github.com/openstack/tripleo-ansible/blob/master/tripleo_ansible/roles/tripleo_kernel/vars/main.yml are decreased a lot, comparing to example values set in https://www.rabbitmq.com/networking.html#dealing-with-high-connection-churn-time-wait , so don't you have many | 13:23 |
dpawlik | "errors" in the rabbitmq logs or nova conductor? | 13:23 |
marios | rlandy: yeah so let me check if it hits both (should do since we have bump on c9 should hit both lines) but yes we do want to merge those per our plan https://bugs.launchpad.net/tripleo/+bug/1989341/comments/1 | 13:28 |
ysandeep | dpawlik, decreased recently or in general? | 13:29 |
dpawlik | general | 13:30 |
rlandy | ok - will vote | 13:30 |
dpawlik | ysandeep: as I see last time change was done some time ago | 13:30 |
ysandeep | dpawlik, need to check some ci jobs result, I will get back to you(In a mtg) | 13:31 |
*** dasm|off is now known as dasm | 13:33 | |
dasm | o/ | 13:33 |
dasm | frenzyfriday: o/ i'm trying to understand what is done and what needs to be done wrt our infra. If you can go to our's board backlog, you might see a few infra named tasks. | 13:49 |
dasm | frenzyfriday: I'm not sure about how much is done, but I'd like to use your knowledge to see if i'm missing something | 13:50 |
dasm | frenzyfriday: do you have a few minutes to sync up? | 13:50 |
frenzyfriday | dasm, ack, lemme check | 13:50 |
frenzyfriday | yep sure | 13:50 |
marios | rlandy: re 'is it in both 8 and 9' for the mixed os thing - yes both https://bugs.launchpad.net/tripleo/+bug/1989341/comments/4 | 13:54 |
chandankumar | jm1: bhagyashris|ruck good train hash 489d88a4b22eb070acc39b218844ac82 & buildset: https://review.rdoproject.org/zuul/buildset/34a8704d9fb3424ea756be801bc17c3b | 14:11 |
chandankumar | fs039 and full-tempest-api are failing | 14:11 |
chandankumar | while running the testproject on ibmcloud, please include these vars : https://github.com/rdo-infra/rdo-jobs/blob/master/zuul.d/integration-pipeline-wallaby.yaml | 14:12 |
dasm | jm1: thanks for your comment on trello board wrt ouathlib releas. | 14:13 |
dasm | *release | 14:13 |
rlandy | marios: voted on https://review.rdoproject.org/r/c/rdo-infra/ci-config/+/44885 - you can w+ when ready | 14:17 |
marios | rlandy: ack thanks it can go but may as well hold that till tomorrow morning maybe will be avoided... but gate one can go in as soon as ready i think | 14:24 |
* jm1 having a longer break now | 15:05 | |
bhagyashris|ruck | rlandy, fyi jenkins component 17.1 on rhel9 jobs trigger... | 15:06 |
bhagyashris|ruck | hopefully it will pass ... | 15:06 |
bhagyashris|ruck | and rest of the stuff updated on hackmd | 15:07 |
bhagyashris|ruck | leaving for the day | 15:07 |
jm1 | bhagyashris|ruck: have a nice evening o/ | 15:07 |
bhagyashris|ruck | jm1, thanks see you tomorrow | 15:08 |
chandankumar | see ya people! | 15:08 |
rlandy | bhagyashris|ruck: thank you | 15:12 |
rlandy | chandankumar: have a good night | 15:12 |
*** ysandeep is now known as ysandeep|out | 15:16 | |
*** ysandeep|out is now known as ysandeep | 15:16 | |
ysandeep | dpawlik, error grep from green upstream ci standalone job - https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_49e/853860/15/check/tripleo-ci-centos-9-standalone/49eaf6d/logs/undercloud/var/log/extra/errors.txt | 15:18 |
ysandeep | seeing nothing rabbit related from quick look | 15:18 |
ysandeep | its getting late for me, but I can discuss more in my morning | 15:20 |
*** ysandeep is now known as ysandeep|out | 15:20 | |
dpawlik | ysandeep|out: that's right. Nothing interesting in https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_49e/853860/15/check/tripleo-ci-centos-9-standalone/49eaf6d/logs/undercloud/var/log/containers/nova/nova-conductor.log | 15:20 |
dpawlik | where on IBM02 there are many of rabbit issues -,- | 15:21 |
ysandeep|out | dpawlik, maybe worth pinging someone from pidone | 15:21 |
ysandeep|out | may be damien/eck from pidone | 15:22 |
dpawlik | from time to time it disconnects and then reconnects - that's fine, but when I have "s unreachable: Server unexpectedly closed connection. Trying again in 1 seconds.: OSError: Server unexpectedly closed connection" | 15:22 |
dpawlik | "A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer" | 15:22 |
dpawlik | tomorrow | 15:22 |
dpawlik | thanks ysandeep|out! | 15:22 |
ysandeep|out | o/ let's continue tomorrow | 15:22 |
* ysandeep|out out | 15:22 | |
rlandy | jm1: hey - what can I help with? | 15:29 |
marios | rlandy: jm1: i set workflow on that rlandy so you may want to keep an eye to get it through gate | 15:31 |
marios | rlandy: 'that' https://review.opendev.org/c/openstack/tripleo-ci/+/857142/1#message-41d7487dcb1b979f906120f51fb066c944d73e76 :) | 15:31 |
rlandy | k - will do | 15:32 |
rlandy | if that clears, we don;t need the opendev email | 15:32 |
rlandy | arxcruz: hey - have some time now - want to move your 1-1 up? | 15:36 |
arxcruz | rlandy sure | 15:37 |
arxcruz | joining | 15:37 |
rlandy | joined | 15:37 |
rlandy | arxcruz: ^^ | 15:38 |
arxcruz | rl | 15:38 |
arxcruz | rlandy authenticating, one sec | 15:38 |
rlandy | k | 15:38 |
*** marios is now known as marios|out | 15:47 | |
*** dviroel is now known as dviroel|lunch | 15:55 | |
rlandy | lunch - brb | 16:09 |
*** jpena|off is now known as jpena | 16:09 | |
*** jpena is now known as jpena|off | 16:10 | |
rlandy | rekicked https://review.rdoproject.org/r/c/testproject/+/44661 for wallab c9 | 16:39 |
rlandy | cool - train only out of fs039 - rerun | 16:40 |
abregman | rlandy, arxcruz, chandankumar: is there anything else we need to do here or can we merge it? https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/427708 | 16:49 |
*** dviroel|lunch is now known as dviroel | 16:52 | |
*** amoralej is now known as amoralej|off | 16:52 | |
rlandy | abregman: looking | 17:19 |
rlandy | abregman: I linked the testroject to the review | 17:21 |
rlandy | https://code.engineering.redhat.com/gerrit/c/testproject/+/427709 | 17:21 |
rlandy | extra-vars @/home/zuul/workspace/.quickstart/config/release/tripleo-ci/RedHat-9/rhos-17.1.yml git right file | 17:24 |
rlandy | dviroel: I +2'ed https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/427708 - can you review? | 17:25 |
rlandy | dasm: ^^ | 17:25 |
rlandy | then we can merge for abregman | 17:25 |
* dviroel looks | 17:26 | |
dviroel | rlandy: reviewed | 17:31 |
rlandy | ty | 17:33 |
dasm | rlandy: i can't review. kerberos is playing games with me: "not found in Kerberos database while getting initial credentials" | 17:34 |
dasm | probably i need to restart pc | 17:34 |
rlandy | dasm: bhagyashris|ruck reported some issues today | 17:34 |
rlandy | dviroel reviewed so we are ok for now | 17:34 |
abregman | rlandy, dasm: ty, can we merge also this https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/427727? | 17:50 |
dasm | abregman: quick check shows merge conflict. dviroel can you check that too? i still can't login | 17:52 |
rlandy | going to need a rebase | 17:53 |
rlandy | abregman: also - we'll need a criteria patch for sc001 | 17:53 |
abregman | I'll rebase it now | 17:53 |
dviroel | yep, needs rebase since scn001 just merged | 17:54 |
abregman | rebased but let's see if the gates pass | 17:59 |
abregman | rlandy, dviroel, dasm: criteria patch for sc001: https://code.engineering.redhat.com/gerrit/c/tripleo-environments/+/427729 | 18:01 |
rlandy | openstack-promote-component running again now downstream | 18:01 |
rlandy | hoping it will pick up new 17.1 | 18:01 |
rlandy | abregman: thanks - will merge after current promot ejobs runs | 18:01 |
rlandy | I want to see 17.1 components promote first | 18:01 |
rlandy | they were missing the jenkins jobs links | 18:02 |
abregman | sure | 18:02 |
rcastillo | lunch, brb | 18:07 |
abregman | rlandy, dviroel, dasm: rebased sc002: https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/427727 | 18:13 |
abregman | test project change: https://code.engineering.redhat.com/gerrit/c/testproject/+/427725 | 18:13 |
abregman | rlandy, dviroel, dasm: the the criteria patch for sc002: https://code.engineering.redhat.com/gerrit/c/tripleo-environments/+/427877 | 18:15 |
rlandy | abregman: thanks | 18:17 |
rlandy | 14 minutes agobaremetalpromoted-components3 days ago6b374ec1359019e602c296db57c728e4 | 18:17 |
rlandy | 13 minutes agouipromoted-components2 days ago5f26c39a9e57e972904b976f29f9d2d7 | 18:17 |
rlandy | 13 minutes agotempestpromoted-components5 days ago92436dd5bd89e8fa1364e8644e757a2e | 18:17 |
rlandy | 13 minutes agovalidationpromoted-components5 days ago76cd2c2752da8817c0f976961006bf4b | 18:17 |
rlandy | 13 minutes agocomputepromoted-components2 days ago2febf07d2f0e2b528da4fb390826324f | 18:17 |
rlandy | good | 18:17 |
rlandy | looking better | 18:18 |
rlandy | waiting for common | 18:18 |
rlandy | then we can add more criteria | 18:18 |
abregman | rlandy: a question on sc004 - we know the sc004 jobs fail due to ceratin bug (which was escalated as a CIX), should we wait for the CIX/bug to be resolved before we merge it? https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/427710 | 18:18 |
rlandy | it's ok to merge job definitions and add the job | 18:19 |
rlandy | just not criteria | 18:19 |
abregman | ack | 18:20 |
abregman | k, once we merge sc002, I'll rebase it | 18:20 |
rlandy | ok - looking at criteria now | 18:29 |
rlandy | abregman: left one comment | 18:34 |
rlandy | https://code.engineering.redhat.com/gerrit/c/tripleo-environments/+/427729 | 18:34 |
rlandy | can fix that in a later patch if you want | 18:34 |
abregman | rlandy: fixed | 18:48 |
jm1 | rlandy: i bet we have to cheat again to get a c9 master promotion | 18:59 |
rlandy | isk yet | 18:59 |
rlandy | idk yer | 18:59 |
rlandy | ugh - yet | 18:59 |
jm1 | rlandy: ^^ | 19:00 |
jm1 | rlandy: i updated and rerun all failed jobs once again (except for the ones you reran already) | 19:00 |
jm1 | rlandy: rr notes has been updated | 19:00 |
rlandy | jm1: thanks - it would be a lot to skip | 19:00 |
jm1 | rlandy: jobs failed on either known bugs or intermittent failures | 19:01 |
jm1 | rlandy: c9 wallaby tripleo component should promote | 19:01 |
rlandy | jm1: also - there is a master run going on now | 19:01 |
jm1 | rlandy: c9 wallaby network component is the oldest one and is still failing on intermittent errors | 19:01 |
rlandy | maybe that one will be better | 19:01 |
rlandy | jm1: so that one we may want to skip promote | 19:02 |
jm1 | rlandy: you can give it a couple of rechecks | 19:02 |
rlandy | jm1: ok | 19:02 |
jm1 | rlandy: its the only job left for c9 wallaby components | 19:03 |
rlandy | I'll leave you notes for tomorrow | 19:03 |
rlandy | you can decide that when you get in | 19:03 |
* rlandy looks at component | 19:03 | |
rlandy | periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset001-network-wallaby | 19:03 |
rlandy | only job missing | 19:03 |
jm1 | rlandy: yeah this one is in rerun right now | 19:04 |
rlandy | maybe we try with depends-on chandan's patcj | 19:04 |
rlandy | I see it's in deploy now | 19:04 |
rlandy | ok - I'll watch it | 19:04 |
jm1 | rlandy: last time it failed on tempest | 19:04 |
* rlandy checks that | 19:04 | |
jm1 | rlandy: link in rr notes ;) | 19:04 |
jm1 | rlandy: https://logserver.rdoproject.org/58/44658/6/check/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset001-network-wallaby/8ada848/logs/undercloud/var/log/tempest/stestr_results.html.gz | 19:04 |
rlandy | https://logserver.rdoproject.org/openstack-component-network/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset001-network-wallaby/448033a/logs/undercloud/var/log/ - deploy failure - from line | 19:05 |
jm1 | rlandy: previous error reasons for that job are also listed in rr notes ;) | 19:06 |
rlandy | https://logserver.rdoproject.org/58/44658/6/check/periodic-tripleo-ci-centos-9-ovb-3ctlr_1comp-featureset001-network-wallaby/8ada848/logs/undercloud/var/log/tempest/stestr_results.html.gz - yep - ok - this one is tempest | 19:06 |
rlandy | jm1: late fr you - I'll watch it | 19:06 |
jm1 | rlandy: we should invest some time and write a auto-rerun script. i burned most of my day with rerunning jobs and wading through intermittent errors. | 19:09 |
jm1 | rlandy: did not even have time to watch robert's scrum literature | 19:09 |
rlandy | jm1: that is what frenzyfriday's elastic recheck is suppsed to do | 19:11 |
rlandy | catch repeated error traces | 19:11 |
rlandy | I know OVB is a time sync | 19:11 |
rlandy | I know we need to do something about it | 19:11 |
jm1 | rlandy: can we please put that on top of our scrum board, highest prio? :D | 19:11 |
rlandy | on whose time though? | 19:12 |
rlandy | dasm is slammed with infra now | 19:12 |
dasm | -? | 19:12 |
rlandy | rcastillo and you and stuck in collections | 19:12 |
jm1 | rlandy: you said frenzyfriday is working on elastic recheck, so she is already on it | 19:12 |
rlandy | ysandeep|out, chandankumar anf dviroel are on next gen | 19:12 |
dasm | currently i've changed the way how we're querying zuul to spare some infra resources | 19:12 |
rlandy | arxcruz is busy with tempest | 19:12 |
rlandy | so it goes | 19:13 |
dasm | i might look into auto-rechecks soon | 19:13 |
rlandy | dasm: don;t want to derail you from fixing infra | 19:13 |
dasm | frenzyfriday did a great head start on that | 19:13 |
dasm | rlandy: it's not gonna be one-time thing. infra is gonna be ongoing effort, which will take next few monshr | 19:14 |
dasm | *months | 19:14 |
rlandy | correct | 19:14 |
rlandy | so derailing it will delay stuff | 19:14 |
jm1 | rlandy: okeeee, will try to hack something when i get some time | 19:16 |
jm1 | dviroel: still online? | 19:16 |
frenzyfriday | jm1, rlandy the graph is broken but the bot should work fine. Try adding a query to https://review.opendev.org/c/openstack/tripleo-ci-health-queries/+/853505 | 19:16 |
dviroel | jm1: yes, sup | 19:17 |
frenzyfriday | the erbot should comment on the patches linking the existing bug and asking people to recheck | 19:17 |
jm1 | dviroel: oh great :) regarding this cpu issue.. what do we do about it? we are still facing it a lot | 19:17 |
jm1 | dviroel: cant we edit your patch and choose the lowest possible qemu cpu model? then simply merge it and hope for the best? | 19:18 |
dviroel | jm1: yeah, I see your comment, this is not happening on the job that I was testing (16-2) - but we can try with those failing on master | 19:19 |
dviroel | jm1: master job does not consume bits from internal repos, we need to create a new upstream change to test it | 19:19 |
dviroel | jm1: I can do that | 19:19 |
jm1 | dviroel: that would be awesome | 19:21 |
dviroel | i will give a try | 19:21 |
jm1 | dviroel: somewhere i saw a cpu model qemu64-x86_64-cpu | 19:21 |
jm1 | dviroel: qemu64-x86_64-cpu is deprecated but it reads as if it is very generic | 19:21 |
rlandy | [error] RuntimeError: Certificate issuance failed (CA_UNREACHABLE: Error 7 connecting to http://ipa.ooo.test:8080/ca/ee/ca//profileSubmit: Couldn't connect to server.) | 19:21 |
rlandy | Certificate issuance failed (CA_UNREACHABLE: Error 7 connecting to http://ipa.ooo.test:8080/ca/ee/ca//profileSubmit: Couldn't connect to server.) | 19:21 |
rlandy | The ipa-server-install command failed. See /var/log/ipaserver-install.log for more information | 19:21 |
rlandy | ^^ real issue of fs039 | 19:22 |
jm1 | rlandy: fs39 c9 what? | 19:22 |
rlandy | master atm | 19:22 |
rlandy | only one of those errors | 19:23 |
rlandy | so ignore | 19:23 |
rlandy | will see if it repeats | 19:23 |
jm1 | rlandy: i bet its intermittent. someone wants to place another bet? XD | 19:24 |
jm1 | dviroel: oh wait, there is qemu64: https://qemu.readthedocs.io/en/latest/system/qemu-cpu-models.html | 19:25 |
rlandy | jm1: no - because you will win | 19:25 |
jm1 | rlandy: yeah this is an easy one ;) | 19:25 |
rlandy | jm1: well train will promote | 19:31 |
rlandy | jm1: maybe I'll try ibm cloud for master/wallaby failure | 19:32 |
jm1 | rlandy: kvm internal? | 19:34 |
rlandy | na - for ovb | 19:34 |
rlandy | wallaby is missing 39, 64 and 20 | 19:35 |
* rlandy sees if c9 nodes are available there | 19:36 | |
jm1 | rlandy: atm c9 master fs39 fs35 fs64 fail. for c9 wallaby its fs20 fs39 fs64. you want to move those to ibm? | 19:36 |
rlandy | going to testproject it | 19:36 |
rlandy | c9 wallaby its fs20 fs39 fs64 | 19:36 |
jm1 | rlandy: we could update our promoter to support conditionals, e.g. "c9-master-fs20-vexxhost||c9-master-fs20-ibm" | 19:37 |
rlandy | sec - trying it out first | 19:37 |
rlandy | https://review.rdoproject.org/r/c/testproject/+/31165 - let's see what it does | 19:40 |
jm1 | rlandy: aye, aye, i am out for today :) | 19:44 |
dasm | jm1: o/ take care | 19:46 |
* jm1 have a nice evening, oooci#oooq :) | 19:46 | |
rlandy | jm1[m]: have a good night | 19:48 |
abregman | hey. checking again. are we good to go with change? https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/427727 | 20:06 |
abregman | this* | 20:14 |
dviroel | rlandy: voted on https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/427727 | 20:20 |
* dviroel biab | 20:28 | |
*** dviroel is now known as dviroel|afk | 20:28 | |
afuscoar | dviroel|afk: Hello, I'll add you as reviewer also for this one https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/428034 Thx | 20:35 |
abregman | rlandy, dasm: ^ | 20:35 |
abregman | afuscoar: seems like there's merge conflict there | 20:36 |
afuscoar | I see all of them have it, e.g. https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/427754 | 20:37 |
afuscoar | Well, i don't see merge conflict on https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/428034 just same topic area | 20:38 |
abregman | maybe, but they really shouldn't have. I'll fix 007, maybe you can do it for 012 in the meantime | 20:38 |
afuscoar | Oh I see | 20:38 |
afuscoar | Mmm, I'll check what happens. | 20:38 |
abregman | well, that was quite a rebase | 20:46 |
afuscoar | Better then | 20:47 |
afuscoar | I'm checking this one https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/428072 | 20:47 |
afuscoar | Checking why it fails | 20:47 |
abregman | afuscoar: clone -> cherry-pick -> tox -e linters -> amend -> submit | 20:48 |
afuscoar | Oh | 20:49 |
afuscoar | I'll check, thank you abregman | 20:49 |
rlandy | we just need the testproject listed | 20:50 |
abregman | rlandy: can we merge this one? https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/427754 | 21:03 |
rlandy | checking | 21:03 |
rlandy | abregman: so ... going back to https://code.engineering.redhat.com/gerrit/c/tripleo-environments/+/427729/3/ci-scripts/dlrnapi_promoter/config/RedHat-9/component/rhos-17.1.yaml | 21:07 |
rlandy | the order is still off here | 21:08 |
rlandy | abregman: same here ... | 21:09 |
rlandy | https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/427727 | 21:09 |
rlandy | promote job is the bottom of the line | 21:09 |
abregman | yup, will fix it | 21:09 |
rlandy | abregman: ^^ will merge this one | 21:09 |
rlandy | but can you fix the order in the next patch | 21:09 |
abregman | rlandy: no no | 21:10 |
abregman | will do it now | 21:10 |
rlandy | there's a logic to it :) | 21:10 |
rlandy | ie: not random | 21:10 |
abregman | yup | 21:11 |
abregman | rlandy: fixed https://code.engineering.redhat.com/gerrit/c/tripleo-environments/+/427729 | 21:11 |
rlandy | ok - merging that | 21:12 |
rlandy | abregman: pls fix the order here next https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/427727/6/zuul.d/component-pipeline-rhos-17.1-rhel9.yaml | 21:13 |
rlandy | pls copy the 17 line | 21:13 |
rlandy | and I'll merge that next | 21:13 |
abregman | yes, doing it right now. I need 2m | 21:13 |
rlandy | no problem | 21:14 |
abregman | I think it's good now but let's see what the gates say | 21:17 |
abregman | rlandy: how does it looks now? https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/427727 | 21:20 |
*** abregman is now known as abregman|afk | 21:22 | |
rlandy | ok - I see the issue | 21:22 |
rlandy | scen001 was added before standalone | 21:23 |
* rlandy edits | 21:23 | |
abregman|afk | it should be standlone -> scenario001 -> scenario002? | 21:24 |
abregman|afk | when you said sorted I thought alphabetically :D | 21:24 |
abregman|afk | rlandy: should I modify it to be standlone -> scenario001 -> scenario002? | 21:25 |
rlandy | abregman|afk: no worries | 21:25 |
rlandy | I am editing the patch | 21:25 |
rlandy | late ofr you | 21:25 |
rlandy | then it will be simple to carry one | 21:26 |
rlandy | on | 21:26 |
abregman|afk | k, thank you. yes, we (the team) will continue tomorrow morning with the other changes | 21:26 |
abregman|afk | rlandy: just one question to understand it better - the order matters because they are triggered sequentially and so standalone is the most basic one and it fails no reason to run the other scenarios? | 21:28 |
abregman|afk | if it fails* | 21:28 |
rlandy | no - they all trigger after the deps | 21:28 |
rlandy | it's just easier to find them in this order | 21:29 |
rlandy | all files kind of comply | 21:29 |
rlandy | really a small change | 21:29 |
rlandy | standalone at top | 21:29 |
abregman|afk | oh k. got it | 21:29 |
rlandy | promote job at bottom | 21:29 |
rlandy | etc. | 21:29 |
rlandy | no big deal | 21:29 |
rlandy | abregman|afk: ok ... https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/427727 Add scenario002-standalone-component jobs for OSP17.1 | 21:33 |
rlandy | dasm: dviroel|afk: can you take one more look at: https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/427727 | 21:36 |
rlandy | then we can merge | 21:36 |
afuscoar | rlandy: in this case https://code.engineering.redhat.com/gerrit/c/openstack/tripleo-ci-internal-jobs/+/428072 should I move them? | 21:39 |
afuscoar | This is the test with the depends-on https://code.engineering.redhat.com/gerrit/c/testproject/+/428074 | 21:39 |
rlandy | afuscoar: yes pls try match the 17 pattern | 21:39 |
rlandy | and link your testproject in the commit message | 21:40 |
rlandy | afuscoar: ^^ helps reviewers | 21:40 |
afuscoar | oh yes, that's true | 21:42 |
afuscoar | I don't get the order, I've checked the zuul.d/component-jobs-rhel-9.yam and the ovb jobs are at the end | 21:44 |
rlandy | afuscoar: it's the line | 21:58 |
rlandy | biab | 22:29 |
*** dasm is now known as dasm|off | 22:59 | |
dasm|off | leaving for tonight. take care | 23:00 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!