*** holser has quit IRC | 00:56 | |
*** holser has joined #oooq | 01:06 | |
*** jmasud has joined #oooq | 01:14 | |
*** holser has quit IRC | 01:29 | |
*** jmasud has quit IRC | 01:45 | |
*** jmasud has joined #oooq | 01:46 | |
*** jmasud has quit IRC | 01:58 | |
*** jmasud has joined #oooq | 02:00 | |
*** jmasud has quit IRC | 02:21 | |
*** jmasud has joined #oooq | 02:22 | |
*** jmasud has quit IRC | 03:06 | |
*** ykarel|away is now known as ykarel | 04:07 | |
*** raukadah is now known as chkumar|rover | 04:55 | |
*** jmasud has joined #oooq | 05:18 | |
*** soniya29 has joined #oooq | 05:45 | |
*** skramaja has joined #oooq | 05:49 | |
*** udesale has joined #oooq | 05:49 | |
* chkumar|rover headed to wework | 06:04 | |
*** sanjayu_ has joined #oooq | 06:12 | |
*** marios has joined #oooq | 06:14 | |
*** ratailor has joined #oooq | 06:17 | |
*** sanjayu__ has joined #oooq | 06:22 | |
*** sanjayu_ has quit IRC | 06:25 | |
*** jfrancoa has joined #oooq | 06:52 | |
*** sanjayu__ has quit IRC | 06:56 | |
*** saneax has joined #oooq | 06:58 | |
*** dtantsur|afk is now known as dtantsur | 07:20 | |
*** jbadiapa has joined #oooq | 07:39 | |
chkumar|rover | zbr, morning, Do we still have missing patches for fix ovb logs, https://logserver.rdoproject.org/openstack-periodic-master/opendev.org/openstack/tripleo-ci/master/periodic-tripleo-ci-centos-7-ovb-1ctlr_2comp-featureset020-master/7a86edf/logs/ logs are not getting collected | 07:40 |
---|---|---|
zbr | nope, but i hope sagi will look into it. i had the -vvvv removal which was blocked. | 07:44 |
*** ykarel is now known as ykarel|lunch | 07:51 | |
*** kopecmartin has joined #oooq | 08:00 | |
*** jmasud has quit IRC | 08:13 | |
*** jmasud has joined #oooq | 08:14 | |
*** tesseract has joined #oooq | 08:14 | |
*** apetrich has joined #oooq | 08:16 | |
zbr | i think someone pointed about isort last week, shortly isort is good but we need to wait ~month until they finish addressing conflicts with black. | 08:25 |
*** tosky has joined #oooq | 08:29 | |
arxcruz | chkumar|rover: wondering why fs021 is running on tempest project... | 08:58 |
chkumar|rover | arxcruz, need to do a git blame there | 08:58 |
arxcruz | marios: so, I was able to run full tempest on fs020 with 4:30 min increasing the cpu and the concurrency to 4 | 08:59 |
marios | arxcruz: ack i recall you saying that on friday | 09:00 |
marios | arxcruz: maybe add note on https://tree.taiga.io/project/tripleo-ci-board/task/1383 and/or comment on the reviews? we can discuss this afternoon on calls? | 09:00 |
*** yolanda has joined #oooq | 09:03 | |
*** d0ugal has quit IRC | 09:29 | |
*** ykarel|lunch is now known as ykarel | 09:29 | |
*** sshnaidm|off is now known as sshnaidm | 09:30 | |
*** d0ugal has joined #oooq | 09:31 | |
sshnaidm | zbr, the logs problem is not related to -vvvv, it started with last gzipping, I hope you'll check it. Actually w/o debug we won't be able to find the problem | 09:31 |
zbr | sshnaidm: i am not going to check it, i already pointed to the dangereus piece of code, which I consider a "time-bomb" it has zero projection against becoming a DDOS. | 09:35 |
sshnaidm | zbr, what are you talking about? I'm talking about logs are not collected | 09:35 |
zbr | if anyone wants, they can put a gz to that -vvvv line. | 09:35 |
zbr | i am ready to bet that ansible dies writing to that debug log file. | 09:36 |
sshnaidm | zbr, debug file worked last 2 years, instead of blaming long working code I think it's worth to check last changes | 09:37 |
zbr | sshnaidm: sorry but i will let you and wes deal with that aspect, i am with alex side on this. | 09:38 |
sshnaidm | zbr, we are talking about logs are not collected | 09:38 |
zbr | i not saying that is the only issue, but I consider this one a blocker. | 09:38 |
*** derekh has joined #oooq | 09:39 | |
sshnaidm | zbr, there are no sides here, there are logs not collected | 09:39 |
zbr | te code that fails to run to finish is the one with -vvvv | 09:39 |
sshnaidm | zbr, I'm not sure you're serious now.. | 09:39 |
sshnaidm | zbr, here's the code that ran without -vvvv, where are the logs? https://logserver.rdoproject.org/59/705059/2/openstack-check/tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001/53209e9/logs/ | 09:41 |
sshnaidm | zbr, maybe you can leave this debug alone and we can start working on a real problem? | 09:41 |
sshnaidm | chkumar|rover, do you know if we had changed podman recently? | 09:42 |
*** bogdando has joined #oooq | 09:43 | |
*** Tengu has quit IRC | 09:46 | |
*** Tengu has joined #oooq | 09:48 | |
*** Tengu has quit IRC | 09:54 | |
*** Tengu has joined #oooq | 09:55 | |
*** jaosorior has joined #oooq | 10:01 | |
zbr | chkumar|rover: should we install missing tools during collection? like lsof, lvm2, netstat, lspci, pstree? | 10:02 |
*** Tengu has quit IRC | 10:03 | |
*** skramaja has quit IRC | 10:03 | |
*** Tengu has joined #oooq | 10:03 | |
zbr | currently these are not installed and the collecting commands are likely failing. we have two options: a) install them, or b) make collection command run only when the tools are installed. | 10:04 |
zbr | i would personally install lsof all the time, but lvm2 does not make much sense for containers for example. | 10:05 |
zbr | so maybe adding only lsof and netstat and pstree, which makes sense in all use-cases, and the other two "hw" specific, to run only when already installed. | 10:05 |
*** holser has joined #oooq | 10:07 | |
*** Tengu has quit IRC | 10:08 | |
sshnaidm | zbr, you have "ss" instead of netstat | 10:09 |
sshnaidm | lvm2 doesn't make sense in upstream, but might be useful in infrared case I think | 10:09 |
zbr | sshnaidm: my question was re https://review.opendev.org/#/c/705175/ | 10:10 |
zbr | maybe we should not add any new rpm, yet an delay that for other changes. | 10:10 |
sshnaidm | we can see in logs that collections is stopped on containers part, I think it may be copying of files from podman containers stuck forever | 10:10 |
zbr | first we need to move the package list somewhere where is configurable. | 10:10 |
zbr | ouch.. doing cp with podman was a known issue. | 10:11 |
sshnaidm | zbr, sorry, I'm focusing now on getting logs back, I don't care in which form | 10:11 |
zbr | sure. makes sense. | 10:11 |
zbr | we would have a big laugh if we discover that this was a bug introduded by newer podman. | 10:11 |
zbr | i think they upgraded it recently. | 10:11 |
zbr | maybe we should try to switch molecule to use podman when testing this job, we may have a revelation. | 10:12 |
*** Tengu has joined #oooq | 10:13 | |
*** Tengu_ has joined #oooq | 10:14 | |
*** Tengu_ has quit IRC | 10:15 | |
chkumar|rover | sshnaidm, nope | 10:30 |
chkumar|rover | sshnaidm, Any issue you are seeing? | 10:30 |
sshnaidm | chkumar|rover, I don't know yet, maybe issues with copying files from containers | 10:31 |
sshnaidm | zbr, I don't understand your statement here https://review.opendev.org/#/c/705347/3/tasks/collect/container.yml | 10:39 |
sshnaidm | zbr, what is the issue you see? | 10:39 |
zbr | thre is no need to load container enginer into an ansible fact. | 10:40 |
zbr | mainly you explode one shell taks into 3 tasks w/o good reasons. | 10:40 |
zbr | putting that one big shell script in-line was an old mistake, how abut moving that script like a standalone file? | 10:41 |
sshnaidm | zbr, I think it's reasonable improvement | 10:41 |
zbr | it would be easier for us to edit it, lint it, maintain it. | 10:42 |
zbr | it would be one standalone piece of bash | 10:42 |
zbr | the best part is that it doesn't even have to be a template, it can be pure bash. | 10:42 |
zbr | inline shell in ansible is good until you reach ~15 lines, after it can become a liability. | 10:43 |
zbr | in fact you could have sorted the original bug with a very simple hack | 10:44 |
sshnaidm | zbr, what is the issue you see in this patch? | 10:44 |
zbr | engine=`command -v podman docker|head -n1` | 10:44 |
zbr | i doubt we have any single production place with both engines used. | 10:44 |
sshnaidm | zbr, no, that won't work | 10:45 |
zbr | why? | 10:45 |
sshnaidm | zbr, that's the problem, we have always both engines | 10:45 |
sshnaidm | zbr, did you look at output of this task in jobs? | 10:46 |
zbr | i know we have both on molecule jobs, but i was not aware about doing this trick on other places. | 10:46 |
zbr | do you want me to try to do make that shell a script and combine your change? | 10:47 |
zbr | happy to help, and keep the detection logic proposed | 10:47 |
*** jbadiapa has quit IRC | 10:49 | |
*** holser has quit IRC | 11:00 | |
sshnaidm | zbr, currently I'd like to understand on which stage the collection is stuck | 11:01 |
sshnaidm | zbr, although I still don't understand what is the issue you see in this patch | 11:02 |
sshnaidm | zbr, if you think it deserves -1, please write exact well based reasons behind it | 11:03 |
zbr | does it always happen or is random? i would personally create a change to add `| gzip` to that -vvv line and see what happens. at least we rule-out the outofdisk/time case. big log also means that rsync could fail trying to move it. | 11:04 |
sshnaidm | zbr, https://docs.openstack.org/project-team-guide/review-the-openstack-way.html#code-review-minus-1 | 11:05 |
zbr | yep, this reminds that i wanted to ask something on infra. | 11:06 |
*** whoami-rajat is now known as whoami-rajat|lun | 11:07 | |
*** whoami-rajat|lun is now known as whoami-rajat | 11:07 | |
zbr | is a common assumption that any change should be covered by a test that prevents regression, but that is not mentioned in the gidelines. | 11:08 |
zbr | if not needed, i will remove my -1, and just leave a comment. | 11:08 |
zbr | but experience told me that comments are ignored. | 11:08 |
sshnaidm | zbr, also wrt this patch https://review.opendev.org/#/c/705335/ | 11:10 |
sshnaidm | zbr, you shouldn't set -1 just to leave your comment, please read a code review guide I posted above | 11:11 |
sshnaidm | zbr, you can actually set +1 with a comment, not +2 | 11:11 |
ykarel | sshnaidm, me noticed in some of my patches where logs were not collected was container collect info stuck on compute node, i checked only few logs so not sure though | 11:11 |
sshnaidm | zbr, we don't have yet unittests for sova code, only functional | 11:11 |
sshnaidm | ykarel, yeah, it's exactly what I saw too | 11:12 |
ykarel | sshnaidm, ack then it's related :) | 11:12 |
sshnaidm | ykarel, that I wondered if it's new podman of kind of | 11:12 |
zbr | add some, writing the first is less than 10 lines! | 11:12 |
sshnaidm | zbr, not in this patch, definitely | 11:12 |
sshnaidm | zbr, you can work on it, you're also part of Ci | 11:12 |
ykarel | sshnaidm,new podman causing timeout? | 11:12 |
ykarel | atleast it's not updated in last few days | 11:13 |
sshnaidm | ykarel, well, you say it's not a new | 11:13 |
ykarel | i think month | 11:13 |
ykarel | yes it's not updated for many days | 11:13 |
sshnaidm | ykarel, yeah, and this started from something like a week ago, with all these gzips | 11:13 |
zbr | i bet that the same was said about all the previous patches, not in this one. i offer to write one to break that habbit. | 11:13 |
ykarel | sshnaidm, last update for podman was in september | 11:14 |
sshnaidm | zbr, because patches should not mix things | 11:15 |
ykarel | yes possible it's related to gzip patches | 11:15 |
sshnaidm | ykarel, I see.. I wonder if it's related to gzipping | 11:15 |
ykarel | me not aware of any logs missing | 11:15 |
ykarel | before that | 11:15 |
sshnaidm | ykarel, maybe worth to do containers task as async and add a timeout | 11:16 |
sshnaidm | gonna try it | 11:16 |
ykarel | sshnaidm, for debugging i think could be done | 11:16 |
ykarel | but not permanent if logs are missing | 11:16 |
marios | sshnaidm: arxcruz: zbr: chkumar|rover: panda: * reviews please when you next have time thank you "Refresh start_named_hashes after promotion to prevent false positive" https://review.rdoproject.org/r/#/c/24665/ | 11:19 |
arxcruz | marios: i'm not familiar with the code, so i can only review from the point of view of the logic | 11:20 |
marios | arxcruz: cool thanks sure just review what you can, review the python | 11:20 |
marios | chkumar|rover: thanks for checking added pointer to where the tasks are executed by molecule https://review.rdoproject.org/r/#/c/24771/ | 11:24 |
sshnaidm | zbr, I think you're much better in all related linters and tests, so why wouldn't you contribute from your experience and write some tests for these files in followups? | 11:28 |
ykarel | sshnaidm, other thing i noticed is it's affecting only master | 11:37 |
ykarel | https://review.rdoproject.org/zuul/builds?job_name=tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset035&branch=master | 11:38 |
ykarel | similar can be seen for other jobs | 11:38 |
ykarel | and it started since 30th Jan | 11:41 |
ykarel | infact 29th | 11:42 |
marios | ack sshnaidm checking | 11:45 |
marios | (was meant for tripleo ^ sshnaidm ) | 11:45 |
sshnaidm | ykarel, 035 in check is running on vexxhost | 11:47 |
chkumar|rover | sshnaidm, https://review.opendev.org/#/c/705007/2 is good to go | 11:47 |
chkumar|rover | marios, https://review.opendev.org/#/c/704805/ needs +w on this, thanks :-) | 11:48 |
ykarel | sshnaidm, so it's affecting both rdo and vexx, so cloud thing should be related i think | 11:48 |
sshnaidm | ykarel, yeah, shouldn't | 11:48 |
sshnaidm | ykarel, how do you see the problem from builds table? | 11:48 |
ykarel | sshnaidm, from job timing | 11:49 |
ykarel | 13k+ | 11:49 |
sshnaidm | oh, right | 11:49 |
ykarel | that's not strict but could be used | 11:49 |
marios | chkumar|rover: ack | 11:51 |
sshnaidm | ykarel, are you sure it's only master? Because if so, it shouldn't be related to collect-logs.. | 11:53 |
ykarel | sshnaidm, from what i saw it's master, but please cross check | 11:54 |
*** holser has joined #oooq | 11:54 | |
sshnaidm | chkumar|rover, do you know why we don't have train job in 3 party? https://review.opendev.org/#/c/702844/ | 11:55 |
chkumar|rover | sshnaidm, checking | 11:56 |
chkumar|rover | sshnaidm, I think it got missed adding it right now https://github.com/rdo-infra/review.rdoproject.org-config/blob/master/zuul.d/tripleo.yaml#L72 | 11:57 |
sshnaidm | chkumar|rover, I think it should be in ovb-branchless template | 11:58 |
sshnaidm | chkumar|rover, thanks! | 11:58 |
*** jbadiapa has joined #oooq | 12:03 | |
chkumar|rover | sshnaidm, under ovb branchless template it is there https://github.com/rdo-infra/rdo-jobs/blob/master/zuul.d/project-templates.yaml#L627 | 12:04 |
marios | sshnaidm: chkumar|rover: can you pleas check https://review.rdoproject.org/r/#/c/24771/ & https://review.rdoproject.org/r/#/c/24665/ | 12:04 |
sshnaidm | chkumar|rover, hmm.. then why doesn't it run, maybe problem with anchor? | 12:08 |
chkumar|rover | marios, regarding this one https://review.rdoproject.org/r/#/c/24665/ and I think I am watching the correct logs http://logs.rdoproject.org/65/24665/4/check/tripleo-ci-promotion-staging/2137100/logs/promoter_logs/centos7_master.log and http://localhost:58080/api/civotes_detail.html?commit_hash=360d335e94246d7095672c5aa92b59afa380a059&distro_hash=9e5988125e88f803ba20743be7aa99079dd275f2 | 12:19 |
chkumar|rover | ? sorry for the confusion | 12:19 |
weshay | chkumar|rover, :) things look a little greener this morning eh? | 12:21 |
marios | chkumar|rover: yes | 12:21 |
chkumar|rover | marios, thansk, done | 12:22 |
chkumar|rover | weshay, yes, with few hiccups | 12:22 |
chkumar|rover | weshay, in few master, ovb jobs logs went missing | 12:22 |
marios | chkumar|rover: i'll comment on the review too https://review.rdoproject.org/r/#/c/24665/ but where did you get that one //localhost:58080/api/civotes_detail.html?commit_hash=360d335e94246d7095672c5aa92b59afa380a059&distro_hash=9e5988125e88f803ba20743be7aa99079dd275f2 | 12:22 |
marios | chkumar|rover: thanks | 12:23 |
chkumar|rover | marios, http://logs.rdoproject.org/65/24665/4/check/tripleo-ci-promotion-staging/2137100/logs/promoter_logs/centos7_master.log just go down the logs | 12:24 |
*** rfolco has joined #oooq | 12:24 | |
marios | chkumar|rover: i se, i thought you were pointing to some other file | 12:24 |
marios | chkumar|rover: ack yes | 12:24 |
chkumar|rover | weshay, https://review.opendev.org/#/c/705007/ good to go, one less bug to worry | 12:25 |
weshay | chkumar|rover, go ahead an wf | 12:25 |
weshay | kopecmartin, mtg | 12:32 |
weshay | https://projects.engineering.redhat.com/browse/RHOSINFRA-2954 | 12:32 |
*** soniya29 has quit IRC | 12:41 | |
chkumar|rover | weshay, see ya directly during CIX, heading home | 12:51 |
*** udesale_ has joined #oooq | 12:55 | |
*** udesale has quit IRC | 12:57 | |
*** ratailor has quit IRC | 12:57 | |
*** ratailor has joined #oooq | 12:58 | |
*** marios is now known as marios|call | 13:01 | |
zbr | ykarel: am I correct to assume that on https://review.opendev.org/#/c/705378/1 you are trying to address the issue of having a html file gzipped? | 13:01 |
*** rlandy has joined #oooq | 13:01 | |
ykarel | zbr, nope | 13:01 |
ykarel | zbr, me trying to fix log footer which contains wrong links when artcl_gzip: false | 13:02 |
rfolco | weshay, joining scrum ? | 13:02 |
ykarel | zbr, see https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_fa0/705378/1/check/tripleo-ci-centos-7-undercloud-containers/fa06d68/logs/README.html without that patch | 13:03 |
ykarel | zbr, see https://f36f3fd4b8223f0221b9-992fbb76369d370eec805eb398c9de6e.ssl.cf2.rackcdn.com/705061/1/check/tripleo-ci-centos-7-standalone/31ec969/logs/README.html without the patch | 13:03 |
ykarel | in upstream logs are not gzipped but links there point to not existing\ .gz files | 13:04 |
zbr | ykarel: ok, so the bug is in the generation of the readme file | 13:05 |
zbr | there are two approaches here | 13:06 |
zbr | a) generate the readme file after the archival runs, so it has right files | 13:06 |
zbr | b) alter log server configuration to transparently return .gz files when these exist, also known as pre-archived static file serving (which was even working at some point if I remember well) | 13:07 |
zbr | mainly the UI (readme file) should not need to know about that the file is kept compressed on the log server. | 13:08 |
ykarel | zbr, readme file is static | 13:09 |
zbr | maybe it should not be, i do not find it hard to make it more dinamic and include it in artcl | 13:10 |
ykarel | zbr, ack if you want to convert to dynamic u can get that, no issue from my side | 13:11 |
zbr | which could also avoid noise in the file like stuff that do not exist in current execution | 13:11 |
zbr | ykarel: I can give it a try, i bet it would require less maintenance than one that is updated with sed. | 13:12 |
ykarel | zbr, ack go for it | 13:12 |
ykarel | for me important is to fix that, if it's more better way it's fine | 13:12 |
zbr | ykarel: send me link on current static copy and I will create a change today | 13:13 |
marios|call | https://code.engineering.redhat.com/gerrit/#/q/topic:17-standup+status:open | 13:13 |
zbr | ykarel: sure, I do see its use. | 13:13 |
ykarel | zbr, https://review.opendev.org/#/c/705379/ seperate review depends on the other one | 13:13 |
ykarel | sshnaidm, i see docker got updated on 28th, what do u think if it can be related? | 13:20 |
*** ratailor has quit IRC | 13:21 | |
sshnaidm | ykarel, idk, but we used docker for logs collection before together with podman, I have a patch to separate them and not touch docker if we use podman | 13:21 |
ykarel | sshnaidm, but docker is still used in jobs | 13:22 |
ykarel | in jobs where pacemaker is there | 13:22 |
sshnaidm | ykarel, yes, and it should be used in logs collection in those jobs only | 13:23 |
sshnaidm | ykarel, I mean we ran both in every job in logs collection, no matter what is used | 13:23 |
weshay | chkumar|rover, may be an issue w/ fs001 rhel8 | 13:23 |
weshay | master | 13:23 |
ykarel | sshnaidm, ack iirc we used to run containers with both docker/podman in same job, not sure about recent though | 13:24 |
chkumar|rover | weshay, 500 no valid host | 13:24 |
ykarel | sshnaidm, example was ceph with docker others with podman | 13:24 |
marios|call | chkumar|rover: any idea molecule-container-push NODE_FAILURE in 0s https://review.rdoproject.org/r/#/c/24705/ | 13:24 |
*** derekh has quit IRC | 13:24 | |
marios|call | rechecking | 13:24 |
ykarel | but that was long ago | 13:24 |
sshnaidm | ykarel, I hope we don't. Otherwise need to manage containers lists for both engines | 13:25 |
chkumar|rover | ykarel, does on undercloud we have both podman and docker? | 13:25 |
ykarel | chkumar|rover, undercloud i think only one | 13:26 |
ykarel | the thing both one used to be there in overcloud in ceph jobs | 13:26 |
ykarel | but now i think everything ported to podman | 13:26 |
marios|call | https://bugs.launchpad.net/tripleo/+bug/1861342 https://review.rdoproject.org/r/24771 | 13:27 |
openstack | Launchpad bug 1861342 in tripleo "tripleo-ci promotion failing on "pull ppc64le tagged containers"" [Critical,Triaged] - Assigned to Marios Andreou (marios-b) | 13:27 |
chkumar|rover | sshnaidm, please comment on the latest update on this https://trello.com/c/0pT1zkSe/1316-cixlp1861378tripleociproa-multiple-postfailure-on-master-periodic-pipeline | 13:28 |
chkumar|rover | sshnaidm, since we still have logs missing on few master ovb jobs | 13:29 |
ykarel | sshnaidm, me now checks running jobs where it's stuck | 13:30 |
sshnaidm | ykarel, we don't have it on rhel8 ovb master, btw | 13:30 |
ykarel | sshnaidm, hmm i saw that | 13:31 |
ykarel | sshnaidm, so it can be related to docker update imo | 13:31 |
ykarel | as that's only centos | 13:31 |
sshnaidm | ykarel, or their combination with podman.. | 13:31 |
ykarel | sshnaidm, yes | 13:31 |
sshnaidm | ykarel, this has errors, but it set engine for podman only: https://review.opendev.org/#/c/705347/ let's see if it helps | 13:32 |
sshnaidm | ykarel, at least it worked once | 13:33 |
ykarel | sshnaidm, i logged in to a running job | 13:33 |
ykarel | sshnaidm, i see /usr/bin/docker-current stats --all --no-stream is stuck for 5 minutes in novacompute | 13:33 |
sshnaidm | ykarel, aha! | 13:33 |
ykarel | sshnaidm, share your keys, i have to leave now, i have a meeting in next half hour | 13:34 |
ykarel | sshnaidm, will add your keys so u can check more before node get destroyed | 13:34 |
sshnaidm | ykarel, https://github.com/sshnaidm.keys | 13:34 |
sshnaidm | ykarel, I think we have the same meeting :) | 13:34 |
ykarel | sshnaidm, i have PCD one | 13:34 |
ykarel | i will not be able to join bootcamp one today | 13:35 |
sshnaidm | ack | 13:35 |
ykarel | sshnaidm, try zuul@38.145.35.80 | 13:35 |
ykarel | sshnaidm, and then ssh heat-admin@192.168.24.16 | 13:35 |
*** ykarel is now known as ykarel|afk | 13:36 | |
ykarel|afk | i have installed strace there, i see it's stuck at some place | 13:36 |
ykarel|afk | i see FUTEX_WAIT | 13:37 |
sshnaidm | ykarel|afk, manually it's stuck too.. but works w/o --stream | 13:38 |
ykarel|afk | sshnaidm, ack /me leaving will be back after some time | 13:39 |
*** ykarel|afk is now known as ykarel|away | 13:39 | |
sshnaidm | ykarel|away, ack, thanks for the node | 13:39 |
*** Goneri has joined #oooq | 13:40 | |
rlandy | https://review.opendev.org/705052 | 13:42 |
rlandy | rfolco: ^^ | 13:44 |
sshnaidm | rlandy, I'm fine to merge it if you can please have the followup with changes, the current way looks unsafe to me and should be changed | 13:45 |
rlandy | sshnaidm: I have no problem making your suggested change | 13:45 |
sshnaidm | rlandy, cool, thanks | 13:45 |
*** marios|call is now known as marios | 13:45 | |
rlandy | I'd juts like everyone to take one last look so we can merge it | 13:45 |
rlandy | we are holding steve at this point | 13:46 |
chkumar|rover | weshay, sshnaidm https://review.opendev.org/#/q/topic:cellv2+(status:open+OR+status:merged) I workflowed it, cellv2 patches | 13:49 |
chkumar|rover | weshay, sshnaidm do we want to run fs062 in periodic also? | 13:50 |
sshnaidm | chkumar|rover, yeah, would be great | 13:51 |
sshnaidm | chkumar|rover, thanks | 13:51 |
weshay | marios, please cancel todays boot camp sync | 13:52 |
* weshay needs to chat w/ phill | 13:52 | |
marios | weshay: ack doing | 13:53 |
weshay | marios, thanks | 13:53 |
marios | weshay: postpone or just cancel? | 13:53 |
arxcruz | have a doctor appointment, be back in 1 hour | 13:53 |
weshay | marios, not sure :) | 13:53 |
weshay | will let you know | 13:54 |
marios | weshay: no i mean the sync | 13:54 |
weshay | marios, updated our 1-1 | 13:54 |
marios | weshay: ack ok i'lll cancel it we can re-setup a call np | 13:54 |
weshay | ya.. we'll kick it later this week | 13:54 |
weshay | if we're still on | 13:54 |
sshnaidm | rlandy, commented https://review.opendev.org/#/c/705052/ | 13:55 |
weshay | chkumar|rover, updating https://bugs.launchpad.net/tripleo/+bug/1856016 to triaged | 13:57 |
openstack | Launchpad bug 1856016 in tripleo "Tempest basic ops test_cross_tenant_traffic tests failed on fs020 train with Timed out waiting for 10.0.0.102 to become reachable from 10.0.0.117" [Critical,Triaged] | 13:57 |
rlandy | sshnaidm: updated | 13:58 |
chkumar|rover | rlandy, small nit pick on metalsmith patch | 13:58 |
weshay | chkumar|rover, did you see lines 98 - 111 on https://etherpad.openstack.org/p/ruckroversprint21 | 13:59 |
rlandy | chkumar|rover: patch is updated | 14:00 |
chkumar|rover | weshay, yes, need to ping cgoncalves for the same | 14:00 |
weshay | chkumar|rover, looks like we should bug it | 14:00 |
weshay | chkumar|rover, I've seen it twice in the gates over the last few weeks | 14:00 |
chkumar|rover | weshay, file it sir :-) | 14:01 |
weshay | k.. will do | 14:01 |
chkumar|rover | weshay, going to generatre some ansibel related tasks for next sprint related to os_tempest | 14:01 |
chkumar|rover | octavia and ironic support in os_tempest | 14:01 |
weshay | k | 14:01 |
*** holser has quit IRC | 14:02 | |
chkumar|rover | weshay, rlandy merge these patches in morning https://review.opendev.org/#/q/topic:unskipvolume+(status:open+OR+status:merged) | 14:03 |
chkumar|rover | sorry in your evening | 14:03 |
rlandy | ok | 14:03 |
chkumar|rover | it might increase the time of fs020 | 14:03 |
chkumar|rover | so many tests getting unskipped | 14:03 |
*** ykarel|away is now known as ykarel | 14:04 | |
chkumar|rover | weshay, we need to keep an eye on these patches https://review.opendev.org/#/q/topic:mistral_to_ansible+(status:open+OR+status:merged)+status:open+label:verified%253D%252B1%252Cuser%253Dzuul might be it break periodic master jobs | 14:04 |
rlandy | chkumar|rover: looking for test run here | 14:04 |
weshay | chkumar|rover, what is your additional concern? | 14:05 |
weshay | re: greater than the normal potential for breakage? | 14:05 |
chkumar|rover | weshay, nothing | 14:06 |
weshay | chkumar|rover, this cleared 3rd party https://review.opendev.org/#/c/705323/ | 14:06 |
rlandy | https://review.opendev.org/#/c/704919 | 14:06 |
rlandy | not through get either | 14:06 |
rlandy | chkumar|rover: ^^ | 14:06 |
weshay | chkumar|rover, /me more concerned about the recent fs001 rhel 8 failures | 14:07 |
* weshay will dig in | 14:07 | |
chkumar|rover | weshay, you mean this one https://logserver.rdoproject.org/79/24779/1/check/periodic-tripleo-ci-rhel-8-ovb-3ctlr_1comp-featureset001-master/941e294/logs/undercloud/home/zuul/overcloud_deploy.log 500 error | 14:08 |
weshay | chkumar|rover, https://bugs.launchpad.net/tripleo/+bug/1861685 | 14:11 |
openstack | Launchpad bug 1861685 in tripleo "scenario10 tempest random tempest failures in check / gate, cloud related" [High,Triaged] | 14:11 |
weshay | put notes there | 14:11 |
chkumar|rover | weshay, sure | 14:12 |
*** holser has joined #oooq | 14:14 | |
*** derekh has joined #oooq | 14:17 | |
*** ykarel is now known as ykarel|mtg | 14:19 | |
rfolco | panda, what am I missing to run my delegated molecule test here https://review.rdoproject.org/r/#/c/24762/ | 14:20 |
rfolco | panda, tox.ini has molecule_delegated env, why is it not finding my new test | 14:20 |
marios | panda: please check when you next have reviews time thanks https://review.rdoproject.org/r/#/c/24771 (and as discussed added there new task https://tree.taiga.io/project/tripleo-ci-board/task/1510 under story 1493 consolidate tests | 14:25 |
*** marios is now known as marios|call | 14:32 | |
*** TrevorV has joined #oooq | 14:33 | |
chkumar|rover | weshay, anything more needed on my side, I will be logging off now | 14:37 |
chkumar|rover | feel free to drop emails | 14:37 |
chkumar|rover | see ya, Have a nice day and evening ahead | 14:37 |
*** chkumar|rover is now known as raukadah | 14:37 | |
weshay | raukadah++ | 14:38 |
panda | rfolco: tox.ini ignores delegated tests, you need to add another job | 14:44 |
panda | rfolco: and non-delegated tests are broken right now. | 14:44 |
rfolco | panda, ok I realized that looking at the other delegated ones at zuul.d/jobs.yaml | 14:45 |
rfolco | panda, thanks | 14:47 |
*** marios|call is now known as marios | 14:49 | |
*** dtantsur is now known as dtantsur|brb | 14:50 | |
*** ykarel|mtg is now known as ykarel | 14:52 | |
*** holser has quit IRC | 14:57 | |
*** jbadiapa has quit IRC | 15:02 | |
*** apetrich has quit IRC | 15:07 | |
panda | marios: do you have the powwers of merge on https://review.rdoproject.org/r/#/c/24771 ? You can merge at your discretion. | 15:13 |
marios | ack thanks panda. reviews please @ https://review.rdoproject.org/r/#/c/24771 when you next have time cc rfolco weshay sshnaidm ** i will merge it if it's still around without -1 tomorrow morning | 15:16 |
marios | panda: in fact, it won't do anything until we post to re-enable it weshay (revert revert https://review.rdoproject.org/r/#/c/24750 | 15:17 |
ykarel | sshnaidm, zbr revert temp patch https://review.rdoproject.org/r/#/c/24778/ | 15:26 |
*** zbr has quit IRC | 15:30 | |
arxcruz | back | 15:31 |
*** zbr has joined #oooq | 15:33 | |
*** zbr has quit IRC | 15:33 | |
*** zbr has joined #oooq | 15:34 | |
raukadah | rlandy, sshnaidm, please have a look at this one https://review.opendev.org/#/c/705175/2 when free | 15:36 |
*** rfolco is now known as rfolco|eats | 15:43 | |
*** ykarel is now known as ykarel|away | 15:44 | |
marios | sshnaidm: do you have something to point me at for the 'no logs in ovb' we discussed earlier please | 15:47 |
marios | sshnaidm: like bug or review or taiga | 15:47 |
sshnaidm | marios, https://bugs.launchpad.net/tripleo/+bug/1861694 | 15:47 |
openstack | Launchpad bug 1861694 in tripleo "Nonstop restarting ovn_metadata_haproxy container" [Critical,Triaged] | 15:47 |
marios | sshnaidm: thank you | 15:48 |
sshnaidm | marios, review: https://review.opendev.org/705446 | 15:48 |
marios | sshnaidm: thanks | 15:49 |
*** holser has joined #oooq | 15:57 | |
*** jbadiapa has joined #oooq | 16:03 | |
*** dtantsur|brb is now known as dtantsur | 16:05 | |
weshay | zbr, ping | 16:24 |
weshay | re: logging | 16:24 |
zbr | o/ | 16:24 |
zbr | i was now working to fix the readme generation, i will have a change ready for review before tomorrow | 16:25 |
zbr | POC already worked. | 16:25 |
zbr | it will be kickass readme. | 16:25 |
weshay | zbr, what's broken re: the readme? | 16:26 |
zbr | yatin tried to fix the the broken .gz links in the readme. | 16:26 |
zbr | new one will only have working links. w/ or w/o .gz based on the case. | 16:27 |
*** sshnaidm is now known as sshnaidm|afk | 16:27 | |
weshay | oh this link undercloud/var/log/tripleo-container-image-prepare.log.txt.gz - the container download, container update and provision log | 16:27 |
weshay | and this one https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_db1/705397/1/gate/tripleo-ci-centos-7-scenario000-multinode-oooq-container-updates/db10d98/logs/undercloud/var/log/extra/errors.txt.txt.gz | 16:28 |
weshay | zbr, let's chat about this | 16:28 |
weshay | https://meet.google.com/otr-pmkh-rer?authuser=1 | 16:29 |
weshay | zbr, you avail? | 16:32 |
*** udesale_ has quit IRC | 16:32 | |
zbr | sure. joining. | 16:32 |
* marios home time | 16:44 | |
*** marios is now known as marios|out | 16:50 | |
*** rfolco|eats is now known as rfolco | 16:51 | |
*** marios|out has quit IRC | 16:58 | |
*** bogdando has quit IRC | 16:59 | |
weshay | zbr, https://opendev.org/openstack/tripleo-quickstart-extras/src/branch/master/roles/validate-tempest/vars | 17:02 |
rlandy | weshay: ever hit this error when running standalone on your VM setup? "reset failed: reset: standard error: Inappropriate ioctl for device" | 17:30 |
weshay | hrm... not that I recall.. sec | 17:31 |
*** jbadiapa has quit IRC | 17:40 | |
*** derekh has quit IRC | 18:00 | |
weshay | rlandy, which part is that failing on? | 18:14 |
weshay | virt-customize? | 18:14 |
rlandy | weshay: nvm - I got by it following https://bugs.launchpad.net/tripleo/+bug/1842677 | 18:14 |
openstack | Launchpad bug 1842677 in tripleo "reset failed: reset: standard error: Inappropriate ioctl for device" [Low,Fix released] - Assigned to Alex Schultz (alex-schultz) | 18:14 |
rlandy | I hacked that part not to reset the console | 18:14 |
rlandy | not needed | 18:14 |
rlandy | on other errors now | 18:14 |
raukadah | weshay, regarding tempest 23.0.0 I will drop an email tomorrow so that downstream ci can keep an eye | 18:19 |
weshay | raukadah, that merged for us? the bump to 23.0? | 18:19 |
raukadah | weshay, nope, still in testing phase | 18:19 |
raukadah | weshay, we are now good to go to merge the stuff | 18:20 |
weshay | k.. and not related at all to multicell? | 18:20 |
raukadah | weshay, nope | 18:20 |
weshay | k k.. | 18:21 |
weshay | thanks | 18:21 |
weshay | raukadah, go to bed | 18:21 |
raukadah | weshay, multinode failure is unrelated , need to keep an eye | 18:21 |
weshay | raukadah, we do http://dashboard-ci.tripleo.org/d/jobs/jobs-exploration?orgId=1&fullscreen&panelId=16&from=now-90d&to=now | 18:23 |
weshay | still living on the edge | 18:24 |
*** weshay is now known as weshay|ruck | 18:30 | |
*** holser has quit IRC | 18:36 | |
*** dtantsur is now known as dtantsur|afk | 19:53 | |
*** tesseract has quit IRC | 20:01 | |
*** irclogbot_2 has quit IRC | 20:05 | |
*** irclogbot_3 has joined #oooq | 20:05 | |
*** jmasud has quit IRC | 20:12 | |
*** jmasud has joined #oooq | 20:13 | |
weshay|ruck | rlandy, when you have a sec https://review.opendev.org/#/c/705549/ | 20:39 |
rlandy | looking | 20:40 |
weshay|ruck | rfolco, did we get a new count on centos-8? | 20:48 |
rfolco | weshay|ruck, no, I'm retrying with the new patches merged and the new skips | 21:12 |
rfolco | weshay|ruck, ceph patch merged | 21:12 |
*** jfrancoa has quit IRC | 21:14 | |
weshay|ruck | k | 21:15 |
*** Goneri has quit IRC | 21:16 | |
weshay|ruck | rfolco, has this executed any where? https://review.rdoproject.org/r/#/c/24775/ | 21:18 |
rfolco | weshay|ruck, no I don't know whats going on with marios patch | 21:19 |
rfolco | weshay|ruck, need to sync with him tomorrow | 21:19 |
weshay|ruck | rfolco, think we just need to update zuul config no? | 21:20 |
rfolco | weshay|ruck, https://review.opendev.org/#/c/701937/ zuul -1 | 21:20 |
weshay|ruck | rfolco, wasn't so hard ;) https://review.rdoproject.org/r/#/c/24775/ | 21:28 |
weshay|ruck | ah crud maybe it is so hard | 21:28 |
weshay|ruck | hrm | 21:28 |
weshay|ruck | rlandy, what are we missing here to trigger build containers on a change to zuul.d/build-containers.yaml file | 21:35 |
rlandy | ? | 21:35 |
weshay|ruck | oh.. sorry | 21:35 |
rlandy | rlandy or rfolco? | 21:35 |
weshay|ruck | https://review.rdoproject.org/r/#/c/24775/5/zuul.d/build-containers.yaml | 21:35 |
weshay|ruck | rlandy, | 21:35 |
weshay|ruck | rlandy, need to define in projects.yaml? | 21:36 |
rlandy | weshay|ruck: it should work .. have you tried changing the /build-containers.yaml | 21:38 |
rlandy | file | 21:38 |
rlandy | so usually I would add the files to where the job is called | 21:39 |
rlandy | weshay|ruck: may I edit that? | 21:41 |
weshay|ruck | rlandy, latest patch is up now | 21:42 |
weshay|ruck | rlandy, there it goes | 21:42 |
weshay|ruck | rlandy, you can't add a template there I guess? | 21:43 |
weshay|ruck | rfolco, it's running man | 21:43 |
weshay|ruck | rfolco, https://review.rdoproject.org/zuul/status 24775 | 21:43 |
rlandy | weshay|ruck: k - so while we are talking ... getting somewhere with the tls standalone ... question ... | 21:43 |
weshay|ruck | rlandy, aye | 21:43 |
rlandy | standalone deploy fails with ERROR! the role 'ipaclient' was not found in ... | 21:43 |
rlandy | the ipaclient is in https://github.com/freeipa/ansible-freeipa/tree/master/roles/ipaclient | 21:44 |
rlandy | how do I get THT to pick that up? | 21:44 |
rlandy | add it to the role path for the deploy? | 21:44 |
weshay|ruck | that's a good question... | 21:45 |
weshay|ruck | I could poke at this with you but I don't remember off or I may not have ever known | 21:45 |
weshay|ruck | sagi would know | 21:45 |
weshay|ruck | but happy to look at it w/ you | 21:46 |
rlandy | I'll dig a bit more - if no results, I'll ping you | 21:46 |
weshay|ruck | k | 21:46 |
*** saneax has quit IRC | 22:25 | |
*** rfolco has quit IRC | 22:36 | |
weshay|ruck | rlandy, fwiw.. pretty sure the ansible path is set by tripleo-common cloudnull could help | 22:42 |
rlandy | weshay|ruck: sorry - found ot | 22:42 |
rlandy | it | 22:42 |
weshay|ruck | :) | 22:42 |
rlandy | /usr/share/ansible/roles | 22:42 |
*** TrevorV has quit IRC | 22:53 | |
*** jaosorior has quit IRC | 23:34 | |
*** jaosorior has joined #oooq | 23:50 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!