openstackgerrit | James E. Blair proposed zuul/zuul master: Fix multi-tenant caching of extra config files https://review.opendev.org/669008 | 00:36 |
---|---|---|
*** rlandy|bbl is now known as rlandy | 00:48 | |
*** igordc has quit IRC | 00:49 | |
*** saneax has joined #zuul | 00:55 | |
*** rlandy has quit IRC | 01:09 | |
*** bhavikdbavishi has joined #zuul | 01:39 | |
openstackgerrit | James E. Blair proposed zuul/zuul master: Fix multi-tenant caching of extra config files https://review.opendev.org/669008 | 01:43 |
*** swest has quit IRC | 01:45 | |
*** bhavikdbavishi has quit IRC | 01:52 | |
*** swest has joined #zuul | 01:59 | |
*** bhavikdbavishi has joined #zuul | 03:12 | |
*** bhavikdbavishi has quit IRC | 03:19 | |
*** bhavikdbavishi has joined #zuul | 03:25 | |
*** swest has quit IRC | 04:35 | |
*** altlogbot_0 has quit IRC | 04:57 | |
*** altlogbot_0 has joined #zuul | 04:59 | |
*** swest has joined #zuul | 05:29 | |
*** jamesmcarthur has joined #zuul | 05:46 | |
*** saneax has quit IRC | 05:50 | |
*** jamesmcarthur has quit IRC | 05:50 | |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: DNM: Test whether jobs run https://review.opendev.org/669056 | 06:08 |
*** bhavikdbavishi has quit IRC | 07:07 | |
*** pcaruana has joined #zuul | 07:28 | |
*** themroc has joined #zuul | 07:30 | |
*** bhavikdbavishi has joined #zuul | 07:32 | |
*** wxy-xiyuan has joined #zuul | 07:32 | |
*** tosky has joined #zuul | 07:44 | |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add base role integration jobs https://review.opendev.org/668061 | 07:46 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add multi-node integration jobs https://review.opendev.org/668767 | 07:46 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add base role integration jobs https://review.opendev.org/668061 | 08:02 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add multi-node integration jobs https://review.opendev.org/668767 | 08:02 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add base role integration jobs https://review.opendev.org/668061 | 08:05 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add multi-node integration jobs https://review.opendev.org/668767 | 08:05 |
*** sshnaidm|afk is now known as sshnaidm|ruck | 08:11 | |
*** jangutter has joined #zuul | 08:13 | |
AJaeger | corvus, I'm puzzled why the zuul-cloner test fails in 668061, let me do one more test... | 08:17 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add base role integration jobs https://review.opendev.org/668061 | 08:19 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add multi-node integration jobs https://review.opendev.org/668767 | 08:19 |
*** pcaruana has quit IRC | 08:28 | |
*** saneax has joined #zuul | 08:31 | |
*** pcaruana has joined #zuul | 09:04 | |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add base role integration jobs https://review.opendev.org/668061 | 09:06 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add multi-node integration jobs https://review.opendev.org/668767 | 09:06 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add base role integration jobs https://review.opendev.org/668061 | 09:16 |
AJaeger | corvus: I think I found it - you really need the use-cached-repos role... | 09:17 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add base role integration jobs https://review.opendev.org/668061 | 09:21 |
*** hashar has joined #zuul | 09:30 | |
AJaeger | corvus: that fixes it ^ - please check whether that is what you intented ;) | 09:31 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add multi-node integration jobs https://review.opendev.org/668767 | 09:32 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add multi-node integration jobs https://review.opendev.org/668767 | 09:36 |
*** hwangbo has quit IRC | 09:40 | |
AJaeger | corvus: and the node-failures come from using as labels fedora-latest (valid nodeset but not label) etc, fix coming | 09:44 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add a script to make platform-specific versions of jobs https://review.opendev.org/668955 | 09:46 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add base role integration jobs https://review.opendev.org/668061 | 09:46 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add multi-node integration jobs https://review.opendev.org/668767 | 09:46 |
AJaeger | corvus: let's try again ;) | 09:47 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add a script to make platform-specific versions of jobs https://review.opendev.org/668955 | 09:53 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add base role integration jobs https://review.opendev.org/668061 | 09:53 |
AJaeger | what fun ;( | 09:53 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add multi-node integration jobs https://review.opendev.org/668767 | 09:53 |
AJaeger | corvus: this should fix now finally the node-failures | 09:56 |
*** hashar has quit IRC | 09:57 | |
*** pcaruana has quit IRC | 10:12 | |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add multi-node integration jobs https://review.opendev.org/668767 | 10:14 |
*** bhavikdbavishi has quit IRC | 10:50 | |
AJaeger | corvus: full stack passes now. Exception is 668767 where the ubuntu-trusty job failed. | 11:00 |
AJaeger | And that failure is a mirror failure, hope recheck will succeed | 11:01 |
AJaeger | zuul-jobs maintainers, config-core, please review stack starting at https://review.opendev.org/668955 | 11:01 |
*** hashar has joined #zuul | 11:09 | |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add Gentoo integration tests https://review.opendev.org/669147 | 11:22 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add Gentoo integration tests https://review.opendev.org/669147 | 11:26 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: DNM: Trigger gentoo runs https://review.opendev.org/669148 | 11:26 |
*** bhavikdbavishi has joined #zuul | 12:03 | |
*** saneax has quit IRC | 12:13 | |
*** saneax has joined #zuul | 12:14 | |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add a script to make platform-specific versions of jobs https://review.opendev.org/668955 | 12:16 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add base role integration jobs https://review.opendev.org/668061 | 12:16 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add multi-node integration jobs https://review.opendev.org/668767 | 12:16 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: Add Gentoo integration tests https://review.opendev.org/669147 | 12:16 |
openstackgerrit | Andreas Jaeger proposed zuul/zuul-jobs master: DNM: Trigger gentoo runs https://review.opendev.org/669148 | 12:16 |
*** rfolco has joined #zuul | 12:28 | |
*** rlandy has joined #zuul | 12:30 | |
*** bkorren has joined #zuul | 12:55 | |
bkorren | hi there - is there a way to make a jos not show up in the report zuul send to gerrit? | 12:56 |
bkorren | s/jos/job | 12:56 |
*** bhavikdbavishi has quit IRC | 13:13 | |
AJaeger | bkorren: I'm not aware of that - why would you want to do so? | 13:13 |
bkorren | AJaeger, well the job is doing some background setup task that I don't want my users to see | 13:14 |
*** bhavikdbavishi has joined #zuul | 13:14 | |
bkorren | AJaeger, its should have been a 'pre' playbook, but issues with the static notepool provider forced me to make it into a separate job | 13:15 |
AJaeger | bkorren: and what if it fails? | 13:17 |
bkorren | AJaeger, highly unlikely - and I'll get an email | 13:18 |
AJaeger | ;) | 13:19 |
AJaeger | bkorren: just double checked - and couldn't find any job variables for this. | 13:19 |
AJaeger | so, cannot help further myself | 13:20 |
bkorren | AJaeger, ok, thnaks anyway | 13:20 |
*** bhavikdbavishi has quit IRC | 13:33 | |
AJaeger | corvus, config-core, stack starting at 668955 for zuul-jobs passes \o/ | 13:54 |
AJaeger | but enjoy 4th of July, no urgency on the stack ;) | 13:54 |
openstackgerrit | Monty Taylor proposed zuul/zuul master: Spec: Add a Kubernetes Operator for Zuul https://review.opendev.org/659180 | 14:20 |
*** bkorren has quit IRC | 14:32 | |
pabelanger | Is there any way a running job, could know if autohold has been requested? | 14:33 |
pabelanger | given the cleanup-run phase, was thinking of maybe auto add of ssh key | 14:33 |
*** hashar has quit IRC | 15:07 | |
*** tosky has quit IRC | 15:14 | |
flaper87 | ofosos: you should be able to ommit it, if not there may be a bug | 15:28 |
flaper87 | ofosos: you can also try setting it to something that is not valid, it'll be ignored. But ideally, you should be able to omit that setting | 15:29 |
openstackgerrit | Monty Taylor proposed zuul/zuul master: Spec: Add a Kubernetes Operator for Zuul https://review.opendev.org/659180 | 15:33 |
*** chandankumar is now known as raukadah | 15:39 | |
*** egustafson has quit IRC | 16:10 | |
*** themroc has quit IRC | 16:46 | |
SpamapS | pabelanger:maybe go the other way. Let a trusted job request autohold. Then you can have a post job that checks for things that suggest the node should be held. | 17:56 |
*** saneax has quit IRC | 18:58 | |
*** tosky has joined #zuul | 19:13 | |
*** dmyrhorodskyi has joined #zuul | 19:32 | |
*** dmyrhorodskyi has quit IRC | 19:43 | |
*** gtema_ has joined #zuul | 19:49 | |
*** EmilienM is now known as EvilienM | 19:54 | |
*** EvilienM is now known as EmilienM | 19:56 | |
*** dmyrhorodskyi has joined #zuul | 20:03 | |
dmyrhorodskyi | Hi, I'm running Zuul for image testing purposes with upstream playbooks and gate scripts. We currently have an issue where playbook defined in job stage RUN is executed two times at the same worker. And runs in parallel. Since these playbooks do the same scripts they usually fail at some point. I've taken some time to investigate this issue but co | 20:06 |
dmyrhorodskyi | uld not find a solution. So far we have K8s deployed Zuul 3.9.1.dev63. And we can see the same playbook is running in two separate ssh connections as you can see in attached snippet. This problem persist in 90% of runs but some times it is absent. Could someone please help investigate this problem? | 20:06 |
dmyrhorodskyi | 14:22 0:00 \_ sshd: zuul [priv] | 20:06 |
dmyrhorodskyi | 20:06 | |
dmyrhorodskyi | 20:06 | |
dmyrhorodskyi | \_ /bin/bash ./tools/deployment/osh-infra-logging/020-ceph.sh | 20:06 |
dmyrhorodskyi | 0.0 0.0 7924 776 ? S 14:39 0:00 | \_ sleep 5 | 20:06 |
dmyrhorodskyi | 20:06 | |
dmyrhorodskyi | ELM_ARGS='' OSH_PATH=../openstack-helm/ OSH_INFRA_PATH=../openstack-helm-infra/ OPENSTACK_RELEASE=newton | 20:06 |
dmyrhorodskyi | S 14:32 0:00 | \_ /bin/sh -c set -xe; ./tools/deployment/osh-infra-logging/020-ceph.sh | 20:06 |
dmyrhorodskyi | 20:06 | |
dmyrhorodskyi | 20:06 | |
*** dmyrhorodskyi has quit IRC | 20:13 | |
fungi | just a guess, but could your node be listed twice in the ansible inventory for some reason? | 20:15 |
fungi | or are your nodes using the static driver and somehow ending up running more than one build concurrently? | 20:16 |
*** kkalina has joined #zuul | 20:30 | |
kkalina | fungi, hi @dmytri and me are doing the same thing. I have checked inventory multiple times during the execution, since it was my first guess, that it adds 2 same hosts to inventory, but no, inventory has 1 host only. | 20:37 |
kkalina | we are using openstack driver, and using ephemeral instances, i have double checked that we are not using the same instance twice | 20:38 |
fungi | that's definitely strange... does the job have a parent which also includes the same playbook? | 20:40 |
fungi | but that wouldn't explain why it only happens 10% of the time so guessing not | 20:40 |
kkalina | sorry, if i have misled you, it happens 90% of the time | 20:41 |
fungi | ahh | 20:41 |
fungi | possible i misread | 20:41 |
kkalina | well maybe not 90% but most of the time | 20:41 |
fungi | the executor's debug log entries for one of the impacted builds might yield clues | 20:41 |
fungi | probably named something like /var/log/zuul/executor-debug.log | 20:42 |
kkalina | it doesn't have parent, other than base jobs, that just prepare workspace etc, from zuul-jobs repo | 20:42 |
kkalina | i will double check the logs, we are running executor in k8s currently. using latest docker image zuul/zuul-executor, as this is kind of POC. I can upload whole bunch of logs so you can see it, but there are no errors, other than stream_log connection resets. but as i understand that is websocket connection that shouldn't effect anything | 20:46 |
*** gtema_ has quit IRC | 20:47 | |
kkalina | and we also have a bunch of `defunct` child processes of the zuul executor appearing and disapearing in `ps` output | 20:49 |
fungi | you should be able to filter the executor log for a particular build uuid and see the sequence of actions it's taking | 20:51 |
mordred | fungi, kkalina: I feel like we OCCASIONALLY saw something that sounds similar like a year ago or something. I'm not sure we every fully tracked it down ... pabelanger or clarkb might remember | 21:33 |
kkalina | whenever this happens, i can see message from zuul 2019-07-04 21:32:31.422228 | [Zuul] Log Stream did not terminate``` | 21:35 |
kkalina | whenever this happens, i can see message from zuul 2019-07-04 21:32:31.422228 | [Zuul] Log Stream did not terminate | 21:35 |
kkalina | in the log steam, at web component, i can see that it emits logs from the first playbook run, but when logs are downloaded with upload-logs role, they are different, and output of `ps auxf`, for example contains two entries of the same playbook belonging to different ssh connections, with difference of 5 minutes. | 21:41 |
pabelanger | if 2 jobs are running, on the same node, do they have the same build uuid? | 22:09 |
*** rlandy is now known as rlandy|bbl | 22:16 | |
*** tosky has quit IRC | 22:56 | |
*** sshnaidm|ruck is now known as sshnaidm|off | 23:14 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!