openstackgerrit | Mohammed Naser proposed zuul/zuul-jobs master: helm-template: enable using values file https://review.opendev.org/721365 | 00:11 |
---|---|---|
openstackgerrit | Mohammed Naser proposed zuul/zuul-jobs master: helm-template: allow users to disable wait-for-pods https://review.opendev.org/721369 | 00:11 |
*** kmalloc has joined #zuul | 00:15 | |
*** cdearborn has quit IRC | 00:24 | |
openstackgerrit | Mohammed Naser proposed zuul/zuul-jobs master: helm-template: enable using values file https://review.opendev.org/721365 | 00:35 |
*** dmsimard1 has joined #zuul | 00:50 | |
*** dmsimard has quit IRC | 00:51 | |
*** dmsimard1 is now known as dmsimard | 00:52 | |
*** swest has quit IRC | 01:11 | |
*** swest has joined #zuul | 01:26 | |
*** rlandy has quit IRC | 01:47 | |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: Dockerfile: incoprorate workaround deboostrap https://review.opendev.org/721394 | 01:53 |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: Dockerfile: incoprorate workaround deboostrap https://review.opendev.org/721394 | 02:23 |
*** kmalloc has quit IRC | 02:24 | |
*** bhavikdbavishi has joined #zuul | 02:56 | |
*** bhavikdbavishi has quit IRC | 03:03 | |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: Functional container tests: update to CentOS 8 https://review.opendev.org/721509 | 03:10 |
*** bhavikdbavishi has joined #zuul | 03:52 | |
*** bhavikdbavishi1 has joined #zuul | 03:54 | |
*** bhavikdbavishi has quit IRC | 03:56 | |
*** bhavikdbavishi1 is now known as bhavikdbavishi | 03:56 | |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: [wip] set work dir for image build job to nodepool https://review.opendev.org/721514 | 03:57 |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Mandatory Zookeeper connection for ZuulWeb https://review.opendev.org/721254 | 04:20 |
*** ysandeep|afk is now known as ysandeep | 04:20 | |
*** olaph has quit IRC | 04:22 | |
*** evrardjp has quit IRC | 04:35 | |
*** evrardjp has joined #zuul | 04:35 | |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: Dockerfile: incorporate workaround deboostrap https://review.opendev.org/721394 | 05:10 |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: Functional container tests: update to CentOS 8 https://review.opendev.org/721509 | 05:10 |
openstackgerrit | Ian Wienand proposed zuul/nodepool master: [wip] set work dir for image build job to nodepool https://review.opendev.org/721514 | 05:10 |
*** sgw has quit IRC | 05:22 | |
*** sgw has joined #zuul | 05:40 | |
*** saneax has joined #zuul | 06:03 | |
*** dpawlik has joined #zuul | 06:08 | |
*** Goneri has quit IRC | 06:18 | |
*** bhavikdbavishi has quit IRC | 07:01 | |
*** rpittau|afk is now known as rpittau | 07:03 | |
*** yolanda has joined #zuul | 07:13 | |
*** bhavikdbavishi has joined #zuul | 07:14 | |
avass | anyone wanna take a look at these two? https://review.opendev.org/#/c/721248/ https://review.opendev.org/#/c/721237/ | 07:19 |
*** jcapitao has joined #zuul | 07:24 | |
*** tosky has joined #zuul | 07:44 | |
openstackgerrit | Tobias Henkel proposed zuul/nodepool master: Parallelize initial static node synchronization https://review.opendev.org/721205 | 07:44 |
*** bhavikdbavishi has quit IRC | 07:46 | |
*** threestrands has quit IRC | 07:55 | |
*** jpena|off is now known as jpena | 07:57 | |
*** lennyb has joined #zuul | 08:04 | |
*** bhavikdbavishi has joined #zuul | 08:09 | |
*** ysandeep is now known as ysandeep|lunch | 08:17 | |
*** bhavikdbavishi has quit IRC | 08:42 | |
*** bhavikdbavishi has joined #zuul | 09:06 | |
openstackgerrit | Paul Albertella proposed zuul/zuul-jobs master: Add Bazel build and install roles https://review.opendev.org/693513 | 09:08 |
openstackgerrit | Paul Albertella proposed zuul/zuul-jobs master: Add Bazel build and install roles https://review.opendev.org/693513 | 09:26 |
reiterative | I finally found time to fix my Bazel roles submission to zuul-jobs if someone has time to review it: https://review.opendev.org/#/c/693513 | 09:39 |
*** ysandeep|lunch is now known as ysandeep | 09:39 | |
*** sshnaidm|afk is now known as sshnaidm | 09:54 | |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Use cached 'tox_executable' in fetch-tox-output https://review.opendev.org/721192 | 10:01 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Use cached 'tox_executable' in fetch-tox-output https://review.opendev.org/721192 | 10:06 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Use cached 'tox_executable' in fetch-tox-output https://review.opendev.org/721192 | 10:10 |
openstackgerrit | Albin Vass proposed zuul/zuul-jobs master: Use cached 'tox_executable' in fetch-tox-output https://review.opendev.org/721192 | 10:11 |
*** rpittau is now known as rpittau|bbl | 10:31 | |
avass | reiterative: I guess we haven't announced it yet, but we deprecated the install-* roles and renamed them to ensure-* to have some consistency | 10:35 |
avass | reiterative: so can you rename the install-bazel to ensure-bazel? :) | 10:36 |
zbr | apparently fetch-sphinx-tarball can fail with bzip2: command not found | 10:37 |
zbr | and there is no code inside entire zuul-jobs to install bzip2 | 10:37 |
zbr | i am going to add a task for this | 10:37 |
avass | zbr: on the remote host? | 10:37 |
zbr | see https://object-storage-ca-ymq-1.vexxhost.net/v1/a0b4156a37f9453eb4ec7db5422272df/ansible_64/2664/d7678ba186eb19e5f36529588af292dc52f17e9b/check/molecule-tox-docs/e644d18/job-output.html | 10:38 |
zbr | is from ansible zuul instance, but it does not matter, role should be self-contained and at least attempt to install its own reqs. | 10:38 |
avass | reiterative: otherwise I think it looks good, just going to take a look at it a bit later against since I usually miss something the first time | 10:38 |
avass | zbr: yeah that's probably a good idea | 10:39 |
zbr | i know I can workaround and install it, but i prefer to fix it at the source, so others can benefit from a more reliable role. | 10:39 |
openstackgerrit | Sorin Sbarnea proposed zuul/zuul-jobs master: fetch-sphinx-tarball: install bzip2 https://review.opendev.org/721571 | 10:42 |
zbr | avass: funny, that i only mentioned on a change you made to the same role today that is lacks tests. | 10:42 |
zbr | at least multi-platforms tests for sure. | 10:43 |
avass | zbr: you mean the fetch-sphinx-tarball --no-same-owner? yeah, I'm planning on taking the time and got through all of the unarchive/synchronize and update them to all work like that soonish | 10:45 |
zbr | avass: no problem, I am now working to add tests for this role. | 10:46 |
avass | maybe I could add tests that creates a new user on the remote so they fail if they try to keep owner/group | 10:46 |
zbr | avass: no need for your change. | 10:46 |
zbr | my comment was generic to the role, not to your change. | 10:46 |
avass | ah | 10:47 |
avass | this should be ready as well: https://review.opendev.org/#/c/721192/ | 10:48 |
*** jcapitao is now known as jcapitao_lunch | 10:51 | |
*** bhavikdbavishi has quit IRC | 10:59 | |
*** ysandeep is now known as ysandeep|afk | 11:01 | |
*** bhavikdbavishi has joined #zuul | 11:08 | |
*** ysandeep|afk is now known as ysandeep | 11:32 | |
reiterative | avass Yes, happy to rename | 11:34 |
*** jpena is now known as jpena|lunch | 11:35 | |
openstackgerrit | Sorin Sbarnea proposed zuul/zuul-jobs master: WIP: Add testing of fetch-sphinx-tarball role https://review.opendev.org/721584 | 11:36 |
*** harrymichal has joined #zuul | 11:46 | |
*** hashar has joined #zuul | 11:52 | |
openstackgerrit | Jan Kubovy proposed zuul/zuul master: Mandatory Zookeeper connection for ZuulWeb https://review.opendev.org/721254 | 11:52 |
*** bhavikdbavishi has quit IRC | 11:54 | |
*** jcapitao_lunch is now known as jcapitao | 12:09 | |
openstackgerrit | Paul Albertella proposed zuul/zuul-jobs master: Add Bazel build and ensure roles https://review.opendev.org/693513 | 12:11 |
openstackgerrit | Sorin Sbarnea proposed zuul/zuul-jobs master: WIP: Add testing of fetch-sphinx-tarball role https://review.opendev.org/721584 | 12:12 |
openstackgerrit | Sorin Sbarnea proposed zuul/zuul-jobs master: WIP: Add testing of fetch-sphinx-tarball role https://review.opendev.org/721584 | 12:13 |
*** rlandy has joined #zuul | 12:17 | |
*** Goneri has joined #zuul | 12:23 | |
zbr | avass: do you have any idea why on zuul-jobs we call ansible-lint with each role instead of calling with all of them at once? also affects performance. | 12:27 |
zbr | the find | xargs | ansible-lint ... | 12:27 |
openstackgerrit | Sorin Sbarnea proposed zuul/zuul-jobs master: Make linting use of find portable https://review.opendev.org/721595 | 12:28 |
zbr | to me the -n1 added to xargs does not makes sense | 12:28 |
reiterative | avass I've now uploaded a patch changing 'install' to 'ensure' for https://review.opendev.org/693513 | 12:28 |
*** rpittau|bbl is now known as rpittau | 12:31 | |
openstackgerrit | Sorin Sbarnea proposed zuul/zuul-jobs master: WIP: Add testing of fetch-sphinx-tarball role https://review.opendev.org/721584 | 12:33 |
*** bhavikdbavishi has joined #zuul | 12:35 | |
*** jpena|lunch is now known as jpena | 12:35 | |
avass | reiterative: thanks! | 12:37 |
avass | zbr: don't know actually | 12:37 |
*** bhavikdbavishi has quit IRC | 12:54 | |
*** bhavikdbavishi has joined #zuul | 12:54 | |
*** bhavikdbavishi1 has joined #zuul | 12:57 | |
*** bhavikdbavishi has quit IRC | 12:59 | |
*** bhavikdbavishi1 is now known as bhavikdbavishi | 12:59 | |
openstackgerrit | Sorin Sbarnea proposed zuul/zuul-jobs master: Add testing of fetch-sphinx-tarball role https://review.opendev.org/721584 | 13:10 |
*** zxiiro has joined #zuul | 13:17 | |
*** toabctl has quit IRC | 13:25 | |
*** toabctl has joined #zuul | 13:27 | |
zbr | avass: clarkb ^ small nit | 13:34 |
zbr | oops, i wanted to point to https://review.opendev.org/#/c/721595/ - the other one is in progress. | 13:35 |
*** harrymichal has quit IRC | 13:35 | |
*** harrymichal has joined #zuul | 13:36 | |
zbr | i think i need a hint on https://review.opendev.org/#/c/721584/ as zuul fails to run the code from fetch-sphinx-tarball, and runs the code from master | 13:44 |
zbr | not sure how can I enable its testing | 13:45 |
*** bhavikdbavishi has quit IRC | 13:47 | |
*** gtema has joined #zuul | 13:59 | |
*** y2kenny has joined #zuul | 14:00 | |
y2kenny | how does start-zuul-console work? Does it work with the ansible command or shell module to stream the stdout/stderr? | 14:04 |
mordred | y2kenny: yes, that's right | 14:05 |
y2kenny | are there anything I need to install on the remote side for the streaming to work? | 14:06 |
mordred | y2kenny: it starts a daemon that the callback plugin on the executor connects to to stream the command output - and the command/shell modules write their output to a file that the daemon streams | 14:06 |
mordred | y2kenny: nope - start-zuul-console should be self-contained | 14:06 |
y2kenny | I am getting log stream back from the exectuor but I don't seem to get log stream back from a container remote | 14:06 |
y2kenny | the logs exist in job-output.json I just don't see the stream | 14:07 |
*** gtema has quit IRC | 14:09 | |
y2kenny | mordred: is there any way to interact with the stream daemon at a lower level for debug? (i.e., can I echo some string into that file?) | 14:14 |
AJaeger | mordred: could you review https://review.opendev.org/721245 again, please? | 14:15 |
*** maxamillion has quit IRC | 14:16 | |
*** maxamillion has joined #zuul | 14:17 | |
*** michael-beaver has joined #zuul | 14:22 | |
*** harrymichal has quit IRC | 14:23 | |
*** cdearborn has joined #zuul | 14:33 | |
avass | y2kenny: can you reach the tcp port 19885 on that container? | 14:34 |
avass | zbr: I guess you could grab two nodes, install ansible on one of them and execute the role from that node on the other node? | 14:37 |
y2kenny | avass: I will give it a try | 14:37 |
avass | zbr: or wait, what's the reason it fails actually | 14:38 |
zbr | avass: no way, too much resource waste, i am sure there is a less complex way to do it. | 14:38 |
zbr | i think it has to do with secure/unsecure contect, what happens is that code runs without current change being tested, zuul uses the role from master, not form the CR. | 14:39 |
zbr | so basically, I add the new task, which never runs. | 14:39 |
avass | zbr: yeah that's what I was guessing | 14:39 |
avass | zbr: maybe you could do the same but execute it on the same node with connection: localhost? | 14:40 |
avass | or something like that | 14:40 |
openstackgerrit | Merged zuul/zuul-jobs master: Use main.yaml, not .yml https://review.opendev.org/721245 | 14:42 |
clarkb | zbr: I thought we were fixing that by using gzip instead? | 14:45 |
clarkb | since gzip is more universal | 14:45 |
clarkb | mnaser: had a change for it iirc | 14:45 |
clarkb | basically was switch tar -j to tar -z | 14:46 |
zbr | clarkb: i was not able to find any proposed changed addressing this issue. | 14:47 |
fungi | could it have been in a different but similar role? | 14:47 |
zbr | and even if fixed this, we still avoid the lack of testing issue | 14:48 |
zbr | we do need to enable testing either way | 14:48 |
clarkb | zbr: https://review.opendev.org/#/c/715028/ looks like it is still in progress | 14:48 |
clarkb | yes that adds testing too | 14:48 |
clarkb | but we control the file format there so the decision was made to go to tar -z instead of tar -j to simplify overall | 14:49 |
*** dmellado has quit IRC | 14:49 | |
clarkb | looks like we need to update a second set of promote jobs in unison though to make that work properly | 14:49 |
zbr | good, but that change is 1 month old, and blocked by AJaeger so not much practical use. | 14:50 |
clarkb | zbr: no thats not the right attitude | 14:50 |
clarkb | AJaeger has accurate called out that there is an issue that must be addressed. We should do that | 14:50 |
clarkb | its not blocked | 14:50 |
zbr | to be clear, i was not trying to blame anyone. i am just stating the fact that bzip2 is still an issue. | 14:51 |
clarkb | zbr: yes, and I'm asserting that the better fix is not to try and install bz2 everywhere but change our generated file format instead | 14:51 |
zbr | do we really need to combine the addition of testing with changing behavior? | 14:52 |
clarkb | zbr: if the testing can't work without reinforcing the bad format then yes | 14:52 |
zbr | i guess the fix should be to make the promote to work with both extensions first, so we can merge the change? | 14:53 |
clarkb | ya I think promote likely needs to simply drop the tar flag to select compression type as modern tar should figure it out from the file itself on extraction iirc | 14:56 |
clarkb | (that should be tested but I think from local use of tar that should be fine) | 14:56 |
AJaeger | zbr: yes, that would be best. Update the promote docs jobs to handle both formats, merge that - and then we can merge 715028 | 14:59 |
clarkb | zbr: as for why your tests fail, I believe it is because fetch-sphinx-tarball is running in post as part of the trusted job component from the base job? | 14:59 |
clarkb | zbr: I seem to recall that for zuul-jobs role tests that test post-run roles they need to dep on base-minimal and do the testing in the run component | 15:00 |
zbr | yep, it would be great to find a solution to test that part. i am sure we will need it anyway, | 15:00 |
clarkb | for speculative execution to work properly | 15:00 |
clarkb | basically you can't rely on post-run of parent jobs to test what you want there due to inheritance and security | 15:00 |
AJaeger | yes, fetch-sphinx-tarball invocation is trusted in post | 15:00 |
clarkb | AJaeger: can you link to where promote fails? | 15:00 |
clarkb | (the playbook or role if you have it) | 15:01 |
clarkb | zbr: ya checkout zuul-jobs-test-base-roles | 15:02 |
clarkb | I think the sphinx tarball role falls into the same category as the roles tested there | 15:02 |
AJaeger | clarkb: https://opendev.org/opendev/base-jobs/src/branch/master/playbooks/docs/promote.yaml#L23 | 15:05 |
AJaeger | But we need to check all promote docs playbooks, I think there's one in project-config as well with small changes | 15:06 |
zbr | clarkb: i am doing now a change to make promoter work with both | 15:06 |
AJaeger | bbl | 15:06 |
clarkb | AJaeger: looks like we need to register the actual file from the artifact url | 15:07 |
clarkb | zbr: ^ doing that we should be able to let tar figure out how to decompress based on file type | 15:07 |
y2kenny | avass: does this port access work the same way for node that has the kubectl connection type? | 15:10 |
zbr | clarkb: https://review.opendev.org/#/c/715028/ | 15:11 |
avass | y2kenny: I would guess so, since it's the executor trying to connect to that port | 15:11 |
avass | y2kenny: not ansible itself | 15:11 |
avass | y2kenny: or, let me check. I could remember that wrong | 15:13 |
y2kenny | avass: right... but if ansible connects to the remote differently depending on the node type, shouldn't the executor need a different way to connect to the node? | 15:13 |
y2kenny | avass: ok | 15:13 |
avass | y2kenny: probably mixed it up with something else | 15:16 |
avass | https://review.opendev.org/gitweb?p=zuul/zuul.git;a=blob;f=zuul/ansible/base/callback/zuul_stream.py;h=ca34dc2abce2eccc2508ee18eab87cfbdad107a0;hb=refs/heads/master#l272 | 15:16 |
*** dmellado has joined #zuul | 15:17 | |
clarkb | zbr: ? | 15:18 |
clarkb | looking at download-artifact we don't currently register the files that are downloaded | 15:20 |
zbr | clarkb: i am not against that but i am already too deep with the chain to lead that change. | 15:20 |
clarkb | we do register the json file from zuul though | 15:20 |
*** yolanda has quit IRC | 15:21 | |
clarkb | so we could assume that if we get this far the download has succeeded and simply check for those values. Or look for a bz2 or .gz file | 15:21 |
clarkb | AJaeger: ^ curious if you have opinions on that | 15:21 |
mnaser | zbr, clarkb, AJaeger: i did a follow up change i never had time to finish up which created some sort of artifact mapping | 15:21 |
mnaser | which meant we dont have to hardcode .tgz urls | 15:21 |
mnaser | https://review.opendev.org/#/c/715045/ | 15:21 |
mnaser | with that, we can avoid hardcoding urls and extensions | 15:21 |
clarkb | mnaser: thats exactly what I was looking for thanks | 15:22 |
zbr | my proposal: make it work with either one of them now, and aim to migrate to artifact urls later. | 15:22 |
y2kenny | avass: oh right... I think there was a conversation about this before. I need to add the stream port to the pod I launched manually. Is the port always 19885? | 15:22 |
avass | y2kenny: looking at that code i would guess it's not :) | 15:24 |
mnaser | clarkb: yeah i _think_ that works but maybe a depends-on for promote jobs or something might show that it works, i dont have time currently to push the rest through | 15:25 |
avass | y2kenny: haven't actually done anythign with it myself | 15:25 |
y2kenny | avass: ok | 15:25 |
y2kenny | avass: thanks for the tip, this is useful. | 15:26 |
andreykurilin | Hi folks! Can anyone help me to understand why https://github.com/openstack/rally-openstack/blob/master/.zuul.d/zuul.yaml#L114-L118 was never executed by zuul? http://zuul.openstack.org/builds?job_name=rally-openstack-docker-build-and-push shows nothing. Is anything wrong with the job itself? | 15:27 |
*** y2kenny has quit IRC | 15:29 | |
*** dmellado has quit IRC | 15:35 | |
clarkb | andreykurilin: that job will only run if changes merge to that repo. So I guess first question is have you merged anything? | 15:37 |
*** dmellado has joined #zuul | 15:37 | |
andreykurilin | clarkb: I thought it should be called for a PR that introduces it. I made this assumption on how things are working for gate and check pipelines | 15:38 |
fungi | clarkb: i guess the commit to add that job merged ;) | 15:38 |
clarkb | fungi: ya I've stopped making that ssumption with the prevalence of file filters | 15:39 |
clarkb | but in general yes | 15:39 |
fungi | file filters effectively don't filter job configuration, right? | 15:39 |
*** ysandeep is now known as ysandeep|away | 15:40 | |
andreykurilin | https://review.opendev.org/#/c/720199/ <- it was introduced here and it doesn't have any files or irrelevant-files filters | 15:40 |
fungi | since we ignore file filters when making changes to a job which might otherwise have been filtered | 15:40 |
clarkb | also post is weird in that it doesn't apply within a branch context right? so if you have branch matchers you get broken too | 15:40 |
fungi | that's a good point, defining post pipeline jobs in a branched repo have an empty implicit branch, so the existence of any branch matcher at all will prevent it from being run | 15:41 |
fungi | er, rephrasing | 15:41 |
clarkb | https://opendev.org/openstack/rally/src/branch/master/.zuul.d/docker-jobs.yaml#L47 may also be related | 15:41 |
fungi | and no, post pipeline triggers do have a branch in them, right? | 15:42 |
fungi | it's tags which lack a branch | 15:42 |
clarkb | fungi: I think its both because they are both triggered by refs/* | 15:42 |
clarkb | and not change foo with branch bar | 15:43 |
fungi | anyway, i can take a look in the debug log to see why that job was not selected | 15:43 |
clarkb | fwiw no openstack tenant config errors related to rally repos | 15:43 |
andreykurilin | clarkb: rally-docker-build-and-push doesn't pass secrets to the parent since it overrides run playbook (where the parent job might use secret) | 15:43 |
clarkb | so ya I think at this point we may have to look at zuul server logs | 15:43 |
clarkb | andreykurilin: ya I'm just wondering if maybe that breaks the transitive chain | 15:43 |
clarkb | andreykurilin: and zuul is breaking at runtime trying to apply the secrets to the job maybe | 15:44 |
clarkb | (I don't know just trying to ingest the configs right now) | 15:44 |
andreykurilin | clarkb: I trust parent job, so I can change the flag | 15:44 |
fungi | so digging up what happened when f134067 merged now | 15:45 |
clarkb | andreykurilin: out of curiousity why are you overriding the run and post run playbooks? | 15:45 |
andreykurilin | clarkb: this job not only builds a docker image, but runs a script that ensures that the image is valid and designed workflow works. post job just fetches the results of check script | 15:46 |
clarkb | andreykurilin: ya I think we've got that workflow pretty well established without needing to edit jobs | 15:47 |
clarkb | let me find an example for you | 15:47 |
fungi | btw, this sounds like a prime candidate for our promote pipeline workflow to publish container images | 15:47 |
andreykurilin | also (do not know if it relates), I faced the similar issue for release pipeline (https://opendev.org/openstack/rally/src/branch/master/.zuul.d/zuul.yaml#L72-L74) - http://zuul.openstack.org/builds?job_name=rally-docker-build-and-push&pipeline=release | 15:48 |
andreykurilin | https://opendev.org/openstack/releases/commit/2f33d6b8542b86b02ba118b985389d2d8142751b was merged after changes to release pipeline of rally | 15:48 |
clarkb | andreykurilin: https://opendev.org/zuul/zuul/src/branch/master/.zuul.yaml#L126-L183 this is zuul's build, build+upload, and finally promote jobs. No special playbook overrides. Then https://opendev.org/zuul/zuul/src/branch/master/.zuul.yaml#L248-L250 validates it is working | 15:49 |
clarkb | andreykurilin: beacuse that quickstart job is gating we only promote the image once the tests of the image function | 15:49 |
clarkb | this is nice because it separates the build from the testing. YOu can bring whatever tooling to testing that you need and only need to pass or fail. YOu don't need to modify the jobs beyond setting parameters around which image to build | 15:50 |
andreykurilin | I am always happy to remove duplicated code:) | 15:50 |
fungi | and it still ensures that the image which was tested in the gate pipeline is the exact same file which gets published | 15:50 |
andreykurilin | thanks! checking | 15:51 |
*** yolanda has joined #zuul | 15:52 | |
andreykurilin | clarkb: also, I have one more feature which base docker roles misses: updating readme at docker hub. But I guess it can be fixed by the similar workflow as well or even embedded into original role(it can be helpful for everyone) | 15:54 |
clarkb | andreykurilin: ya that sounds like a candiate for role improvement | 15:54 |
fungi | i found the event id from when that commit merged, still digging in the log entries for it to find out why rally-openstack-docker-build-and-push didn't get built | 15:55 |
fungi | Exception: Project openstack/rally-openstack is not allowed to run job rally-openstack-docker-build-and-push | 15:56 |
fungi | 2020-04-17 16:10:17,306 ERROR zuul.Pipeline.openstack.post: [e: 5c21b6cc8ada40099c58b25ac832fd4d] Error freezing job graph for <QueueItem 0x7f39b3425f60 for <Branch 0x7f3aca500710 openstack/rally-openstack refs/heads/master updated 710d3bfd7708736899e989021b39f198c7da7413..f13406795a50a29e9ad4ed67b340216a1cd7920c> in post> | 15:56 |
clarkb | oh ya we can't share jobs with secrests like that outside of trusted projects? | 15:56 |
fungi | right | 15:56 |
clarkb | so hte fix actually is to stop specializing those jobs :) | 15:56 |
fungi | the secret has to be *in* openstack/rally-openstack | 15:56 |
fungi | for that to work | 15:57 |
fungi | and then passed to the parent job | 15:57 |
clarkb | fungi: it is, but there is also a secret in openstack/rally | 15:57 |
clarkb | I think going to the intended use of those jobs will address this problem | 15:57 |
fungi | yeah, agreed | 15:57 |
clarkb | or rally can deplicate the jobs in their repos | 15:57 |
andreykurilin | hm...but rally-openstack-docker-build-and-push has own secret and passes it to the parent | 15:58 |
fungi | oh, and the secret-using job would still need to be in a trusted config repo right? even if passing theh secret | 15:58 |
clarkb | fungi: or in the current repo | 15:59 |
fungi | sure | 15:59 |
fungi | but not in a different untrusted repo | 15:59 |
clarkb | andreykurilin: I think this is an odd corner case. Zuul has to be conservative here due to a bug found long ago that allowed parenting to jobs like that to leak secrets between projects | 15:59 |
clarkb | andreykurilin: in this case even though pass to parent is true zuul doesn't want to expose the parents secret so says that is an error | 16:00 |
clarkb | andreykurilin: you either need to use the jobs as intended or switch to defining your unique stacks in each projects separately | 16:00 |
andreykurilin | :( | 16:00 |
andreykurilin | ok, one more question. How I can reuse roles from rally by rally-openstack project? I saw that job has 'roles' section, but failed to use it | 16:02 |
fungi | roles should be a list of other projects containing roles you want to use in the job | 16:03 |
clarkb | andreykurilin: https://zuul-ci.org/docs/zuul/reference/job_def.html#attr-job.roles has a good example | 16:03 |
andreykurilin | I guess I failed because `roles` dir are not located at the top dir of the repository | 16:07 |
andreykurilin | *is not located | 16:07 |
zbr | clarkb: how can I break this vicious circle or not being able to add tests for a role? | 16:07 |
clarkb | zbr: can we fix the role so that it is testable? | 16:07 |
clarkb | at the very least the tests need to be updated to test properly then we can see that installing bzip2 even works. I'm not sure this is desireable since we've already said we shouldn't use bzip2. My preference would be to update the role to produce gzip files and be done with it | 16:08 |
zbr | i am ok with that, but that sends to another change, that also lacks tests for its role,... and so on. | 16:09 |
clarkb | zbr: yes testing is desireable but this came up before because bzip2 isn't installed in production for many people | 16:10 |
clarkb | thats why your tess are failing | 16:10 |
andreykurilin | clarkb fungi: what about https://opendev.org/openstack/rally/src/branch/master/.zuul.d/zuul.yaml#L72-L74 ? It has everything in one place and works for post pipeline but not for release - http://zuul.openstack.org/builds?job_name=rally-docker-build-and-push&pipeline=release | 16:10 |
clarkb | if we fix that issue entirely we can move on from it | 16:10 |
zbr | but we need https://review.opendev.org/#/c/721652/ -- which is, not tested. | 16:11 |
zbr | looks simple and safe, but experience told me to not rely on how it looks :D | 16:12 |
fungi | andreykurilin: release is a different problem. because that repo has branches, the branch where the job is defined has an implicit branch matcher. when a tag is pushed, there is no distinct branch associated with it so it cannot match the job | 16:12 |
clarkb | zbr: or https://review.opendev.org/#/c/715045/1 | 16:12 |
clarkb | zbr: but yes that often is the nature of adding testing to that which is untested | 16:12 |
zbr | which again, is not tested | 16:12 |
clarkb | zbr: I wish there was a simple answer, but if they end state goal is reducing debt and more reliable software I think we need to invest in removing the bzip2 debt | 16:13 |
andreykurilin | fungi: oh...got it. so removing branches should solve the issue | 16:14 |
andreykurilin | fungi clarkb: thanks guys! | 16:14 |
zbr | so basically, my new goal is to find a way to test download-artifact, in order to change it. | 16:14 |
fungi | andreykurilin: or putting the job definition in a project with no branches, right | 16:14 |
clarkb | andreykurilin: usually we deal with that not by removing branches, but by appling the config from a branchless repo | 16:14 |
zbr | usually I would have used molecule to test roles but i seen that is not a tool of choice for zuul-roles. | 16:15 |
andreykurilin | clarkb: I prefer to remove old branches that are not used for several years | 16:17 |
*** kklimonda has quit IRC | 16:28 | |
*** kklimonda has joined #zuul | 16:30 | |
*** evrardjp has quit IRC | 16:35 | |
*** evrardjp has joined #zuul | 16:35 | |
*** zxiiro has quit IRC | 16:37 | |
*** michael-beaver has quit IRC | 16:37 | |
*** kklimonda has quit IRC | 16:38 | |
*** lseki has quit IRC | 16:38 | |
*** zxiiro has joined #zuul | 16:39 | |
*** rpittau is now known as rpittau|afk | 16:39 | |
*** kklimonda has joined #zuul | 16:39 | |
*** michael-beaver has joined #zuul | 16:40 | |
*** klindgren has quit IRC | 16:41 | |
*** klindgren has joined #zuul | 16:41 | |
*** mnaser has quit IRC | 16:41 | |
*** vblando has quit IRC | 16:42 | |
*** lseki has joined #zuul | 16:42 | |
*** maxamillion has quit IRC | 16:42 | |
*** Open10K8S has quit IRC | 16:42 | |
*** Open10K8S has joined #zuul | 16:42 | |
*** vblando has joined #zuul | 16:44 | |
*** mnaser has joined #zuul | 16:44 | |
*** mnaser has quit IRC | 16:46 | |
*** maxamillion has joined #zuul | 16:46 | |
*** mnaser has joined #zuul | 16:47 | |
zbr | i need an example json result for "Query Zuul API for artifact information" in order to mock it | 16:48 |
*** mnaser has quit IRC | 16:49 | |
tobiash | corvus, clarkb: we're sometimes facing connection issues like those: http://paste.openstack.org/show/792486/ . Those are currently hard to debug as nodepool auto-deletes them right after the timeout. Do you think it might make sense to implement some autohold that can catch launch errors? I think that would simplify debugging a lot. | 16:49 |
*** mnaser has joined #zuul | 16:50 | |
tristanC | tobiash: that sounds like a good idea, we also have trouble debugging launch error when nodepool cleanup faster than we can inspect | 16:50 |
corvus | tobiash: i guess you want to check the console log for the vm? and maybe ssh into it and check dmesg/syslog if it eventually comes up? | 16:52 |
tobiash | corvus: yes | 16:52 |
corvus | didn't we, at some point, have something that grabbed the console log for failures? | 16:52 |
tobiash | corvus: we grab the openstack error of the instance, however when hitting the timeout issue openstack didn't report any error | 16:53 |
tobiash | s/timeout/connection timeout. | 16:53 |
corvus | i thought we had something that grabbed an openstack console log | 16:53 |
tobiash | hrm, let me check | 16:53 |
corvus | i may not even be thinking of nodepool.... | 16:53 |
corvus | clarkb, mordred: ^ does that ring a bell? | 16:53 |
mordred | corvus: yes - I remember the thing you're talking about vaguely | 16:55 |
mordred | I don't think it was in nodepool | 16:55 |
mordred | but I don't remember what it was from | 16:55 |
mordred | maybe launch-node? | 16:56 |
corvus | mordred: was just looking there, but i don't see it | 16:56 |
mordred | corvus: I feel like I shoudl know where this is | 16:56 |
tobiash | I only know about https://opendev.org/zuul/nodepool/src/branch/master/nodepool/driver/openstack/handler.py#L264 | 16:56 |
tobiash | but that only handles the case when openstack itself had an error during instance creation | 16:57 |
mordred | I'm 99% sure you're not imagining this ... but I cannot think of where we might have done that | 16:57 |
mordred | tobiash: yeah - you definitely need the console log for boot related issues | 16:57 |
tobiash | oh wait, there is something about a console log: https://opendev.org/zuul/nodepool/src/branch/master/nodepool/driver/openstack/handler.py#L50 | 16:57 |
*** jcapitao has quit IRC | 16:57 | |
mordred | yes! | 16:58 |
mordred | there's where it is | 16:58 |
corvus | yep that's it! | 16:58 |
mordred | it's behidn a variable because it could otherwise be quite verbose | 16:58 |
corvus | https://opendev.org/zuul/nodepool/src/branch/master/nodepool/driver/openstack/handler.py#L51 | 16:58 |
corvus | tobiash: maybe you can try that real quick and see if it gets you enough info? | 16:58 |
tobiash | hrm, no console log in our logs | 16:58 |
mordred | tobiash: you need to turn on a flag | 16:58 |
corvus | tobiash: yeah, you'll need to set the variable in the config | 16:59 |
tobiash | oh | 16:59 |
mordred | on the label | 16:59 |
tobiash | ah got it | 16:59 |
corvus | ah neat, it's even documented | 16:59 |
tobiash | sorry for not reading the docs, I thought I have all the docs in my head :D | 17:00 |
* tobiash was wrong appearently | 17:00 | |
*** hashar has quit IRC | 17:01 | |
clarkb | corvus: yes we should grab the console log | 17:01 |
* mordred is surprised - had assumed tobiash had everything memorized | 17:01 | |
clarkb | ah its config flag got it | 17:01 |
clarkb | fwiw the way I like to debug those is to manually boot the image and see what it does | 17:01 |
clarkb | then if that doesn't make it clear dig harder from nodepool side | 17:01 |
tobiash | clarkb: it's a random error that happens to roughly 1/1000 instance creations throughout all of the images | 17:02 |
clarkb | tobiash: ah | 17:02 |
tobiash | and it's pretty nasty as it randomly delayes some jobs by ~20min while the cloud still has plenty of free resources | 17:03 |
corvus | tobiash, tristanC: to the original question, i don't object to a 'hold on error' option, but maybe if this works (or maybe a small enhancement to this if we need something else), we might be able to avoid the extra complexity, and it's probably a better user experience too if we can get the necessary debug info without having to deal with a held node | 17:03 |
tobiash | I'll try with the console log first | 17:04 |
*** bhavikdbavishi has joined #zuul | 17:04 | |
tobiash | if that's not enough I guess I need either the hold functionality or run a python script that spawns instances until it hits that error so we're able to debug on neutron/network level | 17:05 |
clarkb | corvus: unrelated I've just realized the simple nodepool driver already sets nodepool_pool_name metadata | 17:05 |
tristanC | corvus: being able to debug operational issue without having to restart service with special flag to increase debug sounds much better to me. Especially since once you restart in debug, you are likely to stick to debug | 17:05 |
corvus | tristanC: if we added a 'hold on error' flag, it absolutely would be disable by default | 17:05 |
corvus | so there's no advantage there | 17:05 |
tristanC | ah, so not like zuul autohold feature? | 17:06 |
corvus | tristanC: also, you never have to restart nodepool to change a label option | 17:06 |
tristanC | corvus: iiuc logConsole is only available in DEBUG | 17:06 |
corvus | tristanC: i can't imagine why we would go to all that trouble to implement something like the nodepool autohold when in general all of the nodepool config is in a config file | 17:07 |
corvus | tristanC: we can change the log level pretty easily. it should probably be info, warning, or error since it only happens at user request | 17:08 |
tristanC | corvus: changing nodepool config file works for me, as long as we don't have to restart the service | 17:08 |
corvus | yeah, i think if we change the log level, we'll have satisfied that | 17:08 |
openstackgerrit | James E. Blair proposed zuul/nodepool master: Log openstack console at info https://review.opendev.org/721694 | 17:10 |
tristanC | corvus: warning sounds better to me | 17:11 |
corvus | tristanC: i thought about it, and info seemed more appropriate, so that warning can still mean "there may be something wrong you should fix". we'll already have warning or error messages telling someone the launch has failed, so they can look for these. | 17:13 |
*** jpena is now known as jpena|off | 17:13 | |
corvus | tristanC: but if other folks think warning would be better, i can change it | 17:14 |
corvus | (i don't feel very strongly about this) | 17:14 |
mordred | corvus: I think info seems right | 17:16 |
tristanC | corvus: isn't console log enabled when there is something to fix? | 17:16 |
mordred | that's why I think info is right - you already have to opt-in to wanting this info, and it already only outputs if if there is a boot error | 17:17 |
mordred | so it is the info you're asking for | 17:17 |
mordred | but - I also don't feel strongly | 17:17 |
mordred | and would be happy with the other level too | 17:17 |
corvus | tristanC: i'm considering the use case of someone wanting to know if there's anything wrong with their nodepool system. that user may want to grep for all 'warning' or 'error' messages. that user would not want to see these. | 17:17 |
corvus | in other words, putting too many messages at warning or error may cause real warnings or errors to get lost | 17:18 |
tristanC | mordred: well we do have a case where we write >INFO to public place where our user can look for nodepool issue, having the failed console log be available would be useful too. | 17:18 |
corvus | (i would probably feel differently if this were one log line, but it's probably a couple of thousand) | 17:18 |
tristanC | perhaps console log could be dumped to a file instead of printed in the log? | 17:20 |
corvus | anything is possible. sounds like overkill to me. | 17:20 |
fungi | the same could be said of other high-colume on-demand data zuul logs, like thread dumps | 17:20 |
fungi | er, high-volume | 17:20 |
fungi | also a minor amount of tweaking could probably make it possible to redirect those to a separate log anyway with an appropriate python-logging config | 17:21 |
openstackgerrit | James E. Blair proposed zuul/nodepool master: Log openstack console at info https://review.opendev.org/721694 | 17:22 |
tristanC | corvus: one of the big issue we are having recently is diagnose zuul and nodepool issue when the information to understand a failure is in the service logs. | 17:22 |
tristanC | some user brings nodepool resources and a zuul connection, and zuul may only report 'FAILURE' when one of those external component are failing on us, and user complains that zuul is failing | 17:23 |
tristanC | until we dig in the log to find that it's actually the user provided resources which is failing | 17:24 |
fungi | i'm sympathetic to that challenge and also spend a fair amount of time hunting through zuul and nodepool service logs, but being able to figure out for sure that they can be safely exposed is not easy (especially executor logs) | 17:25 |
tristanC | thus we are looking for ways to expose zuul internals so that user can at least check if the failure can be fixed from their side | 17:25 |
tristanC | fungi: yeah, we already noticed it is not acceptable to share raw logs | 17:26 |
fungi | i think we've mostly convinced ourselves that openstacksdk is not going to leak credentials into builder and launcher logs, but no idea if the same can be said for non-openstack clouds | 17:27 |
fungi | so far the approach has been to plumb what we know to be safe up through our usual user interfaces (buildset reporting, web dashboard, api...) | 17:28 |
fungi | for opendev, we've also started publishing our nodepool builder logs so that members of our community can diagnose image build problems for platforms they're familiar with | 17:29 |
fungi | (and also publishing copies of the corresponding images built) | 17:29 |
tristanC | in this document we described the issue in more detail: https://tree.taiga.io/project/morucci-software-factory/us/3538 basically we are looking for a process to diagnose node_failure, job pending for too long, post failure and logless retry limit | 17:30 |
fungi | i have a feeling each of those three things is going to require a different approach | 17:30 |
tristanC | fungi: image build logs are very useful indeed, we share them too by default at fqdn/nodepool-log | 17:31 |
fungi | exposing node_failure reasons may be easy if we can figure out where they belong to be most easily consumed. that just boils down to launchers refusing the request and the reasons given by each | 17:32 |
*** yolanda has quit IRC | 17:33 | |
tristanC | and we recently added fqdn/nodepool-launcher-log for >INFO log of the nodepool handler | 17:33 |
tristanC | fungi: node_failure can also happen when the diskimage fail to provide a working ssh service | 17:34 |
fungi | like i said, i think we've in opendev convinced ourselves that publishing our launcher logs is probably safe because we trust openstacksdk to not leak credentials. i don't know if that evaluation carries over to environments using other kinds of resource providers | 17:34 |
fungi | oh, do we return node_failure when the executor can't ssh to the node? i thought those just bubbled up as retry_limit | 17:35 |
fungi | (after trying and failing however many times) | 17:35 |
openstackgerrit | Clark Boylan proposed zuul/nodepool master: Set pool info on leaked instances https://review.opendev.org/721359 | 17:38 |
clarkb | corvus: tristanC tobiash ^ now that has a test | 17:38 |
clarkb | I think we can probably land that patchset? | 17:38 |
clarkb | oh wait I might fail pep8 /me checks | 17:38 |
openstackgerrit | Clark Boylan proposed zuul/nodepool master: Set pool info on leaked instances https://review.opendev.org/721359 | 17:41 |
clarkb | yup needed pep8 fix. All should be well now | 17:41 |
clarkb | and if that lands I'll restart opendev's nl03 and confirm it makes things at least no worse than they currently are | 17:42 |
*** zxiiro has quit IRC | 17:47 | |
openstackgerrit | Sorin Sbarnea proposed zuul/zuul-jobs master: POC: download-artifacts: provide a dictionary with tests https://review.opendev.org/721703 | 17:52 |
openstackgerrit | Sorin Sbarnea proposed zuul/zuul-jobs master: POC: download-artifacts: provide a dictionary with tests https://review.opendev.org/721703 | 17:54 |
tobiash | fungi: node failure can happen if nodepool cannot gather ssh keys | 17:55 |
fungi | host keys? | 17:56 |
fungi | i guess it doesn't have a way to un-accept an allocation request so it can be picked up by another launcher | 17:57 |
clarkb | fungi: it can once it fails X times in a row (3 by default) | 17:58 |
clarkb | https://review.opendev.org/#/c/721359/4 should make an interaction with ^ better too fwiw | 17:59 |
tobiash | fungi: nodepool checks the imstance for connectivy before it succeeds the node request. In this process it gathers the ssh host keys | 18:00 |
fungi | ahh, okay, but i guess if all launchers fail to get a host key however many times for nodes they tried to allocate, then the request can't be fulfilled so you get a node_failure | 18:00 |
openstackgerrit | Merged zuul/nodepool master: Log openstack console at info https://review.opendev.org/721694 | 18:13 |
*** bhavikdbavishi has quit IRC | 18:23 | |
zbr | can I tell zuul to add some ansible skip tags or this is not supported? | 18:24 |
corvus | zbr: can you be more specific about 'skip tags' ? | 18:27 |
zbr | corvus: aka --skip-tags feature from ansible, https://docs.ansible.com/ansible/latest/user_guide/playbooks_tags.html | 18:28 |
zbr | and before someone asks why, is for testing purposes, some tasks can be marked with tags like "notest" in order to be skipped during testing. | 18:29 |
corvus | zbr: no; but we have proposed that https://zuul-ci.org/docs/zuul/reference/job_def.html#attr-job.tags could be passed through, but no one has implemented that | 18:29 |
corvus | zbr: in the openstack context, we would push back on that, because we try to test like production. | 18:30 |
zbr | corvus: so there was a plan to expose zuul tags as ansible tags. | 18:30 |
corvus | zbr: more like an idea, but no one needed it enough to write the code. | 18:30 |
corvus | usually people just say, well, actually i can test that | 18:30 |
zbr | i wonder if there is a danger of overlapping on this. | 18:30 |
*** gmann is now known as gmann_lunch | 18:31 | |
tristanC | zbr: there is an attempt at ansible tags here: https://review.opendev.org/575672 | 18:31 |
zbr | haha, from my collegues :D | 18:32 |
zbr | here is a practical example for use of tags: https://review.opendev.org/#/c/721703/2/roles/download-artifact/tasks/main.yaml | 18:32 |
fungi | "notest" a.k.a. "nottested" a.k.a. "broken" ;) | 18:33 |
corvus | if anyone wants to get that passing tests, i don't object to the feature and will be happy to review it | 18:33 |
zbr | fungi: sorry to day it but skipping some network specific tasks and mocking them is far better than having not tests at all | 18:33 |
corvus | zbr: but if this is for an openstack project, please feel free to discuss it with me in #openstack-infra because i'd like to understand why we aren't able to test something | 18:33 |
zbr | and I keep facing roles in zuul-jobs that have no tests at all or which "cannot be tested". | 18:34 |
corvus | zbr: yes, we consider them broken and in need of fixing | 18:35 |
corvus | zbr: if this is for zuul-jobs, we certainly will want to fully test something | 18:35 |
avass | anyone wanna take a quick look at this: https://review.opendev.org/#/c/721192/ just need another +2 :) | 18:35 |
corvus | zbr: i doubt we would ever want to land a "tag: notest" in zuul jobs, even if zuul did support that | 18:36 |
zbr | integration testing should not be seen as a way to avoid functional testing, they complete each other. | 18:36 |
corvus | zbr: i understand and agree with your perspective to some extent. | 18:37 |
zbr | huh,... happy to hear that. | 18:37 |
*** saneax has quit IRC | 18:38 | |
corvus | but i don't think a conversation in generalities is going to be useful here. :) let's confine ourselves to specific problems. | 18:38 |
corvus | zbr: is there something in zuul-jobs you'd like to test but need help finding a way to? i'd be happy to help. | 18:38 |
zbr | download-artifacts would be a recent example, as mnaser made a change that end-up lingering due to lack of tests, https://review.opendev.org/#/c/715045/ | 18:40 |
*** bhavikdbavishi has joined #zuul | 18:40 | |
zbr | i was interested about this change as it was needed for enabling testing on another role. | 18:40 |
zbr | some of the roles from zuul-roles run in post, like fetch-sphinx-tarball -- which also prevents testing CRs made towards them. | 18:41 |
zbr | corvus: if we can address these two it would be a great start. | 18:42 |
zbr | is quite late for me, but I will be more than happy to work on that tomorrow morning. still, I need few directions | 18:43 |
clarkb | zbr: it actually lingered because it needed a readme update | 18:43 |
clarkb | at least that was the reason it got a -1 | 18:43 |
zbr | adding the readme was not enough, the code does not run at all and I believe it may be borken (not sure) | 18:44 |
corvus | yeah, i agree a test would be good there; i'll think about how to test that | 18:44 |
zbr | i need an example json file to mock it | 18:44 |
mnaser | tbh, part of it is me not pushing up a revision for it | 18:45 |
corvus | zbr: example json should be pretty easy to get from a zuul job that ran in openstack | 18:45 |
mnaser | please feel free to pick it up, i'm a little overwhelmed with stuff going on right now | 18:45 |
zbr | my previous change was attempting to test mnaser change by avoiding the uri calls, and pre-loading some sample results. | 18:45 |
mnaser | i even have the ensure-packages stack which im bummed about not having time to clean up too, but life | 18:45 |
corvus | i guess we could spin up a little http server; i don't know if the uri module would handle a file url? | 18:46 |
zbr | i am almost sure it does | 18:46 |
corvus | (if not, for this, we could have a little 3 line http server in python or something) | 18:46 |
mnaser | (or convert that task to a python module and unit test it) | 18:48 |
corvus | avass: what does the change to all capital letters for ALL mean?) | 18:48 |
zbr | mnaser: i tried to build a sample json at https://review.opendev.org/#/c/721703/2/roles/download-artifact/molecule/default/converge.yml but I got something wrong. | 18:48 |
mnaser | because the ansible bits of it seem _really_ complicated at this point, imho | 18:48 |
mnaser | zbr: id just grab something from the opendev zuul api | 18:49 |
*** saneax has joined #zuul | 18:49 | |
corvus | zbr: i'm pretty sure i've said this before, but we're going to need a really good reason to introduce yet another test framework to zuul-jobs; so i'd sure like to see an attempt to test this using the "just run the role" approach. | 18:50 |
AJaeger | clarkb, want to review https://review.opendev.org/721706 - that's part 1 of the promote changes? | 18:51 |
*** gmann_lunch is now known as gmann | 18:51 | |
clarkb | AJaeger: ya sorry I've had a million things to juggle this morning | 18:51 |
avass | corvus: tox -e ALL runs all environments, I think the earlier 'all' was a mistake since that makes tox try to run the 'all' testenv | 18:52 |
AJaeger | clarkb: a lot is going on right now - no worries | 18:52 |
corvus | avass: ok, thx | 18:52 |
corvus | avass: is that a feature of tox, or is that just a 'safe' word we've chosen in this role? | 18:53 |
avass | corvus: it's a feature in tox, or at least on my local tox and the one we're running at Volvo as far as i know :) | 18:54 |
avass | corvus: I'll take a look and see if it's documented somewhere | 18:54 |
clarkb | if you run tox without -e it runs the listed envs | 18:54 |
zbr | corvus: i am aware of that anti-test framework approach, which is mainly forcing any change to go only through zuul. only linting can be done locally atm. | 18:55 |
clarkb | like make | 18:55 |
zbr | mnaser: something is not working for me with https://zuul.opendev.org/openapi -- whatever I put as "tenant" for ./builds, I am unable to load anything pressing execute, it only resets the form. | 18:56 |
mnaser | zbr: my hacky way is open up a build page, open dev tools and copy the url it polls | 18:57 |
corvus | zbr: yes, we think that's the appropriate choice for zuul-jobs. it's not necessarily appropriate for everything, but we're pretty convinced it's right for zuul-jobs. so you might have the most success if you work with us on that. | 18:57 |
avass | corvus: here https://tox.readthedocs.io/en/latest/config.html#cmdoption-tox-e | 18:57 |
corvus | avass: neat, thanks :) | 18:57 |
zbr | corvus: not going to fight w/ you on that, i will do whatever is needed to improve its testing. | 18:58 |
mnaser | zbr: in this case, i got http://zuul.opendev.org/api/tenant/openstack/build/aeb36977954e4a4892058ef81c236212 | 18:58 |
mnaser | from a kolla change | 18:58 |
mnaser | that one has artifacts | 18:58 |
corvus | zbr: great :) | 18:58 |
corvus | avass: i think maybe soon we won't need the 'reinstall-tox' playbook. +2, but i'd also like either AJaeger or clarkb to give it a quick review and +W | 18:59 |
avass | corvus: sure :) | 19:00 |
corvus | AJaeger, clarkb: https://review.opendev.org/721192 (just in case i missed something in the ongoing tox work) | 19:00 |
zbr | thanks, i will try to write some testing for it tomorrow, hopefully we will mange to improve the testing coverage to make it easier to patch. | 19:00 |
zbr | tbh, it was bit frustrating to try to fix bug in role foo, just to discover that in order to test it you need to make 5-6 other changes to other roles. | 19:01 |
openstackgerrit | Albin Vass proposed zuul/zuul master: Enables whitelisting and configuring callbacks https://review.opendev.org/717260 | 19:11 |
openstackgerrit | Merged zuul/nodepool master: Set pool info on leaked instances https://review.opendev.org/721359 | 19:13 |
avass | ^ that's pretty much done as well unless someone wants any changes to it :) | 19:23 |
openstackgerrit | Merged zuul/zuul-jobs master: Use cached 'tox_executable' in fetch-tox-output https://review.opendev.org/721192 | 19:32 |
avass | actually nevermind, I forgot I broke my tests | 19:48 |
openstackgerrit | Merged zuul/zuul-jobs master: Make linting use of find portable https://review.opendev.org/721595 | 20:00 |
openstackgerrit | Albin Vass proposed zuul/zuul master: Enables whitelisting and configuring callbacks https://review.opendev.org/717260 | 20:02 |
avass | that should be better | 20:03 |
openstackgerrit | Albin Vass proposed zuul/zuul master: Enables whitelisting and configuring callbacks https://review.opendev.org/717260 | 20:13 |
*** bhavikdbavishi has quit IRC | 20:26 | |
mordred | corvus, zbr: sorry - missed the above scrollback - but we actually have a use case for skippoing a particular thing in opendev - for accessbot - in the gate we want to write all the config and do all of the things ... but we do not want to run the script itself, because the script connects to freenode irc and does some things | 20:47 |
mordred | in my ansible patch I split it into two playbooks - one that runs in the gate and in prod and one that only runs in prody | 20:47 |
mordred | we also have the inverse case in a few places where we've worked around it with flag variables - we want to always start a container in the gate, because we need to make sure the service runs - but in prod, we don't want to restart gerrit every time there is a new image available - so in the gate we set a variable which says "run the start tasks" and in prod we do not set that variable | 20:49 |
fungi | in theory we could run charybdis and ircseven bots equivalent, but that may be overkill | 20:49 |
mordred | fungi: yeah- I think the "don't start gerrit" example is one we can't really work around | 20:50 |
mordred | but - I also think that the flag vars work file | 20:50 |
mordred | fine | 20:50 |
mordred | and although I do think we have a valid use case for using something to indicate a divergence between prod and test - I don't know that tags would make it any clearer | 20:51 |
mordred | but maybe they might | 20:51 |
*** rlandy is now known as rlandy|brb | 20:51 | |
corvus | mordred: yep. i agree the use case is there. and "when: job_var" is roughly equal to --skip-tags foo + "tag: foo", so we're effectively doing that. the tags might make it a bit more explicit. | 21:10 |
mordred | corvus: ++ | 21:11 |
*** rlandy|brb is now known as rlandy | 21:19 | |
*** saneax has quit IRC | 21:24 | |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul-jobs master: cabal-test: add first haskell job https://review.opendev.org/721735 | 21:28 |
*** dpawlik has quit IRC | 21:34 | |
*** sgw has quit IRC | 21:34 | |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul-jobs master: cabal-test: add first haskell job https://review.opendev.org/721735 | 21:41 |
*** michael-beaver has quit IRC | 21:52 | |
*** sgw has joined #zuul | 21:53 | |
openstackgerrit | Merged zuul/zuul-jobs master: helm-template: allow users to disable wait-for-pods https://review.opendev.org/721369 | 21:55 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul-jobs master: cabal-test: add first haskell job https://review.opendev.org/721735 | 22:00 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul-jobs master: cabal-test: add first haskell job https://review.opendev.org/721735 | 22:26 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul-jobs master: cabal-test: add initial haskell job https://review.opendev.org/721735 | 22:56 |
*** tosky has quit IRC | 23:03 | |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul-jobs master: cabal-test: add initial haskell job https://review.opendev.org/721735 | 23:05 |
*** y2kenny has joined #zuul | 23:09 | |
y2kenny | If a job defined with 'requires' runs before another job defined with 'provides', how can I debug what may be missing? | 23:14 |
y2kenny | both jobs are triggered off the same change | 23:14 |
y2kenny | run before the 'provides' job finished* | 23:14 |
mordred | y2kenny: requires and provides do not imply dependencies - they talk about artifacts zuul needs - and also describe relationships potentially across changesets | 23:15 |
mordred | y2kenny: so you _also_ need to make the second job list the first job in dependencies: | 23:15 |
y2kenny | OH | 23:16 |
y2kenny | mordred: thanks for highlighting that. | 23:16 |
mordred | dependencies are for in-the-same-changeset ordering - but you could actually do soft: true on a depend - and stil have requires/provides relationship - and have 2 changes ina. stack where the first change runs the provides job and the second change triggers the requires job and will pick up the articact from the previous change | 23:17 |
mordred | it's possible to describe some pretty crazy relationships :) | 23:17 |
mordred | (and incidentally - we actually do make use of that in opendev/system-config) | 23:17 |
y2kenny | um... I think this is worth a picture in the documentation to illustrates the power of Zuul | 23:18 |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul-jobs master: Add remove-zuul-sshkey https://review.opendev.org/680712 | 23:45 |
y2kenny | mordred: do you have any tips on debugging a dependent job that stopped due to error? (not failure) and the parent job was successful. Is there other ways other than running the executor in debug mode? | 23:53 |
y2kenny | oh wait... the error was actually in the change comment. Um... playbook not found... that's weird... | 23:55 |
y2kenny | nevermind... found the problem. sorry for the noise | 23:55 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!