-@gerrit:opendev.org- Zuul merged on behalf of Ian Wienand: [zuul/zuul-jobs] 856225: upload-npm : support authToken argument https://review.opendev.org/c/zuul/zuul-jobs/+/856225 | 01:13 | |
-@gerrit:opendev.org- Zuul merged on behalf of Ian Wienand: [zuul/zuul] 853208: zuul-stream : Test against a Python 2.7 container https://review.opendev.org/c/zuul/zuul/+/853208 | 07:17 | |
@sameer.deshpande:matrix.org | > <@clarkb:matrix.org> Yes, this is the correct location for questions about zuul and nodepool. | 08:45 |
---|---|---|
Thanks Clark . Facing issue while trying to install Nodepool builder 3.12.0 on Ubuntu 20.04 . Nodepool-builder service is throwing error while starting . Any pointers to resolve this issue | ||
systemctl status nodepool-builder.service | ||
● nodepool-builder.service - LSB: Nodepool-builder | ||
Loaded: loaded (/etc/init.d/nodepool-builder; generated) | ||
Active: active (exited) since Wed 2022-09-07 09:30:47 UTC; 20h ago | ||
Docs: man:systemd-sysv-generator(8) | ||
Tasks: 0 (limit: 9508) | ||
Memory: 0B | ||
CGroup: /system.slice/nodepool-builder.service | ||
File "/usr/local/lib/python3.8/dist-packages/d> | ||
if is_socket(stdin_fd): | ||
File "/usr/local/lib/python3.8/dist-packages/d> | ||
file_socket = socket.fromfd(fd, socket.AF_IN> | ||
File "/usr/lib/python3.8/socket.py", line 544,> | ||
return socket(family, type, proto, nfd) | ||
File "/usr/lib/python3.8/socket.py", line 231,> | ||
_socket.socket.__init__(self, family, type, > | ||
OSError: [Errno 88] Socket operation on non-sock> | ||
-@gerrit:opendev.org- Tristan Cacqueray proposed: [zuul/zuul] 856321: Add initial telemetry tracing to the executor component https://review.opendev.org/c/zuul/zuul/+/856321 | 13:02 | |
-@gerrit:opendev.org- Simon Westphahl proposed: [zuul/zuul] 856523: wip: Add span for builds and propagate via request https://review.opendev.org/c/zuul/zuul/+/856523 | 13:25 | |
@westphahl:matrix.org | tristanC: ^ this is a wip that propagates the parent span info via the build request | 13:25 |
@westphahl:matrix.org | corvus: just wondering if we always want to propagate the full span_info (including links + attributes) or if we have something similar to the w3c traceparent that mainly contains the trace and parent span id | 13:27 |
@jim:acmegating.com | swest: we need it to end the span since the span end may happen on a different host if we want to stick with the opentelemetry api (where links can only be added at the start). if we want to stray farther from the api, we could set links/attrs only at the end, then we don't need to save them. | 13:43 |
@westphahl:matrix.org | corvus: for the use case of propagating the parent span info of the build via the build request I think we don't have to include the full span info of the build with the request. | 13:44 |
@jim:acmegating.com | swest: the parent span of the build is the buildset span, right? the buildset span can start and end on any scheduler. it's true that we don't need that info to construct the build span, but we *do* need it when we end the buildset span. so rather than save all the info we need for the buildset span in one place, then also save a partial copy of it in another place for the build, we just save one copy. it doesn't matter that we don't use the extra info. | 13:48 |
@westphahl:matrix.org | corvus: I'm not sure I can follow. I was thinking about spans that we may live on an executor (e.g. in tristanC s case) and where we might want to establish a relationship to the parent span. one option would be to include the full parent span info in the build request. Another option would be to only propagate enough info to establish the relationship to the parent span (e.g. similar to the w3c trace context) | 13:52 |
@westphahl:matrix.org | * corvus: I'm not sure I can follow. I was thinking about spans that may live on an executor (e.g. in tristanC s case) and where we might want to establish a relationship to the parent span. one option would be to include the full parent span info in the build request. Another option would be to only propagate enough info to establish the relationship to the parent span (e.g. similar to the w3c trace context) | 13:53 |
@jim:acmegating.com | swest: yes, what i'm saying is that any span that lives only on a single host does not need to be saved in zk. any span that can start and end on a different host needs all the info to be saved in zk. if we encounter a case where a parent span exists on only one host and then has a child span that exists on a different host, we will have a situation like you describe. but so far, i think any time we have a child that can start/end on a different host than its parent, the parent itself will also need to start/end on a different host. therefore the full parent needs to be saved in zk. | 13:56 |
@jim:acmegating.com | swest: it looks like you're saying that on the executor, we don't have the buildset object available to restore the parent, so we should save less info. i'll take a closer look at your change later today. | 13:59 |
@westphahl:matrix.org | corvus: yes, we need to include some bit of information with the build request that allows us to establish a relationship to the parent span as we don't have the buildset available there. same goes for e.g. merge or files changes jobs | 14:00 |
@westphahl:matrix.org | corvus: what I was trying to say is: I think this bit of information doesn't have to be the full span info (it would certainly do the trick) but maybe something smaller similar to the w3c trace context | 14:01 |
-@gerrit:opendev.org- Simon Westphahl proposed: [zuul/zuul] 856523: wip: Add span for builds and propagate via request https://review.opendev.org/c/zuul/zuul/+/856523 | 14:06 | |
@tristanc_:matrix.org | swest: corvus I am just trying to add some telemetry around git operations. I was wondering if we could use standalone spans, with just event_id /build_id attribute, but it sounds even better if they could be attached to a build/event parent span. | 14:09 |
@westphahl:matrix.org | tristanC: yeah, I think we should always establish a relationship to other/parent spans | 14:19 |
@clarkb:matrix.org | > <@sameer.deshpande:matrix.org> Thanks Clark . Facing issue while trying to install Nodepool builder 3.12.0 on Ubuntu 20.04 . Nodepool-builder service is throwing error while starting . Any pointers to resolve this issue | 14:33 |
> | ||
> systemctl status nodepool-builder.service | ||
> ● nodepool-builder.service - LSB: Nodepool-builder | ||
> Loaded: loaded (/etc/init.d/nodepool-builder; generated) | ||
> Active: active (exited) since Wed 2022-09-07 09:30:47 UTC; 20h ago | ||
> Docs: man:systemd-sysv-generator(8) | ||
> Tasks: 0 (limit: 9508) | ||
> Memory: 0B | ||
> CGroup: /system.slice/nodepool-builder.service | ||
> | ||
> File "/usr/local/lib/python3.8/dist-packages/d> | ||
> if is_socket(stdin_fd): | ||
> File "/usr/local/lib/python3.8/dist-packages/d> | ||
> file_socket = socket.fromfd(fd, socket.AF_IN> | ||
> File "/usr/lib/python3.8/socket.py", line 544,> | ||
> return socket(family, type, proto, nfd) | ||
> File "/usr/lib/python3.8/socket.py", line 231,> | ||
> _socket.socket.__init__(self, family, type, > | ||
> OSError: [Errno 88] Socket operation on non-sock> | ||
Nodepool 3.12.0 is about 2.5 years old and was created before Ubuntu 20.04 was released. I suspect that The version of python there is simply too new (and untested) with nodepool 3.12.0. However, it is difficult to say because the traceback you pasted had truncated lines. I think if you can upgrading is liekly to be helpful as it would be software that we can support. 3.12.0 isn't something we would be fixing or updating at this point. | ||
-@gerrit:opendev.org- Thomas Cardonne marked as active: [zuul/zuul] 856294: feat(elasticsearch): support datastreams and rollover aliases https://review.opendev.org/c/zuul/zuul/+/856294 | 14:44 | |
@jim:acmegating.com | swest: yep, got it. i think the flaw in my plan was that the executor does not have a buildset object, so i agree we should add something for this case. how about i look into some options today and propose an update to my change and yours? | 14:46 |
@westphahl:matrix.org | corvus: sounds good | 14:47 |
@avass:vassast.org | `event_enqueue_processing_time` and `job_wait_time` seems to be kinda long for us (10-30sec) while schedulers are not using a lot of cpu and executors are accepting jobs. But maybe I just don't know what's happening during those periods? | 14:56 |
@mbecker12:matrix.org | > <@mbecker12:matrix.org> Hi, I left a comment on my patch https://review.opendev.org/c/zuul/nodepool/+/853993 recently regarding the tests. corvus maybe you could have another look if you have a minute :) | 14:59 |
corvus: Could you take another look? I don't know how else to move forward with the tests there | ||
@jim:acmegating.com | mbecker12: yep, i'll try to take a look today, sorry i didn't get to it yesterday | 15:04 |
@avass:vassast.org | `job_wait_time` is the worst offender for us, because as far as I understand that's just time wasted holding a node waiting for the an executor to starting running a job. | 15:05 |
@clarkb:matrix.org | Albin Vass: have you checked if you have a lot of executor load that is causing them to remove themselves from the available executor list (the various governors for resources can do this) | 15:35 |
@clarkb:matrix.org | Albin Vass: this might be an indication that you need more executors or bigger executors, but that data should be recorded via statsd and/or logging | 15:36 |
@avass:vassast.org | Clark: well statsd show that all executors are accepting jobs, and at least load avg shouldn't be too high to stop accepting jobs. But ill have to look into it more tomorrow | 15:37 |
@avass:vassast.org | and adding another two executors didn't change much except for distributing jobs to those as well. | 15:38 |
@noonedeadpunk:matrix.org | Hey! I've just realized, that it's not allowed somehow to copy workdir to remote host. So what's the way to actually copy current repo state to the third party host? | 16:00 |
@fungicide:matrix.org | Dmitriy Rabotyagov: https://zuul-ci.org/docs/zuul-jobs/general-roles.html#role-mirror-workspace-git-repos | 16:02 |
@noonedeadpunk:matrix.org | ah, thanks, indeed | 16:03 |
@noonedeadpunk:matrix.org | I guess I need to clone roles from repo first though | 16:03 |
@fungicide:matrix.org | Dmitriy Rabotyagov: zuul should prepare repo states in the local workspace for any required projects that job specifies | 16:04 |
@noonedeadpunk:matrix.org | I wonder if I can just run prepare-workspace against any host | 16:04 |
@fungicide:matrix.org | prepare-workspace is an alternative to mirror-workspace-git-repos | 16:06 |
@noonedeadpunk:matrix.org | Though prepare-workspace I believe is designed for localhost | 16:06 |
@noonedeadpunk:matrix.org | Ok, thanks, I will try this out now | 16:07 |
@fungicide:matrix.org | oh, yes sorry i meant to link https://zuul-ci.org/docs/zuul-jobs/general-roles.html#role-prepare-workspace-git | 16:09 |
@fungicide:matrix.org | (which is also not prepare-workspace) | 16:09 |
@fungicide:matrix.org | Dmitriy Rabotyagov: we just chuck it into our base job's pre-phase playbook like this: https://opendev.org/opendev/base-jobs/src/branch/master/playbooks/base/pre.yaml#L61 | 16:10 |
@noonedeadpunk:matrix.org | Yes, so the things is it's also in base for our deployment. Though, I'm still trying to test post-deployment in pre-review pipeline | 16:11 |
@noonedeadpunk:matrix.org | *post-merge | 16:11 |
@noonedeadpunk:matrix.org | Eventually, in post-merge it will be simply repo state from git... | 16:12 |
@noonedeadpunk:matrix.org | So I don't really need to sync workdir, but just clone the repo.... | 16:12 |
@fungicide:matrix.org | i also don't recall now why we have both mirror-workspace-git-repos and prepare-workspace-git, the descriptions look almost identical except for the rolevars they take | 16:12 |
@fungicide:matrix.org | both say they use git operations to copy the local git state to remote nodes | 16:13 |
@noonedeadpunk:matrix.org | For me it feels that prepare-workspace-git prepares basically workdir, while mirror-workspace-git-repos clones workdir content to remote nodes | 16:14 |
@noonedeadpunk:matrix.org | But I haven't look deep to say for sure | 16:14 |
@noonedeadpunk:matrix.org | I just for some reason decided that I need repo state from zuul/gerrit rather then jsut from git | 16:16 |
@noonedeadpunk:matrix.org | As if it's post-merge it means that anything I might need has already been in repo.... | 16:16 |
@clarkb:matrix.org | > <@fungicide:matrix.org> i also don't recall now why we have both mirror-workspace-git-repos and prepare-workspace-git, the descriptions look almost identical except for the rolevars they take | 16:16 |
One uses rsync and the other git operations. The one that uses git operations should likely be prefered | ||
@noonedeadpunk:matrix.org | So thanks for helping, seems I just had to say issue aloud to realize that it's supid way to deal with it | 16:16 |
@noonedeadpunk:matrix.org | @clark the one that uses rsync is prepare-workspace | 16:17 |
@noonedeadpunk:matrix.org | these 2 both use git | 16:17 |
@clarkb:matrix.org | oh I didn't realize there was a third. Interesting | 16:17 |
@noonedeadpunk:matrix.org | Hm, but how to clone gerrit repo given it requires auth.... Zuul obviously does have access to it, but I bet keypair is removed quite early... | 16:23 |
@noonedeadpunk:matrix.org | Pffff.... | 16:23 |
@clarkb:matrix.org | > <@noonedeadpunk:matrix.org> Hm, but how to clone gerrit repo given it requires auth.... Zuul obviously does have access to it, but I bet keypair is removed quite early... | 16:26 |
The idea is that zuul would set the state for you and then your job can move that copy around without interacting with Gerrit. If you need to be able to push back to Gerrit though you'll need to manage a secret | ||
@noonedeadpunk:matrix.org | What I want to to copy repo from localhost to host that is not in inventory at all. Likely easiest thing it to make this host as static nodepool node.... | 16:27 |
@clarkb:matrix.org | you can use add_host to add it to the inventory as well. But even that isn't necessary you could just connect to that host directly | 16:28 |
@noonedeadpunk:matrix.org | So I was trying not to spawn a node and jsut deal with localhost, as what I need is only to copy current project to $host | 16:28 |
@noonedeadpunk:matrix.org | And apparently I can't do that | 16:28 |
@noonedeadpunk:matrix.org | As `"Syncing files from outside the working dir /var/lib/zuul/builds/0eaa2f0882c548f98828911e7be27e0a/work is prohibited"` | 16:30 |
@jim:acmegating.com | Dmitriy Rabotyagov: what version of zuul are you running? | 16:31 |
@noonedeadpunk:matrix.org | 4.9.0 | 16:31 |
@jim:acmegating.com | Dmitriy Rabotyagov: consider upgrading. 6.0.0 will let you do what you want in an untrusted playbook. | 16:32 |
@jim:acmegating.com | (< 6.0.0 you would need to do that in a trusted playbook) | 16:32 |
@jim:acmegating.com | https://zuul-ci.org/docs/zuul/latest/releasenotes.html#relnotes-6-0-0-upgrade-notes for details | 16:34 |
@noonedeadpunk:matrix.org | Well... Quite a few upgrade notes to read.... | 16:35 |
@noonedeadpunk:matrix.org | Also ansible-core 2.12 is sweet | 16:35 |
@noonedeadpunk:matrix.org | Ugh | 16:35 |
@noonedeadpunk:matrix.org | Tough choice :D | 16:36 |
@noonedeadpunk:matrix.org | sorry another question, should not be vars defined for job be available for playbook "hosts"? | 17:10 |
@noonedeadpunk:matrix.org | As it seems that job vars are not yet loaded when playbook being launched | 17:11 |
@noonedeadpunk:matrix.org | Maybe it's fixed already in later versions though... | 17:11 |
@clarkb:matrix.org | Dmitriy Rabotyagov: the vars defined on jobs should be available in the playbooks | 17:22 |
@clarkb:matrix.org | they are just top level variables iirc. You can do host specific vars too. | 17:23 |
@jim:acmegating.com | fungi: i replied to your comments on the spec | 18:03 |
@fungicide:matrix.org | > <@jim:acmegating.com> fungi: i replied to your comments on the spec | 18:36 |
thanks! | ||
@noonedeadpunk:matrix.org | @clark:matrix.org Well, I have a playbook prepare_deploy_host.yml: https://paste.openstack.org/show/bBTYTEGVTJbchRGrR7ac/ and a job: https://paste.openstack.org/show/bSuSHH4qkw0JpP6JoyTn/ | 18:41 |
@noonedeadpunk:matrix.org | and it fails like https://paste.openstack.org/show/b8CrzxoJpzeTKOtLdlFN/ | 18:42 |
@noonedeadpunk:matrix.org | And I o_O | 18:42 |
@jim:acmegating.com | zuul-maint: would any one else like to look at the change to add ansible 6? https://review.opendev.org/853552 | 19:50 |
@jim:acmegating.com | that's the next change needed for 6.4.0 | 19:50 |
@clarkb:matrix.org | corvus: I can | 19:56 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 855801: Add nodeset alternatives https://review.opendev.org/c/zuul/zuul/+/855801 | 20:01 | |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 855691: Remove deprecated pipeline queue configuration https://review.opendev.org/c/zuul/zuul/+/855691 | 20:31 | |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 855691: Remove deprecated pipeline queue configuration https://review.opendev.org/c/zuul/zuul/+/855691 | 20:32 | |
-@gerrit:opendev.org- Zuul merged on behalf of Ian Wienand: [zuul/zuul] 856214: zuul_stream : Use !127.0.0.1 for loopback https://review.opendev.org/c/zuul/zuul/+/856214 | 21:05 | |
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 853552: Add Ansible 6 https://review.opendev.org/c/zuul/zuul/+/853552 | 21:30 | |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: | 22:44 | |
- [zuul/zuul] 854458: Add support for configuring and testing tracing https://review.opendev.org/c/zuul/zuul/+/854458 | ||
- [zuul/zuul] 855096: Tracing: implement span save/restore https://review.opendev.org/c/zuul/zuul/+/855096 | ||
- [zuul/zuul] 855293: Add tracing tutorial https://review.opendev.org/c/zuul/zuul/+/855293 | ||
- [zuul/zuul] 856567: Add startSpanInContext tracing method https://review.opendev.org/c/zuul/zuul/+/856567 | ||
- [zuul/zuul] 856568: Use implicit trace context in build requests https://review.opendev.org/c/zuul/zuul/+/856568 | ||
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed on behalf of Simon Westphahl: [zuul/zuul] 856523: Add span for builds and propagate via request https://review.opendev.org/c/zuul/zuul/+/856523 | 22:44 | |
@jim:acmegating.com | swest: tristanC ^ okay i did some surgery there. tristanC i pulled some of your change into swest's and then updated that to use the new simplified context swest was advocating. i added all of us as co-authors to that one. i also added another change on top of the stack which i consider optional -- we can decide how much we want to use the implicit context (basically, thread local variables) to save on having to pass spans around all the time. | 22:46 |
@jim:acmegating.com | tristanC: i think you could cherry-pick your change onto the end of the stack if you want, and maybe update it to look more like what i did with the context manager in executeJob | 22:47 |
@jim:acmegating.com | the whole stack is updated because i updated the first change to use a noop tracer if tracing is disabled so that we are guaranteed working context managers everywhere | 22:48 |
@iwienand:matrix.org | zuul-main: https://review.opendev.org/c/zuul/nodepool/+/853914 doesn't solve any of the openshift issues we discussed, but it does get the extant test working with f36 by installing a statically built client; removing a f35 dependency (the packages for the 3.11 oc client have been orphaned and don't work any more). this isn't the long-term solution, but at least gets us off a f35 dependency for now | 23:27 |
@clarkb:matrix.org | ianw: I guess the fedora 36 packages are also statically linked but they are built with the wrong compiler if I read the upstream issue correctly? The tarball from github would've been built with the time appropriate golang compiler and is happy | 23:31 |
@iwienand:matrix.org | Clark: yeah i think all go things are essentially statically linked? but yeah, building with the current go the platform ships is incompatible | 23:32 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!