-@gerrit:opendev.org- Ian Wienand proposed: [zuul/nodepool] 849273: Dockerfile: move into separate group when running under cgroupsv2 https://review.opendev.org/c/zuul/nodepool/+/849273 | 01:13 | |
@tristanc_:matrix.org | tony.breeds: Clark: unfortunately the jobs defined in opendev.org/zuul/zuul-jobs often use `become`, sometime just to ensure a piece of software is installed, and that is not working well with unprivileged container. In that case, we setup a passwd alias for zuul that has the uid 0, that way become is a noop and the job can run without using setuid or sudo. | 12:29 |
---|---|---|
@jpew:matrix.org | The `tox-linters` appear to be broken and preventing gating for zuul? | 14:42 |
@jpew:matrix.org | e.g. https://review.opendev.org/c/zuul/zuul/+/850685 | 14:42 |
@clarkb:matrix.org | It is the new "missing whitespace after keyword" rule. That was brought up somewhere else as a problem for things like `assert(foo)` and `del(foo)` which is exactly the sort of thing failing in zuul. | 14:43 |
@jpew:matrix.org | Ah, it wants `assert (foo)` or `assert foo` b/c it's a keyword not a function? | 14:45 |
@clarkb:matrix.org | correct. I think `assert foo` is what they are likely looking for | 14:46 |
-@gerrit:opendev.org- Tristan Cacqueray proposed: [zuul/zuul] 850575: doc: fix liveness probes path rendering https://review.opendev.org/c/zuul/zuul/+/850575 | 14:51 | |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 851895: Add whitespace around keywords https://review.opendev.org/c/zuul/zuul/+/851895 | 15:03 | |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: | 15:03 | |
- [zuul/zuul] 850109: Add tests for zuul-client job-graph https://review.opendev.org/c/zuul/zuul/+/850109 | ||
- [zuul/zuul] 850111: Add test for zuul-client freeze-job https://review.opendev.org/c/zuul/zuul/+/850111 | ||
- [zuul/zuul] 851107: Add job graph support to web UI https://review.opendev.org/c/zuul/zuul/+/851107 | ||
- [zuul/zuul] 851268: Add freeze job to web UI https://review.opendev.org/c/zuul/zuul/+/851268 | ||
- [zuul/zuul] 851604: Use internal links in job graph display https://review.opendev.org/c/zuul/zuul/+/851604 | ||
@jim:acmegating.com | zuul-maint: https://review.opendev.org/851895 would be great to merge asap | 15:04 |
@tobias.henkel:matrix.org | zuul-maint: did anyone of you observe random task hangs in the past similar to the issue in opendev a few weeks ago? I think we face something similar (we're still on python 3.8 and ansible 2.9). | 15:52 |
@tobias.henkel:matrix.org | all gdb back traces I got so far from hanging tasks look the same: | 15:52 |
``` | ||
Thread 1 (Thread 0x7f9ba968a740 (LWP 1758660)): | ||
#0 __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:103 | ||
#1 0x00007f9ba99e67d1 in __GI___pthread_mutex_lock (mutex=0x7f9ba9dfc990 <_rtld_global+2352>) at ../nptl/pthread_mutex_lock.c:115 | ||
#2 0x00007f9ba9ddf1ce in _dl_add_to_namespace_list (new=0x55b53d175e10, nsid=0) at dl-object.c:33 | ||
#3 0x00007f9ba9dda792 in _dl_map_object_from_fd (name=name@entry=0x7f9ba58b54b0 "/usr/local/lib/python3.8/lib-dynload/_crypt.cpython-38-x86_64-linux-gnu.so", origname=origname@entry=0x0, fd=-1, fbp=fbp@entry=0x7ffc97c7ca90, realname=<optimized out>, loader=loader@entry=0x0, l_type=<optimized out>, mode=<optimized out>, stack_endp=<optimized out>, nsid=<optimized out>) at dl-load.c:1382 | ||
#4 0x00007f9ba9ddca8d in _dl_map_object (loader=0x0, loader@entry=0x7f9ba9dce000, name=name@entry=0x7f9ba58b54b0 "/usr/local/lib/python3.8/lib-dynload/_crypt.cpython-38-x86_64-linux-gnu.so", type=type@entry=2, trace_mode=trace_mode@entry=0, mode=mode@entry=-1879048190, nsid=<optimized out>) at dl-load.c:2466 | ||
#5 0x00007f9ba9de6feb in dl_open_worker (a=a@entry=0x7ffc97c7cfe0) at dl-open.c:228 | ||
#6 0x00007f9ba97c157f in __GI__dl_catch_exception (exception=exception@entry=0x7ffc97c7cfc0, operate=operate@entry=0x7f9ba9de6f60 <dl_open_worker>, args=args@entry=0x7ffc97c7cfe0) at dl-error-skeleton.c:196 | ||
#7 0x00007f9ba9de6bba in _dl_open (file=0x7f9ba58b54b0 "/usr/local/lib/python3.8/lib-dynload/_crypt.cpython-38-x86_64-linux-gnu.so", mode=-2147483646, caller_dlopen=0x7f9ba9c675d1 <_PyImport_FindSharedFuncptr+113>, nsid=<optimized out>, argc=17, argv=0x7ffc97c82d58, env=0x7ffc97c82de8) at dl-open.c:599 | ||
``` | ||
@tobias.henkel:matrix.org | all hang during dl_open of _crypt.cpython-38-x86_64-linux-gnu.so | 15:53 |
@fungicide:matrix.org | Albin Vass: that ^ was the thing you ran down initially, right? | 16:08 |
@tobias.henkel:matrix.org | I think it's different, but similar | 16:09 |
@tobias.henkel:matrix.org | for reference, that is Albin Vass issue: https://github.com/ansible/ansible/issues/78270 | 16:11 |
@tobias.henkel:matrix.org | the stack traces look different | 16:11 |
@fungicide:matrix.org | all, okay, and this isn't in pty allocation, it's loading a c lib | 16:13 |
@fungicide:matrix.org | out of curiosity, what glibc version"? | 16:13 |
@fungicide:matrix.org | * out of curiosity, what glibc version? | 16:14 |
@fungicide:matrix.org | * and this isn't in pty allocation, it's loading a c lib | 16:14 |
@fungicide:matrix.org | no signs that reads are generally hanging from that fs? is that on a job node or an executor? | 16:15 |
@tobias.henkel:matrix.org | it's debian buster 2.28-10+deb10u1 | 16:16 |
@tobias.henkel:matrix.org | it's on an executor | 16:16 |
@tobias.henkel:matrix.org | nope, no signs, io seems fine | 16:19 |
@clarkb:matrix.org | tobiash: that looks like you're waiting for a pthread mutex while opening the python crypt module's C component? | 17:20 |
@tobias.henkel:matrix.org | yes | 17:21 |
@tobias.henkel:matrix.org | however in the meantime I found a few different traces as well | 17:21 |
@clarkb:matrix.org | my initial hunch is that seems like a python bug | 17:21 |
@tobias.henkel:matrix.org | so no idea yet in which directtion to look at | 17:22 |
@tobias.henkel:matrix.org | maybe I'll try the shotgun method of upgrading to py3.10 and bullseye | 17:22 |
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 851895: Add whitespace around keywords https://review.opendev.org/c/zuul/zuul/+/851895 | 18:11 | |
@jpew:matrix.org | Clark: I can't seem to regate those 2 patches you reviewed yesterday, maybe I don't have permission? | 19:42 |
@clarkb:matrix.org | > <@jpew:matrix.org> Clark: I can't seem to regate those 2 patches you reviewed yesterday, maybe I don't have permission? | 19:43 |
The string is `recheck` not `regate` | ||
@jpew:matrix.org | Clark: Ah got it. Thanks | 19:44 |
@iwienand:matrix.org | tobiash: yeah when i debugged that issue the stack clearly had grantpt() in it, so i do think different. but i guess be careful with 3.10 + bullseye because you might actually hit that issue. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1015740 does have a patch that i backported but i do think it's unlikely to make it into stable | 19:46 |
-@gerrit:opendev.org- Joshua Watt proposed: [zuul/zuul] 851931: doc: Fix Nodepool monitoring stats https://review.opendev.org/c/zuul/zuul/+/851931 | 20:15 | |
@clarkb:matrix.org | corvus: did you want to reivew https://review.opendev.org/c/zuul/nodepool/+/849273 since it changes how nodepool-builder is invoked on cgroupv2 hosts? | 20:19 |
@jim:acmegating.com | Clark: lgtm, thx | 20:21 |
-@gerrit:opendev.org- Zuul merged on behalf of Joshua Watt: [zuul/zuul] 850685: web: openapi: Fix item_ahead and items_behind https://review.opendev.org/c/zuul/zuul/+/850685 | 20:50 | |
-@gerrit:opendev.org- Zuul merged on behalf of Joshua Watt: [zuul/zuul] 851550: smtpreporter: Add pipeline to subject https://review.opendev.org/c/zuul/zuul/+/851550 | 20:54 | |
@clarkb:matrix.org | corvus: do you know why we required python2.7 support in ansible? re https://review.opendev.org/c/zuul/zuul-jobs/+/851343 is it because ansible may itself run against python2.7 on the remote end and this gives us some checking that it is going to function in that case? | 21:22 |
@clarkb:matrix.org | ianw: ^ fyi I suspect that may be the reason and we may not be able to remove the testing if that is the case | 21:23 |
@jim:acmegating.com | Clark: yeah, we need to at least issue the next major rev of zuul where we drop older ansible to do that i think. and even then, i think it's just that we can look into what the current ansible support policy is. is there a reason we need to drop that now? | 21:51 |
@iwienand:matrix.org | something started failing with the py27 job as i was working on the linter stack and i put in the bits to remove it. i'm afraid what that something was i can't remember off the top of my head | 22:00 |
@jim:acmegating.com | ianw: looks like it's okay in general? was just a one-time fluke? | 22:01 |
@iwienand:matrix.org | i'll have to dig back | 22:04 |
@iwienand:matrix.org | https://docs.ansible.com/ansible/latest/dev_guide/:ref:managed-node-requirements is currently a 404 | 22:11 |
@iwienand:matrix.org | https://github.com/ansible/ansible/commit/2fc73a9dc357e776dbbbfd035c86fe880415e60a appears to have introduced this | 22:18 |
@iwienand:matrix.org | as usual from ansible a super unhelpful commit message with no context. WHY DO THEY DO THIS!!!!!! | 22:20 |
@iwienand:matrix.org | i submitted https://github.com/ansible/ansible/pull/78424 to fix the control node link, and https://github.com/ansible/ansible/issues/78423 to try and figure out what ansible thinks works where | 23:17 |
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/nodepool] 851940: Bump dib to 3.23.0 https://review.opendev.org/c/zuul/nodepool/+/851940 | 23:53 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!