-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 857796: Remove support for Ansible 2 https://review.opendev.org/c/zuul/zuul/+/857796 | 00:14 | |
@jim:acmegating.com | i think that's everything staged for 7.0.0 | 00:15 |
---|---|---|
@iwienand:matrix.org | corvus: Clark if you have a sec can you look at https://gerrit-review.googlesource.com/c/plugins/zuul-results-summary/+/345474. it seems upstream has both disabled being able to upload new patchsets, and the ability for the submitter to approve/merge their change. So I couldn't make edits to the original, and can't merge the updated change I had to upload either | 00:45 |
@jim:acmegating.com | ianw: are you sure that first part (patchsets) is intentional? | 00:50 |
@iwienand:matrix.org | no ... i tried adding permissions for registered users to be able to upload patchset but it didn't work. i figured that it's more compelx than that and relates to cla's etc. as to what would have to be enabled for that to work | 00:51 |
@jim:acmegating.com | ianw: is any of the metadata associated with the comments available to the plugin? | 00:53 |
@jim:acmegating.com | yeah, we have the tags there | 00:54 |
@jim:acmegating.com | ianw: can we encode the pipeline name in a tag? | 00:54 |
@jim:acmegating.com | also maybe the status too? maybe drop that whole re? | 00:54 |
@iwienand:matrix.org | probably, we already look for the autogenerated | 00:55 |
@jim:acmegating.com | oh the pipeline is already there | 00:55 |
@jim:acmegating.com | it's actually autogenerated:zuul:pipeline | 00:55 |
@iwienand:matrix.org | istr having to add in the non-pipeline matching after the fact to support, something. old zuul maybe | 00:56 |
@jim:acmegating.com | well, old zuul hasn't been supported for years | 00:57 |
@jim:acmegating.com | at any rate, seems like looking for the tag first and then falling back to old regexes would be doable/best? | 00:58 |
@jim:acmegating.com | and then adding in a new tag for status might be a further improvement | 00:59 |
@jim:acmegating.com | i do think that's worthwhile, 'cause i don't love the idea that we would say "you must configure a non-default setting for this to work" | 00:59 |
@iwienand:matrix.org | ok, happy for that to have comments as such. i don't think i'm going to let myself get sidetracked into that atm :) | 01:01 |
@jim:acmegating.com | (even if i also agree that changing the default would be a good idea) | 01:01 |
@clarkb:matrix.org | ianw: now they are offset the other direction :/ | 01:53 |
@iwienand:matrix.org | sigh -- i tried to copy in the classes that are styling the button | 01:54 |
@iwienand:matrix.org | i don't seem to have a screen that replicates it | 01:55 |
@clarkb:matrix.org | on mobile they seem to be aligned | 01:55 |
@iwienand:matrix.org | i guess we have a mac laptop with a retina display ... maybe that will | 01:56 |
@clarkb:matrix.org | if I reduce my screen's horizontal width they appear to align as on mobile | 01:56 |
@clarkb:matrix.org | but with my wide fullscreen default browser I get the shift | 01:57 |
@clarkb:matrix.org | I'll have to look more closely tomorrow, but wanted to followup on that tonight at least | 01:57 |
@jim:acmegating.com | that's 857794 for me | 01:58 |
@clarkb:matrix.org | ya that is what I get | 01:59 |
@clarkb:matrix.org | Maybe because the button is hidden it doesn't count towards the alignment? | 01:59 |
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/zuul] 857794: web: better non-expandable console padding https://review.opendev.org/c/zuul/zuul/+/857794 | 02:04 | |
@iwienand:matrix.org | ^ that overrides it "harder" ... instead of using className it just sets class | 02:04 |
@iwienand:matrix.org | that *should* make it so it's a <div pf-c-datalist__item-control> -> <div pf-c-datalist__item-toggle> -> <button style="visibility: hidden" disabled> | 02:06 |
@iwienand:matrix.org | AFAICS, that's the hierarchy react makes | 02:06 |
@iwienand:matrix.org | https://13b50ff454f43e41fea1-70aee045aa856a76767e0cd0433cf359.ssl.cf5.rackcdn.com/857794/3/check/zuul-build-dashboard-opendev/71d3b80/npm/html/ | 02:34 |
@iwienand:matrix.org | lines up for me on the retina display, at least | 02:34 |
-@gerrit:opendev.org- Simon Westphahl proposed: | 09:52 | |
- [zuul/zuul] 856523: Add span for builds and propagate via request https://review.opendev.org/c/zuul/zuul/+/856523 | ||
- [zuul/zuul] 857421: Trace merge requests and merger operations https://review.opendev.org/c/zuul/zuul/+/857421 | ||
-@gerrit:opendev.org- Simon Westphahl proposed on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul] 855096: Tracing: implement span save/restore https://review.opendev.org/c/zuul/zuul/+/855096 | 09:52 | |
-@gerrit:opendev.org- Alfredo Moralejo proposed: [zuul/zuul-jobs] 857730: Use AFS mirrors for extras-common in CS9 https://review.opendev.org/c/zuul/zuul-jobs/+/857730 | 10:17 | |
@westphahl:matrix.org | corvus: I fixed the failing tracing tests in 855096. had to reset the global tracer provider in the otlp fixture | 13:40 |
@fungicide:matrix.org | is there a consensus position/policy for when non-backward-compatible changes to roles in zuul-jobs need to be announced to the ml? | 15:22 |
@fungicide:matrix.org | trying to understand whether https://review.opendev.org/857730 needs advance notification | 15:23 |
@clarkb:matrix.org | I would announce changes like that or put them behind a toggle | 15:24 |
@clarkb:matrix.org | ianw: corvus the latest version of the row content alignment change looks great on both my laptop and desktop | 15:29 |
@fungicide:matrix.org | > <@clarkb:matrix.org> I would announce changes like that or put them behind a toggle | 15:29 |
yeah, it does seem like "you should just be copying whatever opendev does, and if you don't keep up to date on unannounced changes in their systems then your jobs will break" isn't an ideal policy for our standard jobs library | ||
@jim:acmegating.com | fungi: Clark https://zuul-ci.org/docs/zuul-jobs/policy.html#deprecation-policy says 2 week notice for backwards incompat changes | 15:36 |
@jim:acmegating.com | fungi: i left a comment, thanks | 15:39 |
@fungicide:matrix.org | perfect, thanks! i should have looked harder for that, but i forgot we'd codified it | 15:42 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 857916: DNM: Test jobs with Ansible 6 https://review.opendev.org/c/zuul/zuul/+/857916 | 15:57 | |
@jim:acmegating.com | Clark: ^ let's see if that shows any surprises... if not, i think we can make the release | 15:58 |
@clarkb:matrix.org | ++ | 16:00 |
-@gerrit:opendev.org- Alfredo Moralejo proposed: [zuul/zuul-jobs] 857730: Use extras-common repo in CS9 from package_mirror https://review.opendev.org/c/zuul/zuul-jobs/+/857730 | 16:03 | |
-@gerrit:opendev.org- Alfredo Moralejo proposed: [zuul/zuul-jobs] 857730: Use extras-common repo in CS9 from package_mirror https://review.opendev.org/c/zuul/zuul-jobs/+/857730 | 16:06 | |
@clarkb:matrix.org | corvus: looks like post failures for one job. Probably worth checking that is just because of an ansible version incompatibility. But otherwise seems it is running things | 16:09 |
@clarkb:matrix.org | ya that seems consistent so either a playbook compat issue (fine) or something bigger (maybe not fine) | 16:10 |
@clarkb:matrix.org | corvus: localhost -> localhost | ModuleNotFoundError: No module named 'openstack' when doing the swift upload | 16:12 |
@jim:acmegating.com | i don't see any errors in the build when we set up the virtualenv https://zuul.opendev.org/t/zuul/build/3651181f9c4a4b4291055c4efbf76927/log/job-output.txt#7527 | 16:17 |
@clarkb:matrix.org | maybe a difference in how new ansible looks up external ibs? | 16:17 |
@jim:acmegating.com | i'm running a zuul-executor:latest container locally. if i activate the venvs for 5 and 6, "import openstack" works in both | 16:19 |
@clarkb:matrix.org | I think the import that is failing is failing in the ansiballz injected python to "remote" context | 16:20 |
@clarkb:matrix.org | They do copy all the ansible module utils import content tehy can find. I'm not sure what the expectation is for external libs | 16:21 |
@jim:acmegating.com | well, it's on localhost, so it should use the same python interpreter | 16:22 |
@clarkb:matrix.org | good point | 16:22 |
@jim:acmegating.com | (i mean, that's the expectation, but maybe that is broken; maybe it's reverting to system python) | 16:23 |
@jim:acmegating.com | i'll see if i can make a simple local playbook | 16:23 |
@clarkb:matrix.org | I half wonder if this could be a side effect of pipelining since that changes the way ansiballz works. Theoretically possible ansible 5 would've had similar issues with pipelining depending on what the issue is | 16:27 |
@jim:acmegating.com | i have failed to reproduce it... my simple test has crashed at `/usr/local/lib/zuul/ansible/6/lib/python3.10/site-packages/openstack/config/loader.py\", line 507, in _get_base_cloud_config\n raise exceptions.ConfigException(\nopenstack.exceptions.ConfigException: Cloud test was not found.\n` | 16:29 |
@jim:acmegating.com | which appears to be after the openstack import | 16:29 |
@clarkb:matrix.org | the job I took that from was c8fee434b0eb415a80d17ea0ff7bca6f and it ran on ze05.opendev.org which is running the newer of the two executor images. That rules out perhaps a corrupt older image and :latest being different and working | 16:30 |
@clarkb:matrix.org | corvus: https://paste.opendev.org/show/bGkFMtUPYpRn2aFkmfAz/ is the full traceback | 16:32 |
@clarkb:matrix.org | /usr/local/lib/python3.10/runpy.py is in the traceback and not some other path for the ansible venv | 16:33 |
@clarkb:matrix.org | I suspect that your hunch is correct | 16:33 |
@jim:acmegating.com | why wouldn't that be reproducible though? | 16:40 |
@jim:acmegating.com | i'm starting to wonder if we need to hold build dirs and poke around on the prod servers | 16:40 |
@clarkb:matrix.org | corvus: I can half reproduce this by creating a test.py file with 'import openstack' in it then using /usr/bin/python3 to runpy.run_path('test.py'). If I use /usr/local/lib/zuul/ansible/6/bin/python3 then no error occurs | 16:42 |
@jim:acmegating.com | (i made an ansible.cfg with as much of the relevant settings as i could) | 16:42 |
@clarkb:matrix.org | I think that confirms your hunch must be the mechanism, now we need to figure out why ansible causes that to happen | 16:42 |
@clarkb:matrix.org | (I'm doing this locally in the zuul-executor:latest image) | 16:42 |
@jim:acmegating.com | to clarify: my test is running "ansible-playbook" with the zuul-jobs upload role and an ansible.cfg, but with a bunch of stuff deleted | 16:43 |
@jim:acmegating.com | so if it's *just* ansible, i would expect it to fail. it seems like there may be a "how zuul runs ansible" component to this | 16:43 |
@clarkb:matrix.org | corvus: are you using delegate_to or host: localhost? I wonder if the behavior possibly differ | 16:45 |
@clarkb:matrix.org | (the failing role uses delegate_to) | 16:45 |
@jim:acmegating.com | no, was using hosts:localhost; will try | 16:51 |
@clarkb:matrix.org | https://github.com/ansible/ansible/issues/63180 is a similar issue but in theory fixed long ago | 16:53 |
@clarkb:matrix.org | But maybe a similar regression was reintroduced | 16:53 |
@jim:acmegating.com | Clark: same result running with a real host in inventory (but with task delegated to localhost) | 16:55 |
@jim:acmegating.com | i'm going to hold the build dirs | 16:56 |
@clarkb:matrix.org | In the issue I linked the problem was a task running before the delegate_to would populate the python interpreter fact. Then when you delegate_to the python_interpreter fact from before was polluting the delegate | 16:57 |
@jim:acmegating.com | oh, i'll futz with that a bit in my test then before i set keep | 17:02 |
@clarkb:matrix.org | The issue there seemed to be that the discovered interpreter would leak across multiple delegate_tos which is weird. Seems like they should rediscover for sure. | 17:05 |
@clarkb:matrix.org | Here's openstack ansible doing the inverse of what we want: https://opendev.org/openstack/openstack-ansible-openstack_hosts/commit/393175577c3d4a8024fab2563b683dece10d46eb | 17:05 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul] 857916: DNM: Test jobs with Ansible 6. https://review.opendev.org/c/zuul/zuul/+/857916 | 17:11 | |
@jim:acmegating.com | i set keep everywhere and set an autohold on tox-linters | 17:12 |
@jim:acmegating.com | tox-linters is running on ze12 and is build ee3921cb334542df9920d116375ce0ed | 17:14 |
@jim:acmegating.com | okay, i have a bwrapped shell and a test playbook inside of that build dir on ze12 | 17:40 |
@jim:acmegating.com | and i can reproduce the problem there | 17:40 |
@jim:acmegating.com | my playbook is two tasks: a debug output of ansible_python_interpreter, then an include_role for the swift upload | 17:45 |
@jim:acmegating.com | "ansible_python_interpreter": "/usr/local/lib/zuul/ansible/6/bin/python" | 17:45 |
@clarkb:matrix.org | corvus: I've also discovered ansible_playbook_python which is a var pointing at the python running on the controller | 17:46 |
@jim:acmegating.com | Clark: same value | 17:47 |
@jim:acmegating.com | i'm going to try flipping switches in ansible.cfg | 17:48 |
@jim:acmegating.com | removing pipelining and fact caching doesn't change anything | 17:51 |
@jim:acmegating.com | https://paste.opendev.org/show/btfPjDuVNTWxXk26q7H9/ is the current minimal config i'm using | 17:52 |
@clarkb:matrix.org | corvus: looking at the discovery file (https://github.com/ansible/ansible/blob/v2.13.4/lib/ansible/executor/interpreter_discovery.py) there should be some debug printing that is helpful in understanding its decision making | 17:55 |
@clarkb:matrix.org | I get '<localhost> Python interpreter discovery fallback (unsupported Linux distribution: opensuse-tumbleweed)' on my local machine when gathering facts for example. Then it uses /usr/bin/python3.10 | 17:57 |
@clarkb:matrix.org | And sure enough when I do a delegate_to: localhost it uses that python version. Not the python in my virtualenv | 17:58 |
@jim:acmegating.com | Clark: i'm not following. can you explain in more detail? | 18:00 |
@jim:acmegating.com | like... when you say "it uses" what are you referencing? | 18:00 |
@clarkb:matrix.org | corvus: The ansiballz executable that ansible executes to accept module execution stuff runs under that python that discovery found. This appears true whether or not pipelining is enabled | 18:01 |
@clarkb:matrix.org | It doesn't seem to be running using the local interpreter for the control process when running delegate_to. | 18:02 |
@clarkb:matrix.org | I'm going to downgrade to ansible 5 and see if that continues to happen | 18:02 |
@jim:acmegating.com | i'm able to reproduce in my local container now | 18:03 |
@jim:acmegating.com | the problem before was that i had the venv activated | 18:03 |
@jim:acmegating.com | if i run /usr/local/lib/zuul/ansible/5/bin/ansible-playbook in my local container, it succeeds, and if i run /usr/local/lib/zuul/ansible/6/bin/ansible-playbook it fails | 18:04 |
@jim:acmegating.com | as long as the venv isn't activated | 18:04 |
@jim:acmegating.com | Clark: how do you know what interpreter it's using? | 18:05 |
@jim:acmegating.com | from the vvv output? | 18:05 |
@clarkb:matrix.org | yes it tells you in the vvv output as part of the ssh commands it lists | 18:06 |
@jim:acmegating.com | ansible 5: | 18:06 |
localhost | EXEC /bin/sh -c '/usr/local/lib/zuul/ansible/5/bin/python /root/.ansible/tmp/ansible-tmp-1663264992.1060188-1538-134089289929445/AnsiballZ_zuul_swift_upload.py && sleep 0' | |
ansible 6: | ||
localhost | EXEC /bin/sh -c '/usr/bin/env python3 /root/.ansible/tmp/ansible-tmp-1663264998.1375575-1573-7321559521200/AnsiballZ_zuul_swift_upload.py && sleep 0' | |
@jim:acmegating.com | those are both from my local container, but in prod, i see it's also using `env` (my prod bwrap doesn't have ansible 5 mapped into it, so i only have 6 available to test easily there) | 18:07 |
@jim:acmegating.com | Clark: i don't see any discover output... do you think gather_facts: true is needed for that? | 18:09 |
@clarkb:matrix.org | corvus: that may be the case. Its part of fact gathering in my local playbook at least according to -vvv | 18:09 |
@clarkb:matrix.org | _execute_module() in lib/ansible/plugins/action/__init__.py is where this appeas to happen | 18:13 |
@jim:acmegating.com | i don't see any interesting debug messages in the fact gathering phase | 18:16 |
@clarkb:matrix.org | corvus: does local_action instead of delegate_to change behavior? https://docs.ansible.com/ansible/latest/inventory/implicit_localhost.html indicates that it uses ansible_playbook_python for local_action | 18:23 |
@jim:acmegating.com | Clark: i'll check -- meanwhile, i just found this suspcious commit 8d41b97329cae281ce194dbb8cb3ce35fdce23ec | 18:24 |
@clarkb:matrix.org | I'm having a hard time getting an ansible yaml inventory that looks like the one in that doc to stop ssh'ing | 18:24 |
@jim:acmegating.com | Clark: i'm not sure how to use local_action with the swift upload module | 18:27 |
@clarkb:matrix.org | https://github.com/ansible/ansible/issues/16724#issuecomment-269994368 is maybe helpful background | 18:30 |
@jim:acmegating.com | Clark: that suggests that local_action is an alias for delegate_to... are we sure we want to run this down? it seems like understanding that commit ^ might be helpful? | 18:31 |
@clarkb:matrix.org | corvus: that commit == 8d41b97329cae281ce194dbb8cb3ce35fdce23ec ? and yes looks like local_action and delegate_to use the same implicit localhost | 18:32 |
@clarkb:matrix.org | so I don't expect them to be different | 18:33 |
@jim:acmegating.com | yeah | 18:33 |
@clarkb:matrix.org | https://github.com/ansible/ansible/commit/8d41b97329cae281ce194dbb8cb3ce35fdce23ec link for anyone wanting to pull that up quickly | 18:41 |
@gobi_g:matrix.org | Hi | 18:45 |
Have one weird question. | ||
Is it possible to limit the re-gate of MRs? | ||
Reason: people not trying to solve/look into the failures they're just re-gating the MRs. Its wasting lot of resources. | ||
@clarkb:matrix.org | corvus: check out 9142be2f6cabbe6597c9254c5bb9186d17036d55 | 18:47 |
@clarkb:matrix.org | corvus: I think they are looking at the shebang in the swift role | 18:47 |
@clarkb:matrix.org | Thats why you get the `/usr/bin/env python3` because that is what is int he fiel | 18:48 |
@jim:acmegating.com | Clark: ah yes, that makes sense... | 18:48 |
@clarkb:matrix.org | Thats super unexpected to me because the shebang should only be interpreted by the kernel if the file is being executed directly | 18:48 |
@jim:acmegating.com | and i have no idea how we're going to deal with that | 18:48 |
@clarkb:matrix.org | corvus: can we just drop the shebang? I don't know that the file needs to be directly executable without specifying an interpreter on the command line | 18:49 |
@jim:acmegating.com | oh hrm | 18:49 |
@jim:acmegating.com | Clark: yes that works | 18:50 |
@clarkb:matrix.org | > <@gobi_g:matrix.org> Hi | 18:50 |
> Have one weird question. | ||
> Is it possible to limit the re-gate of MRs? | ||
> | ||
> Reason: people not trying to solve/look into the failures they're just re-gating the MRs. Its wasting lot of resources. | ||
Zuul doesn't currently have a way to limit based on the number of previous attempts. With Gerrit you can look at the vote state and prevent gating if ther is already a -2 as one option. In general though I would communicate with your users and address this with them directly | ||
@jim:acmegating.com | so i think we have 2 options: | 18:51 |
1) make that change in zuul-jobs and update our release notes to highlight it as a potential gotcha | ||
2) or try to emulate venv activation when we invoke ansible-playbook | ||
@jim:acmegating.com | i think i lke door number 1....? | 18:51 |
@clarkb:matrix.org | corvus: I'm going to withhold judgement on that change as I don't know what motivated it. But my initial hunch is that this is a bug in ansible. The shebang is for execvpe or whatever the syscall is | 18:51 |
@clarkb:matrix.org | corvus: yes I like the simplicity of 1. | 18:51 |
@jim:acmegating.com | Clark: yes, i'm with you on that. | 18:52 |
@clarkb:matrix.org | people that want to run those scripts on the command line outside of ansible can run them directly under the interpreter they want to use | 18:52 |
@clarkb:matrix.org | that shouldn't be a big burden | 18:52 |
@jim:acmegating.com | Clark: i get the expected behavior in my local test container with all 4 versions of ansible after removing the shebang | 18:55 |
@clarkb:matrix.org | corvus: my vote is on that then :) | 18:55 |
@clarkb:matrix.org | corvus: and maybe we need to check for shebangs across zuul-jobs too | 18:55 |
@jim:acmegating.com | yeah, i'll propose a change that removes them from all python libraries in zuul-jobs | 18:56 |
@vlotorev:matrix.org | > <@iwienand:matrix.org> corvus: Clark if you have a sec can you look at https://gerrit-review.googlesource.com/c/plugins/zuul-results-summary/+/345474. it seems upstream has both disabled being able to upload new patchsets, and the ability for the submitter to approve/merge their change. So I couldn't make edits to the original, and can't merge the updated change I had to upload either | 18:56 |
Hi, regarding supportting pipeline name in zuul-results-summary. Gerrit has introduced Checks API, will Zuul support it? It 'yes', then how zuul-results-summary is going to be useful? Or Checks API doesn't play nice with zuul dynamic jobs creations and multiple pipelines? | ||
@vlotorev:matrix.org | > <@iwienand:matrix.org> corvus: Clark if you have a sec can you look at https://gerrit-review.googlesource.com/c/plugins/zuul-results-summary/+/345474. it seems upstream has both disabled being able to upload new patchsets, and the ability for the submitter to approve/merge their change. So I couldn't make edits to the original, and can't merge the updated change I had to upload either | 18:56 |
* Hi, regarding supportting pipeline name in zuul-results-summary. Gerrit has introduced Checks API, will Zuul support it? If 'yes', then how zuul-results-summary is going to be useful? Or Checks API doesn't play nice with zuul dynamic jobs creations and multiple pipelines? | ||
@jim:acmegating.com | vlotorev: the checks plugin is deprecated, don't use it. the javascript checks api should be compatible with zuul, but to my knowledge, no one has volunteered to work on an implementation. | 18:58 |
@clarkb:matrix.org | ya, we just need soemone to write the checks api thing to render the results | 18:58 |
@clarkb:matrix.org | corvus: knowing what we know now it is probably safe to proceed with the zuul release using the commit already chosen. But probably a good idea to confirm that updating zuul-jobs doesn't create regressions more broadly first? It should be pretty safe though as old ansible seems to have ignored the shebang (which is what I would expect it to do) | 19:02 |
@jim:acmegating.com | Clark: agreed | 19:06 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul-jobs] 857948: Remove shebang from all python ansible modules https://review.opendev.org/c/zuul/zuul-jobs/+/857948 | 19:06 | |
@jim:acmegating.com | Clark: based on our knowledge, do you think we need a base-test cycle, or merge and watch/revert? | 19:07 |
@clarkb:matrix.org | corvus: I'm like 95% comfortable that this won't break spectacularly | 19:08 |
@clarkb:matrix.org | I don't think we were relying on a specific python2 vs python3 for any of those | 19:08 |
@jim:acmegating.com | zuul-maint: would you please review 857948 with some urgency? it's needed for ansible 6 and is blocking our release | 19:10 |
@clarkb:matrix.org | There was clearly some sort of shebang handling in the past though and I'm not sure what that may have been doing | 19:11 |
@clarkb:matrix.org | I don't think we ever want shebang handling that just makes no sense to me | 19:11 |
@clarkb:matrix.org | ansible runs on such a variety of systems that it necessarily needs to interpret the best option for the given context and use that | 19:12 |
@clarkb:matrix.org | being explicit in a shebang is there as a fallback when nothing else is possible. It isn't a directive that this is the only option | 19:12 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul-jobs] 857970: DNM: check sfio sphinx jobs https://review.opendev.org/c/zuul/zuul-jobs/+/857970 | 19:14 | |
@jim:acmegating.com | tristanC: thanks for the comment; i just double checked the failures with ^ and confirmed they are unrelated | 19:17 |
@clarkb:matrix.org | corvus: in the old side of the ansible diff is a bit of code that says assume binary when shebang isn't found | 19:17 |
@clarkb:matrix.org | I think that may be the only concern I've got is that it may try to treat these as binaries rather than python scripts | 19:17 |
@clarkb:matrix.org | But your local testing seems to indicate that doesn't happen? | 19:17 |
@jim:acmegating.com | yeah, i got the "cloud test not found" error from python | 19:18 |
@clarkb:matrix.org | its possible another fix may be to just drop the use of env>? | 19:18 |
@clarkb:matrix.org | It seems like they are trying to infer python versions and maybe they would find the right python version in the virtualenv. but since we use env python it is using the linux tool to find the global install | 19:19 |
@clarkb:matrix.org | That may be worth testing if you still have your env available | 19:19 |
@jim:acmegating.com | "module_stderr": "/bin/sh: 1: /usr/bin/python3: not found\n", | 19:20 |
@clarkb:matrix.org | I'm happy with dropping the shebangs if that works. That seems the least surprising to me as less magic should be happening if we provide fewer inputs to magic off of | 19:20 |
@jim:acmegating.com | that's with `#!/usr/bin/python3` | 19:20 |
@jim:acmegating.com | that what you meant? | 19:20 |
@clarkb:matrix.org | corvus: yes, I was hoping maybe ansible elsewhere would look for python3 in the venv, seems like not so that won't work | 19:20 |
@jim:acmegating.com | yep, no joy | 19:20 |
@clarkb:matrix.org | might also be good to ask ansible what the intended behavior is here | 19:21 |
@clarkb:matrix.org | because this seems like hidden magic that I would never expect | 19:21 |
@clarkb:matrix.org | but thats because I know about length limits in shebangs for the kernel in exec paths because linux | 19:21 |
@clarkb:matrix.org | Now that we've mostly run this down I'm going to pop out for lunch and a bike ride. I'm good with proceeding with the shebang removal. I guess be ready for quick revert if that becomes necessary? fungi you may have thoughts too | 19:23 |
@clarkb:matrix.org | I'll be back in a bit | 19:23 |
@jim:acmegating.com | sounds good. i just made a sandwich i will eat at my desk so i'll be around if a revert is needed | 19:27 |
@jim:acmegating.com | there was a failure in one of the eighty bazillion jobs that ran... | 19:42 |
@jim:acmegating.com | https://zuul.opendev.org/t/zuul/build/3b670c2e429045b3bb723a79321ee7be/console#1/0/2/ubuntu-focal | 19:42 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul-jobs] 857973: DNM: check generate-manifest job https://review.opendev.org/c/zuul/zuul-jobs/+/857973 | 19:56 | |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul-jobs] 857974: DNM: check generate-manifest job 2.9 https://review.opendev.org/c/zuul/zuul-jobs/+/857974 | 19:58 | |
@jim:acmegating.com | okay the "good" news is that the noop change failed the same way | 20:05 |
@jim:acmegating.com | so the failure isn't related to the shebang change | 20:05 |
@jim:acmegating.com | the 2.9 job suceeded, so whatever broke *that* happened between 2.9 and 5 | 20:05 |
@gobi_g:matrix.org | > <@clarkb:matrix.org> Zuul doesn't currently have a way to limit based on the number of previous attempts. With Gerrit you can look at the vote state and prevent gating if ther is already a -2 as one option. In general though I would communicate with your users and address this with them directly | 20:12 |
Thanks. Found one option. Using get build details API, I filtered the change and result. So, if it more than 3 I can able to block the re-gate or add some comments | ||
@gobi_g:matrix.org | * Thanks. Found one option. Using get build details API, I filtered the change and result. So, if it more than 3 I can able to block the re-gate (by moving the MR to draft state) or add some comments | 20:13 |
@gobi_g:matrix.org | Also, Is there a way to merge independent and dependent pipeline feature? | 20:17 |
Like 1,2,3 MRs and there is a 4th MR it will cause merge conflict with 3rd MR. | ||
In that time | ||
I'm expecting | ||
1 | ||
1+2 | ||
1+2+3 | ||
1+2+3+4 (not possible bcz of merge-conflict) | ||
So, | ||
1+2+4 | ||
If all the MRs were passed, | ||
Merge 1,2,3 | ||
Instead merge 4th add comment as job passed rebase the change and merge it manually. | ||
Because in my case my 4th built on top of 4th. So, 4th will contain 3rd changes. | ||
@gobi_g:matrix.org | * ------------------ | 20:18 |
Also, Is there a way to merge independent and dependent pipeline feature? | ||
Like 1,2,3 MRs and there is a 4th MR it will cause merge conflict with 3rd MR. | ||
In that time | ||
I'm expecting | ||
1 | ||
1+2 | ||
1+2+3 | ||
1+2+3+4 (not possible bcz of merge-conflict) | ||
So, | ||
1+2+4 | ||
If all the MRs were passed, | ||
Merge 1,2,3 | ||
Instead merge 4th add comment as job passed rebase the change and merge it manually. | ||
Because in my case my 4th built on top of 4th. So, 4th will contain 3rd changes. | ||
@jim:acmegating.com | as best as i can tell, the generate_manifest failure is related to executing on a remote node and the sys.stdin check that we do is failing | 20:32 |
@jim:acmegating.com | so maybe under ansible 5, stdin appears to be a tty on remote nodes, but that's fixed in ansible 6 | 20:33 |
@jim:acmegating.com | maybe that has something to do with pipelining | 20:33 |
@iwienand:matrix.org | in the recent issue corvus raised on locking around output wasn't their discussion on removing stdin from plugins? is that related? | 20:43 |
@jim:acmegating.com | ianw: i don't think so because he're were talking about a module run on the other end of the ssh connection... but anything is possible in the ansible source code! :) | 20:43 |
@iwienand:matrix.org | https://github.com/ansible/ansible/pull/78679#issuecomment-1233434703 was the comment i was thinking of, for posterity in logs :) | 20:44 |
@jim:acmegating.com | (basically, ansible modules are always going to have a stdin, because ansible sends the input parameters over that. it just seems that in ansible 5, that file descriptor is a tty, but is not in other versions. so probably some ssh-specific options happening) | 20:44 |
@iwienand:matrix.org | yeah, i guess that is saying *not inherited* | 20:45 |
-@gerrit:opendev.org- James E. Blair https://matrix.to/#/@jim:acmegating.com proposed: [zuul/zuul-jobs] 857948: Remove shebang from all python ansible modules https://review.opendev.org/c/zuul/zuul-jobs/+/857948 | 20:49 | |
@jim:acmegating.com | zuul-maint: ^ i believe that should address the issue with testing generate_manifest. the issue only appears when a module is used on a remote node under ansible 5 (which is why the issue did not appear in production, only testing, since in production it is delegated to localhost). | 20:51 |
@iwienand:matrix.org | corvus: could we add a simple-ish linter to look at the first line of library/.*.py and make sure it doesn't start with #!? that would probably be enough to stop it coming back? | 20:55 |
@iwienand:matrix.org | btw zuul_azure_storage_upload still has https://review.opendev.org/c/zuul/zuul-jobs/+/852932 as i think that has a typo | 20:57 |
@jim:acmegating.com | ianw: i think that would be a good addition and i would welcome someone other than me contributing that :) | 21:00 |
@clarkb:matrix.org | Just getting back and will review that fix momentarily | 21:03 |
@jim:acmegating.com | the ansible docs apparently want a shebang in the module: https://docs.ansible.com/ansible/latest/dev_guide/developing_modules_documenting.html#python-shebang-utf-8-coding | 21:05 |
@jim:acmegating.com | i'm not sure that should change our plan here since we found something that apparently works, but it is worth noting it appears contrary to doc reccomendations | 21:06 |
@jim:acmegating.com | maybe that means the "activate the virtualenv and use a /usr/bin/env" solution would be more in-line with the docs... maybe we take a look at doing that in the future? | 21:08 |
@clarkb:matrix.org | corvus: does that imply "running code in a virtualenv is never supported" now? | 21:09 |
@clarkb:matrix.org | maybe if we explicitly set the interpreter value that would be honored instead? | 21:09 |
@clarkb:matrix.org | I also wonder if that doc entry is just really old and the expectations have changed since this seems to currently operate contrary to ansible_python_interpreter | 21:10 |
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/zuul-jobs] 857981: linters: lint that library files don't start with #! https://review.opendev.org/c/zuul/zuul-jobs/+/857981 | 21:10 | |
@jim:acmegating.com | Clark: explicitly set what interpreter value? | 21:11 |
@clarkb:matrix.org | corvus: I think ansible_python_interpreter | 21:11 |
@jim:acmegating.com | i mean i don't understand the suggestion | 21:12 |
@jim:acmegating.com | set ansible_python_interpreter for what node and where? | 21:12 |
-@gerrit:opendev.org- Ian Wienand proposed: [zuul/zuul-jobs] 857981: linters: lint that library files don't start with #! https://review.opendev.org/c/zuul/zuul-jobs/+/857981 | 21:12 | |
@clarkb:matrix.org | corvus: basically it seems that ansible is throwing out whatever it knows about the remote node and replacing it with what is in the shebang. If we set ansible_python_interpreter explicitly would that override the throwing out behavior | 21:12 |
@jim:acmegating.com | Clark: maybe, but is there a way to do that for the implicit localhost without adding an explicit localhost to the inventory? | 21:13 |
@clarkb:matrix.org | Because I'm not sure how upstream expects this to ever work if they just blindly replace executables with what is in the shebang rather than using what is known about the runtime | 21:13 |
@clarkb:matrix.org | I think you can set the value on a per task level using vars: | 21:14 |
@clarkb:matrix.org | Clarifying with them seems like a good idea at least | 21:14 |
@jim:acmegating.com | Clark: i don't think our zuul-jobs roles can know the correct path to the ansible interpreter to set that for delegated-to-localhost tasks | 21:15 |
@clarkb:matrix.org | yes, I think that is correct. Which is why this behavior feels very off to me. Its like Ansible doesn't think dynamic execution environments should be supported any longer | 21:15 |
@clarkb:matrix.org | Basically what ansible si doing seems to have some intention behind it, but I can't understand it at all | 21:16 |
@jim:acmegating.com | i'm not getting "we don't support venv" vibes. afaict, if you run ansible in an activated venv then delegated-to-localhost tasks that have a /usr/bin/env python3 shebang will use the venv python because they now honor the shebang. | 21:16 |
@iwienand:matrix.org | > <@clarkb:matrix.org> Basically what ansible si doing seems to have some intention behind it, but I can't understand it at all | 21:17 |
yeah, it's the type of thing a decent changelog entry might be able to clear up, isn't it :/ | ||
@clarkb:matrix.org | corvus: but its never been required to activate a venv to run inside of one | 21:17 |
@clarkb:matrix.org | but also their docs don't suggest using env either | 21:18 |
@clarkb:matrix.org | they say you should use an explicit path to a python | 21:18 |
@fungicide:matrix.org | i almost never activate the venvs i use, i just call entrypoints inside them directly or via symlink | 21:18 |
@jim:acmegating.com | Clark: yep. i suspect either this case hasn't been considered, or if it has, it may have only been thought about inasmuch as "yes i checked it works in a venv" and maybe no one tried the "invoked via venv-specific binary path" approach. | 21:18 |
@jim:acmegating.com | but i could almost (*almost*) see "delegated task to localhost uses the host python" as an intentional feature too (even though it's an anti-feature for us). so yeah... as ianw said.... :) | 21:19 |
@clarkb:matrix.org | I've +2'd the shebang removal. I think we can likely do that safely and followup with ansible to better understand what they expect out of modules and adjust from there if necessary | 21:19 |
@jim:acmegating.com | cool, yeah, i think that's still my preference. basically, i'm not wedded to this if we find out something better we should really be doing | 21:20 |
@clarkb:matrix.org | There is evidence that they didn't want to do that in the past though https://github.com/ansible/ansible/issues/13773#issuecomment-170071068 https://github.com/ansible/ansible/issues/16724#issuecomment-269994368 | 21:21 |
@jim:acmegating.com | i approved the change -- the one job that failed last time passed this time, so i don't think we need to wait for check jobs any longer. it's in gate now. | 21:21 |
@jim:acmegating.com | Clark: that's a good find that they consider "exec using ansible's interpreter" as the correct behavior for localhost. makes me wonder even more about their thought regarding this change. | 21:23 |
@clarkb:matrix.org | > <@gobi_g:matrix.org> ------------------ | 21:23 |
> Also, Is there a way to merge independent and dependent pipeline feature? | ||
> | ||
> Like 1,2,3 MRs and there is a 4th MR it will cause merge conflict with 3rd MR. | ||
> In that time | ||
> I'm expecting | ||
> 1 | ||
> 1+2 | ||
> 1+2+3 | ||
> 1+2+3+4 (not possible bcz of merge-conflict) | ||
> So, | ||
> 1+2+4 | ||
> | ||
> If all the MRs were passed, | ||
> Merge 1,2,3 | ||
> Instead merge 4th add comment as job passed rebase the change and merge it manually. | ||
> | ||
> Because in my case my 4th built on top of 4th. So, 4th will contain 3rd changes. | ||
You should use the git tree to resolve conflicts and zuul will enqueue them in the correct order. | ||
@gobi_g:matrix.org | Hi, | 21:23 |
Is there any way to cancel the running job/pipeline? | ||
@clarkb:matrix.org | > <@gobi_g:matrix.org> Hi, | 21:23 |
> Is there any way to cancel the running job/pipeline? | ||
There is a dequeue command/api that will remove an entire buildset. | ||
@gobi_g:matrix.org | > <@clarkb:matrix.org> You should use the git tree to resolve conflicts and zuul will enqueue them in the correct order. | 21:24 |
Are there any references available for this? | ||
@jim:acmegating.com | karthi: have you checked the docs for zuul-client ? | 21:24 |
@gobi_g:matrix.org | > <@clarkb:matrix.org> There is a dequeue command/api that will remove an entire buildset. | 21:25 |
I know the command. Could you please share the API reference? | ||
@clarkb:matrix.org | I'm not sure there is an api reference. The zuul-client command implements it though | 21:25 |
@jim:acmegating.com | karthi: https://zuul-ci.org/docs/zuul-client/commands.html#dequeue | 21:25 |
@jim:acmegating.com | there's also a button on the web page if you're authenticated. | 21:27 |
@clarkb:matrix.org | > <@clarkb:matrix.org> You should use the git tree to resolve conflicts and zuul will enqueue them in the correct order. | 21:27 |
It has just occured to me that you are using gitlab merge requests which may not allow for ordering between merge requests. With Gerrit we order the changes using the git tree and avoid this problem entirely. For other systems you may need to do something else | ||
@gobi_g:matrix.org | > <@clarkb:matrix.org> It has just occured to me that you are using gitlab merge requests which may not allow for ordering between merge requests. With Gerrit we order the changes using the git tree and avoid this problem entirely. For other systems you may need to do something else | 21:29 |
Oh okay. So no options for other SCM systems. | ||
@gobi_g:matrix.org | > <@jim:acmegating.com> there's also a button on the web page if you're authenticated. | 21:31 |
I can't find any options. Do we have authentication in zuul? | ||
You mean username and password? | ||
@clarkb:matrix.org | > <@gobi_g:matrix.org> Oh okay. So no options for other SCM systems. | 21:31 |
Well you could do something like put everything into a super merge request (not great for reviewing), or have people coordinate their changes and know to rebase when other changes merge. I'm sure there are other tactics too, but I don't deal with tools that force me to figure it out so I'm not the best person to ask | ||
@jim:acmegating.com | > <@gobi_g:matrix.org> I can't find any options. Do we have authentication in zuul? | 21:35 |
> You mean username and password? | ||
yes, there is authentication in zuul. it's described in the manual here https://zuul-ci.org/docs/zuul/latest/authentication.html | ||
@gobi_g:matrix.org | > <@gobi_g:matrix.org> I can't find any options. Do we have authentication in zuul? | 21:36 |
> You mean username and password? | ||
If possible could share how will the cancel button looks like? | ||
-@gerrit:opendev.org- Zuul merged on behalf of James E. Blair https://matrix.to/#/@jim:acmegating.com: [zuul/zuul-jobs] 857948: Remove shebang from all python ansible modules https://review.opendev.org/c/zuul/zuul-jobs/+/857948 | 21:39 | |
@clarkb:matrix.org | Now we can recheck your change? | 21:40 |
@clarkb:matrix.org | https://review.opendev.org/c/zuul/zuul/+/857916 that one? | 21:40 |
@jim:acmegating.com | Clark: yep done | 21:40 |
@gobi_g:matrix.org | > <@jim:acmegating.com> sent an image. | 21:41 |
Wow. Nice. | ||
@clarkb:matrix.org | tox-linters on the recheck passed wtih ansible 6 | 21:48 |
@clarkb:matrix.org | I see log files too so the upload happened | 21:49 |
@clarkb:matrix.org | still waiting on ansible 5 jobs to complete in the zuul and openstack tenants | 21:49 |
@jim:acmegating.com | the docs job on my test change is ansible 5, so that's a good check of the status quo | 21:51 |
@clarkb:matrix.org | yup and there is a cinder chagne in the openstack gate that went in after the zuul-jobs update so all those jobs should be ansible 5 too | 21:52 |
@clarkb:matrix.org | the zuul docs job is done and I can see the docs in the log dir and browse them normally | 21:52 |
@clarkb:matrix.org | fingers crossed but everything is looking happy so far | 21:52 |
@gobi_g:matrix.org | https://zuul-ci.org/docs/zuul/latest/configuration.html?highlight=realm#attr-auth%20%3Cauthenticator%20name%3E.realm | 22:03 |
What is the realm? | ||
@clarkb:matrix.org | > <@gobi_g:matrix.org> https://zuul-ci.org/docs/zuul/latest/configuration.html?highlight=realm#attr-auth%20%3Cauthenticator%20name%3E.realm | 22:14 |
> | ||
> What is the realm? | ||
https://www.rfc-editor.org/rfc/rfc7235#section-2.2 it identifies the scope of protection. | ||
@jim:acmegating.com | zuul-maint: i'm going to push that tag now | 23:00 |
@clarkb:matrix.org | Sounds good | 23:04 |
@jim:acmegating.com | Clark: though before i do that... there are several zuul-build image post_failures... | 23:04 |
@jim:acmegating.com | that can't be related... but i don't understand it yet | 23:04 |
@clarkb:matrix.org | Looks like they timed out | 23:06 |
@clarkb:matrix.org | https://zuul.opendev.org/t/zuul/build/ecb46df665df4478b62a6dffb80eaad0/log/job-output.txt#9470-9471 does the intermediate registry need to eb restarted again? | 23:07 |
@jim:acmegating.com | why did they post-fail then? it looks like the cleanup playbook is getting unreachable | 23:07 |
@clarkb:matrix.org | The timeout happened in post-run | 23:08 |
@jim:acmegating.com | oh, got it. thx | 23:08 |
@clarkb:matrix.org | I'm looking at https://zuul.opendev.org/t/zuul/stream/7d3c4dc486be42eebaf2ac94ead1c11f?logfile=console.log to see what cleanup does | 23:10 |
@jim:acmegating.com | okay, confirmed it's unrelated. will push tag now | 23:13 |
@clarkb:matrix.org | I confirm that the cleanup is getting permission denied publickey which is unexpected | 23:13 |
@clarkb:matrix.org | something in that job cuold be removing or changing auth details? | 23:14 |
@jim:acmegating.com | does cleanup run after remove temp ssh key? | 23:14 |
@clarkb:matrix.org | oh it may. And when it works it works because control persistence is still open | 23:15 |
@clarkb:matrix.org | in this case we're stalling for half an hour and the control persistence will timeout | 23:15 |
@jim:acmegating.com | yep, that sounds logical | 23:15 |
@clarkb:matrix.org | (I learned all about that control persistence staying open earlier today trying to determine if I was using a local connection or ssh) | 23:17 |
@jim:acmegating.com | yeah, timeout is about 1 min i think | 23:18 |
@clarkb:matrix.org | once the release is out we should clean up the glibc backport. ianw you haev a cahnge for that right? | 23:20 |
@clarkb:matrix.org | and we should be able to land the web ui updates? | 23:20 |
@clarkb:matrix.org | But we can worry about all of that tomorrow :) | 23:23 |
@jim:acmegating.com | also we can land the deprecation patches | 23:27 |
@jim:acmegating.com | Clark: you are needed on https://review.opendev.org/854556 | 23:27 |
@iwienand:matrix.org | > <@clarkb:matrix.org> once the release is out we should clean up the glibc backport. ianw you haev a cahnge for that right? | 23:33 |
https://review.opendev.org/c/zuul/zuul/+/854939 | ||
@clarkb:matrix.org | done and done | 23:39 |
@jim:acmegating.com | Clark: tobiash i'm going to carry over your +2s on https://review.opendev.org/856317 since i only added an extra test assertion | 23:45 |
@jim:acmegating.com | zuul-maint: https://review.opendev.org/855691 and https://review.opendev.org/857796 are ready and are the next steps for 7.0.0 | 23:47 |
@jim:acmegating.com | Clark: https://review.opendev.org/857981 would be good while it's fresh (shebang lint) | 23:48 |
@jim:acmegating.com | ianw: what's the story with https://review.opendev.org/856984 ? | 23:50 |
@iwienand:matrix.org | > <@jim:acmegating.com> ianw: what's the story with https://review.opendev.org/856984 ? | 23:52 |
i could not get the elements we have in the current "masthead" area to float right; it seemed to be a bug and i mentioned it @ https://github.com/patternfly/patternfly-react/issues/7960#issuecomment-1243197706 | ||
@jim:acmegating.com | ianw: also https://review.opendev.org/856217 ? | 23:52 |
@iwienand:matrix.org | i may try again after they release the update to not asynchronously load css and see if it works better. the <Page> documentation says that <Masthead> is the way to do it moving forward -- but I don't image it will become deprecated for a long time | 23:53 |
@iwienand:matrix.org | > <@jim:acmegating.com> ianw: also https://review.opendev.org/856217 ? | 23:55 |
I would like to get back to that, but it's not ready. The idea would be to test zuul_stream against a series of containers from 2.7-><insert current python> to ensure full coverage. it needs refactoring though so that we can do that and have each container with a different port. doable but some effort | ||
@jim:acmegating.com | okay, i thought we merged something related to that | 23:56 |
@jim:acmegating.com | was it just for 2.7? | 23:56 |
@iwienand:matrix.org | > <@iwienand:matrix.org> i could not get the elements we have in the current "masthead" area to float right; it seemed to be a bug and i mentioned it @ https://github.com/patternfly/patternfly-react/issues/7960#issuecomment-1243197706 | 23:56 |
what i would really like to try is moving those links in the header (components, api, tenant, etc) into a side nav-bar ... this seems more a common pattern | ||
@iwienand:matrix.org | corvus: yes, currently we test against a 2.7 container just as a lower bound. the upper-bound testing happens against focal/jammy nodes and whatever python they are running | 23:57 |
@jim:acmegating.com | ianw: ok. want to gerrit-wip those 2 for now? | 23:58 |
@clarkb:matrix.org | > <@jim:acmegating.com> Clark: https://review.opendev.org/857981 would be good while it's fresh (shebang lint) | 23:59 |
Approved |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!