opendevreview | OpenStack Proposal Bot proposed openstack/project-config master: Normalize projects.yaml https://review.opendev.org/c/openstack/project-config/+/898150 | 03:07 |
---|---|---|
yoctozepto | could something have affected the opendev images? because it seems the rsync is gone: https://zuul.opendev.org/t/nebulous/build/9b45ae58c4be4583a7e5e98cd956ea1c | 08:01 |
yoctozepto | hmm, another job works | 08:03 |
yoctozepto | strange | 08:03 |
yoctozepto | ok, ignore it, my colleague has introduced the issue by modifying /etc/environment | 08:06 |
mrunge | is there someone who has an idea why an approved patch is not picked up by to get it merged? https://review.opendev.org/c/openstack/python-observabilityclient/+/898092 | 08:22 |
mrunge | the repo is this one https://opendev.org/openstack/python-observabilityclient | 08:24 |
frickler | mrunge: because there is no CI job defined and thus no V+1 | 08:25 |
mrunge | thank you frickler . How would I get a CI job defined/merged then? | 08:25 |
mrunge | Would zuul merge a job for zuul then? | 08:25 |
frickler | mrunge: I think https://review.opendev.org/c/openstack/project-config/+/898098 needs to be finished and merged first | 08:27 |
mrunge | thank you for the pointers frickler! | 08:27 |
opendevreview | Martin Magr proposed openstack/project-config master: Complete config for python-observabilityclient https://review.opendev.org/c/openstack/project-config/+/898098 | 08:44 |
opendevreview | Martin Magr proposed openstack/project-config master: Add observabilityclient to zuul https://review.opendev.org/c/openstack/project-config/+/898210 | 09:04 |
SvenKieske | what reason could there be for a build job to not provide any logs, at all? example: https://zuul.opendev.org/t/openstack/build/3f6816ba15ec489bb6479836b659e695 | 10:08 |
SvenKieske | mhm okay, POST_FAILURE seems to indicate the post playbook didn't run, I'd assume that this collects the logs etc. | 10:10 |
frickler | POST_FAILURE says that the post playbook did run and that run resulted in a failure. one possible reason for that is that the log upload to the log storage failed | 11:35 |
frickler | if there are no logs for a build, this is a very likely reason, could either be an issue in one of our storage providers or the logs being too huge and thus the upload failing in a timeout | 11:36 |
opendevreview | Merged openstack/project-config master: Normalize projects.yaml https://review.opendev.org/c/openstack/project-config/+/898150 | 12:24 |
opendevreview | Merged openstack/project-config master: Add observabilityclient to zuul https://review.opendev.org/c/openstack/project-config/+/898210 | 12:31 |
opendevreview | Martin Magr proposed openstack/project-config master: Complete config for python-observabilityclient https://review.opendev.org/c/openstack/project-config/+/898098 | 12:41 |
SvenKieske | frickler: thanks for the information, very interesting | 12:48 |
fungi | SvenKieske: i hunted down the zuul executor responsible for that build, the failing task was this one: https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/upload-logs-swift/tasks/main.yaml#L20-L33 | 12:58 |
fungi | unfortunately, as you can see there, it specifies "no_log: true" so the most that shows up in the service's debug log is: | 12:59 |
fungi | 2023-10-13 09:53:10,137 DEBUG zuul.AnsibleJob.output: [e: 570eeb6972454b8d8b8a8fa95d84b82a] [build: 3f6816ba15ec489bb6479836b659e695] Ansible output: b'fatal: [localhost]: FAILED! => {"censored": "the output has been hidden due to the fact that \'no_log: true\' was specified for this result", "changed": false}' | 12:59 |
SvenKieske | fungi: has this one "no_log" because of credential leaking or does it spam? | 12:59 |
fungi | SvenKieske: to avoid accidentally leaking swift api credentials into the log, eys | 12:59 |
fungi | er, yes | 12:59 |
SvenKieske | ah I see: https://opendev.org/zuul/zuul-jobs/commit/622baa65bfa02a592024a867c5bd94a8a4567339 | 13:00 |
fungi | 2023-10-13 09:52:28,420 DEBUG zuul.AnsibleJob.output: [e: 570eeb6972454b8d8b8a8fa95d84b82a] [build: 3f6816ba15ec489bb6479836b659e695] Ansible output: b'ok: [localhost] => (item=rax_iad) => {"ansible_facts": {"_swift_provider_name": "rax_iad"}, "ansible_loop_var": "opendev_base_item", "changed": false, "opendev_base_item": "rax_iad"}' | 13:02 |
fungi | so it was trying to upload to the swift api endpoint in rackspace's iad region | 13:02 |
fungi | https://rackspace.service-now.com/system_status?id=child_service_status&service=7239ceb0db6cf200e93ff2e9af961916 | 13:04 |
fungi | no indication of any reported outages or maintenance there | 13:04 |
fungi | might have just been a fluke, but if we see a lot of these we can try to correlate by the provider/region to which we tried to upload, or by the executor which was attempting the upload, and see if commonalities point to an ongoing problem | 13:05 |
SvenKieske | the oslo people already retried the job, was just curious if this is expected/"normal" behaviour as I had never seen the case of no logs at all | 13:19 |
SvenKieske | now I have a circular dependency, seems to become an interesting end of the week :D | 13:19 |
fungi | SvenKieske: yes, sometimes log uploads fail. we've talked about splitting the uploader into batches and uploading the console log and build manifest separate from other collected job logs in order to increase the chances that you at least get those | 13:21 |
fungi | or redesigning the uploader to fall back to another provider in the event of an upload failure, or upload logs redundantly to multiple providers | 13:22 |
SvenKieske | streaming the logs directly would be nice, but also quite challenging, I assume. | 13:22 |
fungi | well, it does stream the logs directly while the build is running | 13:22 |
fungi | we've made so much of zuul redundantly fault-tolerant at this point that the remaining places like log uploading become more obvious in their fragility | 13:24 |
opendevreview | Merged openstack/project-config master: Complete config for python-observabilityclient https://review.opendev.org/c/openstack/project-config/+/898098 | 13:32 |
SvenKieske | sure, there's always another bootleneck :) | 13:32 |
SvenKieske | do you happen to know how I can deal with cross repo circular dependencies? the feature in zuul seems not to be activated, afaik? (I don't know if I understood the docs just yet tbh) | 13:33 |
SvenKieske | mhm, it's not really clear to me from the docs if I strictly need this feature? https://zuul-ci.org/docs/zuul/latest/developer/specs/circular-dependencies.html | 13:35 |
fungi | SvenKieske: the openstack tenant is configured to disallow circular dependencies, to ensure that commits merge across all of openstack in a predictable linear | 13:35 |
fungi | order | 13:35 |
SvenKieske | mhm, so how do I fix that? I can't be the first person with this problem? :D | 13:36 |
fungi | circular dependency relationships are risky since zuul doesn't have direct control over whether commits actually merge (it asks gerrit to merge them once testing completes). as such, there's a potential problem if zuul asks gerrit to merge two changes at the same time and it merges the first but refuses to merge the second | 13:37 |
fungi | but also nothing can truly merge simultaneously, so there's always a window of time, no matter how brief, where one of the changes in that group merges before the other | 13:37 |
SvenKieske | adding only one "Depends-On" in change "a" in repo "a," shouldn't help, because change "b" in repo "b,," - which is the dependency - can't be merged because pipeline can't become green, because it needs "a" as well. | 13:38 |
fungi | and in that short window (which can become a long window in unusual circumstances), the repositories are mutually broken | 13:38 |
SvenKieske | basically how can I force to merge temporarily broken code? I guess via a core override? | 13:38 |
fungi | SvenKieske: the position openstack has taken is that you should arrange your commits so that any one commit results in a fully working state for all integrated projects. sometimes that means temporary backward compatibility, feature flags, or other similar sorts of design patterns | 13:39 |
SvenKieske | fungi: yeah, that's the nature of circular dependencies, after all. but how to merge this stuff? I can't make it magically not dependent on each other, it really is and can't be circumvented :D | 13:40 |
fungi | and depending on the situation, you may even need to make sure backward compatibility is maintained up through the next coordinated release so that upgrades will still work (because the two projects aren't necessarily upgraded simultaneously) | 13:40 |
SvenKieske | okay, then I need to adapt the kolla-ansible change to detect the fluentd version and change pathes and users accordingly, that might work | 13:40 |
fungi | it generally results in more robust (if also more complex) systems, because a circular dependency implies not only simultaneous merging but also simultaneous releasing, simultaneous upgrading, and so on | 13:42 |
SvenKieske | sometimes I want a monorepo :D | 13:43 |
fungi | if changes between projects become that lock-step, then they're one project split between multiple repositories that should instead probably be in the same repository | 13:43 |
SvenKieske | well yeah, this is about these two: https://review.opendev.org/c/openstack/kolla/+/894948 https://review.opendev.org/c/openstack/kolla-ansible/+/898227 | 13:44 |
fungi | openstack's multi-repo approach is, as much as anything, about decoupling changes between different components of a system | 13:44 |
SvenKieske | basically in order to use the new software the configuration for said software needs to be adapted. I don't know why this is not done in the container layer itself, but there must be a good reason I guess. | 13:44 |
fungi | infra-root: i've stopped and disabled exim on the old lists.openstack.org server. apparently some of the mailservers for the provider that hosts red hat's corporate e-mail doesn't obey dns ttls or has otherwise broken resolution caching and sent two messages to the old server today where they were accepted into a black hole | 14:52 |
Clark[m] | fungi: thanks that makes sense to me. We could also shutdown the instance if we want but kernel stuff probably makes that less desirable | 17:13 |
fungi | right, i'd rather have it reachable for a little while still in case we need anything off of it | 17:51 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!