Friday, 2023-10-13

opendevreviewOpenStack Proposal Bot proposed openstack/project-config master: Normalize projects.yaml
yoctozeptocould something have affected the opendev images? because it seems the rsync is gone:
yoctozeptohmm, another job works08:03
yoctozeptook, ignore it, my colleague has introduced the issue by modifying /etc/environment08:06
mrungeis there someone who has an idea why an approved patch is not picked up by to get it merged?
mrungethe repo is this one
fricklermrunge: because there is no CI job defined and thus no V+108:25
mrungethank you frickler . How would I get a CI job defined/merged then?08:25
mrungeWould zuul merge a job for zuul then?08:25
fricklermrunge: I think needs to be finished and merged first08:27
mrungethank you for the pointers frickler!08:27
opendevreviewMartin Magr proposed openstack/project-config master: Complete config for python-observabilityclient
opendevreviewMartin Magr proposed openstack/project-config master: Add observabilityclient to zuul
SvenKieskewhat reason could there be for a build job to not provide any logs, at all? example:
SvenKieskemhm okay, POST_FAILURE seems to indicate the post playbook didn't run, I'd assume that this collects the logs etc.10:10
fricklerPOST_FAILURE says that the post playbook did run and that run resulted in a failure. one possible reason for that is that the log upload to the log storage failed11:35
fricklerif there are no logs for a build, this is a very likely reason, could either be an issue in one of our storage providers or the logs being too huge and thus the upload failing in a timeout11:36
opendevreviewMerged openstack/project-config master: Normalize projects.yaml
opendevreviewMerged openstack/project-config master: Add observabilityclient to zuul
opendevreviewMartin Magr proposed openstack/project-config master: Complete config for python-observabilityclient
SvenKieskefrickler: thanks for the information, very interesting12:48
fungiSvenKieske: i hunted down the zuul executor responsible for that build, the failing task was this one:
fungiunfortunately, as you can see there, it specifies "no_log: true" so the most that shows up in the service's debug log is:12:59
fungi2023-10-13 09:53:10,137 DEBUG zuul.AnsibleJob.output: [e: 570eeb6972454b8d8b8a8fa95d84b82a] [build: 3f6816ba15ec489bb6479836b659e695] Ansible output: b'fatal: [localhost]: FAILED! => {"censored": "the output has been hidden due to the fact that \'no_log: true\' was specified for this result", "changed": false}'12:59
SvenKieskefungi: has this one "no_log" because of credential leaking or does it spam?12:59
fungiSvenKieske: to avoid accidentally leaking swift api credentials into the log, eys12:59
fungier, yes12:59
SvenKieskeah I see:
fungi2023-10-13 09:52:28,420 DEBUG zuul.AnsibleJob.output: [e: 570eeb6972454b8d8b8a8fa95d84b82a] [build: 3f6816ba15ec489bb6479836b659e695] Ansible output: b'ok: [localhost] => (item=rax_iad) => {"ansible_facts": {"_swift_provider_name": "rax_iad"}, "ansible_loop_var": "opendev_base_item", "changed": false, "opendev_base_item": "rax_iad"}'13:02
fungiso it was trying to upload to the swift api endpoint in rackspace's iad region13:02
fungino indication of any reported outages or maintenance there13:04
fungimight have just been a fluke, but if we see a lot of these we can try to correlate by the provider/region to which we tried to upload, or by the executor which was attempting the upload, and see if commonalities point to an ongoing problem13:05
SvenKieskethe oslo people already retried the job, was just curious if this is expected/"normal" behaviour as I had never seen the case of no logs at all13:19
SvenKieskenow I have a circular dependency, seems to become an interesting end of the week :D13:19
fungiSvenKieske: yes, sometimes log uploads fail. we've talked about splitting the uploader into batches and uploading the console log and build manifest separate from other collected job logs in order to increase the chances that you at least get those13:21
fungior redesigning the uploader to fall back to another provider in the event of an upload failure, or upload logs redundantly to multiple providers13:22
SvenKieskestreaming the logs directly would be nice, but also quite challenging, I assume.13:22
fungiwell, it does stream the logs directly while the build is running13:22
fungiwe've made so much of zuul redundantly fault-tolerant at this point that the remaining places like log uploading become more obvious in their fragility13:24
opendevreviewMerged openstack/project-config master: Complete config for python-observabilityclient
SvenKieskesure, there's always another bootleneck :)13:32
SvenKieskedo you happen to know how I can deal with cross repo circular dependencies? the feature in zuul seems not to be activated, afaik? (I don't know if I understood the docs just yet tbh)13:33
SvenKieskemhm, it's not really clear to me from the docs if I strictly need this feature?
fungiSvenKieske: the openstack tenant is configured to disallow circular dependencies, to ensure that commits merge across all of openstack in a predictable linear 13:35
SvenKieskemhm, so how do I fix that? I can't be the first person with this problem? :D13:36
fungicircular dependency relationships are risky since zuul doesn't have direct control over whether commits actually merge (it asks gerrit to merge them once testing completes). as such, there's a potential problem if zuul asks gerrit to merge two changes at the same time and it merges the first but refuses to merge the second13:37
fungibut also nothing can truly merge simultaneously, so there's always a window of time, no matter how brief, where one of the changes in that group merges before the other13:37
SvenKieskeadding only one "Depends-On" in change "a" in repo "a," shouldn't help, because change "b" in repo "b,," - which is the dependency - can't be merged because pipeline can't become green, because it needs "a" as well.13:38
fungiand in that short window (which can become a long window in unusual circumstances), the repositories are mutually broken13:38
SvenKieskebasically how can I force to merge temporarily broken code? I guess via a core override?13:38
fungiSvenKieske: the position openstack has taken is that you should arrange your commits so that any one commit results in a fully working state for all integrated projects. sometimes that means temporary backward compatibility, feature flags, or other similar sorts of design patterns13:39
SvenKieskefungi: yeah, that's the nature of circular dependencies, after all. but how to merge this stuff? I can't make it magically not dependent on each other, it really is and can't be circumvented :D13:40
fungiand depending on the situation, you may even need to make sure backward compatibility is maintained up through the next coordinated release so that upgrades will still work (because the two projects aren't necessarily upgraded simultaneously)13:40
SvenKieskeokay, then I need to adapt the kolla-ansible change to detect the fluentd version and change pathes and users accordingly, that might work13:40
fungiit generally results in more robust (if also more complex) systems, because a circular dependency implies not only simultaneous merging but also simultaneous releasing, simultaneous upgrading, and so on13:42
SvenKieskesometimes I want a monorepo :D13:43
fungiif changes between projects become that lock-step, then they're one project split between multiple repositories that should instead probably be in the same repository13:43
SvenKieskewell yeah, this is about these two:
fungiopenstack's multi-repo approach is, as much as anything, about decoupling changes between different components of a system13:44
SvenKieskebasically in order to use the new software the configuration for said software needs to be adapted. I don't know why this is not done in the container layer itself, but there must be a good reason I guess.13:44
fungiinfra-root: i've stopped and disabled exim on the old server. apparently some of the mailservers for the provider that hosts red hat's corporate e-mail doesn't obey dns ttls or has otherwise broken resolution caching and sent two messages to the old server today where they were accepted into a black hole14:52
Clark[m]fungi: thanks that makes sense to me. We could also shutdown the instance if we want but kernel stuff probably makes that less desirable 17:13
fungiright, i'd rather have it reachable for a little while still in case we need anything off of it17:51

Generated by 2.17.3 by Marius Gedminas - find it at!