| -@gerrit:opendev.org- Michal Nasiadka proposed: [opendev/system-config] 989567: Add backup03 https://review.opendev.org/c/opendev/system-config/+/989567 | 05:15 | |
| @mnasiadka:matrix.org | Clark: since follow: true is default in ansible.builtin.file: Ansible will follow the link and when the target is directory, it should be fine | 05:21 |
|---|---|---|
| -@gerrit:opendev.org- Martin Zobel proposed: [openstack/diskimage-builder] 989657: Add DIB_USE_RSYNC option to copy image contents with rsync https://review.opendev.org/c/openstack/diskimage-builder/+/989657 | 11:34 | |
| @clarkb:matrix.org | mnasiadka: got it. Is that documented somewhere? | 14:44 |
| @clarkb:matrix.org | probably not expliitly just have to interpret what follow true will do | 14:45 |
| @mnasiadka:matrix.org | Clark: Not really explicitly - there's https://docs.ansible.com/projects/ansible/latest/collections/ansible/builtin/file_module.html#parameter-follow | 14:45 |
| @clarkb:matrix.org | thanks | 14:45 |
| @clarkb:matrix.org | The changes lgtm now and I did some checks on the server too. Not sure if Friday is the best day to add a third backup target (I'm also trying to take advantage of some good weather today to get out on the bike if possible). But I'll let other reviewers decide if they want to proceed | 14:47 |
| @clarkb:matrix.org | moving the opendev system-config-run job debugging here | 15:03 |
| @clarkb:matrix.org | https://zuul.opendev.org/t/openstack/build/c70cb69401c943068b43223f9c607032 failed to lookup opendev.org to clone the git repos for dns zones | 15:04 |
| @clarkb:matrix.org | I can currently lookup opendev.org against both ns03 and ns04 over ipv4 (what the job env would have used) using dig | 15:04 |
| @clarkb:matrix.org | so whatever the issue is is either intermittent or maybe environment/route/network specific? | 15:04 |
| @fungicide:matrix.org | Clark: i'm about three problems out on other stuff at the moment so haven't paged that change back into brain context, but suspect that it could be something to do with what addresses the resolver is listening on. if memory serves we do some gymnastics to make bind and unbound coexist on adns* by binding to different addresses | 15:05 |
| @clarkb:matrix.org | and the problem could be between our test nodes and cloudflare/google resolvers or between cloudflare/google resolvers and ns03/ns04 | 15:06 |
| @clarkb:matrix.org | fungi: oh you think that maybe the local deployment of dns tooling is breaking our unbound resolver that should service things? I could see that being the case | 15:06 |
| @clarkb:matrix.org | maybe we should put a hold in place and see if we catch one and try ot debug from there? I can do that | 15:06 |
| @fungicide:matrix.org | given that the change itself is altering what addresses we're using in which vars | 15:07 |
| @clarkb:matrix.org | ya though its doing that at an ansible level whcih I'm not sure if that affects unbound or bind/nsd | 15:07 |
| @fungicide:matrix.org | it may affect the configuration templates we deploy with ansible | 15:07 |
| @fungicide:matrix.org | but this is entirely conjecture on my part | 15:08 |
| @clarkb:matrix.org | yes that is possible. I'm skimming those now. THe bind template seems to use ansibel gathered facts so not the values in inventory | 15:08 |
| @fungicide:matrix.org | and yeah, seems like it must be cloud-specific behavior differences since it's passing in check, though also suspicious that we keep consistently passing in check and failing in gate, that part is probably mere coincidence | 15:09 |
| @fungicide:matrix.org | in theory we don't run the job any differently between the two pipelines | 15:10 |
| @clarkb:matrix.org | ok our normal test nodes use unbound in a forwarding setup. But then because these are system-config-run jobs we apply the prod setup which recurses? I wonder if we are racing unbound config updates and reloading/restarting and catching unbound at a time where it isn't functional | 15:10 |
| @clarkb:matrix.org | no looks like while we update the unbound config we don't force it to restart so it should continue running with the old config. Unless it automatically reloads its config later and we're racing that? | 15:13 |
| @clarkb:matrix.org | so maybe before holding a node we can shim in some dns lookups and unbound state debugging to better characterize (and maybe trip) this | 15:14 |
| @clarkb:matrix.org | I'll work on a change that does that | 15:14 |
| -@gerrit:opendev.org- Jeremy Stanley https://matrix.to/#/@fungicide:matrix.org proposed: [openstack/project-config] 989781: Reset PuppetForge password https://review.opendev.org/c/openstack/project-config/+/989781 | 15:21 | |
| @clarkb:matrix.org | In other news etherpad survived the night. Upstream confirmed the bug and has a fix. We will probably have to upgrade to 3.1.1 or newer to get it though | 15:31 |
| -@gerrit:opendev.org- Clark Boylan proposed: [opendev/system-config] 989784: Add some dns lookup debugging to adns test server https://review.opendev.org/c/opendev/system-config/+/989784 | 15:33 | |
| @clarkb:matrix.org | something like that maybe | 15:33 |
| -@gerrit:opendev.org- Zuul merged on behalf of Jeremy Stanley https://matrix.to/#/@fungicide:matrix.org: [openstack/project-config] 989781: Reset PuppetForge password https://review.opendev.org/c/openstack/project-config/+/989781 | 15:35 | |
| @clarkb:matrix.org | feel free to suggest additional debugging items. I think my main questions are "is unbound running" and if so with what config. Then what do some explicit lookups look like | 15:35 |
| @fungicide:matrix.org | Clark: upstream claude came up with a fix? ;) | 15:35 |
| @clarkb:matrix.org | and take it from there | 15:36 |
| @clarkb:matrix.org | fungi: I haven't read the commit messages to see but I woudl expect so :) | 15:36 |
| @fungicide:matrix.org | `Co-Authored-By: Claude Opus 4.7 ...` yep! | 15:40 |
| @clarkb:matrix.org | the gitea 1.26.2 screenshots here: https://e427a35962f8c8fc85a5-28f35d2f5f941083fd95522e9f6bd028.ssl.cf1.rackcdn.com/openstack/890938b8e5014de38c98eff46c2d1af5/bridge99.opendev.org/screenshots/ look good to me. The change passed our testing checks too | 15:51 |
| -@gerrit:opendev.org- Sabbir Ahmed proposed: [openstack/project-config] 989446: Add starlingx/app-machine-operator project https://review.opendev.org/c/openstack/project-config/+/989446 | 16:34 | |
| @clarkb:matrix.org | ok I'm going to pop out for a bit now. There is a hill to climb before it gets too hot | 17:00 |
| @fungicide:matrix.org | good luck! | 17:09 |
| @clarkb:matrix.org | https://cdf674af691ebd2f77e5-7909900ed724dc9ef77c09a94566cfc0.ssl.cf1.rackcdn.com/openstack/446e5621d04a445185fd4b8d7965e251/bridge99.opendev.org/ara-report/results/373.html my debug dns change passed and the output in the debug script looks good (expected since there wasn't a failure). I guess I'll recheck | 20:21 |
| @clarkb:matrix.org | but that seems to confirm we are not restarting unbound. It is possible that we may wish to in order to ensure the config we expect is in use. But I think the test node config should be fine too | 20:22 |
Generated by irclog2html.py 4.1.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!