| hemanth | Hey I am still seeing RETRY_LIMIT issues in some CI builds, is it a known issue? | 01:25 |
|---|---|---|
| hemanth | 2025-10-15 01:09:21.360122 | ubuntu-jammy | [Zuul] Lost log stream connection to [23.253.164.120:19885]... (full message at <https://matrix.org/oftc/media/v1/media/download/AYerL9bDzIy-IJ_YNCBYeafjtEnit820ds8s13zB2YxrEl3OeR7y9UQDM4huTqlqLKT6g4DksCf-_z3L0LZVHJVCeaLZGYqQAG1hdHJpeC5vcmcvbkFTTGFaQlNuS3RYQ1VRU2ZrVklScmJr>) | 01:25 |
| hemanth | Reference build: https://zuul.opendev.org/t/openstack/build/7beb2f59c6de42e5aee90f3452c5ead3 | 01:26 |
| fungi | hemanth: "still seeing" since... when did it start? | 01:27 |
| fungi | "Lost log stream connection" typically implies that the test node crashed or rebooted while running the job | 01:28 |
| hemanth | It started on monday ykarel reported in this channel but with some other serves raxflex-DFW3 | 01:28 |
| fungi | or there was some network disconnect between the executor and test node | 01:28 |
| hemanth | s/serves/servers/ | 01:28 |
| fungi | or that the lg streaming process was killed on the test node | 01:29 |
| tonyb | hemanth: I can look, the issue ykarel brought up was resolved | 01:29 |
| fungi | log streaming | 01:29 |
| hemanth | yeah the issue ykarel brought up was resolved .. | 01:30 |
| fungi | apologies for lag and typos, airplane wifi | 01:30 |
| ianychoi | fungi: thank you for approving 962454 - to enable translation on 2025.2 . Safe flight! | 01:32 |
| fungi | thanks! | 01:33 |
| tonyb | hemanth: `2025-10-15 01:02:20.333571 | TASK [charm-build : reset ssh connection to apply permissions from new group]` makes me a little nervous, do you know if that was recently added to the build jobs? | 01:34 |
| hemanth | no, its not added recently | 01:35 |
| tonyb | okay | 01:36 |
| tonyb | hemanth: I can't see anything in the logs that's helpful. I'll check with the cloud provider | 02:02 |
| hemanth | ack.. note it is not always happening.. | 02:03 |
| tonyb | Ah okay, that's good to note | 02:04 |
| opendevreview | OpenStack Proposal Bot proposed openstack/project-config master: Normalize projects.yaml https://review.opendev.org/c/openstack/project-config/+/962557 | 02:15 |
| * tonyb takes a break to get some air | 02:40 | |
| tonyb | hemanth: I'll keeps looking when I'm back for any kind of pattern | 02:40 |
| clarkb | hemanth: tonyb rax flex is our only floating ip cloud. Is it possible that network is being modified (looks like you're deploying k8s) in some way that specific breaks floating IPs? | 02:43 |
| clarkb | oh though the example above is not rax flex it is rax classic so maybe not specific to floating IPs then | 02:44 |
| clarkb | but I've also not seen any one else complain about host networking just breaking so I still suspect that it could be related to the job workload | 02:44 |
| clarkb | as a side note the connectivity to the console stream is not what causes the job to fail. Its connectivity over ssh breaking that fails the job. So tcp/ip networking for both port 22 and 19885 is breaking for some reason | 02:46 |
| hemanth | I am not sure ..there are around 20 jobs which does similar stuff charm-build-*-k8s and out of which only one failure and there are lot of success.. | 02:46 |
| hemanth | https://review.opendev.org/c/openstack/sunbeam-charms/+/963705 .. see the last zuul one in the changelog | 02:47 |
| tonyb | clarkb: the job I was looking at ran on rax classic | 02:47 |
| clarkb | tonyb: ya the link above is for rax classic | 02:47 |
| clarkb | but raxflex dfw3 was called out so I got confused for a moment | 02:47 |
| tonyb | ah yes. that explains why I was also confused | 02:48 |
| hemanth | For now, i will keep an eye and if it is recurrent multiple times in next few days i will report back | 02:49 |
| clarkb | is it possible to have charmcraft log more verbosely to stdout or stderr rather than the file it records it is logging to? | 02:50 |
| clarkb | that may help us see what it is doing before things go sideways. hemanth is it alwaysin the same step of the log when it happens? | 02:50 |
| hemanth | clarkb: charmcraft log is already in debug mode.. i will look at other failures yesterday and will let you know if it failed at same step | 02:52 |
| hemanth | yesterday on a different change, it is in the same step on a similar job | 02:55 |
| hemanth | 2025-10-14 17:59:29.084127 | [Zuul] Log Stream did not terminate... (full message at <https://matrix.org/oftc/media/v1/media/download/AeUy-dN-D7SRiDqRKCCER0rwmajeI1l1azltDs0KTqG0-oeVtVXTyCbzi2qsQcwRlNEuF5kT0ZF8z6yGwKyz4bpCeaLeSfGwAG1hdHJpeC5vcmcvT0lCZ1lJY1RCWFVUblZOTmtRU0FnZWRR>) | 02:55 |
| hemanth | Reference build: https://64ac1fc4c571c3ef24ca-89f45418e048e284a0aa8c1a9bb175a2.ssl.cf5.rackcdn.com/openstack/c27f0ca35e584199b90b0cb4bb67e33b/job-output.txt | 02:56 |
| hemanth | charmcraft usually creates a lxd container at this step .. typically with a network range 10.x.x.x | 02:59 |
| fungi | note that nodes in rax classic have interfaces on a 10.x.x.x network too, which is how they reach the local mirrors | 03:31 |
| fungi | probably not related, but worth keeping in mind | 03:31 |
| opendevreview | Merged openstack/project-config master: update_constraints.sh: Better describe what we're skipping https://review.opendev.org/c/openstack/project-config/+/959628 | 07:36 |
| gboutry | Hemanth Nakkina: we're only doing charmcraft -v pack, we could look into doing `charmcraft --verbose --verbosity trace pack`, this will output a lot of logs though :) | 09:48 |
| sfernand | clarkb fungi the nodes held for me can be dropped now.I think I figured out the problem with the stale nfs mounts, it seems to be a bug impacting NFS over TCP in Ubuntu Noble https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2103802 | 13:10 |
| sfernand | The issue can't be reproduced on Ubuntu Jammy so I might possibly just fix the nodeset and wait for the fix to get released | 13:18 |
| clarkb | sfernand: the autoholds have been cleaned up. Thank you for letting us know this was tracked down | 14:58 |
| clarkb | sfernand: you might want to follupw with ubuntu to see if that is still getting aptched in noble as it looks like they indicated in march it would be (and it is now october) | 14:59 |
| sfernand | will do! | 15:00 |
| *** __ministry is now known as Guest29080 | 17:04 | |
| opendevreview | Merged openstack/project-config master: Have more files trigger test-release-openstack https://review.opendev.org/c/openstack/project-config/+/958709 | 22:04 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!