Wednesday, 2025-10-15

hemanthHey I am still seeing RETRY_LIMIT issues in some CI builds, is it a known issue?01:25
hemanth2025-10-15 01:09:21.360122 | ubuntu-jammy | [Zuul] Lost log stream connection to [23.253.164.120:19885]... (full message at <https://matrix.org/oftc/media/v1/media/download/AYerL9bDzIy-IJ_YNCBYeafjtEnit820ds8s13zB2YxrEl3OeR7y9UQDM4huTqlqLKT6g4DksCf-_z3L0LZVHJVCeaLZGYqQAG1hdHJpeC5vcmcvbkFTTGFaQlNuS3RYQ1VRU2ZrVklScmJr>)01:25
hemanthReference build: https://zuul.opendev.org/t/openstack/build/7beb2f59c6de42e5aee90f3452c5ead301:26
fungihemanth: "still seeing" since... when did it start?01:27
fungi"Lost log stream connection" typically implies that the test node crashed or rebooted while running the job01:28
hemanthIt started on monday ykarel reported in this channel but with some other serves raxflex-DFW301:28
fungior there was some network disconnect between the executor and test node01:28
hemanths/serves/servers/01:28
fungior that the lg streaming process was killed on the test node01:29
tonybhemanth: I can look, the issue ykarel brought up was resolved01:29
fungilog streaming01:29
hemanthyeah the issue ykarel brought up was resolved .. 01:30
fungiapologies for lag and typos, airplane wifi01:30
ianychoifungi: thank you for approving 962454 - to enable translation on 2025.2 . Safe flight!01:32
fungithanks!01:33
tonybhemanth: `2025-10-15 01:02:20.333571 | TASK [charm-build : reset ssh connection to apply permissions from new group]` makes me a little nervous, do you know if that was recently added to the build jobs?01:34
hemanthno, its not added recently01:35
tonybokay01:36
tonybhemanth: I can't see anything in the logs that's helpful.  I'll check with the cloud provider02:02
hemanthack.. note it is not always happening.. 02:03
tonybAh okay, that's good to note02:04
opendevreviewOpenStack Proposal Bot proposed openstack/project-config master: Normalize projects.yaml  https://review.opendev.org/c/openstack/project-config/+/96255702:15
* tonyb takes a break to get some air02:40
tonybhemanth: I'll keeps looking when I'm back for any kind of pattern02:40
clarkbhemanth: tonyb rax flex is our only floating ip cloud. Is it possible that network is being modified (looks like you're deploying k8s) in some way that specific breaks floating IPs?02:43
clarkboh though the example above is not rax flex it is rax classic so maybe not specific to floating IPs then02:44
clarkbbut I've also not seen any one else complain about host networking just breaking so I still suspect that it could be related to the job workload02:44
clarkbas a side note the connectivity to the console stream is not what causes the job to fail. Its connectivity over ssh breaking that fails the job. So tcp/ip networking for both port 22 and 19885 is breaking for some reason02:46
hemanthI am not sure ..there are around 20 jobs which does similar stuff charm-build-*-k8s and out of which only one failure and there are lot of success..02:46
hemanthhttps://review.opendev.org/c/openstack/sunbeam-charms/+/963705 .. see the last zuul one in the changelog02:47
tonybclarkb: the job I was looking at ran on rax classic02:47
clarkbtonyb: ya the link above is for rax classic02:47
clarkbbut raxflex dfw3 was called out so I got confused for a moment02:47
tonybah yes.   that explains why I was also confused 02:48
hemanthFor now, i will keep an eye and if it is recurrent multiple times in next few days i will report back 02:49
clarkbis it possible to have charmcraft log more verbosely to stdout or stderr rather than the file it records it is logging to?02:50
clarkbthat may help us see what it is doing before things go sideways. hemanth is it alwaysin the same step of the log when it happens?02:50
hemanthclarkb: charmcraft log is already in debug mode.. i will look at other failures yesterday and will let you know if it failed at same step02:52
hemanthyesterday on a different change, it is in the same step on a similar job02:55
hemanth2025-10-14 17:59:29.084127 | [Zuul] Log Stream did not terminate... (full message at <https://matrix.org/oftc/media/v1/media/download/AeUy-dN-D7SRiDqRKCCER0rwmajeI1l1azltDs0KTqG0-oeVtVXTyCbzi2qsQcwRlNEuF5kT0ZF8z6yGwKyz4bpCeaLeSfGwAG1hdHJpeC5vcmcvT0lCZ1lJY1RCWFVUblZOTmtRU0FnZWRR>)02:55
hemanthReference build: https://64ac1fc4c571c3ef24ca-89f45418e048e284a0aa8c1a9bb175a2.ssl.cf5.rackcdn.com/openstack/c27f0ca35e584199b90b0cb4bb67e33b/job-output.txt02:56
hemanthcharmcraft usually creates a lxd container at this step .. typically with a network range 10.x.x.x02:59
funginote that nodes in rax classic have interfaces on a 10.x.x.x network too, which is how they reach the local mirrors03:31
fungiprobably not related, but worth keeping in mind03:31
opendevreviewMerged openstack/project-config master: update_constraints.sh: Better describe what we're skipping  https://review.opendev.org/c/openstack/project-config/+/95962807:36
gboutryHemanth Nakkina:  we're only doing charmcraft -v pack, we could look into doing `charmcraft --verbose --verbosity trace pack`, this will output a lot of logs though :)09:48
sfernandclarkb fungi the nodes held for me can be dropped now.I think I figured out the problem with the stale nfs mounts, it seems to be a bug impacting NFS over TCP in Ubuntu Noble https://bugs.launchpad.net/ubuntu/+source/linux/+bug/210380213:10
sfernandThe issue can't be reproduced on Ubuntu Jammy so I might possibly just fix the nodeset and wait for the fix to get released 13:18
clarkbsfernand: the autoholds have been cleaned up. Thank you for letting us know this was tracked down14:58
clarkbsfernand: you might want to follupw with ubuntu to see if that is still getting aptched in noble as it looks like they indicated in march it would be (and it is now october)14:59
sfernandwill do!15:00
*** __ministry is now known as Guest2908017:04
opendevreviewMerged openstack/project-config master: Have more files trigger test-release-openstack  https://review.opendev.org/c/openstack/project-config/+/95870922:04

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!