Wednesday, 2025-06-18

zigohberaud[m]: Did you tag Eventlet 0.40.1 ?11:25
zigoWe're still affected by https://github.com/eventlet/eventlet/commit/e470c1f493a87e867a36ee779573d7cbe964d53b even after the last merge (ie: tip of master still has the bug).11:34
hberaud[m]zigo: not yet, the release is pending https://github.com/eventlet/eventlet/issues/105111:36
hberaud[m]Weird... Guillaume confirmed that it solved his problem https://github.com/eventlet/eventlet/issues/1030#issuecomment-297025015711:37
zigohberaud[m]: Different issue, no ?11:38
hberaud[m]its depends on what you are talking about. I was thinking you mostly worried about the Nova issue reported by Guillaume through https://github.com/eventlet/eventlet/issues/103011:40
zigohberaud[m]: The blocker for me, when I spawn a VM with nova, is this one:11:42
zigohttps://github.com/eventlet/eventlet/issues/103211:42
zigoalso reported here:11:42
zigohttps://bugs.launchpad.net/nova/+bug/210341311:42
zigoWe know it's a gc issue because when doing gc.disable(), it kind of works (knowing that gc.disable() is not a viable solution).11:42
zigoIf that one is fixed, then probably everything else will start working.11:43
zigoI don't think it's useful to do a release if that one isn't fixed.11:43
zigohttps://github.com/eventlet/eventlet/commit/e470c1f493a87e867a36ee779573d7cbe964d53b was supposed to address it, but I don't see any resolution to https://bugs.launchpad.net/nova/+bug/2103413 (ie: Nova still can't query Neutron as it looses reference to its keystone object...).11:44
zigohberaud[m]: Does this make more sence now?11:45
hberaud[m]So for now I think that this is a different issue from the initial one reported by Guillaume (the fork one). Guillaume told us that indeed this gc problem is still visible even with the fork patch and with the gc patch. For now we do not have a solution, but I think it is worth releasing eventlet in all the case, nothing will stop us from making another release later once we have a solution for this problem.11:45
zigoAs you like, though as much as I'm concerned, we're still "dans la merde" ! :)11:46
hberaud[m]hahaha11:46
zigoThanks again for your work on this though. :)11:47
hberaud[m]Thanks, and thanks for your precious help11:47
*** croeland1 is now known as croelandt12:18
itamarstif someone can produce a minimal reproducer that would be very helpful13:29
JayFzigo: ^14:19
zigoJayF: The only way I know is setting-up  OpenStack and try to spawn a VM. :/14:19
zigoI can give the traceback though.14:19
JayFitamarst: zigo: if it's easily reproducible in something like a Dev stack, I could maybe set up a test harness.14:19
JayFBut I know that probably the goal is to not have to troubleshoot any of the openstack side of the problem, because isolating what is eventlet and what is Nova will be very difficult I imagine14:20
zigoThat's my traceback:14:21
zigohttps://paste.opendev.org/show/bMyqNoJ43Kshsj6iXu4Y/14:21
zigoitamarst: Does this help?14:25
itamarstnot really14:25
itamarstit would be useful to know whether or not fork() is being used14:25
itamarstif this only happens when using fork()... the solution is to not use fork()14:27
itamarstif this happens without fork()... an object's dictionary being wiped is... potentially even a bug in Python14:30
hberaud[m]Make sense14:32
itamarstanother experiment to try14:32
itamarstremove __del__ methods14:32
zigoWould it help if I tried to bisect interpreter versions ?14:32
itamarstI would first rule out fork(), then remove __del__ methods14:33
zigoIn what class ?14:33
itamarstthere's one in keystoneauth1.session.Session at minimum14:33
itamarstbut I would check SessionClient too14:34
itamarst(https://github.com/python/cpython/issues/135552 is a bug with __del__ in all Python 3 versions, but it may only be exposed in some situations which 3.13 makes more likely. or it may be completely unrealted to what you are ssing)14:34
itamarstalso worth testing with latest patch release of 3.1314:36
itamarst(and that specifically means _not_ the distro version since they don't ship most bug fixes)14:37
zigoCommeting out the __del__() method in keystoneauth1.session.Session has no effect at least.14:37
zigoIf you find a patch to apply to the distro version, it's easy to test. Saying "try the latest" is harder.14:38
itamarstuv will download them for you14:39
itamarstor you can just download from python.org14:39
itamarston ubuntu there's deadsnakes PPA, etc14:40
zigoI can try switching from 3.13.3 to 3.13.5.14:41
zigoNop, not fixing... :/14:43
* zigo goes back home.14:45
itamarsthow about fork()?14:45
jkulikhm ... I was able to reproduce it with this: https://paste.opendev.org/show/bJW0VtRkzfanKa3N4Zso/ - pretty hacked together and needs a working Neutron + Keystone for now15:12
jkulikit started happening once I moved the `neutron.get_client()` calls into the `network()` function. I had passed them into `NetworkInfoAsyncWrapper` as arguments previously and that worked.15:15
itamarstgreat. so no fork(). and it works with older versions of python?15:18
jkulikhm ... need to test. currently 3.13.215:19
itamarstanother fun and plausible place to be causing this is greenlet15:19
itamarstdepending if I understood the problem correctly. where is SessionClient implemented so I can see its source?15:21
jkulikseems to work with Python 3.12.9 - no exception15:22
jkulikSessionClient should come from here https://github.com/openstack/python-neutronclient/blob/master/neutronclient/client.py#L30515:24
itamarstso doesn't look like endpoint_override is del'd, not seeing anything with __dict__... so does seem like a weird bug15:27
itamarstand for all the terrible stuff eventlet does I don't see how it would cause this15:28
hberaud[m]jkulik: thanks for feedback15:30
jkulik(Pdb) [o for o in gc.get_objects() if isinstance(o, t)] using this, I can see 2 clients when the exception gets raised. one (the second one) has a completely empty __dict__15:30
itamarstthat's a pretty good bug15:33
itamarstso, eventlet honestly seems like an unlikely source for this kind of bug, so it's more likely either greenlet, or CPython15:36
jkulikgc.is_finalized() returns False for both objects, fyi15:37
jkulikCannot reproduce with `gc.disable()` added into the code before the first spawn()15:41
itamarstan ideal next step would be to find a reproducer that doesn't rely on third party libraries, or third party servers15:43
itamarstI do wonder how greenlet and gc interact in edge cases so will spend a few minutes looking at that in a bit15:44
itamarstbut it could also be a bug elsewhere in greenlet's 3.13 support15:44
itamarstor it could be CPython, somehow somewhere15:44
itamarst(but if I were a CPython dev my first reflex would be to ask for reproducer _without_ eventlet)15:45
itamarster15:45
itamarstwithout grenlet15:45
itamarstbut in any case a reproducer that uses just eventlet would be very helpful next step15:45
jkulikhm ... I was able to reduce the number of clients it needs. one is enough. what I noticed: I get thrown into the breakpoint after the/a client was used once16:09

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!