tobias-urdin | yoctozepto: yeah, if you check my previous patchsets there i initially tried using the official async lib without success, and we all know how fun it also is to troubleshoot threading and eventlet issues :p | 06:41 |
---|---|---|
tobias-urdin | unfortunately there is prob some stuff that needs to be fixed in lib before we can consider using it as well, the async lib has a lot of retry logic etc but we could just potentially port the same logic over to that library | 06:43 |
tobias-urdin | so i guess first plan of action is continue to investigate that library what needs to be done, propose a spec and work on that until we're satisfied | 06:43 |
tobias-urdin | i would also like to take some time to investigate the old spec and old proposed implementation as well so that we can learn from it | 06:44 |
hberaud | yoctozepto: I'd like to add that eventlet/greenlet are not really active projects, see https://twitter.com/ossmkitty/status/1562750638890827777 | 07:56 |
hberaud | yoctozepto: Python core maintainers and distros maintainers raised warning about the current situation of these projects (eventlet/greenlet) https://github.com/python-greenlet/greenlet/issues/305 | 07:58 |
yoctozepto | tobias-urdin: agreed, doing the same | 07:58 |
yoctozepto | hberaud: long-term I would love to see OpenStack moving to asyncio (I prefer explicit async to implicit async) | 07:59 |
yoctozepto | but I have no idea how feasible that is | 07:59 |
yoctozepto | it would involve all libs | 07:59 |
hberaud | Wouldn't it be a good time to consider starting moving away from eventlet? | 08:00 |
yoctozepto | we need a plan to support two models at once | 08:00 |
yoctozepto | hberaud: probably, yeah | 08:00 |
hberaud | Maybe libs should be transitioned first | 08:00 |
hberaud | And then services | 08:00 |
hberaud | I don't have the big picture of eventlet on Openstack | 08:01 |
hberaud | But you are right, moving away from eventlet/greenletseems to be a real journey | 08:03 |
yoctozepto | indeed | 08:04 |
hberaud | However I really think it would be beneficial for us, as eventlet is often the root cause of many of our pains | 08:04 |
yoctozepto | well, it goes against the grain of python preferring explicit to implicit | 08:04 |
yoctozepto | and goes low-level with that | 08:04 |
hberaud | yeah | 08:05 |
yoctozepto | I wonder if anyone thinks the opposite, i.e. that staying with greenlet is beneficial | 08:08 |
yoctozepto | (not including the migrations costs of course, i.e. it is *directly* beneficial) | 08:08 |
hberaud | yoctozepto: I left some comments related to our previous discussion here: https://review.opendev.org/c/openstack/oslo.messaging/+/848338 | 08:18 |
tobias-urdin | good question I'm interested in that as well, I don't have enough knowledge on the internals to comment on it, but I don't like blackboxes | 08:18 |
yoctozepto | hberaud: ack | 08:23 |
hberaud | Depending on the libs we decide to use (nats.py, nats-python, etc) I think the specs should reflect how to make eventlet and async codes cohabit. | 08:23 |
yoctozepto | another rmq question on the openstack-discuss | 08:23 |
yoctozepto | aye, as you saw, it seems it is generally impossible to have the two used in the same process without stepping onto bugs | 08:24 |
yoctozepto | I would be more inclined to discuss the effort to have the libs support either model (i.e. not two at once, but be usable with similar interfaces with any - choose one and go go go) | 08:25 |
hberaud | good idea | 08:33 |
*** sean-k-mooney1 is now known as sean-k-mooney | 09:31 | |
sean-k-mooney | hberaud:heh i forgot i reviewed https://review.opendev.org/c/openstack/oslo-specs/+/692784 also ussiri is 2 years ago how... | 09:41 |
yoctozepto | happy to remind you all, seems we have gathered quite a strong teams along this idea; let's hope for the best ;p | 09:44 |
sean-k-mooney | hberaud: so to continue the conversation i was starting regarding https://review.opendev.org/c/openstack/oslo.messaging/+/848338/17#message-0d8a5e811a0f54e24081f244e8287122bf9dba76 | 11:27 |
sean-k-mooney | nats.py you see as a roadblock because of asycio | 11:27 |
sean-k-mooney | and the fact it really is not happy with eventlets | 11:27 |
sean-k-mooney | but you mention oslo.db | 11:27 |
sean-k-mooney | however we user eventlest in may ohter context were we would use oslo.messaging too | 11:28 |
sean-k-mooney | one suboptimal approch would be to take a privespe styple approch | 11:28 |
sean-k-mooney | run the oslo-messaging driver in a sepreate processs using asyncio | 11:28 |
sean-k-mooney | so that is isolated form the rest fo the client application | 11:29 |
sean-k-mooney | we have hadd isseus usign pthread with the heatbeat work so i dont think that an option here | 11:29 |
sean-k-mooney | the downside of that is all messge woudl have to flow though a unix socket or named pip between the processes | 11:30 |
hberaud | yes | 11:30 |
hberaud | if we are able to release the two new drivers in the same time that would be ok | 11:31 |
hberaud | (your ones based on unix socket and the one based on nats) | 11:32 |
sean-k-mooney | oh i wasnt even thinking about that | 11:32 |
sean-k-mooney | that might be a way to do it | 11:32 |
sean-k-mooney | i mentioned privsep more because it already does this with its channels | 11:33 |
hberaud | ok | 11:33 |
hberaud | could be an option indeed | 11:33 |
sean-k-mooney | but i guess ya you coudl resues the unix socket dirver as a bridge maybe | 11:33 |
sean-k-mooney | anyway do you see https://github.com/Gr1N/nats-python as a blocker to move forward | 11:34 |
sean-k-mooney | ah the pr has not been adressed in some time | 11:35 |
sean-k-mooney | so the challange is in finding a lib that is maintained and eventlet compatible | 11:36 |
hberaud | sean-k-mooney: yes I see it as a blocker, I left related comments onto the nats review | 11:36 |
hberaud | and the lib (nats-python) seems to lack of a lot of basic features | 11:38 |
sean-k-mooney | ya | 11:39 |
sean-k-mooney | just re read your comments | 11:39 |
hberaud | Could be worth to create a comparative table to define what we need, what we want, and what we have | 11:39 |
hberaud | (in those libs) | 11:39 |
sean-k-mooney | am i think movign the serveices away form evently woudl be a multi cycle effort by the way | 11:39 |
sean-k-mooney | *away form eventlet | 11:39 |
sean-k-mooney | its threading model is deeply baked into nova | 11:40 |
sean-k-mooney | to the point that nova cant really run un monkey patched | 11:40 |
sean-k-mooney | it will endup in an infinit loop waiting for RPC messages | 11:41 |
sean-k-mooney | in the compute-agent at least | 11:41 |
hberaud | I see | 11:42 |
sean-k-mooney | the topic has come up before but we expect it would take a substaital rewrite to adress | 11:42 |
sean-k-mooney | we conepmplated starting a new agent based on asyncio (dont have the time) as a poc and then seeing if we could prot service so tehy could run under either mode | 11:43 |
sean-k-mooney | but i dont see that happing in the next year or two unless new contibutors signing up to work on it | 11:44 |
sean-k-mooney | part of the concern is if we woudl be abel to share code adn move things bit by bit or not | 11:44 |
yoctozepto | hberaud, sean-k-mooney : the legacy lib cannot handle more than one server, so it's not feasible unless we do NATS with a TCP loadbalancer | 12:44 |
hberaud | good to know | 12:45 |
yoctozepto | yeah, plus it lacks at least those features I mentioned in gerrit | 12:47 |
yoctozepto | all in all, it does not make sense to base off of it unless we plan to build upon it quite a bit | 12:47 |
sean-k-mooney | ack | 12:54 |
sean-k-mooney | so short term i think runing the nats part in a spereate process spawned by the oslo driver and bridiging it within the nats driver woudl be "simple" way to move to ths offial asyncio implmationtion | 12:55 |
sean-k-mooney | but its not the nicest approch | 12:55 |
sean-k-mooney | it shoudl jsut work however | 12:55 |
sean-k-mooney | as it will provide a clean seam between the eventlet part and asyncio part | 12:56 |
yoctozepto | RPC to get RPC... | 12:57 |
yoctozepto | 😂 | 12:57 |
sean-k-mooney | IPC to get RPC | 12:58 |
sean-k-mooney | but kind of | 12:58 |
sean-k-mooney | it could be a mmaped queue or any other IPC mechanium | 12:58 |
sean-k-mooney | you jsut need to kep the two eventloops in differnt processes | 12:59 |
yoctozepto | yeah | 13:03 |
yoctozepto | though I am curious about final eventlet removal | 13:04 |
hberaud | I prefer this solution, however, it will surely have performances impacts somewhere | 13:04 |
hberaud | (IPC to get RPC) | 13:05 |
yoctozepto | yeah | 13:05 |
hberaud | however, that's not worst than being stuck by eventlet somewhere | 13:05 |
yoctozepto | I must say I am curious about removing the dep on eventlet | 13:06 |
hberaud | this is a kind of start... | 13:07 |
yoctozepto | indeed, although I am still trying to wrap my head around it more holistically | 13:13 |
yoctozepto | I mean, staying with that IPC will be ugly | 13:14 |
yoctozepto | we need to have a path to eventlet-free openstack | 13:14 |
tobias-urdin | seems like there was some (way way back) planning for that https://wiki.openstack.org/wiki/Oslo/blueprints/asyncio https://blueprints.launchpad.net/oslo.messaging/+spec/asyncio-executor | 13:35 |
tobias-urdin | to run eventlet inside asyncio until everything was ported or something | 13:35 |
tobias-urdin | "The greenio project allows to "plug" asyncio in eventlet" the other way around :pp | 13:36 |
hberaud | good catch | 13:38 |
hberaud | interesting docuemtns | 13:38 |
yoctozepto | https://github.com/miguelgrinberg/greenletio | 13:46 |
yoctozepto | also, it looks like SQLAlchemy is using greenlet to do asyncio | 13:46 |
yoctozepto | https://docs.sqlalchemy.org/en/14/orm/extensions/asyncio.html | 13:47 |
yoctozepto | I think I am confused now :S | 13:47 |
tobias-urdin | me too, what a spider web | 13:54 |
hberaud | +1 | 13:55 |
hberaud | From the doc: "Where above, an API call always starts as asyncio, flows through the synchronous API, and ends as asyncio, before results are propagated through this same chain in the opposite direction. In between, the message is adapted first into sync-style API use, and then back out to async style. Event hooks then by their nature occur in the middle of the “sync-style | 13:57 |
hberaud | API use”. From this it follows that the API presented within event hooks occurs inside the process by which asyncio API requests have been adapted to sync, and outgoing messages to the database API will be converted to asyncio transparently." | 13:57 |
hberaud | https://docs.sqlalchemy.org/en/14/orm/extensions/asyncio.html (looks for the table under "asyncio and events, two opposites") | 13:58 |
hberaud | I think sqlalchemy do the magical abstraction by calling: AsyncSession.run_sync() | 14:01 |
* hberaud goes to the release meeting | 14:01 | |
yoctozepto | o/ | 14:02 |
hberaud | in other words I think sqlalchemy is doing something similar to greenio under its hood | 14:03 |
yoctozepto | maybe, though that part is only about the events, not the general flow | 14:06 |
hberaud | indeed | 14:06 |
tobias-urdin | link so that i don't forget https://lists.openstack.org/pipermail/openstack-dev/2013-May/009784.html | 14:24 |
hberaud | could be worth to put all of these links into the review too, to centralize things. | 14:28 |
frickler | I was just about to suggest an etherpad | 14:28 |
hberaud | also, why not | 14:29 |
yoctozepto | I have been adding some more to the review so that they stay longer | 14:29 |
hberaud | thx | 14:30 |
yoctozepto | btw, regarding the pthread/eventlet issue with heartbeats - was there a debugging conclusion? or only that we revert because it behaves oddly? | 14:33 |
hberaud | we just reverted the default value to avoid breaking nova and neutron parts by default. The option should be turned on manually to run under pthread. | 14:35 |
hberaud | From an oslo view point we don't have lot of options here. We are stuck with this eventlet/uwsgi constraints from the heartbeat view point | 14:36 |
yoctozepto | I mean if anyone discovered *why* nova-compute and friends were failing with hearbeats being on pthread | 14:37 |
hberaud | because nova-compute is not running under uwsgi so this flag should be set to false and turn off by default. | 14:40 |
hberaud | s/turn off/turned off/ | 14:42 |
yoctozepto | but pthread is not wsgi-specific; I mean you are talking about the symptoms and I mean if anyone run a Root Cause Analysis | 14:45 |
yoctozepto | because I can't find it in the bug report | 14:45 |
hberaud | I'm not aware of an existing RCA | 14:46 |
yoctozepto | ack | 14:52 |
yoctozepto | the context to my queries is that the greenletio seems to be using threads to support both styles (unless that was too quick of an analysis) | 14:57 |
yoctozepto | that said, it's greenlet only, without eventlet | 14:58 |
hberaud | greenletio? You means greenio? | 15:00 |
hberaud | or greenlet? | 15:00 |
yoctozepto | https://github.com/miguelgrinberg/greenletio | 15:01 |
yoctozepto | we have been discussing this already | 15:01 |
hberaud | oh I missed this one | 15:03 |
hberaud | I was thinking about https://github.com/1st1/greenio/ | 15:03 |
yoctozepto | ok | 15:16 |
gibi | heads up, I'm in the middle of debugging https://bugs.launchpad.net/nova/+bug/1988311/ oslo.concurrency fair lock + eventlet is broken. https://gist.github.com/gibizer/9051369e67fd46a20d52963dac534852 This is probably the same issue melwitt reported in https://github.com/eventlet/eventlet/issues/731 when we fixed a test only break in nova https://review.opendev.org/c/openstack/nova/+/813114 | 16:54 |
gibi | I wanted to say: I think ... broken | 16:55 |
gibi | I summarized what we know so far in https://bugs.launchpad.net/oslo.concurrency/+bug/1988311/comments/4 | 17:39 |
yoctozepto | another reason to drop eventlet :S | 19:06 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!