Description
Which component is this bug for?
Traceloop SDK
📜 Description
I'm having a memory leak for a service that's using the traceloop-sdk, opentelemetry-instrumentation-langchain and having an HTTP call inside one of langgraph nodes,
That causes my service to restart due to OOM exception
👟 Reproduction steps
Relevant requirements.txt
fastapi[standard]==0.115.6
asgi-correlation-id==4.3.1
uvicorn==0.29.0
aiohttp==3.11.11
langchain==0.2.16
langchain-openai==0.1.25
langgraph==0.2.6
langchain-community==0.2.16
pydantic-settings==2.7.1
opentelemetry-api==1.29.0
opentelemetry-sdk==1.29.0
openinference-semantic-conventions==0.1.12
opentelemetry-exporter-otlp-proto-http==1.29.0
opentelemetry-instrumentation-fastapi==0.50b0
opentelemetry-instrumentation-aiohttp-client==0.50b0
traceloop-sdk==0.35.0
pydantic==2.10.4
I used tracemalloc to find what causes the memory leak and it seems like it is caused by JsonDecoder which keeps growing,
After 40 calls to my API that triggered langgraph, the size of JsonDecoder is ~14MB, please see
{
"stat": ".../plugins/python-ce/helpers/pydev/pydevd.py:2270: size=14.8 MiB (+14.8 MiB), count=115519 (+115519), average=135 B",
"frames": [
"$('.../plugins/python-ce/helpers/pydev/pydevd.py', 2270)\n",
"$('.../plugins/python-ce/helpers/pydev/pydevd.py', 2252)\n",
"$('.../plugins/python-ce/helpers/pydev/pydevd.py', 1563)\n",
"$('.../plugins/python-ce/helpers/pydev/pydevd.py', 1570)\n",
"$('.../plugins/python-ce/helpers/pydev/_pydev_imps/_pydev_execfile.py', 18)\n",
"$('.../app/main.py', 69)\n",
"$('.../venv/lib/python3.11/site-packages/uvicorn/main.py', 575)\n",
"$('.../venv/lib/python3.11/site-packages/uvicorn/server.py', 65)\n",
"$('.../[email protected]/3.11.11/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/runners.py', 190)\n",
"$('.../[email protected]/3.11.11/Frameworks/Python.framework/Versions/3.11/lib/python3.11/asyncio/runners.py', 118)\n",
"$('.../app/tools/base/tool.py', 144)\n",
"$('.../venv/lib/python3.11/site-packages/traceloop/sdk/decorators/base.py', 193)\n",
"$('.../app/tools/banking_question/banking_question_tool.py', 108)\n",
"$('.../venv/lib/python3.11/site-packages/oz_logger/decorators.py', 24)\n",
"$('.../app/client/intent_to_answer_client.py', 51)\n",
"$('.../venv/lib/python3.11/site-packages/tenacity/asyncio/init.py', 189)\n",
"$('.../venv/lib/python3.11/site-packages/tenacity/asyncio/init.py', 114)\n",
"$('.../app/client/intent_to_answer_client.py', 74)\n",
"$('.../venv/lib/python3.11/site-packages/aiohttp/client_reqrep.py', 1298)\n",
"$('.../[email protected]/3.11.11/Frameworks/Python.framework/Versions/3.11/lib/python3.11/json/init.py', 346)\n",
"$('.../[email protected]/3.11.11/Frameworks/Python.framework/Versions/3.11/lib/python3.11/json/decoder.py', 337)\n",
"$('.../[email protected]/3.11.11/Frameworks/Python.framework/Versions/3.11/lib/python3.11/json/decoder.py', 353)\n"
]
}
👍 Expected behavior
No memory leak
👎 Actual Behavior with Screenshots
Having memory leak and Pods restart due to OOM exception
🤖 Python Version
3.11.11
📃 Provide any additional context for the Bug.
After disabling opentelemetry-instrumentation-langchain or the call to another service via aiohttp, there's no memory leak
For now, I disabled opentelemetry-instrumentation-langchain and added custom spans using @workflow/@task annotations around the nodes and tools
👀 Have you spent some time to check if this bug has been raised before?
- I checked and didn't find similar issue
Are you willing to submit PR?
None