Python SDK 1.x: Memory leak in streaming responses
Nina PatelMar 12, 2026
We're seeing a memory leak in our long-running service that uses streaming responses. After ~1000 streaming requests, memory usage grows from 200MB to 2GB+.
# This leaks memory over time
for request in incoming_requests:
stream = client.chat.completions.create(
model="gpt-4o-mini",
messages=request.messages,
stream=True
)
for chunk in stream:
process(chunk)
# stream.close() doesn't help
Using openai==1.52.0 and Python 3.11. Profiling shows the leak is in httpx's connection pool. Anyone else seeing this?
2.2k views13 replies27 likesSolved
Log in to reply to this topic.