Python SDK 1.x: Memory leak in streaming responses

Nina Patel
Nina PatelMar 12, 2026

We're seeing a memory leak in our long-running service that uses streaming responses. After ~1000 streaming requests, memory usage grows from 200MB to 2GB+.

# This leaks memory over time
for request in incoming_requests:
    stream = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=request.messages,
        stream=True
    )
    for chunk in stream:
        process(chunk)
    # stream.close() doesn't help

Using openai==1.52.0 and Python 3.11. Profiling shows the leak is in httpx's connection pool. Anyone else seeing this?

2.2k views13 replies27 likesSolved

Log in to reply to this topic.