Context window management strategies for long conversations

Emma Rodriguez
Emma RodriguezJan 28, 2026

When building chatbots, you eventually hit the context window limit. Here are strategies I've used:

1. Sliding window

Keep the last N messages. Simple but loses early context.

2. Summarization

Periodically summarize older messages and prepend the summary.

def manage_context(messages, max_tokens=100000):
    total = count_tokens(messages)
    if total <= max_tokens:
        return messages
    
    # Keep system prompt + last 10 messages
    system = messages[0]
    recent = messages[-10:]
    old = messages[1:-10]
    
    # Summarize old messages
    summary = summarize(old)
    return [system, {"role": "system", "content": f"Previous conversation summary: {summary}"}] + recent

3. RAG over conversation history

Embed all messages, retrieve relevant ones for each new query.

I use approach #2 for most chatbots. What do you use?

5.3k views30 replies78 likes

Log in to reply to this topic.