Cost of fine-tuning vs in-context learning at scale

Robert Chang
Robert ChangFeb 28, 2026

We ran the numbers on fine-tuning GPT-4o-mini vs using few-shot prompting with GPT-4o for our classification task (10K requests/day).

Option A: Few-shot GPT-4o

  • 5 examples in each prompt (~800 tokens overhead)
  • $2.50/M input tokens
  • Daily cost: ~$75
  • Option B: Fine-tuned GPT-4o-mini

  • No examples needed (200 tokens saved per request)
  • $0.30/M input tokens (mini) + $25 training cost
  • Daily cost: ~$9
  • Quality comparison

  • Few-shot GPT-4o: 91% accuracy
  • Fine-tuned mini: 89% accuracy
  • The fine-tuned model saves us ~$2K/month with only 2% accuracy loss. Training cost pays for itself in less than a day.

    Fine-tuning at scale is an absolute no-brainer for well-defined tasks.

    4.3k views25 replies57 likes

    Log in to reply to this topic.