OpenAI Developer Community

We ran the numbers on fine-tuning GPT-4o-mini vs using few-shot prompting with GPT-4o for our classification task (10K requests/day).

Option A: Few-shot GPT-4o

5 examples in each prompt (~800 tokens overhead)

$2.50/M input tokens

Daily cost: ~$75

No examples needed (200 tokens saved per request)

$0.30/M input tokens (mini) + $25 training cost

Daily cost: ~$9

Few-shot GPT-4o: 91% accuracy

Fine-tuned mini: 89% accuracy

The fine-tuned model saves us ~$2K/month with only 2% accuracy loss. Training cost pays for itself in less than a day.

Fine-tuning at scale is an absolute no-brainer for well-defined tasks.