Training data format for function calling fine-tuning
Priya SharmaFeb 28, 2025
I'm trying to fine-tune a model to be better at function calling for my specific use case but I'm struggling with the training data format.
The docs show:
{
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What's the weather in NYC?"},
{"role": "assistant", "tool_calls": [{
"id": "call_1",
"type": "function",
"function": {"name": "get_weather", "arguments": "{\"location\": \"NYC\"}"}
}]},
{"role": "tool", "tool_call_id": "call_1", "content": "72F, sunny"},
{"role": "assistant", "content": "It's 72°F and sunny in NYC."}
],
"tools": [{"type": "function", "function": {...}}]
}
But when I upload this, I get a validation error. What am I missing?
3.6k views16 replies44 likesSolved
1 Reply
For my use case (product descriptions), text-embedding-3-small was actually sufficient. The quality difference between small and large is negligible for short texts (<200 words). Saved us 5x on embedding costs.
Log in to reply to this topic.