The Claude Model Family
Anthropic publishes three tiers of Claude models, each optimized for a different speed and capability tradeoff. Choosing the right model is one of the most impactful cost and latency decisions you will make.
The Three Tiers
Claude Haiku — Fastest and most affordable. Ideal for high-volume, latency-sensitive tasks where the reasoning requirements are modest.
Claude Sonnet — Balanced performance. The workhorse for most production agentic workflows — capable enough for complex reasoning, fast enough for interactive use.
Claude Opus — Most capable. Best for the hardest reasoning tasks, but slower and more expensive. Use sparingly for tasks that genuinely require frontier intelligence.
Model Selection in Code
import anthropic
client = anthropic.Anthropic()
# High-volume classification or routing
haiku_response = client.messages.create(
model="claude-haiku-4-5",
max_tokens=256,
messages=[{"role": "user", "content": "Classify this email as spam or not: ..."}],
)
# Production agent with tool use
sonnet_response = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=4096,
tools=my_tools,
messages=[{"role": "user", "content": "Research and summarize the top 5 competitors."}],
)
# Complex reasoning or analysis
opus_response = client.messages.create(
model="claude-opus-4-5",
max_tokens=8192,
messages=[{"role": "user", "content": "Analyze the long-term strategic implications of..."}],
)
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic();
// Use environment variables or config to control model selection
const MODEL_CONFIGS = {
routing: "claude-haiku-4-5",
agent: "claude-sonnet-4-5",
analysis: "claude-opus-4-5",
};
const response = await client.messages.create({
model: MODEL_CONFIGS.agent,
max_tokens: 4096,
messages: [{ role: "user", content: userMessage }],
});
Cost Comparison Strategy
A practical pattern for production: use Haiku to route and filter requests, Sonnet for the main agent loop, and Opus only when confidence is low after multiple retries.
def smart_route(query: str) -> str:
"""Use Haiku to decide which model should handle this query."""
routing_response = client.messages.create(
model="claude-haiku-4-5",
max_tokens=16,
system="Reply with only 'simple', 'standard', or 'complex'.",
messages=[{"role": "user", "content": query}],
)
return routing_response.content[0].text.strip()
Matching model tier to task complexity is the single biggest lever for controlling costs in production.