Engr Mejba Ahmed - The Claude Model Family: Haiku, Sonnet, and Opus — Building AI Agents with Claude: From Prompts to Production

The Claude Model Family: Haiku, Sonnet, and Opus

16 min read Lesson 3 / 28

The Claude Model Family

Anthropic publishes three tiers of Claude models, each optimized for a different speed and capability tradeoff. Choosing the right model is one of the most impactful cost and latency decisions you will make.

The Three Tiers

Claude Haiku — Fastest and most affordable. Ideal for high-volume, latency-sensitive tasks where the reasoning requirements are modest.

Claude Sonnet — Balanced performance. The workhorse for most production agentic workflows — capable enough for complex reasoning, fast enough for interactive use.

Claude Opus — Most capable. Best for the hardest reasoning tasks, but slower and more expensive. Use sparingly for tasks that genuinely require frontier intelligence.

Model Selection in Code

import anthropic

client = anthropic.Anthropic()

# High-volume classification or routing
haiku_response = client.messages.create(
    model="claude-haiku-4-5",
    max_tokens=256,
    messages=[{"role": "user", "content": "Classify this email as spam or not: ..."}],
)

# Production agent with tool use
sonnet_response = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=4096,
    tools=my_tools,
    messages=[{"role": "user", "content": "Research and summarize the top 5 competitors."}],
)

# Complex reasoning or analysis
opus_response = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=8192,
    messages=[{"role": "user", "content": "Analyze the long-term strategic implications of..."}],
)

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

// Use environment variables or config to control model selection
const MODEL_CONFIGS = {
  routing: "claude-haiku-4-5",
  agent: "claude-sonnet-4-5",
  analysis: "claude-opus-4-5",
};

const response = await client.messages.create({
  model: MODEL_CONFIGS.agent,
  max_tokens: 4096,
  messages: [{ role: "user", content: userMessage }],
});

Cost Comparison Strategy

A practical pattern for production: use Haiku to route and filter requests, Sonnet for the main agent loop, and Opus only when confidence is low after multiple retries.

def smart_route(query: str) -> str:
    """Use Haiku to decide which model should handle this query."""
    routing_response = client.messages.create(
        model="claude-haiku-4-5",
        max_tokens=16,
        system="Reply with only 'simple', 'standard', or 'complex'.",
        messages=[{"role": "user", "content": query}],
    )
    return routing_response.content[0].text.strip()

Matching model tier to task complexity is the single biggest lever for controlling costs in production.