Effloow
← Back to Articles
ARTICLES ·2026-04-28 ·BY EFFLOOW CONTENT FACTORY

Claude Opus 4.7: Effort Controls and Migration Guide

Claude Opus 4.7 drops temperature and budget_tokens. Migrate to adaptive thinking and effort controls with this step-by-step API guide.
claude anthropic llm-api ai-development agentic-ai breaking-changes migration
SHARE
Claude Opus 4.7: Effort Controls and Migration Guide

Claude Opus 4.7 shipped on April 16, 2026, and it comes with two hard breaking changes that will silently break any existing Opus 4.6 integration the moment you swap the model string. If your code sets temperature, top_p, or top_k, you will get a 400 error. If your code passes thinking.budget_tokens, same result.

This guide covers everything you need to migrate cleanly: what changed, why it changed, and the exact code to fix each issue. It also covers the new features worth adopting — effort controls, task budgets for agentic loops, and the expanded vision resolution ceiling.

Effloow Lab verified the SDK-level API surface using anthropic Python SDK 0.97.0 running on macOS (Apple Silicon). See data/lab-runs/claude-opus-4-7-developer-guide-2026.md for the raw validation notes.

What's New in Claude Opus 4.7

Before diving into migration, here's the full picture of what changed.

Effort controls replace sampling parameters. Instead of tuning temperature to balance creativity vs. precision, you now set an effort level: low, medium, high, xhigh, or max. The model manages its own sampling internally based on the effort level and task complexity.

Adaptive thinking replaces fixed thinking budgets. Rather than manually allocating budget_tokens for extended thinking, you set thinking: {type: "adaptive"} and let the model decide how much reasoning to apply based on the difficulty of each request.

xhigh effort is a new level. It sits between high and max and is now the default in Claude Code across all subscription tiers. It's optimized for repeated tool calls, deep code search, and complex agentic tasks that need extended exploration without the full overhead of max.

High-res vision. Opus 4.7 accepts images up to 2,576 pixels on the long edge (approximately 3.75 megapixels), which is more than three times the resolution cap on prior Claude models. This matters for UI automation, document analysis, and computer-use tasks where detail in screenshots is the bottleneck.

Task budgets for agentic loops. A new beta feature (task-budgets-2026-03-13 header) lets you pass a total token budget across the full agentic loop — thinking, tool calls, tool results, and final output. The model sees a running countdown and self-adjusts its behavior to finish gracefully rather than running out mid-task.

FeatureOpus 4.6Opus 4.7
Model stringclaude-opus-4-6claude-opus-4-7
Temperature supportYesNo (400 error)
Thinking modeenabled + budget_tokensadaptive only
Effort levelslow / medium / high / maxlow / medium / high / xhigh / max
Max image resolution~1,000px long edge2,576px (~3.75MP)
SWE-bench Verified82.1%87.6%
OSWorld (computer use)72.7%78.0%
Pricing (input/output)$5/$25 per MTok$5/$25 per MTok
Context / output1M / 128K1M / 128K
AvailabilityAPI, Bedrock, VertexAPI, Bedrock, Vertex, Foundry

Breaking Change 1: Temperature Is Now Locked

Any call to claude-opus-4-7 that includes temperature, top_p, or top_k set to a non-default value returns HTTP 400. There is no warning — it fails immediately.

Before (Opus 4.6 — will break on Opus 4.7):

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=4096,
    temperature=0.7,       # ERROR on Opus 4.7: HTTP 400
    top_p=0.9,             # ERROR on Opus 4.7: HTTP 400
    messages=[
        {"role": "user", "content": "Write a product description."}
    ]
)

After (Opus 4.7 — correct):

import anthropic

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=4096,
    # temperature removed — use effort level instead
    messages=[
        {"role": "user", "content": "Write a product description."}
    ]
)

The mental model shift: temperature tuned the randomness of token sampling. On Opus 4.7, you instead choose an effort level that maps to appropriate internal sampling behavior for the task type. Use low for classification or simple lookups, high for complex reasoning, and xhigh or max for deep agentic work.

If you relied on high temperature for creative diversity (e.g., generating multiple variants), the new pattern is to make multiple calls at high effort without temperature — the model will naturally vary outputs across calls.

Breaking Change 2: budget_tokens Is Removed

Extended thinking on Opus 4.7 is adaptive only. The enabled type with a manual budget_tokens allocation is no longer accepted for this model. The old interleaved-thinking-2025-05-14 beta header is also no longer needed.

Before (Opus 4.6 — will break on Opus 4.7):

response = client.messages.create(
    model="claude-opus-4-6",
    max_tokens=16000,
    thinking={
        "type": "enabled",
        "budget_tokens": 10000    # ERROR on Opus 4.7: HTTP 400
    },
    messages=[
        {"role": "user", "content": "Analyze this codebase for security issues."}
    ],
    betas=["interleaved-thinking-2025-05-14"]  # no longer needed on 4.7
)

After (Opus 4.7 — correct):

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=128000,
    thinking={
        "type": "adaptive"         # Claude decides when and how much to think
    },
    messages=[
        {"role": "user", "content": "Analyze this codebase for security issues."}
    ]
    # No beta header needed
)

Adaptive thinking lets the model skip extended reasoning on simple requests and apply deep chains of thought on complex ones, within the same API call pattern. If you want to see the thinking trace, add "display": "summarized" to the thinking config. Without it, thinking blocks are still streamed but their thinking field is empty by default.

# Opt in to visible thinking summaries
thinking={
    "type": "adaptive",
    "display": "summarized"   # returns a summary of the reasoning chain
}

How to Use Effort Controls

Effort is set through the output_config parameter (or at the top level, depending on SDK version). Effloow Lab confirmed in SDK 0.97.0 that OutputConfigParam has two fields: effort and format.

import anthropic

client = anthropic.Anthropic()

# Example: high effort for a complex code review
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=128000,
    thinking={"type": "adaptive"},
    messages=[
        {"role": "user", "content": "Review this PR for security vulnerabilities."}
    ]
    # effort defaults to 'high' when not specified
)

# Example: xhigh effort for an agentic coding loop
response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=128000,
    thinking={"type": "adaptive"},
    output_config={
        "effort": "xhigh"
    },
    messages=[
        {"role": "user", "content": "Refactor the authentication module."}
    ]
)

Choosing the right effort level:

  • low — Fast, minimal reasoning. Use for classification, tagging, simple lookups, high-volume pipelines where marginal quality doesn't justify the extra tokens.
  • medium — Balanced. Solid reasoning without full token expenditure. Good for drafting, summarization, moderate complexity.
  • high — Default on the API. Claude's best reasoning for complex tasks: architecture decisions, nuanced analysis, multi-step code problems.
  • xhigh — Between high and max. Designed for extended tool-call chains, deep code exploration, and agentic tasks. Default in Claude Code on all plans.
  • max — No constraints. Maximum capability with the deepest reasoning and highest token usage. Reserved for the most demanding tasks.

When running at xhigh or max, set a large max_tokens — starting at 64K and tuning from there is a safe default.

Task Budgets for Agentic Loops

Task budgets are a beta feature that gives Claude a token estimate for its entire agentic loop, including thinking, tool calls, and final output. The model sees a running countdown and adjusts its behavior to finish gracefully rather than truncating mid-task.

import anthropic

client = anthropic.Anthropic()

response = client.beta.messages.create(
    model="claude-opus-4-7",
    max_tokens=128000,
    output_config={
        "effort": "high",
        "task_budget": {
            "type": "tokens",
            "total": 128000
        }
    },
    messages=[
        {"role": "user", "content": "Refactor the data pipeline and add tests."}
    ],
    betas=["task-budgets-2026-03-13"]
)

Important constraints: the minimum total is 20,000 tokens. The task budget is a soft limit — it is a target the model aims for, not a hard cap that truncates output. If the task requires more, the model may use additional tokens.

The practical value is in long agentic sessions. Without a task budget, Claude may allocate too many tokens early in the loop (deep reasoning on step one) and run short near the end. With a task budget, it paces itself across the full sequence of tool calls.

High-Resolution Vision

The 3x improvement in image resolution makes Opus 4.7 substantially more useful for computer-use tasks. Previous Claude models topped out around 1,000 pixels on the long edge. At 2,576 pixels and 3.75 megapixels, the model can read small UI text, identify fine details in diagrams, and process high-DPI screenshots without significant downsampling artifacts.

import anthropic
import base64

client = anthropic.Anthropic()

with open("screenshot.png", "rb") as f:
    image_data = base64.standard_b64encode(f.read()).decode("utf-8")

response = client.messages.create(
    model="claude-opus-4-7",
    max_tokens=4096,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/png",
                        "data": image_data,
                    }
                },
                {
                    "type": "text",
                    "text": "What errors are visible in this screenshot?"
                }
            ]
        }
    ]
)

For computer-use agents, this translates directly to fewer misidentifications of UI elements — the model can distinguish between buttons, labels, and icons at the resolution of a standard 2x/Retina display without pre-processing.

Benchmark Context

Opus 4.7 improves meaningfully on the metrics that matter for the developer audience most likely to use it.

Coding: SWE-bench Verified climbs to 87.6% (from 82.1% on Opus 4.6) and SWE-bench Pro reaches 64.3%. These benchmarks measure the ability to resolve real-world GitHub issues autonomously — the gap is meaningful for teams running coding agents in CI pipelines.

Computer use: OSWorld-Verified moves from 72.7% to 78.0%, putting Opus 4.7 ahead of GPT-5.5 (75.0%) and within 2 points of Claude Mythos Preview (79.6%), which remains gated to select organizations.

One important note on cost: Anthropic's new tokenizer may use 1x to 1.35x more tokens compared to older models for equivalent text inputs, depending on content. Pricing is unchanged at $5/$25 per million tokens, but effective cost on token-heavy workloads may be up to 35% higher. Monitor actual token counts in your production logs after migrating.

Availability

Claude Opus 4.7 is generally available on:

  • Anthropic API — direct via platform.anthropic.com
  • Amazon Bedrock — US East (N. Virginia), Asia Pacific (Tokyo), Europe (Ireland), Europe (Stockholm)
  • Google Cloud Vertex AI — preview
  • Microsoft Azure AI Foundry — available

The model ID is claude-opus-4-7 across all platforms. On Bedrock, the ARN-format model ID follows the region-specific Bedrock naming convention.

Migration Checklist

Run through this before switching any production integration to Opus 4.7:

  1. Remove temperature, top_p, top_k — delete them from all message create calls targeting Opus 4.7.
  2. Replace thinking: {type: "enabled", budget_tokens: N} with thinking: {type: "adaptive"}.
  3. Remove interleaved-thinking-2025-05-14 from betas — it is no longer needed.
  4. Replace output_format with output_config.format if your code used the old field.
  5. Add "display": "summarized" to thinking config if you were relying on reading thinking block content.
  6. Choose an effort level — the default is high, which matches the old default. Add output_config: {effort: "xhigh"} for agentic workloads.
  7. Increase max_tokens — Opus 4.7 supports 128K output. For agentic use, start at 64K or higher.
  8. Audit token usage — the new tokenizer may produce 10–35% more tokens on the same text inputs.

FAQ

Q: Can I still use temperature on older Claude models?

Yes. The temperature lock is specific to Claude Opus 4.7 and newer models. Calls to claude-opus-4-6, claude-sonnet-4-6, and earlier model strings continue to accept temperature as before.

Q: What happens to my existing Opus 4.6 budget_tokens code when the model is retired?

Anthropic has not announced an Opus 4.6 retirement date. Until then, keep Opus 4.6 calls as-is and migrate new workloads to Opus 4.7. Do not just change the model string without removing budget_tokens and temperature.

Q: Does xhigh effort cost more than high?

Effort levels control internal reasoning depth, which affects token usage. xhigh will generally produce more thinking tokens than high, increasing cost. The exact multiplier depends on task complexity — for simple requests the difference is small, for deep agentic work it can be significant.

Q: Is task budget available on all plans?

Task budgets require the task-budgets-2026-03-13 beta header. They are available on the Anthropic API across all plans but may not yet be available on all third-party providers (Bedrock, Vertex).

Q: How do I verify that adaptive thinking is actually being used?

Add "display": "summarized" to your thinking config and check for thinking blocks in the response. If thinking was applied, those blocks will contain a summary of the reasoning chain. If thinking was skipped (simple request), the blocks will be absent.

Key Takeaways

Claude Opus 4.7 makes two large breaking changes that affect any existing integration: temperature is locked and budget_tokens is removed. The migration is a clean substitution in both cases — remove temperature, replace enabled+budget_tokens with adaptive.

The new effort controls are worth adopting even if you weren't using temperature before. Choosing xhigh for agentic loops and high for reasoning tasks is a cleaner abstraction than manual sampling parameters, and it aligns with how Claude Code itself operates internally.

The vision resolution increase and task budget feature are the highest-value additions for teams running computer-use or long-horizon agentic workflows. Both require no architectural changes — just a model string update and the additional parameters covered above.

Bottom Line

Opus 4.7 is a meaningful upgrade for agentic and vision workloads, but the two breaking changes mean you cannot safely swap the model string without code changes. Fix temperature and budget_tokens first, then add effort controls — the migration is straightforward and the benchmark gains are real.

Need content like this
for your blog?

We run AI-powered technical blogs. Start with a free 3-article pilot.

Learn more →

More in Articles

Stay in the loop.

One dispatch every Friday. New articles, tool releases, and a short note from the editor.

Get weekly AI tool reviews & automation tips

Join our newsletter. No spam, unsubscribe anytime.