Models & Pricingupdate

Claude Opus 4.8: what's new, and why migration is easy this time

In brief

Anthropic's new flagship launched May 28, 2026. Pricing is unchanged at $5/$25. Effort now defaults to high, fast mode is 3x cheaper, and a 1M context window ships by default. Unlike the 4.6→4.7 jump, there are no new breaking changes — moving from Opus 4.7 is a model-name swap.

8 min read·Large Language Model

Contents

♡Sign in to save

Claude Opus 4.8 launched May 28, 2026. It is Anthropic's most capable generally available model, and the model ID is claude-opus-4-8. Pricing is unchanged from Opus 4.7: $5 per million input tokens, $25 per million output tokens. It is available on the Claude API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry.

The headline for anyone already on Opus 4.7: this is the easy upgrade. Opus 4.8 supports the same set of tools and platform features as 4.7, and there are no new API breaking changes. If your code already runs on Opus 4.7, switching is a model-name swap — change claude-opus-4-7 to claude-opus-4-8 and re-test.

What's new

Effort defaults to high. On Opus 4.8 the effort parameter defaults to high across every surface — Claude Code and the Messages API included. On 4.7 you had to opt into higher effort; now the strong default is the floor, and you dial down (low, medium) when you want speed. The full ladder is low, medium, high, xhigh, max.

1M token context by default. The 1M token context window is on by default on the Claude API, Bedrock, and Vertex AI (200k on Microsoft Foundry) at standard pricing — no beta header. Maximum output is 128k tokens.

Smarter adaptive thinking. With adaptive thinking enabled, Opus 4.8 triggers reasoning only when a turn actually needs it. At the same effort level, it spends fewer thinking tokens than 4.7 — you pay for reasoning where it helps and skip it where it doesn't.

Fast mode, 3x cheaper. Fast mode for Opus 4.8 runs at roughly 2.5x the speed of standard generation and is priced at $10/$50 per million tokens — three times cheaper than fast mode on previous Opus models. It is a research preview on the Claude API; in Claude Code, Max-plan users now default to fast mode on Opus 4.8. (Fast mode for Opus 4.6 is deprecated, with removal about 30 days after launch — migrate to 4.8 or 4.7.)

Mid-conversation system messages. You can now send role: "system" messages after a user turn, in the middle of a long-running session, to change instructions without breaking your prompt cache. Previously, changing the system prompt mid-session invalidated the cache from that point on. No beta header required. This matters for long agent loops where the rules shift partway through a task.

Lower cache threshold. The minimum cacheable prompt length drops to 1,024 tokens on Opus 4.8 (it was higher on 4.7), so more of your prompts qualify for caching.

Documented refusal categories. Refusal responses now expose a stop_details field with a category (cyber, bio, or null) and a human-readable explanation, so your app can route different classes of refusal to the right next step instead of treating every block the same.

How it performs

Anthropic reports Opus 4.8 outperforms GPT-5.5 on at least 12 benchmarks. The two most relevant to practitioners:

Agentic coding — SWE-Bench Pro: 69.2% (vs. 58.6% for GPT-5.5).
Computer use — OSWorld-Verified: 83.4% (vs. 78.7% for GPT-5.5). On the Online-Mind2Web browser-agent benchmark it scores 84%.

Two quality claims worth noting for production work. Anthropic says 4.8 is about 4x less likely than 4.7 to let a code flaw pass unremarked in review — directly relevant if you use Claude for code review or security scanning. And it reports improved honesty: 4.8 is better at flagging its own uncertainty and avoiding confident, unsupported claims — the failure mode behind most hallucination incidents.

What carries over from 4.7

These were introduced with Opus 4.7 and still apply, so if you migrated to 4.7 you've already handled them:

Sampling parameters are gone. Setting temperature, top_p, or top_k to a non-default value returns a 400 error. Use structured outputs or prompt instructions for determinism instead.
High-resolution vision. Images up to 2,576 pixels on the long edge (~3.75 MP). Full-resolution images cost more tokens — budget accordingly.
Task budgets, advisor tool, computer use all support 4.8.

If you are still on Opus 4.6 or earlier, read Migrating from Opus 4.6 to Opus 4.7 first — those breaking changes (extended thinking removal, prefill removal, the new tokenizer) apply on the way to 4.8 as well.

The biggest new capability isn't in the model

Alongside Opus 4.8, Claude Code shipped Dynamic Workflows (research preview): Claude can now author a multi-step plan and orchestrate work across tens — up to roughly a thousand — subagents running in the background, for jobs like a codebase-scale migration. That's a large enough shift to cover on its own: see Dynamic Workflows in Claude Code.

Who should upgrade, and when

Now, if you're on Opus 4.7. It's a model-name swap with no new breaking changes. Re-test because the defaults changed (effort is now high, which can raise latency and token use on simple calls) — but there's no code to rewrite.

After a test pass, if you have tuned 4.7 prompts in production. The honesty and effort-calibration changes can shift outputs. Run your eval suite before flipping the default.

Full migration guidance: platform.claude.com/docs/en/about-claude/models/whats-new-claude-4-8

Try this today

Swap one non-critical service to claude-opus-4-8, leave everything else identical, and watch two numbers for a day: cost per request (effort now defaults to high, so this may rise) and your escaped-error or eval-pass rate. If quality climbs and cost is acceptable, roll it forward. If cost jumps on simple calls, set effort: "medium" on those routes — you don't have to take the new default everywhere.