Claude Opus 4.6 — Anthropic's Most Advanced AI

Capabilities

Built for the hardest problems

Opus 4.6 focuses attention where it matters most — automatically applying deeper reasoning to challenging components without explicit instruction.

Adaptive Thinking

The model autonomously determines when extended reasoning helps, applying deeper analysis exactly where needed — no prompt engineering required.

Low Medium High Max Developer-tunable effort levels

Elite Coding

Highest score on Terminal-Bench 2.0 for agentic coding. Operates reliably in large codebases with sustained autonomous task execution.

1M Token Context

First Opus-class model with million-token context. 76% accuracy on 8-needle MRCR v2 retrieval, versus Sonnet 4.5's 18.5%.

Safety Aligned

Lowest rate of over-refusals among recent Claude versions, with strong alignment and low rates of misaligned behavior across evaluations.

Context Compaction

Automatically summarizes older context during extended sessions, enabling longer productive work without hitting context limits.

Information Retrieval

Leads frontier models on BrowseComp for locating hard-to-find information, and on Humanity's Last Exam for multidisciplinary reasoning.

Performance

State-of-the-art benchmarks

Opus 4.6 sets new records across coding, reasoning, knowledge work, and long-context retrieval.

GDPval-AA

Finance, Legal, Technical

Opus 4.6 +144 Elo vs GPT-5.2

GPT-5.2

Opus 4.5 -190 Elo

BigLaw Bench

Legal Reasoning

90.2% accuracy

40% perfect scores across legal tasks

MRCR v2 (8-needle, 1M)

Long-Context Retrieval

76%

Opus 4.6

18.5%

Sonnet 4.5

Terminal-Bench 2.0

Agentic Coding

Highest Score Among all frontier models tested

2× Improvement Over Opus 4.5 on computational biology

Claude Code

Agent Teams

Coordinate multiple Claude Code instances working as a team. One lead orchestrates, teammates work in parallel, and a shared task system keeps everything synchronized.

Team Lead Orchestrates & delegates

Security Reviews vulnerabilities

Performance Optimizes speed

Testing Validates coverage

Frontend Builds UI components

Shared Task List

✓ Auth module review

API optimization

Integration tests

Deploy pipeline

01

Parallel Exploration

Multiple teammates investigate different aspects simultaneously — competing hypotheses, different review lenses, or independent modules — then share and challenge findings.

02

Direct Messaging

Unlike subagents that only report back, teammates message each other directly. They debate approaches, share discoveries, and coordinate without bottlenecking through the lead.

03

Plan Approval Gates

Require teammates to present their plan before implementing. The lead reviews, approves or rejects with feedback, ensuring quality control before any code changes.

04

Delegate Mode

Restrict the lead to coordination-only tools. It focuses purely on orchestration — breaking down work, assigning tasks, synthesizing results — while teammates handle implementation.

05

Flexible Display

Run all teammates in-process within your terminal, or use tmux/iTerm2 split panes to see everyone's output at once. Message any teammate directly at any time.

06

Smart Task Dependencies

Tasks can depend on other tasks. When a blocking task completes, dependents automatically unblock. File locking prevents race conditions during concurrent claims.

Best use cases

🔍

Parallel Code Review

Three reviewers with distinct lenses: security, performance, and test coverage. Each applies a focused filter while the lead synthesizes across all findings.

🔧

Competing Hypotheses

Five investigators each explore a different theory for a bug, actively trying to disprove each other through scientific debate. The surviving theory is the root cause.

🛠

Cross-Layer Features

Frontend, backend, and test teammates each own their layer. They coordinate via the shared task list and direct messages without stepping on each other's files.

Technical

Specifications & Pricing

Everything you need to integrate Opus 4.6 into your applications.

Model Access

Model ID	`claude-opus-4-6`
Platforms	claude.ai, API, AWS, GCP
Context Window	1M tokens (beta)
Max Output	128K tokens

Pricing

Input (standard)	$5 / M tokens
Output (standard)	$25 / M tokens
Input (>200K)	$10 / M tokens
Output (>200K)	$37.50 / M tokens

API Features

Adaptive Thinking	✓
Effort Levels	✓
Context Compaction	Beta
US-Only Inference	1.1× pricing

Safety

Alignment	Matches/exceeds Opus 4.5
Over-refusals	Lowest among Claude models
Cyber Probes	6 new evaluations
Misalignment	Low across all evals

Ready to build with Opus 4.6?

Access the most advanced Claude model through the API, Claude Code, or claude.ai.

Open Claude API Documentation Agent Teams Docs