
Anthropic released Claude Sonnet 4.6, describing it as “our most capable Sonnet model yet.” The model is now the default for free and paid Pro users across Anthropic’s Claude AI chatbot and Claude Cowork productivity tool, replacing Sonnet 4.5 across both platforms.
The release comes just twelve days after Anthropic launched Claude Opus 4.6, its flagship model, highlighting the pace at which the company is pushing new releases as it competes with OpenAI, Meta, and Google.
Flagship-Level Performance at a Lower Cost
The headline claim Anthropic is making with Sonnet 4.6 is straightforward – it performs at near-Opus quality without the Opus price tag. Pricing remains unchanged from Sonnet 4.5, starting at $3/$15 per million tokens.
That cost gap makes Sonnet 4.6 a practical choice for businesses running high-volume AI workloads. Anthropic reports that Sonnet 4.6 outperforms on orchestration evaluations, handles its most complex agentic workloads, and continues to improve at higher effort settings. On OfficeQA, a benchmark that tests how well a model reads enterprise documents like charts, PDFs, and tables, Sonnet 4.6 matches Opus 4.6’s performance entirely.
Claude Sonnet 4.6 Stronger Coding and Developer Preference
Coding improvements are one of the most concrete things Anthropic points to with this release. In Claude Code, early testing found that users preferred Sonnet 4.6 over Sonnet 4.5 roughly 70% of the time. Users even preferred Sonnet 4.6 to Opus 4.5 59% of the time.
Developers also rated Sonnet 4.6 as significantly less prone to overengineering and “laziness,” and meaningfully better at instruction following. They reported fewer false claims of success, fewer hallucinations, and more consistent follow-through on multi-step tasks. On the SWE-bench Verified coding benchmark, Sonnet 4.6 scored 79.6%, comparable to other frontier models.
Additionally, computer use, the ability of the model to navigate a screen, click, type, and interact with software the way a person would, has been one of Anthropic’s key areas of development. The numbers reflect that clearly. On the OSWorld benchmark, Claude Sonnet 3.5 scored 14.9% in October 2024. Sonnet 3.7 reached 28.0% in February 2025. Sonnet 4 hit 42.2% by June. Sonnet 4.5 climbed to 61.4% in October. Sonnet 4.6 has now reached 72.5%, nearly a fivefold improvement in 16 months.
However, Anthropic acknowledges the model still falls short of most skilled human performance in this area, but early users are already reporting human-level capability in tasks like navigating complex spreadsheets or completing multi-step web forms.
Safety and Availability
Anthropic ran extensive safety evaluations, which showed Sonnet 4.6 to be as safe as, or safer than, its other recent models. On prompt injection resistance, where malicious content attempts to hijack the model’s behavior, Sonnet 4.6 is a major improvement over Sonnet 4.5 and performs similarly to Opus 4.6.
The model is available across all Claude plans, Claude Code, the API, Amazon Bedrock, Google Cloud’s Vertex AI, and Microsoft Foundry.
With Sonnet 4.6 now covering a performance range that previously required Opus-level models, the key question for Anthropic’s competitors is how much longer flagship pricing can hold up as a differentiator in providing the same services.
