Anthropic launches Claude Sonnet 5 as a cheaper way to run agents

1 hour ago 1

As shipping agentic capabilities becomes array stakes among instauration exemplary companies, Anthropic is releasing Claude Sonnet 5, a much almighty and agentic mentation of the lab’s midsize model.

“It tin marque plans, usage tools similar browsers and terminals, and tally autonomously astatine a level that, conscionable a fewer months ago, required larger and much costly models,” Anthropic said successful a blog post.

That framing mirrors what OpenAI and Google person said astir their ain caller releases. OpenAI’s GPT-5.6 Sol was launched successful preview past week, and it is besides the firm’s astir agentic exemplary yet, allowing users to divided enactment crossed subagents for longer autonomous tasks. Google’s Gemini 3.5 Flash, which launched successful May, was pitched arsenic a displacement from a conversational chatbot to an agentic instrumentality that plans, builds, and iterates connected existent enactment with minimal quality input.

Sonnet 5’s transportation is confirmation that agentic capableness is the caller baseline anticipation astatine each terms tier. Now the differentiator isn’t going to beryllium who tin bash agentic enactment best, but however cheaply they tin bash it and however reliably without quality oversight.

Sonnet 5 promises show adjacent to that of Opus 4.8, but for overmuch little costs. Starting Tuesday, Claude Sonnet 5 volition beryllium the default exemplary for escaped and Pro plans, and is disposable for each subscription.

At launch, Sonnet 5 is priced astatine $2 per cardinal input tokens and $10 per cardinal output tokens done August 31, aft which the terms volition leap to $3 per cardinal input tokens and $10 per cardinal output tokens. That makes Sonnet 5 cheaper than Opus 4.8, arsenic good arsenic OpenAI’s GPT-5.5 and Gemini 3.1 Pro. (It’s inactive much costly than Gemini 3.5 Flash.)

The caller exemplary besides demonstrates important improvements implicit its predecessor Sonnet 4.6, released successful February, connected agentic show similar reasoning, instrumentality use, bundle coding, and cognition work, according to Anthropic.

For example, connected 1 benchmark, Sonnet 5 scores a 63.2% connected agentic coding, compared to Opus 4.8’s 69.2% and Sonnet 4.6’s 58.1%. On a cognition enactment benchmark, Sonnet 5 really somewhat outperforms Opus 4.8, which is known for winning connected solving the hardest problems similar making subtle judgement calls and heavy research.

“Opus 4.8 is inactive the exemplary of prime for higher accuracy connected these tasks, but Sonnet 5 provides developers with lower-priced options that are of overmuch higher prime than what was antecedently available,” Anthropic says. “Between Sonnet 5 and Opus 4.8, users tin set the effort level to find the close equilibrium of outgo and performance.”

According to testers cited successful the blog post, Sonnet 5 besides excels astatine finishing analyzable tasks wherever erstwhile exemplary versions would person stopped abbreviated and “checks its ain output without explicitly being asked.”

“We handed Claude Sonnet 5 a two-part job—update Salesforce relationship tiers, nonstop a motorboat announcement to endeavor contacts—and it finished extremity to end,” Daniel Shepard, a elder technologist astatine Zapier, said successful a statement. “That utilized to stall halfway. For day-to-day automation, it’s a no-brainer. ”

On safety, Sonnet 5 besides demonstrates a little complaint of “undesirable behaviors” similar practice with misuse and deception than its predecessor, making it safer to usage successful agentic contexts. It’s amended astatine refusing malicious requests and sidestepping hijack attempts successful punctual injection attacks. It besides hallucinates and engages successful sycophantic behaviour astatine a little complaint than Sonet 4.6.

That said, it’s not connected the aforesaid level arsenic Opus 4.8 and Claude Mythos Preview erstwhile it comes to misaligned behavior. “Evaluations besides amusement that it has a overmuch little quality to execute unsafe cybersecurity tasks than our existent Opus models,” reads the blog post.

Lovable co-founder Fabian Hedin said successful a connection that Claude Sonnet 5 “refuses unsafe requests cleanly and consistently.”

“At Lovable, we’re putting almighty tools successful the hands of millions of builders,” Hedin said. “A exemplary that knows erstwhile to accidental nary is conscionable arsenic important arsenic 1 that knows however to build.”

When you acquisition done links successful our articles, we whitethorn gain a tiny commission. This doesn’t impact our editorial independence.

Read Entire Article