What’s Improved in AI Models Sonnet & Opus
Digest more
The company said it was taking the measures as a precaution and that the team had not yet determined if its newst model has crossed the benchmark that would require that protection.
Bowman later edited his tweet and the following one in a thread to read as follows, but it still didn't convince the naysayers.
1don MSN
A third-party research institute Anthropic partnered with to test Claude Opus 4 recommended against deploying an early version because it tends to "scheme."
Claude Opus 4 and Claude Sonnet 4, Anthropic's latest generation of frontier AI models, were announced Thursday.
Anthropic's Claude Opus 4 outperforms OpenAI's GPT-4.1 with unprecedented seven-hour autonomous coding sessions and record-breaking 72.5% SWE-bench score, transforming AI from quick-response tool to day-long collaborator.
Anthropic's newest editon of its flagship AI product will address significant limitations in current large language models.