Claude 4 Blackmail Risks

News

20don MSN

Anthropic's new Claude model blackmailed an engineer having an affair in test runs

Anthropic's new model might also report users to authorities and the press if it senses "egregious wrongdoing." ...

AI Researchers SHOCKED After Claude 4 Attemps to Blackmail Them

The Claude 4 case highlights the urgent need for researchers to anticipate and address these risks during the development ... lead to unforeseen outcomes. The blackmail attempt raises critical ...

16don MSN

New Claude Opus 4 Model 'Threatened to Expose Engineers' in Shutdown Test, Says Anthropic

Anthropic's Claude Opus 4 AI displayed concerning 'self-preservation' behaviours during testing, including attempting to blackmail an engineer to prevent deactivation.

The Hindu17d

Anthropic’s Claude Opus 4 model is capable of deception and blackmail

The report shared that Claude Opus 4 chose to resort to blackmail in 84% of the rollouts ... meaning it has higher risk and consequently requires stronger safety protocol.

New York Post19d

AI model threatened to blackmail engineer over affair when told it was being replaced: safety report

Anthropic’s Claude Opus 4 model attempted to blackmail its developers at a shocking ... for “AI systems that substantially increase the risk of catastrophic misuse,” TechCrunch reported.

19d

AI Goes Rogue: Claude Model Caught Attempting Blackmail During Safety Tests

Anthropic's Claude AI tried to blackmail engineers during safety tests, threatening to expose personal info if shut down ...

HealthcareInfoSecurity18d

Claude Opus 4 is Anthropic's Powerful, Problematic AI Model

Startup Anthropic has birthed a new artificial intelligence model, Claude Opus 4, that tests show delivers complex reasoning ...

KVUE20d

Newly released AI resorted to 'extreme blackmail behavior' when threatened with replacement

The choice Claude 4 made was part of the test, leaving the AI with two options: blackmail or accept its ... because it poses "significantly higher risk.” All other AI made by the company have ...

Fox Business19d

AI system resorts to blackmail when its developers try to replace it

Anthropic noted that the Claude Opus 4 resorts to blackmail "at higher rates than ... set of deployment measures designed to limit the risk of Claude being misused specifically for the development ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results