News
Anthropic's most powerful model yet, Claude 4, has unwanted side effects: The AI can report you to authorities and the press.
19hon MSN
Anthropic’s Claude Opus 4 model attempted to blackmail its developers at a shocking 84% rate or higher in a series of tests that presented the AI with a concocted scenario, TechCrunch reported ...
Anthropic says its AI model Claude Opus 4 resorted to blackmail when it thought an engineer tasked with replacing it was having an extramarital affair.
The Register on MSN1d
Anthropic Claude 4 models a little more willing than before to blackmail some usersOpen the pod bay door Anthropic on Thursday announced the availability of Claude Opus 4 and Claude Sonnet 4, the latest ...
Anthropic's Claude 4 Opus AI sparks backlash for emergent 'whistleblowing'—potentially reporting users for perceived immoral ...
This development, detailed in a recently published safety report, have led Anthropic to classify Claude Opus 4 as an ‘ASL-3’ ...
Anthropic introduced Claude Opus 4 and Claude Sonnet 4 during its first developer conference on May 22. The company claims ...
Claude Opus 4 is the world’s best coding model, Anthropic said. The company also released a safety report for the hybrid ...
The testing found the AI was capable of "extreme actions" if it thought its "self-preservation" was threatened.
20h
Interesting Engineering on MSNAnthropic’s most powerful AI tried blackmailing engineers to avoid shutdownAnthropic's Claude Opus 4 AI model attempted blackmail in safety tests, triggering the company’s highest-risk ASL-3 ...
Anthropic reported that its newest model, Claude Opus 4, used blackmailing as a last resort after being told it could get ...
2don MSN
A third-party research institute Anthropic partnered with to test Claude Opus 4 recommended against deploying an early ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results