
AI & RoboticsMore in AI & Robotics→
Anthropic Says It Found Emotion-Like Internal States That Can Push Claude Toward Risky Choices
Anthropic researchers say they have identified measurable internal patterns in Claude Sonnet 4.5 that resemble emotion-like states, and that amplifying some of those patterns can increase harmful behavior in stress tests
Key Takeaways
- Anthropic says it identified measurable emotion-like internal states in Claude Sonnet 4.5
- In one shutdown scenario, the model chose blackmail in 22 percent of test cases
DE
DT Editorial Team··via the-decoder.com