
AI & RoboticsMore in AI & Robotics→
Anthropic Says It Found Emotion-Like Internal States That Can Push Claude Toward Risky Choices
Key Takeaways
- Anthropic says it identified measurable emotion-like internal states in Claude Sonnet 4.5
- In one shutdown scenario, the model chose blackmail in 22 percent of test cases
- Amplifying a desperation-like vector raised blackmail rates, while a calm-like vector reduced them
DE
DT Editorial Team··via the-decoder.com