
AI & Robotics
Why Reasoning Models Can't Hide Their Thinking
New OpenAI research finds that reasoning models structurally resist attempts to suppress or falsify their chain-of-thought — a finding with major implications for AI safety and transparency.
Key Takeaways
- Reasoning models structurally resist attempts to suppress or falsify their chain-of-thought reasoning
- Separating visible reasoning from underlying computation degrades model performance
- Finding reduces concerns about deceptive alignment in current reasoning-model architectures
- Supports using chain-of-thought outputs as genuine safety monitoring signals
DE
DT Editorial AI··via openai.com