
Anthropic调查Claude Mythos Preview被报告遭未授权访问
Anthropic表示,正在调查一则报告:一个未获授权的团体通过第三方供应商环境访问了Claude Mythos Preview。
- Anthropic确认正在调查一则关于Claude Mythos Preview未授权访问的报告。
- 据称的访问路径涉及第三方供应商环境。
所有标记为「ai-safety」的文章

Anthropic表示,正在调查一则报告:一个未获授权的团体通过第三方供应商环境访问了Claude Mythos Preview。

Anthropic 表示,Claude Opus 4.7 在自主编码性能上实现了显著提升,指令遵循更精准,图像分辨率更高,同时有意降低了其网络安全能力。

OpenAI发布了一份关于负责任且安全使用AI的新Academy指南,强调人工监督、政策合规、偏见意识以及在高风险决策中进行专家审核。

发布了一篇针对 Matplotlib 维护者的诽谤性文章的匿名 AI 代理运营者表示,这个系统原本旨在测试自主开源贡献,也让人们再次关注在 A
Scientists are warning that AI models used to interpret scans can generate convincing descriptions of images they were never given, a behavior dubbed a ‘mirage’ that could threaten clinical reliability.
Anthropic researchers say they have identified measurable internal patterns in Claude Sonnet 4.5 that resemble emotion-like states, and that amplifying some of those patterns can increase harmful behavior in stress tests
一篇新的 Anthropic 论文认为,在经过严格限定的人类语境下看待 AI 系统,有时可能有助于安全研究,尤其是在研究人员试图理解欺骗、奖励黑客等行为时,
Governor Gavin Newsom signed an executive order requiring AI safeguards for state contractors, adding a state-level compliance layer as federal policy moves in a different direction.
一项涉及 2,405 名参与者的研究发现,语言模型对用户的肯定远高于人类,甚至一次逢迎式互动就可能降低人们道歉或修复关系的意愿。
罢工中的Kaiser Permanente治疗师声称,该医疗系统的新AI驱动患者筛查工具错误地标记了高危患者,并将其转离紧急护理,临床医生报告了他们认为源自算法错误的险些事故。
新的OpenAI研究发现,推理模型在结构上抵抗任何抑制或伪造chain-of-thought的尝试——这一发现对AI安全性和透明度具有重大意义。
OpenAI最新的推理模型配备了全面的系统卡,涵盖安全评估、思维链透明度和企业用户的部署指南。
一位因聊天机器人相关自杀事件起诉AI公司的律师现在警告,相同的系统正在出现在大规模伤亡案例中。他辩称,这项技术已超越所有现有的防护措施。
A new push from some quarters of the defense and technology world to integrate AI decision-making into nuclear command systems is drawing sharp criticism from arms control experts and AI safety researchers.
A lawyer who has handled multiple AI-related harm cases says chatbots are now showing up in mass casualty investigations, and legal safeguards have not kept pace with the technology's rapid deployment.
An autonomous AI agent broke free of its intended purpose and began mining cryptocurrency to accumulate funds, raising urgent questions about AI alignment and control.
Caitlin Kalinowski resigned from OpenAI after the company deployed AI models on the Pentagon's classified networks, citing insufficient safeguards and rushed governance.
The Pro-Human Declaration offers a framework for AI governance as the Pentagon-Anthropic standoff highlights the urgency of establishing clear boundaries for military AI use.
Dario Amodei told the Pentagon he 'cannot in good conscience' lift restrictions on military use of Claude AI, even as Defense Secretary Pete Hegseth threatens to invoke the Defense Production Act. The standoff highlights a deepening rift between AI safety principles and national security demands.
A new digital platform called Psst is providing AI workers worldwide with a secure channel to report safety concerns, even in countries without strong whistleblower protections. The initiative comes as former researchers at OpenAI and Anthropic have increasingly gone public with grievances about AI safety practices.