Anthropic Drops AI Safety Pledge Amid Pentagon Pressure

A Pivotal Day for AI Safety

In what critics are calling a watershed moment for the artificial intelligence industry, Anthropic announced sweeping changes to its Responsible Scaling Policy on Tuesday, eliminating the hard safety tripwires that had been central to the company's identity since its founding. The timing was striking — the announcement arrived on the very same day that reports surfaced about Defense Secretary Pete Hegseth pressuring the company to give the U.S. military unfettered access to its Claude AI model.

For more than two years, Anthropic's RSP stood as one of the most concrete safety commitments in the AI industry. The policy established clear red lines: if the company's models reached certain capability thresholds without adequate safety measures in place, development would stop. That pledge is now gone, replaced by a more flexible framework of "Risk Reports" and "Frontier Safety Roadmaps" that the company says better reflects the realities of the competitive AI landscape.

The Rationale Behind the Shift

Anthropic framed the changes as a pragmatic response to a collective action problem. "Two and a half years later, our honest assessment is that some parts of this theory of change have played out as we hoped, but others have not," the company wrote in its updated policy document. The core argument is straightforward: if one responsible developer pauses while competitors race ahead, the result could be a world shaped by the least careful actors rather than the most thoughtful ones.

"We felt that it wouldn't actually help anyone for us to stop training AI models," Jared Kaplan, Anthropic's chief science officer, told Time magazine. "We didn't really feel, with the rapid advance of AI, that it made sense for us to make unilateral commitments… if competitors are blazing ahead." It's a familiar argument in technology — the idea that responsible actors need to remain at the frontier to ensure safety-minded perspectives shape how powerful technology develops.

But the reasoning sits uneasily alongside the company's rising commercial fortunes. Anthropic raised $30 billion in new investment just this month, bringing its valuation to $380 billion. Its Claude models have drawn widespread acclaim, particularly for coding applications. The latest versions have been described by the company itself as its safest yet — raising the question of why safety pledges need weakening at precisely the moment when capabilities and resources are at their peak.

The Pentagon's Ultimatum

The elephant in the room is the concurrent pressure campaign from the Department of Defense. According to reporting by Axios, Defense Secretary Hegseth has given Anthropic CEO Dario Amodei until Friday to provide the military with unrestricted access to Claude or face consequences. Those consequences could include invoking the Defense Production Act, severing the company's existing defense contracts, or designating Anthropic as a supply chain risk — a move that would force other Pentagon contractors to certify they aren't using Claude in their workflows.

Claude is reportedly the only AI model currently used for the military's most sensitive operations. "The only reason we're still talking to these people is we need them and we need them now," a defense official told Axios. The model was reportedly used during recent military operations in Venezuela, a topic Amodei has raised with defense partner Palantir.

Anthropic has reportedly offered to adapt its usage policies for the Pentagon but has drawn lines against allowing the model to be used for mass surveillance of Americans or weapons systems that fire without human involvement. Whether those lines hold in the face of government pressure remains an open question.

The Frog-Boiling Concern

Safety researchers have expressed a range of reactions. Chris Painter, director of the nonprofit METR, described the changes as understandable but potentially ominous. He praised the emphasis on transparent risk reporting but raised concerns about a "frog-boiling" effect — the idea that when hard safety lines become flexible guidelines, each individual concession seems reasonable while the cumulative direction is troubling.

Painter noted that the new RSP suggests Anthropic "believes it needs to shift into triage mode with its safety plans, because methods to assess and mitigate risk are not keeping up with the pace of capabilities." He added bluntly: "This is more evidence that society is not prepared for the potential catastrophic risks posed by AI."

The parallel to Google's evolution is hard to ignore. The search giant famously operated under a "Don't be evil" motto before quietly removing it from its code of conduct as commercial pressures mounted. Whether Anthropic's trajectory follows a similar arc will depend on what the company does in the coming weeks and months — particularly in its standoff with the Pentagon.

What Comes Next

The new RSP framework replaces binary stop/go decisions with graduated assessments and public disclosures. In theory, this provides more nuanced safety governance. In practice, critics worry it removes the only mechanism that could have forced a pause in development at a critical moment.

For the broader AI industry, the message is clear: even the companies most vocally committed to safety are finding that commitment difficult to sustain as valuations soar, competition intensifies, and the government comes calling. The question isn't whether AI development will slow down — it clearly won't. It's whether the guardrails being rebuilt are strong enough to matter.

This article is based on reporting by Engadget. Read the original article.