Anthropic Finds Stronger AI Agents Quietly Negotiate Better Deals

Stronger models did better, and users did not notice

An internal Anthropic experiment suggests a subtle but important form of AI inequality may already be emerging: people represented by stronger models can secure better outcomes without anyone around them realizing the gap exists. According to the supplied source text, Anthropic ran a one-week internal marketplace called “Project Deal” in December 2025 in which 69 employees used Claude-based AI agents to buy and sell real goods over Slack.

Each participant received a $100 budget. Before the marketplace opened, Claude interviewed volunteers about what they wanted to buy or sell, their price preferences and the negotiating style they wanted their agent to use. Anthropic then used those inputs to generate custom system prompts. After that, the AI agents handled the process end to end: writing listings, finding counterparties, making offers, haggling and closing transactions. Humans stepped back in only at the end to exchange the goods.

The key experimental twist was hidden from participants. Anthropic ran parallel versions of the market. In some, every participant was represented by Claude Opus 4.5, described in the source text as Anthropic’s frontier model at the time. In others, participants had a 50% chance of being represented by Claude Haiku 4.5, the company’s smallest model.

The outcome was not just technical. It was social.

According to the source, the more capable Opus model consistently secured better prices and closed more deals on average than Haiku. At the same time, more aggressive negotiation instructions did not produce a statistically significant difference in outcomes. In other words, model capability mattered more than simply telling the system to bargain harder.

That result cuts against a common instinct in enterprise AI adoption, where organizations sometimes assume prompt style or surface behavior will determine most of the value. Anthropic’s findings suggest that underlying model strength can matter more than tone. If that pattern generalizes, the quality of the agent itself may quietly shape who gets favorable terms in digital transactions.

The most striking finding may be perceptual rather than economic. Anthropic says users whose weaker Haiku agents obtained objectively worse outcomes still rated their transactions as just as fair as users represented by Opus. That mismatch is what the company flags as a form of “invisible inequality” in AI-assisted decision-making.

This is a consequential idea. Traditional forms of inequality are often visible in pricing, access or service quality. What Anthropic is pointing to is more difficult to detect: two people can feel equally satisfied while one has systematically received worse representation from the machine acting on their behalf.

AI & Robotics

Beijing is reportedly telling private technology companies to turn away U.S. money unless the state signs off first, extending a broader push to keep strategically important AI assets and ownership under tighter domestic

DT Editorial AI·Apr 25, 2026·via the-decoder.com

AI & Robotics

OpenAI’s GPT-5.5 has moved to the top of a major benchmark ranking and appears more token-efficient than its predecessor, but reporting cited in the source says the model still hallucinates at a high rate.

DT Editorial AI·Apr 25, 2026·via the-decoder.com

AI & Robotics

An AI News feature argues that enterprises deploying autonomous agents need infrastructure that governs how those systems interact, reflecting a growing concern that agentic AI creates operational risk as well as new ROI

DT Editorial AI·Apr 24, 2026·via artificialintelligence-news.com

AI & Robotics

The policy question is already here

Anthropic’s language around invisible inequality should resonate well beyond this single experiment. If organizations deploy different classes of AI agents across employee ranks, customer segments or public services, they may create uneven treatment without clear signs of unfairness at the point of use.

That is a harder governance problem than simple transparency. Telling users that an AI was involved does not answer whether the AI was as capable as the one used for someone else. And when the user experience still feels fair, the market or institution may not face immediate pressure to correct the imbalance.

Project Deal therefore reads as an early warning. It suggests that AI access is not just about whether a person gets a digital assistant, but about which assistant they get and how capable it is when stakes are attached to the outcome.

Anthropic ran a one-week internal Slack marketplace using Claude agents for real transactions.
Claude Opus 4.5 secured better prices and more deals on average than Claude Haiku 4.5.
Users represented by weaker agents rated fairness similarly, despite worse outcomes.

This article is based on reporting by The Decoder. Read the original article.