The New Performance Review Metric No One Asked For
Somewhere in the bowels of corporate performance management, someone had an idea: if we're paying for AI tokens, we should be able to measure who's using them most. And so a new workplace metric was born — AI token consumption as a proxy for productivity and engagement.
Reports are emerging that some tech companies, eager to justify their AI platform investments and identify early adopters, are monitoring how rapidly employees burn through large language model tokens. The more tokens consumed, the more engaged with AI the worker must be — or so the reasoning goes. It's a managerial logic that sounds superficially reasonable until you examine it for about thirty seconds.
Why Token Count Is a Terrible Productivity Metric
Token consumption measures AI usage, not work output. A developer who uses Claude or Copilot to generate five alternative approaches to a coding problem before selecting the best one consumes many more tokens than a developer who writes clean code independently on the first try. Under a token-consumption metric, the first developer scores higher — even though the second might be producing better work.
The metric inverts the incentive structure in other ways too. Employees who understand AI limitations and use it judiciously will naturally use fewer tokens than those who prompt repeatedly hoping for better outputs. The metric rewards volume over discernment.
There's also the obvious gaming problem. Once employees know they're being evaluated on token usage, they will generate prompts. Lots of prompts. Meaningless prompts if necessary. Corporate history is littered with examples of metrics that were easy to game and quickly became the primary output they were supposed to measure.
The Deeper Problem: Measuring AI Adoption the Wrong Way
The impulse behind these metrics isn't entirely misguided. Organizations that have invested heavily in AI platforms want to know whether those investments are generating returns. Identifying employees who aren't using available tools — and understanding why — is a legitimate management concern.
But token consumption is a leading indicator at best, and a misleading one at worst. What actually matters is whether AI is changing work outputs: reducing time-to-completion on tasks, improving quality, enabling work that wasn't previously feasible, or freeing up cognitive capacity for higher-value activities. None of these are captured by counting API calls.
The companies reportedly using this approach are essentially measuring inputs because outputs are harder to define and measure. That's understandable in a transition period, but treating a proxy metric as the real thing is a management failure with a long history in tech.
What This Reveals About AI Integration
The emergence of token-consumption metrics reflects a broader anxiety in tech organizations: the sense that AI is transforming work faster than management frameworks can adapt. Leaders who understand that AI matters but don't yet have clear frameworks for measuring its organizational impact are reaching for whatever numbers are available.
This phase was predictable and is probably temporary. The same pattern played out with cloud adoption metrics, agile velocity points, and countless other technological transitions. Organizations eventually develop more sophisticated ways of measuring impact after the initial hype cycle forces some harder thinking.
The Management Challenge of the AI Era
The harder truth is that AI fundamentally complicates the attribution of work output. When a developer produces code, how much credit belongs to them versus the AI that drafted it? When a designer delivers a concept, how do you value the human creative judgment applied to AI-generated options? When a writer ships an article, where does research assistance end and creative contribution begin?
These questions don't have clean answers, which is why organizations are reaching for simpler proxies like token consumption. But the companies that figure out how to measure AI-augmented work accurately — rather than just measuring AI usage — will have a significant advantage in allocating talent, structuring incentives, and building teams that use AI effectively.
Until then, expect more dubious metrics, more employee confusion, and more articles explaining why counting tokens is not the same as counting good work.
This article is based on reporting by Gizmodo. Read the original article.



