The question after the demo

The most important question in artificial intelligence may no longer be whether the systems are impressive. It is whether they can produce reliable economic value once they leave the demo, the coding benchmark, and the investor deck. That is the argument running through a new MIT Technology Review analysis, which frames the current AI moment as a familiar three-step fantasy: build the technology, assume profit will follow, and leave the difficult middle unexplained.

The supplied source text draws on a well-known joke from South Park: “Phase 1: Collect underpants. Phase 2: ? Phase 3: Profit.” In this retelling, AI has already completed phase one by producing powerful systems, and the industry is loudly promising phase three in the form of transformation and economic upside. The unresolved part is phase two: the messy work of turning capability into routine workplace value.

That framing lands because it captures the contradiction at the center of the current AI boom. Models can write, summarize, classify, generate code, and handle a growing range of language-driven tasks. Yet impressive capability on a benchmark or in a pilot does not automatically become productivity, margin expansion, or durable return on investment inside a real organization.

The article suggests that even the best AI systems, outside coding, still struggle to be economically viable in the workplace. That distinction matters. Coding has emerged as one of the strongest early commercial footholds for generative AI because the outputs are digital, the workflows are iterative, and the users are often highly skilled at evaluating results. Many other domains are less forgiving. Errors carry higher costs, oversight is slower, tasks are less structured, and integration with existing processes is harder.

The analysis points to two recent studies as examples of the gap. One, from Anthropic, predicted which kinds of jobs may be most affected by large language models, highlighting roles such as managers, architects, and people in media while suggesting lesser impact for groundskeepers, construction workers, and hospitality workers. But the article stresses that predictions like these are still essentially guesses about task fit, not proof of actual workplace performance.

That is a critical distinction. A model may appear capable of assisting with a task in theory while failing to clear the practical hurdles that determine whether an employer will deploy it widely. Those hurdles include reliability, compliance, monitoring cost, user trust, workflow redesign, and the simple question of whether using the system is faster or cheaper than sticking with current methods.

The same problem shadows many of the grandest AI claims. Executives and researchers can describe the technology as economically transformative, and they may yet be right. But a transformation only counts once organizations can repeatedly capture the value in production. That means the real competition may not be over who has the most advanced model. It may be over who can define, operationalize, and scale the missing middle layer between model output and business result.

That layer could include process redesign, regulation, oversight mechanisms, software interfaces, pricing models, training, and clearer understanding of where AI meaningfully augments human work instead of complicating it. The MIT Technology Review piece notes that different camps already project different answers into that middle space. Activists associated with Pause AI see regulation as essential. Boosters often glide past the uncertainty because they are more focused on the destination than the route.

In reality, the route is the story. Every major workplace technology wave has depended on complementary systems around the tool itself. The spreadsheet mattered, but so did the business processes that absorbed it. The internet mattered, but so did payments, logistics, standards, and user habits. AI will likely follow the same pattern. The model is only part of the value chain.

That is why the current market is full of tension. Companies have already spent heavily on models, compute, integrations, and pilots. They are under pressure to show that the outlay produces more than novelty. If the economic case remains strongest only in a narrow band of applications, then the path from hype to broad profitability will be slower and more selective than many forecasts imply.

The missing step, then, is not a minor implementation detail. It is the central business problem of the AI era. Until companies can explain, with evidence, how they move from technical possibility to repeatable workplace gains, the sector will keep oscillating between genuine breakthroughs and inflated expectation. AI has reached the point where its hardest challenge is no longer building more capability. It is making capability count.

This article is based on reporting by MIT Technology Review. Read the original article.