The Token Illusion: Why the AI Market Should Be Valued in Dollars, Not Consumption Units

26.06.2026

17:00

Recently, a dangerous trend has emerged in the artificial intelligence industry: analysts and journalists are increasingly relying on raw token consumption data to assess model market share. As a professional analyst, I believe this approach is fundamentally flawed and misleading. Dragonfly managing partner Haseeb Qureshi recently made a compelling argument for why token share is a poor metric and proposed measuring the market by real monetary expenditure. Let's break down his logic.

Four "Traps" of Token Statistics

The first and perhaps most obvious problem is subsidies. Chinese labs regularly launch new models with massive discounts or even free access. This attracts a large number of users who jump from one free model to another, artificially inflating token consumption without generating a single cent of real revenue. Usage graphs in such cases show beautiful growth that has nothing to do with economic reality.

The second problem is related to model size. Small models, such as Qwen 3.5-27B, cost about a hundred times less per token than the flagship Claude Opus. Growth in Qwen usage might appear on a graph as a sharp spike in the share of open models, even though it is economically insignificant. Analyzing the market without separating models by weight class is like comparing elephants to ants.

The third problem is multi-agent systems. You could spend the same amount on a complex multi-agent architecture based on DeepSeek or GLM 5.2 and on a single cutting-edge model like Opus or GPT-5.5 Pro. But with comparable performance, the multi-agent configuration will burn far more tokens for the same money. As Qureshi aptly noted, if 5% of Opus usage shifts to such a system with four times the token consumption, the graph would show an 18% drop in Opus's share, even though real spending only decreased by 5%. This is a gross distortion of the picture.

The fourth problem is the limitation of the OpenRouter platform itself. If a company has settled on one leading lab, it is more cost-effective for them to go directly to Anthropic or OpenAI rather than through OpenRouter with its markup. On a graph, this would look like a decline in the US share, even though the tokens are simply moving off the platform. OpenRouter is useful for assessing share within open models, but it is categorically unsuitable for comparing open and closed ones.

Is the Future in Cheap Models?

SageRoad Research founder Trevor Noren develops a similar idea, linking it to pricing pressure on the industry. He cites a JPMorgan estimate: many tokens in the future may be consumed not by cutting-edge models, but by small open models that are sufficient for specific tasks. Amazon already offers about fifty open models at a fraction of the cost of cutting-edge ones. Nvidia, together with Dell, Lenovo, and HP, is creating computers for AI agents.

The cost example is particularly telling. According to JPMorgan data, running the Artificial Analysis Intelligence Index task set on Claude Opus 4.8 costs $3,700 for a result of 56 points, while DeepSeek V4 Pro scores 44 points for just $186 — roughly 20 times cheaper. The conclusion is obvious: cutting-edge intelligence is not needed for everything, only where it is truly necessary. GLM 5.2 from Z.ai appears comparable to top models from Anthropic and OpenAI.

Noren believes that the commoditization of models will come not only from competition among leading labs but also from companies seeking cost control through cheaper, specialized models.

My conclusion as an analyst: both positions agree on one thing — the artificial intelligence market should be measured by money, not tokens. Under pricing pressure, the advantage is increasingly shifting toward cheap models. This is a fundamental shift that investors and developers should consider today. Those who continue to look at raw token consumption graphs risk missing the real picture of capital redistribution in the industry.

Crypto news