The AI market: tokens deceive, count dollars
Managing partner of venture firm Dragonfly, Haseeb Qureshi, made an important statement: token consumption share is an extremely unreliable metric for evaluating the AI model market. In his opinion, the market should be measured in dollars, not tokens. Analysis based on raw token consumption on the OpenRouter platform, Qureshi believes, leads to fundamentally flawed conclusions.
Four Pitfalls of the Token Metric
The first problem, the expert links to subsidies. Chinese laboratories regularly launch new models with large discounts or even free access. This attracts users who migrate from one free model to another, artificially inflating token consumption without a corresponding increase in real monetary expenditure.
The second problem is the different sizes of models. Small models, such as Qwen 3.5-27B, cost about a hundred times less per token than the flagship Claude Opus. An increase in Qwen usage might look like a sharp spike in the share of open models on a chart, although in economic terms it is an insignificant amount. According to Qureshi, analysis should be done within weight categories by model size.
The third problem is multi-agent systems. You could spend the same amount on a complex multi-agent architecture based on DeepSeek and on one advanced model like Opus. However, the multi-agent configuration burns far more tokens for the same money. As Qureshi notes, if 5% of Opus usage migrates to such a system with a fourfold token expenditure, the chart would show Opus losing about 18% of its share, even though actual spending shifts by only 5%. Such charts exaggerate the importance of low-value tokens.
The fourth problem lies in the OpenRouter sample itself. If a company has settled on one leading laboratory, it is more advantageous for them to contact Anthropic or OpenAI directly, bypassing the platform's markup. On the chart, this looks like a decline in the share of American models, even though the tokens simply move outside the platform. Qureshi's conclusion: OpenRouter is useful for assessing the share within open models, but is not suitable for comparing open and closed ones.
Price Pressure Shifts the Market Toward Cheaper Models
A similar idea is developed by SageRoad Research founder Trevor Noren, linking it to price pressure on the industry. He cites a JPMorgan estimate: many tokens in the future may be consumed not by advanced models, but by small open models that are sufficient for specific tasks.
According to JPMorgan, Amazon already offers about half a dozen open models at a price that is a fraction of the cost of advanced ones, and Nvidia, together with Dell, Lenovo, and HP, is creating computers for AI agents. The bank notes that their own small models, Claude Haiku and GPT-5.4-mini, are not yet competitive on the "efficiency frontier," which is currently dominated by Chinese developers—DeepSeek, MiniMax, Xiaomi, and Alibaba.
The cost example is particularly illustrative. Running the Artificial Analysis Intelligence Index set of tasks on Claude Opus 4.8 costs $3,700 with a result of 56 points, while DeepSeek V4 Pro scores 44 points for just $186—about 20 times cheaper. Conclusion: an advanced level of intelligence is not needed for everything, and where it is necessary, GLM 5.2 from Z.ai appears comparable to the top models from Anthropic and OpenAI.
Noren believes that the commoditization of models will come not only from competition among leading laboratories but also from companies seeking cost control through cheaper, specialized models. Corporate spending remains the most viable path for cloud giants to recoup their AI investments, but companies will spend as little as possible.
Expert opinion: Both positions agree on one point: the AI market should be measured by money, not tokens. Under price pressure, the advantage is increasingly shifting toward cheaper models. This is a fundamental shift that investors and analysts should consider when assessing the real dynamics of the market.