<p>NVIDIA gives away powerful AI for free — and earns more than its competitors from it</p> - 21.06.2026

21.06.2026

10:16

NVIDIA gives away powerful AI for free — and earns more than its competitors from it

On June 4, 2026, NVIDIA released Nemotron 3 Ultra, the largest open AI model in the Nemotron 3 line. The company released the model weights, training data, and training methodologies under a free license. The model is designed for long-running autonomous agents and complex reasoning.

Unlike closed flagship models like ChatGPT or Claude, Nemotron 3 Ultra can be downloaded, fine-tuned on your own data, and run on your own infrastructure. The focus here is not on maximum intelligence, but on openness, efficiency, and control over the model.

What makes the model's architecture special

Nemotron 3 Ultra is not just a "scaled-up transformer." It is based on a hybrid architecture consisting of three different approaches: Mamba-2 layers, Attention layers, and Latent Mixture of Experts (Latent MoE) — a mechanism that routes each query only to the necessary "specialists" within the model.

Mamba-2 layers process long texts quickly and efficiently: their costs grow linearly with length, not exponentially like the standard attention mechanism. Attention layers, in turn, accurately retain large volumes of text in memory. Latent MoE compresses data before passing it to the experts, allowing each expert to work narrowly and precisely without requiring additional computation.

The model has approximately 550 billion parameters in total, but only about 55 billion are activated for processing each token. This allows it to think like a massive system while behaving like a much more compact one in terms of cost. Combined with a 1 million token context window and a speed of over 300 tokens per second, this provides five to six times greater throughput and roughly 30% lower task costs.

NVIDIA's strategy and bet on the ecosystem

The main value of the release, according to industry analysts, is not the model itself, but the ecosystem that NVIDIA is building around its hardware. The logic is simple: whoever runs Nemotron almost certainly does so on NVIDIA graphics cards, fine-tunes it using its software tools, and deploys it on its software. Openness here is not charity, but a way to bring developers back to purchasing the company's hardware.

NVIDIA can afford this because its financial capabilities are incomparable to the costs of the model itself. With a market capitalization exceeding $5 trillion, training Nemotron 3 Ultra, which likely cost hundreds of millions of dollars, is a nearly negligible expense for the company. Graphics card sales more than cover the research, so NVIDIA can give the model away for free and still earn more than closed competitors charge for paid access.

The political context adds further weight to the release. An open American model can be inspected, modified, and run on one's own servers — this has made it attractive to countries building independent national AI, from Europe to Southeast Asia. No one can remotely shut down such a model, and this is especially valuable in light of recent restrictions surrounding closed models.

Where the model falls short and what comes next

Despite all its advantages, Nemotron 3 Ultra is not the smartest model on the market. In the independent Artificial Analysis Intelligence Index, it scored 48 points — the best result among open US models, but globally it trails leaders like Kimi K2.6 (54 points) and DeepSeek. Open models, according to analysts, lag behind closed ones by three to seven months.

But this gap, in my opinion, matters less and less if the open model is simply sufficient for real-world tasks. A bank deploying Nemotron 3 Ultra for processing loans on its own servers does not need flagship-level intelligence — it needs a model that can be fine-tuned on private data, kept within its secure perimeter, and not share confidential information with outsiders.

NVIDIA's bet on efficiency, rather than test records, may prove more far-sighted. In the mass adoption of AI, the cost of running a model comes to the forefront, and one that is nearly as smart but five times cheaper wins in real-world deployment. Analysts expect the open ecosystem to only strengthen: NVIDIA has the resources, motivation, and distribution channels to release increasingly powerful open models faster than any other company.

Expert opinion: In the next 12–18 months, we will see open models based on the Nemotron architecture become the standard for the enterprise sector, while closed systems will remain a niche product for tasks where top-1% quality is critical. By giving away AI for free, NVIDIA is effectively cutting off the oxygen to competitors trying to monetize access to their models. This is a classic example of the "razor and blades" strategy, but on an industry-wide scale.

Crypto news

NVIDIA gives away powerful AI for free — and earns more than its competitors from it

What makes the model's architecture special

NVIDIA's strategy and bet on the ecosystem

Where the model falls short and what comes next