GLM-5.2 from Z.ai: A Real Competitor to Claude or Just Benchmark Hype?
A new hype is brewing in the world of AI. Chinese company Z.ai has released the GLM-5.2 model, which has already been dubbed the "killer" of Anthropic's flagship Claude online. The spectacle is fueled by claims of tenfold superiority at a price ten times lower. But is this really the case, or are we once again dealing with clever marketing rather than a genuine breakthrough?
Technical Specifications and Positioning
GLM-5.2 is a flagship open model designed for long working sessions. Its key advantage is a stable context window of 1 million tokens (compared to 200,000 for its predecessor GLM-5.1). This means the model can hold vast amounts of code or text in focus without losing quality over hours. The model offers two levels of reasoning enhancement: High (a balance of performance and token consumption) and Max (maximum depth, but with significantly higher resource usage).
Important: GLM-5.2 is distributed under the open MIT license with no regional restrictions, allowing it to be run on your own hardware (self-hosting). This fundamentally sets it apart from Anthropic's closed solutions.
Benchmarks: Numbers Don't Lie, But...
According to Z.ai's own tests, GLM-5.2 indeed shows impressive results on standard benchmarks. For example, on Terminal-Bench 2.1, it scored 81.0 points, just 4 points below Opus 4.8 (85.0) and higher than Gemini 3.1 Pro (74.0). On SWE-bench Pro, it scored 62.1 points, nearly catching up with GPT-5.5 (58.6) and Gemini (54.2).
However, on more complex and long-horizon tasks, the gap with the leader becomes noticeable: on SWE-Marathon, the lag behind Opus 4.8 is 13%. This suggests the model handles isolated tasks well, but in large-scale refactoring or building complex systems from scratch, it still falls short of top-tier products.
Price vs. Quality: The Main Trump Card or an Illusion?
The subscription cost of the GLM Coding Plan is indeed attractive: from $12.6 per month (Lite) to $112 (Max) with annual payment. However, as users note, the model only truly shines in Max mode, which "burns" tokens several times faster than High. This negates the price advantage—under intensive use, costs may become comparable to Claude or GPT.
Main user complaints revolve around unstable cloud infrastructure, the model's tendency to get stuck in infinite loops, and ignoring commands. Many note that GLM-5.2 is "tuned" for benchmarks, but in real-world development, it behaves like a "budget AI."
Analyst's Verdict
GLM-5.2 is undoubtedly a strong step forward for open models. It demonstrates that China can create competitive solutions that closely approach market leaders on several metrics. However, calling it a "killer" of Claude is premature. Yes, it is cheaper and more accessible, but in terms of real user experience, stability, and depth of analysis for complex projects, it still lags behind.
My opinion: GLM-5.2 is an excellent tool for those willing to sacrifice convenience for cost savings and who have the ability to deploy the model locally. But for tasks where reliability and predictability of results are critical, Claude Opus 4.8 or GPT-5.5 remain the safer choice. The AI market is becoming increasingly fragmented, and "killer" is more clickbait than reality.