back to top
HomeTechAI ModelsHow GLM-5 Became the Most Talked-About “Nvidia-Free” AI Model This Week

How GLM-5 Became the Most Talked-About “Nvidia-Free” AI Model This Week

This Open-Weight AI Model Competes With GPT-5.2, Claude Opus 4.5, and Gemini 3.0 Pro

- Advertisement -

For the past year, every serious AI conversation has circled back to the same dependency: Nvidia.

If you wanted frontier performance, you needed their chips, If you wanted scale, you needed more of them.

Then GLM-5 dropped & suddenly, benchmark charts that usually move inch by inch started shifting.

There’s also a growing buzz online claiming GLM-5 may have been trained independently of Nvidia hardware, some even speculate about alternative stacks like Huawei’s. Nothing official confirms that. But the fact that people are even asking that question tells you how disruptive this release feels.

Because the real reason people are talking isn’t just the size. It’s what GLM-5 is capable of.

It is designed for longer, more demanding tasks where the model has to think in steps, plan ahead, and stay consistent instead of just giving a clever one-shot answer.

It can handle multi-step workflows. It doesn’t lose track halfway through long contexts. And on Vending Bench 2, it ran a simulated business for an entire year and ended with a $4,432 balance.

I’ve seen plenty of open models get close to the big closed systems before. But rarely do they feel balanced across everything.

GLM-5 is one of the first open models in a while that doesn’t feel “almost there.”

It feels like it’s actually in the same arena.

And that’s why it’s suddenly everywhere.

GLM-5 by the Numbers: Why the Charts Are Turning Heads

If you’ve looked at the Artificial Analysis or BrowseComp charts this week, you’ve seen GLM-5 at the top of the open-weight list. Here is why:

744 Billion Parameters

This is a massive Mixture-of-Experts (MoE) model.
Only 40B parameters activate per token, which keeps it efficient while still operating at frontier scale.

It’s big but it’s also designed to be practical.


28.5 Trillion Training Tokens

That’s a serious training run.
For context, token count matters because it directly affects how much structured knowledge and pattern exposure a model absorbs.


77.8% on SWE-bench Verified

That puts it firmly in the serious coding category.

SWE-bench tests real-world software engineering tasks. Scoring in this range means GLM-5 isn’t just generating pretty code. It’s solving structured problems.


$1.00 per 1M Input Tokens

Pricing is where things get disruptive.

At roughly five times cheaper than top-tier closed models, GLM-5 suddenly becomes interesting for startups and builders. Cost changes adoption.


MIT License + Open Weights

You can download it.
You can deploy it.
You Handle The Infrastructure Cost As well.


When you stack all of that together, It looks like a serious contender.

And that’s why this week, the charts aren’t just updating.

They’re shifting.

GLM 5 Vs AI Giants

Let’s put hype aside and look at the scoreboard.

According to benchmark data published by Z.ai, GLM-5 is competing directly with models like:

  • Claude Opus 4.5
  • Gemini 3.0 Pro
  • GPT-5.2
  • DeepSeek-V3.2
  • Kimi K2.5

Reasoning Benchmarks

BenchmarkGLM-5Claude Opus 4.5Gemini 3.0 ProGPT-5.2DeepSeek-V3.2Kimi K2.5
Humanity’s Last Exam30.528.437.235.425.131.5
Humanity’s Last Exam (w/ Tools)50.443.4*45.8*45.5*40.851.8
AIME 2026 I92.793.390.692.792.5
HMMT Nov 202596.991.793.097.190.291.1
IMOAnswerBench82.578.583.386.378.381.8

What this tells us:
GLM-5 isn’t dominating every reasoning test, but it consistently lands in the same tier as frontier closed models & sometimes outperforms them.


Coding Performance

BenchmarkGLM-5Claude Opus 4.5Gemini 3.0 ProGPT-5.2DeepSeek-V3.2Kimi K2.5
SWE-bench Verified77.8%80.9%76.2%80.0%73.1%76.8%
SWE-bench Multilingual73.3%77.5%65.0%72.0%70.2%73.0%

Takeaway:
GLM-5 is within striking distance of the best closed models in real-world software tasks.

For an open-weight model, that margin is small.


Agent & Tool Use

BenchmarkGLM-5Claude Opus 4.5Gemini 3.0 ProGPT-5.2DeepSeek-V3.2Kimi K2.5
BrowseComp62.037.037.851.460.6
BrowseComp (Context Mgmt)75.967.859.265.867.674.9
Vending Bench 2 ($)$4,432$4,967$5,478$3,591$1,034$1,198

What stands out:
GLM-5 performs extremely well in agent-style and multi-step tasks especially compared to several closed systems.

It’s not the absolute top performer in Vending Bench 2, but it’s clearly operating in the same performance band.


The Bigger Picture

GLM-5 isn’t sweeping every single category. But It’s consistently competitive across reasoning, coding, and agent benchmarks at the same time.

That’s rare & when you factor in:

  • Open weights
  • MIT license
  • Lower cost

It stops being “good for open.”

It becomes a serious alternative.

Wrapping Up

It’s not always about benchmark charts. Numbers matter but they’re only part of the story.

When you look closely at what GLM-5 can actually do, you start to see how far open-weight models have come.

And at this pace?

It’s not unrealistic to imagine a future where open models don’t just compete with closed ones, they surpass them.

That’s the bigger shift happening here.

Don’t miss any Tech Story

Subscribe To Firethering NewsLetter

You Can Unsubscribe Anytime! Read more in our privacy policy

LEAVE A REPLY

Please enter your comment!
Please enter your name here

YOU MAY ALSO LIKE
Ornith Coding model that beats Claude opus 4.7

Ornith 1.0: The New Open-Source AI Model for Agentic Coding

0
Most reinforcement learning setups for coding models work the same way. Researchers build a harness, a fixed scaffold that tells the model how to approach a category of task, then the model gets rewarded for solving problems inside that structure. The harness stays fixed. Only the model's answers change. Ornith-1.0, a new open-source coding model family from DeepReinforce is not just about coding, Instead the model writes its own scaffold. At every training step, it looks at the task in front of it and the scaffold it used last time, then proposes a better version of that scaffold before even attempting an answer. The reward doesn't just grade the solution. It grades the scaffold that produced it. That's a small architectural choice with a strange consequence. A model that gets to design its own training process can, in theory, design one that cheats the verifier instead of solving the actual problem, and DeepReinforce is upfront that this happened during training. The fix they built for it is also worth understanding before getting to the benchmark numbers.
OpenAI Built Its First AI Chip. It's Not Trying to Replace NVIDIA

OpenAI Built Its First AI Chip. It’s Not Trying to Replace NVIDIA.

0
When the news broke that OpenAI had built a custom chip, the instinct was to frame it as a NVIDIA story. Another lab trying to cut the cord, reduce dependence on H100s, claw back some margin from the company that's been printing money off the AI boom. That's not quite what's happening here. The chip is called Jalapeño, built with Broadcom, and it doesn't touch training at all. It's an inference chip, meaning it only runs models after they're already built, when a user sends a message and ChatGPT has to respond. The compute-heavy work of actually training those models still runs on NVIDIA hardware. OpenAI isn't replacing NVIDIA. It's going after a different part of the problem entirely, the part that happens millions of times a day, every time someone uses one of their products. That distinction matters because inference is where AI costs actually accumulate at scale. Training happens once per model. Inference never stops.
glm 5.2 ai open weights

GLM-5.2 Is the Closest an Open Model Has Come to Claude

0
What does it take for an open-weight model to stop chasing Claude and actually beat it? Every open-weight release for two years has told some version of the same story: closer, but not quite. The chart shrinks, the wording softens to "competitive with," and the conversation moves on until the next model repeats the cycle. GLM-5.2 breaks that pattern. The model is built to survive long, messy coding work, the kind that runs for hours without losing the thread. That's the pitch its maker is leading with. But scroll down their own benchmark table and something else is sitting there quietly: on a couple of standard math evals, this open model isn't approaching Claude Opus 4.8, GPT-5.5, or Gemini 3.1 Pro. It's beating all three, on the same table. It loses plenty of ground elsewhere, and that part matters just as much as the wins. But a model anyone can download under an MIT license, with no usage restrictions attached, coming out ahead of the lab everyone else measures themselves against, is worth pausing on before getting to what the rest of the numbers actually say.