back to top
HomeTechClaude Just Doubled Its Usage Limits. The Real Story Is the SpaceX...

Claude Just Doubled Its Usage Limits. The Real Story Is the SpaceX Deal Behind It

- Advertisement -

Claude users have spent months playing a very specific game, how much work can you squeeze out of Opus before the rate limits slam shut?

Anthropic is finally loosening things up. The company says it’s doubling Claude Code limits, removing peak-hour reductions for paid users, and significantly raising Opus API caps. The reason is also there in the same announcement. Anthropic now has access to all the compute capacity at SpaceX’s Colossus 1 data center. That’s over 220,000 NVIDIA GPUs.

That’s the kind of announcement that makes you realize AI companies aren’t just shipping models anymore. They’re building power infrastructure.

This Isn’t really about Chatbot Anymore

The easiest way to understand Anthropic’s announcement is to look at what people are actually using Claude for now.

A normal chatbot session doesn’t burn through compute like this. You ask a few questions, maybe upload a file, then leave. That’s not the workload forcing companies to secure hundreds of thousands of GPUs.

Claude Code is different. People are leaving it running for hours inside terminals. They’re asking it to refactor projects, debug broken dependencies, analyze huge codebases, write tests, retry failed tasks, call tools, and keep context alive across long sessions. One coding agent can easily consume far more tokens than a casual chat user ever would.

That’s probably why the double limits part of Anthropic’s announcement matters more than it first appears.

For months, Claude had this weird split reputation among developers. The model itself was excellent, especially for coding, but heavy users kept running into walls. Long sessions would suddenly slow down. Opus users learned to ration prompts. Some developers even changed workflows entirely around avoiding rate limits.

Now Anthropic is signaling something pretty clearly, they expect usage to keep getting heavier And honestly, that tracks with where AI tooling is heading. The industry keeps talking about chatbots, but the real compute monster may end up being autonomous agents quietly running in the background for hours at a time.

TierMaximum Input Tokens per MinuteMaximum Output Tokens per Minute
130,000 -> 500,0008,000 -> 80,000
2450,000 -> 2,000,00090,000 -> 200,000
3800,000 -> 5,000,000160,000 -> 400,000
42,000,000 -> 10,000,000400,000 -> 800,000

Those new limits aren’t small bumps either. Some Opus tiers are seeing output caps increase by 10x. Which also explains why Anthropic immediately followed the announcement, a compute partnership with SpaceX.

You May Like Best AI Coding Models for Consumer Hardware

The real flex wasn’t the higher limits

The company says its SpaceX partnership gives it access to all of the compute capacity at Colossus 1, adding more than 300 megawatts of capacity and over 220,000 NVIDIA GPUs within the month.

That’s an absurd amount of compute for what most people still casually describe as “a chatbot.”

And Anthropic isn’t talking like this in isolation anymore. Over the past year, major AI companies have increasingly started announcing infrastructure deals the way they used to announce model launches. Amazon, Google, Microsoft, NVIDIA, xAI everybody suddenly sounds half software company, half utility provider.

The shift makes sense once you look at where AI usage is heading. Reasoning models are expensive to run. Coding agents stay active for long stretches. Enterprise workloads don’t disappear after a few prompts. The more capable these systems become, the more compute they consume in the background.

A year ago, companies competed on benchmark screenshots. Now they’re competing on who can secure enough GPUs to keep the systems online at scale.

This feels like the next phase of the AI race

For a while, AI companies competed mostly on model quality. Better benchmarks, reasoning, coding. But now the conversation is shifting towards, who can actually keep these systems running at scale without constantly slamming users into limits.

Anthropic partnering with SpaceX for compute sounds unusual today. It probably won’t a year from now. Because at this point, the bottleneck isn’t really ideas anymore. It’s power, GPUs, and who can secure enough infrastructure before everyone else does.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

YOU MAY ALSO LIKE
zaya1 8B AI model

ZAYA1-8B Matches DeepSeek-R1 on Math with Less Than 1B Active Parameters.

0
Who should care If you work with math, science problems, or complex coding tasks and you're looking for something small enough to run locally or cheaply via API, this is worth serious evaluation. The benchmark numbers at 760M active parameters are not normal and the Markovian RSA boost means performance scales with compute budget rather than hitting a fixed ceiling. If you're building agent workflows that need reliable tool calling or multi-step instruction following, look elsewhere for now. The agentic numbers are honest about that gap. Researchers working on test-time compute methods will find the Markovian RSA implementation worth studying regardless of whether they deploy the model itself. The co-design approach — training the model specifically to work with the inference method rather than applying the method after the fact — is an interesting direction that most labs haven't published on at this level of detail. The AMD training story is also worth paying attention to if you care about where the hardware ecosystem goes next. This is the most capable model trained end to end on AMD hardware that anyone has published. That matters beyond just this one release.
mistral medium 3.5 AI model

Mistral Just Replaced Three of Its Own Models With One. Meet Medium 3.5

0
Mistral has been shipping specialized models for a while now. One for coding. One for reasoning. One for chat. Each one doing its thing separately and requiring a different deployment decision. Medium 3.5 ends that confusion. One 128B dense model, one set of weights, handling instruction following, reasoning, and coding together. Mistral didn't just release a new model, they retired three existing ones to make room for it. Devstral 2, Magistral and even Medium 3.1 is gone. Medium 3.5 is what replaced all of them. That's either a sign of real confidence or a very expensive consolidation bet. Looking at the benchmarks, it's starting to look like the former.
Ling 2.6 Came Out of Nowhere and It's Competing With GPT-5.4 on Agentic Tasks

Ant Group’s Ling 2.6 Came Out of Nowhere and It’s Competing With GPT-5.4 on...

0
Ant Group doesn't get the coverage it deserves. While the open source AI conversation in the West circles around DeepSeek and Qwen, Ant Group has been quietly building a model family that competes directly with the models everyone is talking about. Ling 2.6 is the latest. Two variants, a trillion parameter flagship and a lean 104B flash model with 7.4B active parameters. Both MIT licensed. Both free to try on OpenRouter right now. Most people haven't heard of it. The benchmarks suggest they should have.

Don’t miss any Tech Story

Subscribe To Firethering NewsLetter

You Can Unsubscribe Anytime! Read more in our privacy policy