Anthropic announced that starting August 28, 2025, Claude users - particularly those using Claude Code - will face new weekly usage limits, measured in hours rather than tokens.

These limits will apply to both Pro and Max subscribers, and are being added on top of the existing rolling 5-hour usage reset system.

What Anthropic Is Saying

Anthropic explained the change is designed to curb excessive usage, such as “users running Claude 24/7” or sharing access in ways that violate the terms of use.

They emphasize that:

“Fewer than 5% of users will be affected by these limits.”
- Anthropic spokesperson, via TechCrunch

They’ve also confirmed that extra usage will be available for purchase, if users need more time beyond their weekly cap.

What That Means (In Human Words)

If you’re using Claude the way most people do - asking a few questions, getting help writing or brainstorming here and there - you’ll probably never hit these new limits.

But if you’re using Claude Code to work all day, especially if it’s running quietly in the background while you build, debug, or automate tasks, this change will matter.

Anthropic is adding weekly usage limits on top of the existing 5-hour rolling cap that already pauses your access after long sessions. So now there are two things to watch:

  1. The 5-hour rolling window (still in place)

  2. A new weekly token cap - described in estimated “hours per week”

These new limits apply to Claude’s most capable models:

  • Claude Sonnet 4 - used in Claude Code for general dev work and writing

  • Claude Opus 4 - used for more complex coding, reasoning, and larger tasks

Here’s what Anthropic says that looks like in practical terms:

Plan

Estimated Sonnet 4 Usage

Estimated Opus 4 Usage

Pro ($20/mo)

~40–80 hrs/week

Not included

Max ($100/mo)

~140–280 hrs/week

~15–35 hrs/week

Max ($200/mo)

~240–480 hrs/week

~24–40 hrs/week

Let’s Connect the Dots

If you're a Claude user-especially on Pro or Max-this rate‑limit update may feel abrupt. Suddenly there’s talk of “weekly caps,” “token usage,” and overlapping limits, all tied to outages and behavior. Let’s bring order to that confusion:

We’ll walk through what happened, why it's happening, and how it will actually affect you-so you know whether your usual workflow is at risk.

What Happened?

When Claude Sonnet 4 and Claude Opus 4 launched on May 22, 2025, Anthropic introduced major upgrades to its Claude platform - especially for developers using Claude Code. These new models offered better reasoning, faster response, and longer working sessions designed to support real-time workflows like coding, debugging, and task planning. But they didn’t come without guardrails.

From day one, Claude Code was released with a 5-hour rolling usage limit, enforced through token consumption - not literal clock time. This system was designed to prevent overload and abuse, and it was already in place at launch (https://techcrunch.com).

Still, as adoption of the new models grew, problems began to surface.

Between July 23 and July 25, Anthropic’s status page recorded multiple major service incidents, including elevated error rates and complete outages affecting Sonnet 4 and Opus 4 - especially inside Claude Code (https://status.anthropic.com). External monitors like StatusGator logged over 11 hours of downtime during that week alone (https://statusgator.com/services/anthropic).

The root cause? Some users were:

  • Running Claude Code nearly 24/7

  • Using Claude as a backend compute engine

  • Sharing or reselling their Max accounts at scale

In one case, a single Max user reportedly consumed tens of thousands of dollars in model usage-far beyond what a subscription tier was meant to handle.

And here we are. 

What “Hours” Really Mean

Anthropic keeps referring to usage limits in terms of “hours per week”-but this can be misleading if you take it literally.

In reality, there is no timer running in Claude.
There is no stopwatch tracking how long you’re using the model.
What’s being measured is your token consumption.

Anthropic is using the term “hours” as a way to help you estimate how much usage you’ll get-based on average prompt size, model type, and interaction style.

So when you see something like:

  • “15–35 hours per week” for Claude Opus

  • “140–280 hours” for Claude Sonnet

What it really means is:

You’ll get enough tokens to support roughly that many hours of typical usage.

If you’re writing long prompts, generating large code outputs, or calling Claude repeatedly with high-complexity tasks, you’ll burn through those “hours” faster-even if your actual session time is short.

So while it’s helpful as a reference, it’s not a real limit on your time.
It’s a soft estimate based on how fast your tokens are adding up.

Do We Know the Actual Token Limits?

No official numbers have been published by Anthropic that state exactly how many tokens you get per week or per session.

But external reports and user testing offer some useful estimates:

  • According to third-party analysis:

  • Based on those numbers, the weekly caps (like 40–80 hours for Pro) likely reflect total token budgets in line with those per-session estimates.

  • Anthropic has confirmed that it's token usage, not literal time, that triggers both the rolling and weekly limits - even though they describe access in "hours" to help users estimate.

Let’s Compare… Or Can We?

When it comes to LLM usage - people love asking:
“Which one gives me the most?”
But the truth is…

Comparing Claude, ChatGPT, and Gemini side-by-side isn’t just apples vs. oranges. It’s apples vs. fruit salads vs. mystery smoothies.

Let’s break it down.

🍏 Everyone Counts Differently

  • OpenAI (ChatGPT) gives you a message limit (80 per 3 hours), but doesn’t show tokens.

  • Anthropic (Claude) gives you “hours”, which are really just token-based, but not transparently so.

  • Google (Gemini) gives you daily requests, but not token info unless you're using the API.

So:

Even if you wanted to track your usage - you’re speaking three different languages.

🧮 Tokens? Sure. But What’s a Token?

Every model counts tokens differently based on how it processes language.

  • A simple sentence might be 15 tokens in one model… and 25 in another.

  • A long response might “cost” you more from Claude than GPT‑4o, even if they said the same thing.

No vendor gives you a universal token calculator - especially not in consumer plans.

💰 And What About Cost?

Each company has different expenses behind the scenes:

  • Claude runs big models on rented servers → more expensive to operate

  • GPT‑4o is lighter, faster, and cheaper to serve

  • Gemini uses Google’s own chips → they control their cost stack

So pricing and usage limits are not based on fairness - they’re based on cost-per-use math we don’t get to see.

🛠️ Plus: It’s Not Just About Quantity

Even if you could compare raw tokens:

  • Does the model respond better?

  • Is it faster, more useful, or easier to follow?

  • Does it actually help you get more done?

One smart response can be worth more than 10 weak ones.

So… What Does This All Mean?

Comparing usage across LLMs looks simple - but it’s not.
Every vendor uses different formulas, infrastructures, and business models to shape your experience.

What we can compare is:

  • Transparency

  • How often you hit limits

  • And whether you’re getting real value before you do.

Bottom Line

Effective Date:
August 28th, 2025 - new token-based rate limits will be enforced.

Impacted Models:

  • Claude 3 Sonnet

  • Claude 3 Opus
    These limits apply to both Claude Team subscribers and API users.

What’s Changing:

  • New throttling system triggered by high sustained token use

  • These limits stack on top of the existing “5-hour cooldown” mechanism

  • No real-time token visibility for users (still a black box)

Cost (unchanged):

  • Claude Pro (Sonnet only): $20/month

  • Claude Team (Sonnet + Opus): $30/month per seat

  • Claude API pricing

More Info:

Frozen Light Team Perspective

When we look at strategic moves like this, we see two things: the difference between users and adoption, and the story of AI evolution - and how those two are shaped by infrastructure and cost.

We’re reading about AI every day and how fast it’s scaling, but we are not thinking about tokens.
The users who understand tokens are the ones who can stretch AI to the limit and get the best out of it - while for most of us, it’s still just a wow moment.

Because behind every wow moment is a real cost - and tokens are the real game.

If a user can interrupt a company’s service, that shows how big the difference is between light users and heavy ones.
And if Anthropic says this will impact only 5% of users, well - that tells us a lot about where most users still are.

For us, that’s the real signal: the majority isn’t pushing tools to their limits yet.
There’s still a big gap to close.

In the bigger picture, token usage is going to shape AI pricing and evolution.
There are already signs that pricing is going to change as vendors gain more experience on what is considered average use and forecast better cost.

You can already see which vendors have more flexibility and control over the infrastructure they use.
This will play a big role in which LLM we choose to use - and for what.

We could be wrong.
But we’ll just have to wait and see.

Share Article

Get stories direct to your inbox

We’ll never share your details. View our Privacy Policy for more info.