AI Codex
Claude APIupdate

The Rate Limits API: read your org limits in code

In brief

Anthropic shipped the Rate Limits API on April 24, 2026. It lets admins read the rate limits configured for their organization and workspaces over HTTP, so gateways, alerting, and provisioning automation stop drifting against hardcoded numbers. Here's what it returns, how to call it, and where it fits in an admin stack.

7 min read·Rate Limiting

Contents

Sign in to save

Until April 24, 2026, the only place to read your org's Claude rate limits was the Limits page in the Claude Console. If a gateway, proxy, or alerting system needed those numbers, you copied them by hand and re-copied them whenever Anthropic adjusted them.

The new Rate Limits API exposes the same data over HTTP. It returns the limits configured for your organization and any per-workspace overrides, broken out by model group, batch jobs, files, skills, and the web search tool.

It is read-only. You still set workspace overrides in the Console.

Who this is for

This is an Admin API endpoint. Three concrete use cases the official docs call out:

  • Gateways and proxies — read your current limits at startup and on a schedule, instead of hardcoding values that drift over time.
  • Internal alerting — combine these limits with the Usage and Cost API to alert when a workspace is approaching its ceiling.
  • Provisioning audits — verify that the workspace overrides set by your automation match what's actually in place.

If your org runs Claude through a single shared API key with no Console workspaces, you probably don't need this. If you've split usage into workspaces — per team, per environment, per customer — and have anything reading limits today, this replaces a manual step.

Authentication

The Rate Limits API is part of the Admin API. It requires an Admin API key (it starts with sk-ant-admin...), which is different from a regular API key. Only an org admin can mint one, in Console → Settings → Admin Keys.

Standard sk-ant-api03-... keys will get a 401 on these endpoints.

Two endpoints

Organization limits

curl "https://api.anthropic.com/v1/organizations/rate_limits" \
  --header "anthropic-version: 2023-06-01" \
  --header "x-api-key: $ANTHROPIC_ADMIN_KEY"

The response is a list of rate limit groups. Each group covers a category of resources — for example, all Opus models share one limit group; the Message Batches API has its own; agent skills have their own.

A typical entry:

{
  "type": "rate_limit",
  "group_type": "model_group",
  "models": [
    "claude-opus-4-5",
    "claude-opus-4-5-20251101",
    "claude-opus-4-6",
    "claude-opus-4-7"
  ],
  "limits": [
    { "type": "requests_per_minute",       "value": 4000 },
    { "type": "input_tokens_per_minute",   "value": 2000000 },
    { "type": "output_tokens_per_minute",  "value": 400000 }
  ]
}

The models field tells you which model strings count against this group. Every model ID and alias you can pass to the Messages API appears in exactly one model_group entry.

To look up the limits for a specific model, pass it as a query parameter:

curl "https://api.anthropic.com/v1/organizations/rate_limits?model=claude-opus-4-7" \
  --header "anthropic-version: 2023-06-01" \
  --header "x-api-key: $ANTHROPIC_ADMIN_KEY"

If the model string doesn't match any group, you get a 404.

Workspace overrides

curl "https://api.anthropic.com/v1/organizations/workspaces/$WORKSPACE_ID/rate_limits" \
  --header "anthropic-version: 2023-06-01" \
  --header "x-api-key: $ANTHROPIC_ADMIN_KEY"

This endpoint returns only the overrides set on the workspace. Anything missing is inherited from the organization, not unlimited.

For each overridden limiter, the response includes the workspace value and the org value side by side:

{
  "limits": [
    { "type": "requests_per_minute",     "value": 1000,  "org_limit": 4000    },
    { "type": "input_tokens_per_minute", "value": 500000,"org_limit": 2000000 }
  ]
}

To get the effective limits a workspace is operating under, you merge the workspace response on top of the org response. A short helper:

async function effectiveLimits(workspaceId: string) {
  const [org, ws] = await Promise.all([
    fetch('https://api.anthropic.com/v1/organizations/rate_limits', {
      headers: { 'anthropic-version': '2023-06-01', 'x-api-key': process.env.ANTHROPIC_ADMIN_KEY! },
    }).then(r => r.json()),
    fetch(`https://api.anthropic.com/v1/organizations/workspaces/${workspaceId}/rate_limits`, {
      headers: { 'anthropic-version': '2023-06-01', 'x-api-key': process.env.ANTHROPIC_ADMIN_KEY! },
    }).then(r => r.json()),
  ])

  // Index workspace overrides by group_type + (first model, if any) for merge
  const overrides = new Map<string, Map<string, number>>()
  for (const g of ws.data) {
    const key = g.group_type + ':' + (g.models?.[0] ?? '')
    overrides.set(key, new Map(g.limits.map((l: any) => [l.type, l.value])))
  }

  return org.data.map((g: any) => {
    const key = g.group_type + ':' + (g.models?.[0] ?? '')
    const o = overrides.get(key)
    return {
      ...g,
      limits: g.limits.map((l: any) => ({
        type: l.type,
        value: o?.get(l.type) ?? l.value,
        source: o?.has(l.type) ? 'workspace' : 'org',
      })),
    }
  })
}

You can also filter either endpoint by group_type. Valid values are model_group, batch, token_count, files, skills, and web_search.

What's not in the response

A few things to know up front:

  • Limits for Claude Managed Agents are not included. That product has its own resource model.
  • You can't update limits with this API. Workspace overrides are still set in the Console.
  • The default workspace has no entry on the workspace endpoint. Use the org endpoint to read its limits.
  • Pagination is in place but currently every response is a single page. next_page is always null today. The docs recommend looping on next_page anyway so your client doesn't break when responses grow.

Where this fits in an admin stack

Two patterns the API was built for:

At gateway startup. A team running an internal Claude gateway can fetch its workspace's effective limits at boot and use them to size local rate limiters. Re-fetch on a schedule (every few hours) so a Console change in limits propagates without a deploy.

As an alerting input. Combine this endpoint with the Usage and Cost API. The Usage API tells you tokens consumed in the last interval; this endpoint tells you the ceiling. Compute the ratio per workspace and fire an alert at, say, 80%. This catches workspaces that are about to start failing before users feel it.

The right cadence depends on how often your org changes limits. Most orgs can poll daily. Gateways can re-fetch every few hours.


Source: Rate Limits API, released April 24, 2026 per the Claude Platform release notes.

Related tools

Weekly brief

For people actually using Claude at work.

Each week: one thing Claude can do in your work that most people haven't figured out yet — plus the failure modes to avoid. No tutorials. No hype.

No spam. Unsubscribe anytime.

What to read next

All articles →