The Rate Limits API: read your org limits in code
In brief
Anthropic shipped the Rate Limits API on April 24, 2026. It lets admins read the rate limits configured for their organization and workspaces over HTTP, so gateways, alerting, and provisioning automation stop drifting against hardcoded numbers. Here's what it returns, how to call it, and where it fits in an admin stack.
Contents
Until April 24, 2026, the only place to read your org's Claude rate limits was the Limits page in the Claude Console. If a gateway, proxy, or alerting system needed those numbers, you copied them by hand and re-copied them whenever Anthropic adjusted them.
The new Rate Limits API exposes the same data over HTTP. It returns the limits configured for your organization and any per-workspace overrides, broken out by model group, batch jobs, files, skills, and the web search tool.
It is read-only. You still set workspace overrides in the Console.
Who this is for
This is an Admin API endpoint. Three concrete use cases the official docs call out:
- Gateways and proxies — read your current limits at startup and on a schedule, instead of hardcoding values that drift over time.
- Internal alerting — combine these limits with the Usage and Cost API to alert when a workspace is approaching its ceiling.
- Provisioning audits — verify that the workspace overrides set by your automation match what's actually in place.
If your org runs Claude through a single shared API key with no Console workspaces, you probably don't need this. If you've split usage into workspaces — per team, per environment, per customer — and have anything reading limits today, this replaces a manual step.
Authentication
The Rate Limits API is part of the Admin API. It requires an Admin API key (it starts with sk-ant-admin...), which is different from a regular API key. Only an org admin can mint one, in Console → Settings → Admin Keys.
Standard sk-ant-api03-... keys will get a 401 on these endpoints.
Two endpoints
Organization limits
curl "https://api.anthropic.com/v1/organizations/rate_limits" \
--header "anthropic-version: 2023-06-01" \
--header "x-api-key: $ANTHROPIC_ADMIN_KEY"
The response is a list of rate limit groups. Each group covers a category of resources — for example, all Opus models share one limit group; the Message Batches API has its own; agent skills have their own.
A typical entry:
{
"type": "rate_limit",
"group_type": "model_group",
"models": [
"claude-opus-4-5",
"claude-opus-4-5-20251101",
"claude-opus-4-6",
"claude-opus-4-7"
],
"limits": [
{ "type": "requests_per_minute", "value": 4000 },
{ "type": "input_tokens_per_minute", "value": 2000000 },
{ "type": "output_tokens_per_minute", "value": 400000 }
]
}
The models field tells you which model strings count against this group. Every model ID and alias you can pass to the Messages API appears in exactly one model_group entry.
To look up the limits for a specific model, pass it as a query parameter:
curl "https://api.anthropic.com/v1/organizations/rate_limits?model=claude-opus-4-7" \
--header "anthropic-version: 2023-06-01" \
--header "x-api-key: $ANTHROPIC_ADMIN_KEY"
If the model string doesn't match any group, you get a 404.
Workspace overrides
curl "https://api.anthropic.com/v1/organizations/workspaces/$WORKSPACE_ID/rate_limits" \
--header "anthropic-version: 2023-06-01" \
--header "x-api-key: $ANTHROPIC_ADMIN_KEY"
This endpoint returns only the overrides set on the workspace. Anything missing is inherited from the organization, not unlimited.
For each overridden limiter, the response includes the workspace value and the org value side by side:
{
"limits": [
{ "type": "requests_per_minute", "value": 1000, "org_limit": 4000 },
{ "type": "input_tokens_per_minute", "value": 500000,"org_limit": 2000000 }
]
}
To get the effective limits a workspace is operating under, you merge the workspace response on top of the org response. A short helper:
async function effectiveLimits(workspaceId: string) {
const [org, ws] = await Promise.all([
fetch('https://api.anthropic.com/v1/organizations/rate_limits', {
headers: { 'anthropic-version': '2023-06-01', 'x-api-key': process.env.ANTHROPIC_ADMIN_KEY! },
}).then(r => r.json()),
fetch(`https://api.anthropic.com/v1/organizations/workspaces/${workspaceId}/rate_limits`, {
headers: { 'anthropic-version': '2023-06-01', 'x-api-key': process.env.ANTHROPIC_ADMIN_KEY! },
}).then(r => r.json()),
])
// Index workspace overrides by group_type + (first model, if any) for merge
const overrides = new Map<string, Map<string, number>>()
for (const g of ws.data) {
const key = g.group_type + ':' + (g.models?.[0] ?? '')
overrides.set(key, new Map(g.limits.map((l: any) => [l.type, l.value])))
}
return org.data.map((g: any) => {
const key = g.group_type + ':' + (g.models?.[0] ?? '')
const o = overrides.get(key)
return {
...g,
limits: g.limits.map((l: any) => ({
type: l.type,
value: o?.get(l.type) ?? l.value,
source: o?.has(l.type) ? 'workspace' : 'org',
})),
}
})
}
You can also filter either endpoint by group_type. Valid values are model_group, batch, token_count, files, skills, and web_search.
What's not in the response
A few things to know up front:
- Limits for Claude Managed Agents are not included. That product has its own resource model.
- You can't update limits with this API. Workspace overrides are still set in the Console.
- The default workspace has no entry on the workspace endpoint. Use the org endpoint to read its limits.
- Pagination is in place but currently every response is a single page.
next_pageis alwaysnulltoday. The docs recommend looping onnext_pageanyway so your client doesn't break when responses grow.
Where this fits in an admin stack
Two patterns the API was built for:
At gateway startup. A team running an internal Claude gateway can fetch its workspace's effective limits at boot and use them to size local rate limiters. Re-fetch on a schedule (every few hours) so a Console change in limits propagates without a deploy.
As an alerting input. Combine this endpoint with the Usage and Cost API. The Usage API tells you tokens consumed in the last interval; this endpoint tells you the ceiling. Compute the ratio per workspace and fire an alert at, say, 80%. This catches workspaces that are about to start failing before users feel it.
The right cadence depends on how often your org changes limits. Most orgs can poll daily. Gateways can re-fetch every few hours.
Related reading
- Building a business case for Claude — when admin overhead like this matters
- Claude admin ongoing maintenance — fits in the regular admin pass
- Rate limiting Claude API — the client-side companion: handling 429s in your app
Source: Rate Limits API, released April 24, 2026 per the Claude Platform release notes.