Claude APIupdate

Running Claude Managed Agents on your own infrastructure

In brief

Self-hosted sandboxes let Claude Managed Agents keep their orchestration on Anthropic's side while tool execution — the agent's filesystem, processes, and network — runs on infrastructure you control. It's the answer for agents that touch data that can't leave your network. Here's the work-queue model, the environment worker, and how MCP tunnels, scheduling, and vaults fit in.

8 min read·AI Agent

Contents

♡Sign in to save

Claude Managed Agents normally execute their tool calls — running code, reading and writing files, reaching the network — inside Anthropic-managed cloud sandboxes. That's the easy path, and for most agents it's fine.

It stops being fine the moment the agent needs to operate on data that can't leave your network, reach internal services that aren't publicly routable, or run under your own compliance and audit controls. Self-hosted sandboxes are the answer to all three. They reached general availability across the Claude API and Claude Platform on AWS over May–June 2026, with scheduled deployments and vault credentials landing alongside the Fable 5 release on June 9.

This is a builder/operator guide to how they work and when to reach for them.

What stays where

The split is the whole point:

Orchestration stays on Anthropic's side. Claude — the model deciding what to do next — runs on Anthropic's control plane. You don't host the model.
Tool execution moves to your infrastructure. The filesystem the agent reads and writes, the processes it spawns, and the network it can reach are all on a host you control, under your network policy and lifecycle.
Tool inputs and outputs still flow to Anthropic's control plane so Claude can see results and pick the next step. That's the data-flow boundary to understand before you commit; the security model spells it out.

In a table:

	Cloud sandbox (default)	Self-hosted sandbox
Where tools run	Anthropic-managed	Your infrastructure
Network reach	Anthropic's egress controls	Your network policy
File / repo mounting	Managed by Anthropic	Managed by you
Lifecycle	Managed by Anthropic	Managed by you

Self-hosting is also what unlocks Zero Data Retention and HIPAA BAA eligibility for the execution layer — see API and data retention.

The mental model: a work queue

A self-hosted environment is a work queue. When a session is assigned to it, Anthropic enqueues that session as a work item. A process you run — the environment worker — claims items from the queue, downloads the agent's skills, runs the tool calls locally, and posts the results back.

You run the worker one of two ways:

Always-on — a long-running process polls the queue continuously. Needs only outbound HTTPS. Simplest setup.
Webhook-triggered — a handler that wakes on the session.status_run_started webhook and polls then. Avoids an idle poller, but needs an endpoint Anthropic can reach.

Both the ant CLI and the SDKs ship pre-built workers. The CLI supports the always-on pattern; the SDK supports both.

Standing one up

1. Create the environment with config.type: "self_hosted":

curl -sS --fail-with-body https://api.anthropic.com/v1/environments \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "anthropic-beta: managed-agents-2026-04-01" \
  -H "content-type: application/json" \
  -d '{ "name": "self-hosted", "config": {"type": "self_hosted"} }'

2. Generate an environment key in the Console (key generation is Console-only, even if you created the environment over the API). On the worker host:

export ANTHROPIC_ENVIRONMENT_KEY="sk-ant-oat01-..."
export ANTHROPIC_ENVIRONMENT_ID="env_..."

3. Run the worker. With the CLI:

ant beta:worker poll --workdir /workspace

Or with the SDK (TypeScript shown):

import Anthropic from "@anthropic-ai/sdk";
import { EnvironmentWorker } from "@anthropic-ai/sdk/helpers/beta/environments";

const environmentKey = process.env.ANTHROPIC_ENVIRONMENT_KEY!;
const environmentId = process.env.ANTHROPIC_ENVIRONMENT_ID!;
const client = new Anthropic({ authToken: environmentKey });

const controller = new AbortController();
process.once("SIGTERM", () => controller.abort());

await new EnvironmentWorker({
  client,
  environmentId,
  environmentKey,
  workdir: "/workspace",
  signal: controller.signal,
}).run();

The worker drains in-flight tool calls and exits cleanly on SIGTERM.

4. Create a session that targets the environment. It enters the queue and waits there until a worker claims it — if no worker is connected, the session stays queued rather than failing:

curl -sS https://api.anthropic.com/v1/sessions \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "anthropic-beta: managed-agents-2026-04-01" \
  -H "content-type: application/json" \
  -d '{ "agent": "'"$AGENT_ID"'", "environment_id": "'"$ANTHROPIC_ENVIRONMENT_ID"'" }'

The one security rule that matters most

Run monitoring and session-management calls (work.stats, work.stop, session creation) with your organization API key, from outside the worker host. The worker itself authenticates with the environment key only.

Never set ANTHROPIC_API_KEY on the worker host. Doing so exposes an organization-scoped credential to the agent's own tool calls.

The environment key is scoped to claiming and completing work. The API key can spend money and read org data. Keep them on different machines.

Stronger isolation: a sandbox per session

ant beta:worker poll runs tool calls in-process, in the working directory. If you need a fresh filesystem, resource limits, or per-session network controls, run each session in its own sandbox. Build an image with ant installed and ant beta:worker run as the entrypoint, then point the poller at a spawn script:

ant beta:worker poll --on-work ./spawn.sh

The poller injects ANTHROPIC_SESSION_ID, ANTHROPIC_WORK_ID, ANTHROPIC_ENVIRONMENT_ID, and ANTHROPIC_ENVIRONMENT_KEY into the script, which launches a fresh container per session (Docker, or a managed sandbox provider). Bind-mount a host directory to the sandbox's /mnt/session/outputs to retrieve the agent's deliverables after it exits.

If you'd rather not build the harness, there are pre-built integrations for Blaxel, Cloudflare, Daytona, E2B, Modal, Namespace, Superserve, and Vercel — each gives you per-session sandboxes without writing the spawn logic yourself.

Reaching private MCP servers

Self-hosting controls where the agent's code executes. MCP tunnels control how Anthropic reaches MCP servers inside your network. They're independent:

A session in Anthropic's cloud sandbox can still reach a private MCP server through a tunnel.
A self-hosted session can use tunneled or public MCP servers.

Use both when you want execution and tool access to stay inside your boundary — the full-isolation configuration for sensitive agents.

Two companion features (June 9)

Shipped alongside Fable 5 and worth knowing if you operate agents:

Scheduled deployments — run agent sessions on a cron schedule without standing up your own scheduler. The natural fit for recurring jobs: a nightly reconciliation agent, a Monday-morning report builder.
Vault environment-variable credentials — securely inject secrets into the agent's sandbox as environment variables, for CLIs, SDKs, and other services that authenticate that way. Keeps keys out of your prompts and out of the agent's code.

Operating the fleet

work.stats returns queue health — call it from your ops tooling with the org API key:

{
  "type": "work_queue_stats",
  "depth": 0,            // items waiting to be claimed -> scale or alert on backlog
  "pending": 0,          // items a worker is currently processing
  "oldest_queued_at": null,
  "workers_polling": 0   // workers seen in the last 30s -> liveness alerting
}

Alert when workers_polling drops to 0 (no worker is connected and sessions are piling up) and when depth climbs (you're under-provisioned). Use work.stop for a graceful shutdown of a specific session — it finishes the in-flight tool call, posts a final status, and releases the session.

Two limits to plan around

Files aren't mounted. Anthropic doesn't stage files or GitHub repos into self-hosted sandboxes. Pass a reference (an S3 path, a commit SHA) in the session metadata, and have your worker read that from the claimed item and stage the files before tool execution begins.
Memory isn't supported yet. The Managed Agents memory stores feature doesn't work with self-hosted sandboxes at launch. If your agent depends on persistent memory, stay on cloud sandboxes for now.

When to reach for this

Default to cloud sandboxes — they're less to operate. Move to self-hosted when one of these is true: the agent must touch data that can't leave your network, it needs to reach internal non-public services, or your compliance regime (ZDR, HIPAA) requires execution to stay inside your boundary. Those are the cases worth running a worker fleet for. Everything else isn't.

Claude Managed Agents — the foundation: what the hosted agent loop is
Managed Agents memory — the persistence layer (cloud sandboxes only, for now)
MCP for production agents — connecting agents to real systems safely

Source: Self-hosted sandboxes and the Claude Platform release notes, May–June 2026.

Related tools

Claude vs GPT-4

Side-by-side comparison across code quality, context, debugging, and cost.

See comparison →

Claude API Cost Calculator

Estimate your monthly spend by model, message volume, and caching strategy.

Calculate your cost →

Weekly brief

For people actually using Claude at work.

Each week: one thing Claude can do in your work that most people haven't figured out yet — plus the failure modes to avoid. No tutorials. No hype.

No spam. Unsubscribe anytime.

What to read next

All articles →

AI Agent·Core Definition·5 min

When Claude stops answering and starts doing

There's a clean line between a model that responds to questions and one that takes actions in the world. Understanding that line is the most important thing to know about building with AI right now.

AI Agent·Failure Modes·6 min

Why most AI agent pilots fail in the first month

Building an AI agent that demos well is easy. Building one that works reliably in production is hard. The gap between the two is almost always one of the same five problems.

update·7 min

The Rate Limits API: read your org limits in code

Anthropic shipped the Rate Limits API on April 24, 2026. It lets admins read the rate limits configured for their organization and workspaces over HTTP, so gateways, alerting, and provisioning automation stop drifting against hardcoded numbers. Here's what it returns, how to call it, and where it fits in an admin stack.