AI Codex
Agents & Orchestration Claude

AI Agent

Also: autonomous agent

An AI system set up to take a sequence of actions to complete a goal — not just answer a single question. Instead of just responding to your message, an agent can search the web, read documents, write code, send requests to other systems, and keep working through multiple steps until it finishes the task. Claude can act as an agent when given the right tools. The key difference from a regular chatbot: agents do, not just say.

In practice

Instead of asking Claude "what should I do next?" and acting on its answer yourself, an AI agent does the acting. You give it a goal — "monitor this inbox and draft replies to anything tagged urgent" — and it runs, takes actions, checks results, and keeps going until the job is done.

Related concepts

Where AI Agent shows up

28 articles

Most developers focus on the model. The engineers building production AI applications focus on everything around it. Here is what the agent harness is, why it determines whether your app actually works, and where to start building it intentionally.

Implementation guide·The agent harness: why your infrastructure matters more than your model·9 min

A new Claude API feature lets Sonnet or Haiku call Opus mid-task when they need help. You pay Opus rates only for those calls — everything else runs at Sonnet or Haiku cost. Here's what it does and when to use it.

Implementation guide·The advisor tool: Opus-level reasoning at Sonnet prices·7 min

Anthropic now runs the full agent loop for you — sandboxed execution, built-in tools, and event streaming included. Here's what you get and when it makes sense over building the loop yourself.

Implementation guide·Claude Managed Agents: a hosted agent loop without the infrastructure·5 min

System prompts, ticket workflows, escalation patterns, and QBR prep — the operational guide for deploying Claude across a customer success team.

Implementation guide·The Claude playbook for CS teams·10 min

Claude Code's desktop app was rebuilt for running multiple coding tasks at once. A new sidebar manages sessions across repos, an integrated terminal and diff viewer replace external tools, and side chat lets you branch conversations without interrupting ongoing work.

Implementation guide·Parallel agents in Claude Code: the desktop redesign·6 min

Routines let you configure a Claude Code task once — a nightly bug triage, a PR review on every push, an alert fix triggered by your monitoring system — and have it run in the cloud on its own schedule. Here's how they work and what they're useful for.

Implementation guide·Claude Code Routines: automations that run without you·6 min

Agent Teams let you run multiple Claude Code instances with distinct roles — a frontend dev, a backend dev, a QA reviewer — all coordinating in parallel. There's a real setup cost and most tasks don't need it. Here's how to tell when it's worth it.

Implementation guide·Agent Teams: When Parallel Agents Actually Help (And When They Don't)·7 min

Aaron Levie said career counselors should be figuring out how to help students get these jobs. The path exists — 800% hiring growth, $180K–$700K+ comp, every major AI company hiring. It just hasn't been written down anywhere useful. Until now.

Implementation guide·How to become a Forward Deployed Engineer — the path nobody has written down yet·10 min

A GitHub full of side projects tells an FDE hiring manager that you can code. What they actually want to see is evidence that you can build in the real world — against legacy systems, ambiguous requirements, and non-technical stakeholders. These five projects show exactly that.

Implementation guide·The 5 portfolio projects that actually signal FDE readiness·9 min

Most people handed AI responsibility try to do everything at once and ship nothing reliable, or wait for a perfect plan and never start. The 90-day path is simpler: one team, one workflow, one agent that actually works. Then you expand.

Implementation guide·Your first 90 days as an Agent Operator — what to build, in what order·10 min

Most agents in production have never been formally tested. The person who set them up tried a few examples and it seemed fine. That's how you end up with a contract review agent that hallucinates clause details. Evaluation doesn't require code — it requires a spreadsheet and 30 minutes a week.

Implementation guide·How to evaluate your agents — without being a developer·9 min

Most Agent Operators think connecting their internal systems to Claude requires an engineer. For the majority of use cases, it doesn't. Four levels of integration exist — and Level 2 (native connectors for Google/Microsoft) or Level 3 (Zapier) solve 80% of what you need.

Implementation guide·Wiring your internal systems to Claude — what's actually possible without an engineer·9 min

Your Claude bill went from $200 to $2,000 and you can't explain why. The four cost drivers — bloated system prompts, unnecessary context loading, high failure rates, and no usage monitoring — each have fixes. Cost per task is the metric that matters, not total spend.

Implementation guide·Keeping your agent costs under control as you scale·7 min

Aaron Levie said career counselors should quickly figure out how to get students into forward deployed engineer roles. The role exists, it's exploding, and it pays $150K–$700K+ total comp. The career infrastructure just hasn't caught up yet. Here's what to tell students.

Role-Specific·The Forward Deployed Engineer: a guide for career advisors and CS departments·8 min

Your CEO doesn't want a technology update. They want to know if the investment is working and whether to do more. Three types of evidence actually work: time saved, error rate improvement, and throughput. Here's how to measure them and how to present them.

Role-Specific·Showing AI ROI to your CEO — what to measure and how to report it·7 min

There's a clean line between a model that responds to questions and one that takes actions in the world. Understanding that line is the most important thing to know about building with AI right now.

Core Definition·When Claude stops answering and starts doing·5 min

Anthropic and OpenAI both launched billion-dollar deployment companies in the same week — and both are built around the same type of engineer: someone who moves into a company, builds production AI systems against their actual messy environment, and leaves something that lasts. That engineer has a name now.

Core Definition·What is a Forward Deployed Engineer — and why every AI company is hiring for them now·11 min

Aaron Levie says 500,000 to 1 million companies will hire for this role. Most won't call it 'Agent Operator.' Some will call it an AI program manager, an automation lead, an AI systems admin. Whatever the title, the job is the same: you are responsible for making AI agents actually work inside your company.

Core Definition·What is an Agent Operator — the role Aaron Levie says 500,000 companies are about to hire for·9 min

Building an AI agent that demos well is easy. Building one that works reliably in production is hard. The gap between the two is almost always one of the same five problems.

Failure Modes·Why most AI agent pilots fail in the first month·6 min

Agents fail differently than APIs. When a sub-agent times out halfway through a pipeline, you don't just get an error — you get partial state. The patterns that make multi-agent systems actually recover.

Failure Modes·Multi-agent failure handling: timeouts, partial outputs, and recovery patterns·8 min

When your agent starts producing bad outputs, the instinct is to assume the model got worse. It usually didn't. 90% of agent failures are context failures or prompt failures — both of which you can diagnose and fix without any technical help.

Failure Modes·When your agent breaks — how to diagnose it and fix it·8 min

Building the agent is the easy part. Getting people to use it is where most Agent Operators fail. Three types of resistance — trust, speed, job fear — each with a different fix. And one thing that kills adoption faster than anything else.

Failure Modes·Getting your team to actually use the agent — the change management problem nobody warned you about·8 min

An agent isn't just a chatbot that can click buttons. It's a fundamentally different relationship between a human and an AI. Here's what that looks like when it's working.

Field Note·When Claude starts doing the work: what AI agents look like in practice·5 min

A CS manager who uses Claude well can do meaningful work on renewals, QBRs, and escalations in the gaps between other work. Here's what that workflow actually looks like across a full day.

Field Note·What using Claude actually looks like for a CS manager·8 min

Where Claude genuinely saves hours for marketing managers, where it falls flat, and what the actual workflow looks like.

Field Note·What using Claude actually looks like for a marketing manager·8 min

By default, a Managed Agents session starts fresh and forgets everything when it ends. Memory stores change that — they're workspace-scoped document collections the agent reads and writes across sessions. The feature entered public beta on April 23, 2026.

update·Persistent Memory for Claude Managed Agents·8 min

On May 5, 2026 Anthropic shipped 10 ready-made agent templates for financial services — pitch builder, KYC screener, month-end closer, and seven more. Each one bundles skills, connectors, and subagents in a way that's reusable beyond finance. Here's what shipped and what the pattern teaches anyone designing agents.

update·Claude finance agents: 10 reference templates and the pattern behind them·8 min

On May 6, 2026 Anthropic shipped three new capabilities for Claude Managed Agents: multiagent sessions (public beta), outcomes (public beta), and dreaming (research preview). Multiagent lets a lead agent delegate to specialist subagents on a shared filesystem; outcomes turns a rubric into a self-correction loop; dreaming lets an agent review its past sessions overnight and curate its memory.

update·Claude Managed Agents update: multiagent sessions, outcomes, and dreaming·9 min