AI Codex

Articles

Practical AI for operators.

Decision guides, failure patterns, and what implementing AI actually looks like. Written for people doing the work, not people writing about it.

Start here

If you're new to implementing AI at your company, read these first.

ToolsDecision guide · 5 min

Five habits that separate operators who get results from those who don't

Most of the gap between teams getting real value from Claude and teams that are still in pilot mode comes down to a handful of specific practices. Here's what the effective ones do differently.

EvaluationIn practice · 6 min

Running your first AI pilot: a 30-day plan

Most AI pilots either drag on for six months without a decision, or get declared a success after two weeks based on nothing. Here's a structure that produces a real answer in 30 days.

Making the call

Decision guides for the questions that actually matter.

BusinessDecision guide · 5 min

Opus, Sonnet, or Haiku: which Claude model should your team use?

Claude has three model tiers. Here is which one to use for what — and why defaulting to the most powerful one is usually a mistake.

AgentsDecision guide · 6 min

Managed Agents: what they are and what they mean for your organisation

Anthropic just launched Managed Agents. Here is what they do, who they are for, and how to think about whether your team should use them.

ToolsDecision guide · 6 min

Writing a system prompt that actually works

The system prompt is the highest-leverage thing you control when deploying Claude. Most are either too vague or too long. Here's what good looks like.

ToolsDecision guide · 4 min

Prompt caching: why it matters when you're building with Claude at scale

If your application sends the same long system prompt on every request, you're paying to re-process it every time. Prompt caching stops that.

FoundationDecision guide · 4 min

How to think about Claude's context window (and when it actually matters)

200,000 tokens sounds enormous. In practice, how you use that space changes everything about the quality of your outputs.

ToolsDecision guide · 4 min

Skills and Connectors: how to make Claude actually useful at work

By default, Claude only knows what you tell it in the conversation. Skills and Connectors change that — here's what they do and which ones are worth turning on.

ToolsDecision guide · 5 min

What to automate first with AI

Every company has ten things they could automate with AI. About two of them are actually good starting points. Here's the framework for finding them.

ToolsDecision guide · 5 min

What MCP actually means for your business (it's not just for developers)

The Model Context Protocol sounds technical. The practical implication is simple: AI tools can now connect to your actual systems in a standardised, safe way. Here's what that unlocks.

ToolsDecision guide · 4 min

When to use extended thinking — and when it's a waste

Extended thinking makes Claude noticeably better on hard problems. But most tasks don't need it, and using it everywhere will slow you down and cost more.

ToolsDecision guide · 5 min

How to set up Claude Projects for your team (and what most people miss)

Projects are the most underused feature in Claude. Here's how to configure them so your whole team gets consistent outputs — not whatever each person happens to type.

EvaluationDecision guide · 6 min

How to know if your Claude integration is actually working

Most teams go live on gut feel and find out six weeks later that Claude has been quietly giving wrong answers. Here's how to know before that happens — without being an engineer.

FoundationDecision guide · 5 min

How to work with Claude when accuracy matters

Hallucination isn't a reason to avoid Claude for high-stakes work. It's a constraint to design around. Teams that get this right build AI into their most important workflows. Teams that don't, limit AI to the low-stakes ones.

RetrievalDecision guide · 5 min

Do you actually need RAG? The decision most operators get wrong

Most teams jump to RAG because it sounds like the right answer. Half of them didn't need it. Here's how to know which situation you're in — before you build anything.

What goes wrong

The failure patterns that catch most teams off guard.

ToolsWhat goes wrong · 5 min

Why RAG implementations fail (and how to avoid the most common mistakes)

RAG is one of the most powerful things you can build with Claude. It's also where a lot of teams get stuck. Here are the failure patterns worth knowing before you start.

FoundationWhat goes wrong · 5 min

The hallucination patterns that catch operators off guard

Everyone knows AI can make things up. What surprises people is which specific situations trigger it — and how confident Claude sounds when it does.

ToolsWhat goes wrong · 5 min

Why your first AI pilot probably failed

Most AI pilots don't fail because the AI wasn't good enough. They fail for three very predictable reasons — none of which are technical.

AgentsWhat goes wrong · 6 min

Why most AI agent pilots fail in the first month

Building an AI agent that demos well is easy. Building one that works reliably in production is hard. The gap between the two is almost always one of the same five problems.

PromptWhat goes wrong · 5 min

The system prompt mistakes that make Claude worse, not better

More instructions don't mean better results. Most system prompts fail in one of five predictable ways — and fixing them is usually the highest-leverage thing you can do to improve your Claude integration.

In practice

What it actually looks like when teams implement AI.

BusinessIn practice · 5 min

What AI actually looks like for an HR team

HR involves a lot of writing, reviewing, and communicating. Here's where Claude saves real time — and where to be careful.

BusinessIn practice · 6 min

What AI actually looks like for an operations team

Ops has more to gain from AI than almost any other function — but the use cases look different to what most people expect.

BusinessIn practice · 6 min

What AI actually looks like for a sales team

Not "AI will write your emails." What sales teams are genuinely using Claude for, what works, and the one thing most reps get wrong.

BusinessIn practice · 5 min

How to actually evaluate whether your AI rollout is working

Most AI rollout evaluations are either too vague ("the team likes it") or too technical (automated test suites that miss what users actually care about). Here's what works.

AgentsIn practice · 5 min

When Claude starts doing the work: what AI agents look like in practice

An agent isn't just a chatbot that can click buttons. It's a fundamentally different relationship between a human and an AI. Here's what that looks like when it's working.

BusinessIn practice · 5 min

How marketing teams are actually using Claude

Content is the obvious use case. But the marketing teams getting the most value from AI have figured out something different.

BusinessIn practice · 6 min

What AI actually looks like in a customer success team

Not a demo, not a prediction. What CS teams are actually using AI for right now — what's working, what isn't, and what nobody tells you before you start.

ToolsIn practice · 7 min

Using Claude for customer support: what actually works

Customer support is the most common first AI use case for a reason — and the place where the most teams get burned. Here's what a working implementation looks like, and what the common shortcuts miss.

The concepts

Clear explanations of the ideas behind the tools.

ToolsHow it works · 5 min

When to use Deep Research and how to get the most from it

Deep Research is not just "web search but longer." It is a different tool for a different kind of question. Here is when it is worth the time and tokens.

ToolsHow it works · 5 min

Claude Memory: what it remembers, how to use it, and how to manage it

Claude now remembers things about you across conversations. Here is how it works, what to tell it to remember, and how to keep it useful.

ToolsHow it works · 6 min

Cowork and Dispatch: Claude working on your computer

Claude can now control your desktop and complete tasks while you do other things. Here is how it works, what it is good at, and what to be careful about.

ToolsHow it works · 5 min

Connectors: which to enable, which to disable, and why it matters

Connectors give Claude access to your tools. But having all of them on all the time costs tokens and introduces noise. Here is how to manage them.

ToolsHow it works · 5 min

Claude Skills: what they are, which to enable, and when to use them

Skills give Claude superpowers — web search, code execution, file creation. Here is which ones matter, how to set them up, and when to turn them off.

BusinessHow it works · 6 min

How to minimise your Claude token usage without sacrificing quality

Tokens are what you pay for. Here are the practical things you can do to use fewer of them — from how you prompt to which model you choose.

BusinessHow it works · 5 min

Step-by-step: researching a prospect with Claude before a call

A concrete workflow for turning 30 minutes of pre-call research into 5 minutes — without losing the signal that makes a call go well.

BusinessHow it works · 7 min

How to set up Claude for your whole company

You've been asked to "get Claude set up for the team." Here's exactly what that means, what decisions you need to make, and what to do in what order.

ToolsHow it works · 4 min

How tool use works: what happens when Claude calls a function

Tool use is the mechanism that lets Claude do things, not just say things. Here's exactly what happens when Claude uses a tool.

FoundationConcept · 3 min

Adaptive thinking: how Claude decides how hard to think

Claude doesn't apply the same effort to every question. Here's what adaptive thinking is, how it works, and why it matters for the outputs you get.

InfrastructureConcept · 3 min

Why Claude starts talking before it's finished thinking

Streaming sends Claude's response token by token as it's generated, instead of waiting until the full response is ready. The difference in perceived speed is significant — and the implementation is simpler than you'd expect.

EvaluationConcept · 5 min

The problem of making AI do what you actually mean

Alignment is the core challenge of AI development: building systems that reliably do what humans intend. It's harder than it sounds, and understanding why helps you build better applications today.

FoundationConcept · 4 min

The unit everything in AI is priced and measured in

Tokens are how language models read and write text — and how every AI API charges you. Understanding them turns abstract pricing into something you can predict and control.

FoundationConcept · 5 min

When a well-crafted prompt isn't enough

Fine-tuning is how you train a model on your specific data to change its behavior at a deeper level than prompting can reach. It's powerful — and often unnecessary. Knowing which situation you're in saves a lot of time.

FoundationConcept · 4 min

The dial between predictable and creative

Temperature controls how much Claude surprises you. Turn it down for consistent, focused answers. Turn it up for more varied, exploratory ones. Knowing when to do each is a real skill.

FoundationConcept · 5 min

The engine under everything

A large language model is what Claude is at its core — and understanding how it works changes how you think about everything else in AI.

EvaluationConcept · 5 min

How to know if your Claude integration is actually working

Evals are the testing framework for AI — and they work differently from software tests. You're not checking for correct answers. You're measuring behavior across a range of realistic situations.

InfrastructureHow it works · 4 min

Pay for your context once, not every time

Prompt caching is Claude's way of remembering the expensive part of a conversation so you don't have to re-send — and re-pay for — the same context on every request.

FoundationConcept · 5 min

Why AI gets confident things wrong — and how to design around it

Hallucination isn't a bug that gets patched. It's a structural feature of how language models work. Understanding why it happens is the first step to building applications that aren't derailed by it.

AgentsConcept · 5 min

How Claude reaches beyond the conversation

Tool use is the mechanism that turns Claude from a text generator into something that can actually do things — search the web, run code, query your database, send messages.

AgentsConcept · 5 min

When Claude stops answering and starts doing

There's a clean line between a model that responds to questions and one that takes actions in the world. Understanding that line is the most important thing to know about building with AI right now.

FoundationConcept · 5 min

The whiteboard every AI conversation shares

Context window is the single number that shapes everything about how Claude thinks with you — and most people are using only a fraction of it.

PromptConcept · 4 min

How to brief Claude before the conversation starts

The system prompt is where you stop asking Claude to be general-purpose and start making it yours. Most operators underuse it.

RetrievalConcept · 5 min

How to give Claude a memory it doesn't have by default

RAG is the most practical technique in AI engineering — and the most misnamed. It's not magic. It's just giving the model the right pages of the book before it answers.

FoundationConcept · 5 min

Why Claude has values instead of just rules

Most AI safety is a list of don'ts. Constitutional AI is the method Anthropic used to teach Claude to reason about right and wrong — the same way you'd want a thoughtful colleague to.