AI Codex
Agents & OrchestrationHow It Works

Building a Claude-powered client deliverable

In brief

Some agencies are not just using Claude internally — they are building Claude into the work itself. What that looks like, what clients actually want, and how to avoid the obvious failure modes.

8 min read·Tool Use

Contents

Sign in to save

There is a category of agency work that used to be a novelty and is now a real line item: building Claude-powered tools for clients. A content brief generator. A competitive analysis tool. An onboarding assistant trained on client documentation. These are not advanced engineering projects — they are Claude with a structured interface and well-crafted system prompts. Agencies with even one technical person can build and deploy them.

What clients actually want

Most clients do not want "an AI tool." They want a specific problem solved. The agencies that win this work have learned to translate vague AI interest into concrete use cases.

The pattern that works: Find a task your client's team does repeatedly that requires judgment but follows a predictable structure. "Reviewing incoming customer support tickets and drafting first responses." "Generating weekly social posts from a brief." "Turning a sales call transcript into a CRM summary." These are small enough to ship in weeks, concrete enough to evaluate, and high enough volume that the ROI is visible.

The pattern that fails: "An AI strategy assistant for the executive team." Too broad, too vague, too hard to evaluate. It gets built, used twice, and abandoned.

The architecture of a simple Claude-powered deliverable

For most agency-built tools, the architecture is the same: a form or interface that collects input → a system prompt that defines behavior → Claude's API → an output formatted for what the client actually needs to do with it.

You do not need a custom frontend for this. For internal tools, a simple Next.js page, a Retool dashboard, or even a structured Notion database with a Zapier integration can collect the input and return Claude's output. The complexity is in the prompt engineering, not the frontend.

The system prompt is the product. For a client deliverable, this means:

  1. A clear persona and role that reflects the client's brand and the task's purpose
  2. Output format instructions that match how the output will be used (markdown, JSON, plain text, numbered list)
  3. Client context — the same material as the client context document, but baked into the system prompt permanently
  4. Quality guardrails — what it should refuse, what it should flag for human review, what it should never do

A content brief generator for a fashion brand looks different from one for a B2B software company. The tool is the same; the system prompt is the product differentiation.

Managing the approval and review layer

Claude-powered deliverables for agencies usually need a review step — output goes to a human before it reaches the end audience or gets used in a live campaign. Designing for review is as important as designing for generation.

Make review easy, not optional. If the review step is friction, it gets skipped. The output format should make review natural: side-by-side comparison with the source material, highlighted fields that require human judgment, a simple approve/edit/reject interface.

Log what gets rejected and why. This is your improvement loop. When a client rejects an output, you need to understand whether it is a prompt problem (Claude produced the wrong kind of output) or an expectation problem (the client's standards shifted). Build a lightweight feedback mechanism — even a simple comments field — that captures this.

Set version expectations early. Claude-powered tools rarely produce perfect output in v1. Set the expectation with clients that the first month is a calibration period, not a launch. Use the rejection data to improve the system prompt. Most tools get significantly better between month one and month three.

Pricing this work

The question agencies struggle with: how do you price a tool you built in a week using Claude's API at $0.003 per call?

You are not pricing the API cost or the build time. You are pricing:

  • The expertise to identify the right use case
  • The prompt engineering and iteration to make the output useful
  • The ongoing maintenance as the client's needs evolve
  • The liability for output quality

Agencies that position these tools well charge a project fee for the build and a monthly retainer for maintenance and improvements. The retainer is the model because the tool is only as good as the prompts, and the prompts always need tuning.

Further reading

Weekly brief

For people actually using Claude at work.

What practitioners are building, the mistakes worth avoiding, and the workflows that actually stick. No tutorials. No hype.

No spam. Unsubscribe anytime.

What to read next

All articles →