Business Strategy & ROIHow It Works

What to actually build with Claude as a first product

In brief

Not every idea that works with AI makes a good business. How to filter the options, spot the structural advantages, and choose the problem worth building for.

6 min read·

Contents

♡Sign in to save

Most first AI products fail for the same reason: the founder built something Claude is good at, not something customers urgently need. The capability is not the business. The problem is the business.

Here is a framework for thinking about what to build — and a filter for ideas that sound good but are not.

The four quadrants

Draw a 2x2. X-axis: how painful is the problem (low pain → high pain). Y-axis: how well does Claude solve it (poorly → well).

You want the top-right: high pain, Claude solves it well. Everything else is a trap.

Top-left (high pain, Claude solves poorly): You spend all your time on prompt engineering and the output still is not good enough. Customer acquisition is hard because the product does not work reliably. Stay away until the models improve.

Bottom-right (low pain, Claude solves well): Claude is impressive but users do not actually need this. No urgency, no retention, no willingness to pay. The demo is more compelling than the product.

Bottom-left (low pain, Claude solves poorly): Do not build this.

Top-right (high pain, Claude solves well): This is your target. Someone is doing this manually right now and it is costing them time, money, or quality they cannot afford to lose.

What Claude is actually good at

Be specific. "Claude can write" is not useful. "Claude can draft first-call summaries from transcript + CRM data faster than an SDR" is useful.

Claude is genuinely good at:

Synthesizing unstructured text (transcripts, emails, documents, notes)
Drafting high-variability content (responses, reports, personalized outreach)
Classification and routing at volume
Extracting structured data from unstructured input
Answering questions about a specific knowledge base
First-pass research and summarization

Claude is unreliable at:

Anything requiring precise numeric calculation
Real-time data (Claude has a knowledge cutoff)
Tasks requiring perfect recall of specific facts
Anything where being 95% right is worse than being 0% right (medical diagnosis, legal advice, financial calculation)

The "10x better or 10x cheaper" test

Your product needs to be either dramatically better than the current solution or dramatically cheaper. Usually not both.

For AI products, "10x cheaper" is often achievable: if a task costs $50 in human labor and you can do it for $0.50 in API costs, you have a structural advantage. "10x better" is harder — the question is whether better output actually changes the customer's outcome, or whether "good enough" was always fine.

Ask: what does the customer do differently because the output is better? If the answer is "not much, they just appreciate the quality," the quality advantage does not convert to willingness to pay.

Structural advantages worth looking for

Vertical specificity: A generic AI tool competes with Claude.ai directly. A tool that knows the specific vocabulary, templates, workflows, and context of one industry (property management, clinical trial coordination, franchise operations) is defensible. The moat is domain data and workflow integration, not the underlying model.

Workflow integration: Products that embed in where the work already happens (Gmail, Slack, Notion, Salesforce) have lower activation energy than products that require a behavior change. If you have to convince someone to open a new tab, you are fighting the status quo. If Claude appears where they already are, adoption is a technical problem, not a sales problem.

Volume and repetition: The ROI of AI compounds with volume. A task done 10 times per year does not justify a subscription. A task done 200 times per month, at 5 minutes each, at $50/hour labor cost — that is $8,000/month in labor. A $200/month product paying for itself 40x over is an easy sell.

Proprietary context: Products that get better with customer data (their tone, their terminology, their historical patterns) become stickier over time. A generic summary tool is replaceable. A tool trained on three years of a company's support tickets and outcomes is not.

The test before you build

Before writing any code:

Find three people who have this problem badly enough to talk to you about it for 45 minutes
Ask them to show you how they currently solve it — watch, do not describe
Identify the specific moment where the current solution fails them
Describe your solution in one sentence and ask if they would pay for it

If you cannot find three people willing to have the conversation, the pain is not real enough. If you find them but they would not pay, the pain is real but the solution does not fit. If you find them, they talk, and they ask how to sign up — you have something.

The AI part is not the hard part. Finding the right problem is.