Why starting a new chat almost always saves money
In brief
Ten turns in one conversation costs roughly 5.5x as much as ten separate conversations. Here's why that happens, when it matters, and the one habit that changes how you use Claude.
Contents
Most people open one Claude conversation and keep going. It feels efficient — Claude has context, you don't have to re-explain things, the conversation is building on itself. But this is one of the most expensive habits in Claude usage, and it's almost never actually necessary.
Here's why.
How Claude charges for each message
Every time you send a message, Claude reads the entire conversation from the beginning. Not just your new message — everything. Every question, every answer, every code block, every clarification since the start of the chat.
This matters because Claude charges per token — roughly per word or word-fragment — for everything it processes. The longer a conversation gets, the more tokens Claude reads to respond to each new message, even if earlier messages have nothing to do with your current question.
In practice: a conversation that's 5 messages long means message 6 also processes messages 1–5. A 20-message conversation means every new message also processes messages 1–19. The cost compounds with every turn.
The 5.5x problem
Ten turns kept in one long conversation costs roughly 5.5x as much as ten separate conversations asking the same questions.
The first conversation: message 1 processes nothing, message 2 processes message 1, message 3 processes messages 1–2, and so on. By message 10, you're processing the equivalent of the entire conversation just for that one response.
Ten separate conversations: each one starts fresh. Message 1 in each chat processes only your question. There's no accumulation, no overhead.
If you're using Claude for distinct tasks across a day — a draft email, a data question, a summary of a document, a reply to a customer — there's almost no reason to keep them in the same conversation. Each one starts fresh, each one is cheaper, and each one actually gets better answers because the context isn't cluttered with unrelated earlier messages.
When you actually want a long conversation
There are real cases where keeping a conversation going makes sense:
Multi-step work on the same thing. If you're iterating on a document, a design, or a piece of code and each message builds on the previous one, staying in the conversation is correct. The context is load-bearing.
Ongoing projects. If you're doing a research deep dive and each message refines the previous analysis, the accumulating context is the point.
When you're debugging. Showing Claude what you tried and why it didn't work is valuable context for the next attempt.
The test: does this new message actually need the previous messages to make sense? If yes, stay in the conversation. If no, start a new one.
The Projects escape hatch
The reason most people keep long conversations going is that they need Claude to remember something — a project brief, their writing style, their codebase context, their preferences.
That's what Projects are for. Anything you reference more than twice belongs in Project Knowledge, not in a conversation. Project Knowledge is cached — Claude reads it efficiently without adding conversation overhead. It doesn't count against your message costs the way conversation history does.
If you find yourself pasting the same context into conversations repeatedly, that's a sign it belongs in a Project instead. Set it up once; every new conversation in that Project has it automatically. See how Projects work for setup.
A note on the Claude API: The same economics apply if you're using Claude through the API. Every API call sends the full message history by default, so context accumulates the same way. If you're building an app, managing conversation history carefully — trimming old messages, using system prompts instead of early conversation turns for stable context — has the same cost impact as starting fresh chats in the consumer product.
The PDF trap
An easy way to accidentally balloon your costs: uploading a 50-page PDF and asking questions across multiple messages in the same conversation. The entire PDF content sits in the conversation history and gets re-processed with every message, even when you're asking about page 3 and Claude has already answered everything on pages 1–40.
If you need to ask several questions about a long document, extract the relevant section and paste it as text — then ask one question per conversation rather than ten questions in one. Or put the document in a Project where it's cached properly.
The model habit
One other lever that compounds with session economics: model choice. Claude Haiku handles most daily tasks — summarizing, drafting, answering straightforward questions — at a fraction of Sonnet's cost. Sonnet handles complex reasoning, nuanced writing, and tasks that need depth.
The default habit: start with Haiku. If the answer feels shallow, switch to Sonnet for that specific task. Don't use Sonnet for a task that Haiku would have handled fine.
Combined with starting new conversations for distinct tasks, these two habits account for the majority of the gap between moderate and efficient Claude usage.
The short version
- Start a new chat for each distinct task
- Put context you use repeatedly into Projects, not conversations
- Extract relevant text from documents rather than uploading entire files
- Default to Haiku, escalate to Sonnet when the task needs it
None of these require changing how you work — just changing which button you click at the start of each task.
Related: How Projects work · Minimising token usage · Setting up Claude for your team