AI Codex
Foundation Models & LLMsCore Definition

The dial between predictable and creative

Temperature controls how much Claude surprises you. Turn it down for consistent, focused answers. Turn it up for more varied, exploratory ones. Knowing when to do each is a real skill.

4 min read·Temperature

Temperature is one of the most misunderstood settings in AI. People assume higher is better for creative tasks and lower is better for factual ones. The reality is more nuanced — and getting it right changes the quality of your outputs significantly.

The basic idea

When Claude generates text, each next token is chosen based on a probability distribution. Some tokens are far more likely than others given the context. Temperature controls how much that distribution gets "spread out" or "sharpened."

Low temperature (closer to 0): The highest-probability tokens become even more dominant. Claude almost always picks the most likely next word. Responses are consistent, predictable, and focused.

High temperature (closer to 1 and beyond): The distribution flattens. Lower-probability tokens get more chances. Claude is more likely to take an unexpected path.

Think of it like this: low temperature is Claude playing it safe, choosing the most obvious next word at each step. High temperature is Claude willing to take a detour, pick a less expected word, and see where it leads.

What this looks like in practice

Ask Claude to summarize a document at temperature 0.1 and temperature 0.9. The summaries will be similar in content but noticeably different in character.

At 0.1: tight, structured, hitting the expected beats in a predictable order.

At 0.9: still accurate, but with different choices about what to emphasize, different sentence structures, maybe an unexpected observation or framing.

Neither is wrong. One is more useful for consistent outputs; the other for generative exploration.

When to go low

Use lower temperature when:

  • You need consistent outputs across many runs (summaries, classifications, extractions)
  • You're building a product where users expect predictable behavior
  • The task has a clear "right answer" and you want Claude to converge on it
  • You're running automated pipelines where variance is a problem

A temperature around 0.2–0.4 gives you focused, reliable responses without the rigidity of 0.

When to go higher

Use higher temperature when:

  • You're brainstorming and want diverse options, not just the obvious ones
  • You're generating creative content where variety is the point
  • You're exploring a problem space and want Claude to surface unexpected angles
  • You're running multiple passes and picking the best output

A temperature around 0.7–0.9 gives you genuine variety. Above 1.0, outputs can start to feel random or incoherent for most tasks.

The practical default

For most Claude applications, a temperature of around 0.5–0.7 is a reasonable starting point. It's creative enough to produce varied, natural-sounding text without being so variable that outputs become unpredictable.

The real skill is matching temperature to task. Extraction and classification: low. Summarization and Q&A: medium. Brainstorming and ideation: high. And if you're building a product: test different settings against your actual use cases, because the right answer varies more than you'd expect.