AI Codex
Tools & Ecosystem ClaudeDevelopersOperators

Prompt Caching

Also: context caching

Prompt caching saves a portion of your prompt — typically a large system prompt or document — so Claude doesn't need to re-process it on every request. The first call includes the full context; subsequent calls reference the cached version. The practical effect: faster responses and lower API costs when you're repeatedly sending the same long context, which is common in production applications.

In practice

Your app sends the same 2,000-word system prompt on every API call. Without caching, you pay to process those 2,000 words every single time. With prompt caching, Anthropic processes the system prompt once and keeps it in memory — subsequent calls that reuse it cost 90% less for that portion. It's one of the most impactful cost levers for production apps.

Related concepts

Where Prompt Caching shows up

3 articles