RAG
Also: Retrieval-Augmented Generation, retrieval augmented generation
RAG (Retrieval-Augmented Generation) is an architecture where Claude retrieves relevant information from an external source before generating a response. Instead of relying purely on training data, Claude searches your documents, database, or knowledge base to find the right context, then uses that to answer. The result: Claude can answer questions about your specific products, policies, or data — things it couldn't have known from training alone.
Articles
How to give Claude a memory it doesn't have by default
RAG is the most practical technique in AI engineering — and the most misnamed. It's not magic. It's just giving the model the right pages of the book before it answers.
Do you actually need RAG? The decision most operators get wrong
Most teams jump to RAG because it sounds like the right answer. Half of them didn't need it. Here's how to know which situation you're in — before you build anything.
Why RAG implementations fail (and how to avoid the most common mistakes)
RAG is one of the most powerful things you can build with Claude. It's also where a lot of teams get stuck. Here are the failure patterns worth knowing before you start.