AI Codex
Evaluation & Safety ClaudeDevelopersOperatorsCTOs

Guardrails

Rules and filters built into or around an AI system to constrain what it can do — preventing harmful outputs, keeping it on-topic, enforcing compliance requirements, or protecting user privacy. Examples: blocking Claude from discussing competitor products, requiring a legal disclaimer on financial questions, filtering out personally identifiable information before logging. Guardrails are what you build when "Claude should use good judgment" isn't specific enough for a production system.

In practice

You're deploying Claude for customer service. You add guardrails in your system prompt: never discuss competitor pricing, always recommend speaking to a human for refunds over $500, never make promises about delivery dates. These are guardrails — the boundaries you define so Claude stays in its lane no matter what the customer asks.

Related concepts