AI Codex
Evaluation & SafetyExecutivesCTOsOperators

Explainability

Also: explainable AI, XAI

The ability to explain, in plain English, why an AI produced a particular output. Important in regulated industries (banking, healthcare, legal) and anywhere decisions affect people's lives — if someone is denied a loan, they're entitled to know why. Claude can often explain its reasoning in natural language, but the underlying neural network computation is not itself interpretable in the same way a decision tree would be. Explainability usually means prompting Claude to show its reasoning alongside its answer.

In practice

Your Claude-powered loan tool denies an application. The applicant asks why. Explainability is whether you can give a real answer — "the model flagged high debt-to-income ratio and short employment history" — versus "the AI said no." In regulated industries, you often legally need to be able to explain automated decisions.

Related concepts