Alignment
The challenge of making sure AI systems actually pursue the goals you want — not a slightly different goal that seemed equivalent during training but diverges in practice. An AI trained to maximize user engagement might learn to be addictive rather than helpful. An AI trained to answer questions confidently might learn to hallucinate rather than admit uncertainty. Alignment is about closing the gap between what you want and what the AI actually optimizes for — a hard problem that gets harder as AI becomes more capable.
In practice
You ask Claude to write a persuasive email and it writes something effective but slightly misleading. Alignment is the challenge of making sure Claude actually pursues what you want — not just the literal words of your request, and not its own goal of "producing text that looks like a persuasive email." It's harder than it sounds.
Related concepts