AI Codex
Foundation Models & LLMs ClaudeCTOsDevelopers

RLHF

Also: Reinforcement Learning from Human Feedback

A training technique where human raters score model outputs and those preferences are used to steer model behavior toward more helpful and less harmful responses.