Foundation Models & LLMs◆ ClaudeCTOsDevelopers
RLHF
Also: Reinforcement Learning from Human Feedback
A training technique where human raters score model outputs and those preferences are used to steer model behavior toward more helpful and less harmful responses.