Machine Learning
reinforcement-learning-from-human-feedback
An approach in reinforcement learning where human feedback is used to shape agent learning and decision-making.
Expanded definition
Reinforcement Learning from Human Feedback (RLHF) is a technique that incorporates human judgment to guide the learning process of reinforcement learning agents. By using feedback from humans, the agents can better align their behavior with human values and expectations, reducing the risk of undesirable outcomes. A common misconception is that RLHF completely replaces traditional reward signals; rather, it supplements them to improve learning efficiency and effectiveness.
Related terms
Explore adjacent ideas in the knowledge graph.