Ctrl
K
All Pages
Categories
About
Log in
Sign up
Home
Categories
Alignment
Category
Alignment
2 articles
Direct Preference Optimization (DPO)
Deep Learning, Machine Learning, Natural Language Processing
Reinforcement Learning from Human Feedback (RLHF)
Deep Learning, Machine Learning, Natural Language Processing