Reinforcement learning
60 articles
Action (Reinforcement Learning)
Machine Learning
AlphaGo
Artificial Intelligence, Google
AlphaStar
Artificial Intelligence, DeepMind, Game AI
AlphaZero
Artificial Intelligence, DeepMind, Game AI
Bellman Equation
Machine Learning, Mathematics
Control theory
Engineering, Mathematics, Robotics
Critic
Deep Learning, Machine Learning
DDPG (Deep Deterministic Policy Gradient)
Deep Learning
DQN
Deep Learning, DeepMind
Deep Q-Network (DQN)
Deep Learning, Machine Learning
Discount Factor
Machine Learning
Embodied AI
Artificial Intelligence, Deep Learning, Robotics
Environment
Machine Learning
Episode (Reinforcement Learning)
Machine Learning
Epsilon Greedy Policy
Machine Learning
Experience Replay
Machine Learning
GRPO
AI Techniques, AI Training, DeepSeek
Greedy Policy
Machine Learning
Imitation Learning
Machine Learning, Robotics
Importance sampling
Monte Carlo Methods, Statistics, Variational Inference
KTO
AI Alignment, AI Techniques, AI Training
Kimi K1.5
AI Models, Chinese AI, Large Language Models
Machine learning terms/Reinforcement Learning
Glossaries, Machine Learning
Markov Decision Process (MDP)
Machine Learning, Mathematics
Monte Carlo Tree Search
Algorithms, Game AI
MuJoCo
Open Source, Robotics, Simulation
MuZero
DeepMind, Models
NVIDIA Isaac Lab
NVIDIA, Robot Simulation, Robotics
OpenAI Five
Artificial Intelligence, Game AI, OpenAI
Pieter Abbeel
AI Researchers, People, Robotics
Policy
Machine Learning
Policy gradient methods
Machine Learning, Optimization
Process reward model (PRM)
AI Safety, Machine Learning, Model Evaluation
Proximal Policy Optimization (PPO)
Machine Learning, Optimization
Q-Function
Machine Learning
Q-Learning
Machine Learning
RLAIF
AI Safety, Machine Learning
RLVR
AI Techniques, AI Training, Reasoning Models
Random Policy
Machine Learning
Reinforcement Learning (RL)
Deep Learning, Machine Learning
Reinforcement Learning Models
AI Concepts, Machine Learning
Reinforcement learning
Artificial Intelligence, Deep Learning, Machine Learning
Replay Buffer
Deep Learning, Machine Learning
Return (Reinforcement Learning)
Machine Learning
Reward
Machine Learning
Reward hacking
AI Alignment, AI Safety, Machine Learning
Robot learning
Deep Learning, Machine Learning, Robotics
SARSA (State-Action-Reward-State-Action)
Machine Learning, Temporal-Difference Learning
Sergey Levine
AI Researchers, People, Robotics
Sim-to-real transfer
Robot Learning, Robotics
Simulation (in AI and robotics)
Robotics
Soft Actor-Critic
Algorithms, Deep Learning
State (Reinforcement Learning)
Machine Learning
State-Action Value Function
Machine Learning
Tabular Q-Learning
Machine Learning
Target Network
Deep Learning, Machine Learning
Temporal-difference learning
Algorithms, Machine Learning
Trajectory (Reinforcement Learning)
Machine Learning
Twin Delayed DDPG
Algorithms, Deep Learning
Tülu 3
AI Research, Allen Institute for AI, Open Source AI