Reinforcement Learning
107 articles
Action (Reinforcement Learning)
Machine Learning
AlphaChip
AI Hardware, Google DeepMind
AlphaDev
Algorithms, Google DeepMind
AlphaGo
Artificial Intelligence, Google
AlphaGo Zero
AI in Gaming, Google DeepMind
AlphaStar
AI in Gaming, Artificial Intelligence, Google DeepMind
AlphaTensor
Google DeepMind, Mathematics
AlphaZero
AI in Gaming, Artificial Intelligence, Google DeepMind
Andrew Barto
Machine Learning, People
Bellman Equation
Machine Learning, Mathematics
Best-of-N sampling
Machine Learning
Control theory
Mathematics, Robotics
Critic
Deep Learning, Machine Learning
DAPO (Decoupled Clip and Dynamic Sampling Policy Optimization)
Machine Learning
DARE (Drop And REscale)
Machine Learning
DDPG (Deep Deterministic Policy Gradient)
Deep Learning
DQN
Deep Learning, Google DeepMind
Dactyl (OpenAI)
OpenAI, Robotics
David Silver
Google DeepMind, People
Deep Q-Network (DQN)
Deep Learning, Machine Learning
Depth up-scaling (DUS)
Machine Learning
Discount Factor
Machine Learning
DoReMi
Machine Learning
Dreamer (reinforcement learning)
World Models
Embodied AI
Artificial Intelligence, Deep Learning, Robotics
Environment
Machine Learning
Episode (Reinforcement Learning)
Machine Learning
Epsilon Greedy Policy
Machine Learning
Evol-Instruct
Machine Learning
Experience Replay
Machine Learning
GRPO
AI Inference, Chinese AI, Reasoning Models
Gato (DeepMind)
AI Models, Google DeepMind
Greedy Policy
Machine Learning
Group Sequence Policy Optimization (GSPO)
Machine Learning
Gym (OpenAI Gym / Gymnasium)
Developer Tools, OpenAI
HuggingFace TRL
Open Source AI, Training & Optimization
Imitation Learning
Machine Learning, Robotics
Importance sampling
Statistics
Instruction backtranslation (Humpback)
Machine Learning
Ioannis Antonoglou
Google DeepMind, People
Jeff Clune
AI Research, People
Joelle Pineau
Meta AI, People
John Schulman
OpenAI, People
KTO
AI Alignment, AI Inference, Training & Optimization
Kimi K1.5
AI Models, Chinese AI, Large Language Models
Machine learning terms/Reinforcement Learning
Machine Learning
Markov Decision Process (MDP)
Machine Learning, Mathematics
Misha Laskin
AI Companies, People
Model soups
Machine Learning
Monte Carlo Tree Search
AI in Gaming, Algorithms
MuJoCo
Open Source AI, Robotics
MuZero
AI Models, Google DeepMind
NVIDIA Isaac Lab
NVIDIA, Robotics
Online learning
Machine Learning
OpenAI Baselines
Open Source AI, OpenAI
OpenAI Five
AI in Gaming, Artificial Intelligence, OpenAI
Pieter Abbeel
People, Robotics
Pluribus (poker AI)
AI in Gaming, Meta AI
Policy
Machine Learning
Policy gradient methods
Machine Learning, Training & Optimization