AI Wiki
Category

Reinforcement Learning

107 articles

Action (Reinforcement Learning)

Machine Learning

AlphaChip

AI Hardware, Google DeepMind

AlphaDev

Algorithms, Google DeepMind

AlphaGo

Artificial Intelligence, Google

AlphaGo Zero

AI in Gaming, Google DeepMind

AlphaStar

AI in Gaming, Artificial Intelligence, Google DeepMind

AlphaTensor

Google DeepMind, Mathematics

AlphaZero

AI in Gaming, Artificial Intelligence, Google DeepMind

Andrew Barto

Machine Learning, People

Bellman Equation

Machine Learning, Mathematics

Best-of-N sampling

Machine Learning

Control theory

Mathematics, Robotics

Critic

Deep Learning, Machine Learning

DAPO (Decoupled Clip and Dynamic Sampling Policy Optimization)

Machine Learning

DARE (Drop And REscale)

Machine Learning

DDPG (Deep Deterministic Policy Gradient)

Deep Learning

DQN

Deep Learning, Google DeepMind

Dactyl (OpenAI)

OpenAI, Robotics

David Silver

Google DeepMind, People

Deep Q-Network (DQN)

Deep Learning, Machine Learning

Depth up-scaling (DUS)

Machine Learning

Discount Factor

Machine Learning

DoReMi

Machine Learning

Dreamer (reinforcement learning)

World Models

Embodied AI

Artificial Intelligence, Deep Learning, Robotics

Environment

Machine Learning

Episode (Reinforcement Learning)

Machine Learning

Epsilon Greedy Policy

Machine Learning

Evol-Instruct

Machine Learning

Experience Replay

Machine Learning

GRPO

AI Inference, Chinese AI, Reasoning Models

Gato (DeepMind)

AI Models, Google DeepMind

Greedy Policy

Machine Learning

Group Sequence Policy Optimization (GSPO)

Machine Learning

Gym (OpenAI Gym / Gymnasium)

Developer Tools, OpenAI

HuggingFace TRL

Open Source AI, Training & Optimization

Imitation Learning

Machine Learning, Robotics

Importance sampling

Statistics

Instruction backtranslation (Humpback)

Machine Learning

Ioannis Antonoglou

Google DeepMind, People

Jeff Clune

AI Research, People

Joelle Pineau

Meta AI, People

John Schulman

OpenAI, People

KTO

AI Alignment, AI Inference, Training & Optimization

Kimi K1.5

AI Models, Chinese AI, Large Language Models

Machine learning terms/Reinforcement Learning

Machine Learning

Markov Decision Process (MDP)

Machine Learning, Mathematics

Misha Laskin

AI Companies, People

Model soups

Machine Learning

Monte Carlo Tree Search

AI in Gaming, Algorithms

MuJoCo

Open Source AI, Robotics

MuZero

AI Models, Google DeepMind

NVIDIA Isaac Lab

NVIDIA, Robotics

Online learning

Machine Learning

OpenAI Baselines

Open Source AI, OpenAI

OpenAI Five

AI in Gaming, Artificial Intelligence, OpenAI

Pieter Abbeel

People, Robotics

Pluribus (poker AI)

AI in Gaming, Meta AI

Policy

Machine Learning

Policy gradient methods

Machine Learning, Training & Optimization