Interface administrators, Administrators (Semantic MediaWiki), Curators (Semantic MediaWiki), Editors (Semantic MediaWiki), Suppressors, Administrators
7,785
edits
(Created page with "*action *agent *Bellman equation *critic *Deep Q-Network (DQN) *DQN *environment *episode *epsilon greedy policy *experience replay *greedy policy *Markov decision process (MDP) *Markov property *policy *Q-function *Q-learning *random policy *reinforcement learning (RL) *replay buffer *return *reward *state *state-action value function *tabular Q-learning *target network *...") |
No edit summary |
||
Line 1: | Line 1: | ||
*[[action]] | <noinclude>{{see also|Machine learning terms}}</noinclude>*[[action]] | ||
*[[agent]] | *[[agent]] | ||
*[[Bellman equation]] | *[[Bellman equation]] |