Please enable JavaScript.
Coggle requires JavaScript to display documents.
Reinforcement learning (MDPs (Markov reward processes (Reward function,…
Reinforcement learning
MDPs
Markov reward processes
Reward function
Discount factor
Return
State-value function
Bellman equation
Markov decision processes
Actions
Policy
Action-value function
Markov processes
State transition matrix
States
Dynamic programming
Policy improvement
Policy evaluation
Trial-and-error learning
Monte-Carlo learning
Temporal-Difference learning
Q-learning
Function approximation
Value function approximation
Reward approximation
Policy approximation
Deep learning
Model based learning
Model free learning