Please enable JavaScript.

Coggle requires JavaScript to display documents.

Reinforcement Learning :checkered_flag: - Coggle Diagram

- - - - On Policy (SARSA) [ Q(S,A) to R+γQ(S’,A’) ]
      - Off Policy (Q-Learning) Q(St, At) <- Q(St, At) + α [ Rt+1+ γmaxQ(St+1,a)-Q(St,At)
    - - Incremental Methods
      - Batch Methods
        
        DQN: It is a reinforcement learning algorithm where a deep learning model is built to find the actions an agent can take at each state.
        
        Least Squares for Control
        
        Least Squares: dn = yn – f(xn)
    - - Policy Gradient Theorem: It describes the gradient of the expected discounted return with respect to an agent's policy parameters.
      - No Value Function (MC Policy Gradient = REINFORCE)
      - Actor-Critic (Both Approx value and Policy): It is an reinforcement-learning technique in which you simultaneously learn a policy function and a value function.