Reinforcement learning: A bot trying to find path(steps are actions left, right, up, down) to an object(reward) in a room(environment), State is the current location.
Trace the path robot took to find the object, and then provide feedback like higher numerical value for steps taking toward the object, and lower for the steps taking away from object.