Temporal Difference Learning

Temporal Difference Learning is an unsupervised technique in which the learning agent learns to predict the expected value of a variable occurring at the end of a sequence of states. Reinforcement learning (RL) extends this technique by allowing the learned state-values to guide actions which subsequently change the environment state. Temporal Difference Learning algorithm is related to the temporal difference model of animal learning. It is an approach to learning how to predict a quantity that depends on future values of a given signal.