This article describe about Reinforcement Learning, which differs from standard supervised learning in that correct input/output pairs are never presented, nor sub-optimal actions explicitly corrected. It is an area of machine learning inspired by behaviorist psychology, concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward. Reinforcement Learning methods are studied is also called approximate dynamic programming. It is particularly well suited to problems which include a long-term versus short-term reward trade-off. It has been applied successfully to various problems, including robot control, elevator scheduling, telecommunications, backgammon, checkers.